Meta
Level: Delver
Rank Points: 39
Registered: 12062016
IP: Logged
File: TLP.png (115.9 KB) Downloaded 28 times. License: Public Domain

Pursuing the optimal solutions in DROD RPG (+13)
I Background

I am terrible at playing DROD RPG. Even for the published holds that are rated as very easy, I can barely conquer them. After finishing some holds and looking at the high scores, I almost convinced myself that those scores are not legit, since they looked impossibly high to reach. This motivates me to figure out the optimal solutions, not just the highest known score on the leaderboard, so that I can prove that those scores are fake. DROD is open source after all, and it should not be too hard to create a fake demo. (Later, I discovered that many of those scores are legit, and not yet optimal.)
I got distracted quickly, until Mike announced the Twisty Little Passages (TLP) late 2018. The puzzles there are claimed to have essentially unique solutions. I did not want to spend much time verifying the claim and solving the puzzles myself. So, I wrote a TLP solver, whose front end is a puzzle editor written in Python (together with some handdrawn tile set to avoid possible copyright issue), and whose back end is a puzzle solver written in C++. Although the algorithm used is a simple brute force searching, it worked perfect on the demos of TLP, and proved that the solutions are essentially unique.
The success encouraged me to try it on some easy DROD RPG holds. Not surprisingly, my naive program did better than me, but still far worse than the high scores. The main reason was that I only spent half an hour to write the program, and I wrote a lot of dirty adhoc codes. The program was very slow, and both CPU and memory hungry. My old laptop couldnot handle it.
Then, I put it aside for a while. When TLP collected shipping information, I recalled that I had written such a program. I decided to invest some time in doing it better. I managed to get several optimal solutions this time, and I would like to share some information with you in this article.
In Section II, we will analyze the optimization problems, together with its decision counterparts, and their complexities. In Section III, some algorithms used in practice are shown. In Section IV, some potential issues are discussed. In Appendix, we will list the holds that we have tried, and show some tricks and how we use the algorithm to optimize them.
II Complexity

In DROD RPG, we have a player, who has some stats (for example, HP, ATK, YK). A hold consists of many rooms which are connected with each other in certain ways, and each room consists of some resources (for example, potions, gems, or even monsters). The player wants to go from one room, called entrance, to another room, called exit, without ever running out of resources. Later, we will simply refer this as the puzzle.
Note that the rooms here are referring to an abstract concept of the rooms in DROD. They are more like the corridors in DROD. We assume the following axioms:
* if the player decides to visit a room, then the player must gather every resource in the room in a predescribed order (for example, yellow door, monster, then yellow keys);
* the player can only visit a room that is adjacent to a visited room;
* the player can only visit a room if the visited rooms satisfy some predescribed property (this allows the puzzle to create situations like choosing at most one from multiple rooms);
* a room cannot be visited twice.
These axioms alone are not enough to cover all the possible puzzles. We will use it as a base line, and add more mild axioms if necessary.
Let us consider one very simple class of puzzles, a puzzle consisting of only one kind of resource. For example, a hold consisting of only yellow keys and yellow doors (1 YD can be viewed as 1 YK), or a hold consists of only potions and monsters/hot tiles that deal only constant damages regardless the player's stats. In 2018 Year of RPG Compilation, the entry Suprematism of Health by Skell is such a hold.
The optimization problem asks us to find a way to go to the exit with largest amount of resources and without ever having a negative resource at any time. The decision counterpart is simply asking to find such a way from the entrance to the exit. These two problems are related each other, and we leave it as an exercise to the readers.
Prop. The puzzle decision problem is NPComplete, and the puzzle optimization problem is NPHard.
Proof.
We view a room as a vertex weighted by an integer. Suppose that we have a (edge)weighted graph G whose edgeweights are nonnegative integers. We create a vertexweighted graph H consists of two type of vertices:
* Type I: vertices with large fixed positive integers (compared with the edgeweights of G), which represents the vertices of G,
* Type II: vertices with weights being the negation of the edgeweights of G, which represent the edges of G,
so that the graph H encodes the graph G. Put it in another way, the graph H is the subdivision of G such that exactly one vertex is added for each edge, and that the vertices are suitable weighted.
Now, considering the puzzle optimization for the graph H. Clearly, we need to visit all type I vertices, and we want to visit a small amount of Type II vertices so that the visited vertices are connected at all time, and the total weights of the visited vertices are as small as possible. Since the weights of Type I vertices are fixed, we only need to optimize the total weights of visited Type II vertices. If we translate this problem from graph H to graph G, we will see that we need to find a Steiner tree for the graph G.
It is wellknown that the Steiner tree optimization problem is NPHard, and Steiner tree decision problem is NPComplete. Therefore, the puzzle optimization problem and the puzzle decision problem are both NPHard.
On the other hand, if we have a way to go from the entrance to the exit, such a way is a certificate of polynomial length for the puzzle decision problem, hence the puzzle decision problem are in NP. Thus, the results follow.
Q.E.D.
This proposition implies that we should not expect any effective algorithm to find the optimal solutions, unless P = NP. But we still have some good news. If we only care about near optimal solutions, for instance, a 70% optimal solution (more precisely, 1 / log 4), then a polynomial time algorithm exists for Steiner tree problem. Therefore, we still have some hope to expect a polynomial time algorithm to solve the puzzle.
III Algorithm in practice

We first discuss algorithm of finding (global) optimal solutions, and later algorithms finding local optimal solutions.
III.1 Dynamic Programming

In many puzzles, we can see that the following additional axiom is satisfied:
* except for HP, the stats of the player after visiting a given set of rooms does not depend on the order of the rooms visited.
At the first glance, some common room does not satisfy the axiom, such as a room with a choice of either ATK gems or DEF gems. But, we can always construct the rooms (recall that the room is an abstract concept) in a ways that they satisfy the axiom and still resemble the effect we want. In the scenario mentioned above, we can create two rooms:
* Room A: ATK gems,
* Room B: DEF gems,
and require that:
* Room A can only be visited if Room B has not been,
* Room B can only be visited if Room A has not been.
Let us give another example here. Suppose that we have a hold with two levels so that we cannot backtrack from the second level the first level. Then, we can use the following constraints:
* Rooms in the first level can only be visited if none of the rooms in the second level have been visited.
With this additional axiom, given a specific set of given rooms, we only need to optimize the HP, since all other stats are already determined from the set itself. And the puzzle optimization problem reduces to HP optimization problem for all subsets of rooms that contain the exit. Therefore, it can be solved using dynamic programming.
In the worst case scenario, the time complexity is exponential. But, we already know that the puzzle optimization problem is NPHard, and we cannot hope a much better algorithm. In practice, there are many ways to speedup the dynamic programming.
III.2 Local Constraints

In many fields, we have some kind of localglobal principle, that it, some global object is equivalent to all its corresponding compatible local objects. In this game, although we do not quite have an equivalence, at least we know that a global optimal solution should be a local optimal solution everywhere. By saying a solution is local optimal, I mean that it cannot be improve by making some "local" moves.
One such kind of move is simply changing the order in which the rooms are visited locally. Let us show an example here. Suppose that in a solution (i.e. a sequence of rooms), the player visit Room A at time 3, and visit Room B at time 6. A swap will let the player visit Room A at time 6 and visit Room B at time 3. If the solution is local optimal, then the solution after the swap should not be better than the original solution.
There are more kinds of local moves, and we will not discuss them here. From now on, we assume that we have some consense on the meaning of local moves.
It is quite easy to check if a solution is local optimal. It means that, given a solution, we can always try to improve it until it reaches a local optimal. The difficult part is how to get to the global optimal when we stuck at the local optimal. I do not have a good way, but I would try some deep neural network and see if it works.
Nevertheless, we have some algorithm that can be used to improve existing solutions, and the local constraints can be used to reduce the amount of states that are needed to be checked in the dynamic programming mentioned above.
IV Possible issues

In order to run any algorithms, we first need to model the holds in a way that the axioms are satisfied. If the number of the rooms is small, we do have a practical algorithm to determine the optimal solutions, and when that is large, at least we can get a local optimal solutions. So, if we can model the holds correctly and simplify the model, we have a high chance to get the optimal solutions in reasonable time.
On the other hand, I usually find in practice that a lot of mistakes could happen at the modeling and simplification step. If we overlook a trick needed, the algorithm can only give the optimal solutions under the assumption that the trick is not used. For example, it is quite easy to overlook possible door walkings.
The dynamic programming algorithm will use a lot of resources. We need to store information for a lot of states somewhere. We can use local constraints to reduce the number of states dramatically. I have not implemented many such constraints yet, and for many holds, my old laptop is very slow and does not have enough memory. I may ended up writing some better codes, and putting the data on the hard drives instead of the memory.
Appendix

I will show below, for some small holds (ordered by hold ID), how I use the algorithm to optimize it, and possible some tricks used. For most of them, I either failed to conquer them myself, or only get a very low score myself. I am amazed by the strategies discovered by the program. Now I can see how a person who is good at this game will play, and I know for sure that the high scores on the leaderboard are real. If you want to know exactly how my program plays, just send me an email.
Hold 444: 67th Grayman adventures by Slava

The first score point has a unique solution. For the second score point, my program optimizes the amount of HP to take into 1N2W, and the order to pick up gems and fight spider.
Hold 484: Judy's Mansion by Edward

Although it is rated as the easiest published hold, my program tells me that it is far from being easy to get the optimal solutions for all score points. For almost every score point, we need to do something different, especially for the one killing aumtlich due to the existence of aumtlich beam. One trick for this hold is to use door walking to get 2 yellow keys and 1 green key by killing only one monster.
Hold 643: TenMinutes Distraction by Suwako

Get sword, then shield. Gather DEF first if possible.
Hold 669: 2016 Year of RPG Compilation: One by Bomber50

This is a typical puzzle that can be solved easily by the algorithm.
Hold 669: 2016 Year of RPG Compilation: This or That by Bomber50

Due to its structure, we can speed up the algorithm dramatically and get the optimal solution in milliseconds, despite that it is quite long. I have no way to figure out the magic sequence myself.
Hold 669: 2016 Year of RPG Compilation: Escalation by RedXIII

The puzzle is a bit long to use the algorithm directly. So, I have to apply the algorithm to various key steps. Since the multiplier is very high, applying the algorithm to key steps should not effect the final result much. By the way, I used the wall walking to get the gems once I have 800 GR (door walking in, wall walking out).
Hold 670: One Way by RedXIII

We want to preserve as many keys as possible, which means that it is not that easy to get back once we pass 6N. I used the algorithm to optimize before 6N, and after 6N, with reasonable simplification.
Hold 675: Rock Island by Kieran Millar

The most important step is to get 36 ATK. We can either go clockwise, or go counterclockwise. One direction is far superior than the other. After making the initial moves, the algorithm takes care of the rest.
Hold 679: Get Your Stuff Back by Hyperme (first score point)

One advantage of having a program is that we can test the best altar use without thinking about it ourselves.
Hold 705: Royal Rescue by Bomber50 (first score point)

We need to use door walking to get the shield.
Hold 708: Goblin King's Fortress by Kieran Millar (first score point)

There are too many places where the pickaxe can be used. It turns out that we do not want to use the pickaxe at all.
Hold 711: Goblin Hollow by Bomber50

My algorithm teaches me that the priority is to get 20 DEF at 2N1W and 2N1E.
Hold 718: Throw by Xindaris

I am surprised to know that we want to kill Mud Baby at 2N1E first in order to get 24 ATK.
Hold 753: Minimal Design by Tio

This hold is very complicated to model. I am not sure if the solution given by my program is optimal.
[Last edited by Meta at 01092020 06:00 AM]
