Hello again
I've been working on that for the last couple of weeks and here is what happended :
My game engine overwall was slow, very slow. I spend a lot (too much) time optimizing every function that was taking too much time. The task was basically to optimize the whole game structure since 90% of my code related to battle was beeing called over and over...
My game engine performance improved vastly since last time but sadly I'm nowhere near the level of performance I'd like :/
I've removed all my utility based AI form my code and decided to give MCTS a try. Mixed results so far.
If I disregard some issue in certains browers (love you IE...), with a simple fight with a low number of skills/buffs/other costly mechanisms, I can run the MCTS process around 1500 time during my AI phase (1.5 seconds). That means 1500 node created, 1500 playout (limited to 30 turns), and 1500 evaluations.
If I skip the playout phase, I can process around 8.000 times my MCTS. On the contrary if I set it to do a maximum of 300 turns during the playout, it will process around 200 times my algorithm. (300 turns is about the number of turns that my test-battle takes to go to a win/loss situation)
So far it takes OK-ish decisons (using the 30-turns max playout) but still takes somme silly decisions sometimes.
I can't help to think that running 1500 times a MCTS algorithm is too low to have something decent. At the same time, i've optimized my code as best as I could and cannot find anywhere else to gain speed at the moment. Which is kinda of scary because I anticipate that under real circumstances, the AI will run even slower (having more things to check than with my test setup).
For now, one of the main issue besides speed of execution is that it has trouble handleing RNG.I've stopped using fixed random seed (it was way too time consuming to be viable). That means that if my AI test a move and roll a "miss" on the attack, it will likely discard the branch in favor of another attack that did not miss. The issue is not really that it selected that second attack but more that it avoided the first one based on a false idea of its sucess. it basically means that to be chosen, that attack have to pass a hit-check during the AI phase AND another one during the move processing for real. Hit chance double-dip = bad design here.
I've found a not-so-elegant solution that doesnt solve the issue entirely by at least diminish its impact : Only for the root of my tree (the unit I need to take a decision for in the end), I will add 2-3 times a node for each possible move. So if that move is a fireball, the AI will created 3 children resulting from the cast of a fireball on a target. If one hit and two misses for example, the AI will focus on the hit-branch for that spell and is more likely to select it in the end. At the same time, if my hit-chance was really low, I've got good chances that my AI will miss 3 times and doest not select the move.
All in all, it has many flaws (mainly the number of nodes added cannot be balanced properly vs the percentage of each RNG stat so it will still be RNG-dependant ... just a bit less that with only 1 node per move).
I'm open to any kind of better solution for this !
I think that with more processing power (for example if I was able to compute 15.000 moves per AI phase), the MCTS will likely be able to go much further down the tree and over the course of XXX turns, see that a fail for a move at depth 1 is not a big deal since everything after that can also fail or not and with a big number of turns evaluated, the hit/miss of each moves kinda averages down each branch.
EDIT: I was wrong here, giving it more time doesnt solve the underlying issue. I need to find a better way to solve the RNG nature of the moves.
One thing to note about my tree, is that since i know with certainty what moves the player will make, i've done a tree with only the monter's position in it. When creating a children node, I process turns untill a monster's unit should play. That allow me to have less node in my tree overall and go deeper/faster into the game.
I haven't tried yet to do a lot of optimization on the logic on the MCTS itself (like added domain knowledge and such) but I will come to it soon. A few questions I've got about that :
- it is worth it to add more than 1 children per expanding phase ?
- it is useful to do more than one evaluation per MCTS run ?
- should i initiliaze the number of win/simulation used during the selection process to something closer to the parent's value or keep 0/0 ?
I'm currenctly trying to find a way to prune entire branches based on game knowledge but so far I've failed to find easy to asses situation where I could prune a move with quasi-certainty.
i'm also thinking about using some faster-less accurate versions of the real one I use during the processing of a normal moves. This would allow me to gain speed, potentialy a lot by skipping some computation that have less effect on the end results) but it will comes with a potentialy huge loss of accuracy overall and may lead the AI to do stupid things. For example; if I decide to remove every life/mana leech computation for the AI process. It will be faster but will fail to properly evaluation situation where monsters or players heavily rely on leech.
I like the MCTS idea but my lack of processing power worries me a lot ^^
If you have ideas / suggestions / remarks / criticsm / questions, do not hesitate
EDIT : It turns out that even if I allow my MCTS to run for longer than planned, it still doesnt give me decent results. So i'm shifting my focus from game engine optimization to try and fix-up a MCTS version than can give decent results even if it is under a long time frame to start.
One of my main issue is how to handle the non-deterministic states of my moves. I've read a few papers about this but I'm still in the dark. Some suggest determinizing the game information and run a MCTS tree on that, repeat the process N times to cover a broad range of possible game states and use that information to take your final decision. In the end, it does multiply by a huge factor our computing time since we have to compute N times a MCTS tree instead of one. I cannot rely on that since over the course of a fight I've got thousands of RNG element : 2^1000 MCTS tree to compute where i already struggle with one is not an option :)
My idea of adding X children for the same move doesn't seems to be leading to a good answer either. It smooth the RNG curve a bit but can shift it in the opposite direction if the value of X is too big/small compared to the percentage of a particular RNG. And since I got multiple RNG par move (hit change, crit chance, percentage to proc something etc...) I cannot find a decent value of X that satisfies every cases. More of a badband-aid than anythign else.
Likewise adding 1 node per RNG tuple {hit or miss ,crit or not,proc1 or not,proc2 or not,etc...} for each move should cover every possible situations but has some heavy drawbacks : with 5 RNG mecanism only that means 2^5 node to consider for each move, it is way too much to compute. If we manage to create them all, we could assign them a probability ( linked to the probability of each RNG element in the node's tuple) and use that probability during our selection phase. This should work overall but be really hard on the cpu :/
I also cannot "merge" them in one single node since I've got no way of averaging the player/monsters stat's value accuractely based on two different game state and averaging the move's result during the move processing itself is doable but requieres a lot of simplifcation that are a pain to code and will hurt our accuracy really fast anyway.