Advice needed - AI for a graph game
I'm new to AI game programming. I'm current need to make an AI game algorithm and is getting stuck and I'm really appreciate for advice from the experts here!
I'm trying to use a general approach, so basically my game can be modelled as a graph. There are two players, the human player and the AI.
Each step, each player choose an action, and the game state is changed to new one.
The job of the AI is to assist the human player to get to the goal state.
I need to develop an algorithm for the AI player.
My current algorithm is to do an offline search to construct the whole game graph, and precompute the shortest distance from each state to the goal state. After that, the AI load the precomputed shortest distance table into memory, and each step, the AI player simply choose the action that leads to a state closest to the goal state. This method seems to work well, but the problem is that the graph gets bigger - it takes an extremely long time to precompute the shortest distance, and the table may even not fit in memory. I notice that not all the states in the game are actually needed - the set of states visited during a real game play is smaller. But I don't know how exactly to prune the search yet.
One approach I could think of is to only learn the neccesary part of the graph, say through several game play between two human players; but I'm not sure if this can work. I'm new to AI game programming so I'm really appreciate if there are some techniques you can suggest to me,
Thanks!
For general advice, there is Google. If you want help with your specific situation, you are going to have to describe the game in some detail.
Sorry, I'll provide the details:
- So there's a 2D maze of size 10 x 10.
- There's a few monsters (4-10)
- Each player can "lure" the monsters by coming close to them.
- Basically the two players will lure the monsters and let them (the monsters) "bump" into one another, and they're dead.
- The goal is to destroy all the monsters
- Are the players working together?
- Do they alternate taking turns?
- Do they need to avoid running into the monsters as well?
- What is the range from which they can lure the monsters? (If it is only 1, there's gonna be problems.)
That being said, doing a tree-based min-max based algorithm like T-T-T or chess would work. A tree-based search would need to be hooked up to your simulation system of what the monsters "would do" given any game state. Then, all you are doing is doing "people vs. monsters". The only caveat is that you need an extra flag to say which person's turn it is (I assume they can't move each other).
The order of each ply would be...
Person A's move
Monsters move
Person B's move
Monsters move
Person A's move
Monsters move
...
If you do your homework on min-max searches, you can see how this would work. The scoring system would be how many monsters would be eliminated. When you find that condition in your tree, you simply score it so that you can roll it up the branches of the tree. The problem is that you have open-ended sets such as luring the monster around in circles. Therefore, you can't search all the way to the bottom of the tree (it doesn't exist). However, you can stop searching the infinite tree branches once you get to a ply where the last of the monsters are dead.
As long as you are simulating the exact logic of what the monsters do when it is really their turn, the search will give you exactly the optimal path to solve the game. Of course, the first move in that sequence is your next "hint" to the players.
To make it "not so bloody obvious", you could give them perhaps the top 2 or 3 hints or give them a random one out of the top hints. You could use weighted randoms so that the best one is selected proportionally more often than lesser moves. (This method has the convenient side-effect of automatically discarding bad moves when there is only one good move.)
Dave Mark - President and Lead Designer of Intrinsic Algorithm LLC
Professional consultant on game AI, mathematical modeling, simulation modeling
Co-founder and 10 year advisor of the GDC AI Summit
Author of the book, Behavioral Mathematics for Game AI
Blogs I write:
IA News - What's happening at IA | IA on AI - AI news and notes | Post-Play'em - Observations on AI of games I play
"Reducing the world to mathematical equations!"
* The players collaborate together, so there's no loser. One player is controlled by human; one is by computer, and I need to write an algorithm for the computer player to assist the human player.
* They takes turn together
* They don't need to avoid running into the monsters
* The range to lure the monsters is only 1
* The monsters movement is deterministic; once "lured" they just follow the player wherever the player is
(Actually there's a very similar game online - http://gambit.mit.ed...game/dearth.php)
Thanks for your suggestion. They seems really useful, in particular the selection of best action based on weighted random choices.
But it seems that it's too expensive to search completely until the plies where all the monsters are destroyed. Also, since this is a cooperation game, I'm not sure if techniques such as minimax and alpha-beta can still be used to prune the search tree.
I'm thinking of making a heuristic position evaluation function which includes components such as:
+ How many monsters has been destroyed? (the more the better)
+ Has the AI player lured any monster? (to give it incentive to lure)
+ Distance from the AI player to the nearest monster which has not been lured (to give it incentive to lure)
+ Distance to the other player (to give it incentive to come and cooperate to destroy the monsters)
...
Is there any technique that I can apply to prune the search tree, or to make a reasonable heuristic evaluation function?
Dude... if you're gonna post rules, post 'em all.
- Are the players working together?
- Do they alternate taking turns?
- Do they need to avoid running into the monsters as well?
- What is the range from which they can lure the monsters? (If it is only 1, there's gonna be problems.)
That being said, doing a tree-based min-max based algorithm like T-T-T or chess would work. A tree-based search would need to be hooked up to your simulation system of what the monsters "would do" given any game state. Then, all you are doing is doing "people vs. monsters". The only caveat is that you need an extra flag to say which person's turn it is (I assume they can't move each other).
The order of each ply would be...
Person A's move
Monsters move
Person B's move
Monsters move
Person A's move
Monsters move
...
If you do your homework on min-max searches, you can see how this would work. The scoring system would be how many monsters would be eliminated. When you find that condition in your tree, you simply score it so that you can roll it up the branches of the tree. The problem is that you have open-ended sets such as luring the monster around in circles. Therefore, you can't search all the way to the bottom of the tree (it doesn't exist). However, you can stop searching the infinite tree branches once you get to a ply where the last of the monsters are dead.
As long as you are simulating the exact logic of what the monsters do when it is really their turn, the search will give you exactly the optimal path to solve the game. Of course, the first move in that sequence is your next "hint" to the players.
To make it "not so bloody obvious", you could give them perhaps the top 2 or 3 hints or give them a random one out of the top hints. You could use weighted randoms so that the best one is selected proportionally more often than lesser moves. (This method has the convenient side-effect of automatically discarding bad moves when there is only one good move.)
Also, since this is a cooperation game, I'm not sure if techniques such as minimax and alpha-beta can still be used to prune the search tree.
That's correct. If the behavior of the monsters is mechanical and not malicious, this is essentially a puzzle (at least if you can control both characters). Finding a path from the current configuration to one where all monsters have been destroyed is a pathfinding problem, and A* is the obvious candidate algorithm. If the problem is too large for your program to find a solution by searching from the beginning, you can use an inadmissible heuristic (one that is allowed to overestimate the distance to the goal) to get to a suboptimal solution faster.
If you cannot trust the player to do the optimal thing, you need to have a model of what the player will do, but this is always a messy business, so I would try to avoid it.
Also, since this is a cooperation game, I'm not sure if techniques such as minimax and alpha-beta can still be used to prune the search tree.
Well, it can, but I agree it deserves some thought. Conventional competitive minimax assumes that the opponent will act rationally, but the strategy doesn't fail if the opponent doesn't act rationally. In the cooperative case, though, the computer could end up putting the player in some pretty risky situations, assuming that the player would correctly execute the exact moves required to not die. (And vice versa; the computer's strategy could rely on the player "saving" it).
Honestly, in my opinion the creation of good teammate AI is one of the least satisfactorily-solved problems in game AI. Players want their teammates to be clever, helpful, observant. But not really. They actually want "clever" (that is, the player can come up with a good reason for why the teammate is doing that), "helpful" (the teammate does what the player wants it to do, without being told, even if it's a bad idea), and "observant" (that is, the teammate appears to react strongest to whatever the player feels like the teammate should be paying attention to).
Take a look at this paper on sheep herding AI, it might be useful.
Dave Mark - President and Lead Designer of Intrinsic Algorithm LLC
Professional consultant on game AI, mathematical modeling, simulation modeling
Co-founder and 10 year advisor of the GDC AI Summit
Author of the book, Behavioral Mathematics for Game AI
Blogs I write:
IA News - What's happening at IA | IA on AI - AI news and notes | Post-Play'em - Observations on AI of games I play
"Reducing the world to mathematical equations!"
the AI load the precomputed shortest distance table into memory, and each step, the AI player simply choose the action that leads to a state closest to the goal state. This method seems to work well, but [one] problem is that the table may even not fit in memory
Usually, it's cheaper to store a policy than to store the cost-to-go. I.e., rather than storing for each node how far it is from the goal (what is this, a 32-bit integer?), just store what the optimal action is to take (only log(N) bits, where N is the number of actions. I'm guessing log(N) ~= 2?) Granted, this is just a constant factor improvement, but when that constant factor is 16, you can have something significant!
I notice that not all the states in the game are actually needed - the set of states visited during a real game play is smaller. But I don't know how exactly to prune the search yet.
First let me make sure that you're working backwards from the goal, and not doing, "For each node X{ Use A* to find the shortest path from X to the goal}". Running A* once to find a single path is efficient, but A* is not what you want if you're trying to construct a policy.
Assuming you're constructing your policy efficiently and it's still too slow, you may want to turn to approximate dynamic programming. The simplest method is state aggregation, which is exactly what it sounds like; you lump clusters of states together. A nice way to do this is probabilistically; i.e., although the original game was deterministic, you use a probabilistic interpretation for your state aggregation ("If I only know I'm in cluster C1, that means I'm in state x1 with probability p1, state x2 with probability p2, etc."); then you get probabilities for transitions between clusters in a straightforward way ("If I only know I'm in cluster C1 but not which state exactly, then if I take action u1 I'll end up in cluster C2 with probability P2, C3 with probability P3, etc"). Then you've got a Markov Decision Problem (MDP) on the smaller graph of clusters.
One way to define clusters is to define features -- functions of the state -- that you think are significant (e.g., how far each player is from each monster, etc). Then all the states with the same feature values get lumped into the same clusters.
First let me make sure that you're working backwards from the goal, and not doing, "For each node X{ Use A* to find the shortest path from X to the goal}". Running A* once to find a single path is efficient, but A* is not what you want if you're trying to construct a policy.
I don't understand this: He has a specific starting configuration and we are suggesting that he run A* once to reach a goal configuration. What's wrong with that?