Advertisement

What AI for a card game ?

Started by August 10, 2006 08:27 AM
22 comments, last by Bob Janova 18 years, 2 months ago
Gorogoro :
Have you got some links to some places where this kind of techniques are used ?
I though about generic algorithm to tune the values that I give to my evaluate function, but I don't know how to go any further

DarkZoulz :
I am confused... How a state machine can tell me which card to play ? I have a given state for the game, for the board ( which card are played, their location, and their current state : HP, and some modifiers ). Sure once I have a robust algorithm to choose the card the AI should play, I can tweak some parameters based on some game state to change the strategy ( like the AI is winning, so get more aggressive, or whatever ), but what I'm really after is the core card chooser algorithm.


stevenmarky :
My current minimax algorithm is with alpha beta pruning ( what other kind of pruning is there ? Pruning branches that are "obviously" bad ones ? ). It uses a depth of 4, with a choice between 12 cards that can be played on 5 locations. All the locations are not free at one time, but I can quickly reach the 200 000 states evaluations with this numbers...
One more thing is that my implementation is in python ( because all my game states transitions, and cards implementation are in python ), so 200 000 takes quite some time.

stevenmarky and Kylotan :
I'm not sure how to use minimax with random :
If I am at one game state, and a card play with a damage of 10 - 15, it will create 6 different states ( for each given damage ) ?
The minimax need some states in its branches, I understand I can use the probabilities to evaluate a given state : eg card A have 25 % attack, and 10-15 Damage makes a 12.5 / 4 global attack value.
But I can't create a new game state, and a new branch from this...the results are really different whether the opponent card is dead or not...

kirkd :
I'm not sure I follow you...
You mean that for 'some' game state, I try all the possibilities a big number of time, and record which one is the better ?
But I have two big issues with such a technique : I have a really important number of possible game states. Even by trying to find some kind of patterns, it still really big.
Second issue, is I don't know how to evaluate whether the result is good, till the result is not 'player One wins', or 'Player Two wins'. If I had such a magic perfect evaluation function, I would just choose the better card. As in a chess game, the result can show how good it was some times after it was played...



Everyone :
Thanks all for your advices...
For the moment, I will try to have a better evaluation function, it seems to be the most important part of it...


Emmanuel
Quote: Original post by Emmanuel77

DarkZoulz :
I am confused... How a state machine can tell me which card to play ? I have a given state for the game, for the board ( which card are played, their location, and their current state : HP, and some modifiers ). Sure once I have a robust algorithm to choose the card the AI should play, I can tweak some parameters based on some game state to change the strategy ( like the AI is winning, so get more aggressive, or whatever ), but what I'm really after is the core card chooser algorithm.

Emmanuel


Well, I was thinking that the "core card chooser" should probably choose a card based on a strategy or personality. What card it plays exactly is really not that relevant. If the AI is aggressive, it plays cards that are marked as aggressive. If you want it to make even more specific card choices you could just make have more states and divid your cards into more specific categories.
Advertisement
Genetic algorithms with a greedy-probabilistic selection/deselection of cards during mutation sounds like a method that would produce good combinations, personalized decks and order opponents by skill. Just keep subpopulations for the diversity in which the best combinations are stored and the mutation factors vary greatly. This way, it can explore for "more global" maxima while fine-tuning local maxima. The later kind would seem more natural to the player. If you have many different sort of cards, you could also rank them according to heuristic evaluation and the general level of the decks they participate in. After one or two poor decks with it in,you could for example reduce the probability of it being selected for another deck.
Thanks, CWenner...


Actually I already planned to use genetic algorithm to found some deck, to tune the design of my cards, and to tune the evaluation function I can use in order to make it smarter.

However, my trouble is not in the deck creation, but more once I have a deck, to choose the move to make from current hand.

Thanks anyway,


Emmanuel
Quote: Original post by Emmanuel77
stevenmarky and Kylotan :
I'm not sure how to use minimax with random :
If I am at one game state, and a card play with a damage of 10 - 15, it will create 6 different states ( for each given damage ) ?
The minimax need some states in its branches, I understand I can use the probabilities to evaluate a given state : eg card A have 25 % attack, and 10-15 Damage makes a 12.5 / 4 global attack value.
But I can't create a new game state, and a new branch from this...the results are really different whether the opponent card is dead or not...


Minimax is essentially about minimising risk. So to do it properly, the algorithm could assume that an opponent would do 15 damage with such a card, yet the current player would deal 10 damage. However, due to the random aspect, that is overcautions. So if the algorithm was willing to take more risks, it could assume that it will always do 12.5 damage. You could even tune it between these 2 extremes.

However, I don't see why you can't just create the 6 different states each time anyway. 6 is not a large branching factor and many of those will lead to the same response from the opponent anyway. You just have to make sure you're pruning off the unlikely parts of the game tree. You could even generate all the states and then merge the similar ones. In the above example, you might find that damage scores 10-14 result in the target surviving and 15-16 result in it dying, which can be coalesced into one state with the target having taken 'about 12 damage' and another state where the target is dead, with the first fitness score multiplied by 5/7 and the second fitness score multiplied by 2/7 to accommodate the relative probabilities.

Quote:
kirkd :
I'm not sure I follow you...
You mean that for 'some' game state, I try all the possibilities a big number of time, and record which one is the better ?
But I have two big issues with such a technique : I have a really important number of possible game states. Even by trying to find some kind of patterns, it still really big.


That's exactly what minimax and alpha-beta pruning is all about. Chess has exactly the same problem (20 possible moves on turn one, 400 possible by turn 2, probably about 8000 by turn 3, etc).

Quote: Second issue, is I don't know how to evaluate whether the result is good, till the result is not 'player One wins', or 'Player Two wins'. If I had such a magic perfect evaluation function, I would just choose the better card. As in a chess game, the result can show how good it was some times after it was played...

...

For the moment, I will try to have a better evaluation function, it seems to be the most important part of it...


Definitely. In AI, representation is everything. You have to work out how best to represent your game states, and how to represent the quality of each state.

A good starting point here would be to simply count up the value of everything in play. eg. In Magic: The Gathering, simply counting up the mana cost of cards in play, plus adding a bit extra for the amount of mana producers you have, would yield a decent estimate of how well the game is going. Many simple chess games simply add up all the points values of the pieces on the board, and that is sufficient in many cases. You can then look into adding strategic bonuses for various situations.

Quote:
Minimax is essentially about minimising risk. So to do it properly, the algorithm could assume that an opponent would do 15 damage with such a card, yet the current player would deal 10 damage. However, due to the random aspect, that is overcautions. So if the algorithm was willing to take more risks, it could assume that it will always do 12.5 damage. You could even tune it between these 2 extremes.


I didn't think about this that way ( using max Damage). It is interesting. Although it can lead to very incorrect situation ( Imagine a 2-8 soldier against a troll that regenerate 5 pv each round, in real world, the soldier can harm the troll, as it will regenerate all the wounds, and using the max hide this situation ), but I admit this is really a special situation, and I'm not sure it can happens somewhere else than in discussion forums...

Quote:
However, I don't see why you can't just create the 6 different states each time anyway. 6 is not a large branching factor and many of those will lead to the same response from the opponent anyway. You just have to make sure you're pruning off the unlikely parts of the game tree. You could even generate all the states and then merge the similar ones. In the above example, you might find that damage scores 10-14 result in the target surviving and 15-16 result in it dying, which can be coalesced into one state with the target having taken 'about 12 damage' and another state where the target is dead, with the first fitness score multiplied by 5/7 and the second fitness score multiplied by 2/7 to accommodate the relative probabilities.


Generating all the states is definitively too much, as I'm already limited by the number of nodes I generate / inspect, but I was thinking, as you suggest as generating 'range' of damage with probabilities. Really interesting, I'm not sure I will have time to make it concrete, but good thing to think at...

Quote:
That's exactly what minimax and alpha-beta pruning is all about. Chess has exactly the same problem (20 possible moves on turn one, 400 possible by turn 2, probably about 8000 by turn 3, etc).


Wep, I know that, but I think Kird was talking about an offline pre - process, to store all possible states ( he were speaking of millions of iteration to construct the database ).

Last thing, the evaluation function :
I already have a simple evaluation that uses the strength of all cards involved.
but it is not good enougth.
So I was thinking about something more complex.
My problem here is I would like this function not to have knowledge of cards implementation / special power, in order to be able to add cards without changing the evaluation function.
One way to deal with this is to have the cards implement the evaluate interface, that would make them evaluate themselves...


Anyway, thanks for your post, it was definitively interesting !!


Emmanuel
Advertisement

However, I don't see why you can't just create the 6 different states each time anyway. 6 is not a large branching factor and many of those will lead to the same response from the opponent anyway. You just have to make sure you're pruning off the unlikely parts of the game tree. You could even generate all the states and then merge the similar ones. In the above example, you might find that damage scores 10-14 result in the target surviving and 15-16 result in it dying, which can be coalesced into one state with the target having taken 'about 12 damage' and another state where the target is dead, with the first fitness score multiplied by 5/7 and the second fitness score multiplied by 2/7 to accommodate the relative probabilities.

Alpha-Beta merely rests on the assumption that all players will choose the best move according to some depth and underlying functions. This does eliminate risk but only such that agents can influence. Taking the best/worst outcomes for uncertan variabales is quite different! What Alpha-beta needs to make in such a situation is as you say, branch and calculate the expected outcome (i.e. the average value of outcomes) after potentially further search. For example, reaching point A at depth 3, taking each of the two outcomes B and C, searching an additional two steps and returning the alpha-beta value for B and C and setting A = P(B) * B + P(C) * C. The extra branching of 6 is generally horrible, at least of you can do some sort of approximations.

General:
Something I wouldn't recommend when searching is simulation unless you either plan on giving it several thousand runs or perform it a few number of times. Since alpha-beta will pick the best value out there, several simulations with a high variance aren't going to give a rational result.
If you for example make pick n random numbers in (0,1), you're going to find the expected maximum of these to be
alpha * Int(p=0..1, p * p^n dp)
alpha^-1 = Int(p=0..1, p^n dp) = 1/(n+1)
=>
Int(p=0..1, p^(n+1) dp) * (n+1) = (n+1)/(n+2)
I.e. avg.
0.5, 0.67, 0.75, 0.8, 0.83, ...
An exception may occur in the case where you can generate random variables ones and reuse these in simulations. An example of this would be the cards current at top in the deck. A hundred or so sets could be generated in advance and iterated during simulation for a much smaller effect due to the variance (a proper modeln would be 50 a ply). Actually testing how much it affects is a recommendation though.

For integer ranges, you could add the random vars to calculate the mean or probabilities to be below or above some value (which generally is what you'd use in utility functions and successor functions - whether a creature dies/fully generates, or if you reach the end, how many points he would have last at average). If the first attack does 3-4 and the second 1-3, then the final damage will be between 4 and 7 but you can also calculate the probabilities for each outcomes as
4 + X. Where 0 <= x <= 3. The weights would be (1,2,2,1) - 1 way to get (3 and 1) or (4 and 3). 2 ways to get the other. If you have some set of weights <a_i> and <b_i>, you could easily construct a new set <c_i> by for the weight of getting a total of c, summing over the probabilities for the first outcome to be i and multiplying by the probability of the second to be c - i. If the outcome should be 3 for example, you go over 0, 1, 2 and 3 for the first outcome, pairwise multiplying with the probability for the second to be 3,2,1,0. Formally, if n_a and n_b is the max values (i.e. 1 less than the number of weights)
<c_i: sum(j={max{0,1-n_b}..min{i,n_a}, a_j * b_(i-j))>.
total = sum c_i
n_c = n_a + n_b
The mean and prob for min/max is easy.
P(c >= k) = sum(i=k..n_c, c_i) / total

Such an approach would allow to take into account probabilities of the more critical actions (creatures dying) by generaling to the possible outcomes: die or don't die and leaving the rest to utility for example (i.e. a branch of two if n_c >= hp - the user would prob. chose a different action after knowing it dies).

The standard approach to card games with incomplete information is generally put on utility functions with sofisticated game theory where probability theory is used essentially. Only a few plies are taken into account. One could perhaps complement the utility function with a generated one (difference wize) with a lesser weight to accomodate for all the generalizations done in the model.
I'm making a Magic-like game too, and right now, I'm stucked with the abilities and effects of the cards.

Did you think about that ? If you do, please tell me the solution you've found.

The abilities and effects can modify the rules of the game and that's something new for me.
You could easily set various variables to reflect what sort of rules currently are in effect. If you for example which to change so that you draw two cards instead of one, then have a variable named CardDrawCount and change it. These rulesets belongs to the state of the game, they can change. Therefor, you construct states with variables and in the successor function, you merely make your alternations. The successor function is a function which takes the current state and returns the resulting state. Seeing how you in your search deals with states, the 'rule' changes will not effect non-descedant states.
I have a GetNextState method that gives the next state from current state.

This function is called from the real game, AND for the AI, so the AI search from real states.

Each card has basically 4 functions : Engage, Attack, Defend, Die, that have basic behaviour, and can be overidden for specific behaviour.

And each action can call a modifier the card has subscribed to. This is an action another card have on current card.

Say I have a 'protector card' played, that creates a shield that absorb 1 point of damage for every other card.
All cards have a defense modifier, that will decrement the damage value given to the defense method.
When the 'protector card' die, the modifier is removed on other cards.

Hope it helps,

Emmanuel

This topic is closed to new replies.

Advertisement