Advertisement

My first AI attempt

Started by May 21, 2009 12:16 PM
13 comments, last by AlphaCoder 15 years, 6 months ago
I would like to make an AI that plays a very simple game: Rules: 19x19 grid. players take turn placing pieces on the board subject to the following conditions: -first piece gets placed in center -subsequent pieces must be placed horizontally, vertically, or diagonally adjacent to another already-placed piece -5 in a row wins You could call this a Gomoku variant.. with the modification that pieces must be placed adjacent to others. My question: What would be the best way to implement an engine for this game in terms of playing strength? -neural network approach? -straightforward minimax approach either way what types of scoring should i use? simple +1 for player1 win, -1 for player2 win, 0 for no win yet? would there be any better ways for evaluating positions, like looking for lots of 4-in-a-row's or lots of 3-in-a-rows.. maybe giving them piece values like in chess: X points for a 3 in a row, X' points for a 3 in a row only blocked on one end, X'' points for a 3 in a row not blocked on any ends. etc. and these scores being the weights or "dna" for the genetics or neural networks of the engine. so many ideas are occuring to me, but the most important problem i'm having is that i have no way of knowing or even.. making an educated guess as to which method would lead to the best.. or have a good chance of leading to the best results. Any help at all would be appreciated very much
http://www.sharpnova.com
Definitely minimax. You'll probably need an evaluation function that rewards open threats, as you were saying. You may benefit from some sort of quiescence search, but this is probably not trivial to get right. You can use extensions to look further into lines with forced moves.

Go in small steps. Start with minimax and a random evaluation function if the game is not decided yet. Add alpha-beta pruning and some move order heuristic ("killer move" will probably work well). Add a hash table. Implement iterative deepening. Keep refining your evaluation function.

It's a never-ending project, but you can get a decent playing level quickly.
Advertisement
Minimax might get expensive pretty quickly. There's a way other than minimax to do this.

1) Traverse the grid. Check is to see if the square is open, if so you want to score it. Proceed to 2) else next node.

2) Check to see if any neighbors are occupied. If not, score = 0 or -999 or whatever. If so...

3) for each occupied neighbor, check to see if there is a forming line of yours or opponents in each of the 8 directions. If the line is 4, mark square as high priority. If line is 3, slightly less, etc. Note that this is cumulative for both sides. Blocking 2 sets of opponent's 3s is better than 1 set. Blocking 1 set of opponent's and adding a 4th onto one of your sets of 3's is better than doing either action alone.

4) After finishing the eval of all possible plays, you can rank the plays by the evaluated score and select (most rationally) the top scoring play. For difficulty tweaking, you can select from among the top n plays where n expands to create gradually dumber play. Doing this also increases game to game variety.

The most important factor above is creating the scoring function. How much more important is a row of 3 than 2, for example? You can use a scoring function that is exponential in nature. e.g.:

n: Score
1: 1
2: 2
3: 4
4: 8

The problem with the above is that blocking two rows of 3 and a row of 1 with a single play will score higher than blocking the obvious winner of a row of 4. Therefore, a logarithmic curve can be used:

n: Score
1: 1
2: 10
3: 100
4: 1000

This method guarantees that no amount of multiple play can supersede the layer above it. Since there are only 8 possible things you can add together (one in each direction), blocking or adding to 8 3's would only be 800 points... you would still elect to block or complete the 4. Therefore, the only thing that the cumulative points assist with is ranking combination moves. e.g. 2 3's and a 2 are better than 2 3's and a 1.

I believe that you will find this method pretty quick to program, quick in its execution, and somewhat realistic in its nature of play.

Dave Mark - President and Lead Designer of Intrinsic Algorithm LLC
Professional consultant on game AI, mathematical modeling, simulation modeling
Co-founder and 10 year advisor of the GDC AI Summit
Author of the book, Behavioral Mathematics for Game AI
Blogs I write:
IA News - What's happening at IA | IA on AI - AI news and notes | Post-Play'em - Observations on AI of games I play

"Reducing the world to mathematical equations!"

Would this be a good data structure for a board position?

-19x19 matrix where cells correspond to empty, player1's piece, or player2's piece
-legal moves list
-depth
-the move that lead to this position (for use in doing a quick search around that move to check for a 5-in-a-row, and also for knowing whos turn it is at this plydepth)


is this too large of a data structure for generating the tree or is it about right?

note: this is in response to the first reply

i'd also like to try InnocuousFox's solution as well. i actually think really strong gomoku engines use his method.. in conjunction with some sort of highly pruned search
http://www.sharpnova.com
You don't need to expand the tree in memory. You can do everything using a single board structure on which you do and undo moves. The list of legal moves can be a local variable in the recursive search function. The check for whether the last move achieved five-in-a-row can be done at the time the move is performed (have the function return a bool).

What Dave described is a depth-one minimax search. I am sure one can do much better than that. I may try to write a simple searcher myself.

Regarding scoring:

Each move has to be adjacent to an existing piece. Therefore, each move is either a 'block' or 'extension' or a combination of multiple instances of them. There is no quiet move.

Therefore, instead of evaluating the entire board for every leaf node in the search tree, you can store the static evaluation in the data structure itself and incrementally update it for every move made because the change in evaluation only affects all chains that contain the last move square. The only disadvantage is that the evaluation will be computed at every step in the search tree, but since this method uses only incremental updates, the computations at each step are less.

Instead of storing evaluation as an integer score, you can store the evaluation as a list of open chains (a chain object can have data about how many elements in that chain (1-4) and whether it is open on only one end or both ends). You can then use a chain-map structure that can basically give you a list of pointers to chains that position X belongs to.

The total integer score for any board position can just aggregate the open chain list. Make sure to use logarithmic weights as InnocuousFox mentioned because a 4-chain is more valuable than three 2-chains. Also note that a game is trivially won when a player has a double open 4-chain or more than one 4-chain with distinct open ends (because whatever the opponent does, you can win at the next move).

Storing all this information for one position will not take much space. You should just make sure that your putPieceAt(X) and removePieceFrom(X) (for undoing the move in the search tree) should accurately update the chains-list and chain-map for position X.

This method is something that is difficult to do in a game like chess where the effects of a move may be seen elsewhere (eg. discovered check, exposing a passed pawn, a bishop's diagonal gets free) so the entire board has to be evaluated from scratch. In your game the effects of a move are concentrated on that part of the board only. The question of "this will cause a series of forced moves later" will be handled by the tree search. Use a minimax-like search with pruning techniques and other optimizations like alvaro mentioned.
--------------------------------------Amaze your friends! Astound your family! Kennify your text!
Advertisement
I'm not sure I understand why I need a function which undoes moves.

If I handle the minimax recursively, wouldn't the function look somewhat like this:

function score (state x) returns move m{   m=x.legal_moves[0]   if (x.depth%2)   {      for each y = x.legal_moves[]      {         m = min(score(do_move(x,m)), score(do_move(x,y)))      }   }   else   {      for each y = x.legal_moves[]      {         m = max(score(do_move(x,m)), score(do_move(x,y)))      }   }   return m}


i would guess do_move(x,y) would take a state x, a move y, and return the new position.

this is just a rough guess of how it would look and i could be completely wrong
http://www.sharpnova.com
yeah... completely forgot that with my method you only need to update part of the board after each move. Just extend feelers in all 8 directions to determine what scores have been changed. That makes this much faster than re-processing the entire board - much of which is redundant.

Dave Mark - President and Lead Designer of Intrinsic Algorithm LLC
Professional consultant on game AI, mathematical modeling, simulation modeling
Co-founder and 10 year advisor of the GDC AI Summit
Author of the book, Behavioral Mathematics for Game AI
Blogs I write:
IA News - What's happening at IA | IA on AI - AI news and notes | Post-Play'em - Observations on AI of games I play

"Reducing the world to mathematical equations!"

Quote: Original post by AlphaCoder
I'm not sure I understand why I need a function which undoes moves.

You need it because the state of the board is expensive to copy, and the board doesn't change a whole lot with each move you do or undo. It's simply much faster to do and undo moves than it is to copy states.

Quote: Original post by AlphaCoder
I'm not sure I understand why I need a function which undoes moves.


What you are doing is making a copy of the entire board with the new position. This means you are remaking the entire 19x19 grid and all associated properties. You should minimize the number of dynamic memory allocations in the search.

Instead if you use a single copy of the board and perform moves and undo moves on it, you will save quite a lot. In fact, in a game like gomoku it is much more efficient because all you have to do is set/unset a piece at a given position and update the chain structure. Why would you want to copy the entire board when the remaining 360 positions are going to be the same and hardly 2-3 chains are getting affected at every move?

I learnt this the hard way myself sometime back (thanks alvaro [smile]). The speedup acheived was tremendous.

Also, I'd suggest returning the score rather than the best move in your search. You need the best move only in your root search everywhere else you just want to compare scores.

state = { grid[19][19], player_turn, chains[] } /* global or class member */function search(depth) returns int score:  if(depth==0):    return state.calculate_score() /* sums up weights of chains[] */  else:     foreach(legal_moves() as x):       state.do_move(x)    /* Updates grid[][], player_turn and chains[] */       x_score = search(depth-1)       state.undo_move(x)  /* Reverts all changes */       if(state.player_turn is min):         best_score = min(best_score, x_score)       else if(state.player_turn is max):         best_score = max(best_score, x_score)   return best_score


You will see how the integer score is much more valuable to return when you do alpha-beta pruning, etc. If you use transposition tables, you can search even faster and there is no need of comparing moves at the root either because you can just get the best move from the hashtable.
--------------------------------------Amaze your friends! Astound your family! Kennify your text!

This topic is closed to new replies.

Advertisement