Advertisement

Algorithm for mini-max evaluation for Board Game

Started by March 12, 2012 04:00 PM
5 comments, last by alvaro 12 years, 8 months ago
I'm trying to create an engine to play a gomoku variant. The game is played by taking turns placing pieces on a board. First person to get five in a row wins. A piece must be placed horizontally||vertically||diagonally adjacent to a previously placed piece. (this adjacency requirement is what makes this different from standard gomoku.)

I'm just looking for help with the pseudocode for my functions.

This is what I have so far:

data structure for position is 19x19 array of cells with three possible values. 0=empty, -1=black piece, 1=white piece

move parameter is passed by reference so the min and max functions have a score value as their main return, but also inherently return the move that equates to said score

move datatype has x and y members


max(position[][], depth, byref move)
if depth = _maxDepth_ return 0
max_score = -infinity
best_move = null
for each move in getLegalMoves(position)
if testForWin(position, 1, move) return 1
temp = min(getNewPosition(position, move), depth+1, move)
if temp > max_score
max_score = temp
best_move = move



min(position[][], depth, byref move)
if depth = _maxDepth_ return 0
min_score = infinity
best_move = null
for each move in getLegalMoves(position)
if testForWin(position, -1, move) return -1
temp = max(getNewPosition(position, depth, move), depth+1, move)
if temp < min_score
min_score = temp
best_move = move



getLegalMoves(byval position)
for each cell on board
if cell.value = 0 AND sum of absolute values of adjacent != 0
append this cell to legal_moves[]
return legal_moves[]



getNewPosition(position, depth, move)
position[move.x][move.y] = depth%2 ? 1 : -1



testForWin(position, moveType, move)
starting from (move.x, move.y) iterate to the left until you hit board edge or encounter a value of -1*moveType or have gone four to the left
then iterate to the right until you hit board edge or encounter value of -1 or have gone x to the right where x + how far you went to the left = 5
same for vertical and diagonals




I really have no idea if I'm approaching this right or if this will even work so any pointers as to problems with my logic or flow or byref vs. byval parameters etc. would be appreciated.
One can get away with only one function to compute minimax. This is particularly easy if you adopt the convention that positive scores are good for the side to move. Look up "negamax" on your favorite search engine.

You probably shouldn't return 0 if you get to your maximum depth and haven't found a winner: This is a place where you can place heuristic knowledge that indicates who is ahead (e.g., count the number of open threats of several types and assign them values).

There are many possible improvements to the search, of which a few important ones are (look at chess programming resources --like this Wiki-- for details):
* Alpha-beta pruning (allows you to prune a huge fraction of the tree search, which can be proven to not affect the result of the search at the root).
* Move-ordering heuristics (important to make alpha-beta prune a lot).
* Iterative deepening.
* Transposition tables.

In this particular game, I expect you might gain a lot by searching promising moves more deeply. There are several techniques that try to achieve that effect (singular extensions, late move reductions...), but perhaps the most direct way of addressing it is by using realization probability search. There isn't a whole lot of information about this technique online, but the basic idea is to model a distribution probability for the moves available (e.g., by looking at a few simple features of each move, taking a linear combination of them and then assigning probabilities that are proportional to the exponential of that linear combination) and then stop the search not when a particular depth has been reached, but when the probability of reaching this node is below a certain threshold. You can use -log(probability) as your notion of depth, and then it becomes additive again.
Advertisement
Since negamax returns the score for a position, how do we determine which move enforces that score?



function negamax(node, depth, ?, ?, color)
if node is a terminal node or depth = 0
return color * the heuristic value of node
else
foreach child of node
val := -negamax(child, depth-1, -?, -?, -color)
{the following if statement constitutes alpha-beta pruning}
if val??
return val
if val??
?:=val
return ?


Would I simply add a reference parameter to a move and insert "move = child" before "return val" and before "a = val"?
Ahh I see. The top level that's calling the negamax function will be iterating through the moves and will keep track of which move is best.
[color=#282828][font=helvetica, arial, verdana, tahoma, sans-serif]

Would I simply add a reference parameter to a move and insert "move = child" before "return val" and before "a = val"?[/quote]

[/font]
Yes, that would work. However, you only really need to extract a move at the root, not at any of the other calls. There are several other things that you may want to do at the root (iterative deepening, time control, generate output to the user...), so I prefer to have a separate function for the root, which does all those things. This results in much more clear code with little code duplication. Totally worth it.

Yep. My plan is to employ iterative deepening from a root caller. The search will be a negamax with alpha-beta.

For now the evaluation is going to be win/loss/draw. (because I don't even know if this game is as complex as gomoku) I may be able to apply some transformation and solve the game relatively easily.

I'm amazed at my initial min and max pseudocode! I came up with it complete as I typed that post and it matches the pseudocode on http://chessprogramming.wikispaces.com/Minimax exactly. I nailed that one lol.

Regardless, negamax will be cleaner.

I'm worried that evaluation of this type of position is much more expensive than chess-like evaluations because there are so many permutations of ways to search for threats.
Advertisement
I don't think evaluation will be particularly difficult. Even if you don't want to write an evaluation function now, using a random number as evaluation function performs better than just using 0. This might seem a little counter intuitive, but it makes your program capture some dynamic notion of mobility (prefer a position with 10 non-losing moves to a position with a single non-losing move), and it adds variety.

EDIT: Good job on figuring out minimax on your own! I did the same thing when I was 17 and I remember being really proud of myself.

This topic is closed to new replies.

Advertisement