Advertisement

Problem in C with NegaMax / Alpha-Beta (Connect 4)

Started by May 23, 2012 06:49 PM
58 comments, last by Jaap85 12 years, 5 months ago
If the program thinks that it is losing, it makes sense to pick the move with the least negative score. Now, why moving in the center seems to have a lower score is a different issue. The first thing to get right is the winning conditions. Evaluation comes later.

So, if you set up positions with easy wins or where the program should block you, does it do the right thing?
The problem is that it picks up MAX score every time (in the list of scores - see the debug). If there wasn't a minus sign with all scores, it would be good.
I don't understand why negamax should return a negative score ?




The initial call to negamax doesn't make sense to me.
It is said that's the opposite mellow.png


When called, the arguments ? and ? should be set to the lowest and highest values possible for any node and color should be set to 1.
(* Initial call *)
negamax(origin, depth, -inf, +inf, 1)


Or if i call it with joueur^1 (knowing that AI = joueur 1) it would call it with "0" and not "1".
I don't even talk about the minus sign to negamax.

I'm lost tongue.png
Advertisement
A score of 0 means that the game looks even. A positive score means that the player to move is ahead. A negative score means that the player to move is behind. Of course negamax can return negative values.

I don't know where you got that description of the arguments to negamax, but it is roughly correct (except I don't know what color=1 means).

Let me explain what your code is trying to do. The function that picks a move for the AI consists of a loop over all possible moves. For each move, you play it on the board and now you use negamax to compute a score for this move. When you call negamax, you are giving it a position where the opponent is about to move (hence `joueur^1') and the score returned will be positive if the opponent is happy with the situation and negative if he is unhappy. You flip the sign (as you do in the recursive call to negamax) to convert to the convention that positive means good for the player to move, not the opponent.
Ok but what is the way i have to return score from the eval function ?
Maybe that's why it doesn't work as expected.
You need to return a positive score if the player to move is ahead. You seem to be trying to reward having pieces around the center of the board, but I think you got the sign wrong: You are subtracting from the score when you find a `0' piece and adding when you find a `1' piece, but then you return score if `0' is to move and you flip the sign if `1' is to move, which seems to indicate that the meaning of score was from the point of view of player 0.

Try adding to the score when you find a `0' and subtracting when you find a `1'.
Like that ? Because still doesn't work.


if (nbAlign == 3)
{
if (joueur == 0)
{
score += SCORE_3J;
}
else
{
score -= SCORE_3J;
}
}
else
{
if (nbAlign == 2)
{
if (joueur == 0)
{
score += SCORE_2J;
}
else
{
score -= SCORE_2J;
}
}
else
{
if (nbAlign == 1)
{
if (joueur == 0)
{
score += SCORE_1J;
}
else
{
score -= SCORE_1J;
}
}
}
}
return score;
Advertisement
I was talking about the first part of the evaluation function, where you give points for having pieces near the center. The part with nbAlign makes no sense to me. Actually, you probably shouldn't pass anything to the evaluation function than the position and the player to move.

And please don't just say it "doesn't work". What doesn't work? In what position? What did you expect it to do? What did it do instead? It's very frustrating. Imagine you are a doctor and a patient keeps coming back and telling you "I am still sick", and will not describe any symptoms.
Sorry about that but if i didn't tell you anything else it's because it was exactly the same problem i've shown you before.

I want to be sure about something...
How should i return the score from eval function ?
Currently it's:


if (joueur == 0)
{
return score;
}
return -score;


Because if it's the right way to do, it looks like it does not block me anymore.
If my way of return a score is good, tell me and i'll show you what happened.



Just forget it. I force my eval function to return 0 by default because i'm debugging with a depth=1 and it looks like there is something strange.
I'll let you know as soon as possible.
Yes, that way of returning the score is correct, assuming you write the rest of the evaluation function with the convention that positive scores are good for player 0 and negative ones are good for player 1. But you shouldn't get distracted by this if your search is still not working.

Could you just put the whole code somewhere I can download it? I have a bit of time this morning and I think I can fix it quickly.
I found a problem but don't know yet where it could come from.
This debug is with depth=1 and eval function returns 0 by default.

You can see here something wrong happened (seems it's doing the same thing twice sometimes):

INSERT COIN / AI (player 1) / column 4
CALL NEGAMAX: depth 1 / column 4 / player 0
NEGAMAX INSERT COIN: column 4 (player 0)
CALL NEGAMAX: depth 0 / column 4 / player 1
MAX DEPTH REACHED
CALL NEGAMAX: depth 0 / column 4 / player 1
MAX DEPTH REACHED
NEGAMAX DELETE COIN: column 4 (player 0)
NEGAMAX INSERT COIN: column 3 (player 0)
CALL NEGAMAX: depth 0 / column 3 / player 1
MAX DEPTH REACHED
CALL NEGAMAX: depth 0 / column 3 / player 1
MAX DEPTH REACHED
NEGAMAX DELETE COIN: column 3 (player 0)
NEGAMAX INSERT COIN: column 5 (player 0)
CALL NEGAMAX: depth 0 / column 5 / player 1
MAX DEPTH REACHED
CALL NEGAMAX: depth 0 / column 5 / player 1
MAX DEPTH REACHED
NEGAMAX DELETE COIN: column 5 (player 0)
NEGAMAX INSERT COIN: column 2 (player 0)
CALL NEGAMAX: depth 0 / column 2 / player 1
MAX DEPTH REACHED
CALL NEGAMAX: depth 0 / column 2 / player 1
MAX DEPTH REACHED
NEGAMAX DELETE COIN: column 2 (player 0)
NEGAMAX INSERT COIN: column 6 (player 0)
CALL NEGAMAX: depth 0 / column 6 / player 1
MAX DEPTH REACHED
CALL NEGAMAX: depth 0 / column 6 / player 1
MAX DEPTH REACHED
NEGAMAX DELETE COIN: column 6 (player 0)
NEGAMAX INSERT COIN: column 1 (player 0)
CALL NEGAMAX: depth 0 / column 1 / player 1
MAX DEPTH REACHED
CALL NEGAMAX: depth 0 / column 1 / player 1
MAX DEPTH REACHED
NEGAMAX DELETE COIN: column 1 (player 0)
NEGAMAX INSERT COIN: column 7 (player 0)
CALL NEGAMAX: depth 0 / column 7 / player 1
MAX DEPTH REACHED
CALL NEGAMAX: depth 0 / column 7 / player 1
MAX DEPTH REACHED
NEGAMAX DELETE COIN: column 7 (player 0)
DELETE COIN / AI (player 1) / column 4


And here is the debug code for all columns:

INSERT COIN / AI (player 1) / column 4
CALL NEGAMAX: depth 1 / column 4 / player 0
NEGAMAX INSERT COIN: column 4 (player 0)
CALL NEGAMAX: depth 0 / column 4 / player 1
MAX DEPTH REACHED
CALL NEGAMAX: depth 0 / column 4 / player 1
MAX DEPTH REACHED
NEGAMAX DELETE COIN: column 4 (player 0)
NEGAMAX INSERT COIN: column 3 (player 0)
CALL NEGAMAX: depth 0 / column 3 / player 1
MAX DEPTH REACHED
CALL NEGAMAX: depth 0 / column 3 / player 1
MAX DEPTH REACHED
NEGAMAX DELETE COIN: column 3 (player 0)
NEGAMAX INSERT COIN: column 5 (player 0)
CALL NEGAMAX: depth 0 / column 5 / player 1
MAX DEPTH REACHED
CALL NEGAMAX: depth 0 / column 5 / player 1
MAX DEPTH REACHED
NEGAMAX DELETE COIN: column 5 (player 0)
NEGAMAX INSERT COIN: column 2 (player 0)
CALL NEGAMAX: depth 0 / column 2 / player 1
MAX DEPTH REACHED
CALL NEGAMAX: depth 0 / column 2 / player 1
MAX DEPTH REACHED
NEGAMAX DELETE COIN: column 2 (player 0)
NEGAMAX INSERT COIN: column 6 (player 0)
CALL NEGAMAX: depth 0 / column 6 / player 1
MAX DEPTH REACHED
CALL NEGAMAX: depth 0 / column 6 / player 1
MAX DEPTH REACHED
NEGAMAX DELETE COIN: column 6 (player 0)
NEGAMAX INSERT COIN: column 1 (player 0)
CALL NEGAMAX: depth 0 / column 1 / player 1
MAX DEPTH REACHED
CALL NEGAMAX: depth 0 / column 1 / player 1
MAX DEPTH REACHED
NEGAMAX DELETE COIN: column 1 (player 0)
NEGAMAX INSERT COIN: column 7 (player 0)
CALL NEGAMAX: depth 0 / column 7 / player 1
MAX DEPTH REACHED
CALL NEGAMAX: depth 0 / column 7 / player 1
MAX DEPTH REACHED
NEGAMAX DELETE COIN: column 7 (player 0)
DELETE COIN / AI (player 1) / column 4

INSERT COIN / AI (player 1) / column 3
CALL NEGAMAX: depth 1 / column 3 / player 0
NEGAMAX INSERT COIN: column 4 (player 0)
CALL NEGAMAX: depth 0 / column 4 / player 1
MAX DEPTH REACHED
CALL NEGAMAX: depth 0 / column 4 / player 1
MAX DEPTH REACHED
NEGAMAX DELETE COIN: column 4 (player 0)
DELETE COIN / AI (player 1) / column 3

INSERT COIN / AI (player 1) / column 5
CALL NEGAMAX: depth 1 / column 5 / player 0
NEGAMAX INSERT COIN: column 4 (player 0)
CALL NEGAMAX: depth 0 / column 4 / player 1
MAX DEPTH REACHED
CALL NEGAMAX: depth 0 / column 4 / player 1
MAX DEPTH REACHED
NEGAMAX DELETE COIN: column 4 (player 0)
DELETE COIN / AI (player 1) / column 5

INSERT COIN / AI (player 1) / column 2
CALL NEGAMAX: depth 1 / column 2 / player 0
NEGAMAX INSERT COIN: column 4 (player 0)
CALL NEGAMAX: depth 0 / column 4 / player 1
MAX DEPTH REACHED
CALL NEGAMAX: depth 0 / column 4 / player 1
MAX DEPTH REACHED
NEGAMAX DELETE COIN: column 4 (player 0)
DELETE COIN / AI (player 1) / column 2

INSERT COIN / AI (player 1) / column 6
CALL NEGAMAX: depth 1 / column 6 / player 0
NEGAMAX INSERT COIN: column 4 (player 0)
CALL NEGAMAX: depth 0 / column 4 / player 1
MAX DEPTH REACHED
CALL NEGAMAX: depth 0 / column 4 / player 1
MAX DEPTH REACHED
NEGAMAX DELETE COIN: column 4 (player 0)
DELETE COIN / AI (player 1) / column 6

INSERT COIN / AI (player 1) / column 1
CALL NEGAMAX: depth 1 / column 1 / player 0
NEGAMAX INSERT COIN: column 4 (player 0)
CALL NEGAMAX: depth 0 / column 4 / player 1
MAX DEPTH REACHED
CALL NEGAMAX: depth 0 / column 4 / player 1
MAX DEPTH REACHED
NEGAMAX DELETE COIN: column 4 (player 0)
DELETE COIN / AI (player 1) / column 1

INSERT COIN / AI (player 1) / column 7
CALL NEGAMAX: depth 1 / column 7 / player 0
NEGAMAX INSERT COIN: column 4 (player 0)
CALL NEGAMAX: depth 0 / column 4 / player 1
MAX DEPTH REACHED
CALL NEGAMAX: depth 0 / column 4 / player 1
MAX DEPTH REACHED
NEGAMAX DELETE COIN: column 4 (player 0)
DELETE COIN / AI (player 1) / column 7



Debug in my C code:

int IA_Jouer(int grille[LIGNES][COLONNES], int emplacements[COLONNES], int joueur)
{
int x = COLONNES/2, y, i, k = 1, scoreCoup, scoreMeilleur = -INFINI, col = -1;

FILE *fichier = NULL;
fichier = fopen("test.txt", "w");
for(i = 0; i < COLONNES; i++)
{
// EFFICACITE: Balayer les colonnes du milieu vers l'extérieur (alternativement gauche/droite)
x += i*k;
k = -k;
//fprintf(fichier, "Test column %d\n", x+1);
if (emplacements[x] > -1)
{
y = InsererJeton(grille, emplacements, x, joueur);
fprintf(fichier, "INSERT COIN / AI (player %d) / column %d\n", joueur, x+1);
scoreCoup = -IA_NegaMaxAlphaBeta(MAXPROFONDEUR, -INFINI, -scoreMeilleur, grille, emplacements, x, y, joueur^1, fichier);
//fprintf(fichier, "FREE column %d / SCORE: %d\n", x+1, scoreCoup);
if (scoreCoup > scoreMeilleur)
{
//fprintf(fichier, "BEST column %d (score %d)\n", x+1, scoreCoup);
scoreMeilleur = scoreCoup;
col = x;
}
AnnulerJeton(grille, emplacements, y, x);
fprintf(fichier, "DELETE COIN / AI (player %d) / column %d\n\n\n\n", joueur, x+1);
}
}
fclose(fichier);
return col;
}

int IA_NegaMaxAlphaBeta(int profondeur, int alpha, int beta, int grille[LIGNES][COLONNES], int emplacements[COLONNES], int colonne, int emplacement, int joueur, FILE *fp)
{
int x = COLONNES/2, y, i, k = 1, nbAlign;
fprintf(fp, "CALL NEGAMAX: depth %d / column %d / player %d\n", profondeur, colonne+1, joueur);

nbAlign = check(grille, emplacements, colonne, emplacement, joueur);
if (nbAlign == 4)
{
fprintf(fp, "WIN: depth %d (player %d)\n", profondeur, joueur);
return -1000;
}

if (GrillePleine(emplacements))
{
fprintf(fp, "FULL GRID\n");
return SCORE_NUL;
}
if (profondeur == 0)
{
fprintf(fp, "MAX DEPTH REACHED\n");
return IA_Evaluation(grille, joueur, nbAlign);
}
for(i = 0; i < COLONNES && alpha < beta; i++)
{
// EFFICACITE: Balayer les colonnes du milieu vers l'extérieur (alternativement gauche/droite)
x += i*k;
k = -k;
if (emplacements[x] > -1)
{
y = InsererJeton(grille, emplacements, x, joueur);
fprintf(fp, "NEGAMAX INSERT COIN: column %d (player %d)\n", x+1, joueur);
alpha = max(alpha, -IA_NegaMaxAlphaBeta(profondeur-1, -beta, -alpha, grille, emplacements, x, y, joueur^1, fp));
AnnulerJeton(grille, emplacements, y, x);
fprintf(fp, "NEGAMAX DELETE COIN: column %d (player %d)\n", x+1, joueur);
}
}

return alpha;
}



Yeah i can upload it but there are parts in ASM, do you mind?

This topic is closed to new replies.

Advertisement