Advertisement

Transposition table problems

Started by April 03, 2008 07:29 AM
12 comments, last by kluster 16 years, 7 months ago
Hi everyone! I having some problem with my transposition table for my gomoku/connect five game. The AI don't take every possibility to win a game. When I remove the transposition code, the problem goes away. Sorry about the code not being commented, but I guess you understand the most of it from the names of the functions and variables.
[SOURCE]

int Brain::Min(Move& move, int depth, int a, int b) {

	long zobrist = calcHashValue(move);

	if(hashTable->entryExists(zobrist)) {
		if(hashTable->getDepth(zobrist) >= depth) {
			if(hashTable->getFlag(zobrist) == hashTable->HASH_EXACT) {
				return hashTable->getValue(zobrist);
			} else if(hashTable->getFlag(zobrist) ==hashTable->HASH_ALPHA) {
				return a;
			} else if (hashTable->getFlag(zobrist) == hashTable->HASH_BETA) {
				return b;
			}
		}
	}
	
	if (depth == 0 || move.winningMove) {
		hashTable->record(zobrist, depth, hashTable->HASH_EXACT, move.value);
		return move.value; 
	}

	vector<Move*> *childs = getChilds(move, RING); // get all childs in a sorted vector
	if (childs->empty()) {
		return 0;
	}

	int flag = hashTable->HASH_BETA;
	for (int i = 0; i < childs->size(); i++) {
	
		long zobrist = calcHashValue(*childs->at(i));
		int temp = Max(*childs->at(i), returnMove,  depth - 1, a, b);

		if (temp <= a) {
			hashTable->record(zobrist, depth, hashTable->HASH_ALPHA, temp);
			return a;
		}

		if (temp < b) {
			b = temp;
			flag = hashTable->HASH_EXACT;
		}
	}
	
	hashTable->record(zobrist, depth, flag, b);
	return b;
}

int Brain::Max(Move& move, int depth, int a, int b) {

	long zobrist = calcHashValue(move);

	if(hashTable->entryExists(zobrist)) {
		if(hashTable->getDepth(zobrist) >= depth) {
			if(hashTable->getFlag(zobrist) == hashTable->HASH_EXACT) {
				return hashTable->getValue(zobrist);
			} else if(hashTable->getFlag(zobrist) == hashTable->HASH_ALPHA) {
				return a;
			} else if (hashTable->getFlag(zobrist) == hashTable->HASH_BETA) {
				return b;
			}
		}
	}

	if (depth == 0 || move.winningMove) {
		hashTable->record(zobrist, depth, hashTable->HASH_EXACT, move.value);
		return move.value;
	}

	vector<Move*> *childs = getChilds(move, CROSS); // get all childs in a sorted vector

	if (childs->empty()) {
		return 0;
	}

	int flag = hashTable->HASH_ALPHA;
	for (int i = 0; i < childs->size(); i++) {

		zobrist = calcHashValue(*childs->at(i));
		int temp = Min(*childs->at(i), returnMove, depth - 1, a, b);

		if (temp >= b) {
			hashTable->record(zobrist, depth, hashTable->HASH_BETA, temp);
			return b;
		}

		if (temp > a) {
			a = temp;
			flag = hashTable->HASH_EXACT;
		}
	}

	hashTable->record(zobrist, depth, flag, a);
	return a;
}

[/SOURCE]
Why would you `return a' if the flag is HASH_ALPHA? All you can do is set b to the minimum of b and the value in the hash table, since you proved already that the score was at most that high.

Also, why don't you use NegaMax so you only have one function? It saves a lot of stupid errors.

I used code that looks like this (using NegaMax):
if(hash_found(&flag, &score, &hash_depth) {  if(depth <= hash_depth) {    switch(flag) {      case LOWER_BOUND:        alpha = max(alpha, score); break;      case UPPER_BOUND:        beta = min(beta, score); break;      case EXACT_SCORE:        return score;    }    if(alpha>=beta)      return alpha;  }}
Advertisement
Quote: Original post by alvaro
Why would you `return a' if the flag is HASH_ALPHA? All you can do is set b to the minimum of b and the value in the hash table, since you proved already that the score was at most that high.

Also, why don't you use NegaMax so you only have one function? It saves a lot of stupid errors.

I used code that looks like this (using NegaMax):
*** Source Snippet Removed ***



Thanks for the advice!
I implemented the negamax algorithm and also the transposition table.
The AI plays good but a little different when I compare it without the table.
What could be wrong?

[SOURCE]int negamax(Move& move, char player, int depth, int alpha, int beta) {	long zobrist = calcHashValue(move);	if(hashTable->entryExists(zobrist)) {		if(hashTable->getDepth(zobrist) >= depth) {			int flag = hashTable->getFlag(zobrist);			int value = hashTable->getValue(zobrist);			switch(flag) {				case HASH_ALPHA:					alpha = max(alpha, value); break;				case HASH_BETA:					beta = min(beta, value); break;				case HASH_EXACT:					return value;			}			if(alpha>=beta) 				return alpha;		}	}	if (depth == 0 || move.value == MAXSCORE) {		hashTable->record(zobrist, depth, HASH_EXACT, -move.value);			return -move.value;	} 	vector<Move*> *childs = getChilds(move, player);	if (childs->empty()) {		return 0;	}	int flag = HASH_BETA;	for (unsigned int i = 0; i < childs->size(); i++) {		Move* child = childs->at(i);		long zobrist = calcHashValue(*child);		int value = -negamax(*child, getOp(player), depth-1, -beta, -alpha);		if(value >= beta) {			hashTable->record(zobrist, depth, HASH_BETA, beta);			return beta;		}		if(value > alpha) {			alpha = value;			flag = HASH_EXACT;		}	}		hashTable->record(zobrist, depth, flag, alpha);	return alpha;}[/SOURCE]

I think you are never storing HASH_ALPHA as the flag. I suggest you use more self-evident labels, like LOWER_BOUND and UPPER_BOUND instead of HASH_ALPHA and HASH_BETA.

When a beta cut happens it means that the score that we are storing is a lower bound, and this can be used to improve alpha next time.

When the score returned is no more than alpha, it means the score we are storing is an upper bound, and this can be used to improve (decrease) beta next time.

Give it another try.
Quote: Original post by alvaro
I think you are never storing HASH_ALPHA as the flag. I suggest you use more self-evident labels, like LOWER_BOUND and UPPER_BOUND instead of HASH_ALPHA and HASH_BETA.

When a beta cut happens it means that the score that we are storing is a lower bound, and this can be used to improve alpha next time.

When the score returned is no more than alpha, it means the score we are storing is an upper bound, and this can be used to improve (decrease) beta next time.

Give it another try.


Okey, I saw that the flag was initilized to UPPER_BOUND instead of LOWER_BOUND before
the loop. But that made no difference :(
I also noticed that it makes no difference if I store move.value or -move.value as value
on a leaf-node :( strange.. Should it be -move.value?

[SOURCE]int negamax(Move& move, char player, int depth, int alpha, int beta) {	long zobrist = calcHashValue(move);	if(hashTable->entryExists(zobrist)) {		if(hashTable->getDepth(zobrist) >= depth) {			int flag = hashTable->getFlag(zobrist);			int value = hashTable->getValue(zobrist);			switch(flag) {				case LOWER_BOUND:					alpha = max(alpha, value); break;				case UPPER_BOUND:					beta = min(beta, value); break;				case EXACT:					return value;			}			if(alpha>=beta) {				return alpha;			}		}	}	if (depth == 0 || move.value == MAXSCORE) {		hashTable->record(zobrist, depth, EXACT, -move.value);			return -move.value;	} 	vector<Move*> *childs = getChilds(move, player);	if (childs->empty()) {		return 0;	}	int flag = LOWER_BOUND;		for (unsigned int i = 0; i < childs->size(); i++) {		Move* child = childs->at(i);		long zobrist = calcHashValue(*child);		int value = -negamax(*child, getOp(player), depth-1, -beta, -alpha);		if(value >= beta) {			hashTable->record(zobrist, depth, UPPER_BOUND, beta);			return beta;		}		if(value > alpha) {			alpha = value;			flag = EXACT;		}	}	hashTable->record(zobrist, depth, flag, alpha);	return alpha;}[/SOURCE]
I am not sure what move.value is (is it always from the point of view of the player to move, or from the other player, or from white...?).

You are storing the wrong flags. In a beta cut you should store LOWER_BOUND and if the score is less than alpha you should store UPPER_BOUND.
Advertisement
Quote: Original post by alvaro
I am not sure what move.value is (is it always from the point of view of the player to move, or from the other player, or from white...?).

You are storing the wrong flags. In a beta cut you should store LOWER_BOUND and if the score is less than alpha you should store UPPER_BOUND.


hmm.. are you sure? now the AI is a slower :(
This is very frustrating, I can't get it to work!

move.value stores the value of the move from the players side of view.

[SOURCE]int Brain::negamax(Move& move, char player, int depth, int alpha, int beta) {	long zobrist = calcHashValue(move);	if(hashTable->entryExists(zobrist)) {		if(hashTable->getDepth(zobrist) >= depth) {			int flag = hashTable->getFlag(zobrist);			int value = hashTable->getValue(zobrist);			switch(flag) {				case LOWER_BOUND:					alpha = max(alpha, value); break;				case UPPER_BOUND:					beta = min(beta, value); break;				case EXACT:					return value;			}			if(alpha>=beta) {				return alpha;			}		}	}	if (depth == 0 || move.value == MAXSCORE) {		hashTable->record(zobrist, depth, EXACT, -move.value);			return -move.value;	} 	vector<Move*> *childs = getChilds(move, player);	if (childs->empty()) {		return 0;	}	int flag = UPPER_BOUND;	for (unsigned int i = 0; i < childs->size(); i++) {		Move* child = childs->at(i);		zobrist = calcHashValue(*child);		int value = -negamax(*child, getOp(player), depth-1, -beta, -alpha);		if(value > alpha)  {			if(value >= beta) {				hashTable->record(zobrist, depth, LOWER_BOUND, beta);				return beta;			}					alpha = value;			flag = EXACT;		}	}	hashTable->record(zobrist, depth, flag, alpha);	return alpha;}[/SOURCE]
The hash table part seems OK to me now, although it's possible that I am missing something.

One thing that confuses me in your code is that you don't seem to have separate notions of Move and Position. I am guessing for the most part you mean "Position" where you said "Move". Certainly, the static evaluation is a property of the position, not the move.

Also, the description of your sign convention was ambiguous. When I write this type of program I usually write a function that returns the score from white's point of view, and then I flip the sign if necessary.

Are you computing the Zobrist key from scratch every time? You probably shouldn't. You just need to take the existing zobrist key and xor it with one number.

Oh, I just found a bug. Get rid of the line `zobrist = calcHashValue(*child);' inside the loop; you don't want to do that.

Quote: Original post by alvaro
The hash table part seems OK to me now, although it's possible that I am missing something.

One thing that confuses me in your code is that you don't seem to have separate notions of Move and Position. I am guessing for the most part you mean "Position" where you said "Move". Certainly, the static evaluation is a property of the position, not the move.

Also, the description of your sign convention was ambiguous. When I write this type of program I usually write a function that returns the score from white's point of view, and then I flip the sign if necessary.

Are you computing the Zobrist key from scratch every time? You probably shouldn't. You just need to take the existing zobrist key and xor it with one number.

Oh, I just found a bug. Get rid of the line `zobrist = calcHashValue(*child);' inside the loop; you don't want to do that.


Yes you are right about the position/move thing, but in gomoku they are almost the same.
Now Im calculation the Zobrist keys when Im generation the child-moves by XOR'ing the last
value with the new position. Unfortunately, the TT-AI still playes different comparing to the AI without a TT. But the TT-part seems ok? that makes it even more confusing. Im pretty sure that the TT implementation works and all that stuff.
Thanks alot for help so far! It's very helpful getting corrected, and I learn ALOT!

[SOURCE]int negamax(Position& pos, char player, int depth,  int alpha, int beta) {	if(hashTable->entryExists(pos.zobrist)) {		if(hashTable->getDepth(pos.zobrist) >= depth) {			int flag = hashTable->getFlag(pos.zobrist);			int value = hashTable->getValue(pos.zobrist);			switch(flag) {				case LOWER_BOUND:					alpha = max(alpha, value); break;				case UPPER_BOUND:					beta = min(beta, value); break;				case EXACT:					return value;			}			if(alpha>=beta) {				return alpha;			}		}	}	if (depth == 0 || pos.value == MAXSCORE) {		hashTable->record(pos.zobrist, depth, EXACT, -pos.value);			return -pos.value;	} 	vector<Position*> *childs = getChilds(pos, player);	if (childs->empty()) {		return 0;	}	int flag = UPPER_BOUND;	for (int i = 0; i < childs->size(); i++) {		Position* child = childs->at(i);		int value = -negamax(*child,  getOp(player), depth-1, -beta, -alpha);		if(value > alpha)  {			if(value >= beta) {				hashTable->record(pos.zobrist, depth, LOWER_BOUND, beta);				return beta;			}			alpha = value;			flag = EXACT;		} 	}	hashTable->record(pos.zobrist, depth, flag, alpha);	return alpha;}[/SOURCE]
Quote: Original post by kluster
Yes you are right about the position/move thing, but in gomoku they are almost the same.

I don't think so. "Place a black stone on E6" is a move, not a position. A position is the whole configuration on the board. Looking at your code, it looks like you are making many copies of the board when you generate children. Instead, you could just generate the moves, and then you can just do and undo each move on the existing board, without any copies. This saves a lot of time and is not too hard to implement.

Quote: Now Im calculation the Zobrist keys when Im generation the child-moves by XOR'ing the last
value with the new position.

I guess you mean "with the new move".
Quote: Unfortunately, the TT-AI still playes different comparing to the AI without a TT. But the TT-part seems ok? that makes it even more confusing. Im pretty sure that the TT implementation works and all that stuff.

It looks reasonable to me. You may still have a bug somewhere that I haven't seen (or is in a part of the code that you haven't posted).

Even if you get rid of all the bugs, you shouldn't expect the program with the transposition table to play the exact same moves as the program without it. Transposition tables introduce a well-known nightmare called "search instability", which makes search results less reproducible and your program harder to understand in general. Well, this is true at least for chess and checkers; I haven't thought carefully about how this would affect a gomoku program.

You can (should) store the best move from this position in the transposition table, so even if later on you can't use the score you stored, you can still search that move first, which will improve your move ordering and therefore the efficiency of your alpha-beta search.

In general, you should pay attention to move ordering. It is not obvious from looking at your code if you have made any efforts in that direction.

This topic is closed to new replies.

Advertisement