Advertisement

Iterative deepening

Started by July 19, 2008 08:37 AM
7 comments, last by crypto_rsa 16 years, 4 months ago
Hi, I have a question about IDA. My algorithm is able to search the game tree to e. g. depth 5 (complete search). During searching to the depth 6 the time runs out. However, some "top-level" branches were investigated completely to depth 6. Is it possible to somehow use the information from them or should I always stick with the result I got only from the completely searched depth (in this case 5). What is the common practice? Deeper results contain more information about the current position but they are not complete.
One would keep the moves at the root sorted, at least so the best move found so far is always the first one. Then you simply return the first move, regardless of whether the depth-6 search was completed or not.

Does that answer your question?

Advertisement
Quote: Original post by alvaro
One would keep the moves at the root sorted, at least so the best move found so far is always the first one. Then you simply return the first move, regardless of whether the depth-6 search was completed or not.

Does that answer your question?


Well, it sorts the moves at the beginning of each iteration based on the results stored in a transposition table (which are likely to come from the last completed iteration), but the order based on deeper results may be different. Yes, the best move from depth-5 search is the first one but that does not guarantee it will be the best one in depth-6 search as well.
Quote: Original post by crypto_rsa
Quote: Original post by alvaro
One would keep the moves at the root sorted, at least so the best move found so far is always the first one. Then you simply return the first move, regardless of whether the depth-6 search was completed or not.

Does that answer your question?


Well, it sorts the moves at the beginning of each iteration based on the results stored in a transposition table (which are likely to come from the last completed iteration), but the order based on deeper results may be different.

Typically you wouldn't use the transposition table for this. I'll post some pseudo-code below.

Quote: Yes, the best move from depth-5 search is the first one but that does not guarantee it will be the best one in depth-6 search as well.

Of course not. The point is that it is safe to interrupt depth 6 at any point, and you will get a move that is no worse than the move that was best at depth 5. If you don't get the same move as you got in depth 5, it's because the new move has a better score in depth 6.

Here's some pseudo-code (of the C++ variety):
Move think(Position P) {  vector<Move> moves = generate_moves(P);  for(int depth=1; enough_time_left(); ++depth) {    double best_score = -infinity;    for(unsigned i=0; i<moves.size(); ++i) {      double score = alphabeta(apply_move(P, moves), -infinity, -best_score, depth-1);      if(search_interrupted_because_of_timeout())        break;      if(score>best_score) {        score = best_score;        // Move the new best move to the beginning of the list        std::rotate(moves.begin(), moves.begin()+i, moves.begin()+i+1);      }    }  }    return moves[0];}


Quote: Original post by alvaro
Quote: Original post by crypto_rsa
Quote: Original post by alvaro
One would keep the moves at the root sorted, at least so the best move found so far is always the first one. Then you simply return the first move, regardless of whether the depth-6 search was completed or not.

Does that answer your question?


Well, it sorts the moves at the beginning of each iteration based on the results stored in a transposition table (which are likely to come from the last completed iteration), but the order based on deeper results may be different.

Typically you wouldn't use the transposition table for this. I'll post some pseudo-code below.


I use the TT for storing evaluations (and thus sorting the move list) in the whole game tree, not just at the root node.

Quote:
Quote: Yes, the best move from depth-5 search is the first one but that does not guarantee it will be the best one in depth-6 search as well.

Of course not. The point is that it is safe to interrupt depth 6 at any point, and you will get a move that is no worse than the move that was best at depth 5. If you don't get the same move as you got in depth 5, it's because the new move has a better score in depth 6.

Here's some pseudo-code (of the C++ variety):
Move think(Position P) {  vector<Move> moves = generate_moves(P);  for(int depth=1; enough_time_left(); ++depth) {    double best_score = -infinity;    for(unsigned i=0; i<moves.size(); ++i) {      double score = alphabeta(apply_move(P, moves), -infinity, -best_score, depth-1);      if(search_interrupted_because_of_timeout())        break;      if(score>best_score) {        score = best_score;        // Move the new best move to the beginning of the list        std::rotate(moves.begin(), moves.begin()+i, moves.begin()+i+1);      }    }  }    return moves[0];}


This is what I currently do in my algorithm, only written in a slightly different way. But basically I am storing the best score (and the appropriate move) until time runs out. Thanks for confirming my thoughts :-)

Quote: Original post by crypto_rsa
I use the TT for storing evaluations (and thus sorting the move list) in the whole game tree, not just at the root node.

This makes sense in all the other nodes, but at the root the order you get by using my pseudo-code is better. When a move has been the best candidate for a while and at a particular depth another move seems better, you will often see the old best move become the best again. The closer you have it to the beginning of the list, the faster your search will go.

Anyway, the only thing that is really important is that the first move you explore in depth 6 is the move that was the best move in depth 5, and you seem to be doing that.

Advertisement
Quote:
The point is that it is safe to interrupt depth 6 at any point, and you will get a move that is no worse than the move that was best at depth 5. If you don't get the same move as you got in depth 5, it's because the new move has a better score in depth 6.

Here's some pseudo-code (of the C++ variety):
Move think(Position P) {  vector<Move> moves = generate_moves(P);  for(int depth=1; enough_time_left(); ++depth) {    double best_score = -infinity;    for(unsigned i=0; i<moves.size(); ++i) {      double score = alphabeta(apply_move(P, moves), -infinity, -best_score, depth-1);      if(search_interrupted_because_of_timeout())        break;      if(score>best_score) {        score = best_score;        // Move the new best move to the beginning of the list        std::rotate(moves.begin(), moves.begin()+i, moves.begin()+i+1);      }    }  }    return moves[0];}


One more thing. You seem to be using the best result from the previous iteration as the lower bound in the next iteration. Is this safe? I think the evaluation function would have to be monotone or something. You can get e. g. evaluation 150 for depth 3 (for move A), then you sort the moves, next you begin searching to depth 4 but the correct (best) evaluation for this depth is 100 (move B). When using 150 as the lower bound, you won't discover move B as being the best one. Or am I wrong?

Quote: Original post by crypto_rsa
One more thing. You seem to be using the best result from the previous iteration as the lower bound in the next iteration. Is this safe? I think the evaluation function would have to be monotone or something. You can get e. g. evaluation 150 for depth 3 (for move A), then you sort the moves, next you begin searching to depth 4 but the correct (best) evaluation for this depth is 100 (move B). When using 150 as the lower bound, you won't discover move B as being the best one. Or am I wrong?

You didn't read my pseudo-code correctly. best_score is initialized to -infinity every time we start searching a new depth.
Quote: Original post by alvaro
Quote: Original post by crypto_rsa
One more thing. You seem to be using the best result from the previous iteration as the lower bound in the next iteration. Is this safe? I think the evaluation function would have to be monotone or something. You can get e. g. evaluation 150 for depth 3 (for move A), then you sort the moves, next you begin searching to depth 4 but the correct (best) evaluation for this depth is 100 (move B). When using 150 as the lower bound, you won't discover move B as being the best one. Or am I wrong?

You didn't read my pseudo-code correctly. best_score is initialized to -infinity every time we start searching a new depth.


Oh, sure. I am sorry.

This topic is closed to new replies.

Advertisement