Advertisement

Hash Table values keep colliding

Started by May 09, 2005 10:29 AM
8 comments, last by alvaro 19 years, 6 months ago
Hi, I'm building a checkers program, but I have a problem. I implemented a transposition table where SIZE = 600000 entries. I use Zobrist keys for hashing, which I calculate like this:

public ulong GetHashKey() 
		{
			uint bit;
			ulong hashKey = 0;
			for (int i = 0; i < 32; i++) 
			{
				bit = (uint)1 << i;
				if ((p1 & bit) > 0) hashKey = hashKey ^ random[0, i];
				if ((k1 & bit) > 0) hashKey = hashKey ^ random[1, i];
				if ((p2 & bit) > 0) hashKey = hashKey ^ random[2, i];
				if ((k2 & bit) > 0) hashKey = hashKey ^ random[3, i];
			}

			return hashKey;
		}

The index in the hashtable array is calculated using index = board.GetHashKey() % SIZE I calculate the random numbers like this:

Random r = new Random((int)DateTime.Now.Ticks);
			for (int i = 0; i < 4; i++) 
			{
				for (int j = 0; j < 32; j++) 
				{
					ulong randNumber = 
						(ulong) r.Next() ^ 
						((ulong)r.Next() << 15) ^ 
						((ulong)r.Next() << 30) ^
						((ulong)r.Next() << 45) ^ 
						((ulong)r.Next() << 60);
					Console.WriteLine(randNumber);
					random[i, j] = randNumber;
				}
			}

Now somehow different boards keep coming up with the same hash keys. When my hashtable has about 10000 entries filled, it has collided with some 500 (5%) different boards with the same hashkey.... Question: is this normal? Should I just ignore collisions and keep on recursing down my Negamax function? Does anybody have any idea how to avoid these collisions? Every article I read on this subject says it's highly improbable that hash keys will collide, but my app collides in 5% of all cases. BTW: I don't make any distinction in side to move.. Thanks, Edo
Edo
Start by making a disctinction on the side to move. It is possible that all that 5% comes from there, since it's not hard to imagine situations where you can achieve the position with different side to move. I actually keep separate tables for positions with white to move and for positions with black to move. You could also just xor some number at the end when it's black's turn.

Advertisement
is using the STL hash_multimap an option? You have to do an extra check, but it's fast, and collisions are OK.

No, I only use managed C# code. The .NET framework does have a hashtable class, but it is a bit to slow.

Maybe I wasn't that clear what my problem exactly is... The problem is not that my GetHashKey() function does not generate unique keys.

The problem is that (key % SIZE) may be mapped to the same place in my array. SIZE = 591911 at the moment (don't ask me why). Should I make the array the size of an prime?

Anyway, thanks for your help, I'll start by implementing alvaro's suggestion..

Edo
Edo
lots of the literature I've seen recommend using a table size that is prime, and not near a power of two. I've never verified that myself. However, it makes sense because prime numbers work well in hashing functions, and in effect "HashKey % Size" is just extending your hashing function.

good luck
If your hash function is good, the size of the table doesn't make a difference. Are you sure you are not having duplicated keys? I would also recommend you don't use any timers to initialize your pseudo-random number generator; use a fixed seed so you can easily reproduce problems and debug them.

Advertisement
Hash collisions are normal unless you have a one-to-one hash. 5% doesnt sound bad given the figures.

Don't bother using a prime table size with a zobrist hash. Instead use a power of 2 table size for that little extra efficiency by replacing Modulo with boolean And to crop your key down to the table size.. (Modulo N is equivilent to And (N-1) when N is a power of 2)

As someone else mentioned, include ALL relevant information in the hashing. Side-to-move is very relevant. Move number is not.

I do think that your hash table should be much bigger for this particular game if you are going for engine strength.

- Rockoon (Joseph Koss)

Allright,

Made some changes, but my hash keys keep colliding, about 10% with depth = 11, 30-50% with depth = 12, sometimes 200% with depth = 13. That is, there are twice as many collissions as there are cutt-offs.

Does anyone have any reliable stats on Hashtable Size vs Collissions vs Cutt-offs?

Also, I now have two options when hashkeys collide:
1. Replace the postion in the hash table with a new board
2. Keep the old board

I think I should keep the board on that position that has the LOWEST depth, by assuming that you will get to that position quicker, and eventually will lead to the most cutt-offs. Anyone ideas?

BTW: I'm using a fail-soft system for my Alpha-Beta window.

Thanks,

Edo

PS: sorry for reviving this thread, but I'm a bit stuck here...
Edo
Your hash table contains less elements than the total number of possible board positions, therefore there will be collisions. Can't you just store a linked list in each cell of the hash table?

I don't understand why the fact that you have collisions is a) unexpected or b) a problem. :/
Quote: Original post by Squirm
Your hash table contains less elements than the total number of possible board positions, therefore there will be collisions. Can't you just store a linked list in each cell of the hash table?

Bad idea. You don't want to do any dynamic memory allocation in an alpha-beta searcher. And actually, you are quickly going to search more boards than you can fit in RAM, so don't even try to keep everything.

Quote: I don't understand why the fact that you have collisions is a) unexpected or b) a problem. :/

It's not unexpected. It is a problem, but of the type that you just deal with. You need a repacement scheme to resolve collisions. Start by keeping the position with the highest analyzed depth, since that is the one that's going to save you the most time if found again. You can also try to store the second position somewhere else in the table, or have two entries per position and keep the two most important boards... Many, many options, several of them covered in ICGA articles.

Also, making your transposition tables very large helps.

This topic is closed to new replies.

Advertisement