Here''s a fairly comprehensive page I found:
http://swww.ee.uwa.edu.au/~plsd210/ds/hash_tables.html
Apparently, you''re trying to resolve collisions by linear probing--skipping to the next hash value in the table. This requires knowledge beforehand of exactly how long your table will be. If you don''t have this knowledge, you will run into problems.
A common hashing technique for strings is to make a polynomial value out of the string. Ideally, if you had the string "CAT", you could make a unique value from it in this way:
key = ''C''*26^2 + ''A''*26 + ''T''
A common implementation is here:
int hash_string (const char *stg){ int value = 0; while (*stg) { value += *stg + (value << 5); ++stg; } return value;}
This is different from the ideal because the polynomial is 32 intstead of 26, and there''s no accounting for the way ASCII gives non-zero values to letters and such, but it''s a decent and fairly fast method. It will overflow fairly quickly, but you''ll always have a word hash to the same value. Just take this key and modulo it with the size of your table.
Finally, I''ll address your question of searching for duplicates. If you''re using linear probing to resolve collision detection, you have two related problems:
1) If you want to avoid duplicates, you will have to have a way to check that, if a key exists at your hashed position, was it put there as a probe of a prior key, or is it the same value?
2) If you''re searching for data once you have your hash table put together, you''ll have to verify that the value you hashed to is actually the value you''re looking for; it might be from a linear probe of another value. Furthermore, how do you know when to stop looking for your value?
The solution will have to do with checking the value and hash key of the data already in the table. I''m really not sure what the best way to do it is; I personally always use chaining (array of linked lists) when I do hash tables. Hope that gives you some places to look.