Isn''t it possible to get the least significant bit by doing b&(-b)? Though it''s true this just clears the bitboard of all other bits but the least significant. It alas doesn''t give me the shift to do to bring it to the first square, and javascript has no premade function to do this I am affraid (so my function to get coords of the least significant bit is so far indeed very slow, with up to 18 conditional branches. I am looking for a way to get rid of it, maybe by indexing my squares with one bit bitboards, but I am affraid referencing the square without using an incremental index might then become very slow...). Anyway I''ll switch the program to C++ as soon as I can afford buying it...
BTW Russel, I saw your functions to generate knight and rook moves in the general section and found the rank algo quite clever. I''ll have to compare them speedwise with the use of my pre-calculated moves depending on rank/file/diago occupancy. As I could generate them only for one rank bcs of memory problems, and thus need to rotate the board for rook files and bishop diagos, they might be much faster . Not going to be easy to adapt to my 9*9 board though.
David Antonini
New Chess game tree
There are free C/C++ compilers you can get if money is a problem.
I think you should be able to find the least significant bit in fewer than 18 branches. You could do something like:
I''m not sure if this is 100% correct, but you get the idea. You basically do a binary search on the 64-bit value. Each if-statement will tell you which half the most significant bit is in. You could also speed this up by using a 256-entry lookup table and save the last few if-statements, and look up the most significant bit in your 8-bit lookup table. That would be very fast if your 8-bit lookup table was in the cache (which no doubt it would).
Russell
I think you should be able to find the least significant bit in fewer than 18 branches. You could do something like:
// This is a most significant bit function, but you get the idea// Untested code, in C/C++int msb(unsigned __int64 bits){ int result = 0; if (bits & 0xFFFFFFFF00000000) { bits >>= 32; result += 32; } if (bits & 0xFFFF0000) { bits >>= 16; result += 16; } if (bits & 0xFF00) { bits >>= 8; result += 8; } if (bits & 0xF0) { bits >>= 4; result += 4; } if (bits & 0xC) { bits >>= 2; result += 2; } if (bits & 0x2) { bits >>= 1; result += 1; } return result;}
I''m not sure if this is 100% correct, but you get the idea. You basically do a binary search on the 64-bit value. Each if-statement will tell you which half the most significant bit is in. You could also speed this up by using a 256-entry lookup table and save the last few if-statements, and look up the most significant bit in your 8-bit lookup table. That would be very fast if your 8-bit lookup table was in the cache (which no doubt it would).
Russell
I had just changed my GetLSBShift function to use a dichotomic method .
Alas javascript is very surprising sometimes... The function seems even slower on average now . I guess the interpreter loses more time understanding the branching structure than actually testing the conditional branches...
Going to look for some C++ compilers tonight, while waiting to be able to get MSdev .
Alas javascript is very surprising sometimes... The function seems even slower on average now . I guess the interpreter loses more time understanding the branching structure than actually testing the conditional branches...
Going to look for some C++ compilers tonight, while waiting to be able to get MSdev .
David Antonini.
If you''re not using a compiled language, forget about speed.
If you''re using JS, Java, or anything other ''non-compiled'' language, be carful with your memory. A bitboard (with 1 bb per piece type) will actually take up more memory than an 8x8 array of bytes. If you''ve got to move this data around a lot, you could be wasting many cycles copying data.
Uhm, just as a reference, I''m getting about (average) 350000 Nodes/Sec on a PII-500. Sometimes it''s up over 400000, sometimes as low as 320000. What is a normal benchmark? I know this is hard to tell, because of different static evaluations, but, there must be a ballpark figure somewhere.
Will
If you''re using JS, Java, or anything other ''non-compiled'' language, be carful with your memory. A bitboard (with 1 bb per piece type) will actually take up more memory than an 8x8 array of bytes. If you''ve got to move this data around a lot, you could be wasting many cycles copying data.
Uhm, just as a reference, I''m getting about (average) 350000 Nodes/Sec on a PII-500. Sometimes it''s up over 400000, sometimes as low as 320000. What is a normal benchmark? I know this is hard to tell, because of different static evaluations, but, there must be a ballpark figure somewhere.
Will
------------------http://www.nentari.com
As far as using a non-compiled language, you''re right. Don''t worry about speed, because it''s going to be slow. As far as using a non-compiled language with bitboards...ouch, that''s going to be _SLOW_. I think the only way bitboards are able to keep up on 32-bit machines is because of the bsf/bsr x86 instructions.
Regarding nodes per second, don''t bother. Nodes per second don''t mean anything when you''re comparing different engines. Fritz gets about 350,000 nps on my PIII 733 and it''s world class Grandmaster level. I know of other programs that get ridiculously higher nps, but they''re only rated around weak master level, many hundreds of ELO points below Fritz and GM''s.
A brilliant program written in the slowest interpreted language could perform better than an absolute speed demon written in 100% assembler or highly optimized C. My experience has been that chess programmers put way too heavy of an emphasis on how fast their program is. It''s all about algorithms and searching smart, not how fast.
Russell
Regarding nodes per second, don''t bother. Nodes per second don''t mean anything when you''re comparing different engines. Fritz gets about 350,000 nps on my PIII 733 and it''s world class Grandmaster level. I know of other programs that get ridiculously higher nps, but they''re only rated around weak master level, many hundreds of ELO points below Fritz and GM''s.
A brilliant program written in the slowest interpreted language could perform better than an absolute speed demon written in 100% assembler or highly optimized C. My experience has been that chess programmers put way too heavy of an emphasis on how fast their program is. It''s all about algorithms and searching smart, not how fast.
Russell
Thats an interesting point. So a good static evaluation can really go a long way. And it''s true, another 100,000 nodes/sec is really nothing-- not even 1 extra ply.. I just optimzed the hell outta something a few days back, and it shows up as a 2 second savings on a search that normally takes 27 seconds. That doesn''t translate in to anything useful.
Maybe the reason I, and many others, focus so much on execution speed is because our knowledge of chess is somewhat limited.
I can play okay chess (win 50% of my games on InstantChess.com), but not good chess. At best my understanding of the game is superficial, so I focus on what I can: brute force.
Cheers,
Will
Maybe the reason I, and many others, focus so much on execution speed is because our knowledge of chess is somewhat limited.
I can play okay chess (win 50% of my games on InstantChess.com), but not good chess. At best my understanding of the game is superficial, so I focus on what I can: brute force.
Cheers,
Will
------------------http://www.nentari.com
Sigloxx,
I''m not too sure about javascript, but in Java I don''t think there is a ''sign bit'' (not in the C/C++ sense anyway). I think Java may store information about the sign separatley from the accesable part of the variable. This prevents wrap-arounds and other binary side-effects most programmers don''t like dealing with. <br><br>I ran in to this while writing a Genetic Programming suite in Java. I had a set of op codes defined, where each op code was a bit pattern (optimized to increase effectiveness of random-bit mutation). I kept running in to problems with this, and found that Java was managing the ''sign'' of my numbers for me! In the end I was forced to use 2-byte values just so I could manage all 8 bits of a single byte. I suspect that javascript may have the same ''feature''. <br><br>It''s been awhile since I''ve used Java, and it is quite possible that I was going about this entirely wrong. <br><br>Trying to manage your ''native'' variables at a low level may be causeing IE to do an enormous amount of work. If I were you, I would forget about using any ''native'' binary type operations. They don''t really exist in Java/JS anyway. <img src="smile.gif" width=15 height=15 align=middle> <br><br>Let me know how it turns out.<br><br>Will<br>
I''m not too sure about javascript, but in Java I don''t think there is a ''sign bit'' (not in the C/C++ sense anyway). I think Java may store information about the sign separatley from the accesable part of the variable. This prevents wrap-arounds and other binary side-effects most programmers don''t like dealing with. <br><br>I ran in to this while writing a Genetic Programming suite in Java. I had a set of op codes defined, where each op code was a bit pattern (optimized to increase effectiveness of random-bit mutation). I kept running in to problems with this, and found that Java was managing the ''sign'' of my numbers for me! In the end I was forced to use 2-byte values just so I could manage all 8 bits of a single byte. I suspect that javascript may have the same ''feature''. <br><br>It''s been awhile since I''ve used Java, and it is quite possible that I was going about this entirely wrong. <br><br>Trying to manage your ''native'' variables at a low level may be causeing IE to do an enormous amount of work. If I were you, I would forget about using any ''native'' binary type operations. They don''t really exist in Java/JS anyway. <img src="smile.gif" width=15 height=15 align=middle> <br><br>Let me know how it turns out.<br><br>Will<br>
------------------http://www.nentari.com
quote: Original post by RPGeezus
So a good static evaluation can really go a long way.
A static evaluation function isn''t the real key either. Forward pruning is what makes top programs really strong. Don''t get me wrong, a static eval is very important, but that isn''t what turns a good engine into a great one. You can double your nps and if you''re lucky you''ll get an extra ply sometimes. So that''s no big deal. What gets you significant improvements is forward pruning, things like null-move. With null-move you DO get many extra ply at the cost of a tiny, tiny, tiny reduction in the computer''s razor sharp tactical ability. So now your engine is using a Mach 3 razor instead of a straight edge razor, but it gets several extra plies.
Null-move is just one of the forward pruning methods (as opposed to backward pruning, which is what alpha-beta does...prunes AFTER it has searched forward. Forward pruning prunes before it searches, so you get way big savings). There are other forward pruning methods, but unfortunately they are used by top commercial programs like Fritz and Chess Tiger and of course they aren''t goin to tell us what methods they use.
Russell
I should have done a classic chess engine instead of a shogi one I guess, as I have played chess a lot with FIDE rating close to 2200. I actually wanted to try it with a game I don''t know well, in the hope that maybe my program would be stronger than me .
About the sign, javascript uses classic 32 bits signed integers as only type of integer, with the sign carried by the heaviest bit. I noticed nothing particular as far as binary operations are concerned.
The speed is indeed awfully slow in javascript... No wonder why the few chess js scripts I found were playing so bad. But by using bitboards I do go roughly 10 times faster than with using a classic 8*8 or 12*12 array representation like I was doing at first. The key is that IE uses a LOT of memory to allocate a new object. My bitboards are 3 integers objects (I use the 27 first bits to represent 9*3 squares). So I defined the bitboards for my position once at initialisation, and during the search I never "copy" them. I use the main position data, and I play and unplay moves on it. I use them as global variables. But well, I still reach only ridiculous speeds... The main slowing factor is in fact the generation of the search tree (using an array). Assigning a value to an element of a javascript array is slow.... Once the quiescent search will be added, the program should play at my lvl though, I hope <img src="smile.gif" width=15 height=15 align=middle> (wich is very weak hehe).<br><br> I came to consult this forum tonight mainly because I noticed something wich gave me an interesting idea wich is probably already used in many programs...<br> I noticed than when I have considered a move in a given position (let''s say a move at ply 1), and generated all the possible answers at ply2, if I consider a new move at ply1 the possible answers to that new move are most of the time very similar to the answers to the first move. So though it may seem a bit tricky to implement, wouldn''t it be quite fast for a given ply to just generate the answers to a null move, and to consider wich of them became invalid and wich new ones became possible after replacing the null move by the move you want to study?<br> In the initial position, for instance, the answers to any white move are exactly the same, so using this would speed up move generation a lot...<br> I suppose this is already used or has been already tried? This seems even more valid for shogi where each camp often plays moves within their camp wich have absolutely no impact on the opponents possible moves. But of course, determining wich new moves became valid or invalid because a move was played, compared to the ones that were valid after a null move, isnt''t easy to code efficiently..<br> Anyway this wouldn''t help me much while my program is in js as it uses more times puting the 30 possible moves in my tree array than actually determining what they are....<img src="smile.gif" width=15 height=15 align=middle><br><br>David Antonini.<br><br>P.S : I downloaded the freeware borland C++ compiler. Will probably migrate my program to C++ once it''s working, provided I have the time to.<br>
About the sign, javascript uses classic 32 bits signed integers as only type of integer, with the sign carried by the heaviest bit. I noticed nothing particular as far as binary operations are concerned.
The speed is indeed awfully slow in javascript... No wonder why the few chess js scripts I found were playing so bad. But by using bitboards I do go roughly 10 times faster than with using a classic 8*8 or 12*12 array representation like I was doing at first. The key is that IE uses a LOT of memory to allocate a new object. My bitboards are 3 integers objects (I use the 27 first bits to represent 9*3 squares). So I defined the bitboards for my position once at initialisation, and during the search I never "copy" them. I use the main position data, and I play and unplay moves on it. I use them as global variables. But well, I still reach only ridiculous speeds... The main slowing factor is in fact the generation of the search tree (using an array). Assigning a value to an element of a javascript array is slow.... Once the quiescent search will be added, the program should play at my lvl though, I hope <img src="smile.gif" width=15 height=15 align=middle> (wich is very weak hehe).<br><br> I came to consult this forum tonight mainly because I noticed something wich gave me an interesting idea wich is probably already used in many programs...<br> I noticed than when I have considered a move in a given position (let''s say a move at ply 1), and generated all the possible answers at ply2, if I consider a new move at ply1 the possible answers to that new move are most of the time very similar to the answers to the first move. So though it may seem a bit tricky to implement, wouldn''t it be quite fast for a given ply to just generate the answers to a null move, and to consider wich of them became invalid and wich new ones became possible after replacing the null move by the move you want to study?<br> In the initial position, for instance, the answers to any white move are exactly the same, so using this would speed up move generation a lot...<br> I suppose this is already used or has been already tried? This seems even more valid for shogi where each camp often plays moves within their camp wich have absolutely no impact on the opponents possible moves. But of course, determining wich new moves became valid or invalid because a move was played, compared to the ones that were valid after a null move, isnt''t easy to code efficiently..<br> Anyway this wouldn''t help me much while my program is in js as it uses more times puting the 30 possible moves in my tree array than actually determining what they are....<img src="smile.gif" width=15 height=15 align=middle><br><br>David Antonini.<br><br>P.S : I downloaded the freeware borland C++ compiler. Will probably migrate my program to C++ once it''s working, provided I have the time to.<br>
David Antonini.
By the way, in javascript, not only assigning a value to an array element is slow. Geting the value contained by an array element is also very slow...
For instance, I rewrote in a better way the getLSBshift function using dichotomy, and it got on average slightly faster than the previous version. I then tried using an array of 512 elements (for my 9*9 squares shogi board), to get immediately the shift on a rank. Replacing my last 4 branches by ONE reference to this array was making 500 000 calls to the function take exactly 2 seconds more (7 to 9 seconds for the branch algo, 9 to 11 for the algo taking the value in a premade array). And of course the 7 to 9 seconds are used mostly because I need to get the values contained by my bitboard wich is an object, not by the branches... If I replace the bitboard elements by 3 integers, the 500 000 calls don''t even take one second (but not using an object nor an array at all for the board representation would make the program a hell to code. And for the search tree representation I don''t even have the choice).
For instance, I rewrote in a better way the getLSBshift function using dichotomy, and it got on average slightly faster than the previous version. I then tried using an array of 512 elements (for my 9*9 squares shogi board), to get immediately the shift on a rank. Replacing my last 4 branches by ONE reference to this array was making 500 000 calls to the function take exactly 2 seconds more (7 to 9 seconds for the branch algo, 9 to 11 for the algo taking the value in a premade array). And of course the 7 to 9 seconds are used mostly because I need to get the values contained by my bitboard wich is an object, not by the branches... If I replace the bitboard elements by 3 integers, the 500 000 calls don''t even take one second (but not using an object nor an array at all for the board representation would make the program a hell to code. And for the search tree representation I don''t even have the choice).
David Antonini.
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement