If you would code a chess AI with NN how would you format the input and the output? And how many hidden layers would you use?
Each tile can contain 13 type of piece(Empty, 6 white,6 black) so one tile can be represented with 4 bits and entire board with 64*4=256 bits. So should I make the input layer 256 nodes?
But I am not sure what whould give the best performance, I could use one-hot encoding where each tile is represented with 13 bits and only one of them would be 1.
And about the output I am planning to represent moves like e2->e4 meaning the piece on e2 moved to e4. Which can be expressed in 12 bits, 3 for sourcerow, sourcecolumn, destrow, destcolumn.
Where can I get more insight about this?
For example this project has some creative representations, but it is only handling specific pieces and specific number of them, so it is not generic enough to represent a full game.
http://www.slideshare.net/piuprabhu/chess-end-games-using-neural-networks-presentation
I know NNs are not proper for this kind of game but my aim is to create something that can make at least valid moves, and I read about a man who is beaten by his NN AI in chess, so I'd like to see where this goes too.
And do you have any advice on which library to use with C#?
Chess AI with Neural Networks
I don't know what to tell you, other than it won't work. Artificial neural networks are not some kind of magic thing where you plug in inputs and outputs and it learns anything. In particular, I don't think they can learn the rules of chess, let alone make decent moves.
There is more hope in having an ANN that takes a position as input and outputs an estimate of who is ahead (something like the probability of winning), and then use that in a search. It's still inferior to using an evaluation function that combines features that are known to be important for the game (material, pawn structure, center control, mobility, king safety, passed pawns...).
There is more hope in having an ANN that takes a position as input and outputs an estimate of who is ahead (something like the probability of winning), and then use that in a search. It's still inferior to using an evaluation function that combines features that are known to be important for the game (material, pawn structure, center control, mobility, king safety, passed pawns...).
Similarly I would recommend against. It's a problem that I suppose NNs could be applied to, but it's by no means easily applicable. I actually think that if you wanted to apply NNs to chess, a better approach would be to use them to predict player preference, e.g. which pieces they attack/defend with most, which common mistakes they're most likely to make, etc. Probably inapplicable to grand masters, but may be useful against lower level players. That could be used to weight a traditional chess search.
NNs tend to give an "in general given similar inputs" answer. Board positions that are similar to each other (perhaps differing by 1 piece) may call for dramatically different moves. Therefore, there is no room for "in general" answers. Therefore, your NN would only be perfectly viable once it has a move defined for EVERY SINGLE POSSIBLE BOARD COMBINATION IN THE WORLD. At that point, all you are doing is memorizing stuff -- not using the generalized pattern-matching of a NN.
Conclusion: "Don't."
Conclusion: "Don't."
Dave Mark - President and Lead Designer of Intrinsic Algorithm LLC
Professional consultant on game AI, mathematical modeling, simulation modeling
Co-founder and 10 year advisor of the GDC AI Summit
Author of the book, Behavioral Mathematics for Game AI
Blogs I write:
IA News - What's happening at IA | IA on AI - AI news and notes | Post-Play'em - Observations on AI of games I play
"Reducing the world to mathematical equations!"
So are you pointing that the man I read about was not telling the truth?
Since my experience with AI is probably nowhere near yours I thought a pure NN AI could play chess with right training. It feels so intuitive. There has to be a set-up which makes this possible but I am not sure if that is feasible or not.
Since my experience with AI is probably nowhere near yours I thought a pure NN AI could play chess with right training. It feels so intuitive. There has to be a set-up which makes this possible but I am not sure if that is feasible or not.
So are you pointing that the man I read about was not telling the truth?
Most likely, or maybe you misinterpreted what he said. What exactly is it that you read?
It was this link:
http://www.chesscirc...RY-FOR-OCTAVIUS
I also found an extra link which claims that his NN may/can learn valid moves, but I couldn't compile it yet because it is written in vb6 back in 2001.
http://www.planet-source-code.com/vb/scripts/ShowCode.asp?txtCodeId=22202&lngWId=1
I mean, were people trying this in early 2000s but they gave up?
http://www.chesscirc...RY-FOR-OCTAVIUS
I also found an extra link which claims that his NN may/can learn valid moves, but I couldn't compile it yet because it is written in vb6 back in 2001.
http://www.planet-source-code.com/vb/scripts/ShowCode.asp?txtCodeId=22202&lngWId=1
I mean, were people trying this in early 2000s but they gave up?
That looks a lot like the approach I described as having "more hope" in my first post. The post gives you the answers to many of your original questions:
- 768 = 12*64 inputs, which means individual inputs for a bit saying "this piece type in this square".
- Two hidden layers, with 1024 and 512 nodes each (I think that's probably overkill; too many parameters to tune means you need massive amounts of data to tune them without overfitting).
- 1 output node means you are not generating moves here, just evaluating positions.
It's also similar to the approach that they took for checkers in "Blondie 24".
I don't think there is any evidence that those ANN-based evaluation functions do any better than evaluation functions written by hand.
If we skip past how you actually get there and focus on the end result -- we're talking about making a black box where the input is the state of a chess board, and the output is a chess move that is valid for that board-state.
We know that the number of chessboard states is bounded (somewhere between 10[sup]^43[/sup] and 10[sup]^47[/sup]), and we can build algorithms that choose moves as well as a grandmaster (see IBM v Kasparov).
So in theory, if you iterated through all of those valid board-states and used these algorithms to pick the best move for each one, you could build a huge database that is a "solution to chess" -- this is ideally what your NN is trying to emulate. Each entry in this theoretical solution table has a 256bit key and a 12bit value. The total table size is at least 297539770599541952833533287048 petabytes.
Instead of a table, you could also encode the exact same data as a hugely complex linear equation (which is all that a NN is). If your equation encoded the full table without errors, it would probably require similar storage to the above table. If your equation only approximated the table instead of representing it exactly, then your NN will be smaller, but won't always output a valid or 'good' move.
Whenever the approximated equation produces an invalid move, you've got to modify the equation (via NN training techniques) and try again, and repeat until it does produce a valid move (even if it's not very good). If the internals of the NN aren't big enough to map the entire state-space of chess (billions of petabytes?), then this retraining will necessarilly make it 'forget' some information and produce worse answers to other board-states.
As Dave said, when asking "given this 256bit input, what 12-bit output move should I make?", getting fuzzy/general/"near enough" outputs isn't very useful, which is all that a realistically sized NN can produce for this question.
NN's might be useful for smaller questions, such as what kind of algorithm would be ideal for choosing the next move, or classifying whether you're in the early/mid/late game, or determining whether the 'white' player on a specific board-state is in an aggressive stance or not, or looking at a defensive pattern and choosing which type of unit is the best to attack it, or selecting a suitable opening gambit from a playbook, etc...
We know that the number of chessboard states is bounded (somewhere between 10[sup]^43[/sup] and 10[sup]^47[/sup]), and we can build algorithms that choose moves as well as a grandmaster (see IBM v Kasparov).
So in theory, if you iterated through all of those valid board-states and used these algorithms to pick the best move for each one, you could build a huge database that is a "solution to chess" -- this is ideally what your NN is trying to emulate. Each entry in this theoretical solution table has a 256bit key and a 12bit value. The total table size is at least 297539770599541952833533287048 petabytes.
Instead of a table, you could also encode the exact same data as a hugely complex linear equation (which is all that a NN is). If your equation encoded the full table without errors, it would probably require similar storage to the above table. If your equation only approximated the table instead of representing it exactly, then your NN will be smaller, but won't always output a valid or 'good' move.
Whenever the approximated equation produces an invalid move, you've got to modify the equation (via NN training techniques) and try again, and repeat until it does produce a valid move (even if it's not very good). If the internals of the NN aren't big enough to map the entire state-space of chess (billions of petabytes?), then this retraining will necessarilly make it 'forget' some information and produce worse answers to other board-states.
As Dave said, when asking "given this 256bit input, what 12-bit output move should I make?", getting fuzzy/general/"near enough" outputs isn't very useful, which is all that a realistically sized NN can produce for this question.
NN's might be useful for smaller questions, such as what kind of algorithm would be ideal for choosing the next move, or classifying whether you're in the early/mid/late game, or determining whether the 'white' player on a specific board-state is in an aggressive stance or not, or looking at a defensive pattern and choosing which type of unit is the best to attack it, or selecting a suitable opening gambit from a playbook, etc...
. 22 Racing Series .
Generally, neural networks are very bad candidates for finite state problems. There are specific algorithms for these problems that are a lot better.
NN are well suited for :
- pattern recognition (generalization - classification)
- problems that can't be defined precisely, or that can evoluate in a undefined way.
- automatic learning (with an utility-based learning algorithm for instance) for relatively simple tasks. (competitive learning is interesting)
They can't be used in programs that must be proven. They are black boxes and their 'behaviour' is globally unpredictable.
Encoding inputs and outputs, setting network parameters (learning rates,how many layers, how many nodes for each layer, etc), etc... is awfully difficult and it often needs to have a very good idea about how NN work. (and patience...)
So, I'd suggest just to avoid them whenever it's possible
NN are well suited for :
- pattern recognition (generalization - classification)
- problems that can't be defined precisely, or that can evoluate in a undefined way.
- automatic learning (with an utility-based learning algorithm for instance) for relatively simple tasks. (competitive learning is interesting)
They can't be used in programs that must be proven. They are black boxes and their 'behaviour' is globally unpredictable.
Encoding inputs and outputs, setting network parameters (learning rates,how many layers, how many nodes for each layer, etc), etc... is awfully difficult and it often needs to have a very good idea about how NN work. (and patience...)
So, I'd suggest just to avoid them whenever it's possible
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement