Help with Neural Networks and Fitness Functions
I need some help on theory and design for a Neural Network to learn to play a game.
I have the 2048 game in mind, although I would appreciate general guidelines that work for a general category of similar problems.
The 2048 game can be easily solved through the Expecti-max algorith. Nonetheless, I want to use Neural Networks for my own learning experience. So, please bear with me even if Neural Networks is not the easiest way to solve this game.
Enough with introductions, and let me get into the heart of the matter:
I am familiar with NN with Back-propagation learning algorithm. The problem is that this algorithm requires a set of inputs and desired outputs. The inputs are the state of the game, and the desired outputs are the best moves in a given state.
However, I don't know which moves are the best for a given state. I want the NN to figure this out.
I intend to evaluate the performance of the game based on a FULL game play. So, I want the NN to perform moves and after the game reaches a terminal state, I can then evaluate the performance of the NN and calculate a Fitness Function.
How can I make the NN learn the best moves by evaluating the gameplay after a full session, rather than a move-by-move manner?
Im not very familiar with NNs but a genetic algorithm sounds promising to me. You could feed each NN its gamestate until the game is over (no more moves possible) and whenever it combines 2 tiles you add the resulting tiles value to its fitness. That 'encourages' them to combine higher-value-tiles. When all are finished you run the genetic algorithm. This will use the best performing to generate new NNs based on crossing and mutation. If you are not familiar with genetic algorithms I would suggest to read up on them, its a pretty interesting way of generating neural networks.
I am not a fan of genetic algorithms for NNs. Q-learning seems much more appropriate here. I don't have time to explain the details now, but you can find them in this paper: http://arxiv.org/abs/1312.5602
If you have a hard time extracting the relevant information, I'll try to write a simple description of the method tonight.
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement