Getting GA's to learn to generalise solutions to problem...
Hi Gamedev,
I want to make a simplified tetris style puzzle game inwhich a neural network is combined with GA in order to train it. The problem is, I cannot identify what the problem domain is.
My proposed solution initially is:
1. Have a population (maybe 30) of NN play the game
2. Record thier fitness - fitness maybe, how many lines they make + how long they last in the game
3. Use a selection method to pick best parents for new generation
4. Create new generation of NNs using a reproduction technique
5. Repeat from step 1.
My goal is to see if I can get the computer over many generations learn to (generalise) play new games. For this to happen, is it best that as the evolution occurs should I:
1. Get the population to play the same game every generation (i.e, same blocks always - seed the same game)
2. or Get the population to play different games each generation.
You see, Im more inclined to pick option 2 as I think it may give the whole learning process some variety in block combinations, but I don't know. One argument I have against option 2 maybe that if the problem its trying to solve in each generation is changing (i.e new games) then maybe it will never converge? But If I picked option 1, then it will learn to get better at playing only one game(one set of blocks after the other) and those not be able to generalis new games well, right?
Lastly, If I wanted to use mutation on the chromosomes, how do I do it when the chromosome is made up for floating point numbers of the NN weights, do I add an arbitray floating point number e.g. 0.432 whenever its time to perform the mutation?
Let me know what you all think
Thanks for your help
DarkStar
UK
[Edited by - Dark Star on February 9, 2006 11:11:53 AM]
---------------------------------------------You Only Live Once - Don't be afriad to take chances.
February 09, 2006 11:10 AM
You can try both and see what works. Changing from one method to the other should only require changing one line of code.
I would say that method 1 will lead to a set of NNs that play that particular board very well but fail on anything else. Option 2 is my pick.
As for mutation, instead of adding an arbitrary random number to the weight, get your random number from a normal distribution (Gaussian). The mean would be zero, and the variance would be a tunable number. You could even include the variance as an evolving parameter.
-Kirk
As for mutation, instead of adding an arbitrary random number to the weight, get your random number from a normal distribution (Gaussian). The mean would be zero, and the variance would be a tunable number. You could even include the variance as an evolving parameter.
-Kirk
A friend and I did something similar for a class project once, using GA's to weight an evaluation function for tetris. We had the same question: train on one game, or several? What we ended up doing was just training on a single game, and then later tested on several games and see if it worked. It <a>appeared that the genes that did best on the training game also did best (or at least very well) on the new random games.
Our assumption was that if the member played the same game long enough, it would come across most general situations, that is, I figured no two Tetris games, when played long enough and were actually random (or close enough), are significantly different. That said each member did in fact learn to play the training game much better than any other game thrown at it, and we never had time to test it with multiple training games.
One thing to keep in mind, which was also part of the reason we used the same training game, is that it's possible to get a combination of pieces that force the player to lose. This means that you could get a lucky mutation or something, end up with a really great player, but appears to do poorly and is then removed from the population. So if you use many games to train, consider using the same set of games for all, or use enough so that it won't (shouldn't) be a problem.
As far as the mutation of floating point numbers go, just convert them to binary first and do normal GA bitwise mutation, you end up with some really crazy numbers that way (and depending on the language your using, watch for infinity), but it works.
Hope that helps :).
Our assumption was that if the member played the same game long enough, it would come across most general situations, that is, I figured no two Tetris games, when played long enough and were actually random (or close enough), are significantly different. That said each member did in fact learn to play the training game much better than any other game thrown at it, and we never had time to test it with multiple training games.
One thing to keep in mind, which was also part of the reason we used the same training game, is that it's possible to get a combination of pieces that force the player to lose. This means that you could get a lucky mutation or something, end up with a really great player, but appears to do poorly and is then removed from the population. So if you use many games to train, consider using the same set of games for all, or use enough so that it won't (shouldn't) be a problem.
As far as the mutation of floating point numbers go, just convert them to binary first and do normal GA bitwise mutation, you end up with some really crazy numbers that way (and depending on the language your using, watch for infinity), but it works.
Hope that helps :).
Hi everyone,
Thanks very very much for your replies, that's given me something great to work for. I think I will get them to play different games each generation, but of course it wont be difficult to modify to play same game each generation.
Cheers
DarkStar
UK
Thanks very very much for your replies, that's given me something great to work for. I think I will get them to play different games each generation, but of course it wont be difficult to modify to play same game each generation.
Cheers
DarkStar
UK
---------------------------------------------You Only Live Once - Don't be afriad to take chances.
Another thing is, When I have done some generations and I would like to test the NN's ability to play the game do I pick the best NN from the current population as it stands as the 'learned' network. In other words in GA's involving NN how do you put back the effort of the population back into using it to solve the problem?
I don't know if that makes any sence?
Thanks
DarkStar
UK
I don't know if that makes any sence?
Thanks
DarkStar
UK
---------------------------------------------You Only Live Once - Don't be afriad to take chances.
Yeah I'm not sure what you mean exactly :). Maybe an example of what you would do if you didn't just choose the best in the population would help clarify?
Hi,
what I mean is, Im going to have a population of NNs playing the game, but when I want to stop all training and test how well they have done so far, I want to take only one of them as the "computer brain", do I pick the fitest one in that population?
Thanks
DarkStar
UK
what I mean is, Im going to have a population of NNs playing the game, but when I want to stop all training and test how well they have done so far, I want to take only one of them as the "computer brain", do I pick the fitest one in that population?
Thanks
DarkStar
UK
---------------------------------------------You Only Live Once - Don't be afriad to take chances.
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement