train neural network with genetic algorithms
i try to use also genetic algorithm to train the network in order to learn that better:)
i defined 150 steps each generation and 100 cars each generation.
each car(gene) has the weights of the network as his data.
each generation i set the best genes of the generation (the best and all the genes that their fitness is atleast 95% of the first).
the crossover method is choose each weight from the "parents"(best of the last generation).
the mutation method is select one of the weights and change him by +\- 30% of his value.
each generation has 150 steps ,after each step i add\reduce to the genes fitness.
atm i just have clear screen(half of the normal XNA screen).
if the car collide the walls i reduce 1 to the fitness,
if the car move normally(without collide) i add the ratio of the moved vector from the highest the can be,
lets say that the car moved half meter and the car speed is one meter so i add 0.5/1->
moved/carspeed.
each car has four sensors which represent the distance from each wall.
if the wall is out of the sensor range the sensor is "-1",
else the sensor is the distance of the wall divide by the maximum range =(width+height)/3.
as i can see the car learn to move much more area as the generations moves,but it cant learn avoid the walls:S
at generation 15 it start touching the walls,but then its just get faster and faster but still collide the walls:S(the best car of each generation).
there is something wrong in my general idea?
tyvm for your help:)
each car(gene) has the weights of the network as his data.I think you are confusing genes and chromosomes.
each generation i set the best genes of the generation (the best and all the genes that their fitness is atleast 95% of the first).Is this the cars that are used to generate the next generation? If very few or none of the other cars reach 95% fitness you will get very low genetic diversity which is bad. Using something like tournament selection all cars can still be chosen but with different probabilities.
the crossover method is choose each weight from the "parents"(best of the last generation).I don't get this. Crossover is used to swap some genes between two chromosomes.
the mutation method is select one of the weights and change him by +\- 30% of his value.When you say +\- 30% you mean in the range [-30%, 30%]? Not sure but 30% sounds kind of much.
if the car collide the walls i reduce 1 to the fitness,What happens if the car has been colliding since last step but is not currently colliding? Will that still affect the fitness? If you fail here it can explain why they don't learn to avoid walls.
if the car move normally(without collide) i add the ratio of the moved vector from the highest the can be,Are you taking the road direction into account so that you don't reward movement in the wrong direction?
lets say that the car moved half meter and the car speed is one meter so i add 0.5/1->
moved/carspeed.
each car has four sensors which represent the distance from each wall.Is there one sensor at the front,back,left and right of the car? I have no idea how your track looks like but imagine the car driving on a straight road that has a 90° turn later on. No sensor will be able to see the turn until the car is already in the turn. Could this be too late so a crash is unavoidable? Maybe you need more sensors to be able to learn better.
I think you are confusing genes and chromosomes.
[quote name='simshon' timestamp='1317779711' post='4869184']each car(gene) has the weights of the network as his data.
each generation i set the best genes of the generation (the best and all the genes that their fitness is atleast 95% of the first).Is this the cars that are used to generate the next generation? If very few or none of the other cars reach 95% fitness you will get very low genetic diversity which is bad. Using something like tournament selection all cars can still be chosen but with different probabilities.
yes,all the 'best' cars use for the next generation crossover
the crossover method is choose each weight from the "parents"(best of the last generation).I don't get this. Crossover is used to swap some genes between two chromosomes.
if the program decide to do crossover to this car(the crossover rate is 0.5~) so for each weight in the network the new 'child' randomize from which parent he will "take" this weight
the mutation method is select one of the weights and change him by +\- 30% of his value.When you say +\- 30% you mean in the range [-30%, 30%]? Not sure but 30% sounds kind of much.
yea,the program decide if it should reduce or add to this weight(randomally 50% each),then i add\reduce random.nextdouble()*(val/3) which is 33%.
if the car collide the walls i reduce 1 to the fitness,What happens if the car has been colliding since last step but is not currently colliding? Will that still affect the fitness? If you fail here it can explain why they don't learn to avoid walls.
this situation cant be,because it the car collides i dont allow to move the car into the wall, so from the step the car collide ,it will continue to collide till end of the generation,
how i can improve this maybe?
if the car move normally(without collide) i add the ratio of the moved vector from the highest the can be,Are you taking the road direction into account so that you don't reward movement in the wrong direction?
lets say that the car moved half meter and the car speed is one meter so i add 0.5/1->
moved/carspeed.
atm i just have blank screen and im trying to 'teach' the cars avoid the walls.
each car has four sensors which represent the distance from each wall.Is there one sensor at the front,back,left and right of the car? I have no idea how your track looks like but imagine the car driving on a straight road that has a 90° turn later on. No sensor will be able to see the turn until the car is already in the turn. Could this be too late so a crash is unavoidable? Maybe you need more sensors to be able to learn better.
[/quote]
good idea,ill add more to sensors at the front of the car
ty for your help:)
this situation cant be,because it the car collides i dont allow to move the car into the wall, so from the step the car collide ,it will continue to collide till end of the generation,
how i can improve this maybe?
How about:
fitness = fitnessBeforeCollisions / timeInContactWithWalls
To obtain a pool of decent car controllers that finish the race, you could evaluate them on a long test track with a great variety of curve and straight sections and use length of the path before crashing (not speed!) as fitness: improved mutants would crash further along, until you get controllers that finish the race and you start making them fast.
Can you explain the car controller representation and the type of neural network you are using? Are you considering speed, acceleration, centrifugal forces etc. as inputs?
Omae Wa Mou Shindeiru
I'm curious why the OP chose to use NN/GA technology anyway. Are we trying to create a racing game or a machine learning test project?
Dave Mark - President and Lead Designer of Intrinsic Algorithm LLC
Professional consultant on game AI, mathematical modeling, simulation modeling
Co-founder and 10 year advisor of the GDC AI Summit
Author of the book, Behavioral Mathematics for Game AI
Blogs I write:
IA News - What's happening at IA | IA on AI - AI news and notes | Post-Play'em - Observations on AI of games I play
"Reducing the world to mathematical equations!"
I agree that the mix-and-match approach seemed a bit odd. I can easily see the NN and GA working against each other... or at least being counter-productive because of the additional noise.
I'm curious why the OP chose to use NN/GA technology anyway. Are we trying to create a racing game or a machine learning test project?
Wait, what's the problem, and why is this weird? ANNs are function approximators, GAs are optimizers, and using a GA to optimize the ANN weights is both totally reasonable and commonly done. Maybe a simpler linear architecture would be better, but that only really matters if you're trying to do something like approximate dynamic programming.
I like LorenzoGatti's suggestion to first get cars that can finish the race, and then to optimize lap times. I'm reminded of two-phase algorithms for convex optimization. Borrowing this idea, you'd think of your problem as minimizing lap time subject to the constraint that you finish the race. To do this, you'd start with a bunch of infeasible points, and, in phase 1, minimize constraint violation (say, maximize distance traveled around the track) until you have a large enough pool of feasible points; then you switch to minimizing lap time, keeping only feasible points from now on.
The same two-phase strategy can really be applied with whatever optimizer you want. Since we've got a nonconvex problem if we stick with ANNs, stochastic methods (like GAs) seem reasonable, though I might prefer the (simpler, IMO) cross-entropy method.
All this said, if you're interested more in results than in having the computer "automatically" "learn" how to race, maybe you really just want to use some nice hand-made controllers. Maybe you can optimize them locally by tuning some weights with gradient descent.
- structural evolution (i.e. you can evolve the ANN's structure, not just the weights)
- multi-objective optimization
- quite robust to local optima
- easily multithreadable
The main downside is the computational cost (and potentially spatial cost if your individuals are obese).
Dave Mark - President and Lead Designer of Intrinsic Algorithm LLC
Professional consultant on game AI, mathematical modeling, simulation modeling
Co-founder and 10 year advisor of the GDC AI Summit
Author of the book, Behavioral Mathematics for Game AI
Blogs I write:
IA News - What's happening at IA | IA on AI - AI news and notes | Post-Play'em - Observations on AI of games I play
"Reducing the world to mathematical equations!"