Input data on neural networks in genetic algorithms
I got a problem here. Im making a genetic algorithm that evolves neural networks. The aim is that the agents should navigate around the game world and pick up small green dots, i.e. just move to them.
But they dont evolve very well, I know that algorithms are working fine because Ive tested them on stuff thats easier to learn. The XOR function for example.
Ive tried inputting several different sets of readings from the game world and tried to do different things with the outputs.
First, I thought I made it simple by inputting the angle to the closest "food" (which I call the green dots they're supposed to seek) and whatever the net outputs is the new travelling angle of the agent. The problem was that as the input angle became lower, so did the output angle, so it was extremely easy to be successfull that way.
After that I tried to input the coordinates of the agent and the coordinates of the closest food (four inputs). The output was still the travelling angle of the agent. Now they just kept travelling straight forward, barely reacting to the food. I've heard neural nets work best if they can work with inputs and outputs that range from -1.0 to 1.0, is this correct? Because the screen coords they get as inputs are very high.
And I dont know if I should treat the output value as radians or degrees.
Any suggestions?
No one knows what I mean? Or am I asking the wrong questions?
Ive made the program pretty much the one on ai-junkie. But with SDL for graphics because I dont know DirectX or Win32 programming.
Problem is, I cant figure out from the code what he's giving the agents as inputs and what he's doing with the output.
Anyone care to help? [smile]
Ive made the program pretty much the one on ai-junkie. But with SDL for graphics because I dont know DirectX or Win32 programming.
Problem is, I cant figure out from the code what he's giving the agents as inputs and what he's doing with the output.
Anyone care to help? [smile]
Quote: Original post by Mizipzor
But they dont evolve very well, I know that algorithms are working fine because Ive tested them on stuff thats easier to learn. The XOR function for example.
One problem here is that it's not obvious whether it's the neural net that's at fault or the genetic algorithm, which makes the problem twice as hard. Is there a reason why you can't use backpropagation on the neural net? It's easier to implement than understand, and is more likely to give you reliable results than using a GA when training a NN, doubly so when you're not entirely sure how to set up a GA for the task in question. You may still not have much success but at least you've eliminated an entire class of potential errors.
Quote: First, I thought I made it simple by inputting the angle to the closest "food" (which I call the green dots they're supposed to seek) and whatever the net outputs is the new travelling angle of the agent. The problem was that as the input angle became lower, so did the output angle, so it was extremely easy to be successfull that way.
What do you mean by 'extremely easy to be successful'? This could hint at a significant problem with how you're handling fitness values for the GA if it means what I think it might mean. Or it might mean something entirely different and I don't understand
I'd recommend using relative angles as both the input (ie. the angle between the direction of travel and the direction to the nearest food) and the output (ie. how far to turn to direct the agent towards the food). You always want a minimal representation that can generalise, and by way of comparison - assuming for a second that you deal with integer degree values - you only ever have 360 potential relative angles to contend with whereas you have 360*360 = 129,600 combinations of angle of travel and angle of food. Which do you think will learn more quickly?
Quote: After that I tried to input the coordinates of the agent and the coordinates of the closest food (four inputs). The output was still the travelling angle of the agent. Now they just kept travelling straight forward, barely reacting to the food.
This sounds more like your initial random values are poor, or that your fitness values for the genetic algorithm are poor.
Quote: I've heard neural nets work best if they can work with inputs and outputs that range from -1.0 to 1.0, is this correct?
Neural nets simply take numbers as inputs and give you numbers as outputs. What they work best or worst with depends on how you implement those neural nets. Quite often it's common to scale values down to 0 to 1 or -1 to 1 to simplify the function you use within the net, scaling the value or values back up to suit your outputs. Since you've adapted your code from elsewhere I suggest you look into the values that code is supposed to use, and adjust your code accordingly to ensure it is given the correct values. If it's not easy to find out, you'll have to discern what function your neurons are using and what range of values they require. It's very likely that your inputs will expect values ranging from -1 to 1 but it's impossible to say without seeing your exact code.
Quote: And I dont know if I should treat the output value as radians or degrees.
That makes no difference. The difference between 100 degrees and 1 degree is the same as the difference between 100 radians and 1 radian - a multiple of 100. Just make sure that you perform the correct conversions when calculating inputs to your net and when handling outputs from it.
You're walking into a minefield by trying to use a GA to evolve a neural controller, without first giving consideration to the controller architecture and its feedback interaction with the environment. Furthermore, if you deal in only global coordinates for position of agent and target, then your controller must also learn the implicit nonlinear relationship between coordinates and trajectories. You should stick to using as inputs angles and distances relative to the heading and position of the agent and as outputs directions to turn and speeds to travel at. Furthermore, you need to give consideration to the delay learning problem, which is the problem of correctly rewarding actions that set up a solution. When evolving a controller, it may be that your GA population contains controllers that work well when they are far away, but if you only assess them based on their final performance of achieving the goal, you wont propogate these controllers into the future to develop controllers that are good at reaching the goal from all distances. Make sense?
Cheers,
Timkin
Cheers,
Timkin
Adding to the last poster: the "delay learning problem" is also a topic in Reenforcement Learning (RL), which translates into a credit assignment problem through time. A method to deal with this is the Temporal Difference method invented by Prof. Richard Sutton http://www.cs.ualberta.ca/~sutton/index.html), on his personal web-site, consult his online book and look at the random walk example, which I think might be helpfull. Basically it re-adjusts rewards for previous steps taken based on the results of later steps taken. There are 2 papers you could consult, his '88 and '89 papers, hope this points you more in the right direction or gives you a clue to why what you're doing doesn't work so well...
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement