Advertisement

Neural Net Won't Learn

Started by April 28, 2001 11:19 AM
7 comments, last by +AA_970+ 23 years, 9 months ago
Ok, i''ve implemented a back prop. neural net. to play what''s basically a large tic-tac-toe game. The board is 6x6 and you need 4 across, down...etc. to win. The problem is the AI only learns the last move it was trained for. For example if I train it to play a sequence of moves > (2,2) (3,3) (4,4) The AI will only play (4,4). So here''s how I structured the network. There are 36 inputs, each represents a space on the board, the input is 1 if the space is empty or 0 if it''s occupied. The hidden layer consists of 36 neurons, each of which is connected to all the inputs, and each neuron has 36 weights (1 for each input). Then there''s the output layer with 1 neuron connected to the 36 outputs from the hidden neurons. The output neuron outputs a number between 0 and 1 and this number is scaled to represent a position on the board (the position where the AI should play). Any ideas on how to fix this? Digital Radiation
For a start I''d change the one output to two to represent the x/y position on the board. Your representation is probably confusing the situation because close by areas of the board will have non-local values. For instance, if you number the board from(1,1 1,2 1,3...) then (1,1) and (2,1) will be six spaces apart. Not condusive to learning.

Next up, how are you presenting the data to the network? Do you present (2,2), then update the board with the position taken and present (3,3), performing learning between each piece of data?

What''s your learning rate?

Do you have inhibitory as well as exhitory connections?

Also, how have you implemented back prop?

If you tell me this stuff I may be able to help you.

Mike
Advertisement
Well I think I figured out the problem. Everytime the AI is trained for something new, it has to go through training again for the stuff it learned previously, right?

I made a simple console app using the same network and a few number patterns and this seemed to be the problem.

BTW, the locations on the board are represented by the numbers 0-35 going by rows (ex. the first row is 0-5). Should I still change it to represent the x/y position, it doesn''t seem necessary?

Digital Radiation
All I can give is my advice, you can''t guarantee that it''ll help but I''d definately have separate outputs for the rows and columns.

The way backpropagation works is that it takes the error surface in n-dimensions and calculates the gradient of the surface at the point denoted by your inputs. It then works out the distance to zero error based on that gradient. So if you move all the way to that zero error point each time you train for one example then the network will be reasonably (though by no means perfectly) trained for that sample and probably not trained for any others.
All this leads back to one of my original questions, what is your learning rate? If you''re unsure of the algorithm or any part of it do ask because three days ago I wrote it out from first principles using partial differentiation for a neural networks exam for my Masters, so I know it off by heart. Though due to alcohol consumption that knowledge is fading fast, you may have limited time.

Mike
First, thanx for that explanation about back prop. I think I now understand why you usually don''t get exact values for outputs, everytime you train the network for one sample, precision is lost for another sample.

Anyway, the learning rate is 0.5, and this is one part of the network I don''t understand. How do I know when to change this and by how much?

I think I pretty much understand the rest of the network, just to make sure here''s how I understand it...

The weights are updated by the equation:
learning rate * input * delta
and for the weights of the biases it''s the same except the input is the bias (this is always 1 right?)

For every neuron in the output layer
delta = ActualOutput * (1 - ActualOutput)*(DesiredOutput - ActualOutput)

and for every neuron in the hidden layer
delta = NeuronOutput * (1-NeuronOutput) * (OutputWeight) * delta_out

Well this equation for for an output layer with a single neuron, where OutputWeight would be the weight assigned to the output from the neuron (the one whose delta is being calculated) by the output neuron. delta_out would be the delta value for the output neuron.
If there was more than one neuron in the output layer, you would sum OutputWeight * delta_out (whatever values those variables would represented, I don''t mean multiply and sum the same values for every output neuron) for every neuron in the output layer.


Digital Radiation
Hi, thanks for explaining more of what you''re doing. First of all, the learning rate, drop it to 0.1 or less, I often have learning rates of 0.01 for backprop, especially with large training sets. Normally you won''t change this value over the course of the training unless you want to try methods of avoid local optima (places where all movement across the error landscape away from the current position involve error increases but it is not the globally lowest error point), so leave it constant for now.
Secondly, the learning rule depends on the activation function you''re using. By looking at your code, you''re using the right algorithm for the sigmoid function (output * (1 - output) is the derivative of 1 / (1+e^-activation) with respect to the activation, if this makes no sense, don''t worry about it, I don''t), if this is not the case then please tell me what you are using.
Lastly the delta_out value should be the opposite of the delta value. As you''re trying to reduce error then the error in a neuron (not in the output layer) should be of the opposite sign to the change in weight. So neuron i has x error in it (delta_out), then the weight from i to j is changed by learning_rate * -x (delta). Your delta value is correct, so try making delta_out = -1 * delta.

Apart from that it seems fine.

Tell me how it goes.

Mike
Advertisement
Just wanted to say thanx for all the help, it''s gonna take me a while to fix the network and finish it up, but you helped me out alot.

Digital Radiation
No problem. Happy to help.

Mike
Hi,

I agree with Mike''s feelings that one output is not good, and wanted to add that this is particularly inappropriate given the number of output decisions (36). Since I take it you are using a sigmoid activation function on the output (given the a*(1-a) derivative term in your output weight adjustments), the only stable outputs are high/low (1/0) - the intermediate activations correspond to the steepest part of the activation curve and so are difficult for backprop to stabilize (and thus difficult to learn general rules for). With so many decision-regions mapped onto the [0,1] activation range, it will be extremely difficult if not impossible for the network to learn stable input-output mappings.

It might in fact be the case that the best output model is to use ''one-of-n'' encoding, where each possible decision (i.e. each square) has its own output node, such that all the output activations are _competing_ to be the chosen decision. In the right context, these can be viewed as separate probability estimates of making each particular move. The decision logic is thus separated from the neural network learning itself, in that you just take the most likely move based on the network''s predictions.

I very much encourage you to read the section in the comp.ai.neural-nets FAQ regarding output encodings - it''s a very useful discussion of all these issues (as with everthing in the FAQ actually):

ftp://ftp.sas.com/pub/neural/FAQ2.html#A_cat

Enjoy,

Adam

This topic is closed to new replies.

Advertisement