Advertisement

Neural net learning?

Started by October 28, 2004 07:50 PM
4 comments, last by GameDev.net 20 years, 1 month ago
I'm currently in the process of (trying to!) learn neural networks, although missing a few key points I believe, which is impeding on my progress. So far I've been following fup's nice tutorial on ai-junkie.com, and read a few more technical articles on ANNs. My ignorance may be painful, but try to endure me. ;) I don't understand how having multiple neurons in the hidden layer (between the input and output) is effecting the output. I also don't see how the network can "learn" from the weights on these neurons. Plenty of tutorials have no problem with explaining what a neural network looks like and the properties of it, but none seem to explain how they are actually learning based on their input. Could someone explain to me in a bit greater detail about this? Thanks in advance!
It helps to understand how control systems work. The weights change based on negative feedback so that when the output of the net is far from the target, the weights are changed more than when they are close.

Generally speaking, the easiest way (and fastest) to implement a neural net is to use perceptron neurons. Perceptron is a fancy name for a linear neuron that can find the least mean square. If the problem you are trying to solve is linearly seperable, they can solve it...otherwise (if the problem is non-linear) they can get the closest approximation.

I would suggest using a perceptron ANN before using a backprop net simply because it is easier and more powerful.

If you want it (and if you don't), here is the perceptron learning equation: w(n+1) = w(n) + n * (d(n) - u(n)) * x(n)
I am going to leave it up to you to figure out what the letters mean.
(w = weight, n = learning rate (eta), d = target, u = sum of inputs times weights, x = input to that weight)
(I'm not that mean)
Advertisement
Quote: Original post by birdtracker
It helps to understand how control systems work. The weights change based on negative feedback so that when the output of the net is far from the target, the weights are changed more than when they are close.


I guess I was a bit vague originally, then. I already know that ANNs learn by weight modifications, but I just don't see how feeding it data through several inputs, converting that into numbers that sum up into the hidden layer and outputting them will allow it to figure out patterns.

I appreciate your mentioning of perceptron NNs, and I will check those out too. However, my question applies not specifically to backprop NNs, but to the entire theory of ANNs. I'll provide an example to illustrate what I'm trying to understand:

 Xvelocity    Yvelocity      Direction-of-nearest-powerup    o             o             o    <--- Input layer          8        8    <--- Hidden layer   X vel.   Y vel.   Ship facing      *       *       *    <--- Output layer


Given this data, how is the NN going to learn? Assuming the input data is summed in the hidden layer and sent to the output? You're going to have fairly ambiguous numbers, and I'm not seeing how the NN will be able to tell the difference between (Xvel=0.5, Yvel=0.3) and (Xvel=0.3, Yvel=0.5) because the total will be the same.

I'm guessing that my understanding is somewhat confuddled, which I'm accrediting to my lack of understanding for these ANNs. :)

(Edit: I know my NN example doesn't make perfect sense, but that's not the point that I'm trying to poke at :P)
Umm..I guess you already know some part of this but:

A neural network does not care about which input is what in physical world (only before training though!). The inputs and outputs are all numbers and not significant. The training process and weigth matrix (W=[w1 w2...] for each input) makes them significant.

Don't think of neural net training process as human training, It's an optimisation algorithm which finds the correct weights for given input and output pair. So, after the training if you swap the inputs you'll get bad output, which means wrong answer to the problem neural is supposed to solve, but the neural net will not know it.

After the training is done, it will have a fixed weigth matrix like x*w1 + y*w2 and usually w1 is not equal to w2. So x*w1+y*w2 will not be equal to y*w1+x*w2.

Even in initial training, the totals will not be the same if initial weigthts are randomized.

Quote: Original post by Anonymous Poster
A neural network does not care about which input is what in physical world (only before training though!). The inputs and outputs are all numbers and not significant. The training process and weigth matrix (W=[w1 w2...] for each input) makes them significant.

Don't think of neural net training process as human training, It's an optimisation algorithm which finds the correct weights for given input and output pair. So, after the training if you swap the inputs you'll get bad output, which means wrong answer to the problem neural is supposed to solve, but the neural net will not know it.


This is somewhat what I thought that it would be. So in summarization, you're saying that the values inputted are fairly irrelevant (assuming they are consistant with the world properly), and the neural net, through training, will figure out the pattern on its own, simulating "learning"?
Yup :) At least for supervised learning and feed forward MLP networks. There are some neural network architectures I'm not familiar with, but I guess the basic principle is the same.


I think a real brain also works like that, network of neurons combined with evolution (think genetic algorithms); and human learning is a perceived illusion, a mirage of this mechanical learning. (this guys thinks like me: http://www.imagination-engines.com/technologies/ieitechnology.htm)

To summarize, all you need is a data set containing inputs and desired outputs to train a neural network. The problem also must be solveable i.e. definable by pseudo functions in math like:

af(x)+bf(y)=z

If the functions and formulae are too complex (having too many terms & nonlinear) it's a good candidate for neural nets.


For the simple problem, af(x)+bg(y)=z:


if you know x, y and z (inputs and outputs, since they are observable) but not a,b and functions f(),g() (you don't know anything about the problem) a neural net will find the correct values for a and b and will solve the rest of the problem for new values of x_new,y_new and give an acceptable output z_neuralnetcomputed_new.

So if the neural net output (neural computed z) is close to real world z, the neural net is said to work and has learned the problem.

The weight matrix of the layers in the neural network will be correlated with a,b after it is trained. The transfer (or activation) functions in neurons are used to approximate the real world function (problem), so for harder problems you need higher order transfer functions, hence the need for a hidden layer. The more hidden layers a network has, the more capable of finding a solution to a more complex problem, but it will be harder to train (CPU speed and memory wise).

The most important part in training is generalisation. You need lots of z,y,z data (=a big test or training set)to train the network depending on the problem complexity. If this training data set is small compared to what can be observed from the real world problem, it's most likely that neural network will not work with new data, solving a more general problem (something like finding a local minimum rather than global minimum if you're into optimisation algorithms). The other extreme is overfitting, when the network is too complex for the real problem and memorizes the test data set (also a local minimum). You can avoid it by changing the network size, number of nodes and number of layers.
The general rule is using as little hidden layers as possible and using big training data sets for good generalisation.

i.e don't expect it to find solution for higher order problems like aircraft flight data prediction using 200-300 values of test data, it needs thousands of training data set values and a couple of hidden layers..

Sorry if my post is too long, I used to work with NNs 1.5 years ago and forgot most of the stuff but the topic is interesting and most of the time I'm confused too. I suggest you to obtain matlab and it's neural net toolbox for testing and working with the ideas, if possible. In matlab .m files you can see the actual work, which is easier than parsing C code. At least read its manual, it's very descriptive. For better insight you can read about approximation, optimisation, data fitting, prediction, even fourier analyis.

We can say "neural net has learned the problem" but not "the equation knows the problem". The content of ANNs could be written in equation form. IMHO, ANNs are just a different way of describing the real world problems.

If you want to learn more, get the book "Neural Networks for Pattern Recognition" by Christopher M. Bishop. The book is hard but this guy knows a lot and does not repeat the same descriptions you find elsewhere and shows how NNs relate to mathematics and other methods.

- dustdevil


This topic is closed to new replies.

Advertisement