Advertisement

Making new connections in NNs

Started by January 26, 2007 04:50 PM
6 comments, last by Timkin 17 years, 9 months ago
So I've been recently reading up on Neural Networks, from what I've read it seems that the NN starts with the connections premade and then adjusts weights in order to learn. But I don't understand how these connections are originally formed. How does a node choose which other nodes to connect to? This boggles me especially in the biological sense that neurons must grow "intelligently" to make a new connection to another neuron. Also, it seems that a NN only deals with one problem. For example a NN might be able to recognize faces, but it seems that in a real world setting there are so many inputs that the weights for recognizing faces would be screwed up by all of the other inputs that occur (ie: recognizing Cars)
often, all nodes in one layer is connected to all nodes in the next layer...

instead of not connect, you connect all from the start, and then you set the weight to 0
-Anders-Oredsson-Norway-
Advertisement
It sounds like you're taking about backpropagation, not neural networks. Backpropagation is a family of methods to train neural networks by propagating error signals to adjust weights.

Artificial neural networks (ANNs) themselves are networks of artificial neurons. They aren't the same as real neural networks made of real neurons. Real neurons are much more complicated and less understood and simulations of them aren't really suitable for any task besides trying to model real neurons.

The concept of artificial neurons and ANNs is a sort of mathematical abstraction which was inspired by real neural networks, which display a similiar pattern of connectivity and distributed knowledge representation.

Any artificial neural network is ultimately a collection of nodes, connections, transfer functions, weights, etc. You give it inputs and outputs are calculated in some deterministic way. Even the method of calculating the output can vary; for example, some ANNs sort of model the passage of time and model the time required for signals to propagate. Other ANNs propagate signals instantly. Some ANNs allow arbitrary network topologies and others are more restricted.

Some classes of ANNs that might be parameterized by things like weights, connections, transfer functions, and/or the number of nodes have been shown to be rather expressive. Feed-forward networks with at least one hidden layer can theoretically approximate any continuous function, if they have sufficient nodes in the hidden layer, by setting the weights appropriately. Backpropagation methods adjust the weights by gradient descent, trying to minimize the error (hmm...I have no idea what the results say about local minima in the weight space...). So backpropagation can be a relatively simple way to approximate an unknown function, given sample input|output pairs. There are other methods to accomplish this task that aren't based on neural networks. I've heard that support vector machines do well but I don't know much about them. You can really take any parameterized function and try to find the parameters that minimize the error in trying to approximate an unknown function. You can use the least-squares method to find a polynomial of arbitrary degree that best fits a number of points. Feed-forward apparently tend to have a favorable smoothness to them, but such things are hard to quantify and not guaranteed.

There are other techniques for training ANNs, and some techniques allow things like the topology of the network to be learned as well. The actual methods of learning found in real brains are apparently not well understood and still being actively researched.

I think that real facial recognition software uses more mathematically advanced techniques that are more suited to the problem domain, involving precalculating small databases of facial features and calculating the measurements of various facial features, etc. The problem with neural networks is that they are very general and more specialized techniques tend to do better in most fields. In some areas they are good enough, especially when the specific domain is not understood well enough to develop more specialized methods.
Quote: Original post by orryx
So I've been recently reading up on Neural Networks, from what I've read it seems that the NN starts with the connections premade and then adjusts weights in order to learn. But I don't understand how these connections are originally formed. How does a node choose which other nodes to connect to? This boggles me especially in the biological sense that neurons must grow "intelligently" to make a new connection to another neuron.

Also, it seems that a NN only deals with one problem. For example a NN might be able to recognize faces, but it seems that in a real world setting there are so many inputs that the weights for recognizing faces would be screwed up by all of the other inputs that occur (ie: recognizing Cars)




Usually all the nodes have a link to all the nodes on the previous level. The weights are all set random values to start. The training data is then run thru in cycles to warp/correct the weights into giving the correct answers in the training data (back propagation). Usually the corrections are made smaller and smaller as the cycles go on to prevent pendulum swings that never settle. There also can be such a thing as 'overtraining' that causes the NN to get locked too much into the training data results and not to generalize well if given other data.

Often the pattern gets stuck in a bad combination that never is resolved and the whole process has to be repeated with a different initial random arrangement.
The larger the NN is, the more likely it will grow disfunctional (overtraining in some sections while undertraining in others).



--------------------------------------------[size="1"]Ratings are Opinion, not Fact
If you've got computational time up your sleeve, there are a variety of methods for growing and pruning the structure of ANNs. The general principle is to perform a search in the structure space and for each candidate, perform an error gradient search in the parameter space. Evaluate that candidate on the data set and use the result to guide the search in the structure space. Unfortunately, because there is usually a nonlinear coupling between structures and parameters, this isn't an easy problem to solve. An heuristic approach is to start with an oversized, fully connected network and then to prune nodes and arcs based on their significance to the training and test errors. If removing a node or an arc reduces the error, then accept this removal and test another. You'd probably want to run an ensemble of evaluations to test for different sequences of removing arcs/nodes. If you find that irrespective of sequence that you're getting to the same or similar structure, then you can be reasonably assured that this is an appropriate structure for the training data. Unfortunately, this is rarely the case on real world data sets. ;)

Cheers,

Timkin
Ok...

So in ANNs generally connections are preset to all nodes in the previous layer. Do they have any idea how this happens in the brain?

And my second question: Can a single NN be used for more than one type of pattern recognizing? Ex: A NN that could give appropriate reponses to recognizing a bird vs. a plane and a car vs a bus?
Advertisement
An NN isn't so much a simulation of a brain at all.
It is a function aproximator often used in control theory problems.

You give it a set of inputs, train it, and then hope to get out decent results in your real data set
based on the results from your training set.

A single NN can recognize several different paterns (ie, one that can tell you what number is
displayed in a 16x16 pixel image). The issue comes when you have too few inputs, everything looks
alike, and you dont get results (1x1 image is usless). But, if you have too much detail(2^12x2^12 image) then
you have a hard time training the NN to recognize the patterns. So, with a correctly sized image, you
are now limited by how similar two images are to each other, and how you made your inputs.

So, a NN that recognizes bird vs. plane probably can't recognize car vs. bus, but you could train a NN to tell you
if it is a bird, plane, bus or car. At that point though, you still have to retrain it if you want
to add another type of image for it to recognize.
Quote: Original post by orryx
So in ANNs generally connections are preset to all nodes in the previous layer. Do they have any idea how this happens in the brain?

If you've never been told the following you can be forgiven for associating ANNs and brains... "the only thing that artificial neural networks have in common with brains is that they are both parallel architectures". That's it. End of story. You should never make the mistake of thinking that ANNs tell us something about brains or that brains tell us something about ANNs. Even recurrent ANNs have little in common with brains.

As for the brains wiring architecture, neurons are not connected in layers. They are wired in cliques; that is, most connections are local, forming highly parallel local subsystems, while some connections produce coupling of varying degree to external cliques, producing highly parallel global systems. The brain has a central router (corpus colosum) but not all connections go through this router; many are direct couplings. Computations of certain types are localised within the brain (functional MRI tells us this) and this often depends on which sensors are involved in the information gathering (for example your eyes are wired directly into your occipital lobe (the region of brain near the back of your head) and processing of visual stimuli occurs in this region). Language processing occurs in a small region of the temporal lobe (near its junction with the parietal lobe).

ANNs on the other hand are (for the most part) a single system. While there has been successful application of clique-based ANNs (most notably in the Creatures series of games and the application of the same technology to reinforcement learning problems), the complexity of these systems is often not warranted for the problems we want to solve (we don't generally need very robust networks capable of handling very broad, uncertain domains with many variables).

As for how the brain gets wired the way it is... part of the story is known (that connectivity appears to be a function of use of that region and complexity of information processing) while some of it is unknown (e.g. what forces particular connections to be made or discarded). We known that connectivity does increase and decrease at various stages of our life (for example, around the ages of 9-12 some parts of your brain are shrinking while others are growing).


Quote:
And my second question: Can a single NN be used for more than one type of pattern recognizing? Ex: A NN that could give appropriate reponses to recognizing a bird vs. a plane and a car vs a bus?


Yes, although the accuracy will decline with the number of disjoint partitions in the decision space. In English, that means that the more you generalise a network to handle broader decision problems, the less specialised the network is on any given decision problem. So, yes, you might be able to get it to choose between birds and planes and between cars and buses, but sometimes it might classify a plane as a bus, or perhaps a plane as a large bird!

This topic is closed to new replies.

Advertisement