Advertisement

Arbitrary values in Neural Networks

Started by June 20, 2002 10:44 AM
27 comments, last by Cedric 22 years, 5 months ago
quote: Original post by fup
Hello everyone, I''m back from my biking holiday. Still in one piece! (although nothing else is ;0))


Mmm... biking... (lucky you : )

quote: Original post by fup
To date, there is *no* way to determine the number of hidden neurons except by trial and error (I include GAs and similar search techniques here). See the neural network ''bible'' (Neural Networks for Pattern Recognition - Bishop) for conformation. Also, Warren Searle has a decent discussion about this in the comp.ai.neural-nets faq.


Don''t remind me : ) Though I find this method to be quite useable (especially with only one hidden layer). I would however say that it is not "trial and error" but more of a systamatic search -> not unlike any other search. There is criteria for when you have gone to far and when you have not -> I just find quantifying generality to be difficult at times.

quote: Original post by fup
Each problem you tackle with a NN will usually require a different architecture. If the networks are static then the optimal topology is discovered by repeated experimentation by hand. Usually you start off with too many units and reduce them, taking care that underfitting is avoided. If you have too many units your network will overfit and lose the ability to generalize. Too few, and it won''t learn. The choice is not abitrary.


I personally find that keeping underfitting from occuring is simple enough (just make sure it learns : ) But it is finding a decent number of neurons on the other end of the spectrum without overfitting that is difficult. (Hence the hurting of my poor little brain mentioned in an erlier post : )

I was wondering, fup, if you had ever read "Neural Smithing" by Reed & Marks. If you have, I was wondering what you thought of the chapter on Genetic Algorithms and Neural Networks (chapter 11) They mention that two successful individuals do not always yeild a successful offspring; their bit-strings might represnet points on two different local maxima and the combination might fall in a valley between them.

- mongrelprogrammer
- I hate these user ratings. Please rate me down. (Seriously) -
Ithyl_Chantresonge: Thanks for clarifying that. You are quite correct, a GA is not guaranteed to find the best solution. They are excellent at reducing the search space however. With respect to ANNs it has been demonstrated that the ideal weights may be found (if you have a training set) by searching first with a GA and then by tweaking the weights using gradient descent (backprop, RPROP etc). This way you get the best of both worlds.

PS. Where on earth did you get the inspiration for your name?

cedricl:
quote: . So in this sense, if it has sufficient time, it will try all the possible combinations and find the best one eventually.


It is extremely unlikely that a GA will traverse the entire search space. More often than not the population will converge prematurely at one or more local optima.




Stimulate
Advertisement
mongrelprogrammer: Yes, I guess ‘trial and error’ isn’t the best choice of words. Systamatic search is a much better description. My apologies for the confusion.

Funnily enough I was just skimming through the Neural Smithing book the other day. Good book. Their section on GAs and ANNs is pretty sparse though and only provides a brief overview. The problem you describe - that two successful individuals do not always yield a successful offspring – is called the ‘competing convention’ problem and is a bit of a headache for GANN researchers. In fact, because of it, many researchers have shunned the crossover operator altogether, relying entirely on mutation. This problem is hard to describe without visual aids, but basically it occurs where a system may provide several different ways of encoding networks that exhibit identical functionality. When this happens the offspring tend to have duplicated neurons - which in turn leads to a loss of functionality. Preventing this from happening involves constructing an extremely robust crossover operator. I’m sure if you did a search on Google for ‘competing convention problem’ or maybe ‘structural-functional mapping problem’ you’ll find an explanation with some useful diagrams.
(My tutorial, btw, ignores this problem altogether, NNs are complicated enough for the beginner without confusing the issue!).




Stimulate
quote: Original post by fup
It is extremely unlikely that a GA will traverse the entire search space. More often than not the population will converge prematurely at one or more local optima.

Yes... But let''s say I have n bits in my GA string, and each bit has a k probability of being mutated randomly, then the probability of getting an entirely new string is k^n. So, eventually, after a very, very long while, the entire search space should be covered.

Isn''t that right? I know it''s a limit case, it''s not realistic, and it defeats the whole purpose of GAs, but doesn''t it mean that the GA will eventually find the best solution (if the search space isn''t infinite...)?

Cédric
yes, that''s right. The space will be searched very slowly. In a practical situation though you are going to want the space to be traversed efficiently and in a reasonable amount of time. If the search space is large then relying on just mutation to do this would be unwise. What Ithyl_Chantresonge is saying is that there are other techniques (like simulated annealing) available that will do a much better job.





Stimulate
What about Evolutionary Programming or Evolution Strategies which typically lack crossover completely and rely solely on mutation?

-Kirk
Advertisement
I was addressing the question from the point of excluding the crossover operator from a standard GA. That normally leaves you with a very low mutation rate and an innapropriate selection operator. Hence my comments.

As you say, evolutionary algorithms that omit crossover are often used. This is typically because the crossover operator is difficult to define or is just plain destructive. Also, many researchers believe that the crossover operator plays an insignificant role anyway – I’ve seen lots of discussion about this in NGs and on other forums. It is accepted though that although evolutionary algorithms are great at quickly finding the general area of the global maxima, they are poor at focussing the search from that point on. This is why hybrid solutions are becoming popular at the moment.

Generally speaking, most mutation only algorithms tend to kill off 80% of the population and create offspring based on mutating the best 20%. The mutation rate is usually set higher than for a standard GA.


[edited by - fup on June 24, 2002 9:20:22 AM]
Gotcha. I agree that the standard GA doesn''t perform well without crossover. Not too surprising considering the theory of schema as the foundation.

In another note (sorry to change the direction of the thread but my curiosity is getting the best of me), biological principles have shown that gene duplication, genome expansion/contraction, and chromosomal rearrangement are the cornerstones of adaptation. Have you seen much application of these to modern Evolutionary Computation? Timkin, any thoughts there?

-Kirk
I imagine any frequenters of this forum are as curious as cats! It''s what keeps us going...

I''m doing work at the moment based on variable length genomes. They are pretty easy to use and work remarkably well in the right situation. For example, I use them to evolve neural network topology - the bigger the net, the bigger the genome. Using fixed length genomes you quickly run into the ''scalability'' problem. (GAs applied to large chromosomes perform very poorly). They are fairly commonly used nowadays I''d say.

I''m not really qualified to answer the rest of your question properly though as my biology background is not what I''d like it to be. What do you mean by chromosomal rearrangement? I could hazard a guess but no doubt I''d be wrong.
Howdy.

Here's a list of what I was referring to in my last post.

gene duplication
It has been shown (suggested?) that during the course of evolution some portions of a particular chromosome are broken off and reannealed at a different location. Since our chromosomes come in pairs we then effectively have duplicated that gene elsewhere and now work with two copies. Random mutations to one has no effect on the other and has no effect on the pathway in which that gene functions. The result is we have modifications of a theme. I can supply an example if you're interested.

genome expansion/contraction
The example above is an example of chromosome expansion. Now let's say this happens a few thousand times and we get lots of junk all over. Now, instead of duplicating a gene, let's just kill off a chunk of DNA. As long as it isn't necessary for survival we've effectively reduced the genome. Viruses are probably kings of this domain.

chromosomal rearrangement (or translocation)
Suppose you have a particular gene located at a particular position on a chromosome. This particular position provides the appropriate controls for the gene of interest. Now, let's shuffle the chromosome around or mix bits of different chromsomes. The result is that the same gene is under different control. Another example is the genes coding for antibodies. They seem to undergo a process of random shuffling continuously which gives us the huge array of unique antibodies to fight disease. The gene duplication above might be an example of chromosomal rearrangement but involved duplication instead of just repositioning.

I hope that clarifies. My background is Physiology, so if I've assumed something, please let me know.

-Kirk


[edited by - KirkD on June 24, 2002 1:54:15 PM]

This topic is closed to new replies.

Advertisement