Initial values for neural network
Hi,
I''m implementing a general NeuralNetwork-class in c++. Problem
is, to what values should I initialize all my variables.
In other words, which value should I choose (initially) for:
- my weights (I know this doesn''t really matter, but what is best?)
- the learning rate (I somewhere read 0.3 was standard)
- the momentum (0.7 ??)
And how many neurons should my hidden layer have, If I have,
say 60 inputs and 10 outputs ?
Thanks,
Edo
Edo
I''m sorry to say that there''s no standard-magic-setup which will solve every problem. As you already know, it depends on what "problem" you''re trying to solve.
But in general one sets the weights to small random (uniform,gaussian distribution...) values.
The learning rate and momentum rate is not that easy to determine. You may try to find the best ones.
Note, the momentum term is usually changed and starts at a value, say 0.9, and decreases as the training progresses.
There are algorithms to modify the momentum as a function of time.
Another possibility is to use genetic algorithms to evolve the best NN-setup.
Anyway, what type of problem are you trying to solve?
/Mankind gave birth to God.
But in general one sets the weights to small random (uniform,gaussian distribution...) values.
The learning rate and momentum rate is not that easy to determine. You may try to find the best ones.
Note, the momentum term is usually changed and starts at a value, say 0.9, and decreases as the training progresses.
There are algorithms to modify the momentum as a function of time.
Another possibility is to use genetic algorithms to evolve the best NN-setup.
Anyway, what type of problem are you trying to solve?
/Mankind gave birth to God.
/Mankind gave birth to God.
Well,
I''m building a standard-class that can generate *any* kind of neural network with all kinds of dimensions. I''m thinking of testing it on recognition of numbers when presented in a noisy pattern. Debugging is taking a lot of my time (Network already works, but I want to know if it works correctly before I test it).
Thanks anyway,
Edo
I''m building a standard-class that can generate *any* kind of neural network with all kinds of dimensions. I''m thinking of testing it on recognition of numbers when presented in a noisy pattern. Debugging is taking a lot of my time (Network already works, but I want to know if it works correctly before I test it).
Thanks anyway,
Edo
Edo
Hmm, for you knowledge, one rule-of-thumb to use when determining the number of hidden neurons is:
numofhidd = numoftrainingexamples/(10*(numinputs+numoutputs))
which Widrow came up with in 1987.
/Mankind gave birth to God.
numofhidd = numoftrainingexamples/(10*(numinputs+numoutputs))
which Widrow came up with in 1987.
/Mankind gave birth to God.
/Mankind gave birth to God.
there is another rule of thumb for initializing the weights : the input weights of a neuron should be lower the more inputs it has ( sum := 1.0/inputs * randomnumber )
and weights shouldn''t ever be the same on intializing time because then you can get symetry breaking probs.
@$3.1415rin
and weights shouldn''t ever be the same on intializing time because then you can get symetry breaking probs.
@$3.1415rin
It is often used to set neural network weights (and biases, if present) with "small enough" random values, distributed normally or uniformly within range dependent of quantity neurons in the layer, say [-0.01, 0.01] for layer with 10 neurons.
Too small initial values will cause stuck around local minima at 0-point.
However there are techniques to initialize start weights in more intelligent way, such as Nguen-Whidrow rule. I don''t remember exactly the formulas but idea is that we pick heuristically starting point to be probably close to the minimal point we are seqrching for.
Anyway, random values are the way to go - if they are not good enough, that simply will take few more steps to learn, especially if you use adaptive learning rate adjusting or some more complicated technique for the network learning, such as conjugate gradients or quickprop, or second-order methods or DBD or something like.
Halloween.
Too small initial values will cause stuck around local minima at 0-point.
However there are techniques to initialize start weights in more intelligent way, such as Nguen-Whidrow rule. I don''t remember exactly the formulas but idea is that we pick heuristically starting point to be probably close to the minimal point we are seqrching for.
Anyway, random values are the way to go - if they are not good enough, that simply will take few more steps to learn, especially if you use adaptive learning rate adjusting or some more complicated technique for the network learning, such as conjugate gradients or quickprop, or second-order methods or DBD or something like.
Halloween.
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement
Recommended Tutorials
Advertisement