More than three layers in FF-ANNs
Hi,
So basically, three layers can be shown to be theoretically maximally expressive with feed-forward ANNs. Of course using four or more layers doesn't decrease this expressivity, but it doesn't increase it either.
However, I've seen networks with four or more layers; is there a reason for this? Intuitively it would seem that the non-linearity caused by a high number of layers just makes the learning more difficult; wouldn't it be better to stick with three layers and just increase the number of neurons?
Thanks,
-- Mikko
June 05, 2006 11:09 AM
See CCN (Cascade Correlation Networks) and the spiral problem and you´ll see a reason to have more than one hidden layer of neurons ;)
Greetings!
Vicente
Greetings!
Vicente
>>>you´ll see a reason to have more than one hidden layer of neurons
Ah sorry, the definition of "layer" isn't totally coherent. I meant four neuron layers, corresponding to two hidden neuron layers, corresponding to three weight layers. This is known to be maximally expressive.
Still, thanks for the tip. I'm currently reading a book on ANNs, it should probably get to CCN soon.
-- Mikko
Ah sorry, the definition of "layer" isn't totally coherent. I meant four neuron layers, corresponding to two hidden neuron layers, corresponding to three weight layers. This is known to be maximally expressive.
Still, thanks for the tip. I'm currently reading a book on ANNs, it should probably get to CCN soon.
-- Mikko
Hi,
(Vicente again).
The problem with training one hidden layer with lots of neurons, is that usually, ALL the neurons try to solve the problem at the same time, so they take a lot of time to solve it (that´s why backprop needs so much training iterations to solve problems).
CCN for example only allows 1 neuron per hidden layer, and the neuron acts as a feature detector for the problem, so the problem is solved chaining hidden layers.
It has it´s pros and it´s cons ;)
Greetings!
Vicente
(Vicente again).
The problem with training one hidden layer with lots of neurons, is that usually, ALL the neurons try to solve the problem at the same time, so they take a lot of time to solve it (that´s why backprop needs so much training iterations to solve problems).
CCN for example only allows 1 neuron per hidden layer, and the neuron acts as a feature detector for the problem, so the problem is solved chaining hidden layers.
It has it´s pros and it´s cons ;)
Greetings!
Vicente
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement