Here is how I understand the algorithm:
1 Start with an input and the desired output.
2 Each input flows into each neuron in the first layer.
3 For each neuron, solve for e = sum x_i * w_i for i = 1 to num inputs, where w are the connection weights.
4 For each neuron, solve y = s(e), where s is a sigmoid function.
5 Repeat 3 and 4 for each subsequent layer, passing the y values on as the next layer's input
6 Solve for the output error E = desired output - y, where y is the final layer's output
7 For each neuron in the previous layer, solve for error as s'(e) * E
8 Propagate sum error_i * w_i for i = 1 to num current layer's inputs, for each neuron.
9 Repeat 7 and 8 for all previous layers
10 For each layer, update all weights per w_ij += error_i * a * y_j, where a is the learning rate.
OK, that's pretty crappy, hopefully someone's followed along!
It's kind of a lot of code, so I'm only posting the core of it; maybe someone will spot some glaring errors. Any help is appreciated.
First, the train method on the NeuralNetwork class:
void NeuralNetwork::train( vector<TrainingPoint> set, uint iterations ){ //dumbly iterate x times for( int i = 0; i < iterations; i++ ) { cout << "==================================" << endl; //iterate over each input/output pair in the training set for( int j = 0; j < set.size(); j++ ) { int k; //forward step--training input goes into the first layer //previous layer's output goes into each subsequent layer for( k = 0; k < layers.size(); k++ ) { layers[ k ].update( k == 0 ? set[ j ].inputs : layers[ k - 1 ].output ); } vector<double> signal; //calculate output error: e = desired - actual for( k = 0; k < set[ j ].outputs.size(); k++ ) { signal.push_back( set[ j ].outputs[ k ] - layers[ layers.size() - 1 ].output[ k ] ); } //propagate the error back through the network. //each layer calculates all its neuron's errors for( k = layers.size() - 1; k > -1; k-- ) { layers[ k ].propagateError( k == layers.size() - 1 ? signal : layers[ k + 1 ].signal ); } //step back through and update the weights accordingly for( k = 0; k < layers.size(); k++ ) { layers[ k ].updateWeights( k == 0 ? set[ j ].inputs : layers[ k - 1 ].output ); } cout << "target: " << set[ j ].outputs[ 0 ] << " output: " << layers[ layers.size() - 1 ].output[ 0 ] << endl; } } }
Next, the update, propagateError and updateWeights methods on the NeuralLayer class:
//solve for all neuron activations//nn is num neurons, ni is num inputs (equal to previous layer's num neurons)void NeuralLayer::update( const vector<double> input ){ for( int i = 0; i < nn; i++ ) { double x = 0.0f; for( int j = 0; j < ni; j++ ) { x += input[ j ] * weights[ j ];<br> }<br> <br> output = <span class=cpp-number>1</span>.0f / ( <span class=cpp-number>1</span>.0f + exp( -x ) );<br> <br> }<br> <br>}<br><br><span class=cpp-comment>//solve for each neuron's error given the following layer's propogated error terms</span><br><span class=cpp-comment>//and produce those same terms for the previous layer</span><br><span class=cpp-keyword>void</span> NeuralLayer::propagateError( <span class=cpp-keyword>const</span> vector<<span class=cpp-keyword>double</span>> propagation )<br>{<br> <br> <span class=cpp-keyword>int</span> i;<br><br> <span class=cpp-keyword>for</span>( i = <span class=cpp-number>0</span>; i < nn; i++ )<br> {<br> error = output * ( <span class=cpp-number>1</span>.0f - output ) * propagation;<br> }<br> <br> <span class=cpp-keyword>for</span>( i = <span class=cpp-number>0</span>; i < ni; i++ )<br> {<br> <br> signal = <span class=cpp-number>0</span>.0f;<br> <br> <span class=cpp-keyword>for</span>( <span class=cpp-keyword>int</span> j = <span class=cpp-number>0</span>; j < nn; j++ )<br> {<br> signal += weights[ j ] * error[ j ];<br> }<br> <br> }<br> <br>}<br><br><span class=cpp-keyword>void</span> NeuralLayer::updateWeights( <span class=cpp-keyword>const</span> vector<<span class=cpp-keyword>double</span>> input )<br>{<br> <br> <span class=cpp-keyword>for</span>( <span class=cpp-keyword>int</span> i = <span class=cpp-number>0</span>; i < nn; i++ )<br> {<br> <span class=cpp-keyword>for</span>( <span class=cpp-keyword>int</span> j = <span class=cpp-number>0</span>; j < ni; j++ )<br> {<br> weights[ j ] += error * rate * input[ j ];<br> }<br> }<br> <br>}<br><br></pre></div><!–ENDSCRIPT–>