Advertisement

Back Propagation Error

Started by August 29, 2010 06:51 PM
0 comments, last by newtomaths 14 years, 5 months ago
I'm trying to write a generic neural network library and things aren't working out too well. I must be missing something--I understand the concept, but my output is way off. I'm just trying to teach the network XOR.

Here is how I understand the algorithm:

1 Start with an input and the desired output.
2 Each input flows into each neuron in the first layer.
3 For each neuron, solve for e = sum x_i * w_i for i = 1 to num inputs, where w are the connection weights.
4 For each neuron, solve y = s(e), where s is a sigmoid function.
5 Repeat 3 and 4 for each subsequent layer, passing the y values on as the next layer's input
6 Solve for the output error E = desired output - y, where y is the final layer's output
7 For each neuron in the previous layer, solve for error as s'(e) * E
8 Propagate sum error_i * w_i for i = 1 to num current layer's inputs, for each neuron.
9 Repeat 7 and 8 for all previous layers
10 For each layer, update all weights per w_ij += error_i * a * y_j, where a is the learning rate.

OK, that's pretty crappy, hopefully someone's followed along!

It's kind of a lot of code, so I'm only posting the core of it; maybe someone will spot some glaring errors. Any help is appreciated.

First, the train method on the NeuralNetwork class:
void NeuralNetwork::train( vector<TrainingPoint> set, uint iterations ){		//dumbly iterate x times	for( int i = 0; i < iterations; i++ )	{				cout << "==================================" << endl;				//iterate over each input/output pair in the training set		for( int j = 0; j < set.size(); j++ )		{						int k;						//forward step--training input goes into the first layer			//previous layer's output goes into each subsequent layer			for( k = 0; k < layers.size(); k++ )			{				layers[ k ].update( k == 0 ? set[ j ].inputs : layers[ k - 1 ].output );			}						vector<double> signal;						//calculate output error: e = desired - actual			for( k = 0; k < set[ j ].outputs.size(); k++ )			{				signal.push_back( set[ j ].outputs[ k ] - layers[ layers.size() - 1 ].output[ k ] );			}						//propagate the error back through the network.			//each layer calculates all its neuron's errors			for( k = layers.size() - 1; k > -1; k-- )			{				layers[ k ].propagateError( k == layers.size() - 1 ? signal : layers[ k + 1 ].signal );			}						//step back through and update the weights accordingly			for( k = 0; k < layers.size(); k++ )			{				layers[ k ].updateWeights( k == 0 ? set[ j ].inputs : layers[ k - 1 ].output );			}						cout << "target: " << set[ j ].outputs[ 0 ] << " output: " << layers[ layers.size() - 1 ].output[ 0 ] << endl;					}				}	}


Next, the update, propagateError and updateWeights methods on the NeuralLayer class:
//solve for all neuron activations//nn is num neurons, ni is num inputs (equal to previous layer's num neurons)void NeuralLayer::update( const vector<double> input ){		for( int i = 0; i < nn; i++ )	{				double x = 0.0f;				for( int j = 0; j < ni; j++ )		{			x += input[ j ] * weights[ j ];<br>		}<br>		<br>		output = <span class=cpp-number>1</span>.0f / ( <span class=cpp-number>1</span>.0f + exp( -x ) );<br>		<br>	}<br>	<br>}<br><br><span class=cpp-comment>//solve for each neuron's error given the following layer's propogated error terms</span><br><span class=cpp-comment>//and produce those same terms for the previous layer</span><br><span class=cpp-keyword>void</span> NeuralLayer::propagateError( <span class=cpp-keyword>const</span> vector&lt;<span class=cpp-keyword>double</span>&gt; propagation )<br>{<br>	<br>	<span class=cpp-keyword>int</span> i;<br><br>	<span class=cpp-keyword>for</span>( i = <span class=cpp-number>0</span>; i &lt; nn; i++ )<br>	{<br>		error = output * ( <span class=cpp-number>1</span>.0f - output ) * propagation;<br>	}<br>	<br>	<span class=cpp-keyword>for</span>( i = <span class=cpp-number>0</span>; i &lt; ni; i++ )<br>	{<br>		<br>		signal = <span class=cpp-number>0</span>.0f;<br>		<br>		<span class=cpp-keyword>for</span>( <span class=cpp-keyword>int</span> j = <span class=cpp-number>0</span>; j &lt; nn; j++ )<br>		{<br>			signal += weights[ j ] * error[ j ];<br>		}<br>		<br>	}<br>	<br>}<br><br><span class=cpp-keyword>void</span> NeuralLayer::updateWeights( <span class=cpp-keyword>const</span> vector&lt;<span class=cpp-keyword>double</span>&gt; input )<br>{<br>	<br>	<span class=cpp-keyword>for</span>( <span class=cpp-keyword>int</span> i = <span class=cpp-number>0</span>; i &lt; nn; i++ )<br>	{<br>		<span class=cpp-keyword>for</span>( <span class=cpp-keyword>int</span> j = <span class=cpp-number>0</span>; j &lt; ni; j++ )<br>		{<br>			weights[ j ] += error * rate * input[ j ];<br>		}<br>	}<br>	<br>}<br><br></pre></div><!–ENDSCRIPT–> 
worked it out, just a logic problem

This topic is closed to new replies.

Advertisement