"Ive been debugging my neural net class for a week now, I cant figure out whats wrong. I thought it could be that the back propagation equation isnt implented as intended."
I too had similar problems a while ago. It turned out that the problem was not in the back propagation. There are many other things to which a neural network is sensitive to:
* Initial weight values. I found it a good idea to use random numbers from some configurable bounds, e.g [-0.1,+0.1]. My network actually refused to learn XOR with zero initial weights!
* Order of teaching. I found that using stochastic teaching (i.e choosing teaching samples at random) produces better results than orderly teaching (which can lead to bias). Again my network often refused to learn XOR if teaching was done in a predictive fashion (00,01,10,11).
* Learning constant. OK, it's clear that the learning constant affects the learning by a great deal - try lowering your value at first.
As with all debugging, you should first start with very simple test data, e.g learning AND or OR binary gates (a perceptron is enough for this). Then you can proceed to a little bit more complicated examples, e.g the XOR gate (this requires hidden layers), and off you go.
Good luck,
-- Mikko
The maths behind back propagation
Thanks for your input uutee. [smile]
Ive added stochastic training to my network. Didnt make it better though.
I randomize the weights in the constructor so they are within -1 and 1.
I just lowered the learning constant to 0.1 and ran a whooping 500000 times. The net just gave 0.03 on every input... couldnt be more far of. :P
Ive already made a net with a single perceptron which managed to learn AND. Its now when Ive expanded the class to be of a variable size (choose how many hidden layers and how many neurons in each layer) that the problems have started to really hard to solve.
Heres the source to my net, if you (for any reason) want to look. The training algorithm was taken from a back propagation article on generation5.org (see earlier post for link) and I tried to show how the architecture (that is what id a certain weight/output had) of my net a couple of posts up. Look at that if you dont understand how its laid up from the source.
Here it is, anyway:
NeuralNet.h
NeuralNet.cpp
Main.cpp
Ive added stochastic training to my network. Didnt make it better though.
I randomize the weights in the constructor so they are within -1 and 1.
I just lowered the learning constant to 0.1 and ran a whooping 500000 times. The net just gave 0.03 on every input... couldnt be more far of. :P
Ive already made a net with a single perceptron which managed to learn AND. Its now when Ive expanded the class to be of a variable size (choose how many hidden layers and how many neurons in each layer) that the problems have started to really hard to solve.
Heres the source to my net, if you (for any reason) want to look. The training algorithm was taken from a back propagation article on generation5.org (see earlier post for link) and I tried to show how the architecture (that is what id a certain weight/output had) of my net a couple of posts up. Look at that if you dont understand how its laid up from the source.
Here it is, anyway:
NeuralNet.h
#ifndef _NEURAL_NET#define _NEURAL_NET#include <vector>#include <iostream>#include <time.h>#include <math.h>#include <fstream>#include <sstream>#include <conio.h>using namespace std;#define LEARN_RATE 0.1#define OUTPUTS 1class NeuralNet {private: float Sigmoid(float num) {return (float)(1/(1+exp(-num)));}; vector<vector<float> > W; // weights vector<vector<float> > X; // the neurons outputs int Inputs, HiddenLayers, NeuronsInHidden; float NetOut; // the total output of the net float x(int l, int n); // output x in layer l from neuron n (l = 0 for the inputs to the net) float w(int l, int f, int n); // weight of input f (bias = 0) for neuron n in layer l, l = 0 is hidden, l = 1 is output layer void SetX(int l, int n, float NewX); // same as above but set the weight instead of read it void SetW(int l, int f, int n, float NewWeight); // same as above but set the weight instead of read it bool Debug; // debug output flag //bool StepThrough; public: NeuralNet(int _Inputs, int _HiddenLayers, int _NeuronsInHidden, bool _Debug); ~NeuralNet(); void Train(vector<float> NetIn, float CorrectOutput); // same as below but trains it to float Process(vector<float> NetIn); // takes the inputs, returns the outputs of the net void Print();};#endif
NeuralNet.cpp
#include "NeuralNet.h"NeuralNet::NeuralNet(int _Inputs, int _HiddenLayers, int _NeuronsInHidden, bool _Debug) { Inputs = _Inputs; HiddenLayers = _HiddenLayers; NeuronsInHidden = _NeuronsInHidden; NetOut = 0; // the total output of the net Debug = _Debug; //StepThrough = _StepThrough; srand( (unsigned)time( NULL ) ); // seed the randomizer W.resize(HiddenLayers+1); // setup weights for(int l = 0; l <= HiddenLayers; ++l) { if(l == 0) { // first layer, these inputs are the net inputs W[l].resize((Inputs+1) * NeuronsInHidden); for(int n = 0; n < NeuronsInHidden; ++n) { for(int f = 0; f < (Inputs+1); ++f) SetW(l,f,n, 2.0f*((float)rand()/(float)RAND_MAX)-0.5f); } } else { W[l].resize((NeuronsInHidden+1)*NeuronsInHidden); // hidden layers, inputs to these are the outputs from the former layer for(int n = 0; n < NeuronsInHidden; ++n) { for(int f = 0; f < (NeuronsInHidden+1); ++f) SetW(l,f,n, 2.0f*((float)rand()/(float)RAND_MAX)-0.5f); } } } W[HiddenLayers].resize((NeuronsInHidden+1) * OUTPUTS); // output layer for(int f = 0; f < ((NeuronsInHidden+1) * OUTPUTS); ++f) SetW(HiddenLayers, f, 0, 2.0f*((float)rand()/(float)RAND_MAX)-0.5f); // initialize the input/output holders for perceptrons X.resize(HiddenLayers+1); // +1 for the input layer X[0].resize(Inputs+1, 0); // the input layer for(int l = 1; l <= HiddenLayers; ++l) X[l].resize(NeuronsInHidden+1, 0); // store biases for(int l = 0; l <= HiddenLayers; ++l) SetX(l,0,1); if(Debug) { cout << "--SETUP-------------\n"; for(int l = 0; l <= HiddenLayers; ++l) cout << "W Layer " << l << " have size " << W[l].size() << endl; cout << endl; for(int l = 0; l <= HiddenLayers; ++l) cout << "X Layer " << l << " have size " << X[l].size() << endl; cout << endl; for(int l = 0; l <= HiddenLayers; ++l) { for(int n = 0; n < W[l].size(); ++n) cout << "w(" << l << "," << n << ") is " << W[l][n] << endl; cout << endl; } getch(); }}NeuralNet::~NeuralNet() {}float NeuralNet::w(int l, int f, int n) { // f = 0 is bias weight if(l > W.size()) { cout << "SetW error: Bad layer number: " << l << endl; return 0; } else if(l == 0 ) { // input layer if(((Inputs+1) * n) + f > W[l].size()) { cout << "SetW error: Bad weight id number: " << ((Inputs+1) * n) + f << " on layer " << l << "\n\n"; return 0; } return W[l][((Inputs+1) * n) + f]; } else if(l == 0 || l <= HiddenLayers) { if(((NeuronsInHidden+1) * n) + f > W[l].size()) { cout << "SetW error: Bad weight id number: " << ((NeuronsInHidden+1) * n) + f << " on layer " << l << "\n\n"; return 0; } return W[l][((NeuronsInHidden+1) * n) + f]; } return 0; // just in case}void NeuralNet::SetW(int l, int f, int n, float NewWeight) { if(l == HiddenLayers) // output layer W[HiddenLayers][((NeuronsInHidden+1) * n) + f] = NewWeight; else if(l == 0 ) // input layer W[l][((Inputs+1) * n) + f] = NewWeight; else if(l == 0 || l < HiddenLayers) // hidden layers W[l][((NeuronsInHidden+1) * n) + f] = NewWeight; else if(l < 0 && l > HiddenLayers) { cout << "W Error: Bad layer number: " << l << endl; return; }}void NeuralNet::SetX(int l, int n, float NewX) { if(l < 0 && l > HiddenLayers) cout << "SetX error: Bad layer number: " << l << endl; if(n >= X[l].size()) { cout << "SetX error: Bad layer number: " << l << endl; return; } // we are inside boundries X[l][n] = NewX;}float NeuralNet::x(int l, int n) { // n = 0 is bias (1) if(l < 0 && l > HiddenLayers) cout << "X Error: Bad layer number: " << l << endl; if(n >= X[l].size()) { cout << "X Error: Bad neuron number: " << n << endl; return 0; } // we are inside boundries return X[l][n];}void NeuralNet::Train(vector<float> NetIn, float d) { // first, process so we have the correct values stored inside the neural net Process(NetIn); vector<vector<float> > Delta; Delta.resize(HiddenLayers+1); // one for the output layer to for(int l = 0; l <= HiddenLayers; ++l) { if(l == HiddenLayers) // output layer Delta[l].resize(OUTPUTS, 0); else Delta[l].resize(NeuronsInHidden, 0); } // output layer delta (we only have one output now so the loop will only run once) // d(2,0) = x(3,0)(1 - x(3,0))(d - x(3,0)) //Delta[HiddenLayers][n] = x(HiddenLayers+1,n) * (1 - x(HiddenLayers+1,n)) * (d - x(HiddenLayers+1,n)); Delta[HiddenLayers][0] = NetOut * (1 - NetOut) * (d - NetOut); // hidden layer delta, first one before output // d2(1) = x2(1)(1 - x2(1))w2(1,1)d3(1) // formula: d(l,n) = x(l,n) (1 - x(l,n)) w(l+1,n,n-1) d(l+1,n) // loop through the net backwards for(int l = HiddenLayers-1; l >= 0; --l) { if(l == HiddenLayers-1) { // layer directly before output layer for(int n = 0; n < NeuronsInHidden; ++n) Delta[l][n] = x(l+1,n+1) * (1 - x(l+1,n+1)) * w(l+1,n+1,0) * Delta[HiddenLayers][0]; } else { for(int n = 0; n < NeuronsInHidden; ++n) Delta[l][n] = x(l+1,n+1) * (1 - x(l+1,n+1)) * w(l+1,n+1,n) * Delta[l+1][n]; } } // Delta calculated, now alter the weights (we only have one output now so the loop will only run once) // formula: w2(0,1) = h*x1(0)*d2(1) // formula: w(l,f,n) = h * x(l,f) * d(l,n) for(int f = 0; f < NeuronsInHidden+1; f++) SetW(HiddenLayers,f,0, w(HiddenLayers,f,0)+(LEARN_RATE * x(HiddenLayers,f) * Delta[HiddenLayers][0])); // alter the weights for the hidden layers to for(int l = 0; l < HiddenLayers; l++) { if(l == 0) { // first layer for(int n = 0; n < NeuronsInHidden; n++) { for(int f = 0; f < Inputs+1; f++) SetW(0,f,n, w(0,f,n)+(LEARN_RATE * x(0,f) * Delta[0][n])); } } else { for(int n = 0; n < NeuronsInHidden; n++) { for(int f = 0; f < NeuronsInHidden+1; f++) SetW(l,f,n, w(l,f,n)+(LEARN_RATE * x(l,f) * Delta[l][n])); } } } if(Debug) { cout << "--TRAIN-------------\n"; for(int l = HiddenLayers; l >= 0; --l) { if(l == HiddenLayers) { // output layer for(int n = 0; n < OUTPUTS; ++n) cout << "Delta(" << l << "," << n << ") " << Delta[l][n] << " "; cout << endl; } else if(l == 0) { // input layer for(int n = 0; n < Inputs; ++n) cout << "Delta(" << l << "," << n << ") " << Delta[l][n] << " "; cout << endl; } else { for(int n = 0; n < NeuronsInHidden; ++n) cout << "Delta(" << l << "," << n << ") " << Delta[l][n] << " "; cout << endl; } } cout << endl; for(int l = 0; l <= HiddenLayers; ++l) { for(int n = 0; n < W[l].size(); ++n) cout << "New weight (" << l << "," << n << ") is " << W[l][n] << endl; cout << endl; } getch(); }}float NeuralNet::Process(vector<float> NetIn) { // reset values in net for(int l = 0; l <= HiddenLayers; ++l) { if(l == 0) { // input layer for(int n = 1; n < Inputs+1; ++n) SetX(l,n,0); } else { for(int n = 1; n < NeuronsInHidden+1; ++n) SetX(l,n,0); } } NetOut = 0; // reset output neuron // initial net inputs for(int n = 1; n <= Inputs; ++n) SetX(0,n,NetIn[n-1]); // first layer float Fire = 0; // what the neuron fires for(int n = 1; n <= Inputs; ++n) { for(int i = 0; i <= Inputs; ++i) Fire += x(0, i) * w(0, i, n-1); SetX(1,n, Sigmoid(Fire)); // store it as output Fire = 0; // reset fire } // sort out the hidden layers outputs for(int l = 0; l < HiddenLayers; l++) { // loop through layers for(int n = 1; n <= NeuronsInHidden; n++) { // loop through hiddens, start at one so we dont overwrite the bias for(int i = 0; i < Inputs+1; i++) // loop through inputs Fire += x(l, i) * w(l, i, n-1); // store outputs as inputs in the next layer SetX(l+1,n, Sigmoid(Fire)); // store it as output Fire = 0; // reset fire } } // output neuron for(int i = 0; i <= NeuronsInHidden; i++) NetOut = x(HiddenLayers, i) * w(HiddenLayers, i, 0); NetOut = Sigmoid(NetOut); // --- Calculation done --- if(Debug) { cout << "--PROCESS-----------\n"; for(int l = 0; l <= HiddenLayers; ++l) { for(int n = 1; n < X[l].size(); ++n) cout << "x(" << l << "," << n << ") = " << x(l,n) << " "; cout << endl; } cout << "Netout: " << NetOut << "\n\n"; getch(); } return NetOut; }void NeuralNet::Print() { // print output stringstream str; // hidden layer weights str << "Hidden W: --- "; for(int l = 0; l < HiddenLayers; ++l) { if(l == 0) { // first layer str << "\n\nLayer " << l << "\n"; for(int n = 0; n < NeuronsInHidden; ++n) { for(int i = 0; i < Inputs+1; ++i) str << "w(" << l << "," << i << "," << n << "): " << w(l,i,n) << "\t"; str << endl; } } else { // every other hidden layer str << "\n\nLayer " << l << "\n"; for(int n = 0; n < NeuronsInHidden; ++n) { for(int i = 0; i < NeuronsInHidden+1; ++i) str << "w(" << l << "," << i << "," << n << "): " << w(l,i,n) << "\t"; str << endl; } } str << endl; } str << "\n\n"; // output layer weights str << "Output W: --- \n"; for(int n = 0; n < OUTPUTS; n++) { for(int i = 0; i <= NeuronsInHidden; i++) str << "w(" << HiddenLayers << "," << i << "," << n << "): " << w(HiddenLayers,i,n) << "\t"; } str << "\n\n"; // open file ofstream file("Net.txt"); if(!file.is_open()) { cout << "Print failed, unable to create file: Net.txt\n"; return; } // print it file << str.str(); cout << "Net data printed to file\n";}
Main.cpp
#include <iostream>#include <conio.h>#include "NeuralNet.h"int main() { NeuralNet X(2, 2, 2, 1); vector<float> NetIn; NetIn.resize(2); // Train int s = 0; // for stochastic teaching for(int a = 0; a < 500000; ++a) { s = rand()%3; cout << s << endl; if(s == 0) { NetIn[0] = 0; NetIn[1] = 0; X.Train(NetIn, 0); } else if(s == 1) { NetIn[0] = 1; NetIn[1] = 0; X.Train(NetIn, 0); } else if(s == 2) { NetIn[0] = 0; NetIn[1] = 1; X.Train(NetIn, 0); } else if(s == 3) { NetIn[0] = 1; NetIn[1] = 1; X.Train(NetIn, 1); } } // Output what weve learned NetIn[0] = 0; NetIn[1] = 0; cout << "0,0 = " << X.Process(NetIn); NetIn[0] = 1; NetIn[1] = 0; cout << endl << "1,0 = " << X.Process(NetIn); NetIn[0] = 0; NetIn[1] = 1; cout << endl << "0,1 = " << X.Process(NetIn); NetIn[0] = 1; NetIn[1] = 1; cout << endl << "1,1 = " << X.Process(NetIn) << "\n\n"; X.Print(); getch(); return 1;}
Hey Miz,
You can send me your project file and I can try to debug it if you want me to. NickGeorgia@hotmail.com
You can send me your project file and I can try to debug it if you want me to. NickGeorgia@hotmail.com
I just send the mail with the attached project in a .rar file. Also included some extra info.
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement