I modified the code, and added an additional hidden neuron. I have also re-written parts of it for clarity. Here is the new training function.
void Layernet::Trainnet (int *inputarray){ //present the inputsint *input=inputarray;int counter = 1;double delta,delta_hid1,delta_hid2;double alpha = .4;for (int j=0; j<100; j++){ double hid1_in = w0 * input[0] + w1 *input[1] + bias1; double hid1_out = 1 / (1 + exp (-1 * hid1_in)); double hid2_in = w2 * input[0] + w3 *input[1] + bias2; double hid2_out = 1 / (1 + exp (-1 * hid2_in)); double out_in = w4*hid1_out + w5*hid2_out + bias3; double out_out = 1 / (1 + exp(-1 * out_in)); //calculate the delta for output layer first delta = (input[2] - out_out) * out_out * (1 - out_out); //the delta for the hidden layer, using delta calculated above delta_hid1 = delta * w4 * hid1_out * (1 - hid1_out); delta_hid2 = delta * w5 * hid2_out * (1 - hid2_out); //train using backpropagation, learning rate is alpha // w0, w1, w2, w3 are the weights going from the input neurons to the hidden neurons // w4 and w5 are the weights going from the hidden layer neurons to the o/p neuron w0 += alpha * delta_hid1 * input[0]; w1 += alpha * delta_hid1 * input[1]; bias1 += alpha * delta_hid1 * 1; w2 += alpha * delta_hid2 * input[0]; w3 += alpha * delta_hid2 * input[1]; bias2 += alpha * delta_hid2 * 1; w4 += alpha * delta * hid1_out; w5 += alpha * delta * hid2_out; bias3 += alpha * delta * 1; if (counter == 4) {counter = 1; input = inputarray;} else { counter++; input+=3 ; }}}
The code that initialises the weights is as follows.
Layernet::Layernet(void){//random weights w0 = .2l; w1 = .13; w2 = .11; w3 = .22; w4 = .31; w5 = .12; bias1 = .12; bias2 = .11; bias3 = .10;}
But the net still doesnt work. I am using a 3 layer Feed Forward Network, having 2 hidden neurons in the middle layer, and trained with the standard back-propagation algorithm (no momentun, learning rate = .2, and activation function = sigmoid)
Can someone please tell me why. Just for further clarifications on the code, w0 and w1 are the weights going from the 2 input neurons to hidden neuron 1, and bias1 is the bias of the 1st hidden neuron1. w2,w3 and bias2 are for the next hidden neuron. Finally w4,w5 and bias3 are for the output neuron. Additionally, the derivative of the sigmoid function f = (1 / (1 + exp -x)), is f * (1-f) and has been used directly in the code.
I have tried hard, maybe you can help me get this to work.
[Edit]
Here is the function that I use for o/p
int Layernet::output (int *inputarray){ double hid1_in = w0 * inputarray[0] + w1 * inputarray[1] + bias1; double hid1_out = 1 / ( 1 + exp(-1 * hid1_in)); double hid2_in = w2 * inputarray[0] + w3 * inputarray[1] + bias2; double hid2_out = 1 / ( 1 + exp(-1 * hid2_in)); double out_in = w4 * hid1_out + w5 * hid2_out + bias3; double out_out = 1/ (1 + exp (-1 * out_in)); if (out_out >= .9) return 1; else return 0;}
Also, during training I used 1,-1 instead of 1,0 but it made no difference. I get all the o/p to be 0, which most certainly isnt the XOR gate.
[/Edit]