Advertisement

Neural network, image recognize

Started by September 17, 2008 03:20 AM
14 comments, last by Coldon 16 years, 2 months ago
Hi! Im building a neural network which is suppose to recognize some smilies. Something seems to be wrong but I cant find it. First I train the network 1000 times by calling recognize() and then train() inside a loop. After I do the same but this time only calling recognize() to see have many the NN could recognize. The percentage is allways the same, 25% which is exaclty the probability without the training. The initial values for the link-weights are set to som random real number between -1 and 1. The pixels in the picture is also normalized between -1 and 1. Anyway, I really need som help with this. I have searched my code for the error a long time and I belive I need some help :( Feel free to ask questions about my implementation. This is my first NN.
[SOURCE]

#include <iostream>
#include <fstream>
#include <time.h>
#include <cmath>
#include "Image.h"
#include "Node.h"
#include "Parser.h"
#include "Happy.h"

using namespace std;

void Happy::buildNetwork(vector<Image*> *images) {

	int w = (images->at(0))->w;
	int h = (images->at(0))->h;

	nrOfInput = w * h;
	nrOfOutput = 4;
	nrOfHidden =  (nrOfInput + nrOfOutput) / 2; 

	Input = new Node[nrOfInput];
	
	for(int i=0; i<nrOfInput; i++) {
		Node *n = new Node();
		n->initWeights(nrOfHidden);
		Input = *n;				
	}

	Hidden = new Node[nrOfHidden];
	for(int i=0; i<nrOfHidden; i++) {
		Node *n = new Node();
		n->initWeights(nrOfOutput);
		Hidden = *n;
	}

	Output = new Node[nrOfOutput];
	for(int i=0; i<nrOfOutput; i++) {
		Output = *(new Node());
	}
}


float Happy::sigmoid(float x) {
	return 1.0 / (1.0 + exp(-x));
}


int Happy::recognize(Image *image) {

	/* Assign values to the input nodes from the image */
	for(int i=0; i<nrOfInput; i++) {
		Input.activation = image->pixels;
	}

	/* Set hidden nodes activation/output */ 
	for(int i=0; i<nrOfHidden; i++) {

		double total = 0.0;

		for(int j=0; j<nrOfInput; j++) {
			total += (Input[j].activation * Input[j].weights);	
		}

		Hidden.activation = sigmoid(total);
	}

	/* Set output nodes activation/output */
	for(int i=0; i<nrOfOutput; i++) {

		double total = 0.0;

		for(int j=0; j<nrOfHidden; j++) {
			total += Hidden[j].activation * Hidden[j].weights;	
		}
		
		Output.activation = sigmoid(total);
	}

	/* which output has biggest value? */
	int maxIndex = 0;
	for(int i =0; i<nrOfOutput; i++) {
		if(Output.activation > maxIndex) {
			maxIndex = i;
		}
	}

	/* the answers is given 1-4. Thats explains the maxIndex+1 */
	if(maxIndex+1 == image->answer) {
		correct++;
	}
	
	return maxIndex;
}


void Happy::train(Image* image) {


	/* Output error */
	for(int i=0; i<nrOfOutput; i++) {
		double activation = Output.activation;
		if(i + 1 == image->answer) {
			Output.error = activation * (1 - activation) * (1 - activation); 
		} else {
			Output.error = activation * (1 - activation) * (0 - activation); 
		}
	}

	/* Hidden error */
	for(int i = 0; i < nrOfHidden; i++) {
		double total = 0.0;
		for(int j=0; j < nrOfOutput; j++) {
			total += Hidden.weights[j] * Output[j].error;	
		}
		Hidden.error = total;
	}


	/* Input error */
	for(int i = 0; i < nrOfInput; i++) {
		double total = 0.0;
		for(int j=0; j < nrOfHidden; j++) {
			total += Input.weights[j] * Hidden[j].error;	
		}
		Input.error = total;
	}

	/* Update weights for hidden links*/
	for(int i = 0; i < nrOfOutput; i++) {
		for(int j=0; j < nrOfHidden; j++) {
			Hidden[j].weights += (learningRate * Output.error * Hidden[j].activation);
		}
	}

	/* Update weights for input links*/
	for(int i = 0; i < nrOfHidden; i++) {
		for(int j=0; j < nrOfInput; j++) {
			Input[j].weights += (learningRate * Hidden.error * Input[j].activation);
		}
	}
}

Happy::Happy() {

	this->learningRate = 0.2;
}


int main(int argc, char** argv) {


	if(argc != 3) {
		cout << "you fail" << endl;
	} else {
		if(!strcmp(argv[1], "train")) {

			Parser* p = new Parser();
			vector<Image*> *images = p->parseFile(argv[2]);

			Happy *h = new Happy();
			h->buildNetwork(images);
			h->correct = 0;

			/* Train */
			for(int i = 0;i< 1000; i++) {
				h->recognize(images->at(i));
				h->train(images->at(i));
			}
			
			/* reset the counter */
			h->correct = 0;

			for(int i = 0;i< 1000; i++) {
				h->recognize(images->at(i));
			}

			cout << "Correct: " << (h->correct/1000.0) * 100 << "%" <<  endl;
		}
	}

	return 0;
}
[/SOURCE]
Where/how do you initialise the weights and error values?

And you're doing some odd copying of Nodes - I hope you have the proper copy constructor and assignment operator?

How many inputs do you have, out of interest? (To put it another way, how big are the images?)

I would have thought some simple logging would help here. For every attempt at recognition, you can output the results before training and after training and confirm that things change in the way you expect. Merely looking at the end success rate tells you next to nothing.
Advertisement
Thanks for the answers! Im a litrle worried about the weird copying.
Please look at my answer and let me know more about it :)


Quote:
Where/how do you initialise the weights and error values?

In the Node's constructor I set the error and activation to 0.
the Weights are initialised by a call Nodes function InitWeights()

Quote:
And you're doing some odd copying of Nodes - I hope you have the proper copy constructor and assignment operator?


am I? :(
In Happy.h I store the Nodes like

Node* input
Node* output
Node* hidden

and then in Happy.cpp I allocate with:

input = new Node[nrOfInput]


Quote:
How many inputs do you have, out of interest? (To put it another way, how big are the images?)


20x20



I just noticed one more thing:

When I set the activation value for the hidden nodes , the value is allways very high or very low. The result returned from the sigmoid function is allways between 0.97 to 1 or between 0.03 to 0. Nothing inside of the intervall [0.03, 0.97](!!) Is it suppose to be like that?

for(int i=0; i<nrOfHidden; i++) {    double total = 0.0;    for(int j=0; j<nrOfInput; j++) {        total += (Input[j].activation * Input[j].weights);	    }    Hidden.activation = sigmoid(total);}



The input values is normalized to be -1 to 1, but more values is negative then positive. (many pixels is white)

Quote: Original post by artificial stupidity
Quote:
And you're doing some odd copying of Nodes - I hope you have the proper copy constructor and assignment operator?


am I? :(

Yes. There's little point calling new to create an object, to just copy it straight afterwards. eg:
    Node *n = new Node();    n->initWeights(nrOfHidden);    Input = *n;

This creates a Node called 'n', sets it up, then you copy those values into Input, which I assume you've already set up correctly. This requires that your Nodes have the correct copy semantics, which I can't comment on without seeing the Node class's implementation. Also, you've leaked the memory for 'n', by doing nothing with it after you've copied its values. Do you come from a Java background, by any chance? :) If you want a temporary, just create one on the stack with "Node n;"
Quote: Original post by Kylotan
Quote: Original post by artificial stupidity
Quote:
And you're doing some odd copying of Nodes - I hope you have the proper copy constructor and assignment operator?


am I? :(

Yes. There's little point calling new to create an object, to just copy it straight afterwards. eg:
    Node *n = new Node();    n->initWeights(nrOfHidden);    Input = *n;

This creates a Node called 'n', sets it up, then you copy those values into Input, which I assume you've already set up correctly. This requires that your Nodes have the correct copy semantics, which I can't comment on without seeing the Node class's implementation. Also, you've leaked the memory for 'n', by doing nothing with it after you've copied its values. Do you come from a Java background, by any chance? :) If you want a temporary, just create one on the stack with "Node n;"


Ok, thanks!
Is this better?
I have:
Node** input;
.h-file

                                                                                                                                                                                                                                                            #include <iostream>#include <fstream>#include <time.h>#include <cmath>#include "Image.h"#include "Node.h"#include "Parser.h"#include "Happy.h"using namespace std;void Happy::buildNetwork(vector<Image*> *images) {	int w = (images->at(0))->w;	int h = (images->at(0))->h;	nrOfInput = w * h;	nrOfOutput = 4;	nrOfHidden =  (nrOfInput + nrOfOutput) / 2; 	Input = new Node *[nrOfInput];	for(int i=0; i<nrOfInput; i++) {		Node *n = new Node();		n->initWeights(nrOfHidden);		Input = n;					}	Hidden = new Node *[nrOfHidden];	for(int i=0; i<nrOfHidden; i++) {		Node *n = new Node();		n->initWeights(nrOfOutput);		Hidden = n;	}	Output = new Node *[nrOfOutput];	for(int i=0; i<nrOfOutput; i++) {		Output = new Node();	}}float Happy::sigmoid(float x) {	return 1.0 / (1.0 + exp(-x));}int Happy::recognize(Image *image) {	/* Assign values to the input nodes from the image */	for(int i=0; i<nrOfInput; i++) {		Input->activation = image->pixels;	}	// Set hidden nodes activation/output 	for(int i=0; i<nrOfHidden; i++) {		double total = 0.0;		for(int j=0; j<nrOfInput; j++) {			total += (Input[j]->activation * Input[j]->weights);			}		Hidden->activation = sigmoid(total);	}	// Set output nodes activation/output 	for(int i=0; i<nrOfOutput; i++) {		double total = 0.0;		for(int j=0; j<nrOfHidden; j++) {			total += Hidden[j]->activation * Hidden[j]->weights;			}		Output->activation = sigmoid(total);			}	// which output has biggest value? 	int maxIndex = 0;	for(int i =0; i<nrOfOutput; i++) {		if(Output->activation > maxIndex) {			maxIndex = i;		}	}	// the answers is given 1-4. Thats explains the maxIndex+1 	if(maxIndex+1 == image->answer) {		correct++;	}	return maxIndex;}void Happy::train(int answer) {	/* Output error */	for(int i=0; i<nrOfOutput; i++) {		double activation = Output->activation;		if(i + 1 == answer) {			Output->error = activation * (1 - activation) * (1 - activation); 		} else {			Output->error = activation * (1 - activation) * (0 - activation); 		}	}	/* Hidden error */	for(int i = 0; i < nrOfHidden; i++) {		double total = 0.0;		for(int j=0; j < nrOfOutput; j++) {			total += Hidden->weights[j] * Output[j]->error;			}		Hidden->error = total;	}	/* Input error */	for(int i = 0; i < nrOfInput; i++) {		double total = 0.0;		for(int j=0; j < nrOfHidden; j++) {			total += Input->weights[j] * Hidden[j]->error;			}		Input->error = total;	}	/* Update weights for hidden links*/	for(int i = 0; i < nrOfOutput; i++) {		for(int j=0; j < nrOfHidden; j++) {			Hidden[j]->weights += (learningRate * Output->error * Hidden[j]->activation);		}	}	/* Update weights for input links*/	for(int i = 0; i < nrOfHidden; i++) {		for(int j=0; j < nrOfInput; j++) {			Input[j]->weights += (learningRate * Hidden->error * Input[j]->activation);		}	}}Happy::Happy() {	this->learningRate = 0.2;}void print(Node n) {	for(int i=0; i<10; i++) {		cout << n.weights << " ";	}	cout << endl;}int main(int argc, char** argv) {	if(argc != 3) {		cout << "you fail" << endl;	} else {		if(!strcmp(argv[1], "train")) {			Parser* p = new Parser();			vector<Image*> *images = p->parseFile(argv[2]);			Happy *h = new Happy();			h->buildNetwork(images);			h->correct = 0;			/* Train */			for(int i = 0;i< 200; i++) {				h->recognize(images->at(i));				h->train(images->at(i)->answer);			}			/* reset the counter */			h->correct = 0;			for(int i = 0;i< 200; i++) {				h->recognize(images->at(i));			}			cout << "Correct: " << (h->correct/200.0) * 100 << "%" <<  endl;		}	}	return 0;}


And by the way, do you know anything about the strange sigmoid values I wrote about earlier?

Advertisement
The "strange sigmoid values" simply mean that your weights are too large. You still haven't explained how you initialize your weights.

Think of what the situation is for a neuron in the hidden layer. You have 400 inputs between -1 and 1. Let's imagine that they are uniformly distributed. If you use weights that are also uniformly distributed in [-1,+1], the sum of weights * inputs will look very close to a normal distribution with a standard deviation around 6.5. That means that typical values will saturate the sigmoid function and you'll get outputs that are almost always 1 or 0. The situation is even worse if most inputs are either +1 or -1 (standard deviation jumps to over 11). If you start with weights in [-1/20,+1/20] you'll do much better. `20' is the square root of the number of inputs.

On the programming part of it, since you are using C++, you should consider using STL containers instead of managing your own memory, particularly since you are not very familiar with memory management.
Quote:
The "strange sigmoid values" simply mean that your weights are too large. You still haven't explained how you initialize your weights.


oh sorry, here is the weight-function and the normalizing function for the input values. I modified it with a division by 20 at the end. No differnce :(

#include <iostream>#include <time.h>#include "Node.h"Node::Node() {		activation = 0.0;	error = 0.0;}void Node::initWeights(int nr) {	srand(time(NULL));	weights = new float[nr];	float avg = 0.0;	for(int i = 0; i<nr; i++) {		weights = ((rand() % 200) / 100.0) -1.0;		weights /= 20.0;	}}




#include "Image.h"Image::Image(string name, float* pixels, int w, int h) {	this->name = name;	this->pixels = pixels;	this->w = w;	this->h = h;	normalize();}void Image::normalize () {	float max = 0;	for(int i=0; i<(w*h); i++) {		if(pixels > max) {			max = pixels;		}	}			for(int i=0; i<(w*h); i++) {		pixels/= max;		pixels*=2.0;		pixels-=1;		pixels /= 20.0;	}}



Quote:
Think of what the situation is for a neuron in the hidden layer. You have 400 inputs between -1 and 1. Let's imagine that they are uniformly distributed. If you use weights that are also uniformly distributed in [-1,+1], the sum of weights * inputs will look very close to a normal distribution with a standard deviation around 6.5. That means that typical values will saturate the sigmoid function and you'll get outputs that are almost always 1 or 0. The situation is even worse if most inputs are either +1 or -1 (standard deviation jumps to over 11). If you start with weights in [-1/20,+1/20] you'll do much better. `20' is the square root of the number of inputs.

On the programming part of it, since you are using C++, you should consider using STL containers instead of managing your own memory, particularly since you are not very familiar with memory management.


Dividing all the weights by 20 did nothing? That's surprising.

Can you dump what all the inputs and weights are for one random node and see what you get? It looks like you are going to have to resort to that type of diagnostic tool. You can probably also benefit from reducing the size of your problem. For instance, start with 2x2, and try to recognize "bottom-heavy" patterns, or patterns with high contrast, or simply bright patterns. Those problems are much easier and you'll be able to easily see what's going on because there are only a few tens of numbers involved.
Quote:
Dividing all the weights by 20 did nothing? That's surprising.


Well, it did something. Now the hidden nodes activation is between 0-1 but that didnt change the fact that the result is the same after the training :(
It seems that it does not learn att all

Quote:
Can you dump what all the inputs and weights are for one random node and see what you get?


I could dump the input and weights, but then I dont know what to look for.
The numbers are small, but I have the the weight init to the bound +- 1/20.
This is 10 random weights:

-0.0176376 -0.0176381 0.00177376 0.0198532 2.38221e-44 0 0.22407 -3.25956e-05 1.726e-39 0

Quote:
It looks like you are going to have to resort to that type of diagnostic tool. You can probably also benefit from reducing the size of your problem. For instance, start with 2x2, and try to recognize "bottom-heavy" patterns, or patterns with high contrast, or simply bright patterns. Those problems are much easier and you'll be able to easily see what's going on because there are only a few tens of numbers involved.


I dont think reducing the problem size will do any good. If had some kind of result I could start with 2x2 to see how to get it to learn better and fine tone the NN. But for me, I just wanna see a one more percetage of recognizion before I start anlysing it.

This topic is closed to new replies.

Advertisement