Advertisement

SOM network

Started by April 10, 2006 07:12 AM
4 comments, last by Doggan 18 years, 7 months ago
Hello, i've begun exploring neural network, and loving it by the way. To start off I'd like to program a digit recognition program. I've been searching and reading a lot on the internet, especially about neural network strutures. It seems hard to decide which way to go. Mostly, only the layout of a 'node/neuron' is layed out, but the rest is left blank mostly, i.e. how to connect it all. I tried, at first, a simple feed-forward network. a 32*32 grid of input-nodes, on which I can project an image. The input nodes were all connected to each of the 10 output nodes. I believe this is quite like the Adaline network, or liniear classification. This performs very poorly, for as far as I can tell on novel items presented to the network. I looked futher for other architectures, and stubled onto SOM networks, and also feuture maps. As to SOM, the tutorials only cover networks with 3 inputs(colors e.g.). I'd like to know how this can be used to digit recognition. My thoughts sofar are; a layer of 32*32 input-nodes a layer of 32*32 map-nodes connect each node of input layer to all the nodes in the output layer. These connections are weighted, and stored as a vector in each output layer. ( so node x in the output layer is connected to a the nodes in the input-layer, and the weights of these links are stored within the node itseld) But now the functionality itself training; lay a bitmap on the input-layer. this is represented as a vector with length 32*32. Search for the node with vector closest to input-vector adjust weight-vector in the output-node to closer resemble the input vector do the same, but increasingly less, for the neighboring nodes. This is to where (i think) i follow the concept. But how should the network react to novel input?? When i e.g. train it for a bitmap of a '2', should it converge to a picture of this '2' when presenting another picture resembling a two?? I guess i'm not al to sure on how to visualize the representations in such a network. Also, the learning. For as far as I can tell, you present the network with an image. Then the loop commences of finding the best match unit, and then scaling its weightvector and that of it's neighbors. But what about repetition?? should I present the same picture twice or more?? Can anyone help me on filling in these concrete question, and maybe provide links to CLEAR tutorials of articles ? Besides all that. What should be the best way, you think, to connect a network suited for digit regocnition (or image-resemblance). SOMS / Feature Maps (als fascinated, but need to know more about that two) Backprop, single layered, multilayered???
Quote: Original post by Kincaid

To start off I'd like to program a digit recognition program.

[...]

Besides all that.
What should be the best way, you think, to connect a network suited for digit regocnition (or image-resemblance). SOMS / Feature Maps (als fascinated, but need to know more about that two) Backprop, single layered, multilayered???


The best way would be not using NN. It never has given very good results in image recognition. Plus, as you are surely aware given your long post on the subject, they are insanely difficult to configure given any particular problem.

If your goal is to do a digit recognition program using NN, then somebody else will need to help you, but I warn you, years of research have shown its not the best way.

If you just want to achieve digit recognition whatever the means, I suggest pattern recognition techniques, or kernel-machines classifiers.

Good luck!
Advertisement
The goal is to understand NN's, so eventhough it might not be the most accurate way to go, I still want to use NN's.
The fact that I try digit-recognition is rather random. Just needed sumthing to exploit with NN's :)

If you're going to proceed with using an ANN for image recognition, then I cannot stress highly enough the need to preprocess the input data into an orthogonal space. Use a PCA or ICA of the data as the input to the network. Why? Because cross-correlations in the input data make training far more difficult and degrade the quality of classification. This isn't as noticeable in low dimensional systems, but images fed in as pixel maps are inherently high dimensional.

Cheers,

Timkin
anyone else any thoughts on the do's instead of the dont's in nn's :) ??
I wrote a SOM application in C# last year to perform basic image recognition. In it's current state, you present the application a large assortment of images, and it is able to group them according to similarities. It can be tweaked pretty easily to present a new image to the trained network, and it will group it accordingly (this is not implemented in the current version).

You can download it @ my webpage: http://www.shyammichael.com in the Downloads -> Applications section. It is called SOM_Image.

I use two methods for image classification: 1). Brightness of the image. 2). Color patterns (breaking the image into 9 square regions can calculating an average color for each).

Maybe the code will help answer some of your questions :)

Edit: Apparently there's no code in the above referenced area, only the executable. I just uploaded it, so now you can get code here

This topic is closed to new replies.

Advertisement