Advertisement

Image Targeting

Started by April 15, 2002 09:31 PM
8 comments, last by Angelus_FU 22 years, 7 months ago
Hello everyone, I''ve been on and off this site for a couple of years now, but I''ve absent for months. Anyways, usually I don''t post but read. I''m kind of new to Image Processing and Artifical Intelligence, but what I''m trying to do is apply Neural Networks to following a targeted image section. I''ve already developped a basic Back Propogation Neural Network, but I don''t know how to properly apply a Neural Network to image recognition and stuff like that. Essentially there''s a 640x480 image (this doesn''t have to be real-time analysis), and there''s maybe a 30x30 pixel region that i want to follow (for example, part of a pen) and I want to identify the position of that wherever it is on teh screen, using Neural Networks. I''ve done it using simple pixel-to-pixel compariosn, but is there a way I can do it with a Neural Net? I have no idea how to pre-process the image to feed it into the NN. I would really appreciate it if you guys can help me out, Thank you very much,
If at first you don''t succeed, cheat. Repeat until caught, then lie.
One way of performing image recognition using an ANN is to have an input node for each pixel, or cluster of pixels, in the input image. Using a cluster of pixels would involve averaging out the pixel values. You then train the network on input images and corresponding output classifications.

I presume though that you want to try and make successive identifications of the object easier by using prior knowledge of the objects location and velocity obtained from earlier frames. Correct?

If so, then you also want to represent this data in the input and output set of the network... and you''ll need to use a recurrent ANN instead of a static network. Needless to say, this is not an easy problem and certainly harder than implementing a standard feedforward network.

Cheers,

Timkin

Advertisement
normal backpropagation networks are not translation invariant, at least not if you think about a net like 640*480-n-m net.

You should probably try a search for "neocognitron", that''s a method to detect features scale and translationinvariant

@$3.1415rin
Thank you for your replies.

The image is currently 640x480, but it doesn''t have to be, I mean, I can scale it down, no big deal.
Also, I''ve looked into Cognitron''s without much success. If anybody can give me a place where I can get the principle of Image Recognition with those.

Anyways, another one of my questions is, let''s assume that I will be using a simple back-prop network (I won''t, but let''s assume). To just compare two images, of the object in the same position in the image, I just want to see if it can detect. How would I go about inputting the image into the network. Would I just just feed each pixel into an input node, by giving each bit of maybe a 4-bit grayscale image into the input nodes, or something else?

I''ve only done simple pattern recognition with 1s and 0s, but not with actual images. So any help still would be appreciated.

Thank you very much,
If at first you don''t succeed, cheat. Repeat until caught, then lie.
I don''t have any direct experience with this use of ANNs but I can get you started.

Have a look at motters Rodney project. He has some good documentation on his site here:

http://www.fuzzgun.btinternet.co.uk/rodney/vision.htm

If you have any questions the best place to find motters is the generation5 forum

www.generation5.org

Also, I think you may find this project very interesting:

http://dmtwww.epfl.ch/isr/east/team/kato/SmartEye/

It''s pretty good stuff.



Stimulate
But even if I could use one of these, since I''m beginning to understand the principle behind a Neocognitron . . . sort of, how do I give the input nodes the values of the pixels.

Do I need one node for each RGB value of the pixel (because that wouldn''t make much sense . .. ), or is there some way to boil it down to one value (I can''t see a way of that happening . . .), and I know that all the values of an NN (whether input or output) end up being between 0 and 1, so I can''t input a number greater than one, and I don''t think it works if I input a number other than 1 and 0.

My main difficulty is how to input the image in to the NN .. .

Help please.
If at first you don''t succeed, cheat. Repeat until caught, then lie.
Advertisement
quote: Original post by Angelus_FU
Do I need one node for each RGB value of the pixel (because that wouldn''t make much sense . .. ), or is there some way to boil it down to one value (I can''t see a way of that happening . . .), and I know that all the values of an NN (whether input or output) end up being between 0 and 1, so I can''t input a number greater than one, and I don''t think it works if I input a number other than 1 and 0.



Neither the inputs or the outputs have to be 0 or 1. The outputs are (usually) real numbers bound to (usually) -1I wouldn''t recommend inputting the raw RGB values. You should preprocess them first.




Stimulate
doh, for some reason it can''t handle less than symbols.

In short your inputs can be any size you like, but I would recommend standardizing them.



Stimulate
Hey guys,

Again, thanks a lot for helping, I got that stuff all worked out. But, maybe there is something else you can help me with. I''ve seen all these programs that do face recognition with NNs. But before they do the recognition, they automatically locate the face first.

Do they do this by simply running a loop through all of the pixels in the image to find a face, or is there a different way to automatically locate the face.

Because what I''m doing is locating a target. So is there any easy way to input the entire image and the NN locates the target, or are there alternate methods?

Please help,

Thanks
If at first you don''t succeed, cheat. Repeat until caught, then lie.
fup: use
& l t ;
(remove spaces: < )
when trying to put a less-than sign on a post

HTH,
ld
No Excuses

This topic is closed to new replies.

Advertisement