Neural Network pattern recognition
If you used a neural network to recognize patterns on a 2d image how would you go about identifying a given pattern on a varying scale or x/y transformation, I don’t realy have a project for this, im just curious.
Typically, all input images are scaled to a fixed widthxheight, and the non-used pixels around them are cropped away so that new images look like the training set. This resolves issues with scale and transformations somewhat, although it yields difficulties of its own such as improper stretching or the disappearance of thin lines.
so it just used conventional programming techniques to isolate each character?
Quote: Original post by Kaze
If you used a neural network to recognize patterns on a 2d image how would you go about identifying a given pattern on a varying scale or x/y transformation, I don’t realy have a project for this, im just curious.
Most use of ANNs on recognition tasks (such as OCR or Feature extraction) rely on preprocessing of the data. Indeed, this can make or break the application. Eigenimages are an effective representation of the image space which can incorporate transformations and scalings and are used on problems such as facial recognition from moving images. Is this the sort of problem you were considering?
Cheers,
Timkin
Very common problem. You'll be glad to know there is a simple solution. :)
A common approach to this is to look through the image at different scales, and feed the scaled data in to your detection algorithm.
Lets pretend our algoritm examines a 10x10 block of image data. The picture we're looking at is 100x100.
First, break the image up in to 100 10x10 blocks, and check each block.
Then we break the image up in to 20x20 blocks, scale the 20x20 blocks so they are 10x10, and check again.
Then 30x30, 40x40, 50x50, and keep going until we get to the true size of the source image (100x100), each time scaling the 'blocks' down to the size our algorithm wants.
This is a good approach if your object detection routine spits out a 'yes / no' type of answer.
You may run in to the problem of detecting the same object more than once, but you can use some simple logic to post-process the results and identify the true location.
I've actually been working (slowly but surely) on an object detector of my own, using a slightly different approach. Instead of having a boolean type output, I'm attempting to get the algorithm to tag pixels in the source image that it believes belong to a face. I don't actually want it to tell me _where_ a face exists, just that a group of pixels probably belongs to a face. There is a subtle difference. Anywho, I'm hoping that I'll be able to overcome the problem of scale this way.
I'll then use a second program to segement the original image in to separate faces by processing the output of the first algorithm.
A common approach to this is to look through the image at different scales, and feed the scaled data in to your detection algorithm.
Lets pretend our algoritm examines a 10x10 block of image data. The picture we're looking at is 100x100.
First, break the image up in to 100 10x10 blocks, and check each block.
Then we break the image up in to 20x20 blocks, scale the 20x20 blocks so they are 10x10, and check again.
Then 30x30, 40x40, 50x50, and keep going until we get to the true size of the source image (100x100), each time scaling the 'blocks' down to the size our algorithm wants.
This is a good approach if your object detection routine spits out a 'yes / no' type of answer.
You may run in to the problem of detecting the same object more than once, but you can use some simple logic to post-process the results and identify the true location.
I've actually been working (slowly but surely) on an object detector of my own, using a slightly different approach. Instead of having a boolean type output, I'm attempting to get the algorithm to tag pixels in the source image that it believes belong to a face. I don't actually want it to tell me _where_ a face exists, just that a group of pixels probably belongs to a face. There is a subtle difference. Anywho, I'm hoping that I'll be able to overcome the problem of scale this way.
I'll then use a second program to segement the original image in to separate faces by processing the output of the first algorithm.
------------------http://www.nentari.com
Quote: Original post by RPGeezus
Very common problem. You'll be glad to know there is a simple solution. :)
Rofl... you might like to know that an Eigenimage decomposition gives exactly this sort of multiscale analysis. I love how people think that solutions with fancy names are not 'simple' and that they need to reinvent the wheel. ;)
Timkin
I'm curious, Kaze: Why are you asking about using neural networks to solve this problem, as opposed to other approaches?
Quote: Original post by Sneftel
I'm curious, Kaze: Why are you asking about using neural networks to solve this problem, as opposed to other approaches?
just that im currently intrested in neural networks, mostly becouse of there combination of simplicity and power, though on the note of other approches can anyone explain the proccess of sift
[Edited by - Kaze on November 28, 2005 9:15:44 PM]
Interesting. The reason I ask, is that a disproportionate number of people here are specifically interested in using ANNs for pretty much any game AI problem out there. To an extent, it makes sense; in many ways, ANNs are the "sexiest" area of AI. But in areas like this, ANNs turn out to be pretty much the worst solution imaginable because of their limited ability to cross-correlate inputs.
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement