Advertisement

Stereo Image processing, depth maps

Started by August 23, 2004 12:54 PM
2 comments, last by ick 20 years, 2 months ago
[corrected spelling mistakes] Not stricly an AI question, more of a computer vision question, but this seems to be the appropriate forum. Just for kicks I've been mucking about with stereo image processing.. I have a program that can examine two images of the same subject, taken at different angles, and determine the distance (relative) of each point in the image from the camera(s). There are a few problems with the method I'm using. The method: I examine each point in the right hand image, and compares it to a set of points in the left hand image, trying to find a match. The further apart the two points are (the source point and the best-matched point), the closer they must be to the camera. To find a match I'm using the Zero-Mean Normalized Cross-Correlation algorthim. In short this algorithm tries to nomalize the data against the mean of each set (set1[x][y] -= meanOfSet1; set2[x][y] -= meanOfSet2), then gets the derivitive of the two sets (myVal = set1[x][y] * set2[x][y]), and divides the derivitive (myVal) by a measure of variance (myVal = myVal / variance) which produces a number between 1.0 and -1.0, where 1.0 indicates a perfect match. If anyone is interested I can post source code. Regardless, this part of the code works very well. Thank you NASA for the idea (the two Mars rovers use this method). Cross-Correlation can be used to solve a variety of machine vision problems and works remarkably well. The problems occure in the following areas: 1. Textureless features in the image cannot be reliably matched. 2. Certain features, such as horizontal lines, cannot be reliably matched. 3. Some parts of one image do not exist in the other, due to the different viewing angle. So far my program copes with these problems as follows: 1. In my current implementation I detect featureless areas (using another measure of variance) and tag them as unmatchable. 2. When the program encounters a horizontal line-type feature, it gets tricked in to thinking the feature is very close to the camera. 3. I currently have no way to detect features that cannot be matched due to the different viewing angles. I would like to be able to address all three of these problems. Texureless features: How can I determine the depth of a featureless area? I'm thinking of just filling the data in with the depth value to the left of the featureless pixel. I've seen some solutions that do this, and although they appear to be noise-free, the final output has some significant flaws. Horizontal line-type features This happens because any part of a horizontal line can very closely match any other part of a horizontal line. This is not an issue for me with verticle lines due to the way I process the image (scanline by scanline basis). One idea I've had is to use an edge detector, like Canny, and mark regions of the image that contain this annoying signal. When I detect it I will treat the data like a textureless feature. Another idea is to bias the matching algorithm to prefer matches that give a depth reading that is more similar to the previous depth reading. My hope is that the ZMNCC algorithm will return an almost equal value for almost all parts of the line; with the bias added it will prefer least extreme, distance wise, pixel match. The third, and least attractive to me, is to detect the edge (Canny again) and use a different shaped search windows (instead of 11x11 pixels, 5x30). I don't like this because it will mean significant slow down in the program. Data unavailable due to different camera-angles I think the technical term for this is occlusion. If I could could detect occlusion I would be very happy, as I could then make many logical conclusions about various parts of the image. The only method I currently have that could detect this would be watching for a no-match condtion from the ZMNCC algorithm (a value less that 0.5 for example). The problem with this is that, due to image noise, specular highlights off smooth surfaces, and image artificats, I get all kinds of no-match conditions, and some places where I known occlusion should occure, I sometimes get good matches. Sorry for such the long post.. I welcome any comments and suggestions. This discussion is open to all. Will [Edited by - RPGeezus on August 23, 2004 2:16:34 PM]
------------------http://www.nentari.com
Hummmm.... It sounds to me that your problem is more of a feature matching one rather than a stereo processing.

About the horizontal lines, thats normal. It doesn't happen with vertical ones because of the epipolar constraint. You don't look for features up or down, right? It will allways be a problem with a line at the same angle (or close) of the epipolar line (which is horizontal if the positioning difference between the cams is a horizontal shift or if you rectified the image).

I don't know what you want to achieve with your program, but maybe you could try a feature-based algorithm? You wont have the surface, but its easier to detect occlusions and to deal with the horizontal line problem. If you want to check that, I suggest you read the not new but allways nice paper "Good Features to Track". I don't remember the author right now. It talks about the KLT (Kanade-Lucas-TomasI??? They must be the authors!!! Heheheh)feature tracker..

Anyway, your bias idea sound good. You could give it a bias to match the closest unmatched point (region). Besides the epipolar constraint, you can also restrict the direction of the search. As you certainly know, you start searching at the same coordinates of the other image and move to the center of the "head" (if its the camera on the right, you move from right to left). If the correlation is similar (up to a factor you have to choose), you pick the first one you found. Is that what you were thinking?

For occlusions, you are in the right path. Sometimes it helps to change you dissimilarity funcion. Have you tried something like plain sum of squared errors?

You see, its sad, but it's really hard to make a 3d map of a scene in non-ideal conditions. Did you know that when trying to perform face recognition, several algorithms cut out the eye region? The specular highlights make that area really dificult to work with.

Other thing you can try is to filter the resulting depth image after the 3d reconstruction. Also not very nice. Or you can allways cheat and: (1) Get pictures of scenes without occlusions and fully textured; (2) Get a third camera :)

[edited]
Hummm... I just remembered reading a paper about a stochastic approach to the occlusion detection problem using optical flow + Bayesian theory. If I remember correctly,m the guy made a map of visible points and the invisible ones were obviously ognored during the matching process. I wil try to google for it and post back (probably tomorow, I'm really tired right now).
[/edited]
Advertisement
Thanks for the reply.

No, I have not tried sum of squared.. I'll look in to it.

Filling in the textureless regions with existing depth values works reasonably well.. Gets a little messy in some instances.

Biasing the value based on history has helped quite a bit, but not resolved all of the horizontal line problems. I think a trip to Canny land is in order.

I've also built in some 'noise' filtering. Also made a big improvement. I think the correct term for what I am doing is thresholding. (if (this - last) < someValue then forget it).

I don't really have an end goal.. Ideally, something that has minimal noise, extracts the most important details, and can run in real time.

This is more a personal-development project than anything else.. My idea of 'fun' typically involves the inane. lol

I've noticed the best implementations work on image sequences and build up a model. Not something I'm interested in just yet, but maybe later.

Will
------------------http://www.nentari.com
Hi,
I've found this article through google and very interested in this topic. I am an undergrad student doing a research which require stereo image processing. i am not a programmer myself and i am now facing the problem analyzing image.
I am trying to analyze porosity of this image which contains a lot of particles floating very deep in the middle of the pore (it can be seen in both stereo pair images) and i need somekind of program which is able to tell me how deep all of those particles are so i can easily taken them out if it's deeper than my threshold depth. i read through the topic you posted and if i understand correctly, your program seems to do exactly what i need to solve this problem. i've been struggle trying to find a way to solve this out for the past month and i am really really in need of help. Matlab seems to be able to automatically detects same particle of the pair image but it would be really tough for me to measure all of them individually.
would it be possible for me to try your program on my image. I will be very greatly appreciate. i've been struggle and donno what to do with this for more than a month now.... please e-mail me at s-bhumiratana@northwestern.edu
Thank you very much,
Sarindr

This topic is closed to new replies.

Advertisement