Advertisement

Computer Vision

Started by June 06, 2008 07:58 AM
6 comments, last by corsarius 16 years, 8 months ago
I'm interested in learning the techniques of 'computer vision', specifically taking a 2d image and defining geometry from it. It doesen't have to start out complex.. At first un-accurate - perhaps defining a basic planar structure and determining the distance of each, I think thats a good place to start. Later I could fill it out with a 'rasterizer' like ability to understand the objects shape. Does anyone know a good place to start for this, maybe a link to some code? I would appreciate it.
Quote:
Original post by VprMatrix89
I'm interested in learning the techniques of 'computer vision', specifically taking a 2d image and defining geometry from it.


Defining geometry is already part of the harder stuff... Also, calculating the distance of objects from one image (unless you meant in pixels) in virtually impossible unless you have lots of information about the objects themselves and/or the environment where they are.

Start by implementing some edge detection techniques (like sobel to give you one word you can look at), each techniques have their drawbacks. Try to understand how they work mathematically as this will help you to understand what defines an edge.

After that, you could try to make a corner detector (could be useful later in detecting geometries like squares, rectangles, etc). Hough transform is also something useful to learn.

If you want to reconstruct 3D geometry you might need a stronger math background and lots of patience. Reconstruction using silhouette of object might be the easiest one to understand and implement if you are new to this (Shape-From-Silhouettes).

Quote:
Original post by VprMatrix89
Does anyone know a good place to start for this, maybe a link to some code?


A good place to start is google, I guess I gave you enough keywords to start. Google has always been my first "reference" when I couldn't find something in my notes or needed more explanation than the teacher gave during his lecture.


JFF

Advertisement
Its a very, VERY large body of work. You can't just start by looking at some code.

First of all, do you want to work on a single image at a time, or on a series of image of the same scene? (like a video, or multiple point of views of the same scene).

Then, you need to choose the visual cue you will use.

There is shape-from-shadow, shape-from-perspective, shape-from-movement, shape-from-texture, and more exotic ones like shape-from-blur and shape-from-atmospheric-degradation...

Here's some books to start:

Introductory Techniques for 3-D Computer Vision, aka as "Trucco-Verri". Best begginer's book to computer vision.

You'll also need strong image analysis skills before you can take on computer vision, so I strongly recommend:

Digital Image Processing by Gonzalez and Woods. If you cannot understand the imaging techniques in this book, you probably shouldnt be thinking about computer vision yet.

Computer vision is pretty fun, enjoy yourself!

-- Mathieu
This paper took me a while to digest, but it is good place to start for circle detection. (Link)
"shape-from-shadow"

This is probably the most important aspect I think. I will probably start with sphere and add one overlapped .
Quote:
Original post by VprMatrix89
"shape-from-shadow"

This is probably the most important aspect I think. I will probably start with sphere and add one overlapped .


The tricky part in shape-from-shadow is, you need to solve two problems: You need to find scene depth (shape) AND find the direction the light comes from. I f you have multiple lights sources or light sources that are too close to be considered at infinity, it gets ugly.

Well, I never really took much interest in shape from shadow (perspective and movement where my cup of tea). Start with a good ol'-fashioned litteracy review!
Advertisement
Quote:
Original post by VprMatrix89
I'm interested in learning the techniques of 'computer vision', specifically taking a 2d image and defining geometry from it. It doesen't have to start out complex..

Don't be put off by intimidating responses. In it's simplest form it's extremely easy, if you fluent with elementary vector/linear algebras.
If you don't want code himself just download ARToolkit or ARToolkit+ - open sourced marker registration pakages.
But I'd recommend to start with basics - it would also help understand ARToolkit better.
Start with simple black square marker on the white background. Try to build coordinating system from it.
1.Binarize your image into black/white (simple threshold would do for start. use adaptive threshold for real work).
2. Segment your image into connected components (google for algorithm if you have problems with it - there is a plenty).
3. Approximate each component with polygon - make "line segmentation" of the boundary. Again there are several method google for it. Also I did in depth explanation at usenet sci.image.processing couple of years ago.
Actually that is hardest part of "simple" approach.
4. Chose components consisted of four lines. - suspected markers
5. build coordinate system from the lines.
Suppose V0, V1, V2, V3 - vectors representing lines (counterclockwise)
Use V0 as X, V1 as Y
Z axis could be calculated as (V0 X V2) X (V1 X V3) "X" -is vector crossproduct
You have your coordinate system. You can use additional structures (inside the marker for example) to choose which corner of the marker to use as zero.
6. Download Artoolkit manual and read it.
7. Subscribe to ARToolkit mailing list.


I reiterate - if you are fluent with vectors, matrices and experienced coder it's not a big deal.
The problem starts then you try get rid of markers, or have bad lighting conditions, or use several markers, or want better stability etc.

However if you have problem understanding even this simplified outline better don't start right away. Read some books on matrix/vector algebra, projective geometry and image segmentations instead. Or just use ARToolkit blindly, as end-user.


one of the hardest parts of the entire problem is pulling the "shapes" out of the image.

In order to recognise things as belonging to a "shape" you need to be able to have the top-down knowledge of what different objects/shapes are. Conversely in order to recognise a shape you need to be able to pull the visual primitives out of the image to begin with: it's a top-down AND bottom-up thing, so there's no easy prepackaged solution for this one.

This exact task was the major challenge in computer vision for decades... perhaps if you give us a little bit more information about the domain you're working in: are the 2d images all of a common "subset" of items - human faces, chairs, circles, whatever?

If you have a comparatively constrained set of items, even a comparatively simple single-layer perceptron will do the job (there are a number of papers about this online).

If you're looking for a generic shape recogniser, you're in for a tougher time.

This topic is closed to new replies.

Advertisement