3D-Anaglyph creation algorithm, using depth map image: where to find? - algorithm

I'm looking for a generic algorithm to calculate a red/cian anaglyph starting from the original image and his b/w depth map (example: http://www.swell3d.com/2008/07/turn-2d-painting-into-3d-anagl.html)
That algorythm are used, for example, in Photoshop but I can't find a readable explanation to reproduce it.
Thanks

After some researches I found what I was looking for.
First, I've readed some Photoshop/Gimp tutorials that describes how to make anaglyphs from two inputs: an image and its grayscale depth map. The core of the process is the use of "Displace Tool" and the depth map as a displacement map.
One of the several youtube tutorials: http://www.youtube.com/watch?v=gfYMe_vYhu4
So, I took some documentation about Gimp's Displace Tool by looking at this http://docs.gimp.org/en/plug-in-displace.html and directly at the source code of the tool (the method is very similar to the one proposed by Asgeir).
This lets us to produce two stereo images from the input, by looking at the depth map. The red and cyan colors of every image are calculated by reading this page http://3dtv.at/Knowhow/AnaglyphComparison_en.aspx ("Optimized" matrices are the best ones).
Then, the sum of the two images in one will produce the final anaglyph. Thanks everybody.

There are two algorithms involved. The first uses the original image and the depth map to produce a left and a right image. The second combines these images into a red-cyan anaglyph.
There are a couple ways to accomplish the first part. One is to take the original image and texture map it onto a fine mesh that lies flat in the XY plane. Then you tweak the Z values of each vertex in the mesh according to the corresponding value in the depth map. You've basically created a textured bas relief. You then use a 3D rendering algorithm to render the image from two vantage points that are offset horizontally by a small amount (essentially from the vantage point of a person's left and right eyes as they would view the bas relief).
There is probably a way to directly shift the pixels left and right which is a good fast approximation to what I described above.
Once you have the left and right images, you pass one through a cyan filter and one through a red filter. If you have RGB sources, that's as simple as taking the red channel from one image and combing it with the green and blue channels from the other image.
Anaglyphs work best with muted colors. If you have strong primaries, it won't look as good. You can use an algorithm to reduce the color saturation of the original image before you begin.

From the description in the link you provided I would assume that it is something like
for each pixel in depthmap
x_offset = (depthmap[x][y] / 255.0f) * MAX_PIXEL_OFFSET * DIRECTION
output[x + x_offset][y] = color_buffer[x][y]
blend output with color_buffer
Where MAX_PIXEL_OFFSET is the maximum shift in pixels and DIRECTION is -1 for one color and 1 for the other. This is assuming that the depthbuffer is one byte per pixel, range [0..255] and that 0 in the depthbuffer represents maximum distance.

Related

How can I find intersections of three hexagons on this figure?

I have a figure looks like this
I want to find the coordinates of all intersections of three hexagons.
How can I do this? Should I use OpenCV?
I am still trying to think of a faster/better method, but I think the following should work:
threshold your image to pure blacks and whites
generate and save a list of all black pixels for later
label your image so that each white hexagon is effectively flood-filled with a unique color (or shade of grey) - some folks call this "labelling", some call it "Blob Analysis", some call it "Connected Component Analysis". Whatever it is called, you will get something like this:
Now look at each black pixel from the list you saved in the second step and count how many different colours other than black are in the surrounding 9x9, or 15x15 area. If it's three it is probably an intersection like you are looking for.
Of course there are variations on this - you could implement a "minimum distance from other intersection" on top, for example. Or a "black line thinning first". Or a dilation of each blob to erode the black lines and make the three colours closer together. You could scale your image down (being careful to use NEAREST_NEIGHBOUR rather than interpolation) after labelling to reduce processing time - if important.
You can try to find these features using Harris corner detector.
Also check if findContours with analysis of result intersections could give you useful information.

How to find yellow objects on given picture?

I've this picture:
(this is just subimage of bigger image but only this part is for me important). I need algorithm for finding all these yellow objects in the image and find from them the object which contains the most yellow points. This is just one picture of thousands of similar pictures with more or less these yellow objects. What is the way to do this? I found that the scanline algorithm is good for this, but I haven't found some example which would help me. If you have some ideas or even algorithm it would be perfect. Those color lines are not important I just put them as some border in which I need to find the yellow objects.
Thanks a lot for answers
There are two basic steps:
Thresholding: Generate an array of yellow and not-yellow pixels. If the images you're working with are all like the example you provided, this should be very easy, but try adaptive thresholding if you have to deal with varying shades and hues. Store, e.g., a value of -1 for pixels that are yellow, and 0 everywhere else.
Segmentation: Initialize an ID value to 1. Scan every pixel of the thresholded image. When you encounter a pixel with a value of -1 (i.e., a yellow pixel), use a flood fill routine to write the ID value into this pixel and all the yellow pixels connected to it. Before the flood fill routine exits, you can store information such as the number of pixels it found and the average X and Y coordinates in an array indexed by the ID value. Then increment the ID value and resume scanning until you've covered the entire image.
Then search the data generated by the flood fill routine to find which yellow areas were the largest, and where they were located.
Here's a program that does something quite similar with red objects instead of yellow ones, and then draws circles around them.
It looks like OpenCV has blob detection options. I found this article showing how to detect the blobs using greyscale value, which you should be able to change to use the color value of your target color. It also mentions using the area of the blob as a threshold, so you should be able to use that to find the largest one in the image.
http://www.learnopencv.com/blob-detection-using-opencv-python-c/
One approach would be to generate a quad-tree of the image. Using this quad-tree it's pretty simple to find conjunct pieces that form a blob (even with holes) and calculate the size of these.

Which way is my yarn oriented?

I have an image processing problem. I have pictures of yarn:
The individual strands are partly (but not completely) aligned. I would like to find the predominant direction in which they are aligned. In the center of the example image, this direction is around 30-34 degrees from horizontal. The result could be the average/median direction for the whole image, or just the average in each local neighborhood (producing a vector map of local directions).
What I've tried: I rotated the image in small steps (1 degree) and calculated statistics in the vertical vs horizontal direction of the rotated image (for example: standard deviation of summed rows or summed columns). I reasoned that when the strands are oriented exactly vertically or exactly horizontally the difference in statistics would be greatest, and so that angle of rotation is the correct direction in the original image. However, for at least several kinds of statistical properties I tried, this did not work.
I further thought that perhaps this wasn't working because there were too many different directions at the same time in the whole image, so I tired it in a small neighborhood. In this case, there is always a very clear preferred direction (different for each neighborhood), but it is not the direction that the fibers really go... I can post my sample code but it is basically useless.
I keep thinking there has to be some kind of simple linear algebra/statistical property of the whole image, or some value derived from the 2D FFT that would give the correct direction in one step... but how?
What probably won't work: detecting individual fibers. They are not necessarily the same color, and the image can shade from light to dark so edge detectors don't work well, and the image may not even be in focus sometimes. Because of that, it is not always even possible to see individual fibers for a human (see top-right in the example), they kinda have to be detected as preferred direction in a statistical sense.
You might try doing this in the frequency domain. The output of a Fourier Transform is orientation dependent so, if you have some kind of oriented pattern, you can apply a 2D FFT and you will see a clustering around a specific orientation.
For example, making a greyscale out of your image and performing FFT (with ImageJ) gives this:
You can see a distinct cluster that is oriented orthogonally with respect to the orientation of your yarn. With some pre-processing on your source image, to remove noise and maybe enhance the oriented features, you can probably achieve a much stronger signal in the FFT. Once you have a cluster, you can use something like PCA to determine the vector for the major axis.
For info, this is a technique that is often used to enhance oriented features, such as fingerprints, by applying a selective filter in the FFT and then taking the inverse to obtain a clearer image.
An alternative approach is to try a series of Gabor filters see here pre-built with a selection of orientations and frequencies and use the resulting features as a metric for identifying the most likely orientation. There is a scikit article that gives some examples here.
UPDATE
Just playing with ImageJ to give an idea of some possible approaches to this - I started with the FFT shown above, then - in the following image, I performed these operations (clockwise from top left) - Threshold => Close => Holefill => Erode x 3:
Finally, rather than using PCA, I calculated the spatial moments of the lower left blob using this ImageJ Plugin which handily calculates the orientation of the longest axis based on the 2nd order moment. The result gives an orientation of approximately -38 degrees (with respect to the X axis):
Depending on your frame of reference you can calculate the approximate average orientation of your yarn from this rather than from PCA.
I tried to use Gabor filters to enhance the orientations of your yarns. The parameters I used are:
phi = x*pi/16; % x = 1, 3, 5, 7
theta = 3;
sigma = 0.65*theta;
filterSize = 3;
And the imag part of the convoluted image are shown below:
As you mentioned, the most orientations lies between 30-34 degrees, thus the filter with phi = 5*pi/16 in left bottom yields the best contrast among the four.
I would consider using a Hough Transform for this type of problem, there is a nice write-up here.

How can I normalize a perspectivly stretched image to get a sqared one?

Lets think, I would take a picture of a sheet on a table in an angle, that is not frontal. Of course, I will have a perspectivly stretched image.
Does anyone know an easy algorithm to "normalize" the area again to a sqared one, when all 4 edges/edgepoints in the source (taken photo) are defined/maybe clicked by the user?
The interpolation may be easy, I do not need algorithms to have smooth borders, nearest neighbour is enough (so, simply copying the pixel from the source position, which meets the rounded value from the calculated according pixel, ignoring the after-commas).
OpenCV has it all.
Basically, you get the 4 points and use
http://docs.opencv.org/modules/imgproc/doc/geometric_transformations.html#getperspectivetransform
(which basically inverts a matrix inside) to obtain the perspective transformation matrix, and then use
http://docs.opencv.org/modules/imgproc/doc/geometric_transformations.html#warpperspective
to apply the perspective transformation on the image.
The algorithms inside the latter can be found in http://en.wikipedia.org/wiki/Texture_mapping

How do I locate black rectangles in a grid and extract the binary code from that

i'm working in a project to recognize a bit code from an image like this, where black rectangle represents 0 bit, and white (white space, not visible) 1 bit.
Somebody have any idea to process the image in order to extract this informations? My project is written in java, but any solution is accepted.
thanks all for support.
I'm not an expert in image processing, I try to apply Edge Detection using Canny Edge Detector Implementation, free java implementation find here. I used this complete image [http://img257.imageshack.us/img257/5323/colorimg.png], reduce it (scale factor = 0.4) to have fast processing and this is the result [http://img222.imageshack.us/img222/8255/colorimgout.png]. Now, how i can decode white rectangle with 0 bit value, and no rectangle with 1?
The image have 10 line X 16 columns. I don't use python, but i can try to convert it to Java.
Many thanks to support.
This is recognising good old OMR (optical mark recognition).
The solution varies depending on the quality and consistency of the data you get, so noise is important.
Using an image processing library will clearly help.
Simple case: No skew in the image and no stretch or shrinkage
Create a horizontal and vertical profile of the image. i.e. sum up values in all columns and all rows and store in arrays. for an image of MxN (width x height) you will have M cells in horizontal profile and N cells in vertical profile.
Use a thresholding to find out which cells are white (empty) and which are black. This assumes you will get at least a couple of entries in each row or column. So black cells will define a location of interest (where you will expect the marks).
Based on this, you can define in lozenges in the form and you get coordinates of lozenges (rectangles where you have marks) and then you just add up pixel values in each lozenge and based on the number, you can define if it has mark or not.
Case 2: Skew (slant in the image)
Use fourier (FFT) to find the slant value and then transform it.
Case 3: Stretch or shrink
Pretty much the same as 1 but noise is higher and reliability less.
Aliostad has made some good comments.
This is OMR and you will find it much easier to get good consistent results with a good image processing library. www.leptonica.com is a free open source 'C' library that would be a very good place to start. It could process the skew and thresholding tasks for you. Thresholding to B/W would be a good start.
Another option would be IEvolution - http://www.hi-components.com/nievolution.asp for .NET.
To be successful you will need some type of reference / registration marks to allow for skew and stretch especially if you are using document scanning or capturing from a camera image.
I am not familiar with Java, but in Python, you can use the imaging library to open the image. Then load the height and the widths, and segment the image into a grid accordingly, by Height/Rows and Width/Cols. Then, just look for black pixels in those regions, or whatever color PIL registers that black to be. This obviously relies on the grid like nature of the data.
Edit:
Doing Edge Detection may also be Fruitful. First apply an edge detection method like something from wikipedia. I have used the one found at archive.alwaysmovefast.com/basic-edge-detection-in-python.html. Then convert any grayscale value less than 180 (if you want the boxes darker just increase this value) into black and otherwise make it completely white. Then create bounding boxes, lines where the pixels are all white. If data isn't terribly skewed, then this should work pretty well, otherwise you may need to do more work. See here for the results: http://imm.io/2BLd
Edit2:
Denis, how large is your dataset and how large are the images? If you have thousands of these images, then it is not feasible to manually remove the borders (the red background and yellow bars). I think this is important to know before proceeding. Also, I think the prewitt edge detection may prove more useful in this case, since there appears to be less noise:
The previous method of segmenting may be applied, if you do preprocess to bin in the following manner, in which case you need only count the number of black or white pixels and threshold after some training samples.

Resources