Constructing right-view image from left-view image and disparity map - image

I am trying to construct a right-view image from a left-view image and its disparity map. I use the middleburry dataset 2003 (http://vision.middlebury.edu/stereo/data/scenes2003/) with the full size images, which means the value v of each pixel in the disparity map corresponds to a shift of v pixels on the left-view image.
My algorithm is quite simple. For each pixel of coordinates (x, y) in the left-view image, I copy this pixel on the right-view image but at the coordinates (x - d, y) where d is the value of the disparity map at the coordinates (x, y). If the disparity value is 0, I just don't do anything. I use openCV to manipulate the images.
Here is my code:
void computeCorrespondingImage(const cv::Mat &img, const cv::Mat &disparity, cv::Mat &dest,
const bool leftInput, const int disparityScale)
{
const int shiftDirection = leftInput ? -1 : 1;
dest.create(img.rows, img.cols, img.type());
for (int i(0) ; i < img.rows ; ++i) {
for (int j(0) ; j < img.cols ; ++j) {
const uchar d(disparity.at<const uchar>(i, j));
const int computedColumn(j + shiftDirection * (d / disparityScale));
// No need to consider pixels who would be outside of the image's bounds
if (d > 0 && computedColumn >= 0 && computedColumn < img.cols) {
dest.at<cv::Vec3b>(i, computedColumn) = img.at<const cv::Vec3b>(i, j);
}
}
}
}
Since the disparity map is a ground-truth disparity map, I would expect to get an image quite like the right-view image provided in the dataset with some black areas (for which the disparity is unknown).
However, for some reasons it's like the computed right-view image is split at the center, making the image unusable.
Left-view image :
Ground-truth disparity map :
What I get :
Thank you in advance for your help.

Ok, I figured it out. I was loading the disparity image with imread without specifying that it was a gray scale image (with IMREAD_GRAYSCALE). Therefore, openCV loaded it as an RGB image and when I was accessing a pixel of the disparity with at(), I was specifying uchar as the wanted type. So I guess there was kind of a conversion from Vec3b to uchar that gave false values.

Related

How to get the correct `RGB` value of a `PNG` image?

Mapbox provides Global elevation data with height data encoded in PNG image. Height is decoded by height = -10000 + ((R * 256 * 256 + G * 256 + B) * 0.1). Details are in https://www.mapbox.com/blog/terrain-rgb/.
I want to import the height data to generate terrains in Unity3D.
Texture2D dem = (Texture2D)AssetDatabase.LoadAssetAtPath("Assets/dem/12/12_3417_1536.png", typeof(Texture2D));
for (int i = 0; i < width; i++)
for (int j = 0; j < height; j++)
{
Color c = dem.GetPixel(i, j);
float R = c.r*255;
float G = c.g*255;
float B = c.b*255;
array[i, j] = -10000 + ((R * 256 * 256 + G * 256 + B) * 0.1f);
}
Here I set a break point and the rgba value of the first pixel is RGBA(0.000, 0.592, 0.718, 1.000). c.r is 0. The height is incorrect as this point represent the height of somewhere on a mountain.
Then I open the image in Photoshop and get RGB of the first pixel: R=1,G=152,B=179.
I write a test program in C#.
System.Drawing.Bitmap bitmap = new System.Drawing.Bitmap("12_3417_1536.png");
Color a = bitmap.GetPixel(0, 0);
It shows Color a is (R,G,B,A)=(1,147,249,255)
Here is the image I test:
https://api.mapbox.com/v4/mapbox.terrain-rgb/12/3417/1536.pngraw?access_token=pk.eyJ1Ijoib2xlb3RpZ2VyIiwiYSI6ImZ2cllZQ3cifQ.2yDE9wUcfO_BLiinccfOKg
Why I got different RGBA value with different method? Which one is correct?
According to the comments below, different read order and compressed data in unity may result in different value of the rgba of pixel at (0,0).
Now I want to focus on----How to convert the rgba(0~1) to RGBA(0~255)?
r_ps=r_unity*255? But how can I explain r=0 in unity and r=1 in PS of pixel at (0,0)
?
Try disabling compression from the texture's import settings in Unity (No compression). Alternatively, if you fetch the data at runtime, you can use Texture.LoadBytes() to avoid compression artifacts.
I will assume you are using the same picture and that there aren't two 12_3417_1536.png files in separate folders.
Each of these functions has a different concept of which pixel is at (0,0). Not sure what you mean by "first" pixel when you tested with photoshop, but Texture coordinates in unity start at lower left corner.
When I tested the lower left corner pixel using paint, I got the same value as you did with photoshop. However, if you test the upper left corner, you get (1,147,249,255) which is the result bitmap.GetPixel returns.
The unity values that you're getting seem to be way off. Try calling dem.GetPixel(0,0) so that you're sure you're analyzing the simplest case.

How to extract color shade from a given sample image to convert another image using color of sample image?

I have a sample image and a target image. I want to transfer the color shades of sample image to target image. Please tell me how to extract the color from sample image.
Here the images:
input source image:
input map for desired output image
output image
You can use a technique called "Histogram matching" (another description)
Basically, you use the histogram for your source image as a goal and transform the values for each input map pixel to get the output histogram as close to source as possible. You do it for each rgb channel of the image.
Here is my python code for that:
from scipy.misc import imsave, imread
import numpy as np
imsrc = imread("source.jpg")
imtint = imread("tint_target.jpg")
nbr_bins=255
imres = imsrc.copy()
for d in range(3):
imhist,bins = np.histogram(imsrc[:,:,d].flatten(),nbr_bins,normed=True)
tinthist,bins = np.histogram(imtint[:,:,d].flatten(),nbr_bins,normed=True)
cdfsrc = imhist.cumsum() #cumulative distribution function
cdfsrc = (255 * cdfsrc / cdfsrc[-1]).astype(np.uint8) #normalize
cdftint = tinthist.cumsum() #cumulative distribution function
cdftint = (255 * cdftint / cdftint[-1]).astype(np.uint8) #normalize
im2 = np.interp(imsrc[:,:,d].flatten(),bins[:-1],cdfsrc)
im3 = np.interp(imsrc[:,:,d].flatten(),cdftint, bins[:-1])
imres[:,:,d] = im3.reshape((imsrc.shape[0],imsrc.shape[1] ))
imsave("histnormresult.jpg", imres)
The output for you samples will look like that:
You could also try making the same in HSV colorspace - it might give better results.
I think the hardest part is to determine the dominant color of the first image. Just looking at it, with all the highlights and shadows, the best overall color will be the one that has the highest combination of brightness and saturation. I start with a blurred image to reduce the effects of noise and other anomalies, then convert each pixel to the HSV color space for the brightness and saturation measurement. Here's how it looks in Python with PIL and colorsys:
blurred = im1.filter(ImageFilter.BLUR)
ld = blurred.load()
max_hsv = (0, 0, 0)
for y in range(blurred.size[1]):
for x in range(blurred.size[0]):
r, g, b = tuple(c / 255. for c in ld[x, y])
h, s, v = colorsys.rgb_to_hsv(r, g, b)
if s + v > max_hsv[1] + max_hsv[2]:
max_hsv = h, s, v
r, g, b = tuple(int(c * 255) for c in colorsys.hsv_to_rgb(*max_hsv))
For your image I get a color of (210, 61, 74) which looks like:
From that point it's just a matter of transferring the hue and saturation to the other image.
The histogram matching solutions above did not work for me. Here is my own, based on OpenCV:
def match_image_histograms(image, reference):
chans1 = cv2.split(image)
chans2 = cv2.split(reference)
new_chans = []
for ch1, ch2 in zip(chans1, chans2):
hist1 = cv2.calcHist([ch1], [0], None, [256], [0, 256])
hist1 /= hist1.sum()
hist2 = cv2.calcHist([ch2], [0], None, [256], [0, 256])
hist2 /= hist2.sum()
lut = np.searchsorted(hist1.cumsum(), hist2.cumsum())
new_chans.append(cv2.LUT(ch1, lut))
return cv2.merge(new_chans).astype('uint8')
obtain average color from color map
ignore saturated white/black colors
convert light map to grayscale
change dynamic range of lightmap to match your desired output
I use max dynamic range. You could compute the range of color map and set it for light map
multiply the light map by avg color
This is how it looks like:
And this is the C++ source code
//picture pic0,pic1,pic2;
// pic0 - source color
// pic1 - source light map
// pic2 - output
int x,y,rr,gg,bb,i,i0,i1;
double r,g,b,a;
// init output as source light map in grayscale i=r+g+b
pic2=pic1;
pic2.rgb2i();
// change light map dynamic range to maximum
i0=pic2.p[0][0].dd; // min
i1=pic2.p[0][0].dd; // max
for (y=0;y<pic2.ys;y++)
for (x=0;x<pic2.xs;x++)
{
i=pic2.p[y][x].dd;
if (i0>i) i0=i;
if (i1<i) i1=i;
}
for (y=0;y<pic2.ys;y++)
for (x=0;x<pic2.xs;x++)
{
i=pic2.p[y][x].dd;
i=(i-i0)*767/(i1-i0);
pic2.p[y][x].dd=i;
}
// extract average color from color map (normalized to unit vecotr)
for (r=0.0,g=0.0,b=0.0,y=0;y<pic0.ys;y++)
for (x=0;x<pic0.xs;x++)
{
rr=BYTE(pic0.p[y][x].db[picture::_r]);
gg=BYTE(pic0.p[y][x].db[picture::_g]);
bb=BYTE(pic0.p[y][x].db[picture::_b]);
i=rr+gg+bb;
if (i<400) // ignore saturated colors (whiteish) 3*255=white
if (i>16) // ignore too dark colors (whiteish) 0=black
{
r+=rr;
g+=gg;
b+=bb;
}
}
a=1.0/sqrt((r*r)+(g*g)+(b*b)); r*=a; g*=a; b*=a;
// recolor output
for (y=0;y<pic2.ys;y++)
for (x=0;x<pic2.xs;x++)
{
a=DWORD(pic2.p[y][x].dd);
rr=r*a; if (rr>255) rr=255; pic2.p[y][x].db[picture::_r]=BYTE(rr);
gg=g*a; if (gg>255) gg=255; pic2.p[y][x].db[picture::_g]=BYTE(gg);
bb=b*a; if (bb>255) bb=255; pic2.p[y][x].db[picture::_b]=BYTE(bb);
}
I am using own picture class so here some members:
xs,ys size of image in pixels
p[y][x].dd is pixel at (x,y) position as 32 bit integer type
p[y][x].db[4] is pixel access by color bands (r,g,b,a)
[notes]
If this does not meet your needs then please specify more and add more images. Because your current example is really not self explanatonary
Regarding previous answer, one thing to be careful with:
once the CDF will reach its maximum (=1), the interpolation will get mislead and will match wrongly your values. To avoid this, you should provide the interpolation function only the part of CDF meaningful (not after where it reaches 1) and the corresponding bins. Here the answer adapted:
from scipy.misc import imsave, imread
import numpy as np
imsrc = imread("source.jpg")
imtint = imread("tint_target.jpg")
nbr_bins=255
imres = imsrc.copy()
for d in range(3):
imhist,bins = np.histogram(imsrc[:,:,d].flatten(),nbr_bins,normed=True)
tinthist,bins = np.histogram(imtint[:,:,d].flatten(),nbr_bins,normed=True)
cdfsrc = imhist.cumsum() #cumulative distribution function
cdfsrc = (255 * cdfsrc / cdfsrc[-1]).astype(np.uint8) #normalize
cdftint = tinthist.cumsum() #cumulative distribution function
cdftint = (255 * cdftint / cdftint[-1]).astype(np.uint8) #normalize
im2 = np.interp(imsrc[:,:,d].flatten(),bins[:-1],cdfsrc)
if (cdftint==1).sum()>0:
idx_max = np.where(cdftint==1)[0][0]
im3 = np.interp(im2,cdftint[:idx_max+1], bins[:idx_max+1])
else:
im3 = np.interp(im2,cdftint, bins[:-1])
Enjoy!

Estimate Image line gradient ( not pixel gradient)

I have a problem whereby I want to estimate the gradient of the line on the contour. Please note that I dont need the pixel gradient but the rate of change of line.
If you see the attached image, you will see a binary image with green contour. I want to label each pixel based on the gradient of the pixel on the contour.
Why I need the gradient is because I want to compute the points where the gradient orientation changes from + to - or from - to +.
I cannot think of a good method, to estimate this point on the image. Could someone help me with suggestion on how I can estimate this points.
Here is a small program that computes the tangent at each contour pixel location in a very simple way (there exist other and probably better ways! the easy ones are: http://en.wikipedia.org/wiki/Finite_difference#Forward.2C_backward.2C_and_central_differences):
for a contour pixel c_{i} get the neighbors c_{i-1} and c_{i+1}
tangent direction at c_i is (c_{i-1} - c_{i+1}
So this is all on CONTOUR PIXELS but maybe you could so something similar if you compute the orthogonal to the full image pixel gradient... not sure about that ;)
here's the code:
int main()
{
cv::Mat input = cv::imread("../inputData/ContourTangentBin.png");
cv::Mat gray;
cv::cvtColor(input,gray,CV_BGR2GRAY);
// binarize
cv::Mat binary = gray > 100;
// find contours
std::vector<std::vector<cv::Point> > contours;
std::vector<cv::Vec4i> hierarchy;
findContours( binary.clone(), contours, hierarchy, CV_RETR_TREE, CV_CHAIN_APPROX_NONE ); // CV_CHAIN_APPROX_NONE to get each single pixel of the contour!!
for( int i = 0; i< contours.size(); i++ )
{
std::vector<cv::Point> & cCont = contours[i];
std::vector<cv::Point2f> tangents;
if(cCont.size() < 3) continue;
// 1. compute tangent for first point
cv::Point2f cPoint = cCont.front();
cv::Point2f tangent = cCont.back() - cCont.at(1); // central tangent => you could use another method if you like to
tangents.push_back(tangent);
// display first tangent
cv::Mat tmpOut = input.clone();
cv::line(tmpOut, cPoint + 10*tangent, cPoint-10*tangent, cv::Scalar(0,0,255),1);
cv::imshow("tangent",tmpOut);
cv::waitKey(0);
for(unsigned int j=1; j<cCont.size(); ++j)
{
cPoint = cCont[j];
tangent = cCont[j-1] - cCont[(j+1)%cCont.size()]; // central tangent => you could use another method if you like to
tangents.push_back(tangent);
//display current tangent:
tmpOut = input.clone();
cv::line(tmpOut, cPoint + 10*tangent, cPoint-10*tangent, cv::Scalar(0,0,255),1);
cv::imshow("tangent",tmpOut);
cv::waitKey(0);
//if(cv::waitKey(0) == 's') cv::imwrite("../outputData/ContourTangentTangent.png", tmpOut);
}
// now there are all the tangent directions in "tangents", do whatever you like with them
}
for( int i = 0; i< contours.size(); i++ )
{
drawContours( input, contours, i, cv::Scalar(0,255,0), 1, 8, hierarchy, 0 );
}
cv::imshow("input", input);
cv::imshow("binary", binary);
cv::waitKey(0);
return 0;
}
I used this image:
and got outputs like:
in the result you get a vector with a 2D tangent information (line direction) for each pixel of that contour.

Upscaling images on Retina devices

I know images upscale by default on retina devices, but the default scaling makes the images blurry.
I was wondering if there was a way to scale it in nearest-neighbor mode, where there are no transparent pixels created, but rather each pixel multiplied by 4, so it looks like it would on a non retina device.
Example of what I'm talking about can be seen in the image below.
example http://cclloyd.com/downloads/sdfsdf.png
CoreGraphics will not do a 2x scale like that, you need to write a bit of explicit pixel mapping logic to do something like this. The following is some code I used to do this operation, you would of course need to fill in the details as this operates on an input buffer of pixels and writes to an output buffer of pixels that is 2x larger.
// Use special case "DOUBLE" logic that will simply duplicate the exact
// RGB value from the indicated pixel into the 2x sized output buffer.
int numOutputPixels = resizedFrameBuffer.width * resizedFrameBuffer.height;
uint32_t *inPixels32 = (uint32_t*)cgFrameBuffer.pixels;
uint32_t *outPixels32 = (uint32_t*)resizedFrameBuffer.pixels;
int outRow = 0;
int outColumn = 0;
for (int i=0; i < numOutputPixels; i++) {
if ((i > 0) && ((i % resizedFrameBuffer.width) == 0)) {
outRow += 1;
outColumn = 0;
}
// Divide by 2 to get the column/row in the input framebuffer
int inColumn = outColumn / 2;
int inRow = outRow / 2;
// Get the pixel for the row and column this output pixel corresponds to
int inOffset = (inRow * cgFrameBuffer.width) + inColumn;
uint32_t pixel = inPixels32[inOffset];
outPixels32[i] = pixel;
//fprintf(stdout, "Wrote 0x%.10X for 2x row/col %d %d (%d), read from row/col %d %d (%d)\n", pixel, outRow, outColumn, i, inRow, inColumn, inOffset);
outColumn += 1;
}
This code of course depends on you creating a buffer of pixels and then wrapping it back up into CFImageRef. But, you can find all the code to do that kind of thing easily.

Retrieve color information from images

I need to determine the amount/quality of color in an image in order to compare it with other images and recommend a user (owner of the image) maybe he needs to print it in black and white and not in color.
So far I'm analyzing the image and extracting some data of it:
The number of different colors I find in the image
The percentage of color in the whole page (color pixels / total pixels)
For further analysis I may need other characteristic of these images. Do you know what else is important (or I'm missing here) in image analysis?
After some time I found a missing characteristic (very important) which helped me a lot with the analysis of the images. I don't know if there is a name for that but I called it the average color of the image:
When I was looping over all the pixels of the image and counting each color I also retrieved the information of the RGB values and summarized all the Reds, Greens and Blues of all the pixels. Just to come up with this average color which, again, saved my life when I wanted to compare some kind of images.
The code is something like this:
File f = new File("image.jpg");
BufferedImage im = ImageIO.read(f);
int tot = 0;
int red = 0;
int blue= 0;
int green = 0;
int w = im.getWidth();
int h = im.getHeight();
// Going over all the pixels
for (int i=0;i<w;i++){
for (int j=0;j<h;j++){
int pix = im.getRGB(i, j); //
if (!sameARGB(pix)) { // Compares the RGB values
tot+=1;
red+=pix.getRed();
green+=pix.getGreen();
blue+=pix.getBlue();
}
}
}
And you should get the results like this:
// Percentage of color on the image
double per = (double)tot/(h*w);
// Average color <-------------
Color c = new Color((double)red/tot,(double)green/tot,(double)blue/tot);

Resources