I calculate a disparity map
d = disparity(imgL,imgR, 'Method', 'SemiGlobal', 'BlockSize', 7);
If I want to save the disparity map in image file
dis1 = d/63; imwrite(dis1,'dis.png');
How to read this disparity map in Matlab?
I tried:
disparityMap= single(imread('dis.png')/63);
But it doesn't give the same matrix. Thanks
The problem with saving PNG files with imwrite is that for floating point images such as your disparity map, the function multiplies the data by 255 and truncates the data to 8-bit unsigned integer before saving. Therefore if you try to re-read this image, you will need to divide by 255 to get it back to what it was before but due to truncation you will definitely get precision loss. You can approximate what you had before by first dividing 255 to get your scaled disparity map, then you need to multiply by 63 to undo your previous division by 63... oh yeah, and by the way you need to convert the datatype first before doing the division or else you will be subject to truncation of the datatype and that's also where you're going wrong too:
disparityMap = single(imread('dis.png'))*(63/255);
Be wary that you will not get it exactly the same as you had it before due to the precision loss when dividing by 63 and also when writing to file. The division by 63 will make small disparities even smaller so that when you actually scale by 255, truncate and save to file, these small disparities will inevitably get mapped to a smaller number when you read the file back into memory. Therefore, you need to make absolutely sure that this is what you actually want to do.
I wanted to use CreateBitmapFromMemory method, and its requires the stride as an input. and this stride confused me.
cbStride [in]
Type: UINT
The number of bytes between successive scanlines in pbBuffer.
and here it says: stride = image width + padding
Why do we need these extra space(padding). why dont just image width.
This is how calculate the stride right?
lWidthByte = (lWidth * bits + 7) / 8;
lWidth→pixel count
bits→bits per pixel
I suppuse deviding by 8 is for convert to byte. but,
What is (+7) doing here?
and finally
cbStride =((lWidthByte + 3) / 4) * 4;
Whats going on here? (why not cbStride = lWidthByte)
Please help me to clear these.
The use of padding is due to various (old and current) memory layout optimizations.
Having image pixel-rows have a length (in bytes) that is an integral multiple of 4/8/16 bytes can significantly simplify and optimized many image based operations. The reason is that these sizes allow proper storing and parallel pixel processing in the CPU registers, e.g. with SSE/MMX, without mixing pixels from two consecutive rows.
Without padding, extra code has to be inserted to handle partial WORD/DWORD pixel data since two consecutive pixels in memory might refer to one pixel on the right of one row and the left pixel on the next row.
If your image is a single channel image with 8 bit depth, i.e. grayscale in the range [0,255], then the stride would be the image width rounded up to the nearest multiple of 4 or 8 bytes. Note that the stride always specified in bytes even when a pixel may have more than one byte depth.
For images with more channels and/or more than one byte per pixel/channel, the stride would be the image width in bytes rounded up to the nearest multiple of 4 or 8 bytes.
The +7 and similar arithmetic examples you gave just make sure that the numbers are correctly rounded since integer math truncates the non-integer component of the division.
Just insert some numbers and see how it works. Don't forget to truncate (floor()) the intermediate division results.
Say I have an grey scale image S and I'm looking to ignore all values above 250, how do I do it with out using NaN? the reason I don't want to use NaN is because Im looking to take statistical information from the resultant image such as average etc.
You can collect all image pixel intensities that are less than 250. That's effectively performing the same thing. If your image was stored in A, you can simply do:
pix = A(A < 250);
pix will be a single vector of all image pixels in A that have intensity of 249 or less. From there, you can perform whatever operations you want, such as the average, standard deviation, calculating the histogram of the above, etc.
Going with your post title, we can calculate the histogram of an image very easily using imhist that's part of the image processing toolbox, and so:
out = imhist(pix);
This will give you a 256 element vector where each value denotes the intensity count for a particular intensity. If we did this properly, you should only see bin counts up to intensity 249 (location 250 in the vector) and you should. If you don't have the image processing toolbox, you can repeat the same thing using histc and manually specifying the bin cutoffs to go from 0 up to 249:
out = histc(pix, 0:249);
The difference here is that we will get a histogram of exactly 250 bins whereas imhist will give you 256 bins by default. However, histc is soon to be deprecated and histcounts is what is recommended to use. Still the same syntax:
out = histcounts(pix, 0:249);
You can use logical indexing to build a histogram only using values in your specified range. For example you might do something like:
histogram(imgData(imgData < 250))
I have the following matrix:
A = [0.01 0.02; 1.02 1.80];
I want to compress this using JPEG 2000 and then recover the data. I used imwrite and imread in MATLAB as follows:
imwrite(A,'newA.jpg','jp2','Mode','lossless');
Ahat = imread('newA.jpg');
MATLAB give me the result in uint8. After converting data to double I get:
Ahat_double = im2double(Ahat)
Ahat_double =
0.0118 0.0196
1.0000 1.0000
I know this is because of the quantization, but I don't know how to resolve it and get the exact input data, which is what lossless compression is supposed to do.
Converting data to uint8 at the beginning did not help.
The reason why you are not getting the correct results is because A is a double precision matrix. When you are writing images to file in double precision, it assumes that the values vary between [0,1]. In your matrix, you have 2 values that are > 1. When you write this to file, these values will saturate to 1, and then they are saved to file. Actually, before even writing, the intensities will be scaled so that they are uint8 and vary between [0,255]. When you try re-reading the values, it will be read in as intensity 255, or double intensity of 1.0.
The other two values make sense when you read the values back in, as 0.01 in double form is actually 255*0.01 = 2.55 and thus rounded to 3 and 3 / 255 = 0.0118. For 0.02, this is 255*0.02 = 5.1 and thus rounded to 5 and 5 / 255 - 0.0196.
The only way you can possibly get around this is to renormalize your data before you write the image so that it conforms to [0,1]. To get the original data back, you would have to know the minimum and maximum values you had before you normalized this. Even when you do this, there are only 256 possible double precision values that can be encoded in your image (assuming grayscale), and so you will not be able to capture all possible floating point values this way.
As such, there is basically no way around your problem, so you're SOL!
If you want to encode arbitrary data using the JPEG 2000 standard, perhaps you should download this library from MATLAB's File Exchange. I haven't taken a closer look at it, but it may be able to compress arbitrary data using the JPEG 2000 algorithm.
Take a look at these two example images:
I would like to be able to identify these types of images inside large set of photographs and similar images. By photograph I mean a photograph of people, a landscape, an animal etc.
I don't mind if some photographs are falsely identified as these uniform images but I wouldn't really want to "miss" some of these by identifying them as photographs.
The simplest thing that came to my mind was to analyze the images pixel by pixel to find highest and lowest R,G,B values (each channel separately). If the difference between lowest and highest value is large, then there are large color changes and such image is probably a photograph.
Other idea was to analyze the Hue value of each pixel in similar fashion. The problem is that in HSL model orangish-red and pinkish-red have roughly 350 degree difference when looking clockwise and 10 degree difference when looking counterclockwise. So I cant just compare each pixel's Hue component because I'll get some weird results.
Also, there is a problem of noise - one white or black pixel will ruin tests like that. So I would need to somehow exclude extreme values if there are only few pixels with such extremes. But at this point it gets more and more complicated and I'm feeling it's not the best approach.
I was also thinking about bumping contrast to the max and then running test like the RGB one I described above. It would probably make things easier but still one or two abnormal pixels would ruin the test anyway. How to deal with such cases?
I don't mind running few different algorithms that would cover different image types. But please note that I'm dealing with images from digital cameras so 6MP, 12MP or even 16MP are quite common. Because of that running computation intensive algorithms is not desired. I deal with hundreds or even thousands of images and have only limited CPU resources for image processing. Lets say a second or two per large image is max what I can accept.
I'm aware that for example a photograph of a blue sky might trigger a false positive, but that's OK. False positives are better than misses.
This how I would do it (Whole Method below, at the bottom of post, but just read from top to bottom):
Your quote:
"By photograph I mean a photograph of people, a landscape, an animal
etc."
My response to your quote:
This means that such images have edges, contours. The images you are
trying to separate out, no edges or little contours(for the second
example image at least)
Your quote:
one white or black pixel will ruin tests like that. So I would need to
somehow exclude extreme values if there are only few pixels with such
extremes
My response:
Minimizing the noise through methods like DoG(Difference of Gaussian), etc will reduce the
noisy, individual pixels
So I have taken your images and run it through the following codes:
cv::cvtColor(image, imagec, CV_BGR2GRAY); // where image is the example image you shown
cv::GaussianBlur(imagec,imagec, cv::Size(3,3), 0, 0, cv::BORDER_DEFAULT ); //blur image
cv::Canny(imagec, imagec, 20, 60, 3);
Results for example image 1 you gave:
As you can see after going through the code, the image became blank(totally black). The image quite big, hence bit difficult to show all in one window.
Results for example 2 you showed me:
The outline can be seen, but one method to solve this, is to introduce an ROI of about 20 to 30 pixels from the dimension of the image, so for instance, if image dimension is 640x320, the ROI may be 610x 290, where it is placed in the center of the image.
So now, let me introduce you my real method:
1) run all the images through the codes above to find edges
2) check which images doesn't have any any edges( images with no edges
will have 0 pixel with values more then 0 or little pixels with values more then 0, so set a slightly higher threshold to play it safe? You adjust accordingly, how many pixels to identify your images )
3) Save/Name out all the images without edges, which will be the images
you are trying to separate out from the rest.
4) The end.
EDIT(TO ANSWER THE COMMENT, would have commented back, but my reply is too long):
true about the blurring part. To minimise usage of blurring, you can first do an "elimination-like process", so those smooth like images in image 1 will be already separated and categorised into images you looking for.
From there you do a second test for the remaining images, which will be the "blurring".
If you really wish to avoid blurring, what I notice is that your example image 1 can be categorised as "smooth surface" while your example image 2 can be categorised as "rough-like surface", meaning which it be noisy, which led me to introduce the blurring in the first place.
From my experience and if I do remember correctly, such rough-like surfaces is very good in "watershed" or "clustering through colour" method, they blend very well, unlike the smooth images.
Since the leftover images are high chances of rough images, you can try the watershed method, and canny, you will find it be a black image, if I am not wrong. Try a line maybe like this:
pyrMeanShiftFiltering( image, images, 10, 20, 3)
I am not very sure if such method will be more expensive than Gaussian blurring. but you can try both and compare the computational speed for both.
In regard to your comment on grayscale images:
Converting to grayscale sounds risky - loosing color information may
trigger lot's of false positives
My answer:
I don't really think so. If your images you are trying to segment out
are of one colour, changing to grayscale doesn't matter. Of course if
you snap a photo of a blue sky, it might cause to be a false negative,
but as you said, those are ok.
If you think about it, images with people, etc inside, the intensity
change differs quite a lot. (of course unless your photograph have
extreme cases, like a green ball on a field of grass)
I do admit that converting to grayscale loses information. But in your
case, I doubt it will affect much, in fact, working with grayscale
images is faster and less expensive.
I would use entropy based approach. I don't have any custom code to share, but the following blog entry should push you in right direction.
http://envalo.com/image-cropping-php-using-entropy-explained/
The thing is, that the uniform images will have very low entropy compared to those with something interesting in them.
So the question is to find the correct threshold and process the whole set.
I would generate a color histogram for each image and compare how much they differ from a given pattern.
Maybe you want to normalize the brightness first to simplify the matching.
This is how I would solve it:
Find the average R, G, and B values across the image
Calculate a value for each pixel that is the sum of the differences of each channel from the average
Remove the top 0.1% of values to ignore outliers
Check the largest remaining difference against a threshold (you'll probably need to determine this threshold by trial and error)
The following apprach might be usefull.
Derive local binary pattern in 5x5 window centered around every pixel. So for one pixel you have 15 boolean values. In some direction (Clockwise or anticlockwise) calculate the number 1-0 and 0-1 changes. This is the feature value of the center pixel.
For all 20x20 window derive the variance of the pixel feature values.
If you take variance of the variances , for a uniform image it should approach towards zero. Whereas for other images it would be quite high. In this way there might be no necessary to fix thresholds and local binary pattern takes care of the potential uneven illumination.
for each of the R,G,B channels, calculate the standard deviation of intensity. If it is low enough, you have an uniform image.
If you are worried about having different uniform areas, calculate the standard deviations for, say, each 20x20 square separately, then calculate average of the standard deviations.
You probably can solve your problem using machine learning (classification). It is easier than it sounds. I will give an example:
1 - Feature extraction: compute a color histogram from all images (a histogram of RGB values). Probably you will want to reduce the number of possible values of R,G and B, so your histogram does not grows so large (this is known as requantization). For example, you could make a histogram that accepts 4 different values of R, G and B, yielding an histogram with 4*4*4 bins: [(R=1, G=1, B=1), (R=1, G=1, B=2), ... (R=4, G=4, B=4)].
2 - Manually mark some images that know that are not photographs.
3 - Train a classifier: now that you have examples of images that are photographs and images that are not photographs, you can use this information to train a classifier. This classifier, given a histogram can be used to predict the image is photography or not.
If you do not want to spend time on the classifier, you could try a more simple approach:
Compute the histogram from the image It that you want to know if it is a photography or not;
Compare the histogram of It with the histograms of all marked images and find the most similar histogram (for example, you can sum the differences between bins);
If the image with the most similar histogram is a photography, then you classify the image It as a photography. Otherwise, classify It as not being a photography
Below is my answer. I write a simple demo to explain my idea by C. You can find it in gist.
Ready:
one color/pixel contains three channels (four channels if you have alpha data)
every channel has 8 bit(256) in common
Make some defines:
#define IMAGEWIDTH 20 // Assumed
#define IMAGEHEIGHT 20 // Assumed
#define CHANNELBIT 8
#define COLORLEVEL 256
typedef struct tagPixel
{
unsigned int R : CHANNELBIT;
unsigned int G : CHANNELBIT;
unsigned int B : CHANNELBIT;
} Pixel;
Collect every count of color for every COLORLEVEL in each channel:
void TraverseAndCount(Pixel image_data[IMAGEWIDTH][IMAGEHEIGHT]
, unsigned int red_counts[COLORLEVEL]
, unsigned int green_counts[COLORLEVEL]
, unsigned int blue_counts[COLORLEVEL]);
Next step is very important. Analyze the count of color:
// just a very simple way to smooth the curve of the counts of colors
// and you can replace it with another way you want
unsigned int CalculateRange(unsigned int min_count
, unsigned int blur_size
, unsigned int color_counts[COLORLEVEL]);
This function does:
i smooth the curve of each channel count in axis - COLORLEVEL by blur_size. (you can smooth it by another way)
calculate the range of counts that is more than min_count
At last, calculate the average of range in each channel:
// calculate the average of the range for each channel of color
// the value is bigger if the image is more probably photographs
float AverageRange(unsigned int min_count, unsigned int blur_size
, unsigned int red_counts[COLORLEVEL]
, unsigned int green_counts[COLORLEVEL]
, unsigned int blue_counts[COLORLEVEL]);
Note:
the result depends the min_count. min_count should bigger than 0.
the bigger result is more probably that the image is a photo.
for a photograph, bigger result is more probably in smaller min_count.