read disparity map using png file - image

I calculate a disparity map
d = disparity(imgL,imgR, 'Method', 'SemiGlobal', 'BlockSize', 7);
If I want to save the disparity map in image file
dis1 = d/63; imwrite(dis1,'dis.png');
How to read this disparity map in Matlab?
I tried:
disparityMap= single(imread('dis.png')/63);
But it doesn't give the same matrix. Thanks

The problem with saving PNG files with imwrite is that for floating point images such as your disparity map, the function multiplies the data by 255 and truncates the data to 8-bit unsigned integer before saving. Therefore if you try to re-read this image, you will need to divide by 255 to get it back to what it was before but due to truncation you will definitely get precision loss. You can approximate what you had before by first dividing 255 to get your scaled disparity map, then you need to multiply by 63 to undo your previous division by 63... oh yeah, and by the way you need to convert the datatype first before doing the division or else you will be subject to truncation of the datatype and that's also where you're going wrong too:
disparityMap = single(imread('dis.png'))*(63/255);
Be wary that you will not get it exactly the same as you had it before due to the precision loss when dividing by 63 and also when writing to file. The division by 63 will make small disparities even smaller so that when you actually scale by 255, truncate and save to file, these small disparities will inevitably get mapped to a smaller number when you read the file back into memory. Therefore, you need to make absolutely sure that this is what you actually want to do.

Related

A proper way to convert 2D Array into RGB or GrayScale image for precision difference

I have a 2D CNN model where I perform a classification task. My images are all coming from a sensor data after conversion.
So, normally, my way is to convert them into images using the following approach
newsize = (9, 1000)
pic = acc_normalized[0]
img = Image.fromarray(np.uint8(pic*255), 'L')
img = img.resize(newsize)
image_path = "Images_Accel"
image_name = "D1." + str(2)
img.save(f"{image_path}/{image_name}.jpeg")
This is what I obtain:
However, their precision is sort of important. For instance, some of the numerical values are like:
117.79348187327987 or 117.76568758022673.
As you see in the above line, their difference is the digits, when I use uint8, it only takes 117 to when converting it into image pixels and it looks the same, right? But, I'd like to make them different. In some cases, the difference is even at the 8th or 10th digit.
So, when I try to use mode F and save them .jpeg in Image.fromarray line it gives me error and says that PIL cannot write mode F to jpeg.
Then, I tried to first convert them RGB like following;
img = Image.fromarray(pic, 'RGB')
I am not including np.float32 just before pic or not multiplying it by 255 as it is. Then, I convert this image to grayscale. This is what I got for RGB image;
After converting RGB into grayscale:
As you see, it seems that there is a critical different between the first pic and the last pic. So, what should be the proper way to use them in 2D CNN classification? or, should I convert them into RGB and choose grayscale in CNN implementation and a channel of 1? My image dimensions 1000x9. I can even change this dimension like 250x36 or 100x90. It doesn't matter too much. By the way, in the CNN network, I am able to get more than 90% test accuracy when I use the first-type of image.
The main problem here is using which image conversion method I'll be able to take into account those precision differences across the pixels. Would you give me some idea?
---- EDIT -----
Using .tiff format I made some quick comparisons.
First of all, my data looks like the following;
So, if I convert this first reading into an image using the following code where I use np.float64 and L gives me a grayscale image;
newsize = (9, 1000)
pic = acc_normalized[0]
img = Image.fromarray(np.float64(pic), 'L')
img = img.resize(newsize)
image_path = "Images_Accel"
image_name = "D1." + str(2)
img.save(f"{image_path}/{image_name}.tiff")
It gives me this image;
Then, the first 15x9 matrix seems like following image; The contradiction is that if you take a closer look at the numerical array, for instance (1,4) member, it's a complete black where the numerical array is equal to 0.4326132099074307. For grayscale images, black means that it's close to 0 cause it makes white if it's close to 1. However, if it's making a row operation, there is another value closer to 0 and I was expecting to see it black at (1,5) location. If it does a column operation, there is again something wrong. As I said, this data has been already normalized and varies within 0 and 1. So, what's the logic that it converts the array into an image? What kind of operation it does?
Secondly, if I first get an RGB image of the data and then convert it into a grayscale image, why I am not having exactly the same image as what I obtained first? Should the image coming from direct grayscale conversion (L method, np.float64) and the one coming from RGB-based (first I get RGB then convert it to grayscale) be the same? There is a difference in black-white pixels in those images. I do not know why we have it.
---- EDIT 2 ----
.tiff image with F mode and np.float32 gives the following;
I don't really understand your question, but you seem to want to store image differences that are less than 1, i.e. less than the resolution of integer values.
To do so, you need to use an image format that can store floats. JPEG, PNG, GIF, TGA and BMP cannot store floats. Instead, use TIFF, EXR or PFM formats which can handle floats.
Alternatively, you can create 16-bit PNG images wherein each pixel can store values in range 0..65535. So, say the maximum difference you wanted to store was 60 you could calculate the difference and multiply it by 1000 and round it to make an integer in range 0..60000 and store as 16-bit PNG.
You could record the scale factor as a comment within the image if it is variable.

Normalization of an image

I applied some operations on a grayscale image and now I am getting new values but the problem is the intensity values are now less than 0, between 0 and 255 and greater than 255. For values between [0-255] there is no problem but for intensity values < 0 and intensity values > 255 there is a problem as these values cannot occur in a grayscale image.
Therefore, I need to normalize the values so that all the values whether they are negative or greater than 255 or whatever other values are, comes in the range 0 to 255 so that the image can be displayed.
For that I know two methods:
Method #1
newImg = ((255-0)/(max(img(:))-min(img(:))))*(img-min(img(:)))
where min(img(:)) and max(img(:)) are the minimum and maximum values obtained after doing some operations on the input image img. The min can be less than 0 and the max can be greater than 255.
Method #2
I just make all the values less than 0 as 0 and all the values greater than 255 as 255, so:
img(img < 0) = 0;
img(img > 255) = 255;
I tried to use both the methods but I am getting good results using second method but not with the first one. Can anyone of you please tell me what the problem is?
That totally depends on the image content itself. Both of those methods are valid to ensure that the range of values is between [0,255]. However, before you decide on what method you're using, you need to ask yourself the following questions:
Question #1 - What is my image?
The first question you need to ask is what does your image represent? If this is the output of an edge detector for example, the method you choose will depend on the dynamic range of the values seen in the result (more below in Question #2). For example, it's preferable that you use the second method if there is a good distribution of pixels and a low variance. However, if the dynamic range is a bit smaller, then you'll want to use the first method to push up the contrast of your result.
If the output is an image subtraction, then it's preferable to use the first method because you want to visualize the exact differences between pixels. Truncating the result will not give you a good visualization of the differences.
Question #2 - What's the dynamic range of the values?
Another thing you need to take note of is how wide the dynamic range of the minimum and maximum values are. For example, if the minimum and maximum are not that far off from the limits of [0,255], then you can use the first or second method and you won't notice much of a difference. However, if your values are within a small range that is within [0,255], then doing the first method will increase contrast whereas the second method won't do anything. If it is your goal to also increase the contrast of your image and if the intensities are within the valid [0,255] range, then you should do the first method.
However, if you have minimum and maximum values that are quite far away from the [0,255] range, like min=-50 and max=350, then doing the first method won't bode very well - especially if the grayscale intensities have huge variance. What I mean by huge variance is that you would have values that are in the high range, values in the low range and nothing else. If you rescaled using the first method, this would mean that the minimum gets pushed to 0, the maximum gets shrunk to 255 and the rest of the intensities get scaled in between so for those values that are lower, they get scaled so that they're visualized as gray.
Question #3 - Do I have a clean or noisy image?
This is something that not many people think about. Is your image very clean, or are there a couple of spurious noisy spots? The first method is very bad when it comes to noisy pixels. If you only had a couple of pixel values that have a very large value but the other pixels are within the range of [0,255], this would make all of the other pixels get rescaled accordingly and would thus decrease the contrast of your image. You probably want to ignore the contribution made by these pixels and so the second method is preferable.
Conclusion
Therefore, there is nothing wrong with either of those methods that you have talked about. You need to be cognizant of what the image is, the dynamic range of values that you see once you examine the output and whether or not this is a clear or noisy image. You simply have to make a smart choice keeping those two factors in mind. So in your case, the first output probably didn't work because you have very large negative values and large positive values and perhaps very few of those values too. Doing a truncation is probably better for your application.

Grayscale image compression using Huffman Coding in MATLAB

I am trying to compress a grayscale image using Huffman coding in MATLAB, and have tried the following code.
I have used a grayscale image with size 512x512 in tif format. My problem is that the size of the compressed image (length of the compressed codeword) is getting bigger than the size of the uncompressed image. The compression ratio is getting less than 1.
clc;
clear all;
A1 = imread('fig1.tif');
[M N]=size(A1);
A = A1(:);
count = [0:1:255]; % Distinct data symbols appearing in sig
total=sum(count);
for i=1:1:size((count)');
p(i)=count(i)/total;
end
[dict,avglen]=huffmandict(count,p) % build the Huffman dictionary
comp= huffmanenco(A,dict); %encode your original image with the dictionary you just built
compression_ratio= (512*512*8)/length(comp) %computing the compression ratio
%% DECODING
Im = huffmandeco(comp,dict); % Decode the code
I11=uint8(Im);
decomp=reshape(I11,M,N);
imshow(decomp);
There is a slight error in your code. I'm assuming you want to calculate the probability of encountering each pixel, which is the normalized histogram. You're not computing it properly. Specifically:
count = [0:1:255]; % Distinct data symbols appearing in sig
total=sum(count);
for i=1:1:size((count)');
p(i)=count(i)/total;
end
total is summing over [0,255] which is not correct. You're supposed to compute the probability distribution of your image. You should use imhist for that instead. As such, you should do this instead:
count = 0:255;
p = imhist(A1) / numel(A1);
This will correctly calculate your probability distribution for your image. Remember, when you're doing Huffman coding, you need to specify the probability of encountering a pixel. Assuming that each pixel can equally be likely to be chosen, this is captured by calculating the image's histogram, then normalizing by the total number of pixels in your image. Try that and see if you get any better results.
However, Huffman will only give you good compression ratios if you have frequently occurring symbols. Did you happen to take a look at the histogram or the spread of your pixels in your image?
If the spread is quite large, with very few entries per bin, then Huffman will not give you any compression savings. In fact it may give you a larger size as a result. Bear in mind that the TIFF compression standard only uses Huffman as part of the algorithm. There is also some pre- and post-processing done to further drive down the size.
As a further example, suppose I had an image that consisted of [0, 1, 2, ... 255; 0, 1, 2, ..., 255; 0, 1, 2, ..., 255]; I have 3 rows of [0,255], but really it could be any number of rows. This means that the probability of encountering each symbol is equiprobable, or 1/255, which means that for each symbol, we would need 8 bits per symbol... which is essentially the raw pixel value anyway!
The key behind Huffman is that a group of bits together generate one symbol. Frequently occurring symbols get assigned a smaller sequence of bits. Because this particular image that I talked about has intensities that are equiprobable, then you'd only generate one symbol per intensity rather than a group. With this, not only will you transmit the dictionary, you would effectively be sending one character at a time, and this is no better than sending the raw byte stream.
If you want your image to be compressed by raw Huffman, the distribution of pixels has to be skewed. For example, if most of the intensities in your image are dark, or are bright. If your image has good contrast or if the spread of the pixel intensities is flat throughout the image, then Huffman will not give you any compression savings.

Quantization Error in Lossless JPEG2000 (Matlab)

I have the following matrix:
A = [0.01 0.02; 1.02 1.80];
I want to compress this using JPEG 2000 and then recover the data. I used imwrite and imread in MATLAB as follows:
imwrite(A,'newA.jpg','jp2','Mode','lossless');
Ahat = imread('newA.jpg');
MATLAB give me the result in uint8. After converting data to double I get:
Ahat_double = im2double(Ahat)
Ahat_double =
0.0118 0.0196
1.0000 1.0000
I know this is because of the quantization, but I don't know how to resolve it and get the exact input data, which is what lossless compression is supposed to do.
Converting data to uint8 at the beginning did not help.
The reason why you are not getting the correct results is because A is a double precision matrix. When you are writing images to file in double precision, it assumes that the values vary between [0,1]. In your matrix, you have 2 values that are > 1. When you write this to file, these values will saturate to 1, and then they are saved to file. Actually, before even writing, the intensities will be scaled so that they are uint8 and vary between [0,255]. When you try re-reading the values, it will be read in as intensity 255, or double intensity of 1.0.
The other two values make sense when you read the values back in, as 0.01 in double form is actually 255*0.01 = 2.55 and thus rounded to 3 and 3 / 255 = 0.0118. For 0.02, this is 255*0.02 = 5.1 and thus rounded to 5 and 5 / 255 - 0.0196.
The only way you can possibly get around this is to renormalize your data before you write the image so that it conforms to [0,1]. To get the original data back, you would have to know the minimum and maximum values you had before you normalized this. Even when you do this, there are only 256 possible double precision values that can be encoded in your image (assuming grayscale), and so you will not be able to capture all possible floating point values this way.
As such, there is basically no way around your problem, so you're SOL!
If you want to encode arbitrary data using the JPEG 2000 standard, perhaps you should download this library from MATLAB's File Exchange. I haven't taken a closer look at it, but it may be able to compress arbitrary data using the JPEG 2000 algorithm.

Save an imagesc output in Matlab

I am using imagesc to get an integral image. However, I only manage to display it and then I have to save it by hand, I can't find a way to save the image from the script with imwrite or imsave. Is it possible at all?
The code:
image='C:\image.jpg';
in1= imread((image));
in=rgb2gray(in1);
in_in= cumsum(cumsum(double(in)), 2);
figure, imagesc(in_in);
You can also use the print command. For instance if you are running over multiple images and want to serialize them and save them, you can do something like:
% Create a new figure
figure (fig_ct)
% Plot your figure
% save the figure to your working directory
eval(['print -djpeg99 ' num2str(fig_ct)]);
% increment the counter for the next figure
fig_ct = fig_ct+1;
where fig_ct is just a counter. If you are interested in saving it in another format different than jpeg take a look at the documentation, you can do tiff, eps, and many more.
Hope this helps
I believe your problem may be with the fact that you are saving a double matrix that is not on the range of [0 1]. If you read the documentation, you'll see that
If the input array is of class double, and the image is a grayscale or
RGB color image, imwrite assumes the dynamic range is [0,1] and
automatically scales the data by 255 before writing it to the file as
8-bit values.
You can convert it yourself to a supported type (that's logical, uint8, uint16, or double) or get it in the range [0 1] by, for example, dividing it by the max:
imwrite (in_in / max (in_in(:)), 'out.jpg');
You may still want to further increase the dynamic range of the image you saved. For example, subtract the mininum before dividing by the max.
in_in = in_in - min (in_in(:));
in_in = in_in / max (in_in(:));
imwrite (in_in, 'out.jpg');
If you want exactly what imagesc displays
The imagesc function scales image data to the full range of the current colormap.
I don't know what exactly does it mean exactly but call imagesc requesting with 1 variable, and inspect the image handle to see the colormap and pass it to imwrite().
I'm a very new programmer, so apologies in advance if this isn't very helpful, but I just had the same problem and managed to figure it out. I used uint8 to convert it like this:
imwrite(uint8(in_in), 'in_in.jpg', 'jpg');

Resources