How do the following lines of code actually downsample chrominance of an image? - image

I'm trying to make a jpeg compressor and found a bit of code which supposedly donwsamples the chrominance of an image but I don't understand how it achieves this.
x = 2
pic = imread("picture.jpg");
pic_ycbcr = rgb2ycbcr(pic); %convert rgb image to equivalent Y, Cb, Cr values
pic_downSampled = pic_ycbcr;
pic_downSampled(:,:,2) = x*round(pic_downSampled(:,:,2)/x);
pic_downSampled(:,:,3) = x*round(pic_downSampled(:,:,3)/x);
It is the final two lines which I don't understand.
It is difficult to notice the downsampling when x = 2 but when using higher values of x and converting pic_downSampled back to rgb and comparing it against the original image you can clearly see differences in the image.
I even took a small section of the picture to compare the chroma values in the matrix before and after the downsampling but there's only minor changes to the values which are just being caused by the round function which is why I really don't understand how this is downsampling anything or how it would make the size of an image smaller.

Related

A proper way to convert 2D Array into RGB or GrayScale image for precision difference

I have a 2D CNN model where I perform a classification task. My images are all coming from a sensor data after conversion.
So, normally, my way is to convert them into images using the following approach
newsize = (9, 1000)
pic = acc_normalized[0]
img = Image.fromarray(np.uint8(pic*255), 'L')
img = img.resize(newsize)
image_path = "Images_Accel"
image_name = "D1." + str(2)
img.save(f"{image_path}/{image_name}.jpeg")
This is what I obtain:
However, their precision is sort of important. For instance, some of the numerical values are like:
117.79348187327987 or 117.76568758022673.
As you see in the above line, their difference is the digits, when I use uint8, it only takes 117 to when converting it into image pixels and it looks the same, right? But, I'd like to make them different. In some cases, the difference is even at the 8th or 10th digit.
So, when I try to use mode F and save them .jpeg in Image.fromarray line it gives me error and says that PIL cannot write mode F to jpeg.
Then, I tried to first convert them RGB like following;
img = Image.fromarray(pic, 'RGB')
I am not including np.float32 just before pic or not multiplying it by 255 as it is. Then, I convert this image to grayscale. This is what I got for RGB image;
After converting RGB into grayscale:
As you see, it seems that there is a critical different between the first pic and the last pic. So, what should be the proper way to use them in 2D CNN classification? or, should I convert them into RGB and choose grayscale in CNN implementation and a channel of 1? My image dimensions 1000x9. I can even change this dimension like 250x36 or 100x90. It doesn't matter too much. By the way, in the CNN network, I am able to get more than 90% test accuracy when I use the first-type of image.
The main problem here is using which image conversion method I'll be able to take into account those precision differences across the pixels. Would you give me some idea?
---- EDIT -----
Using .tiff format I made some quick comparisons.
First of all, my data looks like the following;
So, if I convert this first reading into an image using the following code where I use np.float64 and L gives me a grayscale image;
newsize = (9, 1000)
pic = acc_normalized[0]
img = Image.fromarray(np.float64(pic), 'L')
img = img.resize(newsize)
image_path = "Images_Accel"
image_name = "D1." + str(2)
img.save(f"{image_path}/{image_name}.tiff")
It gives me this image;
Then, the first 15x9 matrix seems like following image; The contradiction is that if you take a closer look at the numerical array, for instance (1,4) member, it's a complete black where the numerical array is equal to 0.4326132099074307. For grayscale images, black means that it's close to 0 cause it makes white if it's close to 1. However, if it's making a row operation, there is another value closer to 0 and I was expecting to see it black at (1,5) location. If it does a column operation, there is again something wrong. As I said, this data has been already normalized and varies within 0 and 1. So, what's the logic that it converts the array into an image? What kind of operation it does?
Secondly, if I first get an RGB image of the data and then convert it into a grayscale image, why I am not having exactly the same image as what I obtained first? Should the image coming from direct grayscale conversion (L method, np.float64) and the one coming from RGB-based (first I get RGB then convert it to grayscale) be the same? There is a difference in black-white pixels in those images. I do not know why we have it.
---- EDIT 2 ----
.tiff image with F mode and np.float32 gives the following;
I don't really understand your question, but you seem to want to store image differences that are less than 1, i.e. less than the resolution of integer values.
To do so, you need to use an image format that can store floats. JPEG, PNG, GIF, TGA and BMP cannot store floats. Instead, use TIFF, EXR or PFM formats which can handle floats.
Alternatively, you can create 16-bit PNG images wherein each pixel can store values in range 0..65535. So, say the maximum difference you wanted to store was 60 you could calculate the difference and multiply it by 1000 and round it to make an integer in range 0..60000 and store as 16-bit PNG.
You could record the scale factor as a comment within the image if it is variable.

Convert RGB to color scale defined by any two colors [duplicate]

I am doing some image processing and I needed to reduce the number of colors of an image. I found that rgb2ind could do that and wrote the following snippet:
clc
clear all
[X,map] = rgb2ind(RGB,6,'nodither');
X = rgb2ind(RGB, map);
rgb=ind2rgb(X,map);
rgb_uint8=uint8(rgb*255+0.5);
imshow(rgb_uint8);
But the output looks like this and I doubt there are only 6 colors in it.
It may perceptually look like there is more than 6 colours, but there is truly 6 colours. If you take a look at your map variable, it will be a 6 x 3 matrix. Each row contains a colour that you want to quantize your image to.
To double check, convert this image into a grayscale image, then do a histogram of this image. If rgb2ind worked, you should only see 6 spikes in the histogram.
BTW, to be able to reconstruct your problem, you used the peppers.png image that is built-in to MATLAB's system path. As such, this is what I did to describe what I'm talking about:
RGB = imread('peppers.png');
%// Your code
[X,map] = rgb2ind(RGB,6,'nodither');
X = rgb2ind(RGB, map);
rgb=ind2rgb(X,map);
rgb_uint8=uint8(rgb*255+0.5);
imshow(rgb_uint8);
%// My code - Double check colour distribution
figure;
imhist(rgb2gray(rgb_uint8));
axis tight;
This is the figure I get:
As you can see, there are 6 spikes in our histogram. If there are truly 6 unique colours when you ran your code, then there should be an equivalent of 6 equivalent grayscale intensities when you convert the image into grayscale, and the histogram above verifies our findings.
As such, you are quantizing your image to 6 colours, but it doesn't look like it due to quantization noise of your image.
Don't doubt of your result, the image contains exactly 6 colours.
As explained in the Matlab documentation, the rgb2ind function returns an indexed matrix (X in your code) and a colormap (map in your code). So if you want to check the number of colours in X, you can simply check the size of the colormap: size(map)
In your case the size will be 6x3: 6 colours described on 3 channels (red, greed and blue).

To imread Parula image in Matlab without losing resolution

There is no bijection between RGB and Parula, discussed here.
I am thinking how to do well the image processing of files in Parula.
This challenge has been developed from this thread about removing black color from ECG images by extending the case to a generalized problem with Parula colors.
Data:
which is generated by
[X,Y,Z] = peaks(25);
imgParula = surf(X,Y,Z);
view(2);
axis off;
It is not the point of this thread to use this code in your solution to read the second image.
Code:
[imgParula, map, alpha] = imread('http://i.stack.imgur.com/tVMO2.png');
where map is [] and alpha is a completely white image. Doing imshow(imgParula) gives
where you see a lot of interference and lost of resolution because Matlab reads images as RGB, although the actual colormap is Parula.
Resizing this picture does not improve resolution.
How can you read image into corresponding colormap in Matlab?
I did not find any parameter to specify the colormap in reading.
The Problem
There is a one-to-one mapping from indexed colors in the parula colormap to RGB triplets. However, no such one-to-one mapping exists to reverse this process to convert a parula indexed color back to RGB (indeed there are an infinite number ways to do so). Thus, there is no one-to-one correspondence or bijection between the two spaces. The plot below, which shows the R, G, and B values for each parula index, makes this clearer.
This is the case for most indexed colors. Any solution to this problem will be non-unique.
A Built-in Solution
I after playing around with this a bit, I realized that there's already a built-in function that may be sufficient: rgb2ind, which converts RGB image data to indexed image data. This function uses dither (which in turn calls the mex function ditherc) to perform the inverse colormap transformation.
Here's a demonstration that uses JPEG compression to add noise and distort the colors in the original parula index data:
img0 = peaks(32); % Generate sample data
img0 = img0-min(img0(:));
img0 = floor(255*img0./max(img0(:))); % Convert to 0-255
fname = [tempname '.jpg']; % Save file in temp directory
map = parula(256); % Parula colormap
imwrite(img0,map,fname,'Quality',50); % Write data to compressed JPEG
img1 = imread(fname); % Read RGB JPEG file data
img2 = rgb2ind(img1,map,'nodither'); % Convert RGB data to parula colormap
figure;
image(img0); % Original indexed data
colormap(map);
axis image;
figure;
image(img1); % RGB JPEG file data
axis image;
figure;
image(img2); % rgb2ind indexed image data
colormap(map);
axis image;
This should produce images similar to the first three below.
Alternative Solution: Color Difference
Another way to accomplish this task is by comparing the difference between the colors in the RGB image with the RGB values that correspond to each colormap index. The standard way to do this is by calculating ΔE in the CIE L*a*b* color space. I've implemented a form of this in a general function called rgb2map that can be downloaded from my GitHub. This code relies on makecform and applycform in the Image Processing Toolbox to convert from RGB to the 1976 CIE L*a*b* color space.
The following code will produce an image like the one on the right above:
img3 = rgb2map(img1,map);
figure;
image(img3); % rgb2map indexed image data
colormap(map);
axis image;
For each RGB pixel in an input image, rgb2map calculates the color difference between it and every RGB triplet in the input colormap using the CIE 1976 standard. The min function is used to find the index of the minimum ΔE (if more than one minimum value exists, the index of the first is returned). More sophisticated means can be used to select the "best" color in the case of multiple ΔE minima, but they will be more costly.
Conclusions
As a final example, I used an image of the namesake Parula bird to compare the two methods in the figure below. The two results are quite different for this image. If you manually adjust rgb2map to use the more complex CIE 1994 color difference standard, you'll get yet another rendering. However, for images that more closely match the original parula colormap (as above) both should return more similar results. Importantly, rgb2ind benefits from calling mex functions and is almost 100 times faster than rgb2map despite several optimizations in my code (if the CIE 1994 standard is used, it's about 700 times faster).
Lastly, those who want to learn more about colormaps in Matlab, should read this four-part MathWorks blog post by Steve Eddins on the new parula colormap.
Update 6-20-2015: rgb2map code described above updated to use different color space transforms, which improves it's speed by almost a factor of two.

Converting to 8-bit image causes white spots where black was. Why is this?

Img is a dtype=float64 numpy data type. When I run this code:
Img2 = np.array(Img, np.uint8)
the background of my images turns white. How can I avoid this and still get an 8-bit image?
Edit:
Sure, I can give more info. The single image is compiled from a stack of 400 images. They are each coming from an .avi video file, and each image is converted into a NumPy array like this:
gray_img = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
A more complicated operation is performed on this whole stack, but does not involve creating new images. It's simply performing calculations on each 1D array to yield a single pixel.
The interpolation is most likely linear (the default in plotting images with matplotlib. The images were saved as .PNGs.
You probably see overflow. If you cast 257 to np.uint8, you will get 1. According to a google search, avi files contain images with a color depth of 15 - 24 bit. When you cast this depth to np.uint8, you will see white regions getting darkened and (if a normalization takes place somewhere) also dark regions getting white (-5 -> 251). For the regions that become bright, you could check whether you have negative pixel values in the original image Img.
The Docs say that sometimes you have to do some scaling to get a proper cast, and to rather use higher depth whenever possible to avoid artefacts.
The solution seems to be either working at higher depth, i.e. casting to np.uint16 or np.uint32, or to scale the pixel values before reducing the depth, i.e. with Img2 already being a numpy matrix
# make sure that values are between 0 and 255, i.e. within 8bit range
Img2 *= 255/Img2.max()
# cast to 8bit
Img2 = np.array(Img, np.uint8)

Compression Ratio of an image using matlab

I was trying to calculate the CR(compressing ratio) of an image that I compressed and decompressed using FFT in matlab. I have read a similar post here about CR calculation but I did not get the method he was using to process the image. That post was saying that : CR = numel(X)/numel(Y)
What I understood is that X is my image before FFT and Y is after. So I said that
I=imread('flowers.tif')
RGB = im2double(I);
%process...
iRGB = my reconstructed image after iFFT
CR = numel(RGB)/numel(iRGB);
But this results in CR =1 which I do not think that is the correct answer. Can someone explain to me what am I missing?
The compression rate is the ratio of numel of the compressed representation and the un-compressed one. Your iRGB is a reconstructed representation and therefore has the same number of elements as RGB (you need to reconstruct the entire image). For the CR you need the numel of your compressed representation.

Resources