Reshaping greyscale images for neural network training - how to do this correctly - image

I have a general question about convolutional neural networks and image processing for training if your images are grey scale.
Take this image for example:
Its a grey scale image but when I do
image = cv2.imread("image.jpg")
print(image.shape)
I get
(1024, 1024, 3)
I know that opencv automatically creates 3 channels for jpg images. But when it comes to network training, it would be much more computationally efficient if I could use images in (1024, 1024, 1) - just like many of the MNIST tutorials demonstrate. However, if I reshape this:
image.reshape(1024, 1024 , 1)
And then try for example to show the image
plt.axis("off")
plt.imshow(reshaped_image)
plt.show()
I get
raise TypeError("Invalid dimensions for image data")
Does that mean that reshaping my images this way before network training is incorrect? I want to keep as much information in the image as possible but I don't want to have those extra channels if they aren't needed.

The reason that you're getting the error is that the output of your reshape does not have the same number of elements as the input. From the documentation for reshape:
No extra elements are included into the new matrix and no elements are excluded. Consequently, the product rows*cols*channels() must stay the same after the transformation.
Instead, use cvtColor to convert your 3-channel BGR image to a 1-channel grayscale image:
In Python:
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
Or in C++:
cv::cvtColor(image, image, cv::COLOR_BGR2GRAY);
You could also avoid conversion altogether by reading the image using the IMREAD_GRAYSCALE flag:
image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
or
image = cv2.imread(image_path, 0)
(Thanks to #Alexander Reynolds for the Python code.)

This worked for me.
for image_path in dir:
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
X.append(img)
X = np.array(X)
X = np.expand_dims(X, axis=3)
set axis = Int : based on your array, 1 means it will prepend a new dimension in front.

Related

A proper way to convert 2D Array into RGB or GrayScale image for precision difference

I have a 2D CNN model where I perform a classification task. My images are all coming from a sensor data after conversion.
So, normally, my way is to convert them into images using the following approach
newsize = (9, 1000)
pic = acc_normalized[0]
img = Image.fromarray(np.uint8(pic*255), 'L')
img = img.resize(newsize)
image_path = "Images_Accel"
image_name = "D1." + str(2)
img.save(f"{image_path}/{image_name}.jpeg")
This is what I obtain:
However, their precision is sort of important. For instance, some of the numerical values are like:
117.79348187327987 or 117.76568758022673.
As you see in the above line, their difference is the digits, when I use uint8, it only takes 117 to when converting it into image pixels and it looks the same, right? But, I'd like to make them different. In some cases, the difference is even at the 8th or 10th digit.
So, when I try to use mode F and save them .jpeg in Image.fromarray line it gives me error and says that PIL cannot write mode F to jpeg.
Then, I tried to first convert them RGB like following;
img = Image.fromarray(pic, 'RGB')
I am not including np.float32 just before pic or not multiplying it by 255 as it is. Then, I convert this image to grayscale. This is what I got for RGB image;
After converting RGB into grayscale:
As you see, it seems that there is a critical different between the first pic and the last pic. So, what should be the proper way to use them in 2D CNN classification? or, should I convert them into RGB and choose grayscale in CNN implementation and a channel of 1? My image dimensions 1000x9. I can even change this dimension like 250x36 or 100x90. It doesn't matter too much. By the way, in the CNN network, I am able to get more than 90% test accuracy when I use the first-type of image.
The main problem here is using which image conversion method I'll be able to take into account those precision differences across the pixels. Would you give me some idea?
---- EDIT -----
Using .tiff format I made some quick comparisons.
First of all, my data looks like the following;
So, if I convert this first reading into an image using the following code where I use np.float64 and L gives me a grayscale image;
newsize = (9, 1000)
pic = acc_normalized[0]
img = Image.fromarray(np.float64(pic), 'L')
img = img.resize(newsize)
image_path = "Images_Accel"
image_name = "D1." + str(2)
img.save(f"{image_path}/{image_name}.tiff")
It gives me this image;
Then, the first 15x9 matrix seems like following image; The contradiction is that if you take a closer look at the numerical array, for instance (1,4) member, it's a complete black where the numerical array is equal to 0.4326132099074307. For grayscale images, black means that it's close to 0 cause it makes white if it's close to 1. However, if it's making a row operation, there is another value closer to 0 and I was expecting to see it black at (1,5) location. If it does a column operation, there is again something wrong. As I said, this data has been already normalized and varies within 0 and 1. So, what's the logic that it converts the array into an image? What kind of operation it does?
Secondly, if I first get an RGB image of the data and then convert it into a grayscale image, why I am not having exactly the same image as what I obtained first? Should the image coming from direct grayscale conversion (L method, np.float64) and the one coming from RGB-based (first I get RGB then convert it to grayscale) be the same? There is a difference in black-white pixels in those images. I do not know why we have it.
---- EDIT 2 ----
.tiff image with F mode and np.float32 gives the following;
I don't really understand your question, but you seem to want to store image differences that are less than 1, i.e. less than the resolution of integer values.
To do so, you need to use an image format that can store floats. JPEG, PNG, GIF, TGA and BMP cannot store floats. Instead, use TIFF, EXR or PFM formats which can handle floats.
Alternatively, you can create 16-bit PNG images wherein each pixel can store values in range 0..65535. So, say the maximum difference you wanted to store was 60 you could calculate the difference and multiply it by 1000 and round it to make an integer in range 0..60000 and store as 16-bit PNG.
You could record the scale factor as a comment within the image if it is variable.

Using PIL (Pillow) and Image but keeping a good resolution

I'm using PIL to resize my images. Most of them are 640x480, some of them are bigger. Most of them are in png, but I have jpeg extension too.
I want to resize all my images to be 32x32 pixels, but I noticed that the resolution seems to change after using PIL.
I found that is a typical question and It's often a problem that figure out when you are saving the image.
I tried with different values of "quality", I read the documentation trying different parametres such as "subsampling" and trying both jpeg and png format.
Here is my code:
from PIL import Image
im = Image.open(os.path.join(my_path, file_name))
img = im.resize((32, 32))
if grey_scale is True:
img = img.convert('L') # to resize image in gray scale
img.save(os.path.join(my_path,
file_name[:file_name.index('.')] + '.jpg'), "JPEG", quality=100)
Here I have my input image
Here I have the grainy output obtained with my code
How can I resize my images to be smaller, but keeping a very good resolution?
The sort of image transformation you want is not possible. As when you try to resize an raster image to a lower pixel dimensions, you have to either Downsample it, or not sample it at all(simple resize). Even though you may preserve the resolution (total no of pixels in an image) but still since a pixel can represent one color at a time (atleast in a sub-pixel based display like a monitor), and your final image only has 1024 of them, and the detail in the original image is far more then what could be represented by these number of pixels, this would always result in a considerably lower quality(pixelated image with artifacts) in the final image.
But this is not the general case, as it depends a lot on what sort of details are represented by the image. If the image is not complex (not contains a lot of color changes), then it can be resized to a considerably lower quality version of it without losing details.
746x338 dimension image
32x32 dimension version of previous image
There is almost no difference between both the images (except for their physical size), even though their dimension's are a lot different. The Reason being these are non-complex images containing same pixel value over a large range, which makes it easier to resize them without loss in detail.
Now if the same process is tried out on a complex image, like the one you gave in the question, the result would be an Pixelated image.
SOLUTION:-
You can either choose for a large dimension value in the final image
(a lot more then 32x32) if you want to preserve the image quality.
Create a Vector equivalent of your image, which is resolution
independent and can be resized to a larger/smaller physical size
without affecting image quality.
P.S.:-
Don't save a .png image with .jpgextension, as jpg is a lossy compression technique(for the most part), which in turn results in a lower quality final image, then the original even if no manipulation are made over it.
Reducing the size, you can't keep the image crisp, because you need pixels for those and you can't keep both
There are different filters you can use in this case. See the below code
from PIL import Image
import os
import PIL
filters = [PIL.Image.NEAREST, PIL.Image.BILINEAR, PIL.Image.BICUBIC, PIL.Image.ANTIALIAS]
grey_scale = False
i = 0
for filter in filters:
im = Image.open("./image.png")
img = im.resize((32, 32), filter)
if grey_scale is True:
img = img.convert('L') # to resize image in gray scale
i = i + 1
img.save("./" + str(i) + '.jpg', "JPEG", quality=100)
Results:
Next, using resize you don't maintain the aspect ratio. So instead of using resize, use the thumbnail method which keeps aspect ratio as well
from PIL import Image
import os
import PIL
filters = [PIL.Image.NEAREST, PIL.Image.BILINEAR, PIL.Image.BICUBIC, PIL.Image.ANTIALIAS]
grey_scale = False
i = 5
for filter in filters:
im = Image.open("./image.png")
img = im.thumbnail((32, 32), filter)
img = im
if grey_scale is True:
img = img.convert('L') # to resize image in gray scale
i = i + 1
img.save("./" + str(i) + '.jpg', "JPEG", quality=100)
Results:

Size of Greyscale vs B&W image (.jpg)

In MATLAB, on assigning 1 to every pixel with intensity >127 and 0 otherwise to a grayscale image of ".jpg" format, the overall size of the file is increasing.
can anyone please explain what can be the reason for this.
Both the files have the following details:
grayscale img: 93KB; B&W img: 118KB.
Format: jpg
CodingMethod: Huffman
CodingProcess: Sequential
BitDepth: 8
I can't see any reasons why it happens, just looking at your question.
I made a try
im = imread('greens.jpg');
im = rgb2gray(im);
im2 = im>127;
% check dimensions
imb = whos('im');
im2b = whos('im2');
fprintf('grey: %f\n b/w: %f\n', imb.bytes, im2b.bytes)
and the out put was 150000b for either.
Advice: check the variable dimensions in matlab workspace, instead of the file itself, if those are the same maybe the difference resides in compression algorithm when it saves the files.

how can I get good binary image using Otsu method for this image?

here is my image
a.png
for binarization I try this code.
im=rgb2gray(I);
maxp=uint16(max(max(im)));
minp=uint16(min(min(im)));
bw=im2bw(im,(double(minp+maxp))/(1.42*255));
bw=~bw;
imm=bw;
but I need binarization by otsu.how can I get good binary output using otsu method?
plz help
thanks
MATLAB has its own implementation of Otsu thresholding called multithresh. In your case the code to obtain the segmented image should be something like this:
im=rgb2gray(I); % convert image to grayscale
thresh = multithresh(im); % find one threshold (using Otsu method)
segmented_im = imquantize(im, thresh); % segment image
imagesc(segmented_im); % show segmented image
I haven't tested it so I don't know how well it would perform on your image.
EDIT:
I tested it, and it doesn't work as expected. One of the problems is that Otsu's method works well when there is a clear bimodal distribution of the pixel intensities. This bimodality is lacking in your image. A call to imhist(im) after the grayscale conversion leads to this (comments added by me):
As you can see, the distribution is almost trimodal, and the threshold selected by multithresh is the first one, while you want the second one. The first workaround that comes to my mind (especially if all the images in your dataset are similar to the one you posted, i.e. have a similar intensity distribution) is to make multithresh output two thresholds, and then selecting the last (highest) one:
thresholds = multithresh(im, 2);
thresh = thresholds(end);
Then proceed with the segmentation of the image as stated above. This second method leads to this segmentation:
EDIT 2 (putting it all together):
Indeed the output segmented_im is not a binary image, but a label image. It's easy enough to convert it to a binary image. I will include directly all the code in this next snippet:
im=rgb2gray(I); % convert image to grayscale
thresholds = multithresh(im, 2); % find two thresholds using Otsu
thresh = thresholds(end); % select larger one
segmented_im = imquantize(im, thresh); % segment image
segmented_im(segmented_im == 1) = 0; % make background black (0)
segmented_im(segmented_im == 2) = 255; % make foreground white (255)
binary_im = im2bw(segmented_im); % make binary (logical) image
imshow(binary_im); % show binary image
binary_im il a logical matrix with false (0) for background, and true (1) for foreground. segmented_im is a double matrix with 0 for background and 255 for foreground. I hope this serves your purposes!

How to display a Gray scale image using boundary defined in another binary image

I have a original gray scale image(I m using mammogram image with labels outside image).
I need to remove some objects(Labels) in that image, So i converted that grayscale image to a binary image. Then i followed the answer method provided in
How to Select Object with Largest area
Finally i extracted an Object with largest area as binary image. I want that region in gray scale for accessing and segmenting small objects within that. For example. Minor tissues in region and also should detect its edge.
**
How can i get that separated object region as grayscale image or
anyway to get the largest object region from gray scale directly
without converting to binary or any other way.?
**
(I am new to matlab. I dono whether i explained it correctly or not. If u cant get, I ll provide more detail)
If I understood you correctly, you are looking to have a gray image with only the biggest blob being highlighted.
Code
img = imread(IMAGE_FILEPATH);
BW = im2bw(img,0.2); %%// 0.2 worked to get a good area for the biggest blob
%%// Biggest blob
[L, num] = bwlabel(BW);
counts = sum(bsxfun(#eq,L(:),1:num));
[~,ind] = max(counts);
BW = (L==ind);
%%// Close the biggest blob
[L,num] = bwlabel( ~BW );
counts = sum(bsxfun(#eq,L(:),1:num));
[~,ind] = max(counts);
BW = ~(L==ind);
%%// Original image with only the biggest blob highlighted
img1 = uint8(255.*bsxfun(#times,im2double(img),BW));
%%// Display input and output images
figure,
subplot(121),imshow(img)
subplot(122),imshow(img1)
Output
If I understand your question correctly, you want to use the binary map and access the corresponding pixel intensities in those regions.
If that's the case, then it's very simple. You can use the binary map to identify the spatial co-ordinates of where you want to access the intensities in the original image. Create a blank image, then copy over these intensities over to the blank image using those spatial co-ordinates.
Here's some sample code that you can play around with.
% Assumptions:
% im - Original image
% bmap - Binary image
% Where the output image will be stored
outImg = uint8(zeros(size(im)));
% Find locations in the binary image that are white
locWhite = find(bmap == 1);
% Copy over the intensity values from these locations from
% the original image to the output image.
% The output image will only contain those pixels that were white
% in the binary image
outImg(locWhite) = im(locWhite);
% Show the original and the result side by side
figure;
subplot(1,2,1);
imshow(im); title('Original Image');
subplot(1,2,2);
imshow(outImg); title('Extracted Result');
Let me know if this is what you're looking for.
Method #2
As suggested by Rafael in his comments, you can skip using find all together and use logical statements:
outImg = img;
outImg(~bmap) = 0;
I decided to use find as it less obfuscated for a beginner, even though it is less efficient to do so. Either method will give you the correct result.
Some food for thought
The extracted region that you have in your binary image has several holes. I suspect you would want to grab the entire region without any holes. As such, I would recommend that you fill in these holes before you use the above code. The imfill function from MATLAB works nicely and it accepts binary images as input.
Check out the documentation here: http://www.mathworks.com/help/images/ref/imfill.html
As such, apply imfill on your binary image first, then go ahead and use the above code to do your extraction.

Resources