I'm using PIL to resize my images. Most of them are 640x480, some of them are bigger. Most of them are in png, but I have jpeg extension too.
I want to resize all my images to be 32x32 pixels, but I noticed that the resolution seems to change after using PIL.
I found that is a typical question and It's often a problem that figure out when you are saving the image.
I tried with different values of "quality", I read the documentation trying different parametres such as "subsampling" and trying both jpeg and png format.
Here is my code:
from PIL import Image
im = Image.open(os.path.join(my_path, file_name))
img = im.resize((32, 32))
if grey_scale is True:
img = img.convert('L') # to resize image in gray scale
img.save(os.path.join(my_path,
file_name[:file_name.index('.')] + '.jpg'), "JPEG", quality=100)
Here I have my input image
Here I have the grainy output obtained with my code
How can I resize my images to be smaller, but keeping a very good resolution?
The sort of image transformation you want is not possible. As when you try to resize an raster image to a lower pixel dimensions, you have to either Downsample it, or not sample it at all(simple resize). Even though you may preserve the resolution (total no of pixels in an image) but still since a pixel can represent one color at a time (atleast in a sub-pixel based display like a monitor), and your final image only has 1024 of them, and the detail in the original image is far more then what could be represented by these number of pixels, this would always result in a considerably lower quality(pixelated image with artifacts) in the final image.
But this is not the general case, as it depends a lot on what sort of details are represented by the image. If the image is not complex (not contains a lot of color changes), then it can be resized to a considerably lower quality version of it without losing details.
746x338 dimension image
32x32 dimension version of previous image
There is almost no difference between both the images (except for their physical size), even though their dimension's are a lot different. The Reason being these are non-complex images containing same pixel value over a large range, which makes it easier to resize them without loss in detail.
Now if the same process is tried out on a complex image, like the one you gave in the question, the result would be an Pixelated image.
SOLUTION:-
You can either choose for a large dimension value in the final image
(a lot more then 32x32) if you want to preserve the image quality.
Create a Vector equivalent of your image, which is resolution
independent and can be resized to a larger/smaller physical size
without affecting image quality.
P.S.:-
Don't save a .png image with .jpgextension, as jpg is a lossy compression technique(for the most part), which in turn results in a lower quality final image, then the original even if no manipulation are made over it.
Reducing the size, you can't keep the image crisp, because you need pixels for those and you can't keep both
There are different filters you can use in this case. See the below code
from PIL import Image
import os
import PIL
filters = [PIL.Image.NEAREST, PIL.Image.BILINEAR, PIL.Image.BICUBIC, PIL.Image.ANTIALIAS]
grey_scale = False
i = 0
for filter in filters:
im = Image.open("./image.png")
img = im.resize((32, 32), filter)
if grey_scale is True:
img = img.convert('L') # to resize image in gray scale
i = i + 1
img.save("./" + str(i) + '.jpg', "JPEG", quality=100)
Results:
Next, using resize you don't maintain the aspect ratio. So instead of using resize, use the thumbnail method which keeps aspect ratio as well
from PIL import Image
import os
import PIL
filters = [PIL.Image.NEAREST, PIL.Image.BILINEAR, PIL.Image.BICUBIC, PIL.Image.ANTIALIAS]
grey_scale = False
i = 5
for filter in filters:
im = Image.open("./image.png")
img = im.thumbnail((32, 32), filter)
img = im
if grey_scale is True:
img = img.convert('L') # to resize image in gray scale
i = i + 1
img.save("./" + str(i) + '.jpg', "JPEG", quality=100)
Results:
Related
I am training custom object detection by using mask RCNN. I have custom images that are of different sizes, so I am wondering if I need to resize the images so that they are all of the same size or not?
And if so, which method should I use to resize them?
Also I guess that I have to resize before labeling the images right?
You don't necessarily have to resize it before hand.
you can use this option in the model config file to set the size limit for your training.
image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 600
max_dimension: 1024
}
}
Please make sure all the bounding boxes are in range with the image dimensions. i.e. the within the range of width and height of the image. Then the boxes and the images will be auto resized according to the parameter set here.
In Matterplot's Mask RCNN you can find documentation in the config file:
# Input image resizing
# Generally, use the "square" resizing mode for training and predicting
# and it should work well in most cases. In this mode, images are scaled
# up such that the small side is = IMAGE_MIN_DIM, but ensuring that the
# scaling doesn't make the long side > IMAGE_MAX_DIM. Then the image is
# padded with zeros to make it a square so multiple images can be put
# in one batch.
# Available resizing modes:
# none: No resizing or padding. Return the image unchanged.
# square: Resize and pad with zeros to get a square image
# of size [max_dim, max_dim].
# pad64: Pads width and height with zeros to make them multiples of 64.
# If IMAGE_MIN_DIM or IMAGE_MIN_SCALE are not None, then it scales
# up before padding. IMAGE_MAX_DIM is ignored in this mode.
# The multiple of 64 is needed to ensure smooth scaling of feature
# maps up and down the 6 levels of the FPN pyramid (2**6=64).
# crop: Picks random crops from the image. First, scales the image based
# on IMAGE_MIN_DIM and IMAGE_MIN_SCALE, then picks a random crop of
# size IMAGE_MIN_DIM x IMAGE_MIN_DIM. Can be used in training only.
# IMAGE_MAX_DIM is not used in this mode.
IMAGE_RESIZE_MODE = "square"
IMAGE_MIN_DIM = 800
IMAGE_MAX_DIM = 1024
How I understand it. When you train or predict this configuration will be used without you having to resize it manually. Ofcourse if you have different sizes and ratios of images this can be a problem.
512x512: ratio = 1 so this will upscale to 1024x1024
2054x2456: ratio = 0.836... so this will downscale to 1024x1024 keeping the ratio of 0.836... but using zeropadding to get the square shape.
Where it could go wrong is if a dimension of an object is relatively smaller or bigger in comparison with the different image dimensions which can result in a stretched or compressed object. In this case you should preprocess it manually so that in the end your object is of the same size and shape after the Mask RCNN function has molded it into the right shape.
The Matterplot function is found in "utils.py" and is called "resize_image".
In the "model.py" this is used during training when loading in the data and during inference (detect) to reshape the given numpy-array.
I have a general question about convolutional neural networks and image processing for training if your images are grey scale.
Take this image for example:
Its a grey scale image but when I do
image = cv2.imread("image.jpg")
print(image.shape)
I get
(1024, 1024, 3)
I know that opencv automatically creates 3 channels for jpg images. But when it comes to network training, it would be much more computationally efficient if I could use images in (1024, 1024, 1) - just like many of the MNIST tutorials demonstrate. However, if I reshape this:
image.reshape(1024, 1024 , 1)
And then try for example to show the image
plt.axis("off")
plt.imshow(reshaped_image)
plt.show()
I get
raise TypeError("Invalid dimensions for image data")
Does that mean that reshaping my images this way before network training is incorrect? I want to keep as much information in the image as possible but I don't want to have those extra channels if they aren't needed.
The reason that you're getting the error is that the output of your reshape does not have the same number of elements as the input. From the documentation for reshape:
No extra elements are included into the new matrix and no elements are excluded. Consequently, the product rows*cols*channels() must stay the same after the transformation.
Instead, use cvtColor to convert your 3-channel BGR image to a 1-channel grayscale image:
In Python:
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
Or in C++:
cv::cvtColor(image, image, cv::COLOR_BGR2GRAY);
You could also avoid conversion altogether by reading the image using the IMREAD_GRAYSCALE flag:
image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
or
image = cv2.imread(image_path, 0)
(Thanks to #Alexander Reynolds for the Python code.)
This worked for me.
for image_path in dir:
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
X.append(img)
X = np.array(X)
X = np.expand_dims(X, axis=3)
set axis = Int : based on your array, 1 means it will prepend a new dimension in front.
I have a 565 * 584 image as shown
I want to reduce the radius of the circle by certain number of pixels without changing the size of the image. How can I do it? Please explain or give some ideas. Thank You.
I would use ImageMagick and an erosion like this:
convert http://i.stack.imgur.com/c8lfe.jpg -morphology erode octagon:8 out.png
If you know that the background of the image is a constant, as in your example, this is easy.
Resize the entire image by the ratio you wish to shrink by. Then create a new image at the size of the original and fill it with the background color, then paste the resized image into the center of it.
Here's how you'd do it in OpenCV Python. Going with Mark Setchell's approach, simply specify a round structuring element so that you can maintain or respect the round edges of the object. The closest thing that OpenCV has to offer is the elliptical mask.
As such:
import numpy as np # Import relevant packages - numpy and OpenCV
import cv2
# Read in image and threshold - convert to grayscale first
im = cv2.imread('c8lfe.jpg', 0) > 128
# Specify radius of ellipse
radius = 21
# Obtain structuring element, then erode image
se = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (radius, radius))
# Make sure you convert back to grayscale and multiply by 255
out = 255*(cv2.erode(im, se).astype('uint8'))
# Show the image, wait for user key, then close window and write image
cv2.imshow('Reduced shape', out)
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.imwrite('out.png', out)
We get:
Be advised that the small bump at the top right corner of your shape will mutate. As we are essentially shrinking the perimeter of the object, that bump will also shrink as well. If you wish to preserve the structure of the object while maintaining the image resolution, use Mark Ransom's approach or my slightly modified version of his approach. Both are shown below.
However, to be self-contained, we can certainly do what Mark Ransom has suggested. Resize the image, initialize a blank image that is size of the original image, and place it in the centre:
import numpy as np # Import relevant packages - OpenCV and Python
import cv2
im = cv2.imread('c8lfe.jpg', 0) # Read in the image - grayscale
scale_factor = 0.75 # Set scale factor - We are shrinking the image by 25%
# Get the desired size (row and columns) of the shrunken image
desired_size = np.floor(scale_factor*np.array(im.shape)).astype('int')
# Make sure desired size is ODD for easier placement
if desired_size[0] % 2 == 0:
desired_size[0] += 1
if desired_size[1] % 2 == 0:
desired_size[1] += 1
# Resize the image. Columns come first, followed by rows, which is why we
# reverse the desired_size array
rsz = cv2.resize(im, tuple(desired_size[::-1]))
# Determine half width of both dimensions of shrunken image
half_way = np.floor(desired_size/2.0).astype('int')
# Create output image that is the same size as the input and find its centre
out = np.zeros_like(im, dtype='uint8')
centre = np.floor(np.array(im.shape)/2.0).astype('int')
# Place shrunken image in the centre of the larger output image
out[centre[0]-half_way[0]:centre[0]+half_way[0]+1, centre[1]-half_way[1]:centre[1]+half_way[1]+1] = rsz
# Show the image, wait for user key, then close window and write image
cv2.imshow('Reduced shape', out)
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.imwrite('out.png', out)
We get:
Another suggestion
What I can also recommend you do is pad the array with zeroes, then reshrink the image back to its original size. You would essentially extend the borders of the original image so that the borders contain zeroes. In this case, we would do what Mark Ransom also suggested, but we are working within the inside, out.
Here's the way to pad a matrix with zeroes using OpenCV C++: Pad array with zeros- openCV . However, in Python, simply use numpy's pad function:
import numpy as np # Import relevant packages - numpy and OpenCV
import cv2
# Read in image and threshold - convert to grayscale first
im = cv2.imread('c8lfe.jpg', 0)
# Set how many pixels along the border you want to add on each side
pad_radius = 75
# Pad the image
out = np.lib.pad(im, ((pad_radius, pad_radius), (pad_radius, pad_radius)), 'constant', constant_values=((0,0),(0,0)))
# Shrink it back to what the original size was
out = cv2.resize(out, im.shape[::-1])
# Show the image, wait for user key, then close window and write image
cv2.imshow('Reduced shape', out)
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.imwrite('out.png', out)
We thus get:
Img is a dtype=float64 numpy data type. When I run this code:
Img2 = np.array(Img, np.uint8)
the background of my images turns white. How can I avoid this and still get an 8-bit image?
Edit:
Sure, I can give more info. The single image is compiled from a stack of 400 images. They are each coming from an .avi video file, and each image is converted into a NumPy array like this:
gray_img = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
A more complicated operation is performed on this whole stack, but does not involve creating new images. It's simply performing calculations on each 1D array to yield a single pixel.
The interpolation is most likely linear (the default in plotting images with matplotlib. The images were saved as .PNGs.
You probably see overflow. If you cast 257 to np.uint8, you will get 1. According to a google search, avi files contain images with a color depth of 15 - 24 bit. When you cast this depth to np.uint8, you will see white regions getting darkened and (if a normalization takes place somewhere) also dark regions getting white (-5 -> 251). For the regions that become bright, you could check whether you have negative pixel values in the original image Img.
The Docs say that sometimes you have to do some scaling to get a proper cast, and to rather use higher depth whenever possible to avoid artefacts.
The solution seems to be either working at higher depth, i.e. casting to np.uint16 or np.uint32, or to scale the pixel values before reducing the depth, i.e. with Img2 already being a numpy matrix
# make sure that values are between 0 and 255, i.e. within 8bit range
Img2 *= 255/Img2.max()
# cast to 8bit
Img2 = np.array(Img, np.uint8)
If I have for example an image of size 400 x 600. I know how to resize it in order to be of size 80 x 80 by using the code below:
original_image = imread(my_image);
original_image_gray = rgb2gray(original_image);
Image_resized = imresize(original_image_gray, [80 80]);
But I think that imresize will resize the image with some losses in the quality. So how to resize it without any loss of the quality?
Image resizing itself will lose part of the image info, i.e. quality of the image.
What you can do is to choose the resizing method that fits your purpose by setting up the corresponding parameter:
[...] = imresize(...,method)
^^^^^^
Matlab stores images as pixel array. It is impossible, to store all the information contained in a 400x600 element matrix in a 80x80 matrix, therefore quality loss is unavoidable when resizing the pixel array, which is what imresize does.
If you want to reduce the physical size of your output, you should look at the imgwrite documentation, in particular at the XResolution and YResolution parameters in the case of creating png images.
original_image = imread(my_image);
imwrite(original_image_grey,'image.png','png','ResolutionUnit','cm','XResolution',400)
The above code will create a png of the original image with a resolution of 400px/cm, resulting in an image of 1cm width. The png will still be a 400x600px Bitmap.