Consider two images. The size of these two images could be anything. Bring size of those two images same. Develop an algorithm to mix these two images, such that alternate pixels are brought from two image courses. It is fusion of two images. For example, pixel 1 is image1’s, pixel 2 is from image 2, 3rd pixel from image 1 and so on like that……
I know you prefer to use Matlab, but until someone gives you a Matlab answer, you may like to play around with ImageMagick which can do this for you and is in most Linux distributions anyway and available for free for Windows and Mac OSX.
First, let's create 2 images of different sizes and colours:
convert -size 300x300 xc:blue image1.png
convert -size 200x400 xc:red image2.png
Basically, you can resize images as you read them in by specifying the image size in square brackets after the filename, so I am arbitrarily choosing to resize both images to 256x256 pixels. Then I use the extremely powerful fx operator, so detect if I am processing an odd or an even numbered pixel, and choose either from the first or the second image accordingly:
convert image1.png[256x256] image2.png[256x256] -fx "i%2?u:v" out.png
Here is a way to do it with MATLAB.
clear
clc
%// Initialize red and blueimages
RedImage = zeros(300,300,3,'uint8');
BlueImage = zeros(200,400,3,'uint8');
%// Color them
RedImage(:,:,1) = 255;
BlueImage(:,:,3) = 255;
figure('Color',[1 1 1]);
%// Show them
subplot(1,2,1)
imshow(RedImage)
subplot(1,2,2)
imshow(BlueImage)
It looks like this:
%// Resize them to same size
RedImage = imresize(RedImage,[256 256]);
BlueImage = imresize(BlueImage,[256 256]);
%// Initialize new image
NewImage = zeros(256,256,3,'uint8');
%// Assign alternate pixels to new images
NewImage(1:2:end,1:2:end,:) = RedImage(1:2:end,1:2:end,:);
NewImage(2:2:end,2:2:end,:) = BlueImage(2:2:end,2:2:end,:);
figure
imshow(NewImage)
Which outputs this:
It looks dark but resize the figure will show you that it works indeed!
Hope that helps! Have fun.
Related
I have a 2D CNN model where I perform a classification task. My images are all coming from a sensor data after conversion.
So, normally, my way is to convert them into images using the following approach
newsize = (9, 1000)
pic = acc_normalized[0]
img = Image.fromarray(np.uint8(pic*255), 'L')
img = img.resize(newsize)
image_path = "Images_Accel"
image_name = "D1." + str(2)
img.save(f"{image_path}/{image_name}.jpeg")
This is what I obtain:
However, their precision is sort of important. For instance, some of the numerical values are like:
117.79348187327987 or 117.76568758022673.
As you see in the above line, their difference is the digits, when I use uint8, it only takes 117 to when converting it into image pixels and it looks the same, right? But, I'd like to make them different. In some cases, the difference is even at the 8th or 10th digit.
So, when I try to use mode F and save them .jpeg in Image.fromarray line it gives me error and says that PIL cannot write mode F to jpeg.
Then, I tried to first convert them RGB like following;
img = Image.fromarray(pic, 'RGB')
I am not including np.float32 just before pic or not multiplying it by 255 as it is. Then, I convert this image to grayscale. This is what I got for RGB image;
After converting RGB into grayscale:
As you see, it seems that there is a critical different between the first pic and the last pic. So, what should be the proper way to use them in 2D CNN classification? or, should I convert them into RGB and choose grayscale in CNN implementation and a channel of 1? My image dimensions 1000x9. I can even change this dimension like 250x36 or 100x90. It doesn't matter too much. By the way, in the CNN network, I am able to get more than 90% test accuracy when I use the first-type of image.
The main problem here is using which image conversion method I'll be able to take into account those precision differences across the pixels. Would you give me some idea?
---- EDIT -----
Using .tiff format I made some quick comparisons.
First of all, my data looks like the following;
So, if I convert this first reading into an image using the following code where I use np.float64 and L gives me a grayscale image;
newsize = (9, 1000)
pic = acc_normalized[0]
img = Image.fromarray(np.float64(pic), 'L')
img = img.resize(newsize)
image_path = "Images_Accel"
image_name = "D1." + str(2)
img.save(f"{image_path}/{image_name}.tiff")
It gives me this image;
Then, the first 15x9 matrix seems like following image; The contradiction is that if you take a closer look at the numerical array, for instance (1,4) member, it's a complete black where the numerical array is equal to 0.4326132099074307. For grayscale images, black means that it's close to 0 cause it makes white if it's close to 1. However, if it's making a row operation, there is another value closer to 0 and I was expecting to see it black at (1,5) location. If it does a column operation, there is again something wrong. As I said, this data has been already normalized and varies within 0 and 1. So, what's the logic that it converts the array into an image? What kind of operation it does?
Secondly, if I first get an RGB image of the data and then convert it into a grayscale image, why I am not having exactly the same image as what I obtained first? Should the image coming from direct grayscale conversion (L method, np.float64) and the one coming from RGB-based (first I get RGB then convert it to grayscale) be the same? There is a difference in black-white pixels in those images. I do not know why we have it.
---- EDIT 2 ----
.tiff image with F mode and np.float32 gives the following;
I don't really understand your question, but you seem to want to store image differences that are less than 1, i.e. less than the resolution of integer values.
To do so, you need to use an image format that can store floats. JPEG, PNG, GIF, TGA and BMP cannot store floats. Instead, use TIFF, EXR or PFM formats which can handle floats.
Alternatively, you can create 16-bit PNG images wherein each pixel can store values in range 0..65535. So, say the maximum difference you wanted to store was 60 you could calculate the difference and multiply it by 1000 and round it to make an integer in range 0..60000 and store as 16-bit PNG.
You could record the scale factor as a comment within the image if it is variable.
I have a general question about convolutional neural networks and image processing for training if your images are grey scale.
Take this image for example:
Its a grey scale image but when I do
image = cv2.imread("image.jpg")
print(image.shape)
I get
(1024, 1024, 3)
I know that opencv automatically creates 3 channels for jpg images. But when it comes to network training, it would be much more computationally efficient if I could use images in (1024, 1024, 1) - just like many of the MNIST tutorials demonstrate. However, if I reshape this:
image.reshape(1024, 1024 , 1)
And then try for example to show the image
plt.axis("off")
plt.imshow(reshaped_image)
plt.show()
I get
raise TypeError("Invalid dimensions for image data")
Does that mean that reshaping my images this way before network training is incorrect? I want to keep as much information in the image as possible but I don't want to have those extra channels if they aren't needed.
The reason that you're getting the error is that the output of your reshape does not have the same number of elements as the input. From the documentation for reshape:
No extra elements are included into the new matrix and no elements are excluded. Consequently, the product rows*cols*channels() must stay the same after the transformation.
Instead, use cvtColor to convert your 3-channel BGR image to a 1-channel grayscale image:
In Python:
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
Or in C++:
cv::cvtColor(image, image, cv::COLOR_BGR2GRAY);
You could also avoid conversion altogether by reading the image using the IMREAD_GRAYSCALE flag:
image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
or
image = cv2.imread(image_path, 0)
(Thanks to #Alexander Reynolds for the Python code.)
This worked for me.
for image_path in dir:
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
X.append(img)
X = np.array(X)
X = np.expand_dims(X, axis=3)
set axis = Int : based on your array, 1 means it will prepend a new dimension in front.
here is my image
a.png
for binarization I try this code.
im=rgb2gray(I);
maxp=uint16(max(max(im)));
minp=uint16(min(min(im)));
bw=im2bw(im,(double(minp+maxp))/(1.42*255));
bw=~bw;
imm=bw;
but I need binarization by otsu.how can I get good binary output using otsu method?
plz help
thanks
MATLAB has its own implementation of Otsu thresholding called multithresh. In your case the code to obtain the segmented image should be something like this:
im=rgb2gray(I); % convert image to grayscale
thresh = multithresh(im); % find one threshold (using Otsu method)
segmented_im = imquantize(im, thresh); % segment image
imagesc(segmented_im); % show segmented image
I haven't tested it so I don't know how well it would perform on your image.
EDIT:
I tested it, and it doesn't work as expected. One of the problems is that Otsu's method works well when there is a clear bimodal distribution of the pixel intensities. This bimodality is lacking in your image. A call to imhist(im) after the grayscale conversion leads to this (comments added by me):
As you can see, the distribution is almost trimodal, and the threshold selected by multithresh is the first one, while you want the second one. The first workaround that comes to my mind (especially if all the images in your dataset are similar to the one you posted, i.e. have a similar intensity distribution) is to make multithresh output two thresholds, and then selecting the last (highest) one:
thresholds = multithresh(im, 2);
thresh = thresholds(end);
Then proceed with the segmentation of the image as stated above. This second method leads to this segmentation:
EDIT 2 (putting it all together):
Indeed the output segmented_im is not a binary image, but a label image. It's easy enough to convert it to a binary image. I will include directly all the code in this next snippet:
im=rgb2gray(I); % convert image to grayscale
thresholds = multithresh(im, 2); % find two thresholds using Otsu
thresh = thresholds(end); % select larger one
segmented_im = imquantize(im, thresh); % segment image
segmented_im(segmented_im == 1) = 0; % make background black (0)
segmented_im(segmented_im == 2) = 255; % make foreground white (255)
binary_im = im2bw(segmented_im); % make binary (logical) image
imshow(binary_im); % show binary image
binary_im il a logical matrix with false (0) for background, and true (1) for foreground. segmented_im is a double matrix with 0 for background and 255 for foreground. I hope this serves your purposes!
I have tried image subtraction in MatLab, but realised that there is a big blue patch on the image. Please see image for more details.
Another images showing where the blue patch approximately cover till.
The picture on the left in the top 2 images shows the picture after subtraction.You can ignore the picture on the right of the top 2 images. This is one of the original image:
and this is the background I am subtracting.
The purpose is to get the foreground image and blob it, followed by counting the number of blobs to see how many books are stacked vertically from their sides. I am experimenting how blobs method works on matlab.
Do anybody have any idea? Below is the code on how I carry out my background subtraction as well as display it. Thanks.
[filename, user_canceled] = imgetfile;
fullFileName=filename;
rgbImage = imread(fullFileName);
folder = fullfile('C:\Users\Aaron\Desktop\OPENCV\Book Detection\Sample books');
baseFileName = 'background.jpg';
fullFileName = fullfile(folder, baseFileName);
backgroundImage =imread(fullFileName);
rgbImage= rgbImage - backgroundImage;
%display foreground image after background substraction%%%%%%%%%%%%%%
subplot( 1,2,1);
imshow(rgbImage, []);
Because the foreground objects (i.e. the books) are opaque, the background does not affect those pixels at all. In other words, you are subtracting out something that is not there. What you need is a method of detecting which pixels in your image correspond to foreground, and which correspond to background. Unfortunately, solving this problem might be at least as difficult as the problem you set out to solve in the first place.
If you just want a pixel-by-pixel comparison with the background you could try something like this:
thresh = 250;
imdiff = sum(((rgbImage-backgroundImage).^2),3);
mask = uint8(imdiff > thresh);
maskedImage = rgbImage.*cat(3,mask,mask,mask);
imshow(maskedImage, []);
You will have to play around with the threshold value until you get the desired masking. The problem you are going to have is that the background is poorly suited for the task. If you had the books in front of a green screen for example, you could probably do a much better job.
You are getting blue patches because you are subtracting two color RGB images. Ideally, in the difference image you expect to get zeros for the background pixels, and non-zeros for the foreground pixels. Since you are in RGB, the foreground pixels may end up having some weird color, which does not really matter. All you care about is that the absolute value of the difference is greater than 0.
By the way, your images are probably uint8, which is unsigned. You may want to convert them to double using im2double before you do the subtraction.
I want to load an RGB image in MATLAB and turn it into a binary image, where I can choose how many pixels the binary image has. For instance, I'd load a 300x300 png/jpg image into MATLAB and I'll end up with a binary image (pixels can only be #000 or #FFF) that could be 10x10 pixels.
This is what I've tried so far:
load trees % from MATLAB
gray=rgb2gray(map); % 'map' is loaded from 'trees'. Convert to grayscale.
threshold=128;
lbw=double(gray>threshold);
BW=im2bw(X,lbw); % 'X' is loaded from 'trees'.
imshow(X,map), figure, imshow(BW)
(I got some of the above from an internet search.)
I just end up with a black image when doing the imshow(BW).
Your first problem is that you are confusing indexed images (which have a colormap map) and RGB images (which don't). The sample built-in image trees.mat that you load in your example is an indexed image, and you should therefore use the function ind2gray to first convert it to a grayscale intensity image. For RGB images the function rgb2gray would do the same.
Next, you need to determine a threshold to use to convert the grayscale image to a binary image. I suggest the function graythresh, which will compute a threshold to plug into im2bw (or the newer imbinarize). Here is how I would accomplish what you are doing in your example:
load trees; % Load the image data
I = ind2gray(X, map); % Convert indexed to grayscale
level = graythresh(I); % Compute an appropriate threshold
BW = im2bw(I, level); % Convert grayscale to binary
And here is what the original image and result BW look like:
For an RGB image input, just replace ind2gray with rgb2gray in the above code.
With regard to resizing your image, that can be done easily with the Image Processing Toolbox function imresize, like so:
smallBW = imresize(BW, [10 10]); % Resize the image to 10-by-10 pixels
It is because gray is in the scale of [0,1], whereas threshold is in [0,256].
This causes lbw to be a big array of false. Here is a modified code that solves the problem:
load trees % from MATLAB
gray=rgb2gray(map); % 'map' is loaded from 'trees'. Convert to grayscale.
threshold=128/256;
lbw=double(gray>threshold);
BW=im2bw(X,lbw); % 'X' is loaded from 'trees'.
imshow(X,map), figure, imshow(BW)
And the result is: