Get multiple image files as 2D tensor in tensorflow - image

For a softmax regression program using tensorflow in python, I want to get my 1000 jpeg image files as a 2D tensor x of: [image index, pixel index]. The "image index" is the image and the pixel index is a specific image pixel for that image.
The model equation is:
y = tf.nn.softmax(tf.matmul(x, W) + b)
where:
x = tf.placeholder(tf.float32, [None, image_size])
W = tf.Variable(tf.zeros([image_size, classes]))
b = tf.Variable(tf.zeros([classes]))
image size = height*width of image (constant for all images).
What is the best way in tensorflow to get my image files in that form?

When I do image processing I like to use either OpenCV (cv2.imread(...)) or Scipy (scipy.ndimage.imread(...)) to read the image files. I also think tensorflow might have its own image reader you can use. These two function return the image as a numpy array. You can specify in the arguments if you want grayscale or color. Now you need to preprocess the images. You may need to convert the datatype (OpenCV uses 8 bit integers rather than float32) and normalize the data. You can also resize at this point if the images are not all the same size.
You can then flatten these numpy arrays to get a flat representation of the images. Just call the flatten() function of the np.ndarray. After you have loaded and flattened the images you want for your batch, string them together in a numpy array np.array([img1, img2, ..., imgN]) and this array will be of shape [images, pixels]. You can then feed this to you x placeholder.

I would prefer to preprocess every images if it is for training, but for using Tensorflow on-line with a live image stream, I would try the following method that dynamically changes the data in the memory:
any_shape = [the most natural shape according to the data you already have...]
x_unshaped = tf.placeholder(tf.float32, any_shape)
x = tf.reshape(x_unshaped, [-1, image_size])
If your data is already properly ordered in memory, you could try tf.Tensor.set_shape():
The tf.Tensor.set_shape() method updates the static shape of a Tensor
object, and it is typically used to provide additional shape
information when this cannot be inferred directly. It does not change
the dynamic shape of the tensor.
Source: https://www.tensorflow.org/versions/r0.9/api_docs/python/framework.html

Related

Reshaping greyscale images for neural network training - how to do this correctly

I have a general question about convolutional neural networks and image processing for training if your images are grey scale.
Take this image for example:
Its a grey scale image but when I do
image = cv2.imread("image.jpg")
print(image.shape)
I get
(1024, 1024, 3)
I know that opencv automatically creates 3 channels for jpg images. But when it comes to network training, it would be much more computationally efficient if I could use images in (1024, 1024, 1) - just like many of the MNIST tutorials demonstrate. However, if I reshape this:
image.reshape(1024, 1024 , 1)
And then try for example to show the image
plt.axis("off")
plt.imshow(reshaped_image)
plt.show()
I get
raise TypeError("Invalid dimensions for image data")
Does that mean that reshaping my images this way before network training is incorrect? I want to keep as much information in the image as possible but I don't want to have those extra channels if they aren't needed.
The reason that you're getting the error is that the output of your reshape does not have the same number of elements as the input. From the documentation for reshape:
No extra elements are included into the new matrix and no elements are excluded. Consequently, the product rows*cols*channels() must stay the same after the transformation.
Instead, use cvtColor to convert your 3-channel BGR image to a 1-channel grayscale image:
In Python:
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
Or in C++:
cv::cvtColor(image, image, cv::COLOR_BGR2GRAY);
You could also avoid conversion altogether by reading the image using the IMREAD_GRAYSCALE flag:
image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
or
image = cv2.imread(image_path, 0)
(Thanks to #Alexander Reynolds for the Python code.)
This worked for me.
for image_path in dir:
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
X.append(img)
X = np.array(X)
X = np.expand_dims(X, axis=3)
set axis = Int : based on your array, 1 means it will prepend a new dimension in front.

Extracting pixel values using Pillow - every coordinate I use gives the same value

I am attempting to use the Pillow library in Python 2.7 to extract pixel values at given coordinates on PNG and JPG images. I don't get an errors but I always get the same value irrespective of the coordinates I have used on an image where the values do vary.
This is an extract from my script where I print all the values (its a small test image):
from PIL import Image
box = Image.open("col_grad.jpg")
pixels = list(box.getdata())
print(pixels)
And from when I try to extract a single value:
from PIL import Image, ImageFilter
box = Image.open("col_grad.jpg")
value = box.load()
print(value[10,10])
I have been using these previous questions on this topic for guidance including:
Getting list of pixel values from PIL
How can I read the RGB value of a given pixel in Python?
Thanks for any help with this,
Alex
I'm not sure if you can access it the way you want, because of the complexity of images data.
Just get the pixel:
Image.getpixel(xy)
Returns the pixel value at a given position.
Parameters: xy – The coordinate, given as (x, y).
Returns: The pixel value. If the image is a multi-layer image, this method returns a tuple.:
You should consider doing one of the following:
FIRST convert your image to grayscale and then find the pixel values present.
img_grey= img.convert('1') # convert image to black and white
SECOND If you want the pixel values of RGB channels, you have to split your color image. Then find the pixel values in all the channels at a particular coordinate.
Image.split(img)

How to display a Gray scale image using boundary defined in another binary image

I have a original gray scale image(I m using mammogram image with labels outside image).
I need to remove some objects(Labels) in that image, So i converted that grayscale image to a binary image. Then i followed the answer method provided in
How to Select Object with Largest area
Finally i extracted an Object with largest area as binary image. I want that region in gray scale for accessing and segmenting small objects within that. For example. Minor tissues in region and also should detect its edge.
**
How can i get that separated object region as grayscale image or
anyway to get the largest object region from gray scale directly
without converting to binary or any other way.?
**
(I am new to matlab. I dono whether i explained it correctly or not. If u cant get, I ll provide more detail)
If I understood you correctly, you are looking to have a gray image with only the biggest blob being highlighted.
Code
img = imread(IMAGE_FILEPATH);
BW = im2bw(img,0.2); %%// 0.2 worked to get a good area for the biggest blob
%%// Biggest blob
[L, num] = bwlabel(BW);
counts = sum(bsxfun(#eq,L(:),1:num));
[~,ind] = max(counts);
BW = (L==ind);
%%// Close the biggest blob
[L,num] = bwlabel( ~BW );
counts = sum(bsxfun(#eq,L(:),1:num));
[~,ind] = max(counts);
BW = ~(L==ind);
%%// Original image with only the biggest blob highlighted
img1 = uint8(255.*bsxfun(#times,im2double(img),BW));
%%// Display input and output images
figure,
subplot(121),imshow(img)
subplot(122),imshow(img1)
Output
If I understand your question correctly, you want to use the binary map and access the corresponding pixel intensities in those regions.
If that's the case, then it's very simple. You can use the binary map to identify the spatial co-ordinates of where you want to access the intensities in the original image. Create a blank image, then copy over these intensities over to the blank image using those spatial co-ordinates.
Here's some sample code that you can play around with.
% Assumptions:
% im - Original image
% bmap - Binary image
% Where the output image will be stored
outImg = uint8(zeros(size(im)));
% Find locations in the binary image that are white
locWhite = find(bmap == 1);
% Copy over the intensity values from these locations from
% the original image to the output image.
% The output image will only contain those pixels that were white
% in the binary image
outImg(locWhite) = im(locWhite);
% Show the original and the result side by side
figure;
subplot(1,2,1);
imshow(im); title('Original Image');
subplot(1,2,2);
imshow(outImg); title('Extracted Result');
Let me know if this is what you're looking for.
Method #2
As suggested by Rafael in his comments, you can skip using find all together and use logical statements:
outImg = img;
outImg(~bmap) = 0;
I decided to use find as it less obfuscated for a beginner, even though it is less efficient to do so. Either method will give you the correct result.
Some food for thought
The extracted region that you have in your binary image has several holes. I suspect you would want to grab the entire region without any holes. As such, I would recommend that you fill in these holes before you use the above code. The imfill function from MATLAB works nicely and it accepts binary images as input.
Check out the documentation here: http://www.mathworks.com/help/images/ref/imfill.html
As such, apply imfill on your binary image first, then go ahead and use the above code to do your extraction.

How to identify and display images from a MATLAB .mat data file?

I have a MATLAB file (xyz.mat), and apparently there are image data in this file but I have very little experience with MATLAB and have no clue how to 'extract/open' them.
This is the only clue I have:
The Matlab data file contains a structure "data" with a field "dataList" which is itself a structure array with one element per image. So the first image can be found in data.dataList(1).img
After loading the file into MATLAB (nothing happened) and typing the command data.dataList(1).img (I got a huge list of numbers) I still get no image.
Any help/ideas?
If data.dataList(1).img are 2D or 3D (check using size), you can use imshow to visualize this 2D array (grayscale), or 3D array (color) as an image.
im = data.dataList(1).img;
figure; imshow(im, []);
You can find the range of this image using min(im(:)) and max(im(:)) or plot the distribution of it's values using imhist.
To view all images as a rectangular montage look into montage function:
montage(I) displays all the frames of a multiframe image array I in a
single image object. I can be a sequence of binary, grayscale, or
truecolor images. A binary or grayscale image sequence must be an
M-by-N-by-1-by-K array.
In effect, you can put a number of K images (of the same M x N size) in an M x N x 1 x K array and invoke montage:
for k = 1:K
I(:,:,1,k) = data.dataList(k).img;
end
figure; montage(I);

Matlab - How to obtain values of pixels?

If I have an image, how can I obtain the values of each pixel in that image using matlab
Thanks.
Images are matrices (2D if grayscale, 3D if colored) in MATLAB.
You can use x(i,j) to access a pixel at location (i,j) in a grayscale image.
If the image is colored, you can use x(i,j,:) to access the r, g, b values in a 3-vector, respectively. If you need individual channels, then, you can use x(i,j,1) for red for example.
You may read this page to learn more.
You can use reshape to extract all the pixel values of the image into a vector with pixel values:
frame = imread('picture.jpg');
frame_size = size(frame);
allpixels = reshape(frame, frame_size(1)*frame_size(2), frame_size(3))
This can be useful when you want to vectorize your Matlab code (to avoid a for loop that goes through every pixel). To get back the original image representation:
frame2 = reshape(allpixels, frame_size);
to get the values at pixel(1,1) we simply write image(1,1).

Resources