Iterate over a tensor's rows and cols in Tensorflow - image

A part of my project is to use a thresholding kernel on an image.
The thresholding kernel could look like this:
[50 100]
[150 200]
I would like to go over each group of 3x3 pixels (without overlap), and threshold them using my kernel.
For example, if I have this grayscale image:
[120 120 120 120]
[120 120 120 120]
[170 170 170 170]
[170 170 170 170]
Then after thresholding I should get this image:
[1 1 1 1]
[0 0 0 0]
[1 1 1 1]
[1 0 1 0]
I am using TensorFlow, and my network is needed with different shapes of batches.
The input is:
data['input_tensor'] = tf.placeholder(tf.float32, shape=[None, None, None, 1], name='Input')
The thresholding kernel is of type: tf.Variable(), and size 5x5,
AND ITS VALUES SHOULD BE LEARNED!
I can't find a way to make it happen.
I tried iterating through the input batch, but its size is unknown (it is only known during a session).
I don't want to duplicate the thresholding kernel, because then the network will try to learn all of its values (and it's too much for now).
Is there a way to do it without loops?
If not, how could I do it with loops?
Thanks.

Related

Why are matricies used in computer graphics?

I understand how to apply matrices in computer graphics, but I don't quite understand why this is done. For example in translation: to translate vector (x, y, z) by vector (diffX, diffY, diffZ) you could simply just add the vectors together instead of creating a translation matrix:
[1 0 0 diffX]
[0 1 0 diffY]
[0 0 1 diffZ]
[0 0 0 1 ]
and then multiplying the vector by the matrix to get (x+diffX, y+diffY, z+diffZ). Surely applying matrices like this would be wasteful of performance and memory?

PostScript error with imagemask and raw data

In Adobe's PLRM
I found the following example using the imagemask operator.
This works fine when running with Ghostscript.
54 112 translate % Locate lower-left corner of square
120 120 scale % Scale 1 unit to 120 points
0 setgray % Set current color to black
24 23 % Specify dimensions of source mask
true % Set polarity to paint the 1 bits
[24 0 0 -23 0 23] % Map unit square to mask
{< 003B00 002700 002480 0E4940
114920 14B220 3CB650 75FE88
17FF8C 175F14 1C07E2 3803C4
703182 F8EDFC B2BBC2 BB6F84
31BFC2 18EA3C 0E3E00 07FC00
03F800 1E1800 1FF800 >}
imagemask
showpage
As an exercise I tried to rewrite the above example using an ImageType-1 dictionary and raw data, and finally came up with this code:
54 112 translate
120 120 scale
0 setgray
<<
/ImageType 1
/Width 24
/Heigth 23
/BitsPerComponent 1
/Decode [1 0]
/ImageMatrix [24 0 0 -23 0 23]
/DataSource currentfile /ASCIIHexDecode filter
>>
imagemask
003B00 002700 002480 0E4940
114920 14B220 3CB650 75FE88
17FF8C 175F14 1C07E2 3803C4
703182 F8EDFC B2BBC2 BB6F84
31BFC2 18EA3C 0E3E00 07FC00
03F800 1E1800 1FF800>
showpage
However, when running this with Ghostscript I get the following error.
Error: /undefined in --imagemask--
I'm still scratching my head to find the bug, but in vain.
How can it be imagemask is undefined? Or did I miss something obvious?
I don't know if this is exactly the code you've written, but there's a typo:
/Heigth 23
which should obviously be:
/Height 23
If I correct that, the file runs to completion, and draws the turkey.

How to create a uniformly random matrix in Julia?

l want to get a matrix with uniformly random values sampled from [-1,2]
x= rand([-1,2],(3,3))
3x3 Array{Int64,2}:
-1 -1 -1
2 -1 -1
-1 -1 -1
but it takes into consideration just -1 and 2, and I'm looking for continuous values for instance -0.9 , 0.75, -0.09, 1.80.
How can I do that?
Note: I am assuming here that you're looking for uniform random variables.
You can also use the Distributions package:
## Pkg.add("Distributions") # If you don't already have it installed.
using Distributions
rand(Uniform(-1,2), 3,3)
I do quite like isebarn's solution though, as it gets you thinking about the actual properties of the underlying probability distributions.
for random number in range [a,b]
rand() * (b-a) + a
and it works for a matrix aswell
rand(3,3) * (2 - (-1)) - 1
3x3 Array{Float64,2}:
1.85611 0.456955 -0.0219579
1.91196 -0.0352324 0.0296134
1.63924 -0.567682 0.45602
You need to use a FloatRange{Float64} with the dessired step:
julia> rand(-1.0:0.01:2.0, 3, 3)
3x3 Array{Float64,2}:
0.79 1.73 0.95
0.73 1.4 -0.46
1.42 1.68 -0.55

Sampling a Greyscale image into 8 levels

What I am trying to do:-
Using MATLAB, I am trying to read a Greyscale image (having pixel values bw range 0-255) i.e. an 8bit image into like 3 bit image, hence it is like sampling the range into 8 different levels. For example if the pixel value is 25 then as it comes bw range 0-31, it will be assigned value 0, for bw 32-63 level will be 1 and so on until finally range 224-255 it will be on range 7.
After that I am counting the total no of pixels in different levels.
Code:-
img=imread('Cameraman.bmp');
r=size(img,1);
c=size(img,2);
pixel_count=zeros(9,1);
for i=1:r
for j=1:c
if fix(img(i,j)/31)==8
img(i,j)
end
img(i,j)=fix(img(i,j)/33);
pixel_count(img(i,j)+1)=pixel_count(img(i,j)+1)+1;
end
end
pixel_count
My Problem:-
Even if the range of each pixel is from 0-255, and I am dividing it into 8 levels, I am getting a total of 9 levels.
For debugging it I added the if statement in the code and my output is:--
ans = 248
ans = 250
ans = 249
ans = 249
ans = 235
ans = 249
ans = 249
ans = 235
...and more
pixel_count =
11314
3741
2061
5284
12629
25590
4439
437
41
As you can see for some values like 249,235 and more I am getting the extra 9th level.
What is the problem here. Please help.
Thank You.
You aren't dividing by the right value properly. You need to divide by 32, then take the floor / fix. Between 0-31, if you divide by 32 then take the floor / fix, you get the value 0, between 31-63, you get 1, up until 224-255 which gives you 7.
Also, your for loop is incorrect. You are mistakenly replacing the pixel of the input image with its bin location. I would also change the precision to double. It seems that with my experiments, using fix combined with a uint8 image gives me that random 9th bin index that you're talking about.
Take a look at some sample results from my REPL:
>> fix(240/32) + 1
ans =
8
>> fix(uint8(240)/32) + 1
ans =
9
>> fix(uint8(255)/32) + 1
ans =
9
>> fix(255/32) + 1
ans =
8
Therefore, it's a problem with the image type. For any values that are beyond 240, the value when being divided by 32 as it's uint8 gets rounded so that 240 / 32 = 7.5 but because it's uint8 and it's an integer, it gets rounded to 8, then adding 1 makes it go to 9. Therefore, anything beyond 240 will get rounded to 8 and ultimately giving you 9 when adding by 1.
So, simply change the division to be 32, not 33 or 31 and fix what I said above:
img=imread('Cameraman.bmp');
img = double(img); %// Change
r=size(img,1);
c=size(img,2);
pixel_count=zeros(8,1); %// Change
for i=1:r
for j=1:c
pix = fix(img(i,j)/32); %// Change here
pixel_count(pix+1)=pixel_count(pix+1) + 1; %// Change
end
end
pixel_count
As a minor note, to check to see if you're right, use histc:
pixel_count = histc(fix(double(img(:))/32) + 1, 1:8);
If you got your code right, your code and with what I wrote above should match. Using the cameraman.tif image that's built-in to the Image Processing Toolbox, let's compare the outputs:
>> pixel_count
pixel_count =
13532
2500
2104
8341
15333
22553
817
356
>> pixel_count2 = histc(fix(double(img(:))/32) + 1, 1:8)
pixel_count2 =
13532
2500
2104
8341
15333
22553
817
356
Looks good to me!

How can I find the most dense regions in an image?

Consider a black and white image like this
What I am trying to do is to find the region where the white points are most dense. In this case there are 20-21 such dense regions (i.e. the clusters of points makes a dense region).
Can anyone give me any hint on how this can be achieved?
If you have access to the Image Processing Toolbox, you can take advantage of a number of filtering and morphological operations it contains. Here's one way you could approach your problem, using the functions imfilter, imclose, and imregionalmax:
% Load and plot the image data:
imageData = imread('lattice_pic.jpg'); % Load the lattice image
subplot(221);
imshow(imageData);
title('Original image');
% Gaussian-filter the image:
gaussFilter = fspecial('gaussian', [31 31], 9); % Create the filter
filteredData = imfilter(imageData, gaussFilter);
subplot(222);
imshow(filteredData);
title('Gaussian-filtered image');
% Perform a morphological close operation:
closeElement = strel('disk', 31); % Create a disk-shaped structuring element
closedData = imclose(filteredData, closeElement);
subplot(223);
imshow(closedData);
title('Closed image');
% Find the regions where local maxima occur:
maxImage = imregionalmax(closedData);
maxImage = imdilate(maxImage, strel('disk', 5)); % Dilate the points to see
% them better on the plot
subplot(224);
imshow(maxImage);
title('Maxima locations');
And here's the image the above code creates:
To get things to look good I just kept trying a few different combinations for the parameters for the Gaussian filter (created using fspecial) and the structuring element (created using strel). However, that little bit of trial and error gave a very nice result.
NOTE: The image returned from imregionalmax doesn't always have just single pixels set to 1 (to indicate a maxima). The output image often contains clusters of pixels because neighboring pixels in the input image can have equal values, and are therefore both counted as maxima. In the code above I also dilated these points with imdilate just to make them easier to see in the image, which makes an even bigger cluster of pixels centered on the maxima. If you want to reduce the cluster of pixels to a single pixel, you should remove the dilation step and modify the image in other ways (add noise to the result or filter it, then find the new maxima, etc.).
Sliding Window (simple but slow)
You could create a sliding window (e.g. 10x10 pixels size) which iterates over the image, and for each position you count the number of white pixels in this 10x10 field, and store the positions with the highest counts.
This whole process is O(n*m) where n is the number of pixels of the image, and m the size of the sliding window.
In other words, you convolve the image with a mean filter (here the box filter), and then use the extrema.
Sliding Window (fast)
At first, calculate a summed area table, which can be done very efficiently in a single pass:
create a 2D array sat with the same size as the original image img.
Iterate over each index, and calculate for each index x and y
sat[x, y] = img[x, y] + sat[x-1, y] + sat[x, y-1] - sat[x-1, y-1]
For example, given an image where 0 is dark and 1 is white, this is the result:
img sat
0 0 0 1 0 0 0 0 0 1 1 1
0 0 0 1 0 0 0 0 0 2 2 2
0 1 1 1 0 0 0 1 2 5 5 5
0 1 0 0 0 0 0 2 3 6 6 6
0 0 0 0 0 0 0 2 3 6 6 6
Now iterate over the summed area table's indices with a sliding window, and calculate the number of white pixels in it by using the corners A, B, C, D of the sliding window:
img sat window
0 0 0 1 0 0 0 0 0 1 1 1 0 A-----B 1
0 0 0 1 0 0 0 0 0 2 2 2 0 | 0 2 | 2
0 1 1 1 0 0 0 1 2 5 5 5 0 | 2 5 | 5
0 1 0 0 0 0 0 2 3 6 6 6 0 | 3 6 | 6
0 0 0 0 0 0 0 2 3 6 6 6 0 D-----C 6
Calculate
density(x', y') = sat(A) + sat(C) - sat(B) - sat(D)
Which in the above example is
density(1, 0) = 0 + 6 - 1 - 2 = 3
This process requires a temporary image, but it is just O(n), so speed is independent of the sliding window's size.
if you have the image processing toolbox, blur it with a gaussian filter, then find the peaks/extrema.
vary the size of the gaussian filter to get the number of 'dense' regions you want.
Maybe a naive approach:
You define a square of n*n which is the maximum size of the region in which you measure the density. For each point in the image you consider the point as the center of the square and count around the number of black (b) and white (w) points. Using the difference b-w you can determine in which square(s) is the most white.
The most dense regions must be determined in a fuzzy way. If one region has 600 white points and another 599 then, for the human eye, they are the same density. 600 is 100% dense while 599 is 99% dense and 1% non-dense. Use an epsilon for this.
n can be predefined or based on some function (ie. percent of image size).
You could also use a circle/ellipse instead of square/rectangle. Choose what fits your needs best

Resources