Tensorflow output from stride - filter

While trying to use Tensorflow I encountered a little problem regarding the stride.
I have an image of size 67*67, and I want to apply a filter of size 7*7 with stride 3. The output layer should have an edge length of 20 calculated from:
Where n is the output layer edge length (in this case, 20). It is calculated in the follow way:
If we only consider the first row (since other rows are the same), then out of the 67 elements in the first row, the first 7 would go to the first cell of the output layer. Then the filter moves 3 element to the right, which makes the filter covering element 4 to 10, and that would correspond to the 2nd element of the output layer. So on so forth. Every time we advance 3 elements and the total number of times we will advance (counting the first step where it covers 7 elements) is n. Thus the equation I used.
However, the output layer I got from Tensorflow was 23, which is 67/3 and rounded up to the next integer. I don't understand the reasoning behind this.
Can someone explain why it is done like this in Tensorflow?
Thanks!

Output size is computed in two ways depending on the padding you are using. If you are using 'SAME' padding, the output size is computed as:
out_height = ceil(float(in_height) / float(strides[1]))
out_width = ceil(float(in_width) / float(strides[2]))
Where as with 'VALID' padding output is computed as:
out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
out_width = ceil(float(in_width - filter_width + 1) / float(strides[2]))
Which is what you were using to calculate your output, but we can clearly see you must be using 'SAME' padding.
So in your case you get:
If you were actually using 'VALID' padding, the output would be closer to your approximation.
You can read more about how tensorflow calculates feature map sizes and padding here.

Related

Showing two images with the same colorbar in log

I have two sparse matrices "Matrix1" and "Matrix2" of the same size p x n.
By sparse matrix I mean that it contains a lot of exactly zero elements.
I want to show the two matrices under the same colormap and a unique colorbar. Doing this in MATLAB is straightforward:
bottom = min(min(min(Matrix1)),min(min(Matrix2)));
top = max(max(max(Matrix1)),max(max(Matrix2)));
subplot(1,2,1)
imagesc(Matrix1)
colormap(gray)
caxis manual
caxis([bottom top]);
subplot(1,2,2)
imagesc(Matrix2)
colormap(gray)
caxis manual
caxis([bottom top]);
colorbar;
My problem:
In fact, when I show the matrix using imagesc(Matrix), it can ignore the noises (or backgrounds) that always appear with using imagesc(10*log10(Matrix)).
That is why, I want to show the 10*log10 of the matrices. But in this case, the minimum value will be -Inf since the matrices are sparse. In this case caxis will give an error because bottom is equal to -Inf.
What do you suggest me? How can I modify the above code?
Any help will be very appreciated!
A very important point is that the minimum value in your matrix will always be 0. Leveraging this, a very simple way to address your problem is to add 1 inside the log operation so that values that map to 0 in the original matrix also map to 0 in the log operation. This avoids the -Inf error that you're encountering. In fact, this is a very common way of visualizing the Fourier Transform if you will. Adding 1 to the logarithm ensures that the transform has no negative values in the output, yet the derivative or its rate of change remains intact as the effect is simply a translation of the curve by 1 unit to the left.
Therefore, simply do imagesc(10*log10(1 + Matrix));, then the minimum is always bounded at 0 while the maximum is unbounded but subject to the largest value that is seen in Matrix.

Algorithm - Grid Map find number of sub-blocks with specific property

I have a grid map NxN. Each cell may have the value '0' or '1'. I am trying to find the exact number of distinct rectangle sub-blocks of the map that include a specific number of '1' and this number can be between 1 and 6. I have thought of searching for each possible rectangle but this is very slow for a map of size 500x500 and the solution must be ~ 1 sec for a common desktop computer. Can someone tell me a corresponding problem so I can look for a working algorithm or better can someone suggest me a working algorithm for this problem? Thank you all in advance!
I imagine that your search of all the rectangles is slow because you are actually counting on each possible rectangle. The solution to this is not to count all rectangles, but rather create a second array of NxN which contains the count for the rectangle (0,0..x,y), call this OriginCount. Then to calculate the count for any given rectangle, you will not have to go through the rectangle and count. You can simply use
Count(a,b..c,d) = OriginCount(c,d) + OriginCount(a-1,b-1) -
OriginCount(a-1,d) - OriginCount(c,b-1)
That turns the problem of counting the ones in any given rectangle, from an N2 problem to a discrete or constant time problem, and your code gets in the order of thousands times faster (for your 500x500 case)
Mind you, in order to set up the OriginCount array, you can use the same concept, don't just go count the ones for each rectangle, from 0,0 to x,y. Rather, use the formula
OriginCount(x,y) = OriginCount(x-1,y) + OriginCount(x,y-1) - OriginCount(x-1,y-1) +
GridMap(x,y) == 1 ? 1 : 0;
Mind you, you have to account for edge cases - where x=0 or y=0.

tensorflow: reduce_max function

consider the following code-
a=tf.convert_to_tensor(np.array([[1001,1002],[3,4]]), dtype=tf.float32)
b=tf.reduce_max(a,reduction_indices=[1], keep_dims=True)
with tf.Session():
print b.eval()
What exactly is the purpose of keep_dims here? I tested quite a bit, and saw that the above is equivalent to-
b=tf.reduce_max(a,reduction_indices=[1], keep_dims=False)
b=tf.expand_dims(b,1)
I maybe wrong, but my guess is that if keep_dims is False, we get a 2D coloumn vector. And if keep_dims=True, we have a 2x1 matrix. But how are they different?
If you reduce over one or more indices (i.e. dimensions of the tensor), you effectively reduce the rank of the tensor (i.e. its number of dimensions or, in other words, the number of indices you need in order to access an element of the tensor). By setting keep_dims=True, you are telling tensorflow to keep the dimensions over which you reduce. They will then have size 1, but they are still there. While a column vector and a nx1 matrix are conceptually the same thing, in tensorflow, these are tensors of rank 1 (you need a single index to access an element) and rank 2 (you need two indices to access an element), respectively.

Minimum length routing path - Dynamic Programing

There are 2*N pins on a line, N of them are input pins, N of them are output pins. Every input pin must be connected to a single output pin and vice-versa, like in this image:
The connection lines can be made only vertically and horizontally in the upper half plane and the connection lines can no overlap.
The question is what is the minimum length of all lines that can be achieved when connecting all pins.
In the example above, the length is 31.
A greedy approach using a stack, similar to matching parenthesis problem is not the optimal solution.
If you look at the outermost line between 1 and 8, then it splits the pins into two groups. One between 2 and 7, and the other between 9 and 10.
Each of these groups has a constraint on the maximum line height it can have without extending past the outer line. 2 for the first, and some default like 5 for the second.
This gives a function lineLength(leftPin, rightPin, maxHeight) that can get its value by finding a height h and pin i so that h <= maxHeight and pin[i] is between leftPin+1 and rightPin and the opposite type of pin[leftPin].
Then the line length would be rightPin-leftPin+2*h + lineLength(leftPin+1, i-1, h-1) + lineLength(i+1, rightPin-1, 5)
There are O(n^3) possible values for this function and calculating each value, with memoization, would require O(n^2) time because of the iterations of h and i. So the total time is O(n^5).
It should be possible to improve this with a binary search on the maximum height.
Divide and conquer can get this down to n^2.
In general, the first pin must be paired with something - and the only options for its pairing are when the string of pins have an even number of input and output bits. So in the example #1 can pair with #8 or #10.
For each of these pairings, you add the cost of that wire to the sub-problem inside the wire, and the subproblem outside the wire.
For example: if we're pairing 1 and 8, then
cost = recursiveCost (2,7) + recursivecost(9,end) + wireCost(1,8)
You'll also need to track the maximum recursive depth of the inside function call, because you'll need that to calculate wireCost(a,b).

Tile placement algorithm

I have this field of tiles which is 36 x 36 inches wide and high.
So I have blocks 8x8, 6x6, 6x8, 4x8 which can be rotated 90 degrees to fit wherever possible.
My task is to make application that calulates which and how many blocks should be chosen so that all together fit in to a given wall oppening. In this example oppening 36 x 36.
Note: The oppening should be filled with as least as possible tiles, meaning bigger tiles have priority
Which algorithm should I use for tile placement?
Another example. Field 30 x 30 is drawn like this:
50 x 50
Since amit gave the general case answer, I'll make this one specific. With those four blocks sizes, and assuming it's even possible (dimensions are even and >= 6, etc), you can use a semi-greedy algorithm:
The first obective is to maximize the number of 8x8 blocks. To do that, you need to figure out how many 6 size blocks you need in each direction. For each dimension, just check for divisibility by 8. If it's not divisible, subtract 6. Repeat until divisible (it shouldn't take more than 3 tries).
However many times it took, that's how may 6x6 blocks you need in that dimension. Form a rectangle out of them and put it in one corner. Form another rectangle out of 8x8 blocks and put them in the opposite corner. The corners of these two rectangles should be touching.
So now you probably have some leftover space, in the form of two rectangles in the opposite corners. We know that one dimension of each is divisible by 8, and one is divisible by 6. The easy way out here would be to fill it up with 6x8 blocks rotated appropriately, but that doesn't guarantee the maximum number of large(8x8) blocks. For example, with 50x50, you'd have two rectangles of 18x32 left. You could fill them with twelve 6x8 tiles each. You can't even do better than 12 blocks each, but you can fit more 8x8 blocks in there.
If that's not a concern, then you're done (hooray). The bonus this way is that you never need to use the 4x8 blocks.
If you do want to maximize the 8x8 blocks, you'll have to take another step. We're concentrating on the dimension divisible by 6 here, because the 8 is easy. Every size we might need(8x8,6x8,4x8) stacks there perfectly.
For the other side, there are only 3 possible numbers that it could be: 6, 12, and 18. If it's anything else, the first step wasn't done right. Then take the following action:
For 6, add a row of 6x8 (no optimization)
For 12, add a row of 4x8 and a row of 8x8
For 18, add a row of 4x8, a row of 6x8, a row of 8x8
Done!
To see the difference, here we have two 50x50 grids:
Blue - 8x8
Red - 6x6
Green - 6x8
Gray - 4x8
This first example gives us 49 total blocks. The blue is a 32x32 area (16 blocks), red is 18x18 (9 blocks), and the rest is simply filled with 6x8's (24 blocks).
This example still gives 49 total, but there are more 8x8 blocks. Here we have 24 large blocks, rather than 16 in the last example. There are now also 4x8 blocks being used.
Here you go, in Python:
def aux(x):
# in h we store the pre-calculated results for small dimensions
h = {18:[6,6,6], 16:[8,8], 14:[8,6], 12:[6,6], 10:[6,4], 8:[8], 6:[6], 4:[4]}
res = []
while x > 18:
# as long as the remaining space is large, we put there tiles of size 8
res.append(8)
x -= 8
if x not in h:
print("no solution found")
return []
return res + h[x]
def tiles( x, y ):
ax = aux(x) # split the x-dimension into tiles
ay = aux(y) # split the y-dimension into tiles
res = [ [ (x,y) for x in ax ] for y in ay ]
for u in res:
print(u)
return res
tiles( 30, 30 )
The basic idea is that you can solve x and y independently, and then combine the two solutions.
Edit: As Dukeling says this code happily uses 4x6 and 4x4 blocks, contrary to the requirements. However, I think it does that only if there is no other way. So if the results contains such blocks then there is no solution without those blocks. And if you have no Python readily available, you can play with this code here: http://ideone.com/HHB7F8 , just press fork right above the source code.
Assuming you are looking for general case answer, I am sorry to say - but this problem is NP-Complete. It is basically a 2D variation of the Subset Sum Problem.
The subset sum problem: Given a set S and a number x - find out if there is a subset of S that sums to x.
It is easy to see that by reducing the subset sum problem to a "field" of size 1*x and for every s in S we have a tile 1*s - a solution to one problem is also a solution to the other one.
Thus - there is no known polynomial solution to this problem, and most believe one does not exist.
Note however, there is a pseudo-polynomial dynamic programming solution to subset sum that might be utilized here as well.

Resources