Does locally specifying an argument in a scope in tensorflow overwrite all following arguments of the same type? - arguments

I am currently trying to understand the inception-v3 architecture and was taking a closer look at the definition of the model's layers:
with scopes.arg_scope([ops.conv2d, ops.max_pool, ops.avg_pool],stride=1, padding=’VALID’):
# 299 x 299 x 3
end_points[’conv0’] = ops.conv2d(inputs, 32, [3, 3], stride=2,scope=’conv0’)
# 149 x 149 x 32
end_points[’conv1’] = ops.conv2d(end_points[’conv0’], 32, [3, 3], scope=’conv1’)
# 147 x 147 x 32
end_points[’conv2’] = ops.conv2d(end_points[’conv1’], 64, [3, 3], padding=’SAME’, scope=’conv2’)
# 147 x 147 x 64
end_points[’pool1’] = ops.max_pool(end_points[’conv2’], [3, 3], stride=2, scope=’pool1’)
# 73 x 73 x 64
end_points[’conv3’] = ops.conv2d(end_points[’pool1’], 80, [1, 1], scope=’conv3’)
# 73 x 73 x 80.
end_points[’conv4’] = ops.conv2d(end_points[’conv3’], 192, [3, 3], scope=’conv4’)
# 71 x 71 x 192.
end_points[’pool2’] = ops.max_pool(end_points[’conv4’], [3, 3], stride=2, scope=’pool2’)
# 35 x 35 x 192.
net = end_points[’pool2’]
Checking the dimensions of each layer, I first had to take a look at the different padding styles: VALID and SAME. VALID will discard edges, while SAME will actually pad equally on both sides, so convolution still works on edges.
This holds for example for the first layer with 299x299 pixels to 149x149 with a stride of 2, so we only consider all odd pixels [Filter size: [3,3]] and end up with a dimension of 149x149, not 150x150 because padding is VALID (edges are discarded). Convolving this layer again, with the same filter size but now a stride of 1 we get 147x147 due to the edges "suffering" from being discarded. This layer then is again convolved but now with the twist, that padding is set to SAME which results in the same dimension of 147x147 as the layer before.
Now comes the spot that confuses me:
Assuming, SAME padding was only valid for the conv2 layer and is globally still set to VALID, the dimension for pool1 is correctly shown as 73x73 due to discarding the edge. When now going to the next convolutional layer conv3 I would expect it to become 71x71, taken the VALID padding as active. However, the output of conv3 remains at 73x73, which means, that SAME padding is used. But in conv4, the padding now seems to be VALID, reducing the dimension to 71x71 confusing me totally.
In the readme on github of slim's arg_scope I found, that setting one of the arguments locally overrides the global argument given:
with slim.arg_scope([slim.ops.conv2d], padding='SAME', stddev=0.01, weight_decay=0.0005):
net = slim.ops.conv2d(inputs, 64, [11, 11], scope='conv1')
net = slim.ops.conv2d(net, 128, [11, 11], padding='VALID', scope='conv2')
net = slim.ops.conv2d(net, 256, [11, 11], scope='conv3')
As the example illustrates, the use of arg_scope makes the code
cleaner, simpler and easier to maintain. Notice that while argument
values are specifed in the arg_scope, they can be overwritten locally.
In particular, while the padding argument has been set to 'SAME', the
second convolution overrides it with the value of 'VALID'.
However, this would mean, that conv4 should also have dimension of 73x73 because the padding would be SAME, so preserving the edges and the final pooling layer pool2 would then even be 37x37.
What is the thing that I am missing? Where is my mistake?
Thank you for helping me, I hope I have made the confusing problem clear.

I didn't see the filter size for the pool1 layer is actually [1,1] so it is not reducing the dimensions and has nothing to do with the arg_scope as it stays exactly how it should.

Related

Create a matrix A = [4 ... 128] in Octave language

Can you help to know how to create a matrix in Octave using a shortened way?
I need to have (matrix) A = [4, 8, 16, 32, 64, 128];
Want to use something like A = [4: *2 : 128] (meaning start = 4, step = *2 : finish = 128), but this doesn't work in Octave.
The same needs to be done to matrix B = [1 4 9 16 25 36], where step is 3 at the beginning and is increasing by 2 on the next step.
Any ideas?
With the colon operator you can only do steps of the same size. But notice that your matrix
A = [4, 8, 16, 32, 64, 128];
has the structure [2^2, 2^3, 2^4, ..., 2^7], so you can make use of broadcasting and define it as
A = 2.^[2,3,4,5,6,7];
or simply
A = 2.^(2:7);
You can use a loop for that task. You just need to write a consistent rule in the statments of your loop. A possible way to do that is the following:
start=1;
finish=36;
matrix(1)=start; i=2; last_term=start; %inicializations needed for the loop start
while last_term < finish
matrix(i)=matrix(i-1)+(1+2*(i-1)); %here you define your rule
last_term=matrix(i);
i=i+1;
endwhile
matrix %your output is printed in the console

Behaviour of max pooling is confused in Tensorflow

I am trying to reduce the resolution of an image to speed up training. So I used tf.nn.max_pool method to operate on my raw image. I am expecting the resultant image is a blurred one with smaller size, but actually it is not.
My raw image has shape [320, 240, 3], and it looks like:
And after max_pooling, with ksize=[1,2,2,1] and strides=[1,2,2,1] it becomes
produced by the following code:
# `img` is an numpy.array with shape [320, 240, 3]
# since tf.nn.max_pool only receives tensor with size
# [batch_size, height,width,channel], so I need to reshape
# the image to have a dummy dimension.
img_tensor = tf.placeholder(tf.float32, shape=[1,320,240,3])
pooled = tf.nn.max_pool(img_tensor, ksize=[1,2,2,1], strides=[1,2,2,1],padding='VALID')
pooled_img = pooled.eval(feed_dict={img_tensor: img.reshape([1,320,240,3])})
plt.imshow(np.squeeze(pooled_img, axis=0))
The pooled image has shape [160, 120, 3] which is expected. Its just the transformation behaviour is really confused me. It shouldnt have that "repeated shifting" behaviour, since there is no pixel overlapping computation.
Many thanks in advance.
I think the problem is how your image has been reshaped. This image actually has the shape of [240, 320, 3].
So try to use [1, 240, 320, 3]) instead of [1, 320, 240, 3]). It should work.

How to visualize learned filters on tensorflow

Similarly to the Caffe framework, where it is possible to watch the learned filters during CNNs training and it's resulting convolution with input images, I wonder if is it possible to do the same with TensorFlow?
A Caffe example can be viewed in this link:
http://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/00-classification.ipynb
Grateful for your help!
To see just a few conv1 filters in Tensorboard, you can use this code (it works for cifar10)
# this should be a part of the inference(images) function in cifar10.py file
# conv1
with tf.variable_scope('conv1') as scope:
kernel = _variable_with_weight_decay('weights', shape=[5, 5, 3, 64],
stddev=1e-4, wd=0.0)
conv = tf.nn.conv2d(images, kernel, [1, 1, 1, 1], padding='SAME')
biases = _variable_on_cpu('biases', [64], tf.constant_initializer(0.0))
bias = tf.nn.bias_add(conv, biases)
conv1 = tf.nn.relu(bias, name=scope.name)
_activation_summary(conv1)
with tf.variable_scope('visualization'):
# scale weights to [0 1], type is still float
x_min = tf.reduce_min(kernel)
x_max = tf.reduce_max(kernel)
kernel_0_to_1 = (kernel - x_min) / (x_max - x_min)
# to tf.image_summary format [batch_size, height, width, channels]
kernel_transposed = tf.transpose (kernel_0_to_1, [3, 0, 1, 2])
# this will display random 3 filters from the 64 in conv1
tf.image_summary('conv1/filters', kernel_transposed, max_images=3)
I also wrote a simple gist to display all 64 conv1 filters in a grid.

Prolog "winning position"

This is a game for 2 people to remove numbers from a list. A player will lose if the player picks up the last number. Given the 2 rules of removing numbers in a list,
Prolog search for possible combination for subtracting 2 elements from a list
Prolog possible removal of elements in a list
There is a question,
write a predicate win(S), that succeed if S is a winning position for
the player whose turn it is to play and fails otherwise. Besides
giving the correct answers, your code for this should avoid evaluating
the same position more than once. For example, there are only 960
positions that can be reached from [30,30], but many billions of games
that could be played starting from there...
I am really confused how can [30,30] reach 960 positions. According to the 2 rules, if I subtract N from one element only, I can only reach 60 states.
Y = [29, 30] or Y = [30, 29]
Y = [28, 30] or Y = [30, 28]
Y = [27, 30] or ...
...
Y = [1, 30]
Y = [30]
Or If I subtract N from 2 elements, I can only reach 30 states..
Y = [29, 29]
Y = [28, 28]
...
Y = [1, 1]
Y = []
I am really confused how 960 position can be reached. Yet, win(S), will evaluate whether the S is a winning position... Does it mean that current S can directly leads to [1] by just one move? or the current S can lead to [1] by multiple moves?

Generating a subset uniformly at random?

Here is an implementation of a combinatorial algorithm to choose a subset of an n-set, uniformly at random. Since there are 2n subsets of an n-set, each subset should have a probability: 2-n of getting selected.
I believe I have implemented the algorithm correctly (please let me know if there is a bug somewhere). When I run the program with Java 7 on my Linux box however, I get results that I am not able to reason quite well. The mystery seems to be around the Random Number Generator. I understand that one needs to run the program a 'large number' of times to 'see that the distribution reaches uniformity'. The question however is how large is large. A few runs I did suggest that unless the number of times the experiment is done is >= 1 billion, the distribution of chosen subsets is quite nonuniform.
The algorithm is based on Prof. Herbert Wilf's combinatorial algorithms book where the implementation (slightly different) is done in Fortran and the distribution is more-or-less uniform even when the program is run only 1280 times.
Here are a few sample runs (there's some variation among the run when n is constant) to get a random subset of a 4-set:
Number of times experiment is done n = 1280
Number of times experiment is done n = 12,800
Number of times experiment is done n = 128,000 (still 8 subsets only!)
Number of times experiment is done n = 1,280,000
Number of times experiment is done n = 12,800,000 (now it starts making sense)
Number of times experiment is done n = 1,280,000,000 (this is okay!)
Would you expect such performance? How could Prof. Wilf achieve similar results with only 1280 iterations of an equivalent program?
Every time you call ranInt(), you reset the RNG. Therefore in the long run, these numbers are no longer random.
Moved Random r = new Random(System.currentTimeMillis()); to the top and add static to it
class RandomSubsetSimulation {
static Random r = new Random(System.currentTimeMillis());
public static void main(String[] args) { ...
I am able to get the following results with 8-set
Total: 1000, number of subsets with a frequency > 0: 256
Total # of subsets possible: 256
Full results with 4-set
Frequencies of chosen subsets ....
[3] : 76, 4, 5.94
[4] : 72, 8, 5.63
[] : 83, -3, 6.48
[1] : 90, -10, 7.03
[2] : 80, 0, 6.25
[3, 4] : 86, -6, 6.72
[2, 3] : 88, -8, 6.88
[2, 4] : 55, 25, 4.30
[1, 2, 3] : 99, -19, 7.73
[1, 2, 4] : 75, 5, 5.86
[2, 3, 4] : 76, 4, 5.94
[1, 3] : 85, -5, 6.64
[1, 2] : 94, -14, 7.34
[1, 4] : 72, 8, 5.63
[1, 2, 3, 4] : 71, 9, 5.55
[1, 3, 4] : 78, 2, 6.09
Total: 1280, number of subsets with a frequency > 0: 16
Total # of subsets possible: 16

Resources