I am currently trying to understand the inception-v3 architecture and was taking a closer look at the definition of the model's layers:
with scopes.arg_scope([ops.conv2d, ops.max_pool, ops.avg_pool],stride=1, padding=’VALID’):
# 299 x 299 x 3
end_points[’conv0’] = ops.conv2d(inputs, 32, [3, 3], stride=2,scope=’conv0’)
# 149 x 149 x 32
end_points[’conv1’] = ops.conv2d(end_points[’conv0’], 32, [3, 3], scope=’conv1’)
# 147 x 147 x 32
end_points[’conv2’] = ops.conv2d(end_points[’conv1’], 64, [3, 3], padding=’SAME’, scope=’conv2’)
# 147 x 147 x 64
end_points[’pool1’] = ops.max_pool(end_points[’conv2’], [3, 3], stride=2, scope=’pool1’)
# 73 x 73 x 64
end_points[’conv3’] = ops.conv2d(end_points[’pool1’], 80, [1, 1], scope=’conv3’)
# 73 x 73 x 80.
end_points[’conv4’] = ops.conv2d(end_points[’conv3’], 192, [3, 3], scope=’conv4’)
# 71 x 71 x 192.
end_points[’pool2’] = ops.max_pool(end_points[’conv4’], [3, 3], stride=2, scope=’pool2’)
# 35 x 35 x 192.
net = end_points[’pool2’]
Checking the dimensions of each layer, I first had to take a look at the different padding styles: VALID and SAME. VALID will discard edges, while SAME will actually pad equally on both sides, so convolution still works on edges.
This holds for example for the first layer with 299x299 pixels to 149x149 with a stride of 2, so we only consider all odd pixels [Filter size: [3,3]] and end up with a dimension of 149x149, not 150x150 because padding is VALID (edges are discarded). Convolving this layer again, with the same filter size but now a stride of 1 we get 147x147 due to the edges "suffering" from being discarded. This layer then is again convolved but now with the twist, that padding is set to SAME which results in the same dimension of 147x147 as the layer before.
Now comes the spot that confuses me:
Assuming, SAME padding was only valid for the conv2 layer and is globally still set to VALID, the dimension for pool1 is correctly shown as 73x73 due to discarding the edge. When now going to the next convolutional layer conv3 I would expect it to become 71x71, taken the VALID padding as active. However, the output of conv3 remains at 73x73, which means, that SAME padding is used. But in conv4, the padding now seems to be VALID, reducing the dimension to 71x71 confusing me totally.
In the readme on github of slim's arg_scope I found, that setting one of the arguments locally overrides the global argument given:
with slim.arg_scope([slim.ops.conv2d], padding='SAME', stddev=0.01, weight_decay=0.0005):
net = slim.ops.conv2d(inputs, 64, [11, 11], scope='conv1')
net = slim.ops.conv2d(net, 128, [11, 11], padding='VALID', scope='conv2')
net = slim.ops.conv2d(net, 256, [11, 11], scope='conv3')
As the example illustrates, the use of arg_scope makes the code
cleaner, simpler and easier to maintain. Notice that while argument
values are specifed in the arg_scope, they can be overwritten locally.
In particular, while the padding argument has been set to 'SAME', the
second convolution overrides it with the value of 'VALID'.
However, this would mean, that conv4 should also have dimension of 73x73 because the padding would be SAME, so preserving the edges and the final pooling layer pool2 would then even be 37x37.
What is the thing that I am missing? Where is my mistake?
Thank you for helping me, I hope I have made the confusing problem clear.
I didn't see the filter size for the pool1 layer is actually [1,1] so it is not reducing the dimensions and has nothing to do with the arg_scope as it stays exactly how it should.
I'm implementing gomoku game in Ruby, this is a variation of tic-tac-toe played on 15x15 board, and the first player who places 5 O's or X's in horizontal, vertical or diagonal row wins.
First, I assigning Matrix to a variable and fill it with numbers from 0 to 224, so there are no repetitions and I could count them later
gomoku = Matrix.zero(15)
num = 0
15.times do |i|
15.times do |j|
gomoku[i, j] = num
num += 1
end
end
then players take turns, and after every turn I check a win with the method win?
def win? matrix
15.times do |i|
return true if matrix.row_vectors[i].chunk{|e| e}.map{|_, v| v.length}.max > 4 # thanks to sawa for this way of counting adjacent duplicates
return true if matrix.column_vectors[i].chunk{|e| e}.map{|_, v| v.length}.max > 4
end
return false
end
I know, that I'm probably doing it wrong, but my problem isn't that, though suggestions are welcome. The problem is with diagonal rows. I don't know how to count duplicates in diagonal rows
diagonal_vectors = (-10 .. 10).flat_map do |x|
i = x < 0 ? 0 : x
j = x < 0 ? -x : 0
d = 15 - x.abs
[
d.times.map { |k|
gomoku[i + k, j + k]
},
d.times.map { |k|
gomoku[i + k, 14 - j - k]
}
]
end
With this, you can apply the same test sawa gave you.
EDIT: What this does
When looking at diagonals, there's two kinds: going down-left, and going down-right. Let's focus on down-right ones for now. In a 15x15 matrix, there are 29 down-right diagonals: one starting at each element of the first row, one starting at each element of the first column, but taking care not to count the one starting at [0, 0] twice. But some diagonals are too short, so we want to only take those that start on the first eleven rows and columns (because others will be shorter than 5 elements). This is what the first three lines do: [i, j] will be [10, 0], [9, 0] ... [0, 0], [0, 1], ... [0, 10]. d is the length of a diagonal starting at that position. Then, d.times.map { |k| gomoku[i + k, j + k] } collects all the elements in that diagonal. Say we're working on [10, 0]: d is 5, so we have [10, 0], [11, 1], [12, 2], [13, 3], [14, 4]; and we collect values at those coordinates in a list. Simultaneously, we'll also work on a down-left diagonal; that's the other map's job, which flips one coordinate. Thus, the inner block will return a two-element array, which is two diagonals, one down-left, one down-right. flat_map will take care of iterating while squishing the two-element arrays so that we get one big array of diagonals, not array of two-element arrays of diagonals.
I am provided with M segments of form [L,R] of N elements of an array.I need to change these segments in such a way that all segments have pairwise distinct left ends.
Example : Let suppose we have 5 elements in array and we have 4 segments : [1,2],[1,3],[2,4] and [4,5] then after making all the left ends pairwise disjoint we have [1,2],[3,3],[2,4] and [4,5].Here all segments have different left ends
Let's see if I got this. I suggest
You sort all segments according to the right end.
Then you fix all the left ends, starting with the smallest right end working towards larger right ends. Fixing means you replace the current left end with the next available value.
In Python it looks like this:
def fit_intervals(datalist):
d1 = sorted(datalist, key=lambda x : x[1])
taken = set()
def find_next_free(x):
while x in taken:
x = x + 1
taken.add(x)
return x
for interval in d1:
interval[0] = find_next_free( interval[0] )
data = [ [4,5], [1,9], [1,2], [1,3], [2,4] ]
fit_intervals(data)
print(data)
output: [[4, 5], [5, 9], [1, 2], [2, 3], [3, 4]]
This function find_next_free currently uses a simple linear algorithm, if necessary this could certainly be improved.
How do we recode a set of strictly increasing (or strictly decreasing) positive integers P, to decrease the number of positive integers that can occur between the integers in our set?
Why would we want to do this: Say we want to randomly sample P but 1.) P is too large to enumerate, and 2.) members of P are related in a nonrandom way, but in a way that is too complicated to sample by. However, we know a member of P when we see it. Say we know P[0] and P[n] but can't entertain the idea of enumerating all of P or understanding precisely how members of P are related. Likewise, the number of all possible integers occurring between P[0] and P[n] are many times greater than the size of P, making the chance of randomly drawing a member of P very small.
Example: Let P[0] = 2101010101 & P[n] = 505050505. Now, maybe we're only interested in integers between P[0] and P[n] that have a specific quality (e.g. all integers in P[x] sum to Q or less, each member of P has 7 or less as the largest integer). So, not all positive integers P[n] <= X <= P[0] belong to P. The P I'm interested in is discussed in the comments below.
What I've tried: If P is a strictly decreasing set and we know P[0] and P[n], then we can treat each member as if it were subtracted from P[0]. Doing so decreases each number, perhaps greatly and maintains each member as a unique integer. For the P I'm interested in (below), one can treat each decreased value of P as being divided by a common denominator (9,11,99), which decreases the number of possible integers between members of P. I've found that used in conjunction, these approaches decrease the set of all P[0] <= X <= P[n] by a few orders of magnitude, making the chance of randomly drawing a member of P from all positive integers P[n] <= X <= P[0] still very small.
Note: As should be clear, we have to know something about P. If we don't, that basically means we have no clue of what we're looking for. When we randomly sample integers between P[0] and P[n] (recoded or not) we need to be able to say "Yup, that belongs to P.", if indeed it does.
A good answer could greatly increase the practical application of a computing algorithm I have developed. An example of the kind of P I'm interested in is given in comment 2. I am adamant about giving due credit.
While the original question is asking about a very generic scenario concerning integer encodings, I would suggest that it is unlikely that there exists an approach that works in complete generality. For example, if the P[i] are more or less random (from an information-theoretic standpoint), I would be surprised if anything should work.
So, instead, let us turn our attention to the OP's actual problem of generating partitions of an integer N containing exactly K parts. When encoding with combinatorial objects as integers, it behooves us to preserve as much of the combinatorial structure as possible.
For this, we turn to the classic text Combinatorial Algorithms by Nijenhuis and Wilf, specifically Chapter 13. In fact, in this chapter, they demonstrate a framework to enumerate and sample from a number of combinatorial families -- including partitions of N where the largest part is equal to K. Using the well-known duality between partitions with K parts and partitions where the largest part is K (take the transpose of the Ferrers diagram), we find that we only need to make a change to the decoding process.
Anyways, here's some source code:
import sys
import random
import time
if len(sys.argv) < 4 :
sys.stderr.write("Usage: {0} N K iter\n".format(sys.argv[0]))
sys.stderr.write("\tN = number to be partitioned\n")
sys.stderr.write("\tK = number of parts\n")
sys.stderr.write("\titer = number of iterations (if iter=0, enumerate all partitions)\n")
quit()
N = int(sys.argv[1])
K = int(sys.argv[2])
iters = int(sys.argv[3])
if (N < K) :
sys.stderr.write("Error: N<K ({0}<{1})\n".format(N,K))
quit()
# B[n][k] = number of partitions of n with largest part equal to k
B = [[0 for j in range(K+1)] for i in range(N+1)]
def calc_B(n,k) :
for j in xrange(1,k+1) :
for m in xrange(j, n+1) :
if j == 1 :
B[m][j] = 1
elif m - j > 0 :
B[m][j] = B[m-1][j-1] + B[m-j][j]
else :
B[m][j] = B[m-1][j-1]
def generate(n,k,r=None) :
path = []
append = path.append
# Invalid input
if n < k or n == 0 or k == 0:
return []
# Pick random number between 1 and B[n][k] if r is not specified
if r == None :
r = random.randrange(1,B[n][k]+1)
# Construct path from r
while r > 0 :
if n==1 and k== 1:
append('N')
r = 0 ### Finish loop
elif r <= B[n-k][k] and B[n-k][k] > 0 : # East/West Move
append('E')
n = n-k
else : # Northeast/Southwest move
append('N')
r -= B[n-k][k]
n = n-1
k = k-1
# Decode path into partition
partition = []
l = 0
d = 0
append = partition.append
for i in reversed(path) :
if i == 'N' :
if d > 0 : # apply East moves all at once
for j in xrange(l) :
partition[j] += d
d = 0 # reset East moves
append(1) # apply North move
l += 1
else :
d += 1 # accumulate East moves
if d > 0 : # apply any remaining East moves
for j in xrange(l) :
partition[j] += d
return partition
t = time.clock()
sys.stderr.write("Generating B table... ")
calc_B(N, K)
sys.stderr.write("Done ({0} seconds)\n".format(time.clock()-t))
bmax = B[N][K]
Bits = 0
sys.stderr.write("B[{0}][{1}]: {2}\t".format(N,K,bmax))
while bmax > 1 :
bmax //= 2
Bits += 1
sys.stderr.write("Bits: {0}\n".format(Bits))
if iters == 0 : # enumerate all partitions
for i in xrange(1,B[N][K]+1) :
print i,"\t",generate(N,K,i)
else : # generate random partitions
t=time.clock()
for i in xrange(1,iters+1) :
Q = generate(N,K)
print Q
if i%1000==0 :
sys.stderr.write("{0} written ({1:.3f} seconds)\r".format(i,time.clock()-t))
sys.stderr.write("{0} written ({1:.3f} seconds total) ({2:.3f} iterations per second)\n".format(i, time.clock()-t, float(i)/(time.clock()-t) if time.clock()-t else 0))
And here's some examples of the performance (on a MacBook Pro 8.3, 2GHz i7, 4 GB, Mac OSX 10.6.3, Python 2.6.1):
mhum$ python part.py 20 5 10
Generating B table... Done (6.7e-05 seconds)
B[20][5]: 84 Bits: 6
[7, 6, 5, 1, 1]
[6, 6, 5, 2, 1]
[5, 5, 4, 3, 3]
[7, 4, 3, 3, 3]
[7, 5, 5, 2, 1]
[8, 6, 4, 1, 1]
[5, 4, 4, 4, 3]
[6, 5, 4, 3, 2]
[8, 6, 4, 1, 1]
[10, 4, 2, 2, 2]
10 written (0.000 seconds total) (37174.721 iterations per second)
mhum$ python part.py 20 5 1000000 > /dev/null
Generating B table... Done (5.9e-05 seconds)
B[20][5]: 84 Bits: 6
100000 written (2.013 seconds total) (49665.478 iterations per second)
mhum$ python part.py 200 25 100000 > /dev/null
Generating B table... Done (0.002296 seconds)
B[200][25]: 147151784574 Bits: 37
100000 written (8.342 seconds total) (11987.843 iterations per second)
mhum$ python part.py 3000 200 100000 > /dev/null
Generating B table... Done (0.313318 seconds)
B[3000][200]: 3297770929953648704695235165404132029244952980206369173 Bits: 181
100000 written (59.448 seconds total) (1682.135 iterations per second)
mhum$ python part.py 5000 2000 100000 > /dev/null
Generating B table... Done (4.829086 seconds)
B[5000][2000]: 496025142797537184410324290349759736884515893324969819660 Bits: 188
100000 written (255.328 seconds total) (391.653 iterations per second)
mhum$ python part-final2.py 20 3 0
Generating B table... Done (0.0 seconds)
B[20][3]: 33 Bits: 5
1 [7, 7, 6]
2 [8, 6, 6]
3 [8, 7, 5]
4 [9, 6, 5]
5 [10, 5, 5]
6 [8, 8, 4]
7 [9, 7, 4]
8 [10, 6, 4]
9 [11, 5, 4]
10 [12, 4, 4]
11 [9, 8, 3]
12 [10, 7, 3]
13 [11, 6, 3]
14 [12, 5, 3]
15 [13, 4, 3]
16 [14, 3, 3]
17 [9, 9, 2]
18 [10, 8, 2]
19 [11, 7, 2]
20 [12, 6, 2]
21 [13, 5, 2]
22 [14, 4, 2]
23 [15, 3, 2]
24 [16, 2, 2]
25 [10, 9, 1]
26 [11, 8, 1]
27 [12, 7, 1]
28 [13, 6, 1]
29 [14, 5, 1]
30 [15, 4, 1]
31 [16, 3, 1]
32 [17, 2, 1]
33 [18, 1, 1]
I'll leave it to the OP to verify that this code indeed generates partitions according to the desired (uniform) distribution.
EDIT: Added an example of the enumeration functionality.
Below is a script that accomplishes what I've asked, as far as recoding integers that represent integer partitions of N with K parts. A better recoding method is needed for this approach to be practical for K > 4. This is definitely not a best or preferred approach. However, it's conceptually simple and easily argued as fundamentally unbiased. It's also very fast for small K. The script runs fine in Sage notebook and does not call Sage functions. It is NOT a script for random sampling. Random sampling per se is not the problem.
The method:
1.) Treat integer partitions as if their summands are concatenated together and padded with zeros according to size of largest summand in first lexical partition, e.g. [17,1,1,1] -> 17010101 & [5,5,5,5] -> 05050505
2.) Treat the resulting integers as if they are subtracted from the largest integer (i.e. the int representing the first lexical partition). e.g. 17010101 - 5050505 = 11959596
3.) Treat each resulting decreased integer as divided by a common denominator, e.g. 11959596/99 = 120804
So, if we wanted to choose a random partition we would:
1.) Choose a number between 0 and 120,804 (instead of a number between 5,050,505 and 17,010,101)
2.) Multiply the number by 99 and substract from 17010101
3.) Split the resulting integer according to how we treated each integer as being padded with 0's
Pro's and Con's: As stated in the body of the question, this particular recoding method doesn't do enough to greatly improve the chance of randomly selecting an integer representing a member of P. For small numbers of parts, e.g. K < 5 and substantially larger totals, e.g. N > 100, a function that implements this concept can be very fast because the approach avoids timely recursion (snake eating its tail) that slows other random partition functions or makes other functions impractical for dealing with large N.
At small K, the probability of drawing a member of P can be reasonable when considering how fast the rest of the process is. Coupled with quick random draws, decoding, and evaluation, this function can find uniform random partitions for combinations of N&K (e.g. N = 20000, K = 4) that are untennable with other algorithms. A better way to recode integers is greatly needed to make this a generally powerful approach.
import random
import sys
First, some generally useful and straightforward functions
def first_partition(N,K):
part = [N-K+1]
ones = [1]*(K-1)
part.extend(ones)
return part
def last_partition(N,K):
most_even = [int(floor(float(N)/float(K)))]*K
_remainder = int(N%K)
j = 0
while _remainder > 0:
most_even[j] += 1
_remainder -= 1
j += 1
return most_even
def first_part_nmax(N,K,Nmax):
part = [Nmax]
N -= Nmax
K -= 1
while N > 0:
Nmax = min(Nmax,N-K+1)
part.append(Nmax)
N -= Nmax
K -= 1
return part
#print first_partition(20,4)
#print last_partition(20,4)
#print first_part_nmax(20,4,12)
#sys.exit()
def portion(alist, indices):
return [alist[i:j] for i, j in zip([0]+indices, indices+[None])]
def next_restricted_part(part,N,K): # *find next partition matching N&K w/out recursion
if part == last_partition(N,K):return first_partition(N,K)
for i in enumerate(reversed(part)):
if i[1] - part[-1] > 1:
if i[0] == (K-1):
return first_part_nmax(N,K,(i[1]-1))
else:
parts = portion(part,[K-i[0]-1]) # split p
h1 = parts[0]
h2 = parts[1]
next = first_part_nmax(sum(h2),len(h2),(h2[0]-1))
return h1+next
""" *I don't know a math software that has this function and Nijenhuis and Wilf (1978)
don't give it (i.e. NEXPAR is not restricted by K). Apparently, folks often get the
next restricted part using recursion, which is unnecessary """
def int_to_list(i): # convert an int to a list w/out padding with 0'
return [int(x) for x in str(i)]
def int_to_list_fill(i,fill):# convert an int to a list and pad with 0's
return [x for x in str(i).zfill(fill)]
def list_to_int(l):# convert a list to an integer
return "".join(str(x) for x in l)
def part_to_int(part,fill):# convert an int to a partition of K parts
# and pad with the respective number of 0's
p_list = []
for p in part:
if len(int_to_list(p)) != fill:
l = int_to_list_fill(p,fill)
p = list_to_int(l)
p_list.append(p)
_int = list_to_int(p_list)
return _int
def int_to_part(num,fill,K): # convert an int to a partition of K parts
# and pad with the respective number of 0's
# This function isn't called by the script, but I thought I'd include
# it anyway because it would be used to recover the respective partition
_list = int_to_list(num)
if len(_list) != fill*K:
ct = fill*K - len(_list)
while ct > 0:
_list.insert(0,0)
ct -= 1
new_list1 = []
new_list2 = []
for i in _list:
new_list1.append(i)
if len(new_list1) == fill:
new_list2.append(new_list1)
new_list1 = []
part = []
for i in new_list2:
j = int(list_to_int(i))
part.append(j)
return part
Finally, we get to the total N and number of parts K. The following will print partitions satisfying N&K in lexical order, with associated recoded integers
N = 20
K = 4
print '#, partition, coded, _diff, smaller_diff'
first_part = first_partition(N,K) # first lexical partition for N&K
fill = len(int_to_list(max(first_part)))
# pad with zeros to 1.) ensure a strictly decreasing relationship w/in P,
# 2.) keep track of (encode/decode) partition summand values
first_num = part_to_int(first_part,fill)
last_part = last_partition(N,K)
last_num = part_to_int(last_part,fill)
print '1',first_part,first_num,'',0,' ',0
part = list(first_part)
ct = 1
while ct < 10:
part = next_restricted_part(part,N,K)
_num = part_to_int(part,fill)
_diff = int(first_num) - int(_num)
smaller_diff = (_diff/99)
ct+=1
print ct, part, _num,'',_diff,' ',smaller_diff
OUTPUT:
ct, partition, coded, _diff, smaller_diff
1 [17, 1, 1, 1] 17010101 0 0
2 [16, 2, 1, 1] 16020101 990000 10000
3 [15, 3, 1, 1] 15030101 1980000 20000
4 [15, 2, 2, 1] 15020201 1989900 20100
5 [14, 4, 1, 1] 14040101 2970000 30000
6 [14, 3, 2, 1] 14030201 2979900 30100
7 [14, 2, 2, 2] 14020202 2989899 30201
8 [13, 5, 1, 1] 13050101 3960000 40000
9 [13, 4, 2, 1] 13040201 3969900 40100
10 [13, 3, 3, 1] 13030301 3979800 40200
In short, integers in the last column could be a lot smaller.
Why a random sampling strategy based on this idea is fundamentally unbiased:
Each integer partition of N having K parts corresponds to one and only one recoded integer. That is, we don't pick a number at random, decode it, and then try to rearrange the elements to form a proper partition of N&K. Consequently, each integer (whether corresponding to partitions of N&K or not) has the same chance of being drawn. The goal is to inherently reduce the number of integers not corresponding to partitions of N with K parts, and so, to make the process of random sampling faster.