Count the number of all the shortest paths for any two vertices in a graph - algorithm

For any pair of different vertices in a given undirected graph G= , I want to find the number of all the shortest paths ("SP", in abbreviation) (it is not required or necessary to find/print the exact vertices on a certain path). For example, for the following graph given in edge-list format, there are two SPs: (1,3,2) and (1,4,2).
vertex =
1 3
2 4
1 4
2 3
1 8
4 7
3 6
5 2
I want to implement this algorithm based on Floyd-Warshall algorithm , which is a famous algorithm based on the idea of dynamic programming to find the value of the shortest path for each pair of vertices in O(N^3), say the result is an 2D array a[n][n]. n is the number of vertices. For the above graph, it is:
0 2 1 1 3 2 2 1
2 0 1 1 1 2 2 3
1 1 0 2 2 1 3 2
1 1 2 0 2 3 1 2
3 1 2 2 0 3 3 4
2 2 1 3 3 0 4 3
2 2 3 1 3 4 0 3
1 3 2 2 4 3 3 0
The code for constructing graph matrix G and solving for matrix a is as follows:
v = vertex(:,1);
t = vertex(:,2);
G = zeros( max(max(v),max(t)));
% Build the matrix for graph:
for i = 1:length(v)
G(v(i), t(i)) = G(v(i), t(i)) + 1;
G(t(i), v(i)) = G(v(i), t(i)); % comment here is input is bi-directional
end
a = G;
n = length(a);
a(a==0) = Inf;
a(1:n+1:n^2)=0; % diagonal element to be zero
for k = 1:n
for i= 1:n
for j= 1:n % for j=i+1:n
if a(i,j) > a(i,k) + a(k,j)
a(i,j) = a(i,k) + a(k,j);
% a(j,i) = a(i,j);
end
end
end
end
Now, let's defined a 2D array b[n][n] as the number of ALL the SPs for each pair of vertices. For example, we expect b[1][2] = 2.
I wrote the following code in MATLAB (if you are not familiar whit MATLAB, just treat it as a pseudo-code). It gives almost correct values for all the pairs except several wrong values for certain pairs. For example. after running the cod,e b[5][8] = 0, which is wrong (correct answer should be 2)
%%
% find the number of ALL SP paths for ALL pairs based on the "a" array:
% b is a two-dim array, b(i,j) is the total number of SP for pair( i,j)
b = G;
for k=1:n
for i=1:n
for j= i+1:n
if(i==j)
continue; % b(i,i)=0
end
if (k==j) % the same as : G(k,j)==0
continue;
end
if(k==i && G(k,j)~=0)
b(i,j) = 1;
continue;
end
if(a(i,j) ~= a(i,k)+G(k,j)) % w(u,v)=G(u,v) in unweighted graph)
continue;
end
% sigma(s,v) = sigma(s,v) + sigma(s,u);
b(i,j) = b(i,j) + b(k,i);
end
end
end

Related

2D bin packing on a grid

I have an n × m grid and a collection of polyominos. I would like to know if it is possible to pack them into the grid: no overlapping or rotation is allowed.
I expect that like most packing problems this version is NP-hard and difficult to approximate, so I'm not expecting anything crazy, but an algorithm that could find reasonable packings on a grid around 25 × 25 and be fairly comprehensive around 10 × 10 would be great. (My tiles are mostly tetrominos -- four blocks -- but they could have 5–9+ blocks.)
I'll take whatever anyone has to offer: an algorithm, a paper, an existing program which can be adapted.
Here is a prototype-like SAT-solver approach, which tackles:
a-priori fixed polyomino patterns (see Constants / Input in code)
if rotations should be allowed, rotated pieces have to be added to the set
every polyomino can be placed 0-inf times
there is no scoring-mechanic besides:
the number of non-covered tiles is minimized!
Considering classic off-the-shelf methods for combinatorial-optimization (SAT, CP, MIP), this one will probably scale best (educated guess). It will also be very hard to beat when designing customized heuristics!
If needed, these slides provide some practical introduction to SAT-solvers in practice. Here we are using CDCL-based solvers which are complete (will always find a solution in finite time if there is one; will always be able to prove there is no solution in finite time if there is none; memory of course also plays a role!).
More complex (linear) per-tile scoring-functions are hard to incorporate in general. This is where a (M)IP-approach can be better. But in terms of pure search SAT-solving is much faster in general.
The N=25 problem with my polyomino-set takes ~ 1 second (and one could easily parallize this on multiple granularity-levels -> SAT-solver (threadings-param) vs. outer-loop; the latter will be explained later).
Of course the following holds:
as this is an NP-hard problem, there will be easy and non-easy instances
i did not do scientific benchmarks with many different sets of polyominos
it's to be expected that some sets are easier to solve than others
this is one possible SAT-formulation (not the most trivial!) of infinite many
each formulation has advantages and disadvantages
Idea
The general approach is creating a decision-problem and transforming it into CNF, which is then solved by highly efficient SAT-solvers (here: cryptominisat; CNF will be in DIMCAS-CNF format), which will be used as black-box solvers (no parameter-tuning!).
As the goal is to optimize the number of filled tiles and we are using a decision-problem, we need an outer-loop, adding a minimum tile-used constraint and try to solve it. If not successful, decrease this number. So in general we are calling the SAT-solver multiple times (from scratch!).
There are many different formulations / transformations to CNF possible. Here we use (binary) decision-variables X which indicate a placement. A placement is a tuple like polyomino, x_index, y_index (this index marks the top-left field of some pattern). There is a one-to-one mapping between the number of variables and the number of possible placements of all polyominos.
The core idea is: search in the space of all possible placement-combinations for one solution, which is not invalidating some constraints.
Additionally, we have decision-variables Y, which indicate a tile being filled. There are M*N such variables.
When having access to all possible placements, it's easy to calculate a collision-set for each tile-index (M*N). Given some fixed tile, we can check which placements can fill this one and constrain the problem to only select <=1 of those. This is active on X. In the (M)IP world this probably would be called convex-hull for the collisions.
n<=k-constraints are ubiquitous in SAT-solving and many different formulations are possible. Naive-encoding would need an exponential number of clauses in general which easily becomes infeasibly. Using new variables, there are many variable-clause trade-offs (see Tseitin-encoding) possible. I'm reusing one (old code; only reason why my code is python2-only) which worked good for me in the past. It's based on describing hardware-based counter-logic into CNF and provides good empirical- and theoretical performance (see paper). Of course there are many alternatives.
Additionally, we need to force the SAT-solver not to make all variables negative. We have to add constraints describing the following (that's one approach):
if some field is used: there has to be at least one placement active (poly + x + y), which results in covering this field!
this is a basic logical implication easily formulated as one potentially big logical or
Then only the core-loop is missing, trying to fill N fields, then N-1 until successful. This is again using the n<=k formulation mentioned earlier.
Code
This is python2-code, which needs the SAT-solver cryptominisat 5 in the directory the script is run from.
I'm also using tools from python's excellent scientific-stack.
# PYTHON 2!
import math
import copy
import subprocess
import numpy as np
import matplotlib.pyplot as plt # plotting-only
import seaborn as sns # plotting-only
np.set_printoptions(linewidth=120) # more nice console-output
""" Constants / Input
Example: 5 tetrominoes; no rotation """
M, N = 25, 25
polyominos = [np.array([[1,1,1,1]]),
np.array([[1,1],[1,1]]),
np.array([[1,0],[1,0], [1,1]]),
np.array([[1,0],[1,1],[0,1]]),
np.array([[1,1,1],[0,1,0]])]
""" Preprocessing
Calculate:
A: possible placements
B: covered positions
C: collisions between placements
"""
placements = []
covered = []
for p_ind, p in enumerate(polyominos):
mP, nP = p.shape
for x in range(M):
for y in range(N):
if x + mP <= M: # assumption: no zero rows / cols in each p
if y + nP <= N: # could be more efficient
placements.append((p_ind, x, y))
cover = np.zeros((M,N), dtype=bool)
cover[x:x+mP, y:y+nP] = p
covered.append(cover)
covered = np.array(covered)
collisions = []
for m in range(M):
for n in range(N):
collision_set = np.flatnonzero(covered[:, m, n])
collisions.append(collision_set)
""" Helper-function: Cardinality constraints """
# K-ARY CONSTRAINT GENERATION
# ###########################
# SINZ, Carsten. Towards an optimal CNF encoding of boolean cardinality constraints.
# CP, 2005, 3709. Jg., S. 827-831.
def next_var_index(start):
next_var = start
while(True):
yield next_var
next_var += 1
class s_index():
def __init__(self, start_index):
self.firstEnvVar = start_index
def next(self,i,j,k):
return self.firstEnvVar + i*k +j
def gen_seq_circuit(k, input_indices, next_var_index_gen):
cnf_string = ''
s_index_gen = s_index(next_var_index_gen.next())
# write clauses of first partial sum (i.e. i=0)
cnf_string += (str(-input_indices[0]) + ' ' + str(s_index_gen.next(0,0,k)) + ' 0\n')
for i in range(1, k):
cnf_string += (str(-s_index_gen.next(0, i, k)) + ' 0\n')
# write clauses for general case (i.e. 0 < i < n-1)
for i in range(1, len(input_indices)-1):
cnf_string += (str(-input_indices[i]) + ' ' + str(s_index_gen.next(i, 0, k)) + ' 0\n')
cnf_string += (str(-s_index_gen.next(i-1, 0, k)) + ' ' + str(s_index_gen.next(i, 0, k)) + ' 0\n')
for u in range(1, k):
cnf_string += (str(-input_indices[i]) + ' ' + str(-s_index_gen.next(i-1, u-1, k)) + ' ' + str(s_index_gen.next(i, u, k)) + ' 0\n')
cnf_string += (str(-s_index_gen.next(i-1, u, k)) + ' ' + str(s_index_gen.next(i, u, k)) + ' 0\n')
cnf_string += (str(-input_indices[i]) + ' ' + str(-s_index_gen.next(i-1, k-1, k)) + ' 0\n')
# last clause for last variable
cnf_string += (str(-input_indices[-1]) + ' ' + str(-s_index_gen.next(len(input_indices)-2, k-1, k)) + ' 0\n')
return (cnf_string, (len(input_indices)-1)*k, 2*len(input_indices)*k + len(input_indices) - 3*k - 1)
def gen_at_most_n_constraints(vars, start_var, n):
constraint_string = ''
used_clauses = 0
used_vars = 0
index_gen = next_var_index(start_var)
circuit = gen_seq_circuit(n, vars, index_gen)
constraint_string += circuit[0]
used_clauses += circuit[2]
used_vars += circuit[1]
start_var += circuit[1]
return [constraint_string, used_clauses, used_vars, start_var]
def parse_solution(output):
# assumes there is one
vars = []
for line in output.split("\n"):
if line:
if line[0] == 'v':
line_vars = list(map(lambda x: int(x), line.split()[1:]))
vars.extend(line_vars)
return vars
def solve(CNF):
p = subprocess.Popen(["cryptominisat5.exe"], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
result = p.communicate(input=CNF)[0]
sat_line = result.find('s SATISFIABLE')
if sat_line != -1:
# solution found!
vars = parse_solution(result)
return True, vars
else:
return False, None
""" SAT-CNF: BASE """
X = np.arange(1, len(placements)+1) # decision-vars
# 1-index for CNF
Y = np.arange(len(placements)+1, len(placements)+1 + M*N).reshape(M,N)
next_var = len(placements)+1 + M*N # aux-var gen
n_clauses = 0
cnf = '' # slow string appends
# int-based would be better
# <= 1 for each collision-set
for cset in collisions:
constraint_string, used_clauses, used_vars, next_var = \
gen_at_most_n_constraints(X[cset].tolist(), next_var, 1)
n_clauses += used_clauses
cnf += constraint_string
# if field marked: one of covering placements active
for x in range(M):
for y in range(N):
covering_placements = X[np.flatnonzero(covered[:, x, y])] # could reuse collisions
clause = str(-Y[x,y])
for i in covering_placements:
clause += ' ' + str(i)
clause += ' 0\n'
cnf += clause
n_clauses += 1
print('BASE CNF size')
print('clauses: ', n_clauses)
print('vars: ', next_var - 1)
""" SOLVE in loop -> decrease number of placed-fields until SAT """
print('CORE LOOP')
N_FIELD_HIT = M*N
while True:
print(' N_FIELDS >= ', N_FIELD_HIT)
# sum(y) >= N_FIELD_HIT
# == sum(not y) <= M*N - N_FIELD_HIT
cnf_final = copy.copy(cnf)
n_clauses_final = n_clauses
if N_FIELD_HIT == M*N: # awkward special case
constraint_string = ''.join([str(y) + ' 0\n' for y in Y.ravel()])
n_clauses_final += N_FIELD_HIT
else:
constraint_string, used_clauses, used_vars, next_var = \
gen_at_most_n_constraints((-Y).ravel().tolist(), next_var, M*N - N_FIELD_HIT)
n_clauses_final += used_clauses
n_vars_final = next_var - 1
cnf_final += constraint_string
cnf_final = 'p cnf ' + str(n_vars_final) + ' ' + str(n_clauses) + \
' \n' + cnf_final # header
status, sol = solve(cnf_final)
if status:
print(' SOL found: ', N_FIELD_HIT)
""" Print sol """
res = np.zeros((M, N), dtype=int)
counter = 1
for v in sol[:X.shape[0]]:
if v>0:
p, x, y = placements[v-1]
pM, pN = polyominos[p].shape
poly_nnz = np.where(polyominos[p] != 0)
x_inds, y_inds = x+poly_nnz[0], y+poly_nnz[1]
res[x_inds, y_inds] = p+1
counter += 1
print(res)
""" Plot """
# very very ugly code; too lazy
ax1 = plt.subplot2grid((5, 12), (0, 0), colspan=11, rowspan=5)
ax_p0 = plt.subplot2grid((5, 12), (0, 11))
ax_p1 = plt.subplot2grid((5, 12), (1, 11))
ax_p2 = plt.subplot2grid((5, 12), (2, 11))
ax_p3 = plt.subplot2grid((5, 12), (3, 11))
ax_p4 = plt.subplot2grid((5, 12), (4, 11))
ax_p0.imshow(polyominos[0] * 1, vmin=0, vmax=5)
ax_p1.imshow(polyominos[1] * 2, vmin=0, vmax=5)
ax_p2.imshow(polyominos[2] * 3, vmin=0, vmax=5)
ax_p3.imshow(polyominos[3] * 4, vmin=0, vmax=5)
ax_p4.imshow(polyominos[4] * 5, vmin=0, vmax=5)
ax_p0.xaxis.set_major_formatter(plt.NullFormatter())
ax_p1.xaxis.set_major_formatter(plt.NullFormatter())
ax_p2.xaxis.set_major_formatter(plt.NullFormatter())
ax_p3.xaxis.set_major_formatter(plt.NullFormatter())
ax_p4.xaxis.set_major_formatter(plt.NullFormatter())
ax_p0.yaxis.set_major_formatter(plt.NullFormatter())
ax_p1.yaxis.set_major_formatter(plt.NullFormatter())
ax_p2.yaxis.set_major_formatter(plt.NullFormatter())
ax_p3.yaxis.set_major_formatter(plt.NullFormatter())
ax_p4.yaxis.set_major_formatter(plt.NullFormatter())
mask = (res==0)
sns.heatmap(res, cmap='viridis', mask=mask, cbar=False, square=True, linewidths=.1, ax=ax1)
plt.tight_layout()
plt.show()
break
N_FIELD_HIT -= 1 # binary-search could be viable in some cases
# but beware the empirical asymmetry in SAT-solvers:
# finding solution vs. proving there is none!
Output console
BASE CNF size
('clauses: ', 31509)
('vars: ', 13910)
CORE LOOP
(' N_FIELDS >= ', 625)
(' N_FIELDS >= ', 624)
(' SOL found: ', 624)
[[3 2 2 2 2 1 1 1 1 1 1 1 1 2 2 1 1 1 1 1 1 1 1 2 2]
[3 2 2 2 2 1 1 1 1 1 1 1 1 2 2 2 2 2 2 1 1 1 1 2 2]
[3 3 3 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 1 1 1 1 2 2]
[2 2 3 1 1 1 1 1 1 1 1 2 2 2 2 1 1 1 1 2 2 2 2 2 2]
[2 2 3 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 2 2 2 2 2 2]
[1 1 1 1 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 2 2]
[1 1 1 1 3 3 3 2 2 1 1 1 1 2 2 2 2 2 2 2 2 1 1 1 1]
[2 2 1 1 1 1 3 2 2 2 2 2 2 2 2 1 1 1 1 2 2 2 2 2 2]
[2 2 2 2 2 2 3 3 3 2 2 2 2 1 1 1 1 2 2 2 2 2 2 2 2]
[2 2 2 2 2 2 2 2 3 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2]
[2 2 1 1 1 1 2 2 3 3 3 2 2 2 2 2 2 1 1 1 1 2 2 2 2]
[1 1 1 1 1 1 1 1 2 2 3 2 2 1 1 1 1 1 1 1 1 1 1 1 1]
[2 2 3 1 1 1 1 3 2 2 3 3 4 1 1 1 1 2 2 1 1 1 1 2 2]
[2 2 3 1 1 1 1 3 1 1 1 1 4 4 3 2 2 2 2 1 1 1 1 2 2]
[2 2 3 3 5 5 5 3 3 1 1 1 1 4 3 2 2 1 1 1 1 1 1 1 1]
[2 2 2 2 4 5 1 1 1 1 1 1 1 1 3 3 3 2 2 1 1 1 1 2 2]
[2 2 2 2 4 4 2 2 1 1 1 1 1 1 1 1 3 2 2 1 1 1 1 2 2]
[2 2 2 2 3 4 2 2 2 2 2 2 1 1 1 1 3 3 3 2 2 2 2 2 2]
[3 4 2 2 3 5 5 5 2 2 2 2 1 1 1 1 2 2 3 2 2 2 2 2 2]
[3 4 4 3 3 3 5 5 5 5 1 1 1 1 2 2 2 2 3 3 3 2 2 2 2]
[3 3 4 3 1 1 1 1 5 1 1 1 1 4 2 2 2 2 2 2 3 2 2 2 2]
[2 2 3 3 3 1 1 1 1 1 1 1 1 4 4 4 2 2 2 2 3 3 0 2 2]
[2 2 3 1 1 1 1 1 1 1 1 5 5 5 4 4 4 1 1 1 1 2 2 2 2]
[2 2 3 3 1 1 1 1 1 1 1 1 5 5 5 5 4 1 1 1 1 2 2 2 2]
[2 2 1 1 1 1 1 1 1 1 1 1 1 1 5 1 1 1 1 1 1 1 1 2 2]]
Output plot
One field cannot be covered in this parameterization!
Some other examples with a bigger set of patterns
Square M=N=61 (prime -> intuition: harder) where the base-CNF has 450.723 clauses and 185.462 variables. There is an optimal packing!
Non-square M,N =83,131 (double prime) where the base-CNF has 1.346.511 clauses and 553.748 variables. There is an optimal packing!
One approach could be using integer programming. I'll implement this using the python pulp package, though packages are available for pretty much any programming language.
The basic idea is to define a decision variable for every possible placement location for every tile. If a decision variable takes value 1, then its associated tile is placed there. If it takes value 0, then it is not placed there. The objective is therefore to maximize the sum of the decision variables times the number of squares in the variable's tile --- this corresponds to placing the maximum number of squares possible on the board.
My code implements two constraints:
Each tile can only be placed once (below we will relax this constraint)
Each square can have at most one tile on it
Here's the output for a set of five fixed tetrominoes on a 4x5 grid:
import itertools
import pulp
import string
def covered(tile, base):
return {(base[0] + t[0], base[1] + t[1]): True for t in tile}
tiles = [[(0,0), (1,0), (0,1), (0,2)],
[(0,0), (1,0), (2,0), (3,0)],
[(1,0), (0,1), (1,1), (2,0)],
[(0,0), (1,0), (0,1), (1,1)],
[(1,0), (0,1), (1,1), (2,1)]]
rows = 25
cols = 25
squares = {x: True for x in itertools.product(range(rows), range(cols))}
vars = list(itertools.product(range(rows), range(cols), range(len(tiles))))
vars = [x for x in vars if all([y in squares for y in covered(tiles[x[2]], (x[0], x[1])).keys()])]
x = pulp.LpVariable.dicts('tiles', vars, lowBound=0, upBound=1, cat=pulp.LpInteger)
mod = pulp.LpProblem('polyominoes', pulp.LpMaximize)
# Objective value is number of squares in tile
mod += sum([len(tiles[p[2]]) * x[p] for p in vars])
# Don't use any shape more than once
for tnum in range(len(tiles)):
mod += sum([x[p] for p in vars if p[2] == tnum]) <= 1
# Each square can be covered by at most one shape
for s in squares:
mod += sum([x[p] for p in vars if s in covered(tiles[p[2]], (p[0], p[1]))]) <= 1
# Solve and output
mod.solve()
out = [['-'] * cols for rep in range(rows)]
chars = string.ascii_uppercase + string.ascii_lowercase
numset = 0
for p in vars:
if x[p].value() == 1.0:
for off in tiles[p[2]]:
out[p[0] + off[0]][p[1] + off[1]] = chars[numset]
numset += 1
for row in out:
print(''.join(row))
It obtains the following optimal solution:
AAAB-
A-BBC
DDBCC
DD--C
If we allow repeats (comment out the constraint limiting to one copy of each shape), then we can completely tile the grid:
ABCDD
ABCDD
ABCEE
ABCEE
It worked near-instantaneously for a 10x10 grid:
ABCCDDEEFF
ABCCDDEEFF
ABGHHIJJKK
ABGHHIJJKK
LLGMMINOPP
LLGMMINOPP
QQRRSTNOUV
QQRRSTNOUV
WWXXSTYYUV
WWXXSTYYUV
The code obtains an optimal solution for the 25x25 grid in 100 seconds of runtime, though unfortunately there aren't enough letter and numbers for my output code to print the solution.
I don't know if its of any use to you but I coded up a small sketchy frame in Python. It doesn't place polyminos yet but the functions are there - checking for dead empty spaces is primitive, though and needs a better approach. Then again, maybe it is all rubbish...
import functools
import itertools
M = 4 # x
N = 5 # y
field = [[9999]*(N+1)]+[[9999]+[0]*N+[9999] for _ in range(M)]+[[9999]*(N+1)]
def field_rd(p2d):
return field[p2d[0]+1][p2d[1]+1]
def field_add(p2d,val):
field[p2d[0]+1][p2d[1]+1] += val
def add2d(p,k):
return p[0]+k[0],p[1]+k[1]
def norm(polymino_2d):
x0,y0 = min(x for x,y in polymino_2d),min(y for x,y in polymino_2d)
return tuple(sorted(map(lambda p: add2d(p,(-x0,-y0)), polymino_2d)))
def create_cutoff(occupied):
"""Receive a polymino and create the outer area of squares which could be cut off by a placement of this polymino"""
cutoff = set(itertools.chain.from_iterable(map(lambda p: add2d(p,(x,y)),occupied) for (x,y) in [(-1,0),(1,0),(0,-1),(0,1)])) #(-1,-1),(-1,0),(-1,1),(0,1),(1,1),(1,0),(1,-1)]))
return tuple(cutoff.difference(occupied))
def is_occupied(p2d):
return field_rd(p2d) == 0
def is_cutoff(p2d):
return not is_occupied(p2d) and all(map(is_occupied,map(lambda p: add2d(p,p2d),[(-1,0),(1,0),(0,-1),(0,1)])))
def polym_colliding(p2d,occupied):
return any(map(is_occupied,map(lambda p: add2d(p,p2d),occupied)))
def polym_cutoff(p2d,cutoff):
return any(map(is_cutoff,map(lambda p: add2d(p,p2d),cutoff)))
def put(p2d,occupied,polym_nr):
for p in occupied:
field_add(add2d(p2d,p),polym_nr)
def remove(p2d,occupied,polym_nr):
for p in polym:
field_add(add2d(p2d,p),-polym_nr)
def place(p2d,polym_nr):
"""Try to place a polymino at point p2d. If it fits without cutting off unreachable single cells return True else False"""
occupied = polym[polym_nr][0]
if polym_colliding(p2d,occupied):
return False
put(p2d,occupied,polym_nr)
cutoff = polym[polym_nr][1]
if polym_cutoff(p2d,cutoff):
remove(p2d,occupied,polym_nr)
return False
return True
def NxM_array(N,M):
return [[0]*N for _ in range(M)]
def generate_all_polyminos(n):
"""Create all polyminos with size n"""
def gen_recur(polymino,i,result):
if i > 1:
new_pts = set(itertools.starmap(add2d,itertools.product(polymino,[(-1,0),(1,0),(0,-1),(0,1)])))
new_pts = new_pts.difference(polymino)
for p in new_pts:
gen_recur(polymino.union({p}),i-1,result)
else:
result.add(norm(polymino))
#---------------------------------------
all_polyminos = set()
gen_recur({(0,0)},n,all_polyminos)
return all_polyminos
print("All possible Tetris blocks (all orientations): ",generate_all_polyminos(4))

Find number of shortest paths in an unweighted, undirected graph using adejacency-matrix

I am trying to write an algorithm that takes an adjacency-matrix A and gives me the number of shortest paths between all pairs of nodes from length 1 to 20. For example, if there are 4 nodes that have a direct neighbor and 2 nodes that are connected by a shortest path of length two (and no paths longer than 2) the algorithm should return a vector [4 2 0 ... ].
My Idea is to use the fact that A^N gives to number of paths of length N between nodes.
Here is how my code looks so far:
function separations=FindShortestPaths(A)
#Save the original A to exponentiate
B = A;
#C(j,k) will be non-zero if there is a shorter path between nodes j and k
C = sparse(zeros(size(A)));
n = size(A,1);
#the vector in which i save the number of paths
separations = zeros(20,1);
for i = 1:20
#D(j,k) shows how many paths of length i there are from j to k
#if there is a shorter path it will be saved in C,
#so every index j,k that is non-zero in C will be zero in D.
D = A;
D(find(C)) = 0;
#Afterwards the diagonal of D is set to zero so that paths from nodes to themselves are not counted.
D(1:n+1:n*n) = 0;
#The number of remaining non-zero elements in D is the number of shortest paths of length i
separations(i) = size(find(D),1);
#C is updated by adding the matrix of length-i paths.
C = C + A;
#A is potentiated and now A(j,k) gives the number of length i+1 paths from j to k
A = A*B;
endfor
endfunction
I tried it with some smaller graphs of length 5 to 8 and it worked (each pair is counted twice though) but when I tried it with a larger graph it gave me odd numbers for some lengths, which can't be right as every pair is counted twice.
Edit:
Here is an example of one of the graphs I tried and with which it worked: Graph
The adjacency Matrix of this graph is
[010000
101010
010101
001000
010000
001000]
And the number of shortest paths of length n from any Node is:
n: 1 2 3 4 5 6
A: 1 2 2 0 0 0
B: 3 2 0 0 0 0
C: 3 2 0 0 0 0
D: 1 2 2 0 0 0
E: 1 2 2 0 0 0
F: 1 2 2 0 0 0
Total: 10 12 8 0 0 0
So the algorithm should (and in this case does) return a vector where the first 3 elements are 10 12 8 and the rest are zeros. However, with this bigger matrix it doesn't work.

All possible N choose K WITHOUT recusion

I'm trying to create a function that is able to go through a row vector and output the possible combinations of an n choose k without recursion.
For example: 3 choose 2 on [a,b,c] outputs [a,b; a,c; b,c]
I found this: How to loop through all the combinations of e.g. 48 choose 5 which shows how to do it for a fixed n choose k and this: https://codereview.stackexchange.com/questions/7001/generating-all-combinations-of-an-array which shows how to get all possible combinations. Using the latter code, I managed to make a very simple and inefficient function in matlab which returned the result:
function [ combi ] = NCK(x,k)
%x - row vector of inputs
%k - number of elements in the combinations
combi = [];
letLen = 2^length(x);
for i = 0:letLen-1
temp=[0];
a=1;
for j=0:length(x)-1
if (bitand(i,2^j))
temp(k) = x(j+1);
a=a+1;
end
end
if (nnz(temp) == k)
combi=[combi; derp];
end
end
combi = sortrows(combi);
end
This works well for very small vectors, but I need this to be able to work with vectors of at least 50 in length. I've found many examples of how to do this recursively, but is there an efficient way to do this without recursion and still be able to do variable sized vectors and ks?
Here's a simple function that will take a permutation of k ones and n-k zeros and return the next combination of nchoosek. It's completely independent of the values of n and k, taking the values directly from the input array.
function [nextc] = nextComb(oldc)
nextc = [];
o = find(oldc, 1); %// find the first one
z = find(~oldc(o+1:end), 1) + o; %// find the first zero *after* the first one
if length(z) > 0
nextc = oldc;
nextc(1:z-1) = 0;
nextc(z) = 1; %// make the first zero a one
nextc(1:nnz(oldc(1:z-2))) = 1; %// move previous ones to the beginning
else
nextc = zeros(size(oldc));
nextc(1:nnz(oldc)) = 1; %// start over
end
end
(Note that the else clause is only necessary if you want the combinations to wrap around from the last combination to the first.)
If you call this function with, for example:
A = [1 1 1 1 1 0 1 0 0 1 1]
nextCombination = nextComb(A)
the output will be:
A =
1 1 1 1 1 0 1 0 0 1 1
nextCombination =
1 1 1 1 0 1 1 0 0 1 1
You can then use this as a mask into your alphabet (or whatever elements you want combinations of).
C = ['a' 'b' 'c' 'd' 'e' 'f' 'g' 'h' 'i' 'j' 'k']
C(find(nextCombination))
ans = abcdegjk
The first combination in this ordering is
1 1 1 1 1 1 1 1 0 0 0
and the last is
0 0 0 1 1 1 1 1 1 1 1
To generate the first combination programatically,
n = 11; k = 8;
nextCombination = zeros(1,n);
nextCombination(1:k) = 1;
Now you can iterate through the combinations (or however many you're willing to wait for):
for c = 2:nchoosek(n,k) %// start from 2; we already have 1
nextCombination = nextComb(A);
%// do something with the combination...
end
For your example above:
nextCombination = [1 1 0];
C(find(nextCombination))
for c = 2:nchoosek(3,2)
nextCombination = nextComb(nextCombination);
C(find(nextCombination))
end
ans = ab
ans = ac
ans = bc
Note: I've updated the code; I had forgotten to include the line to move all of the 1's that occur prior to the swapped digits to the beginning of the array. The current code (in addition to being corrected above) is on ideone here. Output for 4 choose 2 is:
allCombs =
1 2
1 3
2 3
1 4
2 4
3 4

MATLAB identify adjacient regions in 3D image

I have a 3D image, divided into contiguous regions where each voxel has the same value. The value assigned to this region is unique to the region and serves as a label. The example image below describes the 2D case:
1 1 1 1 2 2 2
1 1 1 2 2 2 3
Im = 1 4 1 2 2 3 3
4 4 4 4 3 3 3
4 4 4 4 3 3 3
I want to create a graph describing adjaciency between these regions. In the above case, this would be:
0 1 0 1
A = 1 0 1 1
0 1 0 1
1 1 1 0
I'm looking for a speedy solution to do this for large 3D images in MATLAB. I came up with a solution that iterates over all regions, which takes 0.05s per iteration - unfortunately, this will take over half an hour for an image with 32'000 regions. Does anybody now a more elegant way of doing this? I'm posting the current algorithm below:
labels = unique(Im); % assuming labels go continuously from 1 to N
A = zeros(labels);
for ii=labels
% border mask to find neighbourhood
dil = imdilate( Im==ii, ones(3,3,3) );
border = dil - (Im==ii);
neighLabels = unique( Im(border>0) );
A(ii,neighLabels) = 1;
end
imdilate is the bottleneck I would like to avoid.
Thank you for your help!
I came up with a solution which is a combination of Divakar's and teng's answers, as well as my own modifications and I generalised it to the 2D or 3D case.
To make it more efficient, I should probably pre-allocate the r and c, but in the meantime, this is the runtime:
For a 3D image of dimension 117x159x126 and 32000 separate regions: 0.79s
For the above 2D example: 0.004671s with this solution, 0.002136s with Divakar's solution, 0.03995s with teng's solution.
I haven't tried extending the winner (Divakar) to the 3D case, though!
noDims = length(size(Im));
validim = ones(size(Im))>0;
labels = unique(Im);
if noDims == 3
Im = padarray(Im,[1 1 1],'replicate', 'post');
shifts = {[-1 0 0] [0 -1 0] [0 0 -1]};
elseif noDims == 2
Im = padarray(Im,[1 1],'replicate', 'post');
shifts = {[-1 0] [0 -1]};
end
% get value of the neighbors for each pixel
% by shifting the image in each direction
r=[]; c=[];
for i = 1:numel(shifts)
tmp = circshift(Im,shifts{i});
r = [r ; Im(validim)];
c = [c ; tmp(validim)];
end
A = sparse(r,c,ones(size(r)), numel(labels), numel(labels) );
% make symmetric, delete diagonal
A = (A+A')>0;
A(1:size(A,1)+1:end)=0;
Thanks for the help!
Try this out -
Im = padarray(Im,[1 1],'replicate');
labels = unique(Im);
box1 = [-size(Im,1)-1 -size(Im,1) -size(Im,1)+1 -1 1 size(Im,1)-1 size(Im,1) size(Im,1)+1];
mat1 = NaN(numel(labels),numel(labels));
for k2=1:numel(labels)
a1 = find(Im==k2);
for k1=1:numel(labels)
a2 = find(Im==k1);
t1 = bsxfun(#plus,a1,box1);
t2 = bsxfun(#eq,t1,permute(a2,[3 2 1]));
mat1(k2,k1) = any(t2(:));
end
end
mat1(1:size(mat1,1)+1:end)=0;
If it works for you, share with us the runtimes as comparison? Would love to see if the coffee brews any faster than half an hour!
Below is my attempt.
Im = [1 1 1 1 2 2 2;
1 1 1 2 2 2 3;
1 4 1 2 2 3 3;
4 4 4 4 3 3 3;
4 4 4 4 3 3 3];
% mark the borders
validim = zeros(size(Im));
validim(2:end-1,2:end-1) = 1;
% get value of the 4-neighbors for each pixel
% by shifting the images 4 times in each direction
numNeighbors = 4;
adj = zeros([prod(size(Im)),numNeighbors]);
shifts = {[0 1] [0 -1] [1 0] [-1 0]};
for i = 1:numNeighbors
tmp = circshift(Im,shifts{i});
tmp(validim == 0) = nan;
adj(:,i) = tmp(:);
end
% mark neighbors where it does not eq Im
imDuplicates = repmat(Im(:),[1 numNeighbors]);
nonequals = adj ~= imDuplicates;
% neglect the border
nonequals(isnan(adj)) = 0;
% get these neighbor values and the corresponding Im value
compared = [imDuplicates(nonequals == 1) adj(nonequals == 1)];
% construct your 'A' % possibly could be more optimized here.
labels = unique(Im);
A = zeros(numel(labels));
for i = 1:size(compared,1)
A(compared(i,1),compared(i,2)) = 1;
end
#Lisa
Yours reasoning is elegant, though it obviously gives wrong answers for labels on the edges.
Try this simple label matrix:
Im =
1 2 2
3 3 3
3 4 4
The resulting adjacency matrix , according to your code is:
A =
0 1 1 0
1 0 1 1
1 1 0 1
0 1 1 0
which claims an adjacency between labels "2" and "4": obviously wrong. This happens simply because you are reading padded Im labels based on "validim" indices, which now doesn't match the new Im and goes all the way down to the lower borders.

Print (or output to file) table of number of steps for Euclid's algorithm

I'd like to print (or send to a file in a human-readable format like below) arbitrary size square tables where each table cell contains the number of steps required to solve Euclid's algorithm for the two integers in the row/column headings like this (table written by hand, but I think the numbers are all correct):
1 2 3 4 5 6
1 1 1 1 1 1 1
2 1 1 2 1 2 1
3 1 2 1 2 3 1
4 1 1 2 1 2 2
5 1 2 3 2 1 2
6 1 1 1 2 2 1
The script would ideally allow me to choose the start integer (1 as above or 11 as below or something else arbitrary) and end integer (6 as above or 16 as below or something else arbitrary and larger than the start integer), so that I could do this too:
11 12 13 14 15 16
11 1 2 3 4 4 3
12 2 1 2 2 2 2
13 3 2 1 2 3 3
14 4 2 2 1 2 2
15 4 2 3 2 1 2
16 3 2 3 2 2 1
I realize that the table is symmetric about the diagonal and so only half of the table contains unique information, and that the diagonal itself is always a 1-step algorithm.
See this and for a graphical representation of what I'm after, but I'd like to know the actual number of steps for any two integers which the image doesn't show me.
I have the algorithms (there's probably better implementations, but I think these work):
The step counter:
def gcd(a,b):
"""Step counter."""
if b > a:
x = a
a = b
b = x
counter = 0
while b:
c = a % b
a = b
b = c
counter += 1
return counter
The list builder:
def gcd_steps(n):
"""List builder."""
print("Table of size", n - 1, "x", n - 1)
list_of_steps = []
for i in range(1, n):
for j in range(1, n):
list_of_steps.append(gcd(i,j))
print(list_of_steps)
return list_of_steps
but I'm totally hung up on how to write the table. I thought about a double nested for loop with i and j and stuff, but I'm new to Python and haven't a clue about the best way (or any way) to go about writing the table. I don't need special formatting like something to offset the row/column heads from the table cells as I can do that by eye, but just getting everything to line up so that I can read it easily is proving too difficult for me at my current skill level, I'm afraid. I'm thinking that it probably makes sense to print/output within the two nested for loops as I'm calculating the numbers I need which is why the list builder has some print statements as well as returning the list, but I don't know how to work the print magic to do what I'm after.
Try this. The programs computes data row by row and prints each row when it's available,
in order to limit memory usage.
import sys, os
def gcd(a,b):
k = 0
if b > a:
a, b = b, a
while b > 0:
a, b = b, a%b
k += 1
return k
def printgcd(name, a, b):
f = open(name, "wt")
s = ""
for i in range(a, b + 1):
s = "{}\t{}".format(s, i)
f.write("{}\n".format(s))
for i in range(a, b + 1):
s = "{}".format(i)
for j in range (a, b + 1):
s = "{}\t{}".format(s, gcd(i, j))
f.write("{}\n".format(s))
f.close()
printgcd("gcd-1-6.txt", 1, 6)
The preceding won't return a list with all computed values, since they are destroyed on purpose. It's easy to do however. Here is a solution with a hash table
def printgcd2(name, a, b):
f = open(name, "wt")
s = ""
h = { }
for i in range(a, b + 1):
s = "{}\t{}".format(s, i)
f.write("{}\n".format(s))
for i in range(a, b + 1):
s = "{}".format(i)
for j in range (a, b + 1):
k = gcd(i, j)
s = "{}\t{}".format(s, k)
h[i, j] = k
f.write("{}\n".format(s))
f.close()
return h
And here is another with a list of lists
def printgcd3(name, a, b):
f = open(name, "wt")
s = ""
u = [ ]
for i in range(a, b + 1):
s = "{}\t{}".format(s, i)
f.write("{}\n".format(s))
for i in range(a, b + 1):
v = [ ]
s = "{}".format(i)
for j in range (a, b + 1):
k = gcd(i, j)
s = "{}\t{}".format(s, k)
v.append(k)
f.write("{}\n".format(s))
u.append(v)
f.close()
return u

Resources