Optimal filling of grid figure with squares - algorithm

recently I have designed a puzzle for children to solve. However I would like to now the optimal solution.
The problem is as follows: You have this figure made up off small squares
You have to fill it in with larger squares and it is scored with the following table:
| Square Size | 1x1 | 2x2 | 3x3 | 4x4 | 5x5 | 6x6 | 7x7 | 8x8 |
|-------------|-----|-----|-----|-----|-----|-----|-----|-----|
| Points | 0 | 4 | 10 | 20 | 35 | 60 | 84 | 120 |
There are simply to many possible solutions to check them all. Some other people suggested dynamic programming, but I don't know how to divide the figure in smaller ones which put together have the same optimal solution.
I would like to find a way to find the optimal solutions to these kinds of problems in reasonable time (like a couple of days max on a regular desktop). The highest score found so far with a guessing algorithm and some manual work is 1112.
Solutions to similar problems with combining sub-problems are also appreciated.
I don't need all the code written out. An outline or idea for an algorithm would be enough.
Note: The biggest square that can fit is 8x8 so scores for bigger squares are not included.
[[1,1,0,0,0,1,0,0,0,0,0,0,1,1,1,1,1,1,0,0,1,1,1,1,1,0,0,1,1,1],
[1,1,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,1,1,0,0,0,1,1],
[1,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,1],
[0,0,0,1,1,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0],
[0,0,0,0,1,1,0,0,0,0,1,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0],
[0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,0,1,1,1],
[0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1],
[1,0,0,0,0,0,0,1,1,1,1,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1],
[1,1,0,0,0,0,0,1,1,1,1,0,0,0,1,1,1,1,1,1,1,1,0,0,1,0,0,0,0,1],
[1,1,1,0,0,0,0,1,1,1,1,1,0,0,1,1,1,1,1,1,1,0,0,0,1,1,1,0,0,0],
[0,1,1,1,0,0,0,1,1,1,1,1,0,0,0,0,1,1,1,0,0,0,0,1,1,1,1,0,0,0],
[0,0,1,1,1,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0],
[0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0],
[0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1],
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1],
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,1],
[0,0,0,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,1],
[0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0],
[0,1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0],
[0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0],
[0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1],
[0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1],
[0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1],
[0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1],
[0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1],
[0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1],
[0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1],
[0,0,0,0,0,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1],
[0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1],
[1,1,1,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1],
[1,1,1,0,0,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1],
[1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1],
[1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1],
[1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1],
[1,1,0,0,0,1,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,1,1,1,1],
[1,0,0,0,0,1,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,1,1,1,1],
[1,0,0,0,0,1,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,1,1,1],
[1,0,0,0,0,1,0,0,0,1,1,1,0,0,0,0,0,0,0,0,1,1,1,0,0,1,1,1,1,1],
[0,0,0,0,0,1,0,0,0,1,1,1,1,1,1,0,0,0,0,1,1,1,1,0,0,0,1,1,1,1],
[0,0,0,0,0,1,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,1,1,1,1],
[0,0,0,0,0,1,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,1,1,1,1]];

Here is a quite general prototype using Mixed-integer-programming which solves your instance optimally (i obtained the value of 1112 like you deduced yourself) and might solve others too.
In general, your problem is np-complete and this makes it hard (there are some instances which will be trouble).
While i suspect that SAT-solver and CP-solver based approaches might be more powerful (because of the combinatoric nature; i even was surprised that MIP works here), the MIP-approach has also some advantages:
MIP-solvers are complete (as SAT and CP; but many random-based heuristics are not):
There are many commercial-grade solvers available if needed
The formulation is quite easy (especially compared to SAT; SAT-formulations will need advanced at most k out of n-formulations (for scoring-formulations) which are growing sub-quadratic (the naive approach grows exponentially)! They do exist, but are non-trivial)
The optimization-objective is just natural (SAT and CP would need iterative-refining = solve with some lower-bound; increment bound and re-solve)
MIP-solvers can also be quite powerful to obtain approximations of the optimal solution and also provide some proven bounds (e.g. optimum lower than x)
The following code is implemented in python using common scientific tools available (all of these are open-source). It allows setting the tile-range (e.g. adding 9x9 tiles) and different cost-functions. The comments should be enough to understand the ideas. It will use some (probably the best) open-source MIP-solver, but can also use commercial ones (outcommented line shows usage).
Code
import numpy as np
import itertools
from collections import defaultdict
import matplotlib.pyplot as plt # visualization only
import seaborn as sns # ""
from pulp import * # MIP-modelling & solver
""" INSTANCE """
instance = np.asarray([[1,1,0,0,0,1,0,0,0,0,0,0,1,1,1,1,1,1,0,0,1,1,1,1,1,0,0,1,1,1],
[1,1,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,1,1,0,0,0,1,1],
[1,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,1],
[0,0,0,1,1,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0],
[0,0,0,0,1,1,0,0,0,0,1,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0],
[0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,0,1,1,1],
[0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1],
[1,0,0,0,0,0,0,1,1,1,1,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1],
[1,1,0,0,0,0,0,1,1,1,1,0,0,0,1,1,1,1,1,1,1,1,0,0,1,0,0,0,0,1],
[1,1,1,0,0,0,0,1,1,1,1,1,0,0,1,1,1,1,1,1,1,0,0,0,1,1,1,0,0,0],
[0,1,1,1,0,0,0,1,1,1,1,1,0,0,0,0,1,1,1,0,0,0,0,1,1,1,1,0,0,0],
[0,0,1,1,1,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0],
[0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0],
[0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1],
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1],
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,1],
[0,0,0,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,1],
[0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0],
[0,1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0],
[0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0],
[0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1],
[0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1],
[0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1],
[0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1],
[0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1],
[0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1],
[0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1],
[0,0,0,0,0,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1],
[0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1],
[1,1,1,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1],
[1,1,1,0,0,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1],
[1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1],
[1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1],
[1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1],
[1,1,0,0,0,1,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,1,1,1,1],
[1,0,0,0,0,1,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,1,1,1,1],
[1,0,0,0,0,1,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,1,1,1],
[1,0,0,0,0,1,0,0,0,1,1,1,0,0,0,0,0,0,0,0,1,1,1,0,0,1,1,1,1,1],
[0,0,0,0,0,1,0,0,0,1,1,1,1,1,1,0,0,0,0,1,1,1,1,0,0,0,1,1,1,1],
[0,0,0,0,0,1,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,1,1,1,1],
[0,0,0,0,0,1,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,1,1,1,1]], dtype=bool)
def plot_compare(instance, solution, subgrids):
f, (ax1, ax2) = plt.subplots(2, sharex=True, sharey=True)
sns.heatmap(instance, ax=ax1, cbar=False, annot=True)
sns.heatmap(solution, ax=ax2, cbar=False, annot=True)
plt.show()
""" PARAMETERS """
SUBGRIDS = 8 # 1x1 - 8x8
SUGBRID_SCORES = {1:0, 2:4, 3:10, 4:20, 5:35, 6:60, 7:84, 8:120}
N, M = instance.shape # free / to-fill = zeros!
""" HELPER FUNCTIONS """
def get_square_covered_indices(instance, pos_x, pos_y, sg):
""" Calculate all covered tiles when given a top-left position & size
-> returns the base-index too! """
N, M = instance.shape
neighbor_indices = []
valid = True
for sX in range(sg):
for sY in range(sg):
if pos_x + sX < N:
if pos_y + sY < M:
if instance[pos_x + sX, pos_y + sY] == 0:
neighbor_indices.append((pos_x + sX, pos_y + sY))
else:
valid = False
break
else:
valid = False
break
else:
valid = False
break
return valid, neighbor_indices
def preprocessing(instance, SUBGRIDS):
""" Calculate all valid placement / tile-selection combinations """
placements = {}
index2placement = {}
placement2index = {}
placement2type = {}
type2placement = defaultdict(list)
cover2index = defaultdict(list) # cell covered by placement-index
index_gen = itertools.count()
for sg in range(1, SUBGRIDS+1): # sg = subgrid size
for pos_x in range(N):
for pos_y in range(M):
if instance[pos_x, pos_y] == 0: # free
feasible, covering = get_square_covered_indices(instance, pos_x, pos_y, sg)
if feasible:
new_index = next(index_gen)
placements[(sg, pos_x, pos_y)] = covering
index2placement[new_index] = (sg, pos_x, pos_y)
placement2index[(sg, pos_x, pos_y)] = new_index
placement2type[new_index] = sg
type2placement[sg].append(new_index)
cover2index[(pos_x, pos_y)].append(new_index)
return placements, index2placement, placement2index, placement2type, type2placement, cover2index
def calculate_collisions(placements, index2placement):
""" Calculate collisions between tile-placements (position + tile-selection)
-> only upper triangle is used: a < b! """
n_p = len(placements)
coll_mat = np.zeros((n_p, n_p), dtype=bool) # only upper triangle is used
for pA in range(n_p):
for pB in range(n_p):
if pA < pB:
covered_A = placements[index2placement[pA]]
covered_B = placements[index2placement[pB]]
if len(set(covered_A).intersection(set(covered_B))) > 0:
coll_mat[pA, pB] = True
return coll_mat
""" PREPROCESSING """
placements, index2placement, placement2index, placement2type, type2placement, cover2index = preprocessing(instance, SUBGRIDS)
N_P = len(placements)
coll_mat = calculate_collisions(placements, index2placement)
""" MIP-MODEL """
prob = LpProblem("GridFill", LpMaximize)
# Variables
X = np.empty(N_P, dtype=object)
for x in range(N_P):
X[x] = LpVariable('x'+str(x), 0, 1, cat='Binary')
# Objective
placement_scores = [SUGBRID_SCORES[index2placement[p][0]] for p in range(N_P)]
prob += lpDot(placement_scores, X), "Score"
# Constraints
# C1: Forbid collisions of placements
for a in range(N_P):
for b in range(N_P):
if a < b: # symmetry-reduction
if coll_mat[a, b]:
prob += X[a] + X[b] <= 1 # not both!
""" SOLVE """
print('solve')
#prob.solve(GUROBI()) # much faster commercial solver; if available
prob.solve(PULP_CBC_CMD(msg=1, presolve=True, cuts=True))
print("Status:", LpStatus[prob.status])
""" INTERPRET AND COMPLETE SOLUTION """
solution = np.zeros((N, M), dtype=int)
for x in range(N_P):
if X[x].value() > 0.99:
sg, pos_x, pos_y = index2placement[x]
_, positions = get_square_covered_indices(instance, pos_x, pos_y, sg)
for pos in positions:
solution[pos[0], pos[1]] = sg
fill_with_ones = np.logical_and((solution == 0), (instance == 0))
solution[fill_with_ones] = 1
""" VISUALIZE """
plot_compare(instance, solution, SUBGRIDS)
Assumptions / Nature of algorithm
There are no constraints describing the need for every free cell to be covered
This works when there are not negative scores
A positive score will be placed if it improves the objective
A zero-score (like your example) might keep some cells free, but these are proven to be 1's then (added after optimizing)
Performance
This is a good example of the discrepancy between open-source and commercial solvers. The two solvers tried were cbc and Gurobi.
cbc example output (just some final parts)
Result - Optimal solution found
Objective value: 1112.00000000
Enumerated nodes: 0
Total iterations: 307854
Time (CPU seconds): 2621.19
Time (Wallclock seconds): 2627.82
Option for printingOptions changed from normal to all
Total time (CPU seconds): 2621.57 (Wallclock seconds): 2628.24
Needed: ~45 mins
Gurobi example output
Explored 0 nodes (7004 simplex iterations) in 5.30 seconds
Thread count was 4 (of 4 available processors)
Optimal solution found (tolerance 1.00e-04)
Best objective 1.112000000000e+03, best bound 1.112000000000e+03, gap 0.0%
Needed: 6 seconds
General remarks about solver-performance
Gurobi should have much more functionality recognizing the nature of the problem and using appropriate hyper-parameters internally
I also think there are some SAT-based approaches used internally (as one of the core-developers wrote his dissertation mostly about combining these very different algorithmic techniques)
There are much better heuristics used, which could provide non-optimal solutions fast (which will help the steps after)
Example output: optimal solution with score 1112 (click to enlarge)

It is possible to reformulate problem into another NP-hard problem :-)
Create weighted graph where vertices are all possible squares that can be put on the board with weights regarding size, and edges are between intersecting squares. There is no need to represent squares 1x1 since there weight is zero.
E.g. for simple empty board 3x3, there are:
- 5 vertices: one 3x3 and four 2x2,
- 7 edges: four between 3x3 square and each 2x2 square, and six between each pair of 2x2 squares.
Now problem is to find maximum weight independent set.
I am not experienced with the topic, but from Wikipedia description it seems that there could exist fast enough algorithm. This graph is not in one of classes with known polynomial time algorithm, but it is quite close to P5-free graph. It seems to me that only possibility to have P5 in this graph is between 2x2 squares, which means to have stripe of width 2 of length 5. There is one in lower left corner. These regions can be covered (removed) before finding independent set with loosing none or very little to optimal solution.

(This is not meant to be a full answer; I'm just sharing what I'm working on so that we can collaborate.)
I think a good first step is to transform the binary grid by giving every cell the value of the maximum size of square that the cell can be the top-left corner of, like this:
0,0,3,2,1,0,3,2,2,2,2,1,0,0,0,0,0,0,2,1,0,0,0,0,0,2,1,0,0,0
0,0,2,2,2,3,3,2,1,1,1,1,0,0,0,3,3,3,3,3,3,2,1,0,0,1,2,1,0,0
0,2,1,1,1,2,3,2,1,0,0,0,0,3,2,2,2,2,2,2,3,3,2,1,0,0,3,2,1,0
3,2,1,0,0,1,3,2,1,0,0,0,3,2,2,1,1,1,1,1,2,3,3,2,1,0,2,2,2,1
3,3,2,1,0,0,2,2,2,1,0,3,2,2,1,1,0,0,0,0,1,2,4,3,2,2,1,1,1,1
2,3,3,2,1,0,2,1,1,1,2,3,2,1,1,0,0,0,0,0,0,1,3,3,2,1,1,0,0,0
1,2,3,4,3,2,1,1,0,0,1,3,2,1,0,0,0,0,0,0,0,0,2,2,2,1,0,0,0,0
0,1,2,3,3,2,1,0,0,0,0,2,2,1,0,0,0,0,0,0,0,0,2,1,1,2,2,2,1,0
0,0,1,2,3,2,1,0,0,0,0,1,2,1,0,0,0,0,0,0,0,0,2,1,0,1,1,2,1,0
0,0,0,1,2,2,1,0,0,0,0,0,2,1,0,0,0,0,0,0,0,2,1,1,0,0,0,3,2,1
1,0,0,0,1,2,1,0,0,0,0,0,4,3,2,1,0,0,0,4,3,2,1,0,0,0,0,2,2,1
2,1,0,0,0,1,2,1,0,0,5,5,4,4,4,4,4,4,4,5,5,4,3,2,1,0,0,1,2,1
3,2,1,0,0,0,1,6,6,5,4,4,4,3,3,3,3,3,3,4,4,5,4,3,2,1,0,0,1,1
3,2,1,0,0,0,0,6,5,5,4,3,3,3,2,2,2,2,2,3,3,4,5,4,3,2,1,0,0,0
3,2,2,2,2,7,6,6,5,4,4,3,2,2,2,1,1,1,1,2,2,3,5,5,4,3,2,1,0,0
2,2,1,1,1,7,6,5,5,4,3,3,2,1,1,1,0,0,0,1,1,2,4,6,5,4,3,2,1,0
2,1,1,0,0,7,6,5,4,4,3,2,2,1,0,0,0,0,0,0,0,1,3,6,5,4,3,2,1,0
1,1,0,0,8,7,6,5,4,3,3,2,1,1,0,0,0,0,0,0,0,0,2,7,6,5,4,3,2,1
1,0,0,0,8,7,6,5,4,3,2,2,1,0,0,0,0,0,0,0,0,0,1,7,6,5,4,3,2,1
0,0,0,7,8,7,6,5,4,3,2,1,1,0,0,0,0,0,0,0,0,0,0,6,6,5,4,3,2,1
0,0,0,6,8,7,6,5,4,3,2,1,0,0,0,0,0,0,0,0,0,0,0,6,5,5,4,3,2,1
0,0,0,5,7,7,6,5,4,3,2,1,0,0,0,0,0,0,0,0,0,0,0,6,5,4,4,3,2,1
0,0,0,4,6,7,7,6,5,4,3,2,1,0,0,0,0,0,0,0,0,0,6,5,5,4,3,3,2,1
0,0,0,3,5,6,7,7,6,5,4,3,2,1,0,0,0,0,0,0,0,6,6,5,4,4,3,2,2,1
1,0,0,2,4,5,6,7,8,7,6,5,4,3,2,1,0,0,0,7,6,6,5,5,4,3,3,2,1,1
1,0,0,1,3,4,5,6,7,7,8,8,8,8,8,8,7,7,6,6,6,5,5,4,4,3,2,2,1,0
2,1,0,0,2,3,4,5,6,6,7,7,8,7,7,7,7,6,6,5,5,5,4,4,3,3,2,1,1,0
2,1,0,0,1,2,3,4,5,5,6,6,8,7,6,6,6,6,5,5,4,4,4,3,3,2,2,1,0,0
3,2,1,0,0,1,2,3,4,4,5,5,8,7,6,5,5,5,5,4,4,3,3,3,2,2,1,1,0,0
3,2,1,0,0,0,1,2,3,3,4,4,8,7,6,5,4,4,4,4,3,3,2,2,2,1,1,0,0,0
4,3,2,1,0,0,0,1,2,2,3,3,8,7,6,5,4,3,3,3,3,2,2,1,1,1,0,0,0,0
3,3,2,1,0,0,0,0,1,1,2,2,8,7,6,5,4,3,2,2,2,2,1,1,0,0,0,0,0,0
2,2,2,2,1,0,0,0,0,0,1,1,8,7,6,5,4,3,2,1,1,1,1,0,0,0,0,0,0,0
1,1,1,2,1,0,0,0,0,0,0,0,8,7,6,5,4,3,2,1,0,0,0,0,0,0,0,0,0,0
0,0,0,2,1,0,0,0,0,0,0,0,8,8,7,6,5,4,3,2,1,0,0,0,0,0,0,0,0,0
0,0,0,2,1,0,0,0,0,0,0,6,8,7,7,6,6,5,4,3,2,1,0,0,0,0,0,0,0,0
0,0,0,2,2,2,3,3,3,3,3,5,7,7,6,6,5,5,4,3,3,3,3,2,1,0,0,0,0,0
0,0,3,2,1,1,3,2,2,2,2,4,6,6,6,5,5,4,4,3,2,2,2,2,1,0,0,0,0,0
0,0,3,2,1,0,3,2,1,1,1,3,5,5,5,5,4,4,3,3,2,1,1,2,1,0,0,0,0,0
0,0,3,2,1,0,3,2,1,0,0,2,4,4,4,4,4,3,3,2,2,1,0,2,1,0,0,0,0,0
0,4,3,2,1,0,3,2,1,0,0,1,3,3,3,4,3,3,2,2,1,1,0,2,1,0,0,0,0,0
0,4,3,2,1,0,3,2,1,0,0,0,2,2,2,3,3,2,2,1,1,0,0,2,1,0,0,0,0,0
0,4,3,2,1,0,3,2,1,0,0,0,1,1,1,2,2,2,1,1,0,0,0,2,1,0,0,0,0,0
3,3,3,2,1,0,3,2,1,0,0,0,0,0,0,1,1,1,1,0,0,0,0,3,2,1,0,0,0,0
2,2,2,2,1,0,2,2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,2,1,0,0,0,0
1,1,1,1,1,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0
If you wanted to go through every option using brute force, you'd try every size of square that a cell could be the corner of (including 1x1), mark the square with zeros, adjust the values of the cells up to 7 places left/above the square, and recurse with the new grid.
If you iterated over the cells top-to-bottom and left-to-right, you'd only have to copy the grid starting from the current row to the bottom row, and you'd only have to adjust the values of cells up to 7 places to the left of the square.
The JS code I tested this with is fast for the top 2 or 3 rows of the grid (result: 24 and 44), takes 8 seconds to finish the top 4 rows (result: 70), and 30 minutes for 5 rows (result: 86). I'm not trying 6 rows.
But, as you can see from this grid, the number of possibilities is so huge that brute force will never be an option. On the other hand, trying something like adding large squares first, and then filling up the leftover space with smaller squares, is never going to guarantee the optimal result, I fear. It's too easy to come up with examples that would thwart such a strategy.
7,6,5,4,3,2,1,0,0,0,0,0,0,7,6,5,4,3,2,1
6,6,5,4,3,2,1,0,0,0,0,0,0,6,6,5,4,3,2,1
5,5,5,4,3,2,1,0,0,0,0,0,0,5,5,5,4,3,2,1
4,4,4,4,3,2,1,0,0,0,0,0,0,4,4,4,4,3,2,1
3,3,3,3,3,2,1,0,0,0,0,0,0,3,3,3,3,3,2,1
2,2,2,2,2,2,1,0,0,0,0,0,0,2,2,2,2,2,2,1
1,1,1,1,1,1,8,7,6,5,4,3,2,1,1,1,1,1,1,1
0,0,0,0,0,0,7,7,6,5,4,3,2,1,0,0,0,0,0,0
0,0,0,0,0,0,6,6,6,5,4,3,2,1,0,0,0,0,0,0
0,0,0,0,0,0,5,5,5,5,4,3,2,1,0,0,0,0,0,0
0,0,0,0,0,0,4,4,4,4,4,3,2,1,0,0,0,0,0,0
0,0,0,0,0,0,3,3,3,3,3,3,2,1,0,0,0,0,0,0
0,0,0,0,0,0,2,2,2,2,2,2,2,1,0,0,0,0,0,0
7,6,5,4,3,2,1,1,1,1,1,1,1,7,6,5,4,3,2,1
6,6,5,4,3,2,1,0,0,0,0,0,0,6,6,5,4,3,2,1
5,5,5,4,3,2,1,0,0,0,0,0,0,5,5,5,4,3,2,1
4,4,4,4,3,2,1,0,0,0,0,0,0,4,4,4,4,3,2,1
3,3,3,3,3,2,1,0,0,0,0,0,0,3,3,3,3,3,2,1
2,2,2,2,2,2,1,0,0,0,0,0,0,2,2,2,2,2,2,1
1,1,1,1,1,1,1,0,0,0,0,0,0,1,1,1,1,1,1,1
In the above example, putting an 8x8 square in the center and four 6x6 squares in the corners gives a lower score than putting a 6x6 square in the center and four 7x7 squares in the corners; so a greedy approach based on using the largest square possible will not give the optimal result.
This is how far I got by isolating zones connected by corridors of maximum width 3, and running the brute-force algorithm on the smaller grids. Where the border has no orange zone, adding another 2 cells doesn't increase the score of the isolated zone, so those cells can be used by the main zone unconditionally.

Related

Finding a perfect matching in graphs

I have this question :
Airline company has N different planes and T pilots. Every pilot has a list of planes he can fly. Every flight needs 2 pilots. The company want to have as much flights simultaneously as possible. Find an algorithm that finds if you can have all the flights simultaneously.
This is the solution I thought about is finding max flow on this graph:
I am just not sure what the capacity should be. Can you help me with that?
Great idea to find the max flow.
For each edge from source --> pilot, assign a capacity of 1. Each pilot can only fly one plane at a time since they are running simultaneously.
For each edge from pilot --> plane, assign a capacity of 1. If this edge is filled with flow of 1, it represents that the given pilot is flying that plane.
For each edge from plane --> sink, assign a capacity of 2. This represents that each plane must be supplied by exactly 2 pilots.
Now, find a maximum flow. If the resulting maximum flow is two times the number of planes, then it's possible to satisfy the constraints. In this case, the edges between planes and pilots that are at capacity represent the matching.
The other answer is fine but you don't really need to involve flow as this can be reduced just as well to ordinary maximum bipartite matching:
For each plane, add another auxiliary plane to the plane partition with edges to the same pilots as the first plane.
Find a maximum bipartite matching M.
The answer is now true if and only if M = 2 N.
If you like, you can think of this as saying that each plane needs a pilot and a co-pilot, and the two vertices associated to each plane now represents those two roles.
The reduction to maximum bipartite matching is linear time, so using e.g. the Hopcroft–Karp algorithm to find the matching, you can solve the problem in O(|E| √|V|) where E is the number of edges between the partitions, and V = T + N.
In practice, the improvement over using a maximum flow based approach should depend on the quality of your implementations as well as the particular choice of representation of the graph, but chances are that you're better off this way.
Implementation example
To illustrate the last point, let's give an idea of how the two reductions could look in practice. One representation of a graph that's often useful due to its built-in memory locality is that of a CSR matrix, so let us assume that the input is such a matrix, whose rows correspond to the planes, and whose columns correspond to the pilots.
We will use the Python library SciPy which comes with algorithms for both maximum bipartite matching and maximum flow, and which works with CSR matrix representations for graphs under the hood.
In the algorithm given above, we will then need to construct the biadjacency matrix of the graph with the additional vertices added. This is nothing but the result of stacking the input matrix on top of itself, which is straightforward to phrase in terms of the CSR data structures: Following Wikipedia's notation, COL_INDEX should just be repeated, and ROW_INDEX should be replaced with ROW_INDEX concatenated with a copy of ROW_INDEX in which all elements are increased by the final element of ROW_INDEX.
In SciPy, a complete implementation which answers yes or no to the problem in OP would look as follows:
import numpy as np
from scipy.sparse.csgraph import maximum_bipartite_matching
def reduce_to_max_matching(a):
i, j = a.shape
data = np.ones(a.nnz * 2, dtype=bool)
indices = np.concatenate([a.indices, a.indices])
indptr = np.concatenate([a.indptr, a.indptr[1:] + a.indptr[-1]])
graph = csr_matrix((data, indices, indptr), shape=(2*i, j))
return (maximum_bipartite_matching(graph) != -1).sum() == 2 * i
In the maximum flow approach given by #HeatherGuarnera's answer, we will need to set up the full adjacency matrix of the new graph. This is also relatively straightforward; the input matrix will appear as a certain submatrix of the adjacency matrix, and we need to add a row for the source vertex and a column for the target. The example section of the documentation for SciPy's max flow solver actually contains an illustration of what this looks like in practice. Adopting this, a complete solution looks as follows:
import numpy as np
from scipy.sparse.csgraph import maximum_flow
def reduce_to_max_flow(a):
i, j = a.shape
n = a.nnz
data = np.concatenate([2*np.ones(i, dtype=int), np.ones(n + j, dtype=int)])
indices = np.concatenate([np.arange(1, i + 1),
a.indices + i + 1,
np.repeat(i + j + 1, j)])
indptr = np.concatenate([[0],
a.indptr + i,
np.arange(n + i + 1, n + i + j + 1),
[n + i + j]])
graph = csr_matrix((data, indices, indptr), shape=(2+i+j, 2+i+j))
flow = maximum_flow(graph, 0, graph.shape[0]-1)
return flow.flow_value == 2*i
Let us compare the timings of the two approaches on a single example consisting of 40 planes and 100 pilots, on a graph whose edge density is 0.1:
from scipy.sparse import random
inp = random(40, 100, density=.1, format='csr', dtype=bool)
%timeit reduce_to_max_matching(inp) # 191 µs ± 3.57 µs per loop
%timeit reduce_to_max_flow(inp) # 1.29 ms ± 20.1 µs per loop
The matching-based approach is faster, but not by a crazy amount. On larger problems, we'll start to see the advantages of using matching instead; with 400 planes and 1000 pilots:
inp = random(400, 1000, density=.1, format='csr', dtype=bool)
%timeit reduce_to_max_matching(inp) # 473 µs ± 5.52 µs per loop
%timeit reduce_to_max_flow(inp) # 68.9 ms ± 555 µs per loop
Again, this exact comparison relies on the use of specific predefined solvers from SciPy and how those are implemented, but if nothing else, this hints that simpler is better.

Conditional sampling of binary vectors (?)

I'm trying to find a name for my problem, so I don't have to re-invent wheel when coding an algorithm which solves it...
I have say 2,000 binary (row) vectors and I need to pick 500 from them. In the picked sample I do column sums and I want my sample to be as close as possible to a pre-defined distribution of the column sums. I'll be working with 20 to 60 columns.
A tiny example:
Out of the vectors:
110
010
011
110
100
I need to pick 2 to get column sums 2, 1, 0. The solution (exact in this case) would be
110
100
My ideas so far
one could maybe call this a binary multidimensional knapsack, but I did not find any algos for that
Linear Programming could help, but I'd need some step by step explanation as I got no experience with it
as exact solution is not always feasible, something like simulated annealing brute force could work well
a hacky way using constraint solvers comes to mind - first set the constraints tight and gradually loosen them until some solution is found - given that CSP should be much faster than ILP...?
My concrete, practical (if the approximation guarantee works out for you) suggestion would be to apply the maximum entropy method (in Chapter 7 of Boyd and Vandenberghe's book Convex Optimization; you can probably find several implementations with your favorite search engine) to find the maximum entropy probability distribution on row indexes such that (1) no row index is more likely than 1/500 (2) the expected value of the row vector chosen is 1/500th of the predefined distribution. Given this distribution, choose each row independently with probability 500 times its distribution likelihood, which will give you 500 rows on average. If you need exactly 500, repeat until you get exactly 500 (shouldn't take too many tries due to concentration bounds).
Firstly I will make some assumptions regarding this problem:
Regardless whether the column sum of the selected solution is over or under the target, it weighs the same.
The sum of the first, second, and third column are equally weighted in the solution (i.e. If there's a solution whereas the first column sum is off by 1, and another where the third column sum is off by 1, the solution are equally good).
The closest problem I can think of this problem is the Subset sum problem, which itself can be thought of a special case of Knapsack problem.
However both of these problem are NP-Complete. This means there are no polynomial time algorithm that can solve them, even though it is easy to verify the solution.
If I were you the two most arguably efficient solution of this problem are linear programming and machine learning.
Depending on how many columns you are optimising in this problem, with linear programming you can control how much finely tuned you want the solution, in exchange of time. You should read up on this, because this is fairly simple and efficient.
With Machine learning, you need a lot of data sets (the set of vectors and the set of solutions). You don't even need to specify what you want, a lot of machine learning algorithms can generally deduce what you want them to optimise based on your data set.
Both solution has pros and cons, you should decide which one to use yourself based on the circumstances and problem set.
This definitely can be modeled as (integer!) linear program (many problems can). Once you have it, you can use a program such as lpsolve to solve it.
We model vector i is selected as x_i which can be 0 or 1.
Then for each column c, we have a constraint:
sum of all (x_i * value of i in column c) = target for column c
Taking your example, in lp_solve this could look like:
min: ;
+x1 +x4 +x5 >= 2;
+x1 +x4 +x5 <= 2;
+x1 +x2 +x3 +x4 <= 1;
+x1 +x2 +x3 +x4 >= 1;
+x3 <= 0;
+x3 >= 0;
bin x1, x2, x3, x4, x5;
If you are fine with a heuristic based search approach, here is one.
Go over the list and find the minimum squared sum of the digit wise difference between each bit string and the goal. For example, if we are looking for 2, 1, 0, and we are scoring 0, 1, 0, we would do it in the following way:
Take the digit wise difference:
2, 0, 1
Square the digit wise difference:
4, 0, 1
Sum:
5
As a side note, squaring the difference when scoring is a common method when doing heuristic search. In your case, it makes sense because bit strings that have a 1 in as the first digit are a lot more interesting to us. In your case this simple algorithm would pick first 110, then 100, which would is the best solution.
In any case, there are some optimizations that could be made to this, I will post them here if this kind of approach is what you are looking for, but this is the core of the algorithm.
You have a given target binary vector. You want to select M vectors out of N that have the closest sum to the target. Let's say you use the eucilidean distance to measure if a selection is better than another.
If you want an exact sum, have a look at the k-sum problem which is a generalization of the 3SUM problem. The problem is harder than the subset sum problem, because you want an exact number of elements to add to a target value. There is a solution in O(N^(M/2)). lg N), but that means more than 2000^250 * 7.6 > 10^826 operations in your case (in the favorable case where vectors operations have a cost of 1).
First conclusion: do not try to get an exact result unless your vectors have some characteristics that may reduce the complexity.
Here's a hill climbing approach:
sort the vectors by number of 1's: 111... first, 000... last;
use the polynomial time approximate algorithm for the subset sum;
you have an approximate solution with K elements. Because of the order of elements (the big ones come first), K should be a little as possible:
if K >= M, you take the M first vectors of the solution and that's probably near the best you can do.
if K < M, you can remove the first vector and try to replace it with 2 or more vectors from the rest of the N vectors, using the same technique, until you have M vectors. To sumarize: split the big vectors into smaller ones until you reach the correct number of vectors.
Here's a proof of concept with numbers, in Python:
import random
def distance(x, y):
return abs(x-y)
def show(ls):
if len(ls) < 10:
return str(ls)
else:
return ", ".join(map(str, ls[:5]+("...",)+ls[-5:]))
def find(is_xs, target):
# see https://en.wikipedia.org/wiki/Subset_sum_problem#Pseudo-polynomial_time_dynamic_programming_solution
S = [(0, ())] # we store indices along with values to get the path
for i, x in is_xs:
T = [(x + t, js + (i,)) for t, js in S]
U = sorted(S + T)
y, ks = U[0]
S = [(y, ks)]
for z, ls in U:
if z == target: # use the euclidean distance here if you want an approximation
return ls
if z != y and z < target:
y, ks = z, ls
S.append((z, ls))
ls = S[-1][1] # take the closest element to target
return ls
N = 2000
M = 500
target = 1000
xs = [random.randint(0, 10) for _ in range(N)]
print ("Take {} numbers out of {} to make a sum of {}", M, xs, target)
xs = sorted(xs, reverse = True)
is_xs = list(enumerate(xs))
print ("Sorted numbers: {}".format(show(tuple(is_xs))))
ls = find(is_xs, target)
print("FIRST TRY: {} elements ({}) -> {}".format(len(ls), show(ls), sum(x for i, x in is_xs if i in ls)))
splits = 0
while len(ls) < M:
first_x = xs[ls[0]]
js_ys = [(i, x) for i, x in is_xs if i not in ls and x != first_x]
replace = find(js_ys, first_x)
splits += 1
if len(replace) < 2 or len(replace) + len(ls) - 1 > M or sum(xs[i] for i in replace) != first_x:
print("Give up: can't replace {}.\nAdd the lowest elements.")
ls += tuple([i for i, x in is_xs if i not in ls][len(ls)-M:])
break
print ("Replace {} (={}) by {} (={})".format(ls[:1], first_x, replace, sum(xs[i] for i in replace)))
ls = tuple(sorted(ls[1:] + replace)) # use a heap?
print("{} elements ({}) -> {}".format(len(ls), show(ls), sum(x for i, x in is_xs if i in ls)))
print("AFTER {} splits, {} -> {}".format(splits, ls, sum(x for i, x in is_xs if i in ls)))
The result is obviously not guaranteed to be optimal.
Remarks:
Complexity: find has a polynomial time complexity (see the Wikipedia page) and is called at most M^2 times, hence the complexity remains polynomial. In practice, the process is reasonably fast (split calls have a small target).
Vectors: to ensure that you reach the target with the minimum of elements, you can improve the order of element. Your target is (t_1, ..., t_c): if you sort the t_js from max to min, you get the more importants columns first. You can sort the vectors: by number of 1s and then by the presence of a 1 in the most important columns. E.g. target = 4 8 6 => 1 1 1 > 0 1 1 > 1 1 0 > 1 0 1 > 0 1 0 > 0 0 1 > 1 0 0 > 0 0 0.
find (Vectors) if the current sum exceed the target in all the columns, then you're not connecting to the target (any vector you add to the current sum will bring you farther from the target): don't add the sum to S (z >= target case for numbers).
I propose a simple ad hoc algorithm, which, broadly speaking, is a kind of gradient descent algorithm. It seems to work relatively well for input vectors which have a distribution of 1s “similar” to the target sum vector, and probably also for all “nice” input vectors, as defined in a comment of yours. The solution is not exact, but the approximation seems good.
The distance between the sum vector of the output vectors and the target vector is taken to be Euclidean. To minimize it means minimizing the sum of the square differences off sum vector and target vector (the square root is not needed because it is monotonic). The algorithm does not guarantee to yield the sample that minimizes the distance from the target, but anyway makes a serious attempt at doing so, by always moving in some locally optimal direction.
The algorithm can be split into 3 parts.
First of all the first M candidate output vectors out of the N input vectors (e.g., N=2000, M=500) are put in a list, and the remaining vectors are put in another.
Then "approximately optimal" swaps between vectors in the two lists are done, until either the distance would not decrease any more, or a predefined maximum number of iterations is reached. An approximately optimal swap is one where removing the first vector from the list of output vectors causes a maximal decrease or minimal increase of the distance, and then, after the removal of the first vector, adding the second vector to the same list causes a maximal decrease of the distance. The whole swap is avoided if the net result is not a decrease of the distance.
Then, as a last phase, "optimal" swaps are done, again stopping on no decrease in distance or maximum number of iterations reached. Optimal swaps cause a maximal decrease of the distance, without requiring the removal of the first vector to be optimal in itself. To find an optimal swap all vector pairs have to be checked. This phase is much more expensive, being O(M(N-M)), while the previous "approximate" phase is O(M+(N-M))=O(N). Luckily, when entering this phase, most of the work has already been done by the previous phase.
from typing import List, Tuple
def get_sample(vects: List[Tuple[int]], target: Tuple[int], n_out: int,
max_approx_swaps: int = None, max_optimal_swaps: int = None,
verbose: bool = False) -> List[Tuple[int]]:
"""
Get a sample of the input vectors having a sum close to the target vector.
Closeness is measured in Euclidean metrics. The output is not guaranteed to be
optimal (minimum square distance from target), but a serious attempt is made.
The max_* parameters can be used to avoid too long execution times,
tune them to your needs by setting verbose to True, or leave them None (∞).
:param vects: the list of vectors (tuples) with the same number of "columns"
:param target: the target vector, with the same number of "columns"
:param n_out: the requested sample size
:param max_approx_swaps: the max number of approximately optimal vector swaps,
None means unlimited (default: None)
:param max_optimal_swaps: the max number of optimal vector swaps,
None means unlimited (default: None)
:param verbose: print some info if True (default: False)
:return: the sample of n_out vectors having a sum close to the target vector
"""
def square_distance(v1, v2):
return sum((e1 - e2) ** 2 for e1, e2 in zip(v1, v2))
n_vec = len(vects)
assert n_vec > 0
assert n_out > 0
n_rem = n_vec - n_out
assert n_rem > 0
output = vects[:n_out]
remain = vects[n_out:]
n_col = len(vects[0])
assert n_col == len(target) > 0
sumvect = (0,) * n_col
for outvect in output:
sumvect = tuple(map(int.__add__, sumvect, outvect))
sqdist = square_distance(sumvect, target)
if verbose:
print(f"sqdist = {sqdist:4} after"
f" picking the first {n_out} vectors out of {n_vec}")
if max_approx_swaps is None:
max_approx_swaps = sqdist
n_approx_swaps = 0
while sqdist and n_approx_swaps < max_approx_swaps:
# find the best vect to subtract (the square distance MAY increase)
sqdist_0 = None
index_0 = None
sumvect_0 = None
for index in range(n_out):
tmp_sumvect = tuple(map(int.__sub__, sumvect, output[index]))
tmp_sqdist = square_distance(tmp_sumvect, target)
if sqdist_0 is None or sqdist_0 > tmp_sqdist:
sqdist_0 = tmp_sqdist
index_0 = index
sumvect_0 = tmp_sumvect
# find the best vect to add,
# but only if there is a net decrease of the square distance
sqdist_1 = sqdist
index_1 = None
sumvect_1 = None
for index in range(n_rem):
tmp_sumvect = tuple(map(int.__add__, sumvect_0, remain[index]))
tmp_sqdist = square_distance(tmp_sumvect, target)
if sqdist_1 > tmp_sqdist:
sqdist_1 = tmp_sqdist
index_1 = index
sumvect_1 = tmp_sumvect
if sumvect_1:
tmp = output[index_0]
output[index_0] = remain[index_1]
remain[index_1] = tmp
sqdist = sqdist_1
sumvect = sumvect_1
n_approx_swaps += 1
else:
break
if verbose:
print(f"sqdist = {sqdist:4} after {n_approx_swaps}"
f" approximately optimal swap{'s'[n_approx_swaps == 1:]}")
diffvect = tuple(map(int.__sub__, sumvect, target))
if max_optimal_swaps is None:
max_optimal_swaps = sqdist
n_optimal_swaps = 0
while sqdist and n_optimal_swaps < max_optimal_swaps:
# find the best pair to swap,
# but only if the square distance decreases
best_sqdist = sqdist
best_diffvect = diffvect
best_pair = None
for i0 in range(M):
tmp_diffvect = tuple(map(int.__sub__, diffvect, output[i0]))
for i1 in range(n_rem):
new_diffvect = tuple(map(int.__add__, tmp_diffvect, remain[i1]))
new_sqdist = sum(d * d for d in new_diffvect)
if best_sqdist > new_sqdist:
best_sqdist = new_sqdist
best_diffvect = new_diffvect
best_pair = (i0, i1)
if best_pair:
tmp = output[best_pair[0]]
output[best_pair[0]] = remain[best_pair[1]]
remain[best_pair[1]] = tmp
sqdist = best_sqdist
diffvect = best_diffvect
n_optimal_swaps += 1
else:
break
if verbose:
print(f"sqdist = {sqdist:4} after {n_optimal_swaps}"
f" optimal swap{'s'[n_optimal_swaps == 1:]}")
return output
from random import randrange
C = 30 # number of columns
N = 2000 # total number of vectors
M = 500 # number of output vectors
F = 0.9 # fill factor of the target sum vector
T = int(M * F) # maximum value + 1 that can be appear in the target sum vector
A = 10000 # maximum number of approximately optimal swaps, may be None (∞)
B = 10 # maximum number of optimal swaps, may be None (unlimited)
target = tuple(randrange(T) for _ in range(C))
vects = [tuple(int(randrange(M) < t) for t in target) for _ in range(N)]
sample = get_sample(vects, target, M, A, B, True)
Typical output:
sqdist = 2639 after picking the first 500 vectors out of 2000
sqdist = 9 after 27 approximately optimal swaps
sqdist = 1 after 4 optimal swaps
P.S.: As it stands, this algorithm is not limited to binary input vectors, integer vectors would work too. Intuitively I suspect that the quality of the optimization could suffer, though. I suspect that this algorithm is more appropriate for binary vectors.
P.P.S.: Execution times with your kind of data are probably acceptable with standard CPython, but get better (like a couple of seconds, almost a factor of 10) with PyPy. To handle bigger sets of data, the algorithm would have to be translated to C or some other language, which should not be difficult at all.

Algorithm for Cutting Patterns

Let's say I have a given length c and I need to cut out several pieces of different lengths a{i}, where i is the index of a specific piece. The length of every piece is smaller or equal to the length c. I need to find all possible permutations of cutting patterns.
Does someone has a smart approach for such tasks or an algorithm to solve this?
The function could look something similar to this:
Pattern[] getPatternList(double.. a, double c);
The input is hence a list of different sizes and the total available space. My goal is to optimize/minimize the trim loss.
I'll use the simplex algorithm for that but to create an linear programming model, I need a smart way to determine all the cutting patterns.
There are exponentially many cutting-patterns in general. So it might not be feasible to construct them all (time and memory)
If you need to optimize some cutting based on some objective, enumerating all possible cuttings is a bad approach (like #harold mentioned)
A bad analogy (which does not exactly apply here as your base-problem is np-hard):
solving 2-SAT is possible in polynomial-time
enumerating all 2-SAT solutions is Sharp-P-complete (an efficient algorithm would imply P=NP, so there might be none!)
A simple approach (to generate all valid cutting-patterns):
Generate all permutations if items = ordering of items (bounded by !n)
Place them one after another and stop if c is exceeded
(It would be a good idea to do this incrementally; build one permutation after another)
Assumption: each item can only be selected once
Assumption: moving/shifting a cut within a free range does not generate a new solution. It it would: solution-space is possibly an uncountably infinite set
edit
Code
Here is a more powerful approach handling the problem with the same assumptions as described above. It uses integer-programming to minimize the trim-loss, implemented in python with the use of cvxpy (and a commercial-solver; can be replaced by an open-source solver like cbc):
import numpy as np
from cvxpy import *
np.random.seed(1)
# random problem
SPACE = 25000
N_ITEMS = 10000
items = np.random.randint(0, 10, size=N_ITEMS)
def minimize_loss(items, space):
N = items.shape[0]
X = Bool(N)
constraint = [sum_entries(mul_elemwise(items, X)) <= space]
objective = Minimize(space - sum_entries(mul_elemwise(items, X)))
problem = Problem(objective, constraint)
problem.solve(solver=GUROBI, verbose=True)
print('trim-loss: ', problem.value)
print('validated trim-loss: ', space - sum(np.dot(X.value.flatten(), items)))
print('# selected items: ', np.count_nonzero(np.round(X.value)))
print('items: ', items)
print('space: ', SPACE)
minimize_loss(items, SPACE)
Output
items: [5 8 9 ..., 5 3 5]
space: 25000
Parameter OutputFlag unchanged
Value: 1 Min: 0 Max: 1 Default: 1
Changed value of parameter QCPDual to 1
Prev: 0 Min: 0 Max: 1 Default: 0
Optimize a model with 1 rows, 10000 columns and 8987 nonzeros
Coefficient statistics:
Matrix range [1e+00, 9e+00]
Objective range [1e+00, 9e+00]
Bounds range [1e+00, 1e+00]
RHS range [2e+04, 2e+04]
Found heuristic solution: objective -25000
Presolve removed 1 rows and 10000 columns
Presolve time: 0.01s
Presolve: All rows and columns removed
Explored 0 nodes (0 simplex iterations) in 0.01 seconds
Thread count was 1 (of 4 available processors)
Optimal solution found (tolerance 1.00e-04)
Best objective -2.500000000000e+04, best bound -2.500000000000e+04, gap 0.0%
trim-loss: 0.0
validated trim-loss: [[ 0.]]
# selected items: 6516
edit v2
After read your new comments, it is clear, that your model-description was incomplete/imprecise and nothing above tackles the problem you want to solve. It's a bit sad.
You will need to enumerate all permutations of a, and then take the longest prefix that has length less than or equal to c.
This sounds like a version of the knapsack problem (https://en.wikipedia.org/wiki/Knapsack_problem), and nobody knows an efficient way to do this.

matlab: optimum amount of points for linear fit

I want to make a linear fit to few data points, as shown on the image. Since I know the intercept (in this case say 0.05), I want to fit only points which are in the linear region with this particular intercept. In this case it will be lets say points 5:22 (but not 22:30).
I'm looking for the simple algorithm to determine this optimal amount of points, based on... hmm, that's the question... R^2? Any Ideas how to do it?
I was thinking about probing R^2 for fits using points 1 to 2:30, 2 to 3:30, and so on, but I don't really know how to enclose it into clear and simple function. For fits with fixed intercept I'm using polyfit0 (http://www.mathworks.com/matlabcentral/fileexchange/272-polyfit0-m) . Thanks for any suggestions!
EDIT:
sample data:
intercept = 0.043;
x = 0.01:0.01:0.3;
y = [0.0530642513911393,0.0600786706929529,0.0673485248329648,0.0794662409166333,0.0895915873196170,0.103837395346484,0.107224784565365,0.120300492775786,0.126318699218730,0.141508831492330,0.147135757370947,0.161734674733680,0.170982455701681,0.191799936622712,0.192312642057298,0.204771365716483,0.222689541632988,0.242582251060963,0.252582727297656,0.267390860166283,0.282890010610515,0.292381165948577,0.307990544720676,0.314264952297699,0.332344368808024,0.355781519885611,0.373277721489254,0.387722683944356,0.413648156978284,0.446500064130389;];
What you have here is a rather difficult problem to find a general solution of.
One approach would be to compute all the slopes/intersects between all consecutive pairs of points, and then do cluster analysis on the intersepts:
slopes = diff(y)./diff(x);
intersepts = y(1:end-1) - slopes.*x(1:end-1);
idx = kmeans(intersepts, 3);
x([idx; 3] == 2) % the points with the intersepts closest to the linear one.
This requires the statistics toolbox (for kmeans). This is the best of all methods I tried, although the range of points found this way might have a few small holes in it; e.g., when the slopes of two points in the start and end range lie close to the slope of the line, these points will be detected as belonging to the line. This (and other factors) will require a bit more post-processing of the solution found this way.
Another approach (which I failed to construct successfully) is to do a linear fit in a loop, each time increasing the range of points from some point in the middle towards both of the endpoints, and see if the sum of the squared error remains small. This I gave up very quickly, because defining what "small" is is very subjective and must be done in some heuristic way.
I tried a more systematic and robust approach of the above:
function test
%% example data
slope = 2;
intercept = 1.5;
x = linspace(0.1, 5, 100).';
y = slope*x + intercept;
y(1:12) = log(x(1:12)) + y(12)-log(x(12));
y(74:100) = y(74:100) + (x(74:100)-x(74)).^8;
y = y + 0.2*randn(size(y));
%% simple algorithm
[X,fn] = fminsearch(#(ii)P(ii, x,y,intercept), [0.5 0.5])
[~,inds] = P(X, y,x,intercept)
end
function [C, inds] = P(ii, x,y,intercept)
% ii represents fraction of range from center to end,
% So ii lies between 0 and 1.
N = numel(x);
n = round(N/2);
ii = round(ii*n);
inds = min(max(1, n+(-ii(1):ii(2))), N);
% Solve linear system with fixed intercept
A = x(inds);
b = y(inds) - intercept;
% and return the sum of squared errors, divided by
% the number of points included in the set. This
% last step is required to prevent fminsearch from
% reducing the set to 1 point (= minimum possible
% squared error).
C = sum(((A\b)*A - b).^2)/numel(inds);
end
which only finds a rough approximation to the desired indices (12 and 74 in this example).
When fminsearch is run a few dozen times with random starting values (really just rand(1,2)), it gets more reliable, but I still wouln't bet my life on it.
If you have the statistics toolbox, use the kmeans option.
Depending on the number of data values, I would split the data into a relative small number of overlapping segments, and for each segment calculate the linear fit, or rather the 1-st order coefficient, (remember you know the intercept, which will be same for all segments).
Then, for each coefficient calculate the MSE between this hypothetical line and entire dataset, choosing the coefficient which yields the smallest MSE.

Distributing points over a surface within boundries

I'm interested in a way (algorithm) of distributing a predefined number of points over a 4 sided surface like a square.
The main issue is that each point has got to have a minimum and maximum proximity to each other (random between two predefined values). Basically the distance of any two points should not be closer than let's say 2, and a further than 3.
My code will be implemented in ruby (the points are locations, the surface is a map), but any ideas or snippets are definitely welcomed as all my ideas include a fair amount of brute force.
Try this paper. It has a nice, intuitive algorithm that does what you need.
In our modelization, we adopted another model: we consider each center to be related to all its neighbours by a repulsive string.
At the beginning of the simulation, the centers are randomly distributed, as well as the strengths of the
strings. We choose randomly to move one center; then we calculate the resulting force caused by all
neighbours of the given center, and we calculate the displacement which is proportional and oriented
in the sense of the resulting force.
After a certain number of iterations (which depends on the number of
centers and the degree of initial randomness) the system becomes stable.
In case it is not clear from the figures, this approach generates uniformly distributed points. You may use instead a force that is zero inside your bounds (between 2 and 3, for example) and non-zero otherwise (repulsive if the points are too close, attractive if too far).
This is my Python implementation (sorry, I don´t know ruby). Just import this and call uniform() to get a list of points.
import numpy as np
from numpy.linalg import norm
import pylab as pl
# find the nearest neighbors (brute force)
def neighbors(x, X, n=10):
dX = X - x
d = dX[:,0]**2 + dX[:,1]**2
idx = np.argsort(d)
return X[idx[1:11]]
# repulsion force, normalized to 1 when d == rmin
def repulsion(neib, x, d, rmin):
if d == 0:
return np.array([1,-1])
return 2*(x - neib)*rmin/(d*(d + rmin))
def attraction(neib, x, d, rmax):
return rmax*(neib - x)/(d**2)
def uniform(n=25, rmin=0.1, rmax=0.15):
# Generate randomly distributed points
X = np.random.random_sample( (n, 2) )
# Constants
# step is how much each point is allowed to move
# set to a lower value when you have more points
step = 1./50.
# maxk is the maximum number of iterations
# if step is too low, then maxk will need to increase
maxk = 100
k = 0
# Force applied to the points
F = np.zeros(X.shape)
# Repeat for maxk iterations or until all forces are zero
maxf = 1.
while maxf > 0 and k < maxk:
maxf = 0
for i in xrange(n):
# Force calculation for the i-th point
x = X[i]
f = np.zeros(x.shape)
# Interact with at most 10 neighbors
Neib = neighbors(x, X, 10)
# dmin is the distance to the nearest neighbor
dmin = norm(Neib[0] - x)
for neib in Neib:
d = norm(neib - x)
if d < rmin:
# feel repulsion from points that are too near
f += repulsion(neib, x, d, rmin)
elif dmin > rmax:
# feel attraction if there are no neighbors closer than rmax
f += attraction(neib, x, d, rmax)
# save all forces and the maximum force to normalize later
F[i] = f
if norm(f) <> 0:
maxf = max(maxf, norm(f))
# update all positions using the forces
if maxf > 0:
X += (F/maxf)*step
k += 1
if k == maxk:
print "warning: iteration limit reached"
return X
I presume that one of your brute force ideas includes just repeatedly generating points at random and checking to see if the constraints happen to be satisified.
Another way is to take a configuration that satisfies the constraints and repeatedly perturb a small part of it, chosen at random - for instance move a single point - to move to a randomly chosen nearby configuration. If you do this often enough you should move to a random configuration that is almost independent of the starting point. This could be justified under http://en.wikipedia.org/wiki/Metropolis%E2%80%93Hastings_algorithm or http://en.wikipedia.org/wiki/Gibbs_sampling.
I might try just doing it at random, then going through and dropping points that are to close to other points. You can compare the square of the distance to save some math time.
Or create cells with borders and place a point in each one. Less random, it depends on if this is a "just for looks thing" or not. But it could be very fast.
I made a compromise and ended up using the Poisson Disk Sampling method.
The result was fairly close to what I needed, especially with a lower number of tries (which also drastically reduces cost).

Resources