How to implement crossover for the distance-constrained vehicle routing problem? - genetic-algorithm

I'm trying to build a genetic algorithm that can solve the distance-constrained vehicle routing problem (DVRP). This is a subset of the Traveling Salesman Problem, and asks the following question:
"What is the optimal set of routes for a fleet of vehicles to traverse in order to deliver to a given set of customers, given that each vehicle has a maximum distance it can travel? Vehicles must all start and stop at the depot.
This secondary constraint has been giving me trouble, as it seems that when I attempt solving this as one would with other TSP/VRP problems, the same approach no longer applies. The issue seems to specifically stem from mutation/crossover of solutions, as there is no long-term positive trend in distance/fitness.
This is demonstrated by the graph that I plotted above, where, with distance on the y-axis and generations on the x-axis, there is no discernible non-random change. Here, blue represents the average distance for the solutions in that generation, and orange the best in that generation.
I am encoding my solution according to the algorithm described on page 7 of this paper.
An example is above, with 10 destinations and 4 vehicles. Numbers 1-10 represent destinations, and 11-14 vehicles (14 is omitted as it must be at the front). The number 0 represents the depot, and is also omitted - but when calculating fitness, it will be added back in as the first and last destination for every route for each vehicle.
With such an encoding in mind, I am using the ordered crossover operator commonly used for VRP problems. It and my mutation algorithm (simply swapping two elements in the encoding) follow their relative description exactly, but fail to produce results as shown in the graph earlier.
Crossover
def crossover(self, path1, path2):
# path1 and path2 are lists of integers as described above
# Step 1: Select a random swath of consecutive alleles from parent 1.
cut_point_1 = random.randint(0, len(path1) - 1)
cut_point_2 = random.randint(0, len(path2) - 1)
if cut_point_1 > cut_point_2:
cut_point_1, cut_point_2 = cut_point_2, cut_point_1
# Drop the swath down to child 1
swath = path1[cut_point_1:cut_point_2]
child = [None] * len(path1)
child[cut_point_1:cut_point_2] = swath
swath_points = [e for e in swath]
# Mark out these alleles in Parent 2.
# each point is visited only once
final_path2 = path2.copy()
for i in range(len(path2)):
if path2[i] in swath_points: # parents share some visited point
final_path2.remove(path2[i])
# fill empty spaces in child with remaining alleles from path2
for i in range(len(child)):
if child[i] is None:
child[i] = final_path2.pop(0)
if random.random() < self.mutation_rate:
return self.mutate(child)
return child
Mutation
def mutate(self, solution):
"""
Just swaps two random nodes. Keeps doing so until we find a valid solution
(one where all individual vehicle routes are at most vehicle_max_distance).
"""
# swap two random elements in solution
while True:
i = random.randint(0, len(solution) - 1)
j = random.randint(0, len(solution) - 1)
if i != j:
solution[i], solution[j] = solution[j], solution[i]
if self.is_valid_solution(solution):
return solution
solution[i], solution[j] = (
solution[j],
solution[i],
) # is not valid, reset
If the implementation is correct, is there any alternative genetic algorithm-based approach for solving this optimization problem?
The entire code can be found here.

Related

Coin change recursive algorithm unwinding

I am trying to unwind the recursive function in this algorithm. Coin change problem: Given a target amount n and a list array of distinct coin values, what's the fewest coins needed to make the change amount.
def rec_coin(target,coins):
# Default to target value
min_coins = target
# Check to see if we have a single coin match (BASE CASE)
if target in coins:
return 1
else:
# for every coin value that is <= than target
for i in [c for c in coins if c <= target]:
# Recursive Call (add a count coin and subtract from the target)
num_coins = 1 + rec_coin(target-i,coins)
# Reset Minimum if we have a new minimum
if num_coins < min_coins:
min_coins = num_coins
return min_coins
# rec_coin(63,[1,5,10,25])
# 6
This is what I come up with after breaking it apart
1 + 63-1 coins + 62-1 + 61-1 and so on..
why do we need to add 1? What would be the correct way of unwiding the recursion
The code you present is very inefficient. For finding the solution for an amount of 63, imagine that it will first recurse to the target amount with steps of the smallest coin (i.e. 1). Then after a lot of backtracking and attempts with other coins, it finally backtracks to the outermost level and tries a coin with value 5. Now the recursion kicks in again, just like before, adding coins of value 1. But the problem is that this intermediate value (63-5) was already visited before (5 levels deep after picking coin 1), and it took a lot of function calls to get results for that value 58. And yet, the algorithm will just ignore that and do all of that work again.
A common solution to this is dynamic programming, i.e. memoizing earlier found solutions so they can be reused without extra work.
I will present here a bottom-up method: it first checks all amounts that can be achieved with just one coin. These amounts are put in a queue. If the target is among them then the answer is 1. If not, all amounts in the queue are processed by adding all possible coins to each of them. Sometimes a value will be found that was already visited before, and in that case it is not put in the next queue, but otherwise it is. If now the target value is in that queue, you know that target can be reached with just 2 coins.
This process continues in a loop, which is in fact just a Breadth-first-search in a tree where amounts are nodes, and edges represent that one amount can be reached from another by adding one coin to it. The search starts in the node that represents an amount of 0.
Here is the code for it:
def rec_coin(target, coins):
visited = set() # Amounts that we have already achieved with a minimal number of coins
amounts = [0] # The latest series of amounts all using an equal number of coins
for min_coins in range(1, target+1):
next_amounts = []
for amount in amounts:
for coin in coins:
added = amount + coin
if added == target:
return min_coins
if not added in visited:
visited.add(added)
next_amounts.append(added)
amounts = next_amounts
print (rec_coin(63,[1,5,10,25]))

Shortest Path for n entities

I'm trying to solve a problem that is about minimizing the distance traveled by a group of n entities who have to go trough a group of x points in a given order.
The n entities all start in the same position (1,1) and then I'm given x points that are in a queue and have to be "answered" in the correct order. However, I want the distance to be minimal.
My approach to the algorithm so far was to order the entities in increasing order of their distance to the x that is next in line. Then, I'd check, from the ones that are closer to the ones that are the furthest away, if this distance to the next in line is bigger than the distance to the one that comes afterwards to minimize the distance. The closest one to not fulfill this condition went to answer. If all were closer to the x that came afterwards, I'd reorder them in increasing order of distance to the one that came afterwards and send the furthest away from this to answer the x. Since this is a test problem I'm doing as practice for a competition I know what the result should be for my test case and it seems I'm doing this wrong.
How should I implement such an algorithm that guarantees that the distance is minimal?
The algorithm you've described sounds like a greedy search algorithm. Greedy algorithms are not guaranteed to find optimal solutions except under specific conditions that don't seem to hold here.
This looks like a candidate for a dynamic programming formulation. Alternatively, you can use heuristic search such as the A* search algorithm. I'd go with the latter if I was in the competition. See the link for a description of how the algorithm works, how to implement it, and how you might apply it to your problem.
Although since the points need to be visited in order, there is a bound on the number of possible arrangements, I couldn't think of a more efficient way than the following formulation. Let f(ns, i) represent the optimal arrangement up to the ith point, where ns is the list of the last chosen point for each entity that has at least one point. Then we have at most two choices: either start a new entity if we haven't run out, or try the current point as the next visit for each entity.
Python recursion:
import math
def d(p1, p2):
return math.sqrt(math.pow(p1[0] - p2[0], 2) + math.pow(p1[1] - p2[1], 2))
def f(ps, n):
def g(ns, i):
if i == len(ps):
return 0
# start a new entity if we haven't run out
best = d((0,0), ps[i]) + g(ns[:] + [i], i + 1) if len(ns) < n else float('inf')
# try the current point as the next visit for each entity
for entity_idx, point_idx in enumerate(ns):
_ns = ns[:]
_ns[entity_idx] = i
best = min(best, d(ps[point_idx], ps[i]) + g(_ns, i + 1))
return best
return d((0,0), ps[0]) + g([0], 1)
Output:
"""
p5
p3
p1
(0,0) p4 p6
p2
"""
points = [(0,1), (0,-1), (0,2), (2,0), (0,3), (3,0)]
print f(points, 3) # 7.0

Optimal filling of grid figure with squares

recently I have designed a puzzle for children to solve. However I would like to now the optimal solution.
The problem is as follows: You have this figure made up off small squares
You have to fill it in with larger squares and it is scored with the following table:
| Square Size | 1x1 | 2x2 | 3x3 | 4x4 | 5x5 | 6x6 | 7x7 | 8x8 |
|-------------|-----|-----|-----|-----|-----|-----|-----|-----|
| Points | 0 | 4 | 10 | 20 | 35 | 60 | 84 | 120 |
There are simply to many possible solutions to check them all. Some other people suggested dynamic programming, but I don't know how to divide the figure in smaller ones which put together have the same optimal solution.
I would like to find a way to find the optimal solutions to these kinds of problems in reasonable time (like a couple of days max on a regular desktop). The highest score found so far with a guessing algorithm and some manual work is 1112.
Solutions to similar problems with combining sub-problems are also appreciated.
I don't need all the code written out. An outline or idea for an algorithm would be enough.
Note: The biggest square that can fit is 8x8 so scores for bigger squares are not included.
[[1,1,0,0,0,1,0,0,0,0,0,0,1,1,1,1,1,1,0,0,1,1,1,1,1,0,0,1,1,1],
[1,1,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,1,1,0,0,0,1,1],
[1,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,1],
[0,0,0,1,1,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0],
[0,0,0,0,1,1,0,0,0,0,1,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0],
[0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,0,1,1,1],
[0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1],
[1,0,0,0,0,0,0,1,1,1,1,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1],
[1,1,0,0,0,0,0,1,1,1,1,0,0,0,1,1,1,1,1,1,1,1,0,0,1,0,0,0,0,1],
[1,1,1,0,0,0,0,1,1,1,1,1,0,0,1,1,1,1,1,1,1,0,0,0,1,1,1,0,0,0],
[0,1,1,1,0,0,0,1,1,1,1,1,0,0,0,0,1,1,1,0,0,0,0,1,1,1,1,0,0,0],
[0,0,1,1,1,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0],
[0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0],
[0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1],
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1],
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,1],
[0,0,0,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,1],
[0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0],
[0,1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0],
[0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0],
[0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1],
[0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1],
[0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1],
[0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1],
[0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1],
[0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1],
[0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1],
[0,0,0,0,0,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1],
[0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1],
[1,1,1,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1],
[1,1,1,0,0,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1],
[1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1],
[1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1],
[1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1],
[1,1,0,0,0,1,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,1,1,1,1],
[1,0,0,0,0,1,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,1,1,1,1],
[1,0,0,0,0,1,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,1,1,1],
[1,0,0,0,0,1,0,0,0,1,1,1,0,0,0,0,0,0,0,0,1,1,1,0,0,1,1,1,1,1],
[0,0,0,0,0,1,0,0,0,1,1,1,1,1,1,0,0,0,0,1,1,1,1,0,0,0,1,1,1,1],
[0,0,0,0,0,1,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,1,1,1,1],
[0,0,0,0,0,1,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,1,1,1,1]];
Here is a quite general prototype using Mixed-integer-programming which solves your instance optimally (i obtained the value of 1112 like you deduced yourself) and might solve others too.
In general, your problem is np-complete and this makes it hard (there are some instances which will be trouble).
While i suspect that SAT-solver and CP-solver based approaches might be more powerful (because of the combinatoric nature; i even was surprised that MIP works here), the MIP-approach has also some advantages:
MIP-solvers are complete (as SAT and CP; but many random-based heuristics are not):
There are many commercial-grade solvers available if needed
The formulation is quite easy (especially compared to SAT; SAT-formulations will need advanced at most k out of n-formulations (for scoring-formulations) which are growing sub-quadratic (the naive approach grows exponentially)! They do exist, but are non-trivial)
The optimization-objective is just natural (SAT and CP would need iterative-refining = solve with some lower-bound; increment bound and re-solve)
MIP-solvers can also be quite powerful to obtain approximations of the optimal solution and also provide some proven bounds (e.g. optimum lower than x)
The following code is implemented in python using common scientific tools available (all of these are open-source). It allows setting the tile-range (e.g. adding 9x9 tiles) and different cost-functions. The comments should be enough to understand the ideas. It will use some (probably the best) open-source MIP-solver, but can also use commercial ones (outcommented line shows usage).
Code
import numpy as np
import itertools
from collections import defaultdict
import matplotlib.pyplot as plt # visualization only
import seaborn as sns # ""
from pulp import * # MIP-modelling & solver
""" INSTANCE """
instance = np.asarray([[1,1,0,0,0,1,0,0,0,0,0,0,1,1,1,1,1,1,0,0,1,1,1,1,1,0,0,1,1,1],
[1,1,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,1,1,0,0,0,1,1],
[1,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,1],
[0,0,0,1,1,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0],
[0,0,0,0,1,1,0,0,0,0,1,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0],
[0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,0,1,1,1],
[0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1],
[1,0,0,0,0,0,0,1,1,1,1,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1],
[1,1,0,0,0,0,0,1,1,1,1,0,0,0,1,1,1,1,1,1,1,1,0,0,1,0,0,0,0,1],
[1,1,1,0,0,0,0,1,1,1,1,1,0,0,1,1,1,1,1,1,1,0,0,0,1,1,1,0,0,0],
[0,1,1,1,0,0,0,1,1,1,1,1,0,0,0,0,1,1,1,0,0,0,0,1,1,1,1,0,0,0],
[0,0,1,1,1,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0],
[0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0],
[0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1],
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1],
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,1],
[0,0,0,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,1],
[0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0],
[0,1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0],
[0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0],
[0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1],
[0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1],
[0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1],
[0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1],
[0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1],
[0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1],
[0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1],
[0,0,0,0,0,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1],
[0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1],
[1,1,1,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1],
[1,1,1,0,0,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1],
[1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1],
[1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1],
[1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1],
[1,1,0,0,0,1,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,1,1,1,1],
[1,0,0,0,0,1,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,1,1,1,1],
[1,0,0,0,0,1,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,1,1,1],
[1,0,0,0,0,1,0,0,0,1,1,1,0,0,0,0,0,0,0,0,1,1,1,0,0,1,1,1,1,1],
[0,0,0,0,0,1,0,0,0,1,1,1,1,1,1,0,0,0,0,1,1,1,1,0,0,0,1,1,1,1],
[0,0,0,0,0,1,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,1,1,1,1],
[0,0,0,0,0,1,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,1,1,1,1]], dtype=bool)
def plot_compare(instance, solution, subgrids):
f, (ax1, ax2) = plt.subplots(2, sharex=True, sharey=True)
sns.heatmap(instance, ax=ax1, cbar=False, annot=True)
sns.heatmap(solution, ax=ax2, cbar=False, annot=True)
plt.show()
""" PARAMETERS """
SUBGRIDS = 8 # 1x1 - 8x8
SUGBRID_SCORES = {1:0, 2:4, 3:10, 4:20, 5:35, 6:60, 7:84, 8:120}
N, M = instance.shape # free / to-fill = zeros!
""" HELPER FUNCTIONS """
def get_square_covered_indices(instance, pos_x, pos_y, sg):
""" Calculate all covered tiles when given a top-left position & size
-> returns the base-index too! """
N, M = instance.shape
neighbor_indices = []
valid = True
for sX in range(sg):
for sY in range(sg):
if pos_x + sX < N:
if pos_y + sY < M:
if instance[pos_x + sX, pos_y + sY] == 0:
neighbor_indices.append((pos_x + sX, pos_y + sY))
else:
valid = False
break
else:
valid = False
break
else:
valid = False
break
return valid, neighbor_indices
def preprocessing(instance, SUBGRIDS):
""" Calculate all valid placement / tile-selection combinations """
placements = {}
index2placement = {}
placement2index = {}
placement2type = {}
type2placement = defaultdict(list)
cover2index = defaultdict(list) # cell covered by placement-index
index_gen = itertools.count()
for sg in range(1, SUBGRIDS+1): # sg = subgrid size
for pos_x in range(N):
for pos_y in range(M):
if instance[pos_x, pos_y] == 0: # free
feasible, covering = get_square_covered_indices(instance, pos_x, pos_y, sg)
if feasible:
new_index = next(index_gen)
placements[(sg, pos_x, pos_y)] = covering
index2placement[new_index] = (sg, pos_x, pos_y)
placement2index[(sg, pos_x, pos_y)] = new_index
placement2type[new_index] = sg
type2placement[sg].append(new_index)
cover2index[(pos_x, pos_y)].append(new_index)
return placements, index2placement, placement2index, placement2type, type2placement, cover2index
def calculate_collisions(placements, index2placement):
""" Calculate collisions between tile-placements (position + tile-selection)
-> only upper triangle is used: a < b! """
n_p = len(placements)
coll_mat = np.zeros((n_p, n_p), dtype=bool) # only upper triangle is used
for pA in range(n_p):
for pB in range(n_p):
if pA < pB:
covered_A = placements[index2placement[pA]]
covered_B = placements[index2placement[pB]]
if len(set(covered_A).intersection(set(covered_B))) > 0:
coll_mat[pA, pB] = True
return coll_mat
""" PREPROCESSING """
placements, index2placement, placement2index, placement2type, type2placement, cover2index = preprocessing(instance, SUBGRIDS)
N_P = len(placements)
coll_mat = calculate_collisions(placements, index2placement)
""" MIP-MODEL """
prob = LpProblem("GridFill", LpMaximize)
# Variables
X = np.empty(N_P, dtype=object)
for x in range(N_P):
X[x] = LpVariable('x'+str(x), 0, 1, cat='Binary')
# Objective
placement_scores = [SUGBRID_SCORES[index2placement[p][0]] for p in range(N_P)]
prob += lpDot(placement_scores, X), "Score"
# Constraints
# C1: Forbid collisions of placements
for a in range(N_P):
for b in range(N_P):
if a < b: # symmetry-reduction
if coll_mat[a, b]:
prob += X[a] + X[b] <= 1 # not both!
""" SOLVE """
print('solve')
#prob.solve(GUROBI()) # much faster commercial solver; if available
prob.solve(PULP_CBC_CMD(msg=1, presolve=True, cuts=True))
print("Status:", LpStatus[prob.status])
""" INTERPRET AND COMPLETE SOLUTION """
solution = np.zeros((N, M), dtype=int)
for x in range(N_P):
if X[x].value() > 0.99:
sg, pos_x, pos_y = index2placement[x]
_, positions = get_square_covered_indices(instance, pos_x, pos_y, sg)
for pos in positions:
solution[pos[0], pos[1]] = sg
fill_with_ones = np.logical_and((solution == 0), (instance == 0))
solution[fill_with_ones] = 1
""" VISUALIZE """
plot_compare(instance, solution, SUBGRIDS)
Assumptions / Nature of algorithm
There are no constraints describing the need for every free cell to be covered
This works when there are not negative scores
A positive score will be placed if it improves the objective
A zero-score (like your example) might keep some cells free, but these are proven to be 1's then (added after optimizing)
Performance
This is a good example of the discrepancy between open-source and commercial solvers. The two solvers tried were cbc and Gurobi.
cbc example output (just some final parts)
Result - Optimal solution found
Objective value: 1112.00000000
Enumerated nodes: 0
Total iterations: 307854
Time (CPU seconds): 2621.19
Time (Wallclock seconds): 2627.82
Option for printingOptions changed from normal to all
Total time (CPU seconds): 2621.57 (Wallclock seconds): 2628.24
Needed: ~45 mins
Gurobi example output
Explored 0 nodes (7004 simplex iterations) in 5.30 seconds
Thread count was 4 (of 4 available processors)
Optimal solution found (tolerance 1.00e-04)
Best objective 1.112000000000e+03, best bound 1.112000000000e+03, gap 0.0%
Needed: 6 seconds
General remarks about solver-performance
Gurobi should have much more functionality recognizing the nature of the problem and using appropriate hyper-parameters internally
I also think there are some SAT-based approaches used internally (as one of the core-developers wrote his dissertation mostly about combining these very different algorithmic techniques)
There are much better heuristics used, which could provide non-optimal solutions fast (which will help the steps after)
Example output: optimal solution with score 1112 (click to enlarge)
It is possible to reformulate problem into another NP-hard problem :-)
Create weighted graph where vertices are all possible squares that can be put on the board with weights regarding size, and edges are between intersecting squares. There is no need to represent squares 1x1 since there weight is zero.
E.g. for simple empty board 3x3, there are:
- 5 vertices: one 3x3 and four 2x2,
- 7 edges: four between 3x3 square and each 2x2 square, and six between each pair of 2x2 squares.
Now problem is to find maximum weight independent set.
I am not experienced with the topic, but from Wikipedia description it seems that there could exist fast enough algorithm. This graph is not in one of classes with known polynomial time algorithm, but it is quite close to P5-free graph. It seems to me that only possibility to have P5 in this graph is between 2x2 squares, which means to have stripe of width 2 of length 5. There is one in lower left corner. These regions can be covered (removed) before finding independent set with loosing none or very little to optimal solution.
(This is not meant to be a full answer; I'm just sharing what I'm working on so that we can collaborate.)
I think a good first step is to transform the binary grid by giving every cell the value of the maximum size of square that the cell can be the top-left corner of, like this:
0,0,3,2,1,0,3,2,2,2,2,1,0,0,0,0,0,0,2,1,0,0,0,0,0,2,1,0,0,0
0,0,2,2,2,3,3,2,1,1,1,1,0,0,0,3,3,3,3,3,3,2,1,0,0,1,2,1,0,0
0,2,1,1,1,2,3,2,1,0,0,0,0,3,2,2,2,2,2,2,3,3,2,1,0,0,3,2,1,0
3,2,1,0,0,1,3,2,1,0,0,0,3,2,2,1,1,1,1,1,2,3,3,2,1,0,2,2,2,1
3,3,2,1,0,0,2,2,2,1,0,3,2,2,1,1,0,0,0,0,1,2,4,3,2,2,1,1,1,1
2,3,3,2,1,0,2,1,1,1,2,3,2,1,1,0,0,0,0,0,0,1,3,3,2,1,1,0,0,0
1,2,3,4,3,2,1,1,0,0,1,3,2,1,0,0,0,0,0,0,0,0,2,2,2,1,0,0,0,0
0,1,2,3,3,2,1,0,0,0,0,2,2,1,0,0,0,0,0,0,0,0,2,1,1,2,2,2,1,0
0,0,1,2,3,2,1,0,0,0,0,1,2,1,0,0,0,0,0,0,0,0,2,1,0,1,1,2,1,0
0,0,0,1,2,2,1,0,0,0,0,0,2,1,0,0,0,0,0,0,0,2,1,1,0,0,0,3,2,1
1,0,0,0,1,2,1,0,0,0,0,0,4,3,2,1,0,0,0,4,3,2,1,0,0,0,0,2,2,1
2,1,0,0,0,1,2,1,0,0,5,5,4,4,4,4,4,4,4,5,5,4,3,2,1,0,0,1,2,1
3,2,1,0,0,0,1,6,6,5,4,4,4,3,3,3,3,3,3,4,4,5,4,3,2,1,0,0,1,1
3,2,1,0,0,0,0,6,5,5,4,3,3,3,2,2,2,2,2,3,3,4,5,4,3,2,1,0,0,0
3,2,2,2,2,7,6,6,5,4,4,3,2,2,2,1,1,1,1,2,2,3,5,5,4,3,2,1,0,0
2,2,1,1,1,7,6,5,5,4,3,3,2,1,1,1,0,0,0,1,1,2,4,6,5,4,3,2,1,0
2,1,1,0,0,7,6,5,4,4,3,2,2,1,0,0,0,0,0,0,0,1,3,6,5,4,3,2,1,0
1,1,0,0,8,7,6,5,4,3,3,2,1,1,0,0,0,0,0,0,0,0,2,7,6,5,4,3,2,1
1,0,0,0,8,7,6,5,4,3,2,2,1,0,0,0,0,0,0,0,0,0,1,7,6,5,4,3,2,1
0,0,0,7,8,7,6,5,4,3,2,1,1,0,0,0,0,0,0,0,0,0,0,6,6,5,4,3,2,1
0,0,0,6,8,7,6,5,4,3,2,1,0,0,0,0,0,0,0,0,0,0,0,6,5,5,4,3,2,1
0,0,0,5,7,7,6,5,4,3,2,1,0,0,0,0,0,0,0,0,0,0,0,6,5,4,4,3,2,1
0,0,0,4,6,7,7,6,5,4,3,2,1,0,0,0,0,0,0,0,0,0,6,5,5,4,3,3,2,1
0,0,0,3,5,6,7,7,6,5,4,3,2,1,0,0,0,0,0,0,0,6,6,5,4,4,3,2,2,1
1,0,0,2,4,5,6,7,8,7,6,5,4,3,2,1,0,0,0,7,6,6,5,5,4,3,3,2,1,1
1,0,0,1,3,4,5,6,7,7,8,8,8,8,8,8,7,7,6,6,6,5,5,4,4,3,2,2,1,0
2,1,0,0,2,3,4,5,6,6,7,7,8,7,7,7,7,6,6,5,5,5,4,4,3,3,2,1,1,0
2,1,0,0,1,2,3,4,5,5,6,6,8,7,6,6,6,6,5,5,4,4,4,3,3,2,2,1,0,0
3,2,1,0,0,1,2,3,4,4,5,5,8,7,6,5,5,5,5,4,4,3,3,3,2,2,1,1,0,0
3,2,1,0,0,0,1,2,3,3,4,4,8,7,6,5,4,4,4,4,3,3,2,2,2,1,1,0,0,0
4,3,2,1,0,0,0,1,2,2,3,3,8,7,6,5,4,3,3,3,3,2,2,1,1,1,0,0,0,0
3,3,2,1,0,0,0,0,1,1,2,2,8,7,6,5,4,3,2,2,2,2,1,1,0,0,0,0,0,0
2,2,2,2,1,0,0,0,0,0,1,1,8,7,6,5,4,3,2,1,1,1,1,0,0,0,0,0,0,0
1,1,1,2,1,0,0,0,0,0,0,0,8,7,6,5,4,3,2,1,0,0,0,0,0,0,0,0,0,0
0,0,0,2,1,0,0,0,0,0,0,0,8,8,7,6,5,4,3,2,1,0,0,0,0,0,0,0,0,0
0,0,0,2,1,0,0,0,0,0,0,6,8,7,7,6,6,5,4,3,2,1,0,0,0,0,0,0,0,0
0,0,0,2,2,2,3,3,3,3,3,5,7,7,6,6,5,5,4,3,3,3,3,2,1,0,0,0,0,0
0,0,3,2,1,1,3,2,2,2,2,4,6,6,6,5,5,4,4,3,2,2,2,2,1,0,0,0,0,0
0,0,3,2,1,0,3,2,1,1,1,3,5,5,5,5,4,4,3,3,2,1,1,2,1,0,0,0,0,0
0,0,3,2,1,0,3,2,1,0,0,2,4,4,4,4,4,3,3,2,2,1,0,2,1,0,0,0,0,0
0,4,3,2,1,0,3,2,1,0,0,1,3,3,3,4,3,3,2,2,1,1,0,2,1,0,0,0,0,0
0,4,3,2,1,0,3,2,1,0,0,0,2,2,2,3,3,2,2,1,1,0,0,2,1,0,0,0,0,0
0,4,3,2,1,0,3,2,1,0,0,0,1,1,1,2,2,2,1,1,0,0,0,2,1,0,0,0,0,0
3,3,3,2,1,0,3,2,1,0,0,0,0,0,0,1,1,1,1,0,0,0,0,3,2,1,0,0,0,0
2,2,2,2,1,0,2,2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,2,1,0,0,0,0
1,1,1,1,1,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0
If you wanted to go through every option using brute force, you'd try every size of square that a cell could be the corner of (including 1x1), mark the square with zeros, adjust the values of the cells up to 7 places left/above the square, and recurse with the new grid.
If you iterated over the cells top-to-bottom and left-to-right, you'd only have to copy the grid starting from the current row to the bottom row, and you'd only have to adjust the values of cells up to 7 places to the left of the square.
The JS code I tested this with is fast for the top 2 or 3 rows of the grid (result: 24 and 44), takes 8 seconds to finish the top 4 rows (result: 70), and 30 minutes for 5 rows (result: 86). I'm not trying 6 rows.
But, as you can see from this grid, the number of possibilities is so huge that brute force will never be an option. On the other hand, trying something like adding large squares first, and then filling up the leftover space with smaller squares, is never going to guarantee the optimal result, I fear. It's too easy to come up with examples that would thwart such a strategy.
7,6,5,4,3,2,1,0,0,0,0,0,0,7,6,5,4,3,2,1
6,6,5,4,3,2,1,0,0,0,0,0,0,6,6,5,4,3,2,1
5,5,5,4,3,2,1,0,0,0,0,0,0,5,5,5,4,3,2,1
4,4,4,4,3,2,1,0,0,0,0,0,0,4,4,4,4,3,2,1
3,3,3,3,3,2,1,0,0,0,0,0,0,3,3,3,3,3,2,1
2,2,2,2,2,2,1,0,0,0,0,0,0,2,2,2,2,2,2,1
1,1,1,1,1,1,8,7,6,5,4,3,2,1,1,1,1,1,1,1
0,0,0,0,0,0,7,7,6,5,4,3,2,1,0,0,0,0,0,0
0,0,0,0,0,0,6,6,6,5,4,3,2,1,0,0,0,0,0,0
0,0,0,0,0,0,5,5,5,5,4,3,2,1,0,0,0,0,0,0
0,0,0,0,0,0,4,4,4,4,4,3,2,1,0,0,0,0,0,0
0,0,0,0,0,0,3,3,3,3,3,3,2,1,0,0,0,0,0,0
0,0,0,0,0,0,2,2,2,2,2,2,2,1,0,0,0,0,0,0
7,6,5,4,3,2,1,1,1,1,1,1,1,7,6,5,4,3,2,1
6,6,5,4,3,2,1,0,0,0,0,0,0,6,6,5,4,3,2,1
5,5,5,4,3,2,1,0,0,0,0,0,0,5,5,5,4,3,2,1
4,4,4,4,3,2,1,0,0,0,0,0,0,4,4,4,4,3,2,1
3,3,3,3,3,2,1,0,0,0,0,0,0,3,3,3,3,3,2,1
2,2,2,2,2,2,1,0,0,0,0,0,0,2,2,2,2,2,2,1
1,1,1,1,1,1,1,0,0,0,0,0,0,1,1,1,1,1,1,1
In the above example, putting an 8x8 square in the center and four 6x6 squares in the corners gives a lower score than putting a 6x6 square in the center and four 7x7 squares in the corners; so a greedy approach based on using the largest square possible will not give the optimal result.
This is how far I got by isolating zones connected by corridors of maximum width 3, and running the brute-force algorithm on the smaller grids. Where the border has no orange zone, adding another 2 cells doesn't increase the score of the isolated zone, so those cells can be used by the main zone unconditionally.

Routing "paths" through a rectangular array

I'm trying to create my own implementation of a puzzle game.
To create my game board, I need to traverse each square in my array once and only once.
The traversal needs to be linked to an adjacent neighbor (horizontal, vertical or diagonal).
I'm using an array structure of the form:
board[n,m] = byte
Each bit of the byte represents a direction 0..7 and exactly 2 bits are always set
Directions are numbered clockwise
0 1 2
7 . 3
6 5 4
Board[0,0] must have some combination of bits 3,4,5 set
My current approach for constructing a random path is:
Start at a random position
While (Squares remain)
if no directions from this square are valid, step backwards
Pick random direction from those remaining in bitfield for this square
Erase the direction to this square from those not picked
Move to the next square
This algorithm takes far too long on some medium sized grids, because earlier choices remove areas from consideration.
What I'd like to have is a function that takes an index into every possible path, and returns with the array filled in for that path. This would let me provide a 'seed' value to return to this particular board.
Other suggestions are welcome..
Essentially, you want to construct a Hamiltonian path: A path that visits each node of a graph exactly once.
In general, even if you only want to test whether a graph contains a Hamiltonian path, that's already NP-complete. In this case, it's obvious that the graph contains at least one Hamiltonian path, and the graph has a nice regular structure -- but enumerating all Hamiltonian paths still seems to be a difficult problem.
The Wikipedia entry on the Hamiltonian path problem has a randomized algorithm for finding a Hamiltonian path that is claimed to be "fast on most graphs". It's different from your algorithm in that it swaps around a whole branch of the path instead of backtracking by just one node. This more "aggressive" strategy might be faster -- try it and see.
You could let your users enter the seed for the random number generator to reconstruct a certain path. You still wouldn't be enumerating all possible paths, but I guess that's probably not necessary for your application.
This was originally a comment to Martin B's post, but it grew a bit, so I'm posting it as an "answer".
Firstly, the problem of enumerating all possible Hamiltonian Paths is quite similar to the #P complexity class --- NP's terrifying older brother. Wikipedia, as always, can explain. It is for this reason that I hope you don't really need to know every path, but just most. If not, well, there's still hope if you can exploit the structure of your graphs.
(Here's a fun way of looking at why enumerating all paths is really hard: Let's say you have 10 paths for the graph, and you think you're done. And evil old me comes over and says "well, prove to me that there isn't an 11th path". Having an answer that could both convince me and not take a while to demonstrate --- that would be pretty impressive! Proving that there doesn't exist any path is co-NP, and that's also quite hard. Proving that there only exists a certain number, that sounds very hard. Its late here, so I'm a bit fuzzy-headed, but so far I think everything I've said is correct.)
Anyways, my google-fu seems particularly weak on this subject, which is surprising, so I don't have a cold answer for you, but I have some hints?
Firstly, if each node has at most 2 outgoing edges, then the number of edges are linear in the number of vertices, and I'm pretty sure that that means its a "sparse" graph, which are typically happier to deal with. But Hamiltonian path is exponential in the number of edges, so there's no really easy way out. :)
Secondly, if the structure of your graph lends itself to this, it may be possible to break the problem up into smaller chunks. Min-cut and strongly-connected-components come to mind. The basic idea would be, ideally, finding that your big graph is in fact, really, 4 smaller graphs with just a few connecting edges between the smaller graphs. Running your HamPath algorithm on those smaller chunks would be considerably faster, and the hope would be that it would be somewhat easy to piece together the sub-graph-paths into the whole-graph-paths. Coincidentally, these are great tongue-twisters. The downside is that these algorithms are a bit difficult to implement, and will require a lot of thought and care to use properly (in the combining of the HamPaths). Furthermore, if your graphs are in fact quite thoroughly connected, this whole paragraph was for naught! :)
So those were just a few ideas. If there is any other aspect of your graph (a particular starting point is required, or a particular ending point, or additional rules about what node can connect to which), it may be quite helpful! The border between NP-complete problems and P (fast) algorithms is often quite thin.
Hope this helps, -Agor.
I would start with a trivial path through the grid, then randomly select locations on the board (using the input seed to seed the random number generator) and perform "switches" on the path where possible. e.g.:
------ to: --\/--
------ --/\--
With enough switches you should get a random-looking path.
Bizarrely enough, I spent an undergrad summer in my Math degree studying this problem as well as coming up with algorithms to address it. First I will comment on the other answers.
Martin B: Correctly identified this as a Hamiltonian path problem. However, if the graph is a regular grid (as you two discussed in the comments) then a Hamiltonian path can trivially be found (snaking row-major order for example.)
agnorest: Correctly talked about the Hamiltonian path problem being in a class of hard problems. agnorest also alluded to possibly exploiting the graph structure, which funny enough, is trivial in the case of a regular grid.
I will now continue the discussion by elaborating on what I think you are trying to achieve. As you mentioned in the comments, it is trivial to find certain classes of "space-filling" non intersecting "walks" on a regular lattice/grid. However, it seems you are not satisfied only with these and would like to find an algorithm that finds "interesting" walks that cover your grid at random. But before I explore that possibility, I'd like to point out that the "non-intersecting" property of these walks is extremely important and what causes the difficulties of enumerating them.
In fact, the thing that I studied in my summer internship is called the Self Avoiding Walk. Surprisingly, the study of SAWs (self avoiding walks) is very important to a few sub-domains of modeling in physics (and was a key ingredient to the Manhattan project!) The algorithm you gave in your question is actually a variant on an algorithm known as the "growth" algorithm or Rosenbluth's algorithm (named after non other than Marshal Rosenbluth). More details on both the general version of this algorithm (used to estimate statistics on SAWs) as well as their relation to physics can be readily found in literature like this thesis.
SAWs in 2 dimensions are notoriously hard to study. Very few theoretical results are known about SAWs in 2 dimensions. Oddly enough, in higher than 3 dimensions you can say that most properties of SAWs are known theoretically, such as their growth constant. Suffice it to say, SAWs in 2 dimensions are very interesting things!
Reigning in this discussion to talk about your problem at hand, you are probably finding that your implementation of the growth algorithm gets "cut off" quite frequently and can not expand to fill the whole of your lattice. In this case, it is more appropriate to view your problem as a Hamiltonian Path problem. My approach to finding interesting Hamiltonian paths would be to formulate the problem as an Integer Linear Program, and add fix random edges to be used in the ILP. The fixing of random edges would give randomness to the generation process, and the ILP portion would efficiently compute whether certain configurations are feasible and if they are would return a solution.
The Code
The following code implements the Hamiltonian path or cycle problem for arbitrary initial conditions. It implements it on a regular 2D lattice with 4-connectivity. The formulation follows the standard sub-tour elimination ILP Lagrangian. The sub-tours are eliminated lazily, meaning there can be potentially many iterations required.
You could augment this to satisfy random or otherwise initial conditions that you deem "interesting" for your problem. If the initial conditions are infeasible it terminates early and prints this.
This code depends on NetworkX and PuLP.
"""
Hamiltonian path formulated as ILP, solved using PuLP, adapted from
https://projects.coin-or.org/PuLP/browser/trunk/examples/Sudoku1.py
Authors: ldog
"""
# Import PuLP modeler functions
from pulp import *
# Solve for Hamiltonian path or cycle
solve_type = 'cycle'
# Define grid size
N = 10
# If solving for path a start and end must be specified
if solve_type == 'path':
start_vertex = (0,0)
end_vertex = (5,5)
# Assuming 4-connectivity (up, down, left, right)
Edges = ["up", "down", "left", "right"]
Sequence = [ i for i in range(N) ]
# The Rows and Cols sequences follow this form, Vals will be which edge is used
Vals = Edges
Rows = Sequence
Cols = Sequence
# The prob variable is created to contain the problem data
prob = LpProblem("Hamiltonian Path Problem",LpMinimize)
# The problem variables are created
choices = LpVariable.dicts("Choice",(Vals,Rows,Cols),0,1,LpInteger)
# The arbitrary objective function is added
prob += 0, "Arbitrary Objective Function"
# A constraint ensuring that exactly two edges per node are used
# (a requirement for the solution to be a walk or path.)
for r in Rows:
for c in Cols:
if solve_type == 'cycle':
prob += lpSum([choices[v][r][c] for v in Vals ]) == 2, ""
elif solve_type == 'path':
if (r,c) == end_vertex or (r,c) == start_vertex:
prob += lpSum([choices[v][r][c] for v in Vals]) == 1, ""
else:
prob += lpSum([choices[v][r][c] for v in Vals]) == 2, ""
# A constraint ensuring that edges between adjacent nodes agree
for r in Rows:
for c in Cols:
#The up direction
if r > 0:
prob += choices["up"][r][c] == choices["down"][r-1][c],""
#The down direction
if r < N-1:
prob += choices["down"][r][c] == choices["up"][r+1][c],""
#The left direction
if c > 0:
prob += choices["left"][r][c] == choices["right"][r][c-1],""
#The right direction
if c < N-1:
prob += choices["right"][r][c] == choices["left"][r][c+1],""
# Ensure boundaries are not used
for c in Cols:
prob += choices["up"][0][c] == 0,""
prob += choices["down"][N-1][c] == 0,""
for r in Rows:
prob += choices["left"][r][0] == 0,""
prob += choices["right"][r][N-1] == 0,""
# Random conditions can be specified to give "interesting" paths or cycles
# that have this condition.
# In the simplest case, just specify one node with fixed edges used.
prob += choices["down"][2][2] == 1,""
prob += choices["up"][2][2] == 1,""
# Keep solving while the smallest cycle is not the whole thing
while True:
# The problem is solved using PuLP's choice of Solver
prob.solve()
# The status of the solution is printed to the screen
status = LpStatus[prob.status]
print "Status:", status
if status == 'Infeasible':
print 'The set of conditions imposed are impossible to solve for. Change the conditions.'
break
import networkx as nx
g = nx.Graph()
g.add_nodes_from([i for i in range(N*N)])
for r in Rows:
for c in Cols:
if value(choices['up'][r][c]) == 1:
nr = r - 1
nc = c
g.add_edge(r*N+c,nr*N+nc)
if value(choices['down'][r][c]) == 1:
nr = r + 1
nc = c
g.add_edge(r*N+c,nr*N+nc)
if value(choices['left'][r][c]) == 1:
nr = r
nc = c - 1
g.add_edge(r*N+c,nr*N+nc)
if value(choices['right'][r][c]) == 1:
nr = r
nc = c + 1
g.add_edge(r*N+c,nr*N+nc)
# Get connected components sorted by length
cc = sorted(nx.connected_components(g), key = len)
# For the shortest, add the remove cycle condition
def ngb(idx,v):
r = idx/N
c = idx%N
if v == 'up':
nr = r - 1
nc = c
if v == 'down':
nr = r + 1
nc = c
if v == 'left':
nr = r
nc = c - 1
if v == 'right':
nr = r
nc = c + 1
return nr*N+c
prob += lpSum([choices[v][idx/N][idx%N] for v in Vals for idx in cc[0] if ngb(idx,v) in cc[0] ]) \
<= 2*(len(cc[0]))-1, ""
# Pretty print the solution
if len(cc[0]) == N*N:
print ''
print '***************************'
print ' This is the final solution'
print '***************************'
for r in Rows:
s = ""
for c in Cols:
if value(choices['up'][r][c]) == 1:
s += " | "
else:
s += " "
print s
s = ""
for c in Cols:
if value(choices['left'][r][c]) == 1:
s += "-"
else:
s += " "
s += "X"
if value(choices['right'][r][c]) == 1:
s += "-"
else:
s += " "
print s
s = ""
for c in Cols:
if value(choices['down'][r][c]) == 1:
s += " | "
else:
s += " "
print s
if len(cc[0]) != N*N:
print 'Press key to continue to next iteration (which eliminates a suboptimal subtour) ...'
elif len(cc[0]) == N*N:
print 'Press key to terminate'
raw_input()
if len(cc[0]) == N*N:
break

Algorithm to establish ordering amongst a set of items

I have a set of students (referred to as items in the title for generality). Amongst these students, some have a reputation for being rambunctious. We are told about a set of hate relationships of the form 'i hates j'. 'i hates j' does not imply 'j hates i'. We are supposed to arrange the students in rows (front most row numbered 1) in a way such that if 'i hates j' then i should be put in a row that is strictly lesser numbered than that of j (in other words: in some row that is in front of j's row) so that i doesn't throw anything at j (Turning back is not allowed). What would be an efficient algorithm to find the minimum number of rows needed (each row need not have the same number of students)?
We will make the following assumptions:
1) If we model this as a directed graph, there are no cycles in the graph. The most basic cycle would be: if 'i hates j' is true, 'j hates i' is false. Because otherwise, I think the ordering would become impossible.
2) Every student in the group is at least hated by one other student OR at least hates one other student. Of course, there would be students who are both hated by some and who in turn hate other students. This means that there are no stray students who don't form part of the graph.
Update: I have already thought of constructing a directed graph with i --> j if 'i hates j and doing topological sorting. However, since the general topological sort would suit better if I had to line all the students in a single line. Since there is a variation of the rows here, I am trying to figure out how to factor in the change into topological sort so it gives me what I want.
When you answer, please state the complexity of your solution. If anybody is giving code and you don't mind the language, then I'd prefer Java but of course any other language is just as fine.
JFYI This is not for any kind of homework (I am not a student btw :)).
It sounds to me that you need to investigate topological sorting.
This problem is basically another way to put the longest path in a directed graph problem. The number of rows is actually number of nodes in path (number of edges + 1).
Assuming the graph is acyclic, the solution is topological sort.
Acyclic is a bit stronger the your assumption 1. Not only A -> B and B -> A is invalid. Also A -> B, B -> C, C -> A and any cycle of any length.
HINT: the question is how many rows are needed, not which student in which row. The answer to the question is the length of the longest path.
It's from a project management theory (or scheduling theory, I don't know the exact term). There the task is about sorting jobs (vertex is a job, arc is a job order relationship).
Obviously we have some connected oriented graph without loops. There is an arc from vertex a to vertex b if and only if a hates b. Let's assume there is a source (without incoming arcs) and destination (without outgoing arcs) vertex. If that is not the case, just add imaginary ones. Now we want to find length of a longest path from source to destination (it will be number of rows - 1, but mind the imaginary verteces).
We will define vertex rank (r[v]) as number of arcs in a longest path between source and this vertex v. Obviously we want to know r[destination]. Algorithm for finding rank:
0) r_0[v] := 0 for all verteces v
repeat
t) r_t[end(j)] := max( r_{t-1}[end(j)], r_{t-1}[start(j)] + 1 ) for all arcs j
until for all arcs j r_{t+1}[end(j)] = r_t[end(j)] // i.e. no changes on this iteration
On each step at least one vertex increases its rank. Therefore in this form complexity is O(n^3).
By the way, this algorithm also gives you student distribution among rows. Just group students by their respective ranks.
Edit: Another code with the same idea. Possibly it is better understandable.
# Python
# V is a list of vertex indices, let it be something like V = range(N)
# source has index 0, destination has index N-1
# E is a list of edges, i.e. tuples of the form (start vertex, end vertex)
R = [0] * len(V)
do:
changes = False
for e in E:
if R[e[1]] < R[e[0]] + 1:
changes = True
R[e[1]] = R[e[0]] + 1
while changes
# The answer is derived from value of R[N-1]
Of course this is the simplest implementation. It can be optimized, and time estimate can be better.
Edit2: obvious optimization - update only verteces adjacent to those that were updated on the previous step. I.e. introduce a queue with verteces whose rank was updated. Also for edge storing one should use adjacency lists. With such optimization complexity would be O(N^2). Indeed, each vertex may appear in the queue at most rank times. But vertex rank never exceeds N - number of verteces. Therefore total number of algorithm steps will not exceed O(N^2).
Essentailly the important thing in assumption #1 is that there must not be any cycles in this graph. If there are any cycles you can't solve this problem.
I would start by seating all of the students that do not hate any other students in the back row. Then you can seat the students who hate these students in the next row and etc.
The number of rows is the length of the longest path in the directed graph, plus one. As a limit case, if there is no hate relationship everyone can fit on the same row.
To allocate the rows, put everyone who is not hated by anyone else on the row one. These are the "roots" of your graph. Everyone else is put on row N + 1 if N is the length of the longest path from any of the roots to that person (this path is of length one at least).
A simple O(N^3) algorithm is the following:
S = set of students
for s in S: s.row = -1 # initialize row field
rownum = 0 # start from first row below
flag = true # when to finish
while (flag):
rownum = rownum + 1 # proceed to next row
flag = false
for s in S:
if (s.row != -1) continue # already allocated
ok = true
foreach q in S:
# Check if there is student q who will sit
# on this or later row who hates s
if ((q.row == -1 or q.row = rownum)
and s hated by q) ok = false; break
if (ok): # can put s here
s.row = rownum
flag = true
Simple answer = 1 row.
Put all students in the same row.
Actually that might not solve the question as stated - lesser row, rather than equal row...
Put all students in row 1
For each hate relation, put the not-hating student in a row behind the hating student
Iterate till you have no activity, or iterate Num(relation) times.
But I'm sure there are better algorithms - look at acyclic graphs.
Construct a relationship graph where i hates j will have a directed edge from i to j. So end result is a directed graph. It should be a DAG otherwise no solutions as it's not possible to resolve circular hate relations ship.
Now simply do a DFS search and during the post node callbacks, means the once the DFS of all the children are done and before returning from the DFS call to this node, simply check the row number of all the children and assign the row number of this node as row max row of the child + 1. Incase if there is some one who doesn't hate anyone basically node with no adjacency list simply assign him row 0.
Once all the nodes are processed reverse the row numbers. This should be easy as this is just about finding the max and assigning the row numbers as max-already assigned row numbers.
Here is the sample code.
postNodeCb( graph g, int node )
{
if ( /* No adj list */ )
row[ node ] = 0;
else
row[ node ] = max( row number of all children ) + 1;
}
main()
{
.
.
for ( int i = 0; i < NUM_VER; i++ )
if ( !visited[ i ] )
graphTraverseDfs( g, i );`enter code here`
.
.
}

Resources