Missing some paths in edmonds karp max flow algorithm - algorithm

I'd implement Edmond Karp algorithm, but seems it's not correct and I'm not getting correct flow, consider following graph and flow from 4 to 8:
Algorithm runs as follow:
First finds 4→1→8,
Then finds 4→5→8
after that 4→1→6→8
And I think third path is wrong, because by using this path we can't use flow from 6→8 (because it used), and in fact we can't use flow from 4→5→6→8.
In fact if we choose 4→5→6→8, and then 4→1→3→7→8 and then 4→1→3→7→8 we can gain better flow(40).
I Implemented algorithm from wiki sample code. I think we can't use any valid path and in fact this greedy selection is wrong.
Am I wrong?
Code is as below (in c#, threshold is 0, and doesn't affect the algorithm):
public decimal EdmondKarps(decimal[][] capacities/*Capacity matrix*/,
List<int>[] neighbors/*Neighbour lists*/,
int s /*source*/,
int t/*sink*/,
decimal threshold,
out decimal[][] flowMatrix
/*flowMatrix (A matrix giving a legal flowMatrix with the maximum value)*/
)
{
THRESHOLD = threshold;
int n = capacities.Length;
decimal flow = 0m; // (Initial flowMatrix is zero)
flowMatrix = new decimal[n][]; //array(1..n, 1..n) (Residual capacity from u to v is capacities[u,v] - flowMatrix[u,v])
for (int i = 0; i < n; i++)
{
flowMatrix[i] = new decimal[n];
}
while (true)
{
var path = new int[n];
var pathCapacity = BreadthFirstSearch(capacities, neighbors, s, t, flowMatrix, out path);
if (pathCapacity <= threshold)
break;
flow += pathCapacity;
//(Backtrack search, and update flowMatrix)
var v = t;
while (v != s)
{
var u = path[v];
flowMatrix[u][v] = flowMatrix[u][v] + pathCapacity;
flowMatrix[v][u] = flowMatrix[v][u] - pathCapacity;
v = u;
}
}
return flow;
}
private decimal BreadthFirstSearch(decimal[][] capacities, List<int>[] neighbors, int s, int t, decimal[][] flowMatrix, out int[] path)
{
var n = capacities.Length;
path = Enumerable.Range(0, n).Select(x => -1).ToArray();//array(1..n)
path[s] = -2;
var pathFlow = new decimal[n];
pathFlow[s] = Decimal.MaxValue; // INFINT
var Q = new Queue<int>(); // Q is exactly Queue :)
Q.Enqueue(s);
while (Q.Count > 0)
{
var u = Q.Dequeue();
for (int i = 0; i < neighbors[u].Count; i++)
{
var v = neighbors[u][i];
//(If there is available capacity, and v is not seen before in search)
if (capacities[u][v] - flowMatrix[u][v] > THRESHOLD && path[v] == -1)
{
// save path:
path[v] = u;
pathFlow[v] = Math.Min(pathFlow[u], capacities[u][v] - flowMatrix[u][v]);
if (v != t)
Q.Enqueue(v);
else
return pathFlow[t];
}
}
}
return 0;
}

The way to choose paths is not important.
You have to add edges of the path in reverse order with path capacity and reduce capacity of edges of the path by that value.
In fact this solution works:
while there is a path with positive capacity from source to sink{
find any path with positive capacity from source to sink, named P with capacity C.
add C to maximum_flow_value.
reduce C from capacity of edges of P.
add C to capacity of edges of reverse_P.
}
Finally the value of maximum-flow is sum of Cs in the loop.
If you want to see the flow in edges in the maximum-flow you made, you can retain the initial graph somewhere, the flow in edge e would be original_capacity_e - current_capacity_e.

Related

Bus Routes Algorithm

I'm trying to solve the following question:
https://leetcode.com/problems/bus-routes/
We have a list of bus routes. Each routes[i] is a bus route that the ith bus repeats forever. For example if routes[0] = [1, 5, 7], this means that the first bus (0th indexed) travels in the sequence 1→5→7→1→5→7→1→... forever.
We start at bus stop S (initially not on a bus), and we want to go to bus stop T. Travelling by buses only, what is the least number of buses we must take to reach our destination? Return -1 if it is not possible.
Example:
Input:
routes = [[1, 2, 7], [3, 6, 7]]
S = 1
T = 6
Output:
2
Explanation:
The best strategy is take the first bus to the bus stop 7, then take the second bus to the bus stop 6.
Note:
1 <= routes.length <= 500.
1 <= routes[i].length <= 500.
0 <= routes[i][j] < 10 ^ 6.
My idea is to treat each stop as a Node. Each node has a color (the bus number), and has a value (the stop number).
This problem would then be converted to a 0-1 BFS shortest path problem.
Here's my code :
class Node {
int val;
int color;
boolean visited;
int distance;
public Node(int val, int color) {
this.val = val;
this.color = color;
this.visited = false;
this.distance = 0;
}
public String toString() {
return "{ val = " + this.val + ", color = " + this.color + " ,distance = " + this.distance + "}";
}
}
class Solution {
public int numBusesToDestination(int[][] routes, int S, int T) {
if(S == T) return 0;
// create nodes
// map each values node(s)
// distance between nodes of the same bus, have 0 distance
// if you're switching buses, the distance is 1
Map<Integer, List<Node>> adjacency = new HashMap<Integer, List<Node>>();
int color = 0;
Set<Integer> colorsToStartWith = new HashSet<Integer>();
for(int[] route : routes) {
for(int i = 0; i < route.length - 1; i++) {
int source = route[i];
int dest = route[i + 1];
adjacency.putIfAbsent(source, new ArrayList<Node>());
adjacency.putIfAbsent(dest, new ArrayList<Node>());
if(source == S) colorsToStartWith.add(color);
adjacency.get(source).add(new Node(dest, color));
adjacency.get(source).add(new Node(source, color));
}
if(route[route.length - 1] == S) colorsToStartWith.add(color);
adjacency.putIfAbsent(route[route.length - 1], new ArrayList<Node>());
adjacency.get(route[route.length - 1]).add(new Node(route[0], color));
adjacency.get(route[route.length - 1]).add(new Node(route[route.length - 1], color));
color++;
}
// run bfs
int minDistance = Integer.MAX_VALUE;
Deque<Node> q = new LinkedList<Node>();
Node start = new Node(S, 0);
start.distance = 1;
q.add(start);
boolean first = true;
boolean found = false;
while(!q.isEmpty()) {
Node current = q.remove();
current.visited = true;
System.out.println(current);
for(Node neighbor : adjacency.get(current.val)) {
if(!neighbor.visited) {
neighbor.visited = true;
if(neighbor.color == current.color || current.val == neighbor.val || first) {
q.addFirst(neighbor);
neighbor.distance = current.distance;
} else {
q.addLast(neighbor);
neighbor.distance = current.distance + 1;
}
if(neighbor.val == T) {
minDistance = Math.min(minDistance, neighbor.distance);
}
}
}
first = false;
}
return minDistance == Integer.MAX_VALUE ? -1 : minDistance;
}
}
I'm not sure why this is wrong.
The following test case fails :
Routes = [
[12,16,33,40,44,47,68,69,77,78,82,86,97],
[5,8,25,28,45,46,50,52,63,66,80,81,95,97],
[4,5,6,14,30,31,34,36,37,47,48,55,56,58,73,74,76,80,88,98],
[58,59],
[54,56,78,96,98],
[7,30,35,44,60,87,97],
[3,5,57,88],
[3,9,13,15,23,24,28,38,49,51,54,59,63,65,78,81,86,92,95],
[2,7,16,20,23,46,55,57,93],
[10,11,15,31,32,48,53,54,57,66,69,75,85,98],
[24,26,30,32,51,54,58,77,81],
[7,21,39,40,49,58,84,89],
[38,50,57],
[10,57],
[11,27,28,37,55,56,58,59,81,87,97],
[0,1,8,17,19,24,25,27,36,37,39,51,68,72,76,82,84,87,89],
[10,11,14,22,26,30,48,49,62,66,79,80,81,85,89,93,96,98],
[16,18,24,32,35,37,46,63,66,69,78,80,87,96],
[3,6,13,14,16,17,29,30,42,46,58,73,77,78,81],
[15,19,32,37,52,57,58,61,69,71,73,92,93]
]
S = 6
T = 30
What is the error in my code that makes this test fail?
The example input you give should return 1, since the one-but-last route contains both the source and target bus stop (6 and 30):
[3,6,13,14,16,17,29,30,42,46,58,73,77,78,81]
I ran your code with that input, and it returns 1, so your solution is rejected for another reason, which then must be a time out.
I see several causes for why your code is not optimal:
While ideally a BFS should stop when it has found the target node, your version must continue to visit all unvisited nodes that are reachable, before it can decide what the solution is. So even if it finds the target on the same route as the source, it will continue to switch routes and so do a lot of unnecessary work, as there is no hope to find a shorter path.
This is not how it is supposed to be. You should take care to perform your search in a way that gives priority to the edges that do not increase the distance, and only when there are no more of those, pick an edge that adds 1 to the distance. If you do it like that you can stop the search as soon as you have found the target.
A Node object is created repeatedly for the same combination of bus stop and "color" (i.e. route). As a consequence, when you later set visited to true, the duplicate Node objects will still have visited equal to false and so that bus stop will be visited several times with no gain.
You should make sure to only create new Node objects when there is no existing object with such combination yet.
Edges exist between two consecutive bus stops on the same route, meaning you may need to traverse several edges on the same route before finding the one that is either the target or the right place to switch to another route.
It would be more efficient to consider a whole route a Node: routes would be considered connected (with an edge) when they share at least one bus stop. Converting routes to Sets of bus stops would make it fast and easy to identify these edges.
The reflexive edges, from and to the same bus stop, but specifying a color (route), also do not add to efficiency. The main issue you tried to solve with this set up, is to make sure that the first choice of a route is free of charge (is not considered a switch). But that concern is no longer an issue if you apply the previous bullet point.
Implementation
I chose JavaScript as implementation, but I guess it wont be hard to rewrite this in Java:
function numBusesToDestination (routes, S, T) {
if (S === T) return 0;
// Create nodes of the graph
const nodes = routes;
// Map bus stops to routes: a map keyed by stops, with each an empty Set as value
const nodesAtBusStop = new Map([].concat(...routes.map(route => route.map(stop => [stop, new Set]))));
// ... and populate those empty Sets:
for (let node of nodes) {
for (let stop of node) {
nodesAtBusStop.get(stop).add(node);
}
}
// Build adjacency list of the graph
const adjList = new Map(nodes.map(node => [node, new Set]));
for (let [stop, nodes] of nodesAtBusStop.entries()) {
for (let a of nodes) {
for (let b of nodes) {
if (a !== b) adjList.get(a).add(b);
}
}
}
const startNodes = nodesAtBusStop.get(S);
const targetNodes = nodesAtBusStop.get(T);
if (!startNodes || !targetNodes) return -1;
// BFS
let queue = [...startNodes];
let distance = 1;
let visited = new Set;
while (queue.length) {
// Create a new queue for each distance increment
let nextLevel = [];
for (let node of queue) {
if (visited.has(node)) continue;
visited.add(node);
if (targetNodes.has(node)) return distance;
nextLevel.push(...adjList.get(node));
}
queue = nextLevel;
distance++;
}
return -1;
};
// I/O handling
(document.oninput = function () {
let result = "invalid JSON";
try {
let routes = JSON.parse(document.querySelector("#inputRoutes").value);
let S = +document.querySelector("#inputStart").value;
let T = +document.querySelector("#inputTarget").value;
result = numBusesToDestination(routes, S, T);
}
catch (e) {}
document.querySelector("#output").textContent = result;
})();
#inputRoutes { width: 100% }
Routes in JSON format:
<textarea id="inputRoutes">[[1,2,7],[3,6,7]]</textarea><br>
Start: <input id="inputStart" value="1"><br>
Target: <input id="inputTarget" value="6"><br>
Distance: <span id="output"></span>

Verification of algorithm for variant of gas station

I am studying this problem and I recognise this as a variant of the gas station problem. As a result, I use Greedy algorithm to solve this problem. I would like to ask if anyone helps me to point out my algorithm is correct or not, thanks.
My algorithm
var x = input.distance, cost = input.cost, c = input.travelDistance, price = [Number.POSITIVE_INFINITY];
var result = [];
var lastFill = 0, tempMinIndex = 0, totalCost = 0;
for(var i=1; i<x.length; i++) {
var d = x[i] - x[lastFill];
if(d > c){ //car can not travel to this shop, has to decide which shop to refill in the previous possible shops
result.push(tempMinIndex);
lastFill = tempMinIndex;
totalCost += price[tempMinIndex];
tempMinIndex = i;
}
//calculate price
price[i] = d/c * cost[i];
if(price[i] <= price[tempMinIndex])
tempMinIndex = i;
}
//add last station to the list and the total cost
if(lastFill != x.length - 1){
result.push(x.length - 1);
totalCost += price[price.length-1];
}
You can try out the algorithm at this link
https://drive.google.com/file/d/0B4sd8MQwTpVnMXdCRU0xZFlVRlk/view?usp=sharing
First, regarding to your solution.
There is a bug that ruins even at the most simple inputs. When you decided that the distance became too far and you should fulfil at some point before, you don't update distance and gas station charge you more that it should. The fix is simple:
if(d > c){
//car can not travel to this shop, has to decide which shop to refill
//in the previous possible shops
result.push(tempMinIndex);
lastFill = tempMinIndex;
totalCost += price[tempMinIndex];
tempMinIndex = i;
// Fix: update distance
var d = x[i] - x[lastFill];
}
Even with this fix, your algorithm fails on some input data, like this:
0 10 20 30
0 20 30 50
30
It should refill on every gasoline to minimize cost, but it simply fills on the last one.
After some research, I came up with solution. I'll try to explain it as simple as possible to make it language independent.
Idea
For every gas station G we will count cheapest way of filling. We'll do that recursively: for each gas station let's find all gas stations i from which we can reach G. For every i count cheapest filling possible and sum up with the cost of the filling at G given gasoline left. For start gas station cost is 0. More formally:
CostOfFilling(x), Capacity and Position(x) can be retrieved from input data.
So, the answer for the problem is simply BestCost(LastGasStation)
Code
Now, solution in javascript to make things clearer.
function calculate(input)
{
// Array for keeping calculated values of cheapest filling at each station
best = [];
var x = input.distance;
var cost = input.cost;
var capacity = input.travelDistance;
// Array initialization
best.push(0);
for (var i = 0; i < x.length - 1; i++)
{
best.push(-1);
}
var answer = findBest(x, cost, capacity, x.length - 1);
return answer;
}
// Implementation of BestCost function
var findBest = function(distances, costs, capacity, distanceIndex)
{
// Return value if it's already have been calculated
if (best[distanceIndex] != -1)
{
return best[distanceIndex];
}
// Find cheapest way to fill by iterating on every available gas station
var minDistanceIndex = findMinDistance(capacity, distances, distanceIndex);
var answer = findBest(distances, costs, capacity, minDistanceIndex) +
calculateCost(distances, costs, capacity, minDistanceIndex, distanceIndex);
for (var i = minDistanceIndex + 1; i < distanceIndex; i++)
{
var newAnswer = findBest(distances, costs, capacity, i) +
calculateCost(distances, costs, capacity, i, distanceIndex);
if (newAnswer < answer)
{
answer = newAnswer;
}
}
// Save best result
best[distanceIndex] = answer;
return answer;
}
// Implementation of MinGasStation function
function findMinDistance(capacity, distances, distanceIndex)
{
for (var i = 0; i < distances.length; i++)
{
if (distances[distanceIndex] - distances[i] <= capacity)
{
return i;
}
}
}
// Implementation of Cost function
function calculateCost(distances, costs, capacity, a, b)
{
var distance = distances[b] - distances[a];
return costs[b] * (distance / capacity);
}
Full workable html page with code is available here

How to model the three-glass pouring puzzle as a graph

I want to model the following puzzle with a graph.
The barman gives you three
glasses whose sizes are 1000ml, 700ml, and 400ml, respectively. The 700ml and 400ml glasses start
out full of beer, but the 1000ml glass is initially empty. You can get unlimited free beer if you win
the following game:
Game rule: You can keep pouring beer from one glass into another, stopping only when the source
glass is empty or the destination glass is full. You win if there is a sequence of pourings that leaves
exactly 200ml in the 700ml or 400 ml glass.
I was a little unsure of how to translate this problem in a graph. My thought was that the glasses would be represented by nodes in a weighted, undirected graph where edges indicate that a glass u can be poured into a glass v and the other way is the same, therefore a walk would be a sequence of pourings that would lead to the correct solution.
However, this approach of having three single nodes and undirected edges doesn't quite work for Dijkstra's algorithm or other greedy algorithms which was what I was going to use to solve the problem. Would modeling the permutations of the pourings as a graph be more suitable?
You should store whole state as vertex. I mean, value in each glass is a component of state, hence state is array of glassesCount numbers. For example, initial state is (700,400,0).
After that you should add initial state to queue and run BFS. BFS is appliable because each edge has equal weight =1. Weight is equal because weight is a number of pourings between each state which is obviously = 1 as we generate only reachable states from each state in queue.
You may also use DFS, but BFS returns the shortest sequence of pourings because BFS gives shortest path for 1-weighted graphs. If you are not interested in shortest sequence of pourings but any solution, DFS is ok. I will describe BFS because it has the same complexity with DFS and returns better (shorter) solution.
In each state of BFS you have to generate all possible new states by pouring from all pairwise combinations. Also, you should check possibility of pouring.
For 3 glasses there are 3*(3-1)=6 possible branches from each state but I implemented more generic solution allowing you to use my code for N glasses.
public class Solution{
static HashSet<State> usedStates = new HashSet<State>();
static HashMap<State,State> prev = new HashMap<State, State>();
static ArrayDeque<State> queue = new ArrayDeque<State>();
static short[] limits = new short[]{700,400,1000};
public static void main(String[] args){
State initialState = new State(new Short[]{700,400,0});
usedStates.add(initialState);
queue.add(initialState);
prev.put(initialState,null);
boolean solutionFound = false;
while(!queue.isEmpty()){
State curState = queue.poll();
if(curState.isWinning()){
printSolution(curState);
solutionFound = true;
break; //stop BFS even if queue is not empty because solution already found
}
// go to all possible states
for(int i=0;i<curState.getGlasses().length;i++)
for(int j=0;j<curState.getGlasses().length;j++) {
if (i != j) { //pouring from i-th glass to j-th glass, can't pour to itself
short glassI = curState.getGlasses()[i];
short glassJ = curState.getGlasses()[j];
short possibleToPour = (short)(limits[j]-glassJ);
short amountToPour;
if(glassI<possibleToPour) amountToPour = glassI; //pour total i-th glass
else amountToPour = possibleToPour; //pour i-th glass partially
if(glassI!=0){ //prepare new state
Short[] newGlasses = Arrays.copyOf(curState.getGlasses(), curState.getGlasses().length);
newGlasses[i] = (short)(glassI-amountToPour);
newGlasses[j] = (short)(newGlasses[j]+amountToPour);
State newState = new State(newGlasses);
if(!usedStates.contains(newState)){ // if new state not handled before mark it as used and add to queue for future handling
usedStates.add(newState);
prev.put(newState, curState);
queue.add(newState);
}
}
}
}
}
if(!solutionFound) System.out.println("Solution does not exist");
}
private static void printSolution(State curState) {
System.out.println("below is 'reversed' solution. In order to get solution from initial state read states from the end");
while(curState!=null){
System.out.println("("+curState.getGlasses()[0]+","+curState.getGlasses()[1]+","+curState.getGlasses()[2]+")");
curState = prev.get(curState);
}
}
static class State{
private Short[] glasses;
public State(Short[] glasses){
this.glasses = glasses;
}
public boolean isWinning() {
return glasses[0]==200 || glasses[1]==200;
}
public Short[] getGlasses(){
return glasses;
}
#Override
public boolean equals(Object other){
return Arrays.equals(glasses,((State)other).getGlasses());
}
#Override
public int hashCode(){
return Arrays.hashCode(glasses);
}
}
}
Output:
below is 'reversed' solution. In order to get solution from initial
state read states from the end
(700,200,200)
(500,400,200)
(500,0,600)
(100,400,600)
(100,0,1000)
(700,0,400)
(700,400,0)
Interesting fact - this problem has no solution if replace
200ml in g1 OR g2
to
200ml in g1 AND g2
.
I mean, state (200,200,700) is unreachable from (700,400,0)
If we want to model this problem with a graph, each node should represent a possible assignment of beer volume to glasses. Suppose we represent each glass with an object like this:
{ volume: <current volume>, max: <maximum volume> }
Then the starting node is a list of three such objects:
[ { volume: 0, max: 1000 }, { volume: 700, max: 700 }, { volume: 400, max: 400 } ]
An edge represents the action of pouring one glass into another. To perform such an action, we pick a source glass and a target glass, then calculate how much we can pour from the source to the target:
function pour(indexA, indexB, glasses) { // Pour from A to B.
var a = glasses[indexA],
b = glasses[indexB],
delta = Math.min(a.volume, b.max - b.volume);
a.volume -= delta;
b.volume += delta;
}
From the starting node we try pouring from each glass to every other glass. Each of these actions results in a new assignment of beer volumes. We check each one to see if we have achieved the target volume of 200. If not, we push the assignment into a queue.
To find the shortest path from the starting node to a target node, we push newly discovered nodes onto the head of the queue and pop nodes off the end of the queue. This ensures that when we reach a target node, it is no farther from the starting node than any other node in the queue.
To make it possible to reconstruct the shortest path, we store the predecessor of each node in a dictionary. We can use the same dictionary to make sure that we don't explore a node more than once.
The following is a JavaScript implementation of this approach. Click on the blue button below to run it.
function pour(indexA, indexB, glasses) { // Pour from A to B.
var a = glasses[indexA],
b = glasses[indexB],
delta = Math.min(a.volume, b.max - b.volume);
a.volume -= delta;
b.volume += delta;
}
function glassesToKey(glasses) {
return JSON.stringify(glasses);
}
function keyToGlasses(key) {
return JSON.parse(key);
}
function print(s) {
s = s || '';
document.write(s + '<br />');
}
function displayKey(key) {
var glasses = keyToGlasses(key);
parts = glasses.map(function (glass) {
return glass.volume + '/' + glass.max;
});
print('volumes: ' + parts.join(', '));
}
var startGlasses = [ { volume: 0, max: 1000 },
{ volume: 700, max: 700 },
{ volume: 400, max: 400 } ];
var startKey = glassesToKey(startGlasses);
function solve(targetVolume) {
var actions = {},
queue = [ startKey ],
tail = 0;
while (tail < queue.length) {
var key = queue[tail++]; // Pop from tail.
for (var i = 0; i < startGlasses.length; ++i) { // Pick source.
for (var j = 0; j < startGlasses.length; ++j) { // Pick target.
if (i != j) {
var glasses = keyToGlasses(key);
pour(i, j, glasses);
var nextKey = glassesToKey(glasses);
if (actions[nextKey] !== undefined) {
continue;
}
actions[nextKey] = { key: key, source: i, target: j };
for (var k = 1; k < glasses.length; ++k) {
if (glasses[k].volume === targetVolume) { // Are we done?
var path = [ actions[nextKey] ];
while (key != startKey) { // Backtrack.
var action = actions[key];
path.push(action);
key = action.key;
}
path.reverse();
path.forEach(function (action) { // Display path.
displayKey(action.key);
print('pour from glass ' + (action.source + 1) +
' to glass ' + (action.target + 1));
print();
});
displayKey(nextKey);
return;
}
queue.push(nextKey);
}
}
}
}
}
}
solve(200);
body {
font-family: monospace;
}
I had the idea of demonstrating the elegance of constraint programming after the two independent brute force solutions above were given. It doesn't actually answer the OP's question, just solves the puzzle. Admittedly, I expected it to be shorter.
par int:N = 7; % only an alcoholic would try more than 7 moves
var 1..N: n; % the sequence of states is clearly at least length 1. ie the start state
int:X = 10; % capacities
int:Y = 7;
int:Z = 4;
int:T = Y + Z;
array[0..N] of var 0..X: x; % the amount of liquid in glass X the biggest
array[0..N] of var 0..Y: y;
array[0..N] of var 0..Z: z;
constraint x[0] = 0; % initial contents
constraint y[0] = 7;
constraint z[0] = 4;
% the total amount of liquid is the same as the initial amount at all times
constraint forall(i in 0..n)(x[i] + y[i] + z[i] = T);
% we get free unlimited beer if any of these glasses contains 2dl
constraint y[n] = 2 \/ z[n] = 2;
constraint forall(i in 0..n-1)(
% d is the amount we can pour from one glass to another: 6 ways to do it
let {var int: d = min(y[i], X-x[i])} in (x[i+1] = x[i] + d /\ y[i+1] = y[i] - d) \/ % y to x
let {var int: d = min(z[i], X-x[i])} in (x[i+1] = x[i] + d /\ z[i+1] = z[i] - d) \/ % z to x
let {var int: d = min(x[i], Y-y[i])} in (y[i+1] = y[i] + d /\ x[i+1] = x[i] - d) \/ % x to y
let {var int: d = min(z[i], Y-y[i])} in (y[i+1] = y[i] + d /\ z[i+1] = z[i] - d) \/ % z to y
let {var int: d = min(y[i], Z-z[i])} in (z[i+1] = z[i] + d /\ y[i+1] = y[i] - d) \/ % y to z
let {var int: d = min(x[i], Z-z[i])} in (z[i+1] = z[i] + d /\ x[i+1] = x[i] - d) % x to z
);
solve minimize n;
output[show(n), "\n\n", show(x), "\n", show(y), "\n", show(z)];
and the output is
[0, 4, 10, 6, 6, 2, 2]
[7, 7, 1, 1, 5, 5, 7]
[4, 0, 0, 4, 0, 4, 2]
which luckily coincides with the other solutions. Feed it to the MiniZinc solver and wait...and wait. No loops, no BFS and DFS.

Optimized TSP Algorithms

I am interested in ways to improve or come up with algorithms that are able to solve the Travelling salesman problem for about n = 100 to 200 cities.
The wikipedia link I gave lists various optimizations, but it does so at a pretty high level, and I don't know how to go about actually implementing them in code.
There are industrial strength solvers out there, such as Concorde, but those are way too complex for what I want, and the classic solutions that flood the searches for TSP all present randomized algorithms or the classic backtracking or dynamic programming algorithms that only work for about 20 cities.
So, does anyone know how to implement a simple (by simple I mean that an implementation doesn't take more than 100-200 lines of code) TSP solver that works in reasonable time (a few seconds) for at least 100 cities? I am only interested in exact solutions.
You may assume that the input will be randomly generated, so I don't care for inputs that are aimed specifically at breaking a certain algorithm.
200 lines and no libraries is a tough constraint. The advanced solvers use branch and bound with the Held–Karp relaxation, and I'm not sure if even the most basic version of that would fit into 200 normal lines. Nevertheless, here's an outline.
Held Karp
One way to write TSP as an integer program is as follows (Dantzig, Fulkerson, Johnson). For all edges e, constant we denotes the length of edge e, and variable xe is 1 if edge e is on the tour and 0 otherwise. For all subsets S of vertices, ∂(S) denotes the edges connecting a vertex in S with a vertex not in S.
minimize sumedges e we xe
subject to
1. for all vertices v, sumedges e in ∂({v}) xe = 2
2. for all nonempty proper subsets S of vertices, sumedges e in ∂(S) xe ≥ 2
3. for all edges e in E, xe in {0, 1}
Condition 1 ensures that the set of edges is a collection of tours. Condition 2 ensures that there's only one. (Otherwise, let S be the set of vertices visited by one of the tours.) The Held–Karp relaxation is obtained by making this change.
3. for all edges e in E, xe in {0, 1}
3. for all edges e in E, 0 ≤ xe ≤ 1
Held–Karp is a linear program but it has an exponential number of constraints. One way to solve it is to introduce Lagrange multipliers and then do subgradient optimization. That boils down to a loop that computes a minimum spanning tree and then updates some vectors, but the details are sort of involved. Besides "Held–Karp" and "subgradient (descent|optimization)", "1-tree" is another useful search term.
(A slower alternative is to write an LP solver and introduce subtour constraints as they are violated by previous optima. This means writing an LP solver and a min-cut procedure, which is also more code, but it might extend better to more exotic TSP constraints.)
Branch and bound
By "partial solution", I mean an partial assignment of variables to 0 or 1, where an edge assigned 1 is definitely in the tour, and an edge assigned 0 is definitely out. Evaluating Held–Karp with these side constraints gives a lower bound on the optimum tour that respects the decisions already made (an extension).
Branch and bound maintains a set of partial solutions, at least one of which extends to an optimal solution. The pseudocode for one variant, depth-first search with best-first backtracking is as follows.
let h be an empty minheap of partial solutions, ordered by Held–Karp value
let bestsolsofar = null
let cursol be the partial solution with no variables assigned
loop
while cursol is not a complete solution and cursol's H–K value is at least as good as the value of bestsolsofar
choose a branching variable v
let sol0 be cursol union {v -> 0}
let sol1 be cursol union {v -> 1}
evaluate sol0 and sol1
let cursol be the better of the two; put the other in h
end while
if cursol is better than bestsolsofar then
let bestsolsofar = cursol
delete all heap nodes worse than cursol
end if
if h is empty then stop; we've found the optimal solution
pop the minimum element of h and store it in cursol
end loop
The idea of branch and bound is that there's a search tree of partial solutions. The point of solving Held–Karp is that the value of the LP is at most the length OPT of the optimal tour but also conjectured to be at least 3/4 OPT (in practice, usually closer to OPT).
The one detail in the pseudocode I've left out is how to choose the branching variable. The goal is usually to make the "hard" decisions first, so fixing a variable whose value is already near 0 or 1 is probably not wise. One option is to choose the closest to 0.5, but there are many, many others.
EDIT
Java implementation. 198 nonblank, noncomment lines. I forgot that 1-trees don't work with assigning variables to 1, so I branch by finding a vertex whose 1-tree has degree >2 and delete each edge in turn. This program accepts TSPLIB instances in EUC_2D format, e.g., eil51.tsp and eil76.tsp and eil101.tsp and lin105.tsp from http://www2.iwr.uni-heidelberg.de/groups/comopt/software/TSPLIB95/tsp/.
// simple exact TSP solver based on branch-and-bound/Held--Karp
import java.io.*;
import java.util.*;
import java.util.regex.*;
public class TSP {
// number of cities
private int n;
// city locations
private double[] x;
private double[] y;
// cost matrix
private double[][] cost;
// matrix of adjusted costs
private double[][] costWithPi;
Node bestNode = new Node();
public static void main(String[] args) throws IOException {
// read the input in TSPLIB format
// assume TYPE: TSP, EDGE_WEIGHT_TYPE: EUC_2D
// no error checking
TSP tsp = new TSP();
tsp.readInput(new InputStreamReader(System.in));
tsp.solve();
}
public void readInput(Reader r) throws IOException {
BufferedReader in = new BufferedReader(r);
Pattern specification = Pattern.compile("\\s*([A-Z_]+)\\s*(:\\s*([0-9]+))?\\s*");
Pattern data = Pattern.compile("\\s*([0-9]+)\\s+([-+.0-9Ee]+)\\s+([-+.0-9Ee]+)\\s*");
String line;
while ((line = in.readLine()) != null) {
Matcher m = specification.matcher(line);
if (!m.matches()) continue;
String keyword = m.group(1);
if (keyword.equals("DIMENSION")) {
n = Integer.parseInt(m.group(3));
cost = new double[n][n];
} else if (keyword.equals("NODE_COORD_SECTION")) {
x = new double[n];
y = new double[n];
for (int k = 0; k < n; k++) {
line = in.readLine();
m = data.matcher(line);
m.matches();
int i = Integer.parseInt(m.group(1)) - 1;
x[i] = Double.parseDouble(m.group(2));
y[i] = Double.parseDouble(m.group(3));
}
// TSPLIB distances are rounded to the nearest integer to avoid the sum of square roots problem
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
double dx = x[i] - x[j];
double dy = y[i] - y[j];
cost[i][j] = Math.rint(Math.sqrt(dx * dx + dy * dy));
}
}
}
}
}
public void solve() {
bestNode.lowerBound = Double.MAX_VALUE;
Node currentNode = new Node();
currentNode.excluded = new boolean[n][n];
costWithPi = new double[n][n];
computeHeldKarp(currentNode);
PriorityQueue<Node> pq = new PriorityQueue<Node>(11, new NodeComparator());
do {
do {
boolean isTour = true;
int i = -1;
for (int j = 0; j < n; j++) {
if (currentNode.degree[j] > 2 && (i < 0 || currentNode.degree[j] < currentNode.degree[i])) i = j;
}
if (i < 0) {
if (currentNode.lowerBound < bestNode.lowerBound) {
bestNode = currentNode;
System.err.printf("%.0f", bestNode.lowerBound);
}
break;
}
System.err.printf(".");
PriorityQueue<Node> children = new PriorityQueue<Node>(11, new NodeComparator());
children.add(exclude(currentNode, i, currentNode.parent[i]));
for (int j = 0; j < n; j++) {
if (currentNode.parent[j] == i) children.add(exclude(currentNode, i, j));
}
currentNode = children.poll();
pq.addAll(children);
} while (currentNode.lowerBound < bestNode.lowerBound);
System.err.printf("%n");
currentNode = pq.poll();
} while (currentNode != null && currentNode.lowerBound < bestNode.lowerBound);
// output suitable for gnuplot
// set style data vector
System.out.printf("# %.0f%n", bestNode.lowerBound);
int j = 0;
do {
int i = bestNode.parent[j];
System.out.printf("%f\t%f\t%f\t%f%n", x[j], y[j], x[i] - x[j], y[i] - y[j]);
j = i;
} while (j != 0);
}
private Node exclude(Node node, int i, int j) {
Node child = new Node();
child.excluded = node.excluded.clone();
child.excluded[i] = node.excluded[i].clone();
child.excluded[j] = node.excluded[j].clone();
child.excluded[i][j] = true;
child.excluded[j][i] = true;
computeHeldKarp(child);
return child;
}
private void computeHeldKarp(Node node) {
node.pi = new double[n];
node.lowerBound = Double.MIN_VALUE;
node.degree = new int[n];
node.parent = new int[n];
double lambda = 0.1;
while (lambda > 1e-06) {
double previousLowerBound = node.lowerBound;
computeOneTree(node);
if (!(node.lowerBound < bestNode.lowerBound)) return;
if (!(node.lowerBound < previousLowerBound)) lambda *= 0.9;
int denom = 0;
for (int i = 1; i < n; i++) {
int d = node.degree[i] - 2;
denom += d * d;
}
if (denom == 0) return;
double t = lambda * node.lowerBound / denom;
for (int i = 1; i < n; i++) node.pi[i] += t * (node.degree[i] - 2);
}
}
private void computeOneTree(Node node) {
// compute adjusted costs
node.lowerBound = 0.0;
Arrays.fill(node.degree, 0);
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) costWithPi[i][j] = node.excluded[i][j] ? Double.MAX_VALUE : cost[i][j] + node.pi[i] + node.pi[j];
}
int firstNeighbor;
int secondNeighbor;
// find the two cheapest edges from 0
if (costWithPi[0][2] < costWithPi[0][1]) {
firstNeighbor = 2;
secondNeighbor = 1;
} else {
firstNeighbor = 1;
secondNeighbor = 2;
}
for (int j = 3; j < n; j++) {
if (costWithPi[0][j] < costWithPi[0][secondNeighbor]) {
if (costWithPi[0][j] < costWithPi[0][firstNeighbor]) {
secondNeighbor = firstNeighbor;
firstNeighbor = j;
} else {
secondNeighbor = j;
}
}
}
addEdge(node, 0, firstNeighbor);
Arrays.fill(node.parent, firstNeighbor);
node.parent[firstNeighbor] = 0;
// compute the minimum spanning tree on nodes 1..n-1
double[] minCost = costWithPi[firstNeighbor].clone();
for (int k = 2; k < n; k++) {
int i;
for (i = 1; i < n; i++) {
if (node.degree[i] == 0) break;
}
for (int j = i + 1; j < n; j++) {
if (node.degree[j] == 0 && minCost[j] < minCost[i]) i = j;
}
addEdge(node, node.parent[i], i);
for (int j = 1; j < n; j++) {
if (node.degree[j] == 0 && costWithPi[i][j] < minCost[j]) {
minCost[j] = costWithPi[i][j];
node.parent[j] = i;
}
}
}
addEdge(node, 0, secondNeighbor);
node.parent[0] = secondNeighbor;
node.lowerBound = Math.rint(node.lowerBound);
}
private void addEdge(Node node, int i, int j) {
double q = node.lowerBound;
node.lowerBound += costWithPi[i][j];
node.degree[i]++;
node.degree[j]++;
}
}
class Node {
public boolean[][] excluded;
// Held--Karp solution
public double[] pi;
public double lowerBound;
public int[] degree;
public int[] parent;
}
class NodeComparator implements Comparator<Node> {
public int compare(Node a, Node b) {
return Double.compare(a.lowerBound, b.lowerBound);
}
}
If your graph satisfy the triangle inequality and you want a guarantee of 3/2 within the optimum I suggest the christofides algorithm. I've wrote an implementation in php at phpclasses.org.
As of 2013, It is possible to solve for 100 cities using only the exact formulation in Cplex. Add degree equations for each vertex, but include subtour-avoiding constraints only as they appear. Most of them are not necessary. Cplex has an example on this.
You should be able to solve for 100 cities. You will have to iterate every time a new subtour is found. I ran an example here and in a couple of minutes and 100 iterations later I got my results.
I took Held-Karp algorithm from concorde library and 25 cities are solved in 0.15 seconds. This performance is perfectly good for me! You can extract the code (writen in ANSI C) of held-karp from concorde library: http://www.math.uwaterloo.ca/tsp/concorde/downloads/downloads.htm. If the download has the extension gz, it should be tgz. You might need to rename it. Then you should make little ajustments to port in in VC++. First take the file heldkarp h and c (rename it cpp) and other about 5 files, make adjustments and it should work calling CCheldkarp_small(...) with edgelen: euclid_ceiling_edgelen.
TSP is an NP-hard problem. (As far as we know) there is no algorithm for NP-hard problems which runs in polynomial time, so you ask for something that doesn't exist.
It's either fast enough to finish in a reasonable time and then it's not exact, or exact but won't finish in your lifetime for 100 cities.
To give a dumb answer: me too. Everyone is interrested in such algorithm, but as others already stated: I does not (yet?) exist. Esp your combination of exact, 200 nodes, few seconds runtime and just 200 lines of code is impossible. You already know that is it NP hard and if you got the slightest impression of asymptotic behaviour you should know that there is no way of achieving this (except you prove that NP=P, and even that I would say thats not possible). Even the exact commercial solvers need for such instances far more than some seconds and as you can imagine they have far more than 200 lines of code (even when you just consider their kernels).
EDIT: The wiki algorithms are the "usual suspects" of the field: Linear Programming and branch-and-bound. Their solutions for the instances with thousands of nodes took Years to solve (they just did it with very very much CPUs parallel, so they can do it faster). Some even use for the branch-and-bound problem specific knowledge for the bounding, so they are no general approaches.
Branch and bound just enumerates all possible paths (e.g. with backtracking) and applies once it has a solution this for to stop a started recursion when it can prove that the result is not better than the already found solution (e.g. if you just visited 2 of your cities and the path is already longer than a found 200 city tour. You can discard all tours that start with that 2 city combination). Here you can invest very much problem specific knowledge in the function that tells you, that the path is not going to be better than the already found solution. The better it is, the less paths you have to look at, the faster is your algorithm.
Linear Programming is an optimization method so solve linear inequality problems. It works in polynomial time (simplex just practically, but that doesnt matter here), but the solution is real. When you have the additional constraint that the solution must be integer, it gets NP-complete. For small instances it is possible, e.g. one method to solve it, then look which variable of the solution violates the integer part and add addition inequalities to change it (this is called cutting-plane, the name cames from the fact that the inequalities define (higher-dimensional) plane, the solution space is a polytop and by adding additional inequalities you cut something with a plane from the polytop). The topic is very complex and even a general simple simplex is hard to understand when you dont want dive deep into the math. There are several good books about, one of the betters is from Chvatal, Linear Programming, but there are several more.
I have a theory, but I've never had the time to pursue it:
The TSP is a bounding problem (single shape where all points lie on the perimeter) where the optimal solution is that solution that has the shortest perimeter.
There are plenty of simple ways to get all the points that lie on a minimum bounding perimeter (imagine a large elastic band stretched around a bunch of nails in a large board.)
My theory is that if you start pushing in on the elastic band so that the length of band increases by the same amount between adjacent points on the perimeter, and each segment remains in the shape of an eliptical arc, the stretched elastic will cross points on the optimal path before crossing points on non-optimal paths. See this page on mathopenref.com on drawing ellipses--particularly steps 5 and 6. Points on the bounding perimeter can be viewed as focal points of the ellipse (F1, F2) in the images below.
What I don't know is if the "bubble stretching" process needs to be reset after each new point is added, or if the existing "bubbles" continue to grow and each new point on the perimeter causes only the localized "bubble" to turn into two line segments. I'll leave that for you to figure out.

Cycle finding algorithm

I need do find a cycle beginning and ending at given point. It is not guaranteed that it exists.
I use bool[,] points to indicate which point can be in cycle. Poins can be only on grid. points indicates if given point on grid can be in cycle.
I need to find this cycle using as minimum number of points.
One point can be used only once.
Connection can be only vertical or horizontal.
Let this be our points (red is starting point):
removing dead ImageShack links
I realized that I can do this:
while(numberOfPointsChanged)
{
//remove points that are alone in row or column
}
So i have:
removing dead ImageShack links
Now, I can find the path.
removing dead ImageShack links
But what if there are points that are not deleted by this loop but should not be in path?
I have written code:
class MyPoint
{
public int X { get; set; }
public int Y { get; set; }
public List<MyPoint> Neighbours = new List<MyPoint>();
public MyPoint parent = null;
public bool marked = false;
}
private static MyPoint LoopSearch2(bool[,] mask, int supIndexStart, int recIndexStart)
{
List<MyPoint> points = new List<MyPoint>();
//here begins translation bool[,] to list of points
points.Add(new MyPoint { X = recIndexStart, Y = supIndexStart });
for (int i = 0; i < mask.GetLength(0); i++)
{
for (int j = 0; j < mask.GetLength(1); j++)
{
if (mask[i, j])
{
points.Add(new MyPoint { X = j, Y = i });
}
}
}
for (int i = 0; i < points.Count; i++)
{
for (int j = 0; j < points.Count; j++)
{
if (i != j)
{
if (points[i].X == points[j].X || points[i].Y == points[j].Y)
{
points[i].Neighbours.Add(points[j]);
}
}
}
}
//end of translating
List<MyPoint> queue = new List<MyPoint>();
MyPoint start = (points[0]); //beginning point
start.marked = true; //it is marked
MyPoint last=null; //last point. this will be returned
queue.Add(points[0]);
while(queue.Count>0)
{
MyPoint current = queue.First(); //taking point from queue
queue.Remove(current); //removing it
foreach(MyPoint neighbour in current.Neighbours) //checking Neighbours
{
if (!neighbour.marked) //in neighbour isn't marked adding it to queue
{
neighbour.marked = true;
neighbour.parent = current;
queue.Add(neighbour);
}
//if neighbour is marked checking if it is startig point and if neighbour's parent is current point. if it is not that means that loop already got here so we start searching parents to got to starting point
else if(!neighbour.Equals(start) && !neighbour.parent.Equals(current))
{
current = neighbour;
while(true)
{
if (current.parent.Equals(start))
{
last = current;
break;
}
else
current = current.parent;
}
break;
}
}
}
return last;
}
But it doesn't work. The path it founds contains two points: start and it's first neighbour.
What am I doing wrong?
EDIT:
Forgot to mention... After horizontal connection there has to be vertical, horizontal, vertical and so on...
What is more in each row and column there need to be max two points (two or none) that are in the cycle. But this condition is the same as "The cycle has to be the shortest one".
First of all, you should change your representation to a more efficient one. You should make vertex a structure/class, which keeps the list of the connected vertices.
Having changed the representation, you can easily find the shortest cycle using breadth-first search.
You can speed the search up with the following trick: traverse the graph in the breadth-first order, marking the traversed vertices (and storing the "parent vertex" number on the way to the root at each vertex). AS soon as you find an already marked vertex, the search is finished. You can find the two paths from the found vertex to the root by walking back by the stored "parent" vertices.
Edit:
Are you sure you code is right? I tried the following:
while (queue.Count > 0)
{
MyPoint current = queue.First(); //taking point from queue
queue.Remove(current); //removing it
foreach (MyPoint neighbour in current.Neighbours) //checking Neighbours
{
if (!neighbour.marked) //if neighbour isn't marked adding it to queue
{
neighbour.marked = true;
neighbour.parent = current;
queue.Add(neighbour);
}
else if (!neighbour.Equals(current.parent)) // not considering own parent
{
// found!
List<MyPoint> loop = new List<MyPoint>();
MyPoint p = current;
do
{
loop.Add(p);
p = p.parent;
}
while (p != null);
p = neighbour;
while (!p.Equals(start))
{
loop.Add(p);
p = p.parent;
}
return loop;
}
}
}
return null;
instead of the corresponding part in your code (I changed the return type to List<MyPoint>, too). It works and correctly finds a smaller loop, consisting of 3 points: the red point, the point directly above and the point directly below.
That is what I have done. I don't know if it is optimised but it does work correctly. I have not done the sorting of the points as #marcog suggested.
private static bool LoopSearch2(bool[,] mask, int supIndexStart, int recIndexStart, out List<MyPoint> path)
{
List<MyPoint> points = new List<MyPoint>();
points.Add(new MyPoint { X = recIndexStart, Y = supIndexStart });
for (int i = 0; i < mask.GetLength(0); i++)
{
for (int j = 0; j < mask.GetLength(1); j++)
{
if (mask[i, j])
{
points.Add(new MyPoint { X = j, Y = i });
}
}
}
for (int i = 0; i < points.Count; i++)
{
for (int j = 0; j < points.Count; j++)
{
if (i != j)
{
if (points[i].X == points[j].X || points[i].Y == points[j].Y)
{
points[i].Neighbours.Add(points[j]);
}
}
}
}
List<MyPoint> queue = new List<MyPoint>();
MyPoint start = (points[0]);
start.marked = true;
queue.Add(points[0]);
path = new List<MyPoint>();
bool found = false;
while(queue.Count>0)
{
MyPoint current = queue.First();
queue.Remove(current);
foreach (MyPoint neighbour in current.Neighbours)
{
if (!neighbour.marked)
{
neighbour.marked = true;
neighbour.parent = current;
queue.Add(neighbour);
}
else
{
if (neighbour.parent != null && neighbour.parent.Equals(current))
continue;
if (current.parent == null)
continue;
bool previousConnectionHorizontal = current.parent.Y == current.Y;
bool currentConnectionHorizontal = current.Y == neighbour.Y;
if (previousConnectionHorizontal != currentConnectionHorizontal)
{
MyPoint prev = current;
while (true)
{
path.Add(prev);
if (prev.Equals(start))
break;
prev = prev.parent;
}
path.Reverse();
prev = neighbour;
while (true)
{
if (prev.Equals(start))
break;
path.Add(prev);
prev = prev.parent;
}
found = true;
break;
}
}
if (found) break;
}
if (found) break;
}
if (path.Count == 0)
{
path = null;
return false;
}
return true;
}
Your points removal step is worst case O(N^3) if implemented poorly, with the worst case being stripping a single point in each iteration. And since it doesn't always save you that much computation in the cycle detection, I'd avoid doing it as it also adds an extra layer of complexity to the solution.
Begin by creating an adjacency list from the set of points. You can do this efficiently in O(NlogN) if you sort the points by X and Y (separately) and iterate through the points in order of X and Y. Then to find the shortest cycle length (determined by number of points), start a BFS from each point by initially throwing all points on the queue. As you traverse an edge, store the source of the path along with the current point. Then you will know when the BFS returns to the source, in which case we've found a cycle. If you end up with an empty queue before finding a cycle, then none exists. Be careful not to track back immediately to the previous point or you will end up with a defunct cycle formed by two points. You might also want to avoid, for example, a cycle formed by the points (0, 0), (0, 2) and (0, 1) as this forms a straight line.
The BFS potentially has a worst case of being exponential, but I believe such a case can either be proven to not exist or be extremely rare as the denser the graph the quicker you'll find a cycle while the sparser the graph the smaller your queue will be. On average it is more likely to be closer to the same runtime as the adjacency list construction, or in the worst realistic cases O(N^2).
I think that I'd use an adapted variant of Dijkstra's algorithm which stops and returns the cycle whenever it arrives to any node for the second time. If this never happens, you don't have a cycle.
This approach should be much more efficient than a breadth-first or depth-first search, especially if you have many nodes. It is guarateed that you'll only visit each node once, thereby you have a linear runtime.

Resources