How to Solve Assignment Problem With Constraints? - algorithm

Assume there are N people and M tasks are there and there is a cost matrix which tells when a task is assigned to a person how much it cost.
Assume we can assign more than one task to a person.
It means we can assign all of the tasks to a person if it leads to minimum cost.
I know this problem can be solved using various techniques. Some of them are below.
Bit Masking
Hungarian Algorithm
Min Cost Max Flow
Brute Force( All permutations M!)
Question: But what if we put a constraint like only consecutive tasks can be assigned to a person. 
    T1   T2  T3
P1  2   2    2
P2  3   1    4
Answer: 6 rather than 5
Explanation:
We might think that , P1->T1, P2->T2, P1->T3 = 2+1+2 =5 can be answer but it is not because (T1 and T3 are consecutive so can not be assigned to P1)
P1->T1, P1->T2, P1-T3 = 2+2+2 = 6
How to approach solving this problem?

You can solve this problem using ILP.
Here is an OPL-like pseudo-code:
**input:
two integers N, M // N persons, M tasks
a cost matrix C[N][M]
**decision variables:
X[N][M][M] // An array with values in {0, 1}
// X[i][j][k] = 1 <=> the person i performs the tasks j to k
**constraints:
// one person can perform at most 1 sequence of consecutive tasks
for all i in {1, N}, sum(j in {1, ..., M}, k in {1, ..., M}) X[i][j][k] <= 1
// each task is performed exactly once
for all t in {1, M}, sum(i in {1, ..., N}, j in {1, ..., t}, k in {t, ..., M}) X[i][j][k] = 1
// impossible tasks sequences are discarded
for all i in {1, ..., N}, for all j in {1, ..., M}, sum(k in {1, ..., j-1}) X[i][j][k] = 0
**objective function:
minimize sum(i, j, k) X[i][j][k] * (sum(t in {j, ..., k}) C[t])
I think that ILP could be the tool of choice here, since more often that not scheduling and production-planning problems are solved using it.
If you do not have experience coding LP programs, don't worry, it is much easier than it looks like, and this problem is rather easy and nice to get started.
There also exists a stackexchange dedicated to this kind of problems and solutions, the OR stack exchange.

This looks np-complete to me. If I am correct, there is not going to be a universally quick solution, and the best one can do is approach this problem using the best possible heuristics.
One approach you did not mention is a constructive approach using A* search. In this case, the search in would move along the matrix from left to right, adding candidate items to a priority queue with every step. Each item in the queue would consist of the current column index, the total cost expended so far, and the list of people who have acted so far. The remaining-cost heuristic for any given state would be the sum of the columnar minima for all remaining columns.
I'm certain that this can find a solution, I'm just not sure it is the best approach. Some quick Googling shows that A* has been applied to several types of scheduling problems though.
Edit: Here is an implementation.
public class OrderedTasks {
private class State {
private final State prev;
private final int position;
private final int costSoFar;
private final int lastActed;
public State(int position, int costSoFar, int lastActed, State prev) {
super();
this.prev = prev;
this.lastActed = lastActed;
this.position = position;
this.costSoFar = costSoFar;
}
public void getNextSteps(int[] task, Consumer<State> consumer) {
Set<Integer> actedSoFar = new HashSet<>();
State prev = this.prev;
if (prev != null) {
for (; prev!=null; prev=prev.prev) {
actedSoFar.add(prev.lastActed);
}
}
for (int person=0; person<task.length; ++person) {
if (actedSoFar.contains(person) && this.lastActed!=person) {
continue;
}
consumer.accept(new State(position+1,task[person]+this.costSoFar,
person, this));
}
}
}
public int minCost(int[][] tasksByPeople) {
int[] cumulativeMinCost = getCumulativeMinCostPerTask(tasksByPeople);
Function<State, Integer> totalCost = state->state.costSoFar+(state.position<cumulativeMinCost.length? cumulativeMinCost[state.position]: 0);
PriorityQueue<State> pq = new PriorityQueue<>((s1,s2)->{
return Integer.compare(totalCost.apply(s1), totalCost.apply(s2));
});
State state = new State(0, 0, -1, null);
for (; state.position<tasksByPeople.length; state = pq.poll()) {
state.getNextSteps(tasksByPeople[state.position], pq::add);
}
return state.costSoFar;
}
private int[] getCumulativeMinCostPerTask(int[][] tasksByPeople) {
int[] result = new int[tasksByPeople.length];
int cumulative = 0;
for (int i=tasksByPeople.length-1; i>=0; --i) {
cumulative += minimum(tasksByPeople[i]);
result[i] = cumulative;
}
return result;
}
private int minimum(int[] arr) {
if (arr.length==0) {
throw new RuntimeException("Not valid for empty arrays.");
}
int min = arr[0];
for (int i=1; i<arr.length; ++i) {
min = Math.min(min, arr[i]);
}
return min;
}
public static void main(String[] args) {
OrderedTasks ot = new OrderedTasks();
System.out.println(ot.minCost(new int[][]{{2, 3},{2,1},{2,4},{2,2}}));
}
}

I think your question is very similar to:
Finding the minimum value
Probably not the best approach if the number of workers is large, but easy to understand and implement could be
get a list all the possible combination with repetition of workers W, for example using the algorithm in https://www.geeksforgeeks.org/combinations-with-repetitions/ . This would give you things like [[W1,W3,W2,W3,W1],[W3,W5,W5,W4,W5]
Discard combinations where workers are not continuous
bool isValid=true;
for (int kk = 0; kk < workerOrder.Length; kk++)
{
int state=0;
for (int mm = 0; mm < workerOrder.Length; mm++)
{
if (workerOrder[mm] == kk && state == 0) { state = 1; } //it has appeard
if (workerOrder[mm] != kk && state == 1 ) { state = 2; } //it is not contious
if (workerOrder[mm] == kk && state == 2) { isValid = false; break; } //it appeard again
}
if (isValid==false){break;}
}
Use the filtered list of lists to check times using the table and keep the minimum one

Related

Recursive Brute Force 0-1 Knapsack - add items selected output

I am practicing recursive algorithms because although I love recursion, I am still having trouble when there is "double" recursion going on. So I created this brute force 0-1 Knapsack algorithm which will output the final weight and best value, and its pretty good, but I decided that information is only relevant if you know which items are behind those numbers. I am stuck here, though. I want to do this elegantly, without creating a mess of code, and perhaps I am over-limiting my thinking trying to meet that goal. I thought I would post the code here and see if anyone had some nifty ideas about adding code to output the chosen items. This is Java:
public class Knapsack {
static int num_items = 4;
static int weights[] = { 3, 5, 1, 4 };
static int benefit[] = { 2, 4, 3, 6 };
static int capacity = 10;
static int new_sack[] = new int[num_items];
static int max_value = 0;
static int weight = 0;
// O(n2^n) brute force algorithm (i.e. check all combinations) :
public static void findMaxValue(int n, int currentWeight, int currentValue) {
if ((n == 0) && (currentWeight <= capacity) && (currentValue > max_value)) {
max_value = currentValue;
weight = currentWeight;
}
if (n == 0) {
return;
}
findMaxValue(n - 1, currentWeight, currentValue);
findMaxValue(n - 1, currentWeight + weights[n - 1], currentValue + benefit[n - 1]);
}
public static void main(String[] args) {
findMaxValue(num_items, 0, 0);
System.out.println("The max value you can get is: " + max_value + " with weight: " + weight);
// System.out.println(Arrays.toString(new_sack));
}
}
The point of the 0-1 Knapsack algorithm is to find if excluding or including an item in the knapsack results in a higher value. Your code doesn't compare these two possibilities. The code to do this would look like:
public int knapsack(int[] weights, int[] values, int n, int capacity) {
if (n == 0 || capacity == 0)
return 0;
if (weights[n-1] > capacity) // if item won't fit in knapsack
return knapsack(weights, values, n-1, capacity); // look at next item
// Compare if excluding or including item results in greater value
return max(
knapsack(weights, values, n-1, capacity), // exclude item
values[n] + knapsack(weights, values, n-1, capacity - weights[n-1])); // include item
}

Best approach to fit numbers

I have the following set of integers {2,9,4,1,8}. I need to divide this set into two subsets so that the sum of the sets results in 14 and 10 respectively. In my example the answer is {2,4,8} and {9,1}. I am not looking for any code. I am pretty sure there must be a standard algorithm to solve this problem. Since i was not successful in googling and finding out that myself, i posted my query here. So what will be the best way to approach this problem?
My try was like this...
public class Test {
public static void main(String[] args) {
int[] input = {2, 9, 4, 1, 8};
int target = 14;
Stack<Integer> stack = new Stack<>();
for (int i = 0; i < input.length; i++) {
stack.add(input[i]);
for (int j = i+1;j<input.length;j++) {
int sum = sumInStack(stack);
if (sum < target) {
stack.add(input[j]);
continue;
}
if (target == sum) {
System.out.println("Eureka");
}
stack.remove(input[i]);
}
}
}
private static int sumInStack(Stack<Integer> stack) {
int sum = 0;
for (Integer integer : stack) {
sum+=integer;
}
return sum;
}
}
I know this approach is not even close to solve the problem
I need to divide this set into two subsets so that the sum of the sets results in 14 and 10 respectively.
If the subsets have to sum to certain values, then it had better be true that the sum of the entire set is the sum of those values, i.e. 14+10=24 in your example. If you only have to find the two subsets, then the problem isn't very difficult — find any subset that sums to one of those values, and the remaining elements of the set must sum to the other value.
For the example set you gave, {2,9,4,1,8}, you said that the answer is {9,1}, {2,4,8}, but notice that that's not the only answer; there's also {2,8}, {9,4,1}.

Dynamic programming: Algorithm to solve the following?

I have recently completed the following interview exercise:
'A robot can be programmed to run "a", "b", "c"... "n" kilometers and it takes ta, tb, tc... tn minutes, respectively. Once it runs to programmed kilometers, it must be turned off for "m" minutes.
After "m" minutes it can again be programmed to run for a further "a", "b", "c"... "n" kilometers.
How would you program this robot to go an exact number of kilometers in the minimum amount of time?'
I thought it was a variation of the unbounded knapsack problem, in which the size would be the number of kilometers and the value, the time needed to complete each stretch. The main difference is that we need to minimise, rather than maximise, the value. So I used the equivalent of the following solution: http://en.wikipedia.org/wiki/Knapsack_problem#Unbounded_knapsack_problem
in which I select the minimum.
Finally, because we need an exact solution (if there is one), over the map constructed by the algorithm for all the different distances, I iterated through each and trough each robot's programmed distance to find the exact distance and minimum time among those.
I think the pause the robot takes between runs is a bit of a red herring and you just need to include it in your calculations, but it does not affect the approach taken.
I am probably wrong, because I failed the test. I don't have any other feedback as to the expected solution.
Edit: maybe I wasn't wrong after all and I failed for different reasons. I just wanted to validate my approach to this problem.
import static com.google.common.collect.Sets.*;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Set;
import org.apache.log4j.Logger;
import com.google.common.base.Objects;
import com.google.common.base.Preconditions;
import com.google.common.collect.Lists;
import com.google.common.collect.Maps;
public final class Robot {
static final Logger logger = Logger.getLogger (Robot.class);
private Set<ProgrammedRun> programmedRuns;
private int pause;
private int totalDistance;
private Robot () {
//don't expose default constructor & prevent subclassing
}
private Robot (int[] programmedDistances, int[] timesPerDistance, int pause, int totalDistance) {
this.programmedRuns = newHashSet ();
for (int i = 0; i < programmedDistances.length; i++) {
this.programmedRuns.add (new ProgrammedRun (programmedDistances [i], timesPerDistance [i] ) );
}
this.pause = pause;
this.totalDistance = totalDistance;
}
public static Robot create (int[] programmedDistances, int[] timesPerDistance, int pause, int totalDistance) {
Preconditions.checkArgument (programmedDistances.length == timesPerDistance.length);
Preconditions.checkArgument (pause >= 0);
Preconditions.checkArgument (totalDistance >= 0);
return new Robot (programmedDistances, timesPerDistance, pause, totalDistance);
}
/**
* #returns null if no strategy was found. An empty map if distance is zero. A
* map with the programmed runs as keys and number of time they need to be run
* as value.
*
*/
Map<ProgrammedRun, Integer> calculateOptimalStrategy () {
//for efficiency, consider this case first
if (this.totalDistance == 0) {
return Maps.newHashMap ();
}
//list of solutions for different distances. Element "i" of the list is the best set of runs that cover at least "i" kilometers
List <Map<ProgrammedRun, Integer>> runsForDistances = Lists.newArrayList();
//special case i = 0 -> empty map (no runs needed)
runsForDistances.add (new HashMap<ProgrammedRun, Integer> () );
for (int i = 1; i <= totalDistance; i++) {
Map<ProgrammedRun, Integer> map = new HashMap<ProgrammedRun, Integer> ();
int minimumTime = -1;
for (ProgrammedRun pr : programmedRuns) {
int distance = Math.max (0, i - pr.getDistance ());
int time = getTotalTime (runsForDistances.get (distance) ) + pause + pr.getTime();
if (minimumTime < 0 || time < minimumTime) {
minimumTime = time;
//new minimum found
map = new HashMap<ProgrammedRun, Integer> ();
map.putAll(runsForDistances.get (distance) );
//increase count
Integer num = map.get (pr);
if (num == null) num = Integer.valueOf (1);
else num++;
//update map
map.put (pr, num);
}
}
runsForDistances.add (map );
}
//last step: calculate the combination with exact distance
int minimumTime2 = -1;
int bestIndex = -1;
for (int i = 0; i <= totalDistance; i++) {
if (getTotalDistance (runsForDistances.get (i) ) == this.totalDistance ) {
int time = getTotalTime (runsForDistances.get (i) );
if (time > 0) time -= pause;
if (minimumTime2 < 0 || time < minimumTime2 ) {
minimumTime2 = time;
bestIndex = i;
}
}
}
//if solution found
if (bestIndex != -1) {
return runsForDistances.get (bestIndex);
}
//try all combinations, since none of the existing maps run for the exact distance
List <Map<ProgrammedRun, Integer>> exactRuns = Lists.newArrayList();
for (int i = 0; i <= totalDistance; i++) {
int distance = getTotalDistance (runsForDistances.get (i) );
for (ProgrammedRun pr : programmedRuns) {
//solution found
if (distance + pr.getDistance() == this.totalDistance ) {
Map<ProgrammedRun, Integer> map = new HashMap<ProgrammedRun, Integer> ();
map.putAll (runsForDistances.get (i));
//increase count
Integer num = map.get (pr);
if (num == null) num = Integer.valueOf (1);
else num++;
//update map
map.put (pr, num);
exactRuns.add (map);
}
}
}
if (exactRuns.isEmpty()) return null;
//finally return the map with the best time
minimumTime2 = -1;
Map<ProgrammedRun, Integer> bestMap = null;
for (Map<ProgrammedRun, Integer> m : exactRuns) {
int time = getTotalTime (m);
if (time > 0) time -= pause; //remove last pause
if (minimumTime2 < 0 || time < minimumTime2 ) {
minimumTime2 = time;
bestMap = m;
}
}
return bestMap;
}
private int getTotalTime (Map<ProgrammedRun, Integer> runs) {
int time = 0;
for (Map.Entry<ProgrammedRun, Integer> runEntry : runs.entrySet()) {
time += runEntry.getValue () * runEntry.getKey().getTime ();
//add pauses
time += this.pause * runEntry.getValue ();
}
return time;
}
private int getTotalDistance (Map<ProgrammedRun, Integer> runs) {
int distance = 0;
for (Map.Entry<ProgrammedRun, Integer> runEntry : runs.entrySet()) {
distance += runEntry.getValue() * runEntry.getKey().getDistance ();
}
return distance;
}
class ProgrammedRun {
private int distance;
private int time;
private transient float speed;
ProgrammedRun (int distance, int time) {
this.distance = distance;
this.time = time;
this.speed = (float) distance / time;
}
#Override public String toString () {
return "(distance =" + distance + "; time=" + time + ")";
}
#Override public boolean equals (Object other) {
return other instanceof ProgrammedRun
&& this.distance == ((ProgrammedRun)other).distance
&& this.time == ((ProgrammedRun)other).time;
}
#Override public int hashCode () {
return Objects.hashCode (Integer.valueOf (this.distance), Integer.valueOf (this.time));
}
int getDistance() {
return distance;
}
int getTime() {
return time;
}
float getSpeed() {
return speed;
}
}
}
public class Main {
/* Input variables for the robot */
private static int [] programmedDistances = {1, 2, 3, 5, 10}; //in kilometers
private static int [] timesPerDistance = {10, 5, 3, 2, 1}; //in minutes
private static int pause = 2; //in minutes
private static int totalDistance = 41; //in kilometers
/**
* #param args
*/
public static void main(String[] args) {
Robot r = Robot.create (programmedDistances, timesPerDistance, pause, totalDistance);
Map<ProgrammedRun, Integer> strategy = r.calculateOptimalStrategy ();
if (strategy == null) {
System.out.println ("No strategy that matches the conditions was found");
} else if (strategy.isEmpty ()) {
System.out.println ("No need to run; distance is zero");
} else {
System.out.println ("Strategy found:");
System.out.println (strategy);
}
}
}
Simplifying slightly, let ti be the time (including downtime) that it takes the robot to run distance di. Assume that t1/d1 ≤ … ≤ tn/dn. If t1/d1 is significantly smaller than t2/d2 and d1 and the total distance D to be run are large, then branch and bound likely outperforms dynamic programming. Branch and bound solves the integer programming formulation
minimize ∑i ti xi
subject to
∑i di xi = D
∀i xi &in; N
by using the value of the relaxation where xi can be any nonnegative real as a guide. The latter is easily verified to be at most (t1/d1)D, by setting x1 to D/d1 and ∀i ≠ 1 xi = 0, and at least (t1/d1)D, by setting the sole variable of the dual program to t1/d1. Solving the relaxation is the bound step; every integer solution is a fractional solution, so the best integer solution requires time at least (t1/d1)D.
The branch step takes one integer program and splits it in two whose solutions, taken together, cover the entire solution space of the original. In this case, one piece could have the extra constraint x1 = 0 and the other could have the extra constraint x1 ≥ 1. It might look as though this would create subproblems with side constraints, but in fact, we can just delete the first move, or decrease D by d1 and add the constant t1 to the objective. Another option for branching is to add either the constraint xi = ⌊D/di⌋ or xi ≤ ⌊D/di⌋ - 1, which requires generalizing to upper bounds on the number of repetitions of each move.
The main loop of branch and bound selects one of a collection of subproblems, branches, computes bounds for the two subproblems, and puts them back into the collection. The efficiency over brute force comes from the fact that, when we have a solution with a particular value, every subproblem whose relaxed value is at least that much can be thrown away. Once the collection is emptied this way, we have the optimal solution.
Hybrids of branch and bound and dynamic programming are possible, for example, computing optimal solutions for small D via DP and using those values instead of branching on subproblems that have been solved.
Create array of size m and for 0 to m( m is your distance) do:
a[i] = infinite;
a[0] = 0;
a[i] = min{min{a[i-j] + tj + m for all j in possible kilometers of robot. and j≠i} , ti if i is in possible moves of robot}
a[m] is lowest possible value. Also you can have array like b to save a[i]s selection. Also if a[m] == infinite means it's not possible.
Edit: we can solve it in another way by creating a digraph, again our graph is dependent to m length of path, graph has nodes labeled {0..m}, now start from node 0 connect it to all possible nodes; means if you have a kilometer i you can connect 0 and vi with weight ti, except for node 0->x, for all other nodes you should connect node i->j with weight tj-i + m for j>i and j-i is available in input kilometers. now you should find shortest path from v0 to vn. but this algorithm still is O(nm).
Let G be the desired distance run.
Let n be the longest possible distance run without pause.
Let L = G / n (Integer arithmetic, discard fraction part)
Let R = G mod n (ie. The remainder from the above division)
Make the robot run it's longest distance (ie. n) L times, and then whichever distance (a, b, c, etc.) is greater than R by the least amount (ie the smallest available distance that is equal to or greater than R)
Either I understood the problem wrong, or you're all over thinking it
I am a big believer in showing instead of telling. Here is a program that may be doing what you are looking for. Let me know if it satisfies your question. Simply copy, paste, and run the program. You should of course test with your own data set.
import java.util.Arrays;
public class Speed {
/***
*
* #param distance
* #param sprints ={{A,Ta},{B,Tb},{C,Tc}, ..., {N,Tn}}
*/
public static int getFastestTime(int distance, int[][] sprints){
long[] minTime = new long[distance+1];//distance from 0 to distance
Arrays.fill(minTime,Integer.MAX_VALUE);
minTime[0]=0;//key=distance; value=time
for(int[] speed: sprints)
for(int d=1; d<minTime.length; d++)
if(d>=speed[0] && minTime[d] > minTime[d-speed[0]]+speed[1])
minTime[d]=minTime[d-speed[0]]+speed[1];
return (int)minTime[distance];
}//
public static void main(String... args){
//sprints ={{A,Ta},{B,Tb},{C,Tc}, ..., {N,Tn}}
int[][] sprints={{3,2},{5,3},{7,5}};
int distance = 21;
System.out.println(getFastestTime(distance,sprints));
}
}

Optimized TSP Algorithms

I am interested in ways to improve or come up with algorithms that are able to solve the Travelling salesman problem for about n = 100 to 200 cities.
The wikipedia link I gave lists various optimizations, but it does so at a pretty high level, and I don't know how to go about actually implementing them in code.
There are industrial strength solvers out there, such as Concorde, but those are way too complex for what I want, and the classic solutions that flood the searches for TSP all present randomized algorithms or the classic backtracking or dynamic programming algorithms that only work for about 20 cities.
So, does anyone know how to implement a simple (by simple I mean that an implementation doesn't take more than 100-200 lines of code) TSP solver that works in reasonable time (a few seconds) for at least 100 cities? I am only interested in exact solutions.
You may assume that the input will be randomly generated, so I don't care for inputs that are aimed specifically at breaking a certain algorithm.
200 lines and no libraries is a tough constraint. The advanced solvers use branch and bound with the Held–Karp relaxation, and I'm not sure if even the most basic version of that would fit into 200 normal lines. Nevertheless, here's an outline.
Held Karp
One way to write TSP as an integer program is as follows (Dantzig, Fulkerson, Johnson). For all edges e, constant we denotes the length of edge e, and variable xe is 1 if edge e is on the tour and 0 otherwise. For all subsets S of vertices, ∂(S) denotes the edges connecting a vertex in S with a vertex not in S.
minimize sumedges e we xe
subject to
1. for all vertices v, sumedges e in ∂({v}) xe = 2
2. for all nonempty proper subsets S of vertices, sumedges e in ∂(S) xe ≥ 2
3. for all edges e in E, xe in {0, 1}
Condition 1 ensures that the set of edges is a collection of tours. Condition 2 ensures that there's only one. (Otherwise, let S be the set of vertices visited by one of the tours.) The Held–Karp relaxation is obtained by making this change.
3. for all edges e in E, xe in {0, 1}
3. for all edges e in E, 0 ≤ xe ≤ 1
Held–Karp is a linear program but it has an exponential number of constraints. One way to solve it is to introduce Lagrange multipliers and then do subgradient optimization. That boils down to a loop that computes a minimum spanning tree and then updates some vectors, but the details are sort of involved. Besides "Held–Karp" and "subgradient (descent|optimization)", "1-tree" is another useful search term.
(A slower alternative is to write an LP solver and introduce subtour constraints as they are violated by previous optima. This means writing an LP solver and a min-cut procedure, which is also more code, but it might extend better to more exotic TSP constraints.)
Branch and bound
By "partial solution", I mean an partial assignment of variables to 0 or 1, where an edge assigned 1 is definitely in the tour, and an edge assigned 0 is definitely out. Evaluating Held–Karp with these side constraints gives a lower bound on the optimum tour that respects the decisions already made (an extension).
Branch and bound maintains a set of partial solutions, at least one of which extends to an optimal solution. The pseudocode for one variant, depth-first search with best-first backtracking is as follows.
let h be an empty minheap of partial solutions, ordered by Held–Karp value
let bestsolsofar = null
let cursol be the partial solution with no variables assigned
loop
while cursol is not a complete solution and cursol's H–K value is at least as good as the value of bestsolsofar
choose a branching variable v
let sol0 be cursol union {v -> 0}
let sol1 be cursol union {v -> 1}
evaluate sol0 and sol1
let cursol be the better of the two; put the other in h
end while
if cursol is better than bestsolsofar then
let bestsolsofar = cursol
delete all heap nodes worse than cursol
end if
if h is empty then stop; we've found the optimal solution
pop the minimum element of h and store it in cursol
end loop
The idea of branch and bound is that there's a search tree of partial solutions. The point of solving Held–Karp is that the value of the LP is at most the length OPT of the optimal tour but also conjectured to be at least 3/4 OPT (in practice, usually closer to OPT).
The one detail in the pseudocode I've left out is how to choose the branching variable. The goal is usually to make the "hard" decisions first, so fixing a variable whose value is already near 0 or 1 is probably not wise. One option is to choose the closest to 0.5, but there are many, many others.
EDIT
Java implementation. 198 nonblank, noncomment lines. I forgot that 1-trees don't work with assigning variables to 1, so I branch by finding a vertex whose 1-tree has degree >2 and delete each edge in turn. This program accepts TSPLIB instances in EUC_2D format, e.g., eil51.tsp and eil76.tsp and eil101.tsp and lin105.tsp from http://www2.iwr.uni-heidelberg.de/groups/comopt/software/TSPLIB95/tsp/.
// simple exact TSP solver based on branch-and-bound/Held--Karp
import java.io.*;
import java.util.*;
import java.util.regex.*;
public class TSP {
// number of cities
private int n;
// city locations
private double[] x;
private double[] y;
// cost matrix
private double[][] cost;
// matrix of adjusted costs
private double[][] costWithPi;
Node bestNode = new Node();
public static void main(String[] args) throws IOException {
// read the input in TSPLIB format
// assume TYPE: TSP, EDGE_WEIGHT_TYPE: EUC_2D
// no error checking
TSP tsp = new TSP();
tsp.readInput(new InputStreamReader(System.in));
tsp.solve();
}
public void readInput(Reader r) throws IOException {
BufferedReader in = new BufferedReader(r);
Pattern specification = Pattern.compile("\\s*([A-Z_]+)\\s*(:\\s*([0-9]+))?\\s*");
Pattern data = Pattern.compile("\\s*([0-9]+)\\s+([-+.0-9Ee]+)\\s+([-+.0-9Ee]+)\\s*");
String line;
while ((line = in.readLine()) != null) {
Matcher m = specification.matcher(line);
if (!m.matches()) continue;
String keyword = m.group(1);
if (keyword.equals("DIMENSION")) {
n = Integer.parseInt(m.group(3));
cost = new double[n][n];
} else if (keyword.equals("NODE_COORD_SECTION")) {
x = new double[n];
y = new double[n];
for (int k = 0; k < n; k++) {
line = in.readLine();
m = data.matcher(line);
m.matches();
int i = Integer.parseInt(m.group(1)) - 1;
x[i] = Double.parseDouble(m.group(2));
y[i] = Double.parseDouble(m.group(3));
}
// TSPLIB distances are rounded to the nearest integer to avoid the sum of square roots problem
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
double dx = x[i] - x[j];
double dy = y[i] - y[j];
cost[i][j] = Math.rint(Math.sqrt(dx * dx + dy * dy));
}
}
}
}
}
public void solve() {
bestNode.lowerBound = Double.MAX_VALUE;
Node currentNode = new Node();
currentNode.excluded = new boolean[n][n];
costWithPi = new double[n][n];
computeHeldKarp(currentNode);
PriorityQueue<Node> pq = new PriorityQueue<Node>(11, new NodeComparator());
do {
do {
boolean isTour = true;
int i = -1;
for (int j = 0; j < n; j++) {
if (currentNode.degree[j] > 2 && (i < 0 || currentNode.degree[j] < currentNode.degree[i])) i = j;
}
if (i < 0) {
if (currentNode.lowerBound < bestNode.lowerBound) {
bestNode = currentNode;
System.err.printf("%.0f", bestNode.lowerBound);
}
break;
}
System.err.printf(".");
PriorityQueue<Node> children = new PriorityQueue<Node>(11, new NodeComparator());
children.add(exclude(currentNode, i, currentNode.parent[i]));
for (int j = 0; j < n; j++) {
if (currentNode.parent[j] == i) children.add(exclude(currentNode, i, j));
}
currentNode = children.poll();
pq.addAll(children);
} while (currentNode.lowerBound < bestNode.lowerBound);
System.err.printf("%n");
currentNode = pq.poll();
} while (currentNode != null && currentNode.lowerBound < bestNode.lowerBound);
// output suitable for gnuplot
// set style data vector
System.out.printf("# %.0f%n", bestNode.lowerBound);
int j = 0;
do {
int i = bestNode.parent[j];
System.out.printf("%f\t%f\t%f\t%f%n", x[j], y[j], x[i] - x[j], y[i] - y[j]);
j = i;
} while (j != 0);
}
private Node exclude(Node node, int i, int j) {
Node child = new Node();
child.excluded = node.excluded.clone();
child.excluded[i] = node.excluded[i].clone();
child.excluded[j] = node.excluded[j].clone();
child.excluded[i][j] = true;
child.excluded[j][i] = true;
computeHeldKarp(child);
return child;
}
private void computeHeldKarp(Node node) {
node.pi = new double[n];
node.lowerBound = Double.MIN_VALUE;
node.degree = new int[n];
node.parent = new int[n];
double lambda = 0.1;
while (lambda > 1e-06) {
double previousLowerBound = node.lowerBound;
computeOneTree(node);
if (!(node.lowerBound < bestNode.lowerBound)) return;
if (!(node.lowerBound < previousLowerBound)) lambda *= 0.9;
int denom = 0;
for (int i = 1; i < n; i++) {
int d = node.degree[i] - 2;
denom += d * d;
}
if (denom == 0) return;
double t = lambda * node.lowerBound / denom;
for (int i = 1; i < n; i++) node.pi[i] += t * (node.degree[i] - 2);
}
}
private void computeOneTree(Node node) {
// compute adjusted costs
node.lowerBound = 0.0;
Arrays.fill(node.degree, 0);
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) costWithPi[i][j] = node.excluded[i][j] ? Double.MAX_VALUE : cost[i][j] + node.pi[i] + node.pi[j];
}
int firstNeighbor;
int secondNeighbor;
// find the two cheapest edges from 0
if (costWithPi[0][2] < costWithPi[0][1]) {
firstNeighbor = 2;
secondNeighbor = 1;
} else {
firstNeighbor = 1;
secondNeighbor = 2;
}
for (int j = 3; j < n; j++) {
if (costWithPi[0][j] < costWithPi[0][secondNeighbor]) {
if (costWithPi[0][j] < costWithPi[0][firstNeighbor]) {
secondNeighbor = firstNeighbor;
firstNeighbor = j;
} else {
secondNeighbor = j;
}
}
}
addEdge(node, 0, firstNeighbor);
Arrays.fill(node.parent, firstNeighbor);
node.parent[firstNeighbor] = 0;
// compute the minimum spanning tree on nodes 1..n-1
double[] minCost = costWithPi[firstNeighbor].clone();
for (int k = 2; k < n; k++) {
int i;
for (i = 1; i < n; i++) {
if (node.degree[i] == 0) break;
}
for (int j = i + 1; j < n; j++) {
if (node.degree[j] == 0 && minCost[j] < minCost[i]) i = j;
}
addEdge(node, node.parent[i], i);
for (int j = 1; j < n; j++) {
if (node.degree[j] == 0 && costWithPi[i][j] < minCost[j]) {
minCost[j] = costWithPi[i][j];
node.parent[j] = i;
}
}
}
addEdge(node, 0, secondNeighbor);
node.parent[0] = secondNeighbor;
node.lowerBound = Math.rint(node.lowerBound);
}
private void addEdge(Node node, int i, int j) {
double q = node.lowerBound;
node.lowerBound += costWithPi[i][j];
node.degree[i]++;
node.degree[j]++;
}
}
class Node {
public boolean[][] excluded;
// Held--Karp solution
public double[] pi;
public double lowerBound;
public int[] degree;
public int[] parent;
}
class NodeComparator implements Comparator<Node> {
public int compare(Node a, Node b) {
return Double.compare(a.lowerBound, b.lowerBound);
}
}
If your graph satisfy the triangle inequality and you want a guarantee of 3/2 within the optimum I suggest the christofides algorithm. I've wrote an implementation in php at phpclasses.org.
As of 2013, It is possible to solve for 100 cities using only the exact formulation in Cplex. Add degree equations for each vertex, but include subtour-avoiding constraints only as they appear. Most of them are not necessary. Cplex has an example on this.
You should be able to solve for 100 cities. You will have to iterate every time a new subtour is found. I ran an example here and in a couple of minutes and 100 iterations later I got my results.
I took Held-Karp algorithm from concorde library and 25 cities are solved in 0.15 seconds. This performance is perfectly good for me! You can extract the code (writen in ANSI C) of held-karp from concorde library: http://www.math.uwaterloo.ca/tsp/concorde/downloads/downloads.htm. If the download has the extension gz, it should be tgz. You might need to rename it. Then you should make little ajustments to port in in VC++. First take the file heldkarp h and c (rename it cpp) and other about 5 files, make adjustments and it should work calling CCheldkarp_small(...) with edgelen: euclid_ceiling_edgelen.
TSP is an NP-hard problem. (As far as we know) there is no algorithm for NP-hard problems which runs in polynomial time, so you ask for something that doesn't exist.
It's either fast enough to finish in a reasonable time and then it's not exact, or exact but won't finish in your lifetime for 100 cities.
To give a dumb answer: me too. Everyone is interrested in such algorithm, but as others already stated: I does not (yet?) exist. Esp your combination of exact, 200 nodes, few seconds runtime and just 200 lines of code is impossible. You already know that is it NP hard and if you got the slightest impression of asymptotic behaviour you should know that there is no way of achieving this (except you prove that NP=P, and even that I would say thats not possible). Even the exact commercial solvers need for such instances far more than some seconds and as you can imagine they have far more than 200 lines of code (even when you just consider their kernels).
EDIT: The wiki algorithms are the "usual suspects" of the field: Linear Programming and branch-and-bound. Their solutions for the instances with thousands of nodes took Years to solve (they just did it with very very much CPUs parallel, so they can do it faster). Some even use for the branch-and-bound problem specific knowledge for the bounding, so they are no general approaches.
Branch and bound just enumerates all possible paths (e.g. with backtracking) and applies once it has a solution this for to stop a started recursion when it can prove that the result is not better than the already found solution (e.g. if you just visited 2 of your cities and the path is already longer than a found 200 city tour. You can discard all tours that start with that 2 city combination). Here you can invest very much problem specific knowledge in the function that tells you, that the path is not going to be better than the already found solution. The better it is, the less paths you have to look at, the faster is your algorithm.
Linear Programming is an optimization method so solve linear inequality problems. It works in polynomial time (simplex just practically, but that doesnt matter here), but the solution is real. When you have the additional constraint that the solution must be integer, it gets NP-complete. For small instances it is possible, e.g. one method to solve it, then look which variable of the solution violates the integer part and add addition inequalities to change it (this is called cutting-plane, the name cames from the fact that the inequalities define (higher-dimensional) plane, the solution space is a polytop and by adding additional inequalities you cut something with a plane from the polytop). The topic is very complex and even a general simple simplex is hard to understand when you dont want dive deep into the math. There are several good books about, one of the betters is from Chvatal, Linear Programming, but there are several more.
I have a theory, but I've never had the time to pursue it:
The TSP is a bounding problem (single shape where all points lie on the perimeter) where the optimal solution is that solution that has the shortest perimeter.
There are plenty of simple ways to get all the points that lie on a minimum bounding perimeter (imagine a large elastic band stretched around a bunch of nails in a large board.)
My theory is that if you start pushing in on the elastic band so that the length of band increases by the same amount between adjacent points on the perimeter, and each segment remains in the shape of an eliptical arc, the stretched elastic will cross points on the optimal path before crossing points on non-optimal paths. See this page on mathopenref.com on drawing ellipses--particularly steps 5 and 6. Points on the bounding perimeter can be viewed as focal points of the ellipse (F1, F2) in the images below.
What I don't know is if the "bubble stretching" process needs to be reset after each new point is added, or if the existing "bubbles" continue to grow and each new point on the perimeter causes only the localized "bubble" to turn into two line segments. I'll leave that for you to figure out.

Sum array values with sum equals X

I have an integer collection. I need to get all possibilites that sum of values are equal to X.
I need something like this.
It can be written in: delphi, c#, php, RoR, python, cobol, vb, vb.net
That's a subset sum problem. And it is NP-Complete.
The only way to implement this would be generate all possible combinations and compare the sum values. Optimization techniques exists though.
Here's one in C#:
static class Program
{
static int TargetSum = 10;
static int[] InputData = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
static void Main()
{
// find all permutations
var permutations = Permute(InputData);
// check each permutation for the sum
foreach (var item in permutations) {
if (item.Sum() == TargetSum) {
Console.Write(string.Join(" + ", item.Select(n => n.ToString()).ToArray()));
Console.Write(" = " + TargetSum.ToString());
Console.WriteLine();
}
}
Console.ReadKey();
}
static IEnumerable<int[]> Permute(int[] data) { return Permute(data, 0); }
static IEnumerable<int[]> Permute(int[] data, int level)
{
// reached the edge yet? backtrack one step if so.
if (level >= data.Length) yield break;
// yield the first #level elements
yield return data.Take(level + 1).ToArray();
// permute the remaining elements
for (int i = level + 1; i < data.Length; i++) {
var temp = data[level];
data[level] = data[i];
data[i] = temp;
foreach (var item in Permute(data, level + 1))
yield return item;
temp = data[i];
data[i] = data[level];
data[level] = temp;
}
}
}
Dynamic Programming would yield the best runtime for an exact solution. The Subset Sum Problem page on Wikipedia has some pseudo-code for the algorithm. Essentially you order all the numbers and add up all the possible sequences in order such that you minimize the number of additions. The runtime is pseudo-polynomial.
For a polynomial algorithm you could use an Approximation Algorithm. Pseudo-code is also available at the Subset Sum Problem page.
Of the two algorithms I would choose the dynamic programming one since it is straight-forward and has a good runtime with most data sets.
However if the integers are all non-negative and fit with the description on the Wikipedia page then you could actually do this in polynomial time with the approximation algorithm.

Resources