I've seen online that one can write the travelling salesman problem as a linear expression and compute it using software such as CPLEX for java.
I have a 1000 towns and need to find a short distance. I plan on partitioning these 1000 towns into clusters of ~100 towns and performing some linear programming algorithm on these individual clusters.
The question I have is, how exactly do I represent this as a linear expression.
So I have 100 towns and I'm sure everyone's aware of how TSP works.
I literally have no clue how I can write linear constraints, objectives and variables which satisfy the TSP.
Could someone explain to me how this is done or send me a link which explains it clearly, because I've been researching a lot and can't seem to find anything.
EDIT:
A bit of extra information I found:
We label the cities with numbers 0 to n and define the matrix:
Would this yield the following matrix for 5 towns?
The constraints are:
i) Each city be arrived at from exactly one other city
ii) From each city there is a departure to exactly one other city
iii) The route isn't broken up into separate islands.
Again, this makes complete sense to me, but I'm still having trouble writing these constraints as a linear expression. Apparently it's a simple enough matrix.
Thanks for any help !
According to this Wikipedia article the travelling salesman problem can be modelled as an integer linear program, which I believe to be the key issue of the question. The idea is to have decision variables of permitted values in {0,1} which model selected edges in the graph. Suitable constraints must ensure that the selected edges cover every node, the selected edges form a collection of cycles and there is only one connected component (which in total means that there is exactly one cycle which contains every node). Note that the article also gives an explicit formulation and explains the interpretations of the constraints.
As the travelling salesman problem is NP-hard, it cannot be solved via (non-integral) linear programming unless P=NP.
The TSP problem is a rather complex integer programming problem due to its combinatorial nature.
There are several (exact and approximated) techniques to solve it. The model in Wikipedia is just one of them: it has constraints to ensure there is only one incoming and outgoing arc in each node and the constraints with u variables are for preventing sub-cycles in the solution. There is also several ways to write this sub-cycle preventing constraints, some of them can be found on section 4.1 of this article. The section presents some models for the TSP problem (all of them can be solved using CPLEX, Gurobi, MATLAB or other Integer/Linear Programming Software.
Hope I could be of any help. (=
Related
A couple weeks ago I encountered a problem that I virtually broke down to a variation of the traveling salesman problem. The twists are:
There are multiple salemen.
The list of cities is dynamically increasing (as in, live input)
Each city is only fully profitable for a limited amount of time, as in after a certain time the city will return less of a reward
And there is an overall time limit
Obviously, this problem is NP. I was wondering if there were any good TSP approximations that could have been modified to fit this problem?
You may be able to use some operations research software to solve your problem, e.g. Coin-OR, the reason being that it's generally easier to add new constraints / objectives to an OR constraint/linear/integer/etc programming solver than to e.g. a specialized TSP solver written in C or Fortran or whatever (and it's not likely that you'll find some C/Fortran code to solve your precise problem). Here is a tutorial on how to code a Tabu search to solve the basic TSP using Coin-OR. In addition, here is an article on modeling the time-dependent TSP (the article uses branch-and-bound to solve the problem which probably isn't appropriate for your problem as it doesn't scale beyond a hundred cities or so, but the model should carry over to an approximate solver like Coin-OR).
To account for having multiple salesmen, you may want to look into graph partitioning to divide up the cities among the different salesmen, for example here is a fast online graph partitioning algorithm. The advantage is that once you've partitioned the graphs you can reduce or even eliminate synchronization between the different salesmen.
I have a problem that has been effectively reduced to a Travelling Salesman Problem with multiple salesmen. I have a list of cities to visit from an initial location, and have to visit all cities with a limited number of salesmen.
I am trying to come up with a heuristic and was wondering if anyone could give a hand. For example, if I have 20 cities with 2 salesmen, the approach that I thought of taking is a 2 step approach. First, divide the 20 cities up randomly into 10 cities for 2 salesman each, and I'd find the tour for each as if it were independent for a few iterations. Afterwards, I'd like to either swap or assign a city to another salesman and find the tour. Effectively, it'd be a TSP and then minimum makespan problem. The problem with this is that it'd be too slow and good neighborhood generation of swapping or assigning a city is hard.
Can anyone give an advise on how I could improve the above?
EDIT:
The geo-location for each city are known, and the salesmen start and end at the same places. The goal is to minimize the max travelling time, making this sort of a minimum makespan problem. So for example, if salesman1 takes 10 hours and salesman2 takes 20 hours, the maximum travelling time would be 20 hours.
TSP is a difficult problem. Multi-TSP is probably much worse. I'm not sure you can find good solutions with ad-hoc methods like this. Have you tried meta-heuristic methods ? I'd try using the Cross Entropy method first : it shouldn't be too hard to use it for your problem. Otherwise look for Generic Algorithms, Ant Colony Optimization, Simulated Annealing ...
See "A Tutorial on the Cross-Entropy Method" from Boer et al. They explain how to use the CE method on the TSP. A simple adaptation for your problem might be to define a different matrix for each salesman.
You might want to assume that you only want to find the optimal partition of cities between the salesmen (and delegate the shortest tour for each salesman to a classic TSP implementation). In this case, in the Cross Entropy setting, you consider a probability for each city Xi to be in the tour of salesman A : P(Xi in A) = pi. And you work, on the space of p = (p1, ... pn). (I'm not sure it will work very well, because you will have to solve many TSP problems.)
When you start talking about multiple salesmen I start thinking about particle swarm optimization. I've found a lot of success with this using a gravitational search algorithm. Here's a (lengthy) paper I found covering the topic. http://eprints.utm.my/11060/1/AmirAtapourAbarghoueiMFSKSM2010.pdf
Why don't you convert your multiple TSP into the traditional TSP?
This is a well-known problem (transforming multiple salesman TSP into TSP) and you can find several articles on it.
For most transformations, you basically copy your depot (where the salesmen start and finish) into several depots (in your case 2), make the edge weights in a way to force a TSP to come back to the depot twice, and then remove the two depots and turn them into one.
Voila! got two min cost tours that cover the vertices exactly once.
I wouldn't start writing an algorythm for such complicated issue (unless that's my day job - to write optimization algorythms). Why don't you turn to a generic solution like http://www.optaplanner.org/ ? You have to define your problem and the program uses algorithms that top developers took years to create and optimize.
My first thought on reading the problem description would be to use a standard approach for the salesman problem (googling for an appropriate one as I've never actually had to write code for it); Then take the result and split it in half. Your algorithm then could be to decide where "half" is -- maybe it is half of the cities, or maybe it is based on distance, or maybe some combination. Or search the result for the largest distance between two cities and choose that as the separation between salesman #1's last city and salesman #2's first city. Of course it does not limit to two salesman, you would break into x pieces; But overall the idea is that your standard 1 salesman TSP solution should have already gotten the "nearby" cities next to each other in the travel graph, so you don't have to come up with a separate grouping algorithm...
Anyway, I'm sure there are better solutions, but this seems like a good first approach to me.
Have a look at this question (562904) - while not identical to yours there should be some good food for thought and references for further study.
As mentioned in the answer above the hierarchical clustering solution will work very well for your problem. Instead of continuing to dissolve clusters until you have a single path, however, stop when you have n, where n is the number of salesmen you have. You can probably improve it some by adding in some "fake" stops to improve the likelihood that your clusters end up evenly spaced from the initial destination if the initial clusters are too disparate. It's not optimal - but you're not going to get an optimal solution for a problem like this. I'd create an app that visualizes the problem and then test many variants of the solution to get a feel for whether your heuristic is optimal enough.
In any case I would not randomize the clusters, that would cause the majority of the clusters to be sub-optimal.
just by starting to read your question using genetic algorithm came to my head. just use two genetic algorithm in the same time one can solve how to assign cities to salesmen and the other can solve the TSP for each salesman you have.
I am looking for a name for this problem or any leads on an algorithm or source code:
Example: You want to find the best route to visit the 100 largest cities in the US (classic TSP) but before you can visit any given city you must visit the capital of the state it is in.
Example: You are collecting permission slips from the students of several professors. You need to visit every student and every professor but you can't visit a professor until you have seen all of his students.
Some Googling turns up the sequential ordering problem or "SOP" but there is not so much literature that I am convinced that this is a widely accepted name.
I don't think these partial orderings can be represented within the classic TSP simply by choosing which edges to use in the graph (e.g. you can't initially go from New York to Chicago, but once you visit Springfield you can) or weights, but I may be wrong.
The Sequential Ordering Problem was first introduced by Escudero in 1988 in a paper entitled "An Inexact Algorithm for the Sequential Ordering Problem" (this appeared in the European Journal of Operational Research), so this is the original name for the problem. The abstract of the paper reads:
Given the directed G= (N, A) and the
penalty matrix C, the Sequential
Ordering Problem (hereafter, SOP)
consists of finding the permutation of
the nodes from the set N, such that it
minimizes a C-based function and does
not violate the precedence
relationships given by the set A.
Strong sufficient conditions for the
infeasibility of a SOP's instance are
imbedded in a procedure for the SOP's
preprocessing. Note that it is one of
the key steps in any algorithm that
attempts to solve SOP. By dropping the
constraints related to the precedence
relationships, SOP can be converted in
the classical Asymmetric Traveling
Salesman Problem (hereafter, ATSP).
The algorithm obtains (hopefully)
satisfactory solutions by modifying
the optimal solution to the related
Assignment Problem (hereafter, AP) if
it is not a Feasible Sequential
Ordering (hereafter, FSO). The new
solution 'patches' the subtours (if
any) giving preference to the patches
with zero reduced cost in the linking
arcs. The AP-based lower bound on the
optimal solution to ATSP is tightened
by using some of the procedures given
in [1]. In any case, a local search
for improving the initial FSO is
performed; it uses 3- and 4-change
based procedures. Computational
results on a broad set of cases are
reported.
Escudero and his collaborators have a number of papers on the subject, with references to even more. Searching for papers by him or that reference this paper may help you if you're looking through the literature.
SOP is a well-studied constrained version of the Asymmetric Travelling Salesman Problem, so much of the literature on ATSP may be related.
You could build a state machine that takes into account the ordering requirements, annotate the transitions with your weights, and solve the traveling salesman on that. Except you'd have a lot more nodes: 2^(number of capitals) times the original number of nodes.
Given a list of cities and the cost to fly between each city, I am trying to find the cheapest itinerary that visits all of these cities. I am currently using a MATLAB solution to find the cheapest route, but I'd now like to modify the algorithm to allow the following:
repeat nodes - repeat nodes should be allowed, since travelling via hub cities can often result in a cheaper route
dynamic edge weights - return/round-trip flights have a different (usually lower) cost to two equivalent one-way flights
For now, I am ignoring the issue of flight dates and assuming that it is possible to travel from any city to any other city.
Does anyone have any ideas how to solve this problem? My first idea was to use an evolutionary optimisation method like GA or ACO to solve point 2, and simply adjust the edge weights when evaluating the objective function based on whether the itinerary contains return/round-trip flights, but perhaps somebody else has a better idea.
(Note: I am using MATLAB, but I am not specifically looking for coded solutions, more just high-level ideas about what algorithms can be used.)
Edit - after thinking about this some more, allowing "repeat nodes" seems to be too loose of a constraint. We could further constrain the problem so that, although nodes can be repeatedly visited, each directed edge can only be visited at most once. It seems reasonable to ignore any itineraries which include the same flight in the same direction more than once.
I haven't tested it myself; however, I have read that implementing Simulated Annealing to solve the TSP (or variants of it) can produce excellent results. The key point here is that Simulated Annealing is very easy to implement and requires minimal tweaking, while approximation algorithms can take much longer to implement and are probably more error prone. Skiena also has a page dedicated to specific TSP solvers.
If you want the cost of the solution produced by the algorithm is within 3/2 of the optimum then you want the Christofides algorithm. ACO and GA don't have a guaranteed cost.
Solving the TSP is a NP-hard problem for its subcycles elimination constraints, if you remove any of them (for your hub cities) you just make the problem easier.
But watch out: TSP has similarities with association problem in the meaning that you could obtain non-valid itineraries like:
Cities: New York, Boston, Dallas, Toronto
Path:
Boston - New York
New York - Boston
Dallas - Toronto
Toronto - Dallas
which is clearly wrong since we don't go across all cities.
The subcycle elimination constraints serve just to this purpose. Including a 'hub city' sounds like you need to add weights to the point and make an hybrid between flux problems and tsp problems. Sounds pretty hard but the first try may be: eliminate the subcycles constraints relative to your hub cities (and leave all the others). You can then link the subcycles obtained for the hub cities together.
Good luck
Firstly, what is approximate number of cities in your problem set? (Up to 100? More than 100?)
I have a fair bit of experience with GA (not ACO), and like epitaph says, it has a bit of gambling aspect. For some input, it might stop at a brutally inefficient solution. So, what I have done in the past is to use GA as the first option, compare the answer to some lower bound, and if that seems to be "way off", then run a second (usually a less efficient) algorithm.
Of course, I used plenty of terms that were not standard, so let us make sure that we agree what they would be in this context:
lower bound - of course, in this case, MST would be a lower bound.
"Way Off" - If triangle inequality holds, then an upper bound is UB = 2 * MST. A good "way off" in this context would be 2 * UB.
Second algorithm - In this case, both a linear programming based approach and Christofides would be good choices.
If you limit the problem to round-trips (i.e. the salesman can only buy round-trip tickets), then it can be represented by an undirected graph, and the problem boils down to finding the minimum spanning tree, which can be done efficiently.
In the general case I don't know of a clever way to use efficient algorithms; GA or similar might be a good way to go.
Do you want a near-optimal solution, or do you want the optimal solution?
For the optimal solution, there's still good ol' brute force. Due to requirement 1 involving repeat nodes, you'll have to make sure you search breadth-first, not dept-first. Otherwise you can end up in an infinite loop. You can slowly drop all routes that exceed your current minimum until all routes are exhausted and the minimal route is discovered.
I want to implement 2-SAT problem for 100000 literals. So there would be 200000 vertices. So I am stuck on having a array of all reachable vertices from each vertex, space complexity of O(200000^2) which is infeasible So please suggest a solution for this. And please throw some light on efficient implementation of 2-SAT problem.
From wikipedia:
... 2-satisfiability can be solved in polynomial time. As Aspvall, Plass & Tarjan (1979) observed, a 2-satisfiability instance is solvable if and only if every variable of the instance belongs to a different strongly connected component of the implication graph than the negation of the same variable. Since strongly connected components may be found in linear time by an algorithm based on depth first search, the same linear time bound applies as well to 2-satisfiability.
I won't pretend to understand most of that paragraph, but it would appear that there is an algorithm which can be used to solve the 2-SAT problem, and it is described within that referenced document (A linear-time algorithm for testing the truth of certain quantified boolean formulas). It can apparently be purchased online for about $20 USD. I'm Not sure if that's helpful or not, but there it is!
update: A free PDF of the same document can be found here. Credit goes to liori for the find.
This whole thread is a bit messed up. Yes, one can solve 2-sat in linear time, but no - you can not solve it for that many variables. The time to solve 2-sat is linear with respect to the number of the implication, which for 200 000 variables could reach up to (200000*199999)/2 and furthermore if you use this solution you will need about the same amount of memory. There is another solution(not using strongly connected components that is slower but doesn't need that much memory).