Algorithm to generate TSP solutions randomly - algorithm

The most common heuristics to solve the TSP problem (in particular the Kernighan–Lin heuristic) require to work on a randomly generated tour and to improve the solution starting from that. However, the only way I came up with to do that is to generate a random permutation of the vertices and to check if it is a solution or not.
For large instances of the problem (for example 1000 vertices) this process can take a while. Is there another smart way to generate a random tour for TSP problem faster?? Note that I'm looking for a tour, no matter the cost, and not an optimal solution.
Thanks in advance

If you're just looking for any tour, you can use Breadth or Depth First Search to generate a path while marking the nodes visited.

You could just create an array which contains ths problem's cities and then randomly shuffle that array (there are methods that could do this).
The resulting array is in fact a random permutation.

You want to use a space-filling-curve.

Related

Approximated closest pair algorithm

I have been thinking about a variation of the closest pair problem in which the only available information is the set of distances already calculated (we are not allowed to sort points according to their x-coordinates).
Consider 4 points (A, B, C, D), and the following distances:
dist(A,B) = 0.5
dist(A,C) = 5
dist(C,D) = 2
In this example, I don't need to evaluate dist(B,C) or dist(A,D), because it is guaranteed that these distances are greater than the current known minimum distance.
Is it possible to use this kind of information to reduce the O(n²) to something like O(nlogn)?
Is it possible to reduce the cost to something close to O(nlogn) if I accept a kind of approximated solution? In this case, I am thinking about some technique based on reinforcement learning that only converges to the real solution when the number of reinforcements go to infinite, but provides a great approximation for small n.
Processing time (measured by the big O notation) is not the only issue. To keep a very large amount of previous calculated distances can also be an issue.
Imagine this problem for a set with 10⁸ points.
What kind of solution should I look for? Was this kind of problem solved before?
This is not a classroom problem or something related. I have been just thinking about this problem.
I suggest using ideas that are derived from quickly solving k-nearest-neighbor searches.
The M-Tree data structure: (see http://en.wikipedia.org/wiki/M-tree and http://www.vldb.org/conf/1997/P426.PDF ) is designed to reduce the number distance comparisons that need to be performed to find "nearest neighbors".
Personally, I could not find an implementation of an M-Tree online that I was satisfied with (see my closed thread Looking for a mature M-Tree implementation) so I rolled my own.
My implementation is here: https://github.com/jon1van/MTreeMapRepo
Basically, this is binary tree in which each leaf node contains a HashMap of Keys that are "close" in some metric space you define.
I suggest using my code (or the idea behind it) to implement a solution in which you:
Search each leaf node's HashMap and find the closest pair of Keys within that small subset.
Return the closest pair of Keys when considering only the "winner" of each HashMap.
This style of solution would be a "divide and conquer" approach the returns an approximate solution.
You should know this code has an adjustable parameter the governs the maximum number of Keys that can be placed in an individual HashMap. Reducing this parameter will increase the speed of your search, but it will increase the probability that the correct solution won't be found because one Key is in HashMap A while the second Key is in HashMap B.
Also, each HashMap is associated a "radius". Depending on how accurate you want your result you maybe able to just search the HashMap with the largest hashMap.size()/radius (because this HashMap contains the highest density of points, thus it is a good search candidate)
Good Luck
If you only have sample distances, not original point locations in a plane you can operate on, then I suspect you are bounded at O(E).
Specifically, it would seem from your description that any valid solution would need to inspect every edge in order to rule out it having something interesting to say, meanwhile, inspecting every edge and taking the smallest solves the problem.
Planar versions bypass O(V^2), by using planar distances to deduce limitations on sets of edges, allowing us to avoid needing to look at most of the edge weights.
Use same idea as in space partitioning. Recursively split given set of points by choosing two points and dividing set in two parts, points that are closer to first point and points that are closer to second point. That is same as splitting points by a line passing between two chosen points.
That produces (binary) space partitioning, on which standard nearest neighbour search algorithms can be used.

Unknown algorithm name

I have a matching problem, and have designed a method of solving it.
I need to know if an algorithm exists, if so the name, for the following situation, I have looked a lot, but cannot find anything. The closest I have come to is round robin, but thats not quite the same.
It is similar to some networking problems, but they generally don't get the most optimal route, they just settle for a good one. I need the most optimal.
This is a long read, and an unusual request for SO, but I cannot find a name for it anywhere.
The Problem
I have a pile of items.
Each item can be potentially connected to 1 or more other items.
Each item can only be paired once.
Each connection has a value.
I need to find what combination of pairs will result in the highest connection value.
My Solution
find all pairs for all items and store it in a map
take the first item and pair it with the first item in the maps value.
take the next item and pair it with the first unused item.
keep doing this until no more unused items exist.
save this combination or pairs total value.
change the last pair in the pair if possible.
compare to saved combination, if more save new combination
when the pair can no longer be changed, delete it and change the new last pair, find more possible pairs.
this keeps going until the combination list is reduced to size zero
The last saved combination is the best one.
(Fin)
This problem is basically just a maximum weighted matching.
For bipartite graphs, finding a maximum matching is easy. For arbitrary graphs, it's harder but still doable. Wikipedia suggests an algorithm by Edmonds called the Blossom Algorithm.
As for your algorithm, it's not clear exactly what you're doing, but it looks like a greedy assignment followed by hill climbing. I'm concerned that your algorithm isn't guaranteed to produce an optimal result. Have you actually proven this? How do you know it won't just get stuck in a local minima?
I believe you are looking for the Gale-Shapley algorithm, which solves the stable marriage problem.

Travelling Salesman with multiple salesmen?

I have a problem that has been effectively reduced to a Travelling Salesman Problem with multiple salesmen. I have a list of cities to visit from an initial location, and have to visit all cities with a limited number of salesmen.
I am trying to come up with a heuristic and was wondering if anyone could give a hand. For example, if I have 20 cities with 2 salesmen, the approach that I thought of taking is a 2 step approach. First, divide the 20 cities up randomly into 10 cities for 2 salesman each, and I'd find the tour for each as if it were independent for a few iterations. Afterwards, I'd like to either swap or assign a city to another salesman and find the tour. Effectively, it'd be a TSP and then minimum makespan problem. The problem with this is that it'd be too slow and good neighborhood generation of swapping or assigning a city is hard.
Can anyone give an advise on how I could improve the above?
EDIT:
The geo-location for each city are known, and the salesmen start and end at the same places. The goal is to minimize the max travelling time, making this sort of a minimum makespan problem. So for example, if salesman1 takes 10 hours and salesman2 takes 20 hours, the maximum travelling time would be 20 hours.
TSP is a difficult problem. Multi-TSP is probably much worse. I'm not sure you can find good solutions with ad-hoc methods like this. Have you tried meta-heuristic methods ? I'd try using the Cross Entropy method first : it shouldn't be too hard to use it for your problem. Otherwise look for Generic Algorithms, Ant Colony Optimization, Simulated Annealing ...
See "A Tutorial on the Cross-Entropy Method" from Boer et al. They explain how to use the CE method on the TSP. A simple adaptation for your problem might be to define a different matrix for each salesman.
You might want to assume that you only want to find the optimal partition of cities between the salesmen (and delegate the shortest tour for each salesman to a classic TSP implementation). In this case, in the Cross Entropy setting, you consider a probability for each city Xi to be in the tour of salesman A : P(Xi in A) = pi. And you work, on the space of p = (p1, ... pn). (I'm not sure it will work very well, because you will have to solve many TSP problems.)
When you start talking about multiple salesmen I start thinking about particle swarm optimization. I've found a lot of success with this using a gravitational search algorithm. Here's a (lengthy) paper I found covering the topic. http://eprints.utm.my/11060/1/AmirAtapourAbarghoueiMFSKSM2010.pdf
Why don't you convert your multiple TSP into the traditional TSP?
This is a well-known problem (transforming multiple salesman TSP into TSP) and you can find several articles on it.
For most transformations, you basically copy your depot (where the salesmen start and finish) into several depots (in your case 2), make the edge weights in a way to force a TSP to come back to the depot twice, and then remove the two depots and turn them into one.
Voila! got two min cost tours that cover the vertices exactly once.
I wouldn't start writing an algorythm for such complicated issue (unless that's my day job - to write optimization algorythms). Why don't you turn to a generic solution like http://www.optaplanner.org/ ? You have to define your problem and the program uses algorithms that top developers took years to create and optimize.
My first thought on reading the problem description would be to use a standard approach for the salesman problem (googling for an appropriate one as I've never actually had to write code for it); Then take the result and split it in half. Your algorithm then could be to decide where "half" is -- maybe it is half of the cities, or maybe it is based on distance, or maybe some combination. Or search the result for the largest distance between two cities and choose that as the separation between salesman #1's last city and salesman #2's first city. Of course it does not limit to two salesman, you would break into x pieces; But overall the idea is that your standard 1 salesman TSP solution should have already gotten the "nearby" cities next to each other in the travel graph, so you don't have to come up with a separate grouping algorithm...
Anyway, I'm sure there are better solutions, but this seems like a good first approach to me.
Have a look at this question (562904) - while not identical to yours there should be some good food for thought and references for further study.
As mentioned in the answer above the hierarchical clustering solution will work very well for your problem. Instead of continuing to dissolve clusters until you have a single path, however, stop when you have n, where n is the number of salesmen you have. You can probably improve it some by adding in some "fake" stops to improve the likelihood that your clusters end up evenly spaced from the initial destination if the initial clusters are too disparate. It's not optimal - but you're not going to get an optimal solution for a problem like this. I'd create an app that visualizes the problem and then test many variants of the solution to get a feel for whether your heuristic is optimal enough.
In any case I would not randomize the clusters, that would cause the majority of the clusters to be sub-optimal.
just by starting to read your question using genetic algorithm came to my head. just use two genetic algorithm in the same time one can solve how to assign cities to salesmen and the other can solve the TSP for each salesman you have.

Depth First Search Basics

I'm trying to improve my current algorithm for the 8 Queens problem, and this is the first time I'm really dealing with algorithm design/algorithms. I want to implement a depth-first search combined with a permutation of the different Y values described here:
http://en.wikipedia.org/wiki/Eight_queens_puzzle#The_eight_queens_puzzle_as_an_exercise_in_algorithm_design
I've implemented the permutation part to solve the problem, but I'm having a little trouble wrapping my mind around the depth-first search. It is described as a way of traversing a tree/graph, but does it generate the tree graph? It seems the only way that this method would be more efficient only if the depth-first search generates the tree structure to be traversed, by implementing some logic to only generate certain parts of the tree.
So essentially, I would have to create an algorithm that generated a pruned tree of lexigraphic permutations. I know how to implement the pruning logic, but I'm just not sure how to tie it in with the permutation generator since I've been using next_permutation.
Is there any resources that could help me with the basics of depth first searches or creating lexigraphic permutations in tree form?
In general, yes, the idea of the depth-first search is that you won't have to generate (or "visit" or "expand") every node.
In the case of the Eight Queens problem, if you place a queen such that it can attack another queen, you can abort that branch; it cannot lead to a solution.
If you were solving a variant of Eight Queens such that your goal was to find one solution, not all 92, then you could quit as soon as you found one.
More generally, if you were solving a less discrete problem, like finding the "best" arrangement of queens according to some measure, then you could abort a branch as soon as you knew it could not lead to a final state better than a final state you'd already found on another branch. This is related to the A* search algorithm.
Even more generally, if you are attacking a really big problem (like chess), you may be satisfied with a solution that is not exact, so you can abort a branch that probably can't lead to a solution you've already found.
The DFS algorithm itself does not generate the tree/graph. If you want to build the tree and graph, it's as simple building it as you perform the search. If you only want to find one soution, a flat LIFO data structure like a linked list will suffice for this: when you visit a new node, append it to the list. When you leave a node to backtrack in the search, pop the node off.
A book called "Introduction to algorithms" by anany levitan has a proper explanation for your understanding. He also provided the solution to 8 queens problem just the way you desctribed it. It will helpyou for sure.
As my understanding, for finding one solution you dont need any permutation all you need is dfs.That will lonely suffice in finding solution

Matching algorithm

Odd question here not really code but logic,hope its ok to post it here,here it is
I have a data structure that can be thought of as a graph.
Each node can support many links but is limited to a value for each node.
All links are bidirectional. and each link has a cost. the cost depends on euclidian difference between the nodes the minimum value of two parameters in each node. and a global modifier.
i wish to find the maximum cost for the graph.
wondering if there was a clever way to find such a matching, rather than going through in brute force ...which is ugly... and i'm not sure how i'd even do that without spending 7 million years running it.
To clarify:
Global variable = T
many nodes N each have E,X,Y,L
L is the max number of links each node can have.
cost of link A,B = Sqrt( min([a].e | [b].e) ) x
( 1 + Sqrt( sqrt(sqr([a].x-[b].x)+sqr([a].y-[b].y)))/75 + Sqrt(t)/10 )
total cost =sum all links.....and we wish to maximize this.
average values for nodes is 40-50 can range to (20..600)
average node linking factor is 3 range 0-10.
For the sake of completeness for anybody else that looks at this article, i would suggest revisiting your graph theory algorithms:
Dijkstra
Astar
Greedy
Depth / Breadth First
Even dynamic programming (in some situations)
ect. ect.
In there somewhere is the correct solution for your problem. I would suggest looking at Dijkstra first.
I hope this helps someone.
If I understand the problem correctly, there is likely no polynomial solution. Therefore I would implement the following algorithm:
Find some solution by beng greedy. To do that, you sort all edges by cost and then go through them starting with the highest, adding an edge to your graph while possible, and skipping when the node can't accept more edges.
Look at your edges and try to change them to archive higher cost by using a heuristics. The first that comes to my mind: you cycle through all 4-tuples of nodes (A,B,C,D) and if your current graph has edges AB, CD but AC, BD would be better, then you make the change.
Optionally the same thing with 6-tuples, or other genetic algorithms (they are called that way because they work by mutations).
This is equivalent to the traveling salesman problem (and is therefore NP-Complete) since if you could solve this problem efficiently, you could solve TSP simply by replacing each cost with its reciprocal.
This means you can't solve exactly. On the other hand, it means that you can do exactly as I said (replace each cost with its reciprocal) and then use any of the known TSP approximation methods on this problem.
Seems like a max flow problem to me.
Is it possible that by greedily selecting the next most expensive option from any given start point (omitting jumps to visited nodes) and stopping once all nodes are visited? If you get to a dead end backtrack to the previous spot where you are not at a dead end and greedily select. It would require some work and probably something like a stack to keep your paths in. I think this would work quite effectively provided the costs are well ordered and non negative.
Use Genetic Algorithms. They are designed to solve the problem you state rapidly reducing time complexity. Check for AI library in your language of choice.

Resources