Metric travelling salesman, force an edge into solution - algorithm

Normally the TSP solution is the one so that the total cost on edges is minimal.
However in my case I need a specific edge on the solution, it does not matter if it the solution is not optimal anymore.
It does matter, however, that of all Hamiltonian cycles containing that edge the obtained solution is optimal. Or at least bounded.
More formally the problem would be: given a complete metric graph and a specific edge, what is the Hamiltonian cycle which cost is minimal passing through that specific edge?
Edit:
transform the graph is probably a good idea. But keep in mind the resulting graph must still be metric and complete. A non-complete graph is equivalent to a non-metric one in this case, just think that the missing edge is actually an overly expensive one.
This is important because there cannot be polinomial-time algorithm for general distances.
If you are curious the proof of this fact is in "P-complete approximation problems" of S. Sahni and T. Gonzalez (1976).

How about making that edge's cost low enough that no Hamiltonian cycle that contains it could be more costly than a Hamiltonian cycle that does not contain it?
Let S be the sum of all distances in the graph. Add 2*S to every edge's cost, except the fixed one. That way every Hamiltonian cycle that contains the fixed edge will have cost at most (N-1)*2*S+S, and every cycle that does not contain it will have cost at least N*2*S.
The triangle inequality is also preserved, since every triangle (x, y, z) becomes either (x+2*S, y+2*S, z+2*S) or (x, y+2*S, z+2*S).

If X-Y is the edge, you can introduce a new vertex Z such that Z is connected only to X and Y, and remove X-Y. Distance(X,Z) + Distance(Z,Y) = Distance(X,Y).

Related

Which of the following statements are true for the given special cases of the Traveling Salesman Problem?

I'm taking the Algorithms: Design and Analysis II class, one of the questions asks:
Which of the following statements is true?
Consider a TSP instance in which every edge cost is either 1 or 2. Then an optimal tour can be computed in polynomial time.
Consider a TSP instance in which every edge cost is negative. The dynamic programming algorithm covered in the video lectures might not
correctly compute the optimal (i.e., minimum sum of edge lengths) tour
of this instance.
Consider a TSP instance in which every edge cost is negative. Deleting a vertex and all of its incident edges cannot increase the
cost of the optimal (i.e., minimum sum of edge lengths) tour.
Consider a TSP instance in which every edge cost is the Euclidean distance between two points in the place (just like in Programming
Assignment #5). Deleting a vertex and all of its incident edges cannot
increase the cost of the optimal (i.e., minimum sum of edge lengths)
tour.
I argue as follows:
The DP algorithm doesn't make any assumptions on the edge costs, so option 2 is incorrect.
If all edge weights are negative, then deleting a vertex and all of its incident edges can certainly increase the minimum sum because in effect, that edge weight is now added to the previous minimum. Thus, option 3 is incorrect.
Take the optimal tour in the original instance. Now, instead of visiting the deleted vertex v, skip straight from v's predecessor to its successor on the tour. Because Euclidean distance satisfies the "Triangle Inequality", this shortcut only decreases the overall distance traveled. The best tour can of course only be better. Thus, option 4 is correct.
However, I'm not able to find any significance for TSP problems with unit edge costs. Is option 1 merely a trick, or is there more to it?
OP here:
Papadimitriou and Yannakakis have shown that it is possible to approximate the TSP problem 1 in polynomial time within accuracy 7/6. This guarantee has been further improved by Bläser and Shankar Ram to 65/56. However, no matter how good those results are, those are still approximations, and not an optimal solution. Thus, option 1 is incorrect.

Minimum spanning tree with two edges tied

I'd like to solve a harder version of the minimum spanning tree problem.
There are N vertices. Also there are 2M edges numbered by 1, 2, .., 2M. The graph is connected, undirected, and weighted. I'd like to choose some edges to make the graph still connected and make the total cost as small as possible. There is one restriction: an edge numbered by 2k and an edge numbered by 2k-1 are tied, so both should be chosen or both should not be chosen. So, if I want to choose edge 3, I must choose edge 4 too.
So, what is the minimum total cost to make the graph connected?
My thoughts:
Let's call two edges 2k and 2k+1 a edge set.
Let's call an edge valid if it merges two different components.
Let's call an edge set good if both of the edges are valid.
First add exactly m edge sets which are good in increasing order of cost. Then iterate all the edge sets in increasing order of cost, and add the set if at least one edge is valid. m should be iterated from 0 to M.
Run an kruskal algorithm with some variation: The cost of an edge e varies.
If an edge set which contains e is good, the cost is: (the cost of the edge set) / 2.
Otherwise, the cost is: (the cost of the edge set).
I cannot prove whether kruskal algorithm is correct even if the cost changes.
Sorry for the poor English, but I'd like to solve this problem. Is it NP-hard or something, or is there a good solution? :D Thanks to you in advance!
As I speculated earlier, this problem is NP-hard. I'm not sure about inapproximability; there's a very simple 2-approximation (split each pair in half, retaining the whole cost for both halves, and run your favorite vanilla MST algorithm).
Given an algorithm for this problem, we can solve the NP-hard Hamilton cycle problem as follows.
Let G = (V, E) be the instance of Hamilton cycle. Clone all of the other vertices, denoting the clone of vi by vi'. We duplicate each edge e = {vi, vj} (making a multigraph; we can do this reduction with simple graphs at the cost of clarity), and, letting v0 be an arbitrary original vertex, we pair one copy with {v0, vi'} and the other with {v0, vj'}.
No MST can use fewer than n pairs, one to connect each cloned vertex to v0. The interesting thing is that the other halves of the pairs of a candidate with n pairs like this can be interpreted as an oriented subgraph of G where each vertex has out-degree 1 (use the index in the cloned bit as the tail). This graph connects the original vertices if and only if it's a Hamilton cycle on them.
There are various ways to apply integer programming. Here's a simple one and a more complicated one. First we formulate a binary variable x_i for each i that is 1 if edge pair 2i-1, 2i is chosen. The problem template looks like
minimize sum_i w_i x_i (drop the w_i if the problem is unweighted)
subject to
<connectivity>
for all i, x_i in {0, 1}.
Of course I have left out the interesting constraints :). One way to enforce connectivity is to solve this formulation with no constraints at first, then examine the solution. If it's connected, then great -- we're done. Otherwise, find a set of vertices S such that there are no edges between S and its complement, and add a constraint
sum_{i such that x_i connects S with its complement} x_i >= 1
and repeat.
Another way is to generate constraints like this inside of the solver working on the linear relaxation of the integer program. Usually MIP libraries have a feature that allows this. The fractional problem has fractional connectivity, however, which means finding min cuts to check feasibility. I would expect this approach to be faster, but I must apologize as I don't have the energy to describe it detail.
I'm not sure if it's the best solution, but my first approach would be a search using backtracking:
Of all edge pairs, mark those that could be removed without disconnecting the graph.
Remove one of these sets and find the optimal solution for the remaining graph.
Put the pair back and remove the next one instead, find the best solution for that.
This works, but is slow and unelegant. It might be possible to rescue this approach though with a few adjustments that avoid unnecessary branches.
Firstly, the edge pairs that could still be removed is a set that only shrinks when going deeper. So, in the next recursion, you only need to check for those in the previous set of possibly removable edge pairs. Also, since the order in which you remove the edge pairs doesn't matter, you shouldn't consider any edge pairs that were already considered before.
Then, checking if two nodes are connected is expensive. If you cache the alternative route for an edge, you can check relatively quick whether that route still exists. If it doesn't, you have to run the expensive check, because even though that one route ceased to exist, there might still be others.
Then, some more pruning of the tree: Your set of removable edge pairs gives a lower bound to the weight that the optimal solution has. Further, any existing solution gives an upper bound to the optimal solution. If a set of removable edges doesn't even have a chance to find a better solution than the best one you had before, you can stop there and backtrack.
Lastly, be greedy. Using a regular greedy algorithm will not give you an optimal solution, but it will quickly raise the bar for any solution, making pruning more effective. Therefore, attempt to remove the edge pairs in the order of their weight loss.

Algorithm for determining largest covered area

I'm looking for an algorithm which I'm sure must have been studied, but I'm not familiar enough with graph theory to even know the right terms to search for.
In the abstract, I'm looking for an algorithm to determine the set of routes between reachable vertices [x1, x2, xn] and a certain starting vertex, when each edge has a weight and each route can only have a given maximum total weight of x.
In more practical terms, I have road network and for each road segment a length and maximum travel speed. I need to determine the area that can be reached within a certain time span from any starting point on the network. If I can find the furthest away points that are reachable within that time, then I will use a convex hull algorithm to determine the area (this approximates enough for my use case).
So my question, how do I find those end points? My first intuition was to use Dijkstra's algorithm and stop once I've 'consumed' a certain 'budget' of time, subtracting from that budget on each road segment; but I get stuck when the algorithm should backtrack but has used its budget. Is there a known name for this problem?
If I understood the problem correctly, your initial guess is right. Dijkstra's algorithm, or any other algorithm finding a shortest path from a vertex to all other vertices (like A*) will fit.
In the simplest case you can construct the graph, where weight of edges stands for minimum time required to pass this segment of road. If you have its length and maximum allowed speed, I assume you know it. Run the algorithm from the starting point, pick those vertices with the shortest path less than x. As simple as that.
If you want to optimize things, note that during the work of Dijkstra's algorithm, currently known shortest paths to the vertices are increasing monotonically with each iteration. Which is kind of expected when you deal with graphs with non-negative weights. Now, on each step you are picking an unused vertex with minimum current shortest path. If this path is greater than x, you may stop. There is no chance that you have any vertices with shortest path less than x from now on.
If you need to exactly determine points between vertices, that a vehicle can reach in a given time, it is just a small extension to the above algorithm. As a next step, consider all (u, v) edges, where u can be reached in time x, while v cannot. I.e. if we define shortest path to vertex w as t(w), we have t(u) <= x and t(v) > x. Now use some basic math to interpolate point between u and v with the coefficient (x - t(u)) / (t(v) - t(u)).
Using breadth first search from the starting node seems a good way to solve the problem in O(V+E) time complexity. Well that's what Dijkstra does, but it stops after finding the smallest path. In your case, however, you must continue collecting routes for your set of routes until no route can be extended keeping its weigth less than or equal the maximum total weight.
And I don't think there is any backtracking in Dijkstra's algorithm.

Minimum Weight Cycles with Floyd-Warshall Algorithm

"Let G be a directed weighted graph with no negative cycles. Design an algorithm to find a minimum weight cycle in G that runs with a time complexity of O(|V|^3)."
The above is a question I have been working on as part of my coursework. When I first read it, I immediately thought that the Floyd-Warshall algorithm would solve this problem - mainly because F-W runs in O(|V|^3) time and it works for both positive and negative weighted graphs with no negative cycles. However, I soon remembered that F-W is designed to find the shortest path of a graph, not a minimum weight cycle.
Am I on the right track with this question? Would it be possible to modify the Floyd-Warshall algorithm to find a minimum weight cycle in a graph?
I think you are pretty close. Once you do FW loop in O(|V|^3), you have the weight of a shortest path between each pair of vertices, let's call it D(i, j). Now the solution to our main problem is as follows: Brute force for the vertex which is gonna be in a loop, say it's X. Also brute force on Y, which is gonna be the last vertex in the cycle, before X itself. The weight of this cycle is obviously D(X, Y) + W(Y, X). Then you just choose the one with the smallest value. It's obviously a correct answer, because:
(1) All these values of D(X,Y)+D(Y,X) represent the weight of some(maybe non-simple) cycle;
(2) If the optimal cycle is some a->...->b->a, then we consider it when X=a, Y=b;
To reconstruct the answer, first you need to keep track of which X, Y gave us the optimal result. Once we have this X and Y, the only thing remaining is to find the shortest path from X to Y, which can be done in various ways.

How do I find the number of Hamiltonian cycles that do not use "forbidden" edges?

This question actually rephrases that one. The code jam problem is the following:
You are given a complete undirected graph with N nodes and K "forbidden" edges. N <= 300, K <= 15. Find the number of Hamiltonian cycles in the graph that do not use any of the K "forbidden" edges.
The straightforward DP approach of O(2^N*N^2) does not work for such N. It looks like that the winning solutions are O(2^K). Does anybody know how to solve this problem ?
Let's find out for each subset S of forbidden edges, how many Hamiltonian cycles there exist, that use all edges of S. If we solve this subtask then we'll solve the problem by inclusion-exclusion formula.
Now how do we solve the subtask? Let's draw all edges of S. If there exist a vertex of degree more than 2, then obviously we cannot complete the cycle and the answer is 0. Otherwise the graph is divided into connected components. Each component is a sole vertex, a cycle or a simple path.
If there exists a cycle, then it must pass through all vertices, otherwise we won't be able to complete the Hamiltonian cycle. If this is the case, the answer is 2. (The cycle can be traversed in 2 directions.) Otherwise the answer is 0.
The remaining case is when there are c paths and k sole vertices. To complete the Hamiltonian cycle we must choose the direction of each path (2^c ways) and then choose the order of components. We've got c+k components, so they can be rearranged in (c+k)! ways. But we are interested in cycles, so we don't distinguish the orderings which turn into one another by cyclic shifts. (So (1,2,3), (2,3,1) and (3,1,2) are the same.) It means that we must divide the answer by the number of shifts, c+k. So the answer (to the subtask) is 2^c (c+k-1)!.
This idea is implemented in bmerry's solution (very clean code, btw).
Hamiltonian cycle problem is a special case of travelling salesman problem (obtained by setting the distance between two cities to a finite constant if they are adjacent and infinity otherwise.)
These are NP Complete problems which in simple words means no fast solution to them is known.
A trivial heuristic algorithm for locating Hamiltonian paths is to construct a path abc... and extend it until no longer possible; when the path abc... xyz cannot be extended any longer because all neighbours of z already lie in the path, one goes back one step, removing the edge yz and extending the path with a different neighbour of y; if no choice produces a Hamiltonian path, then one takes a further step back, removing the edge xy and extending the path with a different neighbour of x, and so on. This algorithm will certainly find an Hamiltonian path (if any) but it runs in exponential time.
For more check NP Complete problem chapter of "Introduction to Algos" by Cormen

Resources