Complete Weighted Graph and Hamiltonian Tour - algorithm

I ran into a question on a midterm exam. Can anyone clarify the answer?
Problem A: Given a Complete Weighted Graph G, find a Hamiltonian Tour with minimum weight.
Problem B: Given a Complete Weighted Graph G and Real Number R, does G have a Hamiltonian Tour with weight at most R?
Suppose there is a machine that solves B. How many times can we call B (each time G and Real number R are given),to solve problem A with that machine? Suppose the sum of Edges in G up to M.
1) We cannot do this, because there is uncountable state.
2) O(|E|) times
3) O(lg m) times
4) because A is NP-Hard, This is cannot be done.

First algorithm
The answer is (3) O(lg m) times. You just have to perform a binary search for the minimum hamiltonian tour in the weighted graph. Notice that if there is a hamiltonian tour of length L in the graph, there is no point in checking if a hamiltonian tour of length L', where L' > L, exists, since you are interested in the minimum-weight hamiltonian tour. So, in each step of your algorithm you can eliminate half of the remaining possible tour-weights. Consequently, you will have to call B in your machine O(lg m) times, where m stands for the total weight of all edges in the complete graph.
Edit:
Second algorithm
I have a slight modification of the above algorithm, which uses the machine O(|E|) times, since some people said that we cannot apply binary search in an uncountable set of possible values (and they are possibly right): Take every possible subset of edges from the graph, and for each subset store a value that is the sum of weights of all edges from the subset. Lets store the values for all the subsets in an array called Val. The size of this array is 2^|E|. Sort Val in increasing order, and then apply binary search for the minimum hamiltonian path, but this time call the machine that solves problem B only with values from the Val array. Since every subset of edges is included in the sorted array, it is guaranteed that the solution will be found. The total number of calls of the machine is O(lg(2^|E|)), which is O(|E|). So, the correct choice is (2) O(|E|) times.
Note:
The first algorithm I proposed is probably incorrect, as some people noted that you cannot apply binary search in an uncountable set. Since we are talking about real numbers, we cannot apply binary search in the range [0-M]

I believe that choice that was meant to be the answer is 1- you can't do that.
The reason is that you can only do binary search on countable sets.
Note that the edges of the graph may even have negative weights, and besides, they may have fractional, or even irrational weights. In that case, the search space for the answer will be the set of all real values less than m.
However,you may get arbitrarily close to the answer of A in Log(n) time, but you cannot find the exact answer. (n being the size of the countable space).

Supposing that in the encoding of graphs the weights are encoded as binary strings representing nonnegative integers and that Problem B can actually algorithmically be solved by entering a real number and perform calculations based on that, things are apparently as follows.
It is possible to do first binary search over the integral interval {0,...,M} to obtain the minumum weight of a Hamiltonian tour in O(log M) calls to the algorithm for Problem B. As afterwards the optimum is known, we can eliminate single edges in G and use the resulting graph as an input to the algorithm for Problem B to test whether or not the optimum changes. This process uses O(|E|) calls to the algorithm for Problem B to identify edges which occur in an optimal Hamiltonian tour. The overall running time of this approach is O( (|E| + log M ) * r(G)), where r(G) denotes the running time of the algorithm for Problem B taking a graph G as an input. I suppose that r is a polynomial, although the question does not explicitly state this; in total, the overall running time would be polynomially bounded in the encoding length of the input, as M can be computed in polynomial time (and hence is pseudopolynomially bounded in the encoding length of the input G).
That being said, the supposed answers can be commented as follows.
The answer is wrong, as the set of necessary states are finite.
Might be true, but does not follow from the algorithm discussed above.
Might be true, but does not follow from the algorithm discussed above.
The answer is wrong. Strictly speaking, the NP-hardness of Problem A does not rule out a polynomial time algorithm; furthermore, the algorithm for Problem B is not stated to be polynomial, so even P=NP does not follow if Problem A can be solved by a polynomial number of calls to the algorithm for Problem B (which is the case by the algorithm sketched above).

Related

Shortest Path in a Directed Acyclic Graph with two types of costs

I am given a directed acyclic graph G = (V,E), which can be assumed to be topologically ordered (if needed). The edges in G have two types of costs - a nominal cost w(e) and a spiked cost p(e).
The goal is to find the shortest path from a node s to a node t which minimizes the following cost:
sum_e (w(e)) + max_e (p(e)), where the sum and maximum are taken over all edges in the path.
Standard dynamic programming methods show that this problem is solvable in O(E^2) time. Is there a more efficient way to solve it? Ideally, an O(E*polylog(E,V)) algorithm would be nice.
---- EDIT -----
This is the O(E^2) solution I found using dynamic programming.
First, order all costs p(e) in an ascending order. This takes O(Elog(E)) time.
Second, define the state space consisting of states (x,i) where x is a node in the graph and i is in 1,2,...,|E|. It represents "We are in node x, and the highest edge weight p(e) we have seen so far is the i-th largest".
Let V(x,i) be the length of the shortest path (in the classical sense) from s to x, where the highest p(e) encountered was the i-th largest. It's easy to compute V(x,i) given V(y,j) for any predecessor y of x and any j in 1,...,|E| (there are two cases to consider - the edge y->x is has the j-th largest weight, or it does not).
At every state (x,i), this computation finds the minimum of about deg(x) values. Thus the complexity is O(|E| * sum_(x\in V) deg(x)) = O(|E|^2), as each node is associated to |E| different states.
I don't see any way to get the complexity you want. Here's an algorithm that I think would be practical in real life.
First, reduce the graph to only vertices and edges between s and t, and do a topological sort so that you can easily find shortest paths in O(E) time.
Let W(m) be the minimum sum(w(e)) cost of paths max(p(e)) <= m, and let P(m) be the smallest max(p(e)) among those shortest paths. The problem solution corresponds to W(m)+P(m) for some cost m. Note that we can find W(m) and P(m) simultaneously in O(E) time by finding a shortest W-cost path, using P-cost to break ties.
The relevant values for m are the p(e) costs that actually occur, so make a sorted list of those. Then use a Kruskal's algorithm variant to find the smallest m that connects s to t, and calculate P(infinity) to find the largest relevant m.
Now we have an interval [l,h] of m-values that might be the best. The best possible result in the interval is W(h)+P(l). Make a priority queue of intervals ordered by best possible result, and repeatedly remove the interval with the best possible result, and:
stop if the best possible result = an actual result W(l)+P(l) or W(h)+P(h)
stop if there are no p(e) costs between l and P(h)
stop if the difference between the best possible result and an actual result is within some acceptable tolerance; or
stop if you have exceeded some computation budget
otherwise, pick a p(e) cost t between l and P(h), find a shortest path to get W(t) and P(t), split the interval into [l,t] and [t,h], and put them back in the priority queue and repeat.
The worst case complexity to get an exact result is still O(E2), but there are many economies and a lot of flexibility in how to stop.
This is only a 2-approximation, not an approximation scheme, but perhaps it inspires someone to come up with a better answer.
Using binary search, find the minimum spiked cost θ* such that, letting C(θ) be the minimum nominal cost of an s-t path using edges with spiked cost ≤ θ, we have C(θ*) = θ*. Every solution has either nominal or spiked cost at least as large as θ*, hence θ* leads to a 2-approximate solution.
Each test in the binary search involves running Dijkstra on the subset with spiked cost ≤ θ, hence this algorithm takes time O(|E| log2 |E|), well, if you want to be technical about it and use Fibonacci heaps, O((|E| + |V| log |V|) log |E|).

algorithm to compute the largest subset of L in which every pair of segments intersects

I came across this question in preparation for the final exam, and I could not find the recursive formula although I saw similar questions.
I will thank you for any help!
the problem is:
Suppose we are given a set L of n line segments in the plane, where the endpoints
of each segment lie on the unit circle x
2 + y
2 = 1, and all 2n endpoints are
distinct. Describe and analyze an algorithm to compute the largest subset of L in
which every pair of segments intersects
The solution needs to be an algorithm in dynamic programming approach (based on recursive formula)
I am assuming the question ("the largest subset of L...") is dealing with the subset size, and not that the subset cannot be extended. If the latter is true, the problem is trivial and the simple greedy algorithm works.
Now to your question. Following Matt Timmermans' hint (can you prove it?) this can be viewed as the longest common subsequence problem, except that we don't know what the 2 input strings are = where the splitting point between the 2 sequence occurences is.
Longest common subsequence problem can be solved in O(m*n) time and linear memory. By moving the splitting point along your 2n-length array you will create 2n instances of the LCS problem each of which can be solved in O(n^2) time, which yields the total time complexity of O(n^3).
Your problem is known as the maximum clique problem (with line segments corresponding to graph nodes, and line segments intersections corresponding to graph edges) of a circle graph and has been shown in 2010 to have a solution with O(n^2*log(n)) time complexity.
Please note that the maximum clique problem (the decision version) is NP-hard (NP-complete, to be exact) in the case of an arbitrary graph.

Is finding whether k different perfect matchings exist in a bipartite graph co-NP?

Few definitions first. The co-NP problem is a decision problem where the answer "NO" can be verified in polynomial time. The perfect matching in a bipartite graph is a set of pairs of nodes (a pair is an edge in the graph) and where every node occurs in this set exactly once.
I am given an n x n bipartite graph, and I am trying to find out if the problem of finding whether k different perfect matchings exist in the graph, where k= polynomial(n), is a co-NP problem.
Work done so far
To initially simplify the problem, I believe that if k=2, then this is a co-NP problem. I think this is true, because the bipartite graph does not have 2 different perfect matchings, if there does not exist an exchange of neighbors between 2 nodes. I define the exchange of neighbors as the following. Let G1 be the first set in the graph, and G2 be the second set in the graph. The exchange occurs when we have a subset of G1, S1={A,B}, and a second subset of G2, S2={X,Y}, where {(A,X),(A,Y),(B,X),(B,Y)} belongs to the set of edges E. I call it exchange because if A was initially matched with X, and B with Y, then when A gets paired with Y, and B with X, A and B have exchanged their neighbors. I believe that the only way to have 2 different perfect matchings is to have at least one such exchange.
Now, we can verify that no such exchange exist in polynomial time. This is true since getting all the possible subsets S1 and S2 has O(n^4) time complexity. This because we need (n choose 2) from G1 multiplied by (n choose 2) from G2, and this gives us an upper bound of n^4.
I am not sure if this is a co-NP problem, but it is NP for certain. I think you have a little mixed up the definition of "verifying an answer". In complexity theory verify an answer means that you provide a certificate that proves that your answer is correct, and such certificate may be checked (verified) in polynomial time.
For example, in the case of your problem, if you have a set k different perfect matchings, that will be a good certificate, verifying it means checking that it is indeed a set of perfect matchings in your input graph. You can check this in polynomial time by checking that all edges are in you graph and in each matching no two edges share a vertex, and all of them are different. Since the number of edges in a matching is linear, then verifying each matching can be done in polynomial time, then, since k is polynomial, we verify that property for all matchings also in polynomial time. Finaly, checking that all are different can be done in k square times something polynomial on n, yielding a polynomial complexity. So yes, your problem can be verified in polynomial time, and thus it is in NP.
Now, if you can find such certificate in polynomial time that will be proof enough that you problem is in P, and all problems in P are in NP and in co-NP. So I see two possible ways to solve this, you may prove that your problem is in P, that will yield a yes answer to your question, or you may prove that your problem is NP-complete, that will prove that your answer is no, since all NP-complete problems are not in co-NP (unless P = NP).
Any other way of proving that your problem is or is not in co-NP, might be very difficult and confusing, in fact the work you have done so far was moving towards proving that you can decide negative cases in polynomial time which is a different thing as verifying them, that would prove that it is co-NP, but because you proved that it is in P.

Finding subset of disjunctive intervals with maximal weights

I am looking for an algorithm I could use to solve this, not the code. I wondered about using linear programming with relaxation, but maybe there are more efficient ways for solving this?
The problem
I have set of intervals with weights. Intervals can overlap. I need to find maximal sum of weights of disjunctive intervals subset.
Example
Intervals with weights :
|--3--| |---1-----|
|----2--| |----5----|
Answer: 8
I have an exact O(nlog n) DP algorithm in mind. Since this is homework, here is a clue:
Sort the intervals by right edge position as Saeed suggests, then number them up from 1. Define f(i) to be the highest weight attainable by using only intervals that do not extend to the right of interval i's right edge.
EDIT: Clue 2: Calculate each f(i) in increasing order of i. Keep in mind that each interval will either be present or absent. To calculate the score for the "present" case, you'll need to hunt for the "rightmost" interval that is compatible with interval i, which will require a binary search through the solutions you've already computed.
That was a biggie, not sure I can give more clues without totally spelling it out ;)
If there is no weight it's easy you can use greedy algorithm by sorting the intervals by the end time of them, and in each step get the smallest possible end time interval.
but in your case I think It's NPC (should think about it), but you can use similar greedy algorithm by Value each interval by Weigth/Length, and each time get one of a possible intervals in sorted format, Also you can use simulated annealing, means each time you will get best answer by above value with probability P (p is near to 1) or select another interval with probability 1-P. you can do it in while loop for n times to find a good answer.
Here's an idea:
Consider the following graph: Create a node for each interval. If interval I1 and interval I2 do not overlap and I1 comes before I2, add a directed edge from node I1 to node I2. Note this graph is acyclic. Each node has a cost equal to the length of the corresponding interval.
Now, the idea is to find the longest path in this graph, which can be found in polynomial time for acyclic graphs (using dynamic programming, for example). The problem is that the costs are in the nodes, not in the edges. Here is a trick: split each node v into v' and v''. All edges entering v will now enter v' and all edges leaving v will now leave v''. Then, add an edge from v' to v'' with the node's cost, in this case, the length of the interval. All the other edges will have cost 0.
Well, if I'm not mistaken the longest path in this graph will correspond to the set of disjoint intervals with maximum sum.
You could formulate this problem as a general IP (integer programming) problem with binary variables indicating whether an interval is selected or not. The objective function will then be a weighted linear combination of the variables. You would then need appropriate constraints to enforce disjunctiveness amongst the intervals...That should suffice given the homework tag.
Also, just because a problem can be formulated as an integer program (solving which is NP-Hard) it does not mean that the problem class itself is NP-Hard. So, as Ulrich points out there may be a polynomially-solvable formulation/algorithm such as formulating/solving the problem as a linear program.
Correct solution (end to end) is explained here: http://tkramesh.wordpress.com/2011/02/03/dynamic-programming-1-weighted-interval-scheduling/

"(1:k) Tree-Matching" - Solvable in polynomial time?

Some months ago there was a nice question regarding a "1:n matching problem" and there seems to be no poly-time algorithm.
I would like to add constraints to find a maximum matching for the 1:n matching problem with a polynomial algorithm. I would like to say: "For vertex A1 choose either {B1,B2,B5} or {B2,B3} if the vertices are not already taken from another A-vertex" i.e. I would not allow all possible combinations.
This could be expressed if we introduce helper vertices H for each choice and substitute edges with trees => we get a problem similar to the ordinary bipartite matching. Every vertex of A or B can have only one edge in the matching. The edges to or from vertices in H are either all in the matching or none of them is present in the matching. Imagine the following tri-partite graph:
Now define h_ij="tree rooted that contains H_ij" to express the matching easily:
Then in the example M={h12,h22} would be one 'maximum' matching, although not all vertices from B are involved
The set {h12,h23} is not a matching because then B3 would have be choosen twice.
Would this problem then be solvable in polynomial time? If yes, is there a polytime solution for the weighted (w(h_ij)) variant? If no, could you argue or even proof it for a "simple-man" like me or suggest other constraints to solve the 1:n matching problem?
E.g. could the graph transformed to a general graph which then could be solved with the weighted matching for general graphs? Or could branchings or even matching forests help here?
PS: not a homework ;-)
There is a difference between maximal and maximum. I have assumed you meant maximum for the below writeup.
You don't seem to have defined your problem very clearly, but if I have understood your intent correctly, It seems like your problem is NP complete (and 'equivalent' to Set Packing).
We can assume that the allowed sets sizes is the same (k) for all A_i to find a [1:k] matching, as any other set size can be ignored. To find max k, we just run the algorithm for [1:k] for k = 1,2,3.. etc.
So your problem is (I think...):
Given m set families F_i = {S_1i, .., S_n(i)i} (|F_i| = size of F_i = n(i), need not be same as |F_j|), each set of size k, you have to find one set from each family (say S_i) such that
S_i and S_j are disjoint for any i neq j.
number of S_i's is maximum.
We can show that it is NP-Complete for k=3 in two steps:
The NP-Complete problem Set Packing can be reduced it. This shows that it is NP-Hard.
Your problem is in NP and can be reduced to Set Packing. This and 1) implies your problem is NP-Complete. It also helps you leverage any approximation/randomized algorithms already existing for Set-Packing.
Set Packing is the problem:
Given n sets S_1, S_2, ..., S_n, find the maximum number of pairwise disjoint sets among these.
This problem remains NP-Complete even if |S_1| = |S_2| = ... = |S_n| = 3 and is called the 3-Set packing problem.
We will use this to show that your problem is NP-Hard, by providing an easy reduction from 3-Set packing to your problem.
Given S_1, S_2, .., S_n just form the families
F_i = {S_i}.
Now if your problem had a polynomial time solution, then we get a set of Sets {S_1, S_2, ..., S_r} such that
S_i and S_j are disjoint
Number of S_i is maximum.
This easy reduction gives us a solution to the 3-set Packing problem and thus your problem is NP-Hard.
To see that this problem is in NP, we reduce it to Set-Packing as follows:
Given F_i = {S_1i, S_2i, ..., S_ni}
we consider the sets T_ji = S_ji U {i} (i.e. we add an id of the family into the set itself) and run them through the Set-Packing algorithm. I will leave it to you to see why a solution to Set-Packing gives a solution to your problem.
For a maximal solution, all you need is a greedy algorithm. Just keep picking up sets till you can pick no more. This would be polynomial time.

Resources