"(1:k) Tree-Matching" - Solvable in polynomial time? - algorithm

Some months ago there was a nice question regarding a "1:n matching problem" and there seems to be no poly-time algorithm.
I would like to add constraints to find a maximum matching for the 1:n matching problem with a polynomial algorithm. I would like to say: "For vertex A1 choose either {B1,B2,B5} or {B2,B3} if the vertices are not already taken from another A-vertex" i.e. I would not allow all possible combinations.
This could be expressed if we introduce helper vertices H for each choice and substitute edges with trees => we get a problem similar to the ordinary bipartite matching. Every vertex of A or B can have only one edge in the matching. The edges to or from vertices in H are either all in the matching or none of them is present in the matching. Imagine the following tri-partite graph:
Now define h_ij="tree rooted that contains H_ij" to express the matching easily:
Then in the example M={h12,h22} would be one 'maximum' matching, although not all vertices from B are involved
The set {h12,h23} is not a matching because then B3 would have be choosen twice.
Would this problem then be solvable in polynomial time? If yes, is there a polytime solution for the weighted (w(h_ij)) variant? If no, could you argue or even proof it for a "simple-man" like me or suggest other constraints to solve the 1:n matching problem?
E.g. could the graph transformed to a general graph which then could be solved with the weighted matching for general graphs? Or could branchings or even matching forests help here?
PS: not a homework ;-)

There is a difference between maximal and maximum. I have assumed you meant maximum for the below writeup.
You don't seem to have defined your problem very clearly, but if I have understood your intent correctly, It seems like your problem is NP complete (and 'equivalent' to Set Packing).
We can assume that the allowed sets sizes is the same (k) for all A_i to find a [1:k] matching, as any other set size can be ignored. To find max k, we just run the algorithm for [1:k] for k = 1,2,3.. etc.
So your problem is (I think...):
Given m set families F_i = {S_1i, .., S_n(i)i} (|F_i| = size of F_i = n(i), need not be same as |F_j|), each set of size k, you have to find one set from each family (say S_i) such that
S_i and S_j are disjoint for any i neq j.
number of S_i's is maximum.
We can show that it is NP-Complete for k=3 in two steps:
The NP-Complete problem Set Packing can be reduced it. This shows that it is NP-Hard.
Your problem is in NP and can be reduced to Set Packing. This and 1) implies your problem is NP-Complete. It also helps you leverage any approximation/randomized algorithms already existing for Set-Packing.
Set Packing is the problem:
Given n sets S_1, S_2, ..., S_n, find the maximum number of pairwise disjoint sets among these.
This problem remains NP-Complete even if |S_1| = |S_2| = ... = |S_n| = 3 and is called the 3-Set packing problem.
We will use this to show that your problem is NP-Hard, by providing an easy reduction from 3-Set packing to your problem.
Given S_1, S_2, .., S_n just form the families
F_i = {S_i}.
Now if your problem had a polynomial time solution, then we get a set of Sets {S_1, S_2, ..., S_r} such that
S_i and S_j are disjoint
Number of S_i is maximum.
This easy reduction gives us a solution to the 3-set Packing problem and thus your problem is NP-Hard.
To see that this problem is in NP, we reduce it to Set-Packing as follows:
Given F_i = {S_1i, S_2i, ..., S_ni}
we consider the sets T_ji = S_ji U {i} (i.e. we add an id of the family into the set itself) and run them through the Set-Packing algorithm. I will leave it to you to see why a solution to Set-Packing gives a solution to your problem.
For a maximal solution, all you need is a greedy algorithm. Just keep picking up sets till you can pick no more. This would be polynomial time.

Related

Maximum weight independent set with divide and conquer

I was reading about the Maximum Weight Independent Set problem which is:
Input: An undirected graph G = (V, E)and a non-negative weight Wv for
each vertex v ∈ V
Output: An independent set S ∈ V of G with the maximum-possible sum
∑Vw of vertex weights
and that same source (not the SO post) mentions that the problem can be solved by 4 recursive calls with a divide & conquer approach.
I googled but couldn't find such an algorithm. Does anyone have an idea how would this be solved by divide & conquer? I do understand that the running time is much worse than the Dynamic Programming I am just curious on the approach
I understand the part in the manuscript in such a way that only line graphs are taken into consideration. In that case, I believe the footnote to mean the following.
If a larger line graph is taken as input and split (say at an edge which in incident with the nodes a and b), there are four cases to consider to have a proper combination step.
You would have to solve the "left" branch by solving for the cases
a is included in the maximum independent set
a is not included in the maximum independent set
and the same goes for the "right" branch. In total there are four possible ways to combine the results, out of which a maximal one is to be returned.

Proof of np-completeness

Show that the following problem is NP-complete.
The tv problem is to select tv shows for a weekly tv night so that
everyone in a group of people sees something that they like. You are
given a list of people (P1, . . . , Pn) in the group and a list of
possible shows (S1, . . . , Sk). For each show Si, there is a subset
of the people who would like that show choice. You also get w, the
number of weeks for which you can select shows. The question is
whether there are these many movies so that every person likes at
least one of them.
I can't figure out which np problem can be reduced to this and how to establish the certificate.
You can model this as the Set cover problem. You have elements {P1, ..., Pn}, and k subsets of these, T1, ..., Tk, defined as Ti = {Pj : Pj likes Si}. You then want to find the smallest collection of subsets such that their union is the whole set of people. Deciding whether the number of necessary subsets is less than or equal to a number is NP-complete. Finding the actual optimal collection of subsets is NP-hard.
As Matt commented above, your problem is a set cover problem. To prove it is NP complete we must show that it is in NP, and that a known NP-complete problem can be reduced to yours. As suggested I will use Vertex Cover as our known NP-complete problem.
NP Proof
For this we need to word the problem as a decision problem and supply a certificate that can be verified in P time. Our decision problem will be can we satisfy all people using at most k shows. The certificate will be a subset of shows (we will call this subset X). To verify this certificate we need to verify:
1) X is a subset of S
This is simply done by iterating through X and verifying each item appears in S. This can be accomplished in linear time.
2) |X|<= k
This also can be solved in linear time by iterating through X incrementing a count value and comparing it to k.
3) All people are satisfied
This can be accomplished by iterating through P, and checking to see if each person is accounted for by a selection in X. In worst case where most shows cater to many people, this can be accomplished in O(P^2) time.
Since all these steps take polynomial time, the problem is in NP.
NP Complete Proof by Vertex Cover Problem Reduction
Vertex cover is a problem that concerns finding the minimum subset of vertices such that every edge has an endpoint in that subset of vertices. The input to this problem is a graph G(V,E) and k (the number of vertices). To reduce this problem to the instance of set cover you have stated above, let k = the min number of shows required to satisfy everyone, P = E, and Sn = the set of edges incident to n.
This transformation can be easily accomplished in polynomial time, as the most expensive is the last transformation (Sn = the set of edges incident to n) which takes O(V*E) time.
Now, if G has a vertex cover G' of size k, and then in our problem X is a collection of subsets representing vertices in G. This implies |X| = k. Going further, X is a set cover of P as each edge (u,v) has at least u or v in G' (as it is a vertex cover) meaning u or v are in X in our set cover problem.
What this all means is that if you represent a vertex cover problem as your problem, finding a solution also solves the transformed vertex cover problem, as each subset represents a vertex in G, and since each person is accounted for, every edge in the vertex cover problem is also accounted for

Complete Weighted Graph and Hamiltonian Tour

I ran into a question on a midterm exam. Can anyone clarify the answer?
Problem A: Given a Complete Weighted Graph G, find a Hamiltonian Tour with minimum weight.
Problem B: Given a Complete Weighted Graph G and Real Number R, does G have a Hamiltonian Tour with weight at most R?
Suppose there is a machine that solves B. How many times can we call B (each time G and Real number R are given),to solve problem A with that machine? Suppose the sum of Edges in G up to M.
1) We cannot do this, because there is uncountable state.
2) O(|E|) times
3) O(lg m) times
4) because A is NP-Hard, This is cannot be done.
First algorithm
The answer is (3) O(lg m) times. You just have to perform a binary search for the minimum hamiltonian tour in the weighted graph. Notice that if there is a hamiltonian tour of length L in the graph, there is no point in checking if a hamiltonian tour of length L', where L' > L, exists, since you are interested in the minimum-weight hamiltonian tour. So, in each step of your algorithm you can eliminate half of the remaining possible tour-weights. Consequently, you will have to call B in your machine O(lg m) times, where m stands for the total weight of all edges in the complete graph.
Edit:
Second algorithm
I have a slight modification of the above algorithm, which uses the machine O(|E|) times, since some people said that we cannot apply binary search in an uncountable set of possible values (and they are possibly right): Take every possible subset of edges from the graph, and for each subset store a value that is the sum of weights of all edges from the subset. Lets store the values for all the subsets in an array called Val. The size of this array is 2^|E|. Sort Val in increasing order, and then apply binary search for the minimum hamiltonian path, but this time call the machine that solves problem B only with values from the Val array. Since every subset of edges is included in the sorted array, it is guaranteed that the solution will be found. The total number of calls of the machine is O(lg(2^|E|)), which is O(|E|). So, the correct choice is (2) O(|E|) times.
Note:
The first algorithm I proposed is probably incorrect, as some people noted that you cannot apply binary search in an uncountable set. Since we are talking about real numbers, we cannot apply binary search in the range [0-M]
I believe that choice that was meant to be the answer is 1- you can't do that.
The reason is that you can only do binary search on countable sets.
Note that the edges of the graph may even have negative weights, and besides, they may have fractional, or even irrational weights. In that case, the search space for the answer will be the set of all real values less than m.
However,you may get arbitrarily close to the answer of A in Log(n) time, but you cannot find the exact answer. (n being the size of the countable space).
Supposing that in the encoding of graphs the weights are encoded as binary strings representing nonnegative integers and that Problem B can actually algorithmically be solved by entering a real number and perform calculations based on that, things are apparently as follows.
It is possible to do first binary search over the integral interval {0,...,M} to obtain the minumum weight of a Hamiltonian tour in O(log M) calls to the algorithm for Problem B. As afterwards the optimum is known, we can eliminate single edges in G and use the resulting graph as an input to the algorithm for Problem B to test whether or not the optimum changes. This process uses O(|E|) calls to the algorithm for Problem B to identify edges which occur in an optimal Hamiltonian tour. The overall running time of this approach is O( (|E| + log M ) * r(G)), where r(G) denotes the running time of the algorithm for Problem B taking a graph G as an input. I suppose that r is a polynomial, although the question does not explicitly state this; in total, the overall running time would be polynomially bounded in the encoding length of the input, as M can be computed in polynomial time (and hence is pseudopolynomially bounded in the encoding length of the input G).
That being said, the supposed answers can be commented as follows.
The answer is wrong, as the set of necessary states are finite.
Might be true, but does not follow from the algorithm discussed above.
Might be true, but does not follow from the algorithm discussed above.
The answer is wrong. Strictly speaking, the NP-hardness of Problem A does not rule out a polynomial time algorithm; furthermore, the algorithm for Problem B is not stated to be polynomial, so even P=NP does not follow if Problem A can be solved by a polynomial number of calls to the algorithm for Problem B (which is the case by the algorithm sketched above).

Algorithm to divide a set of symbols with constraints into minimum number of subsets

I have a set S={a,c,d,e,f,j,m,q,s,t} with a constraint C={am,cm,de,df,dm,ds,ef,em,eq,es,et,fj,fm,fs,jm,js}. xy in C means that x and y cannot be in the same subset. I would like an algorithm to split set S into subsets Sj such that:
1.The number of Sj is minimized
2.The difference between size of each subset is as large as possible
For example in this case, both {{q,a,c,d,j,t},{m,s},{f},{e}} and {{a,c,e,j},{m,s,q,t},{d},{f}} are satisfying 1, but the first is optimal.
Coming from a computer science background, I wonder whether Mathematicians have devised an algorithm for this problem.
As I understand, your task can be rewritten as: find the largest independent subset of vertices S' of graph G=(S, C); repeat the step for graph G'=G\S'.
It's well-known (also pointed by #tobias_k in his comment) that largest independent set of the graph is NP-hard problem (as it's equivalent to the famous clique-problem).
I think this is very hard problem, and that is why. For finding minimum number of subsets, you must solve problem about minimum chromatic number of graph. This problem is generally solved by brute force.

System of nondistinct representatives efficiently solvable?

We have sets S1, S2, ..., Sn. These sets do not have to be disjoint. Our task is to select a representative member for each set, such that the total number of elements selected is as small as possible. One element may be present in more than one set, and can represent all the sets it is included in. Is there an algorithm to solve this efficiently?
It is easier to answer this question after restatement: let the original sets S1, S2, ..., Sn be elements of the universe and let the original set members be sets themselves: T1, T2, ..., Tm (where Ti contains elements {Sj} which are the original sets containing corresponding member).
Now we have to cover universe S1, S2, ..., Sn with sets T1, T2, ..., Tm. Which is exactly Set cover problem. It is a well known NP-hard problem, so there is no algorithm to solve it efficiently (unless P=NP, as theorists usually say). As you can see from Wikipedia page, there is a greedy approximation algorithm; it is efficient, but approximation ratio is not very good.
I'm assuming that by "efficiently," you mean in polynomial time.
Evgeny Kluev is correct, the problem is NP-hard. The decision version of it is known as the hitting set problem and was shown to be what we now call NP-complete soon after the introduction of that concept. While it's true that Evgeny's reduction is from the hitting set problem to the set cover problem, it's not hard to see an explicit inverse reduction.
Given a set C={C1,C2,...Cm} whose union is U={u1,u2,...,un}, we want to find a minimal-cardinality subset C' whose union is also equal to U. Define Si in your initial problem as {Cj in C | ui is an element of Cj}. The minimum hitting set of S={S1,S2,...,Sn} is then equal to our desired C'.
Not to steal Evgeny's glory, but here's a rather straightforward way of showing perhaps more rigorously that the general case of the poster's problem is NP-hard.
Consider the minimum vertex cover problem of finding a minimum set X from vertices V in an simple graph (V,E) where every edge in E is adjacent to at least one vertex in X.
An edge can be represented by an unordered two-element set {va, vb} where va and vb are distinct elements in V. Note that an edge e represented as {va, vb} is adjacent to vc if and only if vc is an element of {va, vb}.
Hence, the minimal vertex cover problem is the same as finding a minimum size subset X of V where each edge set {va, vb} defined by an edge in E contains an element that is in X.
If one has an algorithm to efficiently solve the original stated problem, then one has an algorithm to efficiently solve the above problem, and therefore one can solve the minimal vertex cover problem efficiently as well.
A couple of algorithms to look at are Simulated Annealing and Genetic Algorithms, if you can live with a solution close to optimal (they might get you the optimal solution, but not necessarily). Simulated Annealing can be made to work in production electronic CAD autoplacement (as I was part of the development team for Wintek's autoplacement program).

Resources