Algorithm for minimum vertex cover in Bipartite graph - algorithm

I am trying to figure out an algorithm for finding minimum vertex cover of a bipartite graph.
I was thinking about a solution, that reduces the problem to maximum matching in bipartite graph. It's known that it can be found using max flow in networ created from the bip. graph.
Max matching M should determine min. vertex cover C, but I can't cope with choosing the vertices to set C.
Let's say bip. graph has parts X, Y and vertices that are endpoints of max matching edges are in set A, those who are not belong to B.
I would say I should choose one vertex for an edge in M to C.
Specifically the endpoint of edge e in M that is connected to vertex in set B, else if it is connected only to vertices in A it does not matter.
This idea unfortunately doesn't work generally as there can be counterexamples found to my algorithm, since vertices in A can be also connected by other edges than those who are included in M.
Any help would be appriciated.

Kőnig's theorem proof does exactly that - building a minimum vertex cover from a maximum matching in a bipartite graph.
Let's say you have G = (V, E) a bipartite graph, separated between X and Y.
As you said, first you have to find a maximum matching (which can be achieved with Dinic's algorithm for instance). Let's call M this maximum matching.
Then to construct your minimum vertex cover:
Find U the set (possibly empty) of unmatched vertices in X1, ie. not connected to any edge in M
Build Z the set or vertices either in U, or connected to U by alternating paths (paths that alternate between edges of M and edges not in M)
Then K = (X \ Z) U (Y ∩ Z) is your minimum vertex cover
The Wikipedia article has details about how you can prove K is indeed a minimum vertex cover.
1 Or Y, both are symmetrical

Related

Double-matching in a bipartite graph

I've encountered the following problem studying for my Algorithm test, with no answer published to it:
Maximum double matching problem- given a bipartite graph G=(V=(LUR),E) describe an algorithm that returns a group of edges M in E s.t for each vertex v in V there are at most 2 edges in M that include v, of a maximum size.
Definition: a "Strong double matching" is a double matching s.t for each vertice v in V there is at least one edge in M that includes v. Given a bipartite graph G=(V=(LUR),E) and strong double matching M, describe an algorithm that returns a strong double matching M' of maximum size. Prove your answer.
so I've already managed to solve
1) using reduction to max-flow: adding vertices's s and t and edges from s to L and edges from R to t each with the capacity of 2, and defining the capacity of each edge between L and R with the infinite capacity. Finding a max flow using Dinic's algorithm and returning all edges with positive flow between L and R.
about 2) i thought about somehow manipulating the network so that there is positive flow from each vertex then using the algorithm from a somehow to construct a maximum solution. Any thoughts? The runtime restriction is O(V^2E) (Dinics runtime)
Here is a solution in O(n^3) using minimum cost flow.
Recall how we make a network for a standard bipartite matching.
For each vertex u from L, add a unit-capacity edge from S to u;
For each edge u-v, where u is from L and v is from R, add an edge from u to v. Note that its capacity does not matter as long as it is at least one;
For each vertex v from R, add a unit-capacity edge from u to R.
Now we keep the central part the same and change left and right parts a bit.
For each vertex u from L, add two unit-capacity edges from S to u, one of them of having cost -1 and another having cost 0;
Same for edges v->S.
Ignoring cost, this is the same network you built yourself. The maximum flow here corresponds to the maximum double-matching.
Now let's find the minimum cost flow of size k. It corresponds to some double-matching, and of those it corresponds to the matching that touches the maximum possible number of vertices, because touching a vertex (that is, pushing at least unit flow through it) decreases the cost by 1. Moreover, touching the vertex for the second time doesn't decrease the cost because the second edge has cost 0.
How we have the solution: for each k = 1, ..., 2n iteratively find the min-cost flow and take the value which corresponds to the minimum cost.
Using Johnson's algorithm (also called Dijkstra's with potentials) gives O(n^2) per iteration, which is O(n^3) overall.
P.S. The runtime of Dinic's algorithm on unit graphs is better, reaching O(E sqrt(V)) on bipartite graphs.

Minimal Vertex Cover using Meet-in-the-Middle

I was studying Meet-in-the-Middle algorithm and found the following exercise:
Given a graph of n nodes (n <= 30), find out a set with the smallest number of vertices such that each edge in the graph has at least one node inside the set.
I have no idea how to do that, only hint I got was
complexity O(3^(n/2))
can you explain the idea?
Take out an edge (u1, v1) from the graph, remove all edges that share a vertex with it. Take out another one (u2, v2), ... continue until the rest of the graph has no edges.
You end up with a number of pairs of vertices
(u1, v1), (u2, v2), ..., (uk, vk)
The rest of the vertices are:
w1, w2, ..., wm
Call the first set of vertices paired vertices, and the second set unpaired vertices. Note, 2k + m = n, there are no edges between unpaired vertices in the original graph.
A vertex cover must have either u1, v1, or both in it. There are 3 choices for each pair (uj, vj). Consider all 3^k ways to include the paired vertices to the vertex cover.
For each of those configurations, an unpaired vertex wi is to be included into the cover if and only if at least one of its neighbors is not in the cover (note that each of wi's neighbors are paired vertices so whether they are included is known).
For each of the 3^k selections of paired vertices, include the unpaired vertices according to the above criteria, then verify each edge between paired vertices has an incident vertex from the cover, if so, it is a candidate cover set. Take one of the candidate cover sets of minimum size as output.
Overall complexity of the above algorithm is O(3^(n/2)E) where E is the number of edges in the graph.

Algorithm to Maximize Degree Centrality of Subgraph

Say I have some graph with nodes and undirected edges (the edges may have a weight associated to them).
I want to find all (or at least one) connected subgraphs that maximize the sum of the degree centrality of all nodes in the subgraph (the degree centrality is based on the original graph) under the constraint that the sum of the weighted edges is < X.
Is there an algorithm that will do this?
A quick search took me to this description of degree centrality. It turns out that the "degree centrality" of a vertex is simply its degree (neighbour count).
Unfortunately your problem is NP-hard, so it's very unlikely that any algorithm exists that can solve every instance quickly. First notice that, assuming edge weights are positive, the edges in any optimal solution necessarily form a tree, since in any non-tree you can delete at least 1 edge without destroying connectivity, and doing so will decrease the total edge weight of the subgraph. (So, as a positive spinoff: If you compute the minimum spanning tree of your input graph and find that it happens to have total weight < X, then you can simply include every vertex in the graph in your solution.)
Let's formulate a decision version of your problem. Given a graph G = (V, E) with positive (I'll assume) weights on the edges, a number X and a number Y, we want to know: Does there exist a connected subgraph G' = (V', E') of G such that the sum of the edge weights in E' is at most X, and the sum of the degrees of V' (w.r.t. G) is at least Y? (Clearly this is no harder than your original problem: If you had an algorithm to solve your problem, then you could just run it, add up the degrees of the vertices in the subgraph it found and compare this to Y to answer "my" problem.)
Here's a reduction from the NP-hard Steiner Tree in Graphs problem, where we are given a graph G = (V, E) with positive weights on the edges, a subset S of its vertices, and a number k, and the task is to determine whether it's possible to connect the vertices in S using a subset of edges with total weight at most k. (As I showed above, the solution will necessarily be a tree.) If the sum of all degrees in G is d, then all we need to do to transform G into an input for your problem is the following: For each vertex s_i in S we add enough new "ballast" vertices that are each connected only to s_i, via edges with weight X+1, to bring the degree of s_i up to d+1. We set X to k, and set Y to |S|(d+1).
Now suppose that the solution to the Steiner Tree problem is YES -- that is, there exists a subset of edges having total weight <= k that does connect all the vertices in S. In that case, it's clear that the same subgraph in the instance of your problem constructed above connects (possibly among others) all the vertices in S, and since each vertex in S has degree d+1, the total degree is at least |S|(d+1), so the answer to your decision problem is also YES.
In the other direction, suppose that the answer to your decision problem is YES -- that is, there exists a subset of edges having total weight <= X ( = k) that connects a set of vertices having total degree at least |S|(d+1). We need to show that this implies a YES answer to the original Steiner Tree problem. Clearly it suffices to show that the vertex set V' of any subgraph satisfying the conditions above (i.e. edges have total weight <= k and vertices have total degree >= |S|(d+1)) contains S (possibly among other vertices). So let V' be the vertex set of such a solution, and suppose to the contrary that there is some vertex u in S that is not in V'. But then the largest sum of degrees that we could possibly make would be to include all other non-ballast vertices in the graph in V', which would give a degree total of at most (|S|-1)(d+1) + d (the first term is the degree sum for the other vertices in S; the second is an upper bound on the degree sum of all non-S vertices in G; note that none of the ballast vertices we added in could be in the subgraph, because the only way to include any of them is to use an edge of weight X+1, which we obviously can't do). But clearly (|S|-1)(d+1) + d = |S|(d+1) - 1, which is strictly less than |S|(d+1), contradicting our assumption that V' has a degree total at least |S|(d+1). So it follows that S is a subset of V', and thus that it is possible to use the same subset of edges to connect the vertices in S for a total weight of at most k, i.e. that the answer to the Steiner Tree problem is also YES.
So a YES answer to either problem implies a YES answer to the other one, in turn implying that a NO answer to either implies a NO answer to the other. Thus if it were possible to solve the decision version of your problem in polynomial time, it would imply a polynomial-time solution to the NP-hard Steiner Tree in Graphs problem. This means the decision version of your problem is itself NP-hard, and so is the optimisation version (which as I said above is at least as hard). (The decision form is also NP-complete, since a YES answer can be easily verified in polynomial time.)
Sidenote: At first I thought I had a very straightforward reduction from the NP-hard Knapsack problem: Given a list of n weights w_1, ..., w_n and a list of n profits p_1, ..., p_n, make a single central vertex c, and n other vertices v_1, ..., v_n. For each v_i, attach it to c with an edge of weight w_i, and add p_i other leaf vertices, each attached only to v_i with an edge of weight X+1. However this reduction doesn't actually work, because the profits can be exponential in the input size n, meaning that the constructed instance of your problem might need to have an exponential number of vertices, which isn't allowed for a polynomial-time reduction.

Minimum sum of distances from sensor nodes to all others

Is there a way to compute (accurate or hevristics) this problem on medium sized (up to 1000 nodes) weighted graph?
Place n (for example 5) sensors in nodes of the graph in such way that the sum of distances from every other node to the closest sensor will be minimal.
I'll show that this problem is NP-hard by reduction from Vertex Cover. This applies even if the graph is unweighted (you don't say whether it's weighted or not).
Given an unweighted graph G = (V, E) and an integer k, the question asked by Vertex Cover is "Does there exist a set of at most k vertices such that every edge has at least one endpoint in this set?" We will build a new graph G' = (V', E), which is the same as G except that all isolated vertices have been discarded, solve your problem on G', and then use it to answer the original question about Vertex Cover.
Suppose there does exist such a set S of k vertices. If we consider this set S to be the locations to put sensors in your problem, then every vertex in S has a distance of 0, and every other vertex is at a distance of exactly 1 away from a vertex that is in S (because if there was some vertex u for which this wasn't true, it would mean that none of u's neighbours are in S, so for each such neighbour u, the edge uv is not covered by the vertex cover, which would be a contradiction.)
This type of problem is called graph clustering. One of the popular methods to solve it is the Markov Cluster (MCL) Algorithm. A web search should provide some implementation examples. However it does not generally provide the optimal solution.

Directed maximum weighted bipartite matching allowing sharing of start/end vertices

Let G (U u V, E) be a weighted directed bipartite graph (i.e. U and V are the two sets of nodes of the bipartite graph and E contains directed weighted edges from U to V or from V to U). Here is an example:
In this case:
U = {A,B,C}
V = {D,E,F}
E = {(A->E,7), (B->D,1), (C->E,3), (F->A,9)}
Definition: DirectionalMatching (I made up this term just to make things clearer): set of directed edges that may share the start or end vertices. That is, if U->V and U'->V' both belong to a DirectionalMatching then V /= U' and V' /= U but it may be that U = U' or V = V'.
My question: How to efficiently find a DirectionalMatching, as defined above, for a bipartite directional weighted graph which maximizes the sum of the weights of its edges?
By efficiently, I mean polynomial complexity or faster, I already know how to implement a naive brute force approach.
In the example above the maximum weighted DirectionalMatching is: {F->A,C->E,B->D}, with a value of 13.
Formally demonstrating the equivalence of this problem to any other well known problem in graph theory would also be valuable.
Thanks!
Note 1: This question is based on Maximum weighted bipartite matching _with_ directed edges but with the extra relaxation that it is allowed for edges in the matching to share the origin or destination. Since that relaxation makes a big difference, I created an independent question.
Note 2: This is a maximum weight matching. Cardinality (how many edges are present) and the number of vertices covered by the matching is irrelevant for a correct result. Only the maximum weight matters.
Note 2: During my research to solve the problem I found this paper, I think it would be helpful to others trying to find a solution: Alternating cycles and paths in edge-coloured
multigraphs: a survey
Note 3: In case it helps, you can also think of the graph as its equivalent 2-edge coloured undirected bipartite multigraph. The problem formulation would then turn into: Find the set of edges without colour-alternating paths or cycles which has maximum weight sum.
Note 4: I suspect that the problem might be NP-hard, but I am not that experienced with reductions so I haven't managed to prove it yet.
Yet another example:
Imagine you had
4 vertices: {u1, u2} {v1, v2}
4 edges: {u1->v1, u1->v2, u2->v1, v2->u2}
Then, regardless of their weights, u1->v2 and v2->u2 cannot be in the same DirectionalMatching, neither can v2->u2 and u2->v1. However u1->v1 and u1->v2 can, and so can u1->v1 and u2->v1.
Define a new undirected graph G' from G as follows.
G' has a node (A, B) with weight w for each directed edge (A, B) with weight w in G
G' has undirected edge ((A, B),(B, C)) if (A, B) and (B, C) are both directed edges in G
http://en.wikipedia.org/wiki/Line_graph#Line_digraphs
Now find a maximal (weighted) independent vertex set in G'.
http://en.wikipedia.org/wiki/Vertex_independent_set
Edit: stuff after this point only works if all of the edge weights are the same - when the edge weights have different values its a more difficult problem (google "maximum weight independent vertex set" for possible algorithms)
Typically this would be an NP-hard problem. However, G' is a bipartite graph -- it contains only even cycles. Finding the maximal (weighted) independent vertex set in a bipartite graph is not NP-hard.
The algorithm you will run on G' is as follows.
Find the connected components of G', say H_1, H_2, ..., H_k
For each H_i do a 2-coloring (say red and blue) of the nodes. The cookbook approach here is to do a depth-first search on H_i alternating colors. A simple approach would be to color each vertex in H_i based on whether the corresponding edge in G goes from U to V (red) or from V to U (blue).
The two options for which nodes to select from H_i are either all the red nodes or all the blue nodes. Choose the colored node set with higher weight. For example, the red node set has weight equal to H_i.nodes.where(node => node.color == red).sum(node => node.w). Call the higher-weight node set N_i.
Your maximal weighted independent vertex set is now union(N_1, N_2, ..., N_k).
Since each vertex in G' corresponds to one of the directed edges in G, you have your maximal DirectionalMatching.
This problem can be solved in polynomial time using the Hungarian Algorithm. The "proof" by Vor above is wrong.
The method of structuring the problem for the above example is as follows:
D E F
A # 7 9
B 1 # #
C # 3 #
where "#" means negative infinity. You then resolve the matrix using the Hungarian algorithm to determine the maximum matching. You can multiply the numbers by -1 if you want to find a minimum matching.

Resources