Destroy weighted edges in a bidirected graph - algorithm

On my recent interview I was asked the following question:
There is a bidirectional graph G with no cycles. Each edge has some weight. Also there is a set of nodes K which should be disconnected from each other (by removing some edges). There is only one path between any two nodes in K set. The goal is to minimize total weight of removed edges and disconnect all nodes (from set K) from each other.
My approach was to run BFS for each node from K set and determine all paths between all pairs of nodes from K. So then I'll have set of paths (each path is a set of edges).
My next step is to apply dynamic programming approach to find minimum total weight of removed edges. And here I stuck. Could you please help me (just direct me) of how DP should be done.
Thanks.

This sounds like the Multiway Cut problem in trees, assuming a "bidirectional" graph is just like an undirected one. It can be solved in polynomial time by a straightforward dynamic programming. See Chopra and Rao: "On the multiway cut polyhedron", Networks 21(1):51–89, 1991. See also this question.

This problem can be solved using disjoint Sets. In this problem you need to make set of each vertex like forest. Then, join the vertices which belong to two different sets through the edges which are in the graph such that -
1) if one of the set contains a node which is mentioned in the k set of nodes, then the node becomes the representative element.
2) if both the sets contain the unwanted node (node present in the k set of nodes), then find the minimum weight edge in both the sets and compare them, then compare the minimum one of them with the edge to be joined and find the minimum. Then delete the edge which we found so that no path exists b/w them.
3) if none of the set contain the unwanted node, then join one set to the other set.
In this way you find the minimum total weight of the destroyed edges in a very good time complexity of the order of O(nlogn).
Your approach will also work but it will prove to be costly in terms of time complexity.
Here is the complete code - (GameAlgorithm.cpp)
https://github.com/KARTHEEKCIC/RoboAttack

Related

How to check if given set of nodes is the vertex cut set of the graph?

I am looking for efficient algorithm to discover whether removing a set of nodes in graph would split graph into multiple components.
Formally, given undirected graph G = (V,E) and nonempty set od vertices W ⊆ V, return true iff W is vertex cut set. There are no edge weights in graph.
What occurs to my mind until now is using disjoint set:
The disjoint set is initialized with all neigbors of nodes in W where every set contains one such neighbor.
During breadth-first traversal of all nodes of V \ W, exactly one of following cases holds for every newly explored node X:
X is already in same set as its predecessor
X is absent in disjoint set ⇒ is added into same set as its predecessor (both cases mean the connected component is being further explored)
X is in different set ⇒ merge disjoint sets (two components so far appeared as disconnected but turned out to be connected)
Whenever the disjoint set contains only single set (even before traversal is finished), result is false.
If the disjoint set contains 2 or more sets when traversal is finished, result is true.
Time complexity is O(|V|+|E|) (assuming time complexity of disjoint set is O(1) instead of more precise inverse Ackermann function).
Do you know better solution (or see any flaw in proposed one)?
Note: Since it occurs very often in Google search results, I would like to explicitly state I'm not looking for algorithm to find so-far-unknown vertex cut set, let alone optimal. The vertex set is given, the task is just to say yes or no.
Note 2: Also, I don't search for edge cut set validation (I'm aware of Find cut set in a graph but can't think up similar solution for vertices).
Thanks!
UPDATE: I figured out that in case of true result I will also need data of nodes in disconnected components in hierarchical structure depending on distance from removed nodes. Hence the choice of BFS. I apologize for post-edit.
The practical case behind is an outage in telecommunication network. When some node breaks so the whole network get disconnected, one component (containing the node with connectivity to higher level network) is still ok, every other components need to be reported.
Have a unordered_set<int> r to store the set of vertices you want to remove.
Run a DFS normally, but only go to adjacents who are not in r. Always you visit an unvisited node, add 1 to the number of nodes visited.
If in the end, the number of visited nodes is smaller than |V| - r, this set divides the graph.
With this approach, you don't need to make changes in the graph, just ignore nodes that are in r, which you can check in O(1) using an unordered_set<int>.
The complexity is the same than a normal DFS.
Can’t you get O(|V|) complexity by performing a depth first search on the graph? Eliminate the set W from V and perform DFS. Record the number of nodes processed and stop when you cannot reach any more nodes. If the number of nodes processed is less than |V| - |W|, then W is a cut set.

Maximal number of vertex pairs in undirected not weighted graph

Given undirected not weighted graph with any type of connectivity, i.e. it can contain from 1 to several components with or without single nodes, each node can have 0 to many connections, cycles are allowed (but no loops from node to itself).
I need to find the maximal amount of vertex pairs assuming that each vertex can be used only once, ex. if graph has nodes 1,2,3 and node 3 is connected to nodes 1 and 2, the answer is one (1-3 or 2-3).
I am thinking about the following approach:
Remove all single nodes.
Find the edge connected a node with minimal number of edges to node with maximal number of edges (if there are several - take any of them), count and remove this pair of nodes from graph.
Repeat step 2 while graph has connected nodes.
My questions are:
Does it provide maximal number of pairs for any case? I am
worrying about some extremes, like cycles connected with some
single or several paths, etc.
Is there any faster and correct algorithm?
I can use java or python, but pseudocode or just algo description is perfectly fine.
Your approach is not guaranteed to provide the maximum number of vertex pairs even in the case of a cycle-free graph. For example, in the following graph your approach is going to select the edge (B,C). After that unfortunate choice, there are no more vertex pairs to choose from, and therefore you'll end up with a solution of size 1. Clearly, the optimal solution contains two vertex pairs, and hence your approach is not optimal.
The problem you're trying to solve is the Maximum Matching Problem (not to be confused with the Maximal Matching Problem which is trivial to solve):
Find the largest subset of edges S such that no vertex is incident to more than one edge in S.
The Blossom Algorithm solves this problem in O(EV^2).
The way the algorithm works is not straightforward and it introduces nontrivial notions (like a contracted matching, forest expansions and blossoms) to establish the optimal matching. If you just want to use the algorithm without fully understanding its intricacies you can find ready-to-use implementations of it online (such as this Python implementation).

Will a standard Kruskal-like approach for MST work if some edges are fixed?

The problem: you need to find the minimum spanning tree of a graph (i.e. a set S of edges in said graph such that the edges in S together with the respective vertices form a tree; additionally, from all such sets, the sum of the cost of all edges in S has to be minimal). But there's a catch. You are given an initial set of fixed edges K such that K must be included in S.
In other words, find some MST of a graph with a starting set of fixed edges included.
My approach: standard Kruskal's algorithm but before anything else join all vertices as pointed by the set of fixed edges. That is, if K = {1,2}, {4,5} I apply Kruskal's algorithm but instead of having each node in its own individual set initially, instead nodes 1 and 2 are in the same set and nodes 4 and 5 are in the same set.
The question: does this work? Is there a proof that this always yields the correct result? If not, could anyone provide a counter-example?
P.S. the problem only inquires finding ONE MST. Not interested in all of them.
Yes, it will work as long as your initial set of edges doesn't form a cycle.
Keep in mind that the resulting tree might not be minimal in weight since the edges you fixed might not be part of any MST in the graph. But you will get the lightest spanning tree which satisfies the constraint that those fixed edges are part of the tree.
How to implement it:
To implement this, you can simply change the edge-weights of the edges you need to fix. Just pick the lowest appearing edge-weight in your graph, say min_w, subtract 1 from it and assign this new weight,i.e. (min_w-1) to the edges you need to fix. Then run Kruskal on this graph.
Why it works:
Clearly Kruskal will pick all the edges you need (since these are the lightest now) before picking any other edge in the graph. When Kruskal finishes the resulting set of edges is an MST in G' (the graph where you changed some weights). Note that since you only changed the values of your fixed set of edges, the algorithm would never have made a different choice on the other edges (the ones which aren't part of your fixed set). If you think of the edges Kruskal considers, as a sorted list of edges, then changing the values of the edges you need to fix moves these edges to the front of the list, but it doesn't change the order of the other edges in the list with respect to each other.
Note: As you may notice, giving the lightest weight to your edges is basically the same thing as you suggest. But I think it is a bit easier to reason about why it works. Go with whatever you prefer.
I wouldn't recommend Prim, since this algorithm expands the spanning tree gradually from the current connected component (in the beginning one usually starts with a single node). The case where you join larger components (because your fixed edges might not all be in a single component), would be needed to handled separately - it might not be hard, but you would have to take care of it. OTOH with Kruskal you don't have to adapt anything, but simply manipulate your graph a bit before running the regular algorithm.
If I understood the question properly, Prim's algorithm would be more suitable for this, as it is possible to initialize the connected components to be exactly the edges which are required to occur in the resulting spanning tree (plus the remaining isolated nodes). The desired edges are not permitted to contain a cycle, otherwise there is no spanning tree including them.
That being said, apparently Kruskal's algorithm can also be used, as it is explicitly stated that is can be used to find an edge that connects two forests in a cost-minimal way.
Roughly speaking, as the forests of a given graph form a Matroid, the greedy approach yields the desired result (namely a weight-minimal tree) regardless of the independent set you start with.

How to detect if the given graph has a cycle containing all of its nodes? Does the suggested algorithm have any flaws?

I have a connected, non-directed, graph with N nodes and 2N-3 edges. You can consider the graph as it is built onto an existing initial graph, which has 3 nodes and 3 edges. Every node added onto the graph and has 2 connections with the existing nodes in the graph. When all nodes are added to the graph (N-3 nodes added in total), the final graph is constructed.
Originally I'm asked, what is the maximum number of nodes in this graph that can be visited exactly once (except for the initial node), i.e., what is the maximum number of nodes contained in the largest Hamiltonian path of the given graph? (Okay, saying largest Hamiltonian path is not a valid phrase, but considering the question's nature, I need to find a max. number of nodes that are visited once and the trip ends at the initial node. I thought it can be considered as a sub-graph which is Hamiltonian, and consists max. number of nodes, thus largest possible Hamiltonian path).
Since i'm not asked to find a path, I should check if a hamiltonian path exists for given number of nodes first. I know that planar graphs and cycle graphs (Cn) are hamiltonian graphs (I also know Ore's theorem for Hamiltonian graphs, but the graph I will be working on will not be a dense graph with a great probability, thus making Ore's theorem pretty much useless in my case). Therefore I need to find an algorithm for checking if the graph is cycle graph, i.e. does there exist a cycle which contains all the nodes of the given graph.
Since DFS is used for detecting cycles, I thought some minor manipulation to the DFS can help me detect what I am looking for, as in keeping track of explored nodes, and finally checking if the last node visited has a connection to the initial node. Unfortunately
I could not succeed with that approach.
Another approach I tried was excluding a node, and then try to reach to its adjacent node starting from its other adjacent node. That algorithm may not give correct results according to the chosen adjacent nodes.
I'm pretty much stuck here. Can you help me think of another algorithm to tell me if the graph is a cycle graph?
Edit
I realized by the help of the comment (thank you for it n.m.):
A cycle graph consists of a single cycle and has N edges and N vertices. If there exist a cycle which contains all the nodes of the given graph, that's a Hamiltonian cycle. – n.m.
that I am actually searching for a Hamiltonian path, which I did not intend to do so:)
On a second thought, I think checking the Hamiltonian property of the graph while building it will be more efficient, which is I'm also looking for: time efficiency.
After some thinking, I thought whatever the number of nodes will be, the graph seems to be Hamiltonian due to node addition criteria. The problem is I can't be sure and I can't prove it. Does adding nodes in that fashion, i.e. adding new nodes with 2 edges which connect the added node to the existing nodes, alter the Hamiltonian property of the graph? If it doesn't alter the Hamiltonian property, how so? If it does alter, again, how so? Thanks.
EDIT #2
I, again, realized that building the graph the way I described might alter the Hamiltonian property. Consider an input given as follows:
1 3
2 3
1 5
1 3
these input says that 4th node is connected to node 1 and node 3, 5th to node 2 and node 3 . . .
4th and 7th node are connected to the same nodes, thus lowering the maximum number of nodes that can be visited exactly once, by 1. If i detect these collisions (NOT including an input such as 3 3, which is an example that you suggested since the problem states that the newly added edges are connected to 2 other nodes) and lower the maximum number of nodes, starting from N, I believe I can get the right result.
See, I do not choose the connections, they are given to me and I have to find the max. number of nodes.
I think counting the same connections while building the graph and subtracting the number of same connections from N will give the right result? Can you confirm this or is there a flaw with this algorithm?
What we have in this problem is a connected, non-directed graph with N nodes and 2N-3 edges. Consider the graph given below,
A
/ \
B _ C
( )
D
The Graph does not have a Hamiltonian Cycle. But the Graph is constructed conforming to your rules of adding nodes. So searching for a Hamiltonian Cycle may not give you the solution. More over even if it is possible Hamiltonian Cycle detection is an NP-Complete problem with O(2N) complexity. So the approach may not be ideal.
What I suggest is to use a modified version of Floyd's Cycle Finding algorithm (Also called the Tortoise and Hare Algorithm).
The modified algorithm is,
1. Initialize a List CYC_LIST to ∅.
2. Add the root node to the list CYC_LIST and set it as unvisited.
3. Call the function Floyd() twice with the unvisited node in the list CYC_LIST for each of the two edges. Mark the node as visited.
4. Add all the previously unvisited vertices traversed by the Tortoise pointer to the list CYC_LIST.
5. Repeat steps 3 and 4 until no more unvisited nodes remains in the list.
6. If the list CYC_LIST contains N nodes, then the Graph contains a Cycle involving all the nodes.
The algorithm calls Floyd's Cycle Finding Algorithm a maximum of 2N times. Floyd's Cycle Finding algorithm takes a linear time ( O(N) ). So the complexity of the modied algorithm is O(N2) which is much better than the exponential time taken by the Hamiltonian Cycle based approach.
One possible problem with this approach is that it will detect closed paths along with cycles unless stricter checking criteria are implemented.
Reply to Edit #2
Consider the Graph given below,
A------------\
/ \ \
B _ C \
|\ /| \
| D | F
\ / /
\ / /
E------------/
According to your algorithm this graph does not have a cycle containing all the nodes.
But there is a cycle in this graph containing all the nodes.
A-B-D-C-E-F-A
So I think there is some flaw with your approach. But suppose if your algorithm is correct, it is far better than my approach. Since mine takes O(n2) time, where as yours takes just O(n).
To add some clarification to this thread: finding a Hamiltonian Cycle is NP-complete, which implies that finding a longest cycle is also NP-complete because if we can find a longest cycle in any graph, we can find the Hamiltonian cycle of the subgraph induced by the vertices that lie on that cycle. (See also for example this paper regarding the longest cycle problem)
We can't use Dirac's criterion here: Dirac only tells us minimum degree >= n/2 -> Hamiltonian Cycle, that is the implication in the opposite direction of what we would need. The other way around is definitely wrong: take a cycle over n vertices, every vertex in it has exactly degree 2, no matter the size of the circle, but it has (is) an HC. What you can tell from Dirac is that no Hamiltonian Cycle -> minimum degree < n/2, which is of no use here since we don't know whether our graph has an HC or not, so we can't use the implication (nevertheless every graph we construct according to what OP described will have a vertex of degree 2, namely the last vertex added to the graph, so for arbitrary n, we have minimum degree 2).
The problem is that you can construct both graphs of arbitrary size that have an HC and graphs of arbitrary size that do not have an HC. For the first part: if the original triangle is A,B,C and the vertices added are numbered 1 to k, then connect the 1st added vertex to A and C and the k+1-th vertex to A and the k-th vertex for all k >= 1. The cycle is A,B,C,1,2,...,k,A. For the second part, connect both vertices 1 and 2 to A and B; that graph does not have an HC.
What is also important to note is that the property of having an HC can change from one vertex to the other during construction. You can both create and destroy the HC property when you add a vertex, so you would have to check for it every time you add a vertex. A simple example: take the graph after the 1st vertex was added, and add a second vertex along with edges to the same two vertices of the triangle that the 1st vertex was connected to. This constructs from a graph with an HC a graph without an HC. The other way around: add now a 3rd vertex and connect it to 1 and 2; this builds from a graph without an HC a graph with an HC.
Storing the last known HC during construction doesn't really help you because it may change completely. You could have an HC after the 20th vertex was added, then not have one for k in [21,2000], and have one again for the 2001st vertex added. Most likely the HC you had on 23 vertices will not help you a lot.
If you want to figure out how to solve this problem efficiently, you'll have to find criteria that work for all your graphs that can be checked for efficiently. Otherwise, your problem doesn't appear to me to be simpler than the Hamiltonian Cycle problem is in the general case, so you might be able to adjust one of the algorithms used for that problem to your variant of it.
Below I have added three extra nodes (3,4,5) in the original graph and it does seem like I can keep adding new nodes indefinitely while keeping the property of Hamiltonian cycle. For the below graph the cycle would be 0-1-3-5-4-2-0
1---3---5
/ \ / \ /
0---2---4
As there were no extra restrictions about how you can add a new node with two edges, I think by construction you can have a graph that holds the property of hamiltonian cycle.

Correctness of algorithm to calculate maximal independent set

I am trying to find the maximal set for an undirected graph and here is the algorithm that i am using to do so:
1) Select the node with minimum number of edges
2) Eliminate all it's neighbors
3) From the rest of the nodes, select the node with minimum number of edges
4) Repeat the steps until the whole graph is covered
Can someone tell me if this is right? If not, then why is this method wrong to calculate the maximal independent set in a graph?
What you have described will pick a maximal independent set. We can see this as follows:
This produces an independent set. By contradiction, suppose that it didn't. Then there would have to be two nodes connected by edges that were added into the set you produced. Take whichever one of them was picked first (call it u, let the other be v) Then when it was added to the set, you would have removed all of its neighboring nodes from the set, including node v. Then v wouldn't have been added to the set, giving a contradiction.
This produces a maximal independent set. By contradiction, suppose that it didn't. This means that there is some node v that can be added to the independent set produced by your algorithm, but was not added. Since this node wasn't added, it must have been removed from the graph by the algorithm. This means that it must have been adjacent to some node added to the set already. But this is impossible, because it would mean that the node v cannot be added to the produced independent set without making the result not an independent set. We have a contradiction.
Hope this helps!
There is not one definite maximal independent set in any graph; take for example the cycle over 3 nodes, each of the nodes forms a maximal independent set. Your algorithm will give you one of the maximal independent sets of the graph, without guaranteeing that it has maximum cardinality.On the other hand, finding the maximum independent set in a graph is NP-complete (since that problem is complementary to that of finding a maximum clique), so there probably isn't an efficient algorithm.
After your clarify situation in comments, your solutions is right.
Even better, according to Corollary 3 from this paper http://courses.engr.illinois.edu/cs598csc/sp2011/Lectures/lecture_7.pdf
your get good aproximation for subset order.
Greedy gives a 1 / (d + 1) -approximation for (unweighted) MIS in graphs of degree at most d

Resources