Find a MST in O(V+E) Time in a Graph - algorithm

Similar question has been asked before, as in https://cs.stackexchange.com/questions/28635/find-an-mst-in-a-graph-with-edge-weights-from-1-2
The question is:
Given a connected undirected graph G=(V,E) and a weight function w:E→{1,2}, suggest an efficient algorithm that finds an MST of the graph in O(V+E) without using Kruskal.
I had a look at the suggested solutions on the thread above, but am still not sure how to make it work. The first suggestion doesn't consider components. The second suggestion doesn't provide more details on how to identify if the newly considered edge will form a cycle if it is added to the current MST. The tricky part is how to identify if two vertices are in the same component in liner time.
My current thought is
sort the vertices, which can be done in linear time
consider edges with weight 1 first. add the edge to the MST when the number of edges is less than or equal to |V1|-1. V1 are the vertices on the edges with a weight of 1. We need to make sure all the vertices with a weight of 1 is checked. Hash set can be used to store V1 and edges.
add V2 to the graph by using the same logic.
Could anyone suggest if my thought has flaws? If so, what is the best way to tackle this question? Thank you very much!

I would suggest you to do something like the second answer in the given question:
This is prim's algorithm :
Start by picking any vertex to be the root of the tree.
While the tree does not contain all vertices in the graph find
shortest edge leaving the tree and add it to the tree .
So now if we can perform the finding in the set of edges leaving the tree in O(1) time and we can keep the set updated so the search can always happen in constant time in total O(|E|) time, then we are good to go.
Now for this to happen, think of the set of edges leaving the tree as a linked list. now whenever a vertex is added to the set of vertexes that form the MST, iterate through its adjacency list and add the edges of weight 1 to the front of the list, and add the edges of weight 2 to the end of the list. Now whenever you want the minimum edge leaving the tree just take one from the front of the linked list!
The only problem with this method is that you should only add the edges to the list that are "leaving the tree"! because if we don't, we might end up having cycles! For checking this "leaving the tree" property for each edge, we can use the set of selected vertices, and we can check each edge before adding, that it doesn't have both ends in the set, so simply when you are adding the edges of a newly added vertex to set of edges leaving the tree, first check if the vertex on the other end of the edge is in the set of selected vertices of tree, and add the edge to the list only if the other edge wasn't in the set. You can check existence of an element in a set in O(1) time and this way you won't end up with cycles!

Related

How to traverse on only a cycle in a graph?

I'm attempting ch23 in CLRS on MSTs, here's a question:
Given a graph G and a minimum spanning tree T , suppose that we decrease the weight of one of the edges not in T . Give an algorithm for finding the minimum spanning tree in the modified graph.
A solution I found was to add this new changed edge in T, then exactly one simple cycle is created in T, traverse this cycle and delete the max-weight edge in this cycle, voila, the new updated MST is found!
My question is, how do I only traverse nodes on this simple-cycle? Since DFS/BFS traversals might go out of the cycle if I, say, start the traversal in T from one endpoint of this newly added edge in T.
One solution I could think of was to find the biconnected components in T after adding the new edge. Only one BCC will be found, which is this newly formed simple-cycle, then I can put in a special condition in my DFS code saying to only traverse edges/nodes in this BCC, and once a back-edge is found, stop the traversal.
Edit: graph G is connected and undirected btw
Your solution is basically good. To make it more formal you can use Tarjan's bridge-finding algorithm
This algorithm find the cut-edges (aka bridges) in the graph in linear time. Consider E' to be the cut-edges set. It is easy to prove that every edge in E' can not be on circle. So, E / E' are must be the cycle in the graph.
You can use hash-map or array build function to find the difference between your E and the cut-edges set
From here you can run simple for-loop to find the max weight edge which you want to remove.
Hope that help!

Finding the heaviest edge in the graph that forms a cycle

Given an undirected graph, I want an algorithm (inO(|V|+|E|)) that will find me the heaviest edge in the graph that forms a cycle. For example, if my graph is as below, and I'll run DFS(A), then the heaviest edge in the graph will be BC.
(*) In this problem, I have at most 1 cycle.
I'm trying to write a modified DFS, that will return the desired heavy edge, but I'm having some trouble.
Because I have at most 1 cycle, I can save the edges in the cycle in an array, and find the maximum edge easily at the end of the run, but I think this answer seems a bit messy, and I'm sure there's a better recursive answer.
I think the easiest way to solve this is to use a union-find data structure (https://en.wikipedia.org/wiki/Disjoint-set_data_structure) in a manner similar to Kruskal's MST algorithm:
Put each vertex in its own set
Iterate through the edges in order of weight. For each edge, merge the sets of the adjacent vertices if they're not already in the same set.
Remember the last edge for which you found that its adjacent vertices were already in the same set. That's the one you're looking for.
This works because the last and heaviest edge that you visit in any cycle must already have its adjacent vertices connected by edges you visited earlier.
Use Tarjan's Strongly Connected Components algorithm.
Once you have split your graph into many strongly connected graphs assign a COMP_ID to each node which specifies the component ID to which this node belongs (This can be done with a small edit on the algorithm. Define a global integer value which starts at 1. Every time you pop nodes from the stack they all correspond to the same component, save the value of this variable to the COMP_ID of these nodes. When the pop loop ends increment the value of this integer by one).
Now, iterate over all the edges. You have 2 possibilities:
If this edge links two nodes from two different components, then this edge can't be the answer, since it can't possibly be a part of a cycle.
If this edge links two nodes from the same component, then this edge is a part of some cycle. All you have left to do now is to choose the maximum edge among all the edges of type 2.
The described approach runs in a total complexity of O(|V| + |E|) because every node and edge corresponds to at most one strongly connected component.
In the graph example you provided COMP_ID will be as follows:
COMP_ID[A] = 1
COMP_ID[B] = 2
COMP_ID[C] = 2
COMP_ID[D] = 2
Edge 10 connects COMP_ID 1 with COMP_ID 2, thus it can't be the answer. The answer is the maximum among edges {2, 5, 8} since they all connect COMP_ID 1 with it self, thus the answer is 8

Will a standard Kruskal-like approach for MST work if some edges are fixed?

The problem: you need to find the minimum spanning tree of a graph (i.e. a set S of edges in said graph such that the edges in S together with the respective vertices form a tree; additionally, from all such sets, the sum of the cost of all edges in S has to be minimal). But there's a catch. You are given an initial set of fixed edges K such that K must be included in S.
In other words, find some MST of a graph with a starting set of fixed edges included.
My approach: standard Kruskal's algorithm but before anything else join all vertices as pointed by the set of fixed edges. That is, if K = {1,2}, {4,5} I apply Kruskal's algorithm but instead of having each node in its own individual set initially, instead nodes 1 and 2 are in the same set and nodes 4 and 5 are in the same set.
The question: does this work? Is there a proof that this always yields the correct result? If not, could anyone provide a counter-example?
P.S. the problem only inquires finding ONE MST. Not interested in all of them.
Yes, it will work as long as your initial set of edges doesn't form a cycle.
Keep in mind that the resulting tree might not be minimal in weight since the edges you fixed might not be part of any MST in the graph. But you will get the lightest spanning tree which satisfies the constraint that those fixed edges are part of the tree.
How to implement it:
To implement this, you can simply change the edge-weights of the edges you need to fix. Just pick the lowest appearing edge-weight in your graph, say min_w, subtract 1 from it and assign this new weight,i.e. (min_w-1) to the edges you need to fix. Then run Kruskal on this graph.
Why it works:
Clearly Kruskal will pick all the edges you need (since these are the lightest now) before picking any other edge in the graph. When Kruskal finishes the resulting set of edges is an MST in G' (the graph where you changed some weights). Note that since you only changed the values of your fixed set of edges, the algorithm would never have made a different choice on the other edges (the ones which aren't part of your fixed set). If you think of the edges Kruskal considers, as a sorted list of edges, then changing the values of the edges you need to fix moves these edges to the front of the list, but it doesn't change the order of the other edges in the list with respect to each other.
Note: As you may notice, giving the lightest weight to your edges is basically the same thing as you suggest. But I think it is a bit easier to reason about why it works. Go with whatever you prefer.
I wouldn't recommend Prim, since this algorithm expands the spanning tree gradually from the current connected component (in the beginning one usually starts with a single node). The case where you join larger components (because your fixed edges might not all be in a single component), would be needed to handled separately - it might not be hard, but you would have to take care of it. OTOH with Kruskal you don't have to adapt anything, but simply manipulate your graph a bit before running the regular algorithm.
If I understood the question properly, Prim's algorithm would be more suitable for this, as it is possible to initialize the connected components to be exactly the edges which are required to occur in the resulting spanning tree (plus the remaining isolated nodes). The desired edges are not permitted to contain a cycle, otherwise there is no spanning tree including them.
That being said, apparently Kruskal's algorithm can also be used, as it is explicitly stated that is can be used to find an edge that connects two forests in a cost-minimal way.
Roughly speaking, as the forests of a given graph form a Matroid, the greedy approach yields the desired result (namely a weight-minimal tree) regardless of the independent set you start with.

How to get the new MST from the old one if one edge in the graph changes its weight?

We know the original graph and the original MST. Now we change an edge's weight in the graph. Beside the Prim and the Kruskal, is there any way we can generate the new MST from the old one?
Here's how I would do it:
If the changed edge is in the original MST:
If its weight was reduced, then of course it must be in the new MST.
If its weight was increased, then delete it from the original MST and look for the lowest-weight edge that connects the two subtrees that remain (this could select the original edge again). This can be done efficiently in a Kruskal-like way by building up a disjoint-set data structure to hold the two subtrees, and sorting the remaining edges by weight: select the first one with endpoints in different sets. If you know a way of quickly deleting an edge from a disjoint-set data structure, and you built the original MST using Kruskal's algorithm, then you can avoid recomputing them here.
Otherwise:
If its weight was increased, then of course it will remain outside the MST.
If its weight was reduced, add it to the original MST. This will create a cycle. Scan the cycle, looking for the heaviest edge (this could select the original edge again). Delete this edge. If you will be performing many edge mutations, cycle-finding may be sped up by calculating all-pairs shortest paths using the Floyd-Warshall algorithm. You can then find all edges in the cycle by initially leaving the new edge out, and looking for the shortest path in the MST between its two endpoints (there will be exactly one such path).
Besides linear-time algorithm, proposed by j_random_hacker, you can find a sub-linear algorithm in this book: "Handbook of Data Structures and Applications" (Chapter 36) or in these papers: Dynamic graphs, Maintaining Minimum Spanning Trees in Dynamic Graphs.
You can change the problem a little while the result is same.
Get the structure of the original MST, run DFS from each vertice and
you can get the maximum weighted edge in the tree-path between each
vertice-pair. The complexity of this step is O(N ^ 2)
Instead of changing one edge's weight to w, we can assume we are
adding a new edge (u,v) into the original MST whose weight is w. The
adding edge would make a loop on the tree and we should cut one edge
on the loop to generate a new MST. It's obviously that we can only
compare the adding edge with the maximum edge in the path(a,b), The
complexity of this step is O(1)

How to find the minimum set of vertices in a Directed Graph such that all other vertices can be reached

Given a directed graph, I need to find the minimum set of vertices from which all other vertices can be reached.
So the result of the function should be the smallest number of vertices, from which all other vertices can be reached by following the directed edges.
The largest result possible would be if there were no edges, so all nodes would be returned.
If there are cycles in the graph, for each cycle, one node is selected. It does not matter which one, but it should be consistent if the algorithm is run again.
I am not sure that there is an existing algorithm for this? If so does it have a name? I have tried doing my research and the closest thing seems to be finding a mother vertex
If it is that algorithm, could the actual algorithm be elaborated as the answer given in that link is kind of vague.
Given I have to implement this in javascript, the preference would be a .js library or javascript example code.
From my understanding, this is just finding the strongly connected components in a graph. Kosaraju's algorithm is one of the neatest approaches to do this. It uses two depth first searches as against some later algorithms that use just one, but I like it the most for its simple concept.
Edit: Just to expand on that, the minimum set of vertices is found as was suggested in the comments to this post :
1. Find the strongly connected components of the graph - reduce each component to a single vertex.
2. The remaining graph is a DAG (or set of DAGs if there were disconnected components), the root(s) of which form the required set of vertices.
[EDIT #2: As Jason Orendorff mentions in a comment, finding the feedback vertex set is overkill and will produce a vertex set larger than necessary in general. kyun's answer is (or will be, when he/she adds in the important info in the comments) the right way to do it.]
[EDIT: I had the two steps round the wrong way... Now we should guarantee minimality.]
Call all of the vertices with in-degree zero Z. No vertex in Z can be reached by any other vertex, so it must be included in the final set.
Using a depth-first (or breadth-first) traversal, trace out all the vertices reachable from each vertex in Z and delete them -- these are the vertices already "covered" by Z.
The graph now consists purely of directed cycles. Find a feedback vertex set F which gives you a smallest-possible set of vertices whose removal would break every cycle in the graph. Unfortunately as that Wikipedia link shows, this problem is NP-hard for directed graphs.
The set of vertices you're looking for is Z+F.

Resources