I am studying about MST algorithms. I am curious to find the key differences between prims and boruvka's algorithm but the resources online don't have much to say about them other than their implementation and algorithm. If someone can explain, it would be great help. Thanks!
Both algorithms use the facts that
For every vertex v, there exists a minimum spanning tree T such that the cheapest edge incident to v belongs to T.
For every edge e, the (minimum) spanning trees that contain e are in natural one-to-one correspondence with the (minimum) spanning trees of the graph where e is contracted.
Prim and Borůvka exploit these facts in different ways. Prim chooses a root vertex r and repeatedly contracts the cheapest edge incident to r (the usual description avoids graph contraction but is equivalent to this) until only r remains. Borůvka repeatedly contracts all of the cheapest incident edges "in parallel" until there is exactly one vertex remaining.
You can create a variety of minimum spanning tree algorithms by mixing and matching contraction strategies.
Related
First please note that this question is NOT asking about MST, instead, just all possible spanning trees.
So this is NOT the same as finding all minimal spanning trees or All minimum spanning trees implementation
I just need to generate all possible spanning trees from a graph.
I think the brute-force way is straight:
Suppose we have V nodes and E edges.
Get all edges of the graph
Get all possible combinations of V-1 out of E edges.
Filter out non-spanning-tree out of the combinations (for a spanning tree, all nodes inside one set of V-1 edges should appear exactly once)
But I think it is too slow when facing big graph.
Do we have a better way?
Set the weight of all edges to the same value, then use an algorithm to find all minimum spanning trees. Since all spanning trees have |V|-1 edges and all edge weights are equal, all spanning trees will be minimum spanning trees.
I've become interested in this question, and have yet to find a really satisfactory answer. However, I have found a number of references: Knuth's Algorithms S and S' in TAOCP Volume 4 Fascicle 4, a paper by Sorensen and Janssens, and GRAYSPAN, SPSPAN, and GRAYSPSPAN by Knuth. It's too bad none of them are implementations in a language I could use ... I guess I'll have to spend some time coding these ...
I'm trying to find an efficient method of detecting whether a given graph G has two different minimal spanning trees. I'm also trying to find a method to check whether it has 3 different minimal spanning trees. The naive solution that I've though about is running Kruskal's algorithm once and finding the total weight of the minimal spanning tree. Later , removing an edge from the graph and running Kruskal's algorithm again and checking if the weight of the new tree is the weight of the original minimal spanning tree , and so for each edge in the graph. The runtime is O(|V||E|log|V|) which is not good at all, and I think there's a better way to do it.
Any suggestion would be helpful,
thanks in advance
You can modify Kruskal's algorithm to do this.
First, sort the edges by weight. Then, for each weight in ascending order, filter out all irrelevant edges. The relevant edges form a graph on the connected components of the minimum-spanning-forest-so-far. You can count the number of spanning trees in this graph. Take the product over all weights and you've counted the total number of minimum spanning trees in the graph.
You recover the same running time as Kruskal's algorithm if you only care about the one-tree, two-trees, and three-or-more-trees cases. I think you wind up doing a determinant calculation or something to enumerate spanning trees in general, so you likely wind up with an O(MM(n)) worst-case in general.
Suppose you have a MST T0 of a graph. Now, if we can get another MST T1, it must have at least one edge E different from the original MST. Throw away E from T1, now the graph is separated into two components. However, in T0, these two components must be connected, so there will be another edge across this two components that has exactly the same weight as E (or we could substitute the one with more weight with the other one and get a smaller ST). This means substitute this other edge with E will give you another MST.
What this implies is if there are more than one MSTs, we can always change just a single edge from a MST and get another MST. So if you are checking for each edge, try to substitute the edge with the ones with the same weight and if you get another ST it is a MST, you will get a faster algorithm.
Suppose G is a graph with n vertices and m edges; that the weight of any edge e is W(e); and that P is a minimal-weight spanning tree on G, weighing Cost(W,P).
Let δ = minimal positive difference between any two edge weights. (If all the edge weights are the same, then δ is indeterminate; but in this case, any ST is an MST so it doesn't matter.) Take ε such that δ > n·ε > 0.
Create a new weight function U() with U(e)=W(e)+ε when e is in P, else U(e)=W(e). Compute Q, an MST of G under U. If Cost(U,Q) < Cost(U,P) then Q≠P. But Cost(W,Q) = Cost(W,P) by construction of δ and ε. Hence P and Q are distinct MSTs of G under W. If Cost(U,Q) ≥ Cost(U,P) then Q=P and distinct MSTs of G under W do not exist.
The method above determines if there are at least two distinct MSTs, in time O(h(n,m)) if O(h(n,m)) bounds the time to find an MST of G.
I don't know if a similar method can treat whether three (or more) distinct MSTs exist; simple extensions of it fall to simple counterexamples.
If we modify the shortest path problem such that the cost of a path between two vertices is the maximum of the costs of the edges on it, then for any pair of vertices u and v,
the path between them that follows a minimum-cost spanning tree is a min-cost path.
How can I prove this approach is true? It makes sense but I am not sure. Does anyone know if this algorithm exists in the literature? Is there a name for it?
The approach you mentioned is discussed in detail in literatures that discuss the relationship between Prim's algorithm and Dijkstra's algorithm, as usual wikipedia is a good place to start your research:
The process that underlies Dijkstra's algorithm is similar to the greedy process used in Prim's algorithm. Prim's purpose is to find a minimum spanning tree that connects all nodes in the graph; Dijkstra is concerned with only the lowest cost path beteen two nodes.
You can use some basic facts of MST (that are usually discussed in the correctness proof for Prim's & Kruskal's algorithms). The one that matters now is that
Lema 1:
Given a graph cut (a partitioning of the vertices into two
disjoint sets) the edge in the MST connecting the two parts will be
the cheapest of the edges connecting the two parts.
(The proof is straighfoward, if there were a cheaper edge we would be able to easily contruct a cheaper spanning tree)
We can now prove that the paths in a MST are all min-cost paths if you consider the maximum-cost:
Take any two vertices s and t in G and the path p that connects them in a MST of G. Now let uv be the most expensive edge in this path. We can describe a graph cut over this edge, with one partition with the vertices on the u side of the MST and the other partition with the vertices on the v side. We know that any path connecting s and t must pass this cut, therefore we can determine that the cost of any path from s to t must be at least the cost of the cheapest edge on this cut. But Lemma 1 tells us that uv is the cheapest edge on this cut so p must be a min-cost path.
I have an undirected, positive-edge-weight graph (V,E) for which I want a minimum spanning tree covering a subset k of vertices V (the Steiner tree problem).
I'm not limiting the size of the spanning tree to k vertices; rather I know exactly which k vertices must be included in the MST.
Starting from the entire MST I could pare down edges/nodes until I get the smallest MST that contains all k.
I can use Prim's algorithm to get the entire MST, and start deleting edges/nodes while the MST of subset k is not destroyed; alternatively I can use Floyd-Warshall to get all-pairs shortest paths and somehow union the paths. Are there better ways to approach this?
There's a lot of confusion going on here. Based on what the OP says:
I'm not limiting the size of the spanning tree to k vertices; rather I know exactly which k vertices must be included in the MST.
This is the Steiner tree problem on graphs. This is not the k-MST problem. The Steiner tree problem is defined as such:
Given a weighted graph G = (V, E), a subset S ⊆ V of the vertices,
and a root r ∈ V , we want to find a minimum weight tree which connects all the vertices in S to
r. 1
As others have mentionned, this problem is NP-hard. Therefore, you can use an approximation algorithm.
Early/Simple Approximation Algorithms
Two famous methods are Takahashi's method and Kruskal's method (both of which have been extended/improved by Rayward-Smith):
Takahashi H, Matsuyama A: An approximate solution for the Steiner problem in graphs. Math. Jap 1980, 24:573–577.
Kruskal JB: On the Shortest Spanning Subtree of a Graph and the Traveling Salesman Problem. In Proceedings of the American Mathematical Society, Volume 7. ; 1956:48–50.
Rayward-Smith VJ, Clare A: On finding Steiner vertices. Networks 1986, 16:283–294.
Shortest path approximation by Takahashi (with modification by Rayward-Smith)
Kruskal's approximation algorithm (with modification by Rayward-Smith)
Modern/More Advanced Approximation Algorithms
In biology, more recent approaches have treated the problem using the cavity method, which has led to a "modified belief propagation" method that has shown good accuracy on large data sets:
Bayati, M., Borgs, C., Braunstein, A., Chayes, J., Ramezanpour, A., Zecchina, R.: Statistical mechanics of steiner trees. Phys. Rev. Lett. 101(3), 037208 (2008) 15.
For an application: Steiner tree methods for optimal sub-network identification: an empirical study. BMC Bioinformatics. BMC Bioinformatics 2013 30;14:144. Epub 2013 Apr 30.
In the context of search engine problems, approaches have focused on efficiency for very large data sets that can be pre-processed to some degree.
G. Bhalotia, A. Hulgeri, C. Nakhe, S. Chakrabarti, and S. Sudarshan. Keyword Searching and Browsing in Databases using BANKS. In ICDE, pages 431–440.
G. Kasneci, M. Ramanath, M. Sozio, F. M. Suchanek, and G. Weikum. STAR: Steiner-tree approximation in relationship graphs. In ICDE’09, pages 868–879, 2009
The problem you stated is a famous NP-hard problem, called Steiner tree in graphs. There are no known solutions in polynomial time and many believe no such solutions exist.
Run Prim's algorithm on the restricted graph (k, E') where E' = {(x, y) ∈ V : x ∈ k and y ∈ k}). Constructing that graph takes O(|E|).
Is there any applicable approach to find two disjoint spanning trees of an undirected graph or to check if a certain graph has two disjoint spanning trees
This is an example of Matroid union. Consider the graphic matroid where the basis are given by the spanning trees. Now the union of this matroid with itself is again a matroid. Your question is about the size of the basis of this matroid. (whether there exist a basis of size $2(|V|-1)$.
The canonical algorithm for this is Matroid partitioning algorithm. There exist an algorithm which does does the following: It maintains a set of edges with a partitioning into two forests. At each step given a new edge $e$, it decides whether there exist a reshuffling of the current partition into a new partition such that the new edge can be added to the set and the partition remains independent. And if not, it somehow will provide a certificate that it cannot.
For details look at a course in Comb. Optimization or the book by Schriver.
Not sure it helps much in the applicable side but Tutte [1961a] and Nash-Williams [1961] independently characterized graphs having k pairwise edge-disjoint spanning trees:
A graph G has k pairwise edge-disjoint spanning trees iff for every partition of the vertices of G into r sets, there are at least k(r-1) edges of G whose endpoints are in different sets of the partition.
Use k=2 and it may give you a lead for your needs.
According to A Note on Finding Minimum-Cost Edge-Disjoint Spanning Trees, this can be solved in O(k2n2) where k is the number of disjoint spanning trees, and n is the number of vertices.
Unfortunately, all but the first page of the article is behind a paywall.
Assuming that the desire is to find spanning trees with disjoint edge sets, what about:
Given a graph G determining the minimum spanning tree A of G.
Defining B = G - A by deleting all edges from G that also lie in A.
Checking if B is connected.
The nature of a minimum spanning tree somehow makes me intuitively believe that choosing it as one of the two spanning trees gives you maximum freedom in constructing the other (that hopefully turns out to be edge disjunctive).
What do You guys think?
edit
The above algorithm makes no sense as a spanning tree is a tree and therefore needs to be acyclic. But there is no guarantee that B = G - A is acyclic.
However, this observations (thx#Tormer) led me to another idea:
Given a graph G determine the minimum spanning tree A of G.
Define B = (V[G], E[G] \ E[A]) where V[G] describes the vertices of G and E[G] describes the edges of G (A respectively).
Determine, if B has a spanning tree.
It could very well be that the above algorithm fails although G indeed has two edge disjunctive spanning trees - just no one of them is G's minimum spanning tree. I can't judge this (now), so I'm asking for Your opinion if it's wise to always chose the minimum spanning tree as one of the two.