Linear Time Algorithm to Find MST? - algorithm

Given 2 Algorithms for a graph G=(V,E):
One:
Sort edges from lowest to highest weight.
Set T = {}
for each edge e in the previous order, check if e Union T doesn't have any cycles.
If yes, Add e to T.
Return T if it's a spanning Tree.
Two:
Sort edges from highest to lowest weight.
Set T = E
for each edge e in the previous order, check if T{e} is connected graph.
If yes, Remove e from T.
Return T if it's a spanning Tree.
Do both algorithms return Minimum Spanning Tree for sure?
If not I would like to see a counter example.

Both of these algorithms will find an MST if it exists. The first is Kruskal's algorithm, and the second can be proven equivalent pretty easily.
Neither of them are linear time unless the weights are constrained somehow, because they start with an O(N log N) edge sort.
Disregarding the sort, the remainder of Kruskal's algorithm is very close to linear time, because it uses a disjoint set data structure to check for connectivity.
The second algorithm doesn't have a similarly quick and straightforward implementation -- anything fast is going to be more difficult than using Kruskal's algorithm instead.

None of the algorithms is guaranteed to build a tree because E might not be simply connected.

Related

Building MST from a graph with "very few" edges in linear time

I was at an interview and interviewer asked me a question:
We have a graph G(V,E), we can find MST using prim's or kruskal algorithm. But these algorithms do not take into the account that there are "very few" edges in G. How can we use this information to improve time complexity of finding MST? Can we find MST in linear time?
The only thing I could remember was that Kruskal's algorithm is faster in a sparse graphs while Prim's algorithm is faster in really dense graphs. But I couldn't answer him how to use prior knowledge about the number of edges to make MST in linear time.
Any insight or solution would be appreciated.
Kruskal's algorithm is pretty much linear after sorting the edges. If you use a union find structure like disjoint set forest The complexity for processing a single edge will be in the order of lg*(n) where n is the number of vertices and this function grows so slowly that for this case can be considered constant. However the problem is that to sort the edges you still need a O(m * log(m)). Where m is the number of edges.
Prim's algorithm will not be able to take advantage of the fact that the edges are very few.
One approach that you can use is something like a 'reversed' MST approach where you start off with all edges and remove the longest edge until the graph becomes disconnected. You keep doing that until only n - 1 edges are left. Still note that this will be better than Kruskal only if the number of edges to remove k are few enough so that k * n < m * log(m).
Lets say |E| = |V| +c ,c being a small constant. You can run DFS on the graph and every time you detect a circle, remove the largest edge. you must do that c +1 times. O(c+1 * |E|) = O(E) linear time in theory.

How to find the minimum spanning tree by cycle finding?

By searching the web I can find 2(kruskal and prims) algorithm for finding minimum spanning tree. But this algorithm
*let T be initially the set of all edges
*while there is some cycle C in T
remove edge e from T where e has the heaviest weight in C
I can't find by searching the web. How do I implement this algorithm. How can I find every possible cycle?
Sort the edges by decreasing order, then try to delete an edge each time. Check whether the graph is connected or not. If the graph is still connected after deleting an edges, it will guarantee that the edge is in a cycle.
The easiest way to achieve this would be through the Union-Find data structure (see link below). Checking for cycles can be done in O(1) time and adding a new edge takes on average O(log V) processing per node, requiring an overall run-time of O(E + V log V). We can use more advanced data structures to achieve near linear time bounds.
http://people.cs.umass.edu/~barring/cs611/lecture/7.pdf

Kruskal's MST Algorithm non-deterministic?

The following is pseudo code for Kruskal's Minimum Spanning Tree algorithm from our CS algorithms lecturer. I wanted to know if the MST algorithm is non-deterministic. Given two edges with the same weight, how would the algorithm decide between them if neither edge formed a cycle when adding to T. Surely if it was random then one could not determine the result of what exact edges are added to T?
Given an undirected connected graph G=(V,E)
T=Ø //Empty set, i.e. empty
E'=E
while E'≠Ø do
begin
pick an edge e in E' with minumum weight
if adding e to T does not form a cycle then
T = T∪{e} //Set union, add e to T
E' = E'\{e} //Set difference, remove e from E'
end
Thanks!
Kruskal's algorithm is deterministic if you pick a deterministic choice function for the cases where you have a choice, otherwise it's non-deterministic. If you choose randomly, you can't tell which edges end up in the MST if there are several possibilities.
Given two edges with the same weight, how would the algorithm decide
between them if neither edge formed a cycle when adding to T. Surely
if it was random then one could not determine the result of what exact
edges are added to T?
This is upto the implementation.
Kruskal's algorithm finds one of the many possible MSTs of a connected weighted graph (which is not a tree). This is because at each iteration you have multiple choices (of choosing an edge from within edges with the same weight). This is the non-deterministic bit. Of course, when you will implement the algorithm you will make a choice (i.e. impose an order) however a different implementation can very well impose a different order. Thus, you will have two implementations of the algorithm, solving the same problem, correctly but possibly with different end results.

Prim and Kruskal's algorithms complexity

Given an undirected connected graph with weights. w:E->{1,2,3,4,5,6,7} - meaning there is only 7 weights possible.
I need to find a spanning tree using Prim's algorithm in O(n+m) and Kruskal's algorithm in O( m*a(m,n)).
I have no idea how to do this and really need some guidance about how the weights can help me in here.
You can sort edges weights faster.
In Kruskal algorithm you don't need O(M lg M) sort, you just can use count sort (or any other O(M) algorithm). So the final complexity is then O(M) for sorting and O(Ma(m)) for union-find phase. In total it is O(Ma(m)).
For the case of Prim algorithm. You don't need to use heap, you need 7 lists/queues/arrays/anything (with constant time insert and retrieval), one for each weight. And then when you are looking for cheapest outgoing edge you check is one of these lists is nonempty (from the cheapest one) and use that edge. Since 7 is a constant, whole algorithms runs in O(M) time.
As I understand, it is not popular to answer homework assignments, but this could hopefully be usefull for other people than just you ;)
Prim:
Prim is an algorithm for finding a minimum spanning tree (MST), just as Kruskal is.
An easy way to visualize the algorithm, is to draw the graph out on a piece of paper.
Then you create a moveable line (cut) over all the nodes you have selected. In the example below, the set A will be the nodes inside the cut. Then you chose the smallest edge running through the cut, i.e. from a node inside of the line to a node on the outside. Always chose the edge with the lowest weight. After adding the new node, you move the cut, so it contains the newly added node. Then you repeat untill all nodes are within the cut.
A short summary of the algorithm is:
Create a set, A, which will contain the chosen verticies. It will initially contain a random starting node, chosen by you.
Create another set, B. This will initially be empty and used to mark all chosen edges.
Choose an edge E (u, v), that is, an edge from node u to node v. The edge E must be the edge with the smallest weight, which has node u within the set A and v is not inside A. (If there are several edges with equal weight, any can be chosen at random)
Add the edge (u, v) to the set B and v to the set A.
Repeat step 3 and 4 until A = V, where V is the set of all verticies.
The set A and B now describe you spanning tree! The MST will contain the nodes within A and B will describe how they connect.
Kruskal:
Kruskal is similar to Prim, except you have no cut. So you always chose the smallest edge.
Create a set A, which initially is empty. It will be used to store chosen edges.
Chose the edge E with minimum weight from the set E, which is not already in A. (u,v) = (v,u), so you can only traverse the edge one direction.
Add E to A.
Repeat 2 and 3 untill A and E are equal, that is, untill you have chosen all edges.
I am unsure about the exact performance on these algorithms, but I assume Kruskal is O(E log E) and the performance of Prim is based on which data structure you use to store the edges. If you use a binary heap, searching for the smallest edge is faster than if you use an adjacency matrix for storing the minimum edge.
Hope this helps!

Fastest algorithm for detecting a loop in a graph

Given an undirected graph what is the best algorithm to detect if it contains a cycle or not?
A breadth first or depth first search while keeping track of visited nodes is one method, but it's O(n^2). Is there anything faster?
The BFS and DFS algorithm for given graph G(V,E) has time complexity of O(|V|+|E|). So as you can see it's linear dependency of input. You can perform some heuristics in case you have very specialized graph, but in general it's not so bad to use DFS for that. You can check for some information here. Anyway you have to traverse the whole graph.
Here's your O(V) algorithm:
def hasCycles(G, V, E):
if E>=V:
return True
else:
# here E<V
# perform O(E+V) = O(V) algorithm
...
The ... can be performed with DFS. If you have E<V and edges are stored in a meaningful way (as a list), you can probably do O(E)+logs which would make the whole algorithm O(min(E,V))+logs.
Hope you like this answer, though a bit late!
Testing for the presence of a cycle in a graph G(V,E) using Depth First Search is O(|V|,|E|) if the graph is represented using an adjacency list.
It is necessary to traverse the entire graph to show there are no cycles. If you are simply interested in the presence/absence of a cycle, you can obviously finish at the point a cycle is discovered.
If you have a simple graph, you can calculate the cyclomatic number:
C = E − N + P
Where C is the number of cycles, E is the number of edges, N is the number of nodes and P is the number of components. If you graph is connected, it is:
C = E - N + 1

Resources