What is the overall Big O run time of Kruskal’s algorithm if BFS was used to check whether adding an edge creates a cycle? - algorithm

If Kruskal's algorithm was implemented using BFS to check whether adding an edge with create a cycle, what would the overall Big-O run time of the algorithm be?

It would be O(V * E + E * log E). Each BFS takes O(V) time because there are V - 1 edges in a tree(or less if the tree is not completely build yet) and it is run for each edge(V is the number of vertices, E is the number of edges). So it is O(V * E) in total. E * log E term comes from sorting the edges.

Related

What is the worst case time complexity of applying Kruskal algorithm in a loop?

Here is a snippet of my pseudo code to find the MST of each Strong Connect Component (SCC) given a graph, G:
Number of SCC, K <- apply Kosaraju's algorithm on Graph G O(V + E)
Loop through K components:
each K components <- apply Kruskal's algorithm
According to what I have learnt, Kruskal's algorithm run in O(E log V) time.
However, I am unsure of the worst case time complexity of the loop. My thought is that the worst case would occur when K = 1. Hence, the big O time complexity would simply just be O(E log V).
I do not know if my thoughts are correct or if they are, what's the justification for it are.
Yes, intuitively you’re saving the cost of comparing edges in one
component with edges in another. Formally, f(V, E) ↦ E log V is a convex
function, so f(V1, E1) + f(V2,
E2) ≤ f(V1 + V2, E1 +
E2), which implies that the cost of handling multiple
components separately is never more than handling them together. Of
course, as you observe, there may be only one component, in which case
there are no savings to be had.

Why is the time complexity of Dijkstra O((V + E) logV)

I was reading about worst case time complexity for the Dijkstra algorithm using binary heap (the graph being represented as adjacency list).
According to wikipedia (https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm#Running_time) and various stackoverflow questions, this is O((V + E) logV) where E - number of edges, V - number of vertices. However I found no explanation as to why it can't be done in O(V + E logV).
With a self-balancing binary search tree or binary heap, the algorithm requires Θ((E+V) logV) time in the worst case
In case E >= V, the complexity reduces to O(E logV) anyway. Otherwise, we have O(E) vertices in the same connected component as the start vertex (and the algorithm ends once we get to them). On each iteration we obtain one of these vertices, taking O(logV) time to remove it from the heap.
Each update of the distance to a connected vertex takes O(logV) and the number of these updates is bound by the number of edges E so in total we do O(E) such updates. Adding O(V) time for initializing distances, we get final complexity O(V + E logV).
Where am I wrong?

Dijkstra Time Complexity using Binary Heap

Let G(V, E)be an undirected graph with positive edge weights. Dijkstra’s single source shortest path algorithm can be implemented using the binary heap data structure with time complexity:
1. O(|V|^2)
2. O(|E|+|V|log|V|)
3. O(|V|log|V|)
4. O((|E|+|V|)log|V|)
========================================================================
Correct answer is -
O((|E|+|V|)log|V|)
=========================================================================
My Approach is as follows -
O(V+V+VlogV+ElogV) = O(ElogV)
O(V) to intialize.
O(V) to Build Heap.
VlogV to perform Extract_Min
ElogV to perform Decrease Key
Now, as I get O(ElogV) and when I see options, a part of me says the
correct one is O(VlogV) because for a sparse Graph |V| = |E|, but as I
said the correct answer is O((|E|+|V|)log|V|). So, where am I going
wrong?
Well, you are correct that the complexity is actually O(E log V).
Since E can be up to (V^2 - V)/2, this is not the same as O(V log V).
If every vertex has an edge, then V <= 2E, so in that case, O(E log V) = O( (E+V) log V). That is the usual case, and corresponds to the "correct" answer.
But technically, O(E log V) is not the same as O( (E+V) log V), because there may be a whole bunch of disconnected vertexes in V. When that is the case, however, Dijkstra's algorithm will never see all those vertexes, since it only finds vertexes connected to the single source. So, when the difference between these two complexities is important, you are right and the "correct answer" is not.
Let me put it this way.The correct answer is O((E+V)logV)).If the graph has the source vertex not reachable to all of the other vertices,VlogV could be more than ElogV.But if we assume that the source vertex is reachable to every other vertex, the graph will have at least V-1 edges.So,it will be ElogV.It is more to do with the reachability from the source vertex.

Which one is better O(V+E) or O(ElogE)?

I am trying to develop an algorithm which will be able to find minimum spanning tree from a graph.I know there are already many existing algorithms for it.However I am trying to eliminate the sorting of edges required in Kruskal's Algorithm.The algorithm I have developed so far has a part where counting of disjoint sets is needed and I need a efficient method for it.After a lot of study I came to know that the only possible way is using BFS or DFS which has a complexity of O(V+E) whereas Kruskal's algorithms has a complexity of O(ElogE).Now my question is which one is better,O(V+E) or O(ElogE)?
In general, E = O(V^2), but that bound may not be tight for all graphs. In particular, in a sparse graph, E = O(V), but for an algorithm complexity is usually stated as a worst-case value.
O(V + E) is a way of indicating that the complexity depends on how many edges there are. In a sparse graph, O(V + E) = O(V + V) = O(V), while in a dense graph O(V + E) = O(V + V^2) = O(V^2).
Another way of looking at it is to see that in big-O notation, O(X + Y) means the same thing as O(max(X, Y)).
Note that this is only useful when V and E might have the same magnitude. For Kruskal's algorithm, the dominating factor is that you need to sort the list of edges. Whether you have a sparse graph or a dense graph, that step dominates anything that might be O(V), so one simply writes O(E lg E) instead of O(V + E lg E).

Minimum cut over all pairs of vertices in directed and strongly connected graph

I have a graph G that is a directed and strongly connected graph, and I am asked to find a minimum cut over all pairs of vertices, meaning for every pair of S and T in the graph. This should be done in O(m2 × n2) time.
The best I came up with was to consider all vertices to be S, and for each S consider all other vertices to be T, and for each of those run the Ford-Fulkerson algorithm, and then find the min cut. But If I am not mistaken, this algorthm will have complexity of O(m2 × n2 × C).
How can I do this task in O(m2 × n2) time? Is it even possible ?
Notation:
m: number of edges n: number of nodes c_max: maximal single edge capacityC: max flow value.
Dinic's algorithm in combination with can be employed to solve the task at hand. It runs in O(m * n^2). The brute-force approach of O(n^2) min cut computations then yield a total of O(m * n^2 * n^2), which is the desired result for m = O(n^2). For sparse graphs with m = o(n^2) I couldn't find a definite result; however, for m = O(n) this paper gives a result of O(n^2 + n^4 * log n) = O(m^2 * n^2 * log n).
There are several algorithms to compute the min cut ( or, equivalently, the max flow ) in a directed graph whose complexities stays below O( m * n^2 ). Wilf H.S., Algorithms and Complexity, 1st ed. has a survey on page 65. The most accessible algorithm is probably Dinic (O(m * n^2)).
Though at first glance the Ford-Fulkerson algorithm has a superior time complexity of O( m * C ), it sports some serious drawbacks:
The time complexity is only valid for integer edge capacities. In fact, with irrational edge capacities, the algorithm is not even guaranteed to terminate at all nor to converge to the maximum flow ( see this paper for a provably minimal counterexample; this paper is also referenced in the wikipedia article ).
The time complexity depends on the value of the maximum flow.
Significance of the flow value
The max flow value C is not necessarily a function of the number of nodes and edges. Even if it is (which depends on the graph topology), the following observation holds: the maximum possible flow value in any graph is the number of edges times the maximum edge capacity, m * c_max which amounts to O(m).
That turns the complexity of Ford-Fulkerson for integer edge capacities to O(m^2), unless the maximum capacity of a single edge is a function of the number of nodes or edges in the graph, which is a non-standard assumption.
For other algorithms there is no effect since its execution hinges on the graph topology and the edge capacities relative to each other, but not on the absolute edge capacities (and, by consequence, is no function of the max flow value either).

Resources