Why is the time complexity of Dijkstra's algorithm O((V + E) logV) and not O((V + E) logE) - algorithm

Suppose we have a graph like this:
5 3
A ───────────► B ───────┐
│ │
│ │
│ │
│ ▼
└─────────────────────► C
10
In this situation. Both B and C are added to the priority queue. When B is explored, C gets added to the queue again, because C is B's neighbor. So essentially, each Vertex is added in to the queue as many times as there are edges going into that vertex. So in the worst case scenario, the priority queue can have E number of items in the queue. Adding and removing items take logE cost. This multiplied by the standard BFS cost of O(V + E) should give us a total cost of O((V + E)logE) but in all textbooks and everywhere the cost is mentioned as O((V + E)logV) why is that?
Even here it says that O(logE) is the same as O(logV) and they claim the total cost is O((V + E)logV). Why is logE the same as logV?

Related

Dropping non-constants in Algorithm Complexity

So, basically I'm implementing an algorithm to calculate distances from one source node to every other node in a weighted graph, and if a node is in a negative cycle, it detects and marks that node as such.
My question regards the total time complexity of my algorithm. Assume V is number of nodes and E the number of edges.
The algorithm starts by asking E lines of input to specify the Edges of the graph and inserts it in the corresponding adjacency list. Such operation is O(E)
I apply the Bellman-Ford algorithm V-1 times to know the distances and then I apply the algorithm V-1 times once again to detect the Nodes in a negative cycle. This is 2 * O(VE) = O(VE).
I print a distances vector with the size V to display the distances and/or wether the node is in a negative cycle or not. O(V).
So I guess my total complexity would be O(VE + V + E). Now my question is: Since VE is almost always bigger than V+E (for large numbers, it's always!), can I drop the V+E in the complexity and make it simply O(VE)?
Yes, O(VE + V + E) simplifies to O(VE) given that V and E represent the number of vertices and edges in a graph. For a highly connected graph, E = O(V^2) and so in that case VE + V + E = O(V^3) = O(VE). For a sparse graph, E = O(V) (note, this is not necessarily a tight upper bound) and so VE + V + E = O(V^2) = O(VE). In all cases O(VE) is an appropriate upper bound on the complexity.
Yes, when dealing with asymptotic complexity, you always assume that V and E are very large (in theory, you study complexity by calculating limits when V and E approach infinity). Pretty much the same why you can write n^2 + n = O(n^2), in your case VE + V + E is O(VE).
Note that the worst-case complexity of Bellman-Ford actually is O(VE), which confirms that your reasoning is correct.

Breadth First Search time complexity analysis

The time complexity to go over each adjacent edge of a vertex is, say, O(N), where N is number of adjacent edges. So, for V numbers of vertices the time complexity becomes O(V*N) = O(E), where E is the total number of edges in the graph. Since removing and adding a vertex from/to a queue is O(1), why is it added to the overall time complexity of BFS as O(V+E)?
I hope this is helpful to anybody having trouble understanding computational time complexity for Breadth First Search a.k.a BFS.
Queue graphTraversal.add(firstVertex);
// This while loop will run V times, where V is total number of vertices in graph.
while(graphTraversal.isEmpty == false)
currentVertex = graphTraversal.getVertex();
// This while loop will run Eaj times, where Eaj is number of adjacent edges to current vertex.
while(currentVertex.hasAdjacentVertices)
graphTraversal.add(adjacentVertex);
graphTraversal.remove(currentVertex);
Time complexity is as follows:
V * (O(1) + O(Eaj) + O(1))
V + V * Eaj + V
2V + E(total number of edges in graph)
V + E
I have tried to simplify the code and complexity computation but still if you have any questions let me know.
Considering the following Graph we see how the time complexity is O(|V|+|E|) but not O(V*E).
Adjacency List
V E
v0:{v1,v2}
v1:{v3}
v2:{v3}
v3:{}
Operating How BFS Works Step by Step
Step1:
Adjacency lists:
V E
v0: {v1,v2} mark, enqueue v0
v1: {v3}
v2: {v3}
v3: {}
Step2:
Adjacency lists:
V E
v0: {v1,v2} dequeue v0;mark, enqueue v1,v2
v1: {v3}
v2: {v3}
v3: {}
Step3:
Adjacency lists:
V E
v0: {v1,v2}
v1: {v3} dequeue v1; mark,enqueue v3
v2: {v3}
v3: {}
Step4:
Adjacency lists:
V E
v0: {v1,v2}
v1: {v3}
v2: {v3} dequeue v2, check its adjacency list (v3 already marked)
v3: {}
Step5:
Adjacency lists:
V E
v0: {v1,v2}
v1: {v3}
v2: {v3}
v3: {} dequeue v3; check its adjacency list
Step6:
Adjacency lists:
V E
v0: {v1,v2} |E0|=2
v1: {v3} |E1|=1
v2: {v3} |E2|=1
v3: {} |E3|=0
Total number of steps:
|V| + |E0| + |E1| + |E2| +|E3| == |V|+|E|
4 + 2 + 1 + 1 + 0 == 4 + 4
8 == 8
Assume an adjacency list representation, V is the number of vertices, E the number of edges.
Each vertex is enqueued and dequeued at most once.
Scanning for all adjacent vertices takes O(|E|) time, since sum of lengths of adjacency lists is |E|.
Hence The Time Complexity of BFS Gives a O(|V|+|E|) time complexity.
The other answers here do a great job showing how BFS runs and how to analyze it. I wanted to revisit your original mathematical analysis to show where, specifically, your reasoning gives you a lower estimate than the true value.
Your analysis goes like this:
Let N be the average number of edges incident to each node (N = E / V).
Each node, therefore, spends O(N) time doing operations on the queue.
Since there are V nodes, the total runtime is the O(V) · O(N) = O(V) · O(E / V) = O(E).
You are very close to having the right estimate here. The question is where the missing V term comes from. The issue here is that, weirdly enough, you can't say that O(V) · O(E / V) = O(E).
You are totally correct that the average work per node is O(E / V). That means that the total work done asympotically is bounded from above by some multiple of E / V. If we think about what BFS is actually doing, the work done per node probably looks more like c1 + c2E / V, since there's some baseline amount of work done per node (setting up loops, checking basic conditions, etc.), which is what's accounted for by the c1 term, plus some amount of work proportional to the number of edges visited (E / V, times the work done per edge). If we multiply this by V, we get that
V · (c1 + c2E / V)
= c1V + c2E
= Θ(V + E)
What's happening here is that those lovely lower-order terms that big-O so conveniently lets us ignore are actually important here, so we can't easily discard them. So that's mathematically at least what's going on.
What's actually happening here is that no matter how many edges there are in the graph, there's some baseline amount of work you have to do for each node independently of those edges. That's the setup to do things like run the core if statements, set up local variables, etc.
Performing an O(1) operation L times, results to O(L) complexity.
Thus, removing and adding a vertex from/to the Queue is O(1), but when you do that for V vertices, you get O(V) complexity.
Therefore, O(V) + O(E) = O(V+E)
Will the time complexity of BFS be not O(V) considering we only have to traverse the vertices in the adjacency list? Am I missing something here?
For the below graph represented using adjacency list for ex:
0 ->1->2->null
1->3->4->null
3->null
4->null
While creating the graph we have the adjacency list which is an array of linked lists. So my understanding is during traversal this array is available to us and it's enough if we only traverse all the elements of this array and mark each element as visited to not visit it twice. What am I missing here?
I would just like to add to above answers that if we are using an adjacency matrix instead of a adjacency list, the time complexity will be O(V^2), as we will have to go through a complete row for each vertex to check which nodes are adjacent.
You are saying that total complexity should be O(V*N)=O(E). Suppose there is no edge between any pair of vertices i.e. Adj[v] is empty for all vertex v. Will BFS take a constant time in this case? Answer is no. It will take O(V) time(more accurately θ(V)). Even if Adj[v] is empty, running the line where you check Adj[v] will itself take some constant time for each vertex. So running time of BFS is O(V+E) which means O(max(V,E)).
One of the ways that I grasped the intuition of the time complexity
O ( V + E)
is that when we traverse the graph (let's take BFS pseudocode in Java):
for(v:V){ // segment 1
if(!v.isVisited) {
q = new Queue<>();
q.add(v);
v.isVisited = true
while(!q.isEmpty) {
curr = q.poll()
for(u: curr.adjacencyList ){ //Segment 2
//do some processing
u.isVisited = true
}
}
}
}
As, we can see there are two important segments 1 and 2 which determines the time complexity.
Case 1: Consider a graph with only vertices and a few edges, sparsely connected graph (100 vertices and 2 edges).
In that case, the segment 1 would dominate the course of traversal.
Hence making, O(V) as the time complexity as segment 1 checks all vertices in graph space once.
Therefore, T.C. = O(V) (since E is negligible).
Case 2: Consider a graph with few vertices but a complete graph (6 vertices and 15 edges) (n C 2).
Here the segment 2 will dominate as the number of edges are more and the segment 2 gets evaluated 2|E| times for an undirected graph.
T.C. of first vertex processing would be,
O(1) * O(2|E|) = O(E)
The rest of the vertex will not be evaluated for the segment 1 and would just add V-1 times of processing (since they are already visited in segment 2 which is O(V).
Thus, in this case its better to say T.C. = O(E) + O(V)
So, in the worst/best case of number of edges, we have
TC(taversing) O(E) + O(V) or
= O(E+V)

Understanding Time complexity calculation for Dijkstra Algorithm

As per my understanding, I have calculated time complexity of Dijkstra Algorithm as big-O notation using adjacency list given below. It didn't come out as it was supposed to and that led me to understand it step by step.
Each vertex can be connected to (V-1) vertices, hence the number of adjacent edges to each vertex is V - 1. Let us say E represents V-1 edges connected to each vertex.
Finding & Updating each adjacent vertex's weight in min heap is O(log(V)) + O(1) or O(log(V)).
Hence from step1 and step2 above, the time complexity for updating all adjacent vertices of a vertex is E*(logV). or E*logV.
Hence time complexity for all V vertices is V * (E*logV) i.e O(VElogV).
But the time complexity for Dijkstra Algorithm is O(ElogV). Why?
Dijkstra's shortest path algorithm is O(ElogV) where:
V is the number of vertices
E is the total number of edges
Your analysis is correct, but your symbols have different meanings! You say the algorithm is O(VElogV) where:
V is the number of vertices
E is the maximum number of edges attached to a single node.
Let's rename your E to N. So one analysis says O(ElogV) and another says O(VNlogV). Both are correct and in fact E = O(VN). The difference is that ElogV is a tighter estimation.
Adding a more detailed explanation as I understood it just in case:
O(for each vertex using min heap: for each edge linearly: push vertices to min heap that edge points to)
V = number of vertices
O(V * (pop vertex from min heap + find unvisited vertices in edges * push them to min heap))
E = number of edges on each vertex
O(V * (pop vertex from min heap + E * push unvisited vertices to min heap)). Note, that we can push the same node multiple times here before we get to "visit" it.
O(V * (log(heap size) + E * log(heap size)))
O(V * ((E + 1) * log(heap size)))
O(V * (E * log(heap size)))
E = V because each vertex can reference all other vertices
O(V * (V * log(heap size)))
O(V^2 * log(heap size))
heap size is V^2 because we push to it every time we want to update a distance and can have up to V comparisons for each vertex. E.g. for the last vertex, 1st vertex has distance 10, 2nd has 9, 3rd has 8, etc, so, we push each time to update
O(V^2 * log(V^2))
O(V^2 * 2 * log(V))
O(V^2 * log(V))
V^2 is also a total number of edges, so if we let E = V^2 (as in the official naming), we will get the O(E * log(V))
let n be the number of vertices and m be the number of edges.
Since with Dijkstra's algorithm you have O(n) delete-mins and O(m) decrease_keys, each costing O(logn), the total run time using binary heaps will be O(log(n)(m + n)). It is totally possible to amortize the cost of decrease_key down to O(1) using Fibonacci heaps resulting in a total run time of O(nlogn+m) but in practice this is often not done since the constant factor penalties of FHs are pretty big and on random graphs the amount of decrease_keys is way lower than its respective upper bound (more in the range of O(n*log(m/n), which is way better on sparse graphs where m = O(n)). So always be aware of the fact that the total run time is both dependent on your data structures and the input class.
In dense(or complete) graph, E logV > V^2
Using linked data & binary heap is not always best.
That case, I prefer to use just matrix data and save minimum length by row.
Just V^2 time needed.
In case, E < V / logV.
Or, max edges per vertex is less than some constant K.
Then use binary heap.
I find it easier to think at this complexity in the following way:
The nodes are first inserted in a priority queue and the extracted one by one leading to O(V log V).
Once a node is extracted, we iterate through its edges and update the priority queue accordingly. Note that every edge is explored only once, moreover, updating the priority queue is O(log V), leading to an overall O(E log V).
TLDR. You have V extractions from the priority queue and E updates to the priority queue, leading to an overall O((V + E) log V).
Let's try to analyze the algorithm as given in CLRS book.
for each loop in line 7: for any vertex say 'u' the number of times the loop runs is equal to the number of adjacent vertices of 'u'.
The number of adjacent vertices for a node is always less than or equal to the total number of edges in the graph.
If we take V (because of while loop in line 4) and E (because of for each in line 7) and compute the complexity as VElog(V) it would be equivalent to assuming each vertex has E edges incident on it, but in actual there will be atmost or less than E edges incident on a single vertex. (the atmost E adjacent vertices for a single vertex case happens in case of a star graph for the internal vertex)
V:Number of Vertices,
E:Number of total_edges
Suppose the Graph is dense
The complexity would be O(V*logV) + O( (1+2+...+V)*logV)
1+2+....+(V-1) = (v)*(v+1)/2 ~ V^2 ~ E because the graph is dense
So the complexity would be O(ElogV).
The sum 1+2+...+ V refers to: For each vertex v in G.adj[u] but not in S
If you think about Q before a vertex is extracted has V vertices then it has V-1 then V-2
... then 1.
E is edges and V is vertices. Number of edges
(V *(V-1)) / 2
approximately
V ^ 2
So we can add maximum V^2 edges to the min heap. So sorting the elements in min heap will take
O(Log(V ^ 2))
Every time we insert a new element into min heap, we are going to sort. We will have E edges so we will be sorting E times. so total time complexity
O(E * Log(V ^ 2)= O( 2 * E * Log(V))
Omitting the constant 2:
O( E * Log(V))

Tree Search algorithm (remove front node from the fringe/queue – goal test – expand)

How many nodes are visited (chosen from the queue) in the worst case using Breadth-First search, when the solution is at depth d, and the branching factor is b, and the depth of the maximum branch is m?
Give a formula.
How many nodes are generated (added to the queue as a result of expanding the parent) in the worst case using Breadth-First search, when the solution is at depth d, and the branching factor is b, and the depth of the maximum branch is m?
Give a formula.
What is the minimum possible size of the queue using Depth-First search, when the solution is at depth d, and the branching factor is b, and the depth of the maximum branch is m? Explain your answer on a small search tree.
1 + b + b2 + ... + bd that is O(bd)
1 + b + b2 + ... + bd+1 - b that is O(bd+1)
b * m
Note: fringe in the DFS is stack not queue (or you may call it a LIFO queue).
1, 2 : look at the following figure as an example with b = 3 in which I've shown the goal state by a red circle.
For this tree all the nodes in the purple box get visited while all of these nodes + nodes in the orange box get added to the fringe.
3 : In the following figure all the nodes inside the circuit (really poor designed ;D ) get added to the fringe.

why E dominates v?

I analyzed the running time for Kruskal algorithm and I come up with O(ElogE+Elogv+v)
I asked my prof and he said that if the graph is very sparse with many isolated vertices V dominates E which makes sense if not then E dominates V and I can not understand why?
I can give an example where graph is not sparse but still V is greater than E
Can anyone help me to clear this confusion?
A tree in a undirectional graph has |V|-1 edges.
Since a tree is the connected component with least edges as possible - it basically means that for each connected undirectional graph, |E| is in Omega(|V|), so |V| is dominated by |E|.
This basically means that if |E| < |V|-1 - the graph is not connected.
Now, since Kruskal algorithm is designed to find a spanning tree, you can abort the algorithm once you have found |E| < |V|-1 - there is no spanning tree at all, no point to look for one.
From this we conclude that when |E| < |V|-1, there is no point in discussing complexity of Kruskal Algorithm, and we can safely assume that |E| >= |V| -1 , so |V| is dominated by |E|.
Density = number of edges / number of possible edges = E / (V(V-1))/2
Let the graph be a tree E = V - 1
So V = (E + 1)
And Kruskal's complexity is
O(E log E + E log V + V) = O(E log E + E log (E + 1) + (E + 1)) = O( E log E )
So E dominates. E will dominate as long as E = O(V).

Resources