Breadth First Search time complexity analysis - algorithm

The time complexity to go over each adjacent edge of a vertex is, say, O(N), where N is number of adjacent edges. So, for V numbers of vertices the time complexity becomes O(V*N) = O(E), where E is the total number of edges in the graph. Since removing and adding a vertex from/to a queue is O(1), why is it added to the overall time complexity of BFS as O(V+E)?

I hope this is helpful to anybody having trouble understanding computational time complexity for Breadth First Search a.k.a BFS.
Queue graphTraversal.add(firstVertex);
// This while loop will run V times, where V is total number of vertices in graph.
while(graphTraversal.isEmpty == false)
currentVertex = graphTraversal.getVertex();
// This while loop will run Eaj times, where Eaj is number of adjacent edges to current vertex.
Time complexity is as follows:
V * (O(1) + O(Eaj) + O(1))
V + V * Eaj + V
2V + E(total number of edges in graph)
V + E
I have tried to simplify the code and complexity computation but still if you have any questions let me know.

Considering the following Graph we see how the time complexity is O(|V|+|E|) but not O(V*E).
Adjacency List
Operating How BFS Works Step by Step
Adjacency lists:
v0: {v1,v2} mark, enqueue v0
v1: {v3}
v2: {v3}
v3: {}
Adjacency lists:
v0: {v1,v2} dequeue v0;mark, enqueue v1,v2
v1: {v3}
v2: {v3}
v3: {}
Adjacency lists:
v0: {v1,v2}
v1: {v3} dequeue v1; mark,enqueue v3
v2: {v3}
v3: {}
Adjacency lists:
v0: {v1,v2}
v1: {v3}
v2: {v3} dequeue v2, check its adjacency list (v3 already marked)
v3: {}
Adjacency lists:
v0: {v1,v2}
v1: {v3}
v2: {v3}
v3: {} dequeue v3; check its adjacency list
Adjacency lists:
v0: {v1,v2} |E0|=2
v1: {v3} |E1|=1
v2: {v3} |E2|=1
v3: {} |E3|=0
Total number of steps:
|V| + |E0| + |E1| + |E2| +|E3| == |V|+|E|
4 + 2 + 1 + 1 + 0 == 4 + 4
8 == 8
Assume an adjacency list representation, V is the number of vertices, E the number of edges.
Each vertex is enqueued and dequeued at most once.
Scanning for all adjacent vertices takes O(|E|) time, since sum of lengths of adjacency lists is |E|.
Hence The Time Complexity of BFS Gives a O(|V|+|E|) time complexity.

The other answers here do a great job showing how BFS runs and how to analyze it. I wanted to revisit your original mathematical analysis to show where, specifically, your reasoning gives you a lower estimate than the true value.
Your analysis goes like this:
Let N be the average number of edges incident to each node (N = E / V).
Each node, therefore, spends O(N) time doing operations on the queue.
Since there are V nodes, the total runtime is the O(V) · O(N) = O(V) · O(E / V) = O(E).
You are very close to having the right estimate here. The question is where the missing V term comes from. The issue here is that, weirdly enough, you can't say that O(V) · O(E / V) = O(E).
You are totally correct that the average work per node is O(E / V). That means that the total work done asympotically is bounded from above by some multiple of E / V. If we think about what BFS is actually doing, the work done per node probably looks more like c1 + c2E / V, since there's some baseline amount of work done per node (setting up loops, checking basic conditions, etc.), which is what's accounted for by the c1 term, plus some amount of work proportional to the number of edges visited (E / V, times the work done per edge). If we multiply this by V, we get that
V · (c1 + c2E / V)
= c1V + c2E
= Θ(V + E)
What's happening here is that those lovely lower-order terms that big-O so conveniently lets us ignore are actually important here, so we can't easily discard them. So that's mathematically at least what's going on.
What's actually happening here is that no matter how many edges there are in the graph, there's some baseline amount of work you have to do for each node independently of those edges. That's the setup to do things like run the core if statements, set up local variables, etc.

Performing an O(1) operation L times, results to O(L) complexity.
Thus, removing and adding a vertex from/to the Queue is O(1), but when you do that for V vertices, you get O(V) complexity.
Therefore, O(V) + O(E) = O(V+E)

Will the time complexity of BFS be not O(V) considering we only have to traverse the vertices in the adjacency list? Am I missing something here?
For the below graph represented using adjacency list for ex:
0 ->1->2->null
While creating the graph we have the adjacency list which is an array of linked lists. So my understanding is during traversal this array is available to us and it's enough if we only traverse all the elements of this array and mark each element as visited to not visit it twice. What am I missing here?

I would just like to add to above answers that if we are using an adjacency matrix instead of a adjacency list, the time complexity will be O(V^2), as we will have to go through a complete row for each vertex to check which nodes are adjacent.

You are saying that total complexity should be O(V*N)=O(E). Suppose there is no edge between any pair of vertices i.e. Adj[v] is empty for all vertex v. Will BFS take a constant time in this case? Answer is no. It will take O(V) time(more accurately θ(V)). Even if Adj[v] is empty, running the line where you check Adj[v] will itself take some constant time for each vertex. So running time of BFS is O(V+E) which means O(max(V,E)).

One of the ways that I grasped the intuition of the time complexity
O ( V + E)
is that when we traverse the graph (let's take BFS pseudocode in Java):
for(v:V){ // segment 1
if(!v.isVisited) {
q = new Queue<>();
v.isVisited = true
while(!q.isEmpty) {
curr = q.poll()
for(u: curr.adjacencyList ){ //Segment 2
//do some processing
u.isVisited = true
As, we can see there are two important segments 1 and 2 which determines the time complexity.
Case 1: Consider a graph with only vertices and a few edges, sparsely connected graph (100 vertices and 2 edges).
In that case, the segment 1 would dominate the course of traversal.
Hence making, O(V) as the time complexity as segment 1 checks all vertices in graph space once.
Therefore, T.C. = O(V) (since E is negligible).
Case 2: Consider a graph with few vertices but a complete graph (6 vertices and 15 edges) (n C 2).
Here the segment 2 will dominate as the number of edges are more and the segment 2 gets evaluated 2|E| times for an undirected graph.
T.C. of first vertex processing would be,
O(1) * O(2|E|) = O(E)
The rest of the vertex will not be evaluated for the segment 1 and would just add V-1 times of processing (since they are already visited in segment 2 which is O(V).
Thus, in this case its better to say T.C. = O(E) + O(V)
So, in the worst/best case of number of edges, we have
TC(taversing) O(E) + O(V) or
= O(E+V)


Find the N highest cost vertices that has a path to S, where S is a vertex in an undirected Graph G

I would like to know, what would be the most efficient way (w.r.t., Space and Time) to solve the following problem:
Given an undirected Graph G = (V, E), a positive number N and a vertex S in V. Assume that every vertex in V has a cost value. Find the N highest cost vertices that is connected to S.
For example:
G = (V, E)
V = {v1, v2, v3, v4},
E = {(v1, v2),
(v1, v3),
(v2, v4),
(v3, v4)}
v1 cost = 1
v2 cost = 2
v3 cost = 3
v4 cost = 4
N = 2, S = v1
result: {v3, v4}
This problem can be solved easily by the graph traversal algorithm (e.g., BFS or DFS). To find the vertices connected to S, we can run either BFS or DFS starting from S. As the space and time complexity of BFS and DFS is same (i.e., time complexity: O(V+E), space complexity: O(E)), here I am going to show the pseudocode using DFS:
Parameter Definition:
* G -> Graph
* S -> Starting node
* N -> Number of connected (highest cost) vertices to find
* Cost -> Array of size V, contains the vertex cost value
procedure DFS-traversal(G,S,N,Cost):
let St be a stack
let Q be a min-priority-queue contains <cost, vertex-id>
let discovered is an array (of size V) to mark already visited vertices
// Comment: if you do not want to consider the case "S is connected to S"
// then, you can consider commenting the following line
Q.push(make-pair(S, Cost[S]))
label S as discovered
while St is not empty
v = St.pop()
for all edges from v to w in G.adjacentEdges(v) do
if w is not labeled as discovered:
label w as discovered
Q.push(make-pair(w, Cost[w]))
if Q.size() == N + 1:
let ret is a N sized array
while Q is not empty:
Let's first describe the process first. Here, I run the iterative version of DFS to traverse the graph starting from S. During the traversal, I use a priority-queue to keep the N highest cost vertices that is reachable from S. Instead of the priority-queue, we can use a simple array (or even we can reuse the discovered array) to keep the record of the reachable vertices with cost.
Analysis of space-complexity:
To store the graph: O(E)
Priority-queue: O(N)
Stack: O(V)
For labeling discovered: O(V)
So, as O(E) is the dominating term here, we can consider O(E) as the overall space complexity.
Analysis of time-complexity:
DFS-traversal: O(V+E)
To track N highest cost vertices:
By maintaining priority-queue: O(V*logN)
Or alternatively using array: O(V*logV)
The overall time-complexity would be: O(V*logN + E) or O(V*logV + E)

Understanding Time complexity calculation for Dijkstra Algorithm

As per my understanding, I have calculated time complexity of Dijkstra Algorithm as big-O notation using adjacency list given below. It didn't come out as it was supposed to and that led me to understand it step by step.
Each vertex can be connected to (V-1) vertices, hence the number of adjacent edges to each vertex is V - 1. Let us say E represents V-1 edges connected to each vertex.
Finding & Updating each adjacent vertex's weight in min heap is O(log(V)) + O(1) or O(log(V)).
Hence from step1 and step2 above, the time complexity for updating all adjacent vertices of a vertex is E*(logV). or E*logV.
Hence time complexity for all V vertices is V * (E*logV) i.e O(VElogV).
But the time complexity for Dijkstra Algorithm is O(ElogV). Why?
Dijkstra's shortest path algorithm is O(ElogV) where:
V is the number of vertices
E is the total number of edges
Your analysis is correct, but your symbols have different meanings! You say the algorithm is O(VElogV) where:
V is the number of vertices
E is the maximum number of edges attached to a single node.
Let's rename your E to N. So one analysis says O(ElogV) and another says O(VNlogV). Both are correct and in fact E = O(VN). The difference is that ElogV is a tighter estimation.
Adding a more detailed explanation as I understood it just in case:
O(for each vertex using min heap: for each edge linearly: push vertices to min heap that edge points to)
V = number of vertices
O(V * (pop vertex from min heap + find unvisited vertices in edges * push them to min heap))
E = number of edges on each vertex
O(V * (pop vertex from min heap + E * push unvisited vertices to min heap)). Note, that we can push the same node multiple times here before we get to "visit" it.
O(V * (log(heap size) + E * log(heap size)))
O(V * ((E + 1) * log(heap size)))
O(V * (E * log(heap size)))
E = V because each vertex can reference all other vertices
O(V * (V * log(heap size)))
O(V^2 * log(heap size))
heap size is V^2 because we push to it every time we want to update a distance and can have up to V comparisons for each vertex. E.g. for the last vertex, 1st vertex has distance 10, 2nd has 9, 3rd has 8, etc, so, we push each time to update
O(V^2 * log(V^2))
O(V^2 * 2 * log(V))
O(V^2 * log(V))
V^2 is also a total number of edges, so if we let E = V^2 (as in the official naming), we will get the O(E * log(V))
let n be the number of vertices and m be the number of edges.
Since with Dijkstra's algorithm you have O(n) delete-mins and O(m) decrease_keys, each costing O(logn), the total run time using binary heaps will be O(log(n)(m + n)). It is totally possible to amortize the cost of decrease_key down to O(1) using Fibonacci heaps resulting in a total run time of O(nlogn+m) but in practice this is often not done since the constant factor penalties of FHs are pretty big and on random graphs the amount of decrease_keys is way lower than its respective upper bound (more in the range of O(n*log(m/n), which is way better on sparse graphs where m = O(n)). So always be aware of the fact that the total run time is both dependent on your data structures and the input class.
In dense(or complete) graph, E logV > V^2
Using linked data & binary heap is not always best.
That case, I prefer to use just matrix data and save minimum length by row.
Just V^2 time needed.
In case, E < V / logV.
Or, max edges per vertex is less than some constant K.
Then use binary heap.
I find it easier to think at this complexity in the following way:
The nodes are first inserted in a priority queue and the extracted one by one leading to O(V log V).
Once a node is extracted, we iterate through its edges and update the priority queue accordingly. Note that every edge is explored only once, moreover, updating the priority queue is O(log V), leading to an overall O(E log V).
TLDR. You have V extractions from the priority queue and E updates to the priority queue, leading to an overall O((V + E) log V).
Let's try to analyze the algorithm as given in CLRS book.
for each loop in line 7: for any vertex say 'u' the number of times the loop runs is equal to the number of adjacent vertices of 'u'.
The number of adjacent vertices for a node is always less than or equal to the total number of edges in the graph.
If we take V (because of while loop in line 4) and E (because of for each in line 7) and compute the complexity as VElog(V) it would be equivalent to assuming each vertex has E edges incident on it, but in actual there will be atmost or less than E edges incident on a single vertex. (the atmost E adjacent vertices for a single vertex case happens in case of a star graph for the internal vertex)
V:Number of Vertices,
E:Number of total_edges
Suppose the Graph is dense
The complexity would be O(V*logV) + O( (1+2+...+V)*logV)
1+2+....+(V-1) = (v)*(v+1)/2 ~ V^2 ~ E because the graph is dense
So the complexity would be O(ElogV).
The sum 1+2+...+ V refers to: For each vertex v in G.adj[u] but not in S
If you think about Q before a vertex is extracted has V vertices then it has V-1 then V-2
... then 1.
E is edges and V is vertices. Number of edges
(V *(V-1)) / 2
V ^ 2
So we can add maximum V^2 edges to the min heap. So sorting the elements in min heap will take
O(Log(V ^ 2))
Every time we insert a new element into min heap, we are going to sort. We will have E edges so we will be sorting E times. so total time complexity
O(E * Log(V ^ 2)= O( 2 * E * Log(V))
Omitting the constant 2:
O( E * Log(V))

Time complexity of Prim's MST Algorithm

Can someone explain to me why is Prim's Algorithm using adjacent matrix result in a time complexity of O(V2)?
(Sorry in advance for the sloppy looking ASCII math, I don't think we can use LaTEX to typeset answers)
The traditional way to implement Prim's algorithm with O(V^2) complexity is to have an array in addition to the adjacency matrix, lets call it distance which has the minimum distance of that vertex to the node.
This way, we only ever check distance to find the next target, and since we do this V times and there are V members of distance, our complexity is O(V^2).
This on it's own wouldn't be enough as the original values in distance would quickly become out of date. To update this array, all we do is at the end of each step, iterate through our adjacency matrix and update the distance appropriately. This doesn't affect our time complexity since it merely means that each step takes O(V+V) = O(2V) = O(V). Therefore our algorithm is O(V^2).
Without using distance we have to iterate through all E edges every single time, which at worst contains V^2 edges, meaning our time complexity would be O(V^3).
To prove that without the distance array it is impossible to compute the MST in O(V^2) time, consider that then on each iteration with a tree of size n, there are V-n vertices to potentially be added.
To calculate which one to choose we must check each of these to find their minimum distance from the tree and then compare that to each other and find the minimum there.
In the worst case scenario, each of the nodes contains a connection to each node in the tree, resulting in n * (V-n) edges and a complexity of O(n(V-n)).
Since our total would be the sum of each of these steps as n goes from 1 to V, our final time complexity is:
(sum O(n(V-n)) as n = 1 to V) = O(1/6(V-1) V (V+1)) = O(V^3)
Note: This answer just borrows jozefg's answer and tries to explain it more fully since I had to think a bit before I understood it.
An Adjacency Matrix representation of a graph constructs a V x V matrix (where V is the number of vertices). The value of cell (a, b) is the weight of the edge linking vertices a and b, or zero if there is no edge.
Adjacency Matrix
A 0 1 0 3 2
B 1 0 0 0 2
C 0 0 0 4 3
D 3 0 4 0 1
E 2 2 3 1 0
Prim's Algorithm is an algorithm that takes a graph and a starting node, and finds a minimum spanning tree on the graph - that is, it finds a subset of the edges so that the result is a tree that contains all the nodes and the combined edge weights are minimized. It may be summarized as follows:
Place the starting node in the tree.
Repeat until all nodes are in the tree:
Find all edges that join nodes in the tree to nodes not in the tree.
Of those edges, choose one with the minimum weight.
Add that edge and the connected node to the tree.
We can now start to analyse the algorithm like so:
At every iteration of the loop, we add one node to the tree. Since there are V nodes, it follows that there are O(V) iterations of this loop.
Within each iteration of the loop, we need to find and test edges in the tree. If there are E edges, the naive searching implementation uses O(E) to find the edge with minimum weight.
So in combination, we should expect the complexity to be O(VE), which may be O(V^3) in the worst case.
However, jozefg gave a good answer to show how to achieve O(V^2) complexity.
Distance to Tree
| A B C D E
Iteration 0 | 0 1* # 3 2
1 | 0 0 # 3 2*
2 | 0 0 4 1* 0
3 | 0 0 3* 0 0
4 | 0 0 0 0 0
NB. # = infinity (not connected to tree)
* = minimum weight edge in this iteration
Here the distance vector represents the smallest weighted edge joining each node to the tree, and is used as follows:
Initialize with the edge weights to the starting node A with complexity O(V).
To find the next node to add, simply find the minimum element of distance (and remove it from the list). This is O(V).
After adding a new node, there are O(V) new edges connecting the tree to the remaining nodes; for each of these determine if the new edge has less weight than the existing distance. If so, update the distance vector. Again, O(V).
Using these three steps reduces the searching time from O(E) to O(V), and adds an extra O(V) step to update the distance vector at each iteration. Since each iteration is now O(V), the overall complexity is O(V^2).
First of all, it's obviously at least O(V^2), because that is how big the adjacency matrix is.
Looking at, you need to execute the step "Repeat until Vnew = V" V times.
Inside that step, you need to work out the shortest link between any vertex in V and any vertex outside V. Maintain an array of size V, holding for each vertex either infinity (if the vertex is in V) or the length of the shortest link between any vertex in V and that vertex, and its length (so in the beginning this just comes from the length of links between the starting vertex and every other vertex). To find the next vertex to add to V, just search this array, at cost V. Once you have a new vertex, look at all the links from that vertex to every other vertex and see if any of them give shorter links from V to that vertex. If they do, update the array. This also cost V.
So you have V steps (V vertexes to add) each taking cost V, which gives you O(V^2)

Why is the time complexity of both DFS and BFS O( V + E )

The basic algorithm for BFS:
set start vertex to visited
load it into queue
while queue not empty
for each edge incident to vertex
if its not visited
load into queue
mark vertex
So I would think the time complexity would be:
v1 + (incident edges) + v2 + (incident edges) + .... + vn + (incident edges)
where v is vertex 1 to n
Firstly, is what I've said correct? Secondly, how is this O(N + E), and intuition as to why would be really nice. Thanks
Your sum
v1 + (incident edges) + v2 + (incident edges) + .... + vn + (incident edges)
can be rewritten as
(v1 + v2 + ... + vn) + [(incident_edges v1) + (incident_edges v2) + ... + (incident_edges vn)]
and the first group is O(N) while the other is O(E).
Setting/getting a vertex/edge label takes O(1) time
Each vertex is labeled twice
once as VISITED
Each edge is labeled twice
Method incidentEdges is called once for each vertex
DFS runs in O(n + m) time provided the graph is represented by the adjacency list structure
Recall that Σv deg(v) = 2m
Setting/getting a vertex/edge label takes O(1) time
Each vertex is labeled twice
once as VISITED
Each edge is labeled twice
Each vertex is inserted once into a sequence Li
Method incidentEdges is called once for each vertex
BFS runs in O(n + m) time provided the graph is represented by the adjacency list structure
Recall that Σv deg(v) = 2m
Very simplified without much formality: every edge is considered exactly twice, and every node is processed exactly once, so the complexity has to be a constant multiple of the number of edges as well as the number of vertices.
An intuitive explanation to this is by simply analysing a single loop:
visit a vertex -> O(1)
a for loop on all the incident edges -> O(e) where e is a number of edges incident on a given vertex v.
So the total time for a single loop is O(1)+O(e). Now sum it for each vertex as each vertex is visited once. This gives
For every V
=> O(V) + O(E)
Time complexity is O(E+V) instead of O(2E+V) because if the time complexity is n^2+2n+7 then it is written as O(n^2).
Hence, O(2E+V) is written as O(E+V)
because difference between n^2 and n matters but not between n and 2n.
I think every edge has been considered twice and every node has been visited once, so the total time complexity should be O(2E+V).
Short but simple explanation:
I the worst case you would need to visit all the vertex and edge hence
the time complexity in the worst case is O(V+E)
In Bfs, each neighboring vertex is inserted once into a queue. This is done by looking at the edges of the vertex. Each visited vertex is marked so it cannot be visited again: each vertex is visited exactly once, and all edges of each vertex are checked. So the complexity of BFS is V+E.
In DFS, each node maintains a list of all its adjacent edges, then, for each node, you need to discover all its neighbors by traversing its adjacency list just once in linear time. For a directed graph, the sum of the sizes of the adjacency lists of all the nodes is E(total number of edges). So, the complexity of DFS is O(V + E).
It's O(V+E) because each visit to v of V must visit each e of E where |e| <= V-1. Since there are V visits to v of V then that is O(V). Now you have to add V * |e| = E => O(E). So total time complexity is O(V + E).

Graph Minimum Spanning Tree using BFS

This is a problem from a practice exam that I'm struggling with:
Let G = (V, E) be a weighted undirected connected graph, with positive
weights (you may assume that the weights are distinct). Given a real
number r, define the subgraph Gr = (V, {e in E | w(e) <= r}). For
example, G0 has no edges (obviously disconnected), and Ginfinity = G
(which by assumption is connected). The problem is to find the
smallest r such that Gr is connected.
Describe an O(mlogn)-time algorithm that solves the problem by
repeated applications of BFS or DFS.
The real problem is doing it in O(mlogn). Here's what I've got:
r = min( w(e) ) => O(m)
while true do => O(m)
Gr = G with edges e | w(e) > r removed => O(m)
if | BFS( Gr ).V | < |V| => O(m + n)
r++ (or r = next smallest w(e))
return r
That's a whopping O(m^2 + mn). Any ideas for getting it down to O(mlogn)? Thanks!
You are iterating over all possible edge costs which results in the outer loop of O(m). Notice that if the graph is disconnected when you discard all edges >w(e), it is also disconnected for >w(e') where w(e') < w(e). You can use this property to do a binary search over the edge costs and thus do this in O(log(n)).
lo=min(w(e) for e in edges), hi=max(w(e) for e in edges)
while lo<hi:
if connected(graph after discarding all e where w(e)>w(mid)):
return lo
The binary search has a complexity of O(log (max_e-min_e)) (you can actually bring it down to O(log(edges)) and discarding edges and determining connectivity can be done in O(edges+vertices), so this can be done in O((edge+vertices)*log(edges)).
Warning: I have not tested this in code yet, so there may be bugs. But the idea should work.
How about the following algorithm?
First take a list of all edges (or all distinct edge lengths, using ) from the graph and sort them. That takes O(m*log m) = O(m*log n) time: m is usually less than n^2, so O(log m)=O(log n^2)=O(2*log n)=O(log n).
It is obvious that r should be equal to the weight of some edge. So you can do a binary search on the index of the edge in the sorted array.
For each index you try, you take the length of the correspondong edge as r, and check the graph for connectivity, only using the edges of length <= r with BFS or DFS.
Each iteration of the binary search takes O(m), and you have to make O(log m)=O(log n) iterations.
