Space Complexity of DFS and BFS in graph - depth-first-search

I am trying to understand what is the space complexity of DFS and BFS in a graph.
I understand that the space complexity of BFS while using adjacency matrix would be O(v^2) where v is the number of vertices.
By using the adjacency list space complexity would be decreased in average case i.e < v^2. But in the worst case, it would be O(v^2).
Even including Queue, it would be O(n^2) (neglecting the lower value)
But, what is the scenario with DFS?
Even if we use the adjacency matrix/list. Space Complexity would be O(v^2). But that seems to be a very loose bound, that too without considering stack frames.
Am I correct, regarding the complexities?
If, not what are the space complexities of BFS/DFS?
and while calculating Space Complexity for DFS, do we consider stack frame or not?
what is the tight bound of space complexity, for BFS and DFS for graph

As shown in Pseudocode 1, the space consumption of the adjacency matrix or adjacency list is not in the BFS algorithm. Adjacency matrix or adjacency list is the input to the BFS algorithm, thus it cannot be included in the calculation of space complexity. So does DFS.
Pseudocode 1
Input: A graph Graph and a starting vertex root of Graph
Output: Goal state. The parent links trace the shortest path back to root
procedure BFS(G,start_v):
let Q be a queue
label start_v as discovered
Q.enqueue(start_v)
while Q is not empty
v = Q.dequeue()
if v is the goal:
return v
for all edges from v to w in G.adjacentEdges(v) do
if w is not labeled as discovered:
label w as discovered
w.parent = v
Q.enqueue(w)
The space complexity of BFS can be expressed as O(|V|), where |V| is the cardinality of the set of vertices. For in the worst case, you would need to hold all vertices in the queue.
The space complexity of DFS depends on the implementation. A non-recursive implementation of DFS with worst-case space complexity O(|E|) is shown as followed, where E is the cardinality of the set of edges:
procedure DFS-iterative(G,v):
let S be a stack
S.push(v)
while S is not empty
v = S.pop()
if v is not labeled as discovered:
label v as discovered
for all edges from v to w in G.adjacentEdges(v) do
S.push(w)
Breadth-first search is complete, while depth-first search is not.

Related

Time Complexity Analysis of BFS

I know that there are a ton of questions out there about the time complexity of BFS which is : O(V+E)
However I still struggle to understand why is the time complexity O(V+E) and not O(V*E)
I know that O(V+E) stands for O(max[V,E]) and my only guess is that it has something to do with the density of the graph and not with the algorithm itself unlike say Merge Sort where it's time complexity is always O(n*logn).
Examples I've thought of are :
A Directed Graph with |E| = |V|-1 and yeah the time complexity will be O(V)
A Directed Graph with |E| = |V|*|V-1| and the complexity would in fact be O(|E|) = O(|V|*|V|) as each vertex has an outgoing edge to every other vertex besides itself
Am I in the right direction? Any insight would be really helpful.
Your "examples of thought" illustrate that the complexity is not O(V*E), but O(E). True, E can be a large number in comparison with V, but it doesn't matter when you say the complexity is O(E).
When the graph is connected, then you can always say it is O(E). The reason to include V in the time complexity, is to cover for the graphs that have many more vertices than edges (and thus are disconnected): the BFS algorithm will not only have to visit all edges, but also all vertices, including those that have no edges, just to detect that they don't have edges. And so we must say O(V+E).
The complexity comes off easily if you walk through the algorithm. Let Q be the FIFO queue where initially it contains the source node. BFS basically does the following
while Q not empty
pop u from Q
for each adjacency v of u
if v is not marked
mark v
push v into Q
Since each node is added once and removed once then the while loop is done O(V) times. Also each time we pop u we perform |adj[u]| operations where |adj[u]| is the number of
adjacencies of u.
Therefore the total complexity is Sum (1+|adj[u]|) over all V which is O(V+E) since the sum of adjacencies is O(E) (2E for undirected graph and E for a directed one)
Consider a situation when you have a tree, maybe even with cycles, you start search from the root and your target is the last leaf of your tree. In this case you will traverse all the edges before you get into your destination.
E.g.
0 - 1
1 - 2
0 - 2
0 - 3
In this scenario you will check 4 edges before you actually find a node #3.
It depends on how the adjacency list is implemented. A properly implemented adjacency list is a list/array of vertices with a list of related edges attached to each vertex entry.
The key is that the edge entries point directly to their corresponding vertex array/list entry, they never have to search through the vertex array/list for a matching entry, they can just look it up directly. This insures that the total number of edge accesses is 2E and the total number of vertex accesses is V+2E. This makes the total time O(E+V).
In improperly implemented adjacency lists, the vertex array/list is not directly indexed, so to go from an edge entry to a vertex entry you have to search through the vertex list which is O(V), which means that the total time is O(E*V).

shortest path between 2 vertices in undirected weighted graph

I am trying to find shortest path between 2 vertices in undirected weighted graph. It is also known that weights are integers less than log(log|V|), where |V| is amount of vertices. It is easy to solve using Bellman-Ford or Dijkstra algorithms, but is there any algorithm which can do it faster?
So far, I have been thinking of using BFS and dividing edges with weight greater than 1 into couple of them with weight 1, but it is not really good idea if |V| is large number. No, it is not my homework, I am just wondering.
One way to think of this question is to improve the running time of using Dijkstra's algorithm to find the shortest path between two vertices in the undirected weighted graph. So in this case, you can use a binary heap as the data structure. A heap is a complete binary tree with the heap property that every parent node is smaller (greater) than its children nodes in the tree in a min heap (a max heap). Here you can use the min heap to store the cost to each node from the starting node.
More information about heap can be found here: https://courses.csail.mit.edu/6.006/fall10/handouts/recitation10-8.pdf
With a heap, the running time of Dijkstra's algorithm can be reduced from O(V^2) to O(E log E), because selecting the minimum distance from the heap takes O(log V) (removing the minimum distance is O(1) and fixing the heap takes O(log V)) and updating distances to vertices takes O(E log V) in total (fixing heap takes O(log V) and it takes E times to examine neighbors and change costs).
Hope this help.

Bellman-Ford Algorithm Space Complexity

I have been searching about Bellman-Ford Algorithm's space complexity but on wikipedia Bellman-Ford Algorithm and it says space complexity is O(V). on this link it says O(V^2) . My question is; what is the true space complexity and why?
It depends on the way we define it.
If we assume that the graph is given, the extra space complexity is O(V) (for an array of distances).
If we assume that the graph also counts, it can be O(V^2) for an adjacency matrix and O(V+E) for an adjacency list.
They both are "true" in some sense. It's just about what we want to count in a specific problem.
There are two cases:-
If we assume that the graph is given, then we have to create 2 arrays (for an array of distances and array of parents) so, the extra space complexity is O(V) .
If we consider storing of graph also then:
a) O(V^2) for adjacency matrix
b) O(V+E) for adjacency list
c) O(E) if we just create edges list which will store all the edges only
It does not matter whether we are using adjacency list or.
adjacency matrix if given graph is complete one then
space complexity = input + extra
1 if we use adjacency matrix, space = input + extra O(V^2)+O(V) ->Using min heap =O(V^2)
2 if we use adjacency list, space = input + extraa
In complite graph E = O(V^2)
O(V + E) + O(V) -> min heap = O(V^2)
Because if we talk about space complexity for an.
algorithm we always go with worst case what can be.
happen .in Dijkstra or bellman ford both have complite
Graph, Space Complexity is = O(V^2)

Graph In-degree Calculation from Adjacency-list

I came across this question in which it was required to calculate in-degree of each node of a graph from its adjacency list representation.
for each u
for each Adj[i] where i!=u
if (i,u) ∈ E
in-degree[u]+=1
Now according to me its time complexity should be O(|V||E|+|V|^2) but the solution I referred instead described it to be equal to O(|V||E|).
Please help and tell me which one is correct.
Rather than O(|V||E|), the complexity of computing indegrees is O(|E|). Let us consider the following pseudocode for computing indegrees of each node:
for each u
indegree[u] = 0;
for each u
for each v \in Adj[u]
indegree[v]++;
First loop has linear complexity O(|V|). For the second part: for each v, the innermost loop executes at most |E| times, while the outermost loop executes |V| times. Therefore the second part appears to have complexity O(|V||E|). In fact, the code executes an operation once for each edge, so a more accurate complexity is O(|E|).
According to http://www.cs.yale.edu/homes/aspnes/pinewiki/C(2f)Graphs.html, Section 4.2, with an adjacency list representation,
Finding predecessors of a node u is extremely expensive, requiring looking through every list of every node in time O(n+m), where m is the total number of edges.
So, in the notation used here, the time complexity of computing the in-degree of a node is O(|V| + |E|).
This can be reduced at the cost of additional space of using extra space, however. The Wiki also states that
adding a second copy of the graph with reversed edges lets us find all predecessors of u in O(d-(u)) time, where d-(u) is u's in-degree.
An example of a package which implements this approach is the Python package Networkx. As you can see from the constructor of the DiGraph object for directional graphs, networkx keeps track of both self._succ and self._pred, which are dictionaries representing the successors and predecessors of each node, respectively. This allows it to compute each node's in_degree efficiently.
O(|V|+|E|) is the correct answer, because you visit each vertex in O(|V|) and each time you visit a fraction of the edges so O(|E|) in total, also usually |E|>>|V| so O(|E|) is also correct

Breadth-first search algorithm (graph represented by the adjacency list) has a quadratic time complexity?

A friend told me that breadth-first search algorithm (graph represented by the adjacency list) has a quadratic time complexity. But in all the sources says that the complexity of BFS algortim exactly O (|V| + |E|) or O (n + m), from which we obtain a quadratic complexity ?
All the sources are right :-) With BFS you visit each vertex and each edge exactly once, resulting in linear complexity. Now, if it's a completely connected graph, i.e. each pair of vertices is connected by an edge, then the number of edges grows quadratic with the number of vertices:
|E| = |V| * (|V|-1) / 2
Then one might say the complexity of BFS is quadratic in the number of vertices: O(|V|+|E|) = O(|V|^2)
BFS is O(E+V) hence in terms of input given it is linear time algorithm but if vertices of graph are considered then no of edges can be O(|V|^2) in dense graphs hence if we consider time complexity in terms of vertices in graph then BFS is O(|V|^2) hence can be considered quadratic in terms of vertices
O(n + m) is linear in complexity and not quadratic. O(n*m) is quadratic.
0. Initially all the vertices are labelled as unvisited. We start from a given vertex as the current vertex.
1. A BFS will cover (visit) all the adjacent unvisited vertices to the current vertex queuing up these children.
2. It would then label the current vertex as 'visited' so that it might not be visited (queued again).
3 BFS would then take out the first vertex from the queue and would repeat the steps from 1 till no more unvisited vertices remain.
The runtime for the above algorithm is linear in the total no. of vertices and edges together because the algorithm would visit each vertex once and check each of its edges once and thus it would take no. of vertices + no. of edges steps to completely search the graph

Resources