For a graph(V,E) where V is the total number of vertices and E is the total number of edges, what is the time complexity of deleting an edge? I thought it would be O(V) worst case since the max number of edges any vertex can have is V-1. But I have been told the time complexity is O(M) where M is the number of edges a vertex has. Which is correct?
Depends on the structure of your graph.
If you choose to implement the graph as an adjacency list, removing an element from a list is O(V), since you may have to iterate through the list.
However, you can implement the graph as a list of sets (each set being the list of adjacent nodes of a node), and hence the time complexity can be O(logV) if the set is sorted or O(1) if it is a hash set.
If your graph is represented as an adjacency matrix, it is also O(1), since you just have to erase E[u][v] and E[v][u].
Related
enter image description here
enter image description here
above are the pseudocode of BFS and DFS.
Now with my calculation I think time complexity for both the code will be O(n), but I also have another confusion that it might be O(V+E) where V stands for Vertex and E stands for Edges. Can anyone give me a detailed time complexity of both the pseudocode.
So in short, what will be the time complexity of the BFS and DFS on both Matrix and Adjacency List.
Let us analyze the time complexity of BFS first for adjacency list implementation.
For breadth-first search, that's what we do:
Start from a node and mark it as visited. Then mark all of the neighbors of that node as visited and add them to a queue. Then fetch the next node from the queue and perform the same operation until the queue is empty. If queue is empty but there are still unvisited nodes, call the BFS function again for that node.
When we are at a node, we check each neighbor of that node to fill up the queue. If a neighbor is already visited (visited[int(neighbor) - 1] = 1), we do not add it to the queue. Neighbor of a node is another node connected to it by an edge, therefore checking all neighbors means checking all the edges. This makes our time complexity O(E). Also since we add each node to the queue (and pop it later), it makes the time complexity O(V).
So which one should we take?
Well, we should take the maximum of E and V. That's why we say O(V+E). If one of them is larger than the other, then smaller can be seen as a constant.
For example, if we have a connected graph with N many nodes, we'll have N*(N-1) edges. At each node, we will check all the neighbors, which makes N*(N-1) many checks. Therefore time complexity will be max(N, N*(N-1)) = N*(N-1) = O(N^2)
For example, if we have a sparse graph with N many nodes, and say sqrt(N) many edges, we have to say time complexity of BFS should be O(N).
Same logic can be applied for DFS. You visit each node and check each edge to dive into the depths of the graph. And again it makes it O(V+E).
As to your assumption, it is partially correct. However, as I explained above we cannot say that time complexity will always be O(N). (I assume N is the number of vertices, you didn't specify that in your question.)
Notice that these are for the adjacency list implementation.
For adjacency matrix implementation, to check neighbors of a node, we have to check all the columns corresponding to the related row, which makes O(V). And we have to do it for all vertices, therefore it is O(V^2).
So, for matrix implementation, time complexity is not dependent on the number of edges. However in most cases O(V+E) << O(V^2), therefore prefer adjacency list implementation.
I know that there are a ton of questions out there about the time complexity of BFS which is : O(V+E)
However I still struggle to understand why is the time complexity O(V+E) and not O(V*E)
I know that O(V+E) stands for O(max[V,E]) and my only guess is that it has something to do with the density of the graph and not with the algorithm itself unlike say Merge Sort where it's time complexity is always O(n*logn).
Examples I've thought of are :
A Directed Graph with |E| = |V|-1 and yeah the time complexity will be O(V)
A Directed Graph with |E| = |V|*|V-1| and the complexity would in fact be O(|E|) = O(|V|*|V|) as each vertex has an outgoing edge to every other vertex besides itself
Am I in the right direction? Any insight would be really helpful.
Your "examples of thought" illustrate that the complexity is not O(V*E), but O(E). True, E can be a large number in comparison with V, but it doesn't matter when you say the complexity is O(E).
When the graph is connected, then you can always say it is O(E). The reason to include V in the time complexity, is to cover for the graphs that have many more vertices than edges (and thus are disconnected): the BFS algorithm will not only have to visit all edges, but also all vertices, including those that have no edges, just to detect that they don't have edges. And so we must say O(V+E).
The complexity comes off easily if you walk through the algorithm. Let Q be the FIFO queue where initially it contains the source node. BFS basically does the following
while Q not empty
pop u from Q
for each adjacency v of u
if v is not marked
mark v
push v into Q
Since each node is added once and removed once then the while loop is done O(V) times. Also each time we pop u we perform |adj[u]| operations where |adj[u]| is the number of
adjacencies of u.
Therefore the total complexity is Sum (1+|adj[u]|) over all V which is O(V+E) since the sum of adjacencies is O(E) (2E for undirected graph and E for a directed one)
Consider a situation when you have a tree, maybe even with cycles, you start search from the root and your target is the last leaf of your tree. In this case you will traverse all the edges before you get into your destination.
E.g.
0 - 1
1 - 2
0 - 2
0 - 3
In this scenario you will check 4 edges before you actually find a node #3.
It depends on how the adjacency list is implemented. A properly implemented adjacency list is a list/array of vertices with a list of related edges attached to each vertex entry.
The key is that the edge entries point directly to their corresponding vertex array/list entry, they never have to search through the vertex array/list for a matching entry, they can just look it up directly. This insures that the total number of edge accesses is 2E and the total number of vertex accesses is V+2E. This makes the total time O(E+V).
In improperly implemented adjacency lists, the vertex array/list is not directly indexed, so to go from an edge entry to a vertex entry you have to search through the vertex list which is O(V), which means that the total time is O(E*V).
A simple greedy algorithm to find a maximal independent set, I think it will take O(n) time since no vertex will be visited more than twice. Why wiki said it would take O(m) time?
Greedy(G)
while G is not empty (visited V in an arbitrary order)
mark v as IS and v's neighbors as Non-IS
return all IS vertices
If you run it on Kn/2,n/2, the neighbors on the side not chosen each get marked as non-IS n/2 times.
I am reading "Algorithms Design" By Eva Tardos and in chapter 3 it is mentioned that adjacency matrix has the complexity of O(n^2) while adjacency list has O(m+n) where m is the total number of edges and n is the total number of nodes. It says that in-case of adjacency list we will need only lists of size m for each node.
Won't we end up with something similar to matrix in case of adjacency list as we have lists,which are also 1D arrays. So basically it is O(m*n) according to me. Please guide me.
An adjacency matrix keeps a value (1/0) for every pair of nodes, whether the edge exists or not, so it requires n*n space.
An adjacency list only contains existing edges, so its length is at most the number of edges (or the number of nodes in case there are fewer edges than nodes).
It says that in-case of adjacency list we will need only lists of size
m for each node.
I think you misunderstood that part. An adjacency list does not hold a list of size m for every node, since m is the number of edges overall.
In a fully connected graph, there is an edge between every pair of nodes so both adjacency list and matrix will require n*n of space, but for every other case - an adjacency list will be smaller.
Because we know that the integers representing a vertex can take values in [0,...,|V|-1] range, we can use counting sort in order to sort each entry of the adjacency list in O(V) time.
Since we have V lists to sort, that would give us a O(V^2) time algorithm. I don't see how we can transform this into an O(V+E) time algorithm...
In fact you need to sort E elements in total - the number of edges. Thus your estimation of O(V^2) is not quite correct. You sort each of the adjacency lists in linear time with respect to the number of edges it contains. And as in total you will have E edges, the complexity of sorting all lists will be O(E). Of course as you have V lists, you can't get lower than O(V) and thus the estimation O(V +E).