Time complexity of adjacency list representation? - algorithm

I am going through this link for adjacency list representation.
http://www.geeksforgeeks.org/graph-and-its-representations/
I have a simple doubt in some part of a code as follows :
// A utility function to print the adjacenncy list representation of graph
void printGraph(struct Graph* graph)
{
int v;
for (v = 0; v < graph->V; ++v)
{
struct AdjListNode* pCrawl = graph->array[v].head;
printf("\n Adjacency list of vertex %d\n head ", v);
while (pCrawl)
{
printf("-> %d", pCrawl->dest);
pCrawl = pCrawl->next;
}
printf("\n");
}
}
Since, here for every V while loop is executed say d times where d is the degree of each vertex.
So, I think time complexity is like
d0 + d1 + d2 + d3 ....... +dV where di is the degree of each vertex.
All this sums up O(E), but the link says that time complexity is O(|V|+|E|)
Not sure what is the problem with the understanding. Some help needed here

The important thing here is that for the time complexity to be valid, we need to cover every possible situation:
The outer loop is executed O(|V|) regardless of the graph structure.
Even if we had no edges at all, for every iteration of the outer loop, we would have to do a constant number of operations (O(1))
The inner loop is executed once for every edge, thus O(deg(v)) times, where deg(v) is the degree of the current node.
Thus the runtime of a single iteration of the outer loop is O(1 + deg(v)). Note that we cannot leave out the 1, because deg(v) might be 0 but we still need to do some work in that iteration
Summing it all up, we get a runtime of O(|V| * 1 + deg(v1) + deg(v2) + ...) = O(|V| + |E|).
As mentioned before, |E| could be rather small such that we still need
to account for the work done exclusively in the outer loop. Thus, we cannot simply remove the |V| term.

Related

Time Complexity of Printing a Graph in Adjacency List Representation

What is the order of growth of the running time of the following code if the graph uses an adjacency-list representation, where V is the number of vertices, and E is the total number of edges?
// G.V() returns number of vertices, G is the graph.
for (int v = 0; v < G.V(); v++) {
for (w : G.adj(v)) {
System.out.println(v + "-" + w);
}
}
Why is the time complexity of the above code Theta(V+E), where V is the number of vertices and E is the number of edges?
I believe that if we let printing be the cost function, then it should be Theta(sum of degrees of each v) = Theta(2E) = Theta(E) because we enter the inner loop deg(v) times for vertex v.
if we let printing be the cost function, then
Using such assumption, yes, there will be Theta(E) println calls.
However, generally the execution time does not depend only on printing, but also on other instructions such as v++, there will be Theta(V+E) of them all.

Search a word in a matrix runtime comlexity

Trying to analyze the runtime complexity of the following algorithm:
Problem: We have an m * n array A consisting of lower case letters and a target string s. The goal is to examine whether the target string appearing in A or not.
algorithm:
for(int i = 0; i < m; i++){
for(int j = 0; j < n; j++){
if(A[i][j] is equal to the starting character in s) search(i, j, s)
}
}
boolean search(int i, int j, target s){
if(the current position relative to s is the length of s) then we find the target
looping through the four possible directions starting from i, j: {p,q} = {i+1, j} or {i-1, j} or {i, j+1} or {i, j-1}, if the coordinate is never visited before
search(p, q, target s)
}
One runtime complexity analysis that I read is the following:
At each position in the array A, we are first presented with 4 possible directions to explore. After the first round, we are only given 3 possible choices because we can never go back. So the worst runtime complexity is O(m * n * 3**len(s))
However, I disagree with this analysis, because even though we are only presented with 3 possible choices each round, we do need to spend one operation to check whether that direction has been visited before or not. For instance, in java you probably just use a boolean array to track whether one spot has been visited before, so in order to know whether a spot has been visited or not, one needs a conditional check, and that costs one operation. The analysis I mentioned does not seem to take into account this.
What should be the runtime complexity?
update:
Let us suppose that the length of the target string is l and the runtime complexity at a given position in the matrix is T(l). Then we have:
T(l) = 4 T(l- 1) + 4 = 4(3T(l - 2) + 4) + 4 = 4(3( 3T(l -3) + 4) + 4)) + 4 = 4 * 3 ** (l - 1) + 4 + 4 *4 + 4 * 3 * 4 + ...
the +4 is coming from the fact that we are looping through four directions in each round besides recursively calling itself three times.
What should be the runtime complexity?
The mentioned analysis is correct and the complexity is indeed O(m * n * 3**len(s)).
For instance, in java you probably just use a boolean array to track whether one spot has been visited before, so in order to know whether a spot has been visited or not, one needs a conditional check, and that costs one operation.
That is correct and does not contradict the analysis.
The worst case we can construct is the matrix filled with only one letter a and a string aaaa....aaaax (many letters a and one x at the end). If m, n and len(s) are large enough, almost each call of the search function will generate 3 recursion calls of itself. Each of that calls will generate another 3 calls (which gives us total 9 calls of depth 2), each of them willl generate another 3 calls (which gives us total 27 calls of depth 3) and so on. Checking current string character, conditional checks, spawning a recursion are all O(1), so complexity of the whole search function is O(3**len(s)).
The solution is brute force. We have to touch each point on the board. That makes O(m*n) operation.
Now for each point, we have to run dfs() to check if the word exist. So we get
O(m * n * timeComplexityOf dfs)
this is a dfs written in python. Examine the time complexity
def dfs(r,c,i):
# O(1)
if i==len(word):
return True
# O(1)
# set is implemented as a hash table.
# So, time complexity of look up in a set is O(1)
if r<0 or c<0 or r>=ROWS or c>=COLS or word[i]!=board[r][c] or (r,c) in path_set:
return False
# O(1)
path.add((r,c))
# O(1)
res=(dfs(r+1,c,i+1) or
dfs(r-1,c,i+1) or
dfs(r,c+1,i+1) or
dfs(r,c-1,i+1))
# O(1)
path.remove((r,c))
return res
Since we dfs recursively calling itself, think about how many dfs calls will be on call stack. in worst case it will length of word. Thats why
O ( m * n * word.length)

Time Complexity of Straight Forward Dijkstra's Algorithm

I am having a hard time seeing the O(mn) bound for the straightforward implementation Dijkstra's algorithm (without a heap). In my implementation and others I have found the main loop iterates n-1 times (for each vertex that is not source, n-1), then in each iteration finding the minimum vertex is O(n) (examining each vertex in the queue and finding min distance to source) and then each discovered minimum vertex would have at most n-1 neighbors, so updating all neighbors is O(n). This would seem to me to lead to a bound of O(n^2). My implementation is provided below
public int[] dijkstra(int s) {
int[] dist = new int[vNum];
LinkedList queue = new LinkedList<Integer>();
for (int i = 0; i < vNum; i++) {
queue.add(i); // add all vertices to the queue
dist[i] = Integer.MAX_VALUE; // set all initial shortest paths to max INT value
}
dist[s] = 0; // the source is 0 away from itself
while (!queue.isEmpty()) { // iterates over n - 1 vertices, O(n)
int minV = getMinDist(queue, dist); // get vertex with minimum distance from source, O(n)
queue.remove(Integer.valueOf(minV)); // remove Integer object, not position at integer
for (int neighbor : adjList[minV]) { // O(n), max n edges
int shortestPath = dist[minV] + edgeLenghts[minV][neighbor];
if (shortestPath < dist[neighbor]) {
dist[neighbor] = shortestPath; // a new shortest path have been found
}
}
}
return dist;
}
I don't think this is correct, but I am having trouble see where m factors in.
Your implementation indeed removes the M factor, at least if we consider only simple graphs (no multiple edges between two vertices). It is O(N^2)! The complexity would be O(N*M) if you would iterate through all the possible edges instead of vertices.
EDIT: Well, it is actually O(M + N^2) to be more specific. Changing value in some vertex takes O(1) time in your algorithm and it might happen each time you consider an edge, in other words, M times. That's why there is M in the complexity.
Unfortunately, if you want to use simple heap, the complexity is going to be O(M* log M) (or M log N). Why? You are not able to quickly change values in heap. So if dist[v] suddenly decreases, because you've found a new, better path to v, you can't just change it in the heap, because you don't really know it's location. You may put a duplicate of v in your heap, but the size of the heap would be then O(M). Even if you store the location and update it cleverly, you might have O(N) items in the heap, but you still have to update the heap after each change, which takes O(size of heap). You may change the value up to O(M) times, what gives you O(M* log M (or N)) complexity

Count triangles (cycles of length 3) in a graph

In an undirected graph with V vertices and E edges how would you count the number of triangles in O(|V||E|)? I see the algorithm here but I'm not exactly sure how that would be implemented to achieve that complexity. Here's the code presented in that post:
for each edge (u, v):
for each vertex w:
if (v, w) is an edge and (w, u) is an edge:
return true
return false
Would you use an adjacency list representation of the graph to traverse all edges in the outer loop and then an adjacency matrix to check for the existence of the 2 edges in the inner loop?
Also, I saw a another solution presented as O(|V||E|) which involves performing a depth-first search on the graph and when you encounter a backedge (u,v) from the vertex u you're visiting check if the grandparent of the vertex u is vertex v. If it is then you have found a triangle. Is this algorithm correct? If so, wouldn't this be O(|V|+|E|)? In the post I linked to there is a counterexample for the breadth-first search solution offered up but based on the examples I came up with it seems like the depth-first search method I outlined above works.
Firstly, note that the algorithm does not so much count the number of triangles, but rather returns whether one exists at all.
For the first algorithm, the analysis becomes simple if we assume that we can do the lookup of (a, b) is an edge in constant time. (Since we loop over all vertices for all edges, and only do something with constant time we get O(|V||E|*1). ) Telling whether something is a member of a set in constant time can be done using for example a hashtable/set. We could also, as you said, do this by the use of the adjacency matrix, which we could create beforehand by looping over all the edges, not changing our total complexity.
An adjacency list representation for looping over the edges could perhaps be used, but traversing this may be O(|V|+|E|), giving us the total complexity O(|V||V| + |V||E|) which may be more than we wanted. If that is the case, we should instead loop over this first, and add all our edges to a normal collection (like a list).
For your proposed DFS algorithm, the problem is that we cannot be sure to encounter a certain edge as a backedge at the correct moment, as is illustrated by the following counterexample:
A -- B --- C -- D
\ / |
E ----- F
Here if we look from A-B-C-E, and then find the backedge E-B, we correctly find the triangle; but if we instead go A-B-C-D-F-E, the backedges E-B, and E-C, do no longer satisfy our condition.
This is a naive approach to count the number of cycles.
We need the input in the form of an adjacency matrix.
public int countTricycles(int [][] adj){
int n = adj.length;
int count = 0;
for(int i = 0; i < n ;i++){
for(int j = 0; j < n; j++){
if(adj[i][j] != 0){
for(int k = 0; k < n; k++){
if(k!=i && adj[j][k] != 0 && adj[i][k] != 0 ){
count++;
}
}
}
}
}
return count/6;
}
The complexity would be O(n^3).

Running time of minimum spanning tree? ( Prim method )

I have written a code that solves MST using Prim method. I read that this kind of implementation(using priority queue) should have O(E + VlogV) = O(VlogV) where E is the number of edges and V number of Edges but when I look at my code it simply doesn't look that way.I would appreciate it if someone could clear this up for me.
To me it seems the running time is this:
The while loop takes O(E) times(until we go through all the edges)
Inside that loop we extract an element from the Q which takes O(logE) time.
And the second inner loop takes O(V) time(although we dont run this loop everytime
it is clear that it will be ran V times since we have to add all the vertices )
My conclusion would be that the running time is: O( E(logE+V) ) = O( E*V ).
This is my code:
#define p_int pair < int, int >
int N, M; //N - nmb of vertices, M - nmb of edges
int graph[100][100] = { 0 }; //adj. matrix
bool in_tree[100] = { false }; //if a node if in the mst
priority_queue< p_int, vector < p_int >, greater < p_int > > Q;
/*
keeps track of what is the smallest edge connecting a node in the mst tree and
a node outside the tree. First part of pair is the weight of the edge and the
second is the node. We dont remember the parent node beaceuse we dont need it :-)
*/
int mst_prim()
{
Q.push( make_pair( 0, 0 ) );
int nconnected = 0;
int mst_cost = 0;
while( nconnected < N )
{
p_int node = Q.top(); Q.pop();
if( in_tree[ node.second ] == false )
{
mst_cost += node.first;
in_tree[ node.second ] = true;
for( int i = 0; i < N; ++i )
if( graph[ node.second ][i] > 0 && in_tree[i]== false )
Q.push( make_pair( graph[ node.second ][i], i ) );
nconnected++;
}
}
return mst_cost;
}
You can use adjacency lists to speed your solution up (but not for dense graphs), but even then, you are not going to get O(V log V) without a Fibonacci heap..
Maybe the Kruskal algorithm would be simpler for you to understand. It features no priority queue, you only have to sort an array of edges once. It goes like this basically:
Insert all edges into an array and sort them by weight
Iterate over the sorted edges, and for each edge connecting nodes i and j, check if i and j are connected. If they are, skip the edge, else add the edge into the MST.
The only catch is to be quickly able to say if two nodes are connected. For this you use the Union-Find data structure, which goes like this:
int T[MAX_#_OF_NODES]
int getParent(int a)
{
if (T[a]==-1)return a;
return T[a]=getParent(T[a]);
}
void Unite(int a,int b)
{
if (rand()&1)
T[a]=b;
else
T[b]=a;
}
In the beginning, just initialize T to all -1, and then every time you want to find out if nodes A and B are connected, just compare their parents - if they are the same, they are connected (like this getParent(A)==getParent(B)). When you are inserting the edge to MST, make sure to update the Union-Find with Unite(getParent(A),getParent(B)).
The analysis is simple, you sort the edges and iterate over using the UF that takes O(1). So it is O(E logE + E ) which equals O(E log E).
That is it ;-)
I did not have to deal with the algorithm before, but what you have implemented does not match the algorithm as explained on Wikipedia. The algorithm there works as follows.
But all vertices into the queue. O(V)
While the queue is not empty... O(V)
Take the edge with the minimum weight from the queue. O(log(V))
Update the weights of adjacent vertices. O(E / V), this is the average number of adjacent vertices.
Reestablish the queue structure. O(log(V))
This gives
O(V) + O(V) * (O(log(V)) + O(V/E))
= O(V) + O(V) * O(log(V)) + O(V) * O(E / V)
= O(V) + O(V * log(V)) + O(E)
= O(V * log(V)) + O(E)
exactly what one expects.

Resources