inverse of adjacency list in O(|V | + |E|) - algorithm

Let G = (V, E) be a directed graph, given in the adjacency list format. Define a
directed graph G' = (V, E') where an edge (u, v) ∈ E'
if and only if (v, u) ∈ E (namely, G'reverses the direction of each edge in G). Describe an algorithm to obtain an adjacency list representation
of G'
in O(|V | + |E|) time.
is there a simple way inverse the adjacency list?
say if it was:
a-> b
b-> de
c-> c
d-> ab
e->
to:
a-> d
b-> ad
c-> c
d-> ab
e-> b

Let's say you iterate the adjacency lists of all vertices in the graph as follows.
for each u in V:
for each v in adjacency_list[u]:
reverse_adjacency_list[v].append(u)
With this approach, you traverse the adjacency lists of all |V| vertices, which contributes O(|V|) to the overall time complexity of your algorithm. Also, as you traverse all of those adjacency list, you effectively traverse all the edges in the graph. In other words, if you concatenated all the adjacency lists you traverse, the length of that resulting list would be |E|. Thus, another O(|E|) is contributed to the overall complexity.
Consequently, the time complexity will be O(|V| + |E|) with this pretty standard approach, and you do not need to devise any peculiar method to achieve this complexity.

Related

Find the N highest cost vertices that has a path to S, where S is a vertex in an undirected Graph G

I would like to know, what would be the most efficient way (w.r.t., Space and Time) to solve the following problem:
Given an undirected Graph G = (V, E), a positive number N and a vertex S in V. Assume that every vertex in V has a cost value. Find the N highest cost vertices that is connected to S.
For example:
G = (V, E)
V = {v1, v2, v3, v4},
E = {(v1, v2),
(v1, v3),
(v2, v4),
(v3, v4)}
v1 cost = 1
v2 cost = 2
v3 cost = 3
v4 cost = 4
N = 2, S = v1
result: {v3, v4}
This problem can be solved easily by the graph traversal algorithm (e.g., BFS or DFS). To find the vertices connected to S, we can run either BFS or DFS starting from S. As the space and time complexity of BFS and DFS is same (i.e., time complexity: O(V+E), space complexity: O(E)), here I am going to show the pseudocode using DFS:
Parameter Definition:
* G -> Graph
* S -> Starting node
* N -> Number of connected (highest cost) vertices to find
* Cost -> Array of size V, contains the vertex cost value
procedure DFS-traversal(G,S,N,Cost):
let St be a stack
let Q be a min-priority-queue contains <cost, vertex-id>
let discovered is an array (of size V) to mark already visited vertices
St.push(S)
// Comment: if you do not want to consider the case "S is connected to S"
// then, you can consider commenting the following line
Q.push(make-pair(S, Cost[S]))
label S as discovered
while St is not empty
v = St.pop()
for all edges from v to w in G.adjacentEdges(v) do
if w is not labeled as discovered:
label w as discovered
St.push(w)
Q.push(make-pair(w, Cost[w]))
if Q.size() == N + 1:
Q.pop()
let ret is a N sized array
while Q is not empty:
ret.append(Q.top().second)
Q.pop()
Let's first describe the process first. Here, I run the iterative version of DFS to traverse the graph starting from S. During the traversal, I use a priority-queue to keep the N highest cost vertices that is reachable from S. Instead of the priority-queue, we can use a simple array (or even we can reuse the discovered array) to keep the record of the reachable vertices with cost.
Analysis of space-complexity:
To store the graph: O(E)
Priority-queue: O(N)
Stack: O(V)
For labeling discovered: O(V)
So, as O(E) is the dominating term here, we can consider O(E) as the overall space complexity.
Analysis of time-complexity:
DFS-traversal: O(V+E)
To track N highest cost vertices:
By maintaining priority-queue: O(V*logN)
Or alternatively using array: O(V*logV)
The overall time-complexity would be: O(V*logN + E) or O(V*logV + E)

Kruskal's Algorithm: Update MST when an edge becomes mandatory

It is given an undirected graph G = (V, E). First it is asked what is the cost of MST.
I can easily find out using Kruskall algorithm, like this:
G = (V, E)
for each edge (u, v) in E sorted by wight
{
if(Find(u) != Find(v))
{
Add (u, v) to the MST
Union(u, v); // put u and v in the same set
}
}
After that, for each edge in the initial graph, it is asked what will the cost of new MST it that edge shall be present in the Minimum Spanning Tree.
If an edge is already present in the MST, the answer remains the same. Otherwise, I can run Kruskall once again. The pseudocode is the following:
G = (V, E)
G1 = runKruskall(G)
for each edge (u, v) in E
{
ClearUnionSets()
if (u, v) in G1
{
print costOf(G1)
} else {
Union(u, v)
G2 = runKruskall(G)
print costOf(G2)
}
}
The problem with that approach is that the total complexity would be: O(E*E)
My question is if there exist a better solution for updating MST as described above.
What I was thinking is that when running for the first time Kruskall, for every edge (u, v), were u and v are in the same set, find the the maximum weighted edge already present in the partial MST that makes a cycle with (u, v) and store that information in a matrix M at M[u][v]. Doing this, the problem of updating MST when a edge becomes mandatory would be solved in O(1).
Can anyone help me with this?
For every edge u-v that is not on the MST, the smallest spanning tree including the edge is the one where u-v replaces the largest edge on the path from u to v on the MST.
The edge to be replaced can be found efficiently as follows. First, root the MST at an arbitrary vertex. We will modify the algorithm to find the lowest common ancestor (LCA) of two vertices, described here. In addition to storing the 2^i th parent for each vertex, we will also store the largest edge on the path to the 2^i th parent. Using this array, while we calculate the LCA we will also calculate the largest edge on the path to the LCA, which gives us the largest edge on the path between the two vertices.
Preprocessing involves finding the MST in O(E log E) and building the parent table for LCA in O(N log N), with the requirement of O(N log N) space. After this, finding the modified MST for each edge requires only the evaluation of LCA once, which can be performed in O(log N). Thus the total complexity is only O(E log E).

single-source shortest algorithm

Let G = (V, E) be a weighted, directed graph with weight function w : E → R. Give an O(V E)-time algorithm to find, for each vertex v ∈ V , the value δ*(v) = min{u∈V} {δ(u, v)}.
I don't understand the question. Could someone give me some ideas?
This basically means:
G = (V, E) having a graph with V vertices and E edges
weighted, directed graph with weight function w : E → R the graph is directed and weighted, where each edge has real value weight
O(V E)-time algorithm find the algorithm that runs in number of operations proportional to number of vertices multiplied by number of edges
for each vertex v ∈ V , the value δ*(v) = min{u∈V} {δ(u, v)} here they have not described what δ(u, v) means, but most probably this is the sum of weights of edges from vertex u to v. This basically asks you to find the minimum distance from vertex u to all vertices v.
And the answer to your question Bellman Ford.

Explaination of prim's algorithm

I have to implement Prim's algorithm using a min-heap based priority queue. If my graph contained the vertices A, B, C, and D with the below undirected adjacency list... [it is sorted as (vertex name, weight to adjacent vertex)]
A -> B,4 -> D,3
B -> A,4 -> C,1 -> D,7
C -> B,1
D -> B,7 -> A,3
Rough Graph:
A-4-B-1-C
| /
3 7
| /
D
What would the priority queue look like? I have no idea what I should put into it. Should I put everything? Should I put just A B C and D. I have no clue and I would really like an answer.
Prim's: grow the tree by adding the edge of min weight with exactly one end in the tree.
The PQ contains the edges with one end in the tree.
Start with vertex 0 added to tree and add all vertices connected to 0 into the PQ.
DeleteMin() will give you the min weight edge (v, w), you add it to the MST and add all vertices connected to w into the PQ.
is this enough to get you started?
---
so, in your example, the in the first iteration, the MST will contain vertex A, and the PQ will contain the 2 edges going out from A:
A-4-B
A-3-D
Here's prim's algorithm:
Choose a node.
Mark it as visited.
Place all edges from this node into a priority queue (sorted to give smallest weights first).
While queue not empty:
pop edge from queue
if both ends are visited, continue
add this edge to your minimum spanning tree
add all edges coming out of the node that hasn't been visited to the queue
mark that node as visited
So to answer your question, you put the edges in from one node.
If you put all of the edges into the priority queue, you've got Kruskal's algorithm, which is also used for minimum spanning trees.
It depends on how you represent your graph as to what the running time is. Adjacency lists make the complexity O(E log E) for Kruskal's and Prim's is O(E log V) unless you use a fibonacci heap, in which case you can achieve O(E + V log V).
You can assign weights to your vertices. Then use priority queue based on these weights. This is a reference from the wiki: http://en.wikipedia.org/wiki/Prim's_algorithm
MST-PRIM (G, w, r) {
for each u ∈ G.V
u.key = ∞
u.parent = NIL
r.key = 0
Q = G.V
while (Q ≠ ø)
u = Extract-Min(Q)
for each v ∈ G.Adj[u]
if (v ∈ Q) and w(u,v) < v.key
v.parent = u
v.key = w(u,v)
}
Q will be your priority queue. You can use struct to hold the information of the vertices.

Graph - Square of a directed graph

Yes, this will be a homework (I am self-learning not for university) question but I am not asking for solution. Instead, I am hoping to clarify the question itself.
In CLRS 3rd edition, page 593, excise 22.1-5,
The square of a directed graph G = (V, E) is the graph G2 = (V, E2) such that (u,v) ∈ E2 if and only if G contains a path with at most two edges between u and v. Describe efficient algorithms for computing G2 from G for both the adjacency-list and adjacency-matrix representations of G. Analyze the running times of your algorithms.
However, in CLRS 2nd edition (I can't find the book link any more), page 530, the same exercise but with slightly different description:
The square of a directed graph G = (V, E) is the graph G2 = (V, E2) such that (u,w) ∈ E2 if and only if for some v ∈ V, both (u,v) ∈ E and (v,w) ∈ E. That is, G2 contains an edge between u and w whenever G contains a path with exactly two edges between u and w. Describe efficient algorithms for computing G2 from G for both the adjacency-list and adjacency-matrix representations of G. Analyze the running times of your algorithms.
For the old exercise with "exactly two edges", I can understand and can solve it. For example, for adjacency-list, I just do v->neighbour->neighbour.neighbour, then add (v, neighbour.neighbour) to the new E2.
But for the new exercise with "at most two edges", I am confused.
What does "if and only if G contains a path with at most two edges between u and v" mean?
Since one edge can meet the condition "at most two edges", if u and v has only one path which contains only one edge, should I add (u, v) to E2?
What if u and v has a path with 2 edges, but also has another path with 3 edges, can I add (u, v) to E2?
Yes, that's exactly what it means. E^2 should contain (u,v) iff E contains (u,v) or there is w in V, such that E contains both (u,w) and (w,v).
In other words, E^2 according to the new definition is the union of E and E^2 according to the old definition.
Regarding to your last question: it doesn't matter what other paths between u and v exist (if they do). So, if there are two paths between u and v, one with 2 edges and one with 3 edges, then (u,v) should be in E^2 (according to both definitions).
The square of a graph G, G^2 defined by those vertices V' for which d(u,v)<=2 and the eges G' of G^2 is all those edges of G which have both the end vertices From V'

Resources