Prim's MST algorithm in O(|V|^2) - algorithm

Time complexity of Prim's MST algorithm is O(|V|^2) if you use adjacency matrix representation.
I am trying to implement Prim's algorithm using adjacency matrix. I am using this
as a reference.
V = {1,2...,n}
U = {1}
T = NULL
while V != U:
/*
Now this implementation means that
I find lowest cost edge in O(n).
How do I do that using adjacency list?
*/
let (u, v) be the lowest cost edge
such that u is in U and v is in V - U;
T = T + {(u,v)}
U = U + {v}
EDIT:
I understand Prim's algorithm very well.
I know how to implement it efficiently using heaps and priority queues.
I also know about better algorithms.
I want to use adjacency matrix representation of graph and get O(|V|^2) implementation.
I WANT THE INEFFICIENT IMPLEMENTATION

Finding the lowest cost edge (u,v), such that u is in U and v is in V-U, is done with a priority queue. More precisely, the priority queue contains each node v from V-U together with the lowest cost edge from v into the current tree U. In other words, the queue contains exactly |V-U| elements.
After adding a new node u to U, you have to update the priority queue by checking whether the neighboring nodes of u can now be reached by an edge of lower cost than previously.
The choice of priority queue determines the time complexity. You will get O(|V|^2) by implementing the priority queue as a simply array cheapest_edges[1..|V|]. That's because finding minimum in this queue takes O(|V|) time, and you repeat that |V| times.
In pseudo-code:
V = {2...,n}
U = {1}
T = NULL
P = array, for each v set P[v] = (1,v)
while V != U
(u,v) = P[v] with v such that length P[v] is minimal
T = T + {(u,v)}
U = U + {v}
for each w adjacent to v
if length (v,w) < length P[w] then
P[w] = (v,w)

You do it like in Dijkstra's algorithm, by selecting the node that is connected to your current partial tree with the minimum cost edge (that doesn't generate a cycle). I think wikipedia explains Prim better than that pseudocode you have. Give it a look and let me know if you have more questions.

You can sort the edges by the cost and then iterate the edges in the order of the cost, and if that edge joins two distinct subgraphs use that edge.
I have a implementation here. It reads the number of verticles (N), the number of edges (M) and the edges in order (A, B, Cost) and then outputs the edges. This is the Kruskal algorithm.
A implementation of the Prim's algorithm with a heap, using the same input can be found here.

Related

Vertex Cover of a Tree Linear or Polynomial Time?

I have the following algorithm to find the minimum vertex cover of a tree. That is a minimal sized set S of vertices such that for every edge (v,u) in G either v is in S or u is in S.
I have been told the algorithm has linear time complexity, however I don't understand how this is the case, since isn't the number of edges incident to u of the order O(n) and so the complexity would be O(n^2)?
Let T = <V, E> be a Tree. That is, the vertex set is V, the edge set is E. Also suppose the cover set = C. The algorithm can be described as follows:
while V != [] do
Identify a leaf vertex v
Locate u = parent(v), the parent vertex of v.
Add u to C
Remove all the edges incident to u
return C.
In a tree, |E| = |V| - 1, so there are O(n) edges to deal with in total.

Find the N highest cost vertices that has a path to S, where S is a vertex in an undirected Graph G

I would like to know, what would be the most efficient way (w.r.t., Space and Time) to solve the following problem:
Given an undirected Graph G = (V, E), a positive number N and a vertex S in V. Assume that every vertex in V has a cost value. Find the N highest cost vertices that is connected to S.
For example:
G = (V, E)
V = {v1, v2, v3, v4},
E = {(v1, v2),
(v1, v3),
(v2, v4),
(v3, v4)}
v1 cost = 1
v2 cost = 2
v3 cost = 3
v4 cost = 4
N = 2, S = v1
result: {v3, v4}
This problem can be solved easily by the graph traversal algorithm (e.g., BFS or DFS). To find the vertices connected to S, we can run either BFS or DFS starting from S. As the space and time complexity of BFS and DFS is same (i.e., time complexity: O(V+E), space complexity: O(E)), here I am going to show the pseudocode using DFS:
Parameter Definition:
* G -> Graph
* S -> Starting node
* N -> Number of connected (highest cost) vertices to find
* Cost -> Array of size V, contains the vertex cost value
procedure DFS-traversal(G,S,N,Cost):
let St be a stack
let Q be a min-priority-queue contains <cost, vertex-id>
let discovered is an array (of size V) to mark already visited vertices
St.push(S)
// Comment: if you do not want to consider the case "S is connected to S"
// then, you can consider commenting the following line
Q.push(make-pair(S, Cost[S]))
label S as discovered
while St is not empty
v = St.pop()
for all edges from v to w in G.adjacentEdges(v) do
if w is not labeled as discovered:
label w as discovered
St.push(w)
Q.push(make-pair(w, Cost[w]))
if Q.size() == N + 1:
Q.pop()
let ret is a N sized array
while Q is not empty:
ret.append(Q.top().second)
Q.pop()
Let's first describe the process first. Here, I run the iterative version of DFS to traverse the graph starting from S. During the traversal, I use a priority-queue to keep the N highest cost vertices that is reachable from S. Instead of the priority-queue, we can use a simple array (or even we can reuse the discovered array) to keep the record of the reachable vertices with cost.
Analysis of space-complexity:
To store the graph: O(E)
Priority-queue: O(N)
Stack: O(V)
For labeling discovered: O(V)
So, as O(E) is the dominating term here, we can consider O(E) as the overall space complexity.
Analysis of time-complexity:
DFS-traversal: O(V+E)
To track N highest cost vertices:
By maintaining priority-queue: O(V*logN)
Or alternatively using array: O(V*logV)
The overall time-complexity would be: O(V*logN + E) or O(V*logV + E)

Clarkson's 2-approximation Weighted Vertex Cover Algorithm Runtime analysis

A well-known 2-approximation for a Minimum Weighted Vertex Cover Problem is the one proposed by Clarkson:
Clarkson, Kenneth L. "A modification of the greedy algorithm for vertex cover." Information Processing Letters 16.1 (1983): 23-25.
Easy-to-read pseudo code of the algorithm can be found here see section 32.1.2.
The algorithm, according to the paper, has a runtime complexity of O(|E|*log|V|) where E is the set of edges and V the set of vertices. I'm not entirely sure how they get this result.
Let d(v) be the degree of vertex v in a graph, and w(v) be some weight function.
Excluding some of the technicalities from the algorithm, the algorithm looks like this:
while( |E| != 0){ //While there are still edges in the graph
Pick a vertex v \in V for which w(v)/d(v) is minimized;
for( u : (u,v) \in E){
update w(u);
...
}
delete v and all edges incident to it from the graph.
}
The outer loop produces the term |E| in the runtime complexity. That means that picking a vertex out of a list of vertices which minimizes some ratio can be done in log n time. As far as I can tell, finding a minimum value out of a list of values takes n-1 comparisons, not log n. Finally, the inner for loop runs for every neighbor of v, so yields a complexity of d(v) which is dominated by n-1. Hence I would conclude that the algorithm has a runtime complexity of O(|E|*|V|).
What am I missing here?
Keep the vertices in a balanced binary search tree ordered by w(v)/d(v). Finding the min is O(log |V|). Each time we delete an edge uv, we have to update u's key (by removing it and reinserting it into the tree with the new key), which takes time O(log |V|). Each of these steps is done at most |E| times.

Maximum weighted path between two vertices in a directed acyclic Graph

Love some guidance on this problem:
G is a directed acyclic graph. You want to move from vertex c to vertex z. Some edges reduce your profit and some increase your profit. How do you get from c to z while maximizing your profit. What is the time complexity?
Thanks!
The problem has an optimal substructure. To find the longest path from vertex c to vertex z, we first need to find the longest path from c to all the predecessors of z. Each problem of these is another smaller subproblem (longest path from c to a specific predecessor).
Lets denote the predecessors of z as u1,u2,...,uk and dist[z] to be the longest path from c to z then dist[z]=max(dist[ui]+w(ui,z))..
Here is an illustration with 3 predecessors omitting the edge set weights:
So to find the longest path to z we first need to find the longest path to its predecessors and take the maximum over (their values plus their edges weights to z).
This requires whenever we visit a vertex u, all of u's predecessors must have been analyzed and computed.
So the question is: for any vertex u, how to make sure that once we set dist[u], dist[u] will never be changed later on? Put it in another way: how to make sure that we have considered all paths from c to u before considering any edge originating at u?
Since the graph is acyclic, we can guarantee this condition by finding a topological sort over the graph. topological sort is like a chain of vertices where all edges point left to right. So if we are at vertex vi then we have considered all paths leading to vi and have the final value of dist[vi].
The time complexity: topological sort takes O(V+E). In the worst case where z is a leaf and all other vertices point to it, we will visit all the graph edges which gives O(V+E).
Let f(u) be the maximum profit you can get going from c to u in your DAG. Then you want to compute f(z). This can be easily computed in linear time using dynamic programming/topological sorting.
Initialize f(u) = -infinity for every u other than c, and f(c) = 0. Then, proceed computing the values of f in some topological order of your DAG. Thus, as the order is topological, for every incoming edge of the node being computed, the other endpoints are calculated, so just pick the maximum possible value for this node, i.e. f(u) = max(f(v) + cost(v, u)) for each incoming edge (v, u).
Its better to use Topological Sorting instead of Bellman Ford since its DAG.
Source:- http://www.utdallas.edu/~sizheng/CS4349.d/l-notes.d/L17.pdf
EDIT:-
G is a DAG with negative edges.
Some edges reduce your profit and some increase your profit
Edges - increase profit - positive value
Edges - decrease profit -
negative value
After TS, for each vertex U in TS order - relax each outgoing edge.
dist[] = {-INF, -INF, ….}
dist[c] = 0 // source
for every vertex u in topological order
if (u == z) break; // dest vertex
for every adjacent vertex v of u
if (dist[v] < (dist[u] + weight(u, v))) // < for longest path = max profit
dist[v] = dist[u] + weight(u, v)
ans = dist[z];

Explaination of prim's algorithm

I have to implement Prim's algorithm using a min-heap based priority queue. If my graph contained the vertices A, B, C, and D with the below undirected adjacency list... [it is sorted as (vertex name, weight to adjacent vertex)]
A -> B,4 -> D,3
B -> A,4 -> C,1 -> D,7
C -> B,1
D -> B,7 -> A,3
Rough Graph:
A-4-B-1-C
| /
3 7
| /
D
What would the priority queue look like? I have no idea what I should put into it. Should I put everything? Should I put just A B C and D. I have no clue and I would really like an answer.
Prim's: grow the tree by adding the edge of min weight with exactly one end in the tree.
The PQ contains the edges with one end in the tree.
Start with vertex 0 added to tree and add all vertices connected to 0 into the PQ.
DeleteMin() will give you the min weight edge (v, w), you add it to the MST and add all vertices connected to w into the PQ.
is this enough to get you started?
---
so, in your example, the in the first iteration, the MST will contain vertex A, and the PQ will contain the 2 edges going out from A:
A-4-B
A-3-D
Here's prim's algorithm:
Choose a node.
Mark it as visited.
Place all edges from this node into a priority queue (sorted to give smallest weights first).
While queue not empty:
pop edge from queue
if both ends are visited, continue
add this edge to your minimum spanning tree
add all edges coming out of the node that hasn't been visited to the queue
mark that node as visited
So to answer your question, you put the edges in from one node.
If you put all of the edges into the priority queue, you've got Kruskal's algorithm, which is also used for minimum spanning trees.
It depends on how you represent your graph as to what the running time is. Adjacency lists make the complexity O(E log E) for Kruskal's and Prim's is O(E log V) unless you use a fibonacci heap, in which case you can achieve O(E + V log V).
You can assign weights to your vertices. Then use priority queue based on these weights. This is a reference from the wiki: http://en.wikipedia.org/wiki/Prim's_algorithm
MST-PRIM (G, w, r) {
for each u ∈ G.V
u.key = ∞
u.parent = NIL
r.key = 0
Q = G.V
while (Q ≠ ø)
u = Extract-Min(Q)
for each v ∈ G.Adj[u]
if (v ∈ Q) and w(u,v) < v.key
v.parent = u
v.key = w(u,v)
}
Q will be your priority queue. You can use struct to hold the information of the vertices.

Resources