Explaination of prim's algorithm - algorithm

I have to implement Prim's algorithm using a min-heap based priority queue. If my graph contained the vertices A, B, C, and D with the below undirected adjacency list... [it is sorted as (vertex name, weight to adjacent vertex)]
A -> B,4 -> D,3
B -> A,4 -> C,1 -> D,7
C -> B,1
D -> B,7 -> A,3
Rough Graph:
A-4-B-1-C
| /
3 7
| /
D
What would the priority queue look like? I have no idea what I should put into it. Should I put everything? Should I put just A B C and D. I have no clue and I would really like an answer.

Prim's: grow the tree by adding the edge of min weight with exactly one end in the tree.
The PQ contains the edges with one end in the tree.
Start with vertex 0 added to tree and add all vertices connected to 0 into the PQ.
DeleteMin() will give you the min weight edge (v, w), you add it to the MST and add all vertices connected to w into the PQ.
is this enough to get you started?
---
so, in your example, the in the first iteration, the MST will contain vertex A, and the PQ will contain the 2 edges going out from A:
A-4-B
A-3-D

Here's prim's algorithm:
Choose a node.
Mark it as visited.
Place all edges from this node into a priority queue (sorted to give smallest weights first).
While queue not empty:
pop edge from queue
if both ends are visited, continue
add this edge to your minimum spanning tree
add all edges coming out of the node that hasn't been visited to the queue
mark that node as visited
So to answer your question, you put the edges in from one node.
If you put all of the edges into the priority queue, you've got Kruskal's algorithm, which is also used for minimum spanning trees.
It depends on how you represent your graph as to what the running time is. Adjacency lists make the complexity O(E log E) for Kruskal's and Prim's is O(E log V) unless you use a fibonacci heap, in which case you can achieve O(E + V log V).

You can assign weights to your vertices. Then use priority queue based on these weights. This is a reference from the wiki: http://en.wikipedia.org/wiki/Prim's_algorithm
MST-PRIM (G, w, r) {
for each u ∈ G.V
u.key = ∞
u.parent = NIL
r.key = 0
Q = G.V
while (Q ≠ ø)
u = Extract-Min(Q)
for each v ∈ G.Adj[u]
if (v ∈ Q) and w(u,v) < v.key
v.parent = u
v.key = w(u,v)
}
Q will be your priority queue. You can use struct to hold the information of the vertices.

Related

Find the N highest cost vertices that has a path to S, where S is a vertex in an undirected Graph G

I would like to know, what would be the most efficient way (w.r.t., Space and Time) to solve the following problem:
Given an undirected Graph G = (V, E), a positive number N and a vertex S in V. Assume that every vertex in V has a cost value. Find the N highest cost vertices that is connected to S.
For example:
G = (V, E)
V = {v1, v2, v3, v4},
E = {(v1, v2),
(v1, v3),
(v2, v4),
(v3, v4)}
v1 cost = 1
v2 cost = 2
v3 cost = 3
v4 cost = 4
N = 2, S = v1
result: {v3, v4}
This problem can be solved easily by the graph traversal algorithm (e.g., BFS or DFS). To find the vertices connected to S, we can run either BFS or DFS starting from S. As the space and time complexity of BFS and DFS is same (i.e., time complexity: O(V+E), space complexity: O(E)), here I am going to show the pseudocode using DFS:
Parameter Definition:
* G -> Graph
* S -> Starting node
* N -> Number of connected (highest cost) vertices to find
* Cost -> Array of size V, contains the vertex cost value
procedure DFS-traversal(G,S,N,Cost):
let St be a stack
let Q be a min-priority-queue contains <cost, vertex-id>
let discovered is an array (of size V) to mark already visited vertices
St.push(S)
// Comment: if you do not want to consider the case "S is connected to S"
// then, you can consider commenting the following line
Q.push(make-pair(S, Cost[S]))
label S as discovered
while St is not empty
v = St.pop()
for all edges from v to w in G.adjacentEdges(v) do
if w is not labeled as discovered:
label w as discovered
St.push(w)
Q.push(make-pair(w, Cost[w]))
if Q.size() == N + 1:
Q.pop()
let ret is a N sized array
while Q is not empty:
ret.append(Q.top().second)
Q.pop()
Let's first describe the process first. Here, I run the iterative version of DFS to traverse the graph starting from S. During the traversal, I use a priority-queue to keep the N highest cost vertices that is reachable from S. Instead of the priority-queue, we can use a simple array (or even we can reuse the discovered array) to keep the record of the reachable vertices with cost.
Analysis of space-complexity:
To store the graph: O(E)
Priority-queue: O(N)
Stack: O(V)
For labeling discovered: O(V)
So, as O(E) is the dominating term here, we can consider O(E) as the overall space complexity.
Analysis of time-complexity:
DFS-traversal: O(V+E)
To track N highest cost vertices:
By maintaining priority-queue: O(V*logN)
Or alternatively using array: O(V*logV)
The overall time-complexity would be: O(V*logN + E) or O(V*logV + E)

Algorithm for finding weight of path with lowest weight in weighted directed graph

I am given a G=(V,E) directed graph, and all of its edges have weight of either "0" or "1".
I'm given a vertex named "A" in the graph, and for each v in V, i need to find the weight of the path from A to v which has the lowest weight in time O(V+E).
I have to use only BFS or DFS (although this is probably a BFS problem).
I though about making a new graph where vertices that have an edge of 0 between them are united and then run BFS on it, but that would ruin the graph direction (this would work if the graph was undirected or the weights were {2,1} and for an edge of 2 i would create a new vertex).
I would appreciate any help.
Thanks
I think it can be done with a combination of DFS and BFS.
In the original BFS for an unweighted graph, we have the invariant that the distance of nodes unexplored have a greater or equal distance to those nodes explored.
In our BFS, for each node we first do DFS through all 0 weighted edges, mark down the distance, and mark it as explored. Then we can continue the other nodes in our BFS.
Array Seen[] = false
Empty queue Q
E' = {(a, b) | (a, b) = 0 and (a, b) is of E}
DFS(V, E', u)
for each v is adjacent to u in E' // (u, v) has an edge weighted 0
if Seen[v] = false
v.dist = u.dist
DFS(V, E', v)
Seen[u] = true
Enqueue(Q, u)
BFS(V, E, source)
Enqueue(Q, source)
source.dist = 0
DFS(V, E', source)
while (Q is not empty)
u = Dequeue(Q)
for each v is adjacent to u in E
if Seen[v] = false
v.dist = u.dist + 1
Enqueue(Q, v)
Seen[u] = true
After running the BFS, it can give you all shortest distance from the node source. If you only want a shortest distance to a single node, simply terminate when you see the destination node. And yes, it meets the requirement of O(V+E) time complexity.
This problem can be modified to the problem of Single Source Shortest Path.
You just need to reverse all the edge directions and find the minimum distance of each vertex v from the vertex A.
It could be easily observed that if in the initial graph if we had a minimal path from some vertex v to A, after changing the edge directions we would have the same minimal path from A to v.
This could be simply done either by Dijkstra OR as the edges just have two values {0 and 1}, it could also be done by modified BFS (first go to vertexes with distance 0, then 1, then 2 and so on.).

Given an undirected graph G = (V, E), determine whether G is a complete graph

I'm pretty sure this problem is P and not NP, but I'm having difficulty coming up with a polynomially bound algorithm to solve it.
You can :
check that number of edges in the graph is n(n-1)/2.
check that each vertice is connected to exaclty n-1 distinct vertices.
This will run in O(V²), which is polynomial.
Hope it helped.
Here's an O(|E|) algorithm that also has a small constant.
It's trivial to enumerate every edge in a complete graph. So all you need to do is scan your edge list and verify that every such edge exists.
For each edge (i, j), let f(i, j) = i*|V| + j. Assuming vertices are numbered 0 to |V|-1.
Let bitvec be a bit vector of length |V|2, initialized to 0.
For each edge (i, j), set bitvec[f(i, j)] = 1.
G is a complete graph if and only if every element of bitvec == 1.
This algorithm not only touches E once, but it's also completely vectorizable if you have a scatter instruction. That also means it's trivial to parallelize.
Here is an O(E) algorithm:
Use O(E) as it is input time, to scan the graph
Meanwhile, record each vertex p's degree, increase degree only if the neighbor is not p itself (self-connecting edge) and is not a vertex q where p and q has another edge counted already (multiple edge), these checking can be done in O(1)
Check if all vertex's degree is |V|-1, this step is O(V), if Yes then it is a complete graph
Total is O(E)
For a given graph G = (V,E), check for each pair u, v in the V, and see if edge (u,v) is in E.
The total number of u, v pairs are |V|*(|V|-1)/2. As a result, with a time complexity of O(|V|^2), you can check and see if a graph is complete or not.

Finding a New Minimum Spanning Tree After a New Edge Was Added to The Graph

Let G = (V, E) be a weighted, connected and undirected graph and let T be a minimum spanning tree. Let e be any edge not in E (and has a weight W(e)).
Prove or disprove:
T U {e} is an edge set that contains a minimum spanning tree of G' = (V, E U {e}).
Well, it sounds true to me, so I decided to prove it but I just get stuck every time...
For example, if e is the new edge with minimum weight, who can promise us that the edges in T weren't chosen in a bad way that would prevent us from obtaining a new minimum weight without the 'help' of other edges in E - T ?
I would appreciate any help,
Thanks in advance.
Let [a(1), a(2), ..., a(n-1)] be a sequence of edges selected from E to construct MST of G by Kruskal's algorithm (in the order they were selected - weight(a(i)) <= weight(a(i + 1))).
Let's now consider how Kruskal's Algorithm behaves being given as input E' = E U {e}.
Let i = min{i: weight(e) < weight(a(i))}. Firstly algorithm decides to choose edges [a(1), ..., a(i - 1)] (e hasn't been processed yet, so it behaves the same). Then it need to decide on e - if e is dropped, solution for E' will be the same as for E. So let's suppose that first i edges selected by algorithm are [a(1), ..., a(i - 1), e] - I will call this new sequence a'. Algorithm continues - as long as its following selections (for j > i) satisfy a'(j) = a(j - 1) we are cool. There are two scenarios that break such great streak (let's say streak breaks at index k + 1):
1) Algorithm selects some edge e' that is not in T, and weight(e') < weight(a(k+1)). By now a' sequence is:
[a(1), ..., a(i-1), e, a(i), a(i+1), ..., a(k-1), a(k), e']
But if it was possible to append e' to this list it would be also possible to append it to [a(1), ..., a(k-1), a(k)]. But Kruskal's algorithm didn't do it when looking for MST for G. That leads to contradiction.
2) Algorithm politely selected:
[a(1), ..., a(i-1), e, a(i), a(i+1), ..., a(k-1), a(k)]
but decided to drop edge a(k+1). But if e was not present in the list algorithm would decide to append a(k+1). That means that in graph (V, {a(1), ..., a(k)}) edge a(k+1) would connect the same components as edge e. And that means that after considering by algorithm edge a(k + 1) in case of both G and G' the division into connected components (determined by set of selected edges) is the same. So after processing a(k+1) algorithm will proceed in the same way in both cases.
When ever a edge is add to a graph without adding a node , then that edge creates a cycle in minimum spanning tree of graph, cycle length may vary from 2 to n where n= no of nodes in graph.
T = Minimum spanning tree of G
Now to find the MST for (T + added edge) , we have to just remove one edge from that cycle .. so remove that edge which has maximum weight.
So T' always comes from T U {e}.
And if you are thinking that this doesn't prove that new MST will be an edge set of T U {e} then analyse Kruskal algorithim for for new graph. i.e. if e is of minimum weight it must have been selected for MST acc to Kruskal algorithim and same here if it is minimum it can not be removed from cycle.

Prim's MST algorithm in O(|V|^2)

Time complexity of Prim's MST algorithm is O(|V|^2) if you use adjacency matrix representation.
I am trying to implement Prim's algorithm using adjacency matrix. I am using this
as a reference.
V = {1,2...,n}
U = {1}
T = NULL
while V != U:
/*
Now this implementation means that
I find lowest cost edge in O(n).
How do I do that using adjacency list?
*/
let (u, v) be the lowest cost edge
such that u is in U and v is in V - U;
T = T + {(u,v)}
U = U + {v}
EDIT:
I understand Prim's algorithm very well.
I know how to implement it efficiently using heaps and priority queues.
I also know about better algorithms.
I want to use adjacency matrix representation of graph and get O(|V|^2) implementation.
I WANT THE INEFFICIENT IMPLEMENTATION
Finding the lowest cost edge (u,v), such that u is in U and v is in V-U, is done with a priority queue. More precisely, the priority queue contains each node v from V-U together with the lowest cost edge from v into the current tree U. In other words, the queue contains exactly |V-U| elements.
After adding a new node u to U, you have to update the priority queue by checking whether the neighboring nodes of u can now be reached by an edge of lower cost than previously.
The choice of priority queue determines the time complexity. You will get O(|V|^2) by implementing the priority queue as a simply array cheapest_edges[1..|V|]. That's because finding minimum in this queue takes O(|V|) time, and you repeat that |V| times.
In pseudo-code:
V = {2...,n}
U = {1}
T = NULL
P = array, for each v set P[v] = (1,v)
while V != U
(u,v) = P[v] with v such that length P[v] is minimal
T = T + {(u,v)}
U = U + {v}
for each w adjacent to v
if length (v,w) < length P[w] then
P[w] = (v,w)
You do it like in Dijkstra's algorithm, by selecting the node that is connected to your current partial tree with the minimum cost edge (that doesn't generate a cycle). I think wikipedia explains Prim better than that pseudocode you have. Give it a look and let me know if you have more questions.
You can sort the edges by the cost and then iterate the edges in the order of the cost, and if that edge joins two distinct subgraphs use that edge.
I have a implementation here. It reads the number of verticles (N), the number of edges (M) and the edges in order (A, B, Cost) and then outputs the edges. This is the Kruskal algorithm.
A implementation of the Prim's algorithm with a heap, using the same input can be found here.

Resources