Use Dijkstra's to find a Minimum Spanning Tree? - algorithm

Dijkstra's is typically used to find the shortest distance between two nodes in a graph. Can it be used to find a minimum spanning tree? If so, how?
Edit: This isn't homework, but I am trying to understand a question on an old practice exam.

The answer is no. To see why, let's first articulate the question like so:
Q: For a connected, undirected, weighted graph G = (V, E, w) with only nonnegative edge weights, does the predecessor subgraph produced by Dijkstra's Algorithm form a minimum spanning tree of G?
(Note that undirected graphs are a special class of directed graphs, so it is perfectly ok to use Dijkstra's Algorithm on undirected graphs. Furthermore, MST's are defined only for connected, undirected graphs, and are trivial if the graph is not weighted, so we must restrict our inquiry to these graphs.)
A: Dijkstra's Algorithm at every step greedily selects the next edge that is closest to some source vertex s. It does this until s is connected to every other vertex in the graph. Clearly, the predecessor subgraph that is produced is a spanning tree of G, but is the sum of edge weights minimized?
Prim's Algorithm, which is known to produce a minimum spanning tree, is highly similar to Dijkstra's Algorithm, but at each stage it greedily selects the next edge that is closest to any vertex currently in the working MST at that stage. Let's use this observation to produce a counterexample.
Counterexample: Consider the undirected graph G = (V, E, w) where
V = { a, b, c, d }
E = { (a,b), (a,c), (a,d), (b,d), (c,d) }
w = {
( (a,b) , 5 )
( (a,c) , 5 )
( (a,d) , 5 )
( (b,d) , 1 )
( (c,d) , 1 )
}
Take a as the source vertex.
Dijkstra's Algorithm takes edges { (a,b), (a,c), (a,d) }.
Thus, the total weight of this spanning tree is 5 + 5 + 5 = 15.
Prim's Algorithm takes edges { (a,d), (b,d), (c,d) }.
Thus, the total weight of this spanning tree is 5 + 1 + 1 = 7.

Strictly, the answer is no. Dijkstra's algorithm finds the shortest path between 2 vertices on a graph. However, a very small change to the algorithm produces another algorithm which does efficiently produce an MST.
The Algorithm Design Manual is the best book I've found to answer questions like this one.

Prim's algorithm uses the same underlying principle as Dijkstra's algorithm.

I'd keep to a greedy algorithm such as Prim's or Kruskal's. I fear Djikstra's won't do, simply because it minimizes the cost between pairs of nodes, not for the whole tree.

Of course, It's possible to use Dijkstra for minimum spanning tree:
dijsktra(s):
dist[s] = 0;
while (some vertices are unmarked) {
v = unmarked vertex with
smallest dist;
Mark v; // v leaves “table”
for (each w adj to v) {
dist[w] = min[ dist[w], dist[v] + c(v,w) ];
}
}
Here is an example of using Dijkstra for spanning tree:
You can find further explanation in Foundations of Algorithms book, chapter 4, section 2.
Hope this help

Related

Minimum Spanning tree of a Complete Graph

Assume G = (V,E) is a complete graph.
Let the vertices be a set of points in the plane and let the edges be line segments between the points. Let the weight of each edge [a, b] be the length of the segment 'ab'.
After reading about Prim's Algorithm and Kruskal's Algorithm, I have some sound knowledge that these greedy algorithms output the minimum spanning tree of a graph.
My Question is: After obtaining a minimum spanning tree of G, Is there a way to prove that the minimum spanning tree of G is a plane graph?
You can check if the minimum spanning tree is planar as any graph. There are a simple way to check if a graph is planar. The very known Euler formula
“If G is a connected planar graph with e edges and v vertices, where v >= 3, then e <= 3v - 6. Also G cannot have a vertex of degree exceeding 5.”
or you can rely on the following method:
Theorem – “Let G be a connected simple planar graph with e edges and v vertices. Then the number of faces f in the graph is equal to f = e-v+2.”
Euler also showed that for any connected planar graph, the following relationship holds:
v - e + f = 2.
Good lucky

Algorithm to Compute square of a directed graph(represented in form of an adjacency list)

I am working on constructing an algorithm to compute G^2 of a directed graph that is a form of an adjacency list, where G^2 = (V,E'), where E' is defined as (u,v)∈E′ if there is a path of length 2 between u and v in G. I understand the question very well and have found an algorithm which I assume is correct, however the runtime of my algorithm is O(VE^2) where V is the number of vertices and E is the number of Edges of the graph. I was wondering how I could do this in O(VE) time in order to make it more efficient?
Here is the algorithm, I came up with:
for vertex in Vertices
for neighbor in Neighbors
for n in Neighbors
if(n!=neighbor)
then-> if(n.value==neighbor)
add this to a new adjacency list
break; // this means we have found a path of size 2 between vertex and neighbor
continue otherwise
The problem can be solved in time O(VE) using BFS(breadth first search). The thing about BFS, is that it traverses the graph level by level. Meaning that first it traverses all the vertices at a distance of 1 from the source vertex. Then it traverses all the vertices at a distance of 2 from the source vertex and so on. So we can take advantage of this fact and terminate our BFS, when we have reached vertices at a distance of 2.
Following is the pseudocode:
For each vertex v in V
{
Do a BFS with v as source vertex
{
For all vertices u at distance of 2 from v
add u to adjacency list of v
and terminate BFS
}
}
Since BFS takes time O(V + E) and we invoke this for every vertex, so total time is O(V(V + E)) = O(V^2 + VE) = O(VE) .Just remember to start with fresh data structures for every BFS traversal.

Linear-time algorithm for number of distinct paths from each vertex in a directed acyclic graph

I am working on the following past paper question for an algorithms module:
Let G = (V, E) be a simple directed acyclic graph (DAG).
For a pair of vertices v, u in V, we say v is reachable from u if there is a (directed) path from u to v in G.
(We assume that every vertex is reachable from itself.)
For any vertex v in V, let R(v) be the reachability number of vertex v, which is the number of vertices u in V that are reachable from v.
Design an algorithm which, for a given DAG, G = (V, E), computes the values of R(v) for all vertices v in V.
Provide the analysis of your algorithm (i.e., correctness and running time
analysis).
(Optimally, one should try to design an algorithm running in
O(n + m) time.)
So, far I have the following thoughts:
The following algorithm for finding a topological sort of a DAG might be useful:
TopologicalSort(G)
1. Run DFS on G and compute a DFS-numbering, N // A DFS-numbering is a numbering (starting from 1) of the vertices of G, representing the point at which the DFS-call on a given vertex v finishes.
2. Let the topological sort be the function a(v) = n - N[v] + 1 // n is the number of nodes in G and N[v] is the DFS-number of v.
My second thought is that dynamic programming might be a useful approach, too.
However, I am currently not sure how to combine these two ideas into a solution.
I would appreciate any hints!
EDIT: Unfortunately the approach below is not correct in general. It may count multiple times the nodes that can be reached via multiple paths.
The ideas below are valid if the DAG is a polytree, since this guarantees that there is at most one path between any two nodes.
You can use the following steps:
find all nodes with 0 in-degree (i.e. no incoming edges).
This can be done in O(n + m), e.g. by looping through all edges
and marking those nodes that are the end of any edge. The nodes with 0
in-degree are those which have not been marked.
Start a DFS from each node with 0 in-degree.
After the DFS call for a node ends, we want to have computed for that
node the information of its reachability.
In order to achieve this, we need to add the reachability of the
successors of this node. Some of these values might have already been
computed (if the successor was already visited by DFS), therefore this
is a dynamic programming solution.
The following pseudocode describes the DFS code:
function DFS(node) {
visited[node] = true;
reachability[node] = 1;
for each successor of node {
if (!visited[successor]) {
DFS(successor);
}
reachability[node] += reachability[successor];
}
}
After calling this for all nodes with 0 in-degree, the reachability
array will contain the reachability for all nodes in the graph.
The overall complexity is O(n + m).
I'd suggest using a Breadth First Search approach.
For every node, add all the nodes that are connected to the queue. In addition to that, maintain a separate array for calculating the reachability.
For example, if a A->B, then
1.) Mark A as traversed
2.) B is added to the queue
3.) arr[B]+=1
This way, we can get R(v) for all vertices in O(|V| + |E|) time through arr[].

Finding paths of fixed cost in weighted undirected graph?

I have the following problem and I'm not quite sure how to solve it:
Given a graph G = (V;E) in which every edge e has a positive integer cost c_e and a starting vertex s\in V . Design an O(V + E) algorithm that marks all vertices reachable from s using a path (not necessarily a simple path) with the total cost of that path being multiples of 5.
How can I keep track of the total amount of cost of the path that I've already visited? I've been studying about BFS in undirected weighted graphs and made some attempts on using it here, but most of the BFS references focus on finding the shortest path (and not something like keep it multiple of 5).
What do you think about the next algorithm?
Let's consider new directed graph based on the source graph. For every vertex v from the source graph create 5 new vertexes v[0], v[1], ..., v[4] in the new graph corresponding to the modules from the division by 5. Then, if vertexes v and u were connected in the source graph by the edge with the weight w, add edge between v[i] and u[(i + w) % 5], u[j] and v[(j + w) % 5] in the new graph, where i = 0..4, j = 0..4. Then run BFS from the v[0], where v is the starting vertex in the source graph.
Consider vertexes with the index 0 like v[0]. Each of them corresponding to the path of the length multiple of 5 to the vertex v in the source graph. All of such vertexes marked after BFS as reachable from the starting vertex form the answer. Total complexity is linear.

The Edge Set Grown in Kruskal's Algorithm [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Let G = (V, E) be a weighted, connected and undirected graph. Let T be the edge set that is grown in Kruskal's algorithm and stopped after k iterations (so T might contain less than |E|-1 edges). Let W(T) be the weighted sum of this set.
Let T’ be an acylic edge set such that |T| = |T’|. Prove that W(T) <= W(T’)
I understand the original proof of the algorithm and I’ve tried several approaches to tackle this, neither worked.
For example: I thought an induction on |T| might work.
For |T| = 1 it’s obvious.
We assume correctness for |T|=k and prove (or not…) for k+1. Assume by contradiction that there exists an edge set T’ such that |T’|=k+1 and W(T’) < W(T).
Let e be the last edge added by Kruskal algorithm. So for any edge f in T’, W(f) < W(e) (otherwise we remove the edges from the 2 sets and get a contradiction).
This can only happen if every edge in T’ is already in T or forms a cycle with T – {e}.
…
Note: It's not the same proof as in Kruskal's algorithm. We don't even know whether T' is connected.
I have no idea what to do next. I would really appreciate any help,
Thanks in advance
Let T’ be an edge set such that |T| = |T’|. Prove that W(T) <= W(T’).
You'll have a hard time doing that, since it's false in general.
Consider
1
A---B
2 \ / 3
C
| 4
D
Kruskal's algorithm produces the edge set T = { (A,B), (A,C), (C,D) }, which is the unique minimal spanning tree.
But the edge set T' = { (A,B), (A,C), (B,C) } has the same cardinality as T, and
W(T') = 6 < W(T) = 7
There's some condition missing in the problem statement (like that T' should connect the graph).
You're right. I forgot to mention that there is no cycle in T'
In that case, T' spans a tree(1). And since |T'| = |T| is assumed, the tree that T' spans connects the graph, i.e. is a spanning tree.
(1) From the absence of cycles, it follows directly that each connected component of T' is a tree. A tree with n vertices has n-1 edges. Thus if T' has k connected components, the number of vertices in the graph is
V = |T'| + k
But T is a spanning tree, and |T| = |T'|, hence
V = |T| + 1 = |T'| + 1
which implies k = 1.
Thus you are asked to simply prove the correctness of Kruskal's algorithm. You can find proofs in the literature easily, for example on wikipedia.
A proof of correctness (by induction on the number of vertices):
Lemma: Let G be a connected graph with N > 1 vertices, and T a minimal spanning tree of G. Let e be an edge in T.
Then T \ {e} projects to a minimal spanning tree of the graph G' obtained from G by identifying the two endpoints a and b of e. Conversely, if T' is a set of edges of G that projects to a minimal spanning tree of G', then T' ∪ {e} is a minimal spanning tree of G.
Proof: Let p : G -> G' be the projection identifying a and b.
Then p(T \ {e}) has no cycles.
Suppose p(T \ {e}) contained a cycle C. Then p^(-1)(C) must be a path connecting a and b. But then T would contain the cycle p^(-1)(C) ∪ {e}, contradicting the premise that T is a tree.
Thus p(T \ {e}) is a cycle-free set of edges of G' with cardinality N - 2, and that implies (see above) that it is a spanning tree.
Let T'' be a minimal spanning tree of G' and S = p^(-1)(T'').
Then S ∪ {e} has no cycles.
If there were a cycle in S, that would project to a cycle in T'', so every cycle in S ∪ {e} must contain e. Suppose C were a cycle in S ∪ {e}. Then C \ {e} is a path connecting a and b, thus C \ {e} projects to a cycle in G', since a and b project to the same vertex of G'. That contradicts the premise that T'' is a tree.
So S ∪ {e} is an edge set of cardinality N - 1 without cycles, and hence (see above) a spanning tree of G.
Then W(T) <= W(S ∪ {e}) since T is a minimal spanning tree, and thus
W(p(T \ {e})) = W(T \ {e}) <= W(S) = W(T'')
Since T'' is assumed to be a minimal spanning tree of G', it follows that equality holds, and that p(T \ {e}) is a minimal spanning tree of G', and that S ∪ {e} is a minimal spanning tree of G.
Now to the induction to prove the correctness of Kruskal's algorithm:
For a graph with at most two vertices, it is obvious that the algorithm produces a minimal spanning tree.
For n >= 2, assume the correctness of the algorithm for all connected graphs with at most n vertices. (Induction hypothesis)
Let G be a connected graph with n+1 vertices. Let e be the first edge chosen in the algorithm, and a and b its endpoints.
Let G' be the graph obtained from G by identifying a and b, and p :: G -> G' the projection.
Let T be the edge set selected by the algorithm.
Then p(T \ {e}) is the edge set selected by Kruskal's algorithm on G'. Thus, by the Lemma above, T is a minimal spanning tree of G.
(Okay, probably the proof in wikipedia is simpler, but I wanted to produce a different one.)

Resources