Minimum Spanning tree different from another - algorithm

Assume we are given
an undirected graph g where every node i,1 <= i < n is connected to all j,i < j <=n
and a source s.
We want to find the total costs (defined as the sum of all edges' weights) of the cheapest minimum spanning tree that differs from the minimum distance tree of s (i.e. from the MST obtained by running prim/dijkstra on s) by at least one edge.
What would be the best way to tackle this? Because currently, I can only think of some kind of fixed-point iteration
run dijkstra on (g,s) to obtain reference graph r that we need to differ from
costs := sum(edge_weights_of(r))
change := 0
for each vertex u in r, run a bfs and note for each reached vertex v the longest edge on the path from u to v.
iterate through all edges e = (a,b) in g: and find e'=(a',b') that is NOT in r and minimizes newchange := weight(e') - weight(longest_edge(a',b'))
if(first_time_here OR newchange < 0) then change += newchange
if(newchange < 0) goto 4
result := costs + change
That seems to waste a lot of time... It relies on the fact that adding an edge to a spanning tree creates a cycle from which we can remove the longest edge.
I also thought about using Kruskal to get an overall minimum spanning tree and only using the above algorithm to replace a single edge when the trees from both, prim and kruskal, happen to be the same, but that doesn't seem to work as the result would be highly dependent on the edges selected during a run of kruskal.
Any suggestions/hints?

You can do it using Prim`s algorithm
Prim's algorithm:
let T be a single vertex x
while (T has fewer than n vertices)
1.find the smallest edge connecting T to G-T
2.add it to T
Now lets modify it.
Let you have one minimum spanning tree. Say Tree(E,V)
Using this algorithm
Prim's algorithm (Modified):
let T be a single vertex
let isOther = false
while (T has fewer than n vertices)
1.find the smallest edge (say e) connecting T to G-T
2.If more than one edge is found, {
check which one you have in E(Tree)
choose one different from this
add it to T
set isOther = true
else if one vertex is found {
add it to T
If E(Tree) doesn`t contain this edge, set isOther = true
Else don`t touch isOther ( keep value ).
If isOther = true, it means you have found another tree different from Tree(E,V) and it is T,
Else graph have single minimum spanning tree


how do I reduce this spanning tree problem to np-completeness?

I have the following algorithmic problem:
If I have a graph G=(V,E), does G have a spanning tree with exactly k leaves?
Leaves being a vertex with only one neighbor in the spanning tree.
Also, i'm not looking for a minimum spanning tree, just a spanning tree.
TO sum up, a solution algortihm would take as inputs a graph G and a number k, and return either true or false, depending on whether G has a spanning tree of k leaves
For this graph:
if k is 6, then my algorithm would output "True" because:
Now I am pretty sure that this problem is np-complete, so I need to perform a reduction from a know np-complete problem.
I just have no idea which problem, and how the reduction should look like, can you help out?
The Hamiltonian path problem is a special case of your problem - a spanning tree with exactly k = 2 leaves is a Hamiltonian path. Testing for the existence of one is NP-complete.
Not a real answer to your question, but you might want to try to simplify the graph before you go on board on those 1.x^N algorithms
Simplifying things (untested code ahead)
if (nodes.size() < K)
return false;
Remove all nodes with only one edge as they are forced to be leaves.
while (nodes && nodes.front().edges.size() == 1) {
nodes.erase(nodes.begin()); // updates one other node which could have 1 edge then.
if (K < 0 || nodes.size() < K)
return false;
Remove all nodes which have 2 edges and where removing one would disconnect the graph, connect the two nodes it connected to directly. It is not a bridge if there is any path from edge1 to edge2. O(N^2)
node = nodes.begin();
while (node->edges.size() == 2) {
if (DisconnectingBrigde(node)) {
edges = node->edges;
node = nodes.erase(node); // returns next node
nodes.addEgde(edges.front(), edges.back()); // connect the two parts
} else
node++; // next node

Linear-time algorithm for number of distinct paths from each vertex in a directed acyclic graph

I am working on the following past paper question for an algorithms module:
Let G = (V, E) be a simple directed acyclic graph (DAG).
For a pair of vertices v, u in V, we say v is reachable from u if there is a (directed) path from u to v in G.
(We assume that every vertex is reachable from itself.)
For any vertex v in V, let R(v) be the reachability number of vertex v, which is the number of vertices u in V that are reachable from v.
Design an algorithm which, for a given DAG, G = (V, E), computes the values of R(v) for all vertices v in V.
Provide the analysis of your algorithm (i.e., correctness and running time
(Optimally, one should try to design an algorithm running in
O(n + m) time.)
So, far I have the following thoughts:
The following algorithm for finding a topological sort of a DAG might be useful:
1. Run DFS on G and compute a DFS-numbering, N // A DFS-numbering is a numbering (starting from 1) of the vertices of G, representing the point at which the DFS-call on a given vertex v finishes.
2. Let the topological sort be the function a(v) = n - N[v] + 1 // n is the number of nodes in G and N[v] is the DFS-number of v.
My second thought is that dynamic programming might be a useful approach, too.
However, I am currently not sure how to combine these two ideas into a solution.
I would appreciate any hints!
EDIT: Unfortunately the approach below is not correct in general. It may count multiple times the nodes that can be reached via multiple paths.
The ideas below are valid if the DAG is a polytree, since this guarantees that there is at most one path between any two nodes.
You can use the following steps:
find all nodes with 0 in-degree (i.e. no incoming edges).
This can be done in O(n + m), e.g. by looping through all edges
and marking those nodes that are the end of any edge. The nodes with 0
in-degree are those which have not been marked.
Start a DFS from each node with 0 in-degree.
After the DFS call for a node ends, we want to have computed for that
node the information of its reachability.
In order to achieve this, we need to add the reachability of the
successors of this node. Some of these values might have already been
computed (if the successor was already visited by DFS), therefore this
is a dynamic programming solution.
The following pseudocode describes the DFS code:
function DFS(node) {
visited[node] = true;
reachability[node] = 1;
for each successor of node {
if (!visited[successor]) {
reachability[node] += reachability[successor];
After calling this for all nodes with 0 in-degree, the reachability
array will contain the reachability for all nodes in the graph.
The overall complexity is O(n + m).
I'd suggest using a Breadth First Search approach.
For every node, add all the nodes that are connected to the queue. In addition to that, maintain a separate array for calculating the reachability.
For example, if a A->B, then
1.) Mark A as traversed
2.) B is added to the queue
3.) arr[B]+=1
This way, we can get R(v) for all vertices in O(|V| + |E|) time through arr[].

Algorithm for finding spanning tree with minimum range in a given graph

Given a weighted undirected graph G(v,e) with weights w(e), find the set of edges such that each pair of vertices (u,v)∈G are connected (in short, spanning tree) and the range of weights of selected edges is minimum (or the difference between the minimum weight and the maximum weight is minimum).
I tried greedy approach in which sorted the edges with respect to weights and then selected two edges with minimum weight difference between the consecutive edges (g[index = current_left],g[index+1 = current_right]) in the sorted array, subsequently I moved left or right depending on the minimum difference between the (current_left,current_left-j) or (current_right,current_right+j) where j is incremented till we find an edge with at least one non-visited vertex.
For example:
Here the minimum range that we can get is by selecting edges with weight {2,3,5} and the range is 3.
Please point a test case where the suggested algorithm fails and suggest an algorithm for finding such spanning tree.
Expected time complexity is O(|E|log|E|) where |E| is number of edges.
You should be able to do it in O(E * (cost of MST computation)):
T = no tree
for all edge weights w_fix sorted in ascending order:
for all edges e:
if w(e) >= w_fix:
set w'(e) = w(e) - w_fix
set w'(e) = infinity
find MST T' according to w'
if T == no tree or max edge weight(T) > max edge weight(T'):
set T := T'
print T
The idea is that some edge weight has to be the minimum edge weight among the edges in an optimal spanning tree; so fix a minimum edge weight and find an MST that only contains edges heavier than that. Since all MSTs are also minimum bottleneck spanning trees, this will work.
Here's an improvement that is optimal up to a log-square factor; the basic idea remains the same.
sort edge array E[] by increasing weights
low := high := 0
opt_low := opt_high := 0
opt := infinity
connected := false
while (high < E.length - 1) or (connected):
if not connected:
high = high + 1
low = low + 1
if connected:
if E[high].weight - E[low].weight < opt:
opt = E[high].weight - E[low].weight
opt_low = low
opt_high = high
print(opt, opt_low, opt_high)
The idea is to keep a sliding window over the edges and use connectivity to maintain the window. To maintain connectivity information, you would use special data structures. There's a number of them that allow for polylogarithmic time costs to maintain connectivity information for both deleting and adding edges, and you can find information on those data structures in these MIT 6.851 lecture notes.
The algorithm described by G Bach infact works correctly and has a run time of O(m*m) where m is the number of edges(considering the computation of mst takes O(m) time). This was a question asked in codeforces edu section.

Algorithm - Finding the number of pairs with diameter distance in a tree?

I have a non-rooted bidirectional unweighted non-binary tree. I know how to find the diameter of the tree, the greatest distance between any pair of points in the tree, but I'm interested in finding the number of pairs with that max distance. Is there an algorithm to find the number of pairs with diameter distance in better than O(V^2) time, where V is the number of nodes?
Thank you!
Yes, there's a linear-time algorithm that operates bottom-up and resembles the algorithm for just finding the diameter. Here's the signature in Java-ish pseudocode; I'll leave the algorithm itself as an exercise.
class Node {
Collection<Node> children;
class Result {
int height; // height of the tree
int num_deep_nodes; // number of nodes whose depth equals the height
int diameter; // length of the longest path inside the tree
int num_long_paths; // number of pairs of nodes at distance |diameter|
Result computeNumberOfLongPaths(Node root); // recursive
Yes there is an algorithm with O(V+E) time.It is simply a modified version of finding the diameter.
As we know we can find the diameter using two calls of BFS by first making first call on any node and then remembering the last node discovered u and running a second call BFS(u),and remembering the last node discovered ,say v.The distance between u and v gives us the diameter.
Coming to number of pairs with that max distance.
1.Before invoking the first BFS,initialize an array distance of length |V| and distance[s]=0.s is the starting vertex for first BFS call on any node.
2.In the BFS,modify the while loop as:
while(Q is not empty)
for all vertices w adjacent to e
if(w is not visited)
mark w as visited
3.Like I said,remembering the last node visited,say u is that node. Now counting the number of vertices that are at the same level as vertex u. mark is an array of length n,which has all its value initialized to 0,0 implies that vertex not counted initially.
for i = 1 to number of vertices
mark[i]=1/*vertex counted*/
n1 gives the number of vertices,that are at the same level as vertex u,now for all vertices that have mark[i] = 1 ,are marked and they will not be counted again.
4.Similarly before performing second BFS on u,initialize another array distance2 of length |V| and distance2[u]=0.
5.Run BFS(u) and again get the last node discovered say v
6.Repeat 3rd step,this time on distance2 array and taking a different variable say n2=0 and the condition being
else if(distance2[i]==distance2[v]&&mark[i]==1)
7.set_common is a global variable that is set when there are a set of vertices such that between any two vertices the path is that of a diameter and the first bfs did not mark all those vertices but did mark at least one of those that is why mark[i]==1.
Suppose that first bfs did mark all such vertices in first call then n2 would be = 0 and set_common would not be set and there is no need also.But this situation is same as above
In any case the number of pairs giving diameter are:=
(n+n2)combination2 - X=(n1+n2)!/((2!)((n1+n2-2)!)) - X
I will elaborate on what X is.Else the number of pairs are = n1*n2,which is the case when 2 disjoint set of vertices are giving the diameter
So the Condition used is
else n1*n2
Now talking about X.It can occur that the vertices that are marked may have common parent.In that case we must not count there combinations.So before using the above condition it is advised to run the following algorithm
for(i = 1 to number of vertices)
s = 0,p = -1
while((i+1)<=number_of_vertices&& p==parent[i+1])
Proof of correctness
It is very easy.Since BFS traverses a tree level by level,n1 will give you the number of vertices at the level of u and n2 gives you the number of vertices at the level of v and since the distance between u and v = diameter.Therefore, distance between any vertex on level of u and any vertex on level of v will be equal to diameter.
The time taken is 2(|V|) + 2*time_of_DFS=O(V+E).

Maximum weighted path between two vertices in a directed acyclic Graph

Love some guidance on this problem:
G is a directed acyclic graph. You want to move from vertex c to vertex z. Some edges reduce your profit and some increase your profit. How do you get from c to z while maximizing your profit. What is the time complexity?
The problem has an optimal substructure. To find the longest path from vertex c to vertex z, we first need to find the longest path from c to all the predecessors of z. Each problem of these is another smaller subproblem (longest path from c to a specific predecessor).
Lets denote the predecessors of z as u1,u2,...,uk and dist[z] to be the longest path from c to z then dist[z]=max(dist[ui]+w(ui,z))..
Here is an illustration with 3 predecessors omitting the edge set weights:
So to find the longest path to z we first need to find the longest path to its predecessors and take the maximum over (their values plus their edges weights to z).
This requires whenever we visit a vertex u, all of u's predecessors must have been analyzed and computed.
So the question is: for any vertex u, how to make sure that once we set dist[u], dist[u] will never be changed later on? Put it in another way: how to make sure that we have considered all paths from c to u before considering any edge originating at u?
Since the graph is acyclic, we can guarantee this condition by finding a topological sort over the graph. topological sort is like a chain of vertices where all edges point left to right. So if we are at vertex vi then we have considered all paths leading to vi and have the final value of dist[vi].
The time complexity: topological sort takes O(V+E). In the worst case where z is a leaf and all other vertices point to it, we will visit all the graph edges which gives O(V+E).
Let f(u) be the maximum profit you can get going from c to u in your DAG. Then you want to compute f(z). This can be easily computed in linear time using dynamic programming/topological sorting.
Initialize f(u) = -infinity for every u other than c, and f(c) = 0. Then, proceed computing the values of f in some topological order of your DAG. Thus, as the order is topological, for every incoming edge of the node being computed, the other endpoints are calculated, so just pick the maximum possible value for this node, i.e. f(u) = max(f(v) + cost(v, u)) for each incoming edge (v, u).
Its better to use Topological Sorting instead of Bellman Ford since its DAG.
G is a DAG with negative edges.
Some edges reduce your profit and some increase your profit
Edges - increase profit - positive value
Edges - decrease profit -
negative value
After TS, for each vertex U in TS order - relax each outgoing edge.
dist[] = {-INF, -INF, ….}
dist[c] = 0 // source
for every vertex u in topological order
if (u == z) break; // dest vertex
for every adjacent vertex v of u
if (dist[v] < (dist[u] + weight(u, v))) // < for longest path = max profit
dist[v] = dist[u] + weight(u, v)
ans = dist[z];
