Kruskal's MST Algorithm non-deterministic? - algorithm

The following is pseudo code for Kruskal's Minimum Spanning Tree algorithm from our CS algorithms lecturer. I wanted to know if the MST algorithm is non-deterministic. Given two edges with the same weight, how would the algorithm decide between them if neither edge formed a cycle when adding to T. Surely if it was random then one could not determine the result of what exact edges are added to T?
Given an undirected connected graph G=(V,E)
T=Ø //Empty set, i.e. empty
E'=E
while E'≠Ø do
begin
pick an edge e in E' with minumum weight
if adding e to T does not form a cycle then
T = T∪{e} //Set union, add e to T
E' = E'\{e} //Set difference, remove e from E'
end
Thanks!

Kruskal's algorithm is deterministic if you pick a deterministic choice function for the cases where you have a choice, otherwise it's non-deterministic. If you choose randomly, you can't tell which edges end up in the MST if there are several possibilities.

Given two edges with the same weight, how would the algorithm decide
between them if neither edge formed a cycle when adding to T. Surely
if it was random then one could not determine the result of what exact
edges are added to T?
This is upto the implementation.
Kruskal's algorithm finds one of the many possible MSTs of a connected weighted graph (which is not a tree). This is because at each iteration you have multiple choices (of choosing an edge from within edges with the same weight). This is the non-deterministic bit. Of course, when you will implement the algorithm you will make a choice (i.e. impose an order) however a different implementation can very well impose a different order. Thus, you will have two implementations of the algorithm, solving the same problem, correctly but possibly with different end results.

Related

Necessary to MST edges in a graph

Given a G(V,E) weighted(on edges) graph i need to find the number of edges that belong in every MST as well as the number of the edges that belong in at least one but not all and the ones that belong in none.
The graph is given as input in the following form(example):
3 3
1 2 1
1 3 1
2 3 2
First 3 is the number of nodes,second 3 is the number of edges.The following three lines are the edges where first number is where they start,second is where they end and third is the value.
I've thought about running kruskal once to find an MST and then for each edge belonging in G check in(linear?) time if it can be replace an edge in this MST without altering it's overall weight.If it can't it belongs in none.If it can it belongs in one but not all.I can also mark the edges in the first MST(maybe with a second value 1 or 0)and in the end check how many of them could not be replaced.These are the ones belonging in every possible MST. This algorithm is probably O(V^2) and i am not quite sure how to write it in C++.My questions are,could we somehow reduce it's complexity? If i add an edge to the MST how can i check(and implement in C++) if the cycle formed contains an edge that has less weight?
You can do this by adding some extra work to Kruskal's algorithm.
In Kruskal's algorithm, edges with the same weight can be in any order, and the order in which they are actually checked determines which MST you get out of all possible MSTs. For every MST, there is a sort-consistent order of the edges that will produce that tree.
No matter what sort-consistent order is used, the state of the union-find structure will be the same between weights.
If there is only one edge of a specific weight, then it is in every MST if Kruskal's algorithm selects it, because it would be selected with any sort-consistent edge order, and otherwise it is in no MSTs.
If there are multiple edges with the same weight, and Kruskal's algorithm would select at least one of them, then you can pause Kruskal's at that point and make a new (small) graph with just those edges that connect different sets, using the sets that they connect as vertices.
All of those edges are in at least one MST, because Kruskal's might pick them first. Any of those edges that are bridges in the new graph are in every MST, because Kruskal's would select them no matter what. See: https://en.wikipedia.org/wiki/Bridge_(graph_theory)
So, modify Kruskal's algorithm as follows:
When Kruskal's would select an edge, before doing the union, gather and process ALL the non-redundant edges with the same weight.
Make a small graph with those edges using the find() sets they connect as vertices
Use Tarjan's algorithm or equivalent (see the link) to find all the bridges in the graph. Those edges are in ALL MSTs.
The remaining edges in the small graph are in some, but not all, MSTs.
perform all the unions for edges in the small graph and move on to the next weight.
Since bridge-finding can be done in linear time, and every edge is in at most one small graph, the whole algorithm is still O(|V| + |E| log |E|).

Linear algorithm to make edge weights unique

I have a weighted graph, and I want to compute a new weighting function for the graph, such that the edge weights are distinct, and every MST in the new graph corresponds to an MST in the old graph.
I can't find a feasible algorithm. I doubled all the weights, but that won't make them distinct. I also tried doubling the weights and adding different constants to edges with the same weights, but that doesn't feel right, either.
The new graph will have only 1 MST, since all edges are distinct.
Very simple: we multiply all weights by a factor K large enough to ensure that our small changes cannot affect the validity of an MST. I'll go overboard on this one:
K = max(<sum of all graph weights>,
<quantity of edges>)
+ 1
Number the N edges in any order, 0 through N-1. To each edge weight, add the edge number. It's trivial to show that
the edge weights are now unique (new difference between different weights is larger than the changes);
any MST in the new graph maps directly to a corresponding MST in the
old one (each new path sum is K times the old one, plus a quantity smaller than K -- the comparison (less or greater than) on any two paths cannot be affected).
Yes, this is overkill: you can restrict the value of K quite a bit. However, making it that large reduces the correctness proofs to lemmas a junior-high algebra student can follow.
We definitely cannot guarantee that all of the MSTs in the old graph are MSTs in the new graph; a counterexample is the complete graph on three vertices where all edges have equal weights. So I assume that you do not require the construction to give all MSTs, as that is not possible in the general case.
Can we always make it so that the new graph's MSTs are a subset of the old graph's? This would be easy if we could construct a graph without a MST. Of course, that doesn't make any sense and is impossible, since all graphs have at least one MST. Is it possible to change edge weights so that only one of the old graph's MSTs is an MST for the new graph? I propose that this is possible in general.
Determine some MST of the old graph.
Construct a new graph with the same edges and vertices, but with weights assigned as follows:
if the edge in the new graph belongs to the MST determined in step 1, give it a unique weight between 1 and n, the number of edges in the graph.
otherwise, give the edge in the new graph a unique weight greater than or equal to n^2, the square of the number of edges in the graph.
Without proof, it seems like this should guarantee that only the nominated MST from the old graph is an MST of the new graph, and therefore every MST in the new graph (there is just the one) is an MST in the old graph.
Now, one could ask whether we can do the deed with additional restrictions:
Can you do it if you only want to change the values of edges that are not unique in the old graph?
Can you do it if you want to keep relative weights the same for edges which were unique in the old graph?
One could even pose optimization problems:
What is the minimum number of edge weights that must be changed to guarantee it?
What is the weighting with minimum distance (according to some metric) from the old weighting that guarantees it?
What is the weighting that minimizes the average change while also guaranteeing it?
I am hesitant to attempt these, what I believe to be much more difficult, problems.

Finding MST such that a specific vertex has a minimum degree

Given undirected, connected graph G={V,E}, a vertex in V(G), label him v, and a weight function f:E->R+(Positive real numbers), I need to find a MST such that v's degree is minimal. I've already noticed that if all the edges has unique weight, the MST is unique, so I believe it has something to do with repetitive weights on edges. I though about running Kruskal's algorithm, but when sorting the edges, I'll always consider edges that occur on v last. For example, if (a,b),(c,d),(v,e) are the only edges of weight k, so the possible permutations of these edges in the sorted edges array are: {(a,b),(c,d),(v,e)} or {(c,d),(a,b),(v,e)}. I've ran this variation over several graphs and it seems to work, but I couldn't prove it. Does anyone know how to prove the algorithm's correct (Meaning proving v's degree is minimal), or give a contrary example of the algorithm failing?
First note that Kruskal's algorithm can be applied to any weighted graph, whether or not it is connected. In general it results in a minimum-weight spanning forest (MSF), with one MST for each connected component. To prove that your modification of Kruskal's algorithm succeeds in finding the MST for which v has minimal degree, it helps to prove the slightly stronger result that if you apply your algorithm to a possibly disconnected graph then it succeeds in finding the MSF where the degree of v is minimized.
The proof is by induction on the number, k, of distinct weights.
Basis Case (k = 1). In this case weights can be ignored and we are trying to find a spanning forest in which the degree of v is minimized. In this case, your algorithm can be described as follows: pick edges for as long as possible according to the following two rules:
1) No selected edge forms a cycle with previously selected edges
2) An edge involving v isn't selected unless any edge which doesn't
involve v violates rule 1.
Let G' denote the graph from which v and all incident edges have been removed from G. It is easy to see that the algorithm in this special case works as follows. It starts by creating a spanning forest for G'. Then it takes those trees in the forest that are contained in v's connected component in the original graph G and connects each component to v by a single edge. Since the components connected to v in the second stage can be connected to each other in no other way (since if any connecting edge not involving v exists it would have been selected by rule 2) it is easy to see that the degree of v is minimal.
Inductive Case: Suppose that the result is true for k and G is a weighted graph with k+1 distinct weights and v is a specified vertex in G. Sort the distinct weights in increasing order (so that weight k+1 is the longest of the distinct weights -- say w_{k+1}). Let G' be the sub-graph of G with the same vertex set but with all edges of weight w_{k+1} removed. Since the edges are sorted in the order of increasing weight, note that the modified Kruskal's algorithm in effect starts by applying itself to G'. Thus -- by the induction hypothesis prior to considering edges of weight w_{k+1}, the algorithm has succeeded in constructing an MSF F' of G' for which the degree, d' of v in G' is minimized.
As a final step, modified Kruskal's applied to the overall graph G will merge certain of the trees in F' together by adding edges of weight w_{k+1}. One way to conceptualize the final step is the think of F' as a graph where two trees are connected exactly when there is an edge of weight w_{k+1} from some node in the first tree to some node in the second tree. We have (almost) the basis case with F'. Modified Kruskal's will add edged of weight w_{k+1} until it can't do so anymore -- and won't add an edge connecting to v unless there is no other way to connect to trees in F' that need to be connected to get a spanning forest for the original graph G.
The final degree of v in the resulting MSF is d = d'+d" where d" is the number of edges of weight w_{k+1} added at the final step. Neither d' nor d" can be made any smaller, hence it follows that d can't be made any smaller (since the degree of v in any spanning forest can be written as the sum of the number of edges whose weight is less than w_{k+1} coming into v and the number off edges of weight w_{k+1} coming into v).
QED.
There is still an element of hand-waving in this, especially with the final step -- but Stack Overflow isn't a peer-reviewed journal. Anyway, the overall logic should be clear enough.
One final remark -- it seems fairly clear that Prim's algorithm can be similarly modified for this problem. Have you looked into that?

Graph Has Two / Three Different Minimal Spanning Trees ?

I'm trying to find an efficient method of detecting whether a given graph G has two different minimal spanning trees. I'm also trying to find a method to check whether it has 3 different minimal spanning trees. The naive solution that I've though about is running Kruskal's algorithm once and finding the total weight of the minimal spanning tree. Later , removing an edge from the graph and running Kruskal's algorithm again and checking if the weight of the new tree is the weight of the original minimal spanning tree , and so for each edge in the graph. The runtime is O(|V||E|log|V|) which is not good at all, and I think there's a better way to do it.
Any suggestion would be helpful,
thanks in advance
You can modify Kruskal's algorithm to do this.
First, sort the edges by weight. Then, for each weight in ascending order, filter out all irrelevant edges. The relevant edges form a graph on the connected components of the minimum-spanning-forest-so-far. You can count the number of spanning trees in this graph. Take the product over all weights and you've counted the total number of minimum spanning trees in the graph.
You recover the same running time as Kruskal's algorithm if you only care about the one-tree, two-trees, and three-or-more-trees cases. I think you wind up doing a determinant calculation or something to enumerate spanning trees in general, so you likely wind up with an O(MM(n)) worst-case in general.
Suppose you have a MST T0 of a graph. Now, if we can get another MST T1, it must have at least one edge E different from the original MST. Throw away E from T1, now the graph is separated into two components. However, in T0, these two components must be connected, so there will be another edge across this two components that has exactly the same weight as E (or we could substitute the one with more weight with the other one and get a smaller ST). This means substitute this other edge with E will give you another MST.
What this implies is if there are more than one MSTs, we can always change just a single edge from a MST and get another MST. So if you are checking for each edge, try to substitute the edge with the ones with the same weight and if you get another ST it is a MST, you will get a faster algorithm.
Suppose G is a graph with n vertices and m edges; that the weight of any edge e is W(e); and that P is a minimal-weight spanning tree on G, weighing Cost(W,P).
Let δ = minimal positive difference between any two edge weights. (If all the edge weights are the same, then δ is indeterminate; but in this case, any ST is an MST so it doesn't matter.) Take ε such that δ > n·ε > 0.
Create a new weight function U() with U(e)=W(e)+ε when e is in P, else U(e)=W(e). Compute Q, an MST of G under U. If Cost(U,Q) < Cost(U,P) then Q≠P. But Cost(W,Q) = Cost(W,P) by construction of δ and ε. Hence P and Q are distinct MSTs of G under W. If Cost(U,Q) ≥ Cost(U,P) then Q=P and distinct MSTs of G under W do not exist.
The method above determines if there are at least two distinct MSTs, in time O(h(n,m)) if O(h(n,m)) bounds the time to find an MST of G.
I don't know if a similar method can treat whether three (or more) distinct MSTs exist; simple extensions of it fall to simple counterexamples.

Weighted Directed Acyclic Graphs: algorithm to find edge weights such that they define a distance function?

I have this technical problem that can be formuated with a Directed Acyclic Graph (DAG).
Nodes represent events (with unknown timing), with directed edges encoding the relation ship: "I'm younger than you/I happened before you".
I need to estimate edge weights (i.e. "dynamic weights") such that the Weighted DAG (WDAG) implies a distance function on the DAG. In other words the sum of weights for paths between nodes A and B should be equal for all paths.
This is an undetermined problem, even if the weights are integers (same underlying reason that topological sorting is not unique, I suppose). In all generality, the edge weights, which represent timeintervals between nodes/events, are real numbers. Therefore I'm introducing some preset objective function C=J(WDAG) on the weighted DAG, here unspecified.
My question is: is there an algorithm that distributes positive definite weights on the WDAG, subject to the constraint 1) that the weights form a distance function of the DAG and 2) minimizing the objective function cost C.
This seems unrelated to the shortest-path or minimum-spanning-tree problems classically associated to WDAG's. Any ideas to a formal or heuristic solution to the above problem?
Regards,
Stephane
I think all you need is topological sorting.
Sort the nodes according to a topological sort.
Distribute them in the [0,1] interval in the order you got.
Now the distance of node-pairs on the [0,1] line will give you the weight of the edge connecting them.
This is one possible solution. If you have more constraints on the weights than what you describe in the question, the problem might be more difficult.

Resources