why when we change the cost of every edge in G as c'= log17(c),every MST in G is still an MST in G′ (and vice versa)? - algorithm

remarks:c' is logc with base 17
MST means (minimum spanning tree)
it's easy to prove the conclusion is correct when we use linear function to transform the cost of every edge.
But log function is not a linear function ,I could not understand why this conclusion is correct。
Supplementary notes:
I did not consider specific algorithms, such as the greedy algorithm. I simply consider the relationship between the sum of the weights of the two trees after transformation.
Numerically if (a + b) > (c + d) , (log a + log b) maybe not > ( logc + logd) .
If a tree generated by G has two edge a and b ,another tree generated by G has c and d,a + b < c + d and the first tree is a MST,but in transformed graph G' ,the sum of weights of edges of second tree may be smaller.
Because of this, I want to construct a counterexample based on "if (a + b)> (c + d), (log a + log b) maybe not> (logc + logd) ", but I failed.

One way to characterize when a spanning tree T is a minimum spanning tree is that, for every edge e not in T, the cycle formed by e and edges of T (the fundamental cycle of e with respect to T) has no edge more expensive than e. Using this characterization, I hope you see how to prove that transforming the costs with any increasing function preserves minimum spanning trees.
There's a one line proof that this condition is necessary. If the fundamental cycle contained a more expensive edge, we could replace it with e and get a spanning tree that costs less than T.
It's less obvious that this condition is sufficient, since at first glance it looks like we're trying to prove global optimality from a local optimality condition. To prove this statement, let T be a spanning tree that satisfies the condition, let T' be a minimum spanning tree, and let G' be the graph whose edges are the union of the edges of T and T'. Run Kruskal's algorithm on G', breaking ties by favoring edges in T over edges not in T. Let T'' be the resulting minimum spanning tree in G'. Since T' is a spanning tree in G', the cost of T'' is not greater than T', hence T'' is a minimum spanning tree in G as well as G'.
Suppose to the contrary that T'' ≠ T. Then there exists an edge in T but not in T''. Let e be the first such edge considered by Kruskal's algorithm. At the time that e was considered, it formed a cycle C in the edges that had been selected from T''. Since T is acyclic, C \ T is nonempty. By the tie breaking criterion, we know that every edge in C \ T costs less than e. Observing that some edge e' in C \ T must have one endpoint in each of the two connected components of T \ {e}, we infer that the fundamental cycle of e' with respect to T contains e, which violates the local optimality condition. In conclusion, T = T'', hence is a minimum spanning tree in G.
If you want a deeper dive, this logic gets abstracted out in the theory of matroids.

Well, its pretty easy to understand...let's see if I can break it down for you:
c` = log_17(c) // here 17 is base
log may not be linear function...but we can say that:
log_b(x) > log_b(y) if x > y and b > 1 (and of course x > 0 and y > 0)
I hope you get the equation I've written...In words in means, consider a base "b" such that b > 1, then log_b(x) would be greater than log_b(y) if x > y.
So, if we apply this rule in your costs of MST of G, then we see that the edges those were selected for G, would still produce the least possible edges to construct MST G' if c' = log_17(c) // here 17 is base.
UPDATE: As I can see you've problem understanding the proof, I'm elaborating a bit:
I guess, you know MST construction is greedy. We're going to use kruskal's algo to proof why it is correct.(In case, you don't know, how kruskal's algo works, you can read it somewhere, or just google it, you'll find millions of resources). Now, Let me write some steps of kruskal's edge selection for MST of G:
// the following edges are sorted by cost..i.e. c_0 <= c_1 <= c_2 ....
c_0: A, F // here, edge c_0 connects A, F, we've to take the edge in MST
c_1: A, B // it is also taken to construct MST
c_2: B, R // it is also taken to construct MST
c_3: A, R // we won't take it to construct to MST, cause (A, R) already connected through A -> B -> R
c_4: F, X // it is also taken to construct MST
...
...
so on...
Now, when constructing MST of G', we've to select edges which are in the form c' = log_17(c) // where 17 is base
Now, if we convert the edges using log of base 17, then c_0 becomes c_0', c_1 becomes c_1' and so on...
But we, know that:
log_b(x) > log_b(y) if x > y and b > 1 (and of course x > 0 and y > 0)
So, we may say that,
log_17(c_0) <= log_17(c_1), cause c_0 <= c_1
in general,
log_17(c_i) <= log_17(c_j), where i <= j
And now, we may say:
c_0` <= c_1` <= c_2` <= c_3` <= ....
So, the edge selection process to construct MST of G' would be:
// the following edges are sorted by cost..i.e. c_0` <= c_1` <= c_2` ....
c_0`: A, F // here, edge c_0` connects A, F, we've to take the edge in MST
c_1`: A, B // it is also taken to construct MST
c_2`: B, R // it is also taken to construct MST
c_3`: A, R // we won't take it to construct to MST, cause (A, R) already connected through A -> B -> R
c_4`: F, X // it is also taken to construct MST
...
...
so on...
Which is same as MST of G...
That proves the theorem ultimately....
I hope you get it...if not ask me in the comment what is not clear to you...

Related

How to prune this type of sorted weighted trees to maximize this particular function?

Disclaimer #1: I'm not a pro, so many of my nomenclatures might be not standard or useful. Please bear with me / edit me.
Disclaimer #2: As the tags suggest, this may start out as a theoretical question, but I think it's a programming one, though some theory would also be nice.
First, let me describe this type of sorted weighted trees, now called SWR trees. Let T = (V, E, W, U, m, r) be an SWR tree. The only defining properties of T are:
T is a m-ary rooted tree with root r, and every leaf has the same height/level in T
T has predefined and unchanged weights on edges, defined by the function W: E -> R+ (R+ is the set of positive real numbers)
T has predefined and unchanged weights on leaves, defined by the function U: V_L -> R+ (V_L is the set of leaves in V)
For each non-leaf node v of T, its children are sorted in the increasing values of the edges connecting them to v
Now, let me describe the function on T, now called F(T). F will produce a number on T as follows:
Extend the function U to U*: V -> R+ as follows: for each non-leaf node v, assign to v the largest value of the child edges of v (the edges connecting v to its children)
For each height/level h of T, calculate f(h) as the minimum value of the vertices (defined by U*) at that height/level
Sum all of the f(h) to get F(T)
Also, let me describe the proper pruning process on T. Consider the pruning of the edges. When an edge is pruned, its sub-tree is removed. Not only that, all of its larger edges (and their sub-trees) are also removed (keep in mind, due to the sorting, only consider the larger sibling edges). Hence, the remaining tree T' is still an SWR tree and properly inherits all properties from T. Obviously, F(T') has changed (even U* and f have changed).
Therefore, the problem arises. Given an SWR tree T, how can one properly prune it to get an SWR tree T' with the maximum value of F ?
Disclaimer #3: I'm aware of the fact that the problem is like fallen from the sky and rather messy. Please feel free to reformulate it as you like. Also, just to formulate the problem itself exhausts me a bit, so I have had no handle to solve this yet.
Let's first simplify your problem definition slightly by removing the leaf weights. Now that none of the weights are negative, we can put a single child under each leaf and move each leaf's weight to its new child edge.
I can write down what seems like a pretty tight integer program that captures this problem. For each edge e, the variable x[e] is 1 if we keep the edge, 0 otherwise. The variable y[e] is 1 if e is the minimum value of the maximum sibling on its level, 0 otherwise.
maximize sum_{e} W(e) y[e]
subject to
for all e, x[e] ∈ {0, 1}
for all e, y[e] ∈ {0, 1}
for all e sibling of e' with W(e) ≤ W(e'), x[e'] − x[e] ≤ 0
for all e parent of e', x[e'] − x[e] ≤ 0
for all levels ℓ, for all e at level ℓ, for all p at level ℓ−1, y[e] + x[p] − sum_{e' child of p with W(e) ≤ W(e')} x[e] ≤ 1
for all levels ℓ, sum_{e at level ℓ} y[e] = 1
The first two constraint groups enforce the restrictions on pruning. The next constraint group says, essentially, an edge cannot be the minimum value of the maximum sibling on its level unless each sibling group on its level has an edge at least as valuable or is totally gone. The final constraint is only needed to break ties.
This formulation can be solved as is with an integer program solver, but I strongly suspect that there's a more efficient algorithm.

How to update MST from the old MST if one edge is deleted

I am studying algorithms, and I have seen an exercise like this
I can overcome this problem with exponential time but. I don't know how to prove this linear time O(E+V)
I will appreciate any help.
Let G be the graph where the minimum spanning tree T is embedded; let A and B be the two trees remaining after (u,v) is removed from T.
Premise P: Select minimum weight edge (x,y) from G - (u,v) that reconnects A and B. Then T' = A + B + (x,y) is a MST of G - (u,v).
Proof of P: It's obvious that T' is a tree. Suppose it were not minimum. Then there would be a MST - call it M - of smaller weight. And either M contains (x,y), or it doesn't.
If M contains (x,y), then it must have the form A' + B' + (x,y) where A' and B' are minimum weight trees that span the same vertices as A and B. These can't have weight smaller than A and B, otherwise T would not have been an MST. So M is not smaller than T' after all, a contradiction; M can't exist.
If M does not contain (x,y), then there is some other path P from x to y in M. One or more edges of P pass from a vertex in A to another in B. Call such an edge c. Now, c has weight at least that of (x,y), else we would have picked it instead of (x,y) to form T'. Note P+(x,y) is a cycle. Consequently, M - c + (x,y) is also a spanning tree. If c were of greater weight than (x,y) then this new tree would have smaller weight than M. This contradicts the assumption that M is a MST. Again M can't exist.
Since in either case, M can't exist, T' must be a MST. QED
Algorithm
Traverse A and color all its vertices Red. Similarly label B's vertices Blue. Now traverse the edge list of G - (u,v) to find a minimum weight edge connecting a Red vertex with a Blue. The new MST is this edge plus A and B.
When you remove one of the edges then the MST breaks into two parts, lets call them a and b, so what you can do is iterate over all vertices from the part a and look for all adjacent edges, if any of the edges forms a link between the part a and part b you have found the new MST.
Pseudocode :
for(all vertices in part a){
u = current vertex;
for(all adjacent edges of u){
v = adjacent vertex of u for the current edge
if(u and v belong to different part of the MST) found new MST;
}
}
Complexity is O(V + E)
Note : You can keep a simple array to check if vertex is in part a of the MST or part b.
Also note that in order to get the O(V + E) complexity, you need to have an adjacency list representation of the graph.
Let's say you have graph G' after removing the edge. G' consists have two connected components.
Let each node in the graph have a componentID. Set the componentID for all the nodes based on which component they belong to. This can be done with a simple BFS for example on G'. This is an O(V) operation as G' only has V nodes and V-2 edges.
Once all the nodes have been flagged, iterate over all unused edges and find the one with the least weight that connects the two components (componentIDs of the two nodes will be different). This is an O(E) operation.
Thus the total runtime is O(V+E).

The Edge Set Grown in Kruskal's Algorithm [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Let G = (V, E) be a weighted, connected and undirected graph. Let T be the edge set that is grown in Kruskal's algorithm and stopped after k iterations (so T might contain less than |E|-1 edges). Let W(T) be the weighted sum of this set.
Let T’ be an acylic edge set such that |T| = |T’|. Prove that W(T) <= W(T’)
I understand the original proof of the algorithm and I’ve tried several approaches to tackle this, neither worked.
For example: I thought an induction on |T| might work.
For |T| = 1 it’s obvious.
We assume correctness for |T|=k and prove (or not…) for k+1. Assume by contradiction that there exists an edge set T’ such that |T’|=k+1 and W(T’) < W(T).
Let e be the last edge added by Kruskal algorithm. So for any edge f in T’, W(f) < W(e) (otherwise we remove the edges from the 2 sets and get a contradiction).
This can only happen if every edge in T’ is already in T or forms a cycle with T – {e}.
…
Note: It's not the same proof as in Kruskal's algorithm. We don't even know whether T' is connected.
I have no idea what to do next. I would really appreciate any help,
Thanks in advance
Let T’ be an edge set such that |T| = |T’|. Prove that W(T) <= W(T’).
You'll have a hard time doing that, since it's false in general.
Consider
1
A---B
2 \ / 3
C
| 4
D
Kruskal's algorithm produces the edge set T = { (A,B), (A,C), (C,D) }, which is the unique minimal spanning tree.
But the edge set T' = { (A,B), (A,C), (B,C) } has the same cardinality as T, and
W(T') = 6 < W(T) = 7
There's some condition missing in the problem statement (like that T' should connect the graph).
You're right. I forgot to mention that there is no cycle in T'
In that case, T' spans a tree(1). And since |T'| = |T| is assumed, the tree that T' spans connects the graph, i.e. is a spanning tree.
(1) From the absence of cycles, it follows directly that each connected component of T' is a tree. A tree with n vertices has n-1 edges. Thus if T' has k connected components, the number of vertices in the graph is
V = |T'| + k
But T is a spanning tree, and |T| = |T'|, hence
V = |T| + 1 = |T'| + 1
which implies k = 1.
Thus you are asked to simply prove the correctness of Kruskal's algorithm. You can find proofs in the literature easily, for example on wikipedia.
A proof of correctness (by induction on the number of vertices):
Lemma: Let G be a connected graph with N > 1 vertices, and T a minimal spanning tree of G. Let e be an edge in T.
Then T \ {e} projects to a minimal spanning tree of the graph G' obtained from G by identifying the two endpoints a and b of e. Conversely, if T' is a set of edges of G that projects to a minimal spanning tree of G', then T' ∪ {e} is a minimal spanning tree of G.
Proof: Let p : G -> G' be the projection identifying a and b.
Then p(T \ {e}) has no cycles.
Suppose p(T \ {e}) contained a cycle C. Then p^(-1)(C) must be a path connecting a and b. But then T would contain the cycle p^(-1)(C) ∪ {e}, contradicting the premise that T is a tree.
Thus p(T \ {e}) is a cycle-free set of edges of G' with cardinality N - 2, and that implies (see above) that it is a spanning tree.
Let T'' be a minimal spanning tree of G' and S = p^(-1)(T'').
Then S ∪ {e} has no cycles.
If there were a cycle in S, that would project to a cycle in T'', so every cycle in S ∪ {e} must contain e. Suppose C were a cycle in S ∪ {e}. Then C \ {e} is a path connecting a and b, thus C \ {e} projects to a cycle in G', since a and b project to the same vertex of G'. That contradicts the premise that T'' is a tree.
So S ∪ {e} is an edge set of cardinality N - 1 without cycles, and hence (see above) a spanning tree of G.
Then W(T) <= W(S ∪ {e}) since T is a minimal spanning tree, and thus
W(p(T \ {e})) = W(T \ {e}) <= W(S) = W(T'')
Since T'' is assumed to be a minimal spanning tree of G', it follows that equality holds, and that p(T \ {e}) is a minimal spanning tree of G', and that S ∪ {e} is a minimal spanning tree of G.
Now to the induction to prove the correctness of Kruskal's algorithm:
For a graph with at most two vertices, it is obvious that the algorithm produces a minimal spanning tree.
For n >= 2, assume the correctness of the algorithm for all connected graphs with at most n vertices. (Induction hypothesis)
Let G be a connected graph with n+1 vertices. Let e be the first edge chosen in the algorithm, and a and b its endpoints.
Let G' be the graph obtained from G by identifying a and b, and p :: G -> G' the projection.
Let T be the edge set selected by the algorithm.
Then p(T \ {e}) is the edge set selected by Kruskal's algorithm on G'. Thus, by the Lemma above, T is a minimal spanning tree of G.
(Okay, probably the proof in wikipedia is simpler, but I wanted to produce a different one.)

Finding a New Minimum Spanning Tree After a New Edge Was Added to The Graph

Let G = (V, E) be a weighted, connected and undirected graph and let T be a minimum spanning tree. Let e be any edge not in E (and has a weight W(e)).
Prove or disprove:
T U {e} is an edge set that contains a minimum spanning tree of G' = (V, E U {e}).
Well, it sounds true to me, so I decided to prove it but I just get stuck every time...
For example, if e is the new edge with minimum weight, who can promise us that the edges in T weren't chosen in a bad way that would prevent us from obtaining a new minimum weight without the 'help' of other edges in E - T ?
I would appreciate any help,
Thanks in advance.
Let [a(1), a(2), ..., a(n-1)] be a sequence of edges selected from E to construct MST of G by Kruskal's algorithm (in the order they were selected - weight(a(i)) <= weight(a(i + 1))).
Let's now consider how Kruskal's Algorithm behaves being given as input E' = E U {e}.
Let i = min{i: weight(e) < weight(a(i))}. Firstly algorithm decides to choose edges [a(1), ..., a(i - 1)] (e hasn't been processed yet, so it behaves the same). Then it need to decide on e - if e is dropped, solution for E' will be the same as for E. So let's suppose that first i edges selected by algorithm are [a(1), ..., a(i - 1), e] - I will call this new sequence a'. Algorithm continues - as long as its following selections (for j > i) satisfy a'(j) = a(j - 1) we are cool. There are two scenarios that break such great streak (let's say streak breaks at index k + 1):
1) Algorithm selects some edge e' that is not in T, and weight(e') < weight(a(k+1)). By now a' sequence is:
[a(1), ..., a(i-1), e, a(i), a(i+1), ..., a(k-1), a(k), e']
But if it was possible to append e' to this list it would be also possible to append it to [a(1), ..., a(k-1), a(k)]. But Kruskal's algorithm didn't do it when looking for MST for G. That leads to contradiction.
2) Algorithm politely selected:
[a(1), ..., a(i-1), e, a(i), a(i+1), ..., a(k-1), a(k)]
but decided to drop edge a(k+1). But if e was not present in the list algorithm would decide to append a(k+1). That means that in graph (V, {a(1), ..., a(k)}) edge a(k+1) would connect the same components as edge e. And that means that after considering by algorithm edge a(k + 1) in case of both G and G' the division into connected components (determined by set of selected edges) is the same. So after processing a(k+1) algorithm will proceed in the same way in both cases.
When ever a edge is add to a graph without adding a node , then that edge creates a cycle in minimum spanning tree of graph, cycle length may vary from 2 to n where n= no of nodes in graph.
T = Minimum spanning tree of G
Now to find the MST for (T + added edge) , we have to just remove one edge from that cycle .. so remove that edge which has maximum weight.
So T' always comes from T U {e}.
And if you are thinking that this doesn't prove that new MST will be an edge set of T U {e} then analyse Kruskal algorithim for for new graph. i.e. if e is of minimum weight it must have been selected for MST acc to Kruskal algorithim and same here if it is minimum it can not be removed from cycle.

find the minimum size dominating set for a tree using greedy algorithm

Dominating Set (DS) := given an undirected graph G = (V;E), a set of
vertices S V is a dominating set if for every vertex in V , there is a vertex in
S that is adjacent to v. Entire vertex set V is a trivial dominating set in
any graph.
Find minimum size dominating set for a tree.
I'll attempt to prove this in a more formal way.
OUTLINE
To prove your greedy algorithm is correct, you need to prove two things:
First, that your greedy choice is valid and can always be used in the formation of an optimal solution, and
second, that your problem has an optimal substructure property, that is, you can form an optimal solution from optimal solutions to subproblems of your own problem.
Greedy Choice: In your tree T = (V, E), find a vertex v in the tree with the highest number of leaves. Add it to your dominant set.
Optimal Substructure
T' = (V', E') such that:
V' = V \ ({a : a ϵ V, a is adjacent to v, and a's degree ≤ 2} ∪ {v})
E' = E - any edge involving any of the removed vertices
In other words
Look for a vertex with the highest number of leaves, remove any of its adjacent vertices with degree less than or equal to 2, then remove v itself, and add it to your dominant set. Repeat this until you have no vertices left.
PROOF
Greedy choice proof
For any leaf l, it must be that either itself or its parent is in the dominant set. In our case, the vertex v we would have chosen is in this situation.
Let A = {v1 , v2 , ... , vk} be a minimum dominant set of T. If A already has v as member, we are done. If it does not, we see two situations:
v has some neighbouring leaf l. Then, l must be part of the dominant set, otherwise our set is not dominating the entire tree. We can simply thus form A' = {A - {l} + {v}} and still be a dominant set. Since |A'| = |A|, A' is still optimal.
v does not have any neighbouring leaves l. Then, because v was chosen such that it has the highest number of leaves, then no vertex in T have any leaves. Then T is not a tree. Contradiction.
Thus, we will always be able to form an optimal solution with our greedy choice.
Optimal Substructure proof
Suppose that A is a minimum dominant set for T = (V, E), but that A' = A \ {v} is not a minimum dominant set for T' as defined above.
Make a minimum dominant set for T', call it B. As aforementioned, |B| < |A'|. It can be shown that B' = B ∪ {v} is a dominating set for T. Then, since |A'| = |A| - 1, |B'| = |B| + 1, we get |B'| < |A|. This is contradictory, since we assumed that A is an minimum independent set. Thus it must be that A' is also a minimum independent set of T'.
Proving B' = B ∪ {v} is a dominating set for T:
v may have had adjacent vertices adjacent not in T'. We will show that any vertices that were not considered in T' will be dominated by vertices in B' (This means that we picked our set optimally): Let y be some vertex adjacent to v and not in T'. By definition of T', y can only have degree 1 or 2. Now, y is dominated by v. If y is a leaf, then we are done. However, if y is of degree 2, then y is connected another node which is necessarily in the dominant set of B. This is because, when we removed v to make T', the degree of y became 1, meaning that y or its parent was necessarily added to the dominant set. Hence, B' is a dominant set for T.
1- Always start from leafs
2- Add their parent to DS and cut the children
3- Mark parent's of selected parent as already dominated
4- After completing process , check whether those marked nodes has a children that is not
dominated and add them to DS
Good luck

Resources