Minimal spanning tree with degree constraint - algorithm

I have to solve this problem:
Given a weighted connected undirected graph G=(V,E) and vertex u in V.
Describe an algorithm that finds MST for G such that the degree of u
is minimal; the output T of the algorithm is MST and for each another
minimal spanning tree T' being the degree of u in T less than or
equal to the degree of u in T'.
I thought about this algorithm (after some googling I found this solution for similar problem here):
Temporarily delete vertex u.
For each of the resulting connected components C1,…,Cm find a MST using e.g. Kruskal's or Prim's algorithm.
Re-add vertex u and for each Ci add the cheapest edge between 1 and Ci.
EDIT:
I understood that this algorithm may get a wrong MST (see #AndyG comment) so I thought about another one:
let k be the minimal increment between each two weights in G and add 0 < x < k to each adjacent edge of u. (e.g. if all the weights are natural numbers so k=1 and x is fraction).
find a MST using Kruskal's algorithm.
This solution is based on the fact that Kruskal's algorithm iterate the edges
ordered by weigh, so the difference between all the MSTs of G is each edge was chosen from among all edges of the same weight. Therefore, if we increase the degree of the adjacent edges of u, the algorithm will choose the others edges in the same degree and not the adjacent of u unless this edge is necessary for the MST and the degree of u will be the minimal in all the MSTs of G.
I still don't know if it works and how to prove the correctness of this algorithm.
I will appreciate any help.

To summarize the suggested algorithm [with tightened requirements on epsilon (which you called x)]:
Pick a tiny epsilon (such that epsilon * deg(u) is less than d, the smallest non-zero weight difference between any pair of subgraphs). In the case all the original weights are natural numbers, epsilon = 1/(deg(u)+1) suffices.
Add the epsilon to the weights of all edges incident to u
Find a minimal spanning tree.
We'll prove that this procedure finds an MST of the original graph that minimizes the number of edges incident to u.
Let W be the weight of any minimal spanning tree in the original graph.
First, we'll show every MST of the new graph is an MST of the original graph. Any non-MST in the original graph must have weight at least W + d. Any MST in the new graph must have weight at most W + deg(u)*epsilon (since any MST in the original graph has at most this weight in the new graph). Since we chose epsilon such that deg(u)*epsilon < d, we conclude that any MST in the new graph is also an MST in the original graph.
Second, we'll show that the MST of the new graph is the MST of the original graph that minimizes the number of edges incident to u. An MST, T, of the original graph has weight W + k * epsilon in the new graph, where k is the number of edges of T incident to u. We've already shown that every MST of the new graph is also an MST of the original graph. Therefore, the MST of the new graph is the MST of the original graph that minimizes k (the number of edges incident to u).

Related

Backtrack after running Johnson algorithm

I have a question which I was asked in some past exams at my school and I can't find an answer to it.
Is it possible knowing the final matrix after running the Johnson Algorithm on a graph, to know if it previously had negative cycles or not? Why?
Johnson Algorithm
Johnson's Algorithm is a technique that is able to compute shortest paths on graphs. Which is able to handle negative weights on edges, as long as there does not exist a cycle with negative weight.
The algorithm consists of (from Wikipedia):
First, a new node q is added to the graph, connected by zero-weight edges to each of the other nodes.
Second, the Bellman–Ford algorithm is used, starting from the new vertex q, to find for each vertex v the minimum weight h(v) of a path from q to v. If this step detects a negative cycle, the algorithm is terminated.
Next the edges of the original graph are reweighted using the values computed by the Bellman–Ford algorithm: an edge from u to v, having length w(u, v), is given the new length w(u,v) + h(u) − h(v).
Finally, q is removed, and Dijkstra's algorithm is used to find the shortest paths from each node s to every other vertex in the reweighted graph.
If I understood your question correctly, which should have been as follows:
Is it possible knowing the final pair-wise distances matrix after running the Johnson Algorithm on a graph, to know if it originally had any negative-weight edges or not? Why?
As others commented here, we must first assume the graph has no negative weight cycles, since otherwise the Johnson algorithm halts and returns False (due to the internal Bellman-Form detection of negative weight cycles).
The answer then is that if any negative weight edge e = (u, v) exists in the graph, then the shortest weighted distance between u --> v cannot be > 0 (since at the worst case you can travel the negative edge e between those vertices).
Therefore, at least one of the edges had negative weight in the original graph iff any value in the final pair-wise distances is < 0
If the question is supposed to be interpreted as:
Is it possible, knowing the updated non-negative edge weights after running the Johnson Algorithm on a graph, to know if it originally had any negative-weight edges or not? Why?
Then no, you can't tell.
Running the Johnson algorithm on a graph that has only non-negative edge weights will leave the weights unchanged. This is because all of the shortest distances q -> v will be 0. Therefore, given the edge weights after running Johnson, the initial weights could have been exactly the same.

Algorithm to Maximize Degree Centrality of Subgraph

Say I have some graph with nodes and undirected edges (the edges may have a weight associated to them).
I want to find all (or at least one) connected subgraphs that maximize the sum of the degree centrality of all nodes in the subgraph (the degree centrality is based on the original graph) under the constraint that the sum of the weighted edges is < X.
Is there an algorithm that will do this?
A quick search took me to this description of degree centrality. It turns out that the "degree centrality" of a vertex is simply its degree (neighbour count).
Unfortunately your problem is NP-hard, so it's very unlikely that any algorithm exists that can solve every instance quickly. First notice that, assuming edge weights are positive, the edges in any optimal solution necessarily form a tree, since in any non-tree you can delete at least 1 edge without destroying connectivity, and doing so will decrease the total edge weight of the subgraph. (So, as a positive spinoff: If you compute the minimum spanning tree of your input graph and find that it happens to have total weight < X, then you can simply include every vertex in the graph in your solution.)
Let's formulate a decision version of your problem. Given a graph G = (V, E) with positive (I'll assume) weights on the edges, a number X and a number Y, we want to know: Does there exist a connected subgraph G' = (V', E') of G such that the sum of the edge weights in E' is at most X, and the sum of the degrees of V' (w.r.t. G) is at least Y? (Clearly this is no harder than your original problem: If you had an algorithm to solve your problem, then you could just run it, add up the degrees of the vertices in the subgraph it found and compare this to Y to answer "my" problem.)
Here's a reduction from the NP-hard Steiner Tree in Graphs problem, where we are given a graph G = (V, E) with positive weights on the edges, a subset S of its vertices, and a number k, and the task is to determine whether it's possible to connect the vertices in S using a subset of edges with total weight at most k. (As I showed above, the solution will necessarily be a tree.) If the sum of all degrees in G is d, then all we need to do to transform G into an input for your problem is the following: For each vertex s_i in S we add enough new "ballast" vertices that are each connected only to s_i, via edges with weight X+1, to bring the degree of s_i up to d+1. We set X to k, and set Y to |S|(d+1).
Now suppose that the solution to the Steiner Tree problem is YES -- that is, there exists a subset of edges having total weight <= k that does connect all the vertices in S. In that case, it's clear that the same subgraph in the instance of your problem constructed above connects (possibly among others) all the vertices in S, and since each vertex in S has degree d+1, the total degree is at least |S|(d+1), so the answer to your decision problem is also YES.
In the other direction, suppose that the answer to your decision problem is YES -- that is, there exists a subset of edges having total weight <= X ( = k) that connects a set of vertices having total degree at least |S|(d+1). We need to show that this implies a YES answer to the original Steiner Tree problem. Clearly it suffices to show that the vertex set V' of any subgraph satisfying the conditions above (i.e. edges have total weight <= k and vertices have total degree >= |S|(d+1)) contains S (possibly among other vertices). So let V' be the vertex set of such a solution, and suppose to the contrary that there is some vertex u in S that is not in V'. But then the largest sum of degrees that we could possibly make would be to include all other non-ballast vertices in the graph in V', which would give a degree total of at most (|S|-1)(d+1) + d (the first term is the degree sum for the other vertices in S; the second is an upper bound on the degree sum of all non-S vertices in G; note that none of the ballast vertices we added in could be in the subgraph, because the only way to include any of them is to use an edge of weight X+1, which we obviously can't do). But clearly (|S|-1)(d+1) + d = |S|(d+1) - 1, which is strictly less than |S|(d+1), contradicting our assumption that V' has a degree total at least |S|(d+1). So it follows that S is a subset of V', and thus that it is possible to use the same subset of edges to connect the vertices in S for a total weight of at most k, i.e. that the answer to the Steiner Tree problem is also YES.
So a YES answer to either problem implies a YES answer to the other one, in turn implying that a NO answer to either implies a NO answer to the other. Thus if it were possible to solve the decision version of your problem in polynomial time, it would imply a polynomial-time solution to the NP-hard Steiner Tree in Graphs problem. This means the decision version of your problem is itself NP-hard, and so is the optimisation version (which as I said above is at least as hard). (The decision form is also NP-complete, since a YES answer can be easily verified in polynomial time.)
Sidenote: At first I thought I had a very straightforward reduction from the NP-hard Knapsack problem: Given a list of n weights w_1, ..., w_n and a list of n profits p_1, ..., p_n, make a single central vertex c, and n other vertices v_1, ..., v_n. For each v_i, attach it to c with an edge of weight w_i, and add p_i other leaf vertices, each attached only to v_i with an edge of weight X+1. However this reduction doesn't actually work, because the profits can be exponential in the input size n, meaning that the constructed instance of your problem might need to have an exponential number of vertices, which isn't allowed for a polynomial-time reduction.

Finding MST such that a specific vertex has a minimum degree

Given undirected, connected graph G={V,E}, a vertex in V(G), label him v, and a weight function f:E->R+(Positive real numbers), I need to find a MST such that v's degree is minimal. I've already noticed that if all the edges has unique weight, the MST is unique, so I believe it has something to do with repetitive weights on edges. I though about running Kruskal's algorithm, but when sorting the edges, I'll always consider edges that occur on v last. For example, if (a,b),(c,d),(v,e) are the only edges of weight k, so the possible permutations of these edges in the sorted edges array are: {(a,b),(c,d),(v,e)} or {(c,d),(a,b),(v,e)}. I've ran this variation over several graphs and it seems to work, but I couldn't prove it. Does anyone know how to prove the algorithm's correct (Meaning proving v's degree is minimal), or give a contrary example of the algorithm failing?
First note that Kruskal's algorithm can be applied to any weighted graph, whether or not it is connected. In general it results in a minimum-weight spanning forest (MSF), with one MST for each connected component. To prove that your modification of Kruskal's algorithm succeeds in finding the MST for which v has minimal degree, it helps to prove the slightly stronger result that if you apply your algorithm to a possibly disconnected graph then it succeeds in finding the MSF where the degree of v is minimized.
The proof is by induction on the number, k, of distinct weights.
Basis Case (k = 1). In this case weights can be ignored and we are trying to find a spanning forest in which the degree of v is minimized. In this case, your algorithm can be described as follows: pick edges for as long as possible according to the following two rules:
1) No selected edge forms a cycle with previously selected edges
2) An edge involving v isn't selected unless any edge which doesn't
involve v violates rule 1.
Let G' denote the graph from which v and all incident edges have been removed from G. It is easy to see that the algorithm in this special case works as follows. It starts by creating a spanning forest for G'. Then it takes those trees in the forest that are contained in v's connected component in the original graph G and connects each component to v by a single edge. Since the components connected to v in the second stage can be connected to each other in no other way (since if any connecting edge not involving v exists it would have been selected by rule 2) it is easy to see that the degree of v is minimal.
Inductive Case: Suppose that the result is true for k and G is a weighted graph with k+1 distinct weights and v is a specified vertex in G. Sort the distinct weights in increasing order (so that weight k+1 is the longest of the distinct weights -- say w_{k+1}). Let G' be the sub-graph of G with the same vertex set but with all edges of weight w_{k+1} removed. Since the edges are sorted in the order of increasing weight, note that the modified Kruskal's algorithm in effect starts by applying itself to G'. Thus -- by the induction hypothesis prior to considering edges of weight w_{k+1}, the algorithm has succeeded in constructing an MSF F' of G' for which the degree, d' of v in G' is minimized.
As a final step, modified Kruskal's applied to the overall graph G will merge certain of the trees in F' together by adding edges of weight w_{k+1}. One way to conceptualize the final step is the think of F' as a graph where two trees are connected exactly when there is an edge of weight w_{k+1} from some node in the first tree to some node in the second tree. We have (almost) the basis case with F'. Modified Kruskal's will add edged of weight w_{k+1} until it can't do so anymore -- and won't add an edge connecting to v unless there is no other way to connect to trees in F' that need to be connected to get a spanning forest for the original graph G.
The final degree of v in the resulting MSF is d = d'+d" where d" is the number of edges of weight w_{k+1} added at the final step. Neither d' nor d" can be made any smaller, hence it follows that d can't be made any smaller (since the degree of v in any spanning forest can be written as the sum of the number of edges whose weight is less than w_{k+1} coming into v and the number off edges of weight w_{k+1} coming into v).
QED.
There is still an element of hand-waving in this, especially with the final step -- but Stack Overflow isn't a peer-reviewed journal. Anyway, the overall logic should be clear enough.
One final remark -- it seems fairly clear that Prim's algorithm can be similarly modified for this problem. Have you looked into that?

Minimum spanning tree. unique min edge vs non unique proof

So I have an exercise that I should prove or disprove that:
1) if e is a minimum weight edge in the connected graph G such that not all edges are necessarily distinct, then every minimum spanning tree of G contains e
2) Same as 1) but now all edge weights are distinct.
Ok so intuitively, I understand that for 1) since not all edge weights are distinct, then it's possible that a vertex has the path with edge e but also another edge e_1 such that if weight(e) = weight (e_1) then there is a spanning tree which does not contain the edge e since the graph is connected. Otherwise if both e_1 and e are in the minimum spanning tree, then there is a cycle
and for 2) since all edge weights are distinct, then of course the minimum spanning tree will contain the edge e since any algorithm will always choose the smaller path.
Any suggestions on how to prove these two though? induction? Not sure how to approach.
Actually in your first proof when you say that if both e and e_1 are in G, then there's a cycle, that's not true, because they're minimal edges, so there doesn't have to be a cycle, and you do need to include them both into the MST, because if |E| > 1 and |V| > 2 then they both have to be there.
Anyways, a counter example for the first one is a complete graph with all edges of the same weight as e, the MST will contain only |V|-1 edges, but you didn't include all the other edges of that same weight, hence you have a contradiction.
As for the second one, if all edges are distinct, then if you remove the minimum edge and want to reconstruct the MST, the only way to go about this is to add a an edge connecting the 2 disjoint sets that were broken up by that minimum-weight edge.
Now suppose that you didn't remove that minimum-weight edge, and added that other edge, now you've created a cycle, and since all edges are distinct the cycle-creating edge will be greater than all of them, hence if you remove any former MST edge from that cycle, it will only increase the size of the MST. Which means that pretty much all former MST edges are critical when all edges have distinct weights.

How to find maximum spanning tree?

Does the opposite of Kruskal's algorithm for minimum spanning tree work for it? I mean, choosing the max weight (edge) every step?
Any other idea to find maximum spanning tree?
Yes, it does.
One method for computing the maximum weight spanning tree of a network G –
due to Kruskal – can be summarized as follows.
Sort the edges of G into decreasing order by weight. Let T be the set of edges comprising the maximum weight spanning tree. Set T = ∅.
Add the first edge to T.
Add the next edge to T if and only if it does not form a cycle in T. If
there are no remaining edges exit and report G to be disconnected.
If T has n−1 edges (where n is the number of vertices in G) stop and
output T . Otherwise go to step 3.
Source: https://web.archive.org/web/20141114045919/http://www.stats.ox.ac.uk/~konis/Rcourse/exercise1.pdf.
From Maximum Spanning Tree at Wolfram MathWorld:
"A maximum spanning tree is a spanning tree of a weighted graph having maximum weight. It can be computed by negating the weights for each edge and applying Kruskal's algorithm (Pemmaraju and Skiena, 2003, p. 336)."
If you invert the weight on every edge and minimize, do you get the maximum spanning tree? If that works you can use the same algorithm. Zero weights will be a problem, of course.
Although this thread is too old, I have another approach for finding the maximum spanning tree (MST) in a graph G=(V,E)
We can apply some sort Prim's algorithm for finding the MST. For that I have to define Cut Property for the maximum weighted edge.
Cut property: Let say at any point we have a set S which contains the vertices that are in MST( for now assume it is calculated somehow ). Now consider the set S/V ( vertices not in MST ):
Claim: The edge from S to S/V which has the maximum weight will always be in every MST.
Proof: Let's say that at a point when we are adding the vertices to our set S the maximum weighted edge from S to S/V is e=(u,v) where u is in S and v is in S/V. Now consider an MST which does not contain e. Add the edge e to the MST. It will create a cycle in the original MST. Traverse the cycle and find the vertices u' in S and v' in S/V such that u' is the last vertex in S after which we enter S/V and v' is the first vertex in S/V on the path in cycle from u to v.
Remove the edge e'=(u',v') and the resultant graph is still connected but the weight of e is greater than e' [ as e is the maximum weighted edge from S to S/V at this point] so this results in an MST which has sum of weights greater than original MST. So this is a contradiction. This means that edge e must be in every MST.
Algorithm to find MST:
Start from S={s} //s is the start vertex
while S does not contain all vertices
do
{
for each vertex s in S
add a vertex v from S/V such that weight of edge e=(s,v) is maximum
}
end while
Implementation:
we can implement using Max Heap/Priority Queue where the key is the maximum weight of the edge from a vertex in S to a vertex in S/V and value is the vertex itself. Adding a vertex in S is equal to Extract_Max from the Heap and at every Extract_Max change the key of the vertices adjacent to the vertex just added.
So it takes m Change_Key operations and n Extract_Max operations.
Extract_Min and Change_Key both can be implemented in O(log n). n is the number of vertices.
So This takes O(m log n) time. m is the number of edges in the graph.
Let me provide an improvement algorithm:
first construct an arbitrary tree (using BFS or DFS)
then pick an edge outside the tree, add to the tree, it will form a cycle, drop the smallest weight edge in the cycle.
continue doing this util all the rest edges are considered
Thus, we'll get the maximum spanning tree.
This tree satisfies any edge outside the tree, if added will form a cycle and the edge outside <= any edge weights in the cycle
In fact, this is a necessary and sufficient condition for a spanning tree to be maximum spanning tree.
Pf.
Necessary: It's obvious that this is necessary, or we could swap edge to make a tree with a larger sum of edge weights.
Sufficient: Suppose tree T1 satisfies this condition, and T2 is the maximum spanning tree.
Then for the edges T1 ∪ T2, there're T1-only edges, T2-only edges, T1 ∩ T2 edges, if we add a T1-only edge(x1, xk) to T2, we know it will form a cycle, and we claim, in this cycle there must exist one T2-only edge that has the same edge weights as (x1, xk). Then we can exchange these edges will produce a tree with one more edge in common with T2 and has the same sum of edge weights, repeating doing this we'll get T2. so T1 is also a maximum spanning tree.
Prove the claim:
suppose it's not true, in the cycle we must have a T2-only edge since T1 is a tree. If none of the T2-only edges has a value equal to that of (x1, xk), then each of T2-only edges makes a loop with tree T1, then T1 has a loop leads to a contradiction.
This algorithm taken from UTD professor R. Chandrasekaran's notes. You can refer here: Single Commodity Multi-terminal Flows
Negate the weight of original graph and compute minimum spanning tree on the negated graph will give the right answer. Here is why: For the same spanning tree in both graphs, the weighted sum of one graph is the negation of the other. So the minimum spanning tree of the negated graph should give the maximum spanning tree of the original one.
Only reversing the sorting order, and choosing a heavy edge in a vertex cut does not guarantee a Maximum Spanning Forest (Kruskal's algorithm generates forest, not tree). In case all edges have negative weights, the Max Spanning Forest obtained from reverse of kruskal, would still be a negative weight path. However the ideal answer is a forest of disconnected vertices. i.e. a forest of |V| singleton trees, or |V| components having total weight of 0 (not the least negative).
Change the weight in a reserved order(You can achieve this by taking a negative weight value and add a large number, whose purpose is to ensure non-negative) Then run your family geedy-based algorithm on the minimum spanning tree.

Resources