MST recursive construction on subgraph - algorithm

MST denotes: Minimum Spanning Tree.
Given a Graph g = (V, E). Arbitrarily partitioned the vertices into 2 disjoint sets, V1 and V2.
Let E1 is all edges with both incidents vertices in V1
Let E2 is all edges with both incidents vertices in V2
Let E3 is all edges with one incident in V1 one in V2
Now contruct a MST M1 on subgraph(V1, E1) and a MST M2 on subgraph( V2, E2). Then add the lowest weight edge in E3 that connected M1 and M2. Is this result in constructing a MST on the original graph g?

My answer is no.
Consider the graph G: {Vertices: {A, B, C, D} Edges: {AB = 1, BD = 10, DC = 3, AC = 2}}. When it is divided as V1 = {A, C} V2 = {B, D} E1 = {AC} E2 = {BD} E3 = {AB, CD}, according to the description the MST is {AC, AB, BD}, while the true MST is {AB, AC, CD}.
Recall the Kruskal algorithm: Edges are sorted by weight in ascending order and the ones will not form a cycle with already exist edges in MST will be added one by one. MST is tree, so |E|-1 edges will be selected (assume there is no isolate vertices in the MST). If |E1| < |V1|-1 or |E2|<|V2|-1 and |E3|>1, then more than one edges in E3 will be added to the MST when the Kruskal algorithm is implemented on the whole graph G. If the MST is constructed from M1 M2 and the smallest edge in E3, it may lose some edges.
Also, if we implement the Kruskal algorithm on the whole graph G, more than one edges which are small in E3 may be added (we call them E3’). If we implement the Kruskal algorithm separately on V1 E1 an V2 E2, the edges which are larger than those in E3’ will be added to M1, M2. And only the smallest edge in E3 (E3’) will be added. So, the total weight of second MST is larger than the first one and it is not a true MST.
Are there any cases building up the MST separately is the same as building up it on the whole graph?
When E3 has only one edge:
When the Kruskal algorithm is implemented on the whole graph G, this edge will be added to MST because it does not form a cycle with any edges. And it does not affect the decision on any other edges.
When M1, M2 are both connected tree which do not have isolate
vertices and at least |E3|-1 edges in E3 are the largest edges in
the E.
When Kruskal algorithm is implemented on the whole graph, the smallest edge in E3 will be added to the MST. It does not affect the decision on any other edges in E1 or E2 added after it. Because it will not form cycle with any edges in E1 or E2. And the other edges in E3 will not be added because they are the last edges to be considered in Kruskal algorithm and they will form a cycle with already exist edges in MST.

Related

Shortest path distance from source(s) to all nodes in the graph - O(m + n log(n)) time

Let G(V,E) be a directed weighted graph with edge lengths, where all of the edge lengths are positive except two of the edges have negative lengths. Given a fixed vertex s, give an algorithm computing shortest paths from s to any other vertex in O(e + v log(v)) time.
My work:
I am thinking about using the reweighting technique of Johnson's algorithm. And then, run Belford Algo once and apply Dijkstra v times. This will give me the time complexity as O(v^2 log v + ve).
This is the standard all pair shortest problem, As I only need one vertex (s) - my time complexity will be O(v log v + e) right?
For this kind of problem, changing the graph is often a lot easier than changing the algorithm. Let's call the two negative-weight edges N1 and N2; a path by definition cannot use the same edge more than once, so there are four kinds of path:
A. Those which use neither N1 nor N2,
B. Those which use N1 but not N2,
C. Those which use N2 but not N1,
D. Those which use both N1 and N2.
So we can construct a new graph with four copies of each node from the original graph, such that for each node u in the original graph, (u, A), (u, B), (u, C) and (u, D) are nodes in the new graph. The edges in the new graph are as follows:
For each positive weight edge u-v in the original graph, there are four copies of this edge in the new graph, (u, A)-(v, A) ... (u, D)-(v, D). Each edge in the new graph has the same weight as the corresponding edge in the original graph.
For the first negative-weight edge (N1), there are two copies of this edge in the new graph; one from layer A to layer B, and one from layer C to layer D. These new edges have weight 0.
For the second negative-weight edge (N2), there are two copies of this edge in the new graph; one from layer A to layer C, and one from layer B to layer D. These new edges have weight 0.
Now we can run any standard single-source shortest-path problem, e.g. Dijkstra's algorithm, just once on the new graph. The shortest path from the source to a node u in the original graph will be one of the following four paths in the new graph, whichever corresponds to a path of the lowest weight in the original graph:
(source, A) to (u, A) with the same weight.
(source, A) to (u, B) with the weight in the new graph minus the weight of N1.
(source, A) to (u, C) with the weight in the new graph minus the weight of N2.
(source, A) to (u, D) with the weight in the new graph minus the weights of N1 and N2.
Since the new graph has 4V vertices and 4E - 2 edges, the worst-case performance of Dijkstra's algorithm is O((4E - 2) + 4V log 4V), which simplifies to O(E + V log V) as required.
To ensure that a shortest path in the new graph corresponds to a genuine path in the original graph, it remains to be proved that a path from e.g. (source, A) to (u, B) will not use two copies of the same edge from the original graph. That is quite easy to show, but I'll leave it to you as something to think about.

How to calculate the expected value of random graph generation

Hello this is my first question. I met a homework in algorithm and probability that I can't find a clue to calculate.
Question:
Computing Number of Triangles in a Graph: Given an undirected graph G = (V, E), a triangle in G is a clique of size 3 (formally, a set of nodes {u, v, w} is a triangle in G if (u, v), (v, w), (u, w) are all edges of G). Consider the following algorithm for approximating the number of triangles in a graph. First construct a sampled graph G' = (V, E') as follows. The vertex set of G' is same as that of G. For every e ∈ E, put e in E' with probability p (think of p as, say, 0.1). In this new sampled graph G', count the number of triangles and let T' be the number of triangles in G' (assume that you have given a black box subroutine to count the number of triangles in G' ). Then output T̃= T'/p.
Show that the expected value of T̃=T ,T is the triangle number of original graph G.
I am confusing that the edge in G or G' to form a triangle is not independent since two adjacent triangles in G might share the edge. And not the all the pair of vertices in G can form a edge in G', only those edges are in G will be present in G' with p. It's hard for me to think of the relationship of number of edges and number of triangles in G or G'.
Hope someone can give me some hints, even not the whole solution is OK.
the edge in G or G' to form a triangle is not independent since two adjacent triangles in G might share the edge
Doesn't matter. The sum of expectations is the expectation of the sum regardless of correlation, so you can reason about the triangles individually. (Higher moments, were you concerned about analyzing the estimation quality of this algorithm, would be trickier.)

Find a minimum spanning tree in different sets

Here I have two connected undirected graphs
G1 = [V ; E1] and G2 =[V ; E2] on the same set of vertices V . And assume edges in E1 and E2 have different colors.
Let w(e) be the weight of edge e ∈ E1 ∪ E2.
I want to find a minimum weight spanning tree (MSF) among those spanning trees which have at least one edge in each set E1 and E2. In this condition, How to find a proper algorithm for this? I got stuck here a whole night.
Consider two edges e1 &in; E1, e2 &in; E2. They connect between 2 and 4 different vertices in V. If they connect 3 or 4 vertices, suppose you first contract the vertices which e1 connects (same as each step in Kruskal's algorithm), then the ones which e2 connects, and then run any minimum spanning tree algorithm on the resulting graph. Then the result is the MST containing e1 and e2.
It follows that you can find the total MST by looping over all e1 &in; E1, e2 &in; E2 (which don't connect exactly the same two vertices), and finding the lightest solution. The proof of correctness can be easily modified from that of Kruskal's algorithm.
In fact, though, you can make this more efficient, since either the lightest edge in E1 or the lightest edge in E2 must be used in some MST. Suppose that the lightest edge in E1, say e'1, is not used, and consider a cut agreeing with e'1. The MST must contain some e ≠ e'1 connecting the cut. Clearly, if e &in; E1, then e'1 can be used instead of e. If e &in; E2, though, and e can't be used, then e is lighter than e'1. In this case, though, repeating the argument for E2, yields that the lightest edge in E2 can be part of the MST.
Consequently, only the lightest edge of E1 along with any edge in E2, or the lightest edge in E2 along with any edge in E1 must be considered for the first two contractions mention above.
The complexity is Θ(|E1 + E2| f(V, E1 + E2)), where f is the complexity of the MST algorithm.

Decide whether there is a MST that contains some edges of 2 distinct edge sets

Let G = (V, E) be a weighted, connected and undirected graph. Let T1 and T2 be 2 different MST's. Suppose we can write E = (A1 U B U A2) such that:
B is the intersection of the edges of T1 and T2, and
A1 = T1 - B
A2 = T2 - B
Assuming that every MST T in G contains all the edges of B, find an algorithm that decides whether there is a MST T that contains at least one edge in A1 and at least one edge in A2.
Edit: I've dropped the part that was here. I think that it does more harm than good.
you should sort your edge that the red edge is prefer to blue edge for choose.then you can use any MST algorithm same as Prim's algorithm :
If a graph is empty then we are done immediately. Thus, we assume
otherwise. The algorithm starts with a tree consisting of a single
vertex, and continuously increases its size one edge at a time, until
it spans all vertices. Input: A non-empty connected weighted graph
with vertices V and edges E (the weights can be negative). Initialize:
Vnew = {x}, where x is an arbitrary node (starting point) from V, Enew
= {} Repeat until Vnew = V: Choose an edge {u, v} with minimal weight such that u is in Vnew and v is not (if there are multiple edges with
the same weight, any of them may be picked) Add v to Vnew, and {u, v}
to Enew Output: Vnew and Enew describe a minimal spanning tree

Graph - Square of a directed graph

Yes, this will be a homework (I am self-learning not for university) question but I am not asking for solution. Instead, I am hoping to clarify the question itself.
In CLRS 3rd edition, page 593, excise 22.1-5,
The square of a directed graph G = (V, E) is the graph G2 = (V, E2) such that (u,v) ∈ E2 if and only if G contains a path with at most two edges between u and v. Describe efficient algorithms for computing G2 from G for both the adjacency-list and adjacency-matrix representations of G. Analyze the running times of your algorithms.
However, in CLRS 2nd edition (I can't find the book link any more), page 530, the same exercise but with slightly different description:
The square of a directed graph G = (V, E) is the graph G2 = (V, E2) such that (u,w) ∈ E2 if and only if for some v ∈ V, both (u,v) ∈ E and (v,w) ∈ E. That is, G2 contains an edge between u and w whenever G contains a path with exactly two edges between u and w. Describe efficient algorithms for computing G2 from G for both the adjacency-list and adjacency-matrix representations of G. Analyze the running times of your algorithms.
For the old exercise with "exactly two edges", I can understand and can solve it. For example, for adjacency-list, I just do v->neighbour->neighbour.neighbour, then add (v, neighbour.neighbour) to the new E2.
But for the new exercise with "at most two edges", I am confused.
What does "if and only if G contains a path with at most two edges between u and v" mean?
Since one edge can meet the condition "at most two edges", if u and v has only one path which contains only one edge, should I add (u, v) to E2?
What if u and v has a path with 2 edges, but also has another path with 3 edges, can I add (u, v) to E2?
Yes, that's exactly what it means. E^2 should contain (u,v) iff E contains (u,v) or there is w in V, such that E contains both (u,w) and (w,v).
In other words, E^2 according to the new definition is the union of E and E^2 according to the old definition.
Regarding to your last question: it doesn't matter what other paths between u and v exist (if they do). So, if there are two paths between u and v, one with 2 edges and one with 3 edges, then (u,v) should be in E^2 (according to both definitions).
The square of a graph G, G^2 defined by those vertices V' for which d(u,v)<=2 and the eges G' of G^2 is all those edges of G which have both the end vertices From V'

Resources