DFS and depth tree - data-structures

DFS and depth tree - data-structures

I need to decide whether the following statement is true or false. If true, explain why, if false give a counterexample.
Let T be the depth-tree resulting from running the DFS algorithm on a graph G and a source vertex s.
G is an undirected graph with a cycle of exactly 3 vertices.
So T necessarily does not contain all the shortest paths from s to the other vertices in G.
I think it's true but I don't know how to prove it or to explain why

This is a little tricky, because the graph G may contain other cycles. There are many ways that the 3-cycle might be embedded in the tree, and it may be that none of the edges in the cycle are in the tree at all, so it's difficult to make a proof by dividing all the possible embeddings into a few cases.
I think the easiest way to prove this is:
If T is a DFS tree for G rooted at s, then every edge in G is either in T, or it connects a node to an ancestor in T.
Note that the vertices adjacent to any edge are at different heights. For edges in the tree, they differ by 1. For other edges they differ by at least 1.
The sum of height changes around a cycle must be 0, since it starts and ends at the same vertex.
It cannot be that all the height differences in the 3-cycle are of magnitude 1, because then their sum would be an odd number -- not 0.
At least one edge in the 3-cycle must therefore have a height difference > 1. This edge will be a short-cut to its lower vertex, since all paths in the tree can change height by at most one per step.

Related

Show that the heuristic solution to vertex cover is at most twice as large as the optimal solution

The heuristic solution that I've been given is:
Perform a depth-first-search on the graph
Delete all the leaves
The remaining graph forms a vertex cover
I've been given the question: "Show that this heuristic is at most twice as large as the optimal solution to the vertex cover". How can I show this?

I assume that the graph is connected (if it's not the case, we can solve this problem for each component separately).
I also assume that a dfs-tree is rooted and a leaf is a vertex that doesn't have children in the rooted dfs-tree (it's important. If we define it differently, the algorithm may not work).
We need to show to things:
The set of vertices returned by the algorithm is a vertex cover. Indeed, there can be only types of edges in the dfs-tree of any undirected graph: tree edges (such an edge is covered as at least on of its endpoints is not a leaf) and a back edge (again, one of its endpoint is not a leaf because back edge goes from a vertex to its ancestor. A leaf cannot be an ancestor of a leaf).
Let's consider the dfs-tree and ignore the rest of the edges. I'll show that it's not possible to cover tree edges using less than half non-leave vertices. Let S be a minimum vertex cover. Consider a vertex v, such that v is not a leaf and v is not in S (that is, v is returned by the heuristic in question but it's not in the optimal answer). v is not a leaf, thus there is an edge v -> u in the dfs-tree (where u is a successor of v). The edge v -> u is covered by S. Thus, u is in S. Let's define a mapping f from vertices returned by the heuristic that are not in S as f(v) = u (where v and u have the same meaning as in the previous sentence). Note that v is a parent of u in the dfs-tree. But there can be only one parent for any vertex in a tree! Thus, f is an injection. It means that the number of vertices in the set returned by the heuristic but not in the optimal answer is not greater than the size of the optimal answer. That's exactly what we needed to show.

Bad news: heuristics does not work.
Strictly said, 1 isolated vertex is counter-example for the question.
Nevertheless, heuristic does not provide vertex cover solution at all, even if you correct it for isolated vertex and for 2-point cliques.
Take a look at fully connected graphs with number of vertexes from 1 to 3:
1 - strictly said, isolated vertex is not a leaf (it has degree 0, while leaf is a vertex with degree 1), so heuristic will keep it, while vertex cover will not
2 - heuristic will drop both leaves, while vertex cover will keep at least 1 of them
3 - heuristic will leave 1 vertex, while vertex cover has to keep at least 2 vertexes of this clique

Negative weight edges

Full question: Argue that if all edge weights of a graph are positive, then any subset of edges that connects all vertices and has minimum total weight must be a tree. Give an example to show that the same conclusion does not follow if we allow some weights to be nonpositive.
My answer: Since the edges connects all vertices, it must be a tree. In a graph, you can remove one of the edges and still connect all the vertices. Also, negative edges can be allowed in a graph (e.g. Prim and Kruskal's algorithms).
Please let me know if there's a definite answer to this and explain to me how you got the conclusion. I'm a little bit lost with this question.

First off, a tree is a type of graph. So " In a graph, you can remove one of the edges and still connect all the vertices" isn't true. A tree is a graph without cycles - i.e., with only one path between any two nodes.
Negatives weights in general can exist in either a tree or a graph.
The way to approach this problem is to show that if you have a graph that connects all components, but is NOT a tree, then it is also not of minimum weight (i.e., there is some other graph that does the same thing, with a lower total weight.) This conclusion is only true if the graph contains only positive edges, so you should also provide a counterexample - a graph which is NOT a tree, which IS of minimum weight, and which IS fully connected.

With non-negative weights, adding an edge to traverse from one node to another always results in the weight increasing, so for minimum weight you always avoid that.
If you allow negative weights, adding an edge may result in reducing the weight. If you have a cycle with negative weight overall, minimum weight demands that you stay in that cycle infinitely (leading to infinitely negative weight for the path overall).

Finding MST such that a specific vertex has a minimum degree

Given undirected, connected graph G={V,E}, a vertex in V(G), label him v, and a weight function f:E->R+(Positive real numbers), I need to find a MST such that v's degree is minimal. I've already noticed that if all the edges has unique weight, the MST is unique, so I believe it has something to do with repetitive weights on edges. I though about running Kruskal's algorithm, but when sorting the edges, I'll always consider edges that occur on v last. For example, if (a,b),(c,d),(v,e) are the only edges of weight k, so the possible permutations of these edges in the sorted edges array are: {(a,b),(c,d),(v,e)} or {(c,d),(a,b),(v,e)}. I've ran this variation over several graphs and it seems to work, but I couldn't prove it. Does anyone know how to prove the algorithm's correct (Meaning proving v's degree is minimal), or give a contrary example of the algorithm failing?

First note that Kruskal's algorithm can be applied to any weighted graph, whether or not it is connected. In general it results in a minimum-weight spanning forest (MSF), with one MST for each connected component. To prove that your modification of Kruskal's algorithm succeeds in finding the MST for which v has minimal degree, it helps to prove the slightly stronger result that if you apply your algorithm to a possibly disconnected graph then it succeeds in finding the MSF where the degree of v is minimized.
The proof is by induction on the number, k, of distinct weights.
Basis Case (k = 1). In this case weights can be ignored and we are trying to find a spanning forest in which the degree of v is minimized. In this case, your algorithm can be described as follows: pick edges for as long as possible according to the following two rules:
1) No selected edge forms a cycle with previously selected edges
2) An edge involving v isn't selected unless any edge which doesn't
involve v violates rule 1.
Let G' denote the graph from which v and all incident edges have been removed from G. It is easy to see that the algorithm in this special case works as follows. It starts by creating a spanning forest for G'. Then it takes those trees in the forest that are contained in v's connected component in the original graph G and connects each component to v by a single edge. Since the components connected to v in the second stage can be connected to each other in no other way (since if any connecting edge not involving v exists it would have been selected by rule 2) it is easy to see that the degree of v is minimal.
Inductive Case: Suppose that the result is true for k and G is a weighted graph with k+1 distinct weights and v is a specified vertex in G. Sort the distinct weights in increasing order (so that weight k+1 is the longest of the distinct weights -- say w_{k+1}). Let G' be the sub-graph of G with the same vertex set but with all edges of weight w_{k+1} removed. Since the edges are sorted in the order of increasing weight, note that the modified Kruskal's algorithm in effect starts by applying itself to G'. Thus -- by the induction hypothesis prior to considering edges of weight w_{k+1}, the algorithm has succeeded in constructing an MSF F' of G' for which the degree, d' of v in G' is minimized.
As a final step, modified Kruskal's applied to the overall graph G will merge certain of the trees in F' together by adding edges of weight w_{k+1}. One way to conceptualize the final step is the think of F' as a graph where two trees are connected exactly when there is an edge of weight w_{k+1} from some node in the first tree to some node in the second tree. We have (almost) the basis case with F'. Modified Kruskal's will add edged of weight w_{k+1} until it can't do so anymore -- and won't add an edge connecting to v unless there is no other way to connect to trees in F' that need to be connected to get a spanning forest for the original graph G.
The final degree of v in the resulting MSF is d = d'+d" where d" is the number of edges of weight w_{k+1} added at the final step. Neither d' nor d" can be made any smaller, hence it follows that d can't be made any smaller (since the degree of v in any spanning forest can be written as the sum of the number of edges whose weight is less than w_{k+1} coming into v and the number off edges of weight w_{k+1} coming into v).
QED.
There is still an element of hand-waving in this, especially with the final step -- but Stack Overflow isn't a peer-reviewed journal. Anyway, the overall logic should be clear enough.
One final remark -- it seems fairly clear that Prim's algorithm can be similarly modified for this problem. Have you looked into that?

Minimum spanning tree. unique min edge vs non unique proof

So I have an exercise that I should prove or disprove that:
1) if e is a minimum weight edge in the connected graph G such that not all edges are necessarily distinct, then every minimum spanning tree of G contains e
2) Same as 1) but now all edge weights are distinct.
Ok so intuitively, I understand that for 1) since not all edge weights are distinct, then it's possible that a vertex has the path with edge e but also another edge e_1 such that if weight(e) = weight (e_1) then there is a spanning tree which does not contain the edge e since the graph is connected. Otherwise if both e_1 and e are in the minimum spanning tree, then there is a cycle
and for 2) since all edge weights are distinct, then of course the minimum spanning tree will contain the edge e since any algorithm will always choose the smaller path.
Any suggestions on how to prove these two though? induction? Not sure how to approach.

Actually in your first proof when you say that if both e and e_1 are in G, then there's a cycle, that's not true, because they're minimal edges, so there doesn't have to be a cycle, and you do need to include them both into the MST, because if |E| > 1 and |V| > 2 then they both have to be there.
Anyways, a counter example for the first one is a complete graph with all edges of the same weight as e, the MST will contain only |V|-1 edges, but you didn't include all the other edges of that same weight, hence you have a contradiction.
As for the second one, if all edges are distinct, then if you remove the minimum edge and want to reconstruct the MST, the only way to go about this is to add a an edge connecting the 2 disjoint sets that were broken up by that minimum-weight edge.
Now suppose that you didn't remove that minimum-weight edge, and added that other edge, now you've created a cycle, and since all edges are distinct the cycle-creating edge will be greater than all of them, hence if you remove any former MST edge from that cycle, it will only increase the size of the MST. Which means that pretty much all former MST edges are critical when all edges have distinct weights.

minimum connected subgraph containing a given set of nodes

I have an unweighted, connected graph. I want to find a connected subgraph that definitely includes a certain set of nodes, and as few extras as possible. How could this be accomplished?
Just in case, I'll restate the question using more precise language. Let G(V,E) be an unweighted, undirected, connected graph. Let N be some subset of V. What's the best way to find the smallest connected subgraph G'(V',E') of G(V,E) such that N is a subset of V'?
Approximations are fine.

This is exactly the well-known NP-hard Steiner Tree problem. Without more details on what your instances look like, it's hard to give advice on an appropriate algorithm.

I can't think of an efficient algorithm to find the optimal solution, but assuming that your input graph is dense, the following might work well enough:
Convert your input graph G(V, E) to a weighted graph G'(N, D), where N is the subset of vertices you want to cover and D is distances (path lengths) between corresponding vertices in the original graph. This will "collapse" all vertices you don't need into edges.
Compute the minimum spanning tree for G'.
"Expand" the minimum spanning tree by the following procedure: for every edge d in the minimum spanning tree, take the corresponding path in graph G and add all vertices (including endpoints) on the path to the result set V' and all edges in the path to the result set E'.
This algorithm is easy to trip up to give suboptimal solutions. Example case: equilateral triangle where there are vertices at the corners, in midpoints of sides and in the middle of the triangle, and edges along the sides and from the corners to the middle of the triangle. To cover the corners it's enough to pick the single middle point of the triangle, but this algorithm might choose the sides. Nonetheless, if the graph is dense, it should work OK.

The easiest solutions will be the following:
a) based on mst:
- initially, all nodes of V are in V'
- build a minimum spanning tree of the graph G(V,E) - call it T.
- loop: for every leaf v in T that is not in N, delete v from V'.
- repeat loop until all leaves in T are in N.
b) another solution is the following - based on shortest paths tree.
- pick any node in N, call it v, let v be a root of a tree T = {v}.
- remove v from N.
loop:
1) select the shortest path from any node in T and any node in N. the shortest path p: {v, ... , u} where v is in T and u is in N.
2) every node in p is added to V'.
3) every node in p and in N is deleted from N.
--- repeat loop until N is empty.
At the beginning of the algorithm: compute all shortest paths in G using any known efficient algorithm.
Personally, I used this algorithm in one of my papers, but it is more suitable for distributed enviroments.
Let N be the set of nodes that we need to interconnect. We want to build a minimum connected dominating set of the graph G, and we want to give priority for nodes in N.
We give each node u a unique identifier id(u). We let w(u) = 0 if u is in N, otherwise w(1).
We create pair (w(u), id(u)) for each node u.
each node u builds a multiset relay node. That is, a set M(u) of 1-hop neigbhors such that each 2-hop neighbor is a neighbor to at least one node in M(u). [the minimum M(u), the better is the solution].
u is in V' if and only if:
u has the smallest pair (w(u), id(u)) among all its neighbors.
or u is selected in the M(v), where v is a 1-hop neighbor of u with the smallest (w(u),id(u)).
-- the trick when you execute this algorithm in a centralized manner is to be efficient in computing 2-hop neighbors. The best I could get from O(n^3) is to O(n^2.37) by matrix multiplication.
-- I really wish to know what is the approximation ration of this last solution.
I like this reference for heuristics of steiner tree:
The Steiner tree problem, Hwang Frank ; Richards Dana 1955- Winter Pawel 1952

You could try to do the following:
Creating a minimal vertex-cover for the desired nodes N.
Collapse these, possibly unconnected, sub-graphs into "large" nodes. That is, for each sub-graph, remove it from the graph, and replace it with a new node. Call this set of nodes N'.
Do a minimal vertex-cover of the nodes in N'.
"Unpack" the nodes in N'.
Not sure whether or not it gives you an approximation within some specific bound or so. You could perhaps even trick the algorithm to make some really stupid decisions.

As already pointed out, this is the Steiner tree problem in graphs. However, an important detail is that all edges should have weight 1. Because |V'| = |E'| + 1 for any Steiner tree (V',E'), this achieves exactly what you want.
For solving it, I would suggest the following Steiner tree solver (to be transparent: I am one of the developers):
https://scipjack.zib.de/
For graphs with a few thousand edges, you will usually get an optimal solution in less than 0.1 seconds.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio