undirected graphs algorithms - algorithm

Suppose that we have an n-node, m-edge undirected graph G = (V; E) and we have two distinct nodes
called s and t. Suppose that the distance between s and t is strictly greater than n/2
. Show that there
is a node v which is dierent from s and t such that every path from s to t goes through v.
Give an algorithm with running time O(n + m) to nd such a vertex. You do not have to
prove that your algorithm is correct but you must give a proof that a vertex like v exists.
I can not figure out an exact answer to this past paper question, help me out!

Suppose there are two paths between s and t, that don't share a node. Since distance between s and t is > n/2, than each path has >= n/2 nodes between s and t. That means that graph has >= n+2 nodes, what is a contradiction.
For algorithm it is enough to find any path and than see where sub-graph that is connected to one side (s) without using path nodes finish. In more details:
if s is connected only to one node than that node we are looking for.
if not, make BFS from s
find path s-t
find nodes connected to s without using edges going from nodes of path s-t
last node on path s-t that is in connected part is node we are looking for.

Related

Find Two vertices with lowest path weight

I am trying to solve this question but got stuck.
Need some help,Thanks.
Given an undirected Connected graph G with non-negative values at edges.
Let A be a subgroup of V(G), where V(G) is the group of vertices in G.
-Find a pair of vertices (a,b) that belongs to A, such that the weight of the shortest path between them in G is minimal, in O((E+V)*log(v)))
I got the idea of using Dijkstra's algorithm in each node which will give me O(V*((E+V)logv))),which is too much.
So thought about connecting the vertices in A somehow,did'nt find any useful way.
Also tried changing the way Dijkstra's algorithm work,But it get's to hard to prove with no improvment in time complexity.
Note that if the optimal pair is (a, b), then from every node u in the optimal path, a and b are the closest two nodes in A.
I believe we should extend Dijkstra's algorithm in the following manners:
Start with all nodes in A, instead of a single source_node.
For each node, don't just remember the shortest_distance and the previous_node, but also the closest_source_node to remember which node in A gave the shortest distance.
Also, for each node, remember the second_shortest_distance, the second_closest_source_node, and previous_for_second_closest_source_node (shorter name suggestions are welcome). Make sure that second_closest_source_node is never the closest_source_node. Also, think carefully about how you update these variables, the optimal path for a node can become part of the second best path for it's neighbour.
Visit the entire graph, don't just stop at the first node whose closest_source and second_closest_source are found.
Once the entire graph is covered, search for the node whose shortest_distance + second_shortest_distance is smallest.

Removing a node in an undirected graph that destroys a path between two other nodes

I need help with this problem that I'm currently working on, which involves finding a node v in an undirected graph that, when removed, will destroy all paths between two other nodes s and t.
Suppose that an n-node undirected graph G = (V, E) contains two nodes s and t such that the distance between s and t is strictly greater than n/2. Show that there must exist some node v, not equal to either s or t, such that deleting v from G destroys all s-t paths. (In other words, the graph obtained from G by deleting v contains no path from s to t.)
Give an algorithm with running time O(m + n) to find such a node v.
(For solution, you can use either plain English or pseudo code.)
My understanding of this is that the solution would involve creating a breadth-first search that finds the node v and removes it, but I'm not certain of how to prove that removing the node exists in the first place such that removing it would destroy all s-t paths.
First the prove part:
Let's assume v node does not exist which means there is at least two path using totally different nodes from s to t, and the distance is greater than n / 2. This is impossible since you do not even have enough number of nodes for this two path. So this contradicts our assumption there for v node exist.
Second part algorithm:
Use bidirectional BFS. Since v node's exist if you start to search from s and t, they will definitely meet at v nodes. And in the worst case you go thru all the V and E, so it is O(V + E).

graph - The implementation of updating Minimum Spanning Tree after adding a new edge

Here is an excise
Suppose we are given the minimum spanning tree T of a given graph G
(with n vertices and m edges) and a new edge e = (u, v) of weight w
that we will add to G. Give an efficient algorithm to find the minimum
spanning tree of the graph G + e. Your algorithm should run in O(n)
time to receive full credit.
I have this idea:
In the MST, just find out the path between u and v. Then find the edge (along the path) with maximum weight; if the maximum weight is bigger than w, then remove that edge from the MST and add the new edge to the MST.
The tricky part is how to do this in O(n) time and it is also I get stuck.
The question is that how the MST is stored. In normal Prim's algorithm, the MST is stored as a parent array, i.e., each element is the parent of the according vertex.
So suppose the excise give me a parent array indicating the MST, how can I release the above algorithm in O(n)?
First, how can I identify the path between u and v from the parent array? I can have two ancestor arrays for u and v, then check on the common ancestor, then I can get the path, although in backwards. I think for this part, to find the common ancestor, at least I have to do it in O(n^2), right?
Then, we have the path. But we still need to find the weight of each edge along the path. Since I suppose the graph will use adjacency-list for Prim's algorithm, we have to do O(m) (m is the number of edges) to locate each weight of the edge.
...
So I don't see it is possible to do the algorithm in O(n). Am I wrong?
The idea you have is right. Note that, finding the path between u and v is O(n). I'll assume you have a parent array identifying the MST. tracking the path (for max edge) from u to v or u to root vertex should take only O(n). If you reach root vertex, just track the path from v to u or root vertex.
Now that you have the path from u -> u1 ... -> max_path_vert1 -> max_path_vert2 -> ... -> v, remove the edge max_path_vert1->max_path_vert2 (assuming this is greater than the added edge) and reverse the parents for u->...->max_path_vert1 and mark parent[u] = v.
Edit: More explanation for clarity
Note that, in MST there will be exactly one path between any pair of vertices. So, if you can trace from u->y and v->y, you have only traced through atmost n vertices. If you traced more than n vertices that means you visited a vertex twice, which will not happen in an MST. Ok, now hopefully you're convinced it's O(n) to track from u->y and v->y. Once you have these paths, you have established a path from u->v. Do you see how? I'm assuming this is an undirected graph, since finding MST for directed graph is a different concept in itself. For undirected graph, when you have a path from x->y you have a path from y-x. So, u->y->v exist. You don't even need to trace back from y->v, since weights for v->y will be same as that of y->v. Just find the edge with the maximum weight when you trace from u->y and v->y.
Now for finding edge weights in O(1); how are you storing your current weights? Adjacency list or adjacency matrix? For O(1) access, store it the way parent vertex array is stored. So, weight[v] = weight(v, parent[v]). So, you'll have O(1) access. Hope this helps.
Well - your solution is correct.
But regarding implementation, I dont see why you are using G instead of T to find the path between u and v. Using any search traversal in T for the path between u and v, will give you O(n). - That is, you can assume that v is the root and performs a Depth-First Search algorithm [in this case, you will have to assume all neighbors of v as children] - and stop the DFS once you find u - then, the nodes in the stack corresponds to the path between u and v.
It is easy afterward to find the cost of each edge in the path (O(n)), and it is easy as well to delete/add edges. In total O(n).
Does that help somehow ?
Or maybe you are getting O(n^2) - according to my understanding - because you access the children of a vertex v in T in O(n) -- Here, you have to present your data structure as a mapped array so that the cost is reduced to O(1). [for instace, {a,b,c,u,w}(vertices) -> {0,1,2,3,4}(indices of vertices).

minimum connected subgraph containing a given set of nodes

I have an unweighted, connected graph. I want to find a connected subgraph that definitely includes a certain set of nodes, and as few extras as possible. How could this be accomplished?
Just in case, I'll restate the question using more precise language. Let G(V,E) be an unweighted, undirected, connected graph. Let N be some subset of V. What's the best way to find the smallest connected subgraph G'(V',E') of G(V,E) such that N is a subset of V'?
Approximations are fine.
This is exactly the well-known NP-hard Steiner Tree problem. Without more details on what your instances look like, it's hard to give advice on an appropriate algorithm.
I can't think of an efficient algorithm to find the optimal solution, but assuming that your input graph is dense, the following might work well enough:
Convert your input graph G(V, E) to a weighted graph G'(N, D), where N is the subset of vertices you want to cover and D is distances (path lengths) between corresponding vertices in the original graph. This will "collapse" all vertices you don't need into edges.
Compute the minimum spanning tree for G'.
"Expand" the minimum spanning tree by the following procedure: for every edge d in the minimum spanning tree, take the corresponding path in graph G and add all vertices (including endpoints) on the path to the result set V' and all edges in the path to the result set E'.
This algorithm is easy to trip up to give suboptimal solutions. Example case: equilateral triangle where there are vertices at the corners, in midpoints of sides and in the middle of the triangle, and edges along the sides and from the corners to the middle of the triangle. To cover the corners it's enough to pick the single middle point of the triangle, but this algorithm might choose the sides. Nonetheless, if the graph is dense, it should work OK.
The easiest solutions will be the following:
a) based on mst:
- initially, all nodes of V are in V'
- build a minimum spanning tree of the graph G(V,E) - call it T.
- loop: for every leaf v in T that is not in N, delete v from V'.
- repeat loop until all leaves in T are in N.
b) another solution is the following - based on shortest paths tree.
- pick any node in N, call it v, let v be a root of a tree T = {v}.
- remove v from N.
loop:
1) select the shortest path from any node in T and any node in N. the shortest path p: {v, ... , u} where v is in T and u is in N.
2) every node in p is added to V'.
3) every node in p and in N is deleted from N.
--- repeat loop until N is empty.
At the beginning of the algorithm: compute all shortest paths in G using any known efficient algorithm.
Personally, I used this algorithm in one of my papers, but it is more suitable for distributed enviroments.
Let N be the set of nodes that we need to interconnect. We want to build a minimum connected dominating set of the graph G, and we want to give priority for nodes in N.
We give each node u a unique identifier id(u). We let w(u) = 0 if u is in N, otherwise w(1).
We create pair (w(u), id(u)) for each node u.
each node u builds a multiset relay node. That is, a set M(u) of 1-hop neigbhors such that each 2-hop neighbor is a neighbor to at least one node in M(u). [the minimum M(u), the better is the solution].
u is in V' if and only if:
u has the smallest pair (w(u), id(u)) among all its neighbors.
or u is selected in the M(v), where v is a 1-hop neighbor of u with the smallest (w(u),id(u)).
-- the trick when you execute this algorithm in a centralized manner is to be efficient in computing 2-hop neighbors. The best I could get from O(n^3) is to O(n^2.37) by matrix multiplication.
-- I really wish to know what is the approximation ration of this last solution.
I like this reference for heuristics of steiner tree:
The Steiner tree problem, Hwang Frank ; Richards Dana 1955- Winter Pawel 1952
You could try to do the following:
Creating a minimal vertex-cover for the desired nodes N.
Collapse these, possibly unconnected, sub-graphs into "large" nodes. That is, for each sub-graph, remove it from the graph, and replace it with a new node. Call this set of nodes N'.
Do a minimal vertex-cover of the nodes in N'.
"Unpack" the nodes in N'.
Not sure whether or not it gives you an approximation within some specific bound or so. You could perhaps even trick the algorithm to make some really stupid decisions.
As already pointed out, this is the Steiner tree problem in graphs. However, an important detail is that all edges should have weight 1. Because |V'| = |E'| + 1 for any Steiner tree (V',E'), this achieves exactly what you want.
For solving it, I would suggest the following Steiner tree solver (to be transparent: I am one of the developers):
https://scipjack.zib.de/
For graphs with a few thousand edges, you will usually get an optimal solution in less than 0.1 seconds.

Algorithm to check if directed graph is strongly connected

I need to check if a directed graph is strongly connected, or, in other words, if all nodes can be reached by any other node (not necessarily through direct edge).
One way of doing this is running a DFS and BFS on every node and see all others are still reachable.
Is there a better approach to do that?
Consider the following algorithm.
Start at a random vertex v of the graph G, and run a DFS(G, v).
If DFS(G, v) fails to reach every other vertex in the graph G, then there is some vertex u, such that there is no directed path from v to u, and thus G is not strongly connected.
If it does reach every vertex, then there is a directed path from v to every other vertex in the graph G.
Reverse the direction of all edges in the directed graph G.
Again run a DFS starting at v.
If the DFS fails to reach every vertex, then there is some vertex u, such that in the original graph there is no directed path from u to v.
On the other hand, if it does reach every vertex, then in the original graph there is a directed path from every vertex u to v.
Thus, if G "passes" both DFSs, it is strongly connected. Furthermore, since a DFS runs in O(n + m) time, this algorithm runs in O(2(n + m)) = O(n + m) time, since it requires 2 DFS traversals.
Tarjan's strongly connected components algorithm (or Gabow's variation) will of course suffice; if there's only one strongly connected component, then the graph is strongly connected.
Both are linear time.
As with a normal depth first search, you track the status of each node: new, seen but still open (it's in the call stack), and seen and finished. In addition, you store the depth when you first reached a node, and the lowest such depth that is reachable from the node (you know this after you finish a node). A node is the root of a strongly connected component if the lowest reachable depth is equal to its own depth. This works even if the depth by which you reach a node from the root isn't the minimum possible.
To check just for whether the whole graph is a single SCC, initiate the dfs from any single node, and when you've finished, if the lowest reachable depth is 0, and every node was visited, then the whole graph is strongly connected.
To check if every node has both paths to and from every other node in a given graph:
1. DFS/BFS from all nodes:
Tarjan's algorithm supposes every node has a depth d[i]. Initially, the root has the smallest depth. And we do the post-order DFS updates d[i] = min(d[j]) for any neighbor j of i. Actually BFS also works fine with the reduction rule d[i] = min(d[j]) here.
function dfs(i)
d[i] = i
mark i as visited
for each neighbor j of i:
if j is not visited then dfs(j)
d[i] = min(d[i], d[j])
If there is a forwarding path from u to v, then d[u] <= d[v]. In the SCC, d[v] <= d[u] <= d[v], thus, all the nodes in SCC will have the same depth. To tell if a graph is a SCC, we check whether all nodes have the same d[i].
2. Two DFS/BFS from the single node:
It is a simplified version of the Kosaraju’s algorithm. Starting from the root, we check if every node can be reached by DFS/BFS. Then, reverse the direction of every edge. We check if every node can be reached from the same root again. See C++ code.
You can calculate the All-Pairs Shortest Path and see if any is infinite.
Tarjan's Algorithm has been already mentioned. But I usually find Kosaraju's Algorithm easier to follow even though it needs two traversals of the graph. IIRC, it is also pretty well explained in CLRS.
test-connected(G)
{
choose a vertex x
make a list L of vertices reachable from x,
and another list K of vertices to be explored.
initially, L = K = x.
while K is nonempty
find and remove some vertex y in K
for each edge (y, z)
if (z is not in L)
add z to both L and K
if L has fewer than n items
return disconnected
else return connected
}
You can use Kosaraju’s DFS based simple algorithm that does two DFS traversals of graph:
The idea is, if every node can be reached from a vertex v, and every node can reach v, then the graph is strongly connected.
In step 2 of the algorithm, we check if all vertices are reachable from v. In step 4, we check if all vertices can reach v (In reversed graph, if all vertices are reachable from v, then all vertices can reach v in original graph).
Algorithm :
1) Initialize all vertices as not visited.
2) Do a DFS traversal of graph starting from any arbitrary vertex v. If DFS traversal doesn’t visit all vertices, then return false.
3) Reverse all arcs (or find transpose or reverse of graph)
4) Mark all vertices as not-visited in reversed graph.
5) Do a DFS traversal of reversed graph starting from same vertex v (Same as step 2). If DFS traversal doesn’t visit all vertices, then return false. Otherwise return true.
Time Complexity: Time complexity of above implementation is same as Depth First Search which is O(V+E) if the graph is represented using adjacency list representation.
One way of doing this would be to generate the Laplacian matrix for the graph, then calculate the eigenvalues, and finally count the number of zeros. The graph is strongly connection if there exists only one zero eigenvalue.
Note: Pay attention to the slightly different method for creating the Laplacian matrix for directed graphs.
The algorithm to check if a graph is strongly connected is quite straightforward. But why does the below algorithm work?
Algorithm: suppose there is a graph with vertices [A, B, C......Z]
Choose any random node, say J, and perform DFS from it. If all the nodes are reachable then continue to step 2.
Reverse the directions of the edges of the graph by doing transpose.
Again run DFS from node J and check if all the nodes are visited. If yes then the graph is strongly connected and return true.
Performing step 1 makes sense because we have to check if we can reach all the nodes from that node. After this, next logical step could be
i) Now do this for all other nodes
ii) or try to reach node J from every other node. Because once you reach node J, you are sure that you can reach every other node because of step 1.
This is what we are trying to do in steps 2 & 3. If in a transposed graph node J is able to reach all other nodes then this implies that in original graph all other nodes can reach J.

Resources