How to detect whether a directed graph is uniquely connected? - algorithm

A directed graph is said to be uniquely connected if there exists exactly one path between every pair of vertices. How to identify whether a graph has this property or not? This needs to be done in order O(n+m), where n are the number of vertices of the graph and m are the edges.
It is quite clear that there shouldn't be any cross-edges or forward-edges in the graph. But what about back-edges?

If there is exactly one directed path between every pair of nodes, then
every node must have at least one out-edge (else no paths from that node to other nodes)
no node can have have more than one out-edge (if there is an edge from X to Y and an edge from X to Z, and there are paths from Y to T and from Z to T, then there are multiple paths from X to T)
But now, with every node having exactly one out-edge, and every node being reachable from every other node, the graph must be a single directed cycle.
That is trivial to check in O(n) time.
Edit: As Erik P notes in the comments, this argument only applies if the paths in question are simple paths. In the same spirit, a graph of size 3 may need special treatment, because the X-Y-Z-T reasoning above doesn't apply, which means a graph with nodes X,Y,Z and edges from X to Y and Z, and from Y and Z to X would be legal.

Related

Strongly Connected Components Quastion

If you don't know how SCC algorithm works read this article: https://www.hackerearth.com/practice/algorithms/graphs/strongly-connected-components/tutorial/ (This is the best article I could find).
After finding finish time for each node, we reverse the original graph and start to run DFS from highest time node. What if we start to run DFS from smallest node in the original graph? Why it doesn't work?
Thats because the first DSF's finish times give you the topological order (which means one edge depends on another).
SCC means the every nodes are reachable from every other nodes in the component.
If you start from the smallest node (so backward) the algorithm will give false result, because in the transposed graph somewhere it wont find a way between two nodes which actually connect, or find an incorrect way because you 'walk throught' a node before its 'parent'.
Simple example (-> means depend on). Start from X the topological order: X,Y,Z,W
X -> Y -> Z
^ /
\ ˘
W
If you transpose the one above and start from Z, it will look like the whole graph is one SCC. But it is not. You must process the parent element before child. So if you start from X you cannot go into Z in the original graph before Y, also cannot go into W before Y. In the transposed graph there are a route between Z and Y but you can only use it if the invere was there in the original graph. And TO describe that there was or wasnt it. If a node topologically preceed another route and there is a route in the transposed graph between them then they strongly connected.

Directed graph decomposition

I want to decompose a directed acyclic graph into minimum number of components such that in each component the following property holds true-
For all pair of vertices (u,v) in a components, there is a path from u to v or from v to u.
Is there any algorithm for this?
I know that when the or is replaced by and in the condition, it is same as finding the number of strongly connected components(which is possible using DFS).
*EDIT: * What happens if the Directed graph contains cycles (i.e. it is not acyclic)?
My idea is to order the graph topologically O(n) using DFS, and then think about for what vertices can this property be false. It can be false for those who are joining from 2 different branches, or who are spliting into 2 different branches.
I would go from any starting vertex(lowest in topological ordering) and follow it's path going into random branches, till you cannot go further and delete this path from graph(first component).This would be repeated till the graph is empty and you have all such components.
It seems like a greedy algorithm, but consider you find a very short path in the first run(by having a random bad luck) or you find a longest path(good luck). Then you would still have to find that small branch component in another step of algorithm.
Complexity would be O(n*number of components).
When there is and condition, you should be considering any oriented graph, as DAG cannot have strongly connected component.
The two existing answers both have problems that I've outlined in comments. But there's a more fundamental reason why no decomposition into components can work in general. First, let's concisely express the relation "u and v belong in the same component of the decomposition" as u # v.
It's not transitive
In order to represent a relation # as vertices in a component, that relation must be an equivalence relation, which means among other things that it must transitive: That is, if x # y and y # z, it must necessarily be true that x # z. Is our relation # transitive? Unfortunately the answer is "No", since it may be that there is a path from x to y (so that x # y), and a path from z to y (so that y # z), but no path from x to z or from z to x (so that x # z does not hold), as the following graph shows:
z
|
|
v
x----->y
The problem is that according to the above graph, x and y belong in the same component, and y and z belong in the same component, but x and z belong in different components, which is a contradiction. This means that, in general, it's impossible to represent the relationship # as a decomposition into components.
If an instance happens to be transitive
So there is no solution in general -- but there can still be input graphs for which the relation # happens to be transitive, and for which we can therefore compute a solution. Here is one way to do that (though probably not the most efficient way).
Compute shortest paths between all pairs of vertices (using e.g. the Floyd-Warshall algorithm, in O(n^3) time for n vertices). Now, for every vertex pair (u, v), either d(u, v) = inf, indicating that there is no way to reach v from u at all, or not, indicating that there is some path from u to v. To answer the question "Does u # v hold?" (i.e., "Do u and v belong in the same component of the decomposition?"), we can simply calculate d(u, v) != inf || d(v, u) != inf.
This gives us a relation that we can use to build an undirected graph G' in which there is a vertex u' for each original vertex u, and an edge between two vertices u' and v' if and only if d(u, v) != inf || d(v, u) != inf. Intuitively, every connected component in this new graph must be a clique. This property can be checked in O(n^2) time by first performing a series of DFS traversals from each vertex to assign a component label to each vertex, and then checking that each pair of vertices belongs to the same component if and only if they are connected by an edge. If the property holds then the resulting cliques correspond to the desired decomposition; otherwise, there is no valid decomposition.
Interestingly, there are graphs that are not chains of strongly connected components (as claimed by Zotta), but which nonetheless do have transitive # relations. For example, a tournament is a digraph in which there is an edge, in some direction, between every pair of vertices -- so clearly # holds for every pair of vertices in such a graph. But if we number the vertices 1 to n and include only edges from lower-numbered to higher-numbered vertices, there will be no cycles, and thus the graph is not strongly connected (and if n > 2, then clearly it's not a path).

Turning Recursive algorithm into breadth-first queue

So I am working with a "course". The course is full of coordinates. Each coordinate has attributes that allow movement(#left #right #up #down). The course is built upon a coordinate system so left would be x-1, right would be x+1, up would be y-1, and down would be y+1.
My goal is to get the shortest distance of each reachable coordinate.
Distance is defined by the number of moves from the start point (the start coordinate of the course that is provided in the parameters). So the distance from (0,0) to (1,2) would be 3. 1 right and 2 down
I've originally solved this problem using recursion:
Answer: Rather than go through each one as far as as possible in depth, use an array to go through each of the paths within each difference at a time
Think of your problem as a undirected graph with the nodes being the coordinates, where each pair of (distinct) nodes [x0,y0] and [x1,y1] is "adjacent" if:
[x0-x1].abs <= 1 && [y0-y1].abs <= 1
Two nodes are connected by an undirected link if they are adjacent, in which case the length of that link is 1. If two nodes are connected by a path of nodes and links, the distance between them equals the sum of the lengths of the links on the path (i.e., the number of links in the path).
You can find the shortest distance between all pairs of coordinates by employing an algorithm that computes shortest paths between all pairs of nodes in an undirected graph, such as the Floyd-Warshall algorithm (which also works for directed graphs).
Floyd-Warshall treats non-adjacent pairs of nodes as being connected by a link of infinite length (which may be implemented as a suitably large number). If the length of the shortest path between a given pair of nodes is found to be "infinite", you know the nodes are not connected (i.e., there is no path between the coordinates).

Single edge addition to minimize number of bridges in a graph

You are given a graph G with N vertices and M edges with N<=10^4 and M<=10^5. Now, you have to add exactly one edge (u,v) to the graph so that the total number of bridges is minimized. G may have multiple edges, but no self loops. On the other hand, the newly generated graph, after adding the edge, G', may have both self loops and multiple edges. If many such (u,v) with u<=v is possible then output the lexicographically smallest one (the vertices are numbered from 1..n).
A trivial idea would be to try all edges in order and then use the bridges finding algorithm to find the number of bridges. This takes time O(V^2 * E), so it is clearly useless. How to do better in terms of runtime ?
EDIT: Following advice by j_random_hacker, I add the following details about the source of the above problem. This is a problem named Computer Network (specifically problem 3) from India's IOI Training Camp '14 Practice Test (Test 3). It was an onsite offline test, so I cannot prove that it is not from a present contest, by giving a link. But I have a PDF of the problem statement.
This is not a complete answer but some ideas to steer you towards it:
To avoid having to run the bridge-finding algorithm after trying each possible edge, it pays to ask: By how much can adding a single edge (u, v) change the number of bridges in a graph G?
If u and v are not already connected by any path in G, then certainly (u, v) will itself become a bridge. What about the "bridgeness" (bridgity? bridgulence?) of all other pairs of vertices? Does it change? (Most importantly: Can any edge go from being a bridge to being a non-bridge? If you can prove that this can never happen, then you can immediately discard all such vertex pairs (u, v) from consideration as they can only ever make the situation worse.)
If u and v are already connected in G, there are 2 possibilities:
Every path P that connects them shares some edge (x, y) (note that x and y are not necessarily distinct from u and v). Then (x, y) is a bridge in G, and adding (u, v) will cause (x, y) to stop being a bridge, because it will then become possible to get from x to y "the long way", by going from x back to u, via the new edge (u, v) to v, and then back up to y. (This assumes that x is closer to u on P than y is, but clearly the argument still works if y is closer: just swap u and v.) There could be multiple such bridges (x, y): in that case, all of them will become non-bridges after (u, v) is added.
There are at least 2 edge-disjoint paths P and Q already connecting u and v. Obviously no edge (x, y) on P or Q can be a bridge, since if (x, y) on P were deleted, it's still possible to get from x to y "the long way" via Q. The question is, again: What about the bridgeness of all other vertex pairs? You should be able to prove that this property doesn't change, meaning that adding the edge (u, v) leaves the total number of bridges unchanged, and can therefore be disregarded as a useless move (unless there are no bridges at all to start with).
We see that 2.1 above is the only case in which adding an edge (u, v) can be useful. Furthermore, it seems that the more bridges we can find in a single path in G, the more of them we can neutralise by choosing to connect the endpoints of that path.
So it seems like "Find the path in G that contains the most bridges" might be the right criterion. But first we need to ask ourselves: Does the number of bridges in a path P accurately count the number of bridges eliminated by adding an edge from the start of P to the end? (We know that adding such an edge must eliminate at least those bridges, but perhaps some others are also eliminated as a "side effect" -- and if so, then we need to count them somehow to make sure that we add the edge that eliminates the most bridges overall.)
Happily the answer is that no other bridges are eliminated. This time I'll do the proof myself.
Suppose that there is a path P from u to v, and suppose to the contrary that adding the edge (u, v) would eliminate a bridge (x, y) that is not on P. Then it must be that the single edge (x, y) is the only path from x to y, and that adding (u, v) would create a second path Q from x, via the edge (u, v) in either direction, to y that avoids the edge (x, y). But for any such Q, we could replace the edge (u, v) in Q with the path P, which from our initial assumption avoids (x, y), and still get a path Q' from x to y that avoids the edge (x, y) -- this means that (x, y) must have already been connected by two edge-disjoint paths (namely the single edge (x, y) and Q'), so it could not have been a bridge in the first place. Since this is a contradiction, it follows that no such "removed as a side effect" bridge (x, y) can exist.
So "Find the path in G that contains the most bridges, and add an edge between its endpoints" definitely gives the right answer -- but there is still a problem: this sounds a lot like the Longest Path problem, which is NP-hard for general graphs, and therefore slow to solve.
However, there is a way out. (There must be: you already have an O(V^2*E) algorithm, so it can't be that your problem is NP-hard :-) ) Think of the biconnected components in your input graph G as being vertices in another graph G'. What do the edges between these vertices (in G') correspond to in G? Do they have any particular structure? Final (big) hint: What is a critical path?
This answer is a spoiler. You should probably think along with j_random_hacker's answer instead.
If I understand your problem correctly:
Think of the graph as a tree of biconnected components. Find the longest path in this tree and link up its ends with the new edge.
There is a linear-time algorithm for finding biconnected components using depth first search. Finding the longest path in a tree takes linear time and can be done using depth-first search---make it do "find the farthest vertex and return both it and its distance" and use that. So this takes linear time overall.
(You can roll it all into a single depth-first search that returns the number of bridge edges in the bridgiest path and the farthest vertex in said bridgiest path.)

Using a minimum spanning tree algorithm

Suppose I have a weighted non-directed graph G = (V,E). Each vertex has a list of elements.
We start in a vertex root and start looking for all occurances of elements with a value x. We wish to travel the least amount of distance (in terms of edge weight) to uncover all occurances of elements with value x.
The way I think of it, a MST will contain all vertices (and hence all vertices that satisfy our condition). Therefore the algorithm to uncover all occurances can just be done by finding the shortest path from root to all other vertices (this will be done on the MST of course).
Edit :
As Louis pointed out, the MST will not work in all cases if the root is chosen arbitrarily. However, to make things clear, the root is part of the input and therefore there will be one and only one MST possible (given that the edges have distinct weights). This spanning tree will indeed have all minimum-cost paths to all other vertices in the graph starting from the root.
I don't think this will work. Consider the following example:
x
|
3
|
y--3--root
| /
5 3
| /
| /
x
The minimum spanning tree contains all three edges with weight 3, but this is clearly not the optimum solution.
If I understand the problem correctly, you want to find the minimum-weight tree in the graph which includes all vertices labeled x. (That is, the correct answer would have total weight 8, and would be the two edges drawn vertically in this drawing.) But this does not include your arbitrarily selected root at all.
I am pretty confident that the following variation on Prim's algorithm would work. Not sure if it's optimal, though.
Let's say the label we are looking for is called L.
Use an all-pairs shortest path algorithm to compute d(v, w) for all v, w.
Pick some node labeled L; call this the root. (We can be sure that this will be in the result tree, since we are including all nodes labeled L.)
Initialize a priority queue with the root initialized to 0. (The priority queue will consist of vertices labeled L, and their minimum distance from any node in the tree, including vertices not labeled L.)
While the priority queue is nonempty, do the following:
Pick out the top vertex in the queue; call it v, and its distance from the tree d.
For each vertex w on the path from v to the tree, v inclusive, find the nearest L-labeled node x to w, and add x to the priority queue, or update its priority. Add w to the tree.
The answer is no, if I'm understanding correctly. Finding the minimum spanning tree will contain all vertices V, but you only want to find the vertices with value x. Thus, your MST result may have unneeded vertices adding extra path length and therefore be sub-optimal.
An example has been given where the MST M1 from Root differs from an MST M2 containing all x nodes but not containing Root.
Here's an example where Root is in both MST's: Let graph G contain nodes R,S,T,U,V (R=Root), and a clockwise path R-S-T-U-V-R, with edge weights 1,1,3,2,2 going clockwise, and x at R, S, T, U. The first MST, M1, will have subtrees S-T and V-U below R, with cost 6 = 2+4, and cost-3 edge T-U not included in M1. But M2 has subtree S-T-U (only) below R, at cost 5.
Negative. If the idea is to find for every node that contains 'x' a separate path from root to it, and minimize the total cost of the paths, then you can just use simple shortest-path calculation separately for every node starting from the root, and put the paths together.
Some of those shortest paths will not be in the minimum spanning tree, so if this is your goal, the MST solution does not work. MST optimizes the cost of the tree, not the sum of costs of paths from root to the nodes.
If your idea is to find one path that starts from root and traverses through all nodes that contain 'x', then this is the traveling salesman problem and it is an NP-complete optimization problem, i.e. very hard.

Resources