If you don't know how SCC algorithm works read this article: https://www.hackerearth.com/practice/algorithms/graphs/strongly-connected-components/tutorial/ (This is the best article I could find).
After finding finish time for each node, we reverse the original graph and start to run DFS from highest time node. What if we start to run DFS from smallest node in the original graph? Why it doesn't work?
Thats because the first DSF's finish times give you the topological order (which means one edge depends on another).
SCC means the every nodes are reachable from every other nodes in the component.
If you start from the smallest node (so backward) the algorithm will give false result, because in the transposed graph somewhere it wont find a way between two nodes which actually connect, or find an incorrect way because you 'walk throught' a node before its 'parent'.
Simple example (-> means depend on). Start from X the topological order: X,Y,Z,W
X -> Y -> Z
^ /
\ ˘
W
If you transpose the one above and start from Z, it will look like the whole graph is one SCC. But it is not. You must process the parent element before child. So if you start from X you cannot go into Z in the original graph before Y, also cannot go into W before Y. In the transposed graph there are a route between Z and Y but you can only use it if the invere was there in the original graph. And TO describe that there was or wasnt it. If a node topologically preceed another route and there is a route in the transposed graph between them then they strongly connected.
Related
What is the minimum number of new edges that need to be built to make all the nodes reachable from the root?
Given directed graph and index of root.
I tried finding all the components (if the graph was undirected), then finding the number of nodes with no parents and I call that number sol. I go through every component and ask for number of orphan nodes. If it has none and it does not contain the root then I add 1 to sol, because I should connect that component to root. This is possible when the component is a cycle. Special case is component that includes root. If it doesn't have a node with 0 parent, then sol stays the same, if it does have any, then sol becomes sol + number_of_those_nodes (if root has a parent) or sol + number_of_those_nodes - 1(otherwise)
Can you help me solve this and create a valid pseudo code.
Idea 1: Find condensation of the graph (graph of strongly connected components). Now the graph is acyclic, but the answer is the same: it doesn't make any sense to add an edge inside strongly connected component and it doesn't matter how exactly are distinct strongly connected components are connected. Our problem is now the same, but on acyclic graph (the root becomes its strongly connected component).
Idea 2: note that for all source vertexes (vertexes with inbound degree of zero) except for the root we should add at least one edge (the inbound one). That gives us a lower bound on the answer.
Idea 3: if that lower bound is zero, that bound can be achieved as the graph is already good.
Proof by contradiction: take a look at arbitrary vertex X unreached from the root. It has at least one inbound edge, otherwise our lower bound would not be zero. Let's say it has an inbound edge from arbitrary vertex Y. If Y is reached, then X would also be reached, so Y is unreached. Repeat the same argument for Y, and now we've got an infinite path of vertexes. But it cannot has the same vertex twice, otherwise there would be a loop in our graph, but it's already condensated. On the other hand, there is finite number of vertexes. Q.E.D.
Idea 4: if that lower bound is greater than zero, we can draw edges from root to each source vertex, lower that lower bound to zero and apply idea 3, thus achieving the exact lower bound.
Given a directed graph, and one of the vertices x, I need to find a strongly connected component in the graph that includes x.
The algorithm should be linear, and to use bfs only.
I don't know how this got to the top of my page after all this time but it's missing an answer, so...
To find the strongly connected component that contains a given node x, you can do a BFS from x to find all the nodes reachable from x. Then reverse all the edges and do a BFS from x to find all the nodes reachable through reverse edges. Those are all the nodes in the original graph from which x can be reached.
The intersection of those sets of nodes is the SCC that contains x.
A directed graph is said to be uniquely connected if there exists exactly one path between every pair of vertices. How to identify whether a graph has this property or not? This needs to be done in order O(n+m), where n are the number of vertices of the graph and m are the edges.
It is quite clear that there shouldn't be any cross-edges or forward-edges in the graph. But what about back-edges?
If there is exactly one directed path between every pair of nodes, then
every node must have at least one out-edge (else no paths from that node to other nodes)
no node can have have more than one out-edge (if there is an edge from X to Y and an edge from X to Z, and there are paths from Y to T and from Z to T, then there are multiple paths from X to T)
But now, with every node having exactly one out-edge, and every node being reachable from every other node, the graph must be a single directed cycle.
That is trivial to check in O(n) time.
Edit: As Erik P notes in the comments, this argument only applies if the paths in question are simple paths. In the same spirit, a graph of size 3 may need special treatment, because the X-Y-Z-T reasoning above doesn't apply, which means a graph with nodes X,Y,Z and edges from X to Y and Z, and from Y and Z to X would be legal.
How could I get from set of nodes and edges get tree with a root?
(I'm working with connectivity-matrix, each edge has weight: graph[i][j], without any negative edges). Later I need to do DFS and find LCA's in that tree, so it would be good for optimize.
I suppose that your matrix represents the child relationship (i.e. M[i][j] tells that j is the child of i), for a directed graph G(V,E).
You have 2 different strategies:
use a bit vector, go through each cell of your matrix, and mark the child index in the vector if the cell's weight is not null): the root is the vertex not set in the vector,
look for the columns (or rows, if your matrix is column first) whose cells are all null (no ancestors),
The second solution is better for dense matrices. Its worst running time would be when the root is the last entry (O(V²)). In this case you can stop at the first hit, or run til the end to get all the roots, if your graph has many.
The first one is better suited for sparse matrices, since you have to go through all the cells. It's running time is in O(E). You also get all the roots with this algorithm.
If you are certain that your graph has only one root, you can use the walk the edges up technique, as described in other answers.
Here is a computationally MUCH SLOWER version that is also much easier to code. For small graphs, it is just fine.
Find the node with in-degree zero!
You have to compute all node degrees, O(n), but depending on the setting, this is often much easier to code and thus less prone to error.
Pick one node in the tree and walk up, that is, against the orientation of the edges. When you find a node without an ancestor you have the root.
If you need to do something like this often, just remember the parent node for each node.
a DFS search from any graph gives you a tree (assuming the graph is connected, of course).
you can iterate it, and start from each node as a possible root, you will get a spanning tree eventually this way, if there is one. complexity will be O(V^2+VE)
EDIT: it works because for any finite graph, if there is a root form node a to node b, there will be a path from a to b in the tree DFS creates. so, assuming there is a possible spanning tree, there is a root r, which you can get from to each v in V. when iterating when r chosen as root, there is a path from r to each v in V, so there will be a path from r to it in the spanning tree.
There is a directed graph (not necessarily connected) of which one or more nodes are distinguished as sources. Any node accessible from any one of the sources is considered 'lit'.
Now suppose one of the edges is removed. The problem is to determine the nodes that were previously lit and are not lit anymore.
An analogy like city electricity system may be considered, I presume.
This is a "dynamic graph reachability" problem. The following paper should be useful:
A fully dynamic reachability algorithm for directed graphs with an almost linear update time. Liam Roditty, Uri Zwick. Theory of Computing, 2002.
This gives an algorithm with O(m * sqrt(n))-time updates (amortized) and O(sqrt(n))-time queries on a possibly-cyclic graph (where m is the number of edges and n the number of nodes). If the graph is acyclic, this can be improved to O(m)-time updates (amortized) and O(n/log n)-time queries.
It's always possible you could do better than this given the specific structure of your problem, or by trading space for time.
If instead of just "lit" or "unlit" you would keep a set of nodes from which a node is powered or lit, and consider a node with an empty set as "unlit" and a node with a non-empty set as "lit", then removing an edge would simply involve removing the source node from the target node's set.
EDIT: Forgot this:
And if you remove the last lit-from-node in the set, traverse the edges and remove the node you just "unlit" from their set (and possibly traverse from there too, and so on)
EDIT2 (rephrase for tafa):
Firstly: I misread the original question and thought that it stated that for each node it was already known to be lit or unlit, which as I re-read it now, was not mentioned.
However, if for each node in your network you store a set containing the nodes it was lit through, you can easily traverse the graph from the removed edge and fix up any lit/unlit references.
So for example if we have nodes A,B,C,D like this: (lame attempt at ascii art)
A -> B >- D
\-> C >-/
Then at node A you would store that it was a source (and thus lit by itself), and in both B and C you would store they were lit by A, and in D you would store that it was lit by both A and C.
Then say we remove the edge from B to D: In D we remove B from the lit-source-list, but it remains lit as it is still lit by A. Next say we remove the edge from A to C after that: A is removed from C's set, and thus C is no longer lit. We then go on to traverse the edges that originated at C, and remove C from D's set which is now also unlit. In this case we are done, but if the set was bigger, we'd just go on from D.
This algorithm will only ever visit the nodes that are directly affected by a removal or addition of an edge, and as such (apart from the extra storage needed at each node) should be close to optimal.
Is this your homework?
The simplest solution is to do a DFS (http://en.wikipedia.org/wiki/Depth-first_search) or a BFS (http://en.wikipedia.org/wiki/Breadth-first_search) on the original graph starting from the source nodes. This will get you all the original lit nodes.
Now remove the edge in question. Do again the DFS. You can the nodes which still remain lit.
Output the nodes that appear in the first set but not the second.
This is an asymptotically optimal algorithm, since you do two DFSs (or BFSs) which take O(n + m) times and space (where n = number of nodes, m = number of edges), which dominate the complexity. You need at least o(n + m) time and space to read the input, therefore the algorithm is optimal.
Now if you want to remove several edges, that would be interesting. In this case, we would be talking about dynamic data structures. Is this what you intended?
EDIT: Taking into account the comments:
not connected is not a problem, since nodes in unreachable connected components will not be reached during the search
there is a smart way to do the DFS or BFS from all nodes at once (I will describe BFS). You just have to put them all at the beginning on the stack/queue.
Pseudo code for a BFS which searches for all nodes reachable from any of the starting nodes:
Queue q = [all starting nodes]
while (q not empty)
{
x = q.pop()
forall (y neighbour of x) {
if (y was not visited) {
visited[y] = true
q.push(y)
}
}
}
Replace Queue with a Stack and you get a sort of DFS.
How big and how connected are the graphs? You could store all paths from the source nodes to all other nodes and look for nodes where all paths to that node contain one of the remove edges.
EDIT: Extend this description a bit
Do a DFS from each source node. Keep track of all paths generated to each node (as edges, not vertices, so then we only need to know the edges involved, not their order, and so we can use a bitmap). Keep a count for each node of the number of paths from source to node.
Now iterate over the paths. Remove any path that contains the removed edge(s) and decrement the counter for that node. If a node counter is decremented to zero, it was lit and now isn't.
I would keep the information of connected source nodes on the edges while building the graph.(such as if edge has connectivity to the sources S1 and S2, its source list contains S1 and S2 ) And create the Nodes with the information of input edges and output edges. When an edge is removed, update the output edges of the target node of that edge by considering the input edges of the node. And traverse thru all the target nodes of the updated edges by using DFS or BFS. (In case of a cycle graph, consider marking). While updating the graph, it is also possible to find nodes without any edge that has source connection (lit->unlit nodes). However, it might not be a good solution, if you'd like to remove multiple edges at the same time since that may cause to traverse over same edges again and again.