Say we have a DAG comprised of a list of nodes A, B, C, D, and E.
Each node has a list of reachable nodes - for example:
A --> B, C
A --> B
D --> E
In this case, we would have to visit nodes A and D to comprehensively visit all nodes in the graph. What is the best algorithm to approach this problem in general?
Here is a linear approach:
For every node count it`s in-degree (number of edges pointing to it)
Because graph is a DAG (no cycles) we can just take all nodes with in-degree of 0 as our starting sub-set
Time Complexity(N + M) - linear in graph size
Here is an approach.
Let's say that node A is parent of node B if there is an arc from A to B.
And node C is the most-parent of node B, if it has no parent and there is a path from C to B.
Mark every node as not visited.
For every node in DAG you define it's parent.
for every node A that is not visited
Find A's most-parent MP
Mark all nodes that are reachable from MP as visited
Put MP to array
After this you'll get smallest subset of nodes that reach all nodes in DAG in array
Time compexity of algo is O(n^2)
Related
I was doning a problem of finding a bridge in a undirected connected graph, I looked up wikipedia for Tarjan's algorithm. Here is what it writes
Tarjan's bridge-finding algorithm
The first linear time algorithm for finding the bridges in a graph was described by
Robert Tarjan in 1974. It performs the following steps:
Find a spanning forest of G
Create a rooted forest F from the spanning forest
Traverse the forest F in preorder and number the nodes. Parent nodes in the forest now have lower numbers than child nodes.
For each node v in preorder (denoting each node using its preorder number), do:
Compute the number of forest descendants ND(v) for this node, by adding one to the sum of its children's descendants.
Compute L(v), the lowest preorder label reachable from v by a path for which all but the last edge stays within
the subtree rooted at v. This is the minimum of the set
consisting of the preorder label of v, of the values of
L(w) at child nodes of v and of the preorder
labels of nodes reachable from v by edges that do not
belong to F.
Similarly, compute H(v), the highest preorder label reachable by a path for which all but the last edge stays within the
subtree rooted at v. This is the maximum of the set
consisting of the preorder label of v, of the values of
H(w) at child nodes of v and of the preorder
labels of nodes reachable from v by edges that do not
belong to F.
For each node w with parent node v, if L(w) = w and H(w) < w + ND(w) then the edge
from v to w is a bridge.
I wonder whether I understand the previous steps wrong, since in my opinion, I think that L(w) = w is never gonna happen except at the root. Where in other cases, L(w) should be at least smaller than the father of w.
Source
The English description of L and H is slightly wrong -- they should exclude paths that contain the parent edge, or else it's as if there are parallel edges between each pair of adjacent nodes, hence no bridges. The algorithm for computing L and H correctly iterates over children only.
Let's Suppose we have a connected graph G, a start vertex s, and a spanning tree T of G and G is undirected. How can I describe an algorithm to decide if T is a depth-first spanning tree rooted at s or not?
All DFS trees T for an undirected graph G have the following property:
{u, v} is an edge in G if and only if u is an ancestor of v in T or v is an ancestor of u in T.
To see why, assume without loss of generality that u is visited before v in the DFS. When building the DFS tree node for u, we will either (1) choose to visit node v as a neighbor of u, making node u a parent of node v, or (2) starting at node u we will visit some other neighbor z, and in recursively exploring z we will visit v, in which case u is a parent of z and z is an ancestor of v.
Moreover, we can make a stronger claim: any tree meeting the above criterion is a DFS tree for some DFS tree of G. Here’s how to see this. Start with the root node of T and look at its children. Given any two subtrees of the root, none of the nodes in those subtrees can be adjacent to one another in G, since otherwise by the above property one of those nodes would have to be an ancestor of the other. Therefore, each subtree consists of a set of nodes that are all reachable from one another via paths that only involve the nodes within that subtree. We can then recursively assemble one possible DFS ordering by starting at the root, recursively building DFS trees for the subgraphs represented by the subtrees in any order we’d like, and gluing those DFS orders together.
With this observation in mind, we can check very quickly with a second DFS whether T can be a DFS tree rooted at s, tracking which nodes have been visited as the DFS runs. After all children of a node v have been processed, check whether all the neighbors of v in graph G have been visited. If so, great! If not, it means that some neighbor of v is neither an ancestor nor a descendant, and the tree isn’t a DFS tree. If this process terminated without finding any violations, the process itself traces out a DFS of G using the edges of T, so T is definitely a valid DFS tree.
This algorithm runs in time O(m + n), which is as fast as possible here. After all, if you don’t look at all the nodes or edges of G, you can’t be sure whether the tree is a valid DFS tree because you can’t check the core property listed above.
Given a Graph with N nodes. Two players A and B start from node 1 and node N respectively. A can visit all the adjacent nodes to the nodes already visited by A but can not visit any nodes which are already visited by B and similarly for B also. Suppose A moves first. Find the winner and maximum nodes visited by the winner.
I know a solution for the tree using DFS, but for Graph, I am not able to construct a solution.
In Dijkstra algorithm, what if one of the unvisited nodes in a graph is "cut off" from the current visited node by some other visited node. So say I'm in node A which is linked to node B, and node B is linked to unvisited node C. However node B has already been visited so I can't visit it again. How do I get to C?
a value for C will already be computed in the distance array when you were visiting node B. Simply when you're visiting node C it won't update this existing value as the computation dist(src, A) + inf is greater than dist(src, B) + dist(B, C) or also because your implementation has another way to describe the impossibility of a path. dist(src, B) + dist(B, C) would be the effective result of the computation of the dp while visiting B as the shortest path from src to B would have been computed
Consider the following graph G and consider that at an execution of the algorithm DFS at G, the edges of the graph are characterized as tree edges(t), back edges(b) , forward edges(f) and cross edges(c) as at the following graph. For each node of the graph find the discovery time and the finishing time of the node.
In other words, for each node v of the graph, find the values d[v] and f[v] that associates the algorithm DFS with this node.
Notice that there is only one possible assignment of the values d[v] and f[v].
Could you give me a hint how we can find the initial node in order to start applying the Depth first search algorithm?
Look at node a - what could DFS do in node a? It could go either to b or e. We see that it chose b, because a->b is a tree edge and a->e is a forward edge (check the definition of tree/forward edge). In b the only choice was to visit f. In f DFS could go either to a, e or g. We can assume that it tried to visit a (f->a is marked as back edge, so everything is correct until now), than it visited e and than tried to visit b. However, we now have a problem with edge f->g. It is marked as a cross edge, which means that DFS had already visited g before. Otherwise, this edge would have been marked as a tree edge. So, we know that a was not the initial node. We need to try other options. What about c? Again, all of edges coming out of c are marked as cross, not tree, so c was not the initial node.
What about d? If DFS started in d, it could go from d to g and that's what happened because d->g is marked as tree edge. There were no nodes to go from g so it backtraced to d and visited h. From h it tried to visit g but it has already visited earlier, so h->g is marked as cross - correct. Great, so d was the initial node for this DFS execution. After visiting a connected component which contains d, g and h, DFS could start again either from a or c but we already know that it has not started from c because of those cross edges. So it started from a and after visiting b, f and e it started from c.
Tree edges should form a forest. A node at wich the DFS could have started is a node that has no incoming tree edges.