Does the starting node matter for Breadth First Search and Depth First Search in order to visit all of the nodes? - depth-first-search

For DFS and BFS should should we start at the root always in order to make sure we traverse all the nodes?

No you can start on any node you like, but for each node the result can be different

Related

How to define LCA(Least Common Ancestor)(of two nodes) in case of a graph?

I googled and tried to find about LCA(of two nodes) in a graph, but unfortunately, I didn't find much descriptive and understandable content.
So please can someone elaborate the LCA in a graph(both directed and undirected)?
To find the closest common ancestor to two nodes in a directed graph, do breadth first search upwards from each node. Add each node as it is visited to a vector of nodes, one for each search. When the searches are completed find the the first common node in the two vectors.
Optimization: pause the searches when they increment their depth. If a common node has been found, stop, otherwise continue to the next depth.

Binary Search Tree Algorithm - Start Searching From a Node Other Than Root

Is there a way to search in a binary search tree starting from a node other than the root? E.g to start searching from a node in the third level of the tree.
Is there a way to search in a binary search tree starting from a node
other than the root?
Yes. There is.
But not with the traditional Binary Search Tree data structure.
You will have to modify the structure of a Node to achieve that.
One most straight forward way to achieve that is to have a pointer to root node from any other node. That way, you will be able to go to your root node directly from any given node and apply regular Binary Search algorithm. This method includes overhead of saving one more pointer to Root and would not change the time complexity of Binary Serach, it would remain O(lgN)
Other a bit complex way to achieve that would be to have a pointer to immidiate parent from any node. Given any node, you can first perform binary search on that subtree to find if the node to find exists on the subtree of the given node. That will finish the algorithm in O(lgM) where M is the number of nodes in the subtree. That search would be bit efficient because M < N. If you do not find the node in the subtree, then you will have to traverse back to the root. ( Root will have null as the parent pointer ) While traversing back, you can also track whether the node you are currently at is the node to find, if is is, you can directly return from there. Once you reach the root of the tree, you can apply standard Binary Serach algorithm. The time complexity for this method remains same as O(lgN). In most cases, the algorithm would finish faster than standard Binary Search.

iterative approach for tree traversal

Can someone help me out with an algorithm to traverse a binary tree iteratively without using any other data structure like a stack
I read somewhere we can have a flag named visited for each node and turn in on if the node is visited but my BinaryTreeNode class does not have a visited variable defined. So I can not potentially do something like node.left.visited = false
Is there any other way to traverse iteratively?
One option would be to thread the binary tree.
Whenever some node points to NULL (be it left or right), make that node point to the node which comes next in its traversal (pre-order, post-order, etc). In this way, you can traverse the entire tree in one iteration.
Sample threaded binary tree:
Note that left node of each node points to the largest value smaller than it. And the right node of each node points to the smallest value larger than it. So this gives an in-order traversal.

How can I do this graph traversal?

I have a Directed Cyclic graph consisting of node a, b, c, d, e,f g, where ever node is connected to every other node. The edges may be unidirectional or bidirectional. I need to printout a valid order like this for eg. f->a->c->b->e->d->g such that I can reach the end node from the start node. Note that all the nodes must be present in the output list.
Also note that there may be cycles in the graph.
What I came up with:
Basically first we can try to find a start node. If there is a node such that there is no incoming edge to it (there could be atmost one such node). I may find a start node or may not. Also I will do some preprocessing to find the total number of nodes(lets call it n). Now I will start a DFS from the start node marking nodes as visited when I reach them and counting how many nodes I visited. If I can reach n nodes by this method. I am done. If I hit a node, from which there are no outgoing edges to any unvisited node, I have hit a dead end, and I will just mark that node as unvisited again, reduce the pointer and go to its previous node to try a different route.
This was the case when I find a start node. If I dont find a start node, I will just have to try this with various nodes.
I have no idea if I am even close to the solution. Can anyone help me in this regard?
In my opinion, if there is no incoming edge to a node, it means that node is a start node. You can traverse the graph using this start node. And if this start node can not visit all the n nodes, then there is no solution (as you said that all the nodes must be present in the output list.). This is because if you start with some other nodes, you won't be able to reach this start node.
The problem with your solution is that if you enter a loop you don't know if and when to exit.
A DFS search in these conditions can easily became a non polynomial task!
Let me introduce a polynomial algorithm for your problem.
It looks complicated I hope there's room for simplifications.
Here my suggested solution
1) For each node construct the table of the nodes it can reach (if a can reach b and c; b can reach d; c can reach e; a can reach b,c,d,e even tough there is not a single pathfrom a passing through all of them).
If no node can reach all the other ones you're done: there is no the path you're looking for.
2) Find loops. That's easy: if a node can reach itself, there is a loop. This should be part of the construction of the table at the previous point.
Once you have find one loop you can shrink it (and its nodes) to the representative node whose ingoing (outgoing) connections are the union of the ingoing (outgoing) connections of the nodes in the loop.
You keep reducing loops until you cannot do any more.
3) At this point you are left with an acyclic graph, If there is a path connecting all nodes, there is a single node connected to all and starting from it you can perform depth first search.
4)
Write down the path by replacing the traversal of representative nodes with a loop from the entry point of the loop to the exit point.

Most efficient way to visit nodes of a DAG in order

I have a large (100,000+ nodes) Directed Acyclic Graph (DAG) and would like to run a "visitor" type function on each node in order, where order is defined by the arrows in the graph. i.e. all parents of a node are guaranteed to be visited before the node itself.
If two nodes do not refer to each other directly or indirectly, then I don't care which order they are visited in.
What's the most efficient algorithm to do this?
You would have to perform a topological sort on the nodes, and visit the nodes in the resulting order.
The complexity of such algorithm is O(|V|+|E|) which is quite good. You want to traverse all nodes, so if you would want a faster algorithm than that, you would have to solve it without even looking at all edges, which would be dangerous, because one single edge could havoc the order completely.
There are some answers here:
Good graph traversal algorithm
and here:
http://en.wikipedia.org/wiki/Topological_sorting
In general, after visiting a node, you should visit its related nodes, but only the nodes that are not already visited. In order to keep track of the visited nodes, you need to keep the IDs of the nodes in a set (or map), or you can mark the node as visited (somehow).
If you care about the topological order, you must first get hold of a collection of all the un-traversed links ("remaining links") to a node, sorted by the id of the referenced node (typically: map(node-ID -> link-count)). If you haven't got that, you might need to build it using an approach similar to the one above. Then, start by visiting a node whose remaining incoming link count is zero. For each link from that node, reduce the remaining link count for each related node, adding the related node to the set of nodes-to-visit (or just visiting the node) if the count reaches zero.
As mentioned in the other answers, this problem can be solved by Topological Sorting.
A very simple algorithm for that (not the most efficient):
Keep an array (or map) indegree[] where indegree[node]=number of incoming edges of node
while there is at least one node n with indegree[n]=0:
for each node n in nodes where indegree[n]>0:
visit(n)
indegree[n]=-1 # mark n as visited
for each node x adjacent to n:
indegree[x]=indegree[x]-1 # its parent has been visited, so one less edge coming into it
You can traverse a DAG in O(N) (without any topsort) by just running your dfs from every node with zero indegree, because those will be the valid "starting point". This will work because graph has no cycles, those zero indegree nodes must exist, and must traverse the whole graph.

Resources