I am looking at the non-recursive DFS and BFS of a general graph. Besides the fact that the former uses a stack instead of a queue, the only difference is that it "delays checking whether a vertex has been discovered until the vertex is popped from the stack rather than making this check before pushing the vertex." Why is this "visited" check order different? Or put it another way, can we change BFS to non-recursive DFS by simply replacing queue in BFS with stack?
I checked all posts I can find such as this and this, but none clarifies this question.
Yes, that is the only difference.
The DFS algorithm you show from wikipedia has a bug (well, at least a serious inefficiency) in it -- it will reinsert back into S nodes which have already been visited. The BFS one is more sensibly designed, and you could change it to have a stack.
Related
So hopefully this is a simple question, but I can't seem to find the answer.
The time complexity of DFS is allegedly O(|V|+|E|). Now I'm having issues seeing why it depends on the number of edges. The usual explanation I've seen goes as follows:
Say we implement a DFS using an explicit stack (for simplicity). Say we have a graph where each node is connected to all the rest. We start at some node, visit it and then push all it's neighbors onto the stack. Now we pop the next node and put all of it's neighbors onto the stack. We repeat until we visit all the nodes.
Let's pretend that the node that finds itself on top of the stack is not visited yet in each iteration (best case scenario for this graph). In this case we visited all the nodes in |V| moves, but for each of them we pushed |V|-1 nodes on the stack which means that all the edges are pushed on the stack and the complexity is O(|E|)
A few notes. I'm arguing that the complexity is LESS than that so this proof that only looks at the best scenario for a worst case graph is fine. I'm also assuming that |E| is always larger than |V|. In fact, I'm assuming it's O(|V|^2). This means that O(|V|+|E|) and O(|E|) mean the same thing to me.
Ok, now here's my deal. What if we don't use an explicit stack?
The explosion here is due to the fact that we keep stacking up useless nodes that will never be processed. What if we instead just recurse? The advantage is that we can check if we're done before each recursive call.
Since there's no explicit stack and I'm still only visiting nodes I haven't seen before, I don't see how I can exceed the complexity of O(|V|).
The explosion here is due to the fact that we keep stacking up useless nodes that will never be processed. What if we instead just recurse? The advantage is that we can check if we're done before each recursive call.
That check still contributes to the run time. For each node you visit, you need to see which of its neighbors still need to be visited, which means checking each adjacent edge.
I had this question asked in the interview today. I told them its traversal and DFS can be used to see if the graph is connected. They said it was too simple.
What are some more important practical uses of DFS and BFS?
On a lighter note. This always comes to my mind when I hear DFS or BFS.
Note: This does not provide the direct answer to your question.
BFS :
Minimum Spanning Tree or shortest path
Peer to peer networking
Crawler for search engine
Social network websites - to find people with a given distance k from a person using BFS till k levels.
DFS :
Minimum Spanning Tree
Topological sorting
To solve puzzle/maze which has one solution
Here some points which come in my mind while reading your question:
DFS:
not usable on infinite trees (because you might go down an infinite branch). Or if there are cycles in your search graph you must take precaution to avoid running in a cycle forever.
you will most likely not find the nearest solution first
you need only O(depth) memory
BFS
the first solution you find is one of the nearest ones
you will need quite a lot of memory, because the search tree might get very broad already at quite little depth.
works on infinite trees and cyclic structures without any precaution
Of cause you will find much more on wikipedia
BFS
DFS
I saw this from an answer to another question
IVlad says that the stack will contain the cycle. But while searching through a graph, wouldn't the nodes that make up the cycle have been popped off in the process?
Maybe he meant in a visited nodes stack? But even then, the visited stack does not cleanly contain the cycle. What I mean is that although the cycle is there, it could have other visited nodes sandwiched between the cycle no?
When you are using DFS to find cycle in a graph, usually you use recursive method to implement your DFS. Recursive methods use stack to store their data, and when you find a cycle in your recursive method, you have all the path to receive current node. mean of IVlad is running program stack and not the stack you are using for implementing your DFS method.
Also you can store the nodes (Path) in another stack.
So I recently implemented a non-recursive version of DFS. Turns out that I can mark the nodes "visited" as soon as they are pushed on the stack or when they are popped out. The problem which I was working on specifically stated to mark it "visited" when pushed on stack. Are both versions some kind of DFS. Or is it like one is DFS and the other is not. Any suggestions are welcomed.
What I think is that if I do the second way, it will mimic the recursive dfs. But why does the other one work?
A recursive dfs (please ignore this)
dfsRec(node)
{
visitedArray[node]=1;
for all neighbours of node
if they aren't visited
dfsRec(neighbour);
}
dfs(startNode)
{
visitedArray;
dfsRec(startNode);
}
The problem with the second way (i.e. marking the node visited when they are popped out) is that your code will loop forever whenever your graph has a cycle. Once DFS reaches that cycle, it would continue going in circles, because the nodes would not be marked visited until they are popped of the stack, so any node reachable through a cycle would be pushed again and again, until you run out of memory.
Note that the issue is not too different from the recursive implementation of DFS: recursive implementation will cause stack overflow instead of running out of memory, but the reason for it would be the same.
I have read about DFS and BFS many times but I have this doubt lingering my mind since long. In a lot of articles it is mentioned that DFS can get stuck in infinite loops.
As far as I know, this limitation can easily be removed by keeping track of the visited nodes. In fact, in all the books that I have read, this little check is a part of DFS.
So why are 'infinite loops' mentioned as a disadvantage of DFS? Is it just because the original DFS algorithm did not have this check for visited nodes? Please explain.
(1) In graph search algorithms [used frequently on AI], DFS's main advantage is space efficiency. It is its main advantage on BFS. However, if you keep track of visited nodes, you lose this advantage, since you need to store all visited nodes in memory. Don't forget the size of visited nodes increases drastically over time, and for very large/infinite graphs - might not fit in memory.
(2) Sometimes DFS can be in an infinite branch [in infinite graphs]. An infinite branch is a branch that does not end [always has "more sons"], and also does not get you to your target node, so for DFS, you might keep expanding this branch inifinitely, and 'miss' the good branch, that leads to the target node.
Bonus:
You can overcome this flaw in DFS, while maintaining relatively small memory size by using a combination of DFS and BFS: Iterative Deepening DFS
a conventional DFS algorithm does track down nodes. A local search algorithm does not track down states and behaves with amnesia. So I think the loop mainly refers to the one an infinite branch(a branch with infinite possible states). In that case, DFS simply goes down and become too focused on one branch.
If you do not check for cycles, then DFS can get stuck in one and never find its target whereas BFS will always expand out to all nodes at the next depth and therefore will eventually find its target, even if cycles exist.
Put simply:
If your graph can have cycles and you're using DFS, then you must account for cycles. On the other hand, BFS provides the option to ignore cycles at the expense of efficiency, which is often acceptable when searching a small number of nodes.