Why is Depth-First Search said to suffer from infinite loops? - algorithm

I have read about DFS and BFS many times but I have this doubt lingering my mind since long. In a lot of articles it is mentioned that DFS can get stuck in infinite loops.
As far as I know, this limitation can easily be removed by keeping track of the visited nodes. In fact, in all the books that I have read, this little check is a part of DFS.
So why are 'infinite loops' mentioned as a disadvantage of DFS? Is it just because the original DFS algorithm did not have this check for visited nodes? Please explain.

(1) In graph search algorithms [used frequently on AI], DFS's main advantage is space efficiency. It is its main advantage on BFS. However, if you keep track of visited nodes, you lose this advantage, since you need to store all visited nodes in memory. Don't forget the size of visited nodes increases drastically over time, and for very large/infinite graphs - might not fit in memory.
(2) Sometimes DFS can be in an infinite branch [in infinite graphs]. An infinite branch is a branch that does not end [always has "more sons"], and also does not get you to your target node, so for DFS, you might keep expanding this branch inifinitely, and 'miss' the good branch, that leads to the target node.
Bonus:
You can overcome this flaw in DFS, while maintaining relatively small memory size by using a combination of DFS and BFS: Iterative Deepening DFS

a conventional DFS algorithm does track down nodes. A local search algorithm does not track down states and behaves with amnesia. So I think the loop mainly refers to the one an infinite branch(a branch with infinite possible states). In that case, DFS simply goes down and become too focused on one branch.

If you do not check for cycles, then DFS can get stuck in one and never find its target whereas BFS will always expand out to all nodes at the next depth and therefore will eventually find its target, even if cycles exist.
Put simply:
If your graph can have cycles and you're using DFS, then you must account for cycles. On the other hand, BFS provides the option to ignore cycles at the expense of efficiency, which is often acceptable when searching a small number of nodes.

Related

Infinite nodes in BFS vs DFS

People always talk about how if there are infinite nodes downwards, then DFS will get stuck traversing this infinitely long branch and never reaching the answer in another branch.
Isn't this applicable to BFS as well? For example if the root node has an infinite amount of neighbours, wouldn't the program just spend an infinite amount of time trying to add each one into a queue?
In some cases, yes.
However, in order to have an infinite graph you basically need an implicit graph, https://en.wikipedia.org/wiki/Implicit_graph and many of them have bounded degree which avoids that problem.
Additionally, another advantage with BFS over DFS is that a path with fewer vertices often is "better" in some way - and by having a cost for the vertices that can be formulated using algorithms like Djikstra's that in some cases can be extended even to unbounded degrees.
Yes you are right, in the second case BFS will not have any real progress. For this theoretical infinite scenarios, let's discuss all the three possible cases:
If the graph had infinite nodes downwards and finite neighbors, then
we should use BFS (you already explained the reason)
But if the graph has infinite neighbors and finite nodes downwards,
then we should use DFS as in this case while doing DFS search for
each neighbor we would be able to search it's complete
path in finite time and then move on to the next neighbor. Here, BFS wouldn't have gotten any real progress while searching.
If graph had both infinite neighbors and infinite nodes downwards, then DFS and BFS will seize to differ as we are dealing with infinity on both ends.

What is the point of IDA* vs A* algorithm

I don't understand how IDA* saves memory space.
From how I understand IDA* is A* with iterative deepening.
What's the difference between the amount of memory A* uses vs IDA*.
Wouldn't the last iteration of IDA* behave exactly like A* and use the same amount of memory. When I trace IDA* I realize that it also has to remember a priority queue of the nodes that are below the f(n) threshold.
I understand that ID-Depth first search helps depth first search by allowing it to do a breadth first like search while not having to remember every every node. But I thought A* already behaves like depth first as in it ignores some sub-trees along the way. How does Iteratively deepening make it use less memory?
Another question is Depth first search with iterative deepening allows you to find the shortest path by making it behave breadth first like. But A* already returns optimal shortest path (given that heuristic is admissible). How does iterative deepening help it. I feel like IDA*'s last iteration is identical to A*.
In IDA*, unlike A*, you don't need to keep a set of tentative nodes which you intend to visit, therefore, your memory consumption is dedicated only to the local variables of the recursive function.
Although this algorithm is lower on memory consumption, it has its own flaws:
Unlike A*, IDA* doesn't utilize dynamic programming and therefore often ends up exploring the same nodes many times. (IDA* In Wiki)
The heuristic function still needs to be specified for your case in order to not scan the whole graph, yet the scan's memory required in every moment is only the path you are currently scanning without its surrounding nodes.
Here is a demo of the memory required in each algorithm:
In the A* algorithm all of the nodes and their surrounding nodes needs to be included in the "need to visit" list while in the IDA* you get the next nodes "lazily" when you reach its previews node so you don't need to include it in an extra set.
As mentioned in the comments, IDA* is basically just IDDFS with heuristics:

Which Procedure we can use for Maze exploration BFS or DFS

I know we can use DFS for maze exploration. But I think we can also use BFS for maze exploration. I'm little bit confused here because most of the books and articles that I've read had used DFS for this problem.
What I think is that the Best Case time complexity of DFS will be better as compared to BFS. But Average and Worst Case time complexity will be same for both BFS & DFS and thats why we prefer DFS over BFS.
Am I right or I'm having some misconception
I'm quite amazed that nobody has mentioned so far about the difference in results given by DFS and BFS.
The main difference between these two algorithms is that BFS returns the shortest path and DFS returns just a path.
So if you want to get the shortest path use BFS, otherwise consider other pros and cons (memory etc.)
They have similar running time, but either may greatly outperform the other on any given problem simply due to the order in which the cells are visited.
In terms of space usage, BFS will on average use more memory for trees, but for more general graphs, in certain cases, it could use significantly less memory.
For mazes specifically (if we define a maze as there being only one way to reach a cell from the starting point without backtracking, meaning it's essentially a tree), BFS will generally use more memory, as we'll need to keep multiple paths in memory at the same time, where DFS only needs to keep track of a single path at any given time.
For more general grids, it's much less obvious which one will be better in terms of memory, especially when considering how we keep track of cells we've visited thus far to prevent repeatedly visiting cells.
If you're not concerned about memory, you can pick either. If you're fairly comfortable with recursion, DFS should be easier to implement.
However, if you're looking for the shortest path to some given cell in a general grid, use BFS (or A*), as that guarantees to find the shortest path, where DFS does not (you can still use either in a maze where there's only a single path to any given cell).
Both should be equivalent. DFS is used more because it is a bit easier to implement.
BFS take too much memory, its not good for huge maze.

Iterative deepening vs depth-first search

I keep reading about iterative deepening, but I don't understand how it differs from depth-first search.
I understood that depth-first search keeps going deeper and deeper.
In iterative deepening you establish a value of a level, if there is no solution at that level, you increment that value, and start again from scratch (the root).
Wouldn't this be the same thing as depth-first search?
I mean you would keep incrementing and incrementing, going deeper until you find a solution. I see this as the same thing! I would be going down the same branch, because if I start again from scratch I would go down the same branch as before.
In a depth-first search, you begin at some node in the graph and continuously explore deeper and deeper into the graph while you can find new nodes that you haven't yet reached (or until you find the solution). Any time the DFS runs out of moves, it backtracks to the latest point where it could make a different choice, then explores out from there. This can be a serious problem if your graph is extremely large and there's only one solution, since you might end up exploring the entire graph along one DFS path only to find the solution after looking at each node. Worse, if the graph is infinite (perhaps your graph consists of all the numbers, for example), the search might not terminate. Moreover, once you find the node you're looking for, you might not have the optimal path to it (you could have looped all over the graph looking for the solution even though it was right next to the start node!)
One potential fix to this problem would be to limit the depth of any one path taken by the DFS. For example, we might do a DFS search, but stop the search if we ever take a path of length greater than 5. This ensures that we never explore any node that's of distance greater than five from the start node, meaning that we never explore out infinitely or (unless the graph is extremely dense) we don't search the entire graph. However, this does mean that we might not find the node we're looking for, since we don't necessarily explore the entire graph.
The idea behind iterative deepening is to use this second approach but to keep increasing the depth at each level. In other words, we might try exploring using all paths of length one, then all paths of length two, then length three, etc. until we end up finding the node in question. This means that we never end up exploring along infinite dead-end paths, since the length of each path is capped by some length at each step. It also means that we find the shortest possible path to the destination node, since if we didn't find the node at depth d but did find it at depth d + 1, there can't be a path of length d (or we would have taken it), so the path of length d + 1 is indeed optimal.
The reason that this is different from a DFS is that it never runs into the case where it takes an extremely long and circuitous path around the graph without ever terminating. The lengths of the paths are always capped, so we never end up exploring unnecessary branches.
The reason that this is different from BFS is that in a BFS, you have to hold all of the fringe nodes in memory at once. This takes memory O(bd), where b is the branching factor. Compare this to the O(d) memory usage from iterative deepening (to hold the state for each of the d nodes in the current path). Of course, BFS never explores the same path multiple times, while iterative deepening may explore any path several times as it increases the depth limit. However, asymptotically the two have the same runtime. BFS terminates in O(bd) steps after considering all O(bd) nodes at distance d. Iterative deepening uses O(bd) time per level, which sums up to O(bd) overall, but with a higher constant factor.
In short:
DFS is not guaranteed to find an optimal path; iterative deepening is.
DFS may explore the entire graph before finding the target node; iterative deepening only does this if the distance between the start and end node is the maximum in the graph.
BFS and iterative deepening both run in time O(bd), but iterative deepening likely has a higher constant factor.
BFS uses O(bd) memory, while iterative deepening uses only O(d).
There is a decent page on wikipedia about this.
The basic idea I think you missed is that iterative deepening is primarily a heuristic. When a solution is likely to be found close to the root iterative deepening is will find it relatively fast while straightfoward depth-first-search could make a "wrong" decision and spend a lot of time on a fruitless deep branch.
(This is particularly important when the search tree can be infinite. In this case they are even less equivalent since DFS can get stuck forever while BFS or iterative deepening are sure to find the answer one day if it exists)
Just adding to what's already here, but here are some videos from University of Denver's Moving AI Lab that show the differences.
http://movingai.com/dfid.html
You can see in their examples iterative deepening wins when the goal is shallow (solution depth 3, tree depth) and the solution is on the right, but DFS wins no matter what if the solution is in the last row.
I got into this reading about chess programming, next up for me was thinking about quiescence search check that out if you want to know more about search strategies for AI programming.

Question about breadth-first completeness vs depth-first incompleteness

According to Norvig in AIMA (Artificial Intelligence: A modern approach), the Depth-first algorithm is not complete (will not always produce a solution) because there are cases when the subtree being descended will be infinite.
On the other hand, the Breadth-first approach is said to be complete if the branching factor is not infinite. But isn't that somewhat the same "thing" as in the case of the subtree being infinite in DFS?
Can't the DFS be said to be complete if the tree's depth is finite? How is then that the BFS is complete and the DFS is not, since the completeness of the BFS relies on the branching factor being finite!
A tree can be infinite without having an infinite branching factor. As an example, consider the state tree for Rubik's Cube. Given a configuration of the cube, there is a finite number of moves (18, I believe, since a move consists of picking one of the 9 "planes" and rotating it in one of the two possible directions). However, the tree is infinitely deep, since it is perfectly possible to e.g. only rotate the same plane alternatingly back and forth (never making any progress). In order to prevent a DFS from doing this, one normally caches all the visited states (effectively pruning the state tree) - as you probably know. However, if the state space is too large (or actually infinite), this won't help.
I have not studied AI extensively, but I assume that the rationale for saying that BFS is complete while DFS is not (completeness is, after all, just a term that is defined somewhere) is that infinitely deep trees occur more frequently than trees with infinite branching factors (since having an infinite branching factor means that you have an infinite number of choices, which I believe is not common - games and simulations are usually discrete). Even for finite trees, BFS will normally perform better because DFS is very likely to start out on a wrong path, exploring a large portion of the tree before reaching the goal. Still, as you point out, in a finite tree, DFS will eventually find the solution if it exists.
DFS can not stuck in cycles (if we have a list of opened and closed states). The algorithm is not complete since it does not find a solution in an infinite space, even though the solution is in depth d which is much lower than infinity.
Imagine a strangely defined state space where each node has same number of successors as following number in Fibonacci sequence. So, it's recursively defined and therefore infinite. We're looking for node 2 (marked green in the graph). If DFS starts with the right branch of tree, it will take infinite number of steps to verify that our node is not there. Therefore it's not complete (it won't finish in reasonable time). BFS would find the solution in 3rd iteration.
Rubik's cube state space is finite, it is huge, but finite (human stuck in cycles but DFS won't repeat the same move twice). DFS would find very inefficient way how to solve it, sometimes this kind of solution is infeasible. Usually we consider maximum depth infinite, but our resources (memory) are always finite.
The properties of depth-first search depend strongly on whether the graph-search or
tree-search version is used. The graph-search version, which avoids repeated states and redundant
paths, is complete in finite state spaces because it will eventually expand every node.
The tree-search version, on the other hand, is not complete—for example, in Figure 3.6 the
algorithm will follow the Arad–Sibiu–Arad–Sibiu loop forever
Source: AI: a modern approach

Resources