I've looked for ways to count the no. of connected components online. I noticed that in most sites, the algorithm used is Depth-first search. I believe you can achieve the same thing Breadth-first search and union-find as well. So why do people prefer using DFS for finding number of connected components?
Mainly because of two reasons:
It's simple and short. No data structure is required (well we need a stack but recursion takes care of that)
It's memory friendly, Breath First Search has a memory complexity of O(V) (V is number of nodes). DFS on the other hand has O(h) (h is maximum depth of the recursion tree).
It is not better in terms of complexity, as in all cases you will visit a node exactly once. In terms of memory usage, you will always have to know what nodes were visited and what nodes were found and not visited yet. Depth-first will take a child node and visit its descendants before its siblings, while Breadth-first will visit the siblings before descendants. Depth-first is shorter and simpler and arguably more intuitive though, which could be the reason it is chosen in tutorials, books and presentation more frequently than the others.
In many cases the stack to be used is handled by recursion, but that is not necessarily a good idea, especially in the case of large graphs.
I keep reading about iterative deepening, but I don't understand how it differs from depth-first search.
I understood that depth-first search keeps going deeper and deeper.
In iterative deepening you establish a value of a level, if there is no solution at that level, you increment that value, and start again from scratch (the root).
Wouldn't this be the same thing as depth-first search?
I mean you would keep incrementing and incrementing, going deeper until you find a solution. I see this as the same thing! I would be going down the same branch, because if I start again from scratch I would go down the same branch as before.
In a depth-first search, you begin at some node in the graph and continuously explore deeper and deeper into the graph while you can find new nodes that you haven't yet reached (or until you find the solution). Any time the DFS runs out of moves, it backtracks to the latest point where it could make a different choice, then explores out from there. This can be a serious problem if your graph is extremely large and there's only one solution, since you might end up exploring the entire graph along one DFS path only to find the solution after looking at each node. Worse, if the graph is infinite (perhaps your graph consists of all the numbers, for example), the search might not terminate. Moreover, once you find the node you're looking for, you might not have the optimal path to it (you could have looped all over the graph looking for the solution even though it was right next to the start node!)
One potential fix to this problem would be to limit the depth of any one path taken by the DFS. For example, we might do a DFS search, but stop the search if we ever take a path of length greater than 5. This ensures that we never explore any node that's of distance greater than five from the start node, meaning that we never explore out infinitely or (unless the graph is extremely dense) we don't search the entire graph. However, this does mean that we might not find the node we're looking for, since we don't necessarily explore the entire graph.
The idea behind iterative deepening is to use this second approach but to keep increasing the depth at each level. In other words, we might try exploring using all paths of length one, then all paths of length two, then length three, etc. until we end up finding the node in question. This means that we never end up exploring along infinite dead-end paths, since the length of each path is capped by some length at each step. It also means that we find the shortest possible path to the destination node, since if we didn't find the node at depth d but did find it at depth d + 1, there can't be a path of length d (or we would have taken it), so the path of length d + 1 is indeed optimal.
The reason that this is different from a DFS is that it never runs into the case where it takes an extremely long and circuitous path around the graph without ever terminating. The lengths of the paths are always capped, so we never end up exploring unnecessary branches.
The reason that this is different from BFS is that in a BFS, you have to hold all of the fringe nodes in memory at once. This takes memory O(bd), where b is the branching factor. Compare this to the O(d) memory usage from iterative deepening (to hold the state for each of the d nodes in the current path). Of course, BFS never explores the same path multiple times, while iterative deepening may explore any path several times as it increases the depth limit. However, asymptotically the two have the same runtime. BFS terminates in O(bd) steps after considering all O(bd) nodes at distance d. Iterative deepening uses O(bd) time per level, which sums up to O(bd) overall, but with a higher constant factor.
In short:
DFS is not guaranteed to find an optimal path; iterative deepening is.
DFS may explore the entire graph before finding the target node; iterative deepening only does this if the distance between the start and end node is the maximum in the graph.
BFS and iterative deepening both run in time O(bd), but iterative deepening likely has a higher constant factor.
BFS uses O(bd) memory, while iterative deepening uses only O(d).
There is a decent page on wikipedia about this.
The basic idea I think you missed is that iterative deepening is primarily a heuristic. When a solution is likely to be found close to the root iterative deepening is will find it relatively fast while straightfoward depth-first-search could make a "wrong" decision and spend a lot of time on a fruitless deep branch.
(This is particularly important when the search tree can be infinite. In this case they are even less equivalent since DFS can get stuck forever while BFS or iterative deepening are sure to find the answer one day if it exists)
Just adding to what's already here, but here are some videos from University of Denver's Moving AI Lab that show the differences.
http://movingai.com/dfid.html
You can see in their examples iterative deepening wins when the goal is shallow (solution depth 3, tree depth) and the solution is on the right, but DFS wins no matter what if the solution is in the last row.
I got into this reading about chess programming, next up for me was thinking about quiescence search check that out if you want to know more about search strategies for AI programming.
I have found many algorithms and approaches that talk about finding the shortest path or the best/optimal solution to a problem. However, what I want to do is an algorithm that finds the first K-shortest paths from one point to another. The problem I'm facing is more like searching through a tree, when in each step you take there are multiple options each one with its weight. What kinds of algorithms are used to face this kind of problems?
There is the 2006 paper by Jose Santos
comparing three different K-shortest path finding algorithms.
Yen's algorithm implementation:
http://code.google.com/p/k-shortest-paths/
Easier algorithm & discussion:
Suggestions for KSPA on undirected graph
EDIT: apparently I clicked on a link, because I thought I was answering to a new question; ignore this if - as is very likely - this question isn't important to you anymore.
Given the restricted version of the problem you're dealing with, this becomes a lot simpler to implement. The most important thing to notice is that in trees, shortest paths are the only paths between two nodes. So what you do is solve all pairs shortest paths, which is O(n²) in trees by doing n BFS traversals, and then you get the k minimal values. This probably can be optimized in some way, but the naive approach to do that is sort the O(n²) distances in O(n² log n) time and take the k smallest values; with some book keeping, you can keep track of which distance corresponds to which path without time complexity overhead. This will give you better complexity than using a KSPA algorithm for O(n²) possible s-t-pairs.
If what you actually meant is fixing a source and get the k nodes with the smallest distance from that source, one BFS will do. In case you meant fixing both source and target, one BFS is enough as well.
I don't see how you can use the fact that all edges going from a node to the nodes in the level below have the same weight without knowing more about the structure of the tree.
I want to know the change in results when we use open maze or closed maze for the DFS, BFS, and A* search algorithms? Is there any big difference in the output like increase in number of expanded nodes, cost, etc.?
A naive DFS can go into an infinite loop on certain open mazes, whereas on a closed maze it will always finish. I don't think BFS or A* can fall into that trap. (By "naive DFS" I mean one that doesn't mark nodes as "visited" as it traverses them.)
Edit: Daniel's comment has forced me to rethink this answer in the light of day rather than the sleepy moments before I went to bed. I will concede that A* marks nodes as visited as part of its basic functioning. However, I still think BFS can solve even open mazes without marking nodes. It won't be efficient, but if there is a solution to the maze, BFS will find it. By definition, it is trying all possible paths at a certain depth before moving onto the next depth. So if a solution exists with length 10, BFS will find it before trying any solutions of depth 11.
Yes. There is a big difference as the different strategies traverse the maze in totally different orders
A* can be quite efficient compared to naive dfs and bfs. But you need to find a good function to evaluate the cost from your current position to the target.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
This post was edited and submitted for review 6 months ago and failed to reopen the post:
Original close reason(s) were not resolved
Improve this question
I understand the differences between DFS and BFS, but I'm interested to know what factors to consider when choosing DFS vs BFS.
Things like avoiding DFS for very deep trees, etc.
That heavily depends on the structure of the search tree and the number and location of solutions (aka searched-for items).
If you know a solution is not far from the root of the tree, a
breadth first search (BFS) might be better.
If the tree is very deep and solutions are rare, depth first search
(DFS) might take an extremely long time, but BFS could be faster.
If the tree is very wide, a BFS might need too much memory, so it
might be completely impractical.
If solutions are frequent but located deep in the tree, BFS could be
impractical.
If the search tree is very deep you will need to restrict the search
depth for depth first search (DFS), anyway (for example with
iterative deepening).
But these are just rules of thumb; you'll probably need to experiment.
I think in practice you'll usually not use these algorithms in their pure form anyway. There could be heuristics that help to explore promising parts of the search space first, or you might want to modify your search algorithm to be able to parallelize it efficiently.
Depth-first Search
Depth-first searches are often used in simulations of games (and game-like situations in the real world). In a typical game you can choose one of several possible actions. Each choice leads to further choices, each of which leads to further choices, and so on into an ever-expanding tree-shaped graph of possibilities.
For example in games like Chess, tic-tac-toe when you are deciding what move to make, you can mentally imagine a move, then your opponent’s possible responses, then your responses, and so on. You can decide what to do by seeing which move leads to the best outcome.
Only some paths in a game tree lead to your win. Some lead to a win by your opponent, when you reach such an ending, you must back up, or backtrack, to a previous node and try a different path. In this way you explore the tree until you find a path with a successful conclusion. Then you make the first move along this path.
Breadth-first search
The breadth-first search has an interesting property: It first finds all the vertices that are one edge away from the starting point, then all the vertices that are two edges away, and so on. This is useful if you’re trying to find the shortest path from the starting vertex to a given vertex. You start a BFS, and when you find the specified vertex, you know the path you’ve traced so far is the shortest path to the node. If there were a shorter path, the BFS would have found it already.
Breadth-first search can be used for finding the neighbour nodes in peer to peer networks like BitTorrent, GPS systems to find nearby locations, social networking sites to find people in the specified distance and things like that.
Nice Explanation from
http://www.programmerinterview.com/index.php/data-structures/dfs-vs-bfs/
An example of BFS
Here’s an example of what a BFS would look like. This is something like Level Order Tree Traversal where we will use QUEUE with ITERATIVE approach (Mostly RECURSION will end up with DFS). The numbers represent the order in which the nodes are accessed in a BFS:
In a depth first search, you start at the root, and follow one of the branches of the tree as far as possible until either the node you are looking for is found or you hit a leaf node ( a node with no children). If you hit a leaf node, then you continue the search at the nearest ancestor with unexplored children.
An example of DFS
Here’s an example of what a DFS would look like. I think post order traversal in binary tree will start work from the Leaf level first. The numbers represent the order in which the nodes are accessed in a DFS:
Differences between DFS and BFS
Comparing BFS and DFS, the big advantage of DFS is that it has much lower memory requirements than BFS, because it’s not necessary to store all of the child pointers at each level. Depending on the data and what you are looking for, either DFS or BFS could be advantageous.
For example, given a family tree if one were looking for someone on the tree who’s still alive, then it would be safe to assume that person would be on the bottom of the tree. This means that a BFS would take a very long time to reach that last level. A DFS, however, would find the goal faster. But, if one were looking for a family member who died a very long time ago, then that person would be closer to the top of the tree. Then, a BFS would usually be faster than a DFS. So, the advantages of either vary depending on the data and what you’re looking for.
One more example is Facebook; Suggestion on Friends of Friends. We need immediate friends for suggestion where we can use BFS. May be finding the shortest path or detecting the cycle (using recursion) we can use DFS.
Breadth First Search is generally the best approach when the depth of the tree can vary, and you only need to search part of the tree for a solution. For example, finding the shortest path from a starting value to a final value is a good place to use BFS.
Depth First Search is commonly used when you need to search the entire tree. It's easier to implement (using recursion) than BFS, and requires less state: While BFS requires you store the entire 'frontier', DFS only requires you store the list of parent nodes of the current element.
DFS is more space-efficient than BFS, but may go to unnecessary depths.
Their names are revealing: if there's a big breadth (i.e. big branching factor), but very limited depth (e.g. limited number of "moves"), then DFS can be more preferrable to BFS.
On IDDFS
It should be mentioned that there's a less-known variant that combines the space efficiency of DFS, but (cummulatively) the level-order visitation of BFS, is the iterative deepening depth-first search. This algorithm revisits some nodes, but it only contributes a constant factor of asymptotic difference.
When you approach this question as a programmer, one factor stands out: if you're using recursion, then depth-first search is simpler to implement, because you don't need to maintain an additional data structure containing the nodes yet to explore.
Here's depth-first search for a non-oriented graph if you're storing “already visited” information in the nodes:
def dfs(origin): # DFS from origin:
origin.visited = True # Mark the origin as visited
for neighbor in origin.neighbors: # Loop over the neighbors
if not neighbor.visited: dfs(neighbor) # Visit each neighbor if not already visited
If storing “already visited” information in a separate data structure:
def dfs(node, visited): # DFS from origin, with already-visited set:
visited.add(node) # Mark the origin as visited
for neighbor in node.neighbors: # Loop over the neighbors
if not neighbor in visited: # If the neighbor hasn't been visited yet,
dfs(neighbor, visited) # then visit the neighbor
dfs(origin, set())
Contrast this with breadth-first search where you need to maintain a separate data structure for the list of nodes yet to visit, no matter what.
One important advantage of BFS would be that it can be used to find the shortest path between any two nodes in an unweighted graph.
Whereas, we cannot use DFS for the same.
The following is a comprehensive answer to what you are asking.
In simple terms:
Breadth First Search (BFS) algorithm, from its name "Breadth", discovers all the neighbours of a node through the out edges of the node then it discovers the unvisited neighbours of the previously mentioned neighbours through their out edges and so forth, till all the nodes reachable from the origional source are visited (we can continue and take another origional source if there are remaining unvisited nodes and so forth). That's why it can be used to find the shortest path (if there is any) from a node (origional source) to another node if the weights of the edges are uniform.
Depth First Search (DFS) algorithm, from its name "Depth", discovers the unvisited neighbours of the most recently discovered node x through its out edges. If there is no unvisited neighbour from the node x, the algorithm backtracks to discover the unvisited neighbours of the node (through its out edges) from which node x was discovered, and so forth, till all the nodes reachable from the origional source are visited (we can continue and take another origional source if there are remaining unvisited nodes and so forth).
Both BFS and DFS can be incomplete. For example if the branching factor of a node is infinite, or very big for the resources (memory) to support (e.g. when storing the nodes to be discovered next), then BFS is not complete even though the searched key can be at a distance of few edges from the origional source. This infinite branching factor can be because of infinite choices (neighbouring nodes) from a given node to discover.
If the depth is infinite, or very big for the resources (memory) to support (e.g. when storing the nodes to be discovered next), then DFS is not complete even though the searched key can be the third neighbor of the origional source. This infinite depth can be because of a situation where there is, for every node the algorithm discovers, at least a new choice (neighbouring node) that is unvisited before.
Therefore, we can conclude when to use the BFS and DFS. Suppose we are dealing with a manageable limited branching factor and a manageable limited depth. If the searched node is shallow i.e. reachable after some edges from the origional source, then it is better to use BFS. On the other hand, if the searched node is deep i.e. reachable after a lot of edges from the origional source, then it is better to use DFS.
For example, in a social network if we want to search for people who have similar interests of a specific person, we can apply BFS from this person as an origional source, because mostly these people will be his direct friends or friends of friends i.e. one or two edges far.
On the other hand, if we want to search for people who have completely different interests of a specific person, we can apply DFS from this person as an origional source, because mostly these people will be very far from him i.e. friend of friend of friend.... i.e. too many edges far.
Applications of BFS and DFS can vary also because of the mechanism of searching in each one. For example, we can use either BFS (assuming the branching factor is manageable) or DFS (assuming the depth is manageable) when we just want to check the reachability from one node to another having no information where that node can be. Also both of them can solve same tasks like topological sorting of a graph (if it has).
BFS can be used to find the shortest path, with unit weight edges, from a node (origional source) to another. Whereas, DFS can be used to exhaust all the choices because of its nature of going in depth, like discovering the longest path between two nodes in an acyclic graph. Also DFS, can be used for cycle detection in a graph.
In the end if we have infinite depth and infinite branching factor, we can use Iterative Deepening Search (IDS).
I think it depends on what problems you are facing.
shortest path on simple graph -> bfs
all possible results -> dfs
search on graph(treat tree, martix as a graph too) -> dfs
....
Some algorithms depend on particular properties of DFS (or BFS) to work. For example the Hopcroft and Tarjan algorithm for finding 2-connected components takes advantage of the fact that each already visited node encountered by DFS is on the path from root to the currently explored node.
For BFS, we can consider Facebook example. We receive suggestion to add friends from the FB profile from other other friends profile. Suppose A->B, while B->E and B->F, so A will get suggestion for E And F. They must be using BFS to read till second level.
DFS is more based on scenarios where we want to forecast something based on data we have from source to destination. As mentioned already about chess or sudoku.
Once thing I have different here is, I believe DFS should be used for shortest path because DFS will cover the whole path first then we can decide the best. But as BFS will use greedy's approach so might be it looks like its the shortest path, but the final result might differ.
Let me know whether my understanding is wrong.
According to the properties of DFS and BFS.
For example,when we want to find the shortest path.
we usually use bfs,it can guarantee the 'shortest'.
but dfs only can guarantee that we can come from this point can achieve that point ,can not guarantee the 'shortest'.
Because Depth-First Searches use a stack as the nodes are processed, backtracking is provided with DFS. Because Breadth-First Searches use a queue, not a stack, to keep track of what nodes are processed, backtracking is not provided with BFS.
When tree width is very large and depth is low use DFS as recursion stack will not overflow.Use BFS when width is low and depth is very large to traverse the tree.
This is a good example to demonstrate that BFS is better than DFS in certain case. https://leetcode.com/problems/01-matrix/
When correctly implemented, both solutions should visit cells that have farther distance than the current cell +1.
But DFS is inefficient and repeatedly visited the same cell resulting O(n*n) complexity.
For example,
1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,
0,0,0,0,0,0,0,0,