How is backtracking used in a depth-first traversal? [duplicate] - depth-first-search

This question already has answers here:
What's the difference between backtracking and depth first search?
(17 answers)
Closed 2 years ago.
Can anyone tell me in simple terms, how is backtracking used in a depth-first traversal? I am struggling to understand so I could use an example.
Thanks.

Backtracking is used in depth-first traversals every time you hit a “dead-end”. It ensures that every path is eventually explored. For example, let’s say given this graph, you do a depth-first traversal starting at A. (We’ll use the convention that left children go first.)
1. A -> B (blue in the picture)
B has a child! Nice, we’ll go there.
2. B -> D (green in the picture)
D has 2 children! Nice, let’s go left first.
3. D -> E (purple in the picture)
Uh oh, E has no children! This is a dead end. This means we need to backtrack one level (go back “up” to D), and see if we have other paths we can search.
Yes — D has another unexplored child. Let’s go there.
4. D -> F (pink in the picture)
F has no children! This is another dead end, let’s backtrack one level (“up” to D), and see if we have other paths to search.
No — D has no children left. Let’s backtrack another level (“up” to B).
No — B has no children left. Let’s backtrack another level.
Yes — A has another unexplored child! Let’s go there.
5. A -> C (yellow in the picture)
C has no children. Let’s backtrack another level.
No — A has no children left to explore, and it’s the root. This means our traversal is done! :)

Related

Could Kruskal’s algorithm be implemented in this way instead of using a disjoint-set forest?

I am studying Kruskal's MST from this geeksforgeeks article. The steps given are:
Sort all the edges in non-decreasing order of their weight.
Pick the smallest edge. Check if it forms a cycle with the spanning tree formed so far. If cycle is not formed, include this edge. Else, discard it.
Repeat step (2) until there are (V-1) edges in the spanning tree.
I really don't feel any need to use disjoint set. Instead for checking a cycle we can just store vertices in a visited array and mark them as true whenever an edge is selected. Looping through the program if we find an edge whose both vertices are in the visited array we ignore that edge.
In other words, instead of storing a disjoint-set forest, can’t we just store an array of bits indicating which vertices have been linked to another edge in some previous step?
The approach you’re describing will not work properly in all cases. As an example, consider this line graph:
A - - B - - C - - D
Let’s assume A-B has weight 1, C-D has weight 2, and B - C has weight 3. What will Kruskal’s algorithm do here? First, it’ll add in A - B, then C - D, and then B - C.
Now imagine what your implementation will do. When we add A - B, you’ll mark A and B as having been visited. When we then add C - D, you’ll mark C and D as having been visited. But then when we try to add B - C, since both B and C are visited, you’ll decide not to add the edge, leaving a result that isn’t connected.
The issue here is that when building up an MST you may add edges linking nodes that have already been linked to other nodes in the past. The criterion for adding an edge is therefore less “have these nodes been linked before?” and more “is there already a path between these nodes?” That’s where the disjoint-set forest comes in.
It’s great that you’re poking and prodding conventional implementations and trying to find ways to improve them. You’ll learn a lot about those algorithms if you do! In this case, it just so happens that what you’re proposing doesn’t quite work, and seeing why it doesn’t work helps shed light on why the existing approach is what it is.
I really don't feel any need to use disjoint set. Instead for checking a cycle we can just store vertices in a visited array and mark them as true whenever an edge is selected. Looping through the program if we find an edge whose both vertices are in the visited array we ignore that edge.
Yes, of course you can do that. The point of using a disjoint set in this algorithm is performance. Use of a suitable disjoint set implementation yields better asymptotic performance than using a List can do.

Haskell depth-first-search for a graph

For hours now I am trying to implement a depth-first search for Haskell. My depthfirst algorithm has given a starting node and a graph. That is what I have so far + the definition of the graph datatype.
data Graph a = G [a] (BRfun a)
with:
type BRfun a = a -> [a]
current attempt:
depthFirst :: Eq a => a -> Graph a -> [a]
depthFirst a (G [a] sucs) = [a]
So if only one element is in the nodes list that's the only one I have to put in the final list (I think that should be the cancellation condition).
But now I am struggling to create an recursive algorithm to first get the deepest nodes.
I've had one too much of a drink and have a somewhat fuzzy idea of what I'm talking about, but here's a solution I came up with.
depthFirst :: Eq a => a -> Graph a -> [a]
depthFirst root (G _nodes edges)
= reverse $ go [] root
where
go seen x
| x `elem` seen = seen
| otherwise = foldl' go (x:seen) (edges x)
I use foldl' from Data.List here because we want to traverse nodes left-to-right, which is somewhat challenging with foldr. And straight up using foldl without ' is usually not a good idea, since it builds up thunks unless forced (while forcing is exactly what foldl' does).
So, the general idea, as I outlined in my comment, is as follows. Go down the tree the first chance you get, maintaining the list of nodes you've seen along the way. If a node has no outgoing edges, cool, you're done here. If you've already seen a given node, bail, you don't need infinite recursion.
Fold starts from current node prepended to the list of already seen nodes (at the beginning, empty list). Then, from left to right, it visits every node directly reachable from current node. At every "next" node, it builds reverse depth-first order of a subtree plus already seen nodes. Already seen nodes are carried over to each "next" node (left-to-right order). If there are no nodes reachable from current node, it returns just current node prepended to list of all seen nodes.
List of seen nodes is reversed because prepending is O(1) while appending is O(n). Easier to reverse once and get complexity O(n) rather than append every time and get complexity of roughly O(n²) (complexities are from the top of my head, and I'm more than a bit tipsy, so apply salt liberally)
If elem x seen, function bails returning the list of all nodes seen so far. It makes sure we don't recurse into the nodes we've visited already, and hence avoids infinite recursion on cyclic graphs.
This is classical depth-first search. It could be optimized, and potential for optimization is rather obvious (for one, elem x seen has O(n) worst-case complexity, while it could've been O(log n). Feel free to improve on the code.
As a last bit of advice, type of Graph doesn't guarantee that nodes are unique. A stricter implementation would look like this: data Graph a = G (Set a) (BRfun a), where Set is from Data.Set (or something similar). Given the stated definition with list, it might be a good idea to relabel all nodes, f.ex. nodes' = zip [1..] nodes or something like that.
For graph searches like DFS and BFS, you need to keep around a list of vertices that you've previously visited. This makes it possible to check if you've seen a vertex before, so that you don't visit a vertex twice (and this handles cycles too, although it can't actually detect for sure if cycles exist).
Here's my implementation. The visited list keeps track of which vertices have been visited. For each vertex we encounter, we check to see if it's been visited by traversing the list. When we "visit" a vertex (that is, in the else branch), we add the vertex to the list. The visited list is kept up-to-date by passing it around in the foldl.
In this approach, we can actually hijack the visited list for recording the depth-first order. Since we add vertices to the list when we first see them, the visited list is in reverse depth-first order. So we simply reverse it once the search has completed.
depthFirst source (G _ sucs) = reverse (search [] source)
where
search visited v =
if v `elem` visited
then visited -- already seen v, so skip it
else foldl search (v:visited) (sucs v)
I'd recommend walking through how the code executes on a small graph to get a sense for how it works and why it is correct. For example, try it on the graph defined as follows, from source 0.
edges = [[1,2,3],[4],[5],[4,6],[5],[1],[4]]
g = G [0,1,2,3,4,5,6] (edges!!)
Finally, note that this implementation is correct but highly inefficient, taking time O(nm) for a graph of n vertices and m edges, because we traverse the visited list once per edge. In a more efficient implementation, you would want to keep around two data structures, one for looking up whether or not a vertex has been visited (such as a hash set or binary search tree) and a second one for writing down the depth-first ordering.

Dfs Vs Bfs confusion

From a topcoder article:
"In BFS We mark a vertex visited as we push it into the queue, not as
we pop it in case of DFS."
NOTE: This is said in case of dfs implementation using explicit stack.(pseudo dfs).
My question is why so? why we can not mark a vertex visited after popping from queue, instead while pushing onto the queue in case of bfs ?
Your confusion probably comes from thinking about trees too much, but BFS and DFS can be run on any graph. Consider for example a graph with a loop like A-B-C-A. If you go breadth-first starting from A, you will first add B and C to the list. Then, you will pop B and, unless they were marked as visited, you will add C and A to the list, which is obviously wrong. If instead you go depth first from A, you will then visit B and from there go to C and then to A, unless A was already marked as visited.
So, in summary, you need to mark a vertex as seen as soon as you first see it, no matter which algorithm you take. However, if you only consider DAGs, you will find that things get a bit easier, because there you simply don't have any loop like the above. Anyway, the whole point is that you don't get stuck in a loop, and for that there are multiple variants. Setting a flag is one way, checking a set of visited vertices is another and in some cases like trees, you don't need to do anything but just iterate the edges in order.

Understanding ordering within a graph when doing traversals

I'm trying to understand depth first and breadth first traversals within the context of a graph. Most visual examples I've seen use trees to illustrate the difference between the two. The ordering of nodes within a tree is much more intuitive than in a graph (at least to me) and it makes perfect sense that nodes would be ordered top down, left to right from the root node.
When dealing with graphs, I see no such natural ordering. I've seen an example with various nodes labeled A though F, where the author explains traversals with nodes assuming the lexical order of their label. This would seem to imply that the type of value represented by a node must be inherently comparable. Is this the case? Any clarification would be much appreciated!
Node values in graphs need not be comparable.
An intuitive/oversimplified way to think about BFS vs DFS is this:
In DFS, you choose a direction to move, then you go as far as you can in that direction until you hit a dead end. Then you backtrack as little as possible, until you find a different direction you can go in. Follow that to its end, then backtrack again, and so on.
In BFS, you sequentially take one step in every possible direction. Then you take two steps in every possible direction, and so on.
Consider the following simple graph (I've deliberately chosen labels that are not A, B, C... to avoid the implication that the ordering of labels matters):
Q --> X --> T
| |
| |
v v
K --> W
A DFS starting at Q might proceed like this: Q to X to W (dead end), backtrack to X, go to T (dead end), backtrack to X and then to Q, go to K (dead end since W has already been visited).
A BFS starting at Q might proceed like this: Q, then X (one step away from Q), then K (one step away from Q), then W (two steps away from Q), then T (two steps away from Q).

How to resolve this game problem

I have a simple game problem using A*:
We have several nodes in a tree, one node contains:
monster with power and it's element
the way link to other nodes.
The plus point we get after we kill this monster.
There are five elements: Metal, Wood, Water, Fire, Land.
Our character can only kill a monster if our element's encounter score is more than or equal monster's.
And after kill a monster we must plus all the bonus point to one element's score, we can't split them up to several elements.
goal: Find the shortest way to a specific node.
My solution:
I will use A*:
heuristic: Dijkstra
find(mainCharacter,node,plusPoint) {
// node here is the node has smallest f
shortestWay[5] ways;
foreach(element in elements) {
mainCharacter->element += plusPoint;
if (mainCharacter can beat the monster in node) {
bestNode is node has the smallest f in node->neighbourNodes
*ways[element] ++ << the steps, we plus point to the first element at very first path. it can be -1 if we can't go.
find(mainCharacter,bestNode,node->plusPoint)
}
}
Our goal will be the *ways[element] with the smallest step.
My question:
Are my solution right and good enough ?
Are there any better solution for this game ?
Thanks first :)
I'm not sure A* is going to allow you to do this.
The main issue here is that your available nodes change as you explore new nodes. This means it might be worthwhile to backtrack sometimes.
Example: You are at node A, which opens to B and C. B opens to E. E opens to F, which opens to G, which opens to D. C opens to D which is your destination.
B is guarded by a power 2 elemental, and C guarded by a power 4 elemental. You are at power 3. E F and G all have power 2 elementals.
From A you can only go to B and C. C is too powerful so you go to B. You could keep going around to yield A B E F G D, or you could backtrack: A B A C D. (After you take out B, C is no longer too powerful.)
So, you are going to end up doing a lot of re-evaulation in whatever algorithm you come up with. This isn't even bounded by O(n!) because of the potential back tracking.
The approach I would take is to look at the shortest route without backtracking. This is your upper bound, and should be easy to do with A* (I think...) or something like it. Then you can find the geographical paths (ignoring power levels) that are shorter than this distance. From there, you can start eliminating blocks in power until 1) you get them all down, or 2) your geographic distance required to acquire the additional power to get through the blocks pushes the distance over the upper bound.

Resources