Difference between 'backtracking' and 'branch and bound' - depth-first-search

In backtracking we use both bfs and dfs. Even in branch and bound we use both bfs and dfs in additional to least cost search.
so when do we use backtracking and when do we use branch and bound
Does using branch and bound decreases time complexity?
What is Least cost search in Branch and Bound?

Backtracking
It is used to find all possible solutions available to a problem.
It traverses the state space tree by DFS(Depth First Search) manner.
It realizes that it has made a bad choice & undoes the last choice by backing up.
It searches the state space tree until it has found a solution.
It involves feasibility function.
Branch-and-Bound
It is used to solve optimization problem.
It may traverse the tree in any manner, DFS or BFS.
It realizes that it already has a better optimal solution that the pre-solution leads to so it abandons that pre-solution.
It completely searches the state space tree to get optimal solution.
It involves a bounding function.

Backtracking
Backtracking is a general concept to solve discrete constraint satisfaction problems (CSPs). It uses DFS. Once it's at a point where it's clear that the solution cannot be constructed, it goes back to the last point where there was a choice. This way it iterates all potential solutions, maybe aborting sometimes a bit earlier.
Branch-and-Bound
Branch-and-Bound (B&B) is a concept to solve discrete constrained optimization problems (COPs). They are similar to CSPs, but besides having the constraints they have an optimization criterion. In contrast to backtracking, B&B uses Breadth-First Search.
One part of the name, the bound, refers to the way B&B prunes the space of possible solutions: It gets a heuristic which gets an upper bound. If this cannot be improved, a sup-tree can be discarded.
Besides that, I don't see a difference to Backtracking.
Other Sources
There are other answers on the web which make very different statements:
Branch-and-Bound is backtracking with pruning (source)

Backtracking
Backtracking is a general algorithm for finding all (or some) solutions to some computational problems, notably constraint satisfaction problems, that incrementally builds candidates to the solutions, and abandons each partial candidate c ("backtracks") as soon as it determines that c cannot possibly be completed to a valid solution.
It enumerates a set of partial candidates that, in principle, could be completed in various ways to give all the possible solutions to the given problem. The completion is done incrementally, by a sequence of candidate extension steps.
Conceptually, the partial candidates are represented as the nodes of a tree structure, the potential search tree. Each partial candidate is the parent of the candidates that differ from it by a single extension step, the leaves of the tree are the partial candidates that cannot be extended any further.
It traverses this search tree recursively, from the root down, in depth-first order (DFS). It realizes that it has made a bad choice & undoes the last choice by backing up.
For more details: Sanjiv Bhatia's presentation on Backtracking for UMSL.
Branch And Bound
A branch-and-bound algorithm consists of a systematic enumeration of candidate solutions by means of state space search: the set of candidate solutions is thought of as forming a rooted tree with the full set at the root.
The algorithm explores branches of this tree, which represent subsets of the solution set. Before enumerating the candidate solutions of a branch, the branch is checked against upper and lower estimated bounds on the optimal solution, and is discarded if it cannot produce a better solution than the best one found so far by the algorithm.
It may traverse the tree in any following manner:
BFS (Breath First Search) or (FIFO) Branch and Bound
D-Search or (LIFO) Branch and Bound
Least Count Search or (LC) Branch and Bound
For more information: Sanjiv Bhatia's presentation on Branch and Bound for UMSL.

Backtracking:
-optimal solution is selected from solution space.
-traversed through DFS.
Branch and Bound:
-BFS traversal.
-here only fruitful solutions are generated rather than generating all possible ones.

Related

Unordered Tree Pattern Matching Algorithm

I am trying to find a reasonable algorithm find the first tree pattern matching in unordered, rooted trees. According to some research I have come across, this problem is NP-Complete. I don't need to find every pattern match, I just need to find any pattern matching that exists. Preferably, I would rather not have to perform "deletions" on my tree (nor do I want to make a copy to delete nodes from).
Another thing to note is that the tree will be updated between tree matching queries, so I'm also hoping that there may be some algorithms that take advantage of this fact, possibly using an online approach that keeps track of previous partial matches in the tree to optimize a future match.
Is there a straightforward algorithm that can solve this problem given the criteria I mentioned, but one that is still better than the pure naive brute force approach?
Notes, my problem is similar to this previously asked question, but that question is specific to ordered trees.
According to http://www.sciencedirect.com/science/article/pii/S1570866704000644 the problem that is NP-complete is tree inclusion. That means that the tree can fit in potentially skipping generations. So, for instance, a tree with one root and 1000 leaves could fit into a tree which branches in 2 10x. And because this problem is NP-complete, you cannot do fundamentally better exponential growth as the trees grow.
But you can reduce that exponent and do much better than brute force. For example for each node in the tree record the maximum depth below it and total number of descendants. As you try to fit one tree into the other, stop searching whenever you're trying to fit a subtree with too much depth or too many children. This will let you avoid following a lot of lost causes.
You can also use dynamic programming to help. What you try to do is store for each pair of nodes from the two trees whether or not the subtree below one can be mapped to the other. When you're looking at whether a can go to b what you first do is map the children of a in to the children of b. If any can't go, then you know that the answer is no. If all can go, then sort the children of a from fitting in the least to the most places. Now do a brute force search for how to fit the one into the other. You'll tend to find your dead ends very quickly with this way of organizing the search.
However if the trees are large, if the one won't fit into the other you can spend a very, very long time figuring that fact out.

If memoization is top-down depth-first, and DP is bottom-up breadth-first, what are the top-down breadth-first / bottom-up depth-first equivalents?

I just read this short post about mental models for Recursive Memoization vs Dynamic Programming, written by professor Krishnamurthi. In it, Krishnamurthi represents memoization's top-down structure as a recursion tree, and DP's bottom-up structure as a DAG where the source vertices are the first – likely smallest – subproblems solved, and the sink vertex is the final computation (essentially the graph is the same as the aforementioned recursive tree, but with all the edges flipped). Fair enough, that makes perfect sense.
Anyways, towards the end he gives a mental exercise to the reader:
Memoization is an optimization of a top-down, depth-first computation
for an answer. DP is an optimization of a bottom-up, breadth-first
computation for an answer.
We should naturally ask, what about
top-down, breadth-first
bottom-up, depth-first
Where do they fit into
the space of techniques for avoiding recomputation by trading off
space for time?
Do we already have names for them? If so, what?, or
Have we been missing one or two important tricks?, or
Is there a reason we don't have names for these?
However, he stops there, without giving his thoughts on these questions.
I'm lost, but here goes:
My interpretation is that a top-down, breadth-first computation would require a separate process for each function call. A bottom-up, depth-first approach would somehow piece together the final solution, as each trace reaches the "sink vertex". The solution would eventually "add up" to the right answer once all calls are made.
How off am I? Does anyone know the answer to his three questions?
Let's analyse what the edges in the two graphs mean. An edge from subproblem a to b represents a relation where a solution of b is used in the computation of a and must be solved before it. (The other way round in the other case.)
Does topological sort come to mind?
One way to do a topological sort is to perform a Depth First Search and on your way out of every node, process it. This is essentially what Recursive memoization does. You go down Depth First from every subproblem until you encounter one that you haven't solved (or a node you haven't visited) and you solve it.
Dynamic Programming, or bottom up - breadth first problem solving approach involves solving smaller problems and constructing solutions to larger ones from them. This is the other approach to doing a topological sort, where you visit the node with a in-degree of 0, process it, and then remove it. In DP, the smallest problems are solved first because they have a lower in-degree. (Smaller is subjective to the problem at hand.)
The problem here is the generation of a sequence in which the set of subproblems must be solved. Both top-down breadth-first and bottom-up depth-first can't do that.
Top-down Breadth-first will still end up doing something very similar to the depth-first counter part even if the process is separated into threads. There is an order in which the problems must be solved.
A bottom-up depth-first approach MIGHT be able to partially solve problems but the end result would still be similar to the breadth first counter part. The subproblems will be solved in a similar order.
Given that these approaches have almost no improvements over the other approaches, do not translate well with analogies and are tedious to implement, they aren't well established.
#AndyG's comment is pretty much on the point here. I also like #shebang's answer, but here's one that directly answers these questions in this context, not through reduction to another problem.
It's just not clear what a top-down, breadth-first solution would look like. But even if you somehow paused the computation to not do any sub-computations (one could imagine various continuation-based schemes that might enable this), there would be no point to doing so, because there would be sharing of sub-problems.
Likewise, it's unclear that a bottom-up, depth-first solution could solve the problem at all. If you proceed bottom-up but charge all the way up some spine of the computation, but the other sub-problems' solutions aren't already ready and lying in wait, then you'd be computing garbage.
Therefore, top-down, breadth-first offers no benefit, while bottom-up, depth-first doesn't even offer a solution.
Incidentally, a more up-to-date version of the above blog post is now a section in my text (this is the 2014 edition; expect updates.

better heuristic then A*

I am enrolled in Stanford's ai-class.com and have just learned in my first week of lecture about a* algorithm and how it's better used then other search algo.
I also show one of my class mate implement it on 4x4 sliding block puzzle which he has published at: http://george.mitsuoka.org/StanfordAI/slidingBlocks/
While i very much appreciate and thank George to implement A* and publishing the result for our amusement.
I (and he also) were wondering if there is any way to make the process more optimized or if there is a better heuristic A*, like better heuristic function than the max of "number of blocks out of place" or "sum of distances to goals" that would speed things up?
and Also if there is a better algo then A* for such problems, i would like to know about them as well.
Thanks for the help and in case of discrepancies, before down grading my profile please allow me a chance to upgrade my approach or even if req to delete the question, as am still learning the ways of stackoverflow.
It depends on your heuristic function. for example, if you have a perfect heuristic [h*], then a greedy algorithm(*), will yield better result then A*, and will still be optimal [since your heuristic is perfect!]. It will develop only the nodes needed for the solution. Unfortunately, it is seldom the case that you have a perfect heuristic.
(*)greedy algorithm: always develop the node with the lowest h value.
However, if your heuristic is very bad: h=0, then A* is actually a BFS! And A* in this case will develop O(B^d) nodes, where B is the branch factor and d is the number of steps required for solving.
In this case, since you have a single target function, a bi-directional search (*) will be more efficient, since it needs to develop only O(2*B^(d/2))=O(B^(d/2)) nodes, which is much less then what A* will develop.
bi directional search: (*)run BFS from the target and from the start nodes, each iteration is one step from each side, the algorithm ends when there is a common vertex in both fronts.
For the average case, if you have a heuristic which is not perfect, but not completely terrbile, A* will probably perform better then both solutions.
Possible optimization for average case: You also can run bi-directional search with A*: from the start side, you can run A* with your heuristic, and a regular BFS from the target side. Will it get a solution faster? no idea, you should probably benchmark the two possibilities and find which is better. However, the solution found with this algorithm will also be optimal, like BFS and A*.
The performance of A* is based on the quality of the expected cost heuristic, as you learned in the videos. Getting your expected cost heuristic to match as closely as possible to the actual cost from that state will reduce the total number of states that need to be expanded. There are also a number of variations that perform better under certain circumstances, like for instance when faced with hardware restrictions in large state space searching.

Difference between back tracking and dynamic programming

I heard the only difference between dynamic programming and back tracking is DP allows overlapping of sub problems, e.g.
fib(n) = fib(n-1) + fib (n-2)
Is it right? Are there any other differences?
Also, I would like know some common problems solved using these techniques.
There are two typical implementations of Dynamic Programming approach: bottom-to-top and top-to-bottom.
Top-to-bottom Dynamic Programming is nothing else than ordinary recursion, enhanced with memorizing the solutions for intermediate sub-problems. When a given sub-problem arises second (third, fourth...) time, it is not solved from scratch, but instead the previously memorized solution is used right away. This technique is known under the name memoization (no 'r' before 'i').
This is actually what your example with Fibonacci sequence is supposed to illustrate. Just use the recursive formula for Fibonacci sequence, but build the table of fib(i) values along the way, and you get a Top-to-bottom DP algorithm for this problem (so that, for example, if you need to calculate fib(5) second time, you get it from the table instead of calculating it again).
In Bottom-to-top Dynamic Programming the approach is also based on storing sub-solutions in memory, but they are solved in a different order (from smaller to bigger), and the resultant general structure of the algorithm is not recursive. LCS algorithm is a classic Bottom-to-top DP example.
Bottom-to-top DP algorithms are usually more efficient, but they are generally harder (and sometimes impossible) to build, since it is not always easy to predict which primitive sub-problems you are going to need to solve the whole original problem, and which path you have to take from small sub-problems to get to the final solution in the most efficient way.
Dynamic problems also requires "optimal substructure".
According to Wikipedia:
Dynamic programming is a method of
solving complex problems by breaking
them down into simpler steps. It is
applicable to problems that exhibit
the properties of 1) overlapping
subproblems which are only slightly
smaller and 2) optimal substructure.
Backtracking is a general algorithm
for finding all (or some) solutions to
some computational problem, that
incrementally builds candidates to the
solutions, and abandons each partial
candidate c ("backtracks") as soon as
it determines that c cannot possibly
be completed to a valid solution.
For a detailed discussion of "optimal substructure", please read the CLRS book.
Common problems for backtracking I can think of are:
Eight queen puzzle
Map coloring
Sudoku
DP problems:
This website at MIT has a good collection of DP problems with nice animated explanations.
A chapter from a book from a professor at Berkeley.
One more difference could be that Dynamic programming problems usually rely on the principle of optimality. The principle of optimality states that an optimal sequence of decision or choices each sub sequence must also be optimal.
Backtracking problems are usually NOT optimal on their way! They can only be applied to problems which admit the concept of partial candidate solution.
Say that we have a solution tree, whose leaves are the solutions for the original problem, and whose non-leaf nodes are the suboptimal solutions for part of the problem. We try to traverse the solution tree for the solutions.
Dynamic programming is more like BFS: we find all possible suboptimal solutions represented the non-leaf nodes, and only grow the tree by one layer under those non-leaf nodes.
Backtracking is more like DFS: we grow the tree as deep as possible and prune the tree at one node if the solutions under the node are not what we expect.
Then there is one inference derived from the aforementioned theory: Dynamic programming usually takes more space than backtracking, because BFS usually takes more space than DFS (O(N) vs O(log N)). In fact, dynamic programming requires memorizing all the suboptimal solutions in the previous step for later use, while backtracking does not require that.
DP allows for solving a large, computationally intensive problem by breaking it down into subproblems whose solution requires only knowledge of the immediate prior solution. You will get a very good idea by picking up Needleman-Wunsch and solving a sample because it is so easy to see the application.
Backtracking seems to be more complicated where the solution tree is pruned is it is known that a specific path will not yield an optimal result.
Therefore one could say that Backtracking optimizes for memory since DP assumes that all the computations are performed and then the algorithm goes back stepping through the lowest cost nodes.
IMHO, the difference is very subtle since both (DP and BCKT) are used to explore all possibilities to solve a problem.
As for today, I see two subtelties:
BCKT is a brute force solution to a problem. DP is not a brute force solution. Thus, you might say: DP explores the solution space more optimally than BCKT. In practice, when you want to solve a problem using DP strategy, it is recommended to first build a recursive solution. Well, that recursive solution could be considered also the BCKT solution.
There are hundreds of ways to explore a solution space (wellcome to the world of optimization) "more optimally" than a brute force exploration. DP is DP because in its core it is implementing a mathematical recurrence relation, i.e., current value is a combination of past values (bottom-to-top). So, we might say, that DP is DP because the problem space satisfies exploring its solution space by using a recurrence relation. If you explore the solution space based on another idea, then that won't be a DP solution. As in any problem, the problem itself may facilitate to use one optimization technique or another, based on the problem structure itself. The structure of some problems enable to use DP optimization technique. In this sense, BCKT is more general though not all problems allow BCKT too.
Example: Sudoku enables BCKT to explore its whole solution space. However, it does not allow to use DP to explore more efficiently its solution space, since there is no recurrence relation anywhere that can be derived. However, there are other optimization techniques that fit with the problem and improve brute force BCKT.
Example: Just get the minimum of a classic mathematical function. This problem does not allow BCKT to explore the state space of the problem.
Example: Any problem that can be solved using DP can also be solved using BCKT. In this sense, the recursive solution of the problem could be considered the BCKT solution.
Hope this helps a bit.
In a very simple sentence I can say: Dynamic programming is a strategy to solve optimization problem. optimization problem is about minimum or maximum result (a single result). but in, Backtracking we use brute force approach, not for optimization problem. it is for when you have multiple results and you want all or some of them.
Depth first node generation of state space tree with bounding function is called backtracking. Here the current node is dependent on the node that generated it.
Depth first node generation of state space tree with memory function is called top down dynamic programming. Here the current node is dependant on the node it generates.

Correct formulation of the A* algorithm

I'm looking at definitions of the A* path-finding algorithm, and it seems to be defined somewhat differently in different places.
The difference is in the action performed when going through the successors of a node, and finding that a successor is on the closed list.
One approach (suggested by Wikipedia, and this article) says: if the successor is on the closed list, just ignore it
Another approach (suggested here and here, for example) says: if the successor is on the closed list, examine its cost. If it's higher than the currently computed score, remove the item from the closed list for future examination.
I'm confused - which method is correct ? Intuitively, the first makes more sense to me, but I wonder about the difference in definition. Is one of the definitions wrong, or are they somehow isomorphic ?
The first approach is optimal only if the optimal path to any repeated state is always the first to be followed. This property holds if the heuristic function has the property of consistency (also called monoticity). A heuristic function is consistent if, for every node n and every successor n' of n, the estimated cost of reaching the goal from n is no greater than the step cost of getting to n' from n plus the estimated cost of reaching the goal from n.
The second approach is optimal if the heuristic function is merely admissible, that is, it never overestimates the cost to reach the goal.
Every consistent heuristic function is also admissible. Although consistency is a stricter requirement than admissibility, one has to work quite hard to concoct heuristic functions that are admissible but not consistent.
Thus, even though the second approach is more general as it works with a strictly larger subset of heuristic functions, the first approach is usually sufficient in practice.
Reference: the subsection A* search: Minimizing the total estimated solution cost in section 4.1 Informed (Heuristic) Search Strategies of the book Artificial Intelligence: A Modern Approach.

Resources