Can someone explain the branch and bound search technique for me? I need to find a path with the smallest cost from any start node to an end node of any random graph using branch and bound search algorithm.
The basic idea of B & B is:
When solving an optimisation problem ("Find an X satisfying criteria Y so as to minimise the cost f(X)"), you build a solution piece by piece -- at any point in time, you have a partial solution, which has a cost.
If the nature of the problem is such that the cost of a partial solution can only stay the same or go up as you continue adding pieces to it, then you know that there's no point continuing to add pieces to a partial solution if there's already a full solution with lower cost. In this case, you can abandon (or "prune", or "fathom") further processing of this partial solution.
Many problems have the latter property, making B & B a widely applicable algorithm technique.
The process of searching for solutions can be represented by a search tree, where the root node represents the starting point where no decisions have been made, and each edge leading from a node represents a decision about something to be included in a partial solution. Each node is a partial solution comprising the decisions made (edges) from the root to that node.
Example: if we want to solve a Sudoku puzzle, the root node would represent the board with just the originally supplied numbers filled in; there might be 9 edges from this root, each representing the decision to assign a number 1-9 to the top-left cell. Each of those 9 partial solution nodes could have 8 branches, representing the valid assignments to the cell at position (1, 2), and so on. Usually, each edge represents a recursion step in a program.
With B & B, in the best case a good solution is found early, meaning that unpromising areas of the search tree can be pruned near the root; but in the worst case, the entire tree of valid solutions will be generated. For this reason B & B is usually only used to solve problems for which no faster algorithm is known (such as NP-hard problems).
This link provides a graphical representation of concepts related to B & B.
This link provides an explanation of the algorithm and sample C# code in a downloadable zip file.
Hope this helps.
There are a lot of references about branch and bound algorithms in the web.
here you can find some theoretical explanation.
whereas the code in C# is here
Fantastic answer #j_random_hacker !!!!
See pg 439 (example 18.2) in Papadimitriou and Steiglitz, Combinatorial Optimization.
This book is a classic, and it discusses your exact problem.
Related
Please forgive me if I'm not using the correct terms or have overlooked an existing solution. I'm not experienced in search algorithms and the theories behind it. I just would like to solve a problem.
I've previously used what I was told to be the A* algorithm to solve a different problem. But reading up on it I've realized that what I learned is not quite what wikipedia tells me.
What I learned was:
Start at your origin node
Open a new solution for each path you can take
Recursively create a new subsolution for each path you can take from there
When you arrive at the same place with multiple solutions, drop those who took longer than the fastest
Now if I understand wikipedia correctly, this is what I was supposed to do:
Start at your origin node
Open a new solution for each path you can take
Order the solutions by "cost of path taken" + "estimated cost to target"
Take cheapest solution and create subsolutions for each possible path
order those solutions into the others then rinse repeat
I can see how this would help with not calculating quite as many solutions but my problem is that I see no possiblity to create an "optimistic" estimate.
I'm not searching for a path on a geographical map. I'm trying to find the best sequence of actions. There's a minimum sequence of - say - ABCDEFGH. You cannot do F before E but repeating previous actions in particilar ordering might make later actions more efficient.
Do I need a different search algorithm? Do I do what I originally learned and just live with the fact that doing more work is the price for not having a good heuristic function?
I believe my teacher recognized this problem. And what I learned was simply A* with a heuristic function of f(n) = 0.
I'm not searching for a path on a geographical map. I'm trying to find
the best sequence of actions. There's a minimum sequence of - say -
ABCDEFGH. You cannot do F before E but repeating previous actions in
particular ordering might make later actions more efficient.
It is not clear to me whether you can repeat one action, i.e., a solution is ABCDEFGH, but would ABBBBCDEFGH be possible?
If not, then you might be able to have A* algorithm, implemented like this:
1. At some stage (say the first, "empty"), you have one of several actions
available.
2. The cost of going from Empty City to A City is the cost of action A.
3. The cost of going from Empty City to B city is the cost of action B.
When you've reached B, the cost of doing C is constant (if it is not, then you can't use A* as is) and you insert the cost of going from B City to C City as the cost of C.
So you can handle the case in which an action has different costs, provided that this difference is completely described by the previous state. For example, if you can only do C if you have done A or B, and the cost of C is 5 and 8, you enter the "distance" between A and C as 5, and B to C as 8.
If the cost of, say, D depends on the two previous states, you can still use a more complicated A* implementation where you define the virtual "cities" BC, AB and AC, and the distance from BC to D is "the cost of D having done B and C", and so on. The cost of reaching BC from A is "the cost of B given A, and the cost of C given A and B". So if these costs depend on the previous states, things get even more complicated.
In the end, the complexity of this revised A* will grow until it becomes your algorithm, where every state depends potentially on the sequence of all preceding states. The more this is true, the more your algorithm is convenient; the more every state is a cost unto itself, the more A* is convenient.
And of course the possibility of closed loops (visiting the same state/action twice, making this a cyclic graph) blows A* straight out of the water.
I'm looking for leads on algorithms to deduce the timeline/chronology of a series of novels. I've split the texts into days and created a database of relationships between them, e.g.: X is a month before Y, Y and Z are consecutive, date of Z is known, X is on a Tuesday, etc. There is uncertainty ('month' really only means roughly 30 days) and also contradictions. I can mark some relationships as more reliable than others to help resolve ambiguity and contradictions.
What kind of algorithms exist to deduce a best-fit chronology from this kind of data, assigning a highest-probability date to each day? At least time is 1-dimensional but dealing with a complex relationship graph with inconsistencies seems non-trivial. I have a CS background so I can code something up but some idea about the names of applicable algorithms would be helpful. I guess what I have is a graph with days as nodes as relationships as edges.
A simple, crude first approximation to your problem would be to store information like "A happened before B" in a directed graph with edges like "A -> B". Test the graph to see whether it is a Directed Acyclic Graph (DAG). If it is, the information is consistent in the sense that there is a consistent chronology of what happened before what else. You can get a sample linear chronology by printing a "topological sort" (topsort) of the DAG. If events C and D happened simultaneously or there is no information to say which came before the other, they might appear in the topsort as ABCD or ABDC. You can even get the topsort algorithm to print all possibilities (so both ABCD and ABDC) for further analysis using more detailed information.
If the graph you obtain is not a DAG, you can use an algorithm like Tarjan's algorithm to quickly identify "strongly connected components", which are areas of the graph which contain chronological contradictions in the form of cycles. You could then analyze them more closely to determine which less reliable edges might be removed to resolve contradictions. Another way to identify edges to remove to eliminate cycles is to search for "minimum feedback arc sets". That's NP-hard in general but if your strongly connected components are small the search could be feasible.
Constraint programming is what you need. In propagation-based CP, you alternate between (a) making a decision at the current choice point in the search tree and (b) propagating the consequences of that decision as far as you can. Notionally you do this by maintaining a domain D of possible values for each problem variable x such that D(x) is the set of values for x which have not yet been ruled out along the current search path. In your problem, you might be able to reduce it to a large set of Boolean variables, x_ij, where x_ij is true iff event i precedes event j. Initially D(x) = {true, false} for all variables. A decision is simply reducing the domain of an undecided variable (for a Boolean variable this means reducing its domain to a single value, true or false, which is the same as an assignment). If at any point along a search path D(x) becomes empty for any x, you have reached a dead-end and have to backtrack.
If you're smart, you will try to learn from each failure and also retreat as far back up the search tree as required to avoid redundant search (this is called backjumping -- for example, if you identify that the dead-end you reached at level 7 was caused by the choice you made at level 3, there's no point in backtracking just to level 6 because no solution exists in this subtree given the choice you made at level 3!).
Now, given you have different degrees of confidence in your data, you actually have an optimisation problem. That is, you're not just looking for a solution that satisfies all the constraints that must be true, but one which also best satisfies the other "soft" constraints according to the degree of trust you have in them. What you need to do here is decide on an objective function assigning a score to a given set of satisfied/violated partial constraints. You then want to prune your search whenever you find the current search path cannot improve on the best previously found solution.
If you do decide to go for the Boolean approach, you could profitably look into SAT solvers, which tear through these kinds of problems. But the first place I'd look is at MiniZinc, a CP language which maps on to a whole variety of state of the art constraint solvers.
Best of luck!
Given a bunch of sets of people (similar to):
[p1,p2,p3]
[p2,p3]
[p1]
[p1]
Select 1 from each set, trying to minimize the maximum number of times any one person is selected.
For the sets above, the max number of times a given person MUST be selected is 2.
I'm struggling to get an algorithm for this. I don't think it can be done with a greedy algorithm, more thinking along the lines of a dynamic programming solution.
Any hints on how to go about this? Or do any of you know any good websites about this stuff that I could have a look at?
This is neither dynamic nor greedy. Let's look at a different problem first -- can it be done by selecting every person at most once?
You have P people and S sets. Create a graph with S+P vertices, representing sets and people. There is an edge between person pi and set si iff pi is an element of si. This is a bipartite graph and the decision version of your problem is then equivalent to testing whether the maximum cardinality matching in that graph has size S.
As detailed on that page, this problem can be solved by using a maximum flow algorithm (note: if you don't know what I'm talking about, then take your time to read it now, as you won't understand the rest otherwise): first create a super-source, add an edge linking it to all people with capacity 1 (representing that each person may only be used once), then create a super-sink and add edges linking every set to that sink with capacity 1 (representing that each set may only be used once) and run a suitable max-flow algorithm between source and sink.
Now, let's consider a slightly different problem: can it be done by selecting every person at most k times?
If you paid attention to the remarks in the last paragraph, you should know the answer: just change the capacity of the edges leaving the super-source to indicate that each person may be used more than once in this case.
Therefore, you now have an algorithm to solve the decision problem in which people are selected at most k times. It's easy to see that if you can do it with k, then you can also do it with any value greater than k, that is, it's a monotonic function. Therefore, you can run a binary search on the decision version of the problem, looking for the smallest k possible that still works.
Note: You could also get rid of the binary search by testing each value of k sequentially, and augmenting the residual network obtained in the last run instead of starting from scratch. However, I decided to explain the binary search version as it's conceptually simpler.
I'm trying to improve my current algorithm for the 8 Queens problem, and this is the first time I'm really dealing with algorithm design/algorithms. I want to implement a depth-first search combined with a permutation of the different Y values described here:
http://en.wikipedia.org/wiki/Eight_queens_puzzle#The_eight_queens_puzzle_as_an_exercise_in_algorithm_design
I've implemented the permutation part to solve the problem, but I'm having a little trouble wrapping my mind around the depth-first search. It is described as a way of traversing a tree/graph, but does it generate the tree graph? It seems the only way that this method would be more efficient only if the depth-first search generates the tree structure to be traversed, by implementing some logic to only generate certain parts of the tree.
So essentially, I would have to create an algorithm that generated a pruned tree of lexigraphic permutations. I know how to implement the pruning logic, but I'm just not sure how to tie it in with the permutation generator since I've been using next_permutation.
Is there any resources that could help me with the basics of depth first searches or creating lexigraphic permutations in tree form?
In general, yes, the idea of the depth-first search is that you won't have to generate (or "visit" or "expand") every node.
In the case of the Eight Queens problem, if you place a queen such that it can attack another queen, you can abort that branch; it cannot lead to a solution.
If you were solving a variant of Eight Queens such that your goal was to find one solution, not all 92, then you could quit as soon as you found one.
More generally, if you were solving a less discrete problem, like finding the "best" arrangement of queens according to some measure, then you could abort a branch as soon as you knew it could not lead to a final state better than a final state you'd already found on another branch. This is related to the A* search algorithm.
Even more generally, if you are attacking a really big problem (like chess), you may be satisfied with a solution that is not exact, so you can abort a branch that probably can't lead to a solution you've already found.
The DFS algorithm itself does not generate the tree/graph. If you want to build the tree and graph, it's as simple building it as you perform the search. If you only want to find one soution, a flat LIFO data structure like a linked list will suffice for this: when you visit a new node, append it to the list. When you leave a node to backtrack in the search, pop the node off.
A book called "Introduction to algorithms" by anany levitan has a proper explanation for your understanding. He also provided the solution to 8 queens problem just the way you desctribed it. It will helpyou for sure.
As my understanding, for finding one solution you dont need any permutation all you need is dfs.That will lonely suffice in finding solution
Odd question here not really code but logic,hope its ok to post it here,here it is
I have a data structure that can be thought of as a graph.
Each node can support many links but is limited to a value for each node.
All links are bidirectional. and each link has a cost. the cost depends on euclidian difference between the nodes the minimum value of two parameters in each node. and a global modifier.
i wish to find the maximum cost for the graph.
wondering if there was a clever way to find such a matching, rather than going through in brute force ...which is ugly... and i'm not sure how i'd even do that without spending 7 million years running it.
To clarify:
Global variable = T
many nodes N each have E,X,Y,L
L is the max number of links each node can have.
cost of link A,B = Sqrt( min([a].e | [b].e) ) x
( 1 + Sqrt( sqrt(sqr([a].x-[b].x)+sqr([a].y-[b].y)))/75 + Sqrt(t)/10 )
total cost =sum all links.....and we wish to maximize this.
average values for nodes is 40-50 can range to (20..600)
average node linking factor is 3 range 0-10.
For the sake of completeness for anybody else that looks at this article, i would suggest revisiting your graph theory algorithms:
Dijkstra
Astar
Greedy
Depth / Breadth First
Even dynamic programming (in some situations)
ect. ect.
In there somewhere is the correct solution for your problem. I would suggest looking at Dijkstra first.
I hope this helps someone.
If I understand the problem correctly, there is likely no polynomial solution. Therefore I would implement the following algorithm:
Find some solution by beng greedy. To do that, you sort all edges by cost and then go through them starting with the highest, adding an edge to your graph while possible, and skipping when the node can't accept more edges.
Look at your edges and try to change them to archive higher cost by using a heuristics. The first that comes to my mind: you cycle through all 4-tuples of nodes (A,B,C,D) and if your current graph has edges AB, CD but AC, BD would be better, then you make the change.
Optionally the same thing with 6-tuples, or other genetic algorithms (they are called that way because they work by mutations).
This is equivalent to the traveling salesman problem (and is therefore NP-Complete) since if you could solve this problem efficiently, you could solve TSP simply by replacing each cost with its reciprocal.
This means you can't solve exactly. On the other hand, it means that you can do exactly as I said (replace each cost with its reciprocal) and then use any of the known TSP approximation methods on this problem.
Seems like a max flow problem to me.
Is it possible that by greedily selecting the next most expensive option from any given start point (omitting jumps to visited nodes) and stopping once all nodes are visited? If you get to a dead end backtrack to the previous spot where you are not at a dead end and greedily select. It would require some work and probably something like a stack to keep your paths in. I think this would work quite effectively provided the costs are well ordered and non negative.
Use Genetic Algorithms. They are designed to solve the problem you state rapidly reducing time complexity. Check for AI library in your language of choice.