Decrease and Conquer in Real world [closed] - algorithm

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
Can anyone suggest real-world problems in this algorithms Insertion sort, Breath first search, Depth first search or Topological sorting? Thank you.
Real-world examples of recursion
I saw sample here But I need is specific problems for insertion sort, Breath first search, Depth first search or Topological sorting algorithms.
I wish you can Help me.

How more real can it get than our daily, humdrum lives?
Insertion sort is what we (or at least I) most commonly use when we need to sort things. Consider a deck of cards - one would go over them one by one, put the smallest ones at the front, smaller behind them, etc. Or a pile of paperwork which needs to be sorted by date, same algorithm.
In CS, insertion-sort is less commonly used because we have much better algorithms (qsort and merge-sort come to mind). A human can do them as well, but that'd be a much more tedious task indeed.
Breadth-first search's use is in the name: When we want to go over a tree horizontally, and not vertically. Let's say you heard your family is connected to Ilya Repin, a Russian artist. You go to the attic, crack open the giant wooden chest that's been gathering dust, and take out the old family tree which dates back to the 19th century (doesn't everyone have that?). You know he's closer to the top of the tree than he is to the bottom, so you go breadth-first: Take the first line, followed by the second, and so on...just a little more...Efim Repin...bingo!
If Ilya Repin happened to be in the leftmost branch of the tree, then depth-first would've made more sense. However, in the average case, you'd want to go breadth-first because we know our target is closer to the root than it is to the leafs. In CS there are a buttload of usage cases (Cheney's Algorithm, A*, etc. you can see several more on wikipedia).
Depth-first search is used when we... drumroll ...want to go to the depth of the tree first, travel vertically. There are so many uses I can't even begin, but the simplest and most common is solving a cereal-box maze. You go through one path until you reach a dead-end, and then you backtrack. We don't do it perfectly, since we sometimes skip over a path or forget which ones we took, but we still do it.
In CS, there're a whole lot of usage cases, so I'll redirect you to wikipedia again.
Topological sort some of us use in the back of our heads, but it's easily seen in chefs, cooks, programmers, anyone and everyone who has to do an ordered set of tasks. My grandmother made the best Canneloni I've eaten, and her very simple recipe was constructed of several simple steps (which I've managed to forget, but here are their very general outline): Making the "pancake" wrapper, making the sauce and wrapping them together. Now, we can't wrap these two up without making them, so naturally, we first have to make the wrapper and sauce, and only then wrap 'em.
In CS it's used for exactly the same thing: scheduling. I think Excel uses it to calculate dependant spreadsheet formulas (or maybe it just uses a simple recursive algorithm, which may be less efficient). For some more, you can see our good friend wikipedia and a random article I found.

I work with Hierarchical Datastructures, and always in need of BFS to find objects i need nested under a specific root...
e.g. Find(/Replace)
some-times (significally less) i use DFS in order to check some design constraints that cannot be evaluated without investigating the leafs.
though not used by me, and not exactly BFS,
but GPS navigation software use A* to search for a good path,
Which is kid of a "Weighted BFS"

Insertion sort - none, it's good for learning. Outside computers it is used often to sort, for example, cards. In real-world merge sort or quick sort are better.
BFS - finding connected nodes, finding shortest paths. Base for Dijkstra's algorithm and A* (faster version of Dijkstra's).
DFS - finding connected nodes, numbering nodes in tree.
Topological sorting - finding correct order of tasks.

Related

Genetic Algorithm - better crossover/mutation algorithm? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
For the basic Genetic Algorithm implementation with a random crossover boundary and random number of mutations at random bit positions, a lot of inferior children are created and leaves the optimum solution to be discovered by chance. This wastes a lot of CPU, and the user does not know when the optimum solution is found, because it could always be "the next one".
Is there an algorithm to consistently get better children rather than leave this important process to chance?
Thank you.
As others have said the quality of offspring is dependant on a lot of factors and can often require experimentation, using known solutions, to get right.
However, one of the biggest factors in determining the quality of the children is the selection of the parent chromosome. Since stronger parents are more likely to create strong children the type of selection plays a big part.
The best type of selection (more common types are rank based, roulette wheel and tournament selection) like with most things Genetic Algorithms related are largely dependant on the problem, and can often require experimentation to get right.
On whether there is a better crossover/mutation algorithm for the basic Genetic Algorithm the answer is, not really. You can experiment with different kinds of crossover (1-point, two-point, n-point) and mutation (swap or replace). The values for each can also be altered. There are also plenty of things you can change or add to the Genetic Algorithm to improve efficiency (things like culling, duplicate removal, allowing the best chromosome into the next generation) but then your Genetic Algorithm would no longer be a basic Genetic Algorithm. Adding these features also means that you may have to do a lot more experimentation to get the features used, and their parameters, right.
As Michalewicz states in his book, How to solve it, there is no such thing as an off-the-shelf genetic algorithm. So, the answer to your question is basically what #OnABauer stated.
I would only like to complete his answer with a suggestion for you to look into a memetic algorithm (there is an interesting introduction here). If you add a local optimization operator, chances are that offspring will be improved (beware of local entrapment only).
For optimizations problems like the traveling salesperson, you can encode the solution so that all possible crossovers form a valid solution.
For example, instead of treating the genome as a list of cities (and thereby making every genome that misses a city or revisits a city as invalid), you can treat the genome as a list of transformations on a list of cities, starting with some (arbitrary) canonical list of cities.
Suppose we have a list of cities:
Azusa
Boca Raton
Cincinatti
Denver
If you treat each pair of bits as an encoding of one of the cities, then only a small number of bit patterns encodes a valid tour. Mutating and crossing between valid tours has a very small probability of resulting in another valid tour.
If you instead treat every four bits as a swap instruction. Now any list of bits is valid. To determine the correct tour, you start with an "official" ordering of the cities, and apply the list of swaps in order. You'll end with a valid tour, even if some of the swaps are no-ops.
I've used this approach in a couple of optimization problems with good results.
In essence, genetic algorithm is a type of search algorithm.
GA is a particular kind of heuristic search.
You are trying to explore the answers which you think are more likely to be the best first.
In GA, the basis of why you choose to explore an answer is because it is similar to a previously known good answers (parents).
GA also traditionally can terminate before exploring all the possible answers, which I think is the aspect that worries you the most.
If you want to always look at all possible answer, then you are considering a exhaustive search. For example, doing depth-first search through all possible answers.
In conclusion, GA is a heuristic search.
You choose it, if:
exhaustive search isn't fast enough.
you don't care if the final result is the best (globally optimal)
you understand how to guess for better answer based on explored answers. This depends on the problem domain. It is what determines what mutation and crossover operators.

concrete examples of heuristics

What are concrete examples (e.g. Alpha-beta pruning, example:tic-tac-toe and how is it applicable there) of heuristics. I already saw an answered question about what heuristics is but I still don't get the thing where it uses estimation. Can you give me a concrete example of a heuristic and how it works?
Warnsdorff's rule is an heuristic, but the A* search algorithm isn't. It is, as its name implies, a search algorithm, which is not problem-dependent. The heuristic is. An example: you can use the A* (if correctly implemented) to solve the Fifteen puzzle and to find the shortest way out of a maze, but the heuristics used will be different. With the Fifteen puzzle your heuristic could be how many tiles are out of place: the number of moves needed to solve the puzzle will always be greater or equal to the heuristic.
To get out of the maze you could use the Manhattan Distance to a point you know is outside of the maze as your heuristic. Manhattan Distance is widely used in game-like problems as it is the number of "steps" in horizontal and in vertical needed to get to the goal.
Manhattan distance = abs(x2-x1) + abs(y2-y1)
It's easy to see that in the best case (there are no walls) that will be the exact distance to the goal, in the rest you will need more. This is important: your heuristic must be optimistic (admissible heuristic) so that your search algorithm is optimal. It must also be consistent. However, in some applications (such as games with very big maps) you use non-admissible heuristics because a suboptimal solution suffices.
A heuristic is just an approximation to the real cost, (always lower than the real cost if admissible). The better the approximation, the fewer states the search algorithm will have to explore. But better approximations usually mean more computing time, so you have to find a compromise solution.
Most demonstrative is the usage of heuristics in informed search algorithms, such as A-Star. For realistic problems you usually have large search space, making it infeasible to check every single part of it. To avoid this, i.e. to try the most promising parts of the search space first, you use a heuristic. A heuristic gives you an estimate of how good the available subsequent search steps are. You will choose the most promising next step, i.e. best-first. For example if you'd like to search the path between two cities (i.e. vertices, connected by a set of roads, i.e. edges, that form a graph) you may want to choose the straight-line distance to the goal as a heuristic to determine which city to visit first (and see if it's the target city).
Heuristics should have similar properties as metrics for the search space and they usually should be optimistic, but that's another story. The problem of providing a heuristic that works out to be effective and that is side-effect free is yet another problem...
For an application of different heuristics being used to find the path through a given maze also have a look at this answer.
Your question interests me as I've heard about heuristics too during my studies but never saw an application for it, I googled a bit and found this : http://www.predictia.es/blog/aco-search
This code simulate an "ant colony optimization" algorithm to search trough a website.
The "ants" are workers which will search through the site, some will search randomly, some others will follow the "best path" determined by the previous ones.
A concrete example: I've been doing a solver for the game JT's Block, which is roughly equivalent to the Same Game. The algorithm performs a breadth-first search on all possible hits, store the values, and performs to the next ply. Problem is the number of possible hits quickly grows out of control (10e30 estimated positions per game), so I need to prune the list of positions at each turn and only take the "best" of them.
Now, the definition of the "best" positions is quite fuzzy: they are the positions that are expected to lead to the best final scores, but nothing is sure. And here comes the heuristics. I've tried a few of them:
sort positions by score obtained so far
increase score by best score obtained with a x-depth search
increase score based on a complex formula using the number of tiles, their color and their proximity
improve the last heuristic by tweaking its parameters and seeing how they perform
etc...
The last of these heuristic could have lead to an ant-march optimization: there's half a dozen parameters that can be tweaked from 0 to 1, and an optimizer could find the optimal combination of these. For the moment I've just manually improved some of them.
The second of this heuristics is interesting: it could lead to the optimal score through a full depth-first search, but such a goal is impossible of course because it would take too much time. In general, increasing X leads to a better heuristic, but increases the computing time a lot.
So here it is, some examples of heuristics. Anything can be an heuristic as long as it helps your algorithm perform better, and it's what makes them so hard to grasp: they're not deterministic. Another point with heuristics: they're supposed to lead to quick and dirty results of the real stuff, so there's a trade-of between their execution time and their accuracy.
A couple of concrete examples: for solving the Knight's Tour problem, one can use Warnsdorff's rule - an heuristic. Or for solving the Fifteen puzzle, a possible heuristic is the A* search algorithm.
The original question asked for concrete examples for heuristics.
Some of these concrete examples were already given. Another one would be the number of misplaced tiles in the 15-puzzle or its improvement, the Manhattan distance, based on the misplaced tiles.
One of the previous answers also claimed that heuristics are always problem-dependent, whereas algorithms are problem-independent. While there are, of course, also problem-dependent algorithms (for instance, for every problem you can just give an algorithm that immediately solves that very problem, e.g. the optimal strategy for any tower-of-hanoi problem is known) there are also problem-independent heuristics!
Consequently, there are also different kinds of problem-independent heuristics. Thus, in a certain way, every such heuristic can be regarded a concrete heuristic example while not being tailored to a specific problem like 15-puzzle. (Examples for problem-independent heuristics taken from planning are the FF heuristic or the Add heuristic.)
These problem-independent heuristics base on a general description language and then they perform a problem relaxation. That is, the problem relaxation only bases on the syntax (and, of course, its underlying semantics) of the problem description without "knowing" what it represents. If you are interested in this, you should get familiar with "planning" and, more specifically, with "planning as heuristic search". I also want to mention that these heuristics, while being problem-independent, are dependent on the problem description language, of course. (E.g., my before-mentioned heuristics are specific to "planning problems" and even for planning there are various different sub problem classes with differing kinds of heuristics.)

Depth First Search Basics

I'm trying to improve my current algorithm for the 8 Queens problem, and this is the first time I'm really dealing with algorithm design/algorithms. I want to implement a depth-first search combined with a permutation of the different Y values described here:
http://en.wikipedia.org/wiki/Eight_queens_puzzle#The_eight_queens_puzzle_as_an_exercise_in_algorithm_design
I've implemented the permutation part to solve the problem, but I'm having a little trouble wrapping my mind around the depth-first search. It is described as a way of traversing a tree/graph, but does it generate the tree graph? It seems the only way that this method would be more efficient only if the depth-first search generates the tree structure to be traversed, by implementing some logic to only generate certain parts of the tree.
So essentially, I would have to create an algorithm that generated a pruned tree of lexigraphic permutations. I know how to implement the pruning logic, but I'm just not sure how to tie it in with the permutation generator since I've been using next_permutation.
Is there any resources that could help me with the basics of depth first searches or creating lexigraphic permutations in tree form?
In general, yes, the idea of the depth-first search is that you won't have to generate (or "visit" or "expand") every node.
In the case of the Eight Queens problem, if you place a queen such that it can attack another queen, you can abort that branch; it cannot lead to a solution.
If you were solving a variant of Eight Queens such that your goal was to find one solution, not all 92, then you could quit as soon as you found one.
More generally, if you were solving a less discrete problem, like finding the "best" arrangement of queens according to some measure, then you could abort a branch as soon as you knew it could not lead to a final state better than a final state you'd already found on another branch. This is related to the A* search algorithm.
Even more generally, if you are attacking a really big problem (like chess), you may be satisfied with a solution that is not exact, so you can abort a branch that probably can't lead to a solution you've already found.
The DFS algorithm itself does not generate the tree/graph. If you want to build the tree and graph, it's as simple building it as you perform the search. If you only want to find one soution, a flat LIFO data structure like a linked list will suffice for this: when you visit a new node, append it to the list. When you leave a node to backtrack in the search, pop the node off.
A book called "Introduction to algorithms" by anany levitan has a proper explanation for your understanding. He also provided the solution to 8 queens problem just the way you desctribed it. It will helpyou for sure.
As my understanding, for finding one solution you dont need any permutation all you need is dfs.That will lonely suffice in finding solution

I need an algorithm to find the best path

I need an algorithm to find the best solution of a path finding problem. The problem can be stated as:
At the starting point I can proceed along multiple different paths.
At each step there are another multiple possible choices where to proceed.
There are two operations possible at each step:
A boundary condition that determine if a path is acceptable or not.
A condition that determine if the path has reached the final destination and can be selected as the best one.
At each step a number of paths can be eliminated, letting only the "good" paths to grow.
I hope this sufficiently describes my problem, and also a possible brute force solution.
My question is: is the brute force is the best/only solution to the problem, and I need some hint also about the best coding structure of the algorithm.
Take a look at A*, and use the length as boundary condition.
http://en.wikipedia.org/wiki/A%2a_search_algorithm
You are looking for some kind of state space search algorithm. Without knowing more about the particular problem, it is difficult to recommend one over another.
If your space is open-ended (infinite tree search), or nearly so (chess, for example), you want an algorithm that prunes unpromising paths, as well as selects promising ones. The alpha-beta algorithm (used by many OLD chess programs) comes immediately to mind.
The A* algorithm can give good results. The key to getting good results out of A* is choosing a good heuristic (weighting function) to evaluate the current node and the various successor nodes, to select the most promising path. Simple path length is probably not good enough.
Elaine Rich's AI textbook (oldie but goodie) spent a fair amount of time on various search algorithms. Full Disclosure: I was one of the guinea pigs for the text, during my undergraduate days at UT Austin.
did you try breadth-first search? (BFS) that is if length is a criteria for best path
you will also have to modify the algorithm to disregard "unacceptable paths"
If your problem is exactly as you describe it, you have two choices: depth-first search, and breadth first search.
Depth first search considers a possible path, pursues it all the way to the end (or as far as it is acceptable), and only then is it compared with other paths.
Breadth first search is probably more appropriate, at each junction you consider all possible next steps and use some score to rank the order in which each possible step is taken. This allows you to prioritise your search and find good solutions faster, (but to prove you have found the best solution it takes just as long as depth-first searching, and is less easy to parallelise).
However, your problem may also be suitable for Dijkstra's algorithm depending on the details of your problem. If it is, that is a much better approach!
This would also be a good starting point to develop your own algorithm that performs much better than iterative searching (if such an algorithm is actually possible, which it may not be!)
A* plus floodfill and dynamic programming. It is hard to implement, and too hard to describe in a simple post and too valuable to just give away so sorry I can't provide more but searching on flood fill and dynamic programming will put you on the path if you want to go that route.

How do 20 questions AI algorithms work?

Simple online games of 20 questions powered by an eerily accurate AI.
How do they guess so well?
You can think of it as the Binary Search Algorithm.
In each iteration, we ask a question, which should eliminate roughly half of the possible word choices. If there are total of N words, then we can expect to get an answer after log2(N) questions.
With 20 question, we should optimally be able to find a word among 2^20 = 1 million words.
One easy way to eliminate outliers (wrong answers) would be to probably use something like RANSAC. This would mean, instead of taking into account all questions which have been answered, you randomly pick a smaller subset, which is enough to give you a single answer. Now you repeat that a few times with different random subset of questions, till you see that most of the time, you are getting the same result. you then know you have the right answer.
Of course this is just one way of many ways of solving this problem.
I recommend reading about the game here: http://en.wikipedia.org/wiki/Twenty_Questions
In particular the Computers section:
The game suggests that the information
(as measured by Shannon's entropy
statistic) required to identify an
arbitrary object is about 20 bits. The
game is often used as an example when
teaching people about information
theory. Mathematically, if each
question is structured to eliminate
half the objects, 20 questions will
allow the questioner to distinguish
between 220 or 1,048,576 subjects.
Accordingly, the most effective
strategy for Twenty Questions is to
ask questions that will split the
field of remaining possibilities
roughly in half each time. The process
is analogous to a binary search
algorithm in computer science.
A decision tree supports this kind of application directly. Decision trees are commonly used in artificial intelligence.
A decision tree is a binary tree that asks "the best" question at each branch to distinguish between the collections represented by its left and right children. The best question is determined by some learning algorithm that the creators of the 20 questions application use to build the tree. Then, as other posters point out, a tree 20 levels deep gives you a million things.
A simple way to define "the best" question at each point is to look for a property that most evenly divides the collection into half. That way when you get a yes/no answer to that question, you get rid of about half of the collection at each step. This way you can approximate binary search.
Wikipedia gives a more complete example:
http://en.wikipedia.org/wiki/Decision_tree_learning
And some general background:
http://en.wikipedia.org/wiki/Decision_tree
It bills itself as "the neural net on the internet", and therein lies the key. It likely stores the question/answer probabilities in a spare matrix. Using those probabilities, it's able to use a decision tree algorithm to deduce which question to ask that would best narrow down the next question. Once it narrows the number of possible answers to a few dozen, or if it's reached 20 questions already, then it starts reading off the most likely.
The really intriguing aspect of 20q.net is that unlike most decision tree and neural network algorithms I'm aware of, 20q supports a sparse matrix and incremental updates.
Edit: Turns out the answer's been on the net this whole time. Robin Burgener, the inventor, described his algorithm in detail in his 2005 patent filing.
It is using a learning algorithm.
k-NN is a good example of one of these.
Wikipedia: k-Nearest Neighbor Algorithm

Resources