Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
What I found are:
1. Naive Bayes classifier
2. K nearest neighbors classifier
3. Decision tree Algorithms(C4.5, Random Forest)
4. Kernel Discriminant Analysis
5. Support vector machines
If any other, can someone please help me with the remaining algorithms under this? I need complete list of supervised ML classification algorithms for my academic purpose. Thank you
Although this is an active area of research, I wouldn't say new algorithms are invented every day, not good ones anyway. The invention of a new ML algorithm that is better than the rest in even some semi-important particular cases would be pretty big news.
Usually, known algorithms are adapted to a given problem. Adapting one properly can itself be an area of research (spam classification is done with classical ML algorithms, but it's not trivial to perfect, so is digit recognition etc.)
Regardless, it's hard to find a source that lists all the known, classical algorithms. There are a lot, and it's unlikely that an author somewhere lists them all. They usually list the ones they work with, or the ones they consider the most important.
That said, I'm going to try to give you a longer list, and I'm making this community wiki to encourage other people to add more.
Naive Bayes classifier
K nearest neighbors classifier
Decision tree Algorithms(C4.5, Random Forest)
Kernel Discriminant Analysis
Support vector machines
Logistic Regression
Passive Aggressive Classifiers
Gaussian Processes
Neural networks
The Winnow algorithm
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I have a small confusion with the nature of heuristics.
We know that heuristics need not give correct outputs for all input instances.
But then, why are heuristics proposed??
Heuristics are used to trade off performance (usually execution speed, but also memory consumption) with potential accuracy or generality. For example, your anti virus software uses heuristics to characterize what a virus might look like, and can take advantage of that piece of information to determine which files it should spend more time analyzing. A good heuristic has the property that it can save substantial time with minimal cost.
In graph traversal theory, a heuristic for an A* search algorithm need not be perfect. It just needs to have a predicted cost function h(x) that is less than or equal to the true cost to the goal state in order to guarantee an optimal solution. The closer h(x) equals the true cost, the quicker an optimal solution will be found.
Let me give you an example which might help you understand the importance of heuristics.
In Artificial Intelligence, search problems are mainly classified as blind search and directed search. Blind search is where you make use of algorithms such as BFS and DFS and there is a reason they are called blind search, they don't have any knowledge about the direction you should go, you just have to explore and explore until you reach the goal node, imagine the time and space complexity for those algorithms.
Now if you look at the directed search algorithm such as A*, where you have some kind of heuristic function or in simple terms an assumption about which direction you should take the next step.
Although heuristics does not guarantee the best result but rather will try to give you a better solution and sometimes even the best. There are so many classes of problems (Ex. games you play) where a better solution does the task rather than wasting so much time and space in finding the best solution.
I hope it helps.
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 12 years ago.
What are some algorithms of legitimate utility that are simply too complex to implement?
Let me be clear: I'm not looking for algorithms like the current asymptotic optimal matrix multiplication algorithm, which is reasonable to implement but has a constant that makes it useless in practice. I'm looking for algorithms that could plausibly have practical value, but are so difficult to code that they have never been implemented, only implemented in extremely artificial settings, or only implemented for remarkably special-purpose applications.
Also welcome are near-impossible-to-implement algorithms that have good asymptotics but would likely have poor real performance.
I don't think there is any algorithm with practical use that has never been coded, but there are plenty that are difficult to code.
An example of an algorithm that is asymptotically optimal, but very difficult to code is Chazelle's O(n) polygon triangulation algorithm. According to Skiena (author of The Algorithm Design Manual), "[the] algorithm is quite hopeless to implement."
In general, triangulation and other computational geometry algorithms (such as 3D convex hull, and Voronoi diagrams) can be quick tricky to implement. A lot of the trickiness comes down to handling floating point inaccuracies.
The Piano Mover's Problem of moving a robot through an environment with obstacles can be defined mathematically and solved with algorithms with known asymptotic complexity.
It is amazing that such algorithms exist; however, it is also unfortunate that they are both extremely challenging to implement and not efficient enough for most applications.
While every new thesis on robot motion planning has to mention Canny's Roadmap Algorithm, it is doubtful if it has ever been implemented:
no general implementation of Canny's algorithm appears to exist at present.
If we can equate "tedious" with "difficult" then some mathematical proofs can have a very large number of special cases such as Hale's proof or Kepler's conjecture: http://en.wikipedia.org/wiki/Kepler_conjecture
Following the approach suggested by
Fejes Tóth (1953), Thomas Hales, then
at the University of Michigan,
determined that the maximum density of
all arrangements could be found by
minimizing a function with 150
variables. In 1992, assisted by his
graduate student Samuel Ferguson, he
embarked on a research program to
systematically apply linear
programming methods to find a lower
bound on the value of this function
for each one of a set of over 5,000
different configurations of spheres.
If a lower bound (for the function
value) could be found for every one of
these configurations that was greater
than the value of the function for the
cubic close packing arrangement, then
the Kepler conjecture would be proved.
To find lower bounds for all cases
involved solving around 100,000 linear
programming problems.
When presenting the progress of his
project in 1996, Hales said that the
end was in sight, but it might take "a
year or two" to complete. In August
1998 Hales announced that the proof
was complete. At that stage it
consisted of 250 pages of notes and 3
gigabytes of computer programs, data
and results.
I'm not sure I know what you're asking, but standard NP Incomplete calculations are pretty difficult as far as I know, and they have real world value in many ways, for example computing the most efficient routes for data transmission, or cutting circuit boards, or routing power to power grids...the possibilities are legion.
What properties should the problem have so that I can decide which method to use dynamic programming or greedy method?
Dynamic programming problems exhibit optimal substructure. This means that the solution to the problem can be expressed as a function of solutions to subproblems that are strictly smaller.
One example of such a problem is matrix chain multiplication.
Greedy algorithms can be used only when a locally optimal choice leads to a totally optimal solution. This can be harder to see right away, but generally easier to implement because you only have one thing to consider (the greedy choice) instead of multiple (the solutions to all smaller subproblems).
One famous greedy algorithm is Kruskal's algorithm for finding a minimum spanning tree.
The second edition of Cormen, Leiserson, Rivest and Stein's Algorithms book has a section (16.4) titled "Theoretical foundations for greedy methods" that discusses when the greedy methods yields an optimum solution. It covers many cases of practical interest, but not all greedy algorithms that yield optimum results can be understood in terms of this theory.
I also came across a paper titled "From Dynamic Programming To Greedy Algorithms" linked here that talks about certain greedy algorithms can be seen as refinements of dynamic programming. From a quick scan, it may be of interest to you.
There's really strict rule to know it. As someone already said, there are some things that should turn the red light on, but at the end, only experience will be able to tell you.
We apply greedy method when a decision can be made on the local information available at each stage.We are sure that following the set of decisions at each stage,we will find the optimal solution.
However, in dynamic approach we may not be sure about making a decision at one stage, so we carry a set of probable decisions , one of the probable elements may take to a solution.
What is the difference between a heuristic and an algorithm?
An algorithm is the description of an automated solution to a problem. What the algorithm does is precisely defined. The solution could or could not be the best possible one but you know from the start what kind of result you will get. You implement the algorithm using some programming language to get (a part of) a program.
Now, some problems are hard and you may not be able to get an acceptable solution in an acceptable time. In such cases you often can get a not too bad solution much faster, by applying some arbitrary choices (educated guesses): that's a heuristic.
A heuristic is still a kind of an algorithm, but one that will not explore all possible states of the problem, or will begin by exploring the most likely ones.
Typical examples are from games. When writing a chess game program you could imagine trying every possible move at some depth level and applying some evaluation function to the board. A heuristic would exclude full branches that begin with obviously bad moves.
In some cases you're not searching for the best solution, but for any solution fitting some constraint. A good heuristic would help to find a solution in a short time, but may also fail to find any if the only solutions are in the states it chose not to try.
An algorithm is typically deterministic and proven to yield an optimal result
A heuristic has no proof of correctness, often involves random elements, and may not yield optimal results.
Many problems for which no efficient algorithm to find an optimal solution is known have heuristic approaches that yield near-optimal results very quickly.
There are some overlaps: "genetic algorithms" is an accepted term, but strictly speaking, those are heuristics, not algorithms.
Heuristic, in a nutshell is an "Educated guess". Wikipedia explains it nicely. At the end, a "general acceptance" method is taken as an optimal solution to the specified problem.
Heuristic is an adjective for
experience-based techniques that help
in problem solving, learning and
discovery. A heuristic method is used
to rapidly come to a solution that is
hoped to be close to the best possible
answer, or 'optimal solution'.
Heuristics are "rules of thumb",
educated guesses, intuitive judgments
or simply common sense. A heuristic is
a general way of solving a problem.
Heuristics as a noun is another name
for heuristic methods.
In more precise terms, heuristics
stand for strategies using readily
accessible, though loosely applicable,
information to control problem solving
in human beings and machines.
While an algorithm is a method containing finite set of instructions used to solving a problem. The method has been proven mathematically or scientifically to work for the problem. There are formal methods and proofs.
Heuristic algorithm is an algorithm that is able to produce an
acceptable solution to a problem in
many practical scenarios, in the
fashion of a general heuristic, but
for which there is no formal proof of
its correctness.
An algorithm is a self-contained step-by-step set of operations to be performed 4, typically interpreted as a finite sequence of (computer or human) instructions to determine a solution to a problem such as: is there a path from A to B, or what is the smallest path between A and B. In the latter case, you could also be satisfied with a 'reasonably close' alternative solution.
There are certain categories of algorithms, of which the heuristic algorithm is one. Depending on the (proven) properties of the algorithm in this case, it falls into one of these three categories (note 1):
Exact: the solution is proven to be an optimal (or exact solution) to the input problem
Approximation: the deviation of the solution value is proven to be never further away from the optimal value than some pre-defined bound (for example, never more than 50% larger than the optimal value)
Heuristic: the algorithm has not been proven to be optimal, nor within a pre-defined bound of the optimal solution
Notice that an approximation algorithm is also a heuristic, but with the stronger property that there is a proven bound to the solution (value) it outputs.
For some problems, noone has ever found an 'efficient' algorithm to compute the optimal solutions (note 2). One of those problems is the well-known Traveling Salesman Problem. Christophides' algorithm for the Traveling Salesman Problem, for example, used to be called a heuristic, as it was not proven that it was within 50% of the optimal solution. Since it has been proven, however, Christophides' algorithm is more accurately referred to as an approximation algorithm.
Due to restrictions on what computers can do, it is not always possible to efficiently find the best solution possible. If there is enough structure in a problem, there may be an efficient way to traverse the solution space, even though the solution space is huge (i.e. in the shortest path problem).
Heuristics are typically applied to improve the running time of algorithms, by adding 'expert information' or 'educated guesses' to guide the search direction. In practice, a heuristic may also be a sub-routine for an optimal algorithm, to determine where to look first.
(note 1): Additionally, algorithms are characterised by whether they include random or non-deterministic elements. An algorithm that always executes the same way and produces the same answer, is called deterministic.
(note 2): This is called the P vs NP problem, and problems that are classified as NP-complete and NP-hard are unlikely to have an 'efficient' algorithm. Note; as #Kriss mentioned in the comments, there are even 'worse' types of problems, which may need exponential time or space to compute.
There are several answers that answer part of the question. I deemed them less complete and not accurate enough, and decided not to edit the accepted answer made by #Kriss
Actually I don't think that there is a lot in common between them. Some algorithm use heuristics in their logic (often to make fewer calculations or get faster results). Usually heuristics are used in the so called greedy algorithms.
Heuristics is some "knowledge" that we assume is good to use in order to get the best choice in our algorithm (when a choice should be taken). For example ... a heuristics in chess could be (always take the opponents' queen if you can, since you know this is the stronger figure). Heuristics do not guarantee you that will lead you to the correct answer, but (if the assumptions is correct) often get answer which are close to the best in much shorter time.
An Algorithm is a clearly defined set of instructions to solve a problem, Heuristics involve utilising an approach of learning and discovery to reach a solution.
So, if you know how to solve a problem then use an algorithm. If you need to develop a solution then it's heuristics.
Heuristics are algorithms, so in that sense there is none, however, heuristics take a 'guess' approach to problem solving, yielding a 'good enough' answer, rather than finding a 'best possible' solution.
A good example is where you have a very hard (read NP-complete) problem you want a solution for but don't have the time to arrive to it, so have to use a good enough solution based on a heuristic algorithm, such as finding a solution to a travelling salesman problem using a genetic algorithm.
Algorithm is a sequence of some operations that given an input computes something (a function) and outputs a result.
Algorithm may yield an exact or approximate values.
It also may compute a random value that is with high probability close to the exact value.
A heuristic algorithm uses some insight on input values and computes not exact value (but may be close to optimal).
In some special cases, heuristic can find exact solution.
A heuristic is usually an optimization or a strategy that usually provides a good enough answer, but not always and rarely the best answer. For example, if you were to solve the traveling salesman problem with brute force, discarding a partial solution once its cost exceeds that of the current best solution is a heuristic: sometimes it helps, other times it doesn't, and it definitely doesn't improve the theoretical (big-oh notation) run time of the algorithm
I think Heuristic is more of a constraint used in Learning Based Model in Artificial Intelligent since the future solution states are difficult to predict.
But then my doubt after reading above answers is
"How would Heuristic can be successfully applied using Stochastic Optimization Techniques? or can they function as full fledged algorithms when used with Stochastic Optimization?"
http://en.wikipedia.org/wiki/Stochastic_optimization
One of the best explanations I have read comes from the great book Code Complete, which I now quote:
A heuristic is a technique that helps you look for an answer. Its
results are subject to chance because a heuristic tells you only how
to look, not what to find. It doesn’t tell you how to get directly
from point A to point B; it might not even know where point A and
point B are. In effect, a heuristic is an algorithm in a clown suit.
It’s less predict- able, it’s more fun, and it comes without a 30-day,
money-back guarantee.
Here is an algorithm for driving to someone’s house: Take Highway 167
south to Puy-allup. Take the South Hill Mall exit and drive 4.5 miles
up the hill. Turn right at the light by the grocery store, and then
take the first left. Turn into the driveway of the large tan house on
the left, at 714 North Cedar.
Here’s a heuristic for getting to someone’s house: Find the last
letter we mailed you. Drive to the town in the return address. When
you get to town, ask someone where our house is. Everyone knows
us—someone will be glad to help you. If you can’t find anyone, call us
from a public phone, and we’ll come get you.
The difference between an algorithm and a heuristic is subtle, and the
two terms over-lap somewhat. For the purposes of this book, the main
difference between the two is the level of indirection from the
solution. An algorithm gives you the instructions directly. A
heuristic tells you how to discover the instructions for yourself, or
at least where to look for them.
They find a solution suboptimally without any guarantee as to the quality of solution found, it is obvious that it makes sense to the development of heuristics only polynomial. The application of these methods is suitable to solve real world problems or large problems so awkward from the computational point of view that for them there is not even an algorithm capable of finding an approximate solution in polynomial time.