Related
I'm reading up on the one dimensional bin packing problem and the different solutions that can be used to solve it.
Bin Packing Problem Definition: Given a list of objects and their weights, and a collection of bins of fixed size, find the smallest number of bins so that all of the objects are assigned to a bin.
Solutions I'm studying: Next Fit, First Fit, Best Fit, Worst Fit, First Fit Decreasing, Best Fit Decreasing
I notice that some articles I read call these "approximation algorithms", and others call these "heuristics". I know that there is a difference between approximation algorithms and heuristics:
Heuristic: With some hard problems, it's difficult to get an acceptable solution in a decent run time, so we can get an "okay" solution by applying some educated guesses, or arbitrarily choosing.
Approximation Algorithm: This gives an approximate solution, with some "guarantee" on it's performance (maybe a ratio, or something like that)
So, my question is are these solutions that I'm studying heuristic or approximation algorithms? I'm more inclined to believe that they are heuristic because we're choosing the next item to be placed in a bin by some "guess". We're not guaranteed of some optimal solution. So why do some people call them approximation algorithms?
If these aren't heuristic algorithms, then what are examples of heuristic algorithms to solve the bin packing problem?
An algorithm can be both a heuristic and an approximation algorithm -- the two terms don't conflict. If some "good but not always optimal" strategy (the heuristic) can be proven to be "not too bad" (the approximation guarantee), then it qualifies as both.
All of the algorithms you listed are heuristic because they prescribe a "usually good" strategy, which is the heuristic. For any of the algorithms where there is an approximation guarantee (the "error" must be bounded somehow), then you can also say it's an approximation algorithm.
From what I understand, randomised algorithm could give wrong answer.For example, using contraction algorithm to solve graph min-cut problem, you need to run the algorithm n^2*ln(n) times so that the possibility of failing to get the correct answer is at most 1/n. No matter how small the possibility of failure is, the answer could be incorrect, so when is the right time that we allow the incorrect answer?
To begin with, I think you need to differentiate between different classes of randomized algorithms:
A Monte Carlo algorithm is an algorithm which is random w.r.t. correctness. The randomized min-cut algorithm, from your question, is an example of such an algorithm.
A Las Vegas algorithm is an algorithm which is random w.r.t. running time. Randomized quicksort, for example, is such an algorithm.
You seem to mean Monte-Carlo algorithms in your question.
The question of whether a Monte-Carlo algorithm is suitable to you, probably can't be answered objectively, because it is based on something like the ecomonic theory of utility. Given two algorithms, A and B, then each invocation of A or B takes some time t and gives you the result whose correctness is c. The utility U(t, c) is a random variable, and only you can determine whether the distribution of UA(T, C) is better or worse than UB(T, C). Some examples, where algorithm A performs twice as fast as B, but errs with probability 1e-6:
If these are preference recommendations on a website, then it might be worth it for you to have your website twice as responsive as that of a competitor, at the risk that, rarely, a client gets wrong recommendations.
If these are control systems for a nuclear reactor (to borrow from TemplateTypedef's comment), then a slight chance of failure might not be worth the time saving (e.g., you probably would be better investing in a processor twice as fast running the slower algorithm).
The two examples above show that each of the two choices might be correct for different settings. In fact, utility theory rarely shows sets of choices that are clearly wrong. In the introduction to the book Randomized Algorithms by Motwani and Raghavan, however, the authors do give such an example for the fallacy of avoiding Monte-Carlo algorithms. The probability of a CPU malfunctioning due to cosmic radiation is some α (whose value I forget). Thus avoiding running a Monte-Carlo algorithm with probability of error much lower than α, is probably simply irrational.
You'll always need to analyze the properties of the algorithm and decide if the risk of a non-optimal answer is bearable in your application. (If the answer is Boolean, then "non-optimal" is the same as "wrong.")
There are many kinds of programming problems where some answer that's close to optimal and obtained in reasonable time is much better than the optimal answer provided too late or not at all.
The Traveling Salesman problem is an example. If you are Walmart and need to plan delivery routes each night for given sets of cities, getting a route that's close to optimal is much better than no route or a naively chosen one or the best possible route obtained 2 days from now.
There are many kinds of guarantees provided by randomized algorithms. They often have the form error <= F(cost), where error and cost can be almost anything. The cost may be expressed in run time or how many repeat runs are spent looking for better answers. Space may also figure in cost. The error may be probability of a wrong 1/0 answer, a distance metric from an optimal result, a discrete count of erroneous components, etc., etc.
Sometimes you just have to live with a maybe-wrong answer because there's no useful alternative. Primality testing on big numbers is in this category. Though there are polynomial time deterministic tests, they are still much slower than a probabilistic test that produces the correct answer for all practical purposes.
For example, if you have a Boolean randomized function where True results are always correct, but False are wrong 50% of the time, then you are in pretty good shape. (The Miller-Rabin primality test is actually better than this.)
Suppose you can afford to run the algorithm 40 times. If any of the runs says False, you know the answer is False. If they're all True then the probability of that the real answer if false is roughly 2^40 = 1/(1 trillion).
Even in safety-critical applications, this may be a fine result. The chance of being hit by lightning in a lifetime is about 1/10,000. We all live with that and don't give it a second thought.
Hello I've just started learning greedy algorithm and I've first looked at the classic coin changing problem. I could understand the greediness (i.e., choosing locally optimal solution towards a global optimum.) in the algorithm as I am choosing the highest value of coin such that the
sum+{value of chosen coin}<=total value . Then I started to solve some greedy algorithm problem in some sites. I could solve most of the problems but couldn't figure out exactly where the greediness is applied in the problem. I coded the only solution i could think of, for the problems and got it accepted. The editorials also show the same way of solving problem but i could not understand the application of greedy paradigm in the algorithm.
Are greedy algorithms the only way of solving a particular range of problems? Or they are one way of solving problems which could be more efficient?
Could you give me pseudo codes of a same problem with and without the application of greedy paradigm?
There are lots of real life examples of greedy algorithms. One of the obvious is the coin changing problem, to make change in a certain currency, we repeatedly dispense the largest denomination, thus , to give out seventeen dollars and sixty one cents in change, we give out a ten-dollar bill, a five-dollar bill, two one-dollar bills, two quarters , one dime, and one penny. By doing this, we are guaranteed to minimize the number of bills and coins. This algorithm does not work in all monetary systems...more here
I think that there is always another way to solve a problem, but sometimes, as you've stated, it probably will be less efficient.
For example, you can always check all the options (coins permutations), store the results and choose the best, but of course the efficiency is terrible.
Hope it helps.
Greedy algorithms are just a class of algorithms that iteratively construct/improve a solution.
Imagine the most famous problem - TSP. You can formulate it as Integer Linear Programming problem and give it to an ILP solver and it will give you globally optimal solution (if it has enought time). But you could do it in a greedy way. You can construct some solution (e.g. randomly) and then look for changes (e.g. switch an order of two cities) that improve your solution and you keep doing these changes until there is no such change possible.
So the bottom line is: greedy algorithms are only a method of solving hard problems efficiently (in time, but not necessary in the quality of solution), but there are other classes of algorithms for solving such problems.
For coins, greedy algorithm is also the optimal one, therefore the "greediness" is not as visible as with some other problems.
In some cases you prefer solution, which is not the best one, but you can compute it much faster (computing the real best solution can takes years for example).
Then you choose heuristic, that should give you the best results - based on average input data, its structure and what you want to want to accomplish.
On wikipedia, there is good solution on finding the biggest sum of numbers in tree
Imagine that you have for example 2^1000 nodes in this tree. To find optimal solution, you have to visit each node once. Personal computer today is not able to do this in your lifetime, therefore you want some heuristic. Greedy alghoritm however find solution in just 1000 steps (which does not take more than one milisecond)
I'm developing a trip planer program. Each city has a property called rateOfInterest. Each road between two cities has a time cost. The problem is, given the start city, and the specific amount of time we want to spend, how to output a path which is most interesting (i.e. the sum of the cities' rateOfInterest). I'm thinking using some greedy algorithm, but is there any algorithm that can guarantee an optimal path?
EDIT Just as #robotking said, we allow visit places multiple times and it's only interesting the first visit. We have 50 cities, and each city approximately has 5 adjacent cities. The cost function on each edge is either time or distance. We don't have to visit all cities, just with the given cost function, we need to return an optimal partial trip with highest ROI. I hope this makes the problem clearer!
This sounds very much like an instance of a TSP in a weighted manner meaning there is some vertices that are more desirable than other...
Now you could find an optimal path trying every possible permutation (using backtracking with some pruning to make it faster) depending on the number of cities we are talking about. See the TSP problem is a n! problem so after n > 10 you can forget it...
If your number of cities is not that small then finding an optimal path won't be doable so drop the idea... however there is most likely a good enough heuristic algorithm to approximate a good enough solution.
Steven Skiena recommends "Simulated Annealing" as the heuristics of choice to approximate such hard problem. It is very much like a "Hill Climbing" method but in a more flexible or forgiving way. What I mean is that while in "Hill Climbing" you always only accept changes that improve your solution, in "Simulated Annealing" there is some cases where you actually accept a change even if it makes your solution worse locally hoping that down the road you get your money back...
Either way, whatever is used to approximate a TSP-like problem is applicable here.
From http://en.wikipedia.org/wiki/Travelling_salesman_problem, note that the decision problem version is "(where, given a length L, the task is to decide whether any tour is shorter than L)". If somebody gives me a travelling salesman problem to solve I can set all the cities to have the same rate of interest and then the decision problem is whether a most interesting path for time L actually visits all the cities and returns.
So if there was an efficient solution for your problem there would be an efficient solution for the travelling salesman problem, which is unlikely.
If you want to go further than a greedy search, some of the approaches of the travelling salesman problem may be applicable - http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.26.5150 describes "Iterated Local Search" which looks interesting, with reference to the TSP.
If you want optimality, use a brute force exhaustive search where the leaves are the one where the time run out. As long as the expected depth of the search tree is less than 10 and worst case less than 15 you can produce a practical algorithm.
Now if you think about the future and expect your city network to grow, then you cannot ensure optimality. In this case you are dealing with a local search problem.
What is the difference between a heuristic and an algorithm?
An algorithm is the description of an automated solution to a problem. What the algorithm does is precisely defined. The solution could or could not be the best possible one but you know from the start what kind of result you will get. You implement the algorithm using some programming language to get (a part of) a program.
Now, some problems are hard and you may not be able to get an acceptable solution in an acceptable time. In such cases you often can get a not too bad solution much faster, by applying some arbitrary choices (educated guesses): that's a heuristic.
A heuristic is still a kind of an algorithm, but one that will not explore all possible states of the problem, or will begin by exploring the most likely ones.
Typical examples are from games. When writing a chess game program you could imagine trying every possible move at some depth level and applying some evaluation function to the board. A heuristic would exclude full branches that begin with obviously bad moves.
In some cases you're not searching for the best solution, but for any solution fitting some constraint. A good heuristic would help to find a solution in a short time, but may also fail to find any if the only solutions are in the states it chose not to try.
An algorithm is typically deterministic and proven to yield an optimal result
A heuristic has no proof of correctness, often involves random elements, and may not yield optimal results.
Many problems for which no efficient algorithm to find an optimal solution is known have heuristic approaches that yield near-optimal results very quickly.
There are some overlaps: "genetic algorithms" is an accepted term, but strictly speaking, those are heuristics, not algorithms.
Heuristic, in a nutshell is an "Educated guess". Wikipedia explains it nicely. At the end, a "general acceptance" method is taken as an optimal solution to the specified problem.
Heuristic is an adjective for
experience-based techniques that help
in problem solving, learning and
discovery. A heuristic method is used
to rapidly come to a solution that is
hoped to be close to the best possible
answer, or 'optimal solution'.
Heuristics are "rules of thumb",
educated guesses, intuitive judgments
or simply common sense. A heuristic is
a general way of solving a problem.
Heuristics as a noun is another name
for heuristic methods.
In more precise terms, heuristics
stand for strategies using readily
accessible, though loosely applicable,
information to control problem solving
in human beings and machines.
While an algorithm is a method containing finite set of instructions used to solving a problem. The method has been proven mathematically or scientifically to work for the problem. There are formal methods and proofs.
Heuristic algorithm is an algorithm that is able to produce an
acceptable solution to a problem in
many practical scenarios, in the
fashion of a general heuristic, but
for which there is no formal proof of
its correctness.
An algorithm is a self-contained step-by-step set of operations to be performed 4, typically interpreted as a finite sequence of (computer or human) instructions to determine a solution to a problem such as: is there a path from A to B, or what is the smallest path between A and B. In the latter case, you could also be satisfied with a 'reasonably close' alternative solution.
There are certain categories of algorithms, of which the heuristic algorithm is one. Depending on the (proven) properties of the algorithm in this case, it falls into one of these three categories (note 1):
Exact: the solution is proven to be an optimal (or exact solution) to the input problem
Approximation: the deviation of the solution value is proven to be never further away from the optimal value than some pre-defined bound (for example, never more than 50% larger than the optimal value)
Heuristic: the algorithm has not been proven to be optimal, nor within a pre-defined bound of the optimal solution
Notice that an approximation algorithm is also a heuristic, but with the stronger property that there is a proven bound to the solution (value) it outputs.
For some problems, noone has ever found an 'efficient' algorithm to compute the optimal solutions (note 2). One of those problems is the well-known Traveling Salesman Problem. Christophides' algorithm for the Traveling Salesman Problem, for example, used to be called a heuristic, as it was not proven that it was within 50% of the optimal solution. Since it has been proven, however, Christophides' algorithm is more accurately referred to as an approximation algorithm.
Due to restrictions on what computers can do, it is not always possible to efficiently find the best solution possible. If there is enough structure in a problem, there may be an efficient way to traverse the solution space, even though the solution space is huge (i.e. in the shortest path problem).
Heuristics are typically applied to improve the running time of algorithms, by adding 'expert information' or 'educated guesses' to guide the search direction. In practice, a heuristic may also be a sub-routine for an optimal algorithm, to determine where to look first.
(note 1): Additionally, algorithms are characterised by whether they include random or non-deterministic elements. An algorithm that always executes the same way and produces the same answer, is called deterministic.
(note 2): This is called the P vs NP problem, and problems that are classified as NP-complete and NP-hard are unlikely to have an 'efficient' algorithm. Note; as #Kriss mentioned in the comments, there are even 'worse' types of problems, which may need exponential time or space to compute.
There are several answers that answer part of the question. I deemed them less complete and not accurate enough, and decided not to edit the accepted answer made by #Kriss
Actually I don't think that there is a lot in common between them. Some algorithm use heuristics in their logic (often to make fewer calculations or get faster results). Usually heuristics are used in the so called greedy algorithms.
Heuristics is some "knowledge" that we assume is good to use in order to get the best choice in our algorithm (when a choice should be taken). For example ... a heuristics in chess could be (always take the opponents' queen if you can, since you know this is the stronger figure). Heuristics do not guarantee you that will lead you to the correct answer, but (if the assumptions is correct) often get answer which are close to the best in much shorter time.
An Algorithm is a clearly defined set of instructions to solve a problem, Heuristics involve utilising an approach of learning and discovery to reach a solution.
So, if you know how to solve a problem then use an algorithm. If you need to develop a solution then it's heuristics.
Heuristics are algorithms, so in that sense there is none, however, heuristics take a 'guess' approach to problem solving, yielding a 'good enough' answer, rather than finding a 'best possible' solution.
A good example is where you have a very hard (read NP-complete) problem you want a solution for but don't have the time to arrive to it, so have to use a good enough solution based on a heuristic algorithm, such as finding a solution to a travelling salesman problem using a genetic algorithm.
Algorithm is a sequence of some operations that given an input computes something (a function) and outputs a result.
Algorithm may yield an exact or approximate values.
It also may compute a random value that is with high probability close to the exact value.
A heuristic algorithm uses some insight on input values and computes not exact value (but may be close to optimal).
In some special cases, heuristic can find exact solution.
A heuristic is usually an optimization or a strategy that usually provides a good enough answer, but not always and rarely the best answer. For example, if you were to solve the traveling salesman problem with brute force, discarding a partial solution once its cost exceeds that of the current best solution is a heuristic: sometimes it helps, other times it doesn't, and it definitely doesn't improve the theoretical (big-oh notation) run time of the algorithm
I think Heuristic is more of a constraint used in Learning Based Model in Artificial Intelligent since the future solution states are difficult to predict.
But then my doubt after reading above answers is
"How would Heuristic can be successfully applied using Stochastic Optimization Techniques? or can they function as full fledged algorithms when used with Stochastic Optimization?"
http://en.wikipedia.org/wiki/Stochastic_optimization
One of the best explanations I have read comes from the great book Code Complete, which I now quote:
A heuristic is a technique that helps you look for an answer. Its
results are subject to chance because a heuristic tells you only how
to look, not what to find. It doesn’t tell you how to get directly
from point A to point B; it might not even know where point A and
point B are. In effect, a heuristic is an algorithm in a clown suit.
It’s less predict- able, it’s more fun, and it comes without a 30-day,
money-back guarantee.
Here is an algorithm for driving to someone’s house: Take Highway 167
south to Puy-allup. Take the South Hill Mall exit and drive 4.5 miles
up the hill. Turn right at the light by the grocery store, and then
take the first left. Turn into the driveway of the large tan house on
the left, at 714 North Cedar.
Here’s a heuristic for getting to someone’s house: Find the last
letter we mailed you. Drive to the town in the return address. When
you get to town, ask someone where our house is. Everyone knows
us—someone will be glad to help you. If you can’t find anyone, call us
from a public phone, and we’ll come get you.
The difference between an algorithm and a heuristic is subtle, and the
two terms over-lap somewhat. For the purposes of this book, the main
difference between the two is the level of indirection from the
solution. An algorithm gives you the instructions directly. A
heuristic tells you how to discover the instructions for yourself, or
at least where to look for them.
They find a solution suboptimally without any guarantee as to the quality of solution found, it is obvious that it makes sense to the development of heuristics only polynomial. The application of these methods is suitable to solve real world problems or large problems so awkward from the computational point of view that for them there is not even an algorithm capable of finding an approximate solution in polynomial time.