Should I use recursion or memoization for an algorithm? - algorithm

If I have a choice to use recursion or memoization to solve a problem which should I use? In other words if they are both viable solutions in that they give the correct output and can be reasonably expressed in the code I'm using, when would I use one over the other?

They are not mutually exclusive. You can use them both.
Personally, I'd build the base recursive function first, and add memoization afterwards, as an optimisation step.

The rule of thumb to use is based on the amount of overlap the subproblems have. If you're calculating fibonacci numbers (the classic recursion example) there's a lot of needless recalculation done if you use recursion.
For example, to calculate F(4), I need to know F(3) and F(2), so I calculate F(3) by calculating F(2) and F(1), and so on. If I used recursion, I just calculated F(2) and most other F(n) twice. If I use memoization, I can just look the value up.
If you're doing binary search there is no overlap between subproblems, so recursion is okay. Splitting the input array in half at each step results in two unique arrays, which represent two subproblems with no overlap. Memoization wouldn't be a benefit in cases like this.

Recursion has a performance penalty associated with the creation of stack frames, memoization's penalty is the caching of the results, If performance is a concern only way to know for sure will be to test in your application.
In my personal opinion I'd go with the method which is the easiest to use and understand first, which in my opinion is recursion. Until you demonstrate a need for memoization.

Memoization is just a caching method that happen to be commonly used to optimize recursion. It cannot replace recursion.

Not sure I can say without knowing the problem. Often you'd want to use memoization with recursion. Still, memoization is likely to be significantly quicker than recursion if you can in fact use it as an alternative solution. They both have performance issues, but they vary differently with the nature of the problem/size of input.

I pick memoization because it's usually possible to access more heap memory than stack memory.
That is, if your algorithm is run on a lot of data, in most languages you'll run out of stack space recursing before you run out of space on the heap saving data.

I believe you might be confusing memoization (which is, as others have noted, an optimization strategy for recursive algorithms) with dynamic programming (which simulates a recursive solution but does not actually use recursion). If that was your question I'd say it would depend on your priorities: high runtime efficiency (dynamic programming) or high readability (memoization, as the recursive solution of the problem is still present in the code).

It depends on what you're going for. dynamic programming (memoization) is almost always faster. Often by a LOT. (ie, cubic to squared, or exponential to poly), but in my experience, recursion is easier to read and debug.
Then again, a lot of people avoid recursion like the plague, so they don't find it easy to read...
Also, (third hand?) I find that it's easiest to find the Dynamic solution after I've come up with the recursive one, so I end up doing both. But if you've already got both solutions, Dynamic may be your best bet.
I'm not sure if I've been helpful, but there you go. As in many things of algorithm choice, YMMV.

If your problem is a recursive one, what choice do you have but to recurse?
You can write your recursive function in a way that short circuits using memoization, to gain maximum speed for the second call.

I don't agree with Tomalak's assertion that with a recursive problem you have no choice but to recurse?
The best example is the above-mentioned Fibonacci series.
On my computer the recursive version of F(45) (F for Fibonacci) takes 15 seconds for 2269806339 additions, the iterative version takes 43 additions and executes in a few microseconds.
Another well-known example is Towers of Hanoi. After your class on the topic it may seem like recursion is the only way to solve it. But even here there's a iterative solution, although it's not as obvious as the recursive one. Even so, the iterative is the faster, mainly because no expensive stack-operations are required.
In case you're interested in the non-recursive version of Towers of Hamoi, here's the Delphi source code:
procedure TForm1.TowersOfHanoi(Ndisks: Word);
var
I: LongWord;
begin
for I := 1 to (1 shl Ndisks) do
Memo1.Lines.Add(Format('%4d: move from pole %d to pole %d',
[I, (I and (I - 1)) mod 3, (I or (I - 1) + 1) mod 3]));
Memo1.Lines.Add('done')
end;

Recursion does not need to use a significant amount stack space if the problem can be solved using tail recursion techniques. As said previously, depends on the problem.

In the usual case, you are faced with two criteria which help with your decision:
Run time
Readability
Recursive code is usually slower but much more readable (not always, but most often. As it was said, tail recursion can help if your language supports it - if not, there is not much you can do).
The iterative version of a recursive problem is usually faster in terms of runtime but the code is hard to understand and, because of that, frail.
If both versions have the same run time and the same readability, there is no reason to choose either over the other. In this case, you have to check other things: Will this code change? How about maintenance? Are you comfortable with recursive algorithms or are they giving you nightmares?

var memoizer = function (fund, memo) {
var shell = function (arg) {
if (typeof memo[arg] !== 'number') {
memo[arg] = fund(shell, arg);
}
return memo[arg];
};
return shell;
};
var fibonacci = memoizer(function (recur, n) { return recur(n - 1) + recur(n - 2); }, [0, 1]);
use both!

Combine both. Optimize your recursive solution by using memoization. That's what memoization is meant to be for. For using memory space to speed up the recursion.

Related

Algorithm for finding Time Complexity of Algorithm [duplicate]

I wonder whether there is any automatic way of determining (at least roughly) the Big-O time complexity of a given function?
If I graphed an O(n) function vs. an O(n lg n) function I think I would be able to visually ascertain which is which; I'm thinking there must be some heuristic solution which enables this to be done automatically.
Any ideas?
Edit: I am happy to find a semi-automated solution, just wondering whether there is some way of avoiding doing a fully manual analysis.
It sounds like what you are asking for is an extention of the Halting Problem. I do not believe that such a thing is possible, even in theory.
Just answering the question "Will this line of code ever run?" would be very difficult if not impossible to do in the general case.
Edited to add:
Although the general case is intractable, see here for a partial solution: http://research.microsoft.com/apps/pubs/default.aspx?id=104919
Also, some have stated that doing the analysis by hand is the only option, but I don't believe that is really the correct way of looking at it. An intractable problem is still intractable even when a human being is added to the system/machine. Upon further reflection, I suppose that a 99% solution may be doable, and might even work as well as or better than a human.
You can run the algorithm over various size data sets, and you could then use curve fitting to come up with an approximation. (Just looking at the curve you create probably will be enough in most cases, but any statistical package has curve fitting).
Note that some algorithms exhibit one shape with small data sets, but another with large... and the definition of large remains a bit nebulous. This means that an algorithm with a good performance curve could have so much real world overhead that (for small data sets) it doesn't work as well as the theoretically better algorithm.
As far as code inspection techniques, none exist. But instrumenting your code to run at various lengths and outputting a simple file (RunSize RunLength would be enough) should be easy. Generating proper test data could be more complex (some algorithms work better/worse with partially ordered data, so you would want to generate data that represented your normal use-case).
Because of the problems with the definition of "what is large" and the fact that performance is data dependent, I find that static analysis often is misleading. When optimizing performance and selecting between two algorithms, the real world "rubber hits the road" test is the only final arbitrator I trust.
A short answer is that it's impossible because constants matter.
For instance, I might write a function that runs in O((n^3/k) + n^2). This simplifies to O(n^3) because as n approaches infinity, the n^3 term will dominate the function, irrespective of the constant k.
However, if k is very large in the above example function, the function will appear to run in almost exactly n^2 until some crossover point, at which the n^3 term will begin to dominate. Because the constant k will be unknown to any profiling tool, it will be impossible to know just how large a dataset to test the target function with. If k can be arbitrarily large, you cannot craft test data to determine the big-oh running time.
I am surprised to see so many attempts to claim that one can "measure" complexity by a stopwatch. Several people have given the right answer, but I think that there is still room to drive the essential point home.
Algorithm complexity is not a "programming" question; it is a "computer science" question. Answering the question requires analyzing the code from the perspective of a mathematician, such that computing the Big-O complexity is practically a form of mathematical proof. It requires a very strong understanding of the fundamental computer operations, algebra, perhaps calculus (limits), and logic. No amount of "testing" can be substituted for that process.
The Halting Problem applies, so the complexity of an algorithm is fundamentally undecidable by a machine.
The limits of automated tools applies, so it might be possible to write a program to help, but it would only be able to help about as much as a calculator helps with one's physics homework, or as much as a refactoring browser helps with reorganizing a code base.
For anyone seriously considering writing such a tool, I suggest the following exercise. Pick a reasonably simple algorithm, such as your favorite sort, as your subject algorithm. Get a solid reference (book, web-based tutorial) to lead you through the process of calculating the algorithm complexity and ultimately the "Big-O". Document your steps and results as you go through the process with your subject algorithm. Perform the steps and document your progress for several scenarios, such as best-case, worst-case, and average-case. Once you are done, review your documentation and ask yourself what it would take to write a program (tool) to do it for you. Can it be done? How much would actually be automated, and how much would still be manual?
Best wishes.
I am curious as to why it is that you want to be able to do this. In my experience when someone says: "I want to ascertain the runtime complexity of this algorithm" they are not asking what they think they are asking. What you are most likely asking is what is the realistic performance of such an algorithm for likely data. Calculating the Big-O of a function is of reasonable utility, but there are so many aspects that can change the "real runtime performance" of an algorithm in real use that nothing beats instrumentation and testing.
For example, the following algorithms have the same exact Big-O (wacky pseudocode):
example a:
huge_two_dimensional_array foo
for i = 0, i < foo[i].length, i++
for j = 0; j < foo[j].length, j++
do_something_with foo[i][j]
example b:
huge_two_dimensional_array foo
for j = 0, j < foo[j].length, j++
for i = 0; i < foo[i].length, i++
do_something_with foo[i][j]
Again, exactly the same big-O... but one of them uses row ordinality and one of them uses column ordinality. It turns out that due to locality of reference and cache coherency you might have two completely different actual runtimes, especially depending on the actual size of the array foo. This doesn't even begin to touch the actual performance characteristics of how the algorithm behaves if it's part of a piece of software that has some concurrency built in.
Not to be a negative nelly but big-O is a tool with a narrow scope. It is of great use if you are deep inside algorithmic analysis or if you are trying to prove something about an algorithm, but if you are doing commercial software development the proof is in the pudding, and you are going to want to have actual performance numbers to make intelligent decisions.
Cheers!
This could work for simple algorithms, but what about O(n^2 lg n), or O(n lg^2 n)?
You could get fooled visually very easily.
And if its a really bad algorithm, maybe it wouldn't return even on n=10.
Proof that this is undecidable:
Suppose that we had some algorithm HALTS_IN_FN(Program, function) which determined whether a program halted in O(f(n)) for all n, for some function f.
Let P be the following program:
if(HALTS_IN_FN(P,f(n)))
{
while(1);
}
halt;
Since the function and the program are fixed, HALTS_IN_FN on this input is constant time. If HALTS_IN_FN returns true, the program runs forever and of course does not halt in O(f(n)) for any f(n). If HALTS_IN_FN returns false, the program halts in O(1) time.
Thus, we have a paradox, a contradiction, and so the program is undecidable.
A lot of people have commented that this is an inherently unsolvable problem in theory. Fair enough, but beyond that, even solving it for any but the most trivial cases would seem to be incredibly difficult.
Say you have a program that has a set of nested loops, each based on the number of items in an array. O(n^2). But what if the inner loop is only run in a very specific set of circumstances? Say, on average, it's run in aprox log(n) cases. Suddenly our "obviously" O(n^2) algorithm is really O(n log n). Writing a program that could determine if the inner loop would be run, and how often, is potentially more difficult than the original problem.
Remember O(N) isn't god; high constants can and will change the playing field. Quicksort algorithms are O(n log n) of course, but when the recursion gets small enough, say down to 20 items or so, many implementations of quicksort will change tactics to a separate algorithm as it's actually quicker to do a different type of sort, say insertion sort with worse O(N), but much smaller constant.
So, understand your data, make educated guesses, and test.
I think it's pretty much impossible to do this automatically. Remember that O(g(n)) is the worst-case upper bound and many functions perform better than that for a lot of data sets. You'd have to find the worst-case data set for each one in order to compare them. That's a difficult task on its own for many algorithms.
You must also take care when running such benchmarks. Some algorithms will have a behavior heavily dependent on the input type.
Take Quicksort for example. It is a worst-case O(n²), but usually O(nlogn). For two inputs of the same size.
The traveling salesman is (I think, not sure) O(n²) (EDIT: the correct value is 0(n!) for the brute force algotithm) , but most algorithms get rather good approximated solutions much faster.
This means that the the benchmarking structure has to most of the time be adapted on an ad hoc basis. Imagine writing something generic for the two examples mentioned. It would be very complex, probably unusable, and likely will be giving incorrect results anyway.
Jeffrey L Whitledge is correct. A simple reduction from the halting problem proves that this is undecidable...
ALSO, if I could write this program, I'd use it to solve P vs NP, and have $1million... B-)
I'm using a big_O library (link here) that fits the change in execution time against independent variable n to infer the order of growth class O().
The package automatically suggests the best fitting class by measuring the residual from collected data against each class growth behavior.
Check the code in this answer.
Example of output,
Measuring .columns[::-1] complexity against rapid increase in # rows
--------------------------------------------------------------------------------
Big O() fits: Cubic: time = -0.017 + 0.00067*n^3
--------------------------------------------------------------------------------
Constant: time = 0.032 (res: 0.021)
Linear: time = -0.051 + 0.024*n (res: 0.011)
Quadratic: time = -0.026 + 0.0038*n^2 (res: 0.0077)
Cubic: time = -0.017 + 0.00067*n^3 (res: 0.0052)
Polynomial: time = -6.3 * x^1.5 (res: 6)
Logarithmic: time = -0.026 + 0.053*log(n) (res: 0.015)
Linearithmic: time = -0.024 + 0.012*n*log(n) (res: 0.0094)
Exponential: time = -7 * 0.66^n (res: 3.6)
--------------------------------------------------------------------------------
I guess this isn't possible in a fully automatic way since the type and structure of the input differs a lot between functions.
Well, since you can't prove whether or not a function even halts, I think you're asking a little much.
Otherwise #Godeke has it.
I don't know what's your objective in doing this, but we had a similar problem in a course I was teaching. The students were required to implement something that works at a certain complexity.
In order not to go over their solution manually, and read their code, we used the method #Godeke suggested. The objective was to find students who used linked list instead of a balansed search tree, or students who implemented bubble sort instead of heap sort (i.e. implementations that do not work in the required complexity - but without actually reading their code).
Surprisingly, the results did not reveal students who cheated. That might be because our students are honest and want to learn (or just knew that we'll check this ;-) ). It is possible to miss cheating students if the inputs are small, or if the input itself is ordered or such. It is also possible to be wrong about students who did not cheat, but have large constant values.
But in spite of the possible errors, it is well worth it, since it saves a lot of checking time.
As others have said, this is theoretically impossible. But in practice, you can make an educated guess as to whether a function is O(n) or O(n^2), as long as you don't mind being wrong sometimes.
First time the algorithm, running it on input of various n. Plot the points on a log-log graph. Draw the best-fit line through the points. If the line fits all the points well, then the data suggests that the algorithm is O(n^k), where k is the slope of the line.
I am not a statistician. You should take all this with a grain of salt. But I have actually done this in the context of automated testing for performance regressions. The patch here contains some JS code for it.
If you have lots of homogenious computational resources, I'd time them against several samples and do linear regression, then simply take the highest term.
It's easy to get an indication (e.g. "is the function linear? sub-linear? polynomial? exponential")
It's hard to find the exact complexity.
For example, here's a Python solution: you supply the function, and a function that creates parameters of size N for it. You get back a list of (n,time) values to plot, or to perform regression analysis. It times it once for speed, to get a really good indication it would have to time it many times to minimize interference from environmental factors (e.g. with the timeit module).
import time
def measure_run_time(func, args):
start = time.time()
func(*args)
return time.time() - start
def plot_times(func, generate_args, plot_sequence):
return [
(n, measure_run_time(func, generate_args(n+1)))
for n in plot_sequence
]
And to use it to time bubble sort:
def bubble_sort(l):
for i in xrange(len(l)-1):
for j in xrange(len(l)-1-i):
if l[i+1] < l[i]:
l[i],l[i+1] = l[i+1],l[i]
import random
def gen_args_for_sort(list_length):
result = range(list_length) # list of 0..N-1
random.shuffle(result) # randomize order
# should return a tuple of arguments
return (result,)
# timing for N = 1000, 2000, ..., 5000
times = plot_times(bubble_sort, gen_args_for_sort, xrange(1000,6000,1000))
import pprint
pprint.pprint(times)
This printed on my machine:
[(1000, 0.078000068664550781),
(2000, 0.34400010108947754),
(3000, 0.7649998664855957),
(4000, 1.3440001010894775),
(5000, 2.1410000324249268)]

Is it recommended to use recursive algorithm to calculate sum of n cubes in terms of time and space efficiency?

Is it recommended to use recursive algorithm to calculate sum of n cubes in terms of time and space efficiency? comparing to a non-recursive?
What exactly do you mean? Summing the first n cubes is best done by computing (n^2*(n + 1)^2)/4, but if you're given a list of numbers to sum their cubes, that's not much of an option.
If you are in a language that does tail call optimization, a tail call recursive implementation is certainly recommended. If you're not, it may still be worth writing the recursive function if that is easier for you to reason about (a very important aspect of organizing code!). But do keep in mind that a recursion of depth n will take, depending on your language, compiler, etc., anywhere from 4*n to at least several 100*n bytes of memory, and stack space isn't unlimited.
I'd go for a loop in most languages. For large n, because it is more resource efficient, and for small n, because I find it easier to read than a recursive version. But that's tied to my personal background and experience, and what is easier for you and whoever else needs to se your code maybe completely different.
It is dependent on what you want to accomplish. If you want it to be dependent on previous outcomes, you can make it recursive. Otherwise I would suggest to make it non-recursive.
Most compiled languages have tail recursion removal and for simple case like this it will not be a problem. Math people find it much easier to write functional languages and recursions come more natural to them. You can however you can write very efficiently:
var sumOf0To10Cubes = Enumerable.Range(0, 10).Select(o => Math.Pow(o, 3)).Sum();
Note that math people prefer:
Sum[x^3,x->{0,10}]

recursion versus iteration

Is it correct to say that everywhere recursion is used a for loop could be used? And if recursion is usually slower what is the technical reason for ever using it over for loop iteration?
And if it is always possible to convert an recursion into a for loop is there a rule of thumb way to do it?
Recursion is usually much slower because all function calls must be stored in a stack to allow the return back to the caller functions. In many cases, memory has to be allocated and copied to implement scope isolation.
Some optimizations, like tail call optimization, make recursions faster but aren't always possible, and aren't implemented in all languages.
The main reasons to use recursion are
that it's more intuitive in many cases when it mimics our approach of the problem
that some data structures like trees are easier to explore using recursion (or would need stacks in any case)
Of course every recursion can be modeled as a kind of loop : that's what the CPU will ultimately do. And the recursion itself, more directly, means putting the function calls and scopes in a stack. But changing your recursive algorithm to a looping one might need a lot of work and make your code less maintainable : as for every optimization, it should only be attempted when some profiling or evidence showed it to be necessary.
Is it correct to say that everywhere recursion is used a for loop could be used?
Yes, because recursion in most CPUs is modeled with loops and a stack data structure.
And if recursion is usually slower what is the technical reason for using it?
It is not "usually slower": it's recursion that is applied incorrectly that's slower. On top of that, modern compilers are good at converting some recursions to loops without even asking.
And if it is always possible to convert an recursion into a for loop is there a rule of thumb way to do it?
Write iterative programs for algorithms best understood when explained iteratively; write recursive programs for algorithms best explained recursively.
For example, searching binary trees, running quicksort, and parsing expressions in many programming languages is often explained recursively. These are best coded recursively as well. On the other hand, computing factorials and calculating Fibonacci numbers are much easier to explain in terms of iterations. Using recursion for them is like swatting flies with a sledgehammer: it is not a good idea, even when the sledgehammer does a really good job at it+.
+ I borrowed the sledgehammer analogy from Dijkstra's "Discipline of Programming".
Question :
And if recursion is usually slower what is the technical reason for ever using it over for loop iteration?
Answer :
Because in some algorithms are hard to solve it iteratively. Try to solve depth-first search in both recursively and iteratively. You will get the idea that it is plain hard to solve DFS with iteration.
Another good thing to try out : Try to write Merge sort iteratively. It will take you quite some time.
Question :
Is it correct to say that everywhere recursion is used a for loop could be used?
Answer :
Yes. This thread has a very good answer for this.
Question :
And if it is always possible to convert an recursion into a for loop is there a rule of thumb way to do it?
Answer :
Trust me. Try to write your own version to solve depth-first search iteratively. You will notice that some problems are easier to solve it recursively.
Hint : Recursion is good when you are solving a problem that can be solved by divide and conquer technique.
Besides being slower, recursion can also result in stack overflow errors depending on how deep it goes.
To write an equivalent method using iteration, we must explicitly use a stack. The fact that the iterative version requires a stack for its solution indicates that the problem is difficult enough that it can benefit from recursion. As a general rule, recursion is most suitable for problems that cannot be solved with a fixed amount of memory and consequently require a stack when solved iteratively.
Having said that, recursion and iteration can show the same outcome while they follow different pattern.To decide which method works better is case by case and best practice is to choose based on the pattern that problem follows.
For example, to find the nth triangular number of
Triangular sequence: 1 3 6 10 15 …
A program that uses an iterative algorithm to find the n th triangular number:
Using an iterative algorithm:
//Triangular.java
import java.util.*;
class Triangular {
public static int iterativeTriangular(int n) {
int sum = 0;
for (int i = 1; i <= n; i ++)
sum += i;
return sum;
}
public static void main(String args[]) {
Scanner stdin = new Scanner(System.in);
System.out.print("Please enter a number: ");
int n = stdin.nextInt();
System.out.println("The " + n + "-th triangular number is: " +
iterativeTriangular(n));
}
}//enter code here
Using a recursive algorithm:
//Triangular.java
import java.util.*;
class Triangular {
public static int recursiveTriangular(int n) {
if (n == 1)
return 1;
return recursiveTriangular(n-1) + n;
}
public static void main(String args[]) {
Scanner stdin = new Scanner(System.in);
System.out.print("Please enter a number: ");
int n = stdin.nextInt();
System.out.println("The " + n + "-th triangular number is: " +
recursiveTriangular(n));
}
}
Yes, as said by Thanakron Tandavas,
Recursion is good when you are solving a problem that can be solved by divide and conquer technique.
For example: Towers of Hanoi
N rings in increasing size
3 poles
Rings start stacked on pole 1. Goal is to move rings so
that they are stacked on pole 3 ...But
Can only move one ring at a time.
Can’t put larger ring on top of smaller.
Iterative solution is “powerful yet ugly”; recursive solution is “elegant”.
I seem to remember my computer science professor say back in the day that all problems that have recursive solutions also have iterative solutions. He says that a recursive solution is usually slower, but they are frequently used when they are easier to reason about and code than iterative solutions.
However, in the case of more advanced recursive solutions, I don't believe that it will always be able to implement them using a simple for loop.
Most of the answers seem to assume that iterative = for loop. If your for loop is unrestricted (a la C, you can do whatever you want with your loop counter), then that is correct. If it's a real for loop (say as in Python or most functional languages where you cannot manually modify the loop counter), then it is not correct.
All (computable) functions can be implemented both recursively and using while loops (or conditional jumps, which are basically the same thing). If you truly restrict yourself to for loops, you will only get a subset of those functions (the primitive recursive ones, if your elementary operations are reasonable). Granted, it's a pretty large subset which happens to contain every single function you're likely to encouter in practice.
What is much more important is that a lot of functions are very easy to implement recursively and awfully hard to implement iteratively (manually managing your call stack does not count).
recursion + memorization could lead to a more efficient solution compare with a pure iterative approach, e.g. check this:
http://jsperf.com/fibonacci-memoized-vs-iterative-for-large-n
Short answer: the trade off is recursion is faster and for loops take up less memory in almost all cases. However there are usually ways to change the for loop or recursion to make it run faster

Programmatically obtaining Big-O efficiency of code

I wonder whether there is any automatic way of determining (at least roughly) the Big-O time complexity of a given function?
If I graphed an O(n) function vs. an O(n lg n) function I think I would be able to visually ascertain which is which; I'm thinking there must be some heuristic solution which enables this to be done automatically.
Any ideas?
Edit: I am happy to find a semi-automated solution, just wondering whether there is some way of avoiding doing a fully manual analysis.
It sounds like what you are asking for is an extention of the Halting Problem. I do not believe that such a thing is possible, even in theory.
Just answering the question "Will this line of code ever run?" would be very difficult if not impossible to do in the general case.
Edited to add:
Although the general case is intractable, see here for a partial solution: http://research.microsoft.com/apps/pubs/default.aspx?id=104919
Also, some have stated that doing the analysis by hand is the only option, but I don't believe that is really the correct way of looking at it. An intractable problem is still intractable even when a human being is added to the system/machine. Upon further reflection, I suppose that a 99% solution may be doable, and might even work as well as or better than a human.
You can run the algorithm over various size data sets, and you could then use curve fitting to come up with an approximation. (Just looking at the curve you create probably will be enough in most cases, but any statistical package has curve fitting).
Note that some algorithms exhibit one shape with small data sets, but another with large... and the definition of large remains a bit nebulous. This means that an algorithm with a good performance curve could have so much real world overhead that (for small data sets) it doesn't work as well as the theoretically better algorithm.
As far as code inspection techniques, none exist. But instrumenting your code to run at various lengths and outputting a simple file (RunSize RunLength would be enough) should be easy. Generating proper test data could be more complex (some algorithms work better/worse with partially ordered data, so you would want to generate data that represented your normal use-case).
Because of the problems with the definition of "what is large" and the fact that performance is data dependent, I find that static analysis often is misleading. When optimizing performance and selecting between two algorithms, the real world "rubber hits the road" test is the only final arbitrator I trust.
A short answer is that it's impossible because constants matter.
For instance, I might write a function that runs in O((n^3/k) + n^2). This simplifies to O(n^3) because as n approaches infinity, the n^3 term will dominate the function, irrespective of the constant k.
However, if k is very large in the above example function, the function will appear to run in almost exactly n^2 until some crossover point, at which the n^3 term will begin to dominate. Because the constant k will be unknown to any profiling tool, it will be impossible to know just how large a dataset to test the target function with. If k can be arbitrarily large, you cannot craft test data to determine the big-oh running time.
I am surprised to see so many attempts to claim that one can "measure" complexity by a stopwatch. Several people have given the right answer, but I think that there is still room to drive the essential point home.
Algorithm complexity is not a "programming" question; it is a "computer science" question. Answering the question requires analyzing the code from the perspective of a mathematician, such that computing the Big-O complexity is practically a form of mathematical proof. It requires a very strong understanding of the fundamental computer operations, algebra, perhaps calculus (limits), and logic. No amount of "testing" can be substituted for that process.
The Halting Problem applies, so the complexity of an algorithm is fundamentally undecidable by a machine.
The limits of automated tools applies, so it might be possible to write a program to help, but it would only be able to help about as much as a calculator helps with one's physics homework, or as much as a refactoring browser helps with reorganizing a code base.
For anyone seriously considering writing such a tool, I suggest the following exercise. Pick a reasonably simple algorithm, such as your favorite sort, as your subject algorithm. Get a solid reference (book, web-based tutorial) to lead you through the process of calculating the algorithm complexity and ultimately the "Big-O". Document your steps and results as you go through the process with your subject algorithm. Perform the steps and document your progress for several scenarios, such as best-case, worst-case, and average-case. Once you are done, review your documentation and ask yourself what it would take to write a program (tool) to do it for you. Can it be done? How much would actually be automated, and how much would still be manual?
Best wishes.
I am curious as to why it is that you want to be able to do this. In my experience when someone says: "I want to ascertain the runtime complexity of this algorithm" they are not asking what they think they are asking. What you are most likely asking is what is the realistic performance of such an algorithm for likely data. Calculating the Big-O of a function is of reasonable utility, but there are so many aspects that can change the "real runtime performance" of an algorithm in real use that nothing beats instrumentation and testing.
For example, the following algorithms have the same exact Big-O (wacky pseudocode):
example a:
huge_two_dimensional_array foo
for i = 0, i < foo[i].length, i++
for j = 0; j < foo[j].length, j++
do_something_with foo[i][j]
example b:
huge_two_dimensional_array foo
for j = 0, j < foo[j].length, j++
for i = 0; i < foo[i].length, i++
do_something_with foo[i][j]
Again, exactly the same big-O... but one of them uses row ordinality and one of them uses column ordinality. It turns out that due to locality of reference and cache coherency you might have two completely different actual runtimes, especially depending on the actual size of the array foo. This doesn't even begin to touch the actual performance characteristics of how the algorithm behaves if it's part of a piece of software that has some concurrency built in.
Not to be a negative nelly but big-O is a tool with a narrow scope. It is of great use if you are deep inside algorithmic analysis or if you are trying to prove something about an algorithm, but if you are doing commercial software development the proof is in the pudding, and you are going to want to have actual performance numbers to make intelligent decisions.
Cheers!
This could work for simple algorithms, but what about O(n^2 lg n), or O(n lg^2 n)?
You could get fooled visually very easily.
And if its a really bad algorithm, maybe it wouldn't return even on n=10.
Proof that this is undecidable:
Suppose that we had some algorithm HALTS_IN_FN(Program, function) which determined whether a program halted in O(f(n)) for all n, for some function f.
Let P be the following program:
if(HALTS_IN_FN(P,f(n)))
{
while(1);
}
halt;
Since the function and the program are fixed, HALTS_IN_FN on this input is constant time. If HALTS_IN_FN returns true, the program runs forever and of course does not halt in O(f(n)) for any f(n). If HALTS_IN_FN returns false, the program halts in O(1) time.
Thus, we have a paradox, a contradiction, and so the program is undecidable.
A lot of people have commented that this is an inherently unsolvable problem in theory. Fair enough, but beyond that, even solving it for any but the most trivial cases would seem to be incredibly difficult.
Say you have a program that has a set of nested loops, each based on the number of items in an array. O(n^2). But what if the inner loop is only run in a very specific set of circumstances? Say, on average, it's run in aprox log(n) cases. Suddenly our "obviously" O(n^2) algorithm is really O(n log n). Writing a program that could determine if the inner loop would be run, and how often, is potentially more difficult than the original problem.
Remember O(N) isn't god; high constants can and will change the playing field. Quicksort algorithms are O(n log n) of course, but when the recursion gets small enough, say down to 20 items or so, many implementations of quicksort will change tactics to a separate algorithm as it's actually quicker to do a different type of sort, say insertion sort with worse O(N), but much smaller constant.
So, understand your data, make educated guesses, and test.
I think it's pretty much impossible to do this automatically. Remember that O(g(n)) is the worst-case upper bound and many functions perform better than that for a lot of data sets. You'd have to find the worst-case data set for each one in order to compare them. That's a difficult task on its own for many algorithms.
You must also take care when running such benchmarks. Some algorithms will have a behavior heavily dependent on the input type.
Take Quicksort for example. It is a worst-case O(n²), but usually O(nlogn). For two inputs of the same size.
The traveling salesman is (I think, not sure) O(n²) (EDIT: the correct value is 0(n!) for the brute force algotithm) , but most algorithms get rather good approximated solutions much faster.
This means that the the benchmarking structure has to most of the time be adapted on an ad hoc basis. Imagine writing something generic for the two examples mentioned. It would be very complex, probably unusable, and likely will be giving incorrect results anyway.
Jeffrey L Whitledge is correct. A simple reduction from the halting problem proves that this is undecidable...
ALSO, if I could write this program, I'd use it to solve P vs NP, and have $1million... B-)
I'm using a big_O library (link here) that fits the change in execution time against independent variable n to infer the order of growth class O().
The package automatically suggests the best fitting class by measuring the residual from collected data against each class growth behavior.
Check the code in this answer.
Example of output,
Measuring .columns[::-1] complexity against rapid increase in # rows
--------------------------------------------------------------------------------
Big O() fits: Cubic: time = -0.017 + 0.00067*n^3
--------------------------------------------------------------------------------
Constant: time = 0.032 (res: 0.021)
Linear: time = -0.051 + 0.024*n (res: 0.011)
Quadratic: time = -0.026 + 0.0038*n^2 (res: 0.0077)
Cubic: time = -0.017 + 0.00067*n^3 (res: 0.0052)
Polynomial: time = -6.3 * x^1.5 (res: 6)
Logarithmic: time = -0.026 + 0.053*log(n) (res: 0.015)
Linearithmic: time = -0.024 + 0.012*n*log(n) (res: 0.0094)
Exponential: time = -7 * 0.66^n (res: 3.6)
--------------------------------------------------------------------------------
I guess this isn't possible in a fully automatic way since the type and structure of the input differs a lot between functions.
Well, since you can't prove whether or not a function even halts, I think you're asking a little much.
Otherwise #Godeke has it.
I don't know what's your objective in doing this, but we had a similar problem in a course I was teaching. The students were required to implement something that works at a certain complexity.
In order not to go over their solution manually, and read their code, we used the method #Godeke suggested. The objective was to find students who used linked list instead of a balansed search tree, or students who implemented bubble sort instead of heap sort (i.e. implementations that do not work in the required complexity - but without actually reading their code).
Surprisingly, the results did not reveal students who cheated. That might be because our students are honest and want to learn (or just knew that we'll check this ;-) ). It is possible to miss cheating students if the inputs are small, or if the input itself is ordered or such. It is also possible to be wrong about students who did not cheat, but have large constant values.
But in spite of the possible errors, it is well worth it, since it saves a lot of checking time.
As others have said, this is theoretically impossible. But in practice, you can make an educated guess as to whether a function is O(n) or O(n^2), as long as you don't mind being wrong sometimes.
First time the algorithm, running it on input of various n. Plot the points on a log-log graph. Draw the best-fit line through the points. If the line fits all the points well, then the data suggests that the algorithm is O(n^k), where k is the slope of the line.
I am not a statistician. You should take all this with a grain of salt. But I have actually done this in the context of automated testing for performance regressions. The patch here contains some JS code for it.
If you have lots of homogenious computational resources, I'd time them against several samples and do linear regression, then simply take the highest term.
It's easy to get an indication (e.g. "is the function linear? sub-linear? polynomial? exponential")
It's hard to find the exact complexity.
For example, here's a Python solution: you supply the function, and a function that creates parameters of size N for it. You get back a list of (n,time) values to plot, or to perform regression analysis. It times it once for speed, to get a really good indication it would have to time it many times to minimize interference from environmental factors (e.g. with the timeit module).
import time
def measure_run_time(func, args):
start = time.time()
func(*args)
return time.time() - start
def plot_times(func, generate_args, plot_sequence):
return [
(n, measure_run_time(func, generate_args(n+1)))
for n in plot_sequence
]
And to use it to time bubble sort:
def bubble_sort(l):
for i in xrange(len(l)-1):
for j in xrange(len(l)-1-i):
if l[i+1] < l[i]:
l[i],l[i+1] = l[i+1],l[i]
import random
def gen_args_for_sort(list_length):
result = range(list_length) # list of 0..N-1
random.shuffle(result) # randomize order
# should return a tuple of arguments
return (result,)
# timing for N = 1000, 2000, ..., 5000
times = plot_times(bubble_sort, gen_args_for_sort, xrange(1000,6000,1000))
import pprint
pprint.pprint(times)
This printed on my machine:
[(1000, 0.078000068664550781),
(2000, 0.34400010108947754),
(3000, 0.7649998664855957),
(4000, 1.3440001010894775),
(5000, 2.1410000324249268)]

Your favourite algorithm and the lesson it taught you [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
What algorithm taught you the most about programming or a specific language feature?
We have all had those moments where all of a sudden we know, just know, we have learned an important lesson for the future based on finally understanding an algorithm written by a programmer a couple of steps up the evolutionary ladder. Whose ideas and code had the magic touch on you?
General algorithms:
Quicksort (and it's average complexity analysis), shows that randomizing your input can be a good thing!;
balanced trees (AVL trees for example), a neat way to balance search/insertion costs;
Dijkstra and Ford-Fulkerson algorithms on graphs (I like the fact that the second one has many applications);
the LZ* family of compression algorithms (LZW for example), data compression sounded kind of magic to me until I discovered it (a long time ago :) );
the FFT, ubiquitous (re-used in so many other algorithms);
the simplex algorithm, ubiquitous as well.
Numerical related:
Euclid's algorithm to compute the gcd of two integers: one of the first algorithms, simple and elegant, powerful, has lots of generalizations;
fast multiplication of integers (Cooley-Tukey for example);
Newton iterations to invert / find a root, a very powerful meta-algorithm.
Number theory-related:
AGM-related algorithms (examples): leads to very simple and elegant algorithms to compute pi (and much more!), though the theory is quite profound (Gauss introduced elliptic functions and modular forms from it, so you can say that it gave birth to algebraic geometry...);
the number field sieve (for integer factorization): very complicated, but quite a nice theoretical result (this also goes for the AKS algorithm, which proved that PRIMES is in P).
I also enjoyed studying quantum computing (Shor and Deutsch-Josza algorithms for example): this teaches you to think out of the box.
As you can see, I'm a bit biased towards maths-oriented algorithms :)
"To iterate is human, to recurse divine" - quoted in 1989 at college.
P.S. Posted by Woodgnome while waiting for invite to join
Floyd-Warshall all-pairs shortest paths algorithm
procedure FloydWarshall ()
for k := 1 to n
for i := 1 to n
for j := 1 to n
path[i][j] = min ( path[i][j], path[i][k]+path[k][j] );
Here's why it's cool: when you first learn about the shortest-path problem in your graph theory course, you probably start with Dijkstra's algorithm that solves single-source shortest path. It's quite complicated at first, but then you get over it, and you fully understood it.
Then the teacher says "Now we want to solve the same problem but for ALL sources". You think to yourself, "Oh god, this is going to be a much harder problem! It's going to be at least N times more complicated than Dijkstra's algorithm!!!".
Then the teacher gives you Floyd-Warshall. And your mind explodes. Then you start to tear up at how beautifully simple the algorithm is. It's just a triply-nested loop. It only uses a simple array for its data structure.
The most eye-opening part for me is the following realization: say you have a solution for problem A. Then you have a bigger "superproblem" B which contains problem A. The solution to problem B may in fact be simpler than the solution to problem A.
This one might sound trivial but it was a revelation for me at the time.
I was in my very first programming class(VB6) and the Prof had just taught us about random numbers and he gave the following instructions: "Create a virtual lottery machine. Imagine a glass ball full of 100 ping pong balls marked 0 to 99. Pick them randomly and display their number until they have all been selected, no duplicates."
Everyone else wrote their program like this: Pick a ball, put its number into an "already selected list" and then pick another ball. Check to see if its already selected, if so pick another ball, if not put its number on the "already selected list" etc....
Of course by the end they were making hundreds of comparisons to find the few balls that had not already been picked. It was like throwing the balls back into the jar after selecting them. My revelation was to throw balls away after picking.
I know this sounds mind-numbingly obvious but this was the moment that the "programming switch" got flipped in my head. This was the moment that programming went from trying to learn a strange foreign language to trying to figure out an enjoyable puzzle. And once I made that mental connection between programming and fun there was really no stopping me.
Huffman coding would be mine, I had originally made my own dumb version by minimizing the number of bits to encode text from 8 down to less, but had not thought about variable number of bits depending on frequency. Then I found the huffman coding described in an article in a magazine and it opened up lots of new possibilities.
Quicksort. It showed me that recursion can be powerful and useful.
Bresenham's line drawing algorithm got me interested in realtime graphics rendering. This can be used to render filled polygons, like triangles, for things like 3D model rendering.
Recursive Descent Parsing - I remember being very impressed how such simple code could do something so seemingly complex.
Quicksort in Haskell:
qsort [] = []
qsort (x:xs) = qsort (filter (< x) xs) ++ [x] ++ qsort (filter (>= x) xs)
Although I couldn'd write Haskell at the time, I did understand this code and with it recursion and the quicksort algorithm. It just made click and there it was...
The iterative algorithm for Fibonacci, because for me it nailed down the fact that the most elegant code (in this case, the recursive version) is not necessarily the most efficient.
To elaborate- The "fib(10) = fib(9) + fib(8)" approach means that fib(9) will be evaluated to fib(8) + fib(7). So evaluation of fib(8) (and therefor fib7, fib6) will all be evaluated twice.
The iterative method, (curr = prev1 + prev2 in a forloop) does not tree out this way, nor does it take as much memory since it's only 3 transient variables, instead of n frames in the recursion stack.
I tend to strive for simple, elegant code when I'm programming, but this is the algorithm that helped me realize that this isn't the end-all-be-all for writing good software, and that ultimately the end users don't care how your code looks.
For some reason I like the Schwartzian transform
#sorted = map { $_->[0] }
sort { $a->[1] cmp $b->[1] }
map { [$_, foo($_)] }
#unsorted;
Where foo($) represents a compute-intensive expression that takes $ (each item of the list in turn) and produces the corresponding value that is to be compared in its sake.
Minimax taught me that chess programs aren't smart, they can just think more moves ahead than you can.
I don't know if this qualifies as an algorithm, or just a classic hack. In either case, it helped to get me to start thinking outside the box.
Swap 2 integers without using an intermediate variable (in C++)
void InPlaceSwap (int& a, int &b) {
a ^= b;
b ^= a;
a ^= b;
}
Quicksort: Until I got to college, I had never questioned whether brute force Bubble Sort was the most efficient way to sort. It just seemed intuitively obvious. But being exposed to non-obvious solutions like Quicksort taught me to look past the obvious solutions to see if something better is available.
For me it's the weak-heapsort algorithm because it shows (1) how much a wise chosen data structure (and the algorithms working on it) can influence the performance and (2) that fascinating things can be discovered even in old, well-known things. (weak-heapsort is the best variant of all heap sorts, which was proven eight years later.)
This is a slow one :)
I learned lots about both C and computers in general by understanding Duffs Device and XOR swaps
EDIT:
#Jason Z, that's my XOR swap :) cool isn't it.
For some reason Bubble Sort has always stood out to me. Not because it's elegant or good just because it had/has a goofy name I suppose.
The iterative algorithm for Fibonacci, because for me it nailed down the fact that the most elegant code (in this case, the recursive version) is not necessarily the most efficient.
The iterative method, (curr = prev1 + prev2 in a forloop) does not tree out this way, nor does it take as much memory since it's only 3 transient variables, instead of n frames in the recursion stack.
You know that fibonacci has a closed form solution that allows direct computation of the result in a fixed number of steps, right? Namely, (phin - (1 - phi)n) / sqrt(5). It always strikes me as somewhat remarkable that this should yield an integer, but it does.
phi is the golden ratio, of course; (1 + sqrt(5)) / 2.
I don't have a favourite -- there are so many beautiful ones to pick from -- but one I've always found intriguing is the Bailey–Borwein–Plouffe (BBP) formula, which enables you to calculate an arbitrary digit of pi without knowledge about the preceding digits.
RSA introduced me to the world of modular arithmetic, which can be used to solve a surprising number of interesting problems!
Hasn't taught me much, but the Johnson–Trotter Algorithm never fails to blow my mind.
Binary decision diagrams, though formally not an algorithm but a datastructure, lead to elegant and minimal solutions for various sorts of (boolean) logic problems. They were invented and developped to minimise the gate count in chip-design, and can be viewed as one of the fundaments of the silicon revolution. The resulting algorithms are amazingly simple.
What they taught me:
a compact representation of any problem is important; small is beautiful
a small set of constraints/reductions applied recursively can be used to accomplish this
for problems with symmetries, tranformation to a canonical form should be the first step to consider
not every piece of literature is read. Knuth found out about BDD's several years after their invention/introduction. (and spent almost a year investigating them)
For me, the simple swap in Kelly & Pohl's A Book on C to demonstrate call-by-reference flipped me out when I first saw it. I looked at that, and pointers snapped into place. Verbatim. . .
void swap(int *p, int *q)
{
int temp;
temp = *p;
*p = *q;
*q = temp;
}
The Towers of Hanoi algorithm is one of the most beautiful algorithms. It shows how you can use recursion to solve a problem in a much more elegant fashion than the iterative method.
Alternatively, the recursion algorithm for Fibonacci series and calculating powers of a number demonstrate the reverse situation of recursive algorithm being used for the sake of recursion instead of providing good value.
An algorithm that generates a list of primes by comparing each number to the current list of primes, adding it if it's not found, and returning the list of primes at the end. Mind-bending in several ways, not the least of which being the idea of using the partially-completed output as the primary search criteria.
Storing two pointers in a single word for a doubly linked list tought me the lesson that you can do very bad things in C indeed (with which a conservative GC will have lots of trouble).
The most proud I've been of a solution was writing something very similar to the DisplayTag package. It taught me a lot about code design, maintainability, and reuse. I wrote it well before DisplayTag, and it was sunk into an NDA agreement, so I couldn't open source it, but I can still speak gushingly about that one in job interviews.
Map/Reduce. Two simple concepts that fit together to make a load of data-processing tasks easier to parallelize.
Oh... and it's only the basis of massively-parallel indexing:
http://labs.google.com/papers/mapreduce.html
Not my favorite, but the Miller Rabin Algorithm for testing primality showed me that being right almost all the time, is good enough almost all the time. (i.e. Don't mistrust a probabilistic algorithm just because it has a probability of being wrong.)
#Krishna Kumar
The bitwise solution is even more fun than the recursive solution.

Resources