How the average complexity of algorithm is calculated? Worst is obvious, best also, but how the average is calculated?
calculate the complexity for all possible input and take and weighted sum based on their probabilities. This is also called expected runtine (similar to expectation in probabilities).
ET(I) = P(X=I1)*T(I1) + P(X=I2)*T(I2) + P(X=I3)*T(I3).......
Average performance (time, space, etc.) complexity is found by considering all possible inputs of a given size and stating the asymptotic bound for the average of the respective measure across all those inputs.
For example, average "number of comparisons" complexity for a sort would be found by considering all N! permutations of input of size N and stating bounds on the average number of comparisons performed across all those inputs.
I.e. this is the sum of numbers of comparisons for all of the possible N! inputs divided by N!
Because the average performance across all possible inputs is equal to the expected value of the same performance measure, average performance is also called expected performance.
Quicksort presents an interesting non-trivial example of calculating the average run-time performance. As you can see the math can get quite complex, and so unfortunately I don't think there's a general equation for calculating average performance.
Related
We have three ways to evaluate an algorithm:
Worst case
Best case
And avg case
The first tells us to look at the worst possible input for the algorithm, and evaluate it's performance.
The second tells us to look at the best input for our algorithm.
The last tells us to look at the avg case of input to the algorithm and so it may be a more accurate measure of an algorithm's performance.
Why aren't we considering an algorithm by it's median case, it's surly be a more accurate then avg-case or at least be complementary factor to it.
Because we look at an input that half of the possible input are below it and above it.
median gives the weight needed to the input that avg may not give.
Median doesn't really have very useful statistical properties.
One thing useful about average is that it becomes asymptotically more unlikely that you will get bad input.
Suppose that average run-time of your algorithm is f(n) in 60% cases and g(n) in 40% cases, where g(n) >> f(n). Then your median is Θ(f(n)), but your solution would often not fit in a time-slot for f(n) algorithm. However, even if probability for g(n) is a very small constant, average will still be Θ(g(n)) alerting you that the algorithm might run for a long time.
Other useful property of expected value is summing. If you have a number of tasks executed sequentially then average total run-time will be equal to a total of average run-times. This makes average easier to both derive and use. There is no similar property for medians.
I know that O(log n) refers to an iterative reduction by a fixed ratio of the problem set N (in big O notation), but how do i actually calculate it to see how many iterations an algorithm with a log N complexity would have to preform on the problem set N before it is done (has one element left)?
You can't. You don't calculate the exact number of iterations with BigO.
You can "derive" BigO when you have exact formula for number of iterations.
BigO just gives information how the number iterations grows with growing N, and only for "big" N.
Nothing more, nothing less. With this you can draw conclusions how much more operations/time will the algorithm take if you have some sample runs.
Expressed in the words of Tim Roughgarden at his courses on algorithms:
The big-Oh notation tries to provide a sweet spot for high level algorithm reasoning
That means it is intended to describe the relation between the algorithm time execution and the size of its input avoiding dependencies on the system architecture, programming language or chosen compiler.
Imagine that big-Oh notation could provide the exact execution time, that would mean that for any algorithm, for which you know its big-Oh time complexity function, you could predict how would it behave on any machine whatsoever.
On the other hand, it is centered on asymptotic behaviour. That is, its description is more accurate for big n values (that is why lower order terms of your algorithm time function are ignored in big-Oh notation). It can reasoned that low n values do not demand you to push foward trying to improve your algorithm performance.
Big O notation only shows an order of magnitude - not the actual number of operations that algorithm would perform. If you need to calculate exact number of loop iterations or elementary operations, you have to do it by hand. However in most practical purposes exact number is irrelevant - O(log n) tells you that num. of operations will raise logarythmically with a raise of n
From big O notation you can't tell precisely how many iteration will the algorithm do, it's just estimation. That means with small numbers the different between the log(n) and actual number of iterations could be differentiate significantly but the closer you get to infinity the different less significant.
If you make some assumptions, you can estimate the time up to a constant factor. The big assumption is that the limiting behavior as the size tends to infinity is the same as the actual behavior for the problem sizes you care about.
Under that assumption, the upper bound on the time for a size N problem is C*log(N) for some constant C. The constant will change depending on the base you use for calculating the logarithm. The base does not matter as long as you are consistent about it. If you have the measured time for one size, you can estimate C and use that to guesstimate the time for a different size.
For example, suppose a size 100 problem takes 20 seconds. Using common logarithms, C is 10. (The common log of 100 is 2). That suggests a size 1000 problem might take about 30 seconds, because the common log of 1000 is 3.
However, this is very rough. The approach is most useful for estimating whether an algorithm might be usable for a large problem. In that sort of situation, you also have to pay attention to memory size. Generally, setting up a problem will be at least linear in size, so its cost will grow faster than an O(log N) operation.
I have tried hard , but i'm unable to come up with the expected running time for the number of comparisons to find the Randomized Median(finding the median of an unsorted array in order n time). Also i wanted to make sure that we CANNOT take expectation of the recurrence that we use to find the randomized median , or any other recurrence in any other problem as they belong to different probability spaces? Is this statement right?
This depends on the algorithm, the general name of the problem is selection algorithm. One popular algorithm is quick select, the average performance of this is linear (i.e. the number of comparisons is k*N, with k typically around 2) but the worst case performance is bad, like O(N*N). There are other algorithms with other trade-offs.
Suppose you run a O(log n) algorithm with an input size of 1000 and the algorithm requires 110
operations. When you double the input size to 2000, the algorithm now requires 120 operations. What is
your best guess for the number of operations required when you again double the input size to 4000?
The Big-O notation is used to indicate the runtime of the algorithm with respect to the input size in the worst-case. It does not predict anything about the actual number of operations. It does not take into account the low order terms and the constant factors.
There's an additive constant, corresponding to run-time overhead, in the solution. The following presumes that the result is Ɵ(log n) rather than just O(log n).
You could go on and explicitly solve for the constants if you wanted to make generalized predictions, but doing so based on two points would be pretty dubious.
Let f(n) be the estimation of the number of operations, just put your question into equation :
f(n) = c * log(n) // O(log n) algorithm
f(1000) = 110
f(2000) = 120
f(4000) = ?
Find c and you'll find your answer. But of course, it would only be a best guess estimation based on the given data and limiting behavior of f.
It won't be an accurate prediction for multiple reasons :
The big O notation only gives your the limiting behavior of the algorithm complexity, not the actual complexity formula.
The number of operation may strongly depends on the nature of the data, not just the multiplicity n.
The limiting behavior is calculated in a particular case of possible data (usually worst case).
While answering to this question a debate began in comments about complexity of QuickSort. What I remember from my university time is that QuickSort is O(n^2) in worst case, O(n log(n)) in average case and O(n log(n)) (but with tighter bound) in best case.
What I need is a correct mathematical explanation of the meaning of average complexity to explain clearly what it is about to someone who believe the big-O notation can only be used for worst-case.
What I remember if that to define average complexity you should consider complexity of algorithm for all possible inputs, count how many degenerating and normal cases. If the number of degenerating cases divided by n tend towards 0 when n get big, then you can speak of average complexity of the overall function for normal cases.
Is this definition right or is definition of average complexity different ? And if it's correct can someone state it more rigorously than I ?
You're right.
Big O (big Theta etc.) is used to measure functions. When you write f=O(g) it doesn't matter what f and g mean. They could be average time complexity, worst time complexity, space complexities, denote distribution of primes etc.
Worst-case complexity is a function that takes size n, and tells you what is maximum number of steps of an algorithm given input of size n.
Average-case complexity is a function that takes size n, and tells you what is expected number of steps of an algorithm given input of size n.
As you see worst-case and average-case complexity are functions, so you can use big O to express their growth.
If you're looking for a formal definition, then:
Average complexity is the expected running time for a random input.
Let's refer Big O Notation in Wikipedia:
Let f and g be two functions defined on some subset of the real numbers. One writes f(x)=O(g(x)) as x --> infinity if ...
So what the premise of the definition states is that the function f should take a number as an input and yield a number as an output. What input number are we talking about? It's supposedly a number of elements in the sequence to be sorted. What output number could we be talking about? It could be a number of operations done to order the sequence. But stop. What is a function? Function in Wikipedia:
a function is a relation between a set of inputs and a set of permissible outputs with the property that each input is related to exactly one output.
Are we producing exacly one output with our prior defition? No, we don't. For a given size of a sequence we can get a wide variation of number of operations. So to ensure the definition is applicable to our case we need to reduce a set possible outcomes (number of operations) to a single value. It can be a maximum ("the worse case"), a minimum ("the best case") or an average.
The conclusion is that talking about best/worst/average case is mathematically correct and using big O notation without those in context of sorting complexity is somewhat sloppy.
On the other hand, we could be more precise and use big Theta notation instead of big O notation.
I think your definition is correct, but your conclusions are wrong.
It's not necessarily true that if the proportion of "bad" cases tends to 0, then the average complexity is equal to the complexity of the "normal" cases.
For example, suppose that 1/(n^2) cases are "bad" and the rest "normal", and that "bad" cases take exactly (n^4) operations, whereas "normal" cases take exactly n operations.
Then the average number of operations required is equal to:
(n^4/n^2) + n(n^2-1)/(n^2)
This function is O(n^2), but not O(n).
In practice, though, you might find that time is polynomial in all cases, and the proportion of "bad" cases shrinks exponentially. That's when you'd ignore the bad cases in calculating an average.
Average case analysis does the following:
Take all inputs of a fixed length (say n), sum up all the running times of all instances of this length, and build the average.
The problem is you will probably have to enumerate all inputs of length n in order to come up with an average complexity.