Can someone please explain to me how one can determine the worst-case complexity of an algorithm. I know that the we need to use the equation W(n) = max{t(I)|I element of D), where D is the set of inputs of size n. Do I calculate the number of operations performed for each element I and then take its max? What easier way is there to accomplish this?
Starting from the equation is thinking of it a bit backwards. What you really care about is scalability, or, what is it going to do as you increase the size of the input.
If you just have a loop, for instance, you have a O(n) time complexity algorithm. If you have a loop within another loop though, it becomes O(n^2), because it must now do n^2 many things for any size n input.
When you are talking about worst case, you are usually talking about non deterministic algorithms, where you might have a loop that can stop prematurely. What you want to do for this is assume the worst and pretend the loop will stop as late as possible. So if we have:
for(int i = 0;i < n;i++){
for(int j = 0;j < n;j++){
if(rand() > .5) j = n;
}
}
We would say that the worst-case is O(n^2). Even though we know that it is very likely that the middle loop will bust out early, we are looking for the worst possible performance.
That equation is more of a definition than an algorithm.
Does the algorithm in question care about anything other than the size of its input? If not then calculating W(n) is "easy".
If it does, try to come up with a pathological input. For example, with quicksort it might be fairly obvious that a sorted input is pathological, and you can do some counting to see that it takes O(n^2) steps. At that point you can either
Argue that your input is "maximally" pathological
Exhibit a matching upper bound on the runtime on any input
Example of #1:
Each pass of quicksort will put the pivot in the right place, and then recurse on the two parts. (handwave alert) The worst case is to have the rest of the array on one side of the pivot. A sorted input achieves this.
Example of #2:
Each pass of quicksort puts the pivot in the right place, so there are no more than O(n) passes. Each pass requires no more than O(n) work. As such, no input can cause quicksort to take more than O(n^2).
In this case #2 is a lot easier.
Related
i have a question about O-notation. (big O)
In my code, i am using a for loop to iterate through an array of users.
The for loop has if-statements that makes it break out of the loop, if the rigth user is found.
My question is how i measure the O-notation?
Is the O-notation is O(N) as i loop through all the users in the array?
Or is the O-notation O(1), as the loop breaks and never runs again?
O notation defines an "order of" relationship between an amount of work (however measured) and the number of items processed (usually 'n'). So "O(n)" means "in direct proportion to the number of items n". "O(1)" means simply "constant". If a loop processes every item once then the amount of work is intuitively in direct proportion to n, but let's say that your exit condition gets hit on average half way through, we might be tempted to say that this is O(n/2), but instead we still say that it is O(n) because the relationship to n is still direct/linear. Similarly if you were to assess the relationship to be O(7n^3 + 2n), you'd say the relationship was simply O(n^3) because n^3 is the term that dominates as n grows large.
The answer to your specific question is therefore O(n) because the number of iterations is in direct proportion to n. All that this says is that if N user records take M milliseconds to process, 2N should take about 2M milliseconds.
It is probably worth noting that O notation is strictly concerned with worst case and not the average cost of algorithms (although I have started to find that it is quite common for people to use it in the latter sense). It is always a good idea to specify to avoid ambiguity.
Big O notation answers the following two questions:
If there are N data elements, how many steps will the algorithm take?
How will the performance of the algorithm change if the number of data elements increases?
Best-case scenario in your case is that the user you are searching for is found at the first index. Time complexity in this case would be O(1) because number of steps taken by the algorithm are constant and do not change if the number of elements in the array are changed.
The worst-case scenario is that your loop will have to iterate over all the users. That makes the time complexity to be O(N) because number of steps taken by the algorithm will be directly proportional to the number of elements in the array.
Big O notation generally refers to the worst-case scenario, so you can say that the time complexity in your case is O(N).
Best case complexity of for loop is O(1) and worst case complexity is O(N). In linear search best case is O(N) and worst case is O(N). It also depends on the approach followed by you to solve problem. Like for(int i = n; i>1; i=i/2) in this case complexity is O(log(N). Complexity of if else condition is O(1).
Suppose that I have 2 nested for loops, and 1 array of size N as shown in my code below:
int result = 0;
for( int i = 0; i < N ; i++)
{
for( int j = i; j < N ; j++)
{
result = array[i] + array[j]; // just some funny operation
}
}
Here are 2 cases:
(1) if the constraint is that N >= 1,000,000 strictly, then we can definitely say that the time complexity is O(N^2). This is true for sure as we all know.
(2) Now, if the constraint is that N < 25 strictly, then people could probably say that because we know that definitely, N is always too small, the time complexity is estimated to be O(1) since it takes very little time to run and complete these 2 for loops WITH MODERN COMPUTERS ? Does that sound right ?
Please tell me if the value of N plays a role in deciding the outcome of the time complexity O(N) ? If yes, then how big the value N needs to be in order to play that role (1,000 ? 5,000 ? 20,000 ? 500,000 ?) In other words, what is the general rule of thumb here ?
INTERESTING THEORETICAL QUESTION: If 15 years from now, the computer is so fast that even if N = 25,000,000, these 2 for loops can be completed in 1 second. At that time, can we say that the time complexity would be O(1) even for N = 25,000,000 ? I suppose the answer would be YES at that time. Do you agree ?
tl:dr No. The value of N has no effect on time complexity. O(1) versus O(N) is a statement about "all N" or how the amount of computation increases when N increases.
Great question! It reminds me of when I was first trying to understand time complexity. I think many people have to go through a similar journey before it ever starts to make sense so I hope this discussion can help others.
First of all, your "funny operation" is actually funnier than you think since your entire nested for-loops can be replaced with:
result = array[N - 1] + array[N - 1]; // just some hilarious operation hahaha ha ha
Since result is overwritten each time, only the last iteration effects the outcome. We'll come back to this.
As far as what you're really asking here, the purpose of Big-O is to provide a meaningful way to compare algorithms in a way that is indenependent of input size and independent of the computer's processing speed. In other words, O(1) versus O(N) has nothing to with the size of N and nothing to do with how "modern" your computer is. That all effects execution time of the algorithm on a particular machine with a particular input, but does not effect time complexity, i.e. O(1) versus O(N).
It is actually a statement about the algorithm itself, so a math discussion is unavoidable, as dxiv has so graciously alluded to in his comment. Disclaimer: I'm going to omit certain nuances in the math since the critical stuff is already a lot to explain and I'll defer to the mountains of complete explanations elsewhere on the web and textbooks.
Your code is a great example to understand what Big-O does tell us. The way you wrote it, its complexity is O(N^2). That means that no matter what machine or what era you run your code in, if you were to count the number of operations the computer has to do, for each N, and graph it as a function, say f(N), there exists some quadratic function, say g(N)=9999N^2+99999N+999 that is greater than f(N) for all N.
But wait, if we just need to find big enough coefficients in order for g(N) to be an upper bound, can't we just claim that the algorithm is O(N) and find some g(N)=aN+b with gigantic enough coefficients that its an upper bound of f(N)??? THE ANSWER TO THIS IS THE MOST IMPORTANT MATH OBSERVATION YOU NEED TO UNDERSTAND TO REALLY UNDERSTAND BIG-O NOTATION. Spoiler alert. The answer is no.
For visuals, try this graph on Desmos where you can adjust the coefficients:[https://www.desmos.com/calculator/3ppk6shwem][1]
No matter what coefficients you choose, a function of the form aN^2+bN+c will ALWAYS eventually outgrow a function of the form aN+b (both having positive a). You can push a line as high as you want like g(N)=99999N+99999, but even the function f(N)=0.01N^2+0.01N+0.01 crosses that line and grows past it after N=9999900. There is no linear function that is an upper bound to a quadratic. Similarly, there is no constant function that is an upper bound to a linear function or quadratic function. Yet, we can find a quadratic upper bound to this f(N) such as h(N)=0.01N^2+0.01N+0.02, so f(N) is in O(N^2). This observation is what allows us to just say O(1) and O(N^2) without having to distinguish between O(1), O(3), O(999), O(4N+3), O(23N+2), O(34N^2+4+e^N), etc. By using phrases like "there exists a function such that" we can brush all the constant coefficients under the rug.
So having a quadratic upper bound, aka being in O(N^2), means that the function f(N) is no bigger than quadratic and in this case happens to be exactly quadratic. It sounds like this just comes down to comparing the degree of polynomials, why not just say that the algorithm is a degree-2 algorithm? Why do we need this super abstract "there exists an upper bound function such that bla bla bla..."? This is the generalization necessary for Big-O to account for non-polynomial functions, some common ones being logN, NlogN, and e^N.
For example if the number of operations required by your algorithm is given by f(N)=floor(50+50*sin(N)), we would say that it's O(1) because there is a constant function, e.g. g(N)=101 that is an upper bound to f(N). In this example, you have some bizarre algorithm with oscillating execution times, but you can convey to someone else how much it doesn't slow down for large inputs by simply saying that it's O(1). Neat. Plus we have a way to meaningfully say that this algorithm with trigonometric execution time is more efficient than one with linear complexity O(N). Neat. Notice how it doesn't matter how fast the computer is because we're not measuring in seconds, we're measuring in operations. So you can evaluate the algorithm by hand on paper and it's still O(1) even if it takes you all day.
As for the example in your question, we know it's O(N^2) because there are aN^2+bN+c operations involved for some a, b, c. It can't be O(1) because no matter what aN+b you pick, I can find a large enough input size N such that your algorithm requires more than aN+b operations. On any computer, in any time zone, with any chance of rain outside. Nothing physical effects O(1) versus O(N) versus (N^2). What changes it to O(1) is changing the algorithm itself to the one-liner that I provided above where you just add two numbers and spit out the result no matter what N is. Let's say for N=10 it takes 4 operations to do both array lookups, the addition, and the variable assignment. If you run it again on the same machine with N=10000000 it's still doing the same 4 operations. The amount of operations required by the algorithm doesn't grow with N. That's why the algorithm is O(1).
It's why problems like finding a O(NlogN) algorithm to sort an array are math problems and not nano-technology problems. Big-O doesn't even assume you have a computer with electronics.
Hopefully this rant gives you a hint as to what you don't understand so you can do more effective studying for a complete understanding. There's no way to cover everything needed in one post here. It was some good soul-searching for me, so thanks.
The following is a homework assignment, so I would rather get hints or bits of information that would help me figure this out, and not complete answers.
Consider S an algorithm solution to a problem that takes as input an array A of size n. After analysis, the following conclusion was obtained:
Algorithm S executes an O(n)-time computation for each even number in A.
Algorithm S executes an O(logn)-time computation for each odd number in A.
What are the best and worst case time for algorithm S?
From this I understand that the time complexity changes in accordance to n being even or odd. In other words, if n is even, S takes O(n) time and when n is odd, S takes O(logn).
Is it a simple matter of taking the best case and the worst case of both growth-rates, and choosing their boundaries? Meaning:
Best case of O(n) is O(1), and worst case is O(n).
Best case of O(logn) is O(logn) and worst case is O(logn).
Therefore the best case for Algorithm S is O(logn) and the worst case is O(n)?
Am I missing something? or am I wrong in assessing the different best/worst case of both cases of big-Oh?
1st attempt:
Ok, so I completely misunderstood the problem. Thanks to candu, I can now better understand what is required of me, and so try to calculate the best and worst case better.
It seems that Algorithm S changes its runtime according to EACH number in A. If the number is even, the runtime is O(n), and if the number is odd, we get O(logn).
The worst case will be composed of an array A of n even numbers, and for each the algorithm will run O(n). In other words, the worst case runtime for Algorithm S should be n*O(n).
The best case will be composed of an array A of n odd numbers, and for each the algorithm will run O(logn). The best case runtime for algorithm S should be n*O(logn).
Am I making any sense? is it true then that:
Best case of algorithm S is nO(logn) and worst case is nO(n)?
If that is true, can it be rewritten? for example, as O(log^n(n)) and O(n^n)? or is this an arithmetic mistake?
2nd attempt:
Following JuanLopes' response, it seems like I can rewrite nO(n) as O(n*n) or O(n^2), and nO(logn) as O(nlogn).
Does it make sense now that Algorithm S runs at O(nlogn) at the best case, and O(n^2) at the worst case?
There's a bit of confusion here: the algorithm runtime doesn't depend on n being even or odd, but on whether the numbers in A are even or odd.
With that in mind, what sort of input A would make Algorithm S run faster? Slower?
Also: it doesn't make sense to say that the best case of O(n) is O(1). Suppose I have an algorithm ("Algorithm Q") that is O(n); all I can say is that there exists a constant c such that, for any input of size n, Algorithm Q takes less than cn time. There is no guarantee that I can find specific inputs for which Algorithm Q is O(1).
To give a concrete example, this takes linear time no matter what input it is passed:
def length(A):
len = 0
for x in A:
len += 1
return len
A few thoughts.
First, there is no mention of asymptotically tight time. So an O(n) algorithm can actually be an O(logn) one. So just imagine the best case running time this algorithm can be in this case. I know, this is a little picky. But this is a homework, I guess it's always welcome to mention all the possibilities.
Second, even if it's asymptotically tight, it doesn't necessarily mean it's tight for all elements. Consider insertion sort. For each new element to insert, we need to find the correct position in the previous already-sorted subarray. The time is proportional to the number of element in subarray, which has the upper bound O(n). But it doesn't mean each new element need exactly #n comparisons to insert. Actually, the shorter the subarray, the quicker the insertion.
Back to this question. "executes an O(logn)-time computation for each odd number in A." Let's assume all odd nubmers. It could be that the first odd takes O(log1), the second odd takes O(log2), .. the nth takes O(logn). Totally, it takes O(logn!). It doesn't contradicts "O(logn) for each odd number".
As to worst case, you may analysize it in much the same way.
Two algorithms say A and B are written to solve the same problem.
Algorithm A is O(n).
Algorithm B is (n^2).
You expect algorithm A to work better.
However when you run a specific example of the same machine, Algorithm B runs quicker.
Give the reasons to explain how such a thing happen?
Algorithm A, for example, can run in time 10000000*n which is O(n).
If algorithm B, is running in n*n which is O(n^2), A will be slower for every n < 10000000.
O(n), O(n^2) are asymptotic runtimes that describe the behavior when n->infinity
EDIT - EXAMPLE
Suppose you have the two following functions:
boolean flag;
void algoA(int n) {
for (int i = 0; i < n; i++)
for (int j = 0; j < 1000000; j++)
flag = !flag;
void algoB(int n) {
for (int i = 0; i < n; i++)
for (int j = 0; j < n; j++)
flag = !flag;
algoA has n*1000000 flag flip operations so it is O(n) whereas algoB has n^2 flag flip operations so it is O(n^2).
Just solve the inequality 1000000n > n^2 and you'll get that for n < 1000000 it holds. That is, the O(n) method will be slower.
Knowing the algorithms would help give a more exact answer.
But for the general case, I could think of a few relevant factors:
Hardware related
e.g. if the slower algorithm makes good use of caching & locality or similar low-level mechanisms (see Quicksort's performance compared to other theoretically faster sorting algorithms). Worth reading about timsort as well, as an example where an "efficient" algorithm is used to break the problem up in to smaller input sets and a "simpler" and theoretically "higher complexity" algo is used on those sets, because it's faster.
Properties of the input set
e.g. if the input size is small, the efficiency will not come through in a test; also, for example with sorting again, if the input is mostly pre-sorted vs completely random, you will see different results. Many different inputs should be used in a test of this type for an accurate result. Using just one example is simply not enough, as the input can be engineered to favor one algorithm instead of another.
Specific implementation of either algorithms
e.g. there's a long way to go from the theoretical description of an algorithm to implementation; poor use of data structures, recursion, memory management etc. can seriously affect performance
Big-O-notation says nothing about the speed itself, only about how the speed will change when you change n.
If both algorithms take the same time for a single iteration, #Itay's example is also correct.
While all of the answers so far seem correct... none of them feel really "right" in the context of a CS class. In a computational complexity course you want to be precise and use definitions. I'll outline a lot of the nuances of this question and of computational complexity in general. By the end, we'll conclude why Itay's solution at the top is probably what you should've written. My main issue with Itay's solution is that it lacks definitions which are key to writing a good proof for a CS class. Note that my definitions may differ slightly from your class' so feel free to substitute in whatever you want.
When we say "an algorithm is O(n)" we actually mean "this algorithm is in the set O(n)". And the set O(n) contains all algorithms whose worst-case asymptotic complexity f(n) has the property that f(n) <= c*n + c_0 for some constant c and c_0 where c, c_0 > 0.
Now we want to prove your claim. First of all, the way you stated the problem, it has a trivial solution. That's because our asymptotic bounds are "worst-case". For many "slow" algorithms there is some input that it runs remarkably quickly. For instance, insertion sort is linear if the input is already sorted! So take insertion sort (O(n)) and merge sort (O(nlog(n))) and notice that the insertion sort will run faster if you pass in a sorted array! Boom, proof done.
But I am assuming that your exam meant something more like "show why a linear algorithm might run faster than a quadratic algorithm in the worst-case." As Alex noted above, this is an open ended question. The crux of the issue is that runtime analysis makes assumptions that certain operations are O(1) (e.g. for some problem you might assume that multiplication is O(1) even though it becomes quadratically slower for large numbers (one might argue that the numbers for a given problem are bounded to be 100-bits so it's still "constant time")). Since your class is probably focusing specifically on computational complexity then they probably want you to gloss over this issue. So we'll prove the claim assuming that our O(1) assumptions are right, and so there aren't details like "caching makes this algorithm way faster than the other one".
So now we have one algorithm which runs in f(n) which is O(n) and some other algorithm which runs in g(n) which is O(n^2). We want to use the definitions above to show that for some n we can have g(n) < f(n). The trick is that our assumptions have not fixed the c, c_0, c', c_0'. As Itay mentions, we can choose values for those constants such that g(n) < f(n) for many n. And the rest of the proof is what he wrote above (e.g. let c, c_0 be the constants for f(n) and say they are both 100 while c', c_0' are the constants for g(n) and they are both 1. Then g(n) < f(n) => n + 1 < 100n^2 + 100 => 100n^2 - n + 99 > 0 => (factor to get actual bounds for n))
It depends on different scenario.There are 3 types of scenario 1.Best, 2.Average, 3.Worst. If you know sorting techniques there is also same things happens. For more information see following link:
http://en.wikipedia.org/wiki/Sorting_algorithm
Please correct me if I am wrong.
This is my assignment question:
Explain with an example quick sort , merge sort and heap sort .
further count the number of operations, by each of these sorting methods.
I don't understand what exactly i have to answer in the context of " count the number of operations " ?
I found something in coremen book in chapter 2, they have explained insertions sort the running time of an algorithm by calculating run time of each statement ....
do i have to do in similar way ?
To count the number of operations is also known as to analyze the algorithm complexity. The idea is to have a rough idea how many operations are in the worst case needed to execute the algorithm on an input of size N, which gives you the upper bound of the computational resources required for that algorithm. And since each operation by itself (like multiplication or comparison for example) is a finite operation and takes deterministic time (even though it might be different on different machines), to get an idea of how good or bad an algorithm is, especially compared to other algorithms, all you need to know is the rough number of operations.
Here's an example with bubble sort. Let's say you have an array of two numbers. To sort it you need to compare both numbers and potentially exchange them. Since comparing and exchanging are single operations, the exact time for executing them is minimal and not important by itself. Thus, you can say that with N=2, the number of operations is O(N)=1. For three numbers, though, you need three operations in the worst case - compare the first and the second one and potentially exchange them, then compare the second one and the third one and exchange them, then compare the first one with the second one again. When you continue to generalize the bubble sort, you will find out that potentially to sort N numbers, you need to do N operations for the first number, N-1 for the second and so on. In other words, O(N) = N + (N-1) + ... + 2 + 1 = N * (N-1) / 2, which for big enough N can be simplified to O(N) = N^2.
Of course, you could just cheat and find out on the web the O(N) number for each of the three sort algorithms, but I would urge you to spend the time and try to come up with that number yourself at first. Even if you get it wrong, comparing your estimate and how you got it with the actual way to estimate their complexity will help you understand better the process of analyzing the complexity of particular piece of software you write in future.
This is called the big O notation.
This page shows you the most common sorting algorithms and their comparison expressed through big O.
Computational complexity (worst,
average and best number of comparisons
for several typical test cases, see
below). Typically, good average number
of comparisons/operations is O(n log
n) and bad is O(n^2)
From http://www.softpanorama.org/Algorithms/sorting.shtml
I think this assignment is to give you an idea that how a complexity of an algorithm is calculated. For example bubble sort algorithm has a complexity of O(n^2).
// Bubble sort method.
// ref: [http://www.metalshell.com/source_code/105/Bubble_Sort.html][1]
for(x = 0; x < ARRAY_SIZE; x++)
for(y = 0; y < ARRAY_SIZE-1; y++)
if(iarray[y] > iarray[y+1]) {
holder = iarray[y+1];
iarray[y+1] = iarray[y];
iarray[y] = holder;
}
As you see above, two loops are used to sort the array. Let ARRAY_SIZE be n. Then the number of operations is n*(n-1). That makes n^2-n which is denoted by O(N^2). That is big O notation. We just take the n that has the largest exponent, the highest growth rate. If it were 2n^2+2n, that would be still O(N^2) because constants are also omitted in calculating complexity. The wikipedia article on Big O Notation is really helpful (as Leniel mentioned in his post).
That's your homework so I did not get into details of algorithms you mentioned. But you need to do the math like this. But I think what you are asked is the actual number of operations. So, for the example above, if ARRAY_SIZE is 10, the answer gets 10*9=90. To see the differences you need to use the same array in your sample codes.