Big O time complexity of worst case quick sort? - big-o

What I think I understand.
In the worst case the quick-sort algorithm picks the largest or smallest key in list/array to be sorted for every recursive call(if recursive implementation). I understand that the size of n will determine both the number of recursive calls and the number of comparisons(which will decrease by 1 with every step of recursion). So we have n+(n-1)+(n-2)+...+2+1 number of primitive comparison in total.
What I don't understand.
The part I don't quite grasp is how this is O(n^2)? Like I know that it is at least O(n^2) as n+(n-1)+(n-2)+...+2+1 < n^2 but how do I know it isn't say O(n*logn)? Do I have to prove that result to safely confirm it is O(n^2) or is that immediately obvious in a way I can't see? I guess in general how do I know that I have found the lowest function that represents the big O time complexity.

n+(n-1)+(n-2)+...+1 = n(n+1)/2 (you can prove it by mathematical induction), which obviously is O(n^2).
https://en.wikipedia.org/wiki/1_%2B_2_%2B_3_%2B_4_%2B_%E2%8B%AF

Related

How do i measure the O-notation of a for loop?

i have a question about O-notation. (big O)
In my code, i am using a for loop to iterate through an array of users.
The for loop has if-statements that makes it break out of the loop, if the rigth user is found.
My question is how i measure the O-notation?
Is the O-notation is O(N) as i loop through all the users in the array?
Or is the O-notation O(1), as the loop breaks and never runs again?
O notation defines an "order of" relationship between an amount of work (however measured) and the number of items processed (usually 'n'). So "O(n)" means "in direct proportion to the number of items n". "O(1)" means simply "constant". If a loop processes every item once then the amount of work is intuitively in direct proportion to n, but let's say that your exit condition gets hit on average half way through, we might be tempted to say that this is O(n/2), but instead we still say that it is O(n) because the relationship to n is still direct/linear. Similarly if you were to assess the relationship to be O(7n^3 + 2n), you'd say the relationship was simply O(n^3) because n^3 is the term that dominates as n grows large.
The answer to your specific question is therefore O(n) because the number of iterations is in direct proportion to n. All that this says is that if N user records take M milliseconds to process, 2N should take about 2M milliseconds.
It is probably worth noting that O notation is strictly concerned with worst case and not the average cost of algorithms (although I have started to find that it is quite common for people to use it in the latter sense). It is always a good idea to specify to avoid ambiguity.
Big O notation answers the following two questions:
If there are N data elements, how many steps will the algorithm take?
How will the performance of the algorithm change if the number of data elements increases?
Best-case scenario in your case is that the user you are searching for is found at the first index. Time complexity in this case would be O(1) because number of steps taken by the algorithm are constant and do not change if the number of elements in the array are changed.
The worst-case scenario is that your loop will have to iterate over all the users. That makes the time complexity to be O(N) because number of steps taken by the algorithm will be directly proportional to the number of elements in the array.
Big O notation generally refers to the worst-case scenario, so you can say that the time complexity in your case is O(N).
Best case complexity of for loop is O(1) and worst case complexity is O(N). In linear search best case is O(N) and worst case is O(N). It also depends on the approach followed by you to solve problem. Like for(int i = n; i>1; i=i/2) in this case complexity is O(log(N). Complexity of if else condition is O(1).

Best and worst case time for Algorithm S when time complexity changes in accordance to n being even/odd

The following is a homework assignment, so I would rather get hints or bits of information that would help me figure this out, and not complete answers.
Consider S an algorithm solution to a problem that takes as input an array A of size n. After analysis, the following conclusion was obtained:
Algorithm S executes an O(n)-time computation for each even number in A.
Algorithm S executes an O(logn)-time computation for each odd number in A.
What are the best and worst case time for algorithm S?
From this I understand that the time complexity changes in accordance to n being even or odd. In other words, if n is even, S takes O(n) time and when n is odd, S takes O(logn).
Is it a simple matter of taking the best case and the worst case of both growth-rates, and choosing their boundaries? Meaning:
Best case of O(n) is O(1), and worst case is O(n).
Best case of O(logn) is O(logn) and worst case is O(logn).
Therefore the best case for Algorithm S is O(logn) and the worst case is O(n)?
Am I missing something? or am I wrong in assessing the different best/worst case of both cases of big-Oh?
1st attempt:
Ok, so I completely misunderstood the problem. Thanks to candu, I can now better understand what is required of me, and so try to calculate the best and worst case better.
It seems that Algorithm S changes its runtime according to EACH number in A. If the number is even, the runtime is O(n), and if the number is odd, we get O(logn).
The worst case will be composed of an array A of n even numbers, and for each the algorithm will run O(n). In other words, the worst case runtime for Algorithm S should be n*O(n).
The best case will be composed of an array A of n odd numbers, and for each the algorithm will run O(logn). The best case runtime for algorithm S should be n*O(logn).
Am I making any sense? is it true then that:
Best case of algorithm S is nO(logn) and worst case is nO(n)?
If that is true, can it be rewritten? for example, as O(log^n(n)) and O(n^n)? or is this an arithmetic mistake?
2nd attempt:
Following JuanLopes' response, it seems like I can rewrite nO(n) as O(n*n) or O(n^2), and nO(logn) as O(nlogn).
Does it make sense now that Algorithm S runs at O(nlogn) at the best case, and O(n^2) at the worst case?
There's a bit of confusion here: the algorithm runtime doesn't depend on n being even or odd, but on whether the numbers in A are even or odd.
With that in mind, what sort of input A would make Algorithm S run faster? Slower?
Also: it doesn't make sense to say that the best case of O(n) is O(1). Suppose I have an algorithm ("Algorithm Q") that is O(n); all I can say is that there exists a constant c such that, for any input of size n, Algorithm Q takes less than cn time. There is no guarantee that I can find specific inputs for which Algorithm Q is O(1).
To give a concrete example, this takes linear time no matter what input it is passed:
def length(A):
len = 0
for x in A:
len += 1
return len
A few thoughts.
First, there is no mention of asymptotically tight time. So an O(n) algorithm can actually be an O(logn) one. So just imagine the best case running time this algorithm can be in this case. I know, this is a little picky. But this is a homework, I guess it's always welcome to mention all the possibilities.
Second, even if it's asymptotically tight, it doesn't necessarily mean it's tight for all elements. Consider insertion sort. For each new element to insert, we need to find the correct position in the previous already-sorted subarray. The time is proportional to the number of element in subarray, which has the upper bound O(n). But it doesn't mean each new element need exactly #n comparisons to insert. Actually, the shorter the subarray, the quicker the insertion.
Back to this question. "executes an O(logn)-time computation for each odd number in A." Let's assume all odd nubmers. It could be that the first odd takes O(log1), the second odd takes O(log2), .. the nth takes O(logn). Totally, it takes O(logn!). It doesn't contradicts "O(logn) for each odd number".
As to worst case, you may analysize it in much the same way.

Which is bigger: O(n*logn) or O(1)?

We are going over the master theorem in my algorithms class, and for one problem, I'm trying to compare nlogn vs 1 to figure out which case of the MT it falls under. But I'm having a hard timing figuring out which is bigger.
Edit: This is for solving a recurrence problem. The equation is T(n) = 2T(n/4) + N*LogN. Just threw this in incase it helps.
Think about it this way:
O(N*LogN) will increase with N in such a way that for any X, no matter how large, you can find a value of N such that N*LogN is greater than X.
O(1) will stay the same, no matter what N is.
This means that O(1) is asymptotically better, i.e. for some (perhaps very high) value of N the O(N*LogN) will become slower.
If an algorithm is O(NlogN) that means that there exists a number A and a quantity of execution time B, such that for any input size N greater than A, the execution time will be less than B times NlogN.
If an algorithm is O(1), that would mean that there exists some fixed amount of time C in which the algorithm would be guaranteed to complete regardless of the input size.
In comparing two algorithms, one of which is O(NlgN) and one of which is O(1), one will generally discover that the O(1) algorithm is faster for values of N that are sufficiently large, but in many cases the O(NlgN) algorithm may be faster for small values of N.
Indeed, while something like an O(N^3) or O(N^4) algorithm would generally seem pretty bad, it's possible that even an O(N^4) algorithm may outperform an O(1) algorithm if N is usually a small number (e.g. 1-5 or so) and never gets very big (even an occasional value of 50 could seriously dog performance).

What's the complexity of for i: for o = i+1

for i = 0 to size(arr)
for o = i + 1 to size(arr)
do stuff here
What's the worst-time complexity of this? It's not N^2, because the second one decreases by one every i loop. It's not N, it should be bigger. N-1 + N-2 + N-3 + ... + N-N+1.
It is N ^ 2, since it's the product of two linear complexities.
(There's a reason asymptotic complexity is called asymptotic and not identical...)
See Wikipedia's explanation on the simplifications made.
Think of it like you are working with a n x n matrix. You are approximately working on half of the elements in the matrix, but O(n^2/2) is the same as O(n^2).
When you want to determine the complexity class of an algorithm, all you need is to find the fastest growing term in the complexity function of the algorithm. For example, if you have complexity function f(n)=n^2-10000*n+400, to find O(f(n)), you just have to find the "strongest" term in the function. Why? Because for n big enough, only that term dictates the behavior of the entire function. Having said that, it is easy to see that both f1(n)=n^2-n-4 and f2(n)=n^2 are in O(n^2). However, they, for the same input size n, don't run for the same amount of time.
In your algorithm, if n=size(arr), the do stuff here code will run f(n)=n+(n-1)+(n-2)+...+2+1 times. It is easy to see that f(n) represents a sum of an arithmetic series, which means f(n)=n*(n+1)/2, i.e. f(n)=0.5*n^2+0.5*n. If we assume that do stuff here is O(1), then your algorithm has O(n^2) complexity.
for i = 0 to size(arr)
I assumed that the loop ends when i becomes greater than size(arr), not equal to. However, if the latter is the case, than f(n)=0.5*n^2-0.5*n, and it is still in O(n^2). Remember that O(1),O(n),0(n^2),... are complexity classes, and that complexity functions of algorithms are functions that describe, for the input size n, how many steps there is in the algorithm.
It's n*(n-1)/2 which is equal to O(n^2).

Trying to understand Big-oh notation

Hi I would really appreciate some help with Big-O notation. I have an exam in it tomorrow and while I can define what f(x) is O(g(x)) is, I can't say I thoroughly understand it.
The following question ALWAYS comes up on the exam and I really need to try and figure it out, the first part seems easy (I think) Do you just pick a value for n, compute them all on a claculator and put them in order? This seems to easy though so I'm not sure. I'm finding it very hard to find examples online.
From lowest to highest, what is the
correct order of the complexities
O(n2), O(log2 n), O(1), O(2n), O(n!),
O(n log2 n)?
What is the
worst-case computational-complexity of
the Binary Search algorithm on an
ordered list of length n = 2k?
That guy should help you.
From lowest to highest, what is the
correct order of the complexities
O(n2), O(log2 n), O(1), O(2n), O(n!),
O(n log2 n)?
The order is same as if you compare their limit at infinity. like lim(a/b), if it is 1, then they are same, inf. or 0 means one of them is faster.
What is the worst-case
computational-complexity of the Binary
Search algorithm on an ordered list of
length n = 2k?
Find binary search best/worst Big-O.
Find linked list access by index best/worst Big-O.
Make conclusions.
Hey there. Big-O notation is tough to figure out if you don't really understand what the "n" means. You've already seen people talking about how O(n) == O(2n), so I'll try to explain exactly why that is.
When we describe an algorithm as having "order-n space complexity", we mean that the size of the storage space used by the algorithm gets larger with a linear relationship to the size of the problem that it's working on (referred to as n.) If we have an algorithm that, say, sorted an array, and in order to do that sort operation the largest thing we did in memory was to create an exact copy of that array, we'd say that had "order-n space complexity" because as the size of the array (call it n elements) got larger, the algorithm would take up more space in order to match the input of the array. Hence, the algorithm uses "O(n)" space in memory.
Why does O(2n) = O(n)? Because when we talk in terms of O(n), we're only concerned with the behavior of the algorithm as n gets as large as it could possibly be. If n was to become infinite, the O(2n) algorithm would take up two times infinity spaces of memory, and the O(n) algorithm would take up one times infinity spaces of memory. Since two times infinity is just infinity, both algorithms are considered to take up a similar-enough amount of room to be both called O(n) algorithms.
You're probably thinking to yourself "An algorithm that takes up twice as much space as another algorithm is still relatively inefficient. Why are they referred to using the same notation when one is much more efficient?" Because the gain in efficiency for arbitrarily large n when going from O(2n) to O(n) is absolutely dwarfed by the gain in efficiency for arbitrarily large n when going from O(n^2) to O(500n). When n is 10, n^2 is 10 times 10 or 100, and 500n is 500 times 10, or 5000. But we're interested in n as n becomes as large as possible. They cross over and become equal for an n of 500, but once more, we're not even interested in an n as small as 500. When n is 1000, n^2 is one MILLION while 500n is a "mere" half million. When n is one million, n^2 is one thousand billion - 1,000,000,000,000 - while 500n looks on in awe with the simplicity of it's five-hundred-million - 500,000,000 - points of complexity. And once more, we can keep making n larger, because when using O(n) logic, we're only concerned with the largest possible n.
(You may argue that when n reaches infinity, n^2 is infinity times infinity, while 500n is five hundred times infinity, and didn't you just say that anything times infinity is infinity? That doesn't actually work for infinity times infinity. I think. It just doesn't. Can a mathematician back me up on this?)
This gives us the weirdly counterintuitive result where O(Seventy-five hundred billion spillion kajillion n) is considered an improvement on O(n * log n). Due to the fact that we're working with arbitrarily large "n", all that matters is how many times and where n appears in the O(). The rules of thumb mentioned in Julia Hayward's post will help you out, but here's some additional information to give you a hand.
One, because n gets as big as possible, O(n^2+61n+1682) = O(n^2), because the n^2 contributes so much more than the 61n as n gets arbitrarily large that the 61n is simply ignored, and the 61n term already dominates the 1682 term. If you see addition inside a O(), only concern yourself with the n with the highest degree.
Two, O(log10n) = O(log(any number)n), because for any base b, log10(x) = log_b(*x*)/log_b(10). Hence, O(log10n) = O(log_b(x) * 1/(log_b(10)). That 1/log_b(10) figure is a constant, which we've already shown drop out of O(n) notation.
Very loosely, you could imagine picking extremely large values of n, and calculating them. Might exceed your calculator's range for large factorials, though.
If the definition isn't clear, a more intuitive description is that "higher order" means "grows faster than, as n grows". Some rules of thumb:
O(n^a) is a higher order than O(n^b) if a > b.
log(n) grows more slowly than any positive power of n
exp(n) grows more quickly than any power of n
n! grows more quickly than exp(kn)
Oh, and as far as complexity goes, ignore the constant multipliers.
That's enough to deduce that the correct order is O(1), O(log n), O(2n) = O(n), O(n log n), O(n^2), O(n!)
For big-O complexities, the rule is that if two things vary only by constant factors, then they are the same. If one grows faster than another ignoring constant factors, then it is bigger.
So O(2n) and O(n) are the same -- they only vary by a constant factor (2). One way to think about it is to just drop the constants, since they don't impact the complexity.
The other problem with picking n and using a calculator is that it will give you the wrong answer for certain n. Big O is a measure of how fast something grows as n increases, but at any given n the complexities might not be in the right order. For instance, at n=2, n^2 is 4 and n! is 2, but n! grows quite a bit faster than n^2.
It's important to get that right, because for running times with multiple terms, you can drop the lesser terms -- ie, if O(f(n)) is 3n^2+2n+5, you can drop the 5 (constant), drop the 2n (3n^2 grows faster), then drop the 3 (constant factor) to get O(n^2)... but if you don't know that n^2 is bigger, you won't get the right answer.
In practice, you can just know that n is linear, log(n) grows more slowly than linear, n^a > n^b if a>b, 2^n is faster than any n^a, and n! is even faster than that. (Hint: try to avoid algorithms that have n in the exponent, and especially avoid ones that are n!.)
For the second part of your question, what happens with a binary search in the worst case? At each step, you cut the space in half until eventually you find your item (or run out of places to look). That is log2(2k). A search where you just walk through the list to find your item would take n steps. And we know from the first part that O(log(n)) < O(n), which is why binary search is faster than just a linear search.
Good luck with the exam!
In easy to understand terms the Big-O notation defines how quickly a particular function grows. Although it has its roots in pure mathematics its most popular application is the analysis of algorithms which can be analyzed on the basis of input size to determine the approximate number of operations that must be performed.
The benefit of using the notation is that you can categorize function growth rates by their complexity. Many different functions (an infinite number really) could all be expressed with the same complexity using this notation. For example, n+5, 2*n, and 4*n + 1/n all have O(n) complexity because the function g(n)=n most simply represents how these functions grow.
I put an emphasis on most simply because the focus of the notation is on the dominating term of the function. For example, O(2*n + 5) = O(2*n) = O(n) because n is the dominating term in the growth. This is because the notation assumes that n goes to infinity which causes the remaining terms to play less of a role in the growth rate. And, by convention, any constants or multiplicatives are omitted.
Read Big O notation and Time complexity for more a more in depth overview.
See this and look up for solutions here is first one.

Resources