In worst case R-select is O(n^2) where as select is O(n). Can someone explain and contrast their behavior in average cases.
P.s. - I am not sure if its a repetitive question. I can delete if its the case! Thanks!!
By R-select, I'm assuming you're talking about the randomized selection algorithm that works by choosing a pivot, partitioning on that pivot, and recursively proceeding from there. If that's not the case, let me know!
You're correct that the R-select algorithm's worst-case is Θ(n2), but that's extremely unlikely to arise in practice. This requires you to very frequently pick a pivot that's within a constant number of elements away from the min or max value, and the likelihood that this occurs is exponentially low. The average-case runtime of O(n) is actually quite likely to occur; you can prove, for example, that for any constant k, the probability that the runtime is O(n log n) is at least 1 - 1/nk.
The constant term hidden in the big-O notation of R-select is actually very low, so low in fact that R-select is typically much, much faster than the median-of-medians selection algorithm. In fact, they're sometimes combined together. The introselect algorithm works by running R-select and looking at the runtime, switching to the median-of-medians selection algorithm in the event that the runtime ends up looking bad. The overall runtime is then worst-case O(n) and comparable to R-select.
Related
To avoid the O(n^2) worst case scenario for quick select, I am aware of 2 options:
Randomly choose a pivot index
Use median of medians (MoM) to select an approximate median and pivot around that
When using MoM with quick select, we can guarantee worst case O(n). When using (1), we can't guarantee worst case O(n), but the probability of the algorithm going to O(n^2) should be extremely small. The overhead cost of (2) is much more than (1), where the latter adds little to no additional complexity.
So when should we use one over the other?
As you've noted, the median-of-medians approach is slower than quickselect, but has a better worst-case runtime. Assuming quickselect is truly using a random choice of pivot at each step, you can prove that not only is the expected runtime O(n), but that the probability that its runtime exceeds Θ(n log n) is very, very small (at most 1 / nk for any choice of constant k). So in that sense, if you have the ability to select pivots at random, quickselect will likely be faster.
However, not all implementations of quickselect use true randomness for the pivots, and some use deterministic pivot selection algorithms. This, unfortunately, can lead to pathological inputs that trigger the Θ(n2) worst-case runtime, which is a problem if you have adversarially-chosen inputs.
Once nice compromise between the two is introselect. The basic idea behind introselect is to use quickselect with a deterministic pivot selection algorithm. As the algorithm is running, it keeps track of how many times it's picked a pivot without throwing away at least 30% the input array. If that number exceeds some threshold, it stops using a random pivot choice and switches to the median-of-medians approach to select a good pivot, forcing a 30% size reduction. This approach means that in the common case when quickselect rapidly reduces the input size, introselect is basically identical to quickselect with a tiny bookkeeping overhead. However, in cases where quickselect would degrade to quadratic, introselect stops and switches to the worst-case efficient median-of-medians approach, ensuring the worst-case runtime is O(n). This gives you, essentially, the best of both worlds - it's fast on average, and its worst-case is never worse than O(n).
This question has appeared in my algorithms class. Here's my thought:
I think the answer is no, an algorithm with worst-case time complexity of O(n) is not always faster than an algorithm with worst-case time complexity of O(n^2).
For example, suppose we have total-time functions S(n) = 99999999n and T(n) = n^2. Then clearly S(n) = O(n) and T(n) = O(n^2), but T(n) is faster than S(n) for all n < 99999999.
Is this reasoning valid? I'm slightly skeptical that, while this is a counterexample, it might be a counterexample to the wrong idea.
Thanks so much!
Big-O notation says nothing about the speed of an algorithm for any given input; it describes how the time increases with the number of elements. If your algorithm executes in constant time, but that time is 100 billion years, then it's certainly slower than many linear, quadratic and even exponential algorithms for large ranges of inputs.
But that's probably not really what the question is asking. The question is asking whether an algorithm A1 with worst-case complexity O(N) is always faster than an algorithm A2 with worst-case complexity O(N^2); and by faster it probably refers to the complexity itself. In which case you only need a counter-example, e.g.:
A1 has normal complexity O(log n) but worst-case complexity O(n^2).
A2 has normal complexity O(n) and worst-case complexity O(n).
In this example, A1 is normally faster (i.e. scales better) than A2 even though it has a greater worst-case complexity.
Since the question says Always it means it is enough to find only one counter example to prove that the answer is No.
Example for O(n^2) and O(n logn) but the same is true for O(n^2) and O(n)
One simple example can be a bubble sort where you keep comparing pairs until the array is sorted. Bubble sort is O(n^2).
If you use bubble sort on a sorted array, it will be faster than using other algorithms of time complexity O(nlogn).
You're talking about worst-case complexity here, and for some algorithms the worst case never happen in a practical application.
Saying that an algorithm runs faster than another means it run faster for all input data for all sizes of input. So the answer to your question is obviously no because the worst-case time complexity is not an accurate measure of the running time, it measures the order of growth of the number of operations in a worst case.
In practice, the running time depends of the implementation, and is not only about this number of operations. For example, one has to care about memory allocated, cache-efficiency, space/temporal locality. And obviously, one of the most important thing is the input data.
If you want examples of when the an algorithm runs faster than another while having a higher worst-case complexity, look at all the sorting algorithms and their running time depending of the input.
You are correct in every sense, that you provide a counter example to the statement. If it is for exam, then period, it should grant you full mark.
Yet for a better understanding about big-O notation and complexity stuff, I will share my own reasoning below. I also suggest you to always think the following graph when you are confused, especially the O(n) and O(n^2) line:
Big-O notation
My own reasoning when I first learnt computational complexity is that,
Big-O notation is saying for sufficient large size input, "sufficient" depends on the exact formula (Using the graph, n = 20 when compared O(n) & O(n^2) line), a higher order one will always be slower than lower order one
That means, for small input, there is no guarantee a higher order complexity algorithm will run slower than lower order one.
But Big-O notation tells you an information: When the input size keeping increasing, keep increasing....until a "sufficient" size, after that point, a higher order complexity algorithm will be always slower. And such a "sufficient" size is guaranteed to exist*.
Worst-time complexity
While Big-O notation provides a upper bound of the running time of an algorithm, depends on the structure of the input and the implementation of the algorithm, it may generally have a best complexity, average complexity and worst complexity.
The famous example is sorting algorithm: QuickSort vs MergeSort!
QuickSort, with a worst case of O(n^2)
MergeSort, with a worst case of O(n lg n)
However, Quick Sort is basically always faster than Merge Sort!
So, if your question is about Worst Case Complexity, quick sort & merge sort maybe the best counter example I can think of (Because both of them are common and famous)
Therefore, combine two parts, no matter from the point of view of input size, input structure, algorithm implementation, the answer to your question is NO.
Consider an^2 + bn + c. I understand that for large n, bn and c become insignificant.
I also understand that for large n, the differences between 2n^2 and n^2 are pretty insignificant compared to the differences between, say n^2 and n*log(n).
However, there is still an order of 2 difference between 2n^2 and n^2. Does this matter in practice? Or do people just think about algorithms without coefficients? Why?
The actual coefficients matter if you're interested in timing. But big-O isn't actually about timing, it's about scalability. When you see an algorithm described as O(n^2), you don't really know how long it will take to solve a problem of size n on a particular computer in a particular language with a particular compiler, but you know that a problem of size 2n should take about 4 times as long.
The reason you can ignore the coefficients is that if you consider the ratio of different size problems, the lower order terms' coefficients are asymptotically dominated, and the highest order term's coefficients cancel in the ratio.
We use time complexity analysis to help us estimate the time cost and understand how far we can go. For example, the lower bound time complexity for sorting algorithm is O(nlgn), it is proved in theory, and we should never try to design a algorithm better than this.
For the coefficient, in many case it's not easy to find a accurate number in theory, since it could be effect by the input data. But it doesn't mean it's not important. Quicksort is the most widely used sorting algorithm, since the coefficient of time complexity is really small, which is only 1.39NlgN for average case.
And another interesting fact about quicksort is that we all know that the worst case for quicksort will cost O(N^2). We can use Median of Medians algorithm to reduce the worst case time complexity of quicksort to O(NlgN), but we seldom use this version in practice. It's because that the coefficient of Median of Medians version is too big, which make it unusable.
I have been learning big o efficiency at school as the "go to" method for describing algorithm runtimes as better or worse than others but what I want to know is will the algorithm with the better efficiency always outperform the worst of the lot like bubble sort in every single situation, are there any situations where a bubble sort or a O(n2) algorithm will be better for a task than another algorithm with a lower O() runtime?
Generally, O() notation gives the asymptotic growth of a particular algorithm. That is, the larger category that an algorithm is placed into in terms of asymptotic growth indicates how long the algorithm will take to run as n grows (for some n number of items).
For example, we say that if a given algorithm is O(n), then it "grows linearly", meaning that as n increases, the algorithm will take about as long as any other O(n) algorithm to run.
That doesn't mean that it's exactly as long as any other algorithm that grows as O(n), because we disregard some things. For example, if the runtime of an algorithm takes exactly 12n+65ms, and another algorithm takes 8n+44ms, we can clearly see that for n=1000, algorithm 1 will take 12065ms to run and algorithm 2 will take 8044ms to run. Clearly algorithm 2 requires less time to run, but they are both O(n).
There are also situations that, for small values of n, an algorithm that is O(n2) might outperform another algorithm that's O(n), due to constants in the runtime that aren't being considered in the analysis.
Basically, Big-O notation gives you an estimate of the complexity of the algorithm, and can be used to compare different algorithms. In terms of application, though, you may need to dig deeper to find out which algorithm is best suited for a given project/program.
Big O is gives you the worst cast scenario. That means that it assumes the input in in the worst possible It also ignores the coefficient. If you are using selection sort on an array that is reverse sorted then it will run in n^2 time. If you use selection sort on a sorted array then it will run in n time. Therefore selection sort would run faster than many other sort algorithms on an already sorted list and slower than most (reasonable) algorithms on a reverse sorted list.
Edit: sorry, I meant insertion sort, not selection sort. Selection sort is always n^2
So I just started learning about Asymptotic bounds for an algorithm
Question:
What can we say about theta of a function if for the algorithm we find different lower and upper bounds?? (say omega(n) and O(n^2)). Or rather what can we say about tightness of such an algorithm?
The book which I read says Theta is for same upper and lower bounds of the function.
What about in this case?
I don't think you can say anything, in that case.
The definition of Θ(f(n)) is:
A function is Θ(f(n)) if and only if it is Ω(f(n)) and O(f(n)).
For some pathological function that exhibits those behaviors, such as oscillating between n and n^2, it wouldn't be defined.
Example:
f(x) = n if n is odd
n^2 if n is even
Your bounds Ω(n) and O(n^2) would be tight on this, but Θ(f(n)) is not defined for any function.
See also: What is the difference between Θ(n) and O(n)?
Just for a bit of practicality, one algorithm that is not in Θ(f(n)) for any f(n) would be insertion sort. It runs in Ω(n) since for a list that is already sorted, you only need one operation for the insert in every step, but it runs in O(n^2) in the general case. Constructing functions that oscillate or are non-continuous otherwise usually is done more for didactic purposes, but in my experience such functions rarely, if ever, appear with actual algorithms.
Regarding tightness, I only ever heard that in this context with reference to the upper and lower bounds proposed for algorithms. Again regarding the example of insertion sort, the given bounds are tight in the sense that there are instances of the problem that actually can be done in time linear in their size (the lower bound) and other instances of the problem that will not execute in time less than quadratic in their size. Bounds that are valid, but not tight for insertion sort would be Ω(1) since you can't sort lists of arbitrary size in constant time, and O(n^3) because you can always sort a list of n elements in strictly O(n^2) time, which is an order of magnitude less, so you can certainly do it in O(n^3). What bounds are for is to give us a crude idea of what we can expect as performance of our algorithms so we get an idea of how efficient our solutions are; tight bounds are the most desirable, since they both give us that crude idea and that idea is optimal in the sense that there are extreme cases (which sometimes are also the general case) where we actually need all the complexity the bound allows.
The average case complexity is not a bound; it "only" describes how efficient an algorithm is "in most cases"; take for example quick sort which has a best-case complexity of Ω(n), a worst case complexity of O(n^2) and an average case complexity of O(n log n). This tells us that for almost all cases, quick sort is as fast as sorting gets in general (i.e. the average case complexity), while there are instances of the problem that it solves faster than that (best case complexity -> lower bound) and also instances of the problem that take quick sort longer to solve than that (worst case complexity -> upper bound).