Big-O runtime of running sort twice - algorithm

I'm relatively new to the practice of determining algorithm runtimes using big-O notation and I have a question regarding the runtime of a sorting algorithm. Let's say I have a set of pairs (a, b) in an array and I sort the data using a known sorting algorithm that runs in O(n log n). Next, I take a subset of some number of the n data points and run the same sorting algorithm on that subset (so theoretically I could sort the entire array twice - the first sort would be comparing a's and the second set would be comparing b's). So in other words my code is
pairArray[n];
Sort(pairArray); //runs in O(n log n)
subsetArray[subset]; //where subset <= n
for (int i = 0; i < subset; i++) {
subsetArray[i] = pairArray[i];
}
Sort(subsetArray) //runs in O(n log n)
Is the runtime of this code still O(n log n)? I guess I have two questions: does running an O(something) sort twice increase complexity from the original "something", and does the iteration to reassign to a different array increase complexity? I'm more worried about the first one as the iteration can be eliminated with pointers.

Constact factors are ignored in big-O notation. Sorting twice is still O(n log n).
The loop with the assignment you are doing is an O(n) operation. This is also ignored. Only the largest term is mentioned in big-O notation.
If you want to decide which of two algorithms is better but their big-O is the same then you can use performance measurements on realistic data. When measuring actual performance you can see if one algorithm is typically twice as slow as another. This cannot be seen from the big-O notation.

Related

Best sorting algorithm for a partly sorted sequence?

I have to answer the following question:
What sorting algorithm is recommended if the first n-m part
is already sorted and the remaining part m is unsorted? Are there any algorithms that take O(n log m) comparisons? What about O(m log n) comparisons?
I just can't find the solution.
My first idea was insertion sort because O(n) for almost sorted sequence. But since we don't know the size of m the Runtime is very likely to be O(n^2) eventough the sequence is half sorted already isn't it?
Then I tought perhabs its quick sort because it takes (Sum from k=1 to n) Cavg (1-m) + Cavg (n-m) comparisons. But after ignoring the n-m part of the sequence the remaining sequence is 1-m in quicksort and not m.
Merge Sort and heap sort should have a runtime of O(m log m) for the remaining sequence m I would say.
Does anyone have an idea or can give me some advice?
Greetings
Have you tried sorting remaining part m separately as O(m log (m)) complexity (with any algorithm you like: MergeSort, HeapSort, QuickSort, ...) and then merge that part with sorted part using MergeSort (You won't even need to fully implement MergeSort - just single pass of it's inner loop body to merge two sorted sequences)?
That would result in O(m*log(m) + n + m) = O(m*log(m) + n) complexity. I don't believe it is possible to find better asymptotic complexity on single-core CPU. Although it will require additional O(n+m) memory for merging result array.
Which sort algorithm works best on mostly sorted data?
Sounds like insertion and bubble are good. You are free to implement as many as you want then test to see which is faster/fewer operations by supplying them partially sorted data.

Choosing comparing algorithms to find k max values

Lets say that I want to find the K max values in an array of n elements , and also return them in a sorted output.
k may be -
k = 30 , k = n/5 ..
I thought about some efficient algorithms but all I could think of was in complexity of O(nlogn). Can I do it in `O(n)? maybe with some modification of quick sort?
Thanks!
The problem could be solved using min-heap-based priority queue in
O(NlogK) + (KlogK) time
If k is constant (k=30 case), then complexity is equal to O(N).
If k = O(N) (k=n/5 case), then complexity is equal to O(NlogN).
Another option for constant k - K-select algorithm based on Quicksort partition with average time O(N) (while worst case O(N^2) might occur)
There is a way of sorting elements in nearly O(n), if you assume that you only want to sort integers. This can be done with Algorithms like Bucket Sort or Radix Sort, that do not rely on the comparison between two elements (which are limited to O(n*log(n))).
However note, that these algorithms also have worst-case runtimes, that might be slower than O(n*log(n)).
More information can be found here.
No comparison based sorting algorithms can achieve a better average case complexity than O(n*lg n)
There are many papers with proofs out there but this site provides a nice visual example.
So unless you are given a sorted array, your best case is going to be an O(n lg n) algorithm.
There are sorts like radix and bucket, but they are not based off of comparison based sorting like your title seems to imply.

Is n or nlog(n) better than constant or logarithmic time?

In the Princeton tutorial on Coursera the lecturer explains the common order-of-growth functions that are encountered. He says that linear and linearithmic running times are "what we strive" for and his reasoning was that as the input size increases so too does the running time. I think this is where he made a mistake because I have previously heard him refer to a linear order-of-growth as unsatisfactory for an efficient algorithm.
While he was speaking he also showed a chart that plotted the different running times - constant and logarithmic running times looked to be more efficient. So was this a mistake or is this true?
It is a mistake when taken in the context that O(n) and O(n log n) functions have better complexity than O(1) and O(log n) functions. When looking typical cases of complexity in big O notation:
O(1) < O(log n) < O(n) < O(n log n) < O(n^2)
Notice that this doesn't necessarily mean that they will always be better performance-wise - we could have an O(1) function that takes a long time to execute even though its complexity is unaffected by element count. Such a function would look better in big O notation than an O(log n) function, but could actually perform worse in practice.
Generally speaking: a function with lower complexity (in big O notation) will outperform a function with greater complexity (in big O notation) when n is sufficiently high.
You're missing the broader context in which those statements must have been made. Different kinds of problems have different demands, and often even have theoretical lower bounds on how much work is absolutely necessary to solve them, no matter the means.
For operations like sorting or scanning every element of a simple collection, you can make a hard lower bound of the number of elements in the collection for those operations, because the output depends on every element of the input. [1] Thus, O(n) or O(n*log(n)) are the best one can do.
For other kinds of operations, like accessing a single element of a hash table or linked list, or searching in a sorted set, the algorithm needn't examine all of the input. In those settings, an O(n) operation would be dreadfully slow.
[1] Others will note that sorting by comparisons also has an n*log(n) lower bound, from information-theoretic arguments. There are non-comparison based sorting algorithms that can beat this, for some types of input.
Generally speaking, what we strive for is the best we can manage to do. But depending on what we're doing, that might be O(1), O(log log N), O(log N), O(N), O(N log N), O(N2), O(N3), or (or certain algorithms) perhaps O(N!) or even O(2N).
Just for example, when you're dealing with searching in a sorted collection, binary search borders on trivial and gives O(log N) complexity. If the distribution of items in the collection is reasonably predictable, we can typically do even better--around O(log log N). Knowing that, an algorithm that was O(N) or O(N2) (for a couple of obvious examples) would probably be pretty disappointing.
On the other hand, sorting is generally quite a bit higher complexity--the "good" algorithms manage O(N log N), and the poorer ones are typically around O(N2). Therefore, for sorting an O(N) algorithm is actually very good (in fact, only possible for rather constrained types of inputs), and we can pretty much count on the fact that something like O(log log N) simply isn't possible.
Going even further, we'd be happy to manage a matrix multiplication in only O(N2) instead of the usual O(N3). We'd be ecstatic to get optimum, reproducible answers to the traveling salesman problem or subset sum problem in only O(N3), given that optimal solutions to these normally require O(N!).
Algorithms with a sublinear behavior like O(1) or O(Log(N)) are special in that they do not require to look at all elements. In a way this is a fallacy because if there are really N elements, it will take O(N) just to read or compute them.
Sublinear algorithms are often possible after some preprocessing has been performed. Think of binary search in a sorted table, taking O(Log(N)). If the data is initially unsorted, it will cost O(N Log(N)) to sort it first. The cost of sorting can be balanced if you perform many searches, say K, on the same data set. Indeed, without the sort, the cost of the searches will be O(K N), and with pre-sorting O(N Log(N)+ K Log(N)). You win if K >> Log(N).
This said, when no preprocessing is allowed, O(N) behavior is ideal, and O(N Log(N)) is quite comfortable as well (for a million elements, Lg(N) is only 20). You start screaming with O(N²) and worse.
He said those algorithms are what we strive for, which is generally true. Many algorithms cannot possibly be improved better than logarithmic or linear time, and while constant time would be better in a perfect world, it's often unattainable.
constant time is always better because the time (or space) complexity doesn't depend on the problem size... isn't it a great feature? :-)
then we have O(N) and then Nlog(N)
did you know? problems with constant time complexity exist!
e.g.
let A[N] be an array of N integer values, with N > 3. Find and algorithm to tell if the sum of the first three elements is positive or negative.
What we strive for is efficiency, in the sense of designing algorithms with a time (or space) complexity that does not exceed their theoretical lower bound.
For instance, using comparison-based algorithms, you can't find a value in a sorted array faster than Omega(Log(N)), and you cannot sort an array faster than Omega(N Log(N)) - in the worst case.
Thus, binary search O(Log(N)) and Heapsort O(N Log(N)) are efficient algorithms, while linear search O(N) and Bubblesort O(N²) are not.
The lower bound depends on the problem to be solved, not on the algorithm.
Yes constant time i.e. O(1) is better than linear time O(n) because the former is not depending on the input-size of the problem. The order is O(1) > O (logn) > O (n) > O (nlogn).
Linear or linearthimic time we strive for because going for O(1) might not be realistic as in every sorting algorithm we atleast need a few comparisons which the professor tries to prove with his decison Tree- comparison analysis where he tries to sort three elements a b c and proves a lower bound of nlogn. Check his "Complexity of Sorting" in the Mergesort lecture.

Can an O(n) algorithm ever exceed O(n^2) in terms of computation time?

Assume I have two algorithms:
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
//do something in constant time
}
}
This is naturally O(n^2). Suppose I also have:
for (int i = 0; i < 100; i++) {
for (int j = 0; j < n; j++) {
//do something in constant time
}
}
This is O(n) + O(n) + O(n) + O(n) + ... O(n) + = O(n)
It seems that even though my second algorithm is O(n), it will take longer. Can someone expand on this? I bring it up because I often see algorithms where they will, for example, perform a sorting step first or something like that, and when determining total complexity, its just the most complex element that bounds the algorithm.
Asymptotic complexity (which is what both big-O and big-Theta represent) completely ignores the constant factors involved - it's only intended to give an indication of how running time will change as the size of the input gets larger.
So it's certainly possible that an Θ(n) algorithm can take longer than an Θ(n2) one for some given n - which n this will happen for will really depend on the algorithms involved - for your specific example, this will be the case for n < 100, ignoring the possibility of optimizations differing between the two.
For any two given algorithms taking Θ(n) and Θ(n2) time respectively, what you're likely to see is that either:
The Θ(n) algorithm is slower when n is small, then the Θ(n2) one becomes slower as n increases
(which happens if the Θ(n) one is more complex, i.e. has higher constant factors), or
The Θ(n2) one is always slower.
Although it's certainly possible that the Θ(n) algorithm can be slower, then the Θ(n2) one, then the Θ(n) one again, and so on as n increases, until n gets very large, from which point onwards the Θ(n2) one will always be slower, although it's greatly unlikely to happen.
In slightly more mathematical terms:
Let's say the Θ(n2) algorithm takes cn2 operations for some c.
And the Θ(n) algorithm takes dn operations for some d.
This is in line with the formal definition since we can assume this holds for n greater than 0 (i.e. for all n) and that the two functions between which the running time is lies is the same.
In line with your example, if you were to say c = 1 and d = 100, then the Θ(n) algorithm would be slower until n = 100, at which point the Θ(n2) algorithm would become slower.
(courtesy of WolframAlpha).
Notation note:
Technically big-O is only an upper bound, meaning you can say an O(1) algorithm (or really any algorithm taking O(n2) or less time) takes O(n2) as well. Thus I instead used big-Theta (Θ) notation, which is just a tight bound. See the formal definitions for more information.
Big-O is often informally treated as or taught to be a tight bound, so you may already have been essentially using big-Theta without knowing it.
If we're talking about an upper bound only (as per the formal definition of big-O), that would rather be an "anything goes" situation - the O(n) one can be faster, the O(n2) one can be faster or they can take the same amount of time (asymptotically) - one usually can't make particularly meaningful conclusions with regard to comparing the big-O of algorithms, one can only say that, given a big-O of some algorithm, that that algorithm won't take any longer than that amount of time (asymptotically).
Yes, an O(n) algorithm can exceed an O(n2) algorithm in terms of running time. This happens when the constant factor (that we omit in the big O notation) is large. For example, in your code above, the O(n) algorithm will have a large constant factor. So, it will perform worse than an algorithm that runs in O(n2) for n < 10.
Here, n=100 is the cross-over point. So when a task can be performed in both O(n) and in O(n2) and the constant factor of the linear algorithm is more than that of a quadratic algorithm, then we tend to prefer the algorithm with the worse running time.
For example, when sorting an array, we switch to insertion sort for smaller arrays, even when merge sort or quick sort run asymptotically faster. This is because insertion sort has a smaller constant factor than merge/quick sort and will run faster.
Big O(n) are not meant to compare relative speed of different algorithm. They are meant to measure how fast the running time increase when the size of input increase. For example,
O(n) means that if n multiplied by 1000, then the running time is roughly multiplied by 1000.
O(n^2) means that if n is multiplied by 1000, then the running is roughly multiplied by 1000000.
So when n is large enough, any O(n) algorithm will beat a O(n^2) algorithm. It doesn't mean that anything for a fixed n.
Long story short, yes, it can. The definition of O is base on the fact that O(f(x)) < O(g(x)) implies that g(x) will definitively take more time to run than f(x) given a big enough x.
For example, is a known fact that for small values merge sort is outperformed by insertion sort ( if I remember correctly, that should hold true for n smaller than 31)
Yes. The O() means only asymptotic complexity. The linear algorythm can be slower as the quadratic, if it has same enough large linear slowing constant (f.e. if the core of the loop is running 10-times longer, it will be slower as its quadratic version).
The O()-notation is only an extrapolation, although a quite good one.
The only guarantee you get is that—no matter the constant factors—for big enough n, the O(n) algorithm will spend fewer operations than the O(n^2) one.
As an example, let's count the operations in the OPs neat example.
His two algoriths differ in only one line:
for (int i = 0; i < n; i++) { (* A, the O(n*n) algorithm. *)
vs.
for (int i = 0; i < 100; i++) { (* B, the O(n) algorithm. *)
Since the rest of his programs are the same, the difference in actual running times will be decided by these two lines.
For n=100, both lines do 100 iterations, so A and B perform exactly the same at n=100.
For n<100, say, n=10, A does only 10 iterations, whereas B does 100. Clearly A is faster.
For n>100, say, n=1000. Now the loop of A does 1000 iterations, whereas, the B loop still does its fixed 100 iterations. Clearly A is slower.
Of course, how big n has to get for the O(n) algorithm to be faster depends on the constant factor. If you change the constant 100 to 1000 in B, then the cutoff also changes to 1000.

Big theta notation of insertion sort algorithm

I'm studying asymptotic notations from the book and I can't understand what the author means. I know that if f(n) = Θ(n^2) then f(n) = O(n^2). However, I understand from the author's words that for insertion sort function algorithm f(n) = Θ(n) and f(n)=O(n^2).
Why? Does the big omega or big theta change with different inputs?
He says that:
"The Θ(n^2) bound on the worst-case running time of insertion sort, however, does not imply a Θ(n^2) bound on the running time of insertion sort on every input. "
However it is different for big-oh notation. What does he mean? What is the difference between them?
I'm so confused. I'm copy pasting it below:
Since O-notation describes an upper bound, when we use it to bound the worst-case running
time of an algorithm, we have a bound on the running time of the algorithm on every input.
Thus, the O(n^2) bound on worst-case running time of insertion sort also applies to its running
time on every input. The Θ(n^2) bound on the worst-case running time of insertion sort,
however, does not imply a Θ(n^2) bound on the running time of insertion sort on every input.
For example, when the input is already sorted, insertion sort runs in
Θ(n) time.
Does the big omega or big theta change with different inputs?
Yes. To give a simpler example, consider linear search in an array from left to right. In the worst and average case, this algorithm takes f(n) = a × n/2 + b expected steps for some constants a and b. But when the left element is guaranteed to always hold the key you're looking for, it always takes a + b steps.
Since Θ denotes a strict bound, and Θ(n) != Θ(n²), it follows that the Θ for the two kinds of input is different.
EDIT: as for Θ and big-O being different on the same input, yes, that's perfectly possible. Consider the following (admittedly trivial) example.
When we set n to 5, then n = 5 and n < 6 are both true. But when we set n to 1, then n = 5 is false while n < 6 is still true.
Similarly, big-O is just an upper bound, just like < on numbers, while Θ is a strict bound like =.
(Actually, Θ is more like a < n < b for constants a and b, i.e. it defines something analogous to a range of numbers, but the principle is the same.)
Refer to CLRS edition 3
Page -44(Asymptotic notation,functions,and running times)
It says -
Even when we use asymptotic notation to apply to the running time of an algorithm, we need to understand which running time we mean. Sometimes we are interested in the worst-case running time. Often, however, we wish to characterize the running time no matter what the input. In other words, we often wish to make a blanket statement that covers all inputs, not just the worst case.
Takings from the above para:
Worst case provides the atmost limit for the running time of an algorithm.
Thus, the O(n^2) bound on worst-case running time of insertion sort also applies to its running time on every input.
But Theta(n^2) bound on the worst-case running time of insertion sort, however, does not imply Theta(n^2) bound on the running time of insertion sort on every input.
Because best case running time of insertion sort yields Theta(n).(When the list is already sorted)
We usually write the worst case time complexity of an algorithm but when best case and average case come into accountability the time complexity varies according to these cases.
In simple words, the running time of a programs is described as a
function of its input size i.e. f(n).
The = is asymmetric, thus an+b=O(n) means f(n) belongs to set O(g(n)). So we can also say an+b=O(n^2) and its true because f(n) for some value of a,b and n belongs to set O(n^2).
Thus Big-Oh(O) only gives an upper bound or you can say the notation gives a blanket statement, which means all the inputs of a given input size are covered not just the worst case once. For example in case of insertion sort an array of size n in reverse order.
So n=O(n^2) is true but will be an abuse when defining worst case running time for an algorithm. As worst case running time gives an upper bound on the running time for any input. And as we all know that in case of insertion sort the running time will depend upon how the much sorted the input is in the given array of a fixed size. So if the array is sort the running will be linear.
So we need a tight asymptotic bound notation to describe our worst case,
which is provided by Θ notation thus the worst case of insertion sort will be Θ(n^2) and best case will be Θ(n).
we have a bound on the running time of the algorithm on every input
It means that if there is a set of inputs with running time n^2 while other have less, then the algorithm is O(n^2).
The Θ(n^2) bound on the worst-case running time of insertion sort,
however, does not imply a Θ(n^2) bound on the running time of
insertion sort on every input
He is saying that the converse is not true. If an algorithm is O(n^2), it doesnt not mean every single input will run with quadratic time.
My academic theory on the insertion sort algorithm is far away in the past, but from what I understand in your copy-paste :
big-O notation always describe the worst case, but big-Theta describe some sort of average for typical data.
take a look at this : What is the difference between Θ(n) and O(n)?

Resources