Sorting Algorithm compression - algorithm

Let A and B be two algorithms that solve the same problem.
Claim: If A is faster than B, both in the worst case and in the average case, then,
necessarily, A is faster than B, in the best case as well.

No. Consider merge sort vs insertion sort. Merge sort is always O(n log n) in best, average, and worst cases. Insertion sort is O(n^2) in average and wort cases; however it is O(n) in the best case.

Related

Sorting algorithm with quadratic and linear runtime

Is it possible to make a sorting algorithm
That its running time at worst case is quadratic => n^2
But in most cases
(That is, on more than half of the n-size inputs)
the run time will be linear => n ??
I was thinking about Radix Sort and just make the worst case, worse
But I do not know if it is possible.
yes, the Bucket Sort
https://www.geeksforgeeks.org/bucket-sort-2/?ref=lbp
you can reed abut the algorithm in the link
In the worst case, we could have a bucket which contains all n values of the
array. Since insertion sort has worst case running time O(n^2), so does Bucket
sort. We can avoid this by using merge sort to sort each bucket instead, which
has worst case running time O(n lg n).
Yes, it is possible.
Bucket sort analysis does reveal such behaviour (with reasonable number of buckets).

Time Complexity to Sort a K Sorted Array Using Quicksort Algorithm

Problem:
I have to analyze the time complexity to sort (using Quick sort) a list of integer values which are almost sorted.
What I have done?
I have read SO Q1, SO Q2, SO Q3 and this one.
However, I have not found anything which mentioned explicitly the time complexity to sort a k sorted array using Quick sort.
Since the time complexity of Quick sort algorithm depends on the strategy of choosing pivot and there is a probability to face the worst case due to having almost sorted data, to avoid worst case, I have used median of three values(first, middle, last) as a pivot as referred here.
What do I think?
Since in average case, the time complexity of Quick sort algorithm is O(n log(n)) and as mentioned here, "For any non trivial value of n, a divide and conquer algorithm will need many O(n) passes, even if the array be almost completely sorted",
I think the time complexity to sort a k sorted array using Quick sort algorithm is O(n log(n)), if the worst case does not occur.
My Question:
Am I right that the time complexity to sort a k sorted array using Quick sort algorithm is O(n log(n)) if I try to avoid worst case selecting a proper pivot and if the worst case does not occur.
When you say time complexity of Quick Sort, it is O(n^2), because the worst case is assumed by default. However, if you use another strategy to choose pivot, like Randomized Quick Sort, for example, your time complexity is still going to be O(n^2) by default. But the expected time complexity is O(n log(n)), since the occurrence of the worst case is highly unlikely. So if you can prove somehow that the worst case is 100% guaranteed not to happen, then you can say time complexity is less than O(n^2), otherwise, by default, the worst case is considered, no matter how unlikely.

What is the appropriate data structure for insertion sort?

I revisited insertion sort algorithm and noticed something funny.
One obviously shouldn't use an array with this sort, as upon insertion, one will have to shift all subsequent elements O(n^2 log(n)). However a linked list is also not good here, since we preferably find the right placement using binary search, which isn't possible for a simple linked list (so we end up with O(n^2)).
Which makes me wonder: what is a data structure on which this sorting algorithm provides its premise of O(nlog(n)) complexity?
From where did you get the premise of O(n log n)? Wikipedia disagrees, as does my own experience. The premises of the insertion sort include components that are O(n) for each of the n elements.
Also, I believe that your claim of O(n^2 log n) is incorrect. The binary search is log n, and the ensuing "move sideways" is n, but these two steps are in succession, not nested. The result is n + log n, not a multiplication. The result is the expected O(n^2).
If you use a gapped array and a binary search to figure out where to insert things, then with high probability your sort will be O(n log(n)). See https://en.wikipedia.org/wiki/Library_sort for details.
However this is not as efficient as a wide variety of other sorts that are widely implemented. So this knowledge is only of theoretical interest.
Insertion sort is defined over array or list, if you use some other data structure, then it will be another algorithm.
Of course if you use a BST, insertion and search would be O(log(n)) and your overall complexity would be O(n.log(n)) on the average (remind that it will be O(n^2) in the worst), but this will be no more an insertion sort but a tree sort. If you use an AVL tree, then you get the O(n.log(n)) worst case complexity.
In insertion sort the best case scenario is when the sequence is already sorted and that takes Linear time and in the worst case takes O(n^2) time. I do not know how you got the logarithmic part in the complexity.

Using median selection in quicksort?

I have a slight question about Quicksort. In the case where the minimun or maximum value of the array is selected, the pivot value the partition is very inefficient as the array size decreases by 1 one only.
However if I add code of selecting the median of that array, I think then Ii will be more efficient. Since partition algorithm is already O(N), it will give an O(N log N) algorithm.
Can this be done?
You absolutely can use a linear-time median selection algorithm to compute the pivot in quicksort. This gives you a worst-case O(n log n) sorting algorithm.
However, the constant factor on linear-time selection tends to be so high that the resulting algorithm will, in practice, be much, much slower than a quicksort that just randomly chooses the pivot on each iteration. Therefore, it's not common to see such an implementation.
A completely different approach to avoiding the O(n2) worst-case is to use an approach like the one in introsort. This algorithm monitors the recursive depth of the quicksort. If it appears that the algorithm is starting to degenerate, it switches to a different sorting algorithm (usually, heapsort) with a guaranteed worst-case O(n log n). This makes the overall algorithm O(n log n) without noticeably decreasing performance.
Hope this helps!

Generic and practical sorting algorithm faster than O(n log n)?

Is there any practical algorithm for generic elements (unlike counting sort or bucket sort) that runs faster than O(n log n)?
Many people have mentioned the information-theoretic Ω(n lg n) bound on comparison sorting algorithms, which can't be broken in comparison sorts. (This earlier question explores why that's the case.)
However, there are some types of comparison sorts that, while not breaking O(n lg n) in the average case, can be shown to run faster on inputs that are already presorted to some extent. For example, Dijkstra's smoothsort runs in O(n) on already-sorted inputs with O(n lg n) worst-case behavior. One of my favorite sorts, Cartesian tree sort, provably takes optimal advantage of presortedness in a few metrics. For example, it can sort any sequence with a constant number of increasing or decreasing subsequences in time O(n), degrading gracefully to O(n lg n) in the worst case.
On the subject of non-comparison sorts, there are some famous but tricky sorting algorithms for integers that surpass O(n lg n) bynp doing clever bit-manipulation tricks. The best known integer sorting algorithm is a randomized algorithm that can sort in O(n √lg lg n), while the fastest deterministic algorithm for integer sorting runs in O(n lg lg n) time. You may have heard that radix sort works in O(n), though technically it's O(n lg U), where U is the largest value in the array to sort.
In short, no, you can't do much better than O(n lg n), but you can do marginally better if you know something about your input.
For generic elements that you can only compare and not access the internals of, it is impossible to have a sorting algorithm faster than Theta(n log n). That is because there are n! (n factorial) possible orders of the elements, and you need Theta(n log n) comparisons to distinguish all of them.
No. This is one of the few rigorous minimum bounds for algorithms we have. For a collection of n elements, there are n! different orders, so to specify a given order we need log(n!) bits. By Stirling's approximation this is approximately n log n. For each comparison we do between elements, we get essentially one bit of information (ignoring the possibility of equal elements).
For how many elements? Even though it's something like N1.2, a Shell-Metzner sort is often faster than most others up to a few thousand elements (or so).
It also depends on what you mean by "generic" and "practical". A radix sort can beat O(n log n), and it works for a fairly wide variety of data (but definitely not everything).
If your idea of practical and generic limits the algorithm to one that directly compares elements, then no -- nothing does (or ever can) be better than O(n log n). That's been proven for quite some time.

Resources