Why do Rust's sort methods allocate memory? - sorting

Methods like sort_by on std::slice::MutableSliceAllocating or sort_by on collections::vec::Vec are documented to "allocate approximately 2 * n, where n is the length". I don't think that good C++ std::sort implementations allocate (on the heap) and yet they accomplish the same O(n log n) complexity. Although, the Rust sort methods are stable unlike the C++ std::sort.
Why do the Rust sort methods allocate? To me, it doesn't fit the "zero cost abstraction" bill advertised here.

I realise this is an old post, but I found it on google and the popular answer is wrong. It is in fact possible to perform a stable in-place sort using O(1) (not even log) memory and worst-case O(nlogn) time. See for example GrailSort or WikiSort.

As stated in the comment, this is a stable sort, which requires O(n) space to execute. The best O(n log n) stable sort, mergesort, requires about ½n temporary items. (I'm not familiar with Rust, so I don't know why it needs four times that.)
Stable sort can be achieved in O(log n) space, but only by mergesort variant that takes O(n log² n) time.

Related

How does the Franceschini method work?

It is mentioned on Wikipedia that this method sorts an array in O(n log n) time, but it is also stable and in-place. That sounds like a very good sorting algorithm, since no other sorting algorithm does all of those at once (Insertion Sort isn't O(n log n), Heap Sort isn't stable, Quicksort (or Introsort) isn't either in place or stable, Mergesort is not in-place). However, on wikipedia only it's name is mentioned and nothing else. As a reference it goes to Franceschini, Gianni (1 June 2007). "Sorting Stably, in Place, with O(n log n) Comparisons and O(n) Moves". Theory of Computing Systems 40 (4): 327–353. However, that doesn't really explain how it actually works, it shows more of why it exists.
My question is how does this method work (what steps does it actually do), and why are there so little resources related to it, considering there are no other known O(n log n) stable in place methods of sorting.
considering there are no other known O(n log n) stable in-place methods of sorting.
It's sufficiently easy to implement merge sort in-place with O(log n) additional space, which I guess is close enough in practice.
In fact there is a merge sort variant which is stable and uses only O(1) additional memory: "Practical in-place mergesort" by Katajainen, Pasanen and Teuhola. It has an optimal O(n log n) runtime, but it is not optimal because it uses Ω(n log n) element move operations, when it can be done with O(n) as demonstrated by the Franceschini paper.
It seems to run slower than a traditional merge sort, but not by a large margin. In contrast, the Franceschini version seems to be a lot more complicated and have a huge constant overhead.
Just a relevant note: it IS possible to turn any unstable sorting algorithm into a stable one, by simply holding the original array index alongside the key. When performing the comparison, if the keys are equal, the indices are compared instead.
Using such a technique would turn HeapSort, for example, into an in-place, worst-case O(n*logn), stable algorithm.
However, since we need to store O(1) of 'additional' data for every entry, we technically do need O(n) of extra space, so this isn't really in-place unless you consider the original index a part of the key. Franceschini's would not require to hold any additional data.
The paper can be found here: https://link.springer.com/article/10.1007/s00224-006-1311-1 . However it's rather complicated, splitting into cases as to whether the number of distinct elements is o(n / (log n)^3) or not. Probably the hidden constants make this an unattractive solution for sorting in practice, especially since sorting usually doesn't have to be stable unless secondary information is being stored in elements to be sorted which needs to be preserved in the original order.

why we are always using quick sort ? or any specific sorting algorithm?

why we are always using quick sort ? or any specific sorting algorithm ??
i tried some experiment on my PC using quick,merge,heap,flash sort
results:-
sorting algorithm : time in nanosecond -> time in minutes
quick sort time : 135057597441 -> 2.25095995735
Flash sort time : 137704213630 -> 2.29507022716667
merge sort time : 138317794813 -> 2.30529658021667
heap sort time : 148662032992 -> 2.47770054986667
using java in built function
long startTime = System.nanoTime();
given times are in nanoseconds there hardly any difference between them if we convert them into seconds for 20000000 random integer data and max array size is 2147483647 in java.if we are using in-place algorithm then there may be difference of 1 to 2 min till max array size.
if the difference is too small why we should care ??
All of the algorithms presented have a similar average case bounds, of O(n lg n), which is the "best" a comparison sort can do.
Since they share the same average bounds, the expected performance of these algorithms over random data should be similar - which is what the findings show. However, the devil is in the details. Here is a very quick summary; follow the links for further details.
Quicksort is generally not stable (but there are stable variations). While quicksort has an average bounds of O(n lg n), Quicksort has a worst case bounds of O(n * n) but there are ways to mitigate this. Quicksort, like heapsort, is done in-place.
Merge-sort is a stable sort. Mergesort has a worst case bounds of O(n lg n) which means it has predictable performance. Base merge-sort requires O(n) extra space so it's generally not an in-place sort (although there is an in-place variant, and the memory for a linked list implementation is constant).
Heapsort is not stable; it also has the worst case bounds of O(n lg n), but has the benefit of a constant size bounds and being in-place. It has worse cache and parallelism aspects than merge-sort.
Exactly which one is "best" depends upon the use-case, data, and exact implementation/variant.
Merge-sort (or hybrid such as Timsort) is the "default" sort implementation in many libraries/languages. A common Quicksort-based hybrid, Introsort is used in several C++ implementations. Vanilla/plain Quicksort implementations, should they be provided, are usually secondary implementations.
Merge-sort: a stable sort with consistent performance and acceptable memory bounds.
Quicksort/heapsort: trivially work in-place and [effectively] don't require additional memory.
We rarely need to sort integer data. One of the biggest overheads on a sort is the time it takes to make comparisons. Quicksort reduces the number of comparisons required by comparison with, say, a bubble sort. If you're sorting strings this is much more significant. As a real world example some years ago I wrote a sort/merge that took 40 minutes with a bubble sort, and 17 with a quick sort. (It was a z80 CPU a long time ago. I'd expect much better performance now).
Your conclusion is correct: most people that do care about this in most situations waste their time. Differences between these algorithms in terms of time and memory complexity become significant in a particular scenarios where:
you have huge number of elements to sort
performance is really critical (for example: real-time systems)
resources are really limited (for example: embedded systems)
(please note the really)
Also, there is the concern of stability which may be important more often. Most standard libraries provide stable sort algorithms (for example: OrderBy in C#, std::stable_sort in C++, sort in Python, sort methods in Java).
Correctness. While switching between sort algorithms might offer speed-ups under some specific scenarios, the cost of proving that algorithms work can be quite high.
For instance, TimSort, a popular sorting algorithm used by Android, Java, and Python, had an implementation bug that went unnoticed for years. This bug could cause a crash and was easily induced by the user.
It took a dedicated team "looking for a challenge" to isolate and solve the issue.
For this reason, any time a standard implementation of a data structure or algorithm is available, I will use that standard implementation. The time saved by using a smarter implementation is rarely worth uncertainty about the implementation's security and correctness.

Why is Insertion sort better than Quick sort for small list of elements?

Isn't Insertion sort O(n^2) > Quicksort O(n log n)...so for a small n, won't the relation be the same?
Big-O Notation describes the limiting behavior when n is large, also known as asymptotic behavior. This is an approximation. (See http://en.wikipedia.org/wiki/Big_O_notation)
Insertion sort is faster for small n because Quick Sort has extra overhead from the recursive function calls. Insertion sort is also more stable than Quick sort and requires less memory.
This question describes some further benefits of insertion sort. ( Is there ever a good reason to use Insertion Sort? )
Define "small".
When benchmarking sorting algorithms, I found out that switching from quicksort to insertion sort - despite what everybody was saying - actually hurts performance (recursive quicksort in C) for arrays larger than 4 elements. And those arrays can be sorted with a size-dependent optimal sorting algorithm.
That being said, always keep in mind that O(n...) only is the number of comparisons (in this specific case), not the speed of the algorithm. The speed depends on the implementation, e. g., if your quicksort function as or not recursive and how quickly function calls are dealt with.
Last but not least, big oh notation is only an upper bound.
If algorithm A requires 10000 n log n comparions and algorithm B requires 10 n ^ 2, the first is O(n log n) and the second is O(n ^ 2). Nevertheless, the second will (probably) be faster.
O()-notation is typically used to characterize performance for large problems, while deliberately ignoring constant factors and additive offsets to performance.
This is important because constant factors and overhead can vary greatly between processors and between implementations: the performance you get for a single-threaded Basic program on a 6502 machine will be very different from the same algorithm implemented as a C program running on an Intel i7-class processor. Note that implementation optimization is also a factor: attention to detail can often get you a major performance boost, even if all other factors are the same!
However, the constant factor and overhead are still important. If your application ensures that N never gets very large, the asymptotic behavior of O(N^2) vs. O(N log N) doesn't come into play.
Insertion sort is simple and, for small lists, it is generally faster than a comparably implemented quicksort or mergesort. That is why a practical sort implementation will generally fall back on something like insertion sort for the "base case", instead of recursing all the way down to single elements.
Its a matter of the constants that are attached to the running time that we ignore in the big-oh notation(because we are concerned with order of growth). For insertion sort, the running time is O(n^2) i.e. T(n)<=c(n^2) whereas for Quicksort it is T(n)<=k(nlgn). As c is quite small, for small n, the running time of insertion sort is less then that of Quicksort.....
Hope it helps...
Good real-world example when insertion sort can be used in conjunction with quicksort is the implementation of qsort function from glibc.
The first thing to point is qsort implements quicksort algorithm with a stack because it consumes less memory, stack implemented through macros directives.
Summary of current implementation from the source code (you'll find a lot of useful information through comments if you take a look at it):
Non-recursive
Chose the pivot element using a median-of-three decision tree
Only quicksorts TOTAL_ELEMS / MAX_THRESH partitions, leaving
insertion sort to order the MAX_THRESH items within each partition.
This is a big win, since insertion sort is faster for small, mostly
sorted array segments.
The larger of the two sub-partitions is always pushed onto the
stack first
What is MAX_THRESH value stands for? Well, just a small constant magic value which
was chosen to work best on a Sun 4/260.
How about binary insertion sort? You can absolutely search the position to swap by using binary search.

Comparison between timsort and quicksort

Why is it that I mostly hear about Quicksort being the fastest overall sorting algorithm when Timsort (according to wikipedia) seems to perform much better? Google didn't seem to turn up any kind of comparison.
TimSort is a highly optimized mergesort, it is stable and faster than old mergesort.
when comparing with quicksort, it has two advantages:
It is unbelievably fast for nearly sorted data sequence (including reverse sorted data);
The worst case is still O(N*LOG(N)).
To be honest, I don't think #1 is a advantage, but it did impress me.
Here are QuickSort's advantages
QuickSort is very very simple, even a highly tuned implementation, we can write down its pseduo codes within 20 lines;
QuickSort is fastest in most cases;
The memory consumption is LOG(N).
Currently, Java 7 SDK implements timsort and a new quicksort variant: i.e. Dual Pivot QuickSort.
If you need stable sort, try timsort, otherwise start with quicksort.
More or less, it has to do with the fact that Timsort is a hybrid sorting algorithm. This means that while the two underlying sorts it uses (Mergesort and Insertion sort) are both worse than Quicksort for many kinds of data, Timsort only uses them when it is advantageous to do so.
On a slightly deeper level, as Patrick87 states, quicksort is a worst-case O(n2) algorithm. Choosing a good pivot isn't hard, but guaranteeing an O(n log n) quicksort comes at the cost of generally slower sorting on average.
For more detail on Timsort, see this answer, and the linked blog post. It basically assumes that most data is already partially sorted, and constructs "runs" of sorted data that allow for efficient merges using mergesort.
Generally speaking quicksort is best algorithm for primitive array. This is due to memory locality and cache.
JDK7 uses TimSort for Object array. Object array only holds object reference. The object itself is stored in Heap. To compare object, we need to read object from heap. This is like reading from one part of the heap for one object, then randomly reading object from another part of heap. There will be a lot of cache miss. I guess for this reason memory locality is not important any more. This is may be why JDK only uses TimSort for Object array instead if primitive array.
This is only my guess.
Here are benchmark numbers from my machine (i7-6700 CPU, 3.4GHz, Ubuntu 16.04, gcc 5.4.0, parameters: SIZE=100000 and RUNS=3):
$ ./demo
Running tests
stdlib qsort time: 12246.33 us per iteration
##quick sort time: 5822.00 us per iteration
merge sort time: 8244.33 us per iteration
...
##tim sort time: 7695.33 us per iteration
in-place merge sort time: 6788.00 us per iteration
sqrt sort time: 7289.33 us per iteration
...
grail sort dyn buffer sort time: 7856.67 us per iteration
The benchmark comes from Swenson's sort project in which he as implemented several sorting algorithms in C. Presumably, his implementations are good enough to be representative, but I haven't investigated them.
So you really can't tell. Benchmark numbers only stay relevant for at most two years and then you have to repeat them. Possibly, timsort beat qsort waaay back in 2011 when the question was asked, but the times have changed. Or qsort was always the fastest, but timsort beat it on non-random data. Or Swenson's code isn't so good and a better programmer would turn the tide in timsort's favor. Or perhaps I suck and didn't use the right CFLAGS when compiling the code. Or... You get the point.
Tim Sort is great if you need an order-preserving sort, or if you are sorting a complex array (comparing heap-based objects) rather than a primitive array. As mentioned by others, quicksort benefits significantly from the locality of data and processor caching for primitive arrays.
The fact that the worst case of quicksort is O(n^2) was raised. Fortunately, you can achieve O(n log n) time worst-case with quicksort. The quicksort worst-case occurs when the pivot point is either the smallest or largest value such as when the pivot is the first or last element of an already sorted array.
We can achieve O(n log n) worst-case quicksort by setting the pivot at the median value. Since finding the median value can be done in linear time O(n). Since O(n) + O(n log n) = O(n log n), that becomes the worst-case time complexity.
In practice, however, most implementations find that a random pivot is sufficient so do not search for the median value.
Timsort is a popular hybrid sorting algorithm designed in 2002 by Tim Peters. It is a combination of insertion sort and merge sort. It is developed to perform well on various kinds of real world data sets. It is a fast, stable and adaptive sorting technique with average and worst-case performance of O(n log n).
How Timsort works
First of all, the input array is split into sub-arrays/blocks known as Run.
A simple Insertion Sort is used to sort each Run.
Merge Sort is used to merge the sorted Runs into a single array.
Advantages of Timsort
It performs better on nearly ordered data.
It is well-suited to dealing with real-world data.
Quicksort is a highly useful and efficient sorting algorithm that divides a large array of data into smaller ones and it is based on the concept of Divide and Conquer. Tony Hoare designed this sorting algorithm in 1959 with average performance of O(n log n).
How Quicksort works
Pick any element as the pivot.
Divide the array into partitions based on pivots.
Recursively apply quick sort to the left partition.
Recursively apply quick sort to the right partition.
Advantages of Quicksort
It performs better on random data as compared to Timsort.
It is useful when there is limited space availability.
It is the better suited for large data sets.

Average time complexity of quicksort vs insertion sort

I'm lead to believe that quick sort should be faster than insertion sort on a medium size unorderd int array. I've implemented both algorithms in java and I notice quicksort is significantly slower then insertion sorrt.
I have a theory: quiksort is being slower because it's recursive and the call it's making to it's own method signature is quite slow in the JVM which is why my timer is giving much higher readings than I expected, whereas insertion isn't recursive and all thwe work is done within one method so they JVM isn't having to do any extra grunt work? amirite?
You may be interested in these Sorting Algorithm Animations.
Probably not, unless your recursive methods are making any big allocations. Its more likely there's a quirk in your code or your data set is small.
The JVM shouldn't have any trouble with recursive calls.
Unless you've hit one of Quicksort's pathological cases (often, a list that is already sorted), Quicksort should be O(n log n) — substantially faster than insertion sort's O(n^2) as n increases.
You may want to use merge sort or heap sort instead; they don't have pathological cases. They are both O(n log n).
(When I did these long ago in C++, quicksort was faster than insertion sort with fairly small ns. Radix is notable faster with mid-size ns as well.)
theoretically Quick Sort should work faster than insertion sort for random data of medium to large size.
I guess the differences should be in the way QS is implemented:
pivot selection for the given data ?(3-median is a better approach)
using the same Swap mechanism for QS and insertion sort ?
is the input random enuf, i.e ., if you have clusters of ordered data performance will
suffer.
I did this exercise in C and results are in accordance with theory.
Actually for small value of n insertion sort is better than quick sort. As for small value of n instead of n^2 or nlogn the time depends more on constant.
The fastest implementations of quicksort use looping instead of recursion. Recursion typically isn't very fast.
You have to be careful how you make the recursive calls, and because it's Java, you can't rely on tail calls being optimized, so you should probably manage your own stack for the recursion.
Everything that is available to be known about quicksort vs insertion sort can be found in Bob Sedgewick's doctoral dissertation. The boiled-down version can be found in his algorithms textbooks.
I remember that in school, when we did sorting in Java, we would actually do a hybrid of the two. So for resursive algorithms like quicksort and mergesort, we would actually do insertion sort for segments that were very smal, say 10 records or so.
Recursion is slow, so use it with care. And as was noted before, if you can figure a way to implement the same algorithm in an iterative fashion, then do that.
There are three things to consider here. First, insertion sort is much faster (O(n) vs O(n log n)) than quicksort IF the data set is already sorted, or nearly so; second, if the data set is very small, the 'start up time" to set up the quicksort, find a pivot point and so on, dominates the rest; and third, Quicksort is a little subtle, you may want to re-read the code after a night's sleep.
How are you choosing your pivot in Quicksort?
This simple fact is the key to your question, and probably why Quicksort is running slower. In cases like this it's a good idea to post at least the important sections of your code if you're looking for some real help.
Actually for little worth of n insertion type is healthier than fast type. As for little worth of n rather than n^2 or nlogn the time depends a lot of on constant.
Web Development Indianapolis

Resources