Why is binary insertion sort faster than direct insertion sort?

Why is binary insertion sort faster than direct insertion sort? - sorting

Implementation of direct insertion sort:
void insertion_sort(int a[], int n) {
for (int i = 1; i < n; i++) {
int key = a[i];
int j = i - 1;
while (j >= 0 && a[j] > key) {
a[j + 1] = a[j];
j--;
}
a[j + 1] = key;
}
}
Implementation of binary insertion sort:
void insertion_sort(int a[], int n) {
for (int i = 1; i < n; i++) {
int key = a[i];
int mid = upper_bound(a, a + i, key) - a;
for (int j = i - 1; j >= mid; j--) a[j + 1] = a[j];
a[mid] = key;
}
}
Online tutorials say that binary insertion sort is faster than direct insertion sort, because binary search is faster than sequential search, but I don't think so.
Although direct insertion sort uses sequential search, when the insertion position is found, the element has already been moved. But the binary insertion sort needs to spend O(log i) time to find the insertion position, and the element has not been moved after finding it. If you only look at the inside of the loop, binary insertion sort takes O(log i) more time than direct insertion sort, so why do so many people say that the former is faster than the latter?

First of all, the overall complexity does not change since the "expensive" part is moving the elements which might be O(n). Therefore the whole algorithm is still O(n²). So if you look at the algorithms using asymptotic measures there's no difference between the two. If you look for the actual operation count, one can however see a difference.
Obviously, in both algorithms you have the same amount of move operations. What changes is the amount of comparison operations:
With binary search you have log(n - i) comparison operations
With direct insertion sort you compare within the while loop condition therefore you have as many compare operations as you have move operations (actually one more as the condition is also checked when the while loop exits), therefore you could have n comparison operations in a worst case scenario

Related

Tries for sorting strings in linear time?

I am trying to sort strings alphabetically in linear time and thought about using tries for this, my question is What's the time complexity of running a Pre-Order transversal on tries? is it O(n) ?

You have to be a little careful with the way you measure complexity in this case. A lot of times, people pretend that sorting N strings with a comparison-based sort takes O(N log N) time, but that is not really true in the worst case unless the length of the strings is bounded. It is the expected time if the strings are randomized, however, so it's not a bad approximation for many use cases.
If you want to account for possible long strings with long common prefixes, then you change the meaning of N to refer to the total size of the input, including all the strings. With this new definition, you can sort a list of strings in O(N) time.
Inserting the strings into a trie, or better a radix tree (https://en.wikipedia.org/wiki/Radix_tree) and then doing a preorder traversal is one way, and yes that works in O(N) time, where N is the total size of the input.
But it's faster and easier to do a radix sort: https://en.wikipedia.org/wiki/Radix_sort The Most-Significant-Digit-First variant works best with variable-length inputs.

Radix Sort can be applied in this case to sort them in O(n) refer to the following code implemented in c++:
#include<iostream>
using namespace std;
class RadixSort {
public:
static char charAt(string s,int n){
return s[n];
}
static void countingSort(string arr[],int n,int index,char lower,char upper){
int countArray[(upper-lower)+2];
string tempArray[n];
for(int i =0; i < sizeof(countArray)/sizeof(countArray[0]); i++)
countArray[i]=0;
//increase count for char at index
for(int i=0;i<n;i++){
int charIndex = (arr[i].length()-1 < index) ? 0 : (charAt(arr[i],index) - lower+1);
countArray[charIndex]++;
}
//sum up countArray;countArray will hold last index for the char at each strings index
for(int i=1;i<sizeof(countArray)/sizeof(countArray[0]);i++){
countArray[i] += countArray[i-1];
}
for(int i=n-1;i>=0;i--){
int charIndex = (arr[i].length()-1 < index) ? 0 : (charAt(arr[i],index) - lower+1);
tempArray[countArray[charIndex]-1] = arr[i];
countArray[charIndex]--;
}
for(int i=0;i<sizeof(tempArray)/sizeof(tempArray[0]);i++){
arr[i] = tempArray[i];
}
}
static void radixSort(string arr[],int n,char lower,char upper){
int maxIndex = 0;
for(int i=0;i<n;i++){
if(arr[i].length()-1 > maxIndex){
maxIndex = arr[i].length()-1;
}
}
for(int i=maxIndex;i>=0;i--){
countingSort(arr,n,i,lower,upper);
}
}
};
int main(){
string arr[] = {"a", "aa", "aaa","kinga", "bishoy","computer","az"};
int n = sizeof(arr)/sizeof(arr[0]);
RadixSort::radixSort(arr,n,'a','z');
for(int i=0;i<n;i++){
cout<<arr[i]<<" ";
}
return 0;
}

No. it is not O(n). it is Omega(k(log(k))n).
without any other restriction,and this is the case as i understand from your question, it is just comparison based sorting algorithm.
Sorting an array of length k is in Omega(klog(k)),
and doing it n times, without any connections between the times, will lead to
Omega(klog(k)n).
You can read more here:
https://www.geeksforgeeks.org/lower-bound-on-comparison-based-sorting-algorithms/
If you look at k as bounded, because there is no ENGLISH word longer then 10^1000000 (Which probably larger than atoms on Earth), then sort an array of bounded length is in O(1), and doing it n time will lead to O(n).
You get a lot from dealing with infinity, but sometimes you have to pay back...

Is there a sorting algorithm with a worst case time complexity of n^3?

I'm familiar with other sorting algorithms and the worst I've heard of in polynomial time is insertion sort or bubble sort. Excluding the truly terrible bogosort and those like it, are there any sorting algorithms with a worse polynomial time complexity than n^2?

Here's one, implemented in C#:
public void BadSort<T>(T[] arr) where T : IComparable
{
for (int i = 0; i < arr.Length; i++)
{
var shortest = i;
for (int j = i; j < arr.Length; j++)
{
bool isShortest = true;
for (int k = j + 1; k < arr.Length; k++)
{
if (arr[j].CompareTo(arr[k]) > 0)
{
isShortest = false;
break;
}
}
if(isShortest)
{
shortest = j;
break;
}
}
var tmp = arr[i];
arr[i] = arr[shortest];
arr[shortest] = tmp;
}
}
It's basically a really naive sorting algorithm, coupled with a needlessly-complex method of calculating the index with the minimum value.
The gist is this:
For each index
Find the element from this point forward which
when compared with all other elements after it, ends up being <= all of them.
swap this shortest element with the element at this index
The innermost loop (with the comparison) will be executed O(n^3) times in the worst case (descending-sorted input), and every iteration of the outer loop will put one more element into the correct place, getting you just a bit closer to being fully sorted.
If you work hard enough, you could probably find a sorting algorithm with just about any complexity you want. But, as the commenters pointed out, there's really no reason to seek out an algorithm with a worst-case like this. You'll hopefully never run into one in the wild. You really have to try to come up with one this bad.

Here's an example of elegant algorithm called slowsort which runs in Ω(n^(log(n)/(2+ɛ))) for any positive ɛ:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.116.9158&rep=rep1&type=pdf (section 5).

Slow Sort
Returns the sorted vector after performing SlowSort.
It is a sorting algorithm that is of humorous nature and not useful.
It's based on the principle of multiply and surrender, a tongue-in-cheek joke of divide and conquer.
It was published in 1986 by Andrei Broder and Jorge Stolfi in their paper Pessimal Algorithms and Simplexity Analysis.
This algorithm multiplies a single problem into multiple subproblems
It is interesting because it is provably the least efficient sorting algorithm that can be built asymptotically, and with the restriction that such an algorithm, while being slow, must still all the time be working towards a result.
void SlowSort(vector<int> &a, int i, int j)
{
if(i>=j)
return;
int m=i+(j-i)/2;
int temp;
SlowSort(a, i, m);
SlowSort(a, m + 1, j);
if(a[j]<a[m])
{
temp=a[j];
a[j]=a[m];
a[m]=temp;
}
SlowSort(a, i, j - 1);
}

Time Complexity of a Recursive Algorithm with a Condition not Based on Size

I have a question about the complexity of a recursive function
The code (in C#) is like this:
public void function sort(int[] a, int n)
{
bool done = true;
int j = 0;
while (j <= n - 2)
{
if (a[j] > a[j + 1])
{
// swap a[j] and a[j + 1]
done = false;
{
j++;
}
j = n - 1;
while (j >= 1)
{
if (a[j] < a[j - 1])
{
// swap a[j] and a[j - 1]
done = false;
{
j--;
}
if (!done)
sort(array, length);
}
Now, the difficulty I have is the recursive part of the function.
In all of the recursions I have seen so far, we can determine the number of recursive calls based on the input size because every time we call the function with a smaller input etc.
But for this problem, the recursive part doesn't depend on the input size; instead it's based on whether the elements are sorted or not. I mean, if the array is already sorted, the function will run in O(n) because of the two loops and no recursive calls (I hope I'm right about this part).
How can we determine O(n) for the recursive part?

O(f(n)) means that your algorithm is always faster or equal as f(n) regardless of input (considering only size of input). So you should find worst case for your input of size n.
This one looks like some bubble sort algorithm (although weirdly complicated) which is O(n^2). In worst case, every call of sort function takes O(n) and you transport highest number to the end of array - you have n items so its O(n)*O(n) => O(n^2).

This is bubble sort. It's O(n^2). Since the algorithm swaps adjacent elements, the running time is proportional to the number of inversions in a list, which is O(n^2). The number of recursions will be O(n). The backward pass just causes it to recurse about half the time but doesn't affect the actual complexity--it's still doing the same amount of work.

Why should Insertion Sort be used after threshold crossover in Merge Sort

I have read everywhere that for divide and conquer sorting algorithms like Merge-Sort and Quicksort, instead of recursing until only a single element is left, it is better to shift to Insertion-Sort when a certain threshold, say 30 elements, is reached. That is fine, but why only Insertion-Sort? Why not Bubble-Sort or Selection-Sort, both of which has similar O(N^2) performance? Insertion-Sort should come handy only when many elements are pre-sorted (although that advantage should also come with Bubble-Sort), but otherwise, why should it be more efficient than the other two?
And secondly, at this link, in the 2nd answer and its accompanying comments, it says that O(N log N) performs poorly compared to O(N^2) upto a certain N. How come? N^2 should always perform worse than N log N, since N > log N for all N >= 2, right?

If you bail out of each branch of your divide-and-conquer Quicksort when it hits the threshold, your data looks like this:
[the least 30-ish elements, not in order] [the next 30-ish ] ... [last 30-ish]
Insertion sort has the rather pleasing property that you can call it just once on that whole array, and it performs essentially the same as it does if you call it once for each block of 30. So instead of calling it in your loop, you have the option to call it last. This might not be faster, especially since it pulls the whole data through cache an extra time, but depending how the code is structured it might be convenient.
Neither bubble sort nor selection sort has this property, so I think the answer might quite simply be "convenience". If someone suspects selection sort might be better then the burden of proof lies on them to "prove" that it's faster.
Note that this use of insertion sort also has a drawback -- if you do it this way and there's a bug in your partition code then provided it doesn't lose any elements, just partition them incorrectly, you'll never notice.
Edit: apparently this modification is by Sedgewick, who wrote his PhD on QuickSort in 1975. It was analyzed more recently by Musser (the inventor of Introsort). Reference https://en.wikipedia.org/wiki/Introsort
Musser also considered the effect on caches of Sedgewick's delayed
small sorting, where small ranges are sorted at the end in a single
pass of insertion sort. He reported that it could double the number of
cache misses, but that its performance with double-ended queues was
significantly better and should be retained for template libraries, in
part because the gain in other cases from doing the sorts immediately
was not great.
In any case, I don't think the general advice is "whatever you do, don't use selection sort". The advice is, "insertion sort beats Quicksort for inputs up to a surprisingly non-tiny size", and this is pretty easy to prove to yourself when you're implementing a Quicksort. If you come up with another sort that demonstrably beats insertion sort on the same small arrays, none of those academic sources is telling you not to use it. I suppose the surprise is that the advice is consistently towards insertion sort, rather than each source choosing its own favorite (introductory teachers have a frankly astonishing fondness for bubble sort -- I wouldn't mind if I never hear of it again). Insertion sort is generally thought of as "the right answer" for small data. The issue isn't whether it "should be" fast, it's whether it actually is or not, and I've never particularly noticed any benchmarks dispelling this idea.
One place to look for such data would be in the development and adoption of Timsort. I'm pretty sure Tim Peters chose insertion for a reason: he wasn't offering general advice, he was optimizing a library for real use.

Insertion sort is faster in practice, than bubblesort at least. Their asympotic running time is the same, but insertion sort has better constants (fewer/cheaper operations per iteration). Most notably, it requires only a linear number of swaps of pairs of elements, and in each inner loop it performs comparisons between each of n/2 elements and a "fixed" element that can be stores in a register (while bubble sort has to read values from memory). I.e. insertion sort does less work in its inner loop than bubble sort.
The answer claims that 10000 n lg n > 10 n² for "reasonable" n. This is true up to about 14000 elements.

I am surprised no-one's mentioned the simple fact that insertion sort is simply much faster for "almost" sorted data. That's the reason it's used.

The easier one first: why insertion sort over selection sort? Because insertion sort is in O(n) for optimal input sequences, i.e. if the sequence is already sorted. Selection sort is always in O(n^2).
Why insertion sort over bubble sort? Both need only a single pass for already sorted input sequences, but insertion sort degrades better. To be more specific, insertion sort usually performs better with a small number of inversion than bubble sort does. Source This can be explained because bubble sort always iterates over N-i elements in pass i while insertion sort works more like "find" and only needs to iterate over (N-i)/2 elements in average (in pass N-i-1) to find the insertion position. So, insertion sort is expected to be about two times faster than insertion sort on average.

Here is an empirical proof the insertion sort is faster then bubble sort (for 30 elements, on my machine, the attached implementation, using java...).
I ran the attached code, and found out that the bubble sort ran on average of 6338.515 ns, while insertion took 3601.0
I used wilcoxon signed test to check the probability that this is a mistake and they should actually be the same - but the result is below the range of the numerical error (and effectively P_VALUE ~= 0)
private static void swap(int[] arr, int i, int j) {
int temp = arr[i];
arr[i] = arr[j];
arr[j] = temp;
}
public static void insertionSort(int[] arr) {
for (int i = 1; i < arr.length; i++) {
int j = i;
while (j > 0 && arr[j-1] > arr[j]) {
swap(arr, j, j-1);
j--;
}
}
}
public static void bubbleSort(int[] arr) {
for (int i = 0 ; i < arr.length; i++) {
boolean bool = false;
for (int j = 0; j < arr.length - i ; j++) {
if (j + 1 < arr.length && arr[j] > arr[j+1]) {
bool = true;
swap(arr,j,j+1);
}
}
if (!bool) break;
}
}
public static void main(String... args) throws Exception {
Random r = new Random(1);
int SIZE = 30;
int N = 1000;
int[] arr = new int[SIZE];
int[] millisBubble = new int[N];
int[] millisInsertion = new int[N];
System.out.println("start");
//warm up:
for (int t = 0; t < 100; t++) {
insertionSort(arr);
}
for (int t = 0; t < N; t++) {
arr = generateRandom(r, SIZE);
int[] tempArr = Arrays.copyOf(arr, arr.length);
long start = System.nanoTime();
insertionSort(tempArr);
millisInsertion[t] = (int)(System.nanoTime()-start);
tempArr = Arrays.copyOf(arr, arr.length);
start = System.nanoTime();
bubbleSort(tempArr);
millisBubble[t] = (int)(System.nanoTime()-start);
}
int sum1 = 0;
for (int x : millisBubble) {
System.out.println(x);
sum1 += x;
}
System.out.println("end of bubble. AVG = " + ((double)sum1)/millisBubble.length);
int sum2 = 0;
for (int x : millisInsertion) {
System.out.println(x);
sum2 += x;
}
System.out.println("end of insertion. AVG = " + ((double)sum2)/millisInsertion.length);
System.out.println("bubble took " + ((double)sum1)/millisBubble.length + " while insertion took " + ((double)sum2)/millisBubble.length);
}
private static int[] generateRandom(Random r, int size) {
int[] arr = new int[size];
for (int i = 0 ; i < size; i++)
arr[i] = r.nextInt(size);
return arr;
}
EDIT:
(1) optimizing the bubble sort (updated above) reduced the total time taking to bubble sort to: 6043.806 not enough to make a significant change. Wilcoxon test is still conclusive: Insertion sort is faster.
(2) I also added a selection sort test (code attached) and compared it against insertion. The results are: selection took 4748.35 while insertion took 3540.114.
P_VALUE for wilcoxon is still below the range of numerical error (effectively ~=0)
code for selection sort used:
public static void selectionSort(int[] arr) {
for (int i = 0; i < arr.length ; i++) {
int min = arr[i];
int minElm = i;
for (int j = i+1; j < arr.length ; j++) {
if (arr[j] < min) {
min = arr[j];
minElm = j;
}
}
swap(arr,i,minElm);
}
}

EDIT: As IVlad points out in a comment, selection sort does only n swaps (and therefore only 3n writes) for any dataset, so insertion sort is very unlikely to beat it on account of doing fewer swaps -- but it will likely do substantially fewer comparisons. The reasoning below better fits a comparison with bubble sort, which will do a similar number of comparisons but many more swaps (and thus many more writes) on average.
One reason why insertion sort tends to be faster than the other O(n^2) algorithms like bubble sort and selection sort is because in the latter algorithms, every single data movement requires a swap, which can be up to 3 times as many memory copies as are necessary if the other end of the swap needs to be swapped again later.
With insertion sort OTOH, if the next element to be inserted isn't already the largest element, it can be saved into a temporary location, and all lower elements shunted forward by starting from the right and using single data copies (i.e. without swaps). This opens up a gap to put the original element.
C code for insertion-sorting integers without using swaps:
void insertion_sort(int *v, int n) {
int i = 1;
while (i < n) {
int temp = v[i]; // Save the current element here
int j = i;
// Shunt everything forwards
while (j > 0 && v[j - 1] > temp) {
v[j] = v[j - 1]; // Look ma, no swaps! :)
--j;
}
v[j] = temp;
++i;
}
}

Amortized Time Cost using Accounting Method

I written an algorithm to calculate the next lexicographic permutation of an array of integers (ex. 123, 132, 213, 231, 312,323). I dont think the code is necessary but I included it below.
I think I have appropriately determined worst case time cost of O(n) where n is the number of elements in the array. I understand however if you utilize "Amortized Cost" you would find that the time cost could be accurately shown as O(1) on average case.
Question:
I would like to learn the "ACCOUNTING METHOD" to show this as O(1) but am having difficulty understanding how to apply a cost to each operation. Accounting method: Link: Accounting_Method_Explained
Thoughts:
Ive thought to apply a cost of changing a value at a position, or applying the cost to a swap. But it really doesnt make much sense.
public static int[] getNext(int[] array) {
int temp;
int j = array.length - 1;
int k = array.length - 1;
// Find largest index j with a[j] < a[j+1]
// Finds the next adjacent pair of values not in descending order
do {
j--;
if(j < 0)
{
//Edge case, where you have the largest value, return reverse order
for(int x = 0, y = array.length-1; x<y; x++,y--)
{
temp = array[x];
array[x] = array[y];
array[y] = temp;
}
return array;
}
}while (array[j] > array[j+1]);
// Find index k such that a[k] is smallest integer
// greater than a[j] to the right of a[j]
for (;array[j] > array[k]; k--,count++);
//Swap the two elements found from j and k
temp = array[k];
array[k] = array[j];
array[j] = temp;
//Sort the elements to right of j+1 in ascending order
//This will make sure you get the next smallest order
//after swaping j and k
int r = array.length - 1;
int s = j + 1;
while (r > s) {
temp = array[s];
array[s++] = array[r];
array[r--] = temp;
}
return array;
} // end getNext

Measure running time in swaps, since the other work per iteration is worst-case O(#swaps).
The swap of array[j] and array[k] has virtual cost 2. The other swaps have virtual cost 0. Since at most one swap per iteration is costly, the running time per iteration is amortized constant (assuming that we don't go into debt).
To show that we don't go into debt, it suffices to show that, if the swap of array[j] and array[k] leaves a credit at position j, then every other swap involves a position with a credit available, which is consumed. Case analysis and induction reveal that, between iterations, if an item is larger than the one immediately following it, then it was put in its current position by a swap that left an as-yet unconsumed credit.
This problem is not a great candidate for the accounting method, given the comparatively simple potential function that can be used: number of indexes j such that array[j] > array[j + 1].

From the aggregate analysis, we see T(n) < n! · e < n! · 3, so we pay $3 for each operation, and its enough for the total n! operations. Therefore its an upper bound of actual cost. So the total amortized

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio