Help verifying Big O - sorting

Hey so I am trying to verify some of the sorting algorithms.
Insertion Sort
Mergesort
Quicksort using “median of three” partitioning and cutoff of 10 (using Insertion Sort for the small array portions)
Analysis:
The worst case running time for Insertion Sort was discussed to be O(n2), with actual running time O(n) for sorted input (as long as the inside loop is properly coded). Mergesort was discussed to be O(n log n) for all input. Quicksort using the “median of three” partitioning and a reasonable cutoff was discussed to be O(n Log(n)) for all input, but faster in practice than Mergesort. Do your timings validate these computations?
note that N = Max/2 , and 2N = MAX
After running the programs, i found that
The diff for insertion sorted with sorted max/2 is 0.000307083129882812
The diff for insertion sorted with sorted max is 0.000623941421508789
The diff for insertion reverse with sorted max/2 is 0.000306129455566406
The diff for insertion reverse with sorted max is 0.000745058059692383
The diff for insertion random with sorted max/2 is 2.39158606529236
The diff for insertion random with sorted max is 9.72073698043823
The diff for merge sort with sorted max/2 is 0.00736188888549805
The diff for merge sort with sorted max is 0.0154471397399902
The diff for merge reverse with sorted max/2 is 0.00730609893798828
The diff for merge reverse with sorted max is 0.0154309272766113
The diff for merge random with sorted max/2 is 0.0109999179840088
The diff for merge random with sorted max is 0.0232758522033691
The diff for quick sorted with sorted max/2 is 3.10367894172668
The diff for quick sorted with sorted max is 12.5512340068817
The diff for quick reverse with sorted max/2 is 3.09689497947693
The diff for quick reverse with sorted max is 12.5547797679901
The diff for quick random with sorted max/2 is 0.0112619400024414
The diff for quick random with sorted max is 0.0221798419952393
I know that the insertion sort is working correctly since for the random, i did 9.72073698043823/ 2.39158606529236 ~= 4 = 22 = O(n2)
but I don't know how to verify if the others ones are O(n Log(n)) or not. Please help

Let me remind you that f(n)=O(n) means limit when n grows very big that f(n)/n => constant
What is not said :
that constant can be very big
that for small values, it means something : e.g. 10^9/n+n is O(n) but for n=1, it0s 10^9+1 ;-)
O(something) is not the argument killer, topology of your data may sometime affect the algorithm (e.g: on "almost" sorted data, bubble sort performs well)
If you want to draw conclusions, run test with big samples relevant for your application, don't draw conclusion too early (note that modern CPU might fool you with cache, pipelining and multicore if you use it (what you can for sorting)

Related

Running time of merge sort, all elements are identical

Given n numbers that all are identical, then what would be the running time of merge sort?
Will it be in linear time O(n) or,
best case O(nlogn)
For a pure merge sort, the number of moves is always the same O(n log(n)). If all elements are the same or in order or reverse order, the number of compares is about half the number of compares for the worst case.
A natural merge sort that creates runs based on existing ordering of data would take O(n) time for all identical values or in order or reverse order. A variation of this is a hybrid insertion sort + merge sort called Timsort.
https://en.wikipedia.org/wiki/Timsort
You need to recheck the recursive formula that you have for the merge sort:
T(n) = 2T(n/2) + \Theta(n)
Now, when all values are identical, let see what will be changed in the formulation. \Theta(n) is for merging two subarrays. As the merging of two subarrays with identical members sweeps those arrays, independent of the identical members, it will be the same in your case.
Therefore, the recursion formula will be unchanged for the specified case; hence the time complexity will be Theta(n log n). That can be considered as one of the shortcomings of the mergesort.

Does merge sort essentially trade space for time when compared to insertion sort

I am trying to understand, kinda intuitively, how the runtime for merge sort is so much better than insertion sort.
Even though we divide and conquer with merge sort, on a single CPU, each of the nodes of the merge sort execution tree will be executed serially. Is the smaller input size on every recursive call (iteration) somehow the key thing for merge sort?
Or is it the fact that since merge sort is not in-place and uses O(n) space this saves on the number of shifts we have to do in insertion sort to make space for the insertion of the smaller number.
But what about the penalty of copying the elements in left and right temporary arrays in every merge step?
Yes, that smaller input size is in large part where the speed up of mergesort comes from compared with insertion sort. The fact that mergesort uses more space is more an artifact of how it works than an inherent reason for the speedup.
Here’s one way to see this. We know that insertion sort, on average, takes time Θ(n2). Now, suppose you want to insertion sort an array of n elements. Instead, you cut the array apart into two smaller arrays of size roughly n/2 and insertion sort each of those. How long will this take? Since insertion sort has quadratic runtime, the cost of insertion sorting each half will be roughly one quarter the cost of insertion sorting the whole array ((n/2)2 = n2 / 4). Since there are two of those arrays, the total cost of sorting things this way will be roughly
2(n2 / 4) = n2 / 2,
which is half the time required to sort the original array. This gives rise to a simple sorting algorithm that’s an improvement over insertion sort:
Split the array in half.
Insertion sort each half.
Merge the two halves together.
That last step introduces linear space overhead for the merge, though you could do it with an in-place merge at a higher cost.
This algorithm, “split sort,” is about twice as a fast as insertion sort. So then you might ask - why split in halves? Why not quarters? After all, the cost of sorting one quarter of the array is about
(n/4)2 = n2 / 16,
which is sixteen times faster than sorting the original array! We could turn that into a sorting algorithm like this:
Split the array into quarters.
Insertion sort each quarter.
Merge the quarters into halves.
Merge the halves into the full array.
This will be about four times faster than insertion sort (each sort takes one sixteenth the time of the original sort, and we do four of them).
You can think of mergesort as the “limit” of this process where we never stop splitting and divide the array into the smallest units possible and then merge them all back together at the end. The speedup is based on the fact that sorting smaller arrays is inherently faster than sorting larger arrays, with the memory overhead for the merge being more of an implementation detail than an intrinsic reason for the speedup.
Another way to see that the space usage isn’t necessary for the speedup is to compare insertion sort to heapsort. Heapsort also runs in time O(n log n) but uses only O(1) auxiliary space.
Hope this helps!
Even an in place merge sort (O(1) space) is faster than insertion sort for n >= ~128 on a typical X86.
For smaller n, insertion sort is faster, due to cache and related constant factors, and because of this, most library implementations of stable sort use a hybrid of insertion sort (to create small sorted runs) and bottom up merge sort.
An example of an in place merge sort is block merge sort (grail), O(1) space, still with O(n log(n)) time complexity, but about 50% slower than a standard merge sort, and the code is complicated:
https://github.com/Mrrl/GrailSort/blob/master/GrailSort.h
But what about the penalty of copying the elements in left and right temporary arrays in every merge step?
Typical merge sort avoids copying of data by doing a one time allocation of a temporary array and then changes the direction of merge based on merge pass for bottom up merge sort, or level of recursion for top down merge sort.

Time Complexity Of Merge Sort In Special Condition

What would be the time complexity If I will apply merge sort on an already sorted array?
Usual merge sort still uses O(nlogn) for sorted data.
But there is natural merge sort variant that provides linear complexity for sorted arrays.
Note that natural merge sort also gives O(nlogn) for arbitrary data, compared with isertion sort, that behaves well for sorted data but becomes quadratic in the worst case
According to the Wikipedia page for merge sort, merge sort has both a best and worse case performance of O(n log n). Given an input of an array already sorted, merge sort would still need to go through the same sorting process as for any other array. As a result, even for a sorted array, the running time would still be O(n log n).
For the case of an already-sorted array, there are other algorithms which actually beat merge sort, e.g. insertion sort. For insertion sort, the performance of an already sorted array is O(n), i.e. linear.

What is the running time of two sorting algorithms if one is merge sort and one is insertion sort

If I have to sort one list and merge it with another already sorted one. Then what will the running time be if I use merge sort and insertion sort?
Merge sort is: n logn
Insertion sort is: n^2
But together they are?
EDIT: Oh, so what I actually meant was that I had to sort one of the lists and merge them together.
I have made the pseudocode for the insertion sort, but I don't know what the running time of the two algorithms will be.
http://gyazo.com/0010f053f0fe64a82dad1dd383740a3f
The complexity of merging two sorted lists with lengths n1 and n2 is O(n1 + n2); that should be enough to work out the big-Oh of the entire algorithm

What is the worst-case time for insertion sort within merge sort?

Recently I stumbled upon this problem from Introduction To Algorithms Edition 3
Problem 2-1:
Although merge sort runs in O(n logn) worst-case time and insertion sort runs in O(n^2), the latter runs faster for small problem sizes. Consider a modification to Merge Sort in which n/k sublists of length k are sorted using insertion sort and then merged using standard merging mechanism.
(A) Show that insertion sort can sort the n/k sublists, each of length k, in O(nk) worst-case time.
The answer given is:
Ans: Insertion sort takes (k^2) time per k-element list in the worst case. Therefore,
sorting n/k lists of k elements each takes (k^2 n/k) = (nk) worst-case time
How do they get (k^2 n/k) from the given data?? Im not understanding this at all and would greatlly appreciate an explanation.
The sublists are of length k, therefore insertion sort takes k^2 for each sublist. now, there are n/k sublists in total, so, n/k * k^2 is nk. The key understanding here is that there are n/k number of sublists, and insertion sort takes k^2 time to sort each one.
Another thing to note, is that knowing that merge sort has O(n logn) is actually not important at all to this problem, because they don't ask for the time for sorting the whole list, just the time for sorting all of the sublists.

Resources