introduction to algorithms exercise 8.3-2 understanding stability

introduction to algorithms exercise 8.3-2 understanding stability - algorithm

when I got the answer:
http://clrs.skanev.com/08/03/02.html for exercise 8.3-2,
I could not understand how to use index specifically to solve it.
Could someone please show it step by step or interpret why is Θ(n)?
and here is the question and answer:
Which of the following sorting algorithms are stable: insertion sort, merge sort, heapsort, and quicksort? Give a simple scheme that makes any sorting algorithm stable. How much additional time and space does your scheme entail?
Stable: Insertion sort, merge sort
Not stable: Heapsort, quicksort
We can make any algorithm stable by mapping the array to an array of pairs, where the first element in each pair is the original element and the second is its index. Then we sort lexicographically. This scheme takes additional Θ(n) space.

In the context of sorting, "stable" means that when a collection containing some elements with equivalent value is sorted, those elements stay in the same order with respect to each other.
So a sorting algorithm can be made stable by storing the original index of each element, and using that index as a secondary way of sorting elements with equal primary value.
To implement this the comparison function (for example <) would be implemented
so A < B returns true if A.PrimarySortValue < B.PrimarySortValue, and returns (A.OrginalIndex < B.OriginalIndex) when A.PrimarySortValue == B.PrimarySortValue. otherwise (when A.PrimarySortValue > B.PrimarySortValue) it returns false;
This requires one additional OriginalIndex value to be stored per element. There are n elements hence Θ(n) extra space is required.

Related

Best Case For Merge Sort

I've made a program that count cost of mergesort algorithm for different value of n, i've taken cost variable and i am incrementing it every time loop encounter or condition chech occure and when i get sorted array i gave that sorted array in and input to merge sort again and after that in third case i am reversing the sorted array so it would be worst case but for all three cases i am getting the same cost,so what would be the Best And Worst Case For Mergesort.

The cost of mergesort implemented classically either as a top-down recursive function or a bottom-up iterative with a small local array of pointers is the same: O(N.log(N)). The number of comparisons will vary depending on the actual contents of the array, but by at most a factor of 2.
You can improve this algorithm at a linear cost by adding an initial comparison between the last element of the left slice and the first element of the right slice in the merge phase. If the comparison yields <= then you can skip the merge phase for this pair of slices.
With this modification, a fully sorted array will sort much faster, with a linear complexity, making it the best case, and a partially sorted array will behave better as well.

Modified version of merge sort which uses insertion sort

After I have divided an array using merge sort,till the array has length k,I'm supposed to use insertion sort on the k length array and then continue with merging. What should be the optimal value of k?
Also, I found these questions similar to mine but didn't find a definite answer
Choosing minimum length k of array for merge sort where use of insertion sort to sort the subarrays is more optimal than standard merge sort
Modification to merge sort to implement merge sort with insertion sort Java

Just measure.
The best threshold value depends on your programming language, data type, daata set value distribution, computer hardware, mergesort and insertion sort implementation details and so on.
Usually this value is in range 10-200, and the gain for the best value is not very significant.

This I feel was a more proper answer for my question http://atekihcan.github.io/CLRS/P02-01/, quoting it here,
For the modified algorithm to have the same asymptotic running time as standard merge sort,
Θ(nk+nlg(n/k))=Θ(nk+nlgn−nlgk)
must be same as
Θ(nlgn).
To satisfy this condition, k cannot grow faster than lg⁡n asymptotically (if k grows faster than lg⁡n, because of the nk term, the algorithm will run at worse asymptotic time than Θ(nlgn). But just this argument is not enough as we have to check for
k=Θ(lgn),
the requirement holds or not.
If we assume,
k=Θ(lgn),
Θ(nk+nlg(n/k))=Θ(nk+nlgn−nlgk)=Θ(nlgn+nlgn−nlg(lgn))=Θ(2nlgn−nlg(lgn))†=Θ(nlgn)
†lg(lgn) is very small compared to lgn for sufficiently larger values of n.

Why insertion sort is best algorithm for sorted or nearly sorted array?

So i guess its because it just compares A[k] and A[k-1], and does the implementation in one sweep but its still not clear. Can someone explain better.
Thanks

This link shows a graphical representation of sorting algorithm with different types of data set.
As you can see, when the data is sorted the algorithm complexity is reduced to N. Which is equivalent to the number of elements as inputs.
The link provided gives a clear picture of how its more efficient.

You answered your own question: For a nearly sorted array, insertion sort will only need a handful of O(n) passes to complete. Contrast that to a divide and conquer sorting algorithm like merge sort, which takes O(n*lgn). For any non trivial value of n, a divide and conquer algorithm will need many O(n) passes, even if the array be almost completely sorted, whereas insertion sort might only require a few.

Insertion sort is a faster and more improved sorting algorithm than selection sort. In selection sort the algorithm iterates through all of the data through every pass whether it is already sorted or not. However, insertion sort works differently, instead of iterating through all of the data after every pass the algorithm only traverses the data it needs to until the segment that is being sorted is sorted. Again there are two loops that are required by insertion sort and therefore two main variables, which in this case are named 'i' and 'j'. Variables 'i' and 'j' begin on the same index after every pass of the first loop, the second loop only executes if variable 'j' is greater then index 0 AND arr[j] < arr[j - 1]. In other words, if 'j' hasn't reached the end of the data AND the value of the index where 'j' is at is smaller than the value of the index to the left of 'j', finally 'j' is decremented. As long as these two conditions are met in the second loop it will keep executing, this is what sets insertion sort apart from selection sort. Only the data that needs to be sorted is sorted.

The general goal of a sorting algorithm is to minimize the number of comparisons. Sorting algorithms have a lower bound and an upper bound on the number of comparisons( n log n worst-case for merge and heap sorts, n log n average case for quick sort). In the most general case, you'd go with an algorithm that happens to have the best average or best worst-case number of comparisons. However, when you know something about the data (e.g., the array is already sorted, or almost sorted), you can exploit the fact that insertion sort's lower bound is far lower than the "n log n" sorts.
For example, if you have an array [1,2,3,4,5,6,7,9] and you need to insert 8 into it, you can either insert it at the end, and sort the array using a vanilla n log n sort (which will do about 28 comparisons (roughly) to sort the data to [1,2,3,4,5,6,7,8,9]). However, insertion sort lets you insert the 8 at the right position in only about 8 comparisons.

Shell sort and insertion sort

I got a problem. I'm very confused over shell sort and insertion sort algorithms. How should we distinguish from each other?

Shell sort is a generalized version of Insertion sort. The basic priciple is the same for both algorithms. You have a sorted sequence of length n and you insert the unsorted element into it - and you get n+1 elements long sorted sequence.
The difference follows: while Insertion sort works only with one sequence (initially the first element of the array) and expands it (using the next element).
However, shell sort has a diminishing increment, which means, that there is a gap between the compared elements (initially n/2). Hence there are n/2 sequences to be sorted using insertion sort. In each step the increment is shrinked (often just divided by 2.2) and the number of sequences is reduced. In the last step there is no gap and the algorithm degenerates to simple insertion sort.
Because of the diminishing increment, the large and small elements are moved rapidly to correct part of the array and than in the last step sorted using insertion sort really fast. This leads to reduced time complexity O(n^(4/3))

You can implement insertion sort as a series of comparisons and swaps of contiguous elements. That makes it a "stable sort". Shell sort, instead, compares and swaps elements which are far from each other. That makes it faster.
I suppose that your confusion comes from the fact that shell sort can be implemented as several insertion sorts applied to different subsets of the data. Note that these subsets are composed of noncontiguous elements of the data sequence.
See the Wikipedia for more details ;-)

The insertion sort is a simple, in-place, O(N^2) sort. Shell sort is a little more complex and harder to understand, and somewhere around O(N^(5/4)). Check the links out for examples -- it should be easy to see the difference.

Quicksort - conditions that makes it stable

A sorting algorithm is stable if it preserves the relative order of any two elements with equals keys. Under which conditions is quicksort stable?
Quicksort is stable when no item is passed unless it has a smaller key.
What other conditions make it stable?

Well, it is quite easy to make a stable quicksort that uses O(N) space rather than the O(log N) that an in-place, unstable implementation uses. Of course, a quicksort that uses O(N) space doesn't have to be stable, but it can be made to be so.
I've read that it is possible to make an in-place quicksort that uses O(log N) memory, but it ends up being significantly slower (and the details of the implementation are kind of beastly).
Of course, you can always just go through the array being sorted and add an extra key that is its place in the original array. Then the quicksort will be stable and you just go through and remove the extra key at the end.

The condition is easy to figure out, just keep the raw order for equal elements. There is no other condition that has essential differences from this one. The point is how to achieve it.
There is a good example.
1. Use the middle element as the pivot.
2. Create two lists, one for smaller, the other for larger.
3. Iterate from the first to the last and put elements into the two lists. Append the element smaller than or equals to and ahead of the pivot to the smaller list, the element larger than or equals to and behind of the pivot to the larger list. Go ahead and recursive the smaller list and the larger list.
https://www.geeksforgeeks.org/stable-quicksort/
The latter 2 points are both necessary. The middle element as the pivot is optional. If you choose the last element as pivot, just append all equal elements to the smaller list one by one from the beginning and they will keep the raw order in nature.

You can add these to a list, if that's your approach:
When the elements have absolute ordering
When the implementation takes O(N) time to note the relative orderings and restores them after the sort
When the pivot chosen is ensured to be of a unique key, or the first occurence in the current sublist.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

introduction to algorithms exercise 8.3-2 understanding stability - algorithm

Related

Best Case For Merge Sort

Modified version of merge sort which uses insertion sort

Why insertion sort is best algorithm for sorted or nearly sorted array?

Shell sort and insertion sort

Quicksort - conditions that makes it stable

Categories

Resources