QuickSort for sorting part Mergesort? - sorting

Ques: Mergesort divides a list of numbers into two halves and calls itself recursively on both of them. Instead can you perform quicksort on the left half and mergesort on the right half? If yes, show how it will sort the following list of numbers by showing every step. If no, explain why you cannot.
Iam supposed to sort a list of numbers using mergesort. Where the left half is to be sorted using a quicksort ?
I figured it out.
Ans:Yes,we can
Sort the right half of the array using mergesort.
Sort the left half using quicksort.
Merge the 2 using the merge func of merge_sort.

Yes, you can do this. The basic idea behind mergesort is the following:
Split the array into two (or more) pieces.
Sort each piece independently.
Apply a merge step to combine the sorted pieces into one overall sorted list.
From the perspective of correctness, it doesn't actually matter how you sort the lists generated in part (2). All that matters is that those lists get sorted. A typical implementation of mergesort does step (2) by recursively applying itself to the left and right halves, but there's no fundamental reason you have to do this. (In fact, in some optimized versions of mergesort, you specifically don't do this and instead switch to an algorithm like insertion sort when the arrays get sufficiently small).
In your case, you are correct that using quicksort on the left and mergesort on the right would still produce a sorted sequence. However, the way in which it would work would look quite different from what you're describing. What would end up happening is something like this: the first half of the array would get quicksorted (because you quicksort the left half), then you'd recursively sort the right half. The first half of that would get quicksorted, then you'd recursively sort the right half. The first half of that would get quicksorted, etc. Overall this would look something like this:
You quicksort the first half of the array, then the first half of what's left, then the first half of what's left, etc. until there are no elements left.
Then, working from left to right, you'd merge the last two elements together, then the last four, then the last eight, etc.
This would be a pretty cool-looking sort, but doing it by hand would be a total pain. You might be better off writing a program that just does this and showing all the intermediate steps. :-)

No, you cannot do it. At least if you still want to call it "merge sort". The most fundamental difference between merge sort and quick sort is that the first is a stable algorithm, i.e. equally ordered elements keep their relative positions unaltered after sorting. This is important in many scenarios.
If you sort the second half using quick sort, the relative position of equal elements can (and very likely will) change. The resulting set will not preserve stability so it can't be still considered merge sort.
By the way, previous answer is correct regarding insertion sort used as the last step of merge sort. Most efficient merge sort implementations will use something like insertion sort when the number of elements is small. Insertion sort is also stable, that's why it can be done without breaking merge sort stability.

Related

Quicksort to already sorted array

In this question: https://www.quora.com/What-is-randomized-quicksort
Alejo Hausner told in: Cost of quicksort, in the worst case, that
Ironically, if you apply quicksort to an array that is already sorted, you will get probably get this costly behavior
I cannot get it. Can someone explain it to me.
https://www.quora.com/What-will-be-the-complexity-of-quick-sort-if-array-is-already-sorted may be answer to this, but that did not get me a complete response.
The Quicksort algorithm is this:
select a pivot
move elements smaller than the pivot to the beginning, and elements larger than pivot to the end
now the array looks like [<=p, <=p, <=p, p, >p, >p, >p]
recursively sort the first and second "halves" of the array
Quicksort will be efficient, with a running time close to n log n, if the pivot always end up close to the middle of the array. This works perfectly if the pivot is the median value. But selecting the actual median would be costly in itself. If the pivot happens, out of bad luck, to be the smallest or largest element in the array, you'll get an array like this: [p, >p, >p, >p, >p, >p, >p]. If this happens too often, your "quicksort" effectively behaves like selection sort. In that case, since the size of the subarray to be recursively sorted only reduces by 1 at every iteration, there will be n levels of iteration, each one costing n operations, so the overall complexity will be `n^2.
Now, since we're not willing to use costly operations to find a good pivot, we might as well pick an element at random. And since we also don't really care about any kind of true randomness, we can just pick an arbitrary element from the array, for instance the first one.
If the array was shuffled uniformly at random, then picking the first element is great. You can reasonably hope it will regularly give you an "average" element. But if the array was already sorted... Then by definition the first element is the smallest. So we're in the bad case where the complexity is n^2.
A simple way to avoid "bad lists" is to pick a true random element instead of an arbitrary element. Or if you have reasons to believe that quicksort will often be called on lists that are almost sorted, you could pick the element in position n/2 instead of the one in position 1.
There are also several research papers about different ways to select the pivot, with precise calculations on the impact on complexity. For instance, you could pick three random elements, rank them from smallest to largest and keep the middle one. But the conclusion usually is: if you try to write a better pivot-selection, then it will also be more costly, and the overall complexity of the algorithm won't be improved that much.
Depending on the implementations there are several 'common' ways to choose the pivot.
In general for 'unsorted' source there is no good or bad way to choose it.
So some implementations just take the first element as pivot.
In the case of a already sorted source this results in the worst pivot possible because the lest interval will always be empty.
-> recursion steps = O(n) instead the desired O(log n).
This leads to O(n²) complexity, which is very bad for sorting.
Choosing the pivot by random avoids this behavior. It is extremely unlikely that the random chosen pivot will have the same bad characteristics in every recursion as described above.
Also on purpose bad source is not possible to generate because you cannot predict the choices of the random generator (if it's a good one)

About bubble sort vs merge sort

This is an interview question that I recently found on Internet:
If you are going to implement a function which takes an integer array as input and returns the maximum, would you use bubble sort or merge sort to implement this function? What if the array size is less than 1000? What if it is greater than 1000?
This is how I think about it:
First, it is really weird to use sorting to implement the above function. You can just go through the array once and find the max one.
Second, if have to make a choice between the two, then bubble sort is better - you don't have to implement the whole bubble sort procedure but only need to do the first pass. It is better than merge sort both in time and space.
Are there any mistakes in my answer? Did I miss anything?
It's a trick question. If you just want the maximum, (or indeed, the kth value for any k, which includes finding the median), there's a perfectly good O(n) algorithm. Sorting is a waste of time. That's what they want to hear.
As you say, the algorithm for maximum is really trivial. To ace a question like this, you should have the quick-select algorithm ready, and also be able to suggest a heap datastructure in case you need to be able to mutate the list of values and always be able to produce the maximum rapidly.
I just googled the algorithms. The bubble sort wins in both situations because of the largest benefit of only having to run through it once. Merge sort can not cut any short cuts for only having to calculate the largest number. Merge takes the length of the list, finds the middle, and then all the numbers below the middle compare to the left and all above compare to the right; in oppose to creating unique pairs to compare. Meaning for every number left in the array an equal number of comparisons need to be made. In addition to that each number is compared twice so the lowest numbers of the array will most likely get eliminated in both of their comparisons. Meaning only one less number in the array after doing two comparisons in many situations. Bubble would dominate
Firstly I agree with everything you have said, but perhaps it is asking about knowing time complexity's of the algorithms and how the input size is a big factor in which will be fastest.
Bubble sort is O(n2) and Merge Sort is O(nlogn). So, on a small set it wont be that different but on a lot of data Bubble sort will be much slower.
Barring the maximum part, bubble sort is slower asymptotically, but it has a big advantage for small n in that it doesn't require the merging/creation of new arrays. In some implementations, this might make it faster in real time.
only one pass is needed , for worst case , to find maximum u just have to traverse the whole array , so bubble would be better ..
Merge sort is easy for a computer to sort the elements and it takes less time to sort than bubble sort. Best case with merge sort is n*log2n and worst case is n*log2n. With bubble sort best case is O(n) and worst case is O(n2).

Shell sort and insertion sort

I got a problem. I'm very confused over shell sort and insertion sort algorithms. How should we distinguish from each other?
Shell sort is a generalized version of Insertion sort. The basic priciple is the same for both algorithms. You have a sorted sequence of length n and you insert the unsorted element into it - and you get n+1 elements long sorted sequence.
The difference follows: while Insertion sort works only with one sequence (initially the first element of the array) and expands it (using the next element).
However, shell sort has a diminishing increment, which means, that there is a gap between the compared elements (initially n/2). Hence there are n/2 sequences to be sorted using insertion sort. In each step the increment is shrinked (often just divided by 2.2) and the number of sequences is reduced. In the last step there is no gap and the algorithm degenerates to simple insertion sort.
Because of the diminishing increment, the large and small elements are moved rapidly to correct part of the array and than in the last step sorted using insertion sort really fast. This leads to reduced time complexity O(n^(4/3))
You can implement insertion sort as a series of comparisons and swaps of contiguous elements. That makes it a "stable sort". Shell sort, instead, compares and swaps elements which are far from each other. That makes it faster.
I suppose that your confusion comes from the fact that shell sort can be implemented as several insertion sorts applied to different subsets of the data. Note that these subsets are composed of noncontiguous elements of the data sequence.
See the Wikipedia for more details ;-)
The insertion sort is a simple, in-place, O(N^2) sort. Shell sort is a little more complex and harder to understand, and somewhere around O(N^(5/4)). Check the links out for examples -- it should be easy to see the difference.

Insertion sort better than Bubble sort?

I am doing my revision for the exam.
Would like to know under what condition will Insertion sort performs better than bubble sort given same average case complexity of O(N^2).
I did found some related articles, but I can't understand them.
Would anyone mind explaining it in a simple way?
The advantage of bubblesort is in the speed of detecting an already sorted list:
BubbleSort Best Case Scenario: O(n)
However, even in this case insertion sort got better/same performance.
Bubblesort is, more or less, only good for understanding and/or teaching the mechanism of sortalgorithm, but wont find a proper usage in programming these days, because its complexity
O(n²)
means that its efficiency decreases dramatically on lists of more than a small number of elements.
Following things came to my mind:
Bubble sort always takes one more pass over array to determine if it's sorted. On the other hand, insertion sort not need this -- once last element inserted, algorithm guarantees that array is sorted.
Bubble sort does n comparisons on every pass. Insertion sort does less than n comparisons: once the algorithm finds the position where to insert current element it stops making comparisons and takes next element.
Finally, quote from wikipedia article:
Bubble sort also interacts poorly with modern CPU hardware. It
requires at least twice as many writes as insertion sort, twice as
many cache misses, and asymptotically more branch mispredictions.
Experiments by Astrachan sorting strings in Java show bubble sort to
be roughly 5 times slower than insertion sort and 40% slower than
selection sort
You can find link to original research paper there.
I guess the answer you're looking for is here:
Bubble sort may also be efficiently used on a list that is already
sorted except for a very small number of elements. For example, if
only one element is not in order, bubble sort will take only 2n time.
If two elements are not in order, bubble sort will take only at most
3n time...
and
Insertion sort is a simple sorting algorithm that is relatively
efficient for small lists and mostly sorted lists, and often is used
as part of more sophisticated algorithms
Could you provide links to the related articles you don't understand? I'm not sure what aspects they might be addressing. Other than that, there is a theoretical difference which might be that bubble sort is more suited for collections represented as arrays (than it is for those represented as linked lists), while insertion sort is suited for linked lists.
The reasoning would be that bubble sort always swaps two items at a time which is trivial on both, array and linked list (more efficient on arrays), while insertion sort inserts at a place in a given list which is trivial for linked lists but involves moving all subsequent elements in an array to the right.
That being said, take it with a grain of salt. First of all, sorting arrays is, in practice, almost always faster than sorting linked lists. Simply due to the fact that scanning the list once has an enormous difference already. Apart from that, moving n elements of an array to the right, is much faster than performing n (or even n/2) swaps. This is why other answers correctly claim insertion sort to be superior in general, and why I really wonder about the articles you read, because I fail to think of a simple way of saying this is better in cases A, and that is better in cases B.
In the worst case both tend to perform at O(n^2)
In the best case scenario, i.e., when the array is already sorted, Bubble sort can perform at O(n).

Quicksort - conditions that makes it stable

A sorting algorithm is stable if it preserves the relative order of any two elements with equals keys. Under which conditions is quicksort stable?
Quicksort is stable when no item is passed unless it has a smaller key.
What other conditions make it stable?
Well, it is quite easy to make a stable quicksort that uses O(N) space rather than the O(log N) that an in-place, unstable implementation uses. Of course, a quicksort that uses O(N) space doesn't have to be stable, but it can be made to be so.
I've read that it is possible to make an in-place quicksort that uses O(log N) memory, but it ends up being significantly slower (and the details of the implementation are kind of beastly).
Of course, you can always just go through the array being sorted and add an extra key that is its place in the original array. Then the quicksort will be stable and you just go through and remove the extra key at the end.
The condition is easy to figure out, just keep the raw order for equal elements. There is no other condition that has essential differences from this one. The point is how to achieve it.
There is a good example.
1. Use the middle element as the pivot.
2. Create two lists, one for smaller, the other for larger.
3. Iterate from the first to the last and put elements into the two lists. Append the element smaller than or equals to and ahead of the pivot to the smaller list, the element larger than or equals to and behind of the pivot to the larger list. Go ahead and recursive the smaller list and the larger list.
https://www.geeksforgeeks.org/stable-quicksort/
The latter 2 points are both necessary. The middle element as the pivot is optional. If you choose the last element as pivot, just append all equal elements to the smaller list one by one from the beginning and they will keep the raw order in nature.
You can add these to a list, if that's your approach:
When the elements have absolute ordering
When the implementation takes O(N) time to note the relative orderings and restores them after the sort
When the pivot chosen is ensured to be of a unique key, or the first occurence in the current sublist.

Resources