Is it possible that quick sort algo produces wrong result in any case,as my quicksort algo produces wrong result one out of ten time and my algo is correct.l
Related
What I seen:
First I have read these two other SO post
Why is Insertion sort better than Quick sort for small list of elements?
Is there ever a good reason to use Insertion Sort?
But the answers on there are not specific enough for me.
From answers from these two post they mainly pointed out Merge Sort and Quick Sort can be slow because of the extra overhead from the recursive function calls. But I am wondering how the specific threshold 7 get set?
My Question:
I want to know why the cut off is around 7 elements where quadratic sorting algorithm like Insertion Sort is faster than O(nlogn) sorting algorithm like Quick Sort or Merge Sort.
Use insertion sort on small subarrays. Mergesort has too much overhead for tiny subarrays.
Cutoff to insertion sort for ~ 7 elements.
I got this from Princeton lecture slide which I think is reputable enough source. see on the 11th slide under Mergesort: Practical Improvements section.
I will really appreciate it if your answer includes examples for mathematical proof.
Big-O only notes the factor that dominates as n gets large. It ignores constant factors and lesser terms, which pretty much always exist and are more significant when n is small. As a consequence, Big-O is near useless for comparing algorithms that will only ever need to work on tiny inputs.
For example, you can have an O(n log n) function with a time graph like t = 5n log n + 2n + 3, and an O(n^2) function whose time graph was like t = 0.5n^2 + n + 2.
Compare those two graphs, and you'll find that in spite of Big-O, the O(n^2) function would be slightly faster until n reaches about 13.
I am now revising the sorting algorithms. Here is the question:
It is given that this sorting algorithm is written in C, and treats 'a' and 'A' as equal, and after running this sorting algorithm, the results are as follows:
10000 random data -> 0.016 sec
100000 random data -> 0.304 sec
10000 ordered data -> 0.006 sec
100000 ordered data -> 0.108 sec
10000 reversed data -> 0.010 sec
100000 reversed data -> 0.138 sec
Question: In point form briefly state the conclusions that you can draw from the test results above.
What I have done
I know this sorting algorithm is non-stable (as stated in the question), and I can guess it is a quick sort.
I know that quick sort has worst case O(n^2), average and best case O(n log n), but I have got no idea how to explain from the results, I can't just say oh its because its non-stable and quick sort have bad results in reversed order, so i can determine it's quick sort.
Are there anything specific I can tell from the result? It would be nice if there are maths calculations or some other important observations from the results.
We can tell that this isn't a quadratic-time algorithm like selection sort or insertion sort, since raising the input size by a factor of 10 raised the runtime by a factor of 13-19. This is behavior we'd expect from an O(n*log(n)) average-case algorithm, like mergesort or a good quicksort.
We can tell that the algorithm isn't adaptive, or at least, not very adaptive. An adaptive algorithm would have performed much better on the sorted input, and probably on the reversed input, too. In particular, raising the size of the sorted input by a factor of 10 would have raised the runtime by a factor of about 10. While the algorithm did do better on sorted or reverse-sorted input than random input, this looks more like a result of, say, more effective branch prediction in those cases.
We don't have any information that would indicate whether the sort is stable.
I don't see anything that would distinguish whether this is a quicksort, mergesort, heapsort, or other O(n*log(n)) algorithm. We can exclude certain types of pivot selection for quicksort - for example, a quicksort that always picks the first element as the pivot would run in quadratic time on sorted input - but beyond that, I can't tell.
This was an interview question and I am wondering if my analysis was correct:
A 'magic select' function basically generates the 'mth' smallest value in an array that has a size of n. The task was to sort the 'm' elements in ascending order using an efficient algorithm. My analysis was to first use the 'magic select' function to get the 'mth' smallest value. I then used a partition function to sort of create a pivot to get all smaller elements on the left. After that point, I felt that a bucket sort should accomplish the task of sorting the left half efficiently.
I was just wondering if this was the best way to sort the 'm' smallest elements. I see the possibility of a quick sort being used here too. However, I thought that avoiding a comparison based sorting could lead to an O(n). Could radix sort or heap sort (O(nlogn)) be used for this too? If I didn't do it in the best possible way, which could be the best possible way to accomplish this? An array was the data structure I was allowed to use.
Many thanks!
I'm pretty sure you can't do any better than any standard algorithm for selecting the k lowest elements out of an array in sorted order. The time complexity of your "magic machine" is O(n), which is the same time complexity you'd get from a standard selection algorithm like the median-of-medians algorithm or quickselect.
Consequently, your approaches seem very reasonable. I doubt you can do any better asymptotically.
Hope this helps!
I am a little unsure of my answer to the question below. Please help:
Suppose you are given a list of N integers. All but one of the integers are sorted in numerical order. Identify a sorting algorithm which will sort this special case in O(N) time and explain why this sorting algorithm achieves O(N) runtime in this case.
I think it is insertion sort but am not sure why that is the case.
Thanks!!
Insertion sort is adaptive, and is efficient for substantially sorted data set. It can sort almost sorted data in O(n+d) where d is number of inversions and in your case d is 1.
I was trying to understand the selection algorithm for finding the median. I have pasted the psuedo code below.
SELECT(A[1 .. n], k):
if n<=25
use brute force
else
m = ceiling(n/5)
for i=1 to m
B[i]=SELECT(A[5i-4 .. 5i], 3)
mom=SELECT(B[1 ..m], floor(m/2))
r = PARTITION(A[1 .. n],mom)
if k < r
return SELECT(A[1 .. r-1], k)
else if k > r
return SELECT(A[r +1 .. n], k-r)
else
return mom
i have a very trivial doubt. I was wondering what the author means by brute force written above for i<=25. Is it that he will compare elements one by one with every other element and see if its the kth largest or something else.
The code must come from here.
A brute force algorithm can be any simple and stupid algorithm. In your example, you can sort the 25 elements and find the middle one. This is simple and stupid compared to the selection algorithm since sorting takes O(nlgn) while selection takes only linear time.
A brute force algorithm is often good enough when n is small. Besides, it is easier to implement. Read more about brute force here.
Common wisdom is that Quicksort is slower than insertion sort for small inputs. Therefore many implementations switch to insertion sort at some threshold.
There is a reference to this practice in the Wikipedia page on Quicksort.
Here's an example of commercial mergesort code that switches to insertion sort for small inputs. Here the threshold is 7.
The "brute force" almost certainly refers to the fact that the code here is using the same practice: insertion sort followed by picking the middle element(s) for the median.
However I've found in practice that the common wisdom is not generally true. When I've run benchmarks, the switch has either very little positive effect or negative. That was for Quicksort. In the Parition algorithm, it's more likely ot be negative because one side of the partition is thrown away at each step, so there is less time spent on small inputs. This is verified in #Dennis's response to this SO question.