Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
In (most of the) research papers on sorting, authors conclude that their algorithm takes n-1comparisons to sort a 'n' sized array (where n is size of the array)
...so and so
but when it comes to coding, the code uses more comparisons than concluded.
More specifically, what assumptions do they take for the comparisons?
What kind of comparisons they don't take into account?
Like if you take a look at freezing sort or Enhanced Insertion sort. The no. Of comparisons, these algo take in actual code is more than they have specied in the graph(no. of comparisons vs no. of elements)
The least possible number of comparisons done in a sorting algorithm could be n-1. In this case, you wouldn't actually be sorted at all, you'd just be checking whether the data is already sorted, essentially just comparing each element to the ones directly before and after it (this is done in the best case for insertion sort). It's fairly easy to see that it's impossible to do less comparisons than this, because then you'd have more than one disjoint sets of what you've compared, meaning you wouldn't know how the elements across these sets compare to each other.
If we're talking about average / worst case, it's actually been proven that the number of comparisons required is Ω(n log n).
An algorithm being recursive or iterative doesn't (directly) affect the number of comparisons. The only statement I could think that we could make specifically about recursive sorting algorithms is perhaps the recursion depth. This greatly depends on the algorithm, but quick-sort, specifically, has a (worst-case) recursion depth around n-1.
More comparisons that are often ignored on papers, but are conducted
in real code are the comparisons for branches. (if (<stop clause>)
return ...;), and similarly for loop iterators.
One reason why they are mostly ignored is because they are done on
indices, which are of constant sizes, while the compared elements
(which we do count) - might take more time, depending on the actual
type being compared (strings might take longer to compare than
integers, for example).
Also note, an array cannot be sorted using n-1 comparisons
(worst/average case), since sorting is Omega(nlogn) problem.
However, it is possible what the authour meant is the sorting takes
n-1 comparisons at each step of the algorithm, and there could be
multiple (typically O(logn)) of those steps.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
Suppose that we are given an array A already sorted in increasing order. Which is asymptotically faster, insertion-sort or merge-sort?
Like wise, suppose we are given an array B sorted in decreasing order, so it needs to be reversed. Which is now asymptotically faster?
I'm having a hard time grasping this, I already know that insertion-sort is better for smaller data sets and merge-sort is better for larger data sets. However I'm not sure why one is faster than the other depending on whether or not the data set is already sorted or not.
Speaking about worst case, the merge sort is faster with O(N logN) against O(N^2) for insertion sort. However, another characteristic of an algorithm is omega - best case complexity, which is Omega(N) for insertion sort against Omega(N logN) of merge sort.
The latter can be explained when looking at the algorithms at hand:
Merge sort works by dividing the array in half (if possible), recursively sorting those halves and merging them. Look how it does not depend on the actual order of elements: we will do recursive calls regardless of whether the part we're sorting is already in order (unless it's the base case).
Insertion sort seeks for the first element which is out of the desired order, and shifts it to the left, until it's in order. If there's no such index, no shifting will occur, and the algorithm will finish, doing only O(N) comparisons.
However, the merge sort is quite fixable w.r.t. best running time! You can check if the part at hand is already sorted before going into recursion. This will not change the worst case complexity of O(N logN) (however, the constant will double), but will bring the best case complexity to Omega(N).
In the case where the data is sorted in the reverse order, the insertion sort's worst case will show itself, since we'll have to move each element (in the order of iteration) from its position to the first position, doing N(N-1)/2 swaps, which belongs to O(N^2). The merge sort, however, still takes O(N logN) because of its recursive approach.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I'm learning about data structures and sort algorithm and I have some questions that I want to ask:
When we choose array and when we choose linked-list for sort algorithm
What sort algorithm we should use for small data and what for big data? I know that depends on the situation, we should choose available algorithm, but I can't understand the specifics.
Linked-list or array
Array is the more common choice.
Linked-list is mostly used when your data is already in a linked-list, or you need it in a linked-list for your application.
Not that I've really seen justifiable cause to use one over the other (except that most sorting algorithms are focussed around arrays). Both can be sorted in O(n log n), at least with comparison-based sorting algorithms.
When to use what
With comparison-based sorting, insertion sort is typically used for < ~10-20 elements, as it has low constant factors, even though it has O(n²) running time. For more elements, quick-sort or merge-sort (both running in O(n log n)) or some derivation of either is typically faster (although there are other O(n log n) sorting algorithms).
Insertion sort also performs well (O(n)) on nearly sorted data.
For non-comparison-based sorting, it really depends on your data. Radix sort, bucket sort and counting sort are all well-known examples, and each have their respective uses. A brief look at their running time should give you a good idea of when they should be used. Counting sort, for example, is good if the range of values to be sorted is really small.
You can see Wikipedia for a list of sorting algorithms.
Keep in mind that sorting less than like 10000 elements would be blazingly fast with any of these sorting algorithms - unless you need the absolute best performance, you could really pick whichever one you want.
To my understanding, for both questions there is no definitive answer as both depend on the context of usage. However the following points might of importance:
If the records to be sorted are large and implemented as a value type, an array might be infavourable since exchange of records involves copying of data, which might be slower than redirecting references.
Some instance size for switching sort algorithms is usually found by experimentation in a specific context; perhaps Quicksort is used for the 'large' instances, whereas Merge Sort is used for 'small' instances, where the actual best separation between 'large' and 'small' is found by trying out in the specific context.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am presently studying sorting algorithms. I have studied that quick sort algorithm depends on the initial organization of data. If the array is sorted, quick sort becomes slower. Is there any other sort which depends on the initial organization of data?
Of course. Insertion sort will be O(n) with the descending sorted input:
define selection_sort (arr):
out = []
while not (arr.is_empty()):
x = arr.pop()
out = insert x out
return out
because each insert call will be O(1). If pop_last() is used instead of pop() then it will be fastest on the sorted ascending input (this assumes pop() and/or pop_last() are O(1) themselves).
All fast sort algorithms minimize comparison and move operations. Minimizing move operations is dependent on the initial element ordering. I'm assuming you mean initial element ordering by initial organization.
Additionally, the fastest real world algorithms exploit locality of reference which which also shows dependence on the initial ordering.
If you are only interestend in a dependency that slows or speeds up the sorting dramatically, for example bubble sort will complete in one pass on sorted data.
Finally, many sort algorithms have average time complexity O(N log N) but worst case complexity O(N^2). What this means is that there exist specific inputs (e.g. sorted or reverse sorted) for these O(N^2) algorithms that provoke the bad run time behaviour. Some quicksort versions are example of these algorithms.
If what you're asking is "should I be worried about which sorting algorithm should I pick on a case basis?", unless you're processing thousands of millions of operations, the short answer is "no". Most of the times quicksort will be just fine (quicksort with a calculated pivot, like Java's).
In general cases, quicksort is good enough.
On the other hand, If your system is always expecting the source data in a consistent initial sorted way, and you need long CPU time and power each time, then you should definitely find the right algorithm for that corner case.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
As Everybody know that processing a sort who is optimize takes less time than normal sort? After knowing that optimize a sort takes less time, first thought that comes to my mind is "What makes is Faster"?
The primary reason an optimized sort runs faster is that it reduces the number of times two elements are compared and/or moved.
If you think about a naive worst-case sort of N items, where you compare each of the N items to each of the N-1 other items, there is a total of N*(N-1)=N^2-N comparisons. One possible way of optimizing the process would be to sort a smaller list - say N/2. That would take N/2*(N/2-1) comparisons, or N^2/4-N/2. But of course, you would have to do that on both halves of the list, which would double that - so N^2/2-N comparisons. You would also need N/2 additional comparisons to merge the two halves back together, so the total number of comparisons to sort two half-sized lists and merge them back together would be N^2/2-N+N/2, or N^2/2-N/2 - half the number of comparisons for sorting the whole list. Recursively subdividing your lists down to 2 items that can be sorted with just one comparison can provide significant savings in the number of comparisons. There are other divide-and-conquer ideas as well as ways to exploit discovered symmetries in the lists to help optimize the sorting algorithm, and coupled with the different ways to arrange data in memory (arrays vs linked lists vs heaps vs trees.... etc.) this leads to different optimized algorithms.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
Given an input set of n integers in the range [0..n^3-1], provide a linear time sorting algorithm.
This is a review for my test on thursday, and I have no idea how to approach this problem.
Also take a look at related sorts too: pigeonhole sort or counting sort, as well as radix sort as mentioned by Pukku.
Have a look at radix sort.
When people say "sorting algorithm" they often are referring to "comparison sorting algorithm", which is any algorithm that only depends on being able to ask "is this thing bigger or smaller than that". So if you are limited to asking this one question about the data then you will never get more than n*log(n) (this is the result of doing a log(n) search of the n factorial possible orderings of a data set).
If you can escape the constraints of "comparison sort" and ask a more sophisticated question about a piece of data, for instance "what is the base 10 radix of this data" then you can come up with any number of linear time sorting algorithms, they just take more memory.
This is a time space trade off. Comparason sort takes little or no ram and runs in N*log(n) time. radix sort (for example) runs in O(n) time AND O(log(radix)) memory.
wikipedia shows quite many different sorting algorithms and their complexities. you might want to check them out
It's really simple, if n=2 and numbers are unique:
Construct an array of bits (2^31-1 bits => ~256MB). Initialize them to zero.
Read the input, for each value you see set the respective bit in the array to 1.
Scan the array, for each bit set, output the respective value.
Complexity => O(2n)
Otherwise, use Radix Sort:
Complexity => O(kn) (hopefully)
Think the numbers as three digit numbers where each digit ranges from 0 to n-1. Sort these numbers with radix sort. For each digit there is a call to counting sort which is taking Theta(n+n) time, so that the total running time corresponds to Theta(n).
A Set of a limited range of numbers can be represented by a bitmap of RANGE bits.
In this case, a 500mb bitmap, so for anything but huge lists, you'd be better off with Radix Sort. As you encounter the number k, set bitmap[k] = 1. Single traversal through list, O(N).
alike algo is possible:
M;// unsorted array
lngth; //number items of M
for(int i=0; i < lngth; i++)sorted[M[i]];
It's alone possible algo for linear complexity, but it has complexity O(k*N) by ram (N - number array elements, k -- element's len)