Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
Given an input set of n integers in the range [0..n^3-1], provide a linear time sorting algorithm.
This is a review for my test on thursday, and I have no idea how to approach this problem.
Also take a look at related sorts too: pigeonhole sort or counting sort, as well as radix sort as mentioned by Pukku.
Have a look at radix sort.
When people say "sorting algorithm" they often are referring to "comparison sorting algorithm", which is any algorithm that only depends on being able to ask "is this thing bigger or smaller than that". So if you are limited to asking this one question about the data then you will never get more than n*log(n) (this is the result of doing a log(n) search of the n factorial possible orderings of a data set).
If you can escape the constraints of "comparison sort" and ask a more sophisticated question about a piece of data, for instance "what is the base 10 radix of this data" then you can come up with any number of linear time sorting algorithms, they just take more memory.
This is a time space trade off. Comparason sort takes little or no ram and runs in N*log(n) time. radix sort (for example) runs in O(n) time AND O(log(radix)) memory.
wikipedia shows quite many different sorting algorithms and their complexities. you might want to check them out
It's really simple, if n=2 and numbers are unique:
Construct an array of bits (2^31-1 bits => ~256MB). Initialize them to zero.
Read the input, for each value you see set the respective bit in the array to 1.
Scan the array, for each bit set, output the respective value.
Complexity => O(2n)
Otherwise, use Radix Sort:
Complexity => O(kn) (hopefully)
Think the numbers as three digit numbers where each digit ranges from 0 to n-1. Sort these numbers with radix sort. For each digit there is a call to counting sort which is taking Theta(n+n) time, so that the total running time corresponds to Theta(n).
A Set of a limited range of numbers can be represented by a bitmap of RANGE bits.
In this case, a 500mb bitmap, so for anything but huge lists, you'd be better off with Radix Sort. As you encounter the number k, set bitmap[k] = 1. Single traversal through list, O(N).
alike algo is possible:
M;// unsorted array
lngth; //number items of M
for(int i=0; i < lngth; i++)sorted[M[i]];
It's alone possible algo for linear complexity, but it has complexity O(k*N) by ram (N - number array elements, k -- element's len)
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
In this sorting animation, I saw that heap sort and merge sort works best for an array containing random numbers. But, what about if we compare these sorting algorithms with Radix and introsort?
In short, which type of sorting algorithm is best to sort an array consisting of random numbers?
Thanks
For an array of random numbers, a least significant digit first counting variation of radix sort is normally fastest for smaller arrays that fit within cache. While for larger arrays, using one most significant digit first to split up the array into smaller sub-arrays that fit in cache will be faster. Since the data is random, the main time overhead for a radix sort is the randomly distributed writes, which is not cache friendly if the array is significantly larger than cache. If the original and working arrays fit within cache, for most systems, the random access writes don't incur a significant time penalty.
There is also a choice for the base used in a radix sort. For example 32 bit numbers can be sorted in 4 passes if using base 256 (8 bit "digits"). Using base 65536 (16 bit "digits") usually exceeds the size of the L1 and/or L2 caches, so it's not faster in most cases, even though it only takes two passes. For 64 bit numbers, four 11 bit "digits" and two 10 bit "digits" could be used to sort in 6 passes, instead of using eight 8 bit "digits" to sort in 8 passes. However, the 11/10 bit digit variation isn't going to be faster unless the array is large enough and the distribution of the random numbers is uniform enough to use up most of the storage used to hold the counts / indexes.
Link to a prior thread about radix sort variations:
Radix Sort Optimization
merge sort "best case"
For a standard merge sort, the number of moves is always the same, but if the data is already sorted, only half the number of compares is done.
quick sort / intro sort "best case"
Best case for quick sort is random data. Using the middle value for partition doesn't make much difference for random data, but if the data is already sorted, it ends up as best case. Intro sort generally involves extra code to check if the recursion is getting too deep where it switches to heap sort. For random data, this shouldn't happen, so the extra code to check for switch to heap sort is just extra overhead.
Here you can see time complexities of various sorting algorithms in best, average and worst cases: http://bigocheatsheet.com/
As you want to compare the time complexities of sorting algorithms with random numbers, we can simply compare their average-case time complexities.
You can further look into their algorithms and analyze their time complexities.
https://www.geeksforgeeks.org/merge-sort/
https://www.geeksforgeeks.org/radix-sort/
https://www.geeksforgeeks.org/heap-sort/
https://www.geeksforgeeks.org/know-your-sorting-algorithm-set-2-introsort-cs-sorting-weapon/
Merge Sort is a widely used sorting algorithm in various libraries which has sort function implementations.
Merge sort sorts in O(nlogn) time and O(n) space.
Heap Sort sorts in O(nlogn) time and O(1) space.
Radix sort sorts in O(nk) time and O(n+k) space.
Intro sort sorts in O(nlogn) time and O(logn) space.
Intro Sort is mix of quick, insertion and heap sort.
Intro Sort is probably the best one.
There is no perfect algorithm, different algorithms have a different set of advantages and disadvantages.
Note: All time complexities are average case.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am presently studying sorting algorithms. I have studied that quick sort algorithm depends on the initial organization of data. If the array is sorted, quick sort becomes slower. Is there any other sort which depends on the initial organization of data?
Of course. Insertion sort will be O(n) with the descending sorted input:
define selection_sort (arr):
out = []
while not (arr.is_empty()):
x = arr.pop()
out = insert x out
return out
because each insert call will be O(1). If pop_last() is used instead of pop() then it will be fastest on the sorted ascending input (this assumes pop() and/or pop_last() are O(1) themselves).
All fast sort algorithms minimize comparison and move operations. Minimizing move operations is dependent on the initial element ordering. I'm assuming you mean initial element ordering by initial organization.
Additionally, the fastest real world algorithms exploit locality of reference which which also shows dependence on the initial ordering.
If you are only interestend in a dependency that slows or speeds up the sorting dramatically, for example bubble sort will complete in one pass on sorted data.
Finally, many sort algorithms have average time complexity O(N log N) but worst case complexity O(N^2). What this means is that there exist specific inputs (e.g. sorted or reverse sorted) for these O(N^2) algorithms that provoke the bad run time behaviour. Some quicksort versions are example of these algorithms.
If what you're asking is "should I be worried about which sorting algorithm should I pick on a case basis?", unless you're processing thousands of millions of operations, the short answer is "no". Most of the times quicksort will be just fine (quicksort with a calculated pivot, like Java's).
In general cases, quicksort is good enough.
On the other hand, If your system is always expecting the source data in a consistent initial sorted way, and you need long CPU time and power each time, then you should definitely find the right algorithm for that corner case.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
In (most of the) research papers on sorting, authors conclude that their algorithm takes n-1comparisons to sort a 'n' sized array (where n is size of the array)
...so and so
but when it comes to coding, the code uses more comparisons than concluded.
More specifically, what assumptions do they take for the comparisons?
What kind of comparisons they don't take into account?
Like if you take a look at freezing sort or Enhanced Insertion sort. The no. Of comparisons, these algo take in actual code is more than they have specied in the graph(no. of comparisons vs no. of elements)
The least possible number of comparisons done in a sorting algorithm could be n-1. In this case, you wouldn't actually be sorted at all, you'd just be checking whether the data is already sorted, essentially just comparing each element to the ones directly before and after it (this is done in the best case for insertion sort). It's fairly easy to see that it's impossible to do less comparisons than this, because then you'd have more than one disjoint sets of what you've compared, meaning you wouldn't know how the elements across these sets compare to each other.
If we're talking about average / worst case, it's actually been proven that the number of comparisons required is Ω(n log n).
An algorithm being recursive or iterative doesn't (directly) affect the number of comparisons. The only statement I could think that we could make specifically about recursive sorting algorithms is perhaps the recursion depth. This greatly depends on the algorithm, but quick-sort, specifically, has a (worst-case) recursion depth around n-1.
More comparisons that are often ignored on papers, but are conducted
in real code are the comparisons for branches. (if (<stop clause>)
return ...;), and similarly for loop iterators.
One reason why they are mostly ignored is because they are done on
indices, which are of constant sizes, while the compared elements
(which we do count) - might take more time, depending on the actual
type being compared (strings might take longer to compare than
integers, for example).
Also note, an array cannot be sorted using n-1 comparisons
(worst/average case), since sorting is Omega(nlogn) problem.
However, it is possible what the authour meant is the sorting takes
n-1 comparisons at each step of the algorithm, and there could be
multiple (typically O(logn)) of those steps.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 8 years ago.
Improve this question
what's the time complexity of the following program?
sum=0;
for(i=1;i<=5;i++)
sum=sum+i;
and how to define this complexity in log ? i shall highly appreciate if someone explain complexity step by step. furthermore how to show in O(big o) and logn.
[Edited]
sum=0; //1 time
i=1; //1 time
i<=5; //6 times
i++ //5 times
sum=sum+i;//5 times
is time complexity 18? Correct?
Preliminaries
Time complexity isn't usually expressed in terms of a specific integer, so the statement "The time complexity of operation X is 18" isn't clear without a unit, e.g., 18 "doodads".
One usually expresses time complexity as a function of the size of the input to some function/operation.
You often want to ignore the specific amount of time a particular operation takes, due to differences in hardware or even differences in constant factors between different languages. For example, summation is still O(n) (in general) in C and in Python (you still have to perform n additions), but differences in constant factors between the two languages will result in C being faster in terms of absolute time the operation takes to halt.
One also usually assumes that "Big-Oh"--e.g, O(f(n))--is the "worst-case" running time of an algorithm. There are other symbols used to study more strict upper and lower bounds.
Your question
Instead of summing from 1 to 5, let's look at summing from 1 to n.
The complexity of this is O(n) where n is the number of elements you're summing together.
Each addition (with +) takes constant time, which you're doing n times in this case.
However, this particular operation that you've shown can be accomplished in O(1) (constant time), because the sum of the numbers from 1 to n can be expressed as a single arithmetic operation. I'll leave the details of that up to you to figure out.
As far as expressing this in terms of logarithms: not exactly sure why you'd want to, but here goes:
Because exp(log(n)) is n, you could express it as O(exp(log(n))). Why would you want to do this? O(n) is perfectly understandable without needing to invoke log or exp.
First of all the loop runs 5 times for 5 inputs hence it has a time complexity of O(n). I am assuming here that values in i are the inputs for sum.
Secondly you cant just define time complexity in log terms it should always in BIG O notation. For example if you perform a binary search then the worst case time complexity of that algorithm is O(log n) because you are getting result in say 3 iterations when the input arrays is 8.
Complexity = log2(base)8 = 3
now here your comlexity is in log.
Is it theoretically possible to sort an array of n integers in an amortized complexity of O(n)?
What about trying to create a worst case of O(n) complexity?
Most of the algorithms today are built on O(nlogn) average + O(n^2) worst case.
Some, while using more memory are O(nlogn) worst.
Can you with no limitation on memory usage create such an algorithm?
What if your memory is limited? how will this hurt your algorithm?
Any page on the intertubes that deals with comparison-based sorts will tell you that you cannot sort faster than O(n lg n) with comparison sorts. That is, if your sorting algorithm decides the order by comparing 2 elements against each other, you cannot do better than that. Examples include quicksort, bubblesort, mergesort.
Some algorithms, like count sort or bucket sort or radix sort do not use comparisons. Instead, they rely on the properties of the data itself, like the range of values in the data or the size of the data value.
Those algorithms might have faster complexities. Here is an example scenario:
You are sorting 10^6 integers, and each integer is between 0 and 10. Then you can just count the number of zeros, ones, twos, etc. and spit them back out in sorted order. That is how countsort works, in O(n + m) where m is the number of values your datum can take (in this case, m=11).
Another:
You are sorting 10^6 binary strings that are all at most 5 characters in length. You can use the radix sort for that: first split them into 2 buckets depending on their first character, then radix-sort them for the second character, third, fourth and fifth. As long as each step is a stable sort, you should end up with a perfectly sorted list in O(nm), where m is the number of digits or bits in your datum (in this case, m=5).
But in the general case, you cannot sort faster than O(n lg n) reliably (using a comparison sort).
I'm not quite happy with the accepted answer so far. So I'm retrying an answer:
Is it theoretically possible to sort an array of n integers in an amortized complexity of O(n)?
The answer to this question depends on the machine that would execute the sorting algorithm. If you have a random access machine, which can operate on exactly 1 bit, you can do radix sort for integers with at most k bits, which was already suggested. So you end up with complexity O(kn).
But if you are operating on a fixed size word machine with a word size of at least k bits (which all consumer computers are), the best you can achieve is O(n log n). This is because either log n < k or you could do a count sort first and then sort with a O (n log n) algorithm, which would yield the first case also.
What about trying to create a worst case of O(n) complexity?
That is not possible. A link was already given. The idea of the proof is that in order to be able to sort, you have to decide for every element to be sorted if it is larger or smaller to any other element to be sorted. By using transitivity this can be represented as a decision tree, which has n nodes and log n depth at best. So if you want to have performance better than Ω(n log n) this means removing edges from that decision tree. But if the decision tree is not complete, than how can you make sure that you have made a correct decision about some elements a and b?
Can you with no limitation on memory usage create such an algorithm?
So as from above that is not possible. And the remaining questions are therefore of no relevance.
If the integers are in a limited range then an O(n) "sort" of them would involve having a bit vector of "n" bits ... looping over the integers in question and setting the n%8 bit of offset n//8 in that byte array to true. That is an "O(n)" operation. Another loop over that bit array to list/enumerate/return/print all the set bits is, likewise, an O(n) operation. (Naturally O(2n) is reduced to O(n)).
This is a special case where n is small enough to fit within memory or in a file (with seek()) operations). It is not a general solution; but it is described in Bentley's "Programming Pearls" --- and was allegedly a practical solution to a real-world problem (involving something like a "freelist" of telephone numbers ... something like: find the first available phone number that could be issued to a new subscriber).
(Note: log(10*10) is ~24 bits to represent every possible integer up to 10 digits in length ... so there's plenty of room in 2*31 bits of a typical Unix/Linux maximum sized memory mapping).
I believe you are looking for radix sort.