Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
Researching big O notation, I understand the concept of O(log n) as a binary search and O(n log n) as a quick sort.
Can anyone put into layman's terms what the main difference in runtime is between these two? and why that is the case?
they seem intuitively to be similarly related
Basically: a factor of N.
A binary search only touches a small number of elements. If there's a billion elements, the binary search only touches ~30 of them.
A quicksort touches every single element, a small number of times. If there's a billion elements, the quick sort touches all of them, about 30 times: about 30 billion touches total.
See how Log(n) is flat (not literally but figuratively, in comparison to other functions), while nLog(n) has crossed 600 for a value of n = 100. That's how different they are.
On simple terms and visualization, they are kind of the same in sorting algorithms, but quick sort as O(n log n) has a flaw in some situations, Quick Sort most situations is log n, but on special cases is n², that's why n before log n . So Quick sort for small amount of sorting is very good, but for millions/billions its not, better use Merge Sort for that kind of sorting.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
So I am having an exam, and a big part of this exam will be quicksort algorithm. As everyone knows, the best case scenario and actually an average case for this algorithm is: O(nlogn). The worst case scenario would be O(n^2).
As for the worst case scenario I know how to explain it: It happens when the selected pivot would be the smallest or the biggest value in the array, then we would have n quicksort calls which may take up to n time (I mean partition operation). Am I right?
Now the best/average case. I've read the Cormens book, I understood many things thanks to that book, but as for the quicksort algorithm he focuses on the mathematical formulas on how to explain O(nlogn) complexity. I just wanted to know why is it O(nlogn), not getting into some mathematical proof. For now I've only seen some Wikipedia explanation, that if we choose a pivot which divides our array into n/2, n/2+1 parts each time, then we would have a call tree of depth logn, but I don't know if that is true and even if so, why is it logn then.
I know that there are many materials covering quicksort on the internet, but they only cover implementation, or are just telling me the complexity, not explaining it.
Am I right?
Yes.
we would have a call tree of depth logn but I don't know if that is true
It is.
why is it logn?
Because we partition the array in half at every step, resulting in logn depth of the call graph. From this Intro:
See the tree and its depth, it's logn. Imagine it as the search in a BST costs logn, or why search takes logn too in Binary search in a sorted array.
PS: Math tell the truth, invest in understanding them, and you shall become a better Computer Scientist! =)
For the best case scenario, quick sort splits the current array 50% / 50% (in half) on each partition step for a time complexity of O(log2(n)) (1/.5 = 2), but the constant 2 is ignored, so it's O(n log(n).
If each partition step produced a 20% / 80% split, then the worst case time complexity would be based on the 80% or O(n log1.25(n)) (1/.8 = 1.25), but the constant 1.25 is ignored so it's also O(n log(n)), even though it's about 3 times slower than the 50% / 50% partition case for sorting 1 million elements.
The O(n^2) time complexity occurs when the partition split only produces a linear reduction in partition size with each partition step. The simplest and worst case example is when only 1 element is removed per partition step.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am presently studying sorting algorithms. I have studied that quick sort algorithm depends on the initial organization of data. If the array is sorted, quick sort becomes slower. Is there any other sort which depends on the initial organization of data?
Of course. Insertion sort will be O(n) with the descending sorted input:
define selection_sort (arr):
out = []
while not (arr.is_empty()):
x = arr.pop()
out = insert x out
return out
because each insert call will be O(1). If pop_last() is used instead of pop() then it will be fastest on the sorted ascending input (this assumes pop() and/or pop_last() are O(1) themselves).
All fast sort algorithms minimize comparison and move operations. Minimizing move operations is dependent on the initial element ordering. I'm assuming you mean initial element ordering by initial organization.
Additionally, the fastest real world algorithms exploit locality of reference which which also shows dependence on the initial ordering.
If you are only interestend in a dependency that slows or speeds up the sorting dramatically, for example bubble sort will complete in one pass on sorted data.
Finally, many sort algorithms have average time complexity O(N log N) but worst case complexity O(N^2). What this means is that there exist specific inputs (e.g. sorted or reverse sorted) for these O(N^2) algorithms that provoke the bad run time behaviour. Some quicksort versions are example of these algorithms.
If what you're asking is "should I be worried about which sorting algorithm should I pick on a case basis?", unless you're processing thousands of millions of operations, the short answer is "no". Most of the times quicksort will be just fine (quicksort with a calculated pivot, like Java's).
In general cases, quicksort is good enough.
On the other hand, If your system is always expecting the source data in a consistent initial sorted way, and you need long CPU time and power each time, then you should definitely find the right algorithm for that corner case.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I just wrote the quick and merge sort algorithms and I want to make a log-log plot of their run time vs size of array to sort.
As I have never done this my question is does it matter if I choose arbitrary numbers for the array length (size of input) or should I follow a pattern (something like 10^3, 10^4, 10^5, etc)?
In general, you need to choose array lengths, for each method, that are large enough to display the expected o(n log n) or O(n^2) type behavior.
If your n is too small the run time may be dominated by other growth rates, for example an algorithm with run time = 1000000*n + n^2 will look to be ~O(n) for n < 1000. For most algorithms the small n behavior means that your log-log plot will initially be curved.
On the other hand, if your n is too large your algorithm may take too long to complete.
The best compromise may be to start with small n, and time for n, 2n, 4n,..., or n, 3n, 9n,... and keep increasing until you can clearly see the log log plots asymptoting to a straight lines.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
As Everybody know that processing a sort who is optimize takes less time than normal sort? After knowing that optimize a sort takes less time, first thought that comes to my mind is "What makes is Faster"?
The primary reason an optimized sort runs faster is that it reduces the number of times two elements are compared and/or moved.
If you think about a naive worst-case sort of N items, where you compare each of the N items to each of the N-1 other items, there is a total of N*(N-1)=N^2-N comparisons. One possible way of optimizing the process would be to sort a smaller list - say N/2. That would take N/2*(N/2-1) comparisons, or N^2/4-N/2. But of course, you would have to do that on both halves of the list, which would double that - so N^2/2-N comparisons. You would also need N/2 additional comparisons to merge the two halves back together, so the total number of comparisons to sort two half-sized lists and merge them back together would be N^2/2-N+N/2, or N^2/2-N/2 - half the number of comparisons for sorting the whole list. Recursively subdividing your lists down to 2 items that can be sorted with just one comparison can provide significant savings in the number of comparisons. There are other divide-and-conquer ideas as well as ways to exploit discovered symmetries in the lists to help optimize the sorting algorithm, and coupled with the different ways to arrange data in memory (arrays vs linked lists vs heaps vs trees.... etc.) this leads to different optimized algorithms.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Lets say we want to find some known key at array and extract the value. There are 2 possible approaches(maybe more?) to do it. Linear approach, during which we will compare each array key with needle O(N). Or we can sort this array O(N*log(N)) and apply binary search O(log(N)). And I have several questions about it.
So, as I can see sort is closely related to search but stand alone sort is useless. Sorting is an instrument to simplify search. Am I correct? Or there any other implementations of sorting?
If we will talk about search, than we can do search on unsorted data O(N) and sorted O(N*log(N)) + O(log(N)). Searching can exist separately from sorting. In case when we need to find something at array only once we should use linear search, if the search is repeated we should sort the data and after it perform searching?
Don't think before every search a O(n * lg(n)) sort is needed. That would be ridiculous because O(n * lg(n)) + O(log(n)) > O(n) that is it would be quicker to do a linear search on random order data which on average would be O(n/2).
The idea is to initially sort your random data only once using a O(n * lg(n)) algorithm then any data added prior to sorting should be added in order so every search there after can be done in O(lg(n)) time.
You might be interesting in looking at hash tables which are a kind of array that are unsorted but have O(1) constant access time.
It is extremely rare that you would create an array of N items then search it only once. Therefore it is usually profitable to improve the data structure holding the items to improve sort time (amortize the set up time over all the searches and see if you save over-all time)
However there are many other considerations: Do you need to add new items to the collection? Do you need to remove items from the collection? Are you willing to spend extra memory in order to improve sort time? Do you care about the original order in which the items were added to the collection? All of these factors, and more, influence your choice of container and searching technique.