I am preparing for software development interviews, I always faced the problem in distinguishing the difference between O(logn) and O(nLogn). Can anyone explain me with some examples or share some resource with me. I don't have any code to show. I understand O(Logn) but I haven't understood O(nlogn).
Think of it as O(n*log(n)), i.e. "doing log(n) work n times". For example, searching for an element in a sorted list of length n is O(log(n)). Searching for the element in n different sorted lists, each of length n is O(n*log(n)).
Remember that O(n) is defined relative to some real quantity n. This might be the size of a list, or the number of different elements in a collection. Therefore, every variable that appears inside O(...) represents something interacting to increase the runtime. O(n*m) could be written O(n_1 + n_2 + ... + n_m) and represent the same thing: "doing n, m times".
Let's take a concrete example of this, mergesort. For n input elements: On the very last iteration of our sort, we have two halves of the input, each half size n/2, and each half is sorted. All we have to do is merge them together, which takes n operations. On the next-to-last iteration, we have twice as many pieces (4) each of size n/4. For each of our two pairs of size n/4, we merge the pair together, which takes n/2 operations for a pair (one for each element in the pair, just like before), i.e. n operations for the two pairs.
From here, we can extrapolate that every level of our mergesort takes n operations to merge. The big-O complexity is therefore n times the number of levels. On the last level, the size of the chunks we're merging is n/2. Before that, it's n/4, before that n/8, etc. all the way to size 1. How many times must you divide n by 2 to get 1? log(n). So we have log(n) levels. Therefore, our total runtime is O(n (work per level) * log(n) (number of levels)), n work log(n) times.
Related
Quick is the in place algorithm which does not use any auxiliary array. So why memory complexity of this O(nlog(n)) ?
Similarly I understand it's worst case time complexity is O(n^2) but not getting why average case time complexity is O(nlog(n)). Basically I am not sure what do we mean when we say average case complexity ?
To your second point an excerpt from Wikipedia:
The most unbalanced partition occurs when the partitioning routine returns one of sublists of size n − 1. This may occur if the pivot happens to be the smallest or largest element in the list, or in some implementations (e.g., the Lomuto partition scheme as described above) when all the elements are equal.
If this happens repeatedly in every partition, then each recursive call processes a list of size one less than the previous list. Consequently, we can make n − 1 nested calls before we reach a list of size 1. This means that the call tree is a linear chain of n − 1 nested calls. The ith call does O(n − i) work to do the partition, and {\displaystyle \textstyle \sum _{i=0}^{n}(n-i)=O(n^{2})} , so in that case Quicksort takes O(n²) time.
Because you usually don't know what exact numbers you have to sort and you don't know, which pivot element you choose, you have the chance, that your pivot element isn't the smallest or biggest number in the array you sort. If you have an array of n not duplicated numbers, you have the chance of (n - 2) / n, that you don't have a worst case.
Was reading CLRS when I encountered this:
Why do we not ignore the constant k in the big o equations in a. , b. and c.?
In this case, you aren't considering the run time of a single algorithm, but of a family of algorithms parameterized by k. Considering k lets you compare the difference between sorting n/n == 1 list and n/2 2-element lists. Somewhere in the middle, there is a value of k that you want to compute for part (c) so that Θ(nk + n lg(n/k)) and Θ(n lg n) are equal.
Going into more detail, insertion sort is O(n^2) because (roughly speaking) in the worst case, any single insertion could take O(n) time. However, if the sublists have a fixed length k, then you know the insertion step is O(1), independent of how many lists you are sorting. (That is, the bottleneck is no longer in the insertion step, but the merge phase.)
K is not a constant when you compare different algorithms with different values of k.
In Mergesort Algorithm, instead of splitting array into the equal half, try to split array from random point in each call, I want to calculate the average time of this algorithm?
Our notes calculate it as normal merge sort. any formal idea?
Here is a proof that its time complexity is O(n log n)(it's not very formal).
Let's call a split "good" if the size of the largest part is at most 3/4 of the initial subarray(it looks this way: bad bad good good good good bad bad for an array with 8 elements). The probability of split to be good is 1/2. It means that among two splits we expect one two be "good".
Let's draw a tree of recursive merge sort calls:
[a_1, a_2, a_3, ..., a_n] --- level 1
/ \
[a_1, ..., a_k] [a_k + 1, a_n] --- level 2
/ \ / \
... --- level 3
...
--- level m
It is clear that there are at most n elements at each level, so the time complexity is O(n * m).
But 1). implies that the number of levels is 2 * log(n, 4 / 3), where log(a, b) is a logarithm of a base b, which is O(log n).
Thus, the time complexity is O(n * log n).
I assume you're talking about recursive merge sort.
In standard merge sort, you split the array at the midpoint, so you end up with (mostly) same-sized subarrays at each level. But if you split somewhere else then, except in pathological cases, you still end up with nearly the same number of subarrays.
Look at it this way: the divide and conquer approach of standard merge sort results in log n "levels" of sorting, with each level containing all n items. You do n comparisons at each level to sort the subarrays. That's where the n log n comes from.
If you randomly split your array, then you're bound to have more levels, but not all items are at all levels. That is, smaller subarrays result in single-item arrays before the longer ones do. So not all items are compared at all levels of the algorithm. Meaning that some items are compared more often than others but on average, each item is compared log n times.
So what you're really asking is, given a total number of items N split into k sorted arrays, is it faster to merge if each of the k arrays is the same length, rather than the k arrays being of varying lengths.
The answer is no. Merging N items from k sorted arrays takes the same amount of time regardless of the lengths of the individual arrays. See How to sort K sorted arrays, with MERGE SORT for an example.
So the answer to your question is that the average case (and the best case) of doing a recursive merge sort with a random split will be O(n log n), with stack space usage of O(log n). The worst case, which would occur only if your random split always split the array into one subarray that contains a single item, and the other contains the remainder, would require O(n) stack space, but still only O(n log n) time.
Note that if you use an iterative merge sort, there is no asymptotic difference in time or space usage.
A Merge algorithm merges two sorted input arrays into a sorted output array, by repeatedly comparing the smallest elements of the two input arrays, and moving the smaller one of the two to the output.
Now we need to merge three sorted input arrays (A1, A2, and A3) of the same length into a (sorted) output array, and there are two methods:
Using the above Merge algorithm to merge A1 and A2 into A4, then using the same algorithm to merge A4 and A3 into the output array.
Revising the above Merge algorithm, by repeatedly comparing the smallest elements of the three input arrays, and moving the smallest one of the three to the output array.
Which of the above two algorithms is more efficient, if only considering the worst case of array element movements (i.e., assignments)?
Which of the above two algorithms is more efficient, if only considering the worst case of array element comparisons?
Between these two algorithms, which one has a higher overall efficiency in worst case?
If all that you care about are the number of array writes, the second version (three-way merge) is faster than the first algorithm (two instances of two-way merge). The three-way merge algorithm will do exactly 3n writes (where n is the length of any of the sequences), since it merges all three ranges in one pass. The first approach will merge two of the ranges together, doing 2n writes, and will then merge that sequence with the third sequence, doing 3n writes for a net total of 5n writes.
More generally, suppose that you have k ranges of elements, all of length n. If you merge those ranges pairwise, then merge those merges pairwise again, etc., then you will do roughly k/2 merge steps merging ranges of length n, then k/4 merges of ranges of length 2n, then k/8 merges of length 4n, etc. This gives the sum
kn/2 + kn/2 + ... + kn/2 (log n times)
For a net number of array writes that are O(kn lg n). If, on the other hand, you use a k-way comparison at each step, you do exactly kn writes, which is much smaller.
Now, let's think about how many comparisons you do in each setup. In the three-way merge, each element written into the output sequence requires finding the minimum of three values. This requires two comparisons - one to compare the first values of the first two sequences, and one to compare the minimum of those two values to the first value of the third array. Thus for each value written out to the resulting sequence, we use two comparisons, and since there are 3n values written we need to do a total of at most 6n comparisons.
A much better way to do this would be to store the sequences in a min-heap, where sequences are compared by their first element. On each step, we dequeue the sequence from the heap with the smallest first value, write that value to the result, then enqueue tue rest of the sequence back into the heap. With k sequences, this means that each element written out requires at most O(lg k) comparisons, since heap insertion runs in O(lg k). This gives a net runtime of O(kn lg k), since each of the kn elements written out requires O(lg k) processing time.
In the other version, we begin by doing a standard two-way merge, which requires one comparison per element written out, for a net total of 2n comparisons. In the second pass of the merge, in the worst case we do a total of 3n comparisons, since there are 3G elements being merged. This gives a net total of 5n comparisons. If we use the generalized construction for pairwise merging that's described above, we will need to use O(kn lg n) comparisons, since each element written requires one comparison and we do O(kn lg n) writes.
In short, for the specific case of k=3, we have that the three-way merge does 3n writes and 6n comparisons for a net of 9n memory reads and writes. The iterated two-way merge does 5n writes and 5n comparisons for a net total of 10n memory reads and writes, and so the three-way-merge version is better.
If we consider the generalized constructions, the k-way merge does O(nk) writes and O(nk lg k) comparisons for a total of O(nk lg k) memory operations. The iterated two-way merge algorithm does O(nk lg n) writes and O(nk lg n) comparisons for a total of O(nk lg n) memory operations. Thus the k-way merge is asymptotically better for a few long sequences, while the iterated merge sort is faster for many short sequences.
Hope this helps!
I know there are quite a bunch of questions about big O notation, I have already checked:
Plain english explanation of Big O
Big O, how do you calculate/approximate it?
Big O Notation Homework--Code Fragment Algorithm Analysis?
to name a few.
I know by "intuition" how to calculate it for n, n^2, n! and so, however I am completely lost on how to calculate it for algorithms that are log n , n log n, n log log n and so.
What I mean is, I know that Quick Sort is n log n (on average).. but, why? Same thing for merge/comb, etc.
Could anybody explain me in a not too math-y way how do you calculate this?
The main reason is that Im about to have a big interview and I'm pretty sure they'll ask for this kind of stuff. I have researched for a few days now, and everybody seem to have either an explanation of why bubble sort is n^2 or the unreadable explanation (for me) on Wikipedia
The logarithm is the inverse operation of exponentiation. An example of exponentiation is when you double the number of items at each step. Thus, a logarithmic algorithm often halves the number of items at each step. For example, binary search falls into this category.
Many algorithms require a logarithmic number of big steps, but each big step requires O(n) units of work. Mergesort falls into this category.
Usually you can identify these kinds of problems by visualizing them as a balanced binary tree. For example, here's merge sort:
6 2 0 4 1 3 7 5
2 6 0 4 1 3 5 7
0 2 4 6 1 3 5 7
0 1 2 3 4 5 6 7
At the top is the input, as leaves of the tree. The algorithm creates a new node by sorting the two nodes above it. We know the height of a balanced binary tree is O(log n) so there are O(log n) big steps. However, creating each new row takes O(n) work. O(log n) big steps of O(n) work each means that mergesort is O(n log n) overall.
Generally, O(log n) algorithms look like the function below. They get to discard half of the data at each step.
def function(data, n):
if n <= constant:
return do_simple_case(data, n)
if some_condition():
function(data[:n/2], n / 2) # Recurse on first half of data
else:
function(data[n/2:], n - n / 2) # Recurse on second half of data
While O(n log n) algorithms look like the function below. They also split the data in half, but they need to consider both halves.
def function(data, n):
if n <= constant:
return do_simple_case(data, n)
part1 = function(data[n/2:], n / 2) # Recurse on first half of data
part2 = function(data[:n/2], n - n / 2) # Recurse on second half of data
return combine(part1, part2)
Where do_simple_case() takes O(1) time and combine() takes no more than O(n) time.
The algorithms don't need to split the data exactly in half. They could split it into one-third and two-thirds, and that would be fine. For average-case performance, splitting it in half on average is sufficient (like QuickSort). As long as the recursion is done on pieces of (n/something) and (n - n/something), it's okay. If it's breaking it into (k) and (n-k) then the height of the tree will be O(n) and not O(log n).
You can usually claim log n for algorithms where it halves the space/time each time it runs. A good example of this is any binary algorithm (e.g., binary search). You pick either left or right, which then axes the space you're searching in half. The pattern of repeatedly doing half is log n.
For some algorithms, getting a tight bound for the running time through intuition is close to impossible (I don't think I'll ever be able to intuit a O(n log log n) running time, for instance, and I doubt anyone will ever expect you to). If you can get your hands on the CLRS Introduction to Algorithms text, you'll find a pretty thorough treatment of asymptotic notation which is appropriately rigorous without being completely opaque.
If the algorithm is recursive, one simple way to derive a bound is to write out a recurrence and then set out to solve it, either iteratively or using the Master Theorem or some other way. For instance, if you're not looking to be super rigorous about it, the easiest way to get QuickSort's running time is through the Master Theorem -- QuickSort entails partitioning the array into two relatively equal subarrays (it should be fairly intuitive to see that this is O(n)), and then calling QuickSort recursively on those two subarrays. Then if we let T(n) denote the running time, we have T(n) = 2T(n/2) + O(n), which by the Master Method is O(n log n).
Check out the "phone book" example given here: What is a plain English explanation of "Big O" notation?
Remember that Big-O is all about scale: how much more operation will this algorithm require as the data set grows?
O(log n) generally means you can cut the dataset in half with each iteration (e.g. binary search)
O(n log n) means you're performing an O(log n) operation for each item in your dataset
I'm pretty sure 'O(n log log n)' doesn't make any sense. Or if it does, it simplifies down to O(n log n).
I'll attempt to do an intuitive analysis of why Mergesort is n log n and if you can give me an example of an n log log n algorithm, I can work through it as well.
Mergesort is a sorting example that works through splitting a list of elements repeatedly until only elements exists and then merging these lists together. The primary operation in each of these merges is comparison and each merge requires at most n comparisons where n is the length of the two lists combined. From this you can derive the recurrence and easily solve it, but we'll avoid that method.
Instead consider how Mergesort is going to behave, we're going to take a list and split it, then take those halves and split it again, until we have n partitions of length 1. I hope that it's easy to see that this recursion will only go log (n) deep until we have split the list up into our n partitions.
Now that we have that each of these n partitions will need to be merged, then once those are merged the next level will need to be merged, until we have a list of length n again. Refer to wikipedia's graphic for a simple example of this process http://en.wikipedia.org/wiki/File:Merge_sort_algorithm_diagram.svg.
Now consider the amount of time that this process will take, we're going to have log (n) levels and at each level we will have to merge all of the lists. As it turns out each level will take n time to merge, because we'll be merging a total of n elements each time. Then you can fairly easily see that it will take n log (n) time to sort an array with mergesort if you take the comparison operation to be the most important operation.
If anything is unclear or I skipped somewhere please let me know and I can try to be more verbose.
Edit Second Explanation:
Let me think if I can explain this better.
The problem is broken into a bunch of smaller lists and then the smaller lists are sorted and merged until you return to the original list which is now sorted.
When you break up the problems you have several different levels of size first you'll have two lists of size: n/2, n/2 then at the next level you'll have four lists of size: n/4, n/4, n/4, n/4 at the next level you'll have n/8, n/8 ,n/8 ,n/8, n/8, n/8 ,n/8 ,n/8 this continues until n/2^k is equal to 1 (each subdivision is the length divided by a power of 2, not all lengths will be divisible by four so it won't be quite this pretty). This is repeated division by two and can continue at most log_2(n) times, because 2^(log_2(n) )=n, so any more division by 2 would yield a list of size zero.
Now the important thing to note is that at every level we have n elements so for each level the merge will take n time, because merge is a linear operation. If there are log(n) levels of the recursion then we will perform this linear operation log(n) times, therefore our running time will be n log(n).
Sorry if that isn't helpful either.
When applying a divide-and-conquer algorithm where you partition the problem into sub-problems until it is so simple that it is trivial, if the partitioning goes well, the size of each sub-problem is n/2 or thereabout. This is often the origin of the log(n) that crops up in big-O complexity: O(log(n)) is the number of recursive calls needed when partitioning goes well.