Should we ignore constant k in O(nk)? - algorithm

Was reading CLRS when I encountered this:
Why do we not ignore the constant k in the big o equations in a. , b. and c.?

In this case, you aren't considering the run time of a single algorithm, but of a family of algorithms parameterized by k. Considering k lets you compare the difference between sorting n/n == 1 list and n/2 2-element lists. Somewhere in the middle, there is a value of k that you want to compute for part (c) so that Θ(nk + n lg(n/k)) and Θ(n lg n) are equal.
Going into more detail, insertion sort is O(n^2) because (roughly speaking) in the worst case, any single insertion could take O(n) time. However, if the sublists have a fixed length k, then you know the insertion step is O(1), independent of how many lists you are sorting. (That is, the bottleneck is no longer in the insertion step, but the merge phase.)

K is not a constant when you compare different algorithms with different values of k.

Related

Counting Sort has a lower bound of O(n)

The running time of counting sort is Θ (n+k). If k=O(n), the algorithm is O(n). The k represents the range of the input elements.
Can I say that the Counting sort has a lower bound of O(n) because the algorithm takes O(n) time to compute a problem and that the lower bound of O(n) shows that there is no hope of solving a specific computation problem in time better than Ω(n)??
Well yes since T(n,k) = Theta(n+k) then T(n,k) = Omega(n+k). Since k is nonnegative we know that n + k = Omega(n) and so T(n, k) = Omega(n) as required.
Another perspective on why the lower bound is indeed Ω(n): if you want to sort an array of n elements, you need to at least look at all the array elements. If you don’t, you can’t form a sorted list of all the elements of the array because you won’t know what those array elements are. :-)
That gives an immediate Ω(n) lower bound for sorting any sequence of n elements, unless you can read multiple elements of the sequence at once (say, using parallelism or if the array elements are so small that you can read several with a single machine instruction.)

Difference between O(logn) and O(nlogn)

I am preparing for software development interviews, I always faced the problem in distinguishing the difference between O(logn) and O(nLogn). Can anyone explain me with some examples or share some resource with me. I don't have any code to show. I understand O(Logn) but I haven't understood O(nlogn).
Think of it as O(n*log(n)), i.e. "doing log(n) work n times". For example, searching for an element in a sorted list of length n is O(log(n)). Searching for the element in n different sorted lists, each of length n is O(n*log(n)).
Remember that O(n) is defined relative to some real quantity n. This might be the size of a list, or the number of different elements in a collection. Therefore, every variable that appears inside O(...) represents something interacting to increase the runtime. O(n*m) could be written O(n_1 + n_2 + ... + n_m) and represent the same thing: "doing n, m times".
Let's take a concrete example of this, mergesort. For n input elements: On the very last iteration of our sort, we have two halves of the input, each half size n/2, and each half is sorted. All we have to do is merge them together, which takes n operations. On the next-to-last iteration, we have twice as many pieces (4) each of size n/4. For each of our two pairs of size n/4, we merge the pair together, which takes n/2 operations for a pair (one for each element in the pair, just like before), i.e. n operations for the two pairs.
From here, we can extrapolate that every level of our mergesort takes n operations to merge. The big-O complexity is therefore n times the number of levels. On the last level, the size of the chunks we're merging is n/2. Before that, it's n/4, before that n/8, etc. all the way to size 1. How many times must you divide n by 2 to get 1? log(n). So we have log(n) levels. Therefore, our total runtime is O(n (work per level) * log(n) (number of levels)), n work log(n) times.

Big O analysis of Modified Merge Sort (divide by √n arrays, instead 2)

I am working on a modified merge sort algorithm using a similar procedure for merging two sorted arrays, but instead want to merge √n sorted arrays of √n size. It will start with an array of size n, then recursively be divided into √n sub problems as stated above. The following algorithm is used:
1.) Divide array of n elements into √n pieces
2.) Pass elements back into method for recursion
3.) Compare pieces from step 1
4.) Merge components together to form sorted array
I am fairly certain this is the proper algorithm, but I am unsure how to find the Big O run time. Any guidance in the proper direction is greatly appreciated!
The key part is to find the complexity of the merging step. Assuming that an analogous method to that of the 2-way case is used:
Finding the minimum element out of all √n arrays is O(√n).
This needs to be done for all n elements to be merged; possible edge cases when some of the arrays are depleted only contribute a subtracted O(√n) in complexity.
Hence the complexity of merging is O(n√n). Expanding the recurrence:
Where (*) marks an expansion of the T() terms. Spotting the pattern for the m-th expansion:
Coefficient of T-term is n to the power of sum of powers of 1/2 up to m.
Argument of T-term is 1/2 to the power of m.
Accumulated terms the sum of n to the power of 1 + powers of 1/2 up to m.
Writing the above rules as a compact series:
(*) used the standard formula for geometric series.
(**) notes that for a summation of powers of n, the highest power dominates (1/2). Assume the stopping condition to be some small constant, be it n = 1:
Note that as n increases, the 2^(1 - ...) term vanishes. The first term is therefore bounded from above by O(n), which is overshadowed by the second term.
The time complexity of √n-way merge-sort is therefore O(n^1.5), which is worse than the O(n log n) complexity of 2-way merge-sort.

Algorithm that sorts n numbers from 0 to n^m in O(n)? where m is a constant

So i came upon this question where:
we have to sort n numbers between 0 and n^3 and the answer of time complexity is O(n) and the author solved it this way:
first we convert the base of these numbers to n in O(n), therefore now we have numbers with maximum 3 digits ( because of n^3 )
now we use radix sort and therefore the time is O(n)
so i have three questions :
1. is this correct? and the best time possible?
2. how is it possible to convert the base of n numbers in O(n)? like O(1) for each number? because some previous topics in this website said its O(M(n) log(n))?!
3. and if this is true, then it means we can sort any n numbers from 0 to n^m in O(n) ?!
( I searched about converting the base of n numbers and some said its
O(logn) for each number and some said its O(n) for n numbers so I got confused about this too)
1) Yes, it's correct. It is the best complexity possible, because any sort would have to at least look at the numbers and that is O(n).
2) Yes, each number is converted to base-n in O(1). Simple ways to do this take O(m^2) in the number of digits, under the usual assumption that you can do arithmetic operations on numbers up to O(n) in O(1) time. m is constant so O(m^2) is O(1)... But really this step is just to say that the radix you use in the radix sort is in O(n). If you implemented this for real, you'd use the smallest power of 2 >= n so you wouldn't need these conversions.
3) Yes, if m is constant. The simplest way takes m passes in an LSB-first radix sort with a radix of around n. Each pass takes O(n) time, and the algorithm requires O(n) extra memory (measured in words that can hold n).
So the author is correct. In practice, though, this is usually approached from the other direction. If you're going to write a function that sorts machine integers, then at some large input size it's going to be faster if you switch to a radix sort. If W is the maximum integer size, then this tradeoff point will be when n >= 2^(W/m) for some constant m. This says the same thing as your constraint, but makes it clear that we're thinking about large-sized inputs only.
There is wrong assumption that radix sort is O(n), it is not.
As described on i.e. wiki:
if all n keys are distinct, then w has to be at least log n for a
random-access machine to be able to store them in memory, which gives
at best a time complexity O(n log n).
The answer is no, "author implementation" is (at best) n log n. Also converting these numbers can take probably more than O(n)
is this correct?
Yes it's correct. If n is used as the base, then it will take 3 radix sort passes, where 3 is a constant, and since time complexity ignores constant factors, it's O(n).
and the best time possible?
Not always. Depending on the maximum value of n, a larger base could be used so that the sort is done in 2 radix sort passes or 1 counting sort pass.
how is it possible to convert the base of n numbers in O(n)? like O(1) for each number?
O(1) just means a constant time complexity == fixed number of operations per number. It doesn't matter if the method chosen is not the fastest if only time complexity is being considered. For example, using a, b, c to represent most to least significant digits and x as the number, then using integer math: a = x/(n^2), b = (x-(a*n^2))/n, c = x%n (assumes x >= 0). (side note - if n is a constant, then an optimizing compiler may convert the divisions into a multiply and shift sequence).
and if this is true, then it means we can sort any n numbers from 0 to n^m in O(n) ?!
Only if m is considered a constant. Otherwise it's O(m n).

log(n) vs log(k) in runtime of an algorithm with k < n

I am having trouble understanding the difference between log(k) and log(n) in complexity analysis.
I have an array of size n. I have another number k < n that is an input of the algorithm (so it's not a constant known ahead of time). What are some examples of algorithms that would have log(n) vs those that would have log(k) in their complexity? I can only think of algorithms that have log(n) in their complexity.
For example, mergesort has log(n) complexity in its runtime analysis (O(nlogn)).
If your algorithm takes a list of size n and a number of magnitude k < n, the input size is on the order of n + log(k) (assuming k may be on the same asymptotic order of n). Why? k is a number represented in a place-value system (e.g., binary or decimal) and a number of magnitude k requires on the order of log k digits to represent.
Therefore, if your algorithm takes an input k and uses it in a way that requires all its digits be used or checked (e.g., equality is being checked, etc.) then the complexity of the whole algorithm is at least on the order of log k. If you do more complicated things with the number, the complexity could be even higher. For instance, if you have something like for i = 1 to k do ..., the complexity of your algorithm is at least k - maybe higher, since you're comparing to a log k-bit number k times (although i will use fewer bits than k for many/most values of i, depending on the base).
There's no "one-size-fits-all" explanation as to where an O(log k) term might come up.
You sometimes see this runtime arise in searching and sorting algorithms where you only need to rearrange some small part of the sequence. For example, the C++ standard library's std::partial_sort function, which rearranges the sequence so that the first k elements are in sorted order and the remainder are in arbitrary order in time O(n log k). One way this could be implemented is to maintain a min-heap of size at most k and do n insertions/deletions on it, which is n operations that each take time O(log k). Similarly, there's an O(n log k)-time algorithm for finding the k largest elements in a data stream, which works by maintaining a max-heap of at most k elements.
(Neither of these approaches are optimal, though - you can do a partial sort in time O(n + k log k) using a linear-time selection algorithm and can similarly find the top k elements of a data stream in O(n).)m
You also sometimes see this runtime in algorithms that involve a divide-and-conquer strategy where the size of the problem in question depends on some parameter of the input size. For example, the Kirkpatrick-Seidel algorithm for computing a convex hull does linear work per level in a recurrence, and the number of layers is O(log k), where k is the number of elements in the resulting convex hull. The total work is then O(n log k), making it an output-sensitive algorithm.
In some cases, an O(log k) term can arise because you are processing a collection of elements one digit at a time. For example, radix sort has a runtime of O(n log k) when used to sort n values that range from 0 to k, inclusive, with the log k term arising from the fact that there are O(log k) digits in the number k.
In graph algorithms, where the number of edges (m) is related to but can be independent of the number of nodes (n), you often see runtimes like O(m log n), as is the case if you implement Dijkstra's algorithm with a binary heap.

Resources