Big O Recursive Method - big-o

I have a method called binary sum
Algorithm BinarySum(A, i, n):
Input: An array A and integers i and n
Output: The sum of the n integers in A starting at index i
if n = 1 then
return A[i]
return BinarySum(A, i, n/ 2) + BinarySum(A, i + n/ 2, n/ 2)
Ignoring the fact of making a simple problem complicated I have been asked to find the Big O. Here is my thought process. For an array of size N I will be making 1 + 2 + 4 .. + N recursive calls. This is close to half the sum from 1 to N so I will say it is about N(N + 1)/4. After making this many calls now I need to add them together. So once again I need to perform N(N+1)/4 additions. Adding them together we are left with N^2 as the dominate term.
So would the big O of this algorithm be O(N^2)? Or am I doing something wrong. It feels strange to have binary recursion and not have a 2^n or log n in the final answer

There are in-fact 2^n and log n terms in the final result... sort of.
For each call to a sub-array of length n, two recursive calls are made to both halves of this array, plus a constant amount of work (if-statement, addition, pushing onto the call stack etc). Thus the recurrence relation is given by:
At this point we could just use the Master theorem to directly arrive at the final result - O(n). But let's instead derive it by repeated expansion:
The stopping condition n = 1 gives the maximum value of m (ignoring rounding):
In step (*) we used the standard formula for geometric series. So as you can see the answer does involve log n and 2^n terms in a sense, but they "cancel" out to give a simple linear term, which is the same as for a simple loop.

Related

Counting primitive operations on recursive functions

I'm reading Algorithm Design and Applications, by Michael T. Goodrich and Roberto Tamassia, published by Wiley. They teach the concept of primitive operations and how to count then in a given algorithm. Everything was clear to me until the moment they showed a recursive function (a simple recursive way to calculate the maximum value of an array) and its primitive operation count.
The function (in pseudo-code) is this:
Algorithm recursiveMax(A, n):
Input: An array A storing n ≥ 1 integers.
Output: The maximum element in A.
if n = 1 then
return A[0]
return max{recursiveMax(A, n − 1), A[n − 1]}
where A is an array and n its length. The author states what follows concerning how we calculate the number of primitive operations this function has:
As with this example, recursive algorithms are often quite elegant. Analyzing
the running time of a recursive algorithm takes a bit of additional work, however.
In particular, to analyze such a running time, we use a recurrence equation, which
defines mathematical statements that the running time of a recursive algorithm must
satisfy. We introduce a function T (n) that denotes the running time of the algorithm
on an input of size n, and we write equations that T (n) must satisfy. For example,
we can characterize the running time, T (n), of the recursiveMax algorithm as T(n) = 3 if n = 1 or T(n - 1) + 7 otherwise, assuming that we count each comparison, array reference, recursive call, max calculation, or return as a single primitive operation. Ideally, we would like to characterize a recurrence equation like that above in closed form, where no references to the function T appear on the righthand side. For the recursiveMax algorithm, it isn’t too hard to see that a closed form would be T (n) = 7(n − 1) + 3 = 7n − 4.
I can clearly understand that in the case of a single item array, our T(n) would be just 3 (only 3 primitive operations will occur, i.e. the comparision n = 1, the array index A[0] and the return operation), but I cannot understand why in the case where n is not 1 we have T(n-1) + 7. Why + 7? From where did we get this constant?
Also, I cannot comprehend this closed form: how did he get that T(n) = 7(n - 1) + 3?
I appreciate any help.

Binary vs Linear searches for unsorted N elements

I try to understand a formula when we should use quicksort. For instance, we have an array with N = 1_000_000 elements. If we will search only once, we should use a simple linear search, but if we'll do it 10 times we should use sort array O(n log n). How can I detect threshold when and for which size of input array should I use sorting and after that use binary search?
You want to solve inequality that rougly might be described as
t * n > C * n * log(n) + t * log(n)
where t is number of checks and C is some constant for sort implementation (should be determined experimentally). When you evaluate this constant, you can solve inequality numerically (with uncertainty, of course)
Like you already pointed out, it depends on the number of searches you want to do. A good threshold can come out of the following statement:
n*log[b](n) + x*log[2](n) <= x*n/2 x is the number of searches; n the input size; b the base of the logarithm for the sort, depending on the partitioning you use.
When this statement evaluates to true, you should switch methods from linear search to sort and search.
Generally speaking, a linear search through an unordered array will take n/2 steps on average, though this average will only play a big role once x approaches n. If you want to stick with big Omicron or big Theta notation then you can omit the /2 in the above.
Assuming n elements and m searches, with crude approximations
the cost of the sort will be C0.n.log n,
the cost of the m binary searches C1.m.log n,
the cost of the m linear searches C2.m.n,
with C2 ~ C1 < C0.
Now you compare
C0.n.log n + C1.m.log n vs. C2.m.n
or
C0.n.log n / (C2.n - C1.log n) vs. m
For reasonably large n, the breakeven point is about C0.log n / C2.
For instance, taking C0 / C2 = 5, n = 1000000 gives m = 100.
You should plot the complexities of both operations.
Linear search: O(n)
Sort and binary search: O(nlogn + logn)
In the plot, you will see for which values of n it makes sense to choose the one approach over the other.
This actually turned into an interesting question for me as I looked into the expected runtime of a quicksort-like algorithm when the expected split at each level is not 50/50.
the first question I wanted to answer was for random data, what is the average split at each level. It surely must be greater than 50% (for the larger subdivision). Well, given an array of size N of random values, the smallest value has a subdivision of (1, N-1), the second smallest value has a subdivision of (2, N-2) and etc. I put this in a quick script:
split = 0
for x in range(10000):
split += float(max(x, 10000 - x)) / 10000
split /= 10000
print split
And got exactly 0.75 as an answer. I'm sure I could show that this is always the exact answer, but I wanted to move on to the harder part.
Now, let's assume that even 25/75 split follows an nlogn progression for some unknown logarithm base. That means that num_comparisons(n) = n * log_b(n) and the question is to find b via statistical means (since I don't expect that model to be exact at every step). We can do this with a clever application of least-squares fitting after we use a logarithm identity to get:
C(n) = n * log(n) / log(b)
where now the logarithm can have any base, as long as log(n) and log(b) use the same base. This is a linear equation just waiting for some data! So I wrote another script to generate an array of xs and filled it with C(n) and ys and filled it with n*log(n) and used numpy to tell me the slope of that least squares fit, which I expect to equal 1 / log(b). I ran the script and got b inside of [2.16, 2.3] depending on how high I set n to (I varied n from 100 to 100'000'000). The fact that b seems to vary depending on n shows that my model isn't exact, but I think that's okay for this example.
To actually answer your question now, with these assumptions, we can solve for the cutoff point of when: N * n/2 = n*log_2.3(n) + N * log_2.3(n). I'm just assuming that the binary search will have the same logarithm base as the sorting method for a 25/75 split. Isolating N you get:
N = n*log_2.3(n) / (n/2 - log_2.3(n))
If your number of searches N exceeds the quantity on the RHS (where n is the size of the array in question) then it will be more efficient to sort once and use binary searches on that.

time complexity to find k elements in unsorted array using quick partition [duplicate]

According to Wikipedia, partition-based selection algorithms such as quickselect have runtime of O(n), but I am not convinced by it. Can anyone explain why it is O(n)?
In the normal quick-sort, the runtime is O(n log n). Every time we partition the branch into two branches (greater than the pivot and lesser than the pivot), we need to continue the process in both branches, whereas quickselect only needs to process one branch. I totally understand these points.
However, if you think in the Binary Search algorithm, after we chose the middle element, we are also searching only one side of the branch. So does that make the algorithm O(1)? No, of course, the Binary Search Algorithm is still O(log N) instead of O(1). This is also the same thing as the search element in a Binary Search Tree. We only search for one side, but we still consider O(log n) instead of O(1).
Can someone explain why in quickselect, if we continue the search in one side of pivot, it is considered O(1) instead of O(log n)? I consider the algorithm to be O(n log n), O(N) for the partitioning, and O(log n) for the number of times to continue finding.
There are several different selection algorithms, from the much simpler quickselect (expected O(n), worst-case O(n2)) to the more complex median-of-medians algorithm (Θ(n)). Both of these algorithms work by using a quicksort partitioning step (time O(n)) to rearrange the elements and position one element into its proper position. If that element is at the index in question, we're done and can just return that element. Otherwise, we determine which side to recurse on and recurse there.
Let's now make a very strong assumption - suppose that we're using quickselect (pick the pivot randomly) and on each iteration we manage to guess the exact middle of the array. In that case, our algorithm will work like this: we do a partition step, throw away half of the array, then recursively process one half of the array. This means that on each recursive call we end up doing work proportional to the length of the array at that level, but that length keeps decreasing by a factor of two on each iteration. If we work out the math (ignoring constant factors, etc.) we end up getting the following time:
Work at the first level: n
Work after one recursive call: n / 2
Work after two recursive calls: n / 4
Work after three recursive calls: n / 8
...
This means that the total work done is given by
n + n / 2 + n / 4 + n / 8 + n / 16 + ... = n (1 + 1/2 + 1/4 + 1/8 + ...)
Notice that this last term is n times the sum of 1, 1/2, 1/4, 1/8, etc. If you work out this infinite sum, despite the fact that there are infinitely many terms, the total sum is exactly 2. This means that the total work is
n + n / 2 + n / 4 + n / 8 + n / 16 + ... = n (1 + 1/2 + 1/4 + 1/8 + ...) = 2n
This may seem weird, but the idea is that if we do linear work on each level but keep cutting the array in half, we end up doing only roughly 2n work.
An important detail here is that there are indeed O(log n) different iterations here, but not all of them are doing an equal amount of work. Indeed, each iteration does half as much work as the previous iteration. If we ignore the fact that the work is decreasing, you can conclude that the work is O(n log n), which is correct but not a tight bound. This more precise analysis, which uses the fact that the work done keeps decreasing on each iteration, gives the O(n) runtime.
Of course, this is a very optimistic assumption - we almost never get a 50/50 split! - but using a more powerful version of this analysis, you can say that if you can guarantee any constant factor split, the total work done is only some constant multiple of n. If we pick a totally random element on each iteration (as we do in quickselect), then on expectation we only need to pick two elements before we end up picking some pivot element in the middle 50% of the array, which means that, on expectation, only two rounds of picking a pivot are required before we end up picking something that gives a 25/75 split. This is where the expected runtime of O(n) for quickselect comes from.
A formal analysis of the median-of-medians algorithm is much harder because the recurrence is difficult and not easy to analyze. Intuitively, the algorithm works by doing a small amount of work to guarantee a good pivot is chosen. However, because there are two different recursive calls made, an analysis like the above won't work correctly. You can either use an advanced result called the Akra-Bazzi theorem, or use the formal definition of big-O to explicitly prove that the runtime is O(n). For a more detailed analysis, check out "Introduction to Algorithms, Third Edition" by Cormen, Leisserson, Rivest, and Stein.
Let me try to explain the difference between selection & binary search.
Binary search algorithm in each step does O(1) operations. Totally there are log(N) steps and this makes it O(log(N))
Selection algorithm in each step performs O(n) operations. But this 'n' keeps on reducing by half each time. There are totally log(N) steps.
This makes it N + N/2 + N/4 + ... + 1 (log(N) times) = 2N = O(N)
For binary search it is 1 + 1 + ... (log(N) times) = O(logN)
In Quicksort, the recursion tree is lg(N) levels deep and each of these levels requires O(N) amount of work. So the total running time is O(NlgN).
In Quickselect, the recurision tree is lg(N) levels deep and each level requires only half the work of the level above it. This produces the following:
N * (1/1 + 1/2 + 1/4 + 1/8 + ...)
or
N * Summation(1/i^2)
1 < i <= lgN
The important thing to note here is that i goes from 1 to lgN, but not from 1 to N and also not from 1 to infinity.
The summation evaluates to 2. Hence Quickselect = O(2N).
Quicksort does not have a big-O of nlogn - it's worst case runtime is n^2.
I assume you're asking about Hoare's Selection Algorithm (or quickselect) not the naive selection algorithm that is O(kn). Like quicksort, quickselect has a worst case runtime of O(n^2) (if bad pivots are chosen), not O(n). It can run in expectation time n because it's only sorting one side, as you point out.
Because for selection, you're not sorting, necessarily. You can simply count how many items there are which have any given value. So an O(n) median can be performed by counting how many times each value comes up, and picking the value that has 50% of items above and below it. It's 1 pass through the array, simply incrementing a counter for each element in the array, so it's O(n).
For example, if you have an array "a" of 8 bit numbers, you can do the following:
int histogram [ 256 ];
for (i = 0; i < 256; i++)
{
histogram [ i ] = 0;
}
for (i = 0; i < numItems; i++)
{
histogram [ a [ i ] ]++;
}
i = 0;
sum = 0;
while (sum < (numItems / 2))
{
sum += histogram [ i ];
i++;
}
At the end, the variable "i" will contain the 8-bit value of the median. It was about 1.5 passes through the array "a". Once through the entire array to count the values, and half through it again to get the final value.

Choosing minimum length k of array for merge sort where use of insertion sort to sort the subarrays is more optimal than standard merge sort

This is a question from Introduction to Algorithms By Cormen. But this isn't a homework problem instead self-study.
There is an array of length n. Consider a modification to merge sort in which n/k sublists each of length k are sorted using insertion sort and then merged using merging mechanism, where k is a value to be determined.
The relationship between n and k isn't known. The length of array is n. k sublists of n/k means n * (n/k) equals n elements of the array. Hence k is simply a limit at which the splitting of array for use with merge-sort is stopped and instead insertion-sort is used because of its smaller constant factor.
I was able to do the mathematical proof that the modified algorithm works in Θ(n*k + n*lg(n/k)) worst-case time. Now the book went on to say to
find the largest value of k as a function of n for which this modified algorithm has the same running time as standard merge sort, in terms of Θ notation. How should we choose k in practice?
Now this got me thinking for a lot of time but I couldn't come up with anything. I tried to solve
n*k + n*lg(n/k) = n*lg(n) for a relationship. I thought that finding an equality for the 2 running times would give me the limit and greater can be checked using simple hit-and-trial.
I solved it like this
n k + n lg(n/k) = n lg(n)
k + lg(n/k) = lg(n)
lg(2^k) + lg(n/k) = lg(n)
(2^k * n)/k = n
2^k = k
But it gave me 2 ^ k = k which doesn't show any relationship. What is the relationship? I think I might have taken the wrong equation for finding the relationship.
I can implement the algorithm and I suppose adding an if (length_Array < k) statement in the merge_sort function here(Github link of merge sort implementation) for calling insertion sort would be good enough. But how do I choose k in real life?
Well, this is a mathematical minimization problem, and to solve it, we need some basic calculus.
We need to find the value of k for which d[n*k + n*lg(n/k)] / dk == 0.
We should also check for the edge cases, which are k == n, and k == 1.
The candidate for the value of k that will give the minimal result for n*k + n*lg(n/k) is the minimum in the required range, and is thus the optimal value of k.
Attachment, solving the derivitives equation:
d[n*k + n*lg(n/k)] / dk = d[n*k + nlg(n) - nlg(k)] / dk
= n + 0 - n*1/k = n - n/k
=>
n - n/k = 0 => n = n/k => 1/k = 1 => k = 1
Now, we have the candidates: k=n, k=1. For k=n we get O(n^2), thus we conclude optimal k is k == 1.
Note that we found the derivitives on the function from the big Theta, and not on the exact complexity function that uses the needed constants.
Doing this on the exact complexity function, with all the constants might yield a bit different end result - but the way to solve it is pretty much the same, only take derivitives from a different function.
maybe k should be lg(n)
theta(nk + nlog(n/k)) have two terms, we have the assumption that k>=1, so the second term is less than nlog(n).
only when k=lg(n), the whole result is theta(nlog(n))

Why is the runtime of the selection algorithm O(n)?

According to Wikipedia, partition-based selection algorithms such as quickselect have runtime of O(n), but I am not convinced by it. Can anyone explain why it is O(n)?
In the normal quick-sort, the runtime is O(n log n). Every time we partition the branch into two branches (greater than the pivot and lesser than the pivot), we need to continue the process in both branches, whereas quickselect only needs to process one branch. I totally understand these points.
However, if you think in the Binary Search algorithm, after we chose the middle element, we are also searching only one side of the branch. So does that make the algorithm O(1)? No, of course, the Binary Search Algorithm is still O(log N) instead of O(1). This is also the same thing as the search element in a Binary Search Tree. We only search for one side, but we still consider O(log n) instead of O(1).
Can someone explain why in quickselect, if we continue the search in one side of pivot, it is considered O(1) instead of O(log n)? I consider the algorithm to be O(n log n), O(N) for the partitioning, and O(log n) for the number of times to continue finding.
There are several different selection algorithms, from the much simpler quickselect (expected O(n), worst-case O(n2)) to the more complex median-of-medians algorithm (Θ(n)). Both of these algorithms work by using a quicksort partitioning step (time O(n)) to rearrange the elements and position one element into its proper position. If that element is at the index in question, we're done and can just return that element. Otherwise, we determine which side to recurse on and recurse there.
Let's now make a very strong assumption - suppose that we're using quickselect (pick the pivot randomly) and on each iteration we manage to guess the exact middle of the array. In that case, our algorithm will work like this: we do a partition step, throw away half of the array, then recursively process one half of the array. This means that on each recursive call we end up doing work proportional to the length of the array at that level, but that length keeps decreasing by a factor of two on each iteration. If we work out the math (ignoring constant factors, etc.) we end up getting the following time:
Work at the first level: n
Work after one recursive call: n / 2
Work after two recursive calls: n / 4
Work after three recursive calls: n / 8
...
This means that the total work done is given by
n + n / 2 + n / 4 + n / 8 + n / 16 + ... = n (1 + 1/2 + 1/4 + 1/8 + ...)
Notice that this last term is n times the sum of 1, 1/2, 1/4, 1/8, etc. If you work out this infinite sum, despite the fact that there are infinitely many terms, the total sum is exactly 2. This means that the total work is
n + n / 2 + n / 4 + n / 8 + n / 16 + ... = n (1 + 1/2 + 1/4 + 1/8 + ...) = 2n
This may seem weird, but the idea is that if we do linear work on each level but keep cutting the array in half, we end up doing only roughly 2n work.
An important detail here is that there are indeed O(log n) different iterations here, but not all of them are doing an equal amount of work. Indeed, each iteration does half as much work as the previous iteration. If we ignore the fact that the work is decreasing, you can conclude that the work is O(n log n), which is correct but not a tight bound. This more precise analysis, which uses the fact that the work done keeps decreasing on each iteration, gives the O(n) runtime.
Of course, this is a very optimistic assumption - we almost never get a 50/50 split! - but using a more powerful version of this analysis, you can say that if you can guarantee any constant factor split, the total work done is only some constant multiple of n. If we pick a totally random element on each iteration (as we do in quickselect), then on expectation we only need to pick two elements before we end up picking some pivot element in the middle 50% of the array, which means that, on expectation, only two rounds of picking a pivot are required before we end up picking something that gives a 25/75 split. This is where the expected runtime of O(n) for quickselect comes from.
A formal analysis of the median-of-medians algorithm is much harder because the recurrence is difficult and not easy to analyze. Intuitively, the algorithm works by doing a small amount of work to guarantee a good pivot is chosen. However, because there are two different recursive calls made, an analysis like the above won't work correctly. You can either use an advanced result called the Akra-Bazzi theorem, or use the formal definition of big-O to explicitly prove that the runtime is O(n). For a more detailed analysis, check out "Introduction to Algorithms, Third Edition" by Cormen, Leisserson, Rivest, and Stein.
Let me try to explain the difference between selection & binary search.
Binary search algorithm in each step does O(1) operations. Totally there are log(N) steps and this makes it O(log(N))
Selection algorithm in each step performs O(n) operations. But this 'n' keeps on reducing by half each time. There are totally log(N) steps.
This makes it N + N/2 + N/4 + ... + 1 (log(N) times) = 2N = O(N)
For binary search it is 1 + 1 + ... (log(N) times) = O(logN)
In Quicksort, the recursion tree is lg(N) levels deep and each of these levels requires O(N) amount of work. So the total running time is O(NlgN).
In Quickselect, the recurision tree is lg(N) levels deep and each level requires only half the work of the level above it. This produces the following:
N * (1/1 + 1/2 + 1/4 + 1/8 + ...)
or
N * Summation(1/i^2)
1 < i <= lgN
The important thing to note here is that i goes from 1 to lgN, but not from 1 to N and also not from 1 to infinity.
The summation evaluates to 2. Hence Quickselect = O(2N).
Quicksort does not have a big-O of nlogn - it's worst case runtime is n^2.
I assume you're asking about Hoare's Selection Algorithm (or quickselect) not the naive selection algorithm that is O(kn). Like quicksort, quickselect has a worst case runtime of O(n^2) (if bad pivots are chosen), not O(n). It can run in expectation time n because it's only sorting one side, as you point out.
Because for selection, you're not sorting, necessarily. You can simply count how many items there are which have any given value. So an O(n) median can be performed by counting how many times each value comes up, and picking the value that has 50% of items above and below it. It's 1 pass through the array, simply incrementing a counter for each element in the array, so it's O(n).
For example, if you have an array "a" of 8 bit numbers, you can do the following:
int histogram [ 256 ];
for (i = 0; i < 256; i++)
{
histogram [ i ] = 0;
}
for (i = 0; i < numItems; i++)
{
histogram [ a [ i ] ]++;
}
i = 0;
sum = 0;
while (sum < (numItems / 2))
{
sum += histogram [ i ];
i++;
}
At the end, the variable "i" will contain the 8-bit value of the median. It was about 1.5 passes through the array "a". Once through the entire array to count the values, and half through it again to get the final value.

Resources