Is the big-O complexity of these functions correct? - algorithm

I am learning about algorithm complexity, and I just want to verify my understanding is correct.
1) T(n) = 2n + 1 = O(n)
This is because we drop the constants 2 and 1, and we are left with n. Therefore, we have O(n).
2) T(n) = n * n - 100 = O(n^2)
This is because we drop the constant -100, and are left with n * n, which is n^2. Therefore, we have O(n^2)
Am I correct?

Basically you have those different levels determined by the "dominant" factor of your function, starting from the lowest complexity :
O(1) if your function only contains constants
O(log(n)) if the dominant part is in log, ln...
O(n^p) if the dominant part is polynomial and the highest power is p (e.g. O(n^3) for T(n) = n*(3n^2 + 1) -3 )
O(p^n) if the dominant part is a fixed number to n-th power (e.g. O(3^n) for T(n) = 3 + n^99 + 2*3^n)
O(n!) if the dominant part is factorial
and so on...

Related

Complexity of f(k) when f(n) = O(n!) and k=n*(n-1)

I have the following problem. Let's suppose we have function f(n). Complexity of f(n) is O(n!). However, there is also parameter k=n*(n-1). My question is - what is the complexity of f(k)? Is it f(k)=O(k!/k^2) or something like that, taking into consideration that there is a quadratic relation between k and n?
Computational complexity is interpreted base on the size of the input. Hence, if f(n) = O(n!) when your input is n, then f(k) = O(k!) when your input is k.
Therefore, you don't need to compute the complexity for each value of input for the function f(n). For example, f(2) = O(2!), you don't need to compute the complexity of f(10) likes O((5*2)!) as 10 = 5 * 2, and try to simplify it base on the value of 2!. We can say f(10) = O(10!).
Anyhow, if you want compute (n*(n-1))! = (n^2 - n)!(n^2 - n + 1)...(n^2 - n + n) /(n^2 - n + 1)...(n^2 - n + n) = (n^2)!/\theta(n^3) = O((n^2)!/n^(2.9))
Did you consider that there is a m, such that the n you used in your f(n) is equal to m * (m - 1).
Does that change the complexity?
The n in f(n) = O(n!) represents all the valid inputs.
You are trying to pass a variable k whose actual value in terms of another variable is n * (n - 1). That does not change the complexity. It will be O(k!) only.

Calculating execution time of an algorithm

I have this algorithm:
S(n)
if n=1 then return(0)
else
S(n/3)
x <- 0
while x<= 3n^3 do
x <- x+3
S(n/3)
Is 2 * T(n/3) + n^3 the recurrence relation?
Is T(n) = O(n^3) the execution time?
The recurrence expression is correct. The time complexity of the algorithm is O(n^3).
The recurrence stops at T(1).
Running an example for n = 27 helps deriving a general expression:
T(n) = 2*T(n/3)+n^3 =
= 2*(2*T(n/9)+(n/3)^3)+n^3 =
= 2*(2*(2*T(n/27)+(n/9)^3)+(n/3)^3)+n^3 =
= ... =
= 2*(2*2*T(n/27)+2*(n/9)^3+(n/3)^3)+n^3 =
= 2*2*2*T(n/27)+2*2*(n/9)^3+2*(n/3)^3+n^3
From this example we can see that the general expression is given by:
Which is equivalent to:
Which, in turn, can be solved to the following closed form:
The dominating term in this expression is (1/25)*27n^3 (2^(log_3(n)) is O(n), you can think of it as 2^(log(n)*(1/log(3))); dropping the constant 1/log(3) gives 2^log(n) = n), thus, the recurrence is O(n^3).
2 * T(n/3) + n^3
Yes, I think this is a correct recurrence relation.
Time complexity:
while x<= 3n^3 do
x <- x+3
This has a Time complexity of O(n^3). Also, at each step, the function calls itself twice with 1/3rd n. So the series shall be
n, n/3, n/9, ...
The total complexity after adding each depth
n^3 + 2/27 * (n^3) + 4/243 * (n^3)...
This series is bounded by k*n^3 where k is a constant.
Proof: if it is considered as a GP with a factor of 1/2, then the sum
becomes 2*n^3. Now we can see that at each step, the factor is
continuously decreasing and is less than half. Hence the upper bound is less than 2*n^3.
So in my opinion the complexity = O(n^3).

Mergesort: How does the computational complexity vary when changing the split point?

I have to do an exercise from my algorithm book. Suppose a mergesort is implemented to split an array by α, which is in a range from 0.1 to 0.9.
This is the original method to calculate the split point
middle = fromIndex + (toIndex - fromIndex)/2;
I would like to change it to this:
factor = 0.1; //varies in range from 0.1 to 0.9
middle = fromIndex + (toIndex - fromIndex)*factor;
So my questions are:
Does this impact the computational complexity?
What's the impact on recursion tree depths?
This does change the actual complexity, but not the asymptotic complexity.
If you think about the new recurrence relation you'll get, it will be
T(1) = 1
T(n) = T(αn) + T((1 - α)n) + Θ(n)
Looking over the recursion tree, each level of the tree still has a total of Θ(n) work per level, but the number of levels will be greater. Specifically, let's suppose that 0.5 ≤ α < 1. Then after k recursive calls, the size of the smallest block remaining in the recursion will have size n αk. The recurrence stops when this hits size one. Solving, we get:
n αk = 1
α k = 1/n
k log α = -log n
k = -log n / log α
k = log n / log (1/α)
In other words, varying α varies the constant factor on the logarithmic term of the depth of the recursion. The above equation is minimized when α = 0.5 (since we are subject to the restriction that α ≥ 0.5), so this would be the optimal way to split. However, picking other splits still gives runtime Θ(n log n), though with a higher constant term.
Hope this helps!

Worst Case Performance of Quicksort

I am trying to prove the following worst-case scenario for the Quicksort algorithm but am having some trouble. Initially, we have an array of size n, where n = ij. The idea is that at every partition step of Quicksort, you end up with two sub-arrays where one is of size i and the other is of size i(j-1). i in this case is an integer constant greater than 0. I have drawn out the recursive tree of some examples and understand why this is a worst-case scenario and that the running time will be theta(n^2). To prove this, I've used the iteration method to solve the recurrence equation:
T(n) = T(ij) = m if j = 1
T(n) = T(ij) = T(i) + T(i(j-1)) + cn if j > 1
T(i) = m
T(2i) = m + m + c*2i = 2m + 2ci
T(3i) = m + 2m + 2ci + 3ci = 3m + 5ci
So it looks like the recurrence is:
j
T(n) = jm + ci * sum k - 1
k=1
At this point, I'm a bit lost as to what to do. It looks the summation at the end will result in j^2 if expanded out, but I need to show that it somehow equals n^2. Any explanation on how to continue with this would be appreciated.
Pay attention, the quicksort algorithm worst case scenario is when you have two subproblems of size 0 and n-1. In this scenario, you have this recurrence equations for each level:
T(n) = T(n-1) + T(0) < -- at first level of tree
T(n-1) = T(n-2) + T(0) < -- at second level of tree
T(n-2) = T(n-3) + T(0) < -- at third level of tree
.
.
.
The sum of costs at each level is an arithmetic serie:
n n(n-1)
T(n) = sum k = ------ ~ n^2 (for n -> +inf)
k=1 2
It is O(n^2).
Its a problem of simple mathematics. The complexity as you have calculated correctly is
O(jm + ij^2)
what you have found out is a parameterized complextiy. The standard O(n^2) is contained in this as follows - assuming i=1 you have a standard base case so m=O(1) hence j=n therefore we get O(n^2). if you put ij=n you will get O(nm/i+n^2/i) . Now what you should remember is that m is a function of i depending upon what you will use as the base case algorithm hence m=f(i) thus you are left with O(nf(i)/i + n^2/i). Now again note that since there is no linear algorithm for general sorting hence f(i) = omega(ilogi) which will give you O(nlogi + n^2/i). So you have only one degree of freedom that is i. Check that for any value of i you cannot reduce it below nlogn which is the best bound for comparison based.
Now what I am confused is that you are doing some worst case analysis of quick sort. This is not the way its done. When you say worst case it implies you are using randomization in which case the worst case will always be when i=1 hence the worst case bound will be O(n^2). An elegant way to do this is explained in randomized algorithm book by R. Motwani and Raghavan alternatively if you are a programmer then you look at Cormen.

Drawing Recurrence Tree and Analysis

I am watching Intro to Algorithms (MIT) lecture 1. Theres something like below (analysis of merge sort)
T(n) = 2T(n/2) + O(n)
Few questions:
Why work at bottom level becomes O(n)? It said that the boundary case may have a different constant ... but I still don't get it ...
Its said total = cn(lg n) + O(n). Where does O(n) part come from? The original O(n)?
Although this one's been answered a lot, here's one way to reason it:
If you expand the recursion you get:
t(n) = 2 * t(n/2) + O(n)
t(n) = 2 * (2 * t(n/4) + O(n/2)) + O(n)
t(n) = 2 * (2 * (2 * t(n/8) + O(n/4)) + O(n/2)) + O(n)
...
t(n) = 2^k * t(n / 2^k) + O(n) + 2*O(n/2) + ... + 2^k * O(n/2^k)
The above stops when 2^k = n. So, that means n = log_2(k).
That makes n / 2^k = 1 which makes the first part of the equality simple to express, if we consider t(1) = c (constant).
t(n) = n * c + O(n) + 2*O(n/2) + ... + (2^k * O(n / 2^k))
If we consider the sum of O(n) + .. + 2^k * O(n / 2^k) we can observe that there are exactly k terms, and that each term is actually equivalent to n. So we can rewrite it like so:
t(n) = n * c + {n + n + n + .. + n} <-- where n appears k times
t(n) = n * c + n *k
but since k = log_2(n), we have
t(n) = n * c + n * log_2(n)
And since in Big-Oh notation n * log_2(n) is equivalent to n * log n, and it grows faster than n * c, it follows that the Big-O of the closed form is:
O(n * log n)
I hope this helps!
EDIT
To clarify, your first question, regarding why work at the bottom becomes O(n) is basically because you have n unit operations that take place (you have n leaf nodes in the expansion tree, and each takes a constant c time to complete). In the closed-formula, the work-at-the-bottom is expressed as the first term in the sum: 2 ^ k * t(1). As I said above, you have k levels in the tree, and the unit operation t(1) takes constant time.
To answer the second question, the O(n) does not actually come from the original O(n); it represents the work at the bottom (see answer to first question above).
The original O(n) is the time complexity required to merge the two sub-solutions t(n/2). Since the time complexity of the merge operation is assumed to grow (or decrease) linearly with the size of the problem, that means that at each level you will have a sum of O(n / 2^level), of 2^level terms; this is equivalent to one O(n) operation performed once. Now, since you have k levels, the merge complexity for the initial problem is {O(n) at each level} * {number of levels} which is essentially O(n) * k. Since k = log(n) levels, it follows that the time complexity of the merge operation is: O(n * log n).
Finally, when you examine all the operations performed, you see that the work at the bottom is less than the actual work performed to merge the solutions. Mathematically speaking, the work performed for each of the n items, grows asymptotically slower than the work performed to merge the sub-solutions; put differently, for large values of n, the merge operation dominates. So in Big-Oh analysis, the formula becomes: O(n * log(n)).

Resources