How to convert time analysis to O(n)? - big-o

Rookie computer science student here, have a question I'm having some trouble answering.
I have a tree traversal algorithm, the time performance of which is O(bm) where b is the branching factor and m is the max depth of the tree. I was wondering how one takes this and converts it into standard asymptotic time analysis (IE O(n), O(n^2), etc).
Same question for a different algorithm I have which is O(b^m).
I have gone through my textbook extensively and not found a clear answer about this. Asymptotic time analysis usually relates to input (n) but I'm not sure what n would mean in this instance. I suppose it would be m?
In general, what do you do when you have multiple inputs?
Thank you for your time.

You should start with building a recurrence. For example, let us consider binary search. The recurrence comes as: T(n) = T(n/2) + c. When you solve it, you will get
T(n) = T(n/2) + c
= T(n/4) + c + c
= T(n/8) + c + c + c
...
= T(n/2^k) + kc
The recurrence is solved when n = 2^k or k = log_2(n). So, the complexity is c.log_2(n)
Now, let us look at another situation where the input is divided into 5 parts, and the results combined in linear time. This recurrence will be
T(n) = 5T(n/5) + n
= 5^2T(n/5^2) + 2n
...
= 5^kT(n/5^k) + kn
This will stop when n = 5^k or k = log_5(n). So, substituting above, the complexity is: n.log_5(n).
I guess you should be able to take it from here on.

Related

Calculate the time complexity of recurrence relation f(n) = f(n/2) + f(n/3)

How to calculate time complexity of recurrence relation f(n) = f(n/2) + f(n/3). We have base case at n=1 and n=0.
How to calculate time complexity for general case i.e f(n) = f(n/x) + f(n/y), where x<n and y<n.
Edit-1 :(after first answer posted) every number considered is integer.
Edit-2 :(after first answer posted) I like the answer given by Mbo but is it possible to answer this without using any fancy theorem like master theorem etc.Like by making tree etc.
However users are free to answer the way they like and i will try to understand.
In "layman terms" you can get dependence with larger coefficient:
T(n) = T(n/2) + T(n/2) + O(1)
build call tree for n=2^k and see that the last tree level contains 2^k items, higher level 2^k-1 items, next one 2^k-2 and so on. Sum of sequence (geometric progression)
2^k + 2^k-1 + 2^k-2 + ... + 1 = 2^(k+1) = 2*n
so complexity for this dependence is linear too.
Now get dependence with smaller (zero) second coefficient:
T(n) = T(n/2) + O(1)
and ensure in linear complexity too.
Seems clear that complexity of recurrence in question lies between complexities for these simpler examples, and is linear.
In general case recurrences with complex branching might be solved with Aktra-Bazzi method (more general approach than Master theorem)
I assume that dependence is
T(n) = T(n/2) + T(n/3) + O(1)
In this case g=1, to find p we should numerically solve
(1/2)^p + (1/3)^p = 1
and get p~0.79, then integrate
T(x) = Theta(x^0.79 * (1 + Int[1..x]((1/u^0.79)*du))) =
Theta(x^0.79 * (1 + 4.8*x^0.21 - 4.8) =
Theta(x^0.79 + 4.8*x) =
Theta(x)
So complexity is linear

Algorithms: Find recursive equation of divide and conquer algorithm

I have the following "divide and conquer" algorithm A1.
A1 divides a problem with size n , to 4 sub-problems with size n/4.
Then, solves them and compose the solutions to 12n time.
How can I to write the recursive equation that give the runtime of algorithms.
Answering the question "How can I to write the recursive equation that give the runtime of algorithms"
You should write it this way:
Let T(n) denote the run time of your algorithm for input size of n
T(n) = 4*T(n/4) + 12*n;
Although the master theorem does give a shortcut to the answer, it is imperative to understand the derivation of the Big O runtime. Divide and conquer recurrence relations are written in the form T(n) = q * T(n/j) + cn, where q is the number of subproblems, j the amount we divide the data for each subproblem, and cn is the time it takes to divide/combine/manipulate each subproblem at each level. cn could also be cn^2 or c, whatever the runtime would be.
In your case, you have 4 subproblems of size n/4 with each level being solved in 12n time giving a recurrence relation of T(n) = 4 * T(n/4) + 12n. From this recurrence, we can then derive the runtime of the algorithm. Given it is a divide and conquer relation, we can assume that the base case is T(1) = 1.
To solve the recurrence, I will use a technique called substitution. We know that T(n) = 4 * T(n/4) + 12n, so we will substitute for T(n/4). T(n/4) = 4 * T(n/16) + 12(n/4). Plugging this into the equation gets us T(n) = 4 * (4 * T(n/16) + 12n/4) + 12n, which we can simplify to T(n) = 4^2 * T(n/16) + 2* 12n. Again, we still have more work to do in the equation to capture the work in all levels, so we substitute for T(n/16), T(n) = 4^3 * T(n/64) + 3* 12n. We see the pattern emerge and know that we want to go all the way down to our base case, T(1), so that we substitute to get T(n) = 4^k*T(1) + k * 12n. This equation defines the total amount of work that is in the divide and conquer algorithm because we have substituted all of the levels in, however, we still have an unknown variable k and we want it in terms of n We get k by solving the equation n/4^k = 1 as we know that we have reached the point where we are calling the algorithm on only one variable. We solve for n and get that k = log4n. That means that we have done log4n substitutions. We plug that in for k and get T(n) =4^log4n*T(1) + log4n * 12n. We simplify this to T(n) =n *1 + log4n * 12n. Since this is Big O analysis and log4n is in O(log2n) due to the change of base property of logarithms, we get that T(n) = n + 12n * logn which means that T(n) is in the Big O of nlogn.
Recurrence relation that best describes is given by:
T(n)=4*T(n/4)+12*n
Where T(n)= run time of given algorithm for input of size n, 4= no of subproblems,n/4 = size of each subproblem .
Using Master Theorem Time Complexity is calculated to be:theta(n*log n)

Recurrence: T(n) = 3T(n/2) + n^2(lgn)

Here is the full question...
Analysis of recurrence trees. Find the nice nonrecursive function f (n) such that
T(n) = Θ( f (n)). Show your work: what is the number of levels, number of instances on each level, work of each instance and the total work on that level.
This is a homework question so I do not expect exact answers, but I would like some guidance because I have no idea where to start. Here is part a:
a) T(n) = 3T(n/2) + n^2(lgn)
I really have no idea where to begin.
These types of recurrences are solved with Master's theorem
In your case a=3, b=2 and therefore c = log2(3) < 2.
So you are in the third case and your complexity is O(n^2 * log(n))

Solving a Recurrence Relation: T(n)=T(n-1)+T(n/2)+n

Solve: T(n)=T(n-1)+T(n/2)+n.
I tried solving this using recursion trees.There are two branches T(n-1) and T(n/2) respectively. T(n-1) will go to a higher depth. So we get O(2^n). Is this idea correct?
This is a very strange recurrence for a CS class. This is because from one point of view: T(n) = T(n-1) + T(n/2) + n is bigger than T(n) = T(n-1) + n which is O(n^2).
But from another point of view, the functional equation has an exact solution: T(n) = -2(n + 2). You can easily see that this is the exact solution by substituting it back to the equation: -2(n + 2) = -2(n + 1) + -(n + 2) + n. I am not sure whether this is the only solution.
Here is how I got it: T(n) = T(n-1) + T(n/2) + n. Because you calculate things for very big n, than n-1 is almost the same as n. So you can rewrite it as T(n) = T(n) + T(n/2) + n which is T(n/2) + n = 0, which is equal to T(n) = - 2n, so it is linear. This was counter intuitive to me (the minus sign here), but armed with this solution, I tried T(n) = -2n + a and found the value of a.
I believe you are right. The recurrence relation will always split into two parts, namely T(n-1) and T(n/2). Looking at these two, it is clear that n-1 decreases in value slower than n/2, or in other words, you will have more branches from the n-1 portion of the tree. Despite this, when considering big-o, it is useful to just consider the 'worst-case' scenario, which in this case is that both sides of the tree decreases by n-1 (since this decreases more slowly and you would need to have more branches). In all, you would need to split the relation into two a total of n times, hence you are right to say O(2^n).
Your reasoning is correct, but you give away far too much. (For example, it is also correct to say that 2x^3+4=O(2^n), but that’s not as informative as 2x^3+4=O(x^3).)
The first thing we want to do is get rid of the inhomogeneous term n. This suggests that we may look for a solution of the form T(n)=an+b. Substituting that in, we find:
an+b = a(n-1)+b + an/2+b + n
which reduces to
0 = (a/2+1)n + (b-a)
implying that a=-2 and b=a=-2. Therefore, T(n)=-2n-2 is a solution to the equation.
We now want to find other solutions by subtracting off the solution we’ve already found. Let’s define U(n)=T(n)+2n+2. Then the equation becomes
U(n)-2n-2 = U(n-1)-2(n-1)-2 + U(n/2)-2(n/2)-2 + n
which reduces to
U(n) = U(n-1) + U(n/2).
U(n)=0 is an obvious solution to this equation, but how do the non-trivial solutions to this equation behave?
Let’s assume that U(n)∈Θ(n^k) for some k>0, so that U(n)=cn^k+o(n^k). This makes the equation
cn^k+o(n^k) = c(n-1)^k+o((n-1)^k) + c(n/2)^k+o((n/2)^k)
Now, (n-1)^k=n^k+Θ(n^{k-1}), so that the above becomes
cn^k+o(n^k) = cn^k+Θ(cn^{k-1})+o(n^k+Θ(n^{k-1})) + cn^k/2^k+o((n/2)^k)
Absorbing the lower order terms and subtracting the common cn^k, we arrive at
o(n^k) = cn^k/2^k
But this is false because the right hand side grows faster than the left. Therefore, U(n-1)+U(n/2) grows faster than U(n), which means that U(n) must grow faster than our assumed Θ(n^k). Since this is true for any k, U(n) must grow faster than any polynomial.
A good example of something that grows faster than any polynomial is an exponential function. Consequently, let’s assume that U(n)∈Θ(c^n) for some c>1, so that U(n)=ac^n+o(c^n). This makes the equation
ac^n+o(c^n) = ac^{n-1}+o(c^{n-1}) + ac^{n/2}+o(c^{n/2})
Rearranging and using some order of growth math, this becomes
c^n = o(c^n)
This is false (again) because the left hand side grows faster than the right. Therefore,
U(n) grows faster than U(n-1)+U(n/2), which means that U(n) must grow slower than our assumed Θ(c^n). Since this is true for any c>1, U(n) must grow more slowly than any exponential.
This puts us into the realm of quasi-polynomials, where ln U(n)∈O(log^c n), and subexponentials, where ln U(n)∈O(n^ε). Either of these mean that we want to look at L(n):=ln U(n), where the previous paragraphs imply that L(n)∈ω(ln n)∩o(n). Taking the natural log of our equation, we have
ln U(n) = ln( U(n-1) + U(n/2) ) = ln U(n-1) + ln(1+ U(n/2)/U(n-1))
or
L(n) = L(n-1) + ln( 1 + e^{-L(n-1)+L(n/2)} ) = L(n-1) + e^{-(L(n-1)-L(n/2))} + Θ(e^{-2(L(n-1)-L(n/2))})
So everything comes down to: how fast does L(n-1)-L(n/2) grow? We know that L(n-1)-L(n/2)→∞, since otherwise L(n)∈Ω(n). And it’s likely that L(n)-L(n/2) will be just as useful, since L(n)-L(n-1)∈o(1) is much smaller than L(n-1)-L(n/2).
Unfortunately, this is as far as I’m able to take the problem. I don’t see a good way to control how fast L(n)-L(n/2) grows (and I’ve been staring at this for months). The only thing I can end with is to quote another answer: “a very strange recursion for a CS class”.
I think we can look at it this way:
T(n)=2T(n/2)+n < T(n)=T(n−1)+T(n/2)+n < T(n)=2T(n−1)+n
If we apply the master's theorem, then:
Θ(n∗logn) < Θ(T(n)) < Θ(2n)
Remember that T(n) = T(n-1) + T(n/2) + n being (asymptotically) bigger than T(n) = T(n-1) + n only applies for functions which are asymptotically positive. In that case, we have T = Ω(n^2).
Note that T(n) = -2(n + 2) is a solution to the functional equation, but it doesn't interest us, since it is not an asymptotically positive solution, hence the notations of O don't have meaningful application.
You can also easily check that T(n) = O(2^n). (Refer to yyFred solution, if needed)
If you try using the definition of O for functions of the type n^a(lgn)^b, with a(>=2) and b positive constants, you see that this is not a possible solution too by the Substitution Method.
In fact, the only function that allows a proof with the Substitution Method is exponential, but we know that this recursion doesn't grow as fast as T(n) = 2T(n-1) + n, so if T(n) = O(a^n), we can have a < 2.
Assume that T(m) <= c(a^m), for some constant c, real and positive. Our hypothesis is that this relation is valid for all m < n. Trying to prove this for n, we get:
T(n) <= (1/a+1/a^(n/2))c(a^n) + n
we can get rid of the n easily by changing the hypothesis by a term of lower order. What is important here is that:
1/a+1/a^(n/2) <= 1
a^(n/2+1)-a^(n/2)-a >= 0
Changing variables:
a^(N+1)-a^N-a >= 0
We want to find a bond as tight as possible, so we are searching for the lowest a possible. The inequality we found above accept solutions of a which are pretty close to 1, but is a allowed to get arbitrarily close to 1? The answer is no, let a be of the form a = (1+1/N). Substituting a at the inequality and applying the limit N -> INF:
e-e-1 >= 0
which is a absurd. Hence, the inequality above has some fixed number N* as maximum solution, which can be found computationally. A quick Python program allowed me to find that a < 1+1e-45 (with a little extrapolation), so we can at least be sure that:
T(n) = ο((1+1e-45)^n)
T(n)=T(n-1)+T(n/2)+n is the same as T(n)=T(n)+T(n/2)+n since we are solving for extremely large values of n. T(n)=T(n)+T(n/2)+n can only be true if T(n/2) + n = 0. That means T(n) = T(n) + 0 ~= O(n)

Worst Case Performance of Quicksort

I am trying to prove the following worst-case scenario for the Quicksort algorithm but am having some trouble. Initially, we have an array of size n, where n = ij. The idea is that at every partition step of Quicksort, you end up with two sub-arrays where one is of size i and the other is of size i(j-1). i in this case is an integer constant greater than 0. I have drawn out the recursive tree of some examples and understand why this is a worst-case scenario and that the running time will be theta(n^2). To prove this, I've used the iteration method to solve the recurrence equation:
T(n) = T(ij) = m if j = 1
T(n) = T(ij) = T(i) + T(i(j-1)) + cn if j > 1
T(i) = m
T(2i) = m + m + c*2i = 2m + 2ci
T(3i) = m + 2m + 2ci + 3ci = 3m + 5ci
So it looks like the recurrence is:
j
T(n) = jm + ci * sum k - 1
k=1
At this point, I'm a bit lost as to what to do. It looks the summation at the end will result in j^2 if expanded out, but I need to show that it somehow equals n^2. Any explanation on how to continue with this would be appreciated.
Pay attention, the quicksort algorithm worst case scenario is when you have two subproblems of size 0 and n-1. In this scenario, you have this recurrence equations for each level:
T(n) = T(n-1) + T(0) < -- at first level of tree
T(n-1) = T(n-2) + T(0) < -- at second level of tree
T(n-2) = T(n-3) + T(0) < -- at third level of tree
.
.
.
The sum of costs at each level is an arithmetic serie:
n n(n-1)
T(n) = sum k = ------ ~ n^2 (for n -> +inf)
k=1 2
It is O(n^2).
Its a problem of simple mathematics. The complexity as you have calculated correctly is
O(jm + ij^2)
what you have found out is a parameterized complextiy. The standard O(n^2) is contained in this as follows - assuming i=1 you have a standard base case so m=O(1) hence j=n therefore we get O(n^2). if you put ij=n you will get O(nm/i+n^2/i) . Now what you should remember is that m is a function of i depending upon what you will use as the base case algorithm hence m=f(i) thus you are left with O(nf(i)/i + n^2/i). Now again note that since there is no linear algorithm for general sorting hence f(i) = omega(ilogi) which will give you O(nlogi + n^2/i). So you have only one degree of freedom that is i. Check that for any value of i you cannot reduce it below nlogn which is the best bound for comparison based.
Now what I am confused is that you are doing some worst case analysis of quick sort. This is not the way its done. When you say worst case it implies you are using randomization in which case the worst case will always be when i=1 hence the worst case bound will be O(n^2). An elegant way to do this is explained in randomized algorithm book by R. Motwani and Raghavan alternatively if you are a programmer then you look at Cormen.

Resources