I know that we can apply the Master Theorem to find the running time of a divide and conquer algorithm, when the recurrence relation has the form of:
T(n) = a*T(n/b) + f(n)
We know the following :
a is the number of subproblems that the algorithm divides the original problem
b is the size of the sun-problem i.e n/b
and finally.. f(n) encompasses the cost of dividing the problem and combining the results of the subproblems.
Now we then find something (I will come back to the term "something")
and we have 3 cases to check.
The case that f(n) = O(n^log(b)a-ε) for some ε>0; Then T(n) is O(n*log(b)a)
The case that f(n) = O(n^log(b)a) ; Then T(n) is O(n^log(b)a * log n).
If n^log(b)a+ε = O(f(n)) for some constant ε > 0, and if a*f(n/b) =< cf(n) for some constant
c < 1 and almost all n, then T(n) = O(f(n)).
All fine, I am recalling the term something. How we can use general examples (i.e uses variables and not actual numbers) to decide which case the algorithm is in?
In instance. Consider the following:
T(n) = 8T(n/2) + n
So a = 8, b = 2 and f(n) = n
How I will proceed then? How can I decide which case is? While the function f(n) = some big-Oh notation how these two things are comparable?
The above is just an example to show you where I don't get it, so the question is in general.
Thanks
As CLRS suggests, the basic idea is comparing f(n) with n^log(b)a i.e. n to the power (log a to the base b). In your hypothetical example, we have:
f(n) = n
n^log(b)a = n^3, i.e. n-cubed as your recurrence yields 8 problems of half the size at every step.
Thus, in this case, n^log(b)a is larger because n^3 is always O(n) and the solution is: T(n) = θ(n^3).
Clearly, the number of subproblems vastly outpaces the work (linear, f(n) = n) you are doing for each subproblem. Thus, the intuition tells and master theorem verifies that it is the n^log(b)a that dominates the recurrence.
There is a subtle technicality where the master theorem says that f(n) should be not only smaller than n^log(b)a O-wise, it should be smaller polynomially.
Related
I'm solving some recurrences and I didn't quite understand when to put O or Theta on the result.
If I have a recurrence like this:
T(N)=2T(n/2)+n
and I solve this with the iterative method I know the result is an O (nlogn).
could I also use theta (nlogn)?
is there any verification I could do to establish this?
By definition,
If T(n) = O(g(n)) as well as T(n) = Ω(g(n)),
then we can say that T(n) = Θ(g(n))
In case of T(n) = 2T(n/2) + n you're dividing n inputs into two branches and then merging those inputs from bottom up. This method is also known as Divide and Conquer.
So for every branch you're doing n operations. How many number of times? That is the height of the tree. For binary tree here, the height is log_2(n). Thus the total operations you're doing is at most n * log_2(n).
Since the height of the tree is fixed and the number of operations performed is also fixed so the total operations you'll be doing is also at least n * log_2(n).
Thus for your above function, T(n) = Θ(n * log_2(n)).
This is my recursive function:
function abc(n):
if n == 0
return xyz(n)
for i = 1 to n
print(xyz(n))
return abc(n/2) + abc(n/2)
and xyz() is ϴ(n^3). Will the Master theorem will be valid here? If, yes How will I write it?
The master theorem concerns recurrence relations of this form:
T(n) = a * T(n/b) + f(n)
T being the recursive procedure, a the number of subproblems into which we divide the input n, n/b the size of each subproblem and `f(n) the cost for the division of the input into subproblems and the combination of the results.
If n == 0 then n/b becomes 0, and so does a. This leaves us with:
T(0) = 0 + f(0)
Since there's no more recursion, it basically comes down to f(0). In your hypothetical case this has a complexity ϴ(n^3).
Since f(n) is the cost for the division of n into a subproblems and the combination of results, f(0) would normally have a cost of 0 or a constant. If function f(n) has a complexity of ϴ(n^3), then actually for n == 0 this still leads to a cost of 0 with regards to input size.
The master theorem provides information on the asymptotic bound for T(n), depending on the complexity of f(n), a and b. This depends on how the complexity of f(n) can be expressed using a form that employs logb(a)(log with base b of a). The log of 0 is undefined with b > 0.
What it comes down to is that it makes no sense to ask whether the master theorem holds for some specific input. Moreover, the master theorem holds anyway, it just states that depending on f(n) you can make some claims about the complexity of T or not. This depends on a and b, so without that information it is senseless to ask. If your f(n) has O(n^3) outside of the base case as well (n > 0) then you could make claims about T depending on how that 3 relates to a and b. For example, if 3 < logb(a) you'd be sure that T is ϴ(n^(logb(a)).
Suppose that the a in your algorithm is 2^n, then the master theorem could no longer be used to say anything about T's complexity.
EDIT
After your question edit, the form of your recursive procedure has become this:
T(n) = 2 * T(n/2) + f(n)
So a == 2 and b == 2 are the parameters in your case, since you divide the input into two subproblems which each get an input that's half of the input doing the recursion. The combination of the two recursive calls is constant (a simple addition abc(n/2) + abc(n/2)) and the division of the problems is trivial too, but this part in your case could simulate a ϴ(n^4) algorithm for dividing the input into subproblems:
for i = 1 to n
print(xyz(n))
Note that it's ϴ(n^4) because you stated xyz(n) is ϴ(n^3) and you repeat it n times in the loop. So your f(n) = ϴ(n^4).
The master theorem can't really state anything about this. However, if f(n) = Ω(n^4) (note the omega here), then 4 > log2(2) (the logb(a) with b = 2 and a = 2 in your case). In order to make a statement about T's complexity, another condition must now hold, the regularity condition. It states that a * f(n/b) <= k * f(n) must be true for some k < 1 and sufficiently large n.
So that gives us 2 * f(n/2) <= k * f(n). This is true for k < 1/8. This, finally, lets us state that T = ϴ(f(n)), so T = ϴ(n^4).
Meaning that final part is true if your f(n) (the loop with the xyz call) can be proven to be Ω(n^4) (again, note the omega instead of theta). Since omega is the lower bound, and your f(n) = ϴ(n^4), that should be true.
In big O notation, we always say that we should ignore constant factors for most cases. That is, rather than writing,
3n^2-100n+6
we are almost always satisfied with
n^2
since that term is the fastest growing term in the equation.
But I found many algorithm courses starts comparing functions with many terms
2n^2+120n+5 = big O of n^2
then finding c and n0 for those long functions, before recommending to ignore low order terms in the end.
My question is what would I get from trying to understand and annalising these kinds of functions with many terms? Before this month I am comfortable with understanding what O(1), O(n), O(LOG(n)), O(N^3) mean. But am I missing some important concepts if I just rely on this typically used functions? What will I miss if I skipped analysing those long functions?
Let's first of all describe what we mean when we say that f(n) is in O(g(n)):
... we can say that f(n) is O(g(n)) if we can find a constant c such
that f(n) is less than c·g(n) or all n larger than n0, i.e., for all
n>n0.
In equation for: we need to find one set of constants (c, n0) that fulfils
f(n) < c · g(n), for all n > n0, (+)
Now, the result that f(n) is in O(g(n)) is sometimes presented in difference forms, e.g. as f(n) = O(g(n)) or f(n) ∈ O(g(n)), but the statement is the same. Hence, from your question, the statement 2n^2+120n+5 = big O of n^2 is just:
f(n) = 2n^2 + 120n + 5
a result after some analysis: f(n) is in O(g(n)), where
g(n) = n^2
Ok, with this out of the way, we look at the constant term in the functions we want to analyse asymptotically, and let's look at it educationally, using however, your example.
As the result of any big-O analysis is the asymptotic behaviour of a function, in all but some very unusual cases, the constant term has no effect whatsoever on this behaviour. The constant factor can, however, affect how to choose the constant pair (c, n0) used to show that f(n) is in O(g(n)) for some functions f(n) and g(n), i.e., the none-unique constant pair (c, n0) used to show that (+) holds. We can say that the constant term will have no effect of our result of the analysis, but it can affect our derivation of this result.
Lets look at your function as well as another related function
f(n) = 2n^2 + 120n + 5 (x)
h(n) = 2n^2 + 120n + 22500 (xx)
Using a similar approach as in this thread, for f(n), we can show:
linear term:
120n < n^2 for n > 120 (verify: 120n = n^2 at n = 120) (i)
constant term:
5 < n^2 for e.g. n > 3 (verify: 3^2 = 9 > 5) (ii)
This means that if we replace both 120n as well as 5 in (x) by n^2 we can state the following inequality result:
Given that n > 120, we have:
2n^2 + n^2 + n^2 = 4n^2 > {by (ii)} > 2n^2 + 120n + 5 = f(n) (iii)
From (iii), we can choose (c, n0) = (4, 120), and (iii) then shows that these constants fulfil (+) for f(n) with g(n) = n^2, and hence
result: f(n) is in O(n^2)
Now, for for h(n), we analogously have:
linear term (same as for f(n))
120n < n^2 for n > 120 (verify: 120n = n^2 at n = 120) (I)
constant term:
22500 < n^2 for e.g. n > 150 (verify: 150^2 = 22500) (II)
In this case, we replace 120n as well as 22500 in (xx) by n^2, but we need a larger less than constraint on n for these to hold, namely n > 150. Hence, we the following holds:
Given that n > 150, we have:
2n^2 + n^2 + n^2 = 4n^2 > {by (ii)} > 2n^2 + 120n + 5 = h(n) (III)
In same way as for f(n), we can, here, choose (c, n0) = (4, 150), and (III) then shows that these constants fulfil (+) for h(n), with g(n) = n^2, and hence
result: h(n) is in O(n^2)
Hence, we have the same result for both functions f(n) and h(n), but we had to use different constants (c,n0) to show these (i.e., somewhat different derivation). Note finally that:
Naturally the constants (c,n0) = (4,150) (used for h(n) analysis) are also valid to show that f(n) is in O(n^2), i.e., that (+) holds for f(n) with g(n)=n^2.
However, not the reverse: (c,n0) = (4,120) cannot be used to show that (+) holds for h(n) (with g(n)=n^2).
The core of this discussion is that:
As long as you look at sufficiently large values of n, you will be able to describe the constant terms in relations as constant < dominantTerm(n), where, in our example, we look at the relation with regard to dominant term n^2.
The asymptotic behaviour of a function will not (in all but some very unusual cases) depend on the constant terms, so we might as well skip looking at them at all. However, for a rigorous proof of the asymptotic behaviour of some function, we need to take into account also the constant terms.
Ever have intermediate steps in your work? That is what this likely is as when you are computing a big O, chances are you don't already know for sure what the highest order term is and thus you keep track of them all and then determine which complexity class makes sense in the end. There is also something to be said for understanding why the lower order terms can be ignored.
Take some graph algorithms like a minimum spanning tree or shortest path. Now, can just looking at an algorithm you know what the highest term will be? I know I wouldn't and so I'd trace through the algorithm and collect a bunch of terms.
If you want another example, consider Sorting Algorithms and whether you want to memorize all the complexities or not. Bubble Sort, Shell Sort, Merge Sort, Quick Sort, Radix Sort and Heap Sort are a few of the more common algorithms out there. You could either memorize both the algorithm and complexity or just the algorithm and derive the complexity from the pseudo code if you know how to trace them.
The question is :
T(n) = √2*T(n/2) + log n
I'm not sure whether the master theorem works here, and kinda stuck.
This looks more like the Akra-Bazzi theorem: http://en.wikipedia.org/wiki/Akra%E2%80%93Bazzi_method#The_formula with k=1, h=0, g(n)=log n, a=(2)^{1/2}, b=1/2. In that case, p=1/2 and you need to evaluate the integral \int_1^x log(u)/u^{3/2} du. You can use integration by parts, or a symbolic integrator. Wolfram Alpha tells me the indefinite integral is -2(log u + 2)/u^{1/2} + C, so the definite integral is 4 - 2(log x + 2)/x^{1/2}. Adding 1 and multiplying by x^{1/2}, we get T(x) = \Theta(5x^{1/2} - 2 log x - 4).
Master theorem have only constrains on your a and b which holds for your case. The fact that a is irrational and you have log(n) as your f(n) has no relation to it.
So in your case your c = log2(sqrt(2)) = 1/2. Since n^c grows faster than your log(n), the complexity of the recursion is O(sqrt(n)).
P.S. solution of Danyal is wrong as the complexity is not nlogn and the solution of Edward Doolittle is correct, also it is an overkill in this simple case.
As per master theorem, f(n) should be polynomial but here
f(n) = logn
which is not a polynomial so it can not be solved by master theorem as per rules. I read somewhere about the fourth case as well. I must mention that as well.
It is also discussed here:
Master's theorem with f(n)=log n
However, there is a limited "fourth case" for the master theorem, which allows it to apply to polylogarithmic functions.
If
f(n) = O(nlogba logk n), then T(n) = O(nlogba log k+1 n).
In other words, suppose you have T(n) = 2T (n/2) + n log n. f(n) isn't a polynomial, but f(n)=n log n, and k = 1. Therefore, T(n) = O(n log2 n)
See this handout for more information: http://cse.unl.edu/~choueiry/S06-235/files/MasterTheorem-HandoutNoNotes.pdf
The Master Method is a direct way to get the solution. The Master Method works only for following type of recurrences or for recurrences that can be transformed to following type.
T(n) = a T(n / b) + f(n) where a ≥ 1, b > 1, and f(n) = Θ(nc).
There are following three cases:
If c < logb(a) then T(n) = Θ(nlogb(a)).
If c = logb(a) then T(n) = Θ(nc log(n)).
If c > logb(a) then T(n) = Θ(f(n)).
In the Master Method, if f(n) contains some term of log(n), is it possible to solve this by the Master Method?
for example in
T(n)=4T(n/2)+n^2logn
Here master's theorem appplicable or not
It is not really possible to tell directly whether or not the Master Method works for some logarithmic function. This would depend on the specific recurrence you're trying to solve. It all depends on how f grows in comparison to nlogb a.
In the example given by JPC (where T(n) = 4T(n/2) + log(n)), it is indeed possible. However, also consider the example T(n) = 2T(n/5) + log(n). In this recurrence it is harder to determine whether nlog5 2 grows faster than log(n). If the logarithmic function f(n) gets more complex (e.g. log3(n/2)), it becomes even harder.
In short, it may be hard to determine for logarithmic functions how they grow when compared to an exponential function when the exponent is less than 1 (for exponents >= 1, log(n) is always faster). If it doesn't seem to work for you, you'll have to use other techniques to solve the recurrence.