Find the big-O of a function - algorithm

Please help me on following two functions, I need to simplify them.
O(nlogn + n^1.01)
O(log (n^2))
My current idea is
O(nlogn + n^1.01) = O(nlogn)
O(log (n^2)) = O (log (n^2))
Please kindly help me on these two simplification problems and briefly give an explanation, thanks.

For the second, you have O(lg(n²)) = O(2lg(n)) = O(lg(n)).
For the first, you have O(nlg(n) + n^(1.01)) = O(n(lg(n) + n^(0.01)), you've to decide whatever lg(n) or n^(0.01) grows larger.
For that purpose, you can take the derivative of n^0.01 - lg(n) and see if, at the limit for n -> infinity, it is positive or negative: 0.01/x^(0.99) - 1/x; at the limit, x is bigger than x^0.99, so the difference is positive and thus n^0.01 grows asymptotically faster than log(n), so the complexity is O(n^1.01).

Remember:
log (x * y) = log x + log y
and n^k always grows faster than log n for any k>0.

Putting things together, for the first question O(n*log(n)+n^1.01) the first function grows faster than the second summand, i.e. since nlog(n) > n^1.01 for n greater than about 3, it is O(nlog(n))
In the second case use the formula mentioned by KennyTM, so we get
O(log(n^2)) = O(log(n*n)) = O(log(n)+log(n)) = O(2*log(n)) = O(log(n))
because constant terms can be ignored.

Related

Is 2^(log n) = O(log(n))?

Are these two equal? I read somewhere that O(2lg n) = O(n). Going by this observation, I'm guessing the answer would be no, but I'm not entirely sure. I'd appreciate any help.
Firstly, O(2log(n)) isn't equal to O(n).
To use big O notation, you would find a function that represents the complexity of your algorithm, then you would find the term in that function with the largest growth rate. Finally, you would eliminate any constant factors you could.
e.g. say your algorithm iterates 4n^2 + 5n + 1 times, where n is the size of the input. First, you would take the term with the highest growth rate, in this case 4n^2, then remove any constant factors, leaving O(n^2) complexity.
In your example, O(2log(n)) can be simplified to O(log(n))
Now on to your question.
In computer science, unless specified otherwise, you can generally assume that log(n) actually means the log of n, base 2.
This means, using log laws, 2^log(n) can be simplified to O(n)
Proof:
y = 2^log(n)
log(y) = log(2^log(n))
log(y) = log(n) * log(2) [Log(2) = 1 since we are talking about base 2 here]
log(y) = log(n)
y = n

Solving a Recurrence Relation: T(n)=T(n-1)+T(n/2)+n

Solve: T(n)=T(n-1)+T(n/2)+n.
I tried solving this using recursion trees.There are two branches T(n-1) and T(n/2) respectively. T(n-1) will go to a higher depth. So we get O(2^n). Is this idea correct?
This is a very strange recurrence for a CS class. This is because from one point of view: T(n) = T(n-1) + T(n/2) + n is bigger than T(n) = T(n-1) + n which is O(n^2).
But from another point of view, the functional equation has an exact solution: T(n) = -2(n + 2). You can easily see that this is the exact solution by substituting it back to the equation: -2(n + 2) = -2(n + 1) + -(n + 2) + n. I am not sure whether this is the only solution.
Here is how I got it: T(n) = T(n-1) + T(n/2) + n. Because you calculate things for very big n, than n-1 is almost the same as n. So you can rewrite it as T(n) = T(n) + T(n/2) + n which is T(n/2) + n = 0, which is equal to T(n) = - 2n, so it is linear. This was counter intuitive to me (the minus sign here), but armed with this solution, I tried T(n) = -2n + a and found the value of a.
I believe you are right. The recurrence relation will always split into two parts, namely T(n-1) and T(n/2). Looking at these two, it is clear that n-1 decreases in value slower than n/2, or in other words, you will have more branches from the n-1 portion of the tree. Despite this, when considering big-o, it is useful to just consider the 'worst-case' scenario, which in this case is that both sides of the tree decreases by n-1 (since this decreases more slowly and you would need to have more branches). In all, you would need to split the relation into two a total of n times, hence you are right to say O(2^n).
Your reasoning is correct, but you give away far too much. (For example, it is also correct to say that 2x^3+4=O(2^n), but that’s not as informative as 2x^3+4=O(x^3).)
The first thing we want to do is get rid of the inhomogeneous term n. This suggests that we may look for a solution of the form T(n)=an+b. Substituting that in, we find:
an+b = a(n-1)+b + an/2+b + n
which reduces to
0 = (a/2+1)n + (b-a)
implying that a=-2 and b=a=-2. Therefore, T(n)=-2n-2 is a solution to the equation.
We now want to find other solutions by subtracting off the solution we’ve already found. Let’s define U(n)=T(n)+2n+2. Then the equation becomes
U(n)-2n-2 = U(n-1)-2(n-1)-2 + U(n/2)-2(n/2)-2 + n
which reduces to
U(n) = U(n-1) + U(n/2).
U(n)=0 is an obvious solution to this equation, but how do the non-trivial solutions to this equation behave?
Let’s assume that U(n)∈Θ(n^k) for some k>0, so that U(n)=cn^k+o(n^k). This makes the equation
cn^k+o(n^k) = c(n-1)^k+o((n-1)^k) + c(n/2)^k+o((n/2)^k)
Now, (n-1)^k=n^k+Θ(n^{k-1}), so that the above becomes
cn^k+o(n^k) = cn^k+Θ(cn^{k-1})+o(n^k+Θ(n^{k-1})) + cn^k/2^k+o((n/2)^k)
Absorbing the lower order terms and subtracting the common cn^k, we arrive at
o(n^k) = cn^k/2^k
But this is false because the right hand side grows faster than the left. Therefore, U(n-1)+U(n/2) grows faster than U(n), which means that U(n) must grow faster than our assumed Θ(n^k). Since this is true for any k, U(n) must grow faster than any polynomial.
A good example of something that grows faster than any polynomial is an exponential function. Consequently, let’s assume that U(n)∈Θ(c^n) for some c>1, so that U(n)=ac^n+o(c^n). This makes the equation
ac^n+o(c^n) = ac^{n-1}+o(c^{n-1}) + ac^{n/2}+o(c^{n/2})
Rearranging and using some order of growth math, this becomes
c^n = o(c^n)
This is false (again) because the left hand side grows faster than the right. Therefore,
U(n) grows faster than U(n-1)+U(n/2), which means that U(n) must grow slower than our assumed Θ(c^n). Since this is true for any c>1, U(n) must grow more slowly than any exponential.
This puts us into the realm of quasi-polynomials, where ln U(n)∈O(log^c n), and subexponentials, where ln U(n)∈O(n^ε). Either of these mean that we want to look at L(n):=ln U(n), where the previous paragraphs imply that L(n)∈ω(ln n)∩o(n). Taking the natural log of our equation, we have
ln U(n) = ln( U(n-1) + U(n/2) ) = ln U(n-1) + ln(1+ U(n/2)/U(n-1))
or
L(n) = L(n-1) + ln( 1 + e^{-L(n-1)+L(n/2)} ) = L(n-1) + e^{-(L(n-1)-L(n/2))} + Θ(e^{-2(L(n-1)-L(n/2))})
So everything comes down to: how fast does L(n-1)-L(n/2) grow? We know that L(n-1)-L(n/2)→∞, since otherwise L(n)∈Ω(n). And it’s likely that L(n)-L(n/2) will be just as useful, since L(n)-L(n-1)∈o(1) is much smaller than L(n-1)-L(n/2).
Unfortunately, this is as far as I’m able to take the problem. I don’t see a good way to control how fast L(n)-L(n/2) grows (and I’ve been staring at this for months). The only thing I can end with is to quote another answer: “a very strange recursion for a CS class”.
I think we can look at it this way:
T(n)=2T(n/2)+n < T(n)=T(n−1)+T(n/2)+n < T(n)=2T(n−1)+n
If we apply the master's theorem, then:
Θ(n∗logn) < Θ(T(n)) < Θ(2n)
Remember that T(n) = T(n-1) + T(n/2) + n being (asymptotically) bigger than T(n) = T(n-1) + n only applies for functions which are asymptotically positive. In that case, we have T = Ω(n^2).
Note that T(n) = -2(n + 2) is a solution to the functional equation, but it doesn't interest us, since it is not an asymptotically positive solution, hence the notations of O don't have meaningful application.
You can also easily check that T(n) = O(2^n). (Refer to yyFred solution, if needed)
If you try using the definition of O for functions of the type n^a(lgn)^b, with a(>=2) and b positive constants, you see that this is not a possible solution too by the Substitution Method.
In fact, the only function that allows a proof with the Substitution Method is exponential, but we know that this recursion doesn't grow as fast as T(n) = 2T(n-1) + n, so if T(n) = O(a^n), we can have a < 2.
Assume that T(m) <= c(a^m), for some constant c, real and positive. Our hypothesis is that this relation is valid for all m < n. Trying to prove this for n, we get:
T(n) <= (1/a+1/a^(n/2))c(a^n) + n
we can get rid of the n easily by changing the hypothesis by a term of lower order. What is important here is that:
1/a+1/a^(n/2) <= 1
a^(n/2+1)-a^(n/2)-a >= 0
Changing variables:
a^(N+1)-a^N-a >= 0
We want to find a bond as tight as possible, so we are searching for the lowest a possible. The inequality we found above accept solutions of a which are pretty close to 1, but is a allowed to get arbitrarily close to 1? The answer is no, let a be of the form a = (1+1/N). Substituting a at the inequality and applying the limit N -> INF:
e-e-1 >= 0
which is a absurd. Hence, the inequality above has some fixed number N* as maximum solution, which can be found computationally. A quick Python program allowed me to find that a < 1+1e-45 (with a little extrapolation), so we can at least be sure that:
T(n) = ο((1+1e-45)^n)
T(n)=T(n-1)+T(n/2)+n is the same as T(n)=T(n)+T(n/2)+n since we are solving for extremely large values of n. T(n)=T(n)+T(n/2)+n can only be true if T(n/2) + n = 0. That means T(n) = T(n) + 0 ~= O(n)

Algorithm Peer Review

So I am currently taking an algorithms class and have been asked to prove
Prove: ((n^2 / log n) + 10^5 n * sqrt(n)) / n^2 = O(n^2 / log n)
I have come up with n0 = 1 and c = 5 when solving it I end up with 1 <= 5 I just wanted to see if I could get someone to verify this for me.
I'm not sure if this is the right forum to post in, if it's wrong I apologize and if you could point me in the right direction to go to that would be wonderful.
If I am not wrong, you have to prove that the upper bound of the given function is n^2 logn.
Which can be the case if for very large values of n,
n^2/logn >= n * sqrt(n)
n >= sqrt(n) * log(n)
Since, log(n) < sqrt(n), log(n)*sqrt(n) will always be less than n. Hence, our inequality is correct. So, the upper bound is O(n^2/ logn).
You can use the limit method process:
Thus, the solution of your case should look like this:
Assuming functions f and g are increasing, by definition f(x) = O(g(x)) iff limit x->inf f(x)/g(x) is non-negative. If you substitute your functions for f and g, and simplify the expression, you will see that the limit trivially comes out to be 0.

Algorithm complexity, solving recursive equation

I'm taking Data Structures and Algorithm course and I'm stuck at this recursive equation:
T(n) = logn*T(logn) + n
obviously this can't be handled with the use of the Master Theorem, so I was wondering if anybody has any ideas for solving this recursive equation. I'm pretty sure that it should be solved with a change in the parameters, like considering n to be 2^m , but I couldn't manage to find any good fix.
The answer is Theta(n). To prove something is Theta(n), you have to show it is Omega(n) and O(n). Omega(n) in this case is obvious because T(n)>=n. To show that T(n)=O(n), first
Pick a large finite value N such that log(n)^2 < n/100 for all n>N. This is possible because log(n)^2=o(n).
Pick a constant C>100 such that T(n)<Cn for all n<=N. This is possible due to the fact that N is finite.
We will show inductively that T(n)<Cn for all n>N. Since log(n)<n, by the induction hypothesis, we have:
T(n) < n + log(n) C log(n)
= n + C log(n)^2
< n + (C/100) n
= C * (1/100 + 1/C) * n
< C/50 * n
< C*n
In fact, for this function it is even possible to show that T(n) = n + o(n) using a similar argument.
This is by no means an official proof but I think it goes like this.
The key is the + n part. Because of this, T is bounded below by o(n). (or should that be big omega? I'm rusty.) So let's assume that T(n) = O(n) and have a go at that.
Substitute into the original relation
T(n) = (log n)O(log n) + n
= O(log^2(n)) + O(n)
= O(n)
So it still holds.

Meaning of lg * N in Algorithmic Analysis

I'm currently reading about algorithmic analysis and I read that a certain algorithm (weighted quick union with path compression) is of order N + M lg * N. Apparently though this is linear because lg * N is a constant in this universe. What mathematical operation is being referred to here. I am unfamiliar with the notation lg * N.
The answers given here so far are wrong. lg* n (read "log star") is the iterated logarithm. It is defined as recursively as
0 if n <= 1
lg* n =
1 + lg*(lg n) if n > 1
Another way to think of it is the number of times that you have to iterate logarithm before the result is less than or equal to 1.
It grows extremely slowly. You can read more on Wikipedia which includes some examples of algorithms for which lg* n pops up in the analysis.
I'm assuming you're talking about the algorithm analyzed on slide 44 of this lecture:
http://www.cs.princeton.edu/courses/archive/fall05/cos226/lectures/union-find.pdf
Where they say "lg * N is a constant in this universe" I believe they aren't being entirely literal.
lg*N does appear to increase with N as per their table on the right side of the slide; it just happens to grow at such a slow rate that it can't be considered much else (N = 2^65536 -> log*n = 5). As such it seems they're saying that you can just ignore the log*N as a constant because it will never increase enough to cause a problem.
I could be wrong, though. That's simply how I read it.
edit: it might help to note that for this equation they're defining "lg*N" to be 2^(lg*(N-1)). Meaning that an N value of 2^(2^(65536)) [a far larger number] would give lg*N = 6, for example.
The recursive definition of lg*n by Jason is equivalent to
lg*n = m when 2 II m <= n < 2 II (m+1)
where
2 II m = 2^2^...^2 (repeated exponentiation, m copies of 2)
is Knuth's double up arrow notation. Thus
lg*2= 1, lg*2^2= 2, lg*2^{2^2}= 3, lg*2^{2^{2^2}} = 4, lg*2^{2^{2^{2^2}}} = 5.
Hence lg*n=4 for 2^{16} <= n < 2^{65536}.
The function lg*n approaches infinity extremely slowly.
(Faster than an inverse of the Ackermann function A(n,n) which involves n-2 up arrows.)
Stephen
lg is "LOG" or inverse exponential. lg typically refers to base 2, but for algorithmic analysis, the base usually doesnt matter.
lg n refers to log base n. It is the answer to the equation 2^x = n. In Big O complexity analysis, the base to log is irrelevant. Powers of 2 crop up in CS, so it is no surprise if we have to choose a base, it will be base 2.
A good example of where it crops up is a fully binary tree of height h, which has 2^h-1 nodes. If we let n be the number of nodes this relationship is the tree is height lg n with n nodes. The algorithm traversing this tree takes at most lg n to see if a value is stored in the tree.
As to be expected, wiki has great additional info.
Logarithm is denoted by log or lg. In your case I guess the correct interpretation is N + M * log(N).
EDIT: The base of the logarithm does not matter when doing asymptotic complexity analysis.

Resources