Is log(n!) = Θ(n·log(n))? - algorithm

I am to show that log(n!) = Θ(n·log(n)).
A hint was given that I should show the upper bound with nn and show the lower bound with (n/2)(n/2). This does not seem all that intuitive to me. Why would that be the case? I can definitely see how to convert nn to n·log(n) (i.e. log both sides of an equation), but that's kind of working backwards.
What would be the correct approach to tackle this problem? Should I draw the recursion tree? There is nothing recursive about this, so that doesn't seem like a likely approach..

Remember that
log(n!) = log(1) + log(2) + ... + log(n-1) + log(n)
You can get the upper bound by
log(1) + log(2) + ... + log(n) <= log(n) + log(n) + ... + log(n)
= n*log(n)
And you can get the lower bound by doing a similar thing after throwing away the first half of the sum:
log(1) + ... + log(n/2) + ... + log(n) >= log(n/2) + ... + log(n)
= log(n/2) + log(n/2+1) + ... + log(n-1) + log(n)
>= log(n/2) + ... + log(n/2)
= n/2 * log(n/2)

I realize this is a very old question with an accepted answer, but none of these answers actually use the approach suggested by the hint.
It is a pretty simple argument:
n! (= 1*2*3*...*n) is a product of n numbers each less than or equal to n. Therefore it is less than the product of n numbers all equal to n; i.e., n^n.
Half of the numbers -- i.e. n/2 of them -- in the n! product are greater than or equal to n/2. Therefore their product is greater than the product of n/2 numbers all equal to n/2; i.e. (n/2)^(n/2).
Take logs throughout to establish the result.

Sorry, I don't know how to use LaTeX syntax on stackoverflow..

See Stirling's Approximation:
ln(n!) = n*ln(n) - n + O(ln(n))
where the last 2 terms are less significant than the first one.

For lower bound,
lg(n!) = lg(n)+lg(n-1)+...+lg(n/2)+...+lg2+lg1
>= lg(n/2)+lg(n/2)+...+lg(n/2)+ ((n-1)/2) lg 2 (leave last term lg1(=0); replace first n/2 terms as lg(n/2); replace last (n-1)/2 terms as lg2 which will make cancellation easier later)
= n/2 lg(n/2) + (n/2) lg 2 - 1/2 lg 2
= n/2 lg n - (n/2)(lg 2) + n/2 - 1/2
= n/2 lg n - 1/2
lg(n!) >= (1/2) (n lg n - 1)
Combining both bounds :
1/2 (n lg n - 1) <= lg(n!) <= n lg n
By choosing lower bound constant greater than (1/2) we can compensate for -1 inside the bracket.
Thus lg(n!) = Theta(n lg n)

Helping you further, where Mick Sharpe left you:
It's deriveration is quite simple:
see http://en.wikipedia.org/wiki/Logarithm -> Group Theory
log(n!) = log(n * (n-1) * (n-2) * ... * 2 * 1) = log(n) + log(n-1) + ... + log(2) + log(1)
Think of n as infinitly big. What is infinite minus one? or minus two? etc.
log(inf) + log(inf) + log(inf) + ... = inf * log(inf)
And then think of inf as n.

Thanks, I found your answers convincing but in my case, I must use the Θ properties:
log(n!) = Θ(n·log n) => log(n!) = O(n log n) and log(n!) = Ω(n log n)
to verify the problem I found this web, where you have all the process explained: http://www.mcs.sdsmt.edu/ecorwin/cs372/handouts/theta_n_factorial.htm

http://en.wikipedia.org/wiki/Stirling%27s_approximation
Stirling approximation might help you. It is really helpful in dealing with problems on factorials related to huge numbers of the order of 10^10 and above.

This might help:
eln(x) = x
and
(lm)n = lm*n

If you reframe the problem, you can solve this with calculus! This method was originally shown to me via Arthur Breitman https://twitter.com/ArthurB/status/1436023017725964290.
First, you take the integral of log(x) from 1 to n it is n*log(n) -n +1. This proves a tight upper bound since log is monotonic and for every point n, the integral from n to n+1 of log(n) > log(n) * 1. You can similarly craft the lower bound using log(x-1), as for every point n, 1*log(n) > the integral from x=n-1 to n of log(x). The integral of log(x) from 0 to n-1 is (n-1)*(log(n-1) -1), or n log(n-1) -n -log(n-1)+1.
These are very tight bounds!

Related

Calculating the Recurrence Relation T(n) = sqrt(n * T(sqrt(n)) + n)

I think the complexity of this recursion is O(n^2/3)` by change variable and induction. but I'm not sure. Is this solution correct?
This is a fascinating recurrence and it does not solve to Θ(n). Rather, it appears to solve to Θ(n2/3).
To give an intuition for why this isn't likely to be Θ(n), let's imagine that we're dealing with a really, really large value of n. Then since
T(n) = (nT(√n) + n)1/2
under the assumption that T(√n) ≈ √n, we'd get that
T(n) = (n√n + n)1/2
= (n3/2 + n)1/2
≈ n3/4.
In other words, assuming that T(n) = Θ(n) would give us a different value of T(n) as n gets large.
On the other hand, let's assume that T(n) = Θ(n2/3). Then the same calculation gives us that
T(n) = (nT(n) + n)1/2
= (n · n2/3 + n)1/2
&approx; (n4/3)1/2
= n2/3,
which is consistent with itself.
To validate this, I wrote a short program that printed out different values of T(n) given different inputs and plotted the results. Here's the version of T(n) that I wrote up:
double T(double n) {
if (n <= 2) return n;
return sqrt(n * T(sqrt(n)) + n);
}
I decided to use 2 as a base case, since repeatedly taking square roots will never let n drop to one. I also decided to use real-valued arguments rather than discrete integer values just to make the math easier.
If you plot the values of T(n), you get this curve:
.
This doesn't look like what I'd expect from a linear plot. To figure out what this was, I plotted it on a log/log plot, which has the nice property that all polynomial functions get converted to straight lines whose slope is equal to the exponent. Here's the result:
I consulted my Handy Neighborhood Regression Software and asked it to determine the slope of this line. Here's what it gave back:
Slope: 0.653170918815869
R2: 0.999942627574643
That's a very good fit, and the slope of 0.653 is pretty close to 2/3. So that's more empirical evidence supporting that the recurrence solves to Θ(n2/3).
All that's left to do now is to work out the math. We'll solve this recurrence using a series of substitutions.
First, I'm generally not that comfortable working with exponents in the way that this recurrence uses them, so let's take the log of both sides. (Throughout this exposition, I'll use lg n to mean log2 n).
lg T(n) = lg (nT(√n) + n)1/2
= (1/2) lg (nT(√n) + n)
= (1/2) lg(T(√n) + 1) + (1/2)lg n
≈ (1/2) lg T(√n) + (1/2) lg n
Now, let's define S(n) = lg T(n). Then we have
S(n) = lg T(n)
≈ (1/2) lg T(√ n) + (1/2) lg n
= (1/2) S(√ n) + (1/2) lg n
That's a lot easier to work with, though we still have the problem of the recurrence shrinking by powers each time. To address this, let's do one more substitution, which is a fairly common one when working with these sorts of expressions. Let's define R(n) = S(2n). Then we have that
R(n) = S(2n)
&approx; (1/2)S(√2n) + (1/2) lg 2n
= (1/2)S(2n/2) + (1/2) n
= (1/2) R(n / 2) + (1/2) n
Great! All that's left to do now is to solve R(n).
Now, there is a slight catch here. We could immediately use the Master Theorem to conclude that R(n) = Θ(n). The problem with this is that just knowing that R(n) = Θ(n) won't allow us to determine what T(n) is. Specifically, let's suppose that we just know R(n) = Θ(n). Then we could say that
S(n) = S(2lg n) = R(lg n) = Θ(log n)
to get that S(n) = Θ(log n). However, we get stuck when trying to solve for T(n) in terms of S(n). Specifically, we know that
T(n) = 2S(n) = 2Θ(log n),
but we cannot go from this to saying that T(n) = Θ(n). The reason is that the hidden coefficient in the Θ(log n) is significant here. Specifically, if S(n) = k lg n, then we have that
2k lg n = 2lg nk = nk,
so the leading coefficient of the logarithm will end up determining the exponent on the polynomial. As a result, when solving R, we need to determine the exact coefficient of the linear term, which translates into the exact coefficient of the logarithmic term for S.
So let's jump back to R(n), which we know is
R(n) &approx; (1/2) R(n/2) + (1/2)n.
If we iterate this a few times, we see this pattern:
R(n) &approx; (1/2) R(n/2) + (1/2)n
&approx; (1/2)((1/2) R(n/4) + (1/4)n) + (1/2)n
&approx; (1/4)R(n/4) + (1/8)n + (1/2)n
&approx; (1/4)((1/2)R(n/8) + n/8) + (1/8)n + (1/2)n
&approx; (1/8)R(n/8) + (1/32)n + (1/8)n + (1/2)n.
The pattern appears to be that, after k iterations, we get that
R(n) &approx; (1/2k)R(n/2k) + n(1/2 + 1/8 + 1/32 + 1/128 + ... + 1/22k+1).
This means we should look at the sum
(1/2) + (1/8) + (1/32) + (1/128) + ...
This is
(1/2)(1 + 1/4 + 1/16 + 1/64 + ... )
which, as the sum of a geometric series, solves to
(1/2)(4/3)
= 2/3.
Hey, look! It's the 2/3 we were talking about earlier. This means that R(n) works out to approximately (2/3)n + c for some constant c that depends on the base case of the recurrence. Therefore, we see that
T(n) = 2S(n)
= 2S(2lg n)
= 2R(lg n)
&approx; 2(2/3)lg n + c
= 2lg n2/3 + c
= 2c 2lg n2/3
= 2c n2/3
= Θ(n2/3)
Which matches the theoretically predicted and empirically observed values from earlier.
This was a very fun problem to work through and I'll admit I'm surprised by the answer! I am a bit nervous, though, that I may have missed something when going from
lg T(n) = (1/2) lg (T(√n) + 1) + (1/2) lg n
to
lg T(n) &approx; (1/2) lg T(√ n) + (1/2) lg n.
It's possible that this +1 term actually introduces some other term into the recurrence that I didn't recognize. For example, is there an O(log log n) term that arises as a result? That wouldn't surprise me, given that we have a recurrence that shrinks by a square root. However, I've done some simple data explorations and I'm not seeing any terms in there that look like there's a double log involved.
Hope this helps!
We know that:
T(n) = sqrt(n) * sqrt(T(sqrt(n)) + 1)
Hence:
T(n) < sqrt(n) * sqrt(T(sqrt(n)) + T(sqrt(n)))
1 is replaced by T(sqrt(n)). So,
T(n) < sqrt(2) * sqrt(n) * sqrt(T(sqrt(n))
Now, to find an upper bound we need to solve the following recurrent relation:
G(n) = sqrt(2n) * sqrt(G(sqrt(n))
To solve this, we need to expand it (suppose n = 2^{2^k} and T(1) = 1):
G(n) = (2n)^{1/2} * (2n)^{1/8} * (2n)^{1/32} * ... * (2n)^(1/2^k) =>
G(n) = (2n)^{1/2 + 1/8 + 1/32 + ... + 1/2^k} =
If we take a factor 1/2 from 1/2 + 1/8 + 1/32 + ... + 1/2^k we will have 1/2 * (1 + 1/4 + 1/8 + ... + 1/2^{k-1}).
As we know that 1 + 1/4 + 1/8 + ... + 1/2^{k-1} is a geometric series with a ratio 1/4, it is equal to 4/3 at infinity. Therefore G(n) = Theta(n^{2/3}) and T(n) = O(n^{2/3}).
Notice that as sqrt(n) * sqrt(T(sqrt(n)) < T(n), we can show similar to the previous case that T(n) = Omega(n^{2/3}). It means T(n) = Theta(n^{2/3}).

Calculating the Recurrence Relation T(n)=T(n / [(log n)^2]) + Θ(1)

I tried to solve this problem many hours and I think the solution is O(log n/[log (log n)^2]). but I'm not sure.Is this solution correct?
Expand the equation:
T(n) = (T(n/(log^2(n)*log(n/log^2(n))^2) + Theta(1)) Theta(1) =
T(n/(log^4(n) + 4 (loglog(n))^2 - 4log(n)loglog(n)) + 2 * Theta(1)
We know n/(log^4(n) + 4 (log(log(n)))^2 - 4log(n)log(log(n)) is greater than n/log^4(n) asymptotically. As you can see, each time n is divided by log^2(n). Hence, we can say if we compute the height of dividing n by log^2(n) up to reaching to 1, it will be a lower bound for T(n).
Hence, the height of the expansion tree will be k such that
n = (log^2(n))^k = lof^2k(n) =>‌ (take a log)
log(n) = 2k log(log(n)) => k = log(n)/(2 * log(log(n)))
Therefore, T(n) = Omega(log(n)/log(log(n))).
For the upper bound, as we know that n/(i-th statement) <‌ n/log^i(n) (instead of applying log^2(n), we've applied log(n)), we can say the height of division of n by log(n) will be an upper bound for T(n). Hence, as:
n = log^k(n) => log(n) = k log(log(n)) => k = log(n) / log(log(n))
we can say T(n) = O(log(n) / log(log(n))).

What is the complexity of sum of log functions

I have an algorithm that have the following complexity in big O:
log(N) + log(N+1) + log(N+2) + ... + log(N+M)
Is it the same as log(N+M) since it is the largest element?
OR is it M*log(N+M) because it is a sum of M elements?
The important rules to know in order to solve it are:
Log a + Log b = Log ab, and
Log a - Log b = Log a/b
Add and subtract Log 2, Log 3, ... Log N-1 to the given value.
This will give you Log 2 + Log 3 + ... + Log (N+M) - (Log 2 + Log 3 + ... + Log (N-1))
The first part will compute to Log ((N+M)!) and the part after the subtraction sign will compute to Log ((N-1)!)
Hence, this complexity comes to Log ( (N+M)! / (N-1)! ).
UPDATE after OP asked another good question in the comment:
If we have N + N^2 + N^3, it will reduce to just N^3 (the largest element), right? Why we can't apply the same logic here - log(N+M) - largest element?
If we have just two terms that look like Log(N) + Log(M+N), then we can combine them both and say that they will definitely be less than 2 * Log(M+N) and will therefore be O(Log (M+N)).
However, if there is a relation between the number of items being summed and highest value of the item, then the presence of such a relation makes the calculation slightly not so straightforward.
For example, the big O of addition of 2 (Log N)'s, is O(Log N), while the big O of summation of N Log N's is not O(Log N) but is O(N * Log N).
In the given summation, the value and the number of total values is dependent on M and N, therefore we cannot put this complexity as Log(M+N), however, we can definitely write it as M * (Log (M+N)).
How? Each of the values in the given summation is less than or equal to Log(M + N), and there are total M such values. Hence the summation of these values will be less than M * (Log (M+N)) and therefore will be O(M * (Log (M+N))).
Thus, both answers are correct but O(Log ( (N+M)! / (N-1)! )) is a tighter bound.
If M does not depend on N and does not vary then the complexity is O(log(N))
For k such as 0 <= k <= M and N>=M and N>=2,
log(N+k)=log(N(1+k/N)) = log(N) + log(1+k/N) <= log(N) + log(2)
<= log(N) + log(N) <= 2 log(N)
So
log(N) + log(N+1) + log(N+2) + ... + log(N+M) <= (M+1)2 log(N)
So the complexity in big O is: log(N)
To answer your questions:
1) yes because there is a fixed number of elements all less or equal than log(N+M)
2) In fact there are M + 1 elements (from 0 to M)
I specify that O((M+1)log(N+M)) is a O(log(N))

Is log(n!) = O((log(n))^2)?

I am practicing problems on asymptotic analysis and I am stuck with this problem.
Is log(n!) = O((log(n))^2) ?
I am able to show that
log(n!) = O(n*log(n))
(log 1 + log 2 + .. + log n <= log n + log n + ... + log n)
and
(log(n))^2 = O(n*log(n))
(log n <= n => (log n)^2 <= n*logn )
I am not able to proceed further. Any hint or intuition on how to proceed further? Thanks
Accoriding to Stirling's Approximation:
log(n!) = n*log(n) - n + O(log(n))
So clearly upper bound for log(n!) will be O(nlogn)
Lower bound can be calculated by removing first half of the equation as:
log(1) + ... + log(n/2) + ... + log(n) = log(n/2) + ... + log(n)
= log(n/2) + ... + log(n/2)
= n/2 * log(n/2)
So Lower bound is also nlogn. Clearly answer would be NO
I think I got the answer to my own question. We will prove the following facts:
1) n*log(n) is a tight bound for log(n!)
2) n*log(n) is a upper bound for (log(n))^2
3)n*log(n) is not a lower bound for (log(n))^2
For proof of (1) see this.
Proof(2) & (3) is provided in the question itself.
growth rate of log n < growth rate of n.
So growth rate of log(n)^2 < growth rate of n*log(n).
So log(n)^2 = o(n*log(n)) (Here I have used little-o to denote that growth rate of n*log(n) is strictly greater than growth rate of log(n)^2
So the conclusion is that log(n!) = big-omega(log(n^2))
Correct me if I have made any mistake

Tree sort: time complexity

Why is the average case time complexity of tree sort O(n log n)?
From Wikipedia:
Adding one item to a binary search tree is on average an O(log n)
process (in big O notation), so adding n items is an O(n log n)
process
But we don't each time add an item to a tree of n items. We start with an empty tree, and gradually increase the size of the tree.
So it looks more like
log1 + log2 + ... + logn = log (1*2*...*n) = log n!
Am I missing something?
The reason why O(log(n!)) = O(nlog(n)) is a two-part answer. First, expand O(log(n!)),
log(1) + log(2) + ... + log(n)
We can both agree here that log(1), log(2), and all the numbers up to log(n-1) are each less than log(n). Therefore, the following inequality can be made,
log(1) + log(2) + ... + log(n) <= log(n) + log(n) + ... + log(n)
Now the other half of the answer depends on the fact that half of the numbers from 1 to n are greater than n/2. This means that log(n!) would be greater than n/2*log(n/2) aka the first half of the sum log(n!),
log(1) + log(2) + ... + log(n) => log(n/2) + log(n/2) + ... + log(n/2)
The reason being that the first half of log(1) + log(2) + ... + log(n) is log(1) + log(2) + ... + log(n/2), which is less than n/2*log(n/2) as proven by the first inequality so by adding the second half of the sum log(n!), it can be shown that it is greater than n/2*log(n/2).
So with these two inequalities, it can be proven that O(log(n!)) = O(nlog(n))
O(log(n!)) = O(nlog(n)).
https://en.wikipedia.org/wiki/Stirling%27s_approximation
(Answers must be 30 characters.)

Resources