What is the complexity of sum of log functions - algorithm

I have an algorithm that have the following complexity in big O:
log(N) + log(N+1) + log(N+2) + ... + log(N+M)
Is it the same as log(N+M) since it is the largest element?
OR is it M*log(N+M) because it is a sum of M elements?

The important rules to know in order to solve it are:
Log a + Log b = Log ab, and
Log a - Log b = Log a/b
Add and subtract Log 2, Log 3, ... Log N-1 to the given value.
This will give you Log 2 + Log 3 + ... + Log (N+M) - (Log 2 + Log 3 + ... + Log (N-1))
The first part will compute to Log ((N+M)!) and the part after the subtraction sign will compute to Log ((N-1)!)
Hence, this complexity comes to Log ( (N+M)! / (N-1)! ).
UPDATE after OP asked another good question in the comment:
If we have N + N^2 + N^3, it will reduce to just N^3 (the largest element), right? Why we can't apply the same logic here - log(N+M) - largest element?
If we have just two terms that look like Log(N) + Log(M+N), then we can combine them both and say that they will definitely be less than 2 * Log(M+N) and will therefore be O(Log (M+N)).
However, if there is a relation between the number of items being summed and highest value of the item, then the presence of such a relation makes the calculation slightly not so straightforward.
For example, the big O of addition of 2 (Log N)'s, is O(Log N), while the big O of summation of N Log N's is not O(Log N) but is O(N * Log N).
In the given summation, the value and the number of total values is dependent on M and N, therefore we cannot put this complexity as Log(M+N), however, we can definitely write it as M * (Log (M+N)).
How? Each of the values in the given summation is less than or equal to Log(M + N), and there are total M such values. Hence the summation of these values will be less than M * (Log (M+N)) and therefore will be O(M * (Log (M+N))).
Thus, both answers are correct but O(Log ( (N+M)! / (N-1)! )) is a tighter bound.

If M does not depend on N and does not vary then the complexity is O(log(N))
For k such as 0 <= k <= M and N>=M and N>=2,
log(N+k)=log(N(1+k/N)) = log(N) + log(1+k/N) <= log(N) + log(2)
<= log(N) + log(N) <= 2 log(N)
So
log(N) + log(N+1) + log(N+2) + ... + log(N+M) <= (M+1)2 log(N)
So the complexity in big O is: log(N)
To answer your questions:
1) yes because there is a fixed number of elements all less or equal than log(N+M)
2) In fact there are M + 1 elements (from 0 to M)
I specify that O((M+1)log(N+M)) is a O(log(N))

Related

How do you find the complexity of an algorithm given the number of computations performed each iteration?

Say there is an algorithm with input of size n. On the first iteration, it performs n computations, then is left with a problem instance of size floor(n/2) - for the worst case. Now it performs floor(n/2) computations. So, for example, an input of n=25 would see it perform 25+12+6+3+1 computations until an answer is reached, which is 47 total computations. How do you put this into Big O form to find worst case complexity?
You just need to write the corresponding recurrence in a formal manner:
T(n) = T(n/2) + n = n + n/2 + n/4 + ... + 1 =
n(1 + 1/2 + 1/4 + ... + 1/n) < 2 n
=> T(n) = O(n)

Why is O(n/2 + 5 log n) O(log n) and not O(n log n)?

For n/2 + 5 log n, I would of thought the lower order terms of 5 and 2 would be dropped, thus leaving n log n
Where am I going wrong?
Edit:
Thank you, I believe I can now correct my mistake:
O(n/2 + 5 log n) = O(n/2 + log n) = O(n + log n) = O(n)
n/2 + 5 log n <= 2n, for all n >= 1 (c = 2, n0=1)
Let us define the function f as follows for n >= 1:
f(n) = n/2 + 5*log(n)
This function is not O(log n); it grows more quickly than that. To show that, we can show that for any constant c > 0, there is a choice of n0 such that for n > n0, f(n) > c * log(n). For 0 < c <= 5, this is trivial, since f(n) > [5log(n)] by definition. For c > 5, we get
n/2 + 5*log(n) > c*log(n)
<=> n/2 > (c - 5)*log(n)
<=> (1/(2(c - 5))*n/log(n) > 1
We can now note that the expression on the LHS is monotonically increasing for n > 1 and find the limit as n grows without bound using l'Hopital:
lim(n->infinity) (1/(2(c - 5))*n/log(n)
= (1/(2(c - 5))* lim(n->infinity) n/log(n)
= (1/(2(c - 5))* lim(n->infinity) 1/(1/n)
= (1/(2(c - 5))* lim(n->infinity) n
-> infinity
Using l'Hopital we find there is no limit as n grows without bound; the value of the LHS grows without bound as well. Because the LHS is monotonically increasing and grows without bound, there must be an n0 after which the value of the LHS exceeds the value 1, as required.
This all proves that f is not O(log n).
It is true that f is O(n log n). This is not hard to show at all: choose c = (5+1/2), and it is obvious that
f(n) = n/2 + 5log(n) <= nlog(n)/2 + 5nlog(n) = (5+1/2)nlog(n) for all n.
However, this is not the best bound we can get for your function. Your function is actually O(n) as well. Choosing the same value for c as before, we need only notice that n > log(n) for all n >= 1, so
f(n) = n/2 + 5log(n) <= n/2 + 5n = (5+1/2)n
So, f is also O(n). We can show that f(n) is Omega(n) which proves it is also Theta(n). That is left as an exercise but is not difficult to do either. Hint: what if you choose c = 1/2?
It's neither O(log n) nor O(n*log n). It'll be O(n) because for large value of n log n is much smaller than n hence it'll be dropped.
It's neither O(log n) nor O(n*log n). It'll be O(n) because for larger values of n log(n) is much smaller than n hence it'll be dropped.
Consider n=10000, now 5log(n) i.e 5*log(10000)=46(apprx) which is less than n/2(= 5000).

What is the complexity of multiple runs of an O(n log n) algorithm?

If the problem size is n, and every time an algorithm reduces the problem size by half, I believe the complexity is O (n log n) e..g merge sort. So, basically you are running a (log n) algorithm (the comparison) n times...
Now the problem is, if I have a problem of size n. My algorithm is able to reduce the size by half in a run and each run takes O(n log n). What is the complexity in this case?
If the problem takes n steps at size n, plus an additional run at size floor(n/2) when n > 1, then it takes O(n) time in total: n + n/2 + n/4 + ... =~ 2n = O(n).
Similarly, if each run takes time O(n log n) and an additional run at size floor(n/2) when n > 1, the total time is O(n log n).
Since the size of the problem gets halved in each iteration and at each level the time taken is n log n, the recurrence relation is
T(n) = T(n/2) + n log n
Applying Master theorem,
Comparing with T(n) = a T(n/b) + f(n), we have a=1 and b=2.
Hence nlogba = nlog21
= n0 = 1.
Thus f(n) = n log n > nlogba.
Applying Master theorem we get T(n) = Θ(f(n)) = Θ(n log n).
Hence the complexity is T(n) = Θ(n log n).
EDIT after comments:
If the size of the problem halves at every run, you'll have log(n) runs to complete it. Since every run take n*log(n) time, you'll have log(n) times n*log(n) runs. The total complexity will be:
O(n log(n)^2)
If I don't misunderstand the question, the first run completes in (proportional to) nlogn. The second run has only n/2 left, so completes in n/2log(n/2), and so on.
For large n, which is what you assume when analyzing the time-complexity, log(n/2) = (logn - log2) is to be replaced by logn.
summing over "all" steps:
log(n) * (n + n/2 + n/4 ...) = 2n log(n), i.e. time complexity nlogn
In other words: the time complexity is the same as for your first/basic step, all others together "only" contributing the same amount once more
I'm pretty sure that comes to O(n^2 log n). You create a geometric series of n + n/2 + n/4 + ... = 2n (for large n). But you ignore the coefficient and just get the n.
This is fine unless you mean the inner nlogn to be the same n value as the outer n.
Edit:
I think that what the OP means here is that each run the inner nlogn also gets havled. In other words,
nlogn + n/2 log n/2 + n/4 log n/4 + ... + n/2^(n - 1) log n/2^(n-1)
If this is the case, then one thing to consider is that at some point
2^(n-1) > n
At that point the log breaks down (because log of a number between 0 and 1 is negative). But, you don't really need the log as all there will be is 1 operation in these iterations. So from there on you are just adding 1s.
This occurs at the log n / log 2. So, for the first log n / log 2 iterations, we have the sum as we had above, and after that it is just a sum of 1s.
nlogn + n/2 log n/2 + n/4 log n/4 + ... + n/2^(log n / log 2) log n/2^(log n / log 2) + 1 + 1 + 1 + 1 ... (n - log n / log 2) times
Unfortunately, this expression is not an easy one to simplify...

Drawing Recurrence Tree and Analysis

I am watching Intro to Algorithms (MIT) lecture 1. Theres something like below (analysis of merge sort)
T(n) = 2T(n/2) + O(n)
Few questions:
Why work at bottom level becomes O(n)? It said that the boundary case may have a different constant ... but I still don't get it ...
Its said total = cn(lg n) + O(n). Where does O(n) part come from? The original O(n)?
Although this one's been answered a lot, here's one way to reason it:
If you expand the recursion you get:
t(n) = 2 * t(n/2) + O(n)
t(n) = 2 * (2 * t(n/4) + O(n/2)) + O(n)
t(n) = 2 * (2 * (2 * t(n/8) + O(n/4)) + O(n/2)) + O(n)
...
t(n) = 2^k * t(n / 2^k) + O(n) + 2*O(n/2) + ... + 2^k * O(n/2^k)
The above stops when 2^k = n. So, that means n = log_2(k).
That makes n / 2^k = 1 which makes the first part of the equality simple to express, if we consider t(1) = c (constant).
t(n) = n * c + O(n) + 2*O(n/2) + ... + (2^k * O(n / 2^k))
If we consider the sum of O(n) + .. + 2^k * O(n / 2^k) we can observe that there are exactly k terms, and that each term is actually equivalent to n. So we can rewrite it like so:
t(n) = n * c + {n + n + n + .. + n} <-- where n appears k times
t(n) = n * c + n *k
but since k = log_2(n), we have
t(n) = n * c + n * log_2(n)
And since in Big-Oh notation n * log_2(n) is equivalent to n * log n, and it grows faster than n * c, it follows that the Big-O of the closed form is:
O(n * log n)
I hope this helps!
EDIT
To clarify, your first question, regarding why work at the bottom becomes O(n) is basically because you have n unit operations that take place (you have n leaf nodes in the expansion tree, and each takes a constant c time to complete). In the closed-formula, the work-at-the-bottom is expressed as the first term in the sum: 2 ^ k * t(1). As I said above, you have k levels in the tree, and the unit operation t(1) takes constant time.
To answer the second question, the O(n) does not actually come from the original O(n); it represents the work at the bottom (see answer to first question above).
The original O(n) is the time complexity required to merge the two sub-solutions t(n/2). Since the time complexity of the merge operation is assumed to grow (or decrease) linearly with the size of the problem, that means that at each level you will have a sum of O(n / 2^level), of 2^level terms; this is equivalent to one O(n) operation performed once. Now, since you have k levels, the merge complexity for the initial problem is {O(n) at each level} * {number of levels} which is essentially O(n) * k. Since k = log(n) levels, it follows that the time complexity of the merge operation is: O(n * log n).
Finally, when you examine all the operations performed, you see that the work at the bottom is less than the actual work performed to merge the solutions. Mathematically speaking, the work performed for each of the n items, grows asymptotically slower than the work performed to merge the sub-solutions; put differently, for large values of n, the merge operation dominates. So in Big-Oh analysis, the formula becomes: O(n * log(n)).

Is log(n!) = Θ(n·log(n))?

I am to show that log(n!) = Θ(n·log(n)).
A hint was given that I should show the upper bound with nn and show the lower bound with (n/2)(n/2). This does not seem all that intuitive to me. Why would that be the case? I can definitely see how to convert nn to n·log(n) (i.e. log both sides of an equation), but that's kind of working backwards.
What would be the correct approach to tackle this problem? Should I draw the recursion tree? There is nothing recursive about this, so that doesn't seem like a likely approach..
Remember that
log(n!) = log(1) + log(2) + ... + log(n-1) + log(n)
You can get the upper bound by
log(1) + log(2) + ... + log(n) <= log(n) + log(n) + ... + log(n)
= n*log(n)
And you can get the lower bound by doing a similar thing after throwing away the first half of the sum:
log(1) + ... + log(n/2) + ... + log(n) >= log(n/2) + ... + log(n)
= log(n/2) + log(n/2+1) + ... + log(n-1) + log(n)
>= log(n/2) + ... + log(n/2)
= n/2 * log(n/2)
I realize this is a very old question with an accepted answer, but none of these answers actually use the approach suggested by the hint.
It is a pretty simple argument:
n! (= 1*2*3*...*n) is a product of n numbers each less than or equal to n. Therefore it is less than the product of n numbers all equal to n; i.e., n^n.
Half of the numbers -- i.e. n/2 of them -- in the n! product are greater than or equal to n/2. Therefore their product is greater than the product of n/2 numbers all equal to n/2; i.e. (n/2)^(n/2).
Take logs throughout to establish the result.
Sorry, I don't know how to use LaTeX syntax on stackoverflow..
See Stirling's Approximation:
ln(n!) = n*ln(n) - n + O(ln(n))
where the last 2 terms are less significant than the first one.
For lower bound,
lg(n!) = lg(n)+lg(n-1)+...+lg(n/2)+...+lg2+lg1
>= lg(n/2)+lg(n/2)+...+lg(n/2)+ ((n-1)/2) lg 2 (leave last term lg1(=0); replace first n/2 terms as lg(n/2); replace last (n-1)/2 terms as lg2 which will make cancellation easier later)
= n/2 lg(n/2) + (n/2) lg 2 - 1/2 lg 2
= n/2 lg n - (n/2)(lg 2) + n/2 - 1/2
= n/2 lg n - 1/2
lg(n!) >= (1/2) (n lg n - 1)
Combining both bounds :
1/2 (n lg n - 1) <= lg(n!) <= n lg n
By choosing lower bound constant greater than (1/2) we can compensate for -1 inside the bracket.
Thus lg(n!) = Theta(n lg n)
Helping you further, where Mick Sharpe left you:
It's deriveration is quite simple:
see http://en.wikipedia.org/wiki/Logarithm -> Group Theory
log(n!) = log(n * (n-1) * (n-2) * ... * 2 * 1) = log(n) + log(n-1) + ... + log(2) + log(1)
Think of n as infinitly big. What is infinite minus one? or minus two? etc.
log(inf) + log(inf) + log(inf) + ... = inf * log(inf)
And then think of inf as n.
Thanks, I found your answers convincing but in my case, I must use the Θ properties:
log(n!) = Θ(n·log n) => log(n!) = O(n log n) and log(n!) = Ω(n log n)
to verify the problem I found this web, where you have all the process explained: http://www.mcs.sdsmt.edu/ecorwin/cs372/handouts/theta_n_factorial.htm
http://en.wikipedia.org/wiki/Stirling%27s_approximation
Stirling approximation might help you. It is really helpful in dealing with problems on factorials related to huge numbers of the order of 10^10 and above.
This might help:
eln(x) = x
and
(lm)n = lm*n
If you reframe the problem, you can solve this with calculus! This method was originally shown to me via Arthur Breitman https://twitter.com/ArthurB/status/1436023017725964290.
First, you take the integral of log(x) from 1 to n it is n*log(n) -n +1. This proves a tight upper bound since log is monotonic and for every point n, the integral from n to n+1 of log(n) > log(n) * 1. You can similarly craft the lower bound using log(x-1), as for every point n, 1*log(n) > the integral from x=n-1 to n of log(x). The integral of log(x) from 0 to n-1 is (n-1)*(log(n-1) -1), or n log(n-1) -n -log(n-1)+1.
These are very tight bounds!

Resources