O(log n) clarification - performance

There are plenty of questions around O(log n) and what it actually means, not trying to open that up again.
However on this particular answer https://stackoverflow.com/a/36877205/10894153, this image makes no sense to me:
That being said, seeing as that answer has over 100 up votes votes and has been up for more than 2 years and no comments to indicate anything might be wrong, I assume I am misunderstanding something, hence asking here for clarification (and I can't post comments because of low reputation).
Mainly, I don't understand why O(log(n)) is 10 when n == 1024. Shouldn't this be 32, seeing as 32^2 = 1024?
Clearly this has an effect on O(n * log(n)) as well, but just need to understand why O(log(1024)) = 10

The table is correct, except that the headings could be misleading because the values below them correspond to the expressions inside the big-O, rather than to the big-O themselves. But that is understandable because O-notations have the meaning of disregarding multiplicative constants.
Something similar happens with log(n). The log notation has also the meaning of disregarding the base of the logarithmic function. But that's fine in this context because:
log_b(n) = log_2(n)/log_2(b) ; see below why this is true
meaning that the function log_b() is just a multiplicative constant away, namely 1/log_2(b), from log_2().
And since the table is purposely emphasizing the fact that the big-O notation disregards multiplicative constants, it is fine to assume that all logs in it are 2-based.
In particular, O(log(1024)) can be interpreted as log_2(1024) which is nothing but 10 (because 2^10 = 1024).
To verify the equation above, we need to check that
log_2(n) = log_2(b)log_b(n)
By the definition of log we have to see that n is 2 to the righ-hand-side, i.e.,
n = 2^{log_2(b)log_b(n)}
but the right hand side is
{2^{log_2(b)}}^{log_b(n)} = b^{log_b(n)} = n
again by definition (applied twice).

Related

Why does O(n*n*n!) get simplified to O((n+2)!) when calculation big O

So I have have been reading the cracking the coding interview book and there is a problem where we’re we have a function that does O(n* n* n!) work. The book then says this can be expressed by O((n+2)!). It says Similarly O(n*n!)can be expressed by O((n+1)!). I looked in all rules if permutations and did not find any way to logically get there . My first step was cool I have O(n^2 +n!) now what? I don’t know what steps to take next .
You already know (I think) that n! = 1*2*3*...*n.
So n*n*n! = 1*2*3*...*n*n*n.
As n gets really big, adding 1 or 2 to a factor has a decreasingly significant effect. I'm no specialist but what matter with O() is either the power of n or, in our case, the number in the ()! expression.
Which gets us to shorten this to 1*2*3*...*n*(n+1)*(n+2)=(n+2)!.
Eventually, O(n*n*n!) can be expressed O((n+2)!).
To calculatee x! you do x*(x-1)! recursively until x-1==1 so x!==(x-1)*(x-2)*...*1 is O(n!). Therefore to do x*x! we have
(x-0)*(x-1)*...*1 which takes one extra call to our recursive function (but at the beginning, with large x value) i.e. (x+1)! iterations. Similarly, (x-0)*(x-0)*(x-1)*(x-2)*...*1==x²*x! requires (x+2)! function evaluations to compute, hence O((n+2)!) efficiency.

How to simplify Big-O expressions

I'm trying to understand the algebra behind Big-O expressions. I have gone through several questions but still don't have a very clear idea how it's done.
When dealing with powers do we always omit the lower powers, for example:
O(10n^4-n^2-10) = O(10n^4)
What difference does it make when multiplication is involved? For example:
O(2n^3+10^2n) * O(n) = O(2n^3) ??
And finally, how do we deal with logs? For example:
O(n2) + O(5*log(n))
I think we try to get rid of all constants and lower powers. I'm not sure how logarithms are involved in the simplification and what difference a multiplication sign would do. Thank you.
Big-O expressions are more closely related to Calculus, specifically limits, than they are to algebraic concepts/rules. The easiest way I've found to think about expressions like the examples you've provided, is to start by plugging in a small number, and then a really large number, and observe how the result changes:
Expression: O(10n^4-n^2-10)
use n = 2: O(10(2^4) - 2^2 - 10)
O(10 * 16 - 4 - 10) = 146
use n = 100: O(10(100^4) - 100^2- 10)
O(10(100,000,000) - 10,000 - 10) = 999,989,990
What you can see from this, is that the n^4 term overpowers all other terms in the expression. Therefore, this algorithm would be denoted as having a run-time of O(n^4).
So yes, your assumptions are correct, that you should generally go with the highest power, drop constants, and drop order-1 terms.
Logarithms are effectively "undoing" exponentiation. Because of this, they will reduce the overall O-run-time of an algorithm. However, when they are added against exponential run-times, they generally get overruled by the larger order term. In the example you provided, if we again evaluate using real numbers:
Expression: O(n^2) + O(5*log(n))
use n=2: O(2^2) + O(5*log(2))
O(4) + O(3.4657) = 7.46
use n=100: O(100^2) + O(5*log(100))
O(10,000) + O(23.02) = 10,023
You will notice that although the logarithm term is increasing, it isn't a great gain compared to the increase in n's size. However, the n^2 term is still generating a massive increase compared to the increase in n's size. Because of this, the Big O of these expressions combined would still boil down to: O(n^2).
If you're interested in further reading about the mathematics side of this, you may want to check out this post: https://secweb.cs.odu.edu/~zeil/cs361/web/website/Lectures/algebra/page/algebra.html

Runtime of following algorithm (example from cracking the coding interview)

One of problem in cracking the coding interview book asks the run-time for following algorithm, which prints the powers of 2 from 1 through n inclusive:
int powersOf2(int n) {
if (n < 1) return 0;
else if (n == 1) print(1); return 1;
else
{
int prev = powersOf2(n/2);
int curr = prev * 2;
print(curr);
return curr;
}
}
The author answers that it runs in O(log n).
It makes perfect sense, but... n is the VALUE of the input! (pseudo-sublinear run-time).
Isn't it more correct to say that the run-time is O(m) where m is the length of input to the algorithm? (O(log(2^m)) = O(m)).
Or is it perfectly fine to simply say it runs in O(log n) without mentioning anything about pseudo- runtimes...
I am preparing for an interview, and wondering whether I need to mention that the run-time is pseudo-sublinear for questions like this that depend on value of an input.
I think the term that you're looking for here is "weakly polynomial," meaning "polynomial in the number of bits in the input, but still dependent on the numeric value of the input."
Is this something you need to mention in an interview? Probably not. Analyzing the algorithm and saying that the runtime is O(log n) describes the runtime perfectly as a function of the input parameter n. Taking things a step further and then looking at how many bits are required to write out the number n, then mentioning that the runtime is linear in the size of the input, is a nice flourish and might make an interviewer happy.
I'd actually be upset if an interviewer held it against you if you didn't mention this - this is the sort of thing you'd only know if you had a good university education or did a lot of self-studying.
When you say that an algorithm takes O(N) time, and it's not specified what N is, then it's taken to be the size of the input.
In this case, however, the algorithm is said to to take O(n) time, where n identifies a specific input parameter. That is also perfectly OK, and is common when the size of the input isn't what you would practically want to measure against. You will also see complexity given in terms of multiple parameters, like O(|V|+|E]) for graph algorithms, etc.
To make things a bit more confusing, the input value n is a single machine word, and numbers that fit into 1 or 2 machine words are usually considered to be constant size, because in practice they are.
Since giving a complexity in terms of the size of n is therefore not useful in any way, if you were asked to give a complexity without any specific instructions of how to measure the input size, you would measure it in terms of the value of n, because that is the useful way to do it.

Order of growth rate in increasing order

Arrange the following functions in increasing order of growth rate
(with g(n) following f(n) in your list if and only if f(n)=O(g(n))).
a)2^log(n)
b)2^2log(n)
c)n^5/2
d)2^n^2
e)n^2 log(n)
So i think answer is in increasing order is
CEDAB
is it correct? i have confusion in option A and B.
i think option A should be at first place.. less one i mean so please help how to solve this.
This question I faced in algorithm course part 1 assignment (Coursera) .
Firstly, any positive power of n is always greater than log n, so E comes before C, not after.
Also, D comes after every other function, as either interpretation of 2^n^2 (could be 2^(n^2) or (2^n)^2 = 2^(2n); I could be wrong in ignoring BIDMAS though...) are exponentials of n itself.
Taking log to be base a, some arbitrary constant:
a)
b)
Thus, unfortunately, the actual order depends of the value of a, e.g. if the value of
is greater than 2, then A comes after E, otherwise before. Curiously the base of the log term in E is irrelevant (it still maintains its place).
The answer is aecbd
The easiest way to see why is to create a table with different values of n and compare amongst them. But some intuition:
a grows lesser than any others, specially c because of the log term in the power as opposed to the term itself
e is a with a n**2 term multiplied in, which is better than it being in an exponent
b is a double exponent, but still better than a quadratic power
d is the obvious worst because it grows exponentially with a quadratic power!

Showing that a recurrence relation is O(n log n)

T (n) = T (xn) + T ((1 − x)n) + n = O(n log n)
where x is a constant in the range 0 < x < 1. Is the asymptotic complexity the same when x = 0.5, 0.1 and 0.001?
What happens to the constant hidden in the O() notation. Use Substitution Method.
I'm trying to use the example on ~page 15 but I find it weird that in that example the logs change from default base to base 2.
I also do not really understand why it needed to be simplified so much just so as to remove cnlog2n from the left side, could that not be done in the first step and the left side would just have "stuff-cnlog2n<=0" then evaulated for any c and n like so?
From what I tried it could not prove that T(n)=O(n)
Well, if you break this up into a tree using Master's theorem, then this will have a constant "amount" to calculate each time. You know this because x + 1 - x = 1.
Thus, the time depends on the level of the tree, which is logarithmic since the pieces are reducing each time by some constant amount. Since you do O(n) calcs each level, your overall complexity is O( n log n ).
I expect this will be a little more complicated to "prove". Remember that it doesn't matter what base your logs are in, they're all just some constant factor. Refer to logarithmic relations for this.
PS: Looks like homework. Think harder yourself!
This seems to me exactly the recurrence equation for the average case in Quicksort.
You should look at CLRS explanation of "Balanced Partitioning".
but I find it weird that in that example the logs change from default base to base 2
That is indeed weird. Simple fact is, that it's easier to prove with base 2, than with base unknown x. For example, log(2n) = 1+log(n) in base 2, that is a bit easier. You don't have to use base 2, you can pick any base you want, or use base x. To be absolutely correct the Induction Hypothesis must have the base in it: T(K) <= c K log_2(K). You can't change the IH later on, so what is happening now is not correct in a strict sense. You're free to pick any IH you like, so, just pick one that makes the prove easier: in this case logs with base 2.
the left side would just have "stuff-cnlog2n<=0" then evaulated for any c and n
What do you mean with 'evaluated for any c and n'? stuff-cnlog2n<=0 is correct, but how do you prove that there is a c so that it holds for all n? Okay, c=2 is a good guess. To prove it like that in WolframAlpha, you need to do stuff <= 0 where c=2, n=1 OK!, stuff <=0 where c=2, n=2OK!, stuff <= 0 where c=2, n=3OK!, ..., etc taking n all the way to infinity. Hmm, it will take you an infinite amount of time to check all of them... The only practical way (I can think of right now) for solving this is to simplify stuff-cnlog2n<=0. Or maybe you prefer this argument: you don't have WolframAlpha at your exam, so you must simplify.

Resources