I'm trying to understand the algebra behind Big-O expressions. I have gone through several questions but still don't have a very clear idea how it's done.
When dealing with powers do we always omit the lower powers, for example:
O(10n^4-n^2-10) = O(10n^4)
What difference does it make when multiplication is involved? For example:
O(2n^3+10^2n) * O(n) = O(2n^3) ??
And finally, how do we deal with logs? For example:
O(n2) + O(5*log(n))
I think we try to get rid of all constants and lower powers. I'm not sure how logarithms are involved in the simplification and what difference a multiplication sign would do. Thank you.
Big-O expressions are more closely related to Calculus, specifically limits, than they are to algebraic concepts/rules. The easiest way I've found to think about expressions like the examples you've provided, is to start by plugging in a small number, and then a really large number, and observe how the result changes:
Expression: O(10n^4-n^2-10)
use n = 2: O(10(2^4) - 2^2 - 10)
O(10 * 16 - 4 - 10) = 146
use n = 100: O(10(100^4) - 100^2- 10)
O(10(100,000,000) - 10,000 - 10) = 999,989,990
What you can see from this, is that the n^4 term overpowers all other terms in the expression. Therefore, this algorithm would be denoted as having a run-time of O(n^4).
So yes, your assumptions are correct, that you should generally go with the highest power, drop constants, and drop order-1 terms.
Logarithms are effectively "undoing" exponentiation. Because of this, they will reduce the overall O-run-time of an algorithm. However, when they are added against exponential run-times, they generally get overruled by the larger order term. In the example you provided, if we again evaluate using real numbers:
Expression: O(n^2) + O(5*log(n))
use n=2: O(2^2) + O(5*log(2))
O(4) + O(3.4657) = 7.46
use n=100: O(100^2) + O(5*log(100))
O(10,000) + O(23.02) = 10,023
You will notice that although the logarithm term is increasing, it isn't a great gain compared to the increase in n's size. However, the n^2 term is still generating a massive increase compared to the increase in n's size. Because of this, the Big O of these expressions combined would still boil down to: O(n^2).
If you're interested in further reading about the mathematics side of this, you may want to check out this post: https://secweb.cs.odu.edu/~zeil/cs361/web/website/Lectures/algebra/page/algebra.html
Related
So I have have been reading the cracking the coding interview book and there is a problem where we’re we have a function that does O(n* n* n!) work. The book then says this can be expressed by O((n+2)!). It says Similarly O(n*n!)can be expressed by O((n+1)!). I looked in all rules if permutations and did not find any way to logically get there . My first step was cool I have O(n^2 +n!) now what? I don’t know what steps to take next .
You already know (I think) that n! = 1*2*3*...*n.
So n*n*n! = 1*2*3*...*n*n*n.
As n gets really big, adding 1 or 2 to a factor has a decreasingly significant effect. I'm no specialist but what matter with O() is either the power of n or, in our case, the number in the ()! expression.
Which gets us to shorten this to 1*2*3*...*n*(n+1)*(n+2)=(n+2)!.
Eventually, O(n*n*n!) can be expressed O((n+2)!).
To calculatee x! you do x*(x-1)! recursively until x-1==1 so x!==(x-1)*(x-2)*...*1 is O(n!). Therefore to do x*x! we have
(x-0)*(x-1)*...*1 which takes one extra call to our recursive function (but at the beginning, with large x value) i.e. (x+1)! iterations. Similarly, (x-0)*(x-0)*(x-1)*(x-2)*...*1==x²*x! requires (x+2)! function evaluations to compute, hence O((n+2)!) efficiency.
There are plenty of questions around O(log n) and what it actually means, not trying to open that up again.
However on this particular answer https://stackoverflow.com/a/36877205/10894153, this image makes no sense to me:
That being said, seeing as that answer has over 100 up votes votes and has been up for more than 2 years and no comments to indicate anything might be wrong, I assume I am misunderstanding something, hence asking here for clarification (and I can't post comments because of low reputation).
Mainly, I don't understand why O(log(n)) is 10 when n == 1024. Shouldn't this be 32, seeing as 32^2 = 1024?
Clearly this has an effect on O(n * log(n)) as well, but just need to understand why O(log(1024)) = 10
The table is correct, except that the headings could be misleading because the values below them correspond to the expressions inside the big-O, rather than to the big-O themselves. But that is understandable because O-notations have the meaning of disregarding multiplicative constants.
Something similar happens with log(n). The log notation has also the meaning of disregarding the base of the logarithmic function. But that's fine in this context because:
log_b(n) = log_2(n)/log_2(b) ; see below why this is true
meaning that the function log_b() is just a multiplicative constant away, namely 1/log_2(b), from log_2().
And since the table is purposely emphasizing the fact that the big-O notation disregards multiplicative constants, it is fine to assume that all logs in it are 2-based.
In particular, O(log(1024)) can be interpreted as log_2(1024) which is nothing but 10 (because 2^10 = 1024).
To verify the equation above, we need to check that
log_2(n) = log_2(b)log_b(n)
By the definition of log we have to see that n is 2 to the righ-hand-side, i.e.,
n = 2^{log_2(b)log_b(n)}
but the right hand side is
{2^{log_2(b)}}^{log_b(n)} = b^{log_b(n)} = n
again by definition (applied twice).
Source: Google Code Jam. https://code.google.com/codejam/contest/10224486/dashboard#s=a&a=1
We're asked to calculate Prob(K successes from N trials) where each of the N trials has a known success probability of p_n.
Some Analysis and thoughts on the problem are given after the Code Jam.
They observe that evaluating all possible outcomes of your N trials would take you an exponential time in N, so instead they provide a nice "dynamic programming" style solution that's O(N^2).
Let P(p#q) = Prob(p Successes after the first q Trials)
Then observe the fact that:
Prob(p#q) = Prob(qth trial succeeds)*P(p-1#q-1) + Prob(qth trial fails)*P(p#q-1)
Now we can build up a table of P(i#j) where i<=j, for i = 1...N
That's all lovely - I follow all of this and could implement it.
Then as the last comment, they say:
In practice, in problems like this, one should store the logarithms of
probabilities instead of the actual values, which can become small
enough for floating-point precision errors to matter.
I think I broadly understand the point they're trying to make, but I specifically can't figure out how to use this suggestion.
Taking the above equation, and substuting in some lettered variables:
P = A*B + C*D
If we want to work in Log Space, then we have:
Log(P) = Log(A*B + C*D),
where we have recursively pre-computed Log(B) and Log(D), and A & B are known, easily-handled decimals.
But I don't see any way to calculate Log(P) without just doing e^(Log(B)), etc. which feels like it would defeat to point of working in log space`?
Does anyone understand in better detail what I'm supposed to be doing?
Starting from the initial relation:
P = A⋅B + C⋅D
Due to its symmetry we can assume that B is larger than D, without loss of generality.
The following processing is useful:
log(P) = log(A⋅B + C⋅D) = log(A⋅elog(B) + C⋅elog(D)) = log(elog(B)⋅(A + C⋅elog(D) - log(B))
log(P) = log(B) + log(A + C⋅elog(D) - log(B)).
This is useful because, in this case, log(B) and log(D) are both negative numbers (logarithms of some probabilities). It was assumed that B is larger than D, thus its log is closer to zero. Therefore log(D) - log(B) is still negative, but not as negative as log(D).
So now, instead of needing to perform exponentiation of log(B) and log(D) separately, we only need to perform exponentiation of log(D) - log(B), which is a mildly negative number. So the above processing leads to better numerical behavior than using logarithms and applying exponentiation in the trivial way, or, equivalently, than not using logarithms at all.
Why does:
O(3n)=O(2n)=O(n)
Whereas their derivatives w.r.t n being 3, 2 and 1 respectively
Asymptotic Notation has no relation with derivatives. It is actually a measure of growth of function w.r.t. size of n. So, it tells us how a function will be changed on changing the value of n. If two functions get changed in same manner on changing n in the same way, we would call them the functions of same order. For example,Let f(n)=3n2+n+1 and g(n)=5n2+3n+1
If we double the n, both functions will roughly get 4 times of previous value. Hence they both are of order O(n2). We removed the constant coefficients(5 and 3) in Big-oh notation, because they are not contributing in telling how function is growing w.r.t. n (In every case, function will get 4 times). However we didn't remove the constant 2(power or exponent of n) because it is contributing in telling how function is growing w.r.t. n(Had we removed that constant and our function would get twice instead of 4 times hence we know it is contributing). Formally, we define Big-Oh notation as follows:
See here: Definition of Big O notation
f(n)=O(g(n)), if and only if f(n)<=c.g(n) for some c>0 and n>=n0
Now let me show you how O(3n)=O(2n)=O(n)
Informal Proof:
f(n)=2n and g(n)=3n both will grow linearly w.r.t. n. By linearly, I mean that if we double/halve n, the function output will also get doubled/halved. It doesn't matter if we change the coefficient to 2,3, or 1000, it will grow linearly w.r.t n. So, that's why O(n)=O(2n)=O(3n)Notice that it is not about removing constants, it's about whether these constants contribute in telling about how our function is growing w.r.t. n.
As a counter-example for this, Let's suppose
f(n)=2n andg(n)=22n
We can't remove the 2 in exponent because that 2 is contributing and saying that g(n) will actually change in terms of square w.r.t. how f(n) would change.So,f(n)=O(2n)while g(n)=O(4n)
Formal Proof:
Suppose n is sufficiently largeif f(n)=O(n), g(n)=O(2n) and h(n)=O(3n) then f(n)<=c1n for some c1>0g(n)<=2c2n for some c2>0, let's have c3=2c2 hence g(n)<=c3n or g(n)=O(n) or O(2n)=O(n)Similarly h(n)<=3c4n or h(n)=O(n) or O(3n)=O(n)Hence, O(3n)=O(2n)=O(n)
Final Words:The key point is just to check how a function is growing. After practicing, you'd have some idea like anp+bnp-1+...+c = O(np)And many more.Read CLRS Book. I don't remember exactly but I think Chapter-3 is dedicated for this concept.
I'm pretty sure that this is the right site for this question, but feel free to move it to some other stackexchange site if it fits there better.
Suppose you have a sum of fractions a1/d1 + a2/d2 + … + an/dn. You want to compute a common numerator and denominator, i.e., rewrite it as p/q. We have the formula
p = a1*d2*…*dn + d1*a2*d3*…*dn + … + d1*d2*…d(n-1)*an
q = d1*d2*…*dn.
What is the most efficient way to compute these things, in particular, p? You can see that if you compute it naïvely, i.e., using the formula I gave above, you compute a lot of redundant things. For example, you will compute d1*d2 n-1 times.
My first thought was to iteratively compute d1*d2, d1*d2*d3, … and dn*d(n-1), dn*d(n-1)*d(n-2), … but even this is inefficient, because you will end up computing multiplications in the "middle" twice (e.g., if n is large enough, you will compute d3*d4 twice).
I'm sure this problem could be expressed somehow using maybe some graph theory or combinatorics, but I haven't studied enough of that stuff to have a good feel for it.
And one note: I don't care about cancelation, just the most efficient way to multiply things.
UPDATE:
I should have known that people on stackoverflow would be assuming that these were numbers, but I've been so used to my use case that I forgot to mention this.
We cannot just "divide" out an from each term. The use case here is a symbolic system. Actually, I am trying to fix a function called .as_numer_denom() in the SymPy computer algebra system which presently computes this the naïve way. See the corresponding SymPy issue.
Dividing out things has some problems, which I would like to avoid. First, there is no guarantee that things will cancel. This is because mathematically, (a*b)**n != a**n*b**n in general (if a and b are positive it holds, but e.g., if a == b ==-1 and n == 1/2, you get (a*b)**n == 1**(1/2) == 1 but (-1)**(1/2)*(-1)**(1/2) == I*I == -1). So I don't think it's a good idea to assume that dividing by an will cancel it in the expression (this may be actually be unfounded, I'd need to check what the code does).
Second, I'd like to also apply a this algorithm to computing the sum of rational functions. In this case, the terms would automatically be multiplied together into a single polynomial, and "dividing" out each an would involve applying the polynomial division algorithm. You can see in this case, you really do want to compute the most efficient multiplication in the first place.
UPDATE 2:
I think my fears for cancelation of symbolic terms may be unfounded. SymPy does not cancel things like x**n*x**(m - n) automatically, but I think that any exponents that would combine through multiplication would also combine through division, so powers should be canceling.
There is an issue with constants automatically distributing across additions, like:
In [13]: 2*(x + y)*z*(S(1)/2)
Out[13]:
z⋅(2⋅x + 2⋅y)
─────────────
2
But this is first a bug and second could never be a problem (I think) because 1/2 would be split into 1 and 2 by the algorithm that gets the numerator and denominator of each term.
Nonetheless, I still want to know how to do this without "dividing out" di from each term, so that I can have an efficient algorithm for summing rational functions.
Instead of adding up n quotients in one go I would use pairwise addition of quotients.
If things cancel out in partial sums then the numbers or polynomials stay smaller, which makes computation faster.
You avoid the problem of computing the same product multiple times.
You could try to order the additions in a certain way, to make canceling more likely (maybe add quotients with small denominators first?), but I don't know if this would be worthwhile.
If you start from scratch this is simpler to implement, though I'm not sure it fits as a replacement of the problematic routine in SymPy.
Edit: To make it more explicit, I propose to compute a1/d1 + a2/d2 + … + an/dn as (…(a1/d1 + a2/d2) + … ) + an/dn.
Compute two new arrays:
The first contains partial multiples to the left: l[0] = 1, l[i] = l[i-1] * d[i]
The second contains partial multiples to the right: r[n-1] = 1, r[i] = d[i] * r[i+1]
In both cases, 1 is the multiplicative identity of whatever ring you are working in.
Then each of your terms on the top, t[i] = l[i-1] * a[i] * r[i+1]
This assumes multiplication is associative, but it need not be commutative.
As a first optimization, you don't actually have to create r as an array: you can do a first pass to calculate all the l values, and accumulate the r values during a second (backward) pass to calculate the summands. No need to actually store the r values since you use each one once, in order.
In your question you say that this computes d3*d4 twice, but it doesn't. It does multiply two different values by d4 (one a right-multiplication and the other a left-multiplication), but that's not exactly a repeated operation. Anyway, the total number of multiplications is about 4*n, vs. 2*n multiplications and n divisions for the other approach that doesn't work in non-commutative multiplication or non-field rings.
If you want to compute p in the above expression, one way to do this would be to multiply together all of the denominators (in O(n), where n is the number of fractions), letting this value be D. Then, iterate across all of the fractions and for each fraction with numerator ai and denominator di, compute ai * D / di. This last term is equal to the product of the numerator of the fraction and all of the denominators other than its own. Each of these terms can be computed in O(1) time (assuming you're using hardware multiplication, otherwise it might take longer), and you can sum them all up in O(n) time.
This gives an O(n)-time algorithm for computing the numerator and denominator of the new fraction.
It was also pointed out to me that you could manually sift out common denominators and combine those trivially without multiplication.