ratio of running times when N doubled in big O notation - algorithm

I learned that,using Big O notation
O(f(n)) + O(g(n)) -> O(max(f(n),g(n))
O( f(n) )* O( g(n)) -> O( f(n) g(n))
but now, I have this equation for running time T for input size N
T(N) = O(N^2) // O of N square
I need to find the ratio T(2N) / T(N)
I tried this
T(2N) / T(N) --> O((2N)^2) /O( N^2) --> 4
Is this correct? Or is the above division invalid?

I would also say this is incorrect. My intuition was that if T(N) is O(N^2), then T(2N)/T(N) is O(1), consistent with your suggestion that T(2N)/T(N) = 4. I think however that the intuition is wrong.
Consider a counter-example.
Let T(N) be 1 if N is odd, and N^2 if N is even, as below.
This is clearly O(N^2), because we can choose a constant p=1, such that T(N) ≤ pN^2 for sufficiently large N.
What is T(2N)? This is (2N)^2 = 4N^2, as below, because 2N is always even.
Hence T(2N)/T(N) is 4N^2/1 = 4N^2 when N is odd, and 4N^2/N^2=4 when N is even, as below.
Clearly T(2N)/T(N) is not 4. It is, however, O(N^2), because we can choose a constant q=4 such that T(2N)/T(N) ≤ qN^2 for sufficiently large N.
R code for the plots below
x=1:50
t1=ifelse(x%%2, 1, x^2)
plot(t1~x,type="l")
x2=2*x
t2=ifelse(x2%%2, 1, x2^2)
plot(t2~x,type="l")
ratio=t2/t1
plot(ratio~x,type="l")
This problem is an interesting one and strikes me as belonging in the realm of pure mathematics, i.e. limits of sequences and the like. I am not trained in pure mathematics and I would be nervous of claiming that T(2N)/T(N) is always O(N^2), as it might be possible to invent some rather tortuous counter-examples.

Even if T(N) = Θ(N²) (big-theta) this doesn't work. (I'm not even going to talk about big-O.)
c1 * N² <= T(N) <= c2 * N²
c1 * 4 * N² <= T(2N) <= c2 * 4 * N²
T(N) = c_a * N² + f(N)
T(2N) = c_b * 4 * N² + g(N)
For c_a and c_b somewhere between c1 and c2, and f(N) and g(N) small-o of N². (Thanks G. Bach!) And there is nothing to guarantee that the quotient will be equal to 4 since both c_a, c_b, f(N) and g(N) can be all sorts of things. For example, take c_a = 1, c_b = 2 and f(N) = g(N) = 0. Divide them and you get
T(2N)/T(N) = (2 * 4 * N²)/N² = 8

Good question!
This is incorrect.
Running time is not the same as time complexity(here Big O).Time complexity says like it can't have a running time worse that a constant times N^2.Running time can be quite different, maybe very low, maybe close to the asymptotic limit.That purely depends on the hidden constants.If someone asked you this question, it's a trick question.
Hidden constants refers to the actual number of primitive instructions carried out.So in this case the total number of operations could be:
5*N^2
or
1000*N^2.
or maybe
100*N^2+90N
or maybe just
100*N (Recall this is also O(N^2))
The factor depends on the implementation and the actual instructions carried out.This is found out from the algorithm.
So whichever the case we simply say the Big-O is O(N^2).

Related

What is the exact runtime in Big-O of those functions

I wanted to ask if someone could answer this questions and also check my solutions. I don't really get the Big-O of these functions.
e(n)= n! + 2^n
f(n)= log10(n) * n/2 + n
g(n) = n2 + n * log(n)
h(n) = 2^30n * 4^log2(n)
I thought that:
e(n) n! because n! is exponentially rising.
f(n) n(log)n but I don't really know why
g(n) n^2
h(n) n
I would appreciate any answer. Thanks.
Big O notation allows us to evaluate the growth of a function with respect to some quantity as it tends towards a infinity.
The Big O notation of a function is that of it's fastest growing constituent. Big O notation bounds the growth of a function.
For example if a function is said to be O(n).
There exists some k such that f(n) <= k * n for all n.
We ignore all other constituents of the equation as we are looking at values that tend towards infinity. As n tends towards infinity the other constituents of the equation are drowned out.
We ignore any constants as we are describing the general relationship of the function as it tends towards some value. Constants make things harder to analyze.
e(n) = n! + 2^n. A factorial is the fastest growing constituent in the equation so the Big O notation is O(n!).
f(n) = log10(n) * n/2 + n. The fast growing constituent is log10(n) * n/2. we say that the Big O notation is O(nlogn). It does not matter what base of log we use as we can convert between log bases by using a constant factor.
g(n) = n^2 + n * log(n). The Big O notation is n^2. This is because n^2 grows faster than n * log(n) with respect to n.
h(n) = (2^30) * n * 4 ^ log2(n). The time complexity is O(nlogn). The constants in this equation are 2^30 and 4 so we ignore these values. When doing this you can clearly see that the time complexity is O(nlogn).

The Mathematical Relationship Between Big-Oh Classes

My textbook describes the relationship as follows:
There is a very nice mathematical intuition which describes these classes too. Suppose we have an algorithm which has running time N0 when given an input of size n, and a running time of N1 on an input of size 2n. We can characterize the rates of growth in terms of the relationship between N0 and N1:
Big-Oh Relationship
O(log n) N1 ≈ N0 + c
O(n) N1 ≈ 2N0
O(n²) N1 ≈ 4N0
O(2ⁿ) N1 ≈ (N0)²
Why is this?
That is because if f(n) is in O(g(n)) then it can be thought of as acting like k * g(n) for some k.
So for example if f(n) = O(log(n)) then it acts like k log(n), and now f(2n) ≈ k log(2n) = k (log(2) + log(n)) = k log(2) + k log(n) ≈ k log(2) + f(n) and that is your desired equation with c = k log(2).
Note that this is a rough intuition only. An example of where it breaks down is that f(n) = (2 + sin(n)) log(n) = O(log(n)). The oscillating 2 + sin(n) bit means that f(2n)-f(n) can be basically anything.
I personally find this kind of rough intuition to be misleading and therefore worse than useless. Others find it very helpful. Decide for yourself how much weight you give it.
Basically what they are trying to show is just basic algebra after substituting 2n for n in the functions.
O(log n)
log(2n) = log(2) + log(n)
N1 ≈ c + N0
O(n)
2n = 2(n)
N1 ≈ 2N0
O(n²)
(2n)^2 = 4n^2 = 4(n^2)
N1 ≈ 4N0
O(2ⁿ)
2^(2n) = 2^(n*2) = (2^n)^2
N1 ≈ (N0)²
Since O(f(n)) ~ k * f(n) (almost by definition), you want to look at what happens when you put 2n in for n. In each case:
N1 ≈ k*log 2n = k*(log 2 + log n) = k*log n + k*log 2 ≈ N0 + c where c = k*log 2
N1 ≈ k*(2n) = 2*k*n ≈ 2N0
N1 ≈ k*(2n)^2 = 4*k*n^2 ≈ 4N0
N1 ≈ k*2^(2n) = k*(2^n)^2 ≈ N0*2^n ≈ N0^2/k
So the last one is not quite right, anyway. Keep in mind that these relationships are only true asymptotically, so the approximations will be more accurate as n gets larger. Also, f(n) = O(g(n)) only means g(n) is an upper bound for f(n) for large enough n. So f(n) = O(g(n)) does not necessarily mean f(n) ~ k*g(n). Ideally, you want that to be true, since your big-O bound will be tight when that is the case.

Role of lower order terms in big O notation

In big O notation, we always say that we should ignore constant factors for most cases. That is, rather than writing,
3n^2-100n+6
we are almost always satisfied with
n^2
since that term is the fastest growing term in the equation.
But I found many algorithm courses starts comparing functions with many terms
2n^2+120n+5 = big O of n^2
then finding c and n0 for those long functions, before recommending to ignore low order terms in the end.
My question is what would I get from trying to understand and annalising these kinds of functions with many terms? Before this month I am comfortable with understanding what O(1), O(n), O(LOG(n)), O(N^3) mean. But am I missing some important concepts if I just rely on this typically used functions? What will I miss if I skipped analysing those long functions?
Let's first of all describe what we mean when we say that f(n) is in O(g(n)):
... we can say that f(n) is O(g(n)) if we can find a constant c such
that f(n) is less than c·g(n) or all n larger than n0, i.e., for all
n>n0.
In equation for: we need to find one set of constants (c, n0) that fulfils
f(n) < c · g(n), for all n > n0, (+)
Now, the result that f(n) is in O(g(n)) is sometimes presented in difference forms, e.g. as f(n) = O(g(n)) or f(n) ∈ O(g(n)), but the statement is the same. Hence, from your question, the statement 2n^2+120n+5 = big O of n^2 is just:
f(n) = 2n^2 + 120n + 5
a result after some analysis: f(n) is in O(g(n)), where
g(n) = n^2
Ok, with this out of the way, we look at the constant term in the functions we want to analyse asymptotically, and let's look at it educationally, using however, your example.
As the result of any big-O analysis is the asymptotic behaviour of a function, in all but some very unusual cases, the constant term has no effect whatsoever on this behaviour. The constant factor can, however, affect how to choose the constant pair (c, n0) used to show that f(n) is in O(g(n)) for some functions f(n) and g(n), i.e., the none-unique constant pair (c, n0) used to show that (+) holds. We can say that the constant term will have no effect of our result of the analysis, but it can affect our derivation of this result.
Lets look at your function as well as another related function
f(n) = 2n^2 + 120n + 5 (x)
h(n) = 2n^2 + 120n + 22500 (xx)
Using a similar approach as in this thread, for f(n), we can show:
linear term:
120n < n^2 for n > 120 (verify: 120n = n^2 at n = 120) (i)
constant term:
5 < n^2 for e.g. n > 3 (verify: 3^2 = 9 > 5) (ii)
This means that if we replace both 120n as well as 5 in (x) by n^2 we can state the following inequality result:
Given that n > 120, we have:
2n^2 + n^2 + n^2 = 4n^2 > {by (ii)} > 2n^2 + 120n + 5 = f(n) (iii)
From (iii), we can choose (c, n0) = (4, 120), and (iii) then shows that these constants fulfil (+) for f(n) with g(n) = n^2, and hence
result: f(n) is in O(n^2)
Now, for for h(n), we analogously have:
linear term (same as for f(n))
120n < n^2 for n > 120 (verify: 120n = n^2 at n = 120) (I)
constant term:
22500 < n^2 for e.g. n > 150 (verify: 150^2 = 22500) (II)
In this case, we replace 120n as well as 22500 in (xx) by n^2, but we need a larger less than constraint on n for these to hold, namely n > 150. Hence, we the following holds:
Given that n > 150, we have:
2n^2 + n^2 + n^2 = 4n^2 > {by (ii)} > 2n^2 + 120n + 5 = h(n) (III)
In same way as for f(n), we can, here, choose (c, n0) = (4, 150), and (III) then shows that these constants fulfil (+) for h(n), with g(n) = n^2, and hence
result: h(n) is in O(n^2)
Hence, we have the same result for both functions f(n) and h(n), but we had to use different constants (c,n0) to show these (i.e., somewhat different derivation). Note finally that:
Naturally the constants (c,n0) = (4,150) (used for h(n) analysis) are also valid to show that f(n) is in O(n^2), i.e., that (+) holds for f(n) with g(n)=n^2.
However, not the reverse: (c,n0) = (4,120) cannot be used to show that (+) holds for h(n) (with g(n)=n^2).
The core of this discussion is that:
As long as you look at sufficiently large values of n, you will be able to describe the constant terms in relations as constant < dominantTerm(n), where, in our example, we look at the relation with regard to dominant term n^2.
The asymptotic behaviour of a function will not (in all but some very unusual cases) depend on the constant terms, so we might as well skip looking at them at all. However, for a rigorous proof of the asymptotic behaviour of some function, we need to take into account also the constant terms.
Ever have intermediate steps in your work? That is what this likely is as when you are computing a big O, chances are you don't already know for sure what the highest order term is and thus you keep track of them all and then determine which complexity class makes sense in the end. There is also something to be said for understanding why the lower order terms can be ignored.
Take some graph algorithms like a minimum spanning tree or shortest path. Now, can just looking at an algorithm you know what the highest term will be? I know I wouldn't and so I'd trace through the algorithm and collect a bunch of terms.
If you want another example, consider Sorting Algorithms and whether you want to memorize all the complexities or not. Bubble Sort, Shell Sort, Merge Sort, Quick Sort, Radix Sort and Heap Sort are a few of the more common algorithms out there. You could either memorize both the algorithm and complexity or just the algorithm and derive the complexity from the pseudo code if you know how to trace them.

Confused on big O notation

According to this book, big O means:
f(n) = O(g(n)) means c · g(n) is an upper bound on f(n). Thus there exists some constant c such that f(n) is always ≤ c · g(n), for large enough n (i.e. , n ≥ n0 for some constant n0).
I have trubble understanding the following big O equation
3n2 − 100n + 6 = O(n2), because I choose c = 3 and 3n2 > 3n2 − 100n + 6;
How can 3 be a factor? In 3n2 − 100n + 6, if we drop the low order terms -100n and 6, aren't 3n2 and 3.n2 the same? How to solve this equation?
I'll take the liberty to slightly paraphrase the question to:
Why do and have the same asymptotic complexity.
For that to be true, the definition should be in effect both directions.
First:
let
Then for the inequality is always satisfied.
The other way around:
let
We have a parabola opened upwards, therefore there is again some after which the inequality is always satisfied.
Let's look at the definition you posted for f(n) in O(g(n)):
f(n) = O(g(n)) means c · g(n) is an upper bound on f(n). Thus there
exists some constant c such that f(n) is always ≤ c · g(n), for
large enough n (i.e. , n ≥ n0 for some constant n0).
So, we only need to find one set of constants (c, n0) that fulfils
f(n) < c · g(n), for all n > n0, (+)
but this set is not unique. I.e., the problem of finding the constants (c, n0) such that (+) holds is degenerate. In fact, if any such pair of constants exists, there will exist an infinite amount of different such pairs.
Note that here I've switched to strict inequalities, which is really only a matter of taste, but I prefer this latter convention. Now, we can re-state the Big-O definition in possibly more easy-to-understand terms:
... we can say that f(n) is O(g(n)) if we can find a constant c such
that f(n) is less than c·g(n) or all n larger than n0, i.e., for all
n>n0.
Now, let's look at your function f(n)
f(n) = 3n^2 - 100n + 6 (*)
Let's describe your functions as a sum of it's highest term and another functions
f(n) = 3n^2 + h(n) (**)
h(n) = 6 - 100n (***)
We now study the behaviour of h(n) and f(n), respectively:
h(n) = 6 - 100n
what can we say about this expression?
=> if n > 6/100, then h(n) < 0, since 6 - 100*(6/100) = 0
=> h(n) < 0, given n > 6/100 (i)
f(n) = 3n^2 + h(n)
what can we say about this expression, given (i)?
=> if n > 6/100, the f(n) = 3n^2 + h(n) < 3n^2
=> f(n) < c*n^2, with c=3, given n > 6/100 (ii)
Ok!
From (ii) we can choose constant c=3, given that we choose the other constant n0 as larger than 6/100. Lets choose the first integer that fulfils this: n0=1.
Hence, we've shown that (+) golds for constant set **(c,n0) = (3,1), and subsequently, f(n) is in O(n^2).
For a reference on asymptotic behaviour, see e.g.
https://www.khanacademy.org/computing/computer-science/algorithms/asymptotic-notation/a/big-o-notation
y=3n^2 (top graph) vs y=3n^2 - 100n + 6
Consider the sketch above. By your definition, 3n^2 only needs to be bigger than 3n^2 - 100n + 6 for large enough n (i.e. , n ≥ n0 for some constant n0). Let that n0 = 5 in this case (it could be something a little smaller, but it's clear which graph is bigger by n=5 so we'll just go with that).
Clearly from the graph, 3n^2 >= 3n^2 - 100n + 6 in the range we've plotted. The only way for 3n^2 - 100n + 6 to get bigger than 3n^2 then is for it to grow more steeply.
But the gradients of 3n^2 and 3n^2 - 100n + 6 are 6n and 6n - 100 respectively, so 3n^2 - 100n + 6 can't grow more steeply, therefore must always be underneath.
So your definition holds - 3n^2 - 100n + 6 <= 3n^2 for all n>=5
I am not an expert, but this looks a lot similar to what we just had in our real analysis course.
Basically if you have something like f(n) = 3n^2 − 100n + 6, the "fastest growing" term "wins" the other terms, when you have really really big n.
So in this case 3n^2 surpasses what ever 100n is, when the n is really big.
Another example would be something like f(n) = n/n^2 or f(n) = n! * n^2.
The first one gets smaller, as n simply cannot "keep up" with n^2. In the second example n! clearly grows faster than n^2, so I guess the answer for that should be f(n) = n! then, because the n^2 technically stops mattering with big n.
And terms like +6, which have no n affecting them are constants and matter even less as they cannot grow even if n grows.
It is all about what happends when n is really big. If your n is 34934854385754385463543856, then n^2 is hell of a bigger than 100n, because n^2 = n * n = 34934854385754385463543856 * 34934854385754385463543856.

Difference between Big-Theta and Big O notation in simple language

While trying to understand the difference between Theta and O notation I came across the following statement :
The Theta-notation asymptotically bounds a function from above and below. When
we have only an asymptotic upper bound, we use O-notation.
But I do not understand this. The book explains it mathematically, but it's too complex and gets really boring to read when I am really not understanding.
Can anyone explain the difference between the two using simple, yet powerful examples.
Big O is giving only upper asymptotic bound, while big Theta is also giving a lower bound.
Everything that is Theta(f(n)) is also O(f(n)), but not the other way around.
T(n) is said to be Theta(f(n)), if it is both O(f(n)) and Omega(f(n))
For this reason big-Theta is more informative than big-O notation, so if we can say something is big-Theta, it's usually preferred. However, it is harder to prove something is big Theta, than to prove it is big-O.
For example, merge sort is both O(n*log(n)) and Theta(n*log(n)), but it is also O(n2), since n2 is asymptotically "bigger" than it. However, it is NOT Theta(n2), Since the algorithm is NOT Omega(n2).
Omega(n) is asymptotic lower bound. If T(n) is Omega(f(n)), it means that from a certain n0, there is a constant C1 such that T(n) >= C1 * f(n). Whereas big-O says there is a constant C2 such that T(n) <= C2 * f(n)).
All three (Omega, O, Theta) give only asymptotic information ("for large input"):
Big O gives upper bound
Big Omega gives lower bound and
Big Theta gives both lower and upper bounds
Note that this notation is not related to the best, worst and average cases analysis of algorithms. Each one of these can be applied to each analysis.
I will just quote from Knuth's TAOCP Volume 1 - page 110 (I have the Indian edition). I recommend reading pages 107-110 (section 1.2.11 Asymptotic representations)
People often confuse O-notation by assuming that it gives an exact order of Growth; they use it as if it specifies a lower bound as well as an upper bound. For example, an algorithm might be called inefficient because its running time is O(n^2). But a running time of O(n^2) does not necessarily mean that running time is not also O(n)
On page 107,
1^2 + 2^2 + 3^2 + ... + n^2 = O(n^4) and
1^2 + 2^2 + 3^2 + ... + n^2 = O(n^3) and
1^2 + 2^2 + 3^2 + ... + n^2 = (1/3) n^3 + O(n^2)
Big-Oh is for approximations. It allows you to replace ~ with an equals = sign. In the example above, for very large n, we can be sure that the quantity will stay below n^4 and n^3 and (1/3)n^3 + n^2 [and not simply n^2]
Big Omega is for lower bounds - An algorithm with Omega(n^2) will not be as efficient as one with O(N logN) for large N. However, we do not know at what values of N (in that sense we know approximately)
Big Theta is for exact order of Growth, both lower and upper bound.
I am going to use an example to illustrate the difference.
Let the function f(n) be defined as
if n is odd f(n) = n^3
if n is even f(n) = n^2
From CLRS
A function f(n) belongs to the set Θ(g(n)) if there exist positive
constants c1 and c2 such that it can be "sandwiched" between c1g(n)
and c2g(n), for sufficiently large n.
AND
O(g(n)) = {f(n): there exist positive constants c and n0 such that 0 ≤
f(n) ≤ cg(n) for all n ≥ n0}.
AND
Ω(g(n)) = {f(n): there exist positive constants c and n0 such that 0 ≤
cg(n) ≤ f(n) for all n ≥ n0}.
The upper bound on f(n) is n^3. So our function f(n) is clearly O(n^3).
But is it Θ(n^3)?
For f(n) to be in Θ(n^3) it has to be sandwiched between two functions one forming the lower bound, and the other the upper bound, both of which grown at n^3. While the upper bound is obvious, the lower bound can not be n^3. The lower bound is in fact n^2; f(n) is Ω(n^2)
From CLRS
For any two functions f(n) and g(n), we have f(n) = Θ(g(n)) if and
only if f(n) = O(g(n)) and f(n) = Ω(g(n)).
Hence f(n) is not in Θ(n^3) while it is in O(n^3) and Ω(n^2)
If the running time is expressed in big-O notation, you know that the running time will not be slower than the given expression. It expresses the worst-case scenario.
But with Theta notation you also known that it will not be faster. That is, there is no best-case scenario where the algorithm will retun faster.
This gives are more exact bound on the expected running time. However for most purposes it is simpler to ignore the lower bound (the possibility of faster execution), while you are generally only concerned about the worst-case scenario.
Here's my attempt:
A function, f(n) is O(n), if and only if there exists a constant, c, such that f(n) <= c*g(n).
Using this definition, could we say that the function f(2^(n+1)) is O(2^n)?
In other words, does a constant 'c' exist such that 2^(n+1) <= c*(2^n)? Note the second function (2^n) is the function after the Big O in the above problem. This confused me at first.
So, then use your basic algebra skills to simplify that equation. 2^(n+1) breaks down to 2 * 2^n. Doing so, we're left with:
2 * 2^n <= c(2^n)
Now its easy, the equation holds for any value of c where c >= 2. So, yes, we can say that f(2^(n+1)) is O(2^n).
Big Omega works the same way, except it evaluates f(n) >= c*g(n) for some constant 'c'.
So, simplifying the above functions the same way, we're left with (note the >= now):
2 * 2^n >= c(2^n)
So, the equation works for the range 0 <= c <= 2. So, we can say that f(2^(n+1)) is Big Omega of (2^n).
Now, since BOTH of those hold, we can say the function is Big Theta (2^n). If one of them wouldn't work for a constant of 'c', then its not Big Theta.
The above example was taken from the Algorithm Design Manual by Skiena, which is a fantastic book.
Hope that helps. This really is a hard concept to simplify. Don't get hung up so much on what 'c' is, just break it down into simpler terms and use your basic algebra skills.

Resources