Understanding Big(O) in loops - algorithm

I am trying to get the correct Big-O of the following code snippet:
s = 0
for x in seq:
for y in seq:
s += x*y
for z in seq:
for w in seq:
s += x-w
According to the book I got this example from (Python Algorithms), they explain it like this:
The z-loop is run for a linear number of iterations, and
it contains a linear loop, so the total complexity there is quadratic, or Θ(n2). The y-loop is clearly Θ(n).
This means that the code block inside the x-loop is Θ(n + n2). This entire block is executed for each
round of the x-loop, which is run n times. We use our multiplication rule and get Θ(n(n + n2)) = Θ(n2 + n3)
= Θ(n3), that is, cubic.
What I don't understand is: how could O(n(n+n2)) become O(n3). Is the math correct?

The math being done here is as follows. When you say O(n(n + n2)), that's equivalent to saying O(n2 + n3) by simply distributing the n throughout the product.
The reason that O(n2 + n3) = O(n3) follows from the formal definition of big-O notation, which is as follows:
A function f(n) = O(g(n)) iff there exists constants n0 and c such that for any n ≥ n0, |f(n)| ≤ c|g(n)|.
Informally, this says that as n gets arbitrary large, f(n) is bounded from above by a constant multiple of g(n).
To formally prove that n2 + n3 is O(n3), consider any n ≥ 1. Then we have that
n2 + n3 ≤ n3 + n3 = 2n3
So we have that n2 + n3 = O(n3), with n0 = 1 and c = 2. Consequently, we have that
O(n(n + n2)) = O(n2 + n3) = O(n3).
To be truly formal about this, we would need to show that if f(n) = O(g(n)) and g(n) = O(h(n)), then f(n) = O(h(n)). Let's walk through a proof of this. If f(n) = O(g(n)), there are constants n0 and c such that for n ≥ n0, |f(n)| ≤ c|g(n)|. Similarly, since g(n) = O(h(n)), there are constants n'0, c' such that for n ≥ n'0, g(n) ≤ c'|h(n)|. So this means that for any n ≥ max(c, c'), we have that
|f(n)| ≤ c|g(n)| ≤ c|c'h(n)| = c x c' |h(n)|
And so f(n) = O(h(n)).
To be a bit more precise - in the case of the algorithm described here, the authors are saying that the runtime is Θ(n3), which is a stronger result than saying that the runtime is O(n3). Θ notation indicates a tight asymptotic bound, meaning that the runtime grows at the same rate as n3, not just that it is bounded from above by some multiple of n3. To prove this, you would also need to show that n3 is O(n2 + n3). I'll leave this as an exercise to the reader. :-)
More generally, if you have any polynomial of order k, that polynomial is O(nk) using a similar argument. To see this, let P(n) = ∑i=0k(aini). Then, for any n ≥ 1, we have that
∑i=0k(aini) ≤ ∑i=0k(aink) = (∑i=0k(ai))nk
so P(n) = O(nk).
Hope this helps!

n(n+n2) == n2 + n3
Big-O notation only cares about the dominant term as n goes to infinity, so the whole algorithm is thought of as Θ(n3).

O(n(n+n^2)) = O(n^2 + n^3)
Since the n^3 term dominates the n^2 term, the n^2 term is negligible and thus it is O(n^3).

The y loop can be discounted because of the z loop (O(n) + O(n^2) -> O(n^2))
Forget the arithmetic.
Then you're left with three nested loops that all iterate over the full length of 'seq', so it's O(n^3)

Related

Finding the values in Big Oh

I am going through the Asymptotic notations from here. I am reading this f(n) ≤ c g(n)
For example, if f(n) = 2n + 2, We can satisfy it in any way as f(n) is O (c.g(n)) by adjusting the value of n and c. Or is there any specific rule or formula for selecting the value of c and n. Will no always be 1?
There is no formula per se. You can find the formal definition here:
f(n) = O(g(n)) means there are positive constants c and k, such that 0 ≤ f(n) ≤ cg(n) for all n ≥ k. The values of c and k must be fixed for the function f and must not depend on n. (big-O notation).
What I understood from your question is, you are not getting the essence of big-O notation. If your complexity is, for example, O(n^2), then you can guarantee that there is some value of n (greater than k) after which f(n) in no case will exceed c g(n).
Let's try to prove f(n) = 2n + 2 is O(n):
As it seems from the function itself, you cannot set the value of c equal to 2 as you want to find f(n) ≤ c g(n). If you plug in c = 2, you have to find k such that f(n) ≤ c g(n) for n ≥ k. Clearly, there is no n for which 2n ≥ 2n + 2. So, we move on to c = 3.
Now, let's find the value of k. So, we solve the equation 3n ≥ 2n + 2. Solving it:
3n ≥ 2n + 2
=> 3n - 2n ≥ 2
=> n ≥ 2
Therefore, for c = 3, we found value of k = 2 (n ≥ k).
You must also understand, your function isn't just O(n). It is also O(n^2), O(n^3), O(n^4) and so on. All because corresponding values of c and k exist for g(n) = n^2, g(n) = n^3 and g(n) = n^4.
Hope it helps.

The Mathematical Relationship Between Big-Oh Classes

My textbook describes the relationship as follows:
There is a very nice mathematical intuition which describes these classes too. Suppose we have an algorithm which has running time N0 when given an input of size n, and a running time of N1 on an input of size 2n. We can characterize the rates of growth in terms of the relationship between N0 and N1:
Big-Oh Relationship
O(log n) N1 ≈ N0 + c
O(n) N1 ≈ 2N0
O(n²) N1 ≈ 4N0
O(2ⁿ) N1 ≈ (N0)²
Why is this?
That is because if f(n) is in O(g(n)) then it can be thought of as acting like k * g(n) for some k.
So for example if f(n) = O(log(n)) then it acts like k log(n), and now f(2n) ≈ k log(2n) = k (log(2) + log(n)) = k log(2) + k log(n) ≈ k log(2) + f(n) and that is your desired equation with c = k log(2).
Note that this is a rough intuition only. An example of where it breaks down is that f(n) = (2 + sin(n)) log(n) = O(log(n)). The oscillating 2 + sin(n) bit means that f(2n)-f(n) can be basically anything.
I personally find this kind of rough intuition to be misleading and therefore worse than useless. Others find it very helpful. Decide for yourself how much weight you give it.
Basically what they are trying to show is just basic algebra after substituting 2n for n in the functions.
O(log n)
log(2n) = log(2) + log(n)
N1 ≈ c + N0
O(n)
2n = 2(n)
N1 ≈ 2N0
O(n²)
(2n)^2 = 4n^2 = 4(n^2)
N1 ≈ 4N0
O(2ⁿ)
2^(2n) = 2^(n*2) = (2^n)^2
N1 ≈ (N0)²
Since O(f(n)) ~ k * f(n) (almost by definition), you want to look at what happens when you put 2n in for n. In each case:
N1 ≈ k*log 2n = k*(log 2 + log n) = k*log n + k*log 2 ≈ N0 + c where c = k*log 2
N1 ≈ k*(2n) = 2*k*n ≈ 2N0
N1 ≈ k*(2n)^2 = 4*k*n^2 ≈ 4N0
N1 ≈ k*2^(2n) = k*(2^n)^2 ≈ N0*2^n ≈ N0^2/k
So the last one is not quite right, anyway. Keep in mind that these relationships are only true asymptotically, so the approximations will be more accurate as n gets larger. Also, f(n) = O(g(n)) only means g(n) is an upper bound for f(n) for large enough n. So f(n) = O(g(n)) does not necessarily mean f(n) ~ k*g(n). Ideally, you want that to be true, since your big-O bound will be tight when that is the case.

Confused on big O notation

According to this book, big O means:
f(n) = O(g(n)) means c · g(n) is an upper bound on f(n). Thus there exists some constant c such that f(n) is always ≤ c · g(n), for large enough n (i.e. , n ≥ n0 for some constant n0).
I have trubble understanding the following big O equation
3n2 − 100n + 6 = O(n2), because I choose c = 3 and 3n2 > 3n2 − 100n + 6;
How can 3 be a factor? In 3n2 − 100n + 6, if we drop the low order terms -100n and 6, aren't 3n2 and 3.n2 the same? How to solve this equation?
I'll take the liberty to slightly paraphrase the question to:
Why do and have the same asymptotic complexity.
For that to be true, the definition should be in effect both directions.
First:
let
Then for the inequality is always satisfied.
The other way around:
let
We have a parabola opened upwards, therefore there is again some after which the inequality is always satisfied.
Let's look at the definition you posted for f(n) in O(g(n)):
f(n) = O(g(n)) means c · g(n) is an upper bound on f(n). Thus there
exists some constant c such that f(n) is always ≤ c · g(n), for
large enough n (i.e. , n ≥ n0 for some constant n0).
So, we only need to find one set of constants (c, n0) that fulfils
f(n) < c · g(n), for all n > n0, (+)
but this set is not unique. I.e., the problem of finding the constants (c, n0) such that (+) holds is degenerate. In fact, if any such pair of constants exists, there will exist an infinite amount of different such pairs.
Note that here I've switched to strict inequalities, which is really only a matter of taste, but I prefer this latter convention. Now, we can re-state the Big-O definition in possibly more easy-to-understand terms:
... we can say that f(n) is O(g(n)) if we can find a constant c such
that f(n) is less than c·g(n) or all n larger than n0, i.e., for all
n>n0.
Now, let's look at your function f(n)
f(n) = 3n^2 - 100n + 6 (*)
Let's describe your functions as a sum of it's highest term and another functions
f(n) = 3n^2 + h(n) (**)
h(n) = 6 - 100n (***)
We now study the behaviour of h(n) and f(n), respectively:
h(n) = 6 - 100n
what can we say about this expression?
=> if n > 6/100, then h(n) < 0, since 6 - 100*(6/100) = 0
=> h(n) < 0, given n > 6/100 (i)
f(n) = 3n^2 + h(n)
what can we say about this expression, given (i)?
=> if n > 6/100, the f(n) = 3n^2 + h(n) < 3n^2
=> f(n) < c*n^2, with c=3, given n > 6/100 (ii)
Ok!
From (ii) we can choose constant c=3, given that we choose the other constant n0 as larger than 6/100. Lets choose the first integer that fulfils this: n0=1.
Hence, we've shown that (+) golds for constant set **(c,n0) = (3,1), and subsequently, f(n) is in O(n^2).
For a reference on asymptotic behaviour, see e.g.
https://www.khanacademy.org/computing/computer-science/algorithms/asymptotic-notation/a/big-o-notation
y=3n^2 (top graph) vs y=3n^2 - 100n + 6
Consider the sketch above. By your definition, 3n^2 only needs to be bigger than 3n^2 - 100n + 6 for large enough n (i.e. , n ≥ n0 for some constant n0). Let that n0 = 5 in this case (it could be something a little smaller, but it's clear which graph is bigger by n=5 so we'll just go with that).
Clearly from the graph, 3n^2 >= 3n^2 - 100n + 6 in the range we've plotted. The only way for 3n^2 - 100n + 6 to get bigger than 3n^2 then is for it to grow more steeply.
But the gradients of 3n^2 and 3n^2 - 100n + 6 are 6n and 6n - 100 respectively, so 3n^2 - 100n + 6 can't grow more steeply, therefore must always be underneath.
So your definition holds - 3n^2 - 100n + 6 <= 3n^2 for all n>=5
I am not an expert, but this looks a lot similar to what we just had in our real analysis course.
Basically if you have something like f(n) = 3n^2 − 100n + 6, the "fastest growing" term "wins" the other terms, when you have really really big n.
So in this case 3n^2 surpasses what ever 100n is, when the n is really big.
Another example would be something like f(n) = n/n^2 or f(n) = n! * n^2.
The first one gets smaller, as n simply cannot "keep up" with n^2. In the second example n! clearly grows faster than n^2, so I guess the answer for that should be f(n) = n! then, because the n^2 technically stops mattering with big n.
And terms like +6, which have no n affecting them are constants and matter even less as they cannot grow even if n grows.
It is all about what happends when n is really big. If your n is 34934854385754385463543856, then n^2 is hell of a bigger than 100n, because n^2 = n * n = 34934854385754385463543856 * 34934854385754385463543856.

Big O vs Small omega

Why is ω(n) smaller than O(n)?
I know what is little omega (for example, n = ω(log n)), but I can't understand why ω(n) is smaller than O(n).
Big Oh 'O' is an upper bound and little omega 'ω' is a Tight lower bound.
O(g(n)) = { f(n): there exist positive constants c and n0 such that 0 ≤ f(n) ≤ cg(n) for all n ≥ n0}
ω(g(n)) = { f(n): for all constants c > 0, there exists a constant n0 such that 0 ≤ cg(n) < f(n) for all n ≥ n0}.
ALSO: infinity = lim f(n)/g(n)
n ∈ O(n) and n ∉ ω(n).
Alternatively:
n ∈ ω(log(n)) and n ∉ O(log(n))
ω(n) and O(n) are at the opposite side of the spectrum, as is illustrated below.
Formally,
For more details, see CSc 345 — Analysis of Discrete Structures
(McCann), which is the source of the graph above. It also contains a compact representation of the definitions, which makes them easy to remember:
I can't comment, so first of all let me say that n ≠ Θ(log(n)). Big Theta means that for some positive constants c1, c2, and k, for all values of n greater than k, c1*log(n) ≤ n ≤ c2*log(n), which is not true. As n approaches infinity, it will always be larger than log(n), no matter log(n)'s coefficient.
jesse34212 was correct in saying that n = ω(log(n)). n = ω(log(n)) means that n ≠ Θ(log(n)) AND n = Ω(log(n)). In other words, little or small omega is a loose lower bound, whereas big omega can be loose or tight.
Big O notation signifies a loose or tight upper bound. For instance, 12n = O(n) (tight upper bound, because it's as precise as you can get), and 12n = O(n^2) (loose upper bound, because you could be more precise).
12n ≠ ω(n) because n is a tight bound on 12n, and ω only applies to loose bounds. That's why 12n = ω(log(n)), or even 12n = ω(1). I keep using 12n, but that value of the constant does not affect the equality.
Technically, O(n) is a set of all functions that grow asymptotically equal to or slower than n, and the belongs character is most appropriate, but most people use "= O(n)" (instead of "∈ O(n)") as an informal way of writing it.
Algorithmic complexity has a mathematic definition.
If f and g are two functions, f = O(g) if you can find two constants c (> 0) and n such as f(x) < c * g(x) for every x > n.
For Ω, it is the opposite: you can find constants such as f(x) > c * g(x).
f = Θ(g) if there are three constants c, d and n such as c * g(x) < f(x) < d * g(x) for every x > n.
Then, O means your function is dominated, Θ your function is equivalent to the other function, Ω your function has a lower limit.
So, when you are using Θ, your approximation is better for you are "wrapping" your function between two edges ; whereas O only set a maximum. Ditto for Ω (minimum).
To sum up:
O(n): in worst situations, your algorithm has a complexity of n
Ω(n): in best case, your algorithm has a complexity of n
Θ(n): in every situation, your algorithm has a complexity of n
To conclude, your assumption is wrong: it is Θ, not Ω. As you may know, n > log(n) when n has a huge value. Then, it is logic to say n = Θ(log(n)), according to previous definitions.

What is the difference between O(1) and Θ(1)?

I know the definitions of both of them, but what is the reason sometimes I see O(1) and other times Θ(1) written in textbooks?
Thanks.
O(1) and Θ(1) aren't necessarily the same if you are talking about functions over real numbers. For example, consider the function f(n) = 1/n. This function is O(1) because for any n ≥ 1, f(n) ≤ 1. However, it is not Θ(1) for the following reason: one definition of f(n) = Θ(g(n)) is that the limit of |f(n) / g(n)| as n goes to infinity is some finite value L satisfying 0 < L. Plugging in f(n) = 1/n and g(n) = 1, we take the limit of |1/n| as n goes to infinity and get that it's 0. Therefore, f(n) ≠ Θ(1).
Hope this helps!
Big-O notation expresses an asymptotic upper bound, whereas Big-Theta notation additionally expresses an asymptotic lower bound. Often, the upper bound is what people are interested in, so they write O(something), even when Theta(something) would also be true. For example, if you wanted to count the number of things that are equal to x in an unsorted list, you might say that it can be done in linear time and is O(n), because what matters to you is that it won't take any longer than that. However, it would also be true that it's Omega(n) and therefore Theta(n), since you have to examine all of the elements in the list - it can't be done in sub-linear time.
UPDATE:
Formally:
f in O(g) iff there exists a c and an n0 such that for all n > n0, f(n) <= c * g(n).
f in Omega(g) iff there exists a c and an n0 such that for all n > n0, f(n) >= c * g(n).
f in Theta(g) iff f in O(g) and f in Omega(g), i.e. iff there exist a c1, a c2 and an n0 such that for all n > n0, c1 * g(n) <= f(n) <= c2 * g(n).

Resources