I have these two questions that I think I understand how to answer (answers after the questions). I just wanted to see if I am understanding time complexity calculations and how to find the BigO.
Generic form is just the product of each value on the right side of the expression.
The BigO is the largest power in an polynomial. Is this way of thinking correct?
int sum = 0;
for (int i = 0; i < n; i++)
for (int j = 0; j < n * n; j++)
for (int k = 0; k < 10; k++)
sum += i;
How many generic time units does this code take? n(n^2)*10
What is the big-oh run time of this code? O(n^3)
Yes. Basically the definition of big O states that you the time units (as you call them) are bounded from above by a constant time you expression starting from some (arbitrarily high) natural number to infinity. In a more mathematical notation this is:
A function f(n) is O(g(n)) if there exist a constant C and a number N such that f(n) < C*g(n) for all n > N.
In your context f(n) = n(n^2)*10 and g(n) = n^3.
You could, by the way, also say that the function is O(n^4). You can use the big theta notation to indicate that this is also the lower bound: f(n) is $\theta(n^3).
See more on this here: https://en.wikipedia.org/wiki/Big_O_notation
Yes your understanding is correct. But sometimes you have to deal with logarithmic terms also.
The way to look at a logarithmic term could be viewing it as n^(1+epsilon). Where epsilon is a small quantity.
Related
Here is a segment of an algorithm I came up with:
for (int i = 0; i < n - 1; i++)
for (int j = i; j < n; j++)
(...)
I am using this "double loop" to test all possible 2-element sums in a an array of size n.
Apparently (and I have to agree with it), this "double loop" is O(n²):
n + (n-1) + (n-2) + ... + 1 = sum from 1 to n = (n (n - 1))/2
Here is where I am confused:
for (int i = 0; i < n; i++)
for (int j = 0; j < n; j++)
(...)
This second "double loop" also has a complexity of O(n²), when it is clearly (at worst) much (?) better than the first.
What am I missing? Is the information accurate? Can someone explain this "phenomenon"?
(n (n - 1))/2 simplifies to n²/2 - n/2. If you use really large numbers for n, the growth rate of n/2 will be dwarfed in comparison to n², so for the sake of calculating Big-O complexity, you effectively ignore it. Likewise, the "constant" value of 1/2 doesn't grow at all as n increases, so you ignore that too. That just leaves you with n².
Just remember that complexity calculations are not the same as "speed". One algorithm can be five thousand times slower than another and still have a smaller Big-O complexity. But as you increase n to really large numbers, general patterns emerge that can typically be classified using simple formulae: 1, log n, n, n log n, n², etc.
It sometimes helps to create a graph and see what kind of line appears:
Even though the zoom factors of these two graphs are very different, you can see that the type of curve it produces is almost exactly the same.
Constant factors.
Big-O notation ignores constant factors, so even though the second loop is slower by a constant factor, they end up with the same time complexity.
Right there in the definition it tells you that you can pick any old constant factor:
... if and only if there is a positive constant M ...
This is because we want to analyse the growth rate of an algorithm - constant factors just complicates things and are often system-dependent (operations may vary in duration on different machines).
You could just count certain types of operations, but then the question becomes which operation to pick, and what if that operation isn't predominant in some algorithm. Then you'll need to relate operations to each other (in a system-independent way, which is probably impossible), or you could just assign the same weight to each, but that would be fairly inaccurate as some operations would take significantly longer than others.
And how useful would saying O(15n² + 568n + 8 log n + 23 sqrt(n) + 17) (for example) really be? As opposed to just O(n²).
(For the purpose of the below, assume n >= 2)
Note that we actually have asymptotically smaller (i.e. smaller as we approach infinity) terms here, but we can always simplify that to a matter of constant factors. (It's n(n+1)/2, not n(n-1)/2)
n(n+1)/2 = n²/2 + n/2
and
n²/2 <= n²/2 + n/2 <= n²
Given that we've just shown that n(n+1)/2 lies between C.n² and D.n², for two constants C and D, we've also just shown that it's O(n²).
Note - big-O notation is actually strictly an upper bound (so we only care that it's smaller than a function, not between two), but it's often used to mean Θ (big-Theta), which cares about both bounds.
From The Big O page on Wikipedia
In typical usage, the formal definition of O notation is not used
directly; rather, the O notation for a function f is derived by the
following simplification rules:
If f(x) is a sum of several terms, the
one with the largest growth rate is kept, and all others omitted
Big-O is used only to give the asymptotic behaviour - that one is a bit faster than the other doesn't come into it - they're both O(N^2)
You could also say that the first loop is O(n(n-1)/2). The fancy mathematical definition of big-O is something like:
function "f" is big-O of function "g" if there exists constants c, n such that f(x) < c*g(x) for some c and all x > n.
It's a fancy way of saying g is an upper bound past some point with some constant applied. It then follows that O(n(n-1)/2) = O((n^2-n)/2) is big-O of O(n^2), which is neater for quick analysis.
AFAIK, your second code snippet
for(int i = 0; i < n; i++) <-- this loop goes for n times
for(int j = 0; j < n; j++) <-- loop also goes for n times
(...)
So essentially, it's getting a O(n*n) = O(n^2) time complexity.
Per BIG-O theory, constant factor is neglected and only higher order is considered. that's to say, if complexity is O(n^2+k) then actual complexity will be O(n^2) constant k will be ignored.
(OR) if complexity is O(n^2+n) then actual complexity will be O(n^2) lower order n will be ignored.
So in your first case where complexity is O(n(n - 1)/2) will/can be simplified to
O(n^2/2 - n/2) = O(n^2/2) (Ignoring the lower order n/2)
= O(1/2 * n^2)
= O(n^2) (Ignoring the constant factor 1/2)
sum = 0;
for (int i = 0; i < N; i++)
for (int j = i; j ≥ 0; j--)
sum++;
I found out the big-oh to be O(n^2) but I am not sure how to find the theta bound for it. Can someone help?
You can make use of sigma notation to derive asymptotic bounds for your algorithm:
Since we can show that T(n) O(n^2) (upper asymptotic bound) as well as in Ω(n^2) (lower asymptotic bound), it follows that T(n) is in Θ(n^2) ("sandwiched", upper and lower asymptotic bound).
In case it's unclear: note that the inner and out sums in the initial expressions above describe your inner and outer for loops, respectively.
the following sample loops have O(n^2) time complexity
Can anyone explain me why it is O(n^2)? As it depends on the value of c...
loop 1---
for (int i = 1; i <=n; i += c)
{
for (int j = 1; j <=n; j += c)
{
// some O(1) expressions
}
}
loop 2---
for (int i = n; i > 0; i -= c)
{
for (int j = i+1; j <=n; j += c)
{
// some O(1) expressions
}
}
If c=0 ; then it runs infinite number of times , in the similar way if c value is increased then the number of times the inner loops run will be decreased
Can anyone explain it to me?
Each of these parts of code takes a time O(n^2/c^2). c is probably considered a strictly positive constant here and therefore O(n^2/c^2) = O(n^2). But it all depends on the context...
Big-O notation is a relative representation of the complexity of an algorithm.
Big-O does not says anything about how many iterations your algorithm will make in any case .
It says in worst case your algorithm will be making n squared computations. Which is useful if you have to compare 2 algorithms.
In your code if we assume c to be a constant then it could be ignored from Big-O notation because Big-O is all about comparison and how thing scale. where constants play no role.
But when c is not a constant the correct Big-O notation would be O(n^2/c^2).
Read this awesome explanation of Big-O by cletus .
For every FIXED c, the time is O (n^2). The number of iterations is roughly max (n^2 / c^2, n^2) unless c = 0. n^2 / c^2 is O (n^2) for every fixed n.
If you had code where c was changed during the loop, you might get a different answer.
Here is a segment of an algorithm I came up with:
for (int i = 0; i < n - 1; i++)
for (int j = i; j < n; j++)
(...)
I am using this "double loop" to test all possible 2-element sums in a an array of size n.
Apparently (and I have to agree with it), this "double loop" is O(n²):
n + (n-1) + (n-2) + ... + 1 = sum from 1 to n = (n (n - 1))/2
Here is where I am confused:
for (int i = 0; i < n; i++)
for (int j = 0; j < n; j++)
(...)
This second "double loop" also has a complexity of O(n²), when it is clearly (at worst) much (?) better than the first.
What am I missing? Is the information accurate? Can someone explain this "phenomenon"?
(n (n - 1))/2 simplifies to n²/2 - n/2. If you use really large numbers for n, the growth rate of n/2 will be dwarfed in comparison to n², so for the sake of calculating Big-O complexity, you effectively ignore it. Likewise, the "constant" value of 1/2 doesn't grow at all as n increases, so you ignore that too. That just leaves you with n².
Just remember that complexity calculations are not the same as "speed". One algorithm can be five thousand times slower than another and still have a smaller Big-O complexity. But as you increase n to really large numbers, general patterns emerge that can typically be classified using simple formulae: 1, log n, n, n log n, n², etc.
It sometimes helps to create a graph and see what kind of line appears:
Even though the zoom factors of these two graphs are very different, you can see that the type of curve it produces is almost exactly the same.
Constant factors.
Big-O notation ignores constant factors, so even though the second loop is slower by a constant factor, they end up with the same time complexity.
Right there in the definition it tells you that you can pick any old constant factor:
... if and only if there is a positive constant M ...
This is because we want to analyse the growth rate of an algorithm - constant factors just complicates things and are often system-dependent (operations may vary in duration on different machines).
You could just count certain types of operations, but then the question becomes which operation to pick, and what if that operation isn't predominant in some algorithm. Then you'll need to relate operations to each other (in a system-independent way, which is probably impossible), or you could just assign the same weight to each, but that would be fairly inaccurate as some operations would take significantly longer than others.
And how useful would saying O(15n² + 568n + 8 log n + 23 sqrt(n) + 17) (for example) really be? As opposed to just O(n²).
(For the purpose of the below, assume n >= 2)
Note that we actually have asymptotically smaller (i.e. smaller as we approach infinity) terms here, but we can always simplify that to a matter of constant factors. (It's n(n+1)/2, not n(n-1)/2)
n(n+1)/2 = n²/2 + n/2
and
n²/2 <= n²/2 + n/2 <= n²
Given that we've just shown that n(n+1)/2 lies between C.n² and D.n², for two constants C and D, we've also just shown that it's O(n²).
Note - big-O notation is actually strictly an upper bound (so we only care that it's smaller than a function, not between two), but it's often used to mean Θ (big-Theta), which cares about both bounds.
From The Big O page on Wikipedia
In typical usage, the formal definition of O notation is not used
directly; rather, the O notation for a function f is derived by the
following simplification rules:
If f(x) is a sum of several terms, the
one with the largest growth rate is kept, and all others omitted
Big-O is used only to give the asymptotic behaviour - that one is a bit faster than the other doesn't come into it - they're both O(N^2)
You could also say that the first loop is O(n(n-1)/2). The fancy mathematical definition of big-O is something like:
function "f" is big-O of function "g" if there exists constants c, n such that f(x) < c*g(x) for some c and all x > n.
It's a fancy way of saying g is an upper bound past some point with some constant applied. It then follows that O(n(n-1)/2) = O((n^2-n)/2) is big-O of O(n^2), which is neater for quick analysis.
AFAIK, your second code snippet
for(int i = 0; i < n; i++) <-- this loop goes for n times
for(int j = 0; j < n; j++) <-- loop also goes for n times
(...)
So essentially, it's getting a O(n*n) = O(n^2) time complexity.
Per BIG-O theory, constant factor is neglected and only higher order is considered. that's to say, if complexity is O(n^2+k) then actual complexity will be O(n^2) constant k will be ignored.
(OR) if complexity is O(n^2+n) then actual complexity will be O(n^2) lower order n will be ignored.
So in your first case where complexity is O(n(n - 1)/2) will/can be simplified to
O(n^2/2 - n/2) = O(n^2/2) (Ignoring the lower order n/2)
= O(1/2 * n^2)
= O(n^2) (Ignoring the constant factor 1/2)
for (int i = 0; i < 5; i++) {
for (int j = 0; j < 5; j++) {
for (int k = 0; k < 5; k++) {
for (int l = 0; l < 5; l++) {
look up in a perfect constant time hash table
}
}
}
}
what would the running time of this be in big theta?
my best guess, a shot in the dark: i always see that nested for loops are O(n^k) where k is the number of loops, so the loops would be O(n^4), then would i multiply by O(1) for constant time? what would this all be in big theta?
If you consider that accessing a hash table is really theta(1), then this algorithm runs in theta(1) too, because it makes only a constant number (5^4) lookups at the hashtable.
However, if you change 5 to n, it will be theta(n^4) because you'll do exactly n^4 constant-time operations.
The big-theta running time would be Θ(n^4).
Big-O is an upper bound, where-as big-theta is a tight bound. What this means is that to say the code is O(n^5) is also correct (but Θ(n^5) is not), whatever's inside the big-O just has to be asymptotically bigger than or equal to n^4.
I'm assuming 5 can be substituted for another value (i.e. is n), if not, the loop would run in constant time (O(1) and Θ(1)), since 5^4 is constant.
Using Sigma notation:
Indeed, instructions inside the innermost loop will execute 625 times.