How can I represent its complexity with Big-O notation? I am little bit confused since the second for loop changes according to the index of the outer loop. Is it still O(n^2)? or less complex? Thanks in advance
for (int k = 0; k<arr.length; k++){
for (m = k; m<arr.length; m++){
//do something
}
}
Your estimation comes from progression formula:
and, thus, is O(n^2). Why your case is progression? Because it's n + (n-1) + ... + 1 summation for your loops.
If you add all iterations of the second loop, you get 1+2+3+...+n, which is equal with n(n+1)/2 (n is the array length). That is n^2/2 + n/2. As you may already know, the relevant term in big-oh notation is the one qith the biggest power, and coeficients are not relevant. So, your complexity is still O(n^2).
well the runtime is cca half of the n^2 loop
but in big O notation it is still O(n^2)
because any constant time/cycle - operation is represented as O(1)
so O((n^2)/2) -> O((n^2)/c) -> O(n^2)
unofficially there are many people using O((n^2)/2) including me for own purposes (its more intuitive and comparable) ... closer to cycle/runtime
hope it helps
Related
Here is a segment of an algorithm I came up with:
for (int i = 0; i < n - 1; i++)
for (int j = i; j < n; j++)
(...)
I am using this "double loop" to test all possible 2-element sums in a an array of size n.
Apparently (and I have to agree with it), this "double loop" is O(n²):
n + (n-1) + (n-2) + ... + 1 = sum from 1 to n = (n (n - 1))/2
Here is where I am confused:
for (int i = 0; i < n; i++)
for (int j = 0; j < n; j++)
(...)
This second "double loop" also has a complexity of O(n²), when it is clearly (at worst) much (?) better than the first.
What am I missing? Is the information accurate? Can someone explain this "phenomenon"?
(n (n - 1))/2 simplifies to n²/2 - n/2. If you use really large numbers for n, the growth rate of n/2 will be dwarfed in comparison to n², so for the sake of calculating Big-O complexity, you effectively ignore it. Likewise, the "constant" value of 1/2 doesn't grow at all as n increases, so you ignore that too. That just leaves you with n².
Just remember that complexity calculations are not the same as "speed". One algorithm can be five thousand times slower than another and still have a smaller Big-O complexity. But as you increase n to really large numbers, general patterns emerge that can typically be classified using simple formulae: 1, log n, n, n log n, n², etc.
It sometimes helps to create a graph and see what kind of line appears:
Even though the zoom factors of these two graphs are very different, you can see that the type of curve it produces is almost exactly the same.
Constant factors.
Big-O notation ignores constant factors, so even though the second loop is slower by a constant factor, they end up with the same time complexity.
Right there in the definition it tells you that you can pick any old constant factor:
... if and only if there is a positive constant M ...
This is because we want to analyse the growth rate of an algorithm - constant factors just complicates things and are often system-dependent (operations may vary in duration on different machines).
You could just count certain types of operations, but then the question becomes which operation to pick, and what if that operation isn't predominant in some algorithm. Then you'll need to relate operations to each other (in a system-independent way, which is probably impossible), or you could just assign the same weight to each, but that would be fairly inaccurate as some operations would take significantly longer than others.
And how useful would saying O(15n² + 568n + 8 log n + 23 sqrt(n) + 17) (for example) really be? As opposed to just O(n²).
(For the purpose of the below, assume n >= 2)
Note that we actually have asymptotically smaller (i.e. smaller as we approach infinity) terms here, but we can always simplify that to a matter of constant factors. (It's n(n+1)/2, not n(n-1)/2)
n(n+1)/2 = n²/2 + n/2
and
n²/2 <= n²/2 + n/2 <= n²
Given that we've just shown that n(n+1)/2 lies between C.n² and D.n², for two constants C and D, we've also just shown that it's O(n²).
Note - big-O notation is actually strictly an upper bound (so we only care that it's smaller than a function, not between two), but it's often used to mean Θ (big-Theta), which cares about both bounds.
From The Big O page on Wikipedia
In typical usage, the formal definition of O notation is not used
directly; rather, the O notation for a function f is derived by the
following simplification rules:
If f(x) is a sum of several terms, the
one with the largest growth rate is kept, and all others omitted
Big-O is used only to give the asymptotic behaviour - that one is a bit faster than the other doesn't come into it - they're both O(N^2)
You could also say that the first loop is O(n(n-1)/2). The fancy mathematical definition of big-O is something like:
function "f" is big-O of function "g" if there exists constants c, n such that f(x) < c*g(x) for some c and all x > n.
It's a fancy way of saying g is an upper bound past some point with some constant applied. It then follows that O(n(n-1)/2) = O((n^2-n)/2) is big-O of O(n^2), which is neater for quick analysis.
AFAIK, your second code snippet
for(int i = 0; i < n; i++) <-- this loop goes for n times
for(int j = 0; j < n; j++) <-- loop also goes for n times
(...)
So essentially, it's getting a O(n*n) = O(n^2) time complexity.
Per BIG-O theory, constant factor is neglected and only higher order is considered. that's to say, if complexity is O(n^2+k) then actual complexity will be O(n^2) constant k will be ignored.
(OR) if complexity is O(n^2+n) then actual complexity will be O(n^2) lower order n will be ignored.
So in your first case where complexity is O(n(n - 1)/2) will/can be simplified to
O(n^2/2 - n/2) = O(n^2/2) (Ignoring the lower order n/2)
= O(1/2 * n^2)
= O(n^2) (Ignoring the constant factor 1/2)
I'm studying for an exam, and i've come across the following question:
Provide a precise (Θ notation) bound for the running time as a
function of n for the following function
for i = 1 to n {
j = i
while j < n {
j = j + 4
}
}
I believe the answer would be O(n^2), although I'm certainly an amateur at the subject but m reasoning is the initial loop takes O(n) and the inner loop takes O(n/4) resulting in O(n^2/4). as O(n^2) is dominating it simplifies to O(n^2).
Any clarification would be appreciated.
If you proceed using Sigma notation, and obtain T(n) equals something, then you get Big Theta.
If T(n) is less or equal, then it's Big O.
If T(n) is greater or equal, then it's Big Omega.
I would like to know, because I couldn't find any information online, how is an algorithm like O(n * m^2) or O(n * k) or O(n + k) supposed to be analysed?
Does only the n count?
The other terms are superfluous?
So O(n * m^2) is actually O(n)?
No, here the k and m terms are not superfluous,they do have a valid existence and essential for computing time complexity. They are wrapped together to provide a concrete-complexity to the code.
It may seem like the terms n and k are independent to each other in the code,but,they both combinedly determines the complexity of the algorithm.
Say, if you've to iterate a loop of size n-elements, and, in between, you have another loop of k-iterations, then the overall complexity turns O(nk).
Complexity of order O(nk), you can't dump/discard k here.
for(i=0;i<n;i++)
for(j=0;j<k;j++)
// do something
Complexity of order O(n+k), you can't dump/discard k here.
for(i=0;i<n;i++)
// do something
for(j=0;j<k;j++)
// do something
Complexity of order O(nm^2), you can't dump/discard m here.
for(i=0;i<n;i++)
for(j=0;j<m;j++)
for(k=0;k<m;k++)
// do something
Answer to the last question---So O(n.m^2) is actually O(n)?
No,O(nm^2) complexity can't be reduced further to O(n) as that would mean m doesn't have any significance,which is not the case actually.
FORMALLY: O(f(n)) is the SET of ALL functions T(n) that satisfy:
There exist positive constants c and N such that, for all n >= N,
T(n) <= c f(n)
Here are some examples of when and why factors other than n matter.
[1] 1,000,000 n is in O(n). Proof: set c = 1,000,000, N = 0.
Big-Oh notation doesn't care about (most) constant factors. We generally leave constants out; it's unnecessary to write O(2n), because O(2n) = O(n). (The 2 is not wrong; just unnecessary.)
[2] n is in O(n^3). [That's n cubed]. Proof: set c = 1, N = 1.
Big-Oh notation can be misleading. Just because an algorithm's running time is in O(n^3) doesn't mean it's slow; it might also be in O(n). Big-Oh notation only gives us an UPPER BOUND on a function.
[3] n^3 + n^2 + n is in O(n^3). Proof: set c = 3, N = 1.
Big-Oh notation is usually used only to indicate the dominating (largest
and most displeasing) term in the function. The other terms become
insignificant when n is really big.
These aren't generalizable, and each case may be different. That's the answer to the questions: "Does only the n count? The other terms are superfluous?"
Although there is already an accepted answer, I'd still like to provide the following inputs :
O(n * m^2) : Can be viewed as n*m*m and assuming that the bounds for n and m are similar then the complexity would be O(n^3).
Similarly -
O(n * k) : Would be O(n^2) (with the bounds for n and k being similar)
and -
O(n + k) : Would be O(n) (again, with the bounds for n and k being similar).
PS: It would be better not to assume the similarity between the variables and to first understand how the variables relate to each other (Eg: m=n/2; k=2n) before attempting to conclude.
Here is a segment of an algorithm I came up with:
for (int i = 0; i < n - 1; i++)
for (int j = i; j < n; j++)
(...)
I am using this "double loop" to test all possible 2-element sums in a an array of size n.
Apparently (and I have to agree with it), this "double loop" is O(n²):
n + (n-1) + (n-2) + ... + 1 = sum from 1 to n = (n (n - 1))/2
Here is where I am confused:
for (int i = 0; i < n; i++)
for (int j = 0; j < n; j++)
(...)
This second "double loop" also has a complexity of O(n²), when it is clearly (at worst) much (?) better than the first.
What am I missing? Is the information accurate? Can someone explain this "phenomenon"?
(n (n - 1))/2 simplifies to n²/2 - n/2. If you use really large numbers for n, the growth rate of n/2 will be dwarfed in comparison to n², so for the sake of calculating Big-O complexity, you effectively ignore it. Likewise, the "constant" value of 1/2 doesn't grow at all as n increases, so you ignore that too. That just leaves you with n².
Just remember that complexity calculations are not the same as "speed". One algorithm can be five thousand times slower than another and still have a smaller Big-O complexity. But as you increase n to really large numbers, general patterns emerge that can typically be classified using simple formulae: 1, log n, n, n log n, n², etc.
It sometimes helps to create a graph and see what kind of line appears:
Even though the zoom factors of these two graphs are very different, you can see that the type of curve it produces is almost exactly the same.
Constant factors.
Big-O notation ignores constant factors, so even though the second loop is slower by a constant factor, they end up with the same time complexity.
Right there in the definition it tells you that you can pick any old constant factor:
... if and only if there is a positive constant M ...
This is because we want to analyse the growth rate of an algorithm - constant factors just complicates things and are often system-dependent (operations may vary in duration on different machines).
You could just count certain types of operations, but then the question becomes which operation to pick, and what if that operation isn't predominant in some algorithm. Then you'll need to relate operations to each other (in a system-independent way, which is probably impossible), or you could just assign the same weight to each, but that would be fairly inaccurate as some operations would take significantly longer than others.
And how useful would saying O(15n² + 568n + 8 log n + 23 sqrt(n) + 17) (for example) really be? As opposed to just O(n²).
(For the purpose of the below, assume n >= 2)
Note that we actually have asymptotically smaller (i.e. smaller as we approach infinity) terms here, but we can always simplify that to a matter of constant factors. (It's n(n+1)/2, not n(n-1)/2)
n(n+1)/2 = n²/2 + n/2
and
n²/2 <= n²/2 + n/2 <= n²
Given that we've just shown that n(n+1)/2 lies between C.n² and D.n², for two constants C and D, we've also just shown that it's O(n²).
Note - big-O notation is actually strictly an upper bound (so we only care that it's smaller than a function, not between two), but it's often used to mean Θ (big-Theta), which cares about both bounds.
From The Big O page on Wikipedia
In typical usage, the formal definition of O notation is not used
directly; rather, the O notation for a function f is derived by the
following simplification rules:
If f(x) is a sum of several terms, the
one with the largest growth rate is kept, and all others omitted
Big-O is used only to give the asymptotic behaviour - that one is a bit faster than the other doesn't come into it - they're both O(N^2)
You could also say that the first loop is O(n(n-1)/2). The fancy mathematical definition of big-O is something like:
function "f" is big-O of function "g" if there exists constants c, n such that f(x) < c*g(x) for some c and all x > n.
It's a fancy way of saying g is an upper bound past some point with some constant applied. It then follows that O(n(n-1)/2) = O((n^2-n)/2) is big-O of O(n^2), which is neater for quick analysis.
AFAIK, your second code snippet
for(int i = 0; i < n; i++) <-- this loop goes for n times
for(int j = 0; j < n; j++) <-- loop also goes for n times
(...)
So essentially, it's getting a O(n*n) = O(n^2) time complexity.
Per BIG-O theory, constant factor is neglected and only higher order is considered. that's to say, if complexity is O(n^2+k) then actual complexity will be O(n^2) constant k will be ignored.
(OR) if complexity is O(n^2+n) then actual complexity will be O(n^2) lower order n will be ignored.
So in your first case where complexity is O(n(n - 1)/2) will/can be simplified to
O(n^2/2 - n/2) = O(n^2/2) (Ignoring the lower order n/2)
= O(1/2 * n^2)
= O(n^2) (Ignoring the constant factor 1/2)
I recently had an interview and was given a small problem that I was to code up.
The problem was basically find duplicates in an array of length n, using constant space in O(n). Each element is in the range 1-(n-1) and guaranteed to be a duplicate. This is what I came up with:
public int findDuplicate(int[] vals) {
int indexSum=0;
int valSum=0;
for (int i=0; i< vals.length; i++) {
indexSum += i;
valSum += vals[i];
}
return valSum - indexSum;
}
Then we got into a discussion about the runtime of this algorithm. A sum of series from 0 -> n = (n^2 + n)/2 which is quadratic. However, isn't the algorithm O(n) time? The number of operations are bound by the length of the array right?
What am I missing? Is this algorithm O(n^2)?
The fact that the sum of the integers from 0 to n is O(n^2) is irrelevant here.
Yes you run through the loop exactly O(n) times.
The big question is, what order of complexity are you assuming on addition?
If O(1) then yeah, this is linear. Most people will assume that addition is O(1).
But iwhat if addition is actually O(b) (b is bits, and in our case b = log n)? If you are going to assume this, then this algorithm is actually O(n * log n) (adding n numbers, each needs log n bits to represent).
Again, most people assume that addition is O(1).
Algorithms researchers have standardized on the unit-cost RAM model, where words are Theta(log n) bits and operations on words are Theta(1) time. An alternative model where operations on words are Theta(log n) time is not used any more because it's ridiculous to have a RAM that can't recognize palindromes in linear time.
Your algorithm clearly runs in time O(n) and extra space O(1), since convention is for the default unit of space to be the word. Your interviewer may have been worried about overflow, but your algorithm works fine if addition and subtraction are performed modulo any number M ≥ n, as would be the case for two's complement.
tl;dr Whatever your interviewer's problem was is imaginary or rooted in an improper understanding of the conventions of theoretical computer science.
You work on each on n cells one time each. Linear time.
Yes the algorithm is linear*. The result of valSum doesn't affect the running time. Take it to extreme, the function
int f(int[] vals) {
return vals.length * vals.length;
}
gives n2 in 1 multiplication. Obviously this doesn't mean f is O(n2) ;)
(*: assuming addition is O(1))
The sum of i from i=0 to n is n*(n+1)/2 which is bounded by n^2 but that has nothing to do with running time... that's just the closed form of the summation.
The running time of your algorithm is linear, O(n), where n is the number of elements in your array (assuming the addition operation is a constant time operation, O(1)).
I hope this helps.
Hristo