Need help understanding Big-O - algorithm

I'm in a Data Structures class now, and we're covering Big-O as a means of algorithm analysis. Unfortunately after many hours of study, I'm still somewhat confused. I understand what Big-O is, and several good code examples I found online make sense. However I have a homework question I don't understand. Any explanation of the following would be greatly appreciated.
Determine how many times the output statement is executed in each of
the following fragments (give a number in terms of n). Then indicate
whether the algorithm is O(n) or O(n2):
for (int i = 0; i < n; i++)
for (int j = 0; j < i; j++)
if (j % i == 0)
System.out.println(i + ” ” + j);

Suppose n = 5. Then, the values of i would be 0, 1, 2, 3, and 4. This means that means that the inner loop will iterate 1, 2, 3, 4, and 5 times, respectively. Because of this, the total number of times that the if comparison will execute is 1+2+3+4+5. A mathematical formula for the sum of integers from 1 to n is n*(n+1)/2. Expanded, this gives us n^2 / 2 + n / 2.
Therefore, the algorithm itself is O(n^2).
For the number of times that something is printed, we need to look at the times that j%i=0. When j < i, the only time that this can be true is when j = 0, so this is the number of times that j = 0 and i is not 0. This means that it is only true once in each iteration of the outer loop, except the first iteration (when i = 0).
Therefore, System.out.println is called n-1 times.

A simple way to look at it is :
A single loop has a complexity of O(n)
A loop within a loop has a complexity of O(n^2) and so on.
So the above loop has a complexity of O(n^2)

This function appears to execute in Quadratic Time - O(n^2).
Here's a trick for something like this. For each nested for loop add one to the exponent for n. If there was three loops this algorithm would run in cubic time O(n^3). If there is only one loop (no halving involved) then it would be linear O(n). If the array was halved each time (recursively or iteratively) it would be considered logarithmic time O(log n) -> base 2.
Hope that helps.

Related

Time complexity of an algorithm that runs 1+2+...+n times;

To start off I found this stackoverflow question that references the time complexity to be O(n^2), but it doesn't answer the question of why O(n^2) is the time complexity but instead asks for an example of such an algorithm. From my understanding an algorithm that runs 1+2+3+...+n times would be
less than O(n^2). For example, take this function
function(n: number) {
let sum = 0;
for(let i = 0; i < n; i++) {
for(let j = 0; j < i+1; j++) {
sum += 1;
}
}
return sum;
}
Here are some input and return values
num
sum
1
1
2
3
3
6
4
10
5
15
6
21
7
28
From this table you can see that this algorithm runs in less than O(n^2) but more than O(n). I also realize than algorithm that runs 1+(1+2)+(1+2+3)+...+(1+2+3+...+n) is true O(n^2) time complexity. For the algorithm stated in the problem, do we just say it runs in O(n^2) because it runs more than O(log n) times?
It's known that 1 + 2 + ... + n has a short form of n * (n + 1) / 2. Even if you didn't know that, you have to consider that, when i gets to n, the inner loop runs at most n times. So you have exactly n times (for outer loop i), each running at most n times (for inner loop j), so the O(n^2) becomes more apparent.
I agree that the complexity would be exactly n^2 if the inner loop also ran from 0 to n, so you have your reasons to think that a loop i from 0 to n and another loop j from 0 to i has to perform better and that's true, but with big Oh notation you're actually measuring the degree of algorithm's complexity, not the exact number of operations.
p.s. O(log n) is usually achieved when you split the main problem into sub-problems.
I think you should interpret the table differently. The O(N^2) complexity says that if you double the input N, the runtime should quadruple (take 4 times as long). In this case, the function(n: number) returns a number mirroring its runtime. I use f(N) as a short for it.
So say N goes from 1 to 2, which means the input has doubled (2/1 = 2). The runtime then has gone from f(1) to f(2), which means it has increased f(2)/f(1) = 3/1 = 3 times. That is not 4 times, but the Big-O complexity measure is asymptotic, dealing with the situation where N approaches infinity. If we test another input doubling from the table, we have f(6)/f(3) = 21/6 = 3.5. It is already closer to 4.
Let us now stray outside the table and try more doublings with bigger N. For example we have f(200)/f(100) = 20100/5050 = 3.980 and f(5000)/f(2500) = 12502500/3126250 = 3.999. The trend is clear. As N approaches infinity, a doubled input tends toward a quadrupled runtime. And that is the hallmark of O(N^2).

How do you find the algorithmic complexity of code fragements?

I don't know what the procedure of this would be. How do I think of this, how do I determine what the big-O will be? What is the process to solving?
Example1:
for ( i = 1; i <= n; i++)
for (j = 1; j <= n*3; j++)
System.out.println("Apple");
Example2:
for (i = 1; i < n*n*n; i *=n)
System.out.println("Banana");
Thank you
The short answer is that you count the loops. If there is no loop, it is O constant, if there is one it is O(N) if there are two nested loops it is O(N squared) and if there are three it is O(N cubed).
However that's only the short answer. You can also have loops which reduce an input by half on each iteration, so thats a log N term. And you can have pathological brute force functions which try every possibility, these are non-polynomial. Usually they are written to make heavy use of recursion and the problem is hardly chipped away at on each recursive step.
Be aware that library functions are often not O constant, and that has to be factored in.
Big-O measures efficiency. So say you were to loop through an array of size n and say n is 2,000. O(n) would signify that your algorithm for solving this is doing WORST CASE 2,000 total calculations. O is always the worst case scenario for your algorithm. There are other notation used for best case. You also have Ω(n) and Θ(n).
Check this out to kind of get an idea of the difference in efficiency:
http://bigocheatsheet.com/
Informally:
"T(n)T(n)T(n) is O(f(n))O(f(n))O(f(n))" basically means that f(n)f(n)f(n) describes the upper bound for T(n)T(n)T(n)
"T(n)T(n)T(n) is Ω(f(n))\Omega(f(n))Ω(f(n))" basically means that f(n)f(n)f(n) describes the lower bound for T(n)T(n)T(n)
"T(n)T(n)T(n) is Θ(f(n))\Theta(f(n))Θ(f(n))" basically means that f(n)f(n)f(n) describes the exact bound for T(n)T(n)T(n)
A good way to approach this for simple situations is to plug a couple of easy numbers in for n and see what happens. So say n is size 10:
in example 1:
for ( i = 1; i <= n; i++) //loop through this n times
for (j = 1; j <= n*3; j++) for each of those n times, loop through 3*n times
System.out.println("Apple"); //negligible time (O(1))
If it were just the outside loop, it would be O(n). However, since you add the inside loop, you get O(N^2) because although your input is (say) 10, you're doing 300 (30 prints for each of the 10; 30*10) operations. 3* O(N^2) but we generally leave the 3 out so O(n^2). Most nested for loops where you aren't modifying by n are O(n^2).
If it's easier you can visualize it as the polynomial 3n * n = 3n^2 worst case.
I'll let you try the next one... hint in the bold statement above.

Calculate Big O Notation

I currently have the following pseudo code, and I am trying to figure out why the answer to the question is O(n).
sum = 0;
for (i = 0; i < n; i++) do
for (j = n/3;j < 2*n; j+= n/3) do
sum++;
I thought the answer would be O(n^2) since the first for loop would run 'n' times and the second for loop has += n/3, giving it another (n divided by something times), which would just simplify to O(n^2). Could somebody explain why it is O(n)?
This is because the second loop runs in constant amount of operations (does not depend on n). From n/3 to 2n with a step n/3 which is similar to from 1/3 to 2 with a step 1/3.
This will run 5-6 times for reasonable n (not 0) (the number is not important and depends on how do you calculate /)
The inner loop increments by a multiple of n, not by 1, so its runtime is bounded by a constant (6?). So the total number of steps is bounded by a constant multiple of n (namely 6n).

Order of growth for loops

What would be the order of growth of the code below. My guess was, each loop's growth is linear but the if statement is confusing me. How do I include that with the whole thing. I would very much appreciate an explanatory answer so I can understand the process involved.
int count = 0;
for (int i = 0; i < N; i++)
for (int j = i+1; j < N; j++)
for (int k = j+1; k < N; k++)
if(a[i] + a[j] + a[k] == 0)
count++;
There are two things that can be confusing when trying to determine the code's complexity.
The fact that not all loops start from 0. The second loop starts from i + 1 and the third from j + 1. Does this affect the complexity? It does not. Let's consider only the first two loops. For i = 0, the second runs N - 1 times, for i = 1 it runs N - 2 times, ..., for i = N - 1 it runs 0 times. Add all these up:
0 + 1 + ... + N - 1 = N(N - 1) / 2 = O(N^2).
So not starting from 0 does not affect the complexity (remember that big-oh ignores lower-order terms and constants). Therefore, even under this setting, the entire thing is O(N^3).
The if statement. The if statement is clearly irrelevant here, because it's only part of the last loop and contains no break statement or other code that would affect the loops. It only affects the incrementation of a count, not the execution of any of the loops, so we can safely ignore it. Even if the count isn't incremented (an O(1) operation), the if condition is checked (also an O(1) operation), so the same rough number of operations is performed with and without the if.
Therefore, even with the if statement, the algorithm is still O(N^3).
Order of growth of the code would be O(N^3).
In general k nested loops of length N contribute growth of O(N^k).
Here are two was to find that the time complexity is Theta(N^3) without much calculation.
First, you select i<j<k from the range 0 through N-1. The number of ways to choose 3 objects out of N is the binomial coefficient N choose 3 = N*(N-1)*(N-2)/(3*2*1) ~ (N^3)/6 = O(N^3), and more precisely Theta(N^3).
Second, an upper bound is that you choose i, j, and k from N possibilities, so there are at most N*N*N = N^3 choices. This is O(N^3). You can also find a lower bound of the same type since you can choose i from 0 through N/3-1, j from N/3 through 2N/3-1, and k from 2N/3 through N-1. This gives you at least floor(N/3)^3 choices, which is about N^3/27. Since you have an upper bound and lower bound of the same form, the time complexity is Theta(N^3).

theoretical analysis of comparisons

I'm first asked to develop a simple sorting algorithm that sorts an array of integers in ascending order and put it to code:
int i, j;
for ( i = 0; i < n - 1; i++)
{
if(A[i] > A[i+1])
swap(A, i+1, i);
for (j = n - 2; j >0 ; j--)
if(A[j] < A[j-1])
swap(A, j-1, j);
}
Now that I have the sort function, I'm asked to do a theoretical analysis for the running time of the algorithm. It says that the answer is O(n^2) but I'm not quite sure how to prove that complexity.
What I know so far is that the 1st loop runs from 0 to n-1, (so n-1 times), and the 2nd loop from n-2 to 0, (so n-2 times).
Doing the recurrence relation:
let C(n) = the number of comparisons
for C(2) = C(n-1) + C(n-2)
= C(1) + C(0)
C(2) = 0 comparisons?
C(n) in general would then be: C(n-1) + C(n-2) comparisons?
If anyone could guide my step by step, that would be greatly appreciated.
When doing a "real" big O - time complexity analysis, you select one operation that you count, obviously the one that dominates the running time. In your case you could either choose the comparison or the swap, since worst case there will be a lot of swaps right?
Then you calculate how many times this will be evoked, scaling to input. So in your case you are quite right with your analysis, you simply do this:
C = O((n - 1)(n - 2)) = O(n^2 -3n + 2) = O(n^2)
I come up with these numbers through reasoning about the flow of data in your code. You have one outer for-loop iterating right? Inside that for-loop you have another for-loop iterating. The first for-loop iterates n - 1 times, and the second one n - 2 times. Since they are nested, the actual number of iterations are actually the multiplication of these two, because for every iteration in the outer loop, the whole inner loop runs, doing n - 2 iterations.
As you might know you always remove all but the dominating term when doing time complexity analysis.
There is a lot to add about worst-case complexity and average case, lower bounds, but this will hopefully make you grasp how to reason about big O time complexity analysis.
I've seen a lot of different techniques for actually analyzing the expression, such as your recurrence relation. However I personally prefer to just reason about the code instead. There are few algorithms which have hard upper bounds to compute, lower bounds on the other hand are in general very hard to compute.
Your analysis is correct: the outer loop makes n-1 iterations. The inner loop makes n-2.
So, for each iteration of the outer loop, you have n-2 iterations on the internal loop. Thus, the total number of steps is (n-1)(n-2) = n^2-3n+2.
The dominating term (which is what matters in big-O analysis) is n^2, so you get O(n^2) runtime.
I personally wouldn't use the recurrence method in this case. Writing the recurrence equation is usually helpful in recursive functions, but in simpler algorithms like this, sometimes it's just easier to look at the code and do some simple math.

Resources