Calculating the number of times an if statement is executed - algorithm

This code counts how many integer triples sum to 0: The full code is here.
initialise an int array of length n
int cnt = 0 // cnt is the number of triples that sum to 0
for (int i = 0; i < n; i++) {
for (int j = i+1; j < n; j++) {
for (int k = j+1; k < n; k++) {
if (array[i]+array[j]+array[k] == 0) {
cnt++;
}
}
}
}
Now, from the book Algorithms by Robert Sedgewick, I read that:
The initialisation of cnt to 0 is executed exactly once.
cnt++ is executed from 0 to the number of times a triple is found.
The if statement is executed n(n-1)(n-2)/6 times.
I've done some experiments and all of them are true. But I completely don't know how they calculate the number of times the if statement got executed.
I'm not sure, but I think that:
n means from i to n
(n-1) means from i+1 to n
(n-2) means from j+1 to n
/6 I don't know what's this for.
Can anyone explain how to calculate this?

It's sum of sums.
The inner loop is executed n-j-1 times each time it is being reached
The middle loop is executed n-i-1 times each time it is being reached
The outer loop is executed n times.
Sum all of these and you get total number of times the cnt++ is invoked.
Note that the number of times the middle loop is executed each time is NOT n-1, it is n-i-1, where i is the index of the outer loop. Similarly for middle loop.
The /6 factor is coming from taking it into account in the summation formula.

First loop executes for N times (0 to N-1)
Time to execute outer loop is:
Fi(0) + Fi(1) + Fi(2)...Fi(N-1)
When i is 0, middle loop executes N-1 times (1 to N-1)
When i is 1, middle loop executes N-2 times (2 to N-1)
...
Time to execute middle loop is:
Fi(0) = Fj(1) + Fj(2) ... Fj(N-1)
Fi(1) = Fj(2) + Fj(3) ... Fj(N-1)
Fi(0) + Fi(1) + Fi(2)...Fi(N-1) = Fj(1) + 2Fj(2) + ... (N-1)Fj(N-1)
Now come to the inner most loop:
When j is 1, inner loop executes N-2 times (2 to N-2)
When j is 2, inner loop executes N-3 times (3 to N-2)
...
Fj(1) = Fk(2) + Fk(3) ... Fk(N-1) = 2 + 3 + ... N-1
Fj(2) = Fk(3) + Fk(4) ... Fk(N-1) = 3 + 4 + ... N-1
Fj(1) + 2Fj(2) + ... (N-1)Fj(N-1) = (2 + 3 + ... N-1) + (3 + 4 + ... N-1) ... (N-1)
= 1 x 2 + 2 x 3 + 3 x 4 .... (N-2) x (N-1)
= 1x1 + 2x2 + 3x3 + 4x4 .... (N-1)*(N-1) - (1 + 2 + 3 + 4 + N-1)
= (N-1) N (N+1) / 6 - N (N-1) / 2
= N (N-1) ((N+1)/2 - 1/2)
= N (N-1) (N-2) / 6
You may want to also check: Formula to calculate the sum of squares of first N natural numbers and sum of first N natural numbers.
Alternate explanation:
You are finding all pairs of triplets. This can be done in NC3 ways. i.e. (N) * (N-1) * (N-2) / (1 * 2 * 3) ways.

This can be viewed as a combinatorial problem. To pick 3 unique items from n items (k=3 in the linked article) gives n!/(n-3)! = n*(n-1)*(n-2) possibilities. However, in the code the order of the 3 items doesn't matter. For each combination of 3 items, there are 3! = 6 permutations. So we need to divide by 6 to get only orderless possibilities. So we get n!/(3!(n-3)!) = n(n-1)(n-2)/6

The basis of this formula comes from the sum of a progression:
1+2 = 3
1+2+3 = 6
1+2+3+4 = 10
There exists the Formula:
Sum(1..N) == N*(N+1)/2
1+2+3+4 = 4*5/2 = 10
With a recursive progression (like in this case) you get another formula for the sums.

In your code, where i runs from 0 to n, j from i to n, k from j to n, the if statement is executed about n^3 / 6 times. To see why that is so, look at this code which will obviously execute the if statement just as often:
int cnt = 0 // cnt is the number of triples that sum to 0
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
for (int k = 0; k < n; k++) {
if (j > i && k > j) {
if (array[i]+array[j]+array[k] == 0) {
cnt++;
}
}
}
}
}
The inner loop now obviously executes n^3 times. The if statement is executed if i < j < k. We ignore the case that i == j or i == k or j == k. The three variables i, j and k could be sorted in six different orders (i < j < k, i < k < j, j < i < k etc.). Since each of these six different sorting orders happens equally often, about n^3 / 6 times we have the order i < j < k.

Related

What is the time complexity of this code that has 3 nested loops but j stops when it's equal to i * 2?

for (int i = 1 to n) {
for (int j = i to n) {
for (int k = j to n) {
sum += a[i] * b[j] * c[k]; //O(1)
}
if (j == 2 * i) {
j = n;
}
}
}
I've spent so many hours tracing the code and tried different n values. I realize that j doesn't run more than [n/2 + 1] times. I made a table that looks like this for n = 8, 9, 12.
(Apologies for the picture, I am new to Stack Overflow)
i and j are values of i and j, whereas the k is the number of times that innermost loop runs
So it turns into something like:
[n+(n-1)] + [(n-1)+(n-2)+(n-3)] + [(n-2)+(n-3)+(n-4)+(n-5)] + ...+ [3+2+1] + [2+1] + 1
I'm not sure how to put this into a arithmetic progression or summation. Please help.
There's n-j work in the inner loop.
For any particular value of j, there's approximately j/2 + 1 possible values of i. For example j can take the value 10 when i is 5,6,7,8,9 or 10.
So the total time complexity is sum((j/2+1) * (n-j), j=1..n), which is n^3/12+n^2/2-7/12 = O(n^3)
I've been sloppy with rounding here, but it doesn't affect the complexity.

Why is this snippet O(n)?

This code snippet is suppose to have a complexity of O(n). Yet, I don't understand why.
sum = 0;
for (k = 1; k <= n; k *= 2) // For some arbitrary n
for (j = 1; j <= k; j++)
sum++;
Now, I understand that the outer loop by itself is O(log n), so why is it that adding the inner loop makes this O(n).
Let's assume that n is a power of 2 for a moment.
The final iteration of the inner loop will run n times. The iteration before that will run n/2 times, the second-to-last iteration n/4 times, and so on up until the first iteration which will run once. This forms a series which sums to 2n − 1 total iterations. This is O(n).
(For example, with n = 16, the inner loop runs 1 + 2 + 4 + 8 + 16 = 31 total times.)
Let m = floor(lg(n)). Then 2^m = C*n with 1 <= C < 2. The number k of steps in the inner loop goes like:
1, 2, 4, 8, ..., 2^m = 2^0, 2^1, ..., 2^m
Therefore the total number of operations is
2^0 + 2^1 + ... + 2^m = 2^{m+1} - 1 ; think binary
= 2*2^m - 1
= 2*C*n - 1 ; replace
= O(n)

Big O Notation for two code fragments

I've have two fragments of code and an explanation of what Big O category they fall into. However, try as I might, I can't tally the explanation with what I can come up either by looking at it or doing sample runs.
The first:
long count = 0;
long n = 1000;
long i, j, k;
for(i = 0; i < n; i++)
for (j = 0; j < i * i; j++)
for (k = 0; k < j; k++)
count++;
Sample runs of this consistently give me N^4, but the answer I've been given is "j can be as large as i^2, which could be as large as N^2. k can be as large as j, which is N^2. The running time is thus proportional to N^N^2^N^2, which is O(N^5)"
Second snippet:
long i, j, k;
long n = 1000;
long count = 0;
for (i = 1; i < n; i++)
for (j = 1; j < i * i; j++)
if (j % i == 0)
for (k = 0; k < j; k++)
count++;
For this the notes say "The if statement is executed at most N3 times, by previous arguments, but it is true only O(N^2) times (because it is true exactly i times for each i). Thus the innermost loop is only executed O(N^2) times. Each time through, it takes O(j^2) = O(N^2) time, for a total of O(N^4)"
For this the notes seem to be accurate enough for the N^4 (although I keep getting a result of N^4 / 10). I don't follow the modulo calculation only being true i times for each i however, it seems to enter that loop a lot less.
So the question is can anyone clarify what I'm not understanding?
For the first one:
sum from i = 0 to n-1 of
sum from j = 0 to i*i-1 of
sum from k = 0 to j-1 of
1
We know the sum of 1 m times is equal to m, so we can reduce this to
sum from i = 0 to n-1 of
sum from j = 0 to i*i-1 of
j
We know the sum 1 + 2 + ... + m = m * (m + 1) / 2, so we can reduce further:
sum from i = 1 to n-1 of
(i * i - 1) * i * i / 2 = (1/2) * (i * i * i * i - i * i)
We can make this easier by taking the (1/2) outside the summation and then splitting up the i * i * i * i and i * i terms; however, the summations are still harder and less well-known than for i alone. It does turn out to be Theta(n^5) hence O(n^5); to at least get an intuitive feeling for why this turns out, recognize that the difference f(n+1) - f(n) = (1/2)(n^4-n^2) which is on the order of n^4, so if f were a continuous function and this difference were the derivative, then the order of f would be one higher.
For the second case:
sum from i = 0 to n-1 of
sum from j = 0 to i-1 of
sum from k = 0 to i*j-1
1
Note that j now assumes only i different values for the purposes of the innermost loop: 0, i, 2i, ..., (i-1)i. The inner loop runs for i times as many iterations as the counter value for j. We do this multiplication shifting to avoid introducing a "step" notation so we can use our usual mathematical results.
sum from i = 0 to n-1 of
sum from j = 0 to i-1 of
i*j
sum from i = 0 to n-1 of
i * (1/2) * i * (i - 1) = (1/2)(i * i * i - i)
Again, we can cheat or do the math or we can use our intuition again to (correctly) surmise this turns out to be Theta(n^4).

Time complexity of the algorithm?

This is the algorithm: I think its time complexity is O(n^2) because of loop in loop. How can I explain that?
FindSum(array, n, t)
i := 0
found := 0
array := quick_sort(array, 0, n - 1)
while i < n – 2
j = i + 1
k = n - 1
while k > j
sum = array[i] + array[j] + array[k]
if sum == t
found += 1
k -= 1
j += 1
else if sum > t
k -= 1
else
j += 1
Yes, the complexity is indeed O(n^2).
The inner loops runs anywhere between k-j = n-1-(i+1) = n-i-2 to (k-j)/2 = (n-i-2)/2 iterations.
Summing it up for all possible values of i from 0 to n-2 gives you:
T = n-0-2 + n-1-2 + n-2-2 + ... + n-(n-2)-2
= n-2 + n-3 + ... + 0
This is sum of arithmetic progression, that sums in (n-1)(n-2)/2 (sum of arithmetic progression), which is quadric. Note that dividing by extra 2 (for "best" case of inner loop) does not change time complexity in terms of big O notation.

Asymptotic analysis of three interdependent nested for loops

The code fragment I am to analyse is below:
int sum = 0;
for (int i = 0; i < n; i++)
for (int j = 0; j < i * i; j++)
for (int k = 0; k < j; k++)
sum++;
I know that the first loop is O(n) but that's about as far as I've gotten. I think that the second loop may be O(n^2) but the more I think about it the less sense it makes. Any guidance would be much appreciated.
The first loop executes n times. Each time, the value i grows. For each such i, the second loop executes i*i times. That means the second loop executes 1*1 + 2*2 + 3*3 + ... + n*n times.
This is a summation of squares, and the formula for this is well-known. Hence we have the second loop executing (n(1 + n)(1 + 2 n))/6 times.
Thus, we know that in total there will be (n(1 + n)(1 + 2 n))/6 values of j, and that for each of these the third loop will execute 1 + 2 + ... + j = j(j+1)/2 times. Actually calculating how many times the third loop executes in total would be very difficult. Luckily, all you really need is a least upper bound for the order of the function.
You know that for each of the (n(1 + n)(1 + 2 n))/6 values of j, the third loop will execute less than n(n+1)/2 times. Therefore you can say that the operation sum++ will execute less than [(n(1 + n)(1 + 2 n))/6] * [n(n+1)/2] times. After some quick mental math, that amounts to a polynomial of maximal degree 5, therefore your program is O(n^5).
int sum = 0;
for (int i = 0; i < n; i++) // Let's call this N
for (int j = 0; j < i * i; j++) // Let's call this M
for (int k = 0; k < j; k++) // Let's call this K
sum++;
N is the number of steps of the entire program, M is the number of steps the two inner loops do and lastly K is the number of steps the last loop does.
It is easy to see that K = j, it takes j steps to do.
Then M = Sum(j=0,i^2,K) = Sum(j=0, i^2, j)
(First param is the iterator, second is the upper bound and last param is what we are adding)
This is actually now a sum of n numbers to i*i. Which means we can apply the formula ((n+1)*n)/2
M = Sum(j=0,i^2,j) = ((i^2+1)*(i^2))/2 = (i^4 + i^2)/2
N = Sum(i=0, n, M) = 1/2 * ( Sum(i=0, n, (i^4)) + Sum(i=0, n, (i^2)) )
These are both well known formulas and after a little playing you get:
N = (n^5)/10 + (n^4)/4 + (n^3)/3 + (n^2)/4 + n/15
This should be the exact number of steps the loop takes, but if you are interested in the O notation you can note that n^5 is the strongest growing part so the solution is O(n^5)
If you proceed methodically using Sigma Notation, you'll end up with the following result:
Try to count how many times the inner loop is executed:
The middle loop runs
0*0 times when i == 0
1*1 times when i == 1
2*2 times when i == 2
...
n*n = n^2 times when i == n.
So it is O(n^2).

Resources