So I was looking at this code from a textbook:
for (int i=0; i<N; i++)
for(int j=i+1; j<N; j++)
The author stated that the inner for-loop iterates for exactly N*(N-1)/2 times but gives no basis for how he arrived to such an equation. I understand N*(N-1) but why divide by 2? I ran the code myself and sure enough when N is 10, the inner loop iterates 45 times (10*9/2).
I messed around with the code myself and tried the following (assigned only i to j):
for (int i=0; i<N; i++)
for(int j=i; j<N; j++)
With N = 10, this results in 55. So I'm having trouble understanding the underlying math here. Sure I could just plug in all the values and bruteforce my way through the problem, but I feel there is something essential and very simple I'm missing. How would you come up with an equation for describing the for loop I just constructed? Is there a way to do it without relying on the outputs? Would really appreciate any help thanks!
Think about what happens each time the outer loop iterates. The first time, i == 0, so the inner loop starts at 1 and runs to N-1, which is N-1 iterations in total. The next time through the outer loop, i has incremented to 1, so the inner loop starts at 2 and runs up to N-1, for a total of N-2 iterations. And that pattern continues: the third time through the outer loop, you get N-3 iterations, the fourth time through, N-4, etc. When you get to the last iteration of the outer loop, i == N-1, so the inner loop starts with j = N and stops immediately. So that's zero iterations.
The total number of iterations is the sum of all these numbers:
(N-1) + (N-2) + (N-3) + ... + 1 + 0
To look at it another way, this is just the sum of the positive integers from 1 to N-1. The result of this sum is called the (N-1)th triangular number, and Wikipedia explains how you can find that the formula for the n'th triangular number is n(n+1)/2. But here you have the (N-1)th triangular number, so if you set n=N-1, you get
(N-1)(N-1+1)/2 = N(N-1)/2
You're looking at nested loops where the outer one runs N times and the inner one (N-1). You're in effect adding up the sum of 1 + 2 + 3 + ....
The N * (N+1) / 2 is a "classic" formula in mathematics. Young Carl Gauss, later a famous mathematician, was given in-class busywork: Adding up the numbers from 1 to 100. The teacher expected to keep the kids busy for an hour but Carl came up with the answer almost immediately: 5050. He explained: 1 + 100; 2 + 99; 3 + 98; 4 + 97; and so on up to 50 + 51. That's 50 sums of 101 each. You could also see that as (100 / 2) * (100 + 1); that's where the /2 comes from.
As for why it's (N-1) instead of the (N+1) I mentioned... that could have to do with starting from 1 rather than 0, that would drop one iteration from the inner loop, I think.
Look at how many times the inner (j) loop runs for each value of i. When N = 10, the outer (i) loop runs 10 times, and the j loop should run 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9 times. Now you just add up those numbers to see how many times the inner loop runs. You can sum the numbers from 0 to N-1 with the formula N(N-1)/2. This is a very slight modification of a well-known formula for adding the numbers from 1 to N.
For a visual aid, you can see why 1 + 2 + 3 + ... + n = n * (n+1) / 2
If you count the iterations of the inner loop, you get:
1 2 3 4 5 6 7 8 9 10
To get the total for an arbitrary number of iterations, you can "wrap" the numbers around like this:
0 1 2 3 4
9 8 7 6 5
Now, if we add each of those columns, the all add to 9 (N-1), and there are 5 (N/2) columns. It's pretty obvious that for any even N, we'd still get N/2 columns that each added up to (N-1). As such, when the total number of iterations is even, the total number of iterations is always (N/2)(N-1), which (thanks to the commutative property) we can rewrite as N(N-1)/2.
If we did the same for an odd number of iterations, we'd have one "odd" column that couldn't be paired. In this case, we can ignore the '0' since we know it won't affect the overall sum in any case. For example, let's consider N=9 instead of N=10. For that, we get:
1 2 3 4
8 7 6 5
This gives us (N-1)/2 columns (9-1=8, 8/2=4) that each add up to N, so the sum will be N*(N-1)/2. Even though we've arrived at it slightly differently, this is an exact match for the formula above for when N is even. Again, it seems pretty obvious that this would remain true regardless of the number of columns we used (i.e., total number of iterations).
For any N (odd or even), the sum of the numbers from 0 through N-1 is N*(N-1)/2.
Related
I have some pseudocode where I have to calculate the number of times the code is executed. The code is provided below.
Historgram1(A,B,N)
1 for k = 1 to N
2 B[k] = 0
3 for i = 1 to N
4 if A[i] == k
5 then B[k] = B[k] + 1
I know that line one would run n times but I am not sure about the others. I am also assuming that line 2 would run at least once since we are setting B[k] to a value of zero or would it be n + 1?
I am also unsure how to calculate th subsequent lines. Any help or guidance would be much appreciated.
Obviously, 1 and the whole loop body (2 to 5) are executed N times (for all values of k).
On every iteration of the outer loop, the inner loop (3) is executed N times (for all values of i), as is line 4; but line 5 is executed conditionally, only if A[i]==k.
So far we can say:
1, 2: N times
3, 4: N² times.
Now line 5 is executed every time some A[i] equals some k in [1, N]. This may occur at most N times (once per A[i]), but might never occur, that's all we can say without knowing more about A.
The global behavior of the algorithm is O(N²), due to lines 3 and 4.
Anyway, the function name hints that we are computing an histogram, so it is likely that most of the values are in range [1,k] (unless the choice of the bins is very poor). So we can conjecture that line 5 is executed close to N times, without exceeding it.
This program uses a very poor method to compute the histogram, as this can be done in time O(N).
I can't understand this particular use of the sigma(summation) notation in the explanation of the Insertion sort of the book Introduction to Algorithms by CLRS:
Let tj denote the number of times the while loop test in line 5 is executed for that value of j.
Can someone explain the use of sigma(summation) in Line 5,6,7?
I am aware of the summation formulas and uses.
I think I finally can clearly understand.
The Sigma is expressing that for each j, the while loop may run up to t times. So, when j is equal to 2, the while loop will run t times, when j is equal to 3, the while loop will run t times, but since we don't know if this t, when j=3, is equal to the previous t, when j = 2, we add a subscript to indicate that there are different t's.
The sum runs from 2 to n, and this already represent the for loop that is running in the outer layer.
So, in summary, the limits are from 2 to n, and each time we are in the for loop and we get to the while loop, this while loop will run t times.
Note - the sigma expressions should be j instead of t[j]. I'm not sure why the book used t[j] since it's using expressions of n for the other lines.
The sigma's are expressing the number of times that the corresponding line will be executed (worst case). For Sigma(j=2 to n) on line 5, that's a total count of 2+3+4+...+n = (n+2)(n-1)/2 = 1/2 n^2 + 1/2 n - 1. Note that for line 5, for j = 2, it's counting line 5 twice, once for i = 1 (the first compare of i > 0) and once for i = 0 (the second compare of i > 0).
Lines 6 and 7 depend A[i]> key, and worst case (A[i] always > key), loop one less than line 5, which explains the (tj - 1) factor => (2-1)+(3-1) + ... = 1+2+3+...+n-1 = (n)(n-1)/2.
This can be explained like:
[Outer loop] Line five will iterate 2 to n times (Let it is t times)
[Inside loop] Line six and seven will iterate same times from `2 to (t-1) times.
This question already has answers here:
Big O, how do you calculate/approximate it?
(24 answers)
Closed 5 years ago.
I'm studying algorithm's complexity and I'm still not able to determine the complexity of some algorithms ... Ok I'm able to figure out basic O(N) and O(N^2) loops but I'm having some difficult in routines like this one:
// What is time complexity of fun()?
int fun(int n)
{
int count = 0;
for (int i = n; i > 0; i /= 2)
for (int j = 0; j < i; j++)
count += 1;
return count;
}
Ok I know that some guys can calculate this with the eyes closed but I would love to to see a "step" by "step" how to if possible.
My first attempt to solve this would be to "simulate" an input and put the values in some sort of table, like below:
for n = 100
Step i
1 100
2 50
3 25
4 12
5 6
6 3
7 1
Ok at this point I'm assuming that this loop is O(logn), but unfortunately as I said no one solve this problem "step" by "step" so in the end I have no clue at all of what was done ....
In case of the inner loop I can build some sort of table like below:
for n = 100
Step i j
1 100 0..99
2 50 0..49
3 25 0..24
4 12 0..11
5 6 0..5
6 3 0..2
7 1 0..0
I can see that both loops are decreasing and I suppose a formula can be derived based on data above ...
Could someone clarify this problem? (The Answer is O(n))
Another simple way to probably look at it is:
Your outer loop initializes i (can be considered step/iterator) at n and divides i by 2 after every iteration. Hence, it executes the i/2 statement log2(n) times. So, a way to think about it is, your outer loop run log2(n) times. Whenever you divide a number by a base continuously till it reaches 0, you effectively do this division log number of times. Hence, outer loop is O(log-base-2 n)
Your inner loop iterates j (now the iterator or the step) from 0 to i every iteration of outer loop. i takes the maximum value of n, hence the longest run that your inner loop will have will be from 0 to n. Thus, it is O(n).
Now, your program runs like this:
Run 1: i = n, j = 0->n
Run 2: i = n/2, j = 0->n/2
Run 3: i = n/4, j = 0->n/4
.
.
.
Run x: i = n/(2^(x-1)), j = 0->[n/(2^(x-1))]
Now, runnning time always "multiplies" for nested loops, so
O(log-base-2 n)*O(n) gives O(n) for your entire code
Lets break this analysis up into a few steps.
First, start with the inner for loop. It is straightforward to see that this takes exactly i steps.
Next, think about which different values i will assume over the course of the algorithm. To start, consider the case where n is some power of 2. In this case, i starts at n, then n/2, then n/4, etc., until it reaches 1, and finally 0 and terminates. Because the inner loop takes i steps each time, then the total number of steps of fun(n) in this case is exactly n + n/2 + n/4 + ... + 1 = 2n - 1.
Lastly, convince yourself this generalizes to non-powers of 2. Given an input n, find smallest power of 2 greater than n and call it m. Clearly, n < m < 2n, so fun(n) takes less than 2m - 1 steps which is less than 4n - 1. Thus fun(n) is O(n).
Say i have the following code
def func(A,n):
for i = 0 to n-1:
for k = i+1 to n-1:
for l = k+1 to n-1:
if A[i]+A[k]+A[l] = 0:
return True
A is an array, and n denotes the length of A.
As I read it, the code checks if any 3 consecutive integers in A sum up to 0. I see the time complexity as
T(n) = (n-2)(n-1)(n-2)+O(1) => O(n^3)
Is this correct, or am I missing something? I have a hard time finding reading material about this (and I own CLRS)
You have the functionality wrong: it checks to see whether any three elements add up to 0. To improve execution time, it considers them only in index order: i < k < j.
You are correct about the complexity. Although each loop takes a short-cut, that short-cut is merely a scalar divisor on the number of iterations. Each loop is still O(n).
As for the coding, you already have most of it done -- and Stack Overflow is not a coding service. Give it your best shot; if that doesn't work and you're stuck, post another question.
If you really want to teach yourself a new technique, look up Python's itertools package. You can use this to generate all the combinations in triples. You can then merely check sum(triple) in each case. In fact, you can use the any method to check whether any one triple sums to 0, which could reduce your function body to a single line of Python code.
I'll leave that research to you. You'll learn other neat stuff on the way.
Addition for OP's comment.
Let's set N to 4, and look at what happens:
i = 0
for k = 1 to 3
... three k loop
i = 1
for k = 2 to 3
... two k loops
i = 2
for k = 3 to 3
... one k loop
The number of k-loop executions is the "triangle" number of n-1: 3 + 2 + 1. Let m = n-1; the formula is T(m) = m(m-1)/2.
Now, you propagate the same logic to the l loops. You run T(k) loops on l for k= 1, 2, 3. If I recall, this third-order "pyramid" formula is P(m) = m(m-1)(m-2)/6.
In terms of n, this is (n-1)(n-2)(n-3)/6 loops on l. When you multiply this out, you get a straightforward cubic formula in n.
Here is the sequence for n=5:
0 1 2
0 1 3
0 1 4
change k
0 2 3
0 2 4
change k
0 3 4
change k
change k
change l
1 2 3
1 2 4
change k
1 3 4
change k
change k
change l
2 3 4
BTW, l is a bad variable name, easily confused with 1.
Analyze the following sorting algorithm:
for (int i = 0; i < SIZE; i++)
{
if (list[i] > list[i + 1])
{
swap list[i] with list[i + 1];
i = 0;
}
}
I want to determine the time complexity for this, in the worse case...I don't understand how it is O(n^3)
Clearly the for loop by itself is O(n). The question is, how many times can it run?
Every time you do a swap, the loop starts over. How many times will you do a swap? You will do a swap for each element from its starting position until it reaches its proper spot in the sorted output. For an input that is reverse sorted, that will average to n/2 times, or O(n) again. But that's for each element, giving another O(n). That's how you get to O(n^3).
I ran an analysis for n = 10 and n = 100. The number of comparisons seems to be O(n3) which makes sense because i gets set to 0 an average of n / 2 times so it's somewhere around n2*(n/2) comparison and increment operations for your for loop, but the number of swaps seems to be only O(n2) because obviously no more swaps are necessary to sort the entire list. The best case is still n-1 comparisons and 0 swaps of course.
For best-case testing I use an already sorted array of n elements: [0...n-1].
For worst-case testing I use a reverse-sorted array of n elements: [n-1...0]
def analyzeSlowSort(A):
comparison_count = swap_count = i = 0
while i < len(A) - 1:
comparison_count += 1
if A[i] > A[i+1]:
A[i], A[i+1] = A[i+1], A[i]
swap_count += 1
i = 0
i += 1
return comparison_count, swap_count
n = 10
# Best case
print analyzeSlowSort(range(n)) # ->(9, 0)
# Worst case
print analyzeSlowSort(range(n, 0, -1)) # ->(129, 37)
n = 100
# Best case
print analyzeSlowSort(range(n)) # ->(99, 0)
# Worst case
print analyzeSlowSort(range(n, 0, -1)) # ->(161799, 4852)
Clearly this is a very inefficient sorting algorithm in terms of comparisons. :)
Okay.. here goes..
in the worst case lets say we have a completely flipped array..
9 8 7 6 5 4 3 2 1 0
Each time there is a swap.. the i is getting resetted to 0.
Lets start by flipping 9 8 : We have now 8 9 7 6 5 4 3 2 1 0 and the i is set back to zero.
Now the loop runs till 2 and we have a flip again.. : 8 7 9 6 5 4 3 2 1 0 i reset again.. but to get 7 to the first place we have another flip for 8 and 7. : 7 8 9 6 5 4 3 2 1 0
So the number of loops are like this :
T(1) = O(1)
T(2) = O(1 + 2)
T(3) = O(1 + 2 + 3)
T(4) = O(1 + 2 + 3 + 4) and so on..
Finally For nth term which is the biggest in this case its T(n) = O(n(n-1)/2).
But for the entire thing you need to sum all of these terms up
Which can be bounded by the case Summation of (T(n)) = O(Summation of (n^2)) = O(n^3)
Addition
Think of it this way: For each element you need to go up to it and bring it back.. but when you bring it back its just by one space. I hope that makes it a little more clear.
Another Edit
If any of the above is not making sense. Think of it this way : You have to bring 0 to the front of the array. You have initially walk up to the zero 9 steps and put it before 1. But after that you are magically transported (i=0) to the beginning of the array. So now you have to walk 8 steps to the zero and then bring it in two's position. Again Zap! and you are back to start of the array. How many steps approximately you have to take to get to zero each time so that its right at the front. 9 + 8 + 7 + 6 + 5 + .. this is the last term of the recurrence and so is bounded by the Square of the length of the array. Does this make sense? Now to do this for each of the element on average you are doing O(n) work.. right? Which translates to summing all the terms up.. And we have O(n^3).
Please comment if things help or don't make sense.