CORMEN explanation of Insertion Sort Algorithm - algorithm

I can't understand this particular use of the sigma(summation) notation in the explanation of the Insertion sort of the book Introduction to Algorithms by CLRS:
Let tj denote the number of times the while loop test in line 5 is executed for that value of j.
Can someone explain the use of sigma(summation) in Line 5,6,7?
I am aware of the summation formulas and uses.

I think I finally can clearly understand.
The Sigma is expressing that for each j, the while loop may run up to t times. So, when j is equal to 2, the while loop will run t times, when j is equal to 3, the while loop will run t times, but since we don't know if this t, when j=3, is equal to the previous t, when j = 2, we add a subscript to indicate that there are different t's.
The sum runs from 2 to n, and this already represent the for loop that is running in the outer layer.
So, in summary, the limits are from 2 to n, and each time we are in the for loop and we get to the while loop, this while loop will run t times.

Note - the sigma expressions should be j instead of t[j]. I'm not sure why the book used t[j] since it's using expressions of n for the other lines.
The sigma's are expressing the number of times that the corresponding line will be executed (worst case). For Sigma(j=2 to n) on line 5, that's a total count of 2+3+4+...+n = (n+2)(n-1)/2 = 1/2 n^2 + 1/2 n - 1. Note that for line 5, for j = 2, it's counting line 5 twice, once for i = 1 (the first compare of i > 0) and once for i = 0 (the second compare of i > 0).
Lines 6 and 7 depend A[i]> key, and worst case (A[i] always > key), loop one less than line 5, which explains the (tj - 1) factor => (2-1)+(3-1) + ... = 1+2+3+...+n-1 = (n)(n-1)/2.

This can be explained like:
[Outer loop] Line five will iterate 2 to n times (Let it is t times)
[Inside loop] Line six and seven will iterate same times from `2 to (t-1) times.

Related

How do I calculate the number of times each line of code is executed?

I have some pseudocode where I have to calculate the number of times the code is executed. The code is provided below.
Historgram1(A,B,N)
1 for k = 1 to N
2 B[k] = 0
3 for i = 1 to N
4 if A[i] == k
5 then B[k] = B[k] + 1
I know that line one would run n times but I am not sure about the others. I am also assuming that line 2 would run at least once since we are setting B[k] to a value of zero or would it be n + 1?
I am also unsure how to calculate th subsequent lines. Any help or guidance would be much appreciated.
Obviously, 1 and the whole loop body (2 to 5) are executed N times (for all values of k).
On every iteration of the outer loop, the inner loop (3) is executed N times (for all values of i), as is line 4; but line 5 is executed conditionally, only if A[i]==k.
So far we can say:
1, 2: N times
3, 4: N² times.
Now line 5 is executed every time some A[i] equals some k in [1, N]. This may occur at most N times (once per A[i]), but might never occur, that's all we can say without knowing more about A.
The global behavior of the algorithm is O(N²), due to lines 3 and 4.
Anyway, the function name hints that we are computing an histogram, so it is likely that most of the values are in range [1,k] (unless the choice of the bins is very poor). So we can conjecture that line 5 is executed close to N times, without exceeding it.
This program uses a very poor method to compute the histogram, as this can be done in time O(N).

Complexity of a loop

I've just started to learn about data structures and I'd be glad for some help.
Lets say i have this pseudo code of a selection sort function for the A array of numbers:
for i = n downto 1 by 1, do
maxPos = i
for j = i - 1 downto 1 by 1, do
if A[ j ] > A[maxPos]
maxPos = j
swap (A[maxPos], A[ i ])
I want to calculate the number of steps this function takes.
For what i know so far:
Sentence #1 runs n times (if you calculate the i-- then the number of steps taken in this sentece are 2n)
Sentence #2 runs n - 1 times.
Now for sentence no.3 and onwards is where i'm starting to get confused:
no.3 runs from what I understand $$\sum_{i=1}^(n-1) i$$
Sentence no.4 runs the same amount of steps taken in no.3 only minus 1. That means: $$\sum_{i=1}^(n-1) i -1$$
Sentence no.5 i dont know how to calculate. I mean - for worst case it should be as no.4, and for best case it should be 0. But is it anyway to write it mathematically for any case?
no.6 take the number of steps no.2 takes.
I'd appreciate your help very much. <3
P.s - i know about the Big O cheat sheets and i know the worst case for this function is O($$n^2$$)
P.s no.2 - if anybody knows why mathjax doesnt work here i'd be glad to know what alternatives can i use here.
You have, essentially:
for i = n downto 1
for j = i-1 downto 1
The outer loop runs n times. The inner loop runs (n-1) times for the first iteration, (n-2) times for the second iteration, etc. So the number of iterations of the inner loop is:
(n-1)+(n-2)+(n-3)...(n-n)
That's basically the sum of numbers from 1 to n-1, making the asymptotic complexity O(n^2).
See also Selection sort:Complexity.

How to effectively calculate an algorithm's time complexity? [duplicate]

This question already has answers here:
Big O, how do you calculate/approximate it?
(24 answers)
Closed 5 years ago.
I'm studying algorithm's complexity and I'm still not able to determine the complexity of some algorithms ... Ok I'm able to figure out basic O(N) and O(N^2) loops but I'm having some difficult in routines like this one:
// What is time complexity of fun()?
int fun(int n)
{
int count = 0;
for (int i = n; i > 0; i /= 2)
for (int j = 0; j < i; j++)
count += 1;
return count;
}
Ok I know that some guys can calculate this with the eyes closed but I would love to to see a "step" by "step" how to if possible.
My first attempt to solve this would be to "simulate" an input and put the values in some sort of table, like below:
for n = 100
Step i
1 100
2 50
3 25
4 12
5 6
6 3
7 1
Ok at this point I'm assuming that this loop is O(logn), but unfortunately as I said no one solve this problem "step" by "step" so in the end I have no clue at all of what was done ....
In case of the inner loop I can build some sort of table like below:
for n = 100
Step i j
1 100 0..99
2 50 0..49
3 25 0..24
4 12 0..11
5 6 0..5
6 3 0..2
7 1 0..0
I can see that both loops are decreasing and I suppose a formula can be derived based on data above ...
Could someone clarify this problem? (The Answer is O(n))
Another simple way to probably look at it is:
Your outer loop initializes i (can be considered step/iterator) at n and divides i by 2 after every iteration. Hence, it executes the i/2 statement log2(n) times. So, a way to think about it is, your outer loop run log2(n) times. Whenever you divide a number by a base continuously till it reaches 0, you effectively do this division log number of times. Hence, outer loop is O(log-base-2 n)
Your inner loop iterates j (now the iterator or the step) from 0 to i every iteration of outer loop. i takes the maximum value of n, hence the longest run that your inner loop will have will be from 0 to n. Thus, it is O(n).
Now, your program runs like this:
Run 1: i = n, j = 0->n
Run 2: i = n/2, j = 0->n/2
Run 3: i = n/4, j = 0->n/4
.
.
.
Run x: i = n/(2^(x-1)), j = 0->[n/(2^(x-1))]
Now, runnning time always "multiplies" for nested loops, so
O(log-base-2 n)*O(n) gives O(n) for your entire code
Lets break this analysis up into a few steps.
First, start with the inner for loop. It is straightforward to see that this takes exactly i steps.
Next, think about which different values i will assume over the course of the algorithm. To start, consider the case where n is some power of 2. In this case, i starts at n, then n/2, then n/4, etc., until it reaches 1, and finally 0 and terminates. Because the inner loop takes i steps each time, then the total number of steps of fun(n) in this case is exactly n + n/2 + n/4 + ... + 1 = 2n - 1.
Lastly, convince yourself this generalizes to non-powers of 2. Given an input n, find smallest power of 2 greater than n and call it m. Clearly, n < m < 2n, so fun(n) takes less than 2m - 1 steps which is less than 4n - 1. Thus fun(n) is O(n).

Theta Notation and Worst Case Running time nested loops

This is the code I need to analyse:
i = 1
while i < n
do
j = 0;
while j <= i
do
j = j + 1
i = 2i
So, the first loop should run log(2,n) and the innermost loop should run log(2,n) * (i + 1), but I'm pretty sure that's wrong.
How do I use a theta notation to prove it?
An intuitive way to think about this is to see how much work your inner loop is doing for a fixed value of outer loop variable i. It's clearly as much as i itself. Thus, if the value of i is 256, then then you will do j = j + 1 that many times.
Thus, total work done is the sum of the values that i takes in the outer loop's execution. That variable is increasing much rapidly to catch up with n. Its values, as given by i = 2i (it should be i = 2*i), are going to be like: 2, 4, 8, 16, ..., because we start with 2 iterations of the inner loop when i = 1. This is a geometric series: a, ar, ar^2 ... with a = 1 and r = 2. The last term, as you figured out will be n and there will be log2 n terms in the series. And that is simple summation of a geometric series.
It doesn't make much sense to have a worst case or a best case for this algorithm because there are no different permutations of the input which is just a number n in this case. Best case or worst case are relevant when a particular input (e.g. a particular sequence of numbers) affects the running time of the algorithm.
The running time then is the sum of geometric series (a.(r^num_terms - 1)/(r-1)):
T(n) = 2 + 4 + ... 2^(log2 n)
= 2 . (2^log2 n - 1)
= 2 . (n - 1)
⩽ 3n = O(n)
Thus, you can't be doing work that is more than some constant multiple of n. Hence, the running time of this algorithm is O(n).
You can't be doing some work that is less than some (other) constant multiple of n, since you have to go through the increment in inner loop as shown above. Thus, the running time of this algorithm is also ≥ c.n i.e. it is Ω(n).
Together, this means that running time of this algorithm is Θ(n).
You can't use i in your final expression; only n.
You can easily see that the inner loop executes i times each time it is reached. And it sounds like you've figured out the different values that i can have. So add up those values, and you have the total amount of work.

Number of iterations in nested for-loops?

So I was looking at this code from a textbook:
for (int i=0; i<N; i++)
for(int j=i+1; j<N; j++)
The author stated that the inner for-loop iterates for exactly N*(N-1)/2 times but gives no basis for how he arrived to such an equation. I understand N*(N-1) but why divide by 2? I ran the code myself and sure enough when N is 10, the inner loop iterates 45 times (10*9/2).
I messed around with the code myself and tried the following (assigned only i to j):
for (int i=0; i<N; i++)
for(int j=i; j<N; j++)
With N = 10, this results in 55. So I'm having trouble understanding the underlying math here. Sure I could just plug in all the values and bruteforce my way through the problem, but I feel there is something essential and very simple I'm missing. How would you come up with an equation for describing the for loop I just constructed? Is there a way to do it without relying on the outputs? Would really appreciate any help thanks!
Think about what happens each time the outer loop iterates. The first time, i == 0, so the inner loop starts at 1 and runs to N-1, which is N-1 iterations in total. The next time through the outer loop, i has incremented to 1, so the inner loop starts at 2 and runs up to N-1, for a total of N-2 iterations. And that pattern continues: the third time through the outer loop, you get N-3 iterations, the fourth time through, N-4, etc. When you get to the last iteration of the outer loop, i == N-1, so the inner loop starts with j = N and stops immediately. So that's zero iterations.
The total number of iterations is the sum of all these numbers:
(N-1) + (N-2) + (N-3) + ... + 1 + 0
To look at it another way, this is just the sum of the positive integers from 1 to N-1. The result of this sum is called the (N-1)th triangular number, and Wikipedia explains how you can find that the formula for the n'th triangular number is n(n+1)/2. But here you have the (N-1)th triangular number, so if you set n=N-1, you get
(N-1)(N-1+1)/2 = N(N-1)/2
You're looking at nested loops where the outer one runs N times and the inner one (N-1). You're in effect adding up the sum of 1 + 2 + 3 + ....
The N * (N+1) / 2 is a "classic" formula in mathematics. Young Carl Gauss, later a famous mathematician, was given in-class busywork: Adding up the numbers from 1 to 100. The teacher expected to keep the kids busy for an hour but Carl came up with the answer almost immediately: 5050. He explained: 1 + 100; 2 + 99; 3 + 98; 4 + 97; and so on up to 50 + 51. That's 50 sums of 101 each. You could also see that as (100 / 2) * (100 + 1); that's where the /2 comes from.
As for why it's (N-1) instead of the (N+1) I mentioned... that could have to do with starting from 1 rather than 0, that would drop one iteration from the inner loop, I think.
Look at how many times the inner (j) loop runs for each value of i. When N = 10, the outer (i) loop runs 10 times, and the j loop should run 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9 times. Now you just add up those numbers to see how many times the inner loop runs. You can sum the numbers from 0 to N-1 with the formula N(N-1)/2. This is a very slight modification of a well-known formula for adding the numbers from 1 to N.
For a visual aid, you can see why 1 + 2 + 3 + ... + n = n * (n+1) / 2
If you count the iterations of the inner loop, you get:
1 2 3 4 5 6 7 8 9 10
To get the total for an arbitrary number of iterations, you can "wrap" the numbers around like this:
0 1 2 3 4
9 8 7 6 5
Now, if we add each of those columns, the all add to 9 (N-1), and there are 5 (N/2) columns. It's pretty obvious that for any even N, we'd still get N/2 columns that each added up to (N-1). As such, when the total number of iterations is even, the total number of iterations is always (N/2)(N-1), which (thanks to the commutative property) we can rewrite as N(N-1)/2.
If we did the same for an odd number of iterations, we'd have one "odd" column that couldn't be paired. In this case, we can ignore the '0' since we know it won't affect the overall sum in any case. For example, let's consider N=9 instead of N=10. For that, we get:
1 2 3 4
8 7 6 5
This gives us (N-1)/2 columns (9-1=8, 8/2=4) that each add up to N, so the sum will be N*(N-1)/2. Even though we've arrived at it slightly differently, this is an exact match for the formula above for when N is even. Again, it seems pretty obvious that this would remain true regardless of the number of columns we used (i.e., total number of iterations).
For any N (odd or even), the sum of the numbers from 0 through N-1 is N*(N-1)/2.

Resources