Complexity of a loop - data-structures

I've just started to learn about data structures and I'd be glad for some help.
Lets say i have this pseudo code of a selection sort function for the A array of numbers:
for i = n downto 1 by 1, do
maxPos = i
for j = i - 1 downto 1 by 1, do
if A[ j ] > A[maxPos]
maxPos = j
swap (A[maxPos], A[ i ])
I want to calculate the number of steps this function takes.
For what i know so far:
Sentence #1 runs n times (if you calculate the i-- then the number of steps taken in this sentece are 2n)
Sentence #2 runs n - 1 times.
Now for sentence no.3 and onwards is where i'm starting to get confused:
no.3 runs from what I understand $$\sum_{i=1}^(n-1) i$$
Sentence no.4 runs the same amount of steps taken in no.3 only minus 1. That means: $$\sum_{i=1}^(n-1) i -1$$
Sentence no.5 i dont know how to calculate. I mean - for worst case it should be as no.4, and for best case it should be 0. But is it anyway to write it mathematically for any case?
no.6 take the number of steps no.2 takes.
I'd appreciate your help very much. <3
P.s - i know about the Big O cheat sheets and i know the worst case for this function is O($$n^2$$)
P.s no.2 - if anybody knows why mathjax doesnt work here i'd be glad to know what alternatives can i use here.

You have, essentially:
for i = n downto 1
for j = i-1 downto 1
The outer loop runs n times. The inner loop runs (n-1) times for the first iteration, (n-2) times for the second iteration, etc. So the number of iterations of the inner loop is:
(n-1)+(n-2)+(n-3)...(n-n)
That's basically the sum of numbers from 1 to n-1, making the asymptotic complexity O(n^2).
See also Selection sort:Complexity.

Related

CORMEN explanation of Insertion Sort Algorithm

I can't understand this particular use of the sigma(summation) notation in the explanation of the Insertion sort of the book Introduction to Algorithms by CLRS:
Let tj denote the number of times the while loop test in line 5 is executed for that value of j.
Can someone explain the use of sigma(summation) in Line 5,6,7?
I am aware of the summation formulas and uses.
I think I finally can clearly understand.
The Sigma is expressing that for each j, the while loop may run up to t times. So, when j is equal to 2, the while loop will run t times, when j is equal to 3, the while loop will run t times, but since we don't know if this t, when j=3, is equal to the previous t, when j = 2, we add a subscript to indicate that there are different t's.
The sum runs from 2 to n, and this already represent the for loop that is running in the outer layer.
So, in summary, the limits are from 2 to n, and each time we are in the for loop and we get to the while loop, this while loop will run t times.
Note - the sigma expressions should be j instead of t[j]. I'm not sure why the book used t[j] since it's using expressions of n for the other lines.
The sigma's are expressing the number of times that the corresponding line will be executed (worst case). For Sigma(j=2 to n) on line 5, that's a total count of 2+3+4+...+n = (n+2)(n-1)/2 = 1/2 n^2 + 1/2 n - 1. Note that for line 5, for j = 2, it's counting line 5 twice, once for i = 1 (the first compare of i > 0) and once for i = 0 (the second compare of i > 0).
Lines 6 and 7 depend A[i]> key, and worst case (A[i] always > key), loop one less than line 5, which explains the (tj - 1) factor => (2-1)+(3-1) + ... = 1+2+3+...+n-1 = (n)(n-1)/2.
This can be explained like:
[Outer loop] Line five will iterate 2 to n times (Let it is t times)
[Inside loop] Line six and seven will iterate same times from `2 to (t-1) times.

How to effectively calculate an algorithm's time complexity? [duplicate]

This question already has answers here:
Big O, how do you calculate/approximate it?
(24 answers)
Closed 5 years ago.
I'm studying algorithm's complexity and I'm still not able to determine the complexity of some algorithms ... Ok I'm able to figure out basic O(N) and O(N^2) loops but I'm having some difficult in routines like this one:
// What is time complexity of fun()?
int fun(int n)
{
int count = 0;
for (int i = n; i > 0; i /= 2)
for (int j = 0; j < i; j++)
count += 1;
return count;
}
Ok I know that some guys can calculate this with the eyes closed but I would love to to see a "step" by "step" how to if possible.
My first attempt to solve this would be to "simulate" an input and put the values in some sort of table, like below:
for n = 100
Step i
1 100
2 50
3 25
4 12
5 6
6 3
7 1
Ok at this point I'm assuming that this loop is O(logn), but unfortunately as I said no one solve this problem "step" by "step" so in the end I have no clue at all of what was done ....
In case of the inner loop I can build some sort of table like below:
for n = 100
Step i j
1 100 0..99
2 50 0..49
3 25 0..24
4 12 0..11
5 6 0..5
6 3 0..2
7 1 0..0
I can see that both loops are decreasing and I suppose a formula can be derived based on data above ...
Could someone clarify this problem? (The Answer is O(n))
Another simple way to probably look at it is:
Your outer loop initializes i (can be considered step/iterator) at n and divides i by 2 after every iteration. Hence, it executes the i/2 statement log2(n) times. So, a way to think about it is, your outer loop run log2(n) times. Whenever you divide a number by a base continuously till it reaches 0, you effectively do this division log number of times. Hence, outer loop is O(log-base-2 n)
Your inner loop iterates j (now the iterator or the step) from 0 to i every iteration of outer loop. i takes the maximum value of n, hence the longest run that your inner loop will have will be from 0 to n. Thus, it is O(n).
Now, your program runs like this:
Run 1: i = n, j = 0->n
Run 2: i = n/2, j = 0->n/2
Run 3: i = n/4, j = 0->n/4
.
.
.
Run x: i = n/(2^(x-1)), j = 0->[n/(2^(x-1))]
Now, runnning time always "multiplies" for nested loops, so
O(log-base-2 n)*O(n) gives O(n) for your entire code
Lets break this analysis up into a few steps.
First, start with the inner for loop. It is straightforward to see that this takes exactly i steps.
Next, think about which different values i will assume over the course of the algorithm. To start, consider the case where n is some power of 2. In this case, i starts at n, then n/2, then n/4, etc., until it reaches 1, and finally 0 and terminates. Because the inner loop takes i steps each time, then the total number of steps of fun(n) in this case is exactly n + n/2 + n/4 + ... + 1 = 2n - 1.
Lastly, convince yourself this generalizes to non-powers of 2. Given an input n, find smallest power of 2 greater than n and call it m. Clearly, n < m < 2n, so fun(n) takes less than 2m - 1 steps which is less than 4n - 1. Thus fun(n) is O(n).

Why is this algorithm O(nlogn)?

I'm reading a book on algorithm analysis and have found an algorithm which I don't know how to get the time complexity of, although the book says that it is O(nlogn).
Here is the algorithm:
sum1=0;
for(k=1; k<=n; k*=2)
for(j=1; j<=n; j++)
sum1++;
Perhaps the easiest way to convince yourself of the O(n*lgn) running time is to run the algorithm on a sheet of paper. Consider what happens when n is 64. Then the outer loop variable k would take the following values:
1 2 4 8 16 32 64
The log_2(64) is 6, which is the number of terms above plus one. You can continue this line of reasoning to conclude that the outer loop will take O(lgn) running time.
The inner loop, which is completely independent of the outer loop, is O(n). Multiplying these two terms together yields O(lgn*n).
In your first loop for(k=1; k<=n; k*=2), variable k reaches the value of n in log n steps since you're doubling the value in each step.
The second loop for(j=1; j<=n; j++) is just a linear cycle, so requires n steps.
Therefore, total time is O(nlogn) since the loops are nested.
To add a bit of mathematical detail...
Let a be the number of times the outer loop for(k=1; k<=n; k*=2) runs. Then this loop will run 2^a times (Note the loop increment k*=2). So we have n = 2^a. Solve for a by taking base 2 log on both sides, then you will get a = log_2(n)
Since the inner loop runs n times, total is O(nlog_2(n)).
To add to #qwerty's answer, if a is the number of times the outer loop runs:
    k takes values 1, 2, 4, ..., 2^a and 2^a <= n
    Taking log on both sides: log_2(2^a) <= log_2(n), i.e. a <= log_2(n)
So the outer loop has a upper bound of log_2(n), i.e. it cannot run more than log_2(n) times.

Theta Notation and Worst Case Running time nested loops

This is the code I need to analyse:
i = 1
while i < n
do
j = 0;
while j <= i
do
j = j + 1
i = 2i
So, the first loop should run log(2,n) and the innermost loop should run log(2,n) * (i + 1), but I'm pretty sure that's wrong.
How do I use a theta notation to prove it?
An intuitive way to think about this is to see how much work your inner loop is doing for a fixed value of outer loop variable i. It's clearly as much as i itself. Thus, if the value of i is 256, then then you will do j = j + 1 that many times.
Thus, total work done is the sum of the values that i takes in the outer loop's execution. That variable is increasing much rapidly to catch up with n. Its values, as given by i = 2i (it should be i = 2*i), are going to be like: 2, 4, 8, 16, ..., because we start with 2 iterations of the inner loop when i = 1. This is a geometric series: a, ar, ar^2 ... with a = 1 and r = 2. The last term, as you figured out will be n and there will be log2 n terms in the series. And that is simple summation of a geometric series.
It doesn't make much sense to have a worst case or a best case for this algorithm because there are no different permutations of the input which is just a number n in this case. Best case or worst case are relevant when a particular input (e.g. a particular sequence of numbers) affects the running time of the algorithm.
The running time then is the sum of geometric series (a.(r^num_terms - 1)/(r-1)):
T(n) = 2 + 4 + ... 2^(log2 n)
= 2 . (2^log2 n - 1)
= 2 . (n - 1)
⩽ 3n = O(n)
Thus, you can't be doing work that is more than some constant multiple of n. Hence, the running time of this algorithm is O(n).
You can't be doing some work that is less than some (other) constant multiple of n, since you have to go through the increment in inner loop as shown above. Thus, the running time of this algorithm is also ≥ c.n i.e. it is Ω(n).
Together, this means that running time of this algorithm is Θ(n).
You can't use i in your final expression; only n.
You can easily see that the inner loop executes i times each time it is reached. And it sounds like you've figured out the different values that i can have. So add up those values, and you have the total amount of work.

How to calculate worst case analysis of this algorithm?

sum = 0;
for(int i = 0; i < N; i++)
for(int j = i; j >= 0; j--)
sum++;
From what I understand, the first line is 1 operation, 2nd line is (i+1) operations, 3rd line is (i-1) operations, and 4th line is n operations. Does this mean that the running time would be 1 + (i+1)(i-1) + n? It's just these last steps that confuse me.
To analyze the algorithm you don't want to go line by line asking "how much time does this particular line contribute?" The reason is that each line doesn't execute the same number of times. For example, the innermost line is executed a whole bunch of times, compared to the first line which is run just once.
To analyze an algorithm like this, try identifying some quantity whose value is within a constant factor of the total runtime of the algorithm. In this case, that quantity would probably be "how many times does the line sum++ execute?", since if we know this value, we know the total amount of time that's spent by the two loops in the algorithm. To figure this out, let's trace out what happens with these loops. On the first iteration of the outer loop, i == 0 and so the inner loop will execute exactly once (counting down from 0 to 0). On the second iteration of the outer loop, i == 1 and the inner loop executes exactly twice (first with j == 1, once with j == 0. More generally, on the kth iteration of the outer loop, the inner loop executes k + 1 times. This means that the total number of iterations of the innermost loop is given by
1 + 2 + 3 + ... + N
This quantity can be shown to be equal to
N (N + 1) N^2 + N N^2 N
--------- = ------- = --- + ---
2 2 2 2
Of these two terms, the N^2 / 2 term is the dominant growth term, and so if we ignore its constant factors we get a runtime of O(N2).
Don't look at this answer as something you should memorize - think of all of the steps required to get to the answer. We started by finding some quantity to count, and then saw how that quantity was influenced by the execution of the loops. From this, we were able to derive a mathematical expression for that quantity, which we then simplified. Finally, we took the resulting expression and determined the dominant term, which serves as the big-O for the overall function.
Work from inside-out.
sum++
This is a single operation on it's own, as it doesn't repeat.
for(int j = i; j >= 0; j--)
This loops i+1 times. There are several operations in there, but you probably don't mean to count the number of asm instructions. So I'll assume for this question this is a multiplier of i+1. Since the loop contents is a single operation, the loop and its block perform i+1 operations.
for(int i = 0; i < N; i++)
This loops N times. So as before, this is a multiplier of N. Since the block performs i+1 operations, this loop performs N(N+1)/2 operations in total. And that's your answer! If you want to consider big-O complexity, then this simplifies to O(N2).
It's not additive: the inner loop happens once for EACH iteration of the outer loop. So it's O(n2).
By the way, this is a good example of why we use asymptotic notation for this kind of thing -- depending on the definition of "operation" the exact details of the count could vary pretty widely. (Like, is sum++ a single operation, or is it add sum to 1 giving temp; load temp to sum?) But since we know that all that can be hidden in a constant factor, it's still going to be O(n2).
No; you don't count a specific number of operations for each line and then add them up. The entire point of constructions like 'for' is to make it possible for a given line of code to run more than once. You're supposed to use thinking and logic skills to figure out how many times the line 'sum++' will run, as a function of N. Hint: it runs once for every time that the third line is encountered.
How many times is the second line encountered?
Each time the second line is encountered, the value of 'i' is set. How many times does the third line run with that value of i? Therefore, how many times will it run overall? (Hint: if I give you a different amount of money on several different occasions, how do you find out the total amount I gave you?)
Each time the third line is encountered, the fourth line happens once.
Which line happens most often? How often does it happen, in terms of N?
So guess what interest you is the sum++ and how many time you execute it.
The final stat of sum would give you that answer.
Actually your loop is just:
Sigma(n) n goes from 1 to N.
Which equal to: N*(N+1) / 2 This give you in big-o-notation O(N^2)
Also beside the name of you question there is no worst case in you algorithm.
Or you could say that the worst case is when N goes to infinity.
Using Sigma notation to represent your loops:

Resources