Why Bubble sort complexity is O(n^2)? - complexity-theory

As I understand, the complexity of an algorithm is a maximum number of operations performed while sorting. So, the complexity of Bubble sort should be a sum of arithmmetic progression (from 1 to n-1), not n^2.
The following implementation counts number of comparisons:
public int[] sort(int[] a) {
int operationsCount = 0;
for (int i = 0; i < a.length; i++) {
for(int j = i + 1; j < a.length; j++) {
operationsCount++;
if (a[i] > a[j]) {
int temp = a[i];
a[i] = a[j];
a[j] = temp;
}
}
}
System.out.println(operationsCount);
return a;
}
The ouput for array with 10 elements is 45, so it's a sum of arithmetic progression from 1 to 9.
So why Bubble sort's complexity is n^2, not S(n-1) ?

This is because big-O notation describes the nature of the algorithm. The major term in the expansion (n-1) * (n-2) / 2 is n^2. And so as n increases all other terms become insignificant.
You are welcome to describe it more precisely, but for all intents and purposes the algorithm exhibits behaviour that is of the order n^2. That means if you graph the time complexity against n, you will see a parabolic growth curve.

Let's do a worst case analysis.
In the worst case, the if (a[i] > a[j]) test will always be true, so the next 3 lines of code will be executed in each loop step. The inner loop goes from j=i+1 to n-1, so it will execute Sum_{j=i+1}^{n-1}{k} elementary operations (where k is a constant number of operations that involve the creation of the temp variable, array indexing, and value copying). If you solve the summation, it gives a number of elementary operations that is equal to k(n-i-1). The external loop will repeat this k(n-i-1) elementary operations from i=0 to i=n-1 (ie. Sum_{i=0}^{n-1}{k(n-i-1)}). So, again, if you solve the summation you see that the final number of elementary operations is proportional to n^2. The algorithm is quadratic in the worst case.
As you are incrementing the variable operationsCount before running any code in the inner loop, we can say that k (the number of elementary operations executed inside the inner loop) in our previous analysis is 1. So, solving Sum_{i=0}^{n-1}{n-i-1} gives n^2/2 - n/2, and substituting n with 10 gives a final result of 45, just the same result that you got by running the code.

Worst case scenario:
indicates the longest running time performed by an algorithm given any input of size n
so we will consider the completely backward list for this worst-case scenario
int[] arr= new int[]{9,6,5,3,2};
Number of iteration or for loops required to completely sort it = n-1 //n - number of elements in the list
1st iteration requires (n-1) swapping + 2nd iteration requires (n-2) swapping + ……….. + (n-1)th iteration requires (n-(n-1)) swapping
i.e. (n-1) + (n-2) + ……….. +1 = n/2(a+l) //sum of AP
=n/2((n-1)+1)=n^2/2
so big O notation = O(n^2)

Related

Calculating program running time with outer loop running log(n) times and inner loop running k times?

I found this example problem on the internet that I just cannot understand how the author came to their conclusion.
sum1 = 0;
for(k=1; k<=n; k*=2) // Do log n times
for (j=1; j<=n; j++) // Do n times
sum1++;`
sum2 = 0;
for (k=1; k<=n; k*=2) // Do log n times
for (j=1; j<=k; j++) // Do k times
sum2++;
I understand that the running time for the first loop is O(n) = nlog(n), but the author claims that for the second loop, the running time is O(n) = n.
Why is that?
The closest I can get to an answer is:
O(n) = k * log(n)
k = 2^i
O(n) = 2^i * log(n) ----> this is where I get stuck
I'm guessing some property of logarithms is used, but I can't figure out which one. Can someone point me in the right direction?
Thanks.
In the second example, the complexity is sum_j 2^j, i.e. the total number of operations in the inner loop.
As 2^j <= n, there are logn terms.
This sum is equal to 2^{jmax+1} - 1, with 2^jmax roughly (<=) equal to n.
Then, effectively, a complexity O(2n) = O(n).
sum2++ is executed 1+2+4+8+...+K times, where K is the largest power of 2 less than or equal to n. That sum is equal to 2K-1.
Since n/2 < K <= n (because K is the largest power of 2 less than or equal to n), the number of iterations is between n-1 and 2n-1. That's Theta(n) if you want to express it in asymptotic notation.

Loop Analysis - Analysis of Algorithms

This question is based off of this resource http://algs4.cs.princeton.edu/14analysis.
Can someone break down why Exercise 6 letter b is linear? The outer loop seems to be increasing i by a factor of 2 each time, so I would assume it was logarithmic...
From the link:
int sum = 0;
for (int n = N; n > 0; n /= 2)
for (int i = 0; i < n; i++)
sum++;
This is a geometric series.
The inner loops runs i iterations per iteration of the outer loop, and the outer loop decreases by half each time.
So, summing it up gives you:
n + n/2 + n/4 + ... + 1
This is geometric series, with r=1/2 and a=n - that converges to a/(1-r)=n/(1/2)=2n, so:
T(n) <= 2n
And since 2n is in O(n) - the algorithm runs in linear time.
This is a perfect example to see that complexity is NOT achieved by multiplying the complexity of each nested loop (that would have got you O(nlogn)), but by actually analyzing how many iterations are needed.
Yes its simple
See the value of n is decreasing by half each time and I runs n times.
So for the first time i goes from 1 to n
next time 0 to n/2
and hence 0 to n/k on kth term.
Now total time inner loop would run = Log(n)
So its a GP the number of times i is running.
with terms
n,n/2,n/4,n/8....0
so we can find the sum of the GP
2^(long(n) +1)-1 / (2-1)
2^(long(n)+1) = n
hence n-1/(1) = >O(n)

Complexity of a nested geometric sequence

Below code is actually bounded by O(n^2), could anyone please explain why?
for (int i = 1; i < n; i++) {
int j = i;
while (j < n) {
do operation with cost O(j)
j = j * 3;
}
}
It's not that tricky.
Your inner loop's complexity forms a geometric progression with total omplexity O(n).
Without filling the details all the way (this seems like a HW problem), note that the formula for a geometric sequence is
a_0 (q^k - 1) / q - 1, (a_0 = first element, q = multiplication factor, k = num elements).
and your q^k here is O(n).
Your outer loop is O(n).
Since it's a nested loop, and the inner term does not depend on the outer index, you can multiply.
The proof is geometric progression :)
For i = n, the inner loop doesn't execute more than once if i > n/3 (because in next iteration j is >= n).
So for 2n/3 iterations of outer loop the inner loop iterates only once and does O(j) operation. So for 2n/3 iterations the complexity is O(n^2).
To calculate complexity of remaining iterations set n = n/3, now apply steps 1, 2 and 3
Methodically, using Sigma notation, you may do the following:

Order of growth for loops

What would be the order of growth of the code below. My guess was, each loop's growth is linear but the if statement is confusing me. How do I include that with the whole thing. I would very much appreciate an explanatory answer so I can understand the process involved.
int count = 0;
for (int i = 0; i < N; i++)
for (int j = i+1; j < N; j++)
for (int k = j+1; k < N; k++)
if(a[i] + a[j] + a[k] == 0)
count++;
There are two things that can be confusing when trying to determine the code's complexity.
The fact that not all loops start from 0. The second loop starts from i + 1 and the third from j + 1. Does this affect the complexity? It does not. Let's consider only the first two loops. For i = 0, the second runs N - 1 times, for i = 1 it runs N - 2 times, ..., for i = N - 1 it runs 0 times. Add all these up:
0 + 1 + ... + N - 1 = N(N - 1) / 2 = O(N^2).
So not starting from 0 does not affect the complexity (remember that big-oh ignores lower-order terms and constants). Therefore, even under this setting, the entire thing is O(N^3).
The if statement. The if statement is clearly irrelevant here, because it's only part of the last loop and contains no break statement or other code that would affect the loops. It only affects the incrementation of a count, not the execution of any of the loops, so we can safely ignore it. Even if the count isn't incremented (an O(1) operation), the if condition is checked (also an O(1) operation), so the same rough number of operations is performed with and without the if.
Therefore, even with the if statement, the algorithm is still O(N^3).
Order of growth of the code would be O(N^3).
In general k nested loops of length N contribute growth of O(N^k).
Here are two was to find that the time complexity is Theta(N^3) without much calculation.
First, you select i<j<k from the range 0 through N-1. The number of ways to choose 3 objects out of N is the binomial coefficient N choose 3 = N*(N-1)*(N-2)/(3*2*1) ~ (N^3)/6 = O(N^3), and more precisely Theta(N^3).
Second, an upper bound is that you choose i, j, and k from N possibilities, so there are at most N*N*N = N^3 choices. This is O(N^3). You can also find a lower bound of the same type since you can choose i from 0 through N/3-1, j from N/3 through 2N/3-1, and k from 2N/3 through N-1. This gives you at least floor(N/3)^3 choices, which is about N^3/27. Since you have an upper bound and lower bound of the same form, the time complexity is Theta(N^3).

Worst Case Time Complexity for an algorithm

What is the Worst Case Time Complexity t(n) :-
I'm reading this book about algorithms and as an example
how to get the T(n) for .... like the selection Sort Algorithm
Like if I'm dealing with the selectionSort(A[0..n-1])
//sorts a given array by selection sort
//input: An array A[0..n - 1] of orderable elements.
//output: Array A[0..n-1] sorted in ascending order
let me write a pseudocode
for i <----0 to n-2 do
min<--i
for j<--i+1 to n-1 do
ifA[j]<A[min] min <--j
swap A[i] and A[min]
--------I will write it in C# too---------------
private int[] a = new int[100];
// number of elements in array
private int x;
// Selection Sort Algorithm
public void sortArray()
{
int i, j;
int min, temp;
for( i = 0; i < x-1; i++ )
{
min = i;
for( j = i+1; j < x; j++ )
{
if( a[j] < a[min] )
{
min = j;
}
}
temp = a[i];
a[i] = a[min];
a[min] = temp;
}
}
==================
Now how to get the t(n) or as its known the worst case time complexity
That would be O(n^2).
The reason is you have a single for loop nested in another for loop. The run time for the inner for loop, O(n), happens for each iteration of the outer for loop, which again is O(n). The reason each of these individually are O(n) is because they take a linear amount of time given the size of the input. The larger the input the longer it takes on a linear scale, n.
To work out the math, which in this case is trivial, just multiple the complexity of the inner loop by the complexity of the outer loop. n * n = n^2. Because remember, for each n in the outer loop, you must again do n for the inner. To clarify: n times for each n.
O(n * n).
O(n^2)
By the way, you shouldn't mix up complexity (denoted by big-O) and the T function. The T function is the number of steps the algorithm has to go through for a given input.
So, the value of T(n) is the actual number of steps, whereas O(something) denotes a complexity. By the conventional abuse of notation, T(n) = O( f(n) ) means that the function T(n) is of at most the same complexity as another function f(n), which will usually be the simplest possible function of its complexity class.
This is useful because it allows us to focus on the big picture: We can now easily compare two algorithms that may have very different-looking T(n) functions by looking at how they perform "in the long run".
#sara jons
The slide set that you've referenced - and the algorithm therein
The complexity is being measured for each primitive/atomic operation in the for loop
for(j=0 ; j<n ; j++)
{
//...
}
The slides rate this loop as 2n+2 for the following reasons:
The initial set of j=0 (+1 op)
The comparison of j < n (n ops)
The increment of j++ (n ops)
The final condition to check if j < n (+1 op)
Secondly, the comparison within the for loop
if(STudID == A[j])
return true;
This is rated as n ops. Thus the result if you add up +1 op, n ops, n ops, +1 op, n ops = 3n+2 complexity. So T(n) = 3n+2
Recognize that T(n) is not the same as O(n).
Another doctoral-comp flashback here.
First, the T function is simply the amount of time (usually in some number of steps, about which more below) an algorithm takes to perform a task. What a "step" is, is somewhat defined by the use; for example, it's conventional to count the number of comparisons in sorting algorithms, but the number of elements searched in search algorithms.
When we talk about the worst-case time of an algorithm, we usually express that with "big-O notation". Thus, for example, you hear that bubble sort takes O(n²) time. When we use big O notation, what we're really saying is that the growth of some function -- in this case T -- is no faster than the growth of some other function times a constant. That is
T(n) = O(n²)
means for any n, no matter how large, there is a constant k for which T(n) ≤ kn². A point of some confustion here is that we're using the "=" sign in an overloaded fashion: it doesn't mean the two are equal in the numerical sense, just that we are saying that T(n) is bounded by kn².
In the example in your extended question, it looks like they're counting the number of comparisons in the for loop and in the test; it would help to be able to see the context and the question they're answering. In any case, though, it shows why we like big-O notation: W(n) here is O(n). (Proof: there exists a constant k, namely 5, for which W(n) ≤ k(3n)+2. It follows by the definition of O(n).)
If you want to learn more about this, consult any good algorithms text, eg, Introduction to Algorithms, by Cormen et al.
write pseudo codes to search, insert and remove student information from the hash table. calculate the best and the worst case time complexities
3n + 2 is the correct answer as far as the loop is concerned. At each step of the loop, 3 atomic operations are done. j++ is actually two operations, not one. and j

Resources