Tricky Big-O complexity - big-o

public void foo(int n, int m) {
int i = m;
while (i > 100) {
i = i / 3;
}
for (int k = i ; k >= 0; k--) {
for (int j = 1; j < n; j *= 2) {
System.out.print(k + "\t" + j);
}
System.out.println();
}
}
I figured the complexity would be O(logn).
That is as a product of the inner loop, the outer loop -- will never be executed more than 100 times, so it can be omitted.
What I'm not sure about is the while clause, should it be incorporated into the Big-O complexity? For very large i values it could make an impact, or arithmetic operations, doesn't matter on what scale, count as basic operations and can be omitted?

The while loop is O(log m) because you keep dividing m by 3 until it is below or equal to 100.
Since 100 is a constant in your case, it can be ignored, yes.
The inner loop is O(log n) as you said, because you multiply j by 2 until it exceeds n.
Therefore the total complexity is O(log n + log m).
or arithmetic operations, doesn't matter on what scale, count as basic operations and can be omitted?
Arithmetic operations can usually be omitted, yes. However, it also depends on the language. This looks like Java and it looks like you're using primitive types. In this case it's ok to consider arithmetic operations O(1), yes. But if you use big integers for example, that's not really ok anymore, as addition and multiplication are no longer O(1).

The complexity is O(log m + log n).
The while loop executes log3(m) times - a constant (log3(100)). The outer for loop executes a constant number of times (around 100), and the inner loop executes log2(n) times.

The while loop divides the value of m by a factor of 3, therefore the number of such operations will be log(base 3) m
For the for loops you could think of the number of operations as 2 summations -
summation (k = 0 to i) [ summation (j = 0 to lg n) (1)]
summation (k = 0 to i) [lg n + 1]
(lg n + 1) ( i + 1) will be total number of operations, of which the log term dominates.
That's why the complexity is O(log (base3) m + lg n)
Here the lg refers to log to base 2

Related

Time Complexity for Sieve of Eratosthenes: why is it not linear? [duplicate]

From Wikipedia:
The complexity of the algorithm is
O(n(logn)(loglogn)) bit operations.
How do you arrive at that?
That the complexity includes the loglogn term tells me that there is a sqrt(n) somewhere.
Suppose I am running the sieve on the first 100 numbers (n = 100), assuming that marking the numbers as composite takes constant time (array implementation), the number of times we use mark_composite() would be something like
n/2 + n/3 + n/5 + n/7 + ... + n/97 = O(n^2)
And to find the next prime number (for example to jump to 7 after crossing out all the numbers that are multiples of 5), the number of operations would be O(n).
So, the complexity would be O(n^3). Do you agree?
Your n/2 + n/3 + n/5 + … n/97 is not O(n), because the number of terms is not constant. [Edit after your edit: O(n2) is too loose an upper bound.] A loose upper-bound is n(1+1/2+1/3+1/4+1/5+1/6+…1/n) (sum of reciprocals of all numbers up to n), which is O(n log n): see Harmonic number. A more proper upper-bound is n(1/2 + 1/3 + 1/5 + 1/7 + …), that is sum of reciprocals of primes up to n, which is O(n log log n). (See here or here.)
The "find the next prime number" bit is only O(n) overall, amortized — you will move ahead to find the next number only n times in total, not per step. So this whole part of the algorithm takes only O(n).
So using these two you get an upper bound of O(n log log n) + O(n) = O(n log log n) arithmetic operations. If you count bit operations, since you're dealing with numbers up to n, they have about log n bits, which is where the factor of log n comes in, giving O(n log n log log n) bit operations.
That the complexity includes the loglogn term tells me that there is a sqrt(n) somewhere.
Keep in mind that when you find a prime number P while sieving, you don't start crossing off numbers at your current position + P; you actually start crossing off numbers at P^2. All multiples of P less than P^2 will have been crossed off by previous prime numbers.
The inner loop does n/i steps, where i is prime => the whole
complexity is sum(n/i) = n * sum(1/i). According to prime harmonic
series, the sum (1/i) where i is prime is log (log n). In
total, O(n*log(log n)).
I think the upper loop can be optimized by replacing n with sqrt(n) so overall time complexity will O(sqrt(n)loglog(n)):
void isPrime(int n){
int prime[n],i,j,count1=0;
for(i=0; i < n; i++){
prime[i] = 1;
}
prime[0] = prime[1] = 0;
for(i=2; i <= n; i++){
if(prime[i] == 1){
printf("%d ",i);
for(j=2; (i*j) <= n; j++)
prime[i*j] = 0;
}
}
}
int n = 100;
int[] arr = new int[n+1];
for(int i=2;i<Math.sqrt(n)+1;i++) {
if(arr[i] == 0) {
int maxJ = (n/i) + 1;
for(int j=2;j<maxJ;j++)
{
arr[i*j]= 1;
}
}
}
for(int i=2;i<=n;i++) {
if(arr[i]==0) {
System.out.println(i);
}
}
For all i>2, Ti = sqrt(i) * (n/i) => Tk = sqrt(k) * (n/k) => Tk = n/sqrt(k)
Loop stops when k=sqrt(n) => n[ 1/sqrt(2) + 1/sqrt(3) + ...] = n * log(log(n)) => O(nloglogn)
see take the above explanation the inner loop is harmonic sum of all prime numbers up to sqrt(n). So, the actual complexity of is O(sqrt(n)*log(log(sqrt(n))))

how i can find the time complexity of the above code

for(i=0; i<n; i++) // time complexity n+1
{
k=1; // time complexity n
while(k<=n) // time complexity n*(n+1)
{
for(j=0; j<k; j++) // time complexity ??
printf("the sum of %d and %d is: %d\n",j,k,j+k); time complexity ??
k++;
}
What is the time complexity of the above code? I stuck in the second (for) and i don't know how to find the time complexity because j is less than k and not less than n.
I always having problems related to time complexity, do you guys got some good article on it?
especially about the step count and loops.
From the question :
because j is less than k and not less than n.
This is just plain wrong, and I guess that's the assumption that got you stuck. We know what values k can take. In your code, it ranges from 1 to n (included). Thus, if j is less than k, it is also less than n.
From the comments :
i know the the only input is n but in the second for depends on k an not in n .
If a variable depends on anything, it's on the input. j depends on k that itself depends on n, which means j depends on n.
However, this is not enough to deduce the complexity. In the end, what you need to know is how many times printf is called.
The outer for loop is executed n times no matter what. We can factor this out.
The number of executions of the inner for loop depends on k, which is modified within the while loop. We know k takes every value from 1 to n exactly once. That means the inner for loop will first be executed once, then twice, then three times and so on, up until n times.
Thus, discarding the outer for loop, printf is called 1+2+3+...+n times. That sum is very well known and easy to calculate : 1+2+3+...+n = n*(n+1)/2 = (n^2 + n)/2.
Finally, the total number of calls to printf is n * (n^2 + n)/2 = n^3/2 + n^2/2 = O(n^3). That's your time complexity.
A final note about this kind of codes. Once you see the same patterns a few times, you quickly start to recognize the kind of complexity involved. Then, when you see that kind of nested loops with dependent variables, you immediately know that the complexity for each loop is linear.
For instance, in the following, f is called n*(n+1)*(n+2)/6 = O(n^3) times.
for (i = 1; i <= n; ++i) {
for (j = 1; j <= i; ++j) {
for (k = 1; k <= j; ++k) {
f();
}
}
}
First, simplify the code to show the main loops. So, we have a structure of:
for(int i = 0; i < n; i++) {
for(int k = 1; k <= n; k++) {
for(int j = 0; j < k; j++) {
}
}
}
The outer-loops run n * n times but there's not much you can do with this information because the complexity of the inner-loop changes based on which iteration of the outer-loop you're on, so it's not as simple as calculating the number of times the outer loops run and multiplying by some other value.
Instead, I would find it easier to start with the inner-loop, and then add the outer-loops from the inner-most to outer-most.
The complexity of the inner-most loop is k.
With the middle loop, it's the sum of k (the complexity above) where k = 1 to n. So 1 + 2 + ... + n = (n^2 + n) / 2.
With the outer loop, it's done n times so another multiplication by n. So n * (n^2 + n) / 2.
After simplifying, we get a total of O(n^3)
The time complexity for the above code is : n x n x n = n^3 + 1+ 1 = n^3 + 2 for the 3 loops plus the two constants. Since n^3 carries the heaviest growing rate the constant values can be ignored, so the Time complexity would be n^3.
Note: Take each loop as (n) and to obtained the total time, multiple the (n) values in each loop.
Hope this will help !

Big O of this code

I am doing the exercise of Skiena's book on algorithms and I am stuck in this question:
I need to calculate the big O of the following algorithm:
function mystery()
r=0
for i=1 to n-1 do
for j=i+1 to n do
for k=1 to j do
r=r+1
Here, the big O of the outermost loop will be O(n-1) and the middle loop will be O(n!). Please tell me if I am wrong here.
I am not able to calculate the big O of the innermost loop.
Can anyone please help me with this?
Here's a more rigorous way to approach solving this problem:
Define the run-time of the algorithm to be f(n) since n is our only input. The outer loop tells us this
f(n) = Sum(g(i,n), 1, n-1)
where Sum(expression, min, max) is the sum of expression from i = min to i = max. Notice that, the expression in this case is an evaluation of g(i, n) with a fixed i (the summation index) and n (the input to f(n)). And we can peel another layer and defineg(i, n):
g(i, n) = Sum(h(j), i+1, n), where i < n
which is the sum of h(j) where j ranges of i+1 to n. Finally we can just define
h(j) = Sum(O(1), 1, j)
since we've assumed that r = r+1 takes time O(1).
Notice at this point that we haven't done any hand-waving, saying stuff like "oh you can just multiply the loops together. The 'innermost operation' is the only one that counts." That statement isn't even true for all algorithms. Here's an example:
for (i = 0; i < n; i++)
{
for (j = 0; j < n; j++)
{
Solve_An_NP_Complete_Problem(n);
for (k = 0; k < n; k++)
{
count++;
}
}
}
The above algorithm isn't O(n^3)... it's not even polynomial.
Anyways, now that I've established the superiority of a rigorous evaluation (:D) we need to work our way upwards so we can figure out what the upper bound is for f(n). First, it's easy to see that h(j) is O(j) (just use the definition of Big-Oh). From that we can now rewrite g(i, n) as:
g(i, n) = Sum(O(j), i+1, n)
=> g(i, n) = O(i+1) + O(i+2) + ... O(n-1) + O(n)
=> g(i, n) = O(n^2 - i^2 - 2i - 1) (because we can sum Big-Oh functions
in this way using lemmas based on
the definition of Big-Oh)
=> g(i, n) = O(n^2) (because g(i, n) is only defined for i < n. Also, note
that before this step we had a Theta bound, which is
stronger!)
And so we can rewrite f(n) as:
f(n) = Sum(O(n^2), 1, n-1)
=> f(n) = (n-1)*O(n^2)
=> f(n) = O(n^3)
You might consider proving the lower bound to show that f(n) = Theta(n^3). The trick there is note simplifying g(i, n) = O(n^2) but keeping the tight bound when computing f(n). It requires some ugly algebra, but I'm pretty sure (i.e. I haven't actually done it) that you will be able to prove f(n) = Omega(n^3) as well (or just Theta(n^3) directly if you're really meticulous).
The analysis takes place as follows:
the outer loop goes from i = [1, n - 1], so the program would have linear complexity O(i) = O(n) if, inside this loop, only constant operations were performed;
however, inside the first loop, for each i, some operation is executed n times -- starting from j = [i + 1, n - 1]. This gives a total complexity of O(i * j) = O(n * n) = O(n^2);
finally, the innermost loop will be executed, for each i, and for each j, k times, with k ranging from k = [j, n - 1]. As j begins with i + 1 = 1 + 1, k is assymptotically equal to n, which gives you O(i * j * k) = O(n * n * n) = O(n^3).
The complexity of the program itself corresponds to the number of iterations of your innermost operations -- or the sums of complexities of innermost operations. If you had:
for i = 1:n
for j = 1:n
--> innermost operation (executed n^2 times)
for k = 1:n
--> innermost operation (executed n^3 times)
endfor
for l = 1:n
--> innermost operation (executed n^3 times)
endfor
endfor
endfor
The total complexity would be given as O(n^2) + O(n^3) + O(n^3), which is equal to max(O(n^2), O(n^3), O(n^3)), or O(n^3).
It cannot be factorial. For example if u have two nested cycles the big O will be n^2, not n^n. So three cycles cannot give more than n^3. Keep digging ;)

O(n log log n) time complexity

I have a short program here:
Given any n:
i = 0;
while (i < n) {
k = 2;
while (k < n) {
sum += a[j] * b[k]
k = k * k;
}
i++;
}
The asymptotic running time of this is O(n log log n). Why is this the case? I get that the entire program will at least run n times. But I'm not sure how to find log log n. The inner loop is depending on k * k, so it's obviously going to be less than n. And it would just be n log n if it was k / 2 each time. But how would you figure out the answer to be log log n?
For mathematical proof, inner loop can be written as:
T(n) = T(sqrt(n)) + 1
w.l.o.g assume 2 ^ 2 ^ (t-1)<= n <= 2 ^ (2 ^ t)=>
we know 2^2^t = 2^2^(t-1) * 2^2^(t-1)
T(2^2^t) = T(2^2^(t-1)) + 1=T(2^2^(t-2)) + 2 =....= T(2^2^0) + t =>
T(2^2^(t-1)) <= T(n) <= T(2^2^t) = T(2^2^0) + log log 2^2^t = O(1) + loglogn
==> O(1) + (loglogn) - 1 <= T(n) <= O(1) + loglog(n) => T(n) = Teta(loglogn).
and then total time is O(n loglogn).
Why inner loop is T(n)=T(sqrt(n)) +1?
first see when inner loop breaks, when k>n, means before that k was at least sqrt(n), or in two level before it was at most sqrt(n), so running time will be T(sqrt(n)) + 2 ≥ T(n) ≥ T(sqrt(n)) + 1.
Time Complexity of a loop is O(log log n) if the loop variables is reduced / increased exponentially by a constant amount. If the loop variable is divided / multiplied by a constant amount then complexity is O(Logn).
Eg: in your case value of k is as follow. Let i in parenthesis denote the number of times the loop has been executed.
2 (0) , 2^2 (1), 2^4 (2), 2^8 (3), 2^16(4), 2^32 (5) , 2^ 64 (6) ...... till n (k) is reached.
The value of k here will be O(log log n) which is the number of times the loop has executed.
For the sake of assumption lets assume that n is 2^64. Now log (2^64) = 64 and log 64 = log (2^6) = 6. Hence your program ran 6 times when n is 2^64.
I think if the codes are like this, it should be n*log n;
i = 0;
while (i < n) {
k = 2;
while (k < n) {
sum += a[j] * b[k]
k *= c;// c is a constant bigger than 1 and less than k;
}
i++;
}
Okay, So let's break this down first -
Given any n:
i = 0;
while (i < n) {
k = 2;
while (k < n) {
sum += a[j] * b[k]
k = k * k;
}
i++;
}
while( i<n ) will run for n+1 times but we'll round it off to n times.
now here comes the fun part, k<n will not run for n times instead it will run for log log n times because here instead of incrementing k by 1,in each loop we are incrementing it by squaring it. now this means it'll take only log log n time for the loop. you'll understand this when you learn design and analysis of algorithm
Now we combine all the time complexity and we get n.log log n time here I hope you get it now.

Time complexity of Sieve of Eratosthenes algorithm

From Wikipedia:
The complexity of the algorithm is
O(n(logn)(loglogn)) bit operations.
How do you arrive at that?
That the complexity includes the loglogn term tells me that there is a sqrt(n) somewhere.
Suppose I am running the sieve on the first 100 numbers (n = 100), assuming that marking the numbers as composite takes constant time (array implementation), the number of times we use mark_composite() would be something like
n/2 + n/3 + n/5 + n/7 + ... + n/97 = O(n^2)
And to find the next prime number (for example to jump to 7 after crossing out all the numbers that are multiples of 5), the number of operations would be O(n).
So, the complexity would be O(n^3). Do you agree?
Your n/2 + n/3 + n/5 + … n/97 is not O(n), because the number of terms is not constant. [Edit after your edit: O(n2) is too loose an upper bound.] A loose upper-bound is n(1+1/2+1/3+1/4+1/5+1/6+…1/n) (sum of reciprocals of all numbers up to n), which is O(n log n): see Harmonic number. A more proper upper-bound is n(1/2 + 1/3 + 1/5 + 1/7 + …), that is sum of reciprocals of primes up to n, which is O(n log log n). (See here or here.)
The "find the next prime number" bit is only O(n) overall, amortized — you will move ahead to find the next number only n times in total, not per step. So this whole part of the algorithm takes only O(n).
So using these two you get an upper bound of O(n log log n) + O(n) = O(n log log n) arithmetic operations. If you count bit operations, since you're dealing with numbers up to n, they have about log n bits, which is where the factor of log n comes in, giving O(n log n log log n) bit operations.
That the complexity includes the loglogn term tells me that there is a sqrt(n) somewhere.
Keep in mind that when you find a prime number P while sieving, you don't start crossing off numbers at your current position + P; you actually start crossing off numbers at P^2. All multiples of P less than P^2 will have been crossed off by previous prime numbers.
The inner loop does n/i steps, where i is prime => the whole
complexity is sum(n/i) = n * sum(1/i). According to prime harmonic
series, the sum (1/i) where i is prime is log (log n). In
total, O(n*log(log n)).
I think the upper loop can be optimized by replacing n with sqrt(n) so overall time complexity will O(sqrt(n)loglog(n)):
void isPrime(int n){
int prime[n],i,j,count1=0;
for(i=0; i < n; i++){
prime[i] = 1;
}
prime[0] = prime[1] = 0;
for(i=2; i <= n; i++){
if(prime[i] == 1){
printf("%d ",i);
for(j=2; (i*j) <= n; j++)
prime[i*j] = 0;
}
}
}
int n = 100;
int[] arr = new int[n+1];
for(int i=2;i<Math.sqrt(n)+1;i++) {
if(arr[i] == 0) {
int maxJ = (n/i) + 1;
for(int j=2;j<maxJ;j++)
{
arr[i*j]= 1;
}
}
}
for(int i=2;i<=n;i++) {
if(arr[i]==0) {
System.out.println(i);
}
}
For all i>2, Ti = sqrt(i) * (n/i) => Tk = sqrt(k) * (n/k) => Tk = n/sqrt(k)
Loop stops when k=sqrt(n) => n[ 1/sqrt(2) + 1/sqrt(3) + ...] = n * log(log(n)) => O(nloglogn)
see take the above explanation the inner loop is harmonic sum of all prime numbers up to sqrt(n). So, the actual complexity of is O(sqrt(n)*log(log(sqrt(n))))

Resources