running time for print out all primes under N - algorithm

int main() {
int i, a[N];
// initialize the array
for(i = 2; i < N; i++) a[i] = 1;
for(i = 2; i < N; i++)
if(a[i])
for(int j = i; j*i < N; j++) a[i*j] =0;
// pirnt the primes less then N
for(i = 2; i < N; i++)
if(a[i]) cout << " " << i;
cout << endl;
}
It was given in algorithm book i am reading running time of above program is proportional to N+N/2+N/3+N/5+N/7+N/11+...,
Please help me in understanding how author came up with above equation from the program.
Thanks!
Venkata

This is the "Sieve of Eratosthenes" method for finding primes. For each prime, the if(a[i]) test succeeds and the inner loop gets executed. Consider how this inner loop terminates at each step (remember, the condition is j*i < N, or equivalently, j < N/i):
i = 2 -> j = 2, 3, 4, ..., N/2
i = 3 -> j = 3, 4, 5, ..., N/3
i = 4 -> not prime
i = 5 -> j = 5, 6, 7, ..., N/5
...
Summing the total number of operations (including initialising the array/extracting the primes) gives the runtime mentioned in the book.
See this question for more, including a discussion of how, in terms of bit operations, this turns into an expansion of O(n(log n)(log log n)) as per the Wikipedia article.

This algorithm is called the Sieve of Eratosthenes. This image explains everything:
(from Wikipedia)

The inner loop (inside if(a[i])) is executed for prime is only. I.e., for i equal to 2, 3, 5, 7, 11, ... And for single i, this loop has approximately N/i iterations. Thus, we have N/2 + N/3 + N/5 + N/7 + N/11 + ... iterations overall.

Related

How to work out the time complexity in terms of the number of operations

so I was wondering how would I work out the time complexity (T(n)) of a piece of code, for example, the one below, in terms of the number of operations.
for( int i = n; i > 0; i /= 2 ) {
for( int j = 1; j < n; j *= 2 ) {
for( int k = 0; k < n; k += 2 ) {
... // constant number of operations
}
}
}
I'm sure its simple but this concept wasn't taught very well by my lecturer and I really want to know how to work out the time complexity!
Thank you in advance!
To approach this, one method is to breakdown the complexity of your three loops individually.
A key observation we can make is that:
(P): The number of steps in each of the loop does not depend on the value of the "index" of its parent loop.
Let's call
f(n) the number of operations aggregated in the outer loop (1)
g(n) in the intermediate inner loop (2)
h(n) in the most inner loop (3).
for( int i = n; i > 0; i /= 2 ) { // (1): f(n)
for( int j = 1; j < n; j *= 2 ) { // (2): g(n)
for( int k = 0; k < n; k += 2 ) { // (3): h(n)
// constant number of operations // => (P)
}
}
}
Loop (1)
Number of steps
i gets the values n, n/2, n/4, ... etc. until it reaches n/2^k where 2^k is greater than n (2^k > n), such that n/2^k = 0, at which point you exit the loop.
Another way to say it is that you have step 1 (i = n), step 2 (i = n/2), step 3 (i = n/4), ... step k - 1 (i = n/2^(k-1)), then you exit the loop. These are k steps.
Now what is the value of k? Observe that n - 1 <= 2^k < n <=> log2(n - 1) <= k < log2(n) <= INT(log2(n - 1)) <= k <= INT(log2(n)). This makes k = INT(log2(n)) or loosely speaking k = log2(n).
Cost of each step
Now how many operations do you have for each individual step?
At step i, it is g(i) = g(n) according to the notations we chose and the property (P).
Loop (2)
Number of steps
You have step (1) (j = 1), step (2) (j = 2), step (3) (j = 4), etc. until you reach step (p) (j = 2^p) where p is defined as the smallest integer such that 2^p > n, or loosely speaking log2(n).
Cost of each step
The cost of step j is h(j) = h(n) according to the notations we chose and the property (P).
Loop (3)
Number of steps
Again, let's count the steps: (1):k = 0, (1):k = 2, (2):k = 4, ..., k = n - 1 or k = n - 2. This amounts to n / 2 steps.
Cost of each step
Because of (P), it is constant. Let's call this constant K.
All loops altogether
The number of aggregated operations is
T(n) = f(n) = sum(i = 0, i < log2(n), g(i))
= sum(i = 0, i < log2(n), g(n))
= log2(n).g(n)
= log2(n).sum(j = 0, j < log2(n), h(j))
= log2(n).log2(n).h(n)
= log2(n).log2(n).(n/2).K
So T(n) = (K/2).(log2(n))^2.n
Write a method, add a counter, return the result:
int nIterator (int n) {
int counter = 0;
for( int i = n; i > 0; i /= 2 ) {
for( int j = 1; j < n; j *= 2 ) {
for( int k = 0; k < n; k += 2 ) {
++counter;
}
}
}
return counter;
}
Protocol for fast increasing N and document in a readable manner the results:
int old = 0;
for (int i = 0, j=1; i < 18; ++i, j*=2) {
int res = nIterator (j);
double quote = (old == 0) ? 0.0 : (res*1.0)/old;
System.out.printf ("%6d %10d %3f\n", j, res, quote);
old=res;
}
Result:
1 0 0,000000
2 2 0,000000
4 12 6,000000
8 48 4,000000
16 160 3,333333
32 480 3,000000
64 1344 2,800000
128 3584 2,666667
256 9216 2,571429
512 23040 2,500000
1024 56320 2,444444
2048 135168 2,400000
4096 319488 2,363636
8192 745472 2,333333
16384 1720320 2,307692
32768 3932160 2,285714
65536 8912896 2,266667
131072 20054016 2,250000
So n is increasing by factor 2, the counter increases in the beginning with more than 2², but then decreases rapidly towards something, not much higher than 2. This should help you find the way.

Dynamic Programming (max sum)

Given a sequence of n real numbers A(1) ... A(n), determine a contiguous sub sequence A(i) ... A(j) for which the sum of elements in the sub sequence is maximized.
The solution is:
M(j) = max sum over all windows ending in j
M(j) = max{M(j-1) +A[j], A[j]}
Could someone please explain how this works for the following sub sequence:
1, 5, -10, 5. Because between the first 5 and -10 the recurrence selects between a sum of -4 (M(j-1) +A[j]) or -10 (M(j-1) +A[j]). But, the best sum is 6.
So shouldn't the recurrence be:
M(j) = max{M(j-1) +A[j], A[j], M(j-1)}
As you said M[j] = maximum sum over all windows ending at j.
so after computing M[j] for all j between 0 and n - 1. You output the maximum of them.
Here is a sample code in C++.
int findMax(int a[], int n){
int * M = new int[n];
M[0] = a[0];
for(int i=1; i<n; i++)
M[i] = max(M[i-1] + a[i], a[i]);
int ans = M[0];
for(int i=1; i<n; i++)
ans = max(ans, M[i]);
return ans;
}
for your sequence {1, 5, -10, 5}. M = {1, 6, -4, 5} and the answer is the maximum value in M which is 6.

Someone please explain to me ths big O complexity between these two java codes? I m so confused?

for ( int i = n, i>0; i / = 2) {
for ( int j = 1, j<n; j * = 2) {
for ( int k = 0, k<n; k += 2) {
} // not nested
}
}
Answer: O(n(log n)^ 2), (2 is to the square root by the way)
The two outer loops are all log n, because it's having, and the inner one is N because it is halving right?
For this code, the correct answer is O(n) ^ 2, I understand the outer loop is n, and the middle loop is log n, and the inner loop should be n too. so why is the answer not N * N * log n?
for( int i = n; i > 0; i - -) {
for( int j = 1; j < n; j *= 2 ) {
for( int k = 0; k < j; k++ ) {
// constant number C of operations
}
}
}
Finally, how do I know when to add or multiply loops? if two loops are nested I just multiply them right? and when do I take the greatest N value over the other loops?
Here it is #2 formatted for readability:
for( int i = n; i > 0; i --)
for( int j = 1; j < n; j *= 2 )
for( int k = 0; k < j; k++ )
action
Forget the i-loop; we know it multiplies the inner bits by N.
The number of times action gets done by the nested j-, k-loops is then
1 + 2 + 4 + 8 + ... N. (If N is not a power of 2, replace it with the next lower power of 2.)
Put this in binary and sum it. For my example, let's let N be 16, but you can easily generalize.
00001
00010
00100
01000
10000
which sums to
11111
which is 2*N-1, or O(N).
Multiplying that by the i-loop range of N gives us O(N^2).
Interesting problem!
the outer and middle loops execute the same number of times, each sqrt(n). the inner loop executes n/2 times. Since they're nested, yes you multiply them together, for a total of sqrt(n)*sqrt(n)*n/2 critical step executions. This equals n^2/2, for which the limit as n approaches infinity is n^2, so it's in the O(n^2) classification of functions. You don't really just take the greatest N value, you take the asymptotic limit, when asked which family the function belongs to.

Identify and state the running time using Big-O

For each of the following algorithms, identify and state the running time using Big-O.
//i for (int i = 0; Math.sqrt(i) < n; i++)
cout << i << endl;
//ii for (int i = 0; i < n; i++){
cout << i << endl;
int k = n;
while (k > 0)
{
k /= 2;
cout << k << endl;
} // while
}
//iii
int k = 1;
for (int i = 0; i < n; i++)
k = k * 2;
for (int j = 0; j < k; j++)
cout << j << endl;
I've calculate the loop times for the first question using n=1 and n=2. The loop in i will run n^2-1 times. Please help and guide me to identify the Big-O notation.
(i) for (int i = 0; Math.sqrt(i) < n; i++)
cout << i << endl;
The loop will run until squareRoot(i) < N , or until i < N^2. Thus the running time will be O(N^2), ie. quadratic.
(ii) for (int i = 0; i < n; i++){
cout << i << endl;
int k = n;
while (k > 0)
{
k /= 2;
cout << k << endl;
} // while
}
The outer loop will run for N iterations. The inner loop will run for logN iterations(because the inner loop will run for k=N, N/2, N/(2^2), N/(2^3), ...logN times). Thus the running time will be O(N logN), ie. linearithmic.
(iii)
int k = 1;
for (int i = 0; i < n; i++)
k = k * 2;
for (int j = 0; j < k; j++)
cout << j << endl;
The value of k after the execution of the first loop will be 2^n as k is multiplied by 2 n times. The second loop runs k times. Thus it will run for 2^n iterations. Running time is O(2^N), ie. exponential.
For the first question, you will have to loop until Math.sqrt(i) >= n, that means that you will stop when i >= n*n, thus the first program runs in O(n^2).
For the second question, the outer loop will execute n times, and the inner loop keeps repeatedly halving k (which is initially equal to n). So the inner loop executes log n times, thus the total time complexity is O(n log n).
For the third question, the first loop executes n times, and on each iteration you double the value of k which is initially 1. After the loop terminates, you will have k = 2^n, and the second loop executes k times, so the total complexity will be O(2^n)
Couple hints may allow you to solve most of running time complexity problems in CS tests/homeworks.
If something decrease by a factor of 2 on each iteration, that's a log(N). In your second case the inner loop index is halved each time.
Geometric series,
a r^0 + a r^1 + a r^2 ... = a (r^n - 1) / (r - 1).
Write out third problem:
2 + 4 + 8 + 16 ... = 2^1 + 2^2 + 2^3 + 2^4 + ...
and use the closed form formula.
Generally it helps to look for log2 and to write few terms to see if there is a repeatable pattern.
Other common questions require you to know factorials and its approximation (Sterling's approximation)
Using Sigma Notation, you can formally obtain the following results:
(i)
(ii)
(iii)

How to generate permutations where a[i] != i?

Suppose I have an array of integers int a[] = {0, 1, ... N-1}, where N is the size of a. Now I need to generate all permutations of a s that a[i] != i for all 0 <= i < N. How would you do that?
Here's some C++ implementing an algorithm based on a bijective proof of the recurrence
!n = (n-1) * (!(n-1) + !(n-2)),
where !n is the number of derangements of n items.
#include <algorithm>
#include <ctime>
#include <iostream>
#include <vector>
static const int N = 12;
static int count;
template<class RAI>
void derange(RAI p, RAI a, RAI b, int n) {
if (n < 2) {
if (n == 0) {
for (int i = 0; i < N; ++i) p[b[i]] = a[i];
if (false) {
for (int i = 0; i < N; ++i) std::cout << ' ' << p[i];
std::cout << '\n';
} else {
++count;
}
}
return;
}
for (int i = 0; i < n - 1; ++i) {
std::swap(a[i], a[n - 1]);
derange(p, a, b, n - 1);
std::swap(a[i], a[n - 1]);
int j = b[i];
b[i] = b[n - 2];
b[n - 2] = b[n - 1];
b[n - 1] = j;
std::swap(a[i], a[n - 2]);
derange(p, a, b, n - 2);
std::swap(a[i], a[n - 2]);
j = b[n - 1];
b[n - 1] = b[n - 2];
b[n - 2] = b[i];
b[i] = j;
}
}
int main() {
std::vector<int> p(N);
clock_t begin = clock();
std::vector<int> a(N);
std::vector<int> b(N);
for (int i = 0; i < N; ++i) a[i] = b[i] = i;
derange(p.begin(), a.begin(), b.begin(), N);
std::cout << count << " permutations in " << clock() - begin << " clocks for derange()\n";
count = 0;
begin = clock();
for (int i = 0; i < N; ++i) p[i] = i;
while (std::next_permutation(p.begin(), p.end())) {
for (int i = 0; i < N; ++i) {
if (p[i] == i) goto bad;
}
++count;
bad:
;
}
std::cout << count << " permutations in " << clock() - begin << " clocks for next_permutation()\n";
}
On my machine, I get
176214841 permutations in 13741305 clocks for derange()
176214841 permutations in 14106430 clocks for next_permutation()
which IMHO is a wash. Probably there are improvements to be made on both sides (e.g., reimplement next_permutation with the derangement test that scans only the elements that changed); that's left as an exercise to the reader.
If you have access to C++ STL, use next_permutation, and do an additional check of a[i] != i in a do-while loop.
If you want to avoid the filter approach that others have suggested (generate the permutations in lexicographic order and skip those with fixed points), then you should generate them based on cycle notation rather than one-line notation (discussion of notation).
The cycle-type of a permutation of n is a partition of n, that is a weakly decreasing sequence of positive integers that sums to n. The condition that a permutation has no fixed points is equivalent to its cycle-type having no 1s. For example, if n=5, then the possible cycle-types are
5
4,1
3,2
3,1,1
2,2,1
2,1,1,1
1,1,1,1,1
Of those, only 5 and 3,2 are valid for this problem since all others contain a 1. Therefore the strategy is to generate partitions with smallest part at least 2, then for each such partition, generate all permutations with that cycle-type.
The permutations you are looking for are called derangements. As others have observed, uniformly randomly distributed derangements can be generated by generating uniformly randomly distributed permutations and then rejecting permutations that have fixed points (where a[i] == i). The rejection method runs in time e*n + o(n) where e is Euler's constant 2.71828... . An alternative algorithm similar to #Per's runs in time 2*n + O(log^2 n). However, the fastest algorithm I've been able to find, an early rejection algorithm, runs in time (e-1)*(n-1). Instead of waiting for the permutation to be generated and then rejecting it (or not), the permutation is tested for fixed points while it is being constructed, allowing for rejection at the earliest possible moment. Here's my implementation of the early rejection method for derangements in Java.
public static int[] randomDerangement(int n)
throws IllegalArgumentException {
if (n<2)
throw new IllegalArgumentException("argument must be >= 2 but was " + n);
int[] result = new int[n];
boolean found = false;
while (!found) {
for (int i=0; i<n; i++) result[i] = i;
boolean fixed = false;
for (int i=n-1; i>=0; i--) {
int j = rand.nextInt(i+1);
if (i == result[j]) {
fixed = true;
break;
}
else {
int temp = result[i];
result[i] = result[j];
result[j] = temp;
}
}
if (!fixed) found = true;
}
return result;
}
For an alternative approach, see my post at Shuffle list, ensuring that no item remains in same position.
Just a hunch: I think lexicographic permutation might be possible to modify to solve this.
Re-arrange the array 1,2,3,4,5,6,... by swapping pairs of odd and even elements into 2,1,4,3,6,5,... to construct the permutation with lowest lexicographic order. Then use the standard algorithm, with the additional constraint that you cannot swap element i into position i.
If the array has an odd number of elements, you will have to make another swap at the end to ensure that element N-1 is not in position N-1.
Here's a small recursive approach in python:
def perm(array,permutation = [], i = 1):
if len(array) > 0 :
for element in array:
if element != i:
newarray = list(array)
newarray.remove(element)
newpermutation = list(permutation)
newpermutation.append(element)
perm(newarray,newpermutation,i+1)
else:
print permutation
Running perm(range(1,5)) will give the following output:
[2, 1, 4, 3]
[2, 3, 4, 1]
[2, 4, 1, 3]
[3, 1, 4, 2]
[3, 4, 1, 2]
[3, 4, 2, 1]
[4, 1, 2, 3]
[4, 3, 1, 2]
[4, 3, 2, 1]

Resources