Inverting a Function in Sieve of Eratosthenes - sieve-of-eratosthenes

I think this is technically wheel factorization. I'm trying to re-compress my program's representation of the Sieve of Eratosthenes, which only contains indexes of numbers which are possibly prime.
Some background:
The most basic wheel is [2]: keep track of 2 as the first prime, and the sieve only contains odd indexes. (50%))
The next wheel is [2 3]: keep track of two and three as the first primes, and the sieve only contains the gaps between 2*3=6 (i.e., 1 and 5). Indexes are of the form 6k+1 and 6k+5. (33%)
The next wheel is [2 3 5]: keep track of 2, 3 and 5 as the first primes, and the sieve only needs 8 bits to represent intervals of size 30. (27%)
When clearing the bits for multiples of a number, I find those multiples using this loop:
def multiplesIndices (self, i):
for gap in self.gaps[1:]:
ret = i * (self.product * 0 + gap)
if ret > len (self): break
yield ret
for k in xrange (1, len (self) / i / self.product + 1):
for gap in self.gaps:
ret = i * (self.product * k + gap)
if ret > len (self): break
yield ret
The problem is the time involved to setup a wheel, combined with the diminishing returns on the compression ratio. Well, that and rn changing to a different wheel size involves a lot of recomputation. Also, by varying the wheel size, I think I can affect the asymptotic complexity.
So my proposed solution is to use small wheels to initialize larger wheels:
[2 3] (6/2) to get the gaps in the [2 3 5] (30/8) wheel
[2 3 5] to get the gaps in the [2 3 5 7] (210/48) wheel
Where I need help is mapping the already-computed small sieve to the to-be-computed bigger sieve, so I can avoid re-sieving everything from 1. Get the first 30 primes, use them to find the next 210-30 primes, use them to find the next 480-210 primes.
More specifically, I need help inverting this function (or correctly implementing invIndexOf()):
def indexOf (self, n):
if not self.isValidIndex (n): raise Exception ()
ret = n / self.product * len (self.gaps) + self.gaps.index (n % self.product)
assert n in self.invIndexOf (ret)
return ret
Also, it's been a few years since I've figured the asymptotic complexity of anything. I'm pretty sure this is an improvement, though not a drastic one.

Related

Easily implementable solution for this brain teaser?

So I have a brain teaser I read on one of the algorithm and puzzle meetups we have on our uni that goes like this:
There's a school that awards students that, during a given period, are
never late more than once and who don't ever happen to be absent for
three or more consecutive days. How many possible permutations with repetitions of
presence (or lack thereof) can we build for a given timeframe that
grant the student an award? Assume that each day is just a state
On-time, Late or Absent for the whole day, don't worry about specific
classes. Example: for three day timeframes, we can create 19 such
permutations with repetitions that grant an award.
I've already posted it on math.SE yesterday cause I was interested if there was some ready-bake formula we could derive to solve it but it turns out there isn't and all the transformations really are rather complex.
Thus, I'm asking here - how would you approach such a problem with an algorithm? I tried narrowing down the possibilities space but after a while taking all the possible permutations with repetitions became well too much and the algorithm started becoming really complex while I believe there should be some easy to implement way to solve it, especially since most of the puzzles we exchange on the meetup are rather like that.
Here is a simplified version of Python 3 code implementing the recursion in the answer by #ProgrammerPerson:
from functools import lru_cache
def count_variants(max_late, base_absent, period_length):
"""
max_late – maximum allowed number of days the student can be late;
base_absent – the number of consecutive days the student can be absent;
period_length – days in a period."""
#lru_cache(max_late * base_absent * period_length)
def count(late, absent, days):
if late < 0: return 0
if absent < 0: return 0
if days == 0: return 1
return (count(late, base_absent, days-1) + # Student is on time. Absent reset.
count(late-1, base_absent, days-1) + # Student is late. Absent reset.
count(late, absent-1, days-1)) # Student is absent.
return count(max_late, base_absent, period_length)
Run example:
In [2]: count_variants(1, 2, 3)
Out[2]: 19
This screams recursion (and/or dynamic programming)!
Suppose we try and solve a slightly general problem:
We give an award if a student is late no more than L times, and isn't
absent for A or more consecutive days.
Now we want to compute the number of possibilities for an n days time frame.
Call this method P(L, A, n)
Now try to build up a recursion based on three cases for the first day of the period.
1) If the student is on-time for the first day, then the number is simply
P(L, A, n-1)
2) If the student is late the first day, then the number is
P(L-1, A, n-1)
3) If the student is absent the first day, then the number is
P(L, A-1, n-1)
This gives us the recursion:
P(L, A, n) = P(L, A, n-1) + P(L-1, A, n-1) + P(L, A-1, n-1)
You can either memoize the recursion, or just have tables which you lookup.
Be careful about the base cases which are
P(0, *, *), P(*, 0, *) and P(*, *, 0) and can be computed by easy mathematical formulae.
Here is quick python code, with memoization + recursion to demonstrate:
import math
def binom(n, r):
return math.factorial(n)/(math.factorial(r)*math.factorial(n-r))
# The memoization table.
table = {}
def P(L, A, n):
if L == 0:
# Only ontime or absent.
# More absents than period.
if A > n:
return 2**n
# 2^n total possibilities.
# of that n-A+1 are non-rewarding.
return 2**n - (n - A + 1)
if A == 0:
# Only Late or ontime.
# need fewer than L+1 late.
# This is n choose 0 + n choose 1 + ... + n choose L
total = 0
for l in xrange(0, min(L,n)):
total += binom(n, l)
return total
if n == 0:
return 1
if (L, A, n) in table:
return table[(L, A, n)]
result = P(L, A, n-1) + P(L-1, A, n-1) + P(L, A-1, n-1)
table[(L, A, n)] = result
return result
print P(1, 3, 3)
Output is 19.
Let S(n) be the number of strings of length n without 3 repeated 1s.
Any such string (with length at least 3) ends in "0", "01" or "011" (and after removing the suffix, any string without three consecutive 1s can appear).
Then for n > 2, S(n) = S(n-1) + S(n-2) + S(n-3), and S(0)=1, S(1)=2, S(2)=4.
If you have a late day on day i (counting from 0), then you have S(i) ways of arranging absent days before, and S(n-i-1) ways of arranging absent days after.
Thus, the solution to the original problem is S(n) + sum(S(i)*S(n-i-1) | i = 0...n-1)
We can compute solutions iteratively like this:
def ways(n):
S = [1, 2, 4] + [0] * (n-2)
for i in xrange(3, n+1):
S[i] = S[i-1] + S[i-2] + S[i-3]
return S[n] + sum(S[i] * S[n-i-1] for i in xrange(n))
for i in xrange(1, 20):
print i, ways(i)
Output:
1 3
2 8
3 19
4 43
5 94
6 200
7 418
8 861
9 1753
10 3536
11 7077
12 14071
13 27820
14 54736
15 107236
16 209305
17 407167
18 789720
19 1527607

Better Algorithm to find the maximum number who's square divides K :

Given a number K which is a product of two different numbers (A,B), find the maximum number(<=A & <=B) who's square divides the K .
Eg : K = 54 (6*9) . Both the numbers are available i.e 6 and 9.
My approach is fairly very simple or trivial.
taking the smallest of the two ( 6 in this case).Lets say A
Square the number and divide K, if its a perfect division, that's the number.
Else A = A-1 ,till A =1.
For the given example, 3*3 = 9 divides K, and hence 3 is the answer.
Looking for a better algorithm, than the trivial solution.
Note : The test cases are in 1000's so the best possible approach is needed.
I am sure someone else will come up with a nice answer involving modulus arithmetic. Here is a naive approach...
Each of the factors can themselves be factored (though it might be an expensive operation).
Given the factors, you can then look for groups of repeated factors.
For instance, using your example:
Prime factors of 9: 3, 3
Prime factors of 6: 2, 3
All prime factors: 2, 3, 3, 3
There are two 3s, so you have your answer (the square of 3 divides 54).
Second example of 36 x 9 = 324
Prime factors of 36: 2, 2, 3, 3
Prime factors of 9: 3, 3
All prime factors: 2, 2, 3, 3, 3, 3
So you have two 2s and four 3s, which means 2x3x3 is repeated. 2x3x3 = 18, so the square of 18 divides 324.
Edit: python prototype
import math
def factors(num, dict):
""" This finds the factors of a number recursively.
It is not the most efficient algorithm, and I
have not tested it a lot. You should probably
use another one. dict is a dictionary which looks
like {factor: occurrences, factor: occurrences, ...}
It must contain at least {2: 0} but need not have
any other pre-populated elements. Factors will be added
to this dictionary as they are found.
"""
while (num % 2 == 0):
num /= 2
dict[2] += 1
i = 3
found = False
while (not found and (i <= int(math.sqrt(num)))):
if (num % i == 0):
found = True
factors(i, dict)
factors(num / i, dict)
else:
i += 2
if (not found):
if (num in dict.keys()):
dict[num] += 1
else:
dict[num] = 1
return 0
#MAIN ROUTINE IS HERE
n1 = 37 # first number (6 in your example)
n2 = 41 # second number (9 in your example)
dict = {2: 0} # initialise factors (start with "no factors of 2")
factors(n1, dict) # find the factors of f1 and add them to the list
factors(n2, dict) # find the factors of f2 and add them to the list
sqfac = 1
# now find all factors repeated twice and multiply them together
for k in dict.keys():
dict[k] /= 2
sqfac *= k ** dict[k]
# here is the result
print(sqfac)
Answer in C++
int func(int i, j)
{
int k = 54
float result = pow(i, 2)/k
if (static_cast<int>(result)) == result)
{
if(i < j)
{
func(j, i);
}
else
{
cout << "Number is correct: " << i << endl;
}
}
else
{
cout << "Number is wrong" << endl;
func(j, i)
}
}
Explanation:
First recursion then test if result is a positive integer if it is then check if the other multiple is less or greater if greater recursive function tries the other multiple and if not then it is correct. Then if result is not positive integer then print Number is wrong and do another recursive function to test j.
If I got the problem correctly, I see that you have a rectangle of length=A, width=B, and area=K
And you want convert it to a square and lose the minimum possible area
If this is the case. So the problem with your algorithm is not the cost of iterating through mutliple iterations till get the output.
Rather the problem is that your algorithm depends heavily on the length A and width B of the input rectangle.
While it should depend only on the area K
For example:
Assume A =1, B=25
Then K=25 (the rect area)
Your algorithm will take the minimum value, which is A and accept it as answer with a single
iteration which is so fast but leads to wrong asnwer as it will result in a square of area 1 and waste the remaining 24 (whatever cm
or m)
While the correct answer here should be 5. which will never be reached by your algorithm
So, in my solution I assume a single input K
My ideas is as follows
x = sqrt(K)
if(x is int) .. x is the answer
else loop from x-1 till 1, x--
if K/x^2 is int, x is the answer
This might take extra iterations but will guarantee accurate answer
Also, there might be some concerns on the cost of sqrt(K)
but it will be called just once to avoid misleading length and width input

Find the sum of least common multiples of all subsets of a given set

Given: set A = {a0, a1, ..., aN-1} (1 &leq; N &leq; 100), with 2 &leq; ai &leq; 500.
Asked: Find the sum of all least common multiples (LCM) of all subsets of A of size at least 2.
The LCM of a setB = {b0, b1, ..., bk-1} is defined as the minimum integer Bmin such that bi | Bmin, for all 0 &leq; i < k.
Example:
Let N = 3 and A = {2, 6, 7}, then:
LCM({2, 6}) = 6
LCM({2, 7}) = 14
LCM({6, 7}) = 42
LCM({2, 6, 7}) = 42
----------------------- +
answer 104
The naive approach would be to simply calculate the LCM for all O(2N) subsets, which is not feasible for reasonably large N.
Solution sketch:
The problem is obtained from a competition*, which also provided a solution sketch. This is where my problem comes in: I do not understand the hinted approach.
The solution reads (modulo some small fixed grammar issues):
The solution is a bit tricky. If we observe carefully we see that the integers are between 2 and 500. So, if we prime factorize the numbers, we get the following maximum powers:
2 8
3 5
5 3
7 3
11 2
13 2
17 2
19 2
Other than this, all primes have power 1. So, we can easily calculate all possible states, using these integers, leaving 9 * 6 * 4 * 4 * 3 * 3 * 3 * 3 states, which is nearly 70000. For other integers we can make a dp like the following: dp[70000][i], where i can be 0 to 100. However, as dp[i] is dependent on dp[i-1], so dp[70000][2] is enough. This leaves the complexity to n * 70000 which is feasible.
I have the following concrete questions:
What is meant by these states?
Does dp stand for dynamic programming and if so, what recurrence relation is being solved?
How is dp[i] computed from dp[i-1]?
Why do the big primes not contribute to the number of states? Each of them occurs either 0 or 1 times. Should the number of states not be multiplied by 2 for each of these primes (leading to a non-feasible state space again)?
*The original problem description can be found from this source (problem F). This question is a simplified version of that description.
Discussion
After reading the actual contest description (page 10 or 11) and the solution sketch, I have to conclude the author of the solution sketch is quite imprecise in their writing.
The high level problem is to calculate an expected lifetime if components are chosen randomly by fair coin toss. This is what's leading to computing the LCM of all subsets -- all subsets effectively represent the sample space. You could end up with any possible set of components. The failure time for the device is based on the LCM of the set. The expected lifetime is therefore the average of the LCM of all sets.
Note that this ought to include the LCM of sets with only one item (in which case we'd assume the LCM to be the element itself). The solution sketch seems to sabotage, perhaps because they handled it in a less elegant manner.
What is meant by these states?
The sketch author only uses the word state twice, but apparently manages to switch meanings. In the first use of the word state it appears they're talking about a possible selection of components. In the second use they're likely talking about possible failure times. They could be muddling this terminology because their dynamic programming solution initializes values from one use of the word and the recurrence relation stems from the other.
Does dp stand for dynamic programming?
I would say either it does or it's a coincidence as the solution sketch seems to heavily imply dynamic programming.
If so, what recurrence relation is being solved? How is dp[i] computed from dp[i-1]?
All I can think is that in their solution, state i represents a time to failure , T(i), with the number of times this time to failure has been counted, dp[i]. The resulting sum would be to sum all dp[i] * T(i).
dp[i][0] would then be the failure times counted for only the first component. dp[i][1] would then be the failure times counted for the first and second component. dp[i][2] would be for the first, second, and third. Etc..
Initialize dp[i][0] with zeroes except for dp[T(c)][0] (where c is the first component considered) which should be 1 (since this component's failure time has been counted once so far).
To populate dp[i][n] from dp[i][n-1] for each component c:
For each i, copy dp[i][n-1] into dp[i][n].
Add 1 to dp[T(c)][n].
For each i, add dp[i][n-1] to dp[LCM(T(i), T(c))][n].
What is this doing? Suppose you knew that you had a time to failure of j, but you added a component with a time to failure of k. Regardless of what components you had before, your new time to fail is LCM(j, k). This follows from the fact that for two sets A and B, LCM(A union B} = LCM(LCM(A), LCM(B)).
Similarly, if we're considering a time to failure of T(i) and our new component's time to failure of T(c), the resultant time to failure is LCM(T(i), T(c)). Note that we recorded this time to failure for dp[i][n-1] configurations, so we should record that many new times to failure once the new component is introduced.
Why do the big primes not contribute to the number of states?
Each of them occurs either 0 or 1 times. Should the number of states not be multiplied by 2 for each of these primes (leading to a non-feasible state space again)?
You're right, of course. However, the solution sketch states that numbers with large primes are handled in another (unspecified) fashion.
What would happen if we did include them? The number of states we would need to represent would explode into an impractical number. Hence the author accounts for such numbers differently. Note that if a number less than or equal to 500 includes a prime larger than 19 the other factors multiply to 21 or less. This makes such numbers amenable for brute forcing, no tables necessary.
The first part of the editorial seems useful, but the second part is rather vague (and perhaps unhelpful; I'd rather finish this answer than figure it out).
Let's suppose for the moment that the input consists of pairwise distinct primes, e.g., 2, 3, 5, and 7. Then the answer (for summing all sets, where the LCM of 0 integers is 1) is
(1 + 2) (1 + 3) (1 + 5) (1 + 7),
because the LCM of a subset is exactly equal to the product here, so just multiply it out.
Let's relax the restriction that the primes be pairwise distinct. If we have an input like 2, 2, 3, 3, 3, and 5, then the multiplication looks like
(1 + (2^2 - 1) 2) (1 + (2^3 - 1) 3) (1 + (2^1 - 1) 5),
because 2 appears with multiplicity 2, and 3 appears with multiplicity 3, and 5 appears with multiplicity 1. With respect to, e.g., just the set of 3s, there are 2^3 - 1 ways to choose a subset that includes a 3, and 1 way to choose the empty set.
Call a prime small if it's 19 or less and large otherwise. Note that integers 500 or less are divisible by at most one large prime (with multiplicity). The small primes are more problematic. What we're going to do is to compute, for each possible small portion of the prime factorization of the LCM (i.e., one of the ~70,000 states), the sum of LCMs for the problem derived by discarding the integers that could not divide such an LCM and leaving only the large prime factor (or 1) for the other integers.
For example, if the input is 2, 30, 41, 46, and 51, and the state is 2, then we retain 2 as 1, discard 30 (= 2 * 3 * 5; 3 and 5 are small), retain 41 as 41 (41 is large), retain 46 as 23 (= 2 * 23; 23 is large), and discard 51 (= 3 * 17; 3 and 17 are small). Now, we compute the sum of LCMs using the previously described technique. Use inclusion-exclusion to get rid of the subsets whose LCM whose small portion properly divides the state instead of being exactly equal. Maybe I'll work a complete example later.
What is meant by these states?
I think here, states refer to if the number is in set B = {b0, b1, ..., bk-1} of LCMs of set A.
Does dp stand for dynamic programming and if so, what recurrence relation is being solved?
dp in the solution sketch stands for dynamic programming, I believe.
How is dp[i] computed from dp[i-1]?
It's feasible that we can figure out the state of next group of LCMs from previous states. So, we only need array of 2, and toggle back and forth.
Why do the big primes not contribute to the number of states? Each of them occurs either 0 or 1 times. Should the number of states not be multiplied by 2 for each of these primes (leading to a non-feasible state space again)?
We can use Prime Factorization and exponents only to present the number.
Here is one example.
6 = (2^1)(3^1)(5^0) -> state "1 1 0" to represent 6
18 = (2^1)(3^2)(5^0) -> state "1 2 0" to represent 18
Here is how we can get LMC of 6 and 18 using Prime Factorization
LCM (6,18) = (2^(max(1,1)) (3^ (max(1,2)) (5^max(0,0)) = (2^1)(3^2)(5^0) = 18
2^9 > 500, 3^6 > 500, 5^4 > 500, 7^4>500, 11^3 > 500, 13^3 > 500, 17^3 > 500, 19^3 > 500
we can use only count of exponents of prime number 2,3,5,7,11,13,17,19 to represent the LCMs in the set B = {b0, b1, ..., bk-1}
for the given set A = {a0, a1, ..., aN-1} (1 ≤ N ≤ 100), with 2 ≤ ai ≤ 500.
9 * 6 * 4 * 4 * 3 * 3 * 3 * 3 <= 70000, so we only need two of dp[9][6][4][4][3][3][3][3] to keep tracks of all LCMs' states. So, dp[70000][2] is enough.
I put together a small C++ program to illustrate how we can get sum of LCMs of the given set A = {a0, a1, ..., aN-1} (1 ≤ N ≤ 100), with 2 ≤ ai ≤ 500. In the solution sketch, we need to loop through 70000 max possible of LCMs.
int gcd(int a, int b) {
int remainder = 0;
do {
remainder = a % b;
a = b;
b = remainder;
} while (b != 0);
return a;
}
int lcm(int a, int b) {
if (a == 0 || b == 0) {
return 0;
}
return (a * b) / gcd(a, b);
}
int sum_of_lcm(int A[], int N) {
// get the max LCM from the array
int max = A[0];
for (int i = 1; i < N; i++) {
max = lcm(max, A[i]);
}
max++;
//
int dp[max][2];
memset(dp, 0, sizeof(dp));
int pri = 0;
int cur = 1;
// loop through n x 70000
for (int i = 0; i < N; i++) {
for (int v = 1; v < max; v++) {
int x = A[i];
if (dp[v][pri] > 0) {
x = lcm(A[i], v);
dp[v][cur] = (dp[v][cur] == 0) ? dp[v][pri] : dp[v][cur];
if ( x % A[i] != 0 ) {
dp[x][cur] += dp[v][pri] + dp[A[i]][pri];
} else {
dp[x][cur] += ( x==v ) ? ( dp[v][pri] + dp[v][pri] ) : ( dp[v][pri] ) ;
}
}
}
dp[A[i]][cur]++;
pri = cur;
cur = (pri + 1) % 2;
}
for (int i = 0; i < N; i++) {
dp[A[i]][pri] -= 1;
}
long total = 0;
for (int j = 0; j < max; j++) {
if (dp[j][pri] > 0) {
total += dp[j][pri] * j;
}
}
cout << "total:" << total << endl;
return total;
}
int test() {
int a[] = {2, 6, 7 };
int n = sizeof(a)/sizeof(a[0]);
int total = sum_of_lcm(a, n);
return 0;
}
Output
total:104
The states are one more than the powers of primes. You have numbers up to 2^8, so the power of 2 is in [0..8], which is 9 states. Similarly for the other states.
"dp" could well stand for dynamic programming, I'm not sure.
The recurrence relation is the heart of the problem, so you will learn more by solving it yourself. Start with some small, simple examples.
For the large primes, try solving a reduced problem without using them (or their equivalents) and then add them back in to see their effect on the final result.

Find next prime given all prior

I'm writing a recursive infinite prime number generator, and I'm almost sure I can optimize it better.
Right now, aside from a lookup table of the first dozen primes, each call to the recursive function receives a list of all previous primes.
Since it's a lazy generator, right now I'm just filtering out any number that is modulo 0 for any of the previous primes, and taking the first unfiltered result. (The check I'm using short-circuits, so the first time a previous prime divides the current number evenly it aborts with that information.)
Right now, my performance degrades around when searching for the 400th prime (37,813). I'm looking for ways to use the unique fact that I have a list of all prior primes, and am only searching for the next, to improve my filtering algorithm. (Most information I can find offers non-lazy sieves to find primes under a limit, or ways to find the pnth prime given pn-1, not optimizations to find pn given 2...pn-1 primes.)
For example, I know that the pnth prime must reside in the range (pn-1 + 1)...(pn-1+pn-2). Right now I start my filtering of integers at pn-1 + 2 (since pn-1 + 1 can only be prime for pn-1 = 2, which is precomputed). But since this is a lazy generator, knowing the terminal bounds of the range (pn-1+pn-2) doesn't help me filter anything.
What can I do to filter more effectively given all previous primes?
Code Sample
#doc """
Creates an infinite stream of prime numbers.
iex> Enum.take(primes, 5)
[2, 3, 5, 7, 11]
iex> Enum.take_while(primes, fn(n) -> n < 25 end)
[2, 3, 5, 7, 11, 13, 17, 19, 23]
"""
#spec primes :: Stream.t
def primes do
Stream.unfold( [], fn primes ->
next = next_prime(primes)
{ next, [next | primes] }
end )
end
defp next_prime([]), do: 2
defp next_prime([2 | _]), do: 3
defp next_prime([3 | _]), do: 5
defp next_prime([5 | _]), do: 7
# ... etc
defp next_prime(primes) do
start = Enum.first(primes) + 2
Enum.first(
Stream.drop_while(
Integer.stream(from: start, step: 2),
fn number ->
Enum.any?(primes, fn prime ->
rem(number, prime) == 0
end )
end
)
)
end
The primes function starts with an empty array, gets the next prime for it (2 initially), and then 1) emits it from the Stream and 2) Adds it to the top the primes stack used in the next call. (I'm sure this stack is the source of some slowdown.)
The next_primes function takes in that stack. Starting from the last known prime+2, it creates an infinite stream of integers, and drops each integer that divides evenly by any known prime for the list, and then returns the first occurrence.
This is, I suppose, something similar to a lazy incremental Eratosthenes's sieve.
You can see some basic attempts at optimization: I start checking at pn-1+2, and I step over even numbers.
I tried a more verbatim Eratosthenes's sieve by just passing the Integer.stream through each calculation, and after finding a prime, wrapping the Integer.stream in a new Stream.drop_while that filtered just multiples of that prime out. But since Streams are implemented as anonymous functions, that mutilated the call stack.
It's worth noting that I'm not assuming you need all prior primes to generate the next one. I just happen to have them around, thanks to my implementation.
For any number k you only need to try division with primes up to and including √k. This is because any prime larger than √k would need to be multiplied with a prime smaller than √k.
Proof:
√k * √k = k so (a+√k) * √k > k (for all 0<a<(k-√k)). From this follows that (a+√k) divides k iff there is another divisor smaller than √k.
This is commonly used to speed up finding primes tremendously.
You don't need all prior primes, just those below the square root of your current production point are enough, when generating composites from primes by the sieve of Eratosthenes algorithm.
This greatly reduces the memory requirements. The primes are then simply those odd numbers which are not among the composites.
Each prime p produces a chain of its multiples, starting from its square, enumerated with the step of 2p (because we work only with odd numbers). These multiples, each with its step value, are stored in a dictionary, thus forming a priority queue. Only the primes up to the square root of the current candidate are present in this priority queue (the same memory requirement as that of a segmented sieve of E.).
Symbolically, the sieve of Eratosthenes is
P = {3,5,7,9, ...} \ &bigcup; {{p2, p2+2p, p2+4p, p2+6p, ...} | p in P}
Each odd prime generates a stream of its multiples by repeated addition; all these streams merged together give us all the odd composites; and primes are all the odd numbers without the composites (and the one even prime number, 2).
In Python (can be read as an executable pseudocode, hopefully),
def postponed_sieve(): # postponed sieve, by Will Ness,
yield 2; yield 3; # https://stackoverflow.com/a/10733621/849891
yield 5; yield 7; # original code David Eppstein / Alex Martelli
D = {} # 2002, http://code.activestate.com/recipes/117119
ps = (p for p in postponed_sieve()) # a separate Primes Supply:
p = ps.next() and ps.next() # (3) a Prime to add to dict
q = p*p # (9) when its sQuare is
c = 9 # the next Candidate
while True:
if c not in D: # not a multiple of any prime seen so far:
if c < q: yield c # a prime, or
else: # (c==q): # the next prime's square:
add(D,c + 2*p,2*p) # (9+6,6 : 15,21,27,33,...)
p=ps.next() # (5)
q=p*p # (25)
else: # 'c' is a composite:
s = D.pop(c) # step of increment
add(D,c + s,s) # next multiple, same step
c += 2 # next odd candidate
def add(D,x,s): # make no multiple keys in Dict
while x in D: x += s # increment by the given step
D[x] = s
Once a prime is produced, it can be forgotten. A separate prime supply is taken from a separate invocation of the same generator, recursively, to maintain the dictionary. And the prime supply for that one is taken from another, recursively as well. Each needs to be supplied only up to the square root of its production point, so very few generators are needed overall (on the order of log log N generators), and their sizes are asymptotically insignificant (sqrt(N), sqrt( sqrt(N) ), etc).
I wrote a program that generates the prime numbers in order, without limit, and used it to sum the first billion primes at my blog. The algorithm uses a segmented Sieve of Eratosthenes; additional sieving primes are calculated at each segment, so the process can continue indefinitely, as long as you have space to store the sieving primes. Here's pseudocode:
function init(delta) # Sieve of Eratosthenes
m, ps, qs := 0, [], []
sieve := makeArray(2 * delta, True)
for p from 2 to delta
if sieve[p]
m := m + 1; ps.insert(p)
qs.insert(p + (p-1) / 2)
for i from p+p to n step p
sieve[i] := False
return m, ps, qs, sieve
function advance(m, ps, qs, sieve, bottom, delta)
for i from 0 to delta - 1
sieve[i] := True
for i from 0 to m - 1
qs[i] := (qs[i] - delta) % ps[i]
p := ps[0] + 2
while p * p <= bottom + 2 * delta
if isPrime(p) # trial division
m := m + 1; ps.insert(p)
qs.insert((p*p - bottom - 1) / 2)
p := p + 2
for i from 0 to m - 1
for j from qs[i] to delta step ps[i]
sieve[j] := False
return m, ps, qs, sieve
Here ps is the list of sieving primes less than the current maximum and qs is the offset of the smallest multiple of the corresponding ps in the current segment. The advance function clears the bitarray, resets qs, extends ps and qs with new sieving primes, then sieves the next segment.
function genPrimes()
bottom, i, delta := 0, 1, 50000
m, ps, qs, sieve := init(delta)
yield 2
while True
if i == delta # reset for next segment
i, bottom := -1, bottom + 2 * delta
m, ps, qs, sieve := \textbackslash
advance(m, ps, qs, sieve, bottom, delta)
else if sieve[i] # found prime
yield bottom + 2*i + 1
i := i + 1
The segment size 2 * delta is arbitrarily set to 100000. This method requires O(sqrt(n)) space for the sieving primes plus constant space for the sieve.
It is slower but saves space to generate candidates with a wheel and test the candidates for primality.
function genPrimes()
w, wheel := 0, [1,2,2,4,2,4,2,4,6,2,6,4,2,4, \
6,6,2,6,4,2,6,4,6,8,4,2,4,2,4,8,6,4,6, \
2,4,6,2,6,6,4,2,4,6,2,6,4,2,4,2,10,2,10]
p := 2; yield p
repeat
p := p + wheel[w]
if w == 51 then w := 4 else w := w + 1
if isPrime(p) yield p
It may be useful to begin with a sieve and switch to a wheel when the sieve grows too large. Even better is to continue sieving with some fixed set of sieving primes, once the set grows too large, then report only those values bottom + 2*i + 1 that pass a primality test.

Why is my implementation of Atkin sieve is slower than Eratosthenes? [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 12 months ago.
Improve this question
I'm doing problems from Project Euler in Ruby and implemented Atkin's sieve for finding prime numbers but it runs slower than sieve of Eratosthenes. What is the problem?
def atkin_sieve(n)
primes = [2,3,5]
sieve = Array.new(n+1, false)
y_upper = n-4 > 0 ? Math.sqrt(n-4).truncate : 1
for x in (1..Math.sqrt(n/4).truncate)
for y in (1..y_upper)
k = 4*x**2 + y**2
sieve[k] = !sieve[k] if k%12 == 1 or k%12 == 5
end
end
y_upper = n-3 > 0 ? Math.sqrt(n-3).truncate : 1
for x in (1..Math.sqrt(n/3).truncate)
for y in (1..y_upper)
k = 3*x**2 + y**2
sieve[k] = !sieve[k] if k%12 == 7
end
end
for x in (1..Math.sqrt(n).truncate)
for y in (1..x)
k = 3*x**2 - y**2
if k < n and k%12 == 11
sieve[k] = !sieve[k]
end
end
end
for j in (5...n)
if sieve[j]
prime = true
for i in (0...primes.length)
if j % (primes[i]**2) == 0
prime = false
break
end
end
primes << j if prime
end
end
primes
end
def erato_sieve(n)
primes = []
for i in (2..n)
if primes.all?{|x| i % x != 0}
primes << i
end
end
primes
end
As Wikipedia says, "The modern sieve of Atkin is more complicated, but faster when properly optimized" (my emphasis).
The first obvious place to save some time in the first set of loops would be to stop iterating over y when 4*x**2 + y**2 is greater than n. For example, if n is 1,000,000 and x is 450, then you should stop iterating when y is greater than 435 (instead of continuing to 999 as you do at the moment). So you could rewrite the first loop as:
for x in (1..Math.sqrt(n/4).truncate)
X = 4 * x ** 2
for y in (1..Math.sqrt(n - X).truncate)
k = X + y ** 2
sieve[k] = !sieve[k] if k%12 == 1 or k%12 == 5
end
end
(This also avoids re-computing 4*x**2 each time round the loop, though that is probably a very small improvement, if any.)
Similar remarks apply, of course, to the other loops over y.
A second place where you could speed things up is in the strategy for looping over y. You loop over all values of y in the range, and then check to see which ones lead to values of k with the correct remainders modulo 12. Instead, you could just loop over the right values of y only, and avoid testing the remainders altogether.
If 4*x**2 is 4 modulo 12, then y**2 must be 1 or 9 modulo 12, and so y must be 1, 3, 5, 7, or 11 modulo 12. If 4*x**2 is 8 modulo 12, then y**2 must be 5 or 9 modulo 12, so y must be 3 or 9 modulo 12. And finally, if 4*x**2 is 0 modulo 12, then y**2 must be 1 or 5 modulo 12, so y must be 1, 5, 7, 9, or 11 modulo 12.
I also note that your sieve of Eratosthenes is doing useless work by testing divisibility by all primes below i. You can halt the iteration once you've test for divisibility by all primes less than or equal to the square root of i.
It would help a lot if you actually implemented the Sieve of Eratosthenes properly in the first place.
The critical feature of that sieve is that you only do one operation per time a prime divides a number. By contrast you are doing work for every prime less than the number. The difference is subtle, but the performance implications are huge.
Here is the actual sieve that you failed to implement:
def eratosthenes_primes(n)
primes = []
could_be_prime = (0..n).map{|i| true}
could_be_prime[0] = false
could_be_prime[1] = false
i = 0
while i*i <= n
if could_be_prime[i]
j = i*i
while j <= n
could_be_prime[j] = false
j += i
end
end
i += 1
end
return (2..n).find_all{|i| could_be_prime[i]}
end
Compare this with your code for finding all of the primes up to 50,000. Also note that this can easily be sped up by a factor of 2 by special casing the logic for even numbers. With that tweak, this algorithm should be fast enough for every Project Euler problem that needs you to compute a lot of primes.
#Gareth mentions some redundant calculations regarding 4x^2+y^2. Both here and in other places where you have calculations within a loop, you can make use of calculations you've already performed and reduce this to simple addition.
Rather than X=4 * x ** 2, you could rely on the fact that X already has the value of 4 * (x-1) ** 2. Since 4x^2 = 4(x-1)^2 + 4(2x - 1), all you need to do is add 8 * x - 4 to X. You can use this same trick for k, and the other places where you have repeated calculations (like 3x^2 + y^2).

Resources