Project Euler number 35 efficiency - performance

https://projecteuler.net/problem=35
All problems on Project Euler are supposed to be solvable by a program in under 1 minute. My solution, however, has a runtime of almost 3 minutes. Other solutions I've seen online are similar to mine conceptually, but have runtimes that are exponentially faster. Can anyone help make my code more efficient/run faster?
Thanks!
#genPrimes takes an argument n and returns a list of all prime numbers less than n
def genPrimes(n):
primeList = [2]
number = 3
while(number < n):
isPrime = True
for element in primeList:
if element > number**0.5:
break
if number%element == 0 and element <= number**0.5:
isPrime = False
break
if isPrime == True:
primeList.append(number)
number += 2
return primeList
#isCircular takes a number as input and returns True if all rotations of that number are prime
def isCircular(prime):
original = prime
isCircular = True
prime = int(str(prime)[-1] + str(prime)[:len(str(prime)) - 1])
while(prime != original):
if prime not in primeList:
isCircular = False
break
prime = int(str(prime)[-1] + str(prime)[:len(str(prime)) - 1])
return isCircular
primeList = genPrimes(1000000)
circCount = 0
for prime in primeList:
if isCircular(prime):
circCount += 1
print circCount

Two modifications of your code yield a pretty fast solution (roughly 2 seconds on my machine):
Generating primes is a common problem with many solutions on the web. I replaced yours with rwh_primes1 from this article:
def genPrimes(n):
sieve = [True] * (n/2)
for i in xrange(3,int(n**0.5)+1,2):
if sieve[i/2]:
sieve[i*i/2::i] = [False] * ((n-i*i-1)/(2*i)+1)
return [2] + [2*i+1 for i in xrange(1,n/2) if sieve[i]]
It is about 65 times faster (0.04 seconds).
The most important step I'd suggest, however, is to filter the list of generated primes. Since each circularly shifted version of an integer has to be prime, the circular prime must not contain certain digits. The prime 23, e.g., can be easily spotted as an invalid candidate, because it contains a 2, which indicates divisibility by two when this is the last digit. Thus you might remove all such bad candidates by the following simple method:
def filterPrimes(primeList):
for i in primeList[3:]:
if '0' in str(i) or '2' in str(i) or '4' in str(i) \
or '5' in str(i) or '6' in str(i) or '8' in str(i):
primeList.remove(i)
return primeList
Note that the loop starts at the fourth prime number to avoid removing the number 2 or 5.
The filtering step takes most of the computing time (about 1.9 seconds), but reduces the number of circular prime candidates dramatically from 78498 to 1113 (= 98.5 % reduction)!
The last step, the circulation of each remaining candidate, can be done as you suggested. If you wish, you can simplify the code as follows:
circCount = sum(map(isCircular, primeList))
Due to the reduced candidate set this step is completed in only 0.03 seconds.

Related

Ruby's digits method performance

I'm solving some Project Euler problems using Ruby, and specifically here I'm talking about problem 25 (What is the index of the first term in the Fibonacci sequence to contain 1000 digits?).
At first, I was using Ruby 2.2.3 and I coded the problem as such:
number = 3
a = 1
b = 2
while b.to_s.length < 1000
a, b = b, a + b
number += 1
end
puts number
But then I found out that version 2.4.2 has a method called digits which is exactly what I needed. I transformed to code to:
while b.digits.length < 1000
And when I compared the two methods, digits was much slower.
Time
./025/problem025.rb 0.13s user 0.02s system 80% cpu 0.190 total
./025/problem025.rb 2.19s user 0.03s system 97% cpu 2.275 total
Does anyone have an idea why?
Ruby's digits
... is implemented in rb_int_digits.
Which for non-tiny numbers (i.e., most of your numbers) uses rb_int_digits_bigbase.
Which extracts digit after digit naively with division/modulo by base.
So it should take quadratic time (at least with a small base such as 10).
Ruby's to_s
... is implemented in int_to_s.
Which uses rb_int2str.
Which for non-tiny numbers uses rb_big2str.
Which uses rb_big2str1.
Which might use big2str_gmp if available (which sounds/looks like it uses the fast GMP library) or ...
... uses big2str_generic.
Which uses big2str_karatsuba (sweet, I recognize that name!).
Which looks like it has something to do with ...
... Karatsuba's algorithm, which is a fast multiplication algorithm. If you multiply two n-digit numbers the naive way you learned in school, you take n2 single-digit products. Karatsuba on the other hand only needs about n1.585, which is quite a lot better. And I didn't read into this further, but I suspect what Ruby does here is also this efficient. Eric Lippert's answer with a base conversion algorithm uses Karatsuba multiplication and says "this [base conversion] algorithm is utterly dominated by the cost of the multiplication".
Comparing quadratic to n1.585 over the number lengths from 1 digit to 1000 digits gives factor 15:
(1..1000).sum { |i| i**2 } / (1..1000).sum { |i| i**1.585 }
=> 15.150583254950678
Which is roughly the factor you observed as well. Of course that's a rather naive comparison, but, well, why not.
GMP by the way apparently uses/used a "near O(n * log(n)) FFT-based multiplication algorithm".
Thanks to #Drenmi's answer for motivating me to dig into the source after all. I hope I did this right, no guarantees, I'm a Ruby beginner. But that's why I left all the links there for you to check for yourself :-P
Integer#digits doesn't just "split" the number. From the documentation:
Returns the array including the digits extracted by place-value
notation with radix base of int.
This extraction is done even if a base argument is omitted. The relevant source:
# ruby/numeric.c:4809
while (!FIXNUM_P(num) || FIX2LONG(num) > 0) {
VALUE qr = rb_int_divmod(num, base);
rb_ary_push(digits, RARRAY_AREF(qr, 1));
num = RARRAY_AREF(qr, 0);
}
As you can see, this process includes repeated modulo arithmetics, which likely accounts for the additional runtime.
Many ruby methods create objects (strins, arrays, etc.)
In ruby, object creation in ruby is "expensive".
For instance to_s creates a string and digits creates an array every time the while condition is evaluated.
If you want to optimize your example, you can do the following:
# create the smallest possible 1000 digits number
max = 10**999
number = 3
a = 1
b = 2
# do not create objects in while condition
while b < max
a, b = b, a + b
number += 1
end
puts number
I have not answered your question, but wish to suggest an improved algorithm for the problem you have addressed. For a given number of decimal digits, n, I have implemented the following algorithm.
estimate the number f of Fibonacci numbers ("FNs") that have n or fewer decimal digits.
compute the fth and (f-1)st FNs, and the number of digits m in the fth FN.
if m >= n back down from down from the (f-1)st FN until the (f-1)st FN has fewer than n decimal digits, at which time the fth FN is the smallest FN to have n decimal digits.
if m < n increase the fth FN until the it has n decimal digits, at which time it is the smallest FN to have n decimal digits.
The key is to compute a close estimate f in the first step.
Code
AVG_FNs_PER_DIGIT = 4.784971966781667
def first_fibonacci_with_n_digits(n)
return [1, 1] if n == 1
idx = (n * AVG_FNs_PER_DIGIT).round
fn, prev_fn = fib(idx)
fn.to_s.size >= n ? fib_down(n, fn, prev_fn, idx) : fib_up(n, fn, prev_fn, idx)
end
def fib(idx)
a = 1
b = 2
(idx - 2).times {a, b = b, a + b }
[b, a]
end
def fib_up(n, b, a, idx)
loop do
a, b = b, a + b
idx += 1
break [idx, b] if b.to_s.size == n
end
end
def fib_down(n, b, a, idx)
loop do
a, b = b - a, a
break [idx, b] if a.to_s.size == n - 1
idx -= 1
end
end
Benchmarks
In computing each Fibonacci number two operations are typically performed:
compute the number of digits in the last-computed Fibonacci number and if that number is equal to the target number of digits, terminate (for reasons made clear in the Explanation section below, it cannot be larger than the target number); else
compute the next number in the Fibonacci sequence.
By contrast, the method I have proposed performs the first step a relatively small number of times.
How important is the first step relative to the second and how does the use of n.digits.size compare with that of n.to_s.size in the first step? Let's run some benchmarks to find out.
def use_to_s(ndigits)
case ndigits
when 1
[1, 1]
else
a = 1
b = 2
idx = 3
loop do
break [idx, b] if b.to_s.length == ndigits
a, b = b, a + b
idx += 1
end
end
end
def use_digits(ndigits)
case ndigits
when 1
[1, 1]
else
a = 1
b = 2
idx = 3
loop do
break [idx, b] if b.digits.size == ndigits
a, b = b, a + b
idx += 1
end
end
end
require 'fruity'
def test(ndigits)
nfibs, last_fib = use_to_s(ndigits)
puts "\nndigits = #{ndigits}, nfibs=#{nfibs}, last_fib=#{last_fib}"
compare do
try_use_to_s { use_to_s(ndigits) }
try_use_digits { use_digits(ndigits) }
try_estimate { first_fibonacci_with_n_digits(ndigits) }
end
end
test 20
ndigits = 20, nfibs=93, last_fib=12200160415121876738
Running each test 128 times. Test will take about 1 second.
try_estimate is faster than try_use_to_s by 2x ± 0.1
try_use_to_s is faster than try_use_digits by 80.0% ± 10.0%
test 100
ndigits = 100, nfibs=476, last_fib=13447...37757 (90 digits omitted)
Running each test 16 times. Test will take about 4 seconds.
try_estimate is faster than try_use_to_s by 5x ± 0.1
try_use_to_s is faster than try_use_digits by 10x ± 1.0
test 500
ndigits = 500, nfibs=2390, last_fib=13519...63145 (490 digits omitted)
Running each test 2 times. Test will take about 27 seconds.
try_estimate is faster than try_use_to_s by 9x ± 0.1
try_use_to_s is faster than try_use_digits by 60x ± 1.0
test 1000
ndigits = 1000, nfibs=4782, last_fib=10700...27816 (990 digits omitted)
Running each test once. Test will take about 1 minute.
try_estimate is faster than try_use_to_s by 12x ± 10.0
try_use_to_s is faster than try_use_digits by 120x ± 100.0
There are two main take-aways from these results:
"try_estimate" is the fastest because it performs the first step relatively few times; and
the use of to_s is much faster than that of digits.
Further to the first of these observations note that the initial estimates of the index of the first FN having a given number of digits, compared to the actual index, are as follows:
for 20 digits: 96 est. vs 93 actual
for 100 digits: 479 est. vs 476 actual
for 500 digits: 2392 est. vs 2390 actual
for 1000 digits: 4785 est. vs 4782 actual
The deviation was at most 3, meaning numbers of digits had to be calculated for at most 3 FNs to obtain the desired result.
Explanation
The only explanation of the methods given in the section Code above is the derivation of the constant AVG_FNs_PER_DIGIT, which is used to calculate an estimate of the index of the first FN having the specified number of digits.
The derivation of this constant derives from the question and selected answer given here. (The Wiki for Fibonacci numbers provides a good overview of the mathematical properties of FNs.)
It is known that the first 7 FNs (including zero) have one digit; thereafter the FNs gain an additional digit every 4 or 5 FNs (i.e., sometimes 4, else 5). Therefore, as a very crude calculation, we see that to calculate the first FN with n digits, n >= 2, it will not be less than the 4*nth FN. For n = 1000, that would be 4,000. (In fact, the 4,782nd is the smallest to have 1,000 digits.) In other words, we don't need to calculate the number of digits in the first 4,000 FNs. We can improve on this estimate, however.
As n approaches infinity, the ratio of ranges 10**n...10**(n+1) (n-digit intervals) that contain 5 FNs to those that contain 4 FNs can be computed as follows.
LOG_10 = Math.log(10)
#=> 2.302585092994046
GR = (1 + Math.sqrt(5))/2
#=> 1.618033988749895
LOG_GR = Math.log(GR)
#=> 0.48121182505960347
RATIO_5to4 = (LOG_10 - 4*LOG_GR)/(5*LOG_GR - LOG_10)
#=> 3.6505564183095474
where GR is the Golden Ratio.
Over a large number of n-digit intervals let n4 be the number of those intervals containing 4 FNs and n5 be the number containing 5 FNs. The average number of FNs per interval is therefore (n4*4 + n5*5)/(n4 + n5). Since n5/n4 converges to RATIO_5to4, n5 approaches RATIO_5to4 * n4 in the limit (discarding roundoff error). If we substitute out n5, and let
b = 1/(1 + RATIO_5to4)
#=> 0.21502803321833364
we find the average number of FNs per n-digit interval converges to
avg = b * 4 + (1-b) *5
#=> 4.784971966781667
If fn is the first FN to have n decimal digits, the number of FNs in the sequence up to an including fn can therefore be approximated to be
n * avg
If, for example, the estimate of the index of the first FN to have 1000 decimal digits would be 1000 * 4.784971966781667).round #=> 4785.

Code Optimization - Generating Prime Numbers

I am trying to write a code for the following problem:
Input
The input begins with the number t of test cases in a single line (t<=10). In each of the next t lines there are two numbers m and n (1 <= m <= n <= 1000000000, n-m<=100000) separated by a space.
Output
For every test case print all prime numbers p such that m <= p <= n, one number per line, test cases separated by an empty line.
Sample Input:
2
1 10
3 5
Sample Output:
2
3
5
7
3
5
My code:
def prime?(number)
return false if number == 1
(2..number-1).each do |n|
return false if number % n == 0
end
true
end
t = gets.strip.to_i
for i in 1..t
mi, ni = gets.strip.split(' ')
mi = mi.to_i
ni = ni.to_i
i = mi
while i <= ni
puts i if prime?(i)
i += 1
end
puts "\n"
end
The code is running fine, only problem I am having is that it is taking a lot of time when run against big input ranges as compared to other programming languages.
Am I doing something wrong here? Can this code be further optimized for faster runtime?
I have tried using a for loop, normal loop, creating an array and then printing it.
Any suggestions.
Ruby is slower than some other languages, depending on what language you compare it to; certainly slower than C/C++. But your problem is not the language (although it influences the run-time behavior), but your way of finding primes. There are many better algorithms for finding primes, such as the Sieve of Eratosthenes or the Sieve of Atkin. You might also read the “Generating Primes” page on Wikipedia and follow the links there.
By the way, for the Sieve of Eratosthenes, there is even a ready-to-use piece of code on Stackoverflow. I'm sure a little bit of googling will turn up implementations for other algorithms, too.
Since your problem is finding primes within a certain range, this is the Sieve of Eratosthenes code found at the above link modified to suit your particular problem:
def better_sieve_upto(first, last)
sieve = [nil, nil] + (2..last).to_a
sieve.each do |i|
next unless i
break if i*i > last
(i*i).step(last, i) {|j| sieve[j] = nil }
end
sieve.reject {|i| !i || i < first}
end
Note the change from "sieve.compact" to a complexer "sieve.reject" with a corresponding condition.
Return true if the number is 2, false if the number is evenly divisible by 2.
Start iterating at 3, instead of 2. Use a step of two.
Iterate up to the square root of the number, instead of the number minus one.
def prime?(number)
return true if number == 2
return false if number <= 1 or number % 2 == 0
(3..Math.sqrt(number)).step(2) do |n|
return false if number % n == 0
end
true
end
This will be much faster, but still not very fast, as #Technation explains.
Here's how to do it using the Sieve of Eratosthenes built into Ruby. You'll need to precompute all the primes up to the maximum maximum, which will be very quick, and then select the primes that fall within each range.
require 'prime'
ranges = Array.new(gets.strip.to_i) do
min, max = gets.strip.split.map(&:to_i)
Range.new(min, max)
end
primes = Prime.each(ranges.map(&:max).max, Prime::EratosthenesGenerator.new)
ranges.each do |range|
primes.each do |prime|
next if prime < range.min
break if prime > range.max
puts prime
end
primes.rewind
puts "\n"
end
Here's how the various solutions perform with the range 50000 200000:
Your original prime? function: 1m49.639s
My modified prime? function: 0m0.687s
Prime::EratosthenesGenerator: 0m0.221s
The more ranges being processed, the faster the Prime::EratosthenesGenerator method should be.

Why does this code take 8 minutes to finish?

This is a (pretty bad) solution to one of the project Euler problems. The problem was to find the 10_001st prime number. The code below does it, but it takes 8 minutes to run. Can you explain why that is the case and how to optimize it?
primes = []
number = 2.0
until primes[10000] != nil
if (2..(number - 1)).any? do |n|
number % n == 0
end == false
primes << number
end
number = number + 1.0
end
puts primes[10000]
Some simple optimizations to prime finding:
Start by pushing 2 onto your primes list, and start by checking if 3 is a prime. (This eliminates needing to write special case code for the numbers 0 to 2)
You only have to check numbers that are odd for prime candidacy. (Or, if you start by adding 2/3/5 and checking 7, you only need to check numbers that are 1 or 5 after doing % 6. Or... You get the idea)
You only have to see if your current candidate x is divisible by factors up to sqrt(x)—because any factor above sqrt(x) divides x into a number below sqrt(x), and you've already checked all of those.
You only have to check numbers in your prime list, instead of all numbers, for divisors of x - since all composite numbers are divisible by primes. For example, 81 is 9*9 - but 9*9 is 3*3*9, 9 being composite, so you'll discover it's a prime when you check it against 3. Therefore you never need to test if 9 is a factor, and so on for every composite factor.
There are very optimized, sped up prime finding functions (see the Sieve of Atkin for a start), but these are the common optimizations that are easy to come up with.
Do you really have to check if the number divides with all previous numbers? Check only with the smaller primes you already discovered. Also, why using floats where integers are perfectly fine?
EDIT:
Some possible changes (not best algorithm, can be improved):
primes = [2, 3, 5]
num = 7
until primes[10000]
is_prime = true
i = 0
sqrtnum = Math.sqrt(num).ceil
while (n=primes[i+=1]) <= sqrtnum
if num % n == 0
is_prime = false
break
end
end
if is_prime
primes << num
end
num += 2
end
puts primes[10000]
On my computer (for 1000 primes):
Yours:
real 0m3.300s
user 0m3.284s
sys 0m0.000s
Mine:
real 0m0.045s
user 0m0.040s
sys 0m0.004s

Find the minimum number of operations required to compute a number using a specified range of numbers

Let me start with an example -
I have a range of numbers from 1 to 9. And let's say the target number that I want is 29.
In this case the minimum number of operations that are required would be (9*3)+2 = 2 operations. Similarly for 18 the minimum number of operations is 1 (9*2=18).
I can use any of the 4 arithmetic operators - +, -, / and *.
How can I programmatically find out the minimum number of operations required?
Thanks in advance for any help provided.
clarification: integers only, no decimals allowed mid-calculation. i.e. the following is not valid (from comments below): ((9/2) + 1) * 4 == 22
I must admit I didn't think about this thoroughly, but for my purpose it doesn't matter if decimal numbers appear mid-calculation. ((9/2) + 1) * 4 == 22 is valid. Sorry for the confusion.
For the special case where set Y = [1..9] and n > 0:
n <= 9 : 0 operations
n <=18 : 1 operation (+)
otherwise : Remove any divisor found in Y. If this is not enough, do a recursion on the remainder for all offsets -9 .. +9. Offset 0 can be skipped as it has already been tried.
Notice how division is not needed in this case. For other Y this does not hold.
This algorithm is exponential in log(n). The exact analysis is a job for somebody with more knowledge about algebra than I.
For more speed, add pruning to eliminate some of the search for larger numbers.
Sample code:
def findop(n, maxlen=9999):
# Return a short postfix list of numbers and operations
# Simple solution to small numbers
if n<=9: return [n]
if n<=18: return [9,n-9,'+']
# Find direct multiply
x = divlist(n)
if len(x) > 1:
mults = len(x)-1
x[-1:] = findop(x[-1], maxlen-2*mults)
x.extend(['*'] * mults)
return x
shortest = 0
for o in range(1,10) + range(-1,-10,-1):
x = divlist(n-o)
if len(x) == 1: continue
mults = len(x)-1
# We spent len(divlist) + mults + 2 fields for offset.
# The last number is expanded by the recursion, so it doesn't count.
recursion_maxlen = maxlen - len(x) - mults - 2 + 1
if recursion_maxlen < 1: continue
x[-1:] = findop(x[-1], recursion_maxlen)
x.extend(['*'] * mults)
if o > 0:
x.extend([o, '+'])
else:
x.extend([-o, '-'])
if shortest == 0 or len(x) < shortest:
shortest = len(x)
maxlen = shortest - 1
solution = x[:]
if shortest == 0:
# Fake solution, it will be discarded
return '#' * (maxlen+1)
return solution
def divlist(n):
l = []
for d in range(9,1,-1):
while n%d == 0:
l.append(d)
n = n/d
if n>1: l.append(n)
return l
The basic idea is to test all possibilities with k operations, for k starting from 0. Imagine you create a tree of height k that branches for every possible new operation with operand (4*9 branches per level). You need to traverse and evaluate the leaves of the tree for each k before moving to the next k.
I didn't test this pseudo-code:
for every k from 0 to infinity
for every n from 1 to 9
if compute(n,0,k):
return k
boolean compute(n,j,k):
if (j == k):
return (n == target)
else:
for each operator in {+,-,*,/}:
for every i from 1 to 9:
if compute((n operator i),j+1,k):
return true
return false
It doesn't take into account arithmetic operators precedence and braces, that would require some rework.
Really cool question :)
Notice that you can start from the end! From your example (9*3)+2 = 29 is equivalent to saying (29-2)/3=9. That way we can avoid the double loop in cyborg's answer. This suggests the following algorithm for set Y and result r:
nextleaves = {r}
nops = 0
while(true):
nops = nops+1
leaves = nextleaves
nextleaves = {}
for leaf in leaves:
for y in Y:
if (leaf+y) or (leaf-y) or (leaf*y) or (leaf/y) is in X:
return(nops)
else:
add (leaf+y) and (leaf-y) and (leaf*y) and (leaf/y) to nextleaves
This is the basic idea, performance can be certainly be improved, for instance by avoiding "backtracks", such as r+a-a or r*a*b/a.
I guess my idea is similar to the one of Peer Sommerlund:
For big numbers, you advance fast, by multiplication with big ciphers.
Is Y=29 prime? If not, divide it by the maximum divider of (2 to 9).
Else you could subtract a number, to reach a dividable number. 27 is fine, since it is dividable by 9, so
(29-2)/9=3 =>
3*9+2 = 29
So maybe - I didn't think about this to the end: Search the next divisible by 9 number below Y. If you don't reach a number which is a digit, repeat.
The formula is the steps reversed.
(I'll try it for some numbers. :) )
I tried with 2551, which is
echo $((((3*9+4)*9+4)*9+4))
But I didn't test every intermediate result whether it is prime.
But
echo $((8*8*8*5-9))
is 2 operations less. Maybe I can investigate this later.

Why is my implementation of Atkin sieve is slower than Eratosthenes? [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 12 months ago.
Improve this question
I'm doing problems from Project Euler in Ruby and implemented Atkin's sieve for finding prime numbers but it runs slower than sieve of Eratosthenes. What is the problem?
def atkin_sieve(n)
primes = [2,3,5]
sieve = Array.new(n+1, false)
y_upper = n-4 > 0 ? Math.sqrt(n-4).truncate : 1
for x in (1..Math.sqrt(n/4).truncate)
for y in (1..y_upper)
k = 4*x**2 + y**2
sieve[k] = !sieve[k] if k%12 == 1 or k%12 == 5
end
end
y_upper = n-3 > 0 ? Math.sqrt(n-3).truncate : 1
for x in (1..Math.sqrt(n/3).truncate)
for y in (1..y_upper)
k = 3*x**2 + y**2
sieve[k] = !sieve[k] if k%12 == 7
end
end
for x in (1..Math.sqrt(n).truncate)
for y in (1..x)
k = 3*x**2 - y**2
if k < n and k%12 == 11
sieve[k] = !sieve[k]
end
end
end
for j in (5...n)
if sieve[j]
prime = true
for i in (0...primes.length)
if j % (primes[i]**2) == 0
prime = false
break
end
end
primes << j if prime
end
end
primes
end
def erato_sieve(n)
primes = []
for i in (2..n)
if primes.all?{|x| i % x != 0}
primes << i
end
end
primes
end
As Wikipedia says, "The modern sieve of Atkin is more complicated, but faster when properly optimized" (my emphasis).
The first obvious place to save some time in the first set of loops would be to stop iterating over y when 4*x**2 + y**2 is greater than n. For example, if n is 1,000,000 and x is 450, then you should stop iterating when y is greater than 435 (instead of continuing to 999 as you do at the moment). So you could rewrite the first loop as:
for x in (1..Math.sqrt(n/4).truncate)
X = 4 * x ** 2
for y in (1..Math.sqrt(n - X).truncate)
k = X + y ** 2
sieve[k] = !sieve[k] if k%12 == 1 or k%12 == 5
end
end
(This also avoids re-computing 4*x**2 each time round the loop, though that is probably a very small improvement, if any.)
Similar remarks apply, of course, to the other loops over y.
A second place where you could speed things up is in the strategy for looping over y. You loop over all values of y in the range, and then check to see which ones lead to values of k with the correct remainders modulo 12. Instead, you could just loop over the right values of y only, and avoid testing the remainders altogether.
If 4*x**2 is 4 modulo 12, then y**2 must be 1 or 9 modulo 12, and so y must be 1, 3, 5, 7, or 11 modulo 12. If 4*x**2 is 8 modulo 12, then y**2 must be 5 or 9 modulo 12, so y must be 3 or 9 modulo 12. And finally, if 4*x**2 is 0 modulo 12, then y**2 must be 1 or 5 modulo 12, so y must be 1, 5, 7, 9, or 11 modulo 12.
I also note that your sieve of Eratosthenes is doing useless work by testing divisibility by all primes below i. You can halt the iteration once you've test for divisibility by all primes less than or equal to the square root of i.
It would help a lot if you actually implemented the Sieve of Eratosthenes properly in the first place.
The critical feature of that sieve is that you only do one operation per time a prime divides a number. By contrast you are doing work for every prime less than the number. The difference is subtle, but the performance implications are huge.
Here is the actual sieve that you failed to implement:
def eratosthenes_primes(n)
primes = []
could_be_prime = (0..n).map{|i| true}
could_be_prime[0] = false
could_be_prime[1] = false
i = 0
while i*i <= n
if could_be_prime[i]
j = i*i
while j <= n
could_be_prime[j] = false
j += i
end
end
i += 1
end
return (2..n).find_all{|i| could_be_prime[i]}
end
Compare this with your code for finding all of the primes up to 50,000. Also note that this can easily be sped up by a factor of 2 by special casing the logic for even numbers. With that tweak, this algorithm should be fast enough for every Project Euler problem that needs you to compute a lot of primes.
#Gareth mentions some redundant calculations regarding 4x^2+y^2. Both here and in other places where you have calculations within a loop, you can make use of calculations you've already performed and reduce this to simple addition.
Rather than X=4 * x ** 2, you could rely on the fact that X already has the value of 4 * (x-1) ** 2. Since 4x^2 = 4(x-1)^2 + 4(2x - 1), all you need to do is add 8 * x - 4 to X. You can use this same trick for k, and the other places where you have repeated calculations (like 3x^2 + y^2).

Resources