Why does this code take 8 minutes to finish? - ruby

This is a (pretty bad) solution to one of the project Euler problems. The problem was to find the 10_001st prime number. The code below does it, but it takes 8 minutes to run. Can you explain why that is the case and how to optimize it?
primes = []
number = 2.0
until primes[10000] != nil
if (2..(number - 1)).any? do |n|
number % n == 0
end == false
primes << number
end
number = number + 1.0
end
puts primes[10000]

Some simple optimizations to prime finding:
Start by pushing 2 onto your primes list, and start by checking if 3 is a prime. (This eliminates needing to write special case code for the numbers 0 to 2)
You only have to check numbers that are odd for prime candidacy. (Or, if you start by adding 2/3/5 and checking 7, you only need to check numbers that are 1 or 5 after doing % 6. Or... You get the idea)
You only have to see if your current candidate x is divisible by factors up to sqrt(x)—because any factor above sqrt(x) divides x into a number below sqrt(x), and you've already checked all of those.
You only have to check numbers in your prime list, instead of all numbers, for divisors of x - since all composite numbers are divisible by primes. For example, 81 is 9*9 - but 9*9 is 3*3*9, 9 being composite, so you'll discover it's a prime when you check it against 3. Therefore you never need to test if 9 is a factor, and so on for every composite factor.
There are very optimized, sped up prime finding functions (see the Sieve of Atkin for a start), but these are the common optimizations that are easy to come up with.

Do you really have to check if the number divides with all previous numbers? Check only with the smaller primes you already discovered. Also, why using floats where integers are perfectly fine?
EDIT:
Some possible changes (not best algorithm, can be improved):
primes = [2, 3, 5]
num = 7
until primes[10000]
is_prime = true
i = 0
sqrtnum = Math.sqrt(num).ceil
while (n=primes[i+=1]) <= sqrtnum
if num % n == 0
is_prime = false
break
end
end
if is_prime
primes << num
end
num += 2
end
puts primes[10000]
On my computer (for 1000 primes):
Yours:
real 0m3.300s
user 0m3.284s
sys 0m0.000s
Mine:
real 0m0.045s
user 0m0.040s
sys 0m0.004s

Related

Code Optimization - Generating Prime Numbers

I am trying to write a code for the following problem:
Input
The input begins with the number t of test cases in a single line (t<=10). In each of the next t lines there are two numbers m and n (1 <= m <= n <= 1000000000, n-m<=100000) separated by a space.
Output
For every test case print all prime numbers p such that m <= p <= n, one number per line, test cases separated by an empty line.
Sample Input:
2
1 10
3 5
Sample Output:
2
3
5
7
3
5
My code:
def prime?(number)
return false if number == 1
(2..number-1).each do |n|
return false if number % n == 0
end
true
end
t = gets.strip.to_i
for i in 1..t
mi, ni = gets.strip.split(' ')
mi = mi.to_i
ni = ni.to_i
i = mi
while i <= ni
puts i if prime?(i)
i += 1
end
puts "\n"
end
The code is running fine, only problem I am having is that it is taking a lot of time when run against big input ranges as compared to other programming languages.
Am I doing something wrong here? Can this code be further optimized for faster runtime?
I have tried using a for loop, normal loop, creating an array and then printing it.
Any suggestions.
Ruby is slower than some other languages, depending on what language you compare it to; certainly slower than C/C++. But your problem is not the language (although it influences the run-time behavior), but your way of finding primes. There are many better algorithms for finding primes, such as the Sieve of Eratosthenes or the Sieve of Atkin. You might also read the “Generating Primes” page on Wikipedia and follow the links there.
By the way, for the Sieve of Eratosthenes, there is even a ready-to-use piece of code on Stackoverflow. I'm sure a little bit of googling will turn up implementations for other algorithms, too.
Since your problem is finding primes within a certain range, this is the Sieve of Eratosthenes code found at the above link modified to suit your particular problem:
def better_sieve_upto(first, last)
sieve = [nil, nil] + (2..last).to_a
sieve.each do |i|
next unless i
break if i*i > last
(i*i).step(last, i) {|j| sieve[j] = nil }
end
sieve.reject {|i| !i || i < first}
end
Note the change from "sieve.compact" to a complexer "sieve.reject" with a corresponding condition.
Return true if the number is 2, false if the number is evenly divisible by 2.
Start iterating at 3, instead of 2. Use a step of two.
Iterate up to the square root of the number, instead of the number minus one.
def prime?(number)
return true if number == 2
return false if number <= 1 or number % 2 == 0
(3..Math.sqrt(number)).step(2) do |n|
return false if number % n == 0
end
true
end
This will be much faster, but still not very fast, as #Technation explains.
Here's how to do it using the Sieve of Eratosthenes built into Ruby. You'll need to precompute all the primes up to the maximum maximum, which will be very quick, and then select the primes that fall within each range.
require 'prime'
ranges = Array.new(gets.strip.to_i) do
min, max = gets.strip.split.map(&:to_i)
Range.new(min, max)
end
primes = Prime.each(ranges.map(&:max).max, Prime::EratosthenesGenerator.new)
ranges.each do |range|
primes.each do |prime|
next if prime < range.min
break if prime > range.max
puts prime
end
primes.rewind
puts "\n"
end
Here's how the various solutions perform with the range 50000 200000:
Your original prime? function: 1m49.639s
My modified prime? function: 0m0.687s
Prime::EratosthenesGenerator: 0m0.221s
The more ranges being processed, the faster the Prime::EratosthenesGenerator method should be.

How to create a method that returns the nth prime number?

I'm trying to write a method that returns the nth prime number.
I've worked out a solution but the problem is in my method. I create a large array of numbers that seems to process super slow. (1..104729).to_a to be exact. I chose 104729 because the max n can be is 10000 and the 10000th integer is 104729. I'm looking for a way to optimize my method.
Is 104729 is too large a value? Is there a way to write this so that I'm not creating a large array?
Here's the method:
def PrimeMover(num)
def is_prime(x)
i = 0
nums = (2..x).to_a
while nums[i] < nums.max
if x % nums[i] != 0
i += 1
else
return false
end
end
return true
end
primes_arr = (3..104729).to_a.select {|y| is_prime(y)}
primes_arr[num]
end
require "prime"
def find_prime(nth)
Prime.take(nth).last
end
Combine Ruby's built-in prime library, and a lazy enumerator for performance:
require 'prime'
(1...100_000).lazy.select(&:prime?).take(100).to_a
Or simply, as highlighted by Arturo:
Prime.take(100)
You can use Ruby's built in #prime? method, which seems pretty efficient.
The code:
require 'prime'
primes_arr = (3..104729).to_a.select &:prime?
runs in 2-3 seconds on my machine, which I find somewhat acceptable.
If you need even better performance or if you really need to write your own method, try implementing the Sieve of Erathostenes. Here are some Ruby samples of that: http://rosettacode.org/wiki/Sieve_of_Eratosthenes#Ruby
Here's an optimal a trial division implementation of is_prime without relying on the Prime class:
A prime number is a whole number divisible only by 1 and itself, and 1 is not prime. So we want to know if x divides into anything less than x and greater than 1. So we start the count at 2, and we end at x - 1.
def prime?(x)
return false if x < 2
2.upto(x - 1) do |n|
return false if (x % n).zero?
end
true
end
As soon as x % n has a remainder, we can break the loop and say this number is not prime. This saves you from looping over the entire range. If all the possible numbers were exhausted, we know the number is prime.
This is still not optimal. For that you would need a sieve, or a different detection algorithm to trial division. But it's a big improvement on your code. Taking the nth up to you.

Project Euler number 35 efficiency

https://projecteuler.net/problem=35
All problems on Project Euler are supposed to be solvable by a program in under 1 minute. My solution, however, has a runtime of almost 3 minutes. Other solutions I've seen online are similar to mine conceptually, but have runtimes that are exponentially faster. Can anyone help make my code more efficient/run faster?
Thanks!
#genPrimes takes an argument n and returns a list of all prime numbers less than n
def genPrimes(n):
primeList = [2]
number = 3
while(number < n):
isPrime = True
for element in primeList:
if element > number**0.5:
break
if number%element == 0 and element <= number**0.5:
isPrime = False
break
if isPrime == True:
primeList.append(number)
number += 2
return primeList
#isCircular takes a number as input and returns True if all rotations of that number are prime
def isCircular(prime):
original = prime
isCircular = True
prime = int(str(prime)[-1] + str(prime)[:len(str(prime)) - 1])
while(prime != original):
if prime not in primeList:
isCircular = False
break
prime = int(str(prime)[-1] + str(prime)[:len(str(prime)) - 1])
return isCircular
primeList = genPrimes(1000000)
circCount = 0
for prime in primeList:
if isCircular(prime):
circCount += 1
print circCount
Two modifications of your code yield a pretty fast solution (roughly 2 seconds on my machine):
Generating primes is a common problem with many solutions on the web. I replaced yours with rwh_primes1 from this article:
def genPrimes(n):
sieve = [True] * (n/2)
for i in xrange(3,int(n**0.5)+1,2):
if sieve[i/2]:
sieve[i*i/2::i] = [False] * ((n-i*i-1)/(2*i)+1)
return [2] + [2*i+1 for i in xrange(1,n/2) if sieve[i]]
It is about 65 times faster (0.04 seconds).
The most important step I'd suggest, however, is to filter the list of generated primes. Since each circularly shifted version of an integer has to be prime, the circular prime must not contain certain digits. The prime 23, e.g., can be easily spotted as an invalid candidate, because it contains a 2, which indicates divisibility by two when this is the last digit. Thus you might remove all such bad candidates by the following simple method:
def filterPrimes(primeList):
for i in primeList[3:]:
if '0' in str(i) or '2' in str(i) or '4' in str(i) \
or '5' in str(i) or '6' in str(i) or '8' in str(i):
primeList.remove(i)
return primeList
Note that the loop starts at the fourth prime number to avoid removing the number 2 or 5.
The filtering step takes most of the computing time (about 1.9 seconds), but reduces the number of circular prime candidates dramatically from 78498 to 1113 (= 98.5 % reduction)!
The last step, the circulation of each remaining candidate, can be done as you suggested. If you wish, you can simplify the code as follows:
circCount = sum(map(isCircular, primeList))
Due to the reduced candidate set this step is completed in only 0.03 seconds.

Largest Prime Factor Ruby using Fermat's factorization method

I have this code that seems to be working for number 6-8 digits, in a normal time period.
When I enter bigger values the amount get ridiculously bug. It takes more than 4 hours to complete.
Here is my code.
#Fermat's factorization method
def get_largest_prime n
t=(Math.sqrt(n)+1).floor
k=0
prime_numbers=[]
while (t+k)<n
element = (t+k)**2-n
if is_integer? Math.sqrt(element)
#store prime numbers
prime_numbers << t+k+Math.sqrt(element)
prime_numbers << t+k-Math.sqrt(element)
#puts "Prime Factors of #{n} are: #{t+k+Math.sqrt(element)} and #{t+k-Math.sqrt(element)}"
end
k+=1
end
puts "Prime Factors: "+prime_numbers.to_s
end
#making sure 450.0 is 450, for example.
def is_integer? number
number.to_i == number ? true : false
end
get_largest_prime 600851475143
Running this will take more than 4 hours.
But running it for value ' 600851' for example or ' 60085167' does not take a lot of time. Any help ?
First note that Fermat factorisation doesn't give you prime factors in general.
Then, you run it until t+k >= n, that means you run the while loop n - t times, since t is roughly sqrt(n), that is an O(n) algorithm. For a largish n like 600851475143 (about 6*10^11), that is bound to take long.
You need to change the algorithm. When you have found a pair of divisors (both larger than 1), factorise them both recursively. If the smaller of the found factors is 1, that is a prime factor.
Doing that (forgive the bad style, I barely know ruby):
#Fermat's factorization method
def get_largest_prime n
t=(Math.sqrt(n)+1).floor
k=0
prime_numbers=[]
while (t+k)<n
element = (t+k)**2-n
if is_integer? Math.sqrt(element)
#store prime numbers
a = t+k+Math.sqrt(element)
b = t+k-Math.sqrt(element)
if b == 1
prime_numbers << a
break
end
prime_numbers += get_largest_prime a
prime_numbers += get_largest_prime b
break
#puts "Prime Factors of #{n} are: #{t+k+Math.sqrt(element)} and #{t+k-Math.sqrt(element)}"
end
k+=1
end
return prime_numbers
end
#making sure 450.0 is 450, for example.
def is_integer? number
number.to_i == number ? true : false
end
a = get_largest_prime 600851475143
puts "Prime Factors: "+a.to_s
solves the given problem quickly.
However, it will still take a long time for numbers that have no divisors close to the square root.
The standard factorisation by trial division has much better worst-case behaviour (O(sqrt(n) worst case). A mixed approach can be slightly faster than pure trial division, though.
Two effects here:
1) When an integer gets larger than 2**31 in Ruby, it uses a different, and slower, representation
2) There are no known factorisation algorithms that don't eventually perform badly once the number gets large enough - technically they all get slower worse than any polynomial of the (number of digits of) the number you want to factorise.
You could speed things up by using
Math.sqrt(element)
less. Assign result of it to a variable, before all the tests. Note this will not "fix" your problem. Ultimately it won't run fast enough above a certain number - even if you transferred everything to C (although you might squeeze out a couple of extra digits before C got slow)
Possibly, you can try the code below.
def prime_factor limit
(2..Math.sqrt(limit).to_i).inject([]) do |memo, var|
memo << var if limit % var == 0 and !memo.any? {|d| var % d == 0}
memo
end
end
prime_result = prime_factor 600851475143
puts prime_result.max
The less loops you use, the faster your code will run ;-) (less cpu cycles). Try to do everything recursively like here on this program that finds the largest prime factor

Why is my implementation of Atkin sieve is slower than Eratosthenes? [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 12 months ago.
Improve this question
I'm doing problems from Project Euler in Ruby and implemented Atkin's sieve for finding prime numbers but it runs slower than sieve of Eratosthenes. What is the problem?
def atkin_sieve(n)
primes = [2,3,5]
sieve = Array.new(n+1, false)
y_upper = n-4 > 0 ? Math.sqrt(n-4).truncate : 1
for x in (1..Math.sqrt(n/4).truncate)
for y in (1..y_upper)
k = 4*x**2 + y**2
sieve[k] = !sieve[k] if k%12 == 1 or k%12 == 5
end
end
y_upper = n-3 > 0 ? Math.sqrt(n-3).truncate : 1
for x in (1..Math.sqrt(n/3).truncate)
for y in (1..y_upper)
k = 3*x**2 + y**2
sieve[k] = !sieve[k] if k%12 == 7
end
end
for x in (1..Math.sqrt(n).truncate)
for y in (1..x)
k = 3*x**2 - y**2
if k < n and k%12 == 11
sieve[k] = !sieve[k]
end
end
end
for j in (5...n)
if sieve[j]
prime = true
for i in (0...primes.length)
if j % (primes[i]**2) == 0
prime = false
break
end
end
primes << j if prime
end
end
primes
end
def erato_sieve(n)
primes = []
for i in (2..n)
if primes.all?{|x| i % x != 0}
primes << i
end
end
primes
end
As Wikipedia says, "The modern sieve of Atkin is more complicated, but faster when properly optimized" (my emphasis).
The first obvious place to save some time in the first set of loops would be to stop iterating over y when 4*x**2 + y**2 is greater than n. For example, if n is 1,000,000 and x is 450, then you should stop iterating when y is greater than 435 (instead of continuing to 999 as you do at the moment). So you could rewrite the first loop as:
for x in (1..Math.sqrt(n/4).truncate)
X = 4 * x ** 2
for y in (1..Math.sqrt(n - X).truncate)
k = X + y ** 2
sieve[k] = !sieve[k] if k%12 == 1 or k%12 == 5
end
end
(This also avoids re-computing 4*x**2 each time round the loop, though that is probably a very small improvement, if any.)
Similar remarks apply, of course, to the other loops over y.
A second place where you could speed things up is in the strategy for looping over y. You loop over all values of y in the range, and then check to see which ones lead to values of k with the correct remainders modulo 12. Instead, you could just loop over the right values of y only, and avoid testing the remainders altogether.
If 4*x**2 is 4 modulo 12, then y**2 must be 1 or 9 modulo 12, and so y must be 1, 3, 5, 7, or 11 modulo 12. If 4*x**2 is 8 modulo 12, then y**2 must be 5 or 9 modulo 12, so y must be 3 or 9 modulo 12. And finally, if 4*x**2 is 0 modulo 12, then y**2 must be 1 or 5 modulo 12, so y must be 1, 5, 7, 9, or 11 modulo 12.
I also note that your sieve of Eratosthenes is doing useless work by testing divisibility by all primes below i. You can halt the iteration once you've test for divisibility by all primes less than or equal to the square root of i.
It would help a lot if you actually implemented the Sieve of Eratosthenes properly in the first place.
The critical feature of that sieve is that you only do one operation per time a prime divides a number. By contrast you are doing work for every prime less than the number. The difference is subtle, but the performance implications are huge.
Here is the actual sieve that you failed to implement:
def eratosthenes_primes(n)
primes = []
could_be_prime = (0..n).map{|i| true}
could_be_prime[0] = false
could_be_prime[1] = false
i = 0
while i*i <= n
if could_be_prime[i]
j = i*i
while j <= n
could_be_prime[j] = false
j += i
end
end
i += 1
end
return (2..n).find_all{|i| could_be_prime[i]}
end
Compare this with your code for finding all of the primes up to 50,000. Also note that this can easily be sped up by a factor of 2 by special casing the logic for even numbers. With that tweak, this algorithm should be fast enough for every Project Euler problem that needs you to compute a lot of primes.
#Gareth mentions some redundant calculations regarding 4x^2+y^2. Both here and in other places where you have calculations within a loop, you can make use of calculations you've already performed and reduce this to simple addition.
Rather than X=4 * x ** 2, you could rely on the fact that X already has the value of 4 * (x-1) ** 2. Since 4x^2 = 4(x-1)^2 + 4(2x - 1), all you need to do is add 8 * x - 4 to X. You can use this same trick for k, and the other places where you have repeated calculations (like 3x^2 + y^2).

Resources