Finding primes up to a certain number in Racket - performance

I'm learning Racket (with the HtDP course) and it's my first shot at a program in a functional language.
I've tried to design a function that finds all primes under a certain input n using (what I think is) a functional approach to the problem, but the program can get really slow (86 seconds for 100.000, while my Python, C and C++ quickly-written solutions take just a couple of seconds).
The following is the code:
;; Natural Natural -> Boolean
;; Helper function to avoid writing the handful (= 0 (modulo na nb))
(define (divisible na nb) (= 0 (modulo na nb)))
;; Natural ListOfNatural -> Boolean
;; n is the number to check, lop is ALL the prime numbers less than n
(define (is-prime? n lop)
(cond [(empty? lop) true]
[(divisible n (first lop)) false]
[ else (is-prime? n (rest lop))]))
;; Natural -> ListOfNatural
(define (find-primes n)
(if (= n 2)
(list 2)
(local [(define LOP (find-primes (sub1 n)))]
(if (is-prime? n LOP)
(append LOP (list n))
LOP))))
(time (find-primes 100000))
I'm using the divisible function instead of just plowing the rest in because I really like to have separated functions when they could be of use in another part of the program. I also should probably define is-prime? inside of find-primes, since no one will ever call is-prime? on a number while also giving all the prime numbers less than that number.
Any pointers on how to improve this?

Here are some ideas for improving the performance, the procedure now returns in under two seconds for n = 100000.
(define (is-prime? n lop)
(define sqrtn (sqrt n))
(if (not (or (= (modulo n 6) 1) (= (modulo n 6) 5)))
false
(let loop ([lop lop])
(cond [(or (empty? lop) (< sqrtn (first lop))) true]
[(zero? (modulo n (first lop))) false]
[else (loop (rest lop))]))))
(define (find-primes n)
(cond [(<= n 1) '()]
[(= n 2) '(2)]
[(= n 3) '(2 3)]
[else
(let loop ([lop '(2 3)] [i 5])
(cond [(> i n) lop]
[(is-prime? i lop) (loop (append lop (list i)) (+ i 2))]
[else (loop lop (+ i 2))]))]))
Some of the optimizations are language-related, others are algorithmic:
The recursion was converted to be in tail position. In this way, the recursive call is the last thing we do at each step, with nothing else to do after it - and the compiler can optimize it to be as efficient as a loop in other programming languages.
The loop in find-primes was modified for only iterating over odd numbers. Note that we go from 3 to n instead of going from n to 2.
divisible was inlined and (sqrt n) is calculated only once.
is-prime? only checks up until sqrt(n), it makes no sense to look for primes after that. This is the most important optimization, instead of being O(n) the algorithm is now O(sqrt(n)).
Following #law-of-fives's advice, is-prime? now skips the check when n is not congruent to 1 or 5 modulo 6.
Also, normally I'd recommend to build the list using cons instead of append, but in this case we need the prime numbers list to be constructed in ascending order for the most important optimization in is-prime? to work.

Here's Óscar López's code, tweaked to build the list in the top-down manner:
(define (is-prime? n lop)
(define sqrtn (sqrt n))
(let loop ([lop lop])
(cond [(or (empty? lop) (< sqrtn (mcar lop))) true]
[(zero? (modulo n (mcar lop))) false]
[else (loop (mcdr lop))])))
(define (find-primes n)
(let* ([a (mcons 3 '())]
[b (mcons 2 a)])
(let loop ([p a] [i 5] [d 2] ; d = diff +2 +4 +2 ...
[c 2]) ; c = count of primes found
(cond [(> i n) c]
[(is-prime? i (mcdr a))
(set-mcdr! p (mcons i '()))
(loop (mcdr p) (+ i d) (- 6 d) (+ c 1))]
[else (loop p (+ i d) (- 6 d) c )]))))
Runs at about ~n1.25..1.32, empirically; compared to the original's ~n1.8..1.9, in the measured range, inside DrRacket (append is the culprit of that bad behaviour). The "under two seconds" for 100K turns into under 0.05 seconds; two seconds gets you well above 1M (one million):
; (time (length (find-primes 100000))) ; with cons times in milliseconds
; 10K 156 ; 20K 437 ; 40K 1607 ; 80K 5241 ; 100K 7753 .... n^1.8-1.9-1.7 OP's
; 10K 62 ; 20K 109 ; 40K 421 ; 80K 1217 ; 100K 2293 .... n^1.8-1.9 Óscar's
; mcons:
(time (find-primes 2000000))
; 100K 47 ; 200K 172 ; 1M 1186 ; 2M 2839 ; 3M 4851 ; 4M 7036 .... n^1.25-1.32 this
; 9592 17984 78498 148933 216816 283146
It's still just a trial division though... :) The sieve of Eratosthenes will be much faster yet.
edit: As for set-cdr!, it is easy to emulate any lazy algorithm with it... Otherwise, we could use extendable arrays (lists of...), for the amortized O(1) snoc/append1 operation (that's lots and lots of coding); or maintain the list of primes split in two (three, actually; see the code below), building the second portion in reverse with cons, and appending it in reverse to the first portion only every so often (specifically, judging the need by the next prime's square):
; times: ; 2M 1934 ; 3M 3260 ; 4M 4665 ; 6M 8081 .... n^1.30
;; find primes up to and including n, n > 2
(define (find-primes n)
(let loop ( [k 5] [q 9] ; next candidate; square of (car LOP2)
[LOP1 (list 2)] ; primes to test by
[LOP2 (list 3)] ; more primes
[LOP3 (list )] ) ; even more primes, in reverse
(cond [ (> k n)
(append LOP1 LOP2 (reverse LOP3)) ]
[ (= k q)
(if (null? (cdr LOP2))
(loop k q LOP1 (append LOP2 (reverse LOP3)) (list))
(loop (+ k 2)
(* (cadr LOP2) (cadr LOP2)) ; next prime's square
(append LOP1 (list (car LOP2)))
(cdr LOP2) LOP3 )) ]
[ (is-prime? k (cdr LOP1))
(loop (+ k 2) q LOP1 LOP2 (cons k LOP3)) ]
[ else
(loop (+ k 2) q LOP1 LOP2 LOP3 ) ])))
;; n is the number to check, lop is list of prime numbers to check it by
(define (is-prime? n lop)
(cond [ (null? lop) #t ]
[ (divisible n (car lop)) #f ]
[ else (is-prime? n (cdr lop)) ]))
edit2: The easiest and simplest fix though, closest to your code, was to decouple the primes calculations of the resulting list, and of the list to check divisibility by. In your
(local [(define LOP (find-primes (sub1 n)))]
(if (is-prime? n LOP)
LOP is used as the list of primes to check by, and it is reused as part of the result list in
(append LOP (list n))
LOP))))
immediately afterwards. Breaking this entanglement enables us to stop the generation of testing primes list at the sqrt of the upper limit, and thus it gives us:
;times: ; 1M-1076 2M-2621 3M-4664 4M-6693
; n^1.28 ^1.33 n^1.32
(define (find-primes n)
(cond
((<= n 4) (list 2 3))
(else
(let* ([LOP (find-primes (inexact->exact (floor (sqrt n))))]
[lp (last LOP)])
(local ([define (primes k ps)
(if (<= k lp)
(append LOP ps)
(primes (- k 2) (if (is-prime? k LOP)
(cons k ps)
ps)))])
(primes (if (> (modulo n 2) 0) n (- n 1)) '()))))))
It too uses the same is-prime? code as in the question, unaltered, as does the second variant above.
It is slower than the 2nd variant. The algorithmic reason for this is clear — it tests all numbers from sqrt(n) to n by the same list of primes, all smaller or equal to the sqrt(n) — but in testing a given prime p < n it is enough to use only those primes that are not greater than sqrt(p), not sqrt(n). But it is the closest to your original code.
For comparison, in Haskell-like syntax, under strict evaluation,
isPrime n lop = null [() | p <- lop, rem n p == 0]
-- OP:
findprimes 2 = [2]
findprimes n = lop ++ [n | isPrime n lop]
where lop = findprimes (n-1)
= lop ++ [n | n <- [q+1..n], isPrime n lop]
where lop = findprimes q ; q = (n-1)
-- 3rd:
findprimes n | n < 5 = [2,3]
findprimes n = lop ++ [n | n <- [q+1..n], isPrime n lop]
where lop = findprimes q ;
q = floor $ sqrt $ fromIntegral n
-- 2nd:
findprimes n = g 5 9 [2] [3] []
where
g k q a b c
| k > n = a ++ b ++ reverse c
| k == q, [h] <- b = g k q a (h:reverse c) []
| k == q, (h:p:ps) <- b = g (k+2) (p*p) (a++[h]) (p:ps) c
| isPrime k a = g (k+2) q a b (k:c)
| otherwise = g (k+2) q a b c
The b and c together (which is to say, LOP2 and LOP3 in the Scheme code) actually constitute a pure functional queue a-la Okasaki, from which sequential primes are taken and appended at the end of the maintained primes prefix a (i.e. LOP1) now and again, on each consecutive prime's square being passed, for a to be used in the primality testing by isPrime.
Because of the rarity of this appending, its computational inefficiency has no impact on the time complexity of the code overall.

Related

how to append elements in a list in Scheme

I have made a program that returns the number of a fibonacci series by using tail recursion, and I would like to add its results to a list. I have done the following:
(define listAux '())
(define (fibTail n1 n2 c)
(if (= c 0)
(appendList -1)
(begin
(appendList n2)
(fibT (+ n1 n2) n1 (- c 1))
)))
(define (appendList n)
(if (= n -1)
listAux
(append (list n) listAux)))
(define (fib n)
(fibTail 1 0 n))
I would like that appendList returns a list with the elements of the fibonacci series, when I call it like (fib 8) for example.
Any help?
Thanks
When programming in Lisp, we avoid using append for building lists - it's very inefficient because to insert a single element at the end, we have to traverse the whole list... and then again, and again. It's better to build the list in reverse order using cons and invert it at the end. Also, the ideal way of writing a tail recursion is to accumulate the results in a parameter, not in an externally defined variable (which anyway won't work, unless you set! its value somewhere). This is what I mean:
(define (fib n)
(fibTail 1 0 n '()))
(define (fibTail n1 n2 c lst)
(if (< c 0)
(reverse lst)
(fibTail (+ n1 n2) n1 (- c 1) (cons n2 lst))))
For example:
(fib 10)
=> '(0 1 1 2 3 5 8 13 21 34 55)

Scheme - speed up a heavily recursive function

I must write a Scheme predicate that computes the function f(N -> N) defined as :
if n < 4 :f(n)= (n^2) + 5
if n ≥ 4 :f(n) = [f(n−1) + f(n−2)] * f(n−4)
I wrote a simple predicate that works :
(define functionfNaive
(lambda (n)
(if (< n 4) (+ (* n n) 5)
(* (+ (functionfNaive (- n 1)) (functionfNaive (- n 2)))
(functionfNaive (- n 4))))))
Now, I try a method with an accumulator but it doesn't work...
My code :
(define functionf
(lambda(n)
(functionfAux n 5 9 14)))
(define functionfAux
(lambda (n n1 n2 n4)
(cond
[(< n 4) (+ (* n n) 5)]
[(= n 4) (* n1 (+ n2 n4))]
[else (functionfAux (- n 1) n2 n4 (* n1 (+ n2 n4)))])))
As requested, here's a memoized version of your code that performs better than the naïve version:
(define functionf
(let ((cache (make-hash)))
(lambda (n)
(hash-ref!
cache
n
(thunk
(if (< n 4)
(+ (* n n) 5)
(* (+ (functionf (- n 1)) (functionf (- n 2))) (functionf (- n 4)))))))))
BTW... computing the result for large values of n is very quick, but printing takes a lot of time. To measure the time, use something like
(time (functionf 50) 'done)
AND here's a generic memoize procedure, should you need it:
(define (memoize fn)
(let ((cache (make-hash)))
(λ arg (hash-ref! cache arg (thunk (apply fn arg))))))
which in your case could be used like
(define functionf
(memoize
(lambda (n)
(if (< n 4)
(+ (* n n) 5)
(* (+ (functionf (- n 1)) (functionf (- n 2))) (functionf (- n 4)))))))
First, that's not a predicate. A Predicate is a function which returns a Boolean value.
To calculate the nth result, start with the first four and count up, maintaining the last four known elements. Stop when n is reached:
(define (step a b c d n)
(list b c d (* (+ c d) a)) (+ n 1)))
etc. Simple. The first call will be (step 5 6 9 14 3).
The depth of the recursion tree may be the biggest question, so may be use the iteration which means use some variables to memory the intermediate processes.
#lang racket
(define (functionf n)
(define (iter now n1 n2 n3 n4 back)
(if (= n now)
back
(iter (+ now 1) back n1 n2 n3 (* n3 (+ back n1)))))
(if (< n 4)
(+ 5 (* n n))
(iter 4 14 9 6 5 125)))
(functionf 5)
in this way, the depth of the stack only be 1 and the code is speeded up.

Different ways to calculate number

I need to write a function that will return the number of ways in which can be n (n is a natural number) written as the sum of natural numbers.
For example: 4 can be written as 1+1+1+1, 1+1+2, 2+2, 3+1 and 4.
I have written a function that returns the number of all the options, but does not take into account that the possibilities 1 + 1 + 2 and 2 + 1 + 1 (and all similar cases) are equal. So for n=4 it returns 8 instead of 5.
Here is my function:
(define (possibilities n)
(define (loop i)
(cond [(= i n) 1]
[(> i n) 0]
[(+ (possibilities (- n i)) (loop (+ i 1)))]))
(cond [(< n 1) 0]
[#t (loop 1)]))
Could you please help me with fixing my function, so it will work the way it should be. Thank you.
This is a well-known function, it's called the partition function P, its possible values are referenced as A000041 in the on-line encyclopedia of integer sequences.
One simple solution (not the fastest!) would be to use this helper function, which denotes the number of ways of writing n as a sum of exactly k terms:
(define (p n k)
(cond ((> k n) 0)
((= k 0) 0)
((= k n) 1)
(else
(+ (p (sub1 n) (sub1 k))
(p (- n k) k)))))
Then we just have to add the possible results, being careful with the edge cases:
(define (possibilities n)
(cond ((negative? n) 0)
((zero? n) 1)
(else
(for/sum ([i (in-range (add1 n))])
(p n i)))))
For example:
(map possibilities (range 11))
=> '(1 1 2 3 5 7 11 15 22 30 42)

How to go about composing core functions, rather then using imperative style?

I have translated this code, the snippet below, from Python to Clojure. I replaced Python's while construct with Clojure's loop-recur here. But this doesn't look idiomatic.
(loop [d 2 [n & more] (list 256)]
(if (> n 1)
(recur (inc d)
(loop [x n sublist more]
(if (= (rem x d) 0)
(recur (/ x d) (conj sublist d))
(conj sublist x))))
(sort more)))
This routine gives me (3 3 31), that is prime factors of 279. For 256, it gives, (2 2 2 2 2 2 2 2), that means, 2^8.
Moreover, it performs worse for large values, say 987654123987546 instead of 279; whereas Python's counterpart works like charm.
How to start composing core functions, rather then translating imperative code as is? And specifically, how to improve this bit?
Thanks.
[Edited]
Here is the python code, I referred above,
def prime_factors(n):
factors = []
d = 2
while n > 1:
while n % d == 0:
factors.append(d)
n /= d
d = d + 1
return factors
A straight translation of the Python code in Clojure would be:
(defn prime-factors [n]
(let [n (atom n) ;; The Python code makes use of mutability which
factors (atom []) ;; isn't idiomatic in Clojure, but can be emulated
d (atom 2)] ;; using atoms
(loop []
(when (< 1 #n)
(loop []
(when (== (rem #n #d) 0)
(swap! factors conj #d)
(swap! n quot #d)
(recur)))
(swap! d inc)
(recur)))
#factors))
(prime-factors 279) ;; => [3 3 31]
(prime-factors 987654123987546) ;; => [2 3 41 14389 279022459]
(time (prime-factors 987654123987546)) ;; "Elapsed time: 13993.984 msecs"
;; same performance on my machine
;; as the Rosetta Code solution
You can improve this code to make it more idiomatic:
from nested loops to a single loop:
(loop []
(cond
(<= #n 1) #factors
(not= (rem #n #d) 0) (do (swap! d inc)
(recur))
:else (do (swap! factors conj #d)
(swap! n quot #d)
(recur))))))
get rid of the atoms:
(defn prime-factors [n]
(loop [n n
factors []
d 2]
(cond
(<= n 1) factors
(not= (rem n d) 0) (recur n factors (inc d))
:else (recur (quot n d) (conj factors d) d))))
replace == 0 by zero?:
(not (zero? (rem n d))) (recur n factors (inc d))
You can also overhaul it completely to make a lazy version of it:
(defn prime-factors [n]
((fn step [n d]
(lazy-seq
(when (< 1 n)
(cond
(zero? (rem n d)) (cons d (step (quot n d) d))
:else (recur n (inc d)))))
n 2))
I planned to have a section on optimization here, but I'm no specialist. The only thing I can say is that you can trivially make this code faster by interrupting the loop when d is greater than the square root of n:
(defn prime-factors [n]
(if (< 1 n)
(loop [n n
factors []
d 2]
(let [q (quot n d)]
(cond
(< q d) (conj factors n)
(zero? (rem n d)) (recur q (conj factors d) d)
:else (recur n factors (inc d)))))
[]))
(time (prime-factors 987654123987546)) ;; "Elapsed time: 7.124 msecs"
Not every loop unrolls cleanly into an elegant "functional" decomposition.
The Rosetta Code solution suggested by #edbond is pretty simple and concise; I would say it's idiomatic since no obvious "functional" solution is apparent. That solution runs noticeably faster on my machine than your Python version for 987654123987546.
More generally, if you're looking to expand your understanding of functional idioms, Bedra and Halloway's "Programming Clojure" (pp.90-95) presents an excellent comparison of different versions of the Fibonacci sequence, using loop, lazy seqs, and an elegant "functional" version. Chouser and Fogus's "Joy of Clojure" (MEAP version) also has a nice section on function composition.

Sort prime factors in ascending order Scheme

I am new to Scheme and I want to sort the prime factors of a number into ascending order. I found this code, but it does not sort.
(define (primefact n)
(let loop ([n n] [m 2] [factors (list)])
(cond [(= n 1) factors]
[(= 0 (modulo n m)) (loop (/ n m) 2 (cons m factors))]
[else (loop n (add1 m) factors)])))
Can you please help.
Thank you
I would say it sorts, but descending. If you want to sort the other way, just reverse the result:
(cond [(= n 1) (reverse factors)]
Usually when you need something sorted in the order you get them you can
cons them like this:
(define (primefact-asc n)
(let recur ((n n) (m 2))
(cond ((= n 1) '())
((= 0 (modulo n m)) (cons m (recur (/ n m) m))) ; replaced 2 with m
(else (recur n (+ 1 m))))))
Note that this is not tail recursive since it needs to cons the result, but since the amount of factors in an answer is few (thousands perhaps) it won't matter much.
Also, since it does find the factors in order you don't need to start at 2 every round but the number you found.
Which dialect of Scheme is used?
Three hints:
You need only to test for divisors less equal as the square-root of your Number.
a * b = N ; a < b ---> a <= sqrt( N ).
If you need all Prime-Numbers less some Number, you should use the sieve of eratothenes. See Wikipedia.
Before you start to write a program, look in Wikipedia.
If

Resources