How to go about composing core functions, rather then using imperative style? - performance

I have translated this code, the snippet below, from Python to Clojure. I replaced Python's while construct with Clojure's loop-recur here. But this doesn't look idiomatic.
(loop [d 2 [n & more] (list 256)]
(if (> n 1)
(recur (inc d)
(loop [x n sublist more]
(if (= (rem x d) 0)
(recur (/ x d) (conj sublist d))
(conj sublist x))))
(sort more)))
This routine gives me (3 3 31), that is prime factors of 279. For 256, it gives, (2 2 2 2 2 2 2 2), that means, 2^8.
Moreover, it performs worse for large values, say 987654123987546 instead of 279; whereas Python's counterpart works like charm.
How to start composing core functions, rather then translating imperative code as is? And specifically, how to improve this bit?
Thanks.
[Edited]
Here is the python code, I referred above,
def prime_factors(n):
factors = []
d = 2
while n > 1:
while n % d == 0:
factors.append(d)
n /= d
d = d + 1
return factors

A straight translation of the Python code in Clojure would be:
(defn prime-factors [n]
(let [n (atom n) ;; The Python code makes use of mutability which
factors (atom []) ;; isn't idiomatic in Clojure, but can be emulated
d (atom 2)] ;; using atoms
(loop []
(when (< 1 #n)
(loop []
(when (== (rem #n #d) 0)
(swap! factors conj #d)
(swap! n quot #d)
(recur)))
(swap! d inc)
(recur)))
#factors))
(prime-factors 279) ;; => [3 3 31]
(prime-factors 987654123987546) ;; => [2 3 41 14389 279022459]
(time (prime-factors 987654123987546)) ;; "Elapsed time: 13993.984 msecs"
;; same performance on my machine
;; as the Rosetta Code solution
You can improve this code to make it more idiomatic:
from nested loops to a single loop:
(loop []
(cond
(<= #n 1) #factors
(not= (rem #n #d) 0) (do (swap! d inc)
(recur))
:else (do (swap! factors conj #d)
(swap! n quot #d)
(recur))))))
get rid of the atoms:
(defn prime-factors [n]
(loop [n n
factors []
d 2]
(cond
(<= n 1) factors
(not= (rem n d) 0) (recur n factors (inc d))
:else (recur (quot n d) (conj factors d) d))))
replace == 0 by zero?:
(not (zero? (rem n d))) (recur n factors (inc d))
You can also overhaul it completely to make a lazy version of it:
(defn prime-factors [n]
((fn step [n d]
(lazy-seq
(when (< 1 n)
(cond
(zero? (rem n d)) (cons d (step (quot n d) d))
:else (recur n (inc d)))))
n 2))
I planned to have a section on optimization here, but I'm no specialist. The only thing I can say is that you can trivially make this code faster by interrupting the loop when d is greater than the square root of n:
(defn prime-factors [n]
(if (< 1 n)
(loop [n n
factors []
d 2]
(let [q (quot n d)]
(cond
(< q d) (conj factors n)
(zero? (rem n d)) (recur q (conj factors d) d)
:else (recur n factors (inc d)))))
[]))
(time (prime-factors 987654123987546)) ;; "Elapsed time: 7.124 msecs"

Not every loop unrolls cleanly into an elegant "functional" decomposition.
The Rosetta Code solution suggested by #edbond is pretty simple and concise; I would say it's idiomatic since no obvious "functional" solution is apparent. That solution runs noticeably faster on my machine than your Python version for 987654123987546.
More generally, if you're looking to expand your understanding of functional idioms, Bedra and Halloway's "Programming Clojure" (pp.90-95) presents an excellent comparison of different versions of the Fibonacci sequence, using loop, lazy seqs, and an elegant "functional" version. Chouser and Fogus's "Joy of Clojure" (MEAP version) also has a nice section on function composition.

Related

Finding primes up to a certain number in Racket

I'm learning Racket (with the HtDP course) and it's my first shot at a program in a functional language.
I've tried to design a function that finds all primes under a certain input n using (what I think is) a functional approach to the problem, but the program can get really slow (86 seconds for 100.000, while my Python, C and C++ quickly-written solutions take just a couple of seconds).
The following is the code:
;; Natural Natural -> Boolean
;; Helper function to avoid writing the handful (= 0 (modulo na nb))
(define (divisible na nb) (= 0 (modulo na nb)))
;; Natural ListOfNatural -> Boolean
;; n is the number to check, lop is ALL the prime numbers less than n
(define (is-prime? n lop)
(cond [(empty? lop) true]
[(divisible n (first lop)) false]
[ else (is-prime? n (rest lop))]))
;; Natural -> ListOfNatural
(define (find-primes n)
(if (= n 2)
(list 2)
(local [(define LOP (find-primes (sub1 n)))]
(if (is-prime? n LOP)
(append LOP (list n))
LOP))))
(time (find-primes 100000))
I'm using the divisible function instead of just plowing the rest in because I really like to have separated functions when they could be of use in another part of the program. I also should probably define is-prime? inside of find-primes, since no one will ever call is-prime? on a number while also giving all the prime numbers less than that number.
Any pointers on how to improve this?
Here are some ideas for improving the performance, the procedure now returns in under two seconds for n = 100000.
(define (is-prime? n lop)
(define sqrtn (sqrt n))
(if (not (or (= (modulo n 6) 1) (= (modulo n 6) 5)))
false
(let loop ([lop lop])
(cond [(or (empty? lop) (< sqrtn (first lop))) true]
[(zero? (modulo n (first lop))) false]
[else (loop (rest lop))]))))
(define (find-primes n)
(cond [(<= n 1) '()]
[(= n 2) '(2)]
[(= n 3) '(2 3)]
[else
(let loop ([lop '(2 3)] [i 5])
(cond [(> i n) lop]
[(is-prime? i lop) (loop (append lop (list i)) (+ i 2))]
[else (loop lop (+ i 2))]))]))
Some of the optimizations are language-related, others are algorithmic:
The recursion was converted to be in tail position. In this way, the recursive call is the last thing we do at each step, with nothing else to do after it - and the compiler can optimize it to be as efficient as a loop in other programming languages.
The loop in find-primes was modified for only iterating over odd numbers. Note that we go from 3 to n instead of going from n to 2.
divisible was inlined and (sqrt n) is calculated only once.
is-prime? only checks up until sqrt(n), it makes no sense to look for primes after that. This is the most important optimization, instead of being O(n) the algorithm is now O(sqrt(n)).
Following #law-of-fives's advice, is-prime? now skips the check when n is not congruent to 1 or 5 modulo 6.
Also, normally I'd recommend to build the list using cons instead of append, but in this case we need the prime numbers list to be constructed in ascending order for the most important optimization in is-prime? to work.
Here's Óscar López's code, tweaked to build the list in the top-down manner:
(define (is-prime? n lop)
(define sqrtn (sqrt n))
(let loop ([lop lop])
(cond [(or (empty? lop) (< sqrtn (mcar lop))) true]
[(zero? (modulo n (mcar lop))) false]
[else (loop (mcdr lop))])))
(define (find-primes n)
(let* ([a (mcons 3 '())]
[b (mcons 2 a)])
(let loop ([p a] [i 5] [d 2] ; d = diff +2 +4 +2 ...
[c 2]) ; c = count of primes found
(cond [(> i n) c]
[(is-prime? i (mcdr a))
(set-mcdr! p (mcons i '()))
(loop (mcdr p) (+ i d) (- 6 d) (+ c 1))]
[else (loop p (+ i d) (- 6 d) c )]))))
Runs at about ~n1.25..1.32, empirically; compared to the original's ~n1.8..1.9, in the measured range, inside DrRacket (append is the culprit of that bad behaviour). The "under two seconds" for 100K turns into under 0.05 seconds; two seconds gets you well above 1M (one million):
; (time (length (find-primes 100000))) ; with cons times in milliseconds
; 10K 156 ; 20K 437 ; 40K 1607 ; 80K 5241 ; 100K 7753 .... n^1.8-1.9-1.7 OP's
; 10K 62 ; 20K 109 ; 40K 421 ; 80K 1217 ; 100K 2293 .... n^1.8-1.9 Óscar's
; mcons:
(time (find-primes 2000000))
; 100K 47 ; 200K 172 ; 1M 1186 ; 2M 2839 ; 3M 4851 ; 4M 7036 .... n^1.25-1.32 this
; 9592 17984 78498 148933 216816 283146
It's still just a trial division though... :) The sieve of Eratosthenes will be much faster yet.
edit: As for set-cdr!, it is easy to emulate any lazy algorithm with it... Otherwise, we could use extendable arrays (lists of...), for the amortized O(1) snoc/append1 operation (that's lots and lots of coding); or maintain the list of primes split in two (three, actually; see the code below), building the second portion in reverse with cons, and appending it in reverse to the first portion only every so often (specifically, judging the need by the next prime's square):
; times: ; 2M 1934 ; 3M 3260 ; 4M 4665 ; 6M 8081 .... n^1.30
;; find primes up to and including n, n > 2
(define (find-primes n)
(let loop ( [k 5] [q 9] ; next candidate; square of (car LOP2)
[LOP1 (list 2)] ; primes to test by
[LOP2 (list 3)] ; more primes
[LOP3 (list )] ) ; even more primes, in reverse
(cond [ (> k n)
(append LOP1 LOP2 (reverse LOP3)) ]
[ (= k q)
(if (null? (cdr LOP2))
(loop k q LOP1 (append LOP2 (reverse LOP3)) (list))
(loop (+ k 2)
(* (cadr LOP2) (cadr LOP2)) ; next prime's square
(append LOP1 (list (car LOP2)))
(cdr LOP2) LOP3 )) ]
[ (is-prime? k (cdr LOP1))
(loop (+ k 2) q LOP1 LOP2 (cons k LOP3)) ]
[ else
(loop (+ k 2) q LOP1 LOP2 LOP3 ) ])))
;; n is the number to check, lop is list of prime numbers to check it by
(define (is-prime? n lop)
(cond [ (null? lop) #t ]
[ (divisible n (car lop)) #f ]
[ else (is-prime? n (cdr lop)) ]))
edit2: The easiest and simplest fix though, closest to your code, was to decouple the primes calculations of the resulting list, and of the list to check divisibility by. In your
(local [(define LOP (find-primes (sub1 n)))]
(if (is-prime? n LOP)
LOP is used as the list of primes to check by, and it is reused as part of the result list in
(append LOP (list n))
LOP))))
immediately afterwards. Breaking this entanglement enables us to stop the generation of testing primes list at the sqrt of the upper limit, and thus it gives us:
;times: ; 1M-1076 2M-2621 3M-4664 4M-6693
; n^1.28 ^1.33 n^1.32
(define (find-primes n)
(cond
((<= n 4) (list 2 3))
(else
(let* ([LOP (find-primes (inexact->exact (floor (sqrt n))))]
[lp (last LOP)])
(local ([define (primes k ps)
(if (<= k lp)
(append LOP ps)
(primes (- k 2) (if (is-prime? k LOP)
(cons k ps)
ps)))])
(primes (if (> (modulo n 2) 0) n (- n 1)) '()))))))
It too uses the same is-prime? code as in the question, unaltered, as does the second variant above.
It is slower than the 2nd variant. The algorithmic reason for this is clear — it tests all numbers from sqrt(n) to n by the same list of primes, all smaller or equal to the sqrt(n) — but in testing a given prime p < n it is enough to use only those primes that are not greater than sqrt(p), not sqrt(n). But it is the closest to your original code.
For comparison, in Haskell-like syntax, under strict evaluation,
isPrime n lop = null [() | p <- lop, rem n p == 0]
-- OP:
findprimes 2 = [2]
findprimes n = lop ++ [n | isPrime n lop]
where lop = findprimes (n-1)
= lop ++ [n | n <- [q+1..n], isPrime n lop]
where lop = findprimes q ; q = (n-1)
-- 3rd:
findprimes n | n < 5 = [2,3]
findprimes n = lop ++ [n | n <- [q+1..n], isPrime n lop]
where lop = findprimes q ;
q = floor $ sqrt $ fromIntegral n
-- 2nd:
findprimes n = g 5 9 [2] [3] []
where
g k q a b c
| k > n = a ++ b ++ reverse c
| k == q, [h] <- b = g k q a (h:reverse c) []
| k == q, (h:p:ps) <- b = g (k+2) (p*p) (a++[h]) (p:ps) c
| isPrime k a = g (k+2) q a b (k:c)
| otherwise = g (k+2) q a b c
The b and c together (which is to say, LOP2 and LOP3 in the Scheme code) actually constitute a pure functional queue a-la Okasaki, from which sequential primes are taken and appended at the end of the maintained primes prefix a (i.e. LOP1) now and again, on each consecutive prime's square being passed, for a to be used in the primality testing by isPrime.
Because of the rarity of this appending, its computational inefficiency has no impact on the time complexity of the code overall.

Returning the sum of positive squares

I'm trying to edit the current program I have
(define (sumofnumber n)
(if (= n 0)
1
(+ n (sumofnumber (modulo n 2 )))))
so that it returns the sum of an n number of positive squares. For example if you inputted in 3 the program would do 1+4+9 to get 14. I have tried using modulo and other methods but it always goes into an infinite loop.
The base case is incorrect (the square of zero is zero), and so is the recursive step (why are you taking the modulo?) and the actual operation (where are you squaring the value?). This is how the procedure should look like:
(define (sum-of-squares n)
(if (= n 0)
0
(+ (* n n)
(sum-of-squares (- n 1)))))
A definition using composition rather than recursion. Read the comments from bottom to top for the procedural logic:
(define (sum-of-squares n)
(foldl + ; sum the list
0
(map (lambda(x)(* x x)) ; square each number in list
(map (lambda(x)(+ x 1)) ; correct for range yielding 0...(n - 1)
(range n))))) ; get a list of numbers bounded by n
I provide this because you are well on your way to understanding the idiom of recursion. Composition is another of Racket's idioms worth exploring and often covered after recursion in educational contexts.
Sometimes I find composition easier to apply to a problem than recursion. Other times, I don't.
You're not squaring anything, so there's no reason to expect that to be a sum of squares.
Write down how you got 1 + 4 + 9 with n = 3 (^ is exponentiation):
1^2 + 2^2 + 3^2
This is
(sum-of-squares 2) + 3^2
or
(sum-of-squares (- 3 1)) + 3^2
that is,
(sum-of-squares (- n 1)) + n^2
Notice that modulo does not occur anywhere, nor do you add n to anything.
(And the square of 0 is 0 , not 1.)
You can break the problem into small chunks.
1. Create a list of numbers from 1 to n
2. Map a square function over list to square each number
3. Apply + to add all the numbers in squared list
(define (sum-of-number n)
(apply + (map (lambda (x) (* x x)) (sequence->list (in-range 1 (+ n 1))))))
> (sum-of-number 3)
14
This is the perfect opportunity for using the transducers technique.
Calculating the sum of a list is a fold. Map and filter are folds, too. Composing several folds together in a nested fashion, as in (sum...(filter...(map...sqr...))), leads to multiple (here, three) list traversals.
But when the nested folds are fused, their reducing functions combine in a nested fashion, giving us a one-traversal fold instead, with the one combined reducer function:
(define (((mapping f) kons) x acc) (kons (f x) acc)) ; the "mapping" transducer
(define (((filtering p) kons) x acc) (if (p x) (kons x acc) acc)) ; the "filtering" one
(define (sum-of-positive-squares n)
(foldl ((compose (mapping sqr) ; ((mapping sqr)
(filtering (lambda (x) (> x 0)))) ; ((filtering {> _ 0})
+) 0 (range (+ 1 n)))) ; +))
; > (sum-of-positive-squares 3)
; 14
Of course ((compose f g) x) is the same as (f (g x)). The combined / "composed" (pun intended) reducer function is created just by substituting the arguments into the definitions, as
((mapping sqr) ((filtering {> _ 0}) +))
=
( (lambda (kons)
(lambda (x acc) (kons (sqr x) acc)))
((filtering {> _ 0}) +))
=
(lambda (x acc)
( ((filtering {> _ 0}) +)
(sqr x) acc))
=
(lambda (x acc)
( ( (lambda (kons)
(lambda (x acc) (if ({> _ 0} x) (kons x acc) acc)))
+)
(sqr x) acc))
=
(lambda (x acc)
( (lambda (x acc) (if (> x 0) (+ x acc) acc))
(sqr x) acc))
=
(lambda (x acc)
(let ([x (sqr x)] [acc acc])
(if (> x 0) (+ x acc) acc)))
which looks almost as something a programmer would write. As an exercise,
((filtering {> _ 0}) ((mapping sqr) +))
=
( (lambda (kons)
(lambda (x acc) (if ({> _ 0} x) (kons x acc) acc)))
((mapping sqr) +))
=
(lambda (x acc)
(if (> x 0) (((mapping sqr) +) x acc) acc))
=
(lambda (x acc)
(if (> x 0) (+ (sqr x) acc) acc))
So instead of writing the fused reducer function definitions ourselves, which as every human activity is error-prone, we can compose these reducer functions from more atomic "transformations" nay transducers.
Works in DrRacket.

Optimisation of the number of divisors of a number

I am using the below code to get the number of divisors for the given number as per this answer https://stackoverflow.com/a/110365/2610955. I calculate the number of times a prime factor is repeated and then increment each of them and make a product out of them. But the algorithm seems too slow. Is there a way to optimise the program. I tried with type hints but they don't have any use. Is there something wrong with the algorithm or am I missing any optimisation here?
(defn prime-factors
[^Integer n]
(loop [num n
divisors {}
limit (-> n Math/sqrt Math/ceil Math/round inc)
init 2]
(cond
(or (>= init limit) (zero? num)) divisors
(zero? (rem num init)) (recur (quot num init) (update divisors init #(if (nil? %) 1 (inc %))) limit init)
:else (recur num divisors limit (inc init)))))
(prime-factors 6)
(set! *unchecked-math* true)
(dotimes [_ 10] (->> (reduce *' (range 30 40))
prime-factors
vals
(map inc)
(reduce *' 1)
time))
Edit : Removed rem as its not needed in the final output
Minor thing: use int or long instead of Integer as #leetwinski mentioned.
Major reason:
It seems that you loop from 2 to sqrt(n), which will loop over unnecessary numbers.
Take a look at the code below: (This is just a dirty patch)
(defn prime-factors
[n]
(loop [num n
divisors {}
limit (-> n Math/sqrt int)
init 2]
(cond
(or (>= init limit) (zero? num)) divisors
(zero? (rem num init)) (recur (quot num init) (update divisors init #(if (nil? %) 1 (inc %))) limit init)
:else (recur num divisors limit (if (= 2 init ) (inc init) (+ 2 init))))))
I just replace (inc init) with (if (= 2 init ) (inc init) (+ 2 init)). This will loop over 2 and odd numbers. If you run it, you will notice that the execution time reduced almost half of the original version. Because it skip the even numbers (except 2).
If you loop over only prime numbers, it will be much faster than this. You can get the sequence of primes from clojure.contrib.lazy-seqs/primes. Though this contrib namespace has deprecated, you can still use it.
This is my approach:
(ns util
(:require [clojure.contrib.lazy-seqs :refer (primes)]))
(defn factorize
"Returns a sequence of pairs of prime factor and its exponent."
[n]
(loop [n n, ps primes, acc []]
(let [p (first ps)]
(if (<= p n)
(if (= 0 (mod n p))
(recur (quot n p) ps (conj acc p))
(recur n (rest ps) acc))
(->> (group-by identity acc)
(map (fn [[k v]] [k (count v)])))))))
Then you can use this function like this:
(dotimes [_ 10] (->> (reduce *' (range 30 40))
factorize
(map second)
(map inc)
(reduce *' 1)
time))
you can increase the performance by using primitive int in loop. Just replace (loop [num n... with (loop [num (int n)...
for me it works 4 to 5 times faster
another variant (which is in fact the same) is to change the type hint in the function signature to ^long.
The problem is that the ^Integer type hint doesn't affect the performance in your case (as far as i know). This kind of hint just helps to avoid reflection overhead when you call some methods of the object (which you don't), and primitive type hint (only ^long and ^double are accepted) actually converts value to the primitive type.

Scheme - speed up a heavily recursive function

I must write a Scheme predicate that computes the function f(N -> N) defined as :
if n < 4 :f(n)= (n^2) + 5
if n ≥ 4 :f(n) = [f(n−1) + f(n−2)] * f(n−4)
I wrote a simple predicate that works :
(define functionfNaive
(lambda (n)
(if (< n 4) (+ (* n n) 5)
(* (+ (functionfNaive (- n 1)) (functionfNaive (- n 2)))
(functionfNaive (- n 4))))))
Now, I try a method with an accumulator but it doesn't work...
My code :
(define functionf
(lambda(n)
(functionfAux n 5 9 14)))
(define functionfAux
(lambda (n n1 n2 n4)
(cond
[(< n 4) (+ (* n n) 5)]
[(= n 4) (* n1 (+ n2 n4))]
[else (functionfAux (- n 1) n2 n4 (* n1 (+ n2 n4)))])))
As requested, here's a memoized version of your code that performs better than the naïve version:
(define functionf
(let ((cache (make-hash)))
(lambda (n)
(hash-ref!
cache
n
(thunk
(if (< n 4)
(+ (* n n) 5)
(* (+ (functionf (- n 1)) (functionf (- n 2))) (functionf (- n 4)))))))))
BTW... computing the result for large values of n is very quick, but printing takes a lot of time. To measure the time, use something like
(time (functionf 50) 'done)
AND here's a generic memoize procedure, should you need it:
(define (memoize fn)
(let ((cache (make-hash)))
(λ arg (hash-ref! cache arg (thunk (apply fn arg))))))
which in your case could be used like
(define functionf
(memoize
(lambda (n)
(if (< n 4)
(+ (* n n) 5)
(* (+ (functionf (- n 1)) (functionf (- n 2))) (functionf (- n 4)))))))
First, that's not a predicate. A Predicate is a function which returns a Boolean value.
To calculate the nth result, start with the first four and count up, maintaining the last four known elements. Stop when n is reached:
(define (step a b c d n)
(list b c d (* (+ c d) a)) (+ n 1)))
etc. Simple. The first call will be (step 5 6 9 14 3).
The depth of the recursion tree may be the biggest question, so may be use the iteration which means use some variables to memory the intermediate processes.
#lang racket
(define (functionf n)
(define (iter now n1 n2 n3 n4 back)
(if (= n now)
back
(iter (+ now 1) back n1 n2 n3 (* n3 (+ back n1)))))
(if (< n 4)
(+ 5 (* n n))
(iter 4 14 9 6 5 125)))
(functionf 5)
in this way, the depth of the stack only be 1 and the code is speeded up.

Performance of vector and array in Clojure

I am trying to solve the Maximum subarray problem on hacker rank. This is a standard DP problem and I write an O(n) solution:
(defn dp
[v]
(let [n (count v)]
(loop [i 1 f (v 0) best f]
(if (< i n)
(let [fi (max (v i) (+ f (v i)))]
(recur (inc i) fi (max fi best)))
best))))
(defn positive-only
[v]
(reduce + (filterv #(> % 0) v)))
(defn line->ints
[line]
(->>
(clojure.string/split line #" ")
(map #(Integer. %))
(into [])
))
(let [T (Integer. (read-line))]
(loop [test 0]
(when (< test T)
(let [_ (read-line)
x (read-line)
v (line->ints x)
a (dp-array v)
b (let [p (positive-only v)]
(if (= p 0) (reduce max v) p))]
(printf "%d %d\n" a b))
(recur (inc test)))))
To my surprise, I got time-limited-exceed for a large test case. I downloaded the input file, and found that the above version needs about 3 seconds to run.
I thought the bottleneck is in (v i) (getting the i-th element in vector v). So I changed the data structure from vector to an array:
(defn dp-array
[v0]
(let [v (into-array v0)
n (int (alength v))]
(loop [i 1
f (aget v 0)
best f]
(if (< i n)
(let [fi (max (aget v i) (+ f (aget v i)))]
(recur (inc i) fi (max fi best)))
best))))
This array version is even slower. On the same input, it costs 33 seconds, much slower than the vector version. I think the slowness is due to boxing and unboxing. I tried to add type hints, but encountered run-time errors. Could anyone help me improve dp-array function? Thanks!
Also, great appreciate if anyone knows how to improve the vector version.
UPDATE:
Finally I managed to get my clojure program accepted, not by optimizing over the dynamic programming function, but by changing (Integer. str) to (Integer/parseInt str). In this way, reflection is avoided in converting from string to integer.
I also replace into-array by int-array. But the speed of both versions are still on par with each other. I would expect the array version be faster than the vector version.
The Clojure compiler can't infer the type of v in the array version of the dp-array function whose argument v0 has unknown type. This causes costs to reflections when evaluating the following alength and aget. In order to avoid these unnecessary reflections, you have to replace into-array with long-array.

Resources