cosine function calculating scheme - scheme

Im making a scheme program that calculates
cos(x) = 1-(x^2/2!)+(x^4/4!)-(x^6/6!).......
whats the most efficient way to finish the program and how would you do the alternating addition and subtraction, thats what I used the modulo for but doesnt work for 0 and 1 (first 2 terms). x is the intial value of x and num is the number of terms
(define cosine-taylor
(lambda (x num)
(do ((i 0 (+ i 1)))
((= i num))
(if(= 0 (modulo i 2))
(+ x (/ (pow-tr2 x (* i 2)) (factorial (* 2 i))))
(- x (/ (pow-tr2 x (* i 2)) (factorial (* 2 i))))
))
x))

Your questions:
whats the most efficient way to finish the program? Assuming you want use the Taylor series expansion and simply sum up the terms n times, then your iterative approach is fine. I've refined it below; but your algorithm is fine. Others have pointed out possible loss of precision issues; see below for my approach.
how would you do the alternating addition and subtraction? Use another 'argument/local-variable' of odd?, a boolean, and have it alternate by using not. When odd? subtract when not odd? add.
(define (cosine-taylor x n)
(let computing ((result 1) (i 1) (odd? #t))
(if (> i n)
result
(computing ((if odd? - +) result (/ (expt x (* 2 i)) (factorial (* 2 i))))
(+ i 1)
(not odd?)))))
> (cos 1)
0.5403023058681398
> (cosine-taylor 1.0 100)
0.5403023058681397
Not bad?
The above is the Scheme-ish way of performing a 'do' loop. You should easily be able to see the correspondence to a do with three locals for i, result and odd?.
Regarding loss of numeric precision - if you really want to solve the precision problem, then convert x to an 'exact' number and do all computation using exact numbers. By doing that, you get a natural, Scheme-ly algorithm with 'perfect' precision.
> (cosine-taylor (exact 1.0) 100)
3982370694189213112257449588574354368421083585745317294214591570720658797345712348245607951726273112140707569917666955767676493702079041143086577901788489963764057368985531760218072253884896510810027045608931163026924711871107650567429563045077012372870953594171353825520131544591426035218450395194640007965562952702049286379961461862576998942257714483441812954797016455243/7370634274437294425723020690955000582197532501749282834530304049012705139844891055329946579551258167328758991952519989067828437291987262664130155373390933935639839787577227263900906438728247155340669759254710591512748889975965372460537609742126858908788049134631584753833888148637105832358427110829870831048811117978541096960000000000000000000000000000000000000000000000000
> (inexact (cosine-taylor (exact 1.0) 100))
0.5403023058681398

we should calculate the terms in iterative fashion to prevent the loss of precision from dividing very large numbers:
(define (cosine-taylor-term x)
(let ((t 1.0) (k 0))
(lambda (msg)
(case msg
((peek) t)
((pull)
(let ((p t))
(set! k (+ k 2))
(set! t (* (- t) (/ x (- k 1)) (/ x k)))
p))))))
Then it should be easy to build a function to produce an n-th term, or to sum the terms up until a term is smaller than a pre-set precision value:
(define t (cosine-taylor-term (atan 1)))
;Value: t
(reduce + 0 (map (lambda(x)(t 'pull)) '(1 2 3 4 5)))
;Value: .7071068056832942
(cos (atan 1))
;Value: .7071067811865476
(t 'peek)
;Value: -2.4611369504941985e-8

A few suggestions:
reduce your input modulo 2pi - most polynomial expansions converge very slowly with large numbers
Keep track of your factorials rather than computing them from scratch each time (once you have 4!, you get 5! by multiplying by 5, etc)
Similarly, all your powers are powers of x^2. Compute x^2 just once, then multiply the "x power so far" by this number (x2), rather than taking x to the n'th power
Here is some python code that implements this - it converges with very few terms (and you can control the precision with the while(abs(delta)>precision): statement)
from math import *
def myCos(x):
precision = 1e-5 # pick whatever you need
xr = (x+pi/2) % (2*pi)
if xr > pi:
sign = -1
else:
sign = 1
xr = (xr % pi) - pi/2
x2 = xr * xr
xp = 1
f = 1
c = 0
ans = 1
temp = 0
delta = 1
while(abs(delta) > precision):
c += 1
f *= c
c += 1
f *= c
xp *= x2
temp = xp / f
c += 1
f *= c
c += 1
f *= c
xp *= x2
delta = xp/f - temp
ans += delta
return sign * ans
Other than that I can't help you much as I am not familiar with scheme...

For your general enjoyment, here is a stream implementation. The stream returns an infinite sequence of taylor terms based on the provided func. The func is called with the current index.
(define (stream-taylor func)
(stream-map func (stream-from 0)))
(define (stream-cosine x)
(stream-taylor (lambda (n)
(if (zero? n)
1
(let ((odd? (= 1 (modulo n 2))))
;; Use `exact` if desired...
;; and see #WillNess above; save 'last'; use for next; avoid expt/factorial
((if odd? - +) (/ (expt x (* 2 n)) (factorial (* 2 n)))))))))
> (stream-fold + 0 (stream-take 10 (stream-cosine 1.0)))
0.5403023058681397

Here's the most streamlined function I could come up with.
It takes advantage of the fact that the every term is multiplied by (-x^2) and divided by (i+1)*(i+2) to come up with the text term.
It also takes advantage of the fact that we are computing factorials of 2, 4, 6. etc. So it increments the position counter by 2 and compares it with 2*N to stop iteration.
(define (cosine-taylor x num)
(let ((mult (* x x -1))
(twice-num (* 2 num)))
(define (helper iter prev-term prev-out)
(if (= iter twice-num)
(+ prev-term prev-out)
(helper (+ iter 2)
(/ (* prev-term mult) (+ iter 1) (+ iter 2))
(+ prev-term prev-out))))
(helper 0 1 0)))
Tested at repl.it.
Here are some answers:
(cosine-taylor 1.0 2)
=> 0.5416666666666666
(cosine-taylor 1.0 4)
=> 0.5403025793650793
(cosine-taylor 1.0 6)
=> 0.5403023058795627
(cosine-taylor 1.0 8)
=> 0.5403023058681398
(cosine-taylor 1.0 10)
=> 0.5403023058681397
(cosine-taylor 1.0 20)
=> 0.5403023058681397

Related

First power of 2 that has leading decimal digits of 12 in Common Lisp

In the webpage of Rosetta Code (http://rosettacode.org/wiki/First_power_of_2_that_has_leading_decimal_digits_of_12#C) there is the following challenge: Find the first power of 2 that has leading decimal digits of 12.
I have written several versions in Common Lisp, but I fail miserably in terms of performance compared to what other languages report.
The following code is one of the many versions that I have tried.
(defun log10 (x) (/ (log x) (log 10)))
(defun first-digits (n l)
(let* ((len-n (1+ (floor (log10 n))))
(tens (expt 10 (- len-n l))) )
(truncate n tens) ))
(defun p (rem n)
(do* ((len-rem (1+ (floor (log10 rem))))
(i 0 (1+ i))
(k 1 (* 2 k)) )
((= n 0) (1- i))
(when (= rem (first-digits k len-rem))
(decf n) )))
The performance is really poor, but I refuse to admit that Common Lisp is slower than any competitors. Any idea of how to achieve the run time of a few seconds reported by C#, C++, etc.?
Here is an answer which is now substantially equivalent to yours (based on the idea in this maths SE answer, but a little more general and a little more optimized. In particular:
it lets you specify both the number you want to raise to a power;
it lets you specify the base you want to work in (so not just 10);
it knows that (log x b) is the log of x in base b;
it computes the log you need just once;
it reorganizes some of the float operations to avoid overflows (at some point your version will call (expt 10 n) with a big enough n thay you'll get a floating point overflow.
I think (in fact I'm reasonably sure) that the remaining thing that is making it slow is float consing. It ought to be possible to prevail upon a compiler to avoid this, and I spent a small amount of time trying but to no avail.
Anyway, here it is.
(defun p (L n &key (b 2) (base 10.0d0))
;; nth occurence of power of b with leading digits L in base:
;; returns power (p below)
(let ((k-1 (floor (log L base))) ;(digits of L in base) - 1
(logb (log (float b 1.0d0) base))) ;log of b in base
(do* ((p 0 (1+ p)) ;power
(d 1 (floor (expt base (+ (rem (* p logb) 1) k-1)))) ;digits
(c (if (= L 1) 1 0) ;count of hits
(if (= L d) (1+ c) c)))
((= c n) p))))
Below is an earlier, useless answer: I've left it for posterity.
Note that the first part what's below deals with only the two-leading-digit case because I didn't bother looking at the original rosetta code thing: see the end for a generalisation.
First of all to pull out the first two decimal digits of something, you want to start by divide it by a suitable power of 10 that the result is a two digit number. That means taking logs, but you can cheat: if you know the number of bits, b, in the number then the number is 2^b + change (assuming it's a positive integer), and knowing the number of bits is very very quick. And then if 10^x = 2^y then x = y/(log_2 10) where log_2 is log base 2. And this is a constant.
So you can write this function after fiddling around with pen and paper for a bit:
(defun leading-two-digits (n)
(truncate (truncate n (expt 10 (1- (truncate (integer-length n)
(load-time-value (log 10 2))))))
10))
Note the use of load-time-value to compute the (log 10 2) just once. And note also that this relies on either having correct integer arithmetic or at least a function which will tell you what the number of bits in a number would be if you did have.
So now
> (leading-two-decimal-digits 59468049869823987435)
5
9
OK, looks good (extensive testing...).
So now you just need to start from 1 and successively multiply by the base, looking for the leading two digits being what you care about. As a hack if the base is 2 you can just left shift: my original function assumed the base was 2 and always left shifted, this one has a configurable base but still special-cases 2, which I suspect may no longer help at all.
(defun nth-ld-power (n &key (b 2) (leading 1) (second 2))
;; iterate is tfeb.github.io/tfeb-lisp-hax/, also should be in
;; Quicklisp
(iterate looking ((v 1) ;value
(p 0) ;power (so we can report it)
(c 0)) ;hit count
(multiple-value-bind (d1 d2) (leading-two-decimal-digits v)
(if (and (= d1 leading) (= d2 second))
(let ((c+ (1+ c)))
(if (= c+ n)
(values v p)
(looking (if (= b 2) (ash v 1) (* v b)) (1+ p) c+)))
(looking (if (= b 2) (ash v 1) (* v b)) (1+ p) c)))))
Again you would want to deal with the multiple-leading-digits case by passing a list of leading digits, but.
OK, so now:
> (time (nth-ld-power 2))
Timing the evaluation of (nth-ld-power 2)
User time = 0.000
System time = 0.000
Elapsed time = 0.000
Allocation = 2848 bytes
0 Page faults
GC time = 0.000
1208925819614629174706176
80
> (time (nth-ld-power 10))
Timing the evaluation of (nth-ld-power 10)
User time = 0.000
System time = 0.000
Elapsed time = 0.000
Allocation = 40704 bytes
0 Page faults
GC time = 0.000
124330809102446660538845562036705210025114037699336929360115994223289874253133343883264
286
OK, this is too quick to measure, let's make it do some real work:
> (time (nth-ld-power 1000))
Timing the evaluation of (nth-ld-power 1000)
User time = 3.264
System time = 0.039
Elapsed time = 3.384
Allocation = 845193424 bytes
517 Page faults
GC time = 0.020
12[...long number truncated here ...]
28745
So that's at least reasonable performance, I think. These figures were for LispWorks on a 2013 macbook. Obviously you can rewrite it without Tim's iterate, but I like it.
Here is a generalisation to the n-digit case.
(defun leading-decimal-digits (n m)
;; Return M leading digits from N: no sanity check
(iterate peel ((q (truncate n (expt 10 (- (truncate (integer-length n)
(load-time-value (log 10 2)))
(- m 1)))))
(digits '()))
(multiple-value-bind (qq d) (truncate q 10)
(cond
((zerop qq)
(cons d digits))
((< qq 10)
(list* qq d digits))
(t
(peel qq (cons d digits)))))))
(defun nth-power-with-leading-decimal-digits (n b leading)
;; Find the N'th occurrence of a power of B whose leading digits are
;; LEADING (a list of digits). Return the value of B^P and P, the
;; power.
(let ((nleading (length leading)))
(iterate looking ((v 1) ;value
(p 0) ;power (so we can report it)
(c 0)) ;hit count
(if (equal (leading-decimal-digits v nleading) leading)
(let ((c+ (1+ c)))
(if (= c+ n)
(values v p)
(looking (if (= b 2) (ash v 1) (* v b)) (1+ p) c+)))
(looking (if (= b 2) (ash v 1) (* v b)) (1+ p) c)))))
So:
> (nth-power-with-leading-decimal-digits 2 2 '(1 2 8))
12855504354071922204335696738729300820177623950262342682411008
203
> (nth-power-with-leading-decimal-digits 2 3 '(2 7 8))
278954761343915929031866324148580803686773879062609352173281933430969939572023921519256921927359964084535215107090906022143908601839272147120823008337941481521208646465304746378648054338849857759629806700446921838039313884792762356955010344065298744426691826196079923777796821513452648753573059469525738664313409324728161550430310432705576201607066435772343529511415605105251217669677767280155388600975280964482318641251059803074701681895639266095014303466041595626938522373004626313927779288797843485898785327074755040298312905780373591994824646107875196292150692772609024142420597241695173176699988995274347532223981851419958515807471788303361043192244700492703668505222371093498892685289816397935636213340656919509760640657215827785179171568543835684943671352357398679232652259639784388841436515656182938820127024910154405205830179436914341715546500034485031465189386918271898229029201
1860
2^203 is the second occurrence of a power of 2 beginning with 128, and 3^1860 is the second occurrence of a power of 3 beginning 278. The second one took 0.017 seconds.

How would I go about create this function

How would I create an approximate cos function.
What I have so far.
(define k 0)
(define (approx-cos x n)
(cond
[(> 0 n) 0]
[else (* (/ (expt -1 k) (factorial (* 2 k))) (expt x (* 2 k)))]))
Your solution requires a lot of work before it meets the expectations. For starters, your parameters are switched: the first one is the number you want to calculate and the second one is the number of iterations...
Which leads me to the major problem in your solution, you're not iterating at all! You're supposed to call approx-cos at some point, or some helper procedure to do the looping (as I did).
Last but not least, you're not correctly implementing the formula. Where's the -1 part, for instance? Or where are you multiplying by x^2k? I'm afraid a complete rewrite is in order:
; main procedure
(define (approx-cos x n)
; call helper procedure
(loop 0 0 x n))
; define a helper procedure
(define (loop acc k x n)
; loop with k from 0 to n, accumulating result
(cond [(> k n) acc] ; return accumulator
[else
(loop (+ acc ; update accumulator
(* (/ (expt -1.0 k) ; implement the formula
(factorial (* 2.0 k)))
(expt x (* 2.0 k))))
(add1 k) ; increment iteration variable
x n)]))
This will pass all the check expects:
(approx-cos 0 0)
=> 1
(approx-cos (/ pi 2) 0)
=> 1
(approx-cos 0 10)
=> 1
(approx-cos pi 10)
=> -0.9999999999243502
(approx-cos (* 3 (/ pi 2)) 9)
=> -1.1432910825361444e-05
(approx-cos 10 100)
=> -0.8390715290756897
Some final thoughts: your implementation of factorial is very slow, if you plan to do a larger number of iterations, your factorial will freeze the execution at some point.

SICP 1.45 - Why are these two higher order functions not equivalent?

I'm going through the exercises in [SICP][1] and am wondering if someone can explain the difference between these two seemingly equivalent functions that are giving different results! Is this because of rounding?? I'm thinking the order of functions shouldn't matter here but somehow it does? Can someone explain what's going on here and why it's different?
Details:
Exercise 1.45: ..saw that finding a fixed point of y => x/y does not
converge, and that this can be fixed by average damping. The same
method works for finding cube roots as fixed points of the
average-damped y => x/y^2. Unfortunately, the process does not work
for fourth roots—a single average damp is not enough to make a
fixed-point search for y => x/y^3 converge.
On the other hand, if we
average damp twice (i.e., use the average damp of the average damp of
y => x/y^3) the fixed-point search does converge. Do some experiments
to determine how many average damps are required to compute nth roots
as a fixed-point search based upon repeated average damping of y => x/y^(n-1).
Use this to implement a simple procedure for computing the roots
using fixed-point, average-damp, and the repeated procedure
of Exercise 1.43. Assume that any arithmetic operations you need are
available as primitives.
My answer (note order of repeat and average-damping):
(define (nth-root-me x n num-repetitions)
(fixed-point (repeat (average-damping (lambda (y)
(/ x (expt y (- n 1)))))
num-repetitions)
1.0))
I see an alternate web solution where repeat is called directly on average damp and then that function is called with the argument
(define (nth-root-web-solution x n num-repetitions)
(fixed-point
((repeat average-damping num-repetition)
(lambda (y) (/ x (expt y (- n 1)))))
1.0))
Now calling both of these, there seems to be a difference in the answers and I can't understand why! My understanding is the order of the functions shouldn't affect the output (they're associative right?), but clearly it is!
> (nth-root-me 10000 4 2)
>
> 10.050110705350287
>
> (nth-root-web-solution 10000 4 2)
>
> 10.0
I did more tests and it's always like this, my answer is close, but the other answer is almost always closer! Can someone explain what's going on? Why aren't these equivalent? My guess is the order of calling these functions is messing with it but they seem associative to me.
For example:
(repeat (average-damping (lambda (y) (/ x (expt y (- n 1)))))
num-repetitions)
vs
((repeat average-damping num-repetition)
(lambda (y) (/ x (expt y (- n 1)))))
Other Helper functions:
(define (fixed-point f first-guess)
(define (close-enough? v1 v2)
(< (abs (- v1 v2))
tolerance))
(let ((next-guess (f first-guess)))
(if (close-enough? next-guess first-guess)
next-guess
(fixed-point f next-guess))))
(define (average-damping f)
(lambda (x) (average x (f x))))
(define (repeat f k)
(define (repeat-helper f k acc)
(if (<= k 1)
acc
;; compose the original function with the modified one
(repeat-helper f (- k 1) (compose f acc))))
(repeat-helper f k f))
(define (compose f g)
(lambda (x)
(f (g x))))
You are asking why “two seemingly equivalent functions” produce a different result, but the two functions are in effect very different.
Let’s try to simplify the problem to see why they are different. The only difference between the two functions are the two expressions:
(repeat (average-damping (lambda (y) (/ x (expt y (- n 1)))))
num-repetitions)
((repeat average-damping num-repetition)
(lambda (y) (/ x (expt y (- n 1)))))
In order to simplify our discussion, we assume num-repetition equal to 2, and a simpler function then that lambda, for instance the following function:
(define (succ x) (+ x 1))
So the two different parts are now:
(repeat (average-damping succ) 2)
and
((repeat average-damping 2) succ)
Now, for the first expression, (average-damping succ) returns a numeric function that calculates the average between a parameter and its successor:
(define h (average-damping succ))
(h 3) ; => (3 + succ(3))/2 = (3 + 4)/2 = 3.5
So, the expression (repeat (average-damping succ) 2) is equivalent to:
(lambda (x) ((compose h h) x)
which is equivalent to:
(lambda (x) (h (h x))
Again, this is a numeric function and if we apply this function to 3, we have:
((lambda (x) (h (h x)) 3) ; => (h 3.5) => (3.5 + 4.5)/2 = 4
In the second case, instead, we have (repeat average-damping 2) that produces a completely different function:
(lambda (x) ((compose average-damping average-damping) x)
which is equivalent to:
(lambda (x) (average-damping (average-damping x)))
You can see that the result this time is a high-level function, not an integer one, that takes a function x and applies two times the average-damping function to it. Let’s verify this by applying this function to succ and then applying the result to the number 3:
(define g ((lambda (x) (average-damping (average-damping x))) succ))
(g 3) ; => 3.25
The difference in the result is not due to numeric approximation, but to a different computation: first (average-damping succ) returns the function h, which computes the average between the parameter and its successor; then (average-damping h) returns a new function that computes the average between the parameter and the result of the function h. Such a function, if passed a number like 3, first calculates the average between 3 and 4, which is 3.5, then calculates the average between 3 (again the parameter), and 3.5 (the previous result), producing 3.25.
The definition of repeat entails
((repeat f k) x) = (f (f (f (... (f x) ...))))
; 1 2 3 k
with k nested calls to f in total. Let's write this as
= ((f^k) x)
and also define
(define (foo n) (lambda (y) (/ x (expt y (- n 1)))))
; ((foo n) y) = (/ x (expt y (- n 1)))
Then we have
(nth-root-you x n k) = (fixed-point ((average-damping (foo n))^k) 1.0)
(nth-root-web x n k) = (fixed-point ((average-damping^k) (foo n)) 1.0)
So your version makes k steps with the once-average-damped (foo n) function on each iteration step performed by fixed-point; the web's uses the k-times-average-damped (foo n) as its iteration step. Notice that no matter how many times it is used, a once-average-damped function is still average-damped only once, and using it several times is probably only going to exacerbate a problem, not solve it.
For k == 1 the two resulting iteration step functions are of course equivalent.
In your case k == 2, and so
(your-step y) = ((average-damping (foo n))
((average-damping (foo n)) y)) ; and,
(web-step y) = ((average-damping (average-damping (foo n))) y)
Since
((average-damping f) y) = (average y (f y))
we have
(your-step y) = ((average-damping (foo n))
(average y ((foo n) y)))
= (let ((z (average y ((foo n) y))))
(average z ((foo n) z)))
(web-step y) = (average y ((average-damping (foo n)) y))
= (average y (average y ((foo n) y)))
= (+ (* 0.5 y) (* 0.5 (average y ((foo n) y))))
= (+ (* 0.75 y) (* 0.25 ((foo n) y)))
;; and in general:
;; = (2^k-1)/2^k * y + 1/2^k * ((foo n) y)
The difference is clear. Average damping is used to dampen the possibly erratic jumps of (foo n) at certain ys, and the higher the k the stronger the damping effect, as is clearly seen from the last formula.

Why does this simplification make my function slower?

The following function computes the Fibonacci series by tail recursive and squaring:
(defun fib1 (n &optional (a 1) (b 0) (p 0) (q 1))
(cond ((zerop n) b)
((evenp n)
(fib1 (/ n 2)
a
b
(+ (* p p) (* q q))
(+ (* q q) (* 2 p q))))
(t
(fib1 (1- n)
(+ (* b q) (* a (+ p q)))
(+ (* b p) (* a q))
p
q))))
Basically it reduces every odd input to a even one, and reduces every even input by half. For example,
F(21)
= F(21 1 0 0 1)
= F(20 1 1 0 1)
= F(10 1 1 1 1)
= F(5 1 1 2 3)
= F(4 8 5 2 3)
= F(2 8 5 13 21)
= F(1 8 5 610 987)
= F(0 17711 10946 610 987)
= 10946
When I saw this I thought it might be better to combine the even and odd cases (since odd minus one = even), so I wrote
(defun fib2 (n &optional (a 1) (b 0) (p 0) (q 1))
(if (zerop n) b
(fib2 (ash n -1)
(if (evenp n) a (+ (* b q) (* a (+ p q))))
(if (evenp n) b (+ (* b p) (* a q)))
(+ (* p p) (* q q))
(+ (* q q) (* 2 p q)))))
and hoping this will make it faster, as the equations above now becomes
F(21)
= F(21 1 0 0 1)
= F(10 1 1 1 1)
= F(5 1 1 2 3)
= F(2 8 5 13 21)
= F(1 8 5 610 987)
= F(0 17711 10946 1346269 2178309)
= 10946
However, it turned out to be much slower (takes about 50% more time in e.g. Clozure CL, CLisp and Lispworks) when I check the time needed for Fib(1000000) by the following code (Ignore the progn, I just don't want my screen filled with numbers.)
(time (progn (fib1 1000000)()))
(time (progn (fib2 1000000)()))
I can only see fib2 may do more evenp than fib1, so why is it that much slower?
EDIT: I think n.m. get it right, and I edited the second group of formulae. E.g. in the example of F(21) above, fib2 actually computes F(31) and F(32) in p and q, which is never used. So in F(1000000), fib2 computes F(1048575) and F(1048576).
Lazy evaluation rocks, that's a very good point. I guess in Common Lisp only some macro like "and" and "or" are evaluated lazily?
The following modified fib2 (defined for n>0) actually runs faster:
(defun fib2 (n &optional (a 1) (b 0) (p 0) (q 1))
(if (= n 1) (+ (* b p) (* a q))
(fib2 (ash n -1)
(if (evenp n) a (+ (* b q) (* a (+ p q))))
(if (evenp n) b (+ (* b p) (* a q)))
(+ (* p p) (* q q))
(+ (* q q) (* 2 p q)))))
Insert printing of the intermediate results. Pay attention to p and q towards the end of the computation.
You will notice that fib2 computes much larger values for p and q at the last step. These two values account for all the performance difference.
The ironic thing is that these expensive values are unused. This is why Haskell doesn't suffer from this performance problem: the unused values are not actually computed.
If nothing else, fib2 has more conditionals (while computing the arguments). That may well change how the code flow is done. Conditionals imply jumps, implies pipeline stalls.
It would probably be instructive to look at the generated code (try (disassemble #'fib1) and (disassemble #'fib2) and see if there's any blatant differences). It might also be worth to change the optimization settings, there's usually a fair few optimizations that are not done unless you request heavy optimization for speed.

Fermat factorization method limit

I am trying to implement Fermat's factorization (Algorithm C in The Art of Computer Programming, Vol. 2). Unfortunately in my edition (ISBN 81-7758-335-2), this algorithm is printed incorrectly. what should be the condition on factor-inner loop below? I am running the loop till y <= n [passed in as limit].
(if (< limit y) 0 (factor-inner x (+ y 2) (- r y) limit))
Is there anyway to avoid this condition altogether, as it will double the speed of loop?
(define (factor n)
(let ((square-root (inexact->exact (floor (sqrt n)))))
(factor-inner (+ (* 2 square-root) 1)
1
(- (* square-root square-root) n)
n)))
(define (factor-inner x y r limit)
(if (= r 0)
(/ (- x y) 2)
(begin
(display x) (display " ") (display y) (display " ") (display r) (newline)
;;(sleep-current-thread 1)
(if (< r 0)
(factor-inner (+ x 2) y (+ r x) limit)
(if (< limit y)
0
(factor-inner x (+ y 2) (- r y) limit))))))
The (< limit y) check is not necessary because, worst-case, the algorithm will eventually find this solution:
x = N + 2
y = N
It will then return 1.
Looking through Algorithm C, it looks like the issue is with the recursion step, which effectively skips step C4 whenever r < 0, because x is not incremented and r is only decremented by y.
Using the notation of a, b and r from the 1998 edition of Vol. 2 (ISBN 0-201-89684-2), a Scheme version would be as follows:
(define (factor n)
(let ((x (inexact->exact (floor (sqrt n)))))
(factor-inner (+ (* x 2) 1)
1
(- (* x x) n))))
(define (factor-inner a b r)
(cond ((= r 0) (/ (- a b) 2))
((< 0 r) (factor-inner a (+ b 2) (- r b)))
(else (factor-inner (+ a 2) (+ b 2) (- r (- a b))))))
EDIT to add: Basically, we are doing a trick that repeatedly checks whether
r <- ((a - b) / 2)*((a + b - 2)/2) - N
is 0, and we're doing it by simply tracking how r changes when we increment a or b. If we were to set b to b+2 in the expression for r above, it's equivalent to reducing r by the old value of b, which is why both are done in parallel in step C4 of the algorithm. I encourage you to expand out the algebraic expression above and convince yourself that this is true.
As long as r > 0, you want to keep decreasing it to find the right value of b, so you keep repeating step C4. However, if you overshoot, and r < 0, you need to increase it. You do this by increasing a, because increasing a by 2 is equivalent to decreasing r by the old value of a, as in step C3. You will always have a > b, so increasing r by a in step C3 automatically makes r positive again, so you just proceed directly on to step C4.
It's also easy to prove that a > b. We start with a manifestly greater than b, and if we ever increase b to the point where b = a - 2, we have
N = (a - (a - 2))/2 * ((a + (a - 2) - 2)/2 = 1 * (a - 2)
This means that N is prime, as the largest factor it has that is less than sqrt(N) is 1, and the algorithm has terminated.

Resources