I have this code to calculate derivatives:
(define (diff x expr)
(if (not (list? expr))
(if (equal? x expr) 1 0)
(let ((u (cadr expr)) (v (caddr expr)))
(case (car expr)
((+) (list '+ (diff x u) (diff x v)))
((-) (list '- (diff x u) (diff x v)))
((*) (list '+
(list '* u (diff x v))
(list '* v (diff x u))))
((/) (list ‘div (list '-
(list '* v (diff x u))
(list '* u (diff x v)))
(list '* u v)))
))))
How can I simplify algebraic expressions?
instead of x + x show 2x
and
instead of x * x show x^2
Simplification of algegraic expressions is quite hard, especially compared to the computation of derivates. Simplification should be done recursively. You simplify the innermost expressions first. Don't try too much at a time. I'd start with the most basic simplifications only e.g:
0 + x -> x
0 * x -> 0
1 * x -> x
x ^ 0 -> 1
x ^ 1 -> x
Replace subtraction by addition and division by multiplication
x - y -> x + (-1)*x
x / y -> x ^ (-1)
This may not look as a simplification, but it will simplify your code. You can always reverse this step at the end.
Then you use associativity and commutativity to sort your terms. Move numerical values to the left side, sort variables by some predefined order (it doesn't have to be alphabetical but it should always be the same)
(x * y) * z -> x * (y * z)
x * 2 -> 2 * x
2 * (x * 3) -> 2 * (3 * x)
Simplify exponents
(x ^ y) ^ z -> x^(y * z)
Simplify the numerical parts.
2 * (3 * x) -> 6 * x
2 + (3 + x) -> 5 + x
Once you have done this you can think about collecting common expressions.
Perhaps this code from PAIP will be useful. It's Common Lisp, which is quite similar to Scheme. The entry point is the simp function.
Note that it also uses this file.
Historical note: The relevant PAIP chapter refers to Macsyma, the computer algebra system developed in MIT in the 1960s, and was the basis for Mathematica, Maple (now in Matlab) and other tools.
Here's a start:
Change the your derivative function from (define (diff x expr) ...) to something like (define (simp expr) ...).
for the x + x case, do something like
(case (car expr)
((+) (if (equal u v) ; check for identical subexpressions
`(* ,(simp u) 2) ; if u==v, simplify 2u
`(+ ,(simp u) ,(simp v))))
...)
The x * x case should be similar. Eventually you may want to convert the if into a cond if you need to do a lot of different simplifications.
This is a Hard Problem to solve completely and the links eliben gives are worth looking at.
The general problem is hard, but you can get a long way with a sum-of-products normal form, represented as a finite map from key (variable name) to coefficient. This form is great for linear equations and linear solving, and it can be extended to multiplication and powers without too much trouble. If you define "smart constructors" for arithmetic on this form you can get some reasonable symbolic differentiation and equation solving going.
This is a quick hack, but I've used it in a few applications. Several it worked in; a couple times it wasn't good enough. For something more serious you're talking years of work.
If you want code examples you can read about one of my equation solvers.
Related
I'm going through the exercises in [SICP][1] and am wondering if someone can explain the difference between these two seemingly equivalent functions that are giving different results! Is this because of rounding?? I'm thinking the order of functions shouldn't matter here but somehow it does? Can someone explain what's going on here and why it's different?
Details:
Exercise 1.45: ..saw that finding a fixed point of y => x/y does not
converge, and that this can be fixed by average damping. The same
method works for finding cube roots as fixed points of the
average-damped y => x/y^2. Unfortunately, the process does not work
for fourth roots—a single average damp is not enough to make a
fixed-point search for y => x/y^3 converge.
On the other hand, if we
average damp twice (i.e., use the average damp of the average damp of
y => x/y^3) the fixed-point search does converge. Do some experiments
to determine how many average damps are required to compute nth roots
as a fixed-point search based upon repeated average damping of y => x/y^(n-1).
Use this to implement a simple procedure for computing the roots
using fixed-point, average-damp, and the repeated procedure
of Exercise 1.43. Assume that any arithmetic operations you need are
available as primitives.
My answer (note order of repeat and average-damping):
(define (nth-root-me x n num-repetitions)
(fixed-point (repeat (average-damping (lambda (y)
(/ x (expt y (- n 1)))))
num-repetitions)
1.0))
I see an alternate web solution where repeat is called directly on average damp and then that function is called with the argument
(define (nth-root-web-solution x n num-repetitions)
(fixed-point
((repeat average-damping num-repetition)
(lambda (y) (/ x (expt y (- n 1)))))
1.0))
Now calling both of these, there seems to be a difference in the answers and I can't understand why! My understanding is the order of the functions shouldn't affect the output (they're associative right?), but clearly it is!
> (nth-root-me 10000 4 2)
>
> 10.050110705350287
>
> (nth-root-web-solution 10000 4 2)
>
> 10.0
I did more tests and it's always like this, my answer is close, but the other answer is almost always closer! Can someone explain what's going on? Why aren't these equivalent? My guess is the order of calling these functions is messing with it but they seem associative to me.
For example:
(repeat (average-damping (lambda (y) (/ x (expt y (- n 1)))))
num-repetitions)
vs
((repeat average-damping num-repetition)
(lambda (y) (/ x (expt y (- n 1)))))
Other Helper functions:
(define (fixed-point f first-guess)
(define (close-enough? v1 v2)
(< (abs (- v1 v2))
tolerance))
(let ((next-guess (f first-guess)))
(if (close-enough? next-guess first-guess)
next-guess
(fixed-point f next-guess))))
(define (average-damping f)
(lambda (x) (average x (f x))))
(define (repeat f k)
(define (repeat-helper f k acc)
(if (<= k 1)
acc
;; compose the original function with the modified one
(repeat-helper f (- k 1) (compose f acc))))
(repeat-helper f k f))
(define (compose f g)
(lambda (x)
(f (g x))))
You are asking why “two seemingly equivalent functions” produce a different result, but the two functions are in effect very different.
Let’s try to simplify the problem to see why they are different. The only difference between the two functions are the two expressions:
(repeat (average-damping (lambda (y) (/ x (expt y (- n 1)))))
num-repetitions)
((repeat average-damping num-repetition)
(lambda (y) (/ x (expt y (- n 1)))))
In order to simplify our discussion, we assume num-repetition equal to 2, and a simpler function then that lambda, for instance the following function:
(define (succ x) (+ x 1))
So the two different parts are now:
(repeat (average-damping succ) 2)
and
((repeat average-damping 2) succ)
Now, for the first expression, (average-damping succ) returns a numeric function that calculates the average between a parameter and its successor:
(define h (average-damping succ))
(h 3) ; => (3 + succ(3))/2 = (3 + 4)/2 = 3.5
So, the expression (repeat (average-damping succ) 2) is equivalent to:
(lambda (x) ((compose h h) x)
which is equivalent to:
(lambda (x) (h (h x))
Again, this is a numeric function and if we apply this function to 3, we have:
((lambda (x) (h (h x)) 3) ; => (h 3.5) => (3.5 + 4.5)/2 = 4
In the second case, instead, we have (repeat average-damping 2) that produces a completely different function:
(lambda (x) ((compose average-damping average-damping) x)
which is equivalent to:
(lambda (x) (average-damping (average-damping x)))
You can see that the result this time is a high-level function, not an integer one, that takes a function x and applies two times the average-damping function to it. Let’s verify this by applying this function to succ and then applying the result to the number 3:
(define g ((lambda (x) (average-damping (average-damping x))) succ))
(g 3) ; => 3.25
The difference in the result is not due to numeric approximation, but to a different computation: first (average-damping succ) returns the function h, which computes the average between the parameter and its successor; then (average-damping h) returns a new function that computes the average between the parameter and the result of the function h. Such a function, if passed a number like 3, first calculates the average between 3 and 4, which is 3.5, then calculates the average between 3 (again the parameter), and 3.5 (the previous result), producing 3.25.
The definition of repeat entails
((repeat f k) x) = (f (f (f (... (f x) ...))))
; 1 2 3 k
with k nested calls to f in total. Let's write this as
= ((f^k) x)
and also define
(define (foo n) (lambda (y) (/ x (expt y (- n 1)))))
; ((foo n) y) = (/ x (expt y (- n 1)))
Then we have
(nth-root-you x n k) = (fixed-point ((average-damping (foo n))^k) 1.0)
(nth-root-web x n k) = (fixed-point ((average-damping^k) (foo n)) 1.0)
So your version makes k steps with the once-average-damped (foo n) function on each iteration step performed by fixed-point; the web's uses the k-times-average-damped (foo n) as its iteration step. Notice that no matter how many times it is used, a once-average-damped function is still average-damped only once, and using it several times is probably only going to exacerbate a problem, not solve it.
For k == 1 the two resulting iteration step functions are of course equivalent.
In your case k == 2, and so
(your-step y) = ((average-damping (foo n))
((average-damping (foo n)) y)) ; and,
(web-step y) = ((average-damping (average-damping (foo n))) y)
Since
((average-damping f) y) = (average y (f y))
we have
(your-step y) = ((average-damping (foo n))
(average y ((foo n) y)))
= (let ((z (average y ((foo n) y))))
(average z ((foo n) z)))
(web-step y) = (average y ((average-damping (foo n)) y))
= (average y (average y ((foo n) y)))
= (+ (* 0.5 y) (* 0.5 (average y ((foo n) y))))
= (+ (* 0.75 y) (* 0.25 ((foo n) y)))
;; and in general:
;; = (2^k-1)/2^k * y + 1/2^k * ((foo n) y)
The difference is clear. Average damping is used to dampen the possibly erratic jumps of (foo n) at certain ys, and the higher the k the stronger the damping effect, as is clearly seen from the last formula.
(Details of my miniKanren in Racket setup appear at the bottom[1].)
The way quotes and unquotes work in The Reasoned Schemer appears not to match the way they work in Racket. For instance, verse 2 of chapter 2 suggests[2] the following function definition:
(run #f
(r )
(fresh (y x )
(== '(,x ,y) r )))
If I evaluate that, I get '((,x ,y)). If instead I rewrite it as this:
(run #f
(r )
(fresh (y x )
(== (list x y) r)))
I get the expected result, '((_.0 _.1)).
This might seem like a minor problem, but in many cases the required translation is extremely verbose. For instance, in exercise 45 of chapter 3 (page 34), the book provides, roughly[3] the following definition:
(run 5 (r)
(fresh (w x y z)
(loto (('g 'g) ('e w) (x y) . z))
(== (w (x y) z) r)))
In order to get the results they get, I had to rewrite it like this:
(run 5 (r)
(fresh (w x y z)
(loto (cons '(g g)
(cons (list 'e w)
(cons (list x y)
z))))
(== (list w (list x y) z)
r)))
[1] As described here, I ran raco pkg install minikanren and then defined a few missing pieces.
[2] Actually, they don't write precisely that, but if you heed the advice in the footnotes to that verse and an earlier verse, it's what you get.
[3] Modulo some implicit quoting and unquoting that I cannot deduce.
Use the backquote ` instead of the simple quote ' you have been using.
I am having trouble where my code feels incomplete and plain wrong. For my function (terms-needed x tol) I am supposed to find the smallest k such that the difference between x and (square (babylonian x k)) is less than tol (tolerance). In other words we are supposed to measure how large k needs to be in the function (babylonian x k) to provide a good approximation of the square root.
As of right now I am getting an error of "application: not a procedure;" with my code
(define (square x)
(* x x))
(define (first-value-k-or-higher x tol k)
(if (<= (x)
(square (babylonian x k)) tol)
k)
(first-value-k-or-higher x tol (+ k 1))
)
(define (terms-needed x tol)
(first-value-k-or-higher x tol 1))
We are supposed to use a helper-function (first-value-k-or-higher x tol k) that evaluates to k if (square (bablyonian x k)) is within tol of the argument x, otherwise calls itself recursively
with larger k.
This is the function that is needed to make (terms-needed x tol) work:
(define (babylonian x k)
(if (>= x 1)
(if (= k 0)
(/ x 2)
(* (/ 1 2) (+ (expt x (/ 1 2)) (/ x (expt x (/ 1 2))))))
1)
)
Here is the full text, providing the full context on what the problem is supposed to be.
We will now measure how large k needs to be in the above function to provide a good approximation
of the square root. You will write a SCHEME function (terms-needed x tol) that will
evaluate to the number of terms in the infinite sum needed to be within tol, that is, the smallest
k such that the difference between x and (square (babylonian x k)) is less than tol.
Remark 2. At first glance, the problem of defining (terms-needed x tol) appears a little challenging,
because it’s not at all obvious how to express it in terms of a smaller problem. But you might
consider writing a helper function (first-value-k-or-higher x tol k) that evaluates to k if
(square (bablyonian x k)) is within tol of the argument x, otherwise calls itself recursively
with larger k.
You have several problems.
First, you have parentheses around x in
(if (<= (x)
That's causing the error you're seeing, because it's trying to call the function named x, but x names a number, not a function.
Second, you're not calculating the difference between x and (square (babylonian x k)). Instead, you gave 3 arguments to <=.
Third, you're not making the recursive call when the comparison fails. It's outside the if, so it's being done all the time (if you make use of an editor's automatic indentation feature, you might have noticed this problem yourself).
Fourth, you need to get the absolute value of the difference, not just the difference itself. Otherwise, if the difference is a large negative number, you'll consider it within the tolerance, which it shouldn't be.
(define (first-value-k-or-higher x tol k)
(if (<= (abs (- x (square (babylonian x k))))
tol)
k
(first-value-k-or-higher x tol (+ k 1))))
Im making a scheme program that calculates
cos(x) = 1-(x^2/2!)+(x^4/4!)-(x^6/6!).......
whats the most efficient way to finish the program and how would you do the alternating addition and subtraction, thats what I used the modulo for but doesnt work for 0 and 1 (first 2 terms). x is the intial value of x and num is the number of terms
(define cosine-taylor
(lambda (x num)
(do ((i 0 (+ i 1)))
((= i num))
(if(= 0 (modulo i 2))
(+ x (/ (pow-tr2 x (* i 2)) (factorial (* 2 i))))
(- x (/ (pow-tr2 x (* i 2)) (factorial (* 2 i))))
))
x))
Your questions:
whats the most efficient way to finish the program? Assuming you want use the Taylor series expansion and simply sum up the terms n times, then your iterative approach is fine. I've refined it below; but your algorithm is fine. Others have pointed out possible loss of precision issues; see below for my approach.
how would you do the alternating addition and subtraction? Use another 'argument/local-variable' of odd?, a boolean, and have it alternate by using not. When odd? subtract when not odd? add.
(define (cosine-taylor x n)
(let computing ((result 1) (i 1) (odd? #t))
(if (> i n)
result
(computing ((if odd? - +) result (/ (expt x (* 2 i)) (factorial (* 2 i))))
(+ i 1)
(not odd?)))))
> (cos 1)
0.5403023058681398
> (cosine-taylor 1.0 100)
0.5403023058681397
Not bad?
The above is the Scheme-ish way of performing a 'do' loop. You should easily be able to see the correspondence to a do with three locals for i, result and odd?.
Regarding loss of numeric precision - if you really want to solve the precision problem, then convert x to an 'exact' number and do all computation using exact numbers. By doing that, you get a natural, Scheme-ly algorithm with 'perfect' precision.
> (cosine-taylor (exact 1.0) 100)
3982370694189213112257449588574354368421083585745317294214591570720658797345712348245607951726273112140707569917666955767676493702079041143086577901788489963764057368985531760218072253884896510810027045608931163026924711871107650567429563045077012372870953594171353825520131544591426035218450395194640007965562952702049286379961461862576998942257714483441812954797016455243/7370634274437294425723020690955000582197532501749282834530304049012705139844891055329946579551258167328758991952519989067828437291987262664130155373390933935639839787577227263900906438728247155340669759254710591512748889975965372460537609742126858908788049134631584753833888148637105832358427110829870831048811117978541096960000000000000000000000000000000000000000000000000
> (inexact (cosine-taylor (exact 1.0) 100))
0.5403023058681398
we should calculate the terms in iterative fashion to prevent the loss of precision from dividing very large numbers:
(define (cosine-taylor-term x)
(let ((t 1.0) (k 0))
(lambda (msg)
(case msg
((peek) t)
((pull)
(let ((p t))
(set! k (+ k 2))
(set! t (* (- t) (/ x (- k 1)) (/ x k)))
p))))))
Then it should be easy to build a function to produce an n-th term, or to sum the terms up until a term is smaller than a pre-set precision value:
(define t (cosine-taylor-term (atan 1)))
;Value: t
(reduce + 0 (map (lambda(x)(t 'pull)) '(1 2 3 4 5)))
;Value: .7071068056832942
(cos (atan 1))
;Value: .7071067811865476
(t 'peek)
;Value: -2.4611369504941985e-8
A few suggestions:
reduce your input modulo 2pi - most polynomial expansions converge very slowly with large numbers
Keep track of your factorials rather than computing them from scratch each time (once you have 4!, you get 5! by multiplying by 5, etc)
Similarly, all your powers are powers of x^2. Compute x^2 just once, then multiply the "x power so far" by this number (x2), rather than taking x to the n'th power
Here is some python code that implements this - it converges with very few terms (and you can control the precision with the while(abs(delta)>precision): statement)
from math import *
def myCos(x):
precision = 1e-5 # pick whatever you need
xr = (x+pi/2) % (2*pi)
if xr > pi:
sign = -1
else:
sign = 1
xr = (xr % pi) - pi/2
x2 = xr * xr
xp = 1
f = 1
c = 0
ans = 1
temp = 0
delta = 1
while(abs(delta) > precision):
c += 1
f *= c
c += 1
f *= c
xp *= x2
temp = xp / f
c += 1
f *= c
c += 1
f *= c
xp *= x2
delta = xp/f - temp
ans += delta
return sign * ans
Other than that I can't help you much as I am not familiar with scheme...
For your general enjoyment, here is a stream implementation. The stream returns an infinite sequence of taylor terms based on the provided func. The func is called with the current index.
(define (stream-taylor func)
(stream-map func (stream-from 0)))
(define (stream-cosine x)
(stream-taylor (lambda (n)
(if (zero? n)
1
(let ((odd? (= 1 (modulo n 2))))
;; Use `exact` if desired...
;; and see #WillNess above; save 'last'; use for next; avoid expt/factorial
((if odd? - +) (/ (expt x (* 2 n)) (factorial (* 2 n)))))))))
> (stream-fold + 0 (stream-take 10 (stream-cosine 1.0)))
0.5403023058681397
Here's the most streamlined function I could come up with.
It takes advantage of the fact that the every term is multiplied by (-x^2) and divided by (i+1)*(i+2) to come up with the text term.
It also takes advantage of the fact that we are computing factorials of 2, 4, 6. etc. So it increments the position counter by 2 and compares it with 2*N to stop iteration.
(define (cosine-taylor x num)
(let ((mult (* x x -1))
(twice-num (* 2 num)))
(define (helper iter prev-term prev-out)
(if (= iter twice-num)
(+ prev-term prev-out)
(helper (+ iter 2)
(/ (* prev-term mult) (+ iter 1) (+ iter 2))
(+ prev-term prev-out))))
(helper 0 1 0)))
Tested at repl.it.
Here are some answers:
(cosine-taylor 1.0 2)
=> 0.5416666666666666
(cosine-taylor 1.0 4)
=> 0.5403025793650793
(cosine-taylor 1.0 6)
=> 0.5403023058795627
(cosine-taylor 1.0 8)
=> 0.5403023058681398
(cosine-taylor 1.0 10)
=> 0.5403023058681397
(cosine-taylor 1.0 20)
=> 0.5403023058681397
This isn't a homework question, I'm just left unsatisfied with my understanding of interval arithmetic and the implications of exercise 2.16.
The interval arithmetic defined by section 2.14 does not exhibit the properties of normal arithmetic. Two should be equivalent operations, (r1*r2)/(r1 + r2) and 1/(1/r1 + 1/r2),
give different results. The exercise asks why this is the case, and if it is possible to construct an interval-arithmetic system in which this is not the case.
The section is addressing the calculation of error margins of resistance of electrical components. I'm not sure I understand what it would mean, in these terms, to multiply and divide intervals. What is the application to multiplying two intervals together?
Is it possible to construct an interval-arithmetic system without the problem in this example?
http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-14.html#%_sec_2.1.4
(define (make-interval a b)
(cons a b))
(define (make-center-width c w)
(make-interval (- c w) (+ c w)))
(define (make-center-percent c p)
(make-center-width c (* c (/ p 100.0))))
(define (lower-bound i)
(car i))
(define (upper-bound i)
(cdr i))
(define (center i)
(/ (+ (upper-bound i) (lower-bound i)) 2))
(define (width i)
(/ (- (upper-bound i) (lower-bound i)) 2))
(define (percent i)
(* 100.0 (/ (width i) (center i))))
(define (add-interval x y)
(make-interval (+ (lower-bound x) (lower-bound y))
(+ (upper-bound x) (upper-bound y))))
(define (sub-interval x y)
(make-interval (- (lower-bound x) (lower-bound y))
(- (upper-bound x) (upper-bound y))))
(define (mul-interval x y)
(let ((p1 (* (lower-bound x) (lower-bound y)))
(p2 (* (lower-bound x) (lower-bound y)))
(p3 (* (lower-bound x) (lower-bound y)))
(p4 (* (lower-bound x) (lower-bound y))))
(make-interval (min p1 p2 p3 p4)
(max p1 p2 p3 p4))))
(define (div-interval x y)
(if (= (width y ) 0)
(error "division by interval with width 0")
(mul-interval x
(make-interval (/ 1.0 (upper-bound y))
(/ 1.0 (lower-bound y))))))
(define (parl1 r1 r2)
(div-interval (mul-interval r1 r2)
(add-interval r1 r2)))
(define (parl2 r1 r2)
(let ((one (make-interval 1 1)))
(div-interval one
(add-interval (div-interval one r1)
(div-interval one r2))))
(define (r1 (make-interval 4.0 3.2)))
(define (r2 (make-interval 3.0 7.2)))
(center (parl1 r1 r2))
(width (parl1 r1 r2))
(newline)
(center (parl2 r1 r2))
(width (parl2 r1 r2))
This happens because the operations in the interval arithmetic do not have the arithmetic structure of a field.
As Sussman says, the exercise is difficult -- you need to check each of the operations of the field structure, and see which one is not satisfied.
The exercise asks us to show that the interval arithmetic is not the arithmetic of the ranges of functions.
A function like f (x) = x^2 defined on a domain [-1, 1] has the
range [0,1], which is included in [-1,1] * [-1,1] = [-1,1], obtained by replacing the symbol x by the domain of the symbol x.
If we define a similar function that uses a different variable for each dimension, like in f(x,y) = x * y, then the range of this function, when defined on the domain [-1,1] * [-1,1], is the same as the interval [-1,1] * [-1,1] = [-1, 1], because x is used once, and so with y.
It happens that all the time when the function f(.., x, ..) is continous in each variable x we have the range arithmetics to be identical with interval arithmetics if each symbol is used only once in definition of f.
In the first formula of Alice, parallel resistor is computed
repeating 2 times the variable R1, and 2 times the variable R2,
and using the same argument the range of this function is included
in the product of the corresponding intervals obtained from the
formula of the function, by replacing each name by corresponding
domain interval, but it is not strictly the same.
We are asked either to rewrite any function such that the range of the rewritten function be the same as the interval obtained by applying the rewritten function's formula, wih names replaced by intervals equal to the domain of the corresponding name from the rewritten function, or to show that this is not possible for each possible function.
This problem is called dependency problem, and it is a large
problem, whose understanding is out of the purpose of SICP, and requires differential equations in multiple variables to solve it.
The purpose of this exercise is , as Sussman himself said, just to show that data can be encoded in multiple ways. The focus is not on mathematics , but on data abstraction.
The easiest way to understand the problem is looking at the very simple expression e = x/x where x is in [2,3]. If we use interval arithmetic, we get that e is in [2/3, 3/2], however second grade arithmetic shows that e=x/x=1. So what gives?
Well it is actually very simple: when using interval arithmetic I made the mistake of assuming x can have two different values at the same time.
the maximum value of e is given when the numerator is 3 and the denominator is 2, however since both should always be the same this is not possible.
So, is it ever possible to use interval arithmetic? Yes, when all intervals appear only once, since then you would not have the problem of different values for the same variable in different interval calculations.
Is it possible to create an arithmetic package which does not have this problem? No, since not every function can be written where every variable appears only once. This problem is known as the dependency problem.