I have a project for school for which I'm supposed to optimize a compiler/evaluator for Scheme.
The task is to implement tail-call optimization wherever possible.
I'm aware of the known tail-call optimization as shown below:
(define (f x)
<send some rockets into space>
(f (+ x 1)))
However, I'm thinking about evaluating operands in tail position as well. Suppose the following:
; The function
(define (f a b c)
<send some monkeys into space>
1)
; Call the function with (f 3 4 5)
(f (+ 1 2) (begin (define x 4) x) 5))
Evaluating the operands (+ 1 2), (begin (define x 4)) and 5 could be done in tail position, right?
Each of the operands are evaluated in their own environment. I tried this by using the regular R5RS in DrRacket with the following expression:
(+ (begin (define x 5) x) x)
If the operands would be evaluated in the same environment I would be able to pass the x defined in the first operand as the second operand. This is however not possible.
So, is it correct for me to assume that each operand can be evaluated in tail position?
"Tail position" is always relative to some outer expression. For example, consider this:
(define (norm . args)
(define (sum-of-squares sum args)
(if (null? args)
sum
(let ((arg (car args)))
(sum-of-squares (+ sum (* arg arg)) (cdr args)))))
(sqrt (sum-of-squares 0 args)))
The recursive call to sum-of-squares is indeed in tail position relative to sum-of-squares. But is it in tail position relative to norm? No, because the return value from sum-of-squares is sent to sqrt, not directly to norm's caller.
The key to analysing whether an expression A is in tail position relative to outer expression B, is to see whether A's return value is directly returned by B, without any further processing.
In your case, with your expression (f (+ 1 2) (begin (define x 4) x) 5) (which isn't actually valid, by the way: perhaps you meant (f (+ 1 2) (let () (define x 4) x) 5) instead), none of the subexpressions (+ 1 2), (let () (define x 4) x), and 5 are in tail position with respect to f, since their values have to be collected first, and then passed to f (which is a tail call).
None of the operands of an application (op1 op2 ...) is in tail position.
For R5RS Scheme you can see the position in which an application occurs in a tail context here:
https://groups.csail.mit.edu/mac/ftpdir/scheme-reports/r5rs-html.old/r5rs_22.html
So I finally figured it out.
In regular R6RS operands can never be evaluated in tail position because R6RS specifies that there is no strict order in which they are evaluated.
However, in this self-built evaluator for Scheme I do specify the order in which they are evaluated. Ergo, I can strictly define which operator is the last one, and that one can be evaluated in tail position.
Related
I have the following accumulate function:
; accumulate
(define (accumulate op initial sequence)
(if (null? sequence)
initial
(op (car sequence) (accumulate op initial (cdr sequence)))))
I'm trying to write a length function to get the length of a sequence using the accumulate function.
For the function to plug into accumulate, why is it (+ y 1) instead of (+ x 1) ? That's the part I can't figure out:
(define (length sequence)
(accumulate (lambda (x y) (+ x 1)) ; wrong
0
sequence))
(define (length sequence)
(accumulate (lambda (x y) (+ y 1)) ; correct
0
sequence))
Your problem is that x and y doesn't tell you anything what it is. However if you look at accumulate you can see how op is called:
(op (car sequence) ; first argument is the element
(accumulate op initial (cdr sequence))) ; second argument is the accumulated value
While it doesn't really look that way Imagine that the second argument is calling accumulate on the empty sequence. Then you get this:
(op (car sequence)
initial)
So lets make length:
(define (length sequence)
(accumulate (lambda (element initial)
;; initial is often also called acc / accumulator
(+ 1 initial))
0
sequence))
So the answer is that the first argument is the individual element while the second is either the initial value (0) or the previous calculated value which is 0 with as many 1 added as the tail of the sequence had. Thus a number. WHy you don't use the first argument is that you can't really use "a" or whatever the list contains to count elements since you need to just count them not use them as values. If you use the first argument and it happens to be strings what is (+ "a" 0) supposed to help in finding out that the list has length 1?
If you use (lambda (x y) (+ x 1)) as op, then your length (or to be precise, the accumulate) function will not use the result of the recursive calls to the accumulate function. It will essentially only do one computation, (+ x 1) ,where x is (car sequence), the first element of sequence -- and this one computation may or may not even make sense, depending on whether or not x is a number, and even if it did the answer would be wrong.
On the other hand, if op is (lambda (x y) (+ y 1)), then your function will replace
(op (car sequence) (accumulate op initial (cdr sequence)))
with
(+ (accumulate op initial (cdr sequence)) 1)
The recursion bottoms out with the computation (+ 0 1), so you ultimately get the length of the list, when each of the nested recursive calls to accumulate return the length of the sub-lists to their calling functions.
Is there any difference between
(define make-point cons)
and
(define (make-point x y)
(cons x y))
?
Is one more efficient than the other, or are they totally equivalent?
There are a few different issues here.
As Oscar Lopez points out, one is an indirection, and one is a wrapper. Christophe De Troyer did some timing and noted that without optimization, the indirection can take twice as much time as the indirection. That's because the alias makes the value of the two variables be the same function. When the system evaluates (cons …) and (make-point …) it evaluates the variables cons and make-point and gets the same function back. In the indirection version, make-point and cons are not the same function. make-point is a new function that makes another call to cons. That's two function calls instead of one. So speed can be an issue, but a good optimizing compiler might be able to make the difference negligible.
However, there's a very important difference if you have the ability to change the value of either of these variables later. When you evaluate (define make-point kons), you evaluate the variable kons once and set the value of make-point to that one value that you get at that evaluation time. When you evaluate (define (make-point x y) (kons x y)), you're setting the value of make-point to a new function. Each time that function is called, the variable kons is evaluated, so any change to the variable kons is reflected. Let's look at an example:
(define (kons x y)
(cons x y))
(display (kons 1 2))
;=> (1 . 2)
Now, let's write an indirection and an alias:
(define (kons-indirection x y)
(kons x y))
(define kons-alias kons)
These produce the same output now:
(display (kons-indirection 1 2))
;=> (1 . 2)
(display (kons-alias 1 2))
;=> (1 . 2)
Now let's redefine the kons function:
(set! kons (lambda (x y) (cons y x))) ; "backwards" cons
The function that was a wrapper around kons, that is, the indirection, sees the new value of kons, but the alias does not:
(display (kons-indirection 1 2))
;=> (2 . 1) ; NEW value of kons
(display (kons-alias 1 2))
;=> (1 . 2) ; OLD value of kons
Semantically they're equivalent: make-point will cons two elements. But the first one is creating an alias of the cons function, whereas the second one is defining a new function that simply calls cons, hence it'll be slightly slower, but the extra overhead will be negligible, even inexistent if the compiler is good.
For cons, there is no difference between your two versions.
For variadic procedures like +, the difference between + and (lambda (x y) (+ x y)) is that the latter constrains the procedure to being called with two arguments only.
Out of curiosity I did a quick and dirty experiment. It seems to be the case that just aliasing cons is almost twice as fast than wrapping it in a new function.
(define mk-point cons)
(define (make-point x y)
(cons x y))
(let ((start (current-inexact-milliseconds)))
(let loop ((n 100000000))
(mk-point 10 10)
(if (> n 0)
(loop (- n 1))
(- (current-inexact-milliseconds) start))))
(let ((start (current-inexact-milliseconds)))
(let loop ((n 100000000))
(make-point 10 10)
(if (> n 0)
(loop (- n 1))
(- (current-inexact-milliseconds) start))))
;;; Result
4141.373046875
6241.93212890625
>
Ran in DrRacket 5.3.6 on Xubuntu.
I have to define a variadic function in Scheme that takes the following form:
(define (n-loop procedure [a list of pairs (x,y)]) where the list of pairs can be any length.
Each pair specifies a lower and upper bound. That is, the following function call: (n-loop (lambda (x y) (inspect (list x y))) (0 2) (0 3)) produces:
(list x y) is (0 0)
(list x y) is (0 1)
(list x y) is (0 2)
(list x y) is (1 0)
(list x y) is (1 1)
(list x y) is (1 2)
Obviously, car and cdr are going to have to be involved in my solution. But the stipulation that makes this difficult is the following. There are to be no assignment statements or iterative loops (while and for) used at all.
I could handle it using while and for to index through the list of pairs, but it appears I have to use recursion. I don't want any code solutions, unless you feel it is necessary for explanation, but does anyone have a suggestion as to how this might be attacked?
The standard way to do looping in Scheme is to use tail recursion. In fact, let's say you have this loop:
(do ((a 0 b)
(b 1 (+ a b))
(i 0 (+ i 1)))
((>= i 10) a)
(eprintf "(fib ~a) = ~a~%" i a))
This actually get macro-expanded into something like the following:
(let loop ((a 0)
(b 1)
(i 0))
(cond ((>= i 10) a)
(else (eprintf "(fib ~a) = ~a~%" i a)
(loop b (+ a b) (+ i 1)))))
Which, further, gets macro-expanded into this (I won't macro-expand the cond, since that's irrelevant to my point):
(letrec ((loop (lambda (a b i)
(cond ((>= i 10) a)
(else (eprintf "(fib ~a) = ~a~%" i a)
(loop b (+ a b) (+ i 1)))))))
(loop 0 1 0))
You should be seeing the letrec here and thinking, "aha! I see recursion!". Indeed you do (specifically in this case, tail recursion, though letrec can be used for non-tail recursions too).
Any iterative loop in Scheme can be rewritten as that (the named let version is how loops are idiomatically written in Scheme, but if your assignment won't let you use named let, expand one step further and use the letrec). The macro-expansions I've described above are straightforward and mechanical, and you should be able to see how one gets translated to the other.
Since your question asked how about variadic functions, you can write a variadic function this way:
(define (sum x . xs)
(if (null? xs) x
(apply sum (+ x (car xs)) (cdr xs))))
(This is, BTW, a horribly inefficient way to write a sum function; I am just using it to demonstrate how you would send (using apply) and receive (using an improper lambda list) arbitrary numbers of arguments.)
Update
Okay, so here is some general advice: you will need two loops:
an outer loop, that goes through the range levels (that's your variadic stuff)
an inner loop, that loops through the numbers in each range level
In each of these loops, think carefully about:
what the starting condition is
what the ending condition is
what you want to do at each iteration
whether there is any state you need to keep between iterations
In particular, think carefully about the last point, as that is how you will nest your loops, given an arbitrary number of nesting levels. (In my sample solution below, that's what the cur variable is.)
After you have decided on all these things, you can then frame the general structure of your solution. I will post the basic structure of my solution below, but you should have a good think about how you want to go about solving the problem, before you look at my code, because it will give you a good grasp of what differences there are between your solution approach and mine, and it will help you understand my code better.
Also, don't be afraid to write it using an imperative-style loop first (like do), then transforming it to the equivalent named let when it's all working. Just reread the first section to see how to do that transformation.
All that said, here is my solution (with the specifics stripped out):
(define (n-loop proc . ranges)
(let outer ((cur ???)
(ranges ranges))
(cond ((null? ranges) ???)
(else (do ((i (caar ranges) (+ i 1)))
((>= i (cadar ranges)))
(outer ??? ???))))))
Remember, once you get this working, you will still need to transform the do loop into one based on named let. (Or, you may have to go even further and transform both the outer and inner loops into their letrec forms.)
In the expression (call/cc (lambda (k) (k 12))), there are three continuations: (k 12), (lambda (k) (k 12)), and (call/cc (lambda (k) (k 12))). Which one is the "current continuation"?
And continuations in some books are viewed as a procedure which is waiting for a value and it will return immediately when it's applied to a value. Is that right?
Can anyone explain what current continuations are in detail?
Things like (k 12) are not continuations. There is a continuation associated with each subexpression in some larger program. So for example, the continuation of x in (* 3 (+ x 42)) is (lambda (_) (* 3 (+ _ 42))).
In your example, the "current continuation" of (call/cc (lambda (k) (k 12))) would be whatever is surrounding that expression. If you just typed it into a scheme prompt, there is nothing surrounding it, so the "current continuation" is simply (lambda (_) _). If you typed something like (* 3 (+ (call/cc (lambda (k) (k 12))) 42)), then the continuation is (lambda (_) (* 3 (+ _ 42))).
Note that the lambdas I used to represent the "current continuation" are not the same as what call/cc passes in (named k in your example). k has a special control effect of aborting the rest of the computation after evaluating the current continuation.
The continuation in this case is the "thing" that receives the return value of the call/cc invocation. Thus:
(display (call/cc (lambda (k) (k 12)))
has the same result as
(display 12)
Continuations in Scheme "look and feel" like procedures, but they do not actually behave like procedures. One thing that can help you understand continuations better is CPS transformations.
In CPS transformation, instead of a function returning a value, instead it takes a continuation parameter, and invokes the continuation with the result. So, a CPS-transformed sqrt function would be invoked with (sqrt 64 k) and rather than returning 8, it just invokes (k 8) in tail position.
Because continuations (in a CPS-transformed function) are tail-called, the function doesn't have to worry about the continuation returning, and in fact, in most cases, they are not expected to return.
With this in mind, here's a simple example of a function:
(define (hypot x y)
(sqrt (+ (* x x) (* y y))))
and its CPS-transformed version:
(define (hypot x y k)
(* x x (lambda (x2)
(* y y (lambda (y2)
(+ x2 y2 (lambda (sum)
(sqrt sum k))))))))
(assuming that *, +, and sqrt have all been CPS-transformed also, to accept a continuation argument).
So now, the interesting part: a CPS-transformed call/cc has the following definition:
(define (call/cc fn k)
(fn k k))
With CPS transformation, call/cc is easy to understand and easy to implement. Without CPS transformation, call/cc is likely to require a highly magical implementation (e.g., via stack copying, etc.).
The following example involves jumping into continuation and exiting out. Can somebody explain the flow of the function. I am moving in a circle around continuation, and do not know the entry and exit points of the function.
(define (prod-iterator lst)
(letrec ((return-result empty)
(resume-visit (lambda (dummy) (process-list lst 1)))
(process-list
(lambda (lst p)
(if (empty? lst)
(begin
(set! resume-visit (lambda (dummy) 0))
(return-result p))
(if (= 0 (first lst))
(begin
(call/cc ; Want to continue here after delivering result
(lambda (k)
(set! resume-visit k)
(return-result p)))
(process-list (rest lst) 1))
(process-list (rest lst) (* p (first lst))))))))
(lambda ()
(call/cc
(lambda (k)
(set! return-result k)
(resume-visit 'dummy))))))
(define iter (prod-iterator '(1 2 3 0 4 5 6 0 7 0 0 8 9)))
(iter) ; 6
(iter) ; 120
(iter) ; 7
(iter) ; 1
(iter) ; 72
(iter) ; 0
(iter) ; 0
Thanks.
The procedure iterates over a list, multiplying non-zero members and returning a result each time a zero is found. Resume-visit stores the continuation for processing the rest of the list, and return-result has the continuation of the call-site of the iterator. In the beginning, resume-visit is defined to process the entire list. Each time a zero is found, a continuation is captured, which when invoked executes (process-list (rest lst) 1) for whatever value lst had at the time. When the list is exhausted, resume-visit is set to a dummy procedure. Moreover, every time the program calls iter, it executes the following:
(call/cc
(lambda (k)
(set! return-result k)
(resume-visit 'dummy)))
That is, it captures the continuation of the caller, invoking it returns a value to the caller. The continuation is stored and the program jumps to process the rest of the list.
When the procedure calls resume-visit, the loop is entered, when return-result is called, the loop is exited.
If we want to examine process-list in more detail, let's assume the list is non-empty. Tho procedure employs basic recursion, accumulating a result until a zero is found. At that point, p is the accumulated value and lst is the list containing the zero. When we have a construction like (begin (call/cc (lambda (k) first)) rest), we first execute first expressions with k bound to a continuation. It is a procedure that when invoked, executes rest expressions. In this case, that continuation is stored and another continuation is invoked, which returns the accumulated result p to the caller of iter. That continuation will be invoked the next time iter is called, then the loop continues with the rest of the list. That is the point with the continuations, everything else is basic recursion.
What you need to keep in mind is that, a call to (call/cc f) will apply the function f passed as argument to call/cc to the current continuation. If that continuation is called with some argument a inside the function f, the execution will go to the corresponding call to call/cc, and the argument a will be returned as the return value of that call/cc.
Your program stores the continuation of "calling call/cc in iter" in the variable return-result, and begins processing the list. It multiplies the first 3 non-zero elements of the list before encountering the first 0. When it sees the 0, the continuation "processing the list element 0" is stored in resume-visit, and the value p is returned to the continuation return-result by calling (return-result p). This call will make the execution go back to the call/cc in iter, and that call/cc returns the passed value of p. So you see the first output 6.
The rest calls to iter are similar and will make the execution go back and forth between such two continuations. Manual analysis may be a little brain-twisting, you have to know what the execution context is when a continuation is restored.
You could achieve the same this way:
(define (prod-iter lst) (fold * 1 (remove zero? lst)))
... even though it could perform better by traversing only once.
For continuations, recall (pun intended) that all call/cc does is wait for "k" to be applied this way:
(call/cc (lambda (k) (k 'return-value)))
=> return-value
The trick here is that you can let call/cc return its own continuation so that it can be applied elsewhere after call/cc has returned like this:
;; returns twice; once to get bound to k, the other to return blah
(let ([k (call/cc (lambda (k) k))]) ;; k gets bound to a continuation
(k 'blah)) ;; k returns here
=> blah
This lets a continuation return more than once by saving it in a variable. Continuations simply return the value they are applied to.
Closures are functions that carry their environment variables along with them before arguments get bounded to them. They are ordinary lambdas.
Continuation-passing style is a way to pass closures as arguments to be applied later. We say that these closure arguments are continuations. Here's half of the current code from my sudoku generator/solver as an example demonstrating how continuation-passing style can simplify your algorithms:
#| the grid is internally represented as a vector of 81 numbers
example: (make-vector 81 0)
this builds a list of indexes |#
(define (cell n) (list (+ (* (car 9) (cadr n))))
(define (row n) (iota 9 (* n 9)))
(define (column n) (iota 9 n 9))
(define (region n)
(let* ([end (+ (* (floor-quotient n 3) 27)
(* (remainder n 3) 3))]
[i (+ end 21)])
(do ([i i
(- i (if (zero? (remainder i 3)) 7 1))]
[ls '() (cons (vector-ref *grid* i) ls)])
((= i end) ls))))
#| f is the continuation
usage examples:
(grid-ref *grid* row 0)
(grid-set! *grid* region 7) |#
(define (grid-ref g f n)
(map (lambda (i) (vector-ref g i)) (f n)))
(define (grid-set! g f n ls)
(for-each (lambda (i x) (vector-set! g i x))
(f n) ls))