What is the difference between encapsulation and closure? - scheme

There is something I really don't understand about the encapsulation and the closure. I believe that the encapsulation is what can not be changed unless it is changed by the code. But I can't really understand when I am asked to explain how the closure and the encapsulation are applied on a code.
For example :
(define new-cercle #f)
(let ((n 0))
(set! new-cercle
(lambda (rayon)
(begin
(set! n (+ n 1))
(lambda (msg)
(cond ((eq? msg ’circonference)
(* 2 3.14 rayon))
((eq? msg ’surface)
(* 3.14 rayon rayon))
((eq? msg ’nb-cercles)
n)))))))
The n is encapsulated, right? So the question is: explain how the encapsulation and the closure are applied on this code.
Another thing I do not understand is why the let has to be above the lambda here? Why when I put it below the lambda, the function doesn't work well and there is no accumulator ?
(define acc
(let ((n 1))
(lambda (x)
(set! n (* n x))
n)))
I hope someone would explain to me this in an easy way because when I google'd it, honestly I didn't understand anything with the complicated examples most topics have.

Encapsulation the name of a pattern involving any situation in which some related items are put together into a container and then travel with that container, and are referenced through some access mechanism on that container. The items could be run-time values, or compile-time identifiers or whatever. An object made up of multiple fields encapsulates fields: a cons cell encapsulates car and cdr. A class encapsulates slots. In some object systems, methods also. Compilation units encapsulate their global definitions such as functions and variables.
The popular use of "encapsulate" in OOP refers to defining a class as a unit which contains the definition of data, together with the methods which operate on it: the code and data are one "capsule". (The Common Lisp object system isn't like this: methods are not encapsulated in classes.)
A closure is something else, something very specific: it is a body of program code, together with its lexical environment, reified into an object of function type. The body of the closure, when invoked, has visibility to two groups of names: the closure's function parameters, and the surrounding names in the lexical scope where the closure is created. A closure is an example of encapsulation: it encapsulates the body of code together with the lexical scope. The only means of access into the capsule is through the function: the function is like a "method", and the elements of the captured lexical environment are like "slots" in an object.
(By combining code and data, Lisp closures resemble the popular notion of encapsulation more than do Lisp class objects.)
About that funny word: in computer science, to "reify" some aspect of a program is to take something which is not a first class object, and somehow turn it into one.
Almost any recognizable concept which is applicable to the understanding of a program can potentially be reified. (Someone clever just has to come up with a sensible proposal about how.)
For instance, the entire future computation at a given point of execution can be reified, and the resulting object is called a continuation (and undelimited continuation, more precisely).
When the continuation operator captures a future computation, that future becomes hypothetical: it doesn't actually happen (doesn't execute). Instead, an alternative future executes in which the continuation is returned to the operator's caller, or passed into a function which the caller designates. The code which now has this continuation in its grasp can use it to explicitly invoke the original, captured future, as if it were a function. Or choose not to do that. In other words, program control flow (execute this block or don't execute this, or execute it several times) has become a function object (call this function or don't call it, or call it several times).
Objects are another example of reification: the reification of modules. Old-fashioned programs are divided into modules which have global functions and global variables. This "module" structure is a concept that we can recognize in a program and usefully apply in describing such programs. It is susceptible to reification: we can imagine, what if we had a run-time object which is "module", having all those same attributes: namely containing functions and data? And, presto: object-based programming is born, with advantages like multiple instantiation of the same module, possible because the variables are no longer global.
About cercle and rayon:
Firstly, new-cercle behaves like a constructor for objects: it is a global function that can be called from anywhere. It maintains a count of how many objects have been constructed. Only that function can access the counter, so it is encapsulated. (Actually not only that function can access it, but also the closures representing the circle instances!) This is an example of module-like encapsulation. It simulates modules, like in the language Modula-2 and similar, such as C language translation units with static variables at file scope.
When we call new-cercle we must supply an argument for the rayon parameter. An object is produced and returned. That object happens to be a function produced as a lexical closure. This closure had captured the rayon parameter, thereby encapsulating this value: the object knows its own radius. We can call new-cercle repeatedly, and obtain different instances of circles, each carrying its own rayon. This rayon is not externally visible; it is packaged inside the closure, and is only visible to that function.
We gain access into the rayon-container indirectly, through a "message" API on the container. We can call the function with the message symbol surface, and it replies by returning the surface area. None of the currently available messages reveals rayon directly, but we could provide an accessor message for that, and even a message to change the radius. There is even a message to access the shared variable n, the count of circles, which behaves like a class variable in an object system (static slot): any instance of a circle can report how many circles have been constructed. (Note that this count doens't inform us how many circles currently exist: it doesn't decrement when a circle becomes garbage and is reclaimed: there is no finalization).
In any case, we clearly have a container whose contents are not accessible except through an interface. That container binds together code and data, so it is not only encapsulation, but arguably encapsulation in the popular OOP sense.

You're perhaps having difficulties because in the trivial case, some differences go away. For example both (let ((n 1)) (lambda (x) n)) and (lambda (x) (let ((n 1)) n) give you basically the same function.
In your example
(define acc (let ((n 1))
(lambda (x) (set! n (* n x)) n)))
the ordering of let and lambda is significant. If you interchange them to (lambda (x) (let ((n 1)) ... then every time you call this function n will again be bound to 1. Instead you want there to be some location n that starts out with value 1 and can be modified by your function and does not go away when your function is done, which is what you get when you have (let ((n 1)) (lambda (x) (set! n ....
The function constructed by the inner lambda captures the use of the outer n and holds on to its location for as long as it itself lives. It also encapsulates n as nothing else can refer to it but this function. We also say that the function is closed by the surrounding binding of n, and that the function is a closure (of n).
Reading about lexical scope might help you as well.

Related

How does call/cc work with the CPS transformation from "Lisp in Small Pieces"?

The book Lisp in Small Pieces demonstrates a transformation from Scheme into continuation passing style (chapter 5.9.1, for those who have access to the book). The transformation represents continuations by lambda forms and call/cc is supposed to become equivalent to a simple (lambda (k f) (f k k)).
I do not understand how this can work because there is no distinction between application of functions and continuations.
Here is a version of the transformation stripped from everything except application (the full version can be found in this gist):
(define (cps e)
(if (pair? e)
(case (car e)
; ...
(else (cps-application e)))
(lambda (k) (k `,e))))
(define (cps-application e)
(lambda (k)
((cps-terms e)
(lambda (t*)
(let ((d (gensym)))
`(,(car t*) (lambda (,d) ,(k d))
. ,(cdr t*)))))))
(define (cps-terms e*)
(if (pair? e*)
(lambda (k)
((cps (car e*))
(lambda (a)
((cps-terms (cdr e*))
(lambda (a*)
(k (cons a a*)))))))
(lambda (k) (k '()))))
Now consider the CPS example from Wikipedia:
(define (f return)
(return 2)
3)
Above transformation would convert the application in the function body (return 2) to something like (return (lambda (g13) ...) 2). A continuation is passed as the first argument and the value 2 as the second argument. This would be fine if return was an ordinary function. However, return is supposed to be a continuation, which only takes a single argument.
I don't see how the pieces fit together. How can the transformation represent continuations as lambda forms but not give special consideration to their application?
I do not understand how this can work because there is no distinction between application of functions and continuations.
Implementing continuations without CPS requires approaches at the virtual machine level, such as using "spaghetti stacks": allocating lexical variables in heap-allocated frames that are subject to garbage collection. Capturing a continuation then just means obtaining an environment pointer which refers to a lexical frame in the spaghetti stack.
CPS builds a de facto spaghetti stack out of closures. A closure captures lexical bindings into an object with an indefinite lifetime. Under CPS, all closures capture the hidden variable k. That k serves the role of the parent frame pointer in the spaghetti stack; it chains the closures together.
Because the whole program is consistently CPS-transformed, there is a k parameter everywhere which points to a dynamically linked chain of closed-over environments that amounts to a de facto stack where execution can be restored.
The one missing piece of the puzzle is that CPS depends on tail calls. Tail calls ensure that we are not using the real stack; everything interesting is in the closed-over environments.
(However, even tail calls are not strictly required, as Henry Baker's approach, embodied in Chicken Scheme, teaches us. Our CPS-transformed code can use real calls that consume stack, but never return. Every once in a while we can move the reachable environment frames (and all contingent objects) from the stack into the heap, and rewind the stack pointer.)
Now consider the CPS example from Wikipedia:
Ah, but that's not a CPS example; that's an example of application code that uses continuations that are available somehow via call/cc.
It becomes CPS if either we transform it to CPS by hand, or use a compiler which does that mechanically.
However, return is supposed to be a continuation, which only takes a single argument.
Thus, return only takes a single argument because we're looking at application source code that hasn't been CPS-transformed.
The application-level continuations take one argument.
The CPS-implementation-level continuations will have the hidden k argument, like all functions.
The k parameter is analogous to a piece of machine context, like a stack or frame pointer. When using a conventional language, and call print("hello"), you don't ask, how come there is only one argument? Doesn't print have to receive the stack pointer so it knows where the parameters are? Of course when the print is compiled, the compiled code has a way of conveying that context from one function to another, invisible to the high level language.
In the case of CPS in Scheme, it's easy to get confused because the source and target language are both Scheme.

Why does Scheme need the special notion of procedure's location tag?

Why does Scheme need the special notion of procedure's location tag?
The standard says:
Each procedure created as the result of evaluating a lambda expression
is (conceptually) tagged with a storage location, in order to make
eqv? and eq? work on procedures
The eqv? procedure returns #t if:
obj1 and obj2 are procedures whose location tags are equal
Eq? and eqv? are guaranteed to have the same behavior on ... procedures ...
But at the same time:
Variables and objects such as pairs, vectors, and strings implicitly denote locations or sequences of locations
The eqv? procedure returns #t if:
obj1 and obj2 are pairs, vectors, or strings that denote the same locations in the store
Eq? and eqv? are guaranteed to have the same behavior on ... pairs ... and non-empty strings and vectors
Why not just apply "implicitly denote locations or sequences of locations" to procedures too?
I thought this concerned them as well
I don't see anything special about procedures in that matter
Pairs, vectors, and strings are mutable. Hence, the identity (or location) of such objects matter.
Procedures are immutable, so they can be copied or coalesced arbitrarily with no apparent difference in behaviour. In practice, that means that some optimising compilers can inline them, effectively making them "multiple copies". R6RS, in particular, says that for an expression like
(let ((p (lambda (x) x)))
(eqv? p p))
the result is not guaranteed to be true, since it could have been inlined as (eqv? (lambda (x) x) (lambda (x) x)).
R7RS's notion of location tags is to give assurance that that expression does indeed result in true, even if an implementation does inlining.
Treating procedures as values works in languages like ML where they are truly immutable. But in Scheme, procedures can actually be mutated, because their local variables can be. In effect, procedures are poor man's objects (though the case can also be made that OO-style objects are just poor man's procedures!) The location tag serves the same purpose as the object identity that distinguishes two pairs with identical cars and cdrs.
In particular, giving global procedures identity means that it's possible to ask directly whether a predicate we have been passed is specifically eq? or eqv? or equal?, which is not portably possible in R6RS (though possible in R6RS implementations in practice).

(Clojure) Make a partial function instead of throwing an arity exception with too few arguments?

If you give a function too few arguments, it complains:
user=> (map-indexed vector)
ArityException Wrong number of args (1) passed to: core$map-indexed
clojure.lang.AFn.throwArity (AFn.java:437)
Suppose I want this do something handy instead, like automagically calling (partial map-indexed vector), and I want this new rule to work with every function without having to rewrite all of them. Is there a way to accomplish that, or is there some good reason it's not possible/not idiomatic?
You have answered your own question, partial is the way to go. You should explain your use case more so that a better answer can be given.
Besides, map-indexed expects a function of arity 2 as the first argument and a collection as the second.
The following returns a function that does what you want (I guess).
(defn foo [f] (fn [] (map-indexed f vector)))
EDIT
I misunderstood the use of vector as was pointed out by amalloy. It's not vector as data but as function.
Apart from the use of fn as shown above and partial as mentioned earlier, perhaps you could create a single-character-name synonym (or a really simple macro) which would expand to a call to partial. If you chose $, it would be ($ map-indexed vector).
You could do something like this for any function you define
(defn f
;; The "real" f
([x y z] (whatever-f-does x y z))
;; Overloads to "automagically" construct partial applications
([x] (partial f x))
([x y] (partial f x y)))
Of course, this can be abstracted with a macro, but that is the pattern.
I don't know whether this is a good idea. It's probably not what most Lispers would expect from most functions, but I recon it could be quite useful in some contexts.
There are also some limitations to this approach. Here are a few I thought of:
It's only useful for functions you write, or happen to be written by others who also use that pattern.
It introduces ambiguity when multiple arity is involved (i.e., if f is a function of either 2 or 3 arguments, is (f x y) a complete application of f or a partial application?)
It can't really handle variable arity either (you run into the same problems with ambiguity).
Perhaps a better approach would be to introduce a different function to do the partial application. For example:
(defn partial-f [& args] (apply partial f args))
Of course, you would want to choose a better name than "partial-f". For instance for map, you might use mapper. And for map-indexed, perhaps indexed-mapper would make sense.
Transducers (will) do exactly this for sequence functions, among other cool things.
See: http://blog.cognitect.com/blog/2014/8/6/transducers-are-coming

Common lisp macro syntax keywords: what do I even call this?

I've looked through On Lisp, Practical Common Lisp and the SO archives in order to answer this on my own, but those attempts were frustrated by my inability to name the concept I'm interested in. I would be grateful if anyone could just tell me the canonical term for this sort of thing.
This question is probably best explained by an example. Let's say I want to implement Python-style list comprehensions in Common Lisp. In Python I would write:
[x*2 for x in range(1,10) if x > 3]
So I begin by writing down:
(listc (* 2 x) x (range 1 10) (> x 3))
and then defining a macro that transforms the above into the correct comprehension. So far so good.
The interpretation of that expression, however, would be opaque to a reader not already familiar with Python list comprehensions. What I'd really like to be able to write is the following:
(listc (* 2 x) for x in (range 1 10) if (> x 3))
but I haven't been able to track down the Common Lisp terminology for this. It seems that the loop macro does exactly this sort of thing. What is it called, and how can I implement it? I tried macro-expanding a sample loop expression to see how it's put together, but the resulting code was unintelligible. Could anyone guide me in the right direction?
Thanks in advance.
Well, what for does is essentially, that it parses the forms supplied as its body. For example:
(defmacro listc (expr &rest forms)
;;
;;
;; (listc EXP for VAR in GENERATOR [if CONDITION])
;;
;;
(labels ((keyword-p (thing name)
(and (symbolp thing)
(string= name thing))))
(destructuring-bind (for* variable in* generator &rest tail) forms
(unless (and (keyword-p for* "FOR") (keyword-p in* "IN"))
(error "malformed comprehension"))
(let ((guard (if (null tail) 't
(destructuring-bind (if* condition) tail
(unless (keyword-p if* "IF") (error "malformed comprehension"))
condition))))
`(loop
:for ,variable :in ,generator
:when ,guard
:collecting ,expr)))))
(defun range (start end &optional (by 1))
(loop
:for k :upfrom start :below end :by by
:collecting k))
Apart from the hackish "parser" I used, this solution has a disadvantage, which is not easily solved in common lisp, namely the construction of the intermediate lists, if you want to chain your comprehensions:
(listc x for x in (listc ...) if (evenp x))
Since there is no moral equivalent of yield in common lisp, it is hard to create a facility, which does not require intermediate results to be fully materialized. One way out of this might be to encode the knowledge of possible "generator" forms in the expander of listc, so the expander can optimize/inline the generation of the base sequence without having to construct the entire intermediate list at run-time.
Another way might be to introduce "lazy lists" (link points to scheme, since there is no equivalent facility in common lisp -- you had to build that first, though it's not particularily hard).
Also, you can always have a look at other people's code, in particular, if they tries to solve the same or a similar problem, for example:
Iterate
Loop in SBCL
Pipes (which does the lazy list thing)
Macros are code transformers.
There are several ways of implementing the syntax of a macro:
destructuring
Common Lisp provides a macro argument list which also provides a form of destructuring. When a macro is used, the source form is destructured according to the argument list.
This limits how macro syntax looks like, but for many uses of Macros provides enough machinery.
See Macro Lambda Lists in Common Lisp.
parsing
Common Lisp also gives the macro the access to the whole macro call form. The macro then is responsible for parsing the form. The parser needs to be provided by the macro author or is part of the macro implementation done by the author.
An example would be an INFIX macro:
(infix (2 + x) * (3 + sin (y)))
The macro implementation needs to implement an infix parser and return a prefix expression:
(* (+ 2 x) (+ 3 (sin y)))
rule-based
Some Lisps provide syntax rules, which are matched against the macro call form. For a matching syntax rule the corresponding transformer will be used to create the new source form. One can easily implement this in Common Lisp, but by default it is not a provided mechanism in Common Lisp.
See syntax case in Scheme.
LOOP
For the implementation of a LOOP-like syntax one needs to write a parser which is called in the macro to parse the source expression. Note that the parser does not work on text, but on interned Lisp data.
In the past (1970s) this has been used in Interlisp in the so-called 'Conversational Lisp', which is a Lisp syntax with a more natural language like surface. Iteration was a part of this and the iteration idea has then brought to other Lisps (like Maclisp's LOOP, from where it then was brought to Common Lisp).
See the PDF on 'Conversational Lisp' by Warren Teitelmann from the 1970s.
The syntax for the LOOP macro is a bit complicated and it is not easy to see the boundaries between individual sub-statements.
See the extended syntax for LOOP in Common Lisp.
(loop for i from 0 when (oddp i) collect i)
same as:
(loop
for i from 0
when (oddp i)
collect i)
One problem that the LOOP macro has is that the symbols like FOR, FROM, WHEN and COLLECT are not the same from the "COMMON-LISP" package (a namespace). When I'm now using LOOP in source code using a different package (namespace), then this will lead to new symbols in this source namespace. For that reason some like to write:
(loop
:for i :from 0
:when (oddp i)
:collect i)
In above code the identifiers for the LOOP relevant symbols are in the KEYWORD namespace.
To make both parsing and reading easier it has been proposed to bring parentheses back.
An example for such a macro usage might look like this:
(iter (for i from 0) (when (oddp i) (collect i)))
same as:
(iter
(for i from 0)
(when (oddp i)
(collect i)))
In above version it is easier to find the sub-expressions and to traverse them.
The ITERATE macro for Common Lisp uses this approach.
But in both examples, one needs to traverse the source code with custom code.
To complement Dirk's answer a little:
Writing your own macros for this is entirely doable, and perhaps a nice exercise.
However there are several facilities for this kind of thing (albeit in a lisp-idiomatic way) out there of high quality, such as
Loop
Iterate
Series
Loop is very expressive, but has a syntax not resembling the rest of common lisp. Some editors don't like it and will indent poorly. However loop is defined in the standard. Usually it's not possible to write extentions to loop.
Iterate is even more expressive, and has a familiar lispy syntax. This doesn't require any special indentation rules, so all editors indenting lisp properly will also indent iterate nicely. Iterate isn't in the standard, so you'll have to get it yourself (use quicklisp).
Series is a framework for working on sequences. In most cases series will make it possible not to store intermediate values.

Scheme Infix to Postfix

Let me establish that this is part of a class assignment, so I'm definitely not looking for a complete code answer. Essentially we need to write a converter in Scheme that takes a list representing a mathematical equation in infix format and then output a list with the equation in postfix format.
We've been provided with the algorithm to do so, simple enough. The issue is that there is a restriction against using any of the available imperative language features. I can't figure out how to do this in a purely functional manner. This is our fist introduction to functional programming in my program.
I know I'm going to be using recursion to iterate over the list of items in the infix expression like such.
(define (itp ifExpr)
(
; do some processing using cond statement
(itp (cdr ifExpr))
))
I have all of the processing implemented (at least as best I can without knowing how to do the rest) but the algorithm I'm using to implement this requires that operators be pushed onto a stack and used later. My question is how do I implement a stack in this function that is available to all of the recursive calls as well?
(Updated in response to the OP's comment; see the new section below the original answer.)
Use a list for the stack and make it one of the loop variables. E.g.
(let loop ((stack (list))
... ; other loop variables here,
; like e.g. what remains of the infix expression
)
... ; loop body
)
Then whenever you want to change what's on the stack at the next iteration, well, basically just do so.
(loop (cons 'foo stack) ...)
Also note that if you need to make a bunch of "updates" in sequence, you can often model that with a let* form. This doesn't really work with vectors in Scheme (though it does work with Clojure's persistent vectors, if you care to look into them), but it does with scalar values and lists, as well as SRFI 40/41 streams.
In response to your comment about loops being ruled out as an "imperative" feature:
(let loop ((foo foo-val)
(bar bar-val))
(do-stuff))
is syntactic sugar for
(letrec ((loop (lambda (foo bar) (do-stuff))))
(loop foo-val bar-val))
letrec then expands to a form of let which is likely to use something equivalent to a set! or local define internally, but is considered perfectly functional. You are free to use some other symbol in place of loop, by the way. Also, this kind of let is called 'named let' (or sometimes 'tagged').
You will likely remember that the basic form of let:
(let ((foo foo-val)
(bar bar-val))
(do-stuff))
is also syntactic sugar over a clever use of lambda:
((lambda (foo bar) (do-stuff)) foo-val bar-val)
so it all boils down to procedure application, as is usual in Scheme.
Named let makes self-recursion prettier, that's all; and as I'm sure you already know, (self-) recursion with tail calls is the way to go when modelling iterative computational processes in a functional way.
Clearly this particular "loopy" construct lends itself pretty well to imperative programming too -- just use set! or data structure mutators in the loop's body if that's what you want to do -- but if you stay away from destructive function calls, there's nothing inherently imperative about looping through recursion or the tagged let itself at all. In fact, looping through recursion is one of the most basic techniques in functional programming and the whole point of this kind of homework would have to be teaching precisely that... :-)
If you really feel uncertain about whether it's ok to use it (or whether it will be clear enough that you understand the pattern involved if you just use a named let), then you could just desugar it as explained above (possibly using a local define rather than letrec).
I'm not sure I understand this all correctly, but what's wrong with this simpler solution:
First:
You test if your argument is indeed a list:
If yes: Append the the MAP of the function over the tail (map postfixer (cdr lst)) to the a list containing only the head. The Map just applies the postfixer again to each sequential element of the tail.
If not, just return the argument unchanged.
Three lines of Scheme in my implementation, translates:
(postfixer '(= 7 (/ (+ 10 4) 2)))
To:
(7 ((10 4 +) 2 /) =)
The recursion via map needs no looping, not even tail looping, no mutation and shows the functional style by applying map. Unless I'm totally misunderstanding your point here, I don't see the need for all that complexity above.
Edit: Oh, now I read, infix, not prefix, to postfix. Well, the same general idea applies except taking the second element and not the first.

Resources