Mutator elisp functions - elisp

How can I define mutator elisp functions? that is, how can I send parameters to an elisp function that can be modified inside the function for use outside the function (similar to non const reference variables or pointers in C++)? For example, suppose I had a function foo defined like
(defun foo (a b c d)
;do some stuff to b, c, and d
.
.
.
)
I might like to call it, say, as follows
(defun bar (x)
(let ((a) (b) (c) (y))
.
.
.
;a, b and c are nil at this point
(foo x a b c)
(setq y (some-other-function-of a b c x and-other-variables))
.
.
.
)) ... )
y)
I know that I could throw all my parameters local to some function into one big old list, evaluate the list at the end of the function and then go fetch these variables from some other list set to be the return value of that function (a list of stuff), i.e.
(setq return-list (foo read-only-x read-only-y))
(setq v_1 (car return-list))
(setq v_2 (cadr return-list))
.
.
but are there any better ways? All I have accomplished so far in my attempts to solve this is exiting the function with variables no different to how they were passed in
As for why I want to be able to do this I am simply trying refactor some large function F in such way that all collections of expressions related to some nameable concepts c live in their own little modules c_1, c_2, c_3, ... c_n that I can call from within F with whatever arguments I need to be updated along the way. That is to say, I would like F to look something like:
(defun F ( ... )
(let ((a_1) (a_2) ... )
(c_1 a_1 ... a_m)
(c_2 a_h ... a_i)
.
.
.
(c_n a_j ... a_k)
.
.
.
))...))

Two ways I can think of:
make the "function" foo a macro and not a function (if possible)
pass a newly created cons (or more of them) into the function, and replace the car and cdr of them via setcar/setcdr
In case the function is too complex, you can also combine both approaches - have a macro foo that creates a cons of a and b and calls a function foo0 with that cons, and later unpacks the car and cdr again.
In case you need more than 2 args, just use more than one cons as a paramter.

Just to show you how it can be done, but please don't do it, it's bad style.
(defun set-to (in-x out-y)
(set out-y in-x))
(let (x)
(set-to 10 'x)
x)
There's a case when this won't work though:
(let (in-x)
(set-to 10 'in-x)
in-x)
It's a bit like this C++ code
void set_to(int x, int* y) {
*y = x;
}
int y;
set_to(10, &y);
Actually I wish there were no non-const references in C++, so that
each mutator would have to be called with a pointer like above.
Again, don't do it unless it's really necessary.
Use instead multiple-value-bind or cl-flet.

Related

Is there normal "set" function (not special form) in Scheme language?

Is there normal "set" function (not special form) in Scheme language or some way to implement it?
I'd like to write code something like:
(map (lambda (var)
(set var 0))
'(a b c))
which could assign a value (here it is '0') to variables from list (here they are 'a', 'b' and 'c').
No. And to see why there isn't consider something like this:
(define (mutant var val)
(let ((x 1))
(set var val)
x))
Now, what should (mutant 'x 3) return? If it should return 3, then:
set can't be a function since it needs access to the lexical environment of mutant;
any kind of reasonable compilation of this function is not possible.
If you want set to be a function then catastrophe follows. Consider this definition:
(define (mutant-horror f)
(let ([x 3])
(f)
x))
Now, you would think that this can be optimised to this:
(define (mutant-horror f)
(f)
3)
But it can't. Because you might call it like this:
(mutant-horror (λ () (set 'x 3)))
or, more generally, you might call it with a function which, somewhere eventually in some function called indirectly from it might end up saying (set 'x 3).
This means that no binding can ever be optimised at all, which is a disaster. It's also at least very close to meaning lexical scope is not possible: if as well as set, a function called get exists, which retrieves the binding of a symbol, then you have, essentially, dynamic scope. That in turn makes things like tail-call elimination at least difficult and probably impossible (in fact set probably does this on its own).
Reasons like this are why even very old Lisps, where things like set did exist and did superficially work, actually made special exemptions for compiled code, where set didn't work (see for instance the Lisp 1.5 programmer's manual (PDF link), appendix D. This divergence between the semantics of compiled and interpreted code is one of the things that later Lisps and Lisp-related languages such as CL and Scheme did away with.
If instead you want something like Common Lisp's semantics, where the equivalent thing
(defun mutant (var val)
(let ((x 1))
(set var val)
x))
Would return 1 (unless x was a globally (see below) special variable, in which case it might return something else) and as a side-effect modify the value cell of whatever symbol was named by var (which might be x), then, well, Scheme has no notion of that at all, and that's a good thing on the whole.
Note that a modified version of the function will also work for locally special variables:
(defun mutant/local-special (a b)
(let ((x 1))
(declare (special x))
(set a b)
x))
But in this case you always know there's a special binding happening because you can always see the declaration.
When you write something like
(map (lambda (var)
(set var 0))
'(a b c))
my first thought was that you try to accumulate unordered sets of the form ( (a 0) (b 0) (c 0) ).
You cannot implement your own setter for any of the internal data structures that are provided by the language as this would mean to write a scheme function to modify some data structures that is implemented in C. For the data structures implemented in C you need to provide setters written in C -- supposing the lower language is C.
If you want to implement your own setter you either
-- check how the data structure in implmented, and if it's implemented in scheme you will undestand how to modify it
-- define your own data structure using already existing data structures and define setters for it.
A setter that mutates the data structure contains a ! at the end of its name, such as set!, append!, etc.
A function call just evaluates a series of instructions in an extended environment, the environment in particular is that of definition extended with its function parameters, this is prety much the case in any language...
If you do this:
(define (my-set var val)
(set! var val))
you will bind the value of val to var, but only within the scope of the current call on my-set. The reason you cannot write such function has to do with the nature of scheme itself, var is a pointer to whatever you pass in the function, but set! will make this pointer point to something else (still within the scope of my-set). my-set could work if we had some sort of mechanism of using actual pointers, as some low lever languages allow. But scheme does not...
Note that scheme goes very well with the functional programming style as well as recursion, so if you have a need for a function as you described, you are probably "doing something wrong"... :)
You can, however, do this:
(define my-list (list 1 2 3))
(define (my-set a-list a-value)
(set-car! my-list a-value))
> (my-set my-list 4)
> my-list
(4 2 3)
this works since a-list is a pointer to a cons-cell, set-car! will modify the contents of a cons-cell, but not affect the pointer to which.

Understanding parentheticals on let

I'm having a hard time understanding the syntax of let vs some of the other statements. For example, a "normal" statement has one parentheses:
(+ 2 2)
$2 = 4
Yet the let statement has two:
(let ((x 2)) (+ x 2))
$3 = 4
Why is this so? I find it quite confusing to remember how many parentheses to put around various items.
Firstly, note that let syntax contains two parts, both of which can have zero or more elements. It binds zero or more variables, and evaluates zero or more forms.
All such Lisp forms create a problem: if the elements are represented as a flat list, there is an ambiguity: we don't know where one list ends and the other begins!
(let <var0> <var1> ... <form0> <form1> ...)
For instance, suppose we had this:
(let (a 1) (b 2) (print a) (list b))
What is (print a): is that the variable print being bound to a? Or is it form0 to be evaluated?
Therefore, Lisp constructs like this are almost always designed in such a way that one of the two lists is a single object, or possibly both. In other words: one of these possibilities:
(let <var0> <var1> ... (<form0> <form1> ...))
(let (<var0> <var1> ...) (<form0> <form1> ...))
(let (<var0> <var1> ...) <form0> <form1> ...)
Traditional Lisp has followed the third idea above in the design of let. That idea has the benefit that the pieces of the form are easily and efficiently accessed in an interpreter, compiler or any code that processes code. Given an object L representing let syntax, the variables are easily retrieved as (cadr L) and the body forms as (cddr L).
Now, within this design choice, there is still a bit of design freedom. The variables could follow a structure similar to a property list:
(let (a 1 b 2 c 3) ...)
or they could be enclosed:
(let ((a 1) (b 2) (c 3)) ...)
The second form is traditional. In the Arc dialect of Lisp designed Paul Graham, the former syntax appears.
The traditional form has more parentheses. However, it allows the initialization forms to be omitted: So that is to say if the initial value of a variable is desired to be nil, instead of writing (a nil), you can just write a:
;; These two are equivalent:
(let ((a nil) (b nil) (c)) ...)
(let (a b c) ...)
This is a useful shorthand in the context of a traditional Lisp which uses the symbol nil for the Boolean false and for the empty list. We have compactly defined three variables that are either empty lists or false Booleans by default.
Basically, we can regard the traditional let as being primarily designed to bind a simple list of variables as in (let (a b c) ...) which default to nil. Then, this syntax is extended to support initial values, by optionally replacing a variable var with a (var init) pair, where init is an expression evaluated to specify its initial value.
In any case, thanks to macros, you can have any binding syntax you want. In more than one program I have seen a let1 macro which binds just one variable, and has no parentheses. It is used like this:
(let1 x 2 (+ x 2)) -> 4
In Common Lisp, we can define let1 very easily like this:
(defmacro let1 (var init &rest body)
`(let ((,var ,init)) ,#body))
If we restrict let1 to have a one-form body, we can then write the expression with obsessively few parentheses;
(let1 x 2 + x 2) -> 4
That one is:
(defmacro let1 (var init &rest form)
`(let ((,var ,init)) (,#form)))
Remember that let allows you to bind multiple variables. Each variable binding is of the form (variable value), and you collect all the bindings into a list. So the general form looks like
(let ((var1 value1)
(var2 value2)
(var3 value3)
...)
body)
That's why there are two parentheses around x 2 -- the inner parentheses are for that specific binding, the outer parentheses are for the list of all bindings. It's only confusing because you're only binding one variable, it becomes clearer with multiple variables.

binding values to frames in the environment model

I am a little confused on how the environment model of evaluation works, and hoping someone could explain.
SICP says:
The environment model specifies: To apply a procedure to arguments,
create a new environment containing a frame that binds the parameters
to the values of the arguments. The enclosing environment of this
frame is the environment specified by the procedure. Now, within this
new environment, evaluate the procedure body.
First example:
If I:
(define y 5)
in the global environment, then call
(f y)
where
(define (f x) (set! x 1))
We construct a new environment (e1). Within e1, x would be bound to the value of y (5). In the body, the value of x would now be 1. I found that y is still 5. I believe the reason for this is because x and y are located in different frames. That is, I completely replaced the value of x. I modified the frame where x is bound, not just its value. Is that correct?
Second example:
If we have in the global environment:
(define (cons x y)
(define (set-x! v) (set! x v))
(define (set-y! v) (set! y v))
(define (dispatch m)
(cond ((eq? m 'car) x)
((eq? m 'cdr) y)
((eq? m 'set-car!) set-x!)
((eq? m 'set-cdr!) set-y!)
(else (error "Undefined
operation: CONS" m))))
dispatch)
(define (set-car! z new-value)
((z 'set-car!) new-value)
z)
Now I say:
(define z2 (cons 1 2))
Suppose z2 has a value the dispatch procedure in an environment called e2, and I call:
(set-car! z2 3)
Set-car! creates a new environment e3. Within e3, the parameter z is bound to the value of z2 (the dispatch procedure in e2) just like in my first example. After the body is executed, z2 is now '(3 2). I think set-car! works the way it does is because I am changing the state of the object held by z (which is also referenced by z2 in global), but not replacing it. That is, I did not modify the frame where z is bound.
In this second example it appears that z2 in global and z in e3 are shared. I am not sure about my first example though. Based on the rules for applying procedures in the environment model, it appears x and y are shared although it is completely undetectable because 5 does not have local state.
Is everything I said correct? Did I misunderstood the quote?
To answer your first question: assuming that you meant to write (f y) in your first question rather than (f 5), the reason that y is not modified is that racket (like most languages) is a "call by value" language. That is, values are passed to procedure calls. In this case, then the argument y is evaluated to 5 before the call to f is made. Mutating the x binding does not affect the y binding.
To answer your second question: in your second example, there are shared environments. That is, z is a function that is closed over an environment (you called it e2). Each call to z creates a new environment that is linked to the existing e2 environment. Performing mutation on either x or y in this environment affects all future references to the e2 environment.
Summary: passing the value of a variable is different from passing a closure that contains that variable. If I say
(f y)
... the after the call is done, 'y' will still refer to the same value[*]. If I write
f (lambda (...) ... y ...)
(that is, passing a closure that has a reference to y, then y might be bound to a different value after the call to f.
If you find this confusing, you're not alone. The key is this: don't stop using closures. Instead, stop using mutation.
[*] if y is a mutable value, it may be mutated, but it will still be the "same" value. see note above about confusion.
TL;DR: simple values in Scheme are immutable, are copied in full when passed as arguments into functions. Compound values are mutable, are passed as a copy of a pointer, whereas the copied pointer points to the same memory location as the original pointer does.
What you're grappling with is known as "mutation". Simple values like 5 are immutable. There's no "set-int!" to change 5 to henceforth hold the value 42 in our program. And it is good that there isn't.
But a variable's value is mutable. A variable is a binding in a function invocation's frame, and it can be changed with set!. If we have
(define y 5)
(define (foo x) (set! x 42) (display (list x x)))
(foo 5)
--> foo is entered
foo invocation environment frame is created as { x : {int 5} }
x's binding's value is changed: the frame is now { x : {int 42} }
(42 42) is displayed
y still refers to 5 in the global environment
But if foo receives a value that is itself holding mutable references, which can be mutated, i.e. changed "in place", then though foo's frame itself doesn't change, the value to which a binding in it is referring can be.
(define y (cons 5 6)) ; Scheme's standard cons
--> a cons cell is created in memory, at {memory-address : 123}, as
{cons-cell {car : 5} {cdr : 6} }
(define (foo x) (set-car! x 42) (display (list x x)))
(foo y)
--> foo is entered
foo invocation environment frame is created as
{ x : {cons-cell-reference {memory-address : 123}} }
x's binding's value is *mutated*: the frame is still
{ x : {cons-cell-reference {memory-address : 123}} }
but the cons cell at {memory-address : 123} is now
{cons-cell {car : 42} {cdr : 6} }
((42 . 6) (42 . 6)) is displayed
y still refers to the same binding in the global environment
which still refers to the same memory location, which has now
been altered in-place: at {memory-address : 123} is now
{cons-cell {car : 42} {cdr : 6} }
In Scheme, cons is a primitive which creates mutable cons cells which can be altered in-place with set-car! and set-cdr!.
What these SICP exercises intend to show is that it is not necessary to have it as a primitive built-in procedure; that it could be implemented by a user, even if it weren't built-in in Scheme. Having set! is enough for that.
Another jargon for it is to speak of "boxed" values. If I pass 5 into some function, when that function returns I'm guaranteed to still have my 5, because it was passed by copying its value, setting the function invocation frame's binding to reference the copy of the value 5 (which is also just an integer 5 of course). This is what is referred to as "pass-by-value".
But if I "box" it and pass (list 5) in to some function, the value that is copied -- in Lisp -- is a pointer to this "box". This is referred to as "pass-by-pointer-value" or something.
If the function mutates that box with (set-car! ... 42), it is changed in-place and I will henceforth have 42 in that box, (list 42) -- under the same memory location as before. My environment frame's binding will be unaltered -- it will still reference the same object in memory -- but the value itself will have been changed, altered in place, mutated.
This works because a box is a compound datum. Whether I put a simple or compound value in it, the box itself (i.e. the mutable cons cell) is not simple, so will be passed by pointer value -- only the pointer will be copied, not what it points to.
x bound to the value of y means that x is a new binding which receives a copy of the same value that y contains. x and y are not aliases to a shared memory location.
Though due to issues of optimization, bindings are not exactly memory locations, you can model their behavior that way. That is to say, you can regard an environment to be a bag of storage locations named by symbols.
Educational Scheme-in-Scheme evaluators, in fact, use association lists for representing environments. Thus (let ((x 1) (y 2)) ...) creates an environment which simply looks like ((y . 1) (x . 2)). The storage locations are the cdr fields of the cons pairs in this list, and their labels are the symbols in the car fields. The cell itself is the binding; the symbol and location are bound together by virtue of being in the same cons structure.
If there is an outer environment surrounding this let, then these association pairs can just be pushed onto it with cons:
(let ((z 3))
;; env is now ((z . 3))
(let ((x 1) (y 2))
;; env is now ((y . 2) (x . 1) (z . 3))
The environment is just a stack of bindings that we push onto. When we capture a lexical closure, we just take the current pointer and stash it into the closure object.
(let ((z 3))
;; env is now ((z . 3))
(let ((x 1) (y 2))
;; env is now ((y . 2) (x . 1) (z . 3))
(lambda (a) (+ x y z a))
;; lambda is an object with these three pices:
;; - the environment ((y . 2) (x . 1) (z . 3))
;; - the code (+ x y z a)
;; - the parameter list (a)
)
;; after this let is done, the environment is again ((z . 3))
;; but the above closure maintains the captured one
)
So suppose we call that lambda with an argument 10. The lambda takes the parameter list (a) and binds it to the argument list to create a new environment:
((a . 1))
This new environment is not made in a vacuum; it is created as an extension to the captured environment. So, really:
((a . 1) (y . 2) (x . 1) (z . 3))
Now, in this effective environment, the body (+ x y z a) is executed.
Everything you need to understand about environments can be understood in reference to this cons pair model of bindings.
Assignment to a variable? That's just set-cdr! on a cons-based binding.
What is "extending an environment"? It's just pushing a cons-based binding onto the front.
What is "fresh binding" of a variable? That's just the allocation of a new cell with (cons variable-symbol value) and extending the environment with it by pushing it on.
What is "shadowing" of a variable? If an environment contains (... ((a . 2)) ...) and we push a new binding (a . 3) onto this environment, then this a is now visible, and (a . 2) is hidden, simply because the assoc function searches linearly and finds (a . 2) first! The inner-to-outer environment lookup is perfectly modeled by assoc. Inner bindings appear to the left of outer bindings, closer to the head of the list and are found first.
The semantics of sharing all follow from the semantics of these lists of cells. In the assoc list model, environment sharing occurs when two environment assoc lists share the same tail. For instance, each time we call our lambda above, a new (a . whatever) argument environment is created, but it extends the same captured environment tail. If the lambda changes a, that is not seen by the other invocations, but if it changes x, then the other invocations will see it. a is private to the lambda invocation, but x, y and z are external to the lambda, in its captured environment.
If you fall back on this assoc list model mentally, you will not go wrong as far as working out the behavior of environments, including arbitrarily complex situations.
Real implementations basically just optimize around this. for instance, a variable that is initialized from a constant like 42 and never assigned does not have to exist as an actual environment entry at all; the optimization called "constant propagation" can just replace occurrences of that variable with 42, as if it were a macro. Real implementations may use hash tables or other structures for the environment levels, not assoc lists. Real implementations may be compiled: lexical environments can be compiled according to various strategies such as "closure conversion". Basically, an entire lexical scope can be flattened into a single vector-like object. When a closure is made at run time, the entire vector is duplicated and initialized. Compiled code doesn't refer to variable symbols, but to offsets in the closure vector, which is substantially faster: no linear search through an assoc list is required.

Generic functions allow different order of arguments

I defined a generic function taking 2 arguments:
(defgeneric interact (a b))
The order of the arguments should not be important, so (interact x y) and (interact y x) should be the same, but I don't want to define two methods that do the same for every combination of different objects.
A Method-Combination of this type should help:
(defmethod interact :around (a b)
(if (some-function a b)
;;some-function has to be true if (eq (class-of a) (class-of b))
;;else (some-function a b) is (not (some-function b a))
;;similar #'<=
(call-next method)
(interact b a))
But I would have to know #'some-function and be able to know the type of the arguments I have to define.
Edit: both proposed approaches have a few limitations discussed in the comments below. Please read them before using this answer!
Can I suggest two options - a working but hacky option for when you only have two arguments, and a vaguely sketched out generic approach which I think should work but I haven't written:
Option 1:
(defparameter *in-interact-generic-call* nil)
(defgeneric interact (x y))
(defmethod interact ((x T) (y T))
; this can be called on pretty much anything
(if *in-interact-generic-call*
(cause-some-kind-of-error) ; Replace this with a more sensible error call
(let ((*in-interact-generic-call* T))
(interact y x))))
(defmethod interact ((x integer) (y string))
; example
(print x )(prin1 y))
(interact 5 "hello") ; should print 5 "hello"
(interact "hello" 5) ; should print 5 "hello"
;(interact "hello" "hello") ; should cause an error
Essentially the idea is to define a generic function which always matches anything, use it to try to swap the arguments (to see if that matches anything better) and if it's already swapped the arguments then to raise some kind of error (I've not really done that right here).
Option 2
Define the generic function as something like interact-impl. Actually call the standard function (defined by defun) interact.
In interact, define a loop over all permutations of the order of your arguments. For each permutation try calling interact-impl (e.g. using (apply #'interact-impl current-permutation).)
At least in sbcl, no matching arguments gives me a simple-error. You probably would want to do a more detailed check that it's actually the right error. Thus the code in interact looks something like
; completely untested!
(do (all-permutations all-permutations (cdr all-permutations))
(...) ; some code to detect when all permutations are exhausted and raise an error
(let (current-permutation (first all-permutations))
(handler-case
(return (apply #'interact-impl current-permutation))
(simple-error () nil)) ; ignore and try the next option
)
)
So what you are looking for is an arbitrary linear order on the class objects.
How about string order on class names?
(defun class-less-p (a b)
"Lexicographic order on printable representation of class names."
(let* ((class-a (class-of a))
(name-a (symbol-name class-a))
(pack-a (package-name (symbol-package name-a)))
(class-b (class-of b))
(name-b (symbol-name class-b))
(pack-b (package-name (symbol-package name-b))))
(or (string< pack-a pack-b)
(and (string= pack-a pack-b)
(string<= name-a name-b)))))

Does the state of local variable created by `let` change during recursive call Scheme?

For example,
I want to check if an element is in a list. The algorithm is straightforward, let's do it in C++
bool element_of( const std::vector<int>& lst, int elem ) {
for( int i( 0 ), ie = lst.size(); i < ie; ++i )
if( elem == lst[i] )
return true;
return false;
}
Since Scheme don't let me use single if statement, I can't do something similar to the C++ code above. Then I came up with a temporary variable, namely result. result will have initial value of #f, next I recursively call the function to check the next item in the list i.e. cdr lst ... So my question is, does the variable which created with let restore its initial value each time it enters a new function call or its value stays the same until the last call?
On the other hand, using fold, my solution was,
(define (element-of x lst)
(fold (lambda (elem result)
(if (eq? elem x) (or result #t) result))
#f
lst))
Thanks,
Each Let call creates a new set of variables in the environment that the main body of the Let is being evaluted in. The Let syntax is a "syntactic sugar" for a lambda being evaluated with arguments passed to it that have been evaluted. For instance
(let ((a (func object))
(b (func object2)))
(cons a b))
is the same as writing
((lambda (a b) (cons a b)) (func object) (func object2))
So you can see that in the Let syntax, the arguments are first evaluated, and then the body is evaluated, and the definitions of a and b are utilized in the local environment scope. So if you recursively call Let, each time you enter the body of the Let call, you are evaluating the body in a new environment (because the body is inside a newly defined lambda), and the definition of the arguments defined in the local Let scope will be different (they are actually new variables in a nested environment setup by the new lambda, not simply variables that have been mutated or "re-defined" like you would find in a C++ loop).
Another way of saying this is that you're variables will be like the local scope variables in a C++ recursive function ... for each function's stack-frame, the locally scoped variables will have their own definition, and their own memory location ... they are not mutated variables like you might see in a loop that re-uses the same memory variables in the local scope.
Hope this helps,
Jason
let always reinitialises variables; it's evident, since you must always provide new binding values. e.g.,
(let ((a 42))
...)
Inside the ..., a starts out as 42, always. It doesn't "retain" values from previous invocations.
By the way, I think you meant to write (or result (equal? elem x)) rather than (if (eq? elem x) (or result #t) result). :-)
(or result (equal? elem x)) translates to the following C++ code:
return result || elem == x;
(assuming that the == operator has been overloaded to perform what equal? does, of course.) The benefit of this is that if result is already true, no further comparisons are performed.

Resources