Inlining a function with Clojure macros - performance

More out of curiousity that anything else (but with the expectation that it might occasionally be a useful trick for performance tuning), is it possible to use Clojure macros to "inline" an existing function?
i.e. I would like to be able to do something like:
(defn my-function [a b] (+ a b))
(defn add-3-numbers [a b c]
(inline (my-function
a
(inline (my-function
b
c)))))
And have it produce (at compile time) exactly the same function as if I had inlined the additions myself, such as:
(defn add-3-numbers [a b c]
(+ a (+ b c)))

In case you didn't know, you can define inlined functions using definline
(doc definline)
-------------------------
clojure.core/definline
([name & decl])
Macro
Experimental - like defmacro, except defines a named function whose
body is the expansion, calls to which may be expanded inline as if
it were a macro. Cannot be used with variadic (&) args.
nil
Also checking the source,
(source definline)
-------------------------
(defmacro definline
[name & decl]
(let [[pre-args [args expr]] (split-with (comp not vector?) decl)]
`(do
(defn ~name ~#pre-args ~args ~(apply (eval (list `fn args expr)) args))
(alter-meta! (var ~name) assoc :inline (fn ~name ~args ~expr))
(var ~name))))
definline simply defines a var with meta-data {:inline (fn definition)}. So although its not exactly what you were asking but you can rebind the var with new metadata to get inlined behavior.

Related

Understanding parentheticals on let

I'm having a hard time understanding the syntax of let vs some of the other statements. For example, a "normal" statement has one parentheses:
(+ 2 2)
$2 = 4
Yet the let statement has two:
(let ((x 2)) (+ x 2))
$3 = 4
Why is this so? I find it quite confusing to remember how many parentheses to put around various items.
Firstly, note that let syntax contains two parts, both of which can have zero or more elements. It binds zero or more variables, and evaluates zero or more forms.
All such Lisp forms create a problem: if the elements are represented as a flat list, there is an ambiguity: we don't know where one list ends and the other begins!
(let <var0> <var1> ... <form0> <form1> ...)
For instance, suppose we had this:
(let (a 1) (b 2) (print a) (list b))
What is (print a): is that the variable print being bound to a? Or is it form0 to be evaluated?
Therefore, Lisp constructs like this are almost always designed in such a way that one of the two lists is a single object, or possibly both. In other words: one of these possibilities:
(let <var0> <var1> ... (<form0> <form1> ...))
(let (<var0> <var1> ...) (<form0> <form1> ...))
(let (<var0> <var1> ...) <form0> <form1> ...)
Traditional Lisp has followed the third idea above in the design of let. That idea has the benefit that the pieces of the form are easily and efficiently accessed in an interpreter, compiler or any code that processes code. Given an object L representing let syntax, the variables are easily retrieved as (cadr L) and the body forms as (cddr L).
Now, within this design choice, there is still a bit of design freedom. The variables could follow a structure similar to a property list:
(let (a 1 b 2 c 3) ...)
or they could be enclosed:
(let ((a 1) (b 2) (c 3)) ...)
The second form is traditional. In the Arc dialect of Lisp designed Paul Graham, the former syntax appears.
The traditional form has more parentheses. However, it allows the initialization forms to be omitted: So that is to say if the initial value of a variable is desired to be nil, instead of writing (a nil), you can just write a:
;; These two are equivalent:
(let ((a nil) (b nil) (c)) ...)
(let (a b c) ...)
This is a useful shorthand in the context of a traditional Lisp which uses the symbol nil for the Boolean false and for the empty list. We have compactly defined three variables that are either empty lists or false Booleans by default.
Basically, we can regard the traditional let as being primarily designed to bind a simple list of variables as in (let (a b c) ...) which default to nil. Then, this syntax is extended to support initial values, by optionally replacing a variable var with a (var init) pair, where init is an expression evaluated to specify its initial value.
In any case, thanks to macros, you can have any binding syntax you want. In more than one program I have seen a let1 macro which binds just one variable, and has no parentheses. It is used like this:
(let1 x 2 (+ x 2)) -> 4
In Common Lisp, we can define let1 very easily like this:
(defmacro let1 (var init &rest body)
`(let ((,var ,init)) ,#body))
If we restrict let1 to have a one-form body, we can then write the expression with obsessively few parentheses;
(let1 x 2 + x 2) -> 4
That one is:
(defmacro let1 (var init &rest form)
`(let ((,var ,init)) (,#form)))
Remember that let allows you to bind multiple variables. Each variable binding is of the form (variable value), and you collect all the bindings into a list. So the general form looks like
(let ((var1 value1)
(var2 value2)
(var3 value3)
...)
body)
That's why there are two parentheses around x 2 -- the inner parentheses are for that specific binding, the outer parentheses are for the list of all bindings. It's only confusing because you're only binding one variable, it becomes clearer with multiple variables.

How can I trace code execution in Clojure?

Why learning Clojure, I sometimes need to see what a function does at each step. For example:
(defn kadane [coll]
(let [pos+ (fn [sum x] (if (neg? sum) x (+ sum x)))
ending-heres (reductions pos+ 0 coll)]
(reduce max ending-heres)))
Should I insert println here and there (where, how); or is there a suggested workflow/tool?
This may not be what you're after at the level of a single function (see Charles Duffy's comment below), but if you wanted to do get an overview of what's going on at the level of a namespace (or several), you could use tools.trace (disclosure: I'm a contributor):
(ns foo.core)
(defn foo [x] x)
(defn bar [x] (foo x))
(in-ns 'user) ; standard REPL namespace
(require '[clojure.tools.trace :as trace])
(trace/trace-ns 'foo.core)
(foo.core/bar 123)
TRACE t20387: (foo.core/bar 123)
TRACE t20388: | (foo.core/foo 123)
TRACE t20388: | => 123
TRACE t20387: => 123
It won't catch inner functions and such (as pointed out by Charles), and might be overwhelming with large code graphs, but when exploring small-ish code graphs it can be quite convenient.
(It's also possible to trace individually selected Vars if the groups of interest aren't perfectly aligned with namespaces.)
If you use Emacs with CIDER as most Clojurians do, you already have a built-in debugger:
https://docs.cider.mx/cider/debugging/debugger.html
Chances are your favorite IDE/Editor has something built-in or a plugin already.
There is also (in no particular order):
spyscope
timbre/spy
tupelo/spyx
sayid
tools.trace
good old println
I would look at the above first. However there were/are other possibilities:
https://gist.github.com/ato/252421
https://github.com/philoskim/debux
https://github.com/pallet/ritz/tree/develop/nrepl-core
https://github.com/hozumi/eyewrap
probably many more
Also, if the function is simple enough you can add defs at development-time to peek inside the bindings at a given time inside your function.
Sayid is a tool presented at Clojure Conj 2016 that's directly appropriate to the purpose and comes with an excellent Emacs plugin. See the talk at which it was presented.
To see inside invocations of transient functions, see ws-add-inner-trace-fn (previously, ws-add-deep-trace-fn).
I frequently use the spyx and related functions like spy-let from the Tupelo library for this purpose:
(ns tst.clj.core
(:require [tupelo.core :as t] ))
(t/refer-tupelo)
(defn kadane [coll]
(spy-let [ pos+ (fn [sum x] (if (neg? sum) x (+ sum x)))
ending-heres (reductions pos+ 0 coll) ]
(spyx (reduce max ending-heres))))
(spyx (kadane (range 5)))
will produce output:
pos+ => #object[tst.clj.core$kadane$pos_PLUS___21786 0x3e7de165 ...]
ending-heres => (0 0 1 3 6 10)
(reduce max ending-heres) => 10
(kadane (range 5)) => 10
IMHO it is hard to beat a simple println or similar for debugging. Log files are also invaluable as you get closer to production.

Contracts in Let language

I saw a question which asked about a contract of a method in a pet language known as let.The language is not important but does contract means that
things that the method takes as an argument and its value after evaluating?
(define extend-env*
(lambda (syms vals old-env)
(if (null? syms)
old-env
(extended-env-record
(car syms)
(car vals)
(extend-env* (cdr syms)
(cdr vals)
old-env)))))
So in here the method takes a symbol a value and an environment and I think it produces a new environment.
Does that mean contract for this method is Identifier(Variable),Value,Environment = Environment ?
Your functions starts like this:
(lambda (syms vals old-env) ...)
Here sym stand for symbol and thus syms stands for a list of syms aka a list of symbols. In the same manner vals stands for a list of values. Finally old-env is an environment.
This covers the input to the function. To confirm that syms is supposed to be a list of symbols, look at how syms is used in the body. We see thee uses: (null? syms), (car syms), and, (cdr syms). This means we guess correctly.
To see type of the output, look for the expression(s) that produce return values.
The simplest is old-env which is an environment. If the function always returns the same type of value, we have determined that the output is an environment. It best to check that the other return expressions also return environments though.
To sum up: the contract seen from Racket is:
extend-env* : list-of-symbols list-of-values environment -> environment
Now in your program the symbols represent identifiers, so you could also write:
extend-env* : list-of-identifiers list-of-values environment -> environment
if you document that identifiers are represented as symbols.

Generic functions allow different order of arguments

I defined a generic function taking 2 arguments:
(defgeneric interact (a b))
The order of the arguments should not be important, so (interact x y) and (interact y x) should be the same, but I don't want to define two methods that do the same for every combination of different objects.
A Method-Combination of this type should help:
(defmethod interact :around (a b)
(if (some-function a b)
;;some-function has to be true if (eq (class-of a) (class-of b))
;;else (some-function a b) is (not (some-function b a))
;;similar #'<=
(call-next method)
(interact b a))
But I would have to know #'some-function and be able to know the type of the arguments I have to define.
Edit: both proposed approaches have a few limitations discussed in the comments below. Please read them before using this answer!
Can I suggest two options - a working but hacky option for when you only have two arguments, and a vaguely sketched out generic approach which I think should work but I haven't written:
Option 1:
(defparameter *in-interact-generic-call* nil)
(defgeneric interact (x y))
(defmethod interact ((x T) (y T))
; this can be called on pretty much anything
(if *in-interact-generic-call*
(cause-some-kind-of-error) ; Replace this with a more sensible error call
(let ((*in-interact-generic-call* T))
(interact y x))))
(defmethod interact ((x integer) (y string))
; example
(print x )(prin1 y))
(interact 5 "hello") ; should print 5 "hello"
(interact "hello" 5) ; should print 5 "hello"
;(interact "hello" "hello") ; should cause an error
Essentially the idea is to define a generic function which always matches anything, use it to try to swap the arguments (to see if that matches anything better) and if it's already swapped the arguments then to raise some kind of error (I've not really done that right here).
Option 2
Define the generic function as something like interact-impl. Actually call the standard function (defined by defun) interact.
In interact, define a loop over all permutations of the order of your arguments. For each permutation try calling interact-impl (e.g. using (apply #'interact-impl current-permutation).)
At least in sbcl, no matching arguments gives me a simple-error. You probably would want to do a more detailed check that it's actually the right error. Thus the code in interact looks something like
; completely untested!
(do (all-permutations all-permutations (cdr all-permutations))
(...) ; some code to detect when all permutations are exhausted and raise an error
(let (current-permutation (first all-permutations))
(handler-case
(return (apply #'interact-impl current-permutation))
(simple-error () nil)) ; ignore and try the next option
)
)
So what you are looking for is an arbitrary linear order on the class objects.
How about string order on class names?
(defun class-less-p (a b)
"Lexicographic order on printable representation of class names."
(let* ((class-a (class-of a))
(name-a (symbol-name class-a))
(pack-a (package-name (symbol-package name-a)))
(class-b (class-of b))
(name-b (symbol-name class-b))
(pack-b (package-name (symbol-package name-b))))
(or (string< pack-a pack-b)
(and (string= pack-a pack-b)
(string<= name-a name-b)))))

Does the state of local variable created by `let` change during recursive call Scheme?

For example,
I want to check if an element is in a list. The algorithm is straightforward, let's do it in C++
bool element_of( const std::vector<int>& lst, int elem ) {
for( int i( 0 ), ie = lst.size(); i < ie; ++i )
if( elem == lst[i] )
return true;
return false;
}
Since Scheme don't let me use single if statement, I can't do something similar to the C++ code above. Then I came up with a temporary variable, namely result. result will have initial value of #f, next I recursively call the function to check the next item in the list i.e. cdr lst ... So my question is, does the variable which created with let restore its initial value each time it enters a new function call or its value stays the same until the last call?
On the other hand, using fold, my solution was,
(define (element-of x lst)
(fold (lambda (elem result)
(if (eq? elem x) (or result #t) result))
#f
lst))
Thanks,
Each Let call creates a new set of variables in the environment that the main body of the Let is being evaluted in. The Let syntax is a "syntactic sugar" for a lambda being evaluated with arguments passed to it that have been evaluted. For instance
(let ((a (func object))
(b (func object2)))
(cons a b))
is the same as writing
((lambda (a b) (cons a b)) (func object) (func object2))
So you can see that in the Let syntax, the arguments are first evaluated, and then the body is evaluated, and the definitions of a and b are utilized in the local environment scope. So if you recursively call Let, each time you enter the body of the Let call, you are evaluating the body in a new environment (because the body is inside a newly defined lambda), and the definition of the arguments defined in the local Let scope will be different (they are actually new variables in a nested environment setup by the new lambda, not simply variables that have been mutated or "re-defined" like you would find in a C++ loop).
Another way of saying this is that you're variables will be like the local scope variables in a C++ recursive function ... for each function's stack-frame, the locally scoped variables will have their own definition, and their own memory location ... they are not mutated variables like you might see in a loop that re-uses the same memory variables in the local scope.
Hope this helps,
Jason
let always reinitialises variables; it's evident, since you must always provide new binding values. e.g.,
(let ((a 42))
...)
Inside the ..., a starts out as 42, always. It doesn't "retain" values from previous invocations.
By the way, I think you meant to write (or result (equal? elem x)) rather than (if (eq? elem x) (or result #t) result). :-)
(or result (equal? elem x)) translates to the following C++ code:
return result || elem == x;
(assuming that the == operator has been overloaded to perform what equal? does, of course.) The benefit of this is that if result is already true, no further comparisons are performed.

Resources