Entropy formula in J language - fork

I'm playing a bit with the J programming language, and I've tried to create a verb for computing entropy from a list of probabilities (outcomes of an event, formula would be like this in python/pesudocode: -sum([p*log(p,2) for p in ps])).
The version I've tried using composition (#:) works, but the one based on hook & fork seems to be doing something else, and I care about why it's doing what it does. I'm trying to grok working with hook and fork, and this case really proves my intuitions are wrong.
Here is the code:
probs =: 0.75 0.25 NB. probabilties
entropy =: +/ #: (- * 2&^.)
entropyWrong =: +/ (- * 2&^.)
entropy probs NB. this is correct
0.811278
entropyWrong probs NB. this is wrong!
1.06128 1.25
0.561278 0.75
NB. shouldn't the following be the same as above (wrong)?
+/ (- * 2&^.) probs
0.811278
The point of my question isn't "how to compute entropy of probabilities in JS", but "why does the entropyWrong above does what it does and why it's not the same as "it's content" which does the right thing apparently.

The entropyWrong definition is a hook that you are using monadically.
entropyWrong =: +/ (- * 2&^.)
If a monadic hook is represented as (u v) y then in your case +/ is u and (- * 2&^.) is v; v is a fork. y of course is probs, the noun argument.
J defines the actions of a monadic hook as equivalent to y u v y so that u becomes dyadic with y as its left argument and v y as its right argument. This is consistent with J's right to left order of execution.
By the way, forks are defined (f g h) y where f, g and h are verbs and the result is (f y) g h y. Each verb can be described as a tine of the fork and the middle tine g is dyadic while f and h are monadic when a fork if applied monadically.
entropy =: +/ #: (- * 2&^.) is doing something different. Entropy is in form u #: v and is taking the results of the fork v and applying them monadically to the verb u
If you would like to get rid of the use of #: in entropy, you can do that by using the verb [: . When used as the left tine of a fork [: returns no result and this creates a monadic centre tine instead of a dyadic one.
entropy2=: [: +/ (- * 2&^.) NB. with three verbs this is now a fork
probs =: 0.75 0.25
entropy2 probs
0.811278

Related

SICP solution to Fibonacci, set `a + b = a`, why not `a + b = b`?

I am reading Tree Recursion of SICP, where fib was computed by a linear recursion.
We can also formulate an iterative process for computing the
Fibonacci numbers. The idea is to use a pair of integers a and b,
initialized to Fib(1) = 1 and Fib(0) = 0, and to repeatedly apply the
simultaneous transformations
It is not hard to show that, after applying this transformation n
times, a and b will be equal, respectively, to Fib(n + 1) and Fib(n).
Thus, we can compute Fibonacci numbers iteratively using the procedure
(rewrite by Emacs Lisp substitute for Scheme)
#+begin_src emacs-lisp :session sicp
(defun fib-iter (a b count)
(if (= count 0)
b
(fib-iter (+ a b) a (- count 1))))
(defun fib (n)
(fib-iter 1 0 n))
(fib 4)
#+end_src
"Set a + b = a and b = a", it's hard to wrap my mind around it.
The general idea to find a fib is simple:
Suppose a completed Fibonacci number table, search X in the table by jumping step by step from 0 to X.
The solution is barely intuitive.
It's reasonably to set a + b = b, a = b:
(defun fib-iter (a b count)
(if (= count 0)
a
(fib-iter b (+ a b) (- count 1))
)
)
(defun fib(n)
(fib-iter 0 1 n))
So, the authors' setting seems no more than just anti-intuitively placing b in the head with no special purpose.
However, I surely acknowledge that SICP deserves digging deeper and deeper.
What key points am I missing? Why set a + b = a rather than a + b = b?
As far as I can see your problem is that you don't like it that order of the arguments to fib-iter is not what you think it should be. The answer is that the order of arguments to functions is very often simply arbitrary and/or conventional: it's a choice made by the person writing the function. It does not matter to anyone but the person reading or writing the code: it's a stylistic choice. It doesn't particularly seem more intuitive to me to have fib defined as
(define (fib n)
(fib-iter 1 0 n))
(define (fib-iter next current n)
(if (zero? n)
current
(fib-iter (+ next current) next (- n 1))))
Rather than
(define (fib n)
(fib-iter 0 1 n))
(define (fib-iter current next n)
(if (zero? n)
current
(fib-iter (+ next current) current (- n 1))))
There are instances where this isn't true. For instance Standard Lisp (warning, PDF link) defined mapcar so that the list being mapped over was the first argument with the function being mapped the second. This means you can't extend it in the way it has been extended for more recent dialects, so that it takes any positive number of lists with the function being applied to the
corresponding elements of all the lists.
Similarly I think it would be extremely unintuitive to define the arguments of - or / the other way around.
but in many, many cases it's just a matter of making a choice and sticking to it.
The recurrence is given in an imperative form. For instance, in Common Lisp, we could use parallel assignment in the body of a loop:
(psetf a (+ a b)
b a)
To reduce confusion, we should think about this functionally and give the old and new variables different names:
a = a' + b'
b = a'
This is no longer an assignment but a pair of equalities; we are justified in using the ordinary "=" operator of mathematics instead of the assignment arrow.
The linear recursion does this implicitly, because it avoids assignment. The value of the expression (+ a b) is passed as the parameter a. But that's a fresh instance of a in new scope which uses the same name, not an assignment; the binding just induces the two to be equivalent.
We can see it also like this with the help of a "Fibonacci slide rule":
1 1 2 3 5 8 13
----------------------------- <-- sliding interface
b' a'
b a
As we calculate the sequence, there is a two-number window whose entries we are calling a and b, which slides along the sequence. You can read the equalities at any position directly off the slide rule: look, b = a' = 5 and a = b' + a' = 8.
You may be confused by a referring to the higher position in the sequence. You might be thinking of this labeling:
1 1 2 3 5 8 13
------------------------
a' b'
a b
Indeed, under this naming arrangement, now we have b = a' + b', as you expect, and a = b'.
It's just a matter of which variable is designated as the leading one farther along the sequence, and which is the trailing one.
The "a is leading" convention comes from the idea that a is before b in the alphabet, and so it receives the newer "updates" from the sequence first, which then pass off to b.
This may seem counterintuitive, but such a pattern appears elsewhere in mathematics, such as convolution of functions.

Switch from dyadic to monadic interpretation in a J sentence

I am trying to understand composition in J, after struggling to mix and match different phases. I would like help switching between monadic and dyadic phrases in the same sentence.
I just made a simple dice roller in J, which will serve as an example:
d=.1+[:?[#]
4 d 6
2 3 1 1
8 d 12
10 2 11 11 5 11 1 10
This is a chain: "d is one plus the (capped) roll of x occurrences of y"
But what if I wanted to use >: to increment (and skip the cap [: ), such that it "switched" to monadic interpretation after the first fork?
It would read: "d is the incremented roll of x occurrences of y".
Something like this doesn't work, even though it looks to me to have about the right structure:
d=.>:&?[#]
d
>:&? ([ # ])
(If this approach is against the grain for J and I should stick to capped forks, that is also useful information.)
Let's look at a dyadic fork a(c d f h g)b where c,d,f, g and h are verbs and a and b are arguments, which is evaluated as: (a c b) d (a f b) h (a g b) The arguments are applied dyadically to the verbs in the odd positions (or tines c,f and g) - and those results are fed dyadically right to left into the even tines d and h. Also a fork can be either in the form of (v v v) or (n v v) where v stands for verbs and n stands for nouns. In the case of (n v v) you just get the value of n as the left argument to the middle tine.
If you look at your original definition of d=.1+[:?[#] you might notice it simplifies to a dyadic fork with five tines (1 + [: ? #) where the [ # ] can be replaced by # as it is a dyadic fork (see definition above).
The [: (Cap) verb returns no value to the left argument of ? which means that ? acts monadically on the result of a # b and this becomes the right argument to + which has a left argument of 1.
So, on to the question of how to get rid of the [: and use >: instead of 1 + ...
You can also write ([: f g) as f#:g to get rid of the Cap, which means that ([: ? #) becomes ?#:# and now since you want to feed this result into >: you can do that by either:
d1=.>:#:?#:#
d2=. [: >: ?#:#
4 d1 6
6 6 1 5
4 d2 6
2 3 4 5
8 d1 12
7 6 6 4 6 9 8 7
8 d2 12
2 10 10 9 8 12 4 3
Hope this helps, it is a good fundamental question about how forks are evaluated. It would be your preference of whether you use the ([: f g) or f#:g forms of composition.
To summarize the main simple patterns of verb mixing in J:
(f #: g) y = f (g y) NB. (1) monadic "at"
x (f #: g) y = f (x g y) NB. (2) dyadic "at"
x (f &: g) y = (g x) f (g y) NB. (3) "appose"
(f g h) y = (f y) g (h y) NB. (4) monadic fork
x (f g h) y = (x f y) g (x h y) NB. (5) dyadic fork
(f g) y = y f (g y) NB. (6) monadic hook
x (f g) y = x f (g y) NB. (7) dyadic hook
A nice review of those is here (compositions) and here (trains).
Usually there are many possible forms for a verb. To complicate matters more, you can mix many primitives in different ways to achieve to same result.
Experience, style, performance and other such factors influence the way you'll combine the above to form your verb.
In this particular case, I would use #bob's d1 because I find it clearer to read: increase the roll of x copies of y:
>: # ? # $
For the same reason, I am replacing # with $. When I see # in this context, I automatically read "number of elements of", but maybe that's just me.

Isabelle/HOL: proof by 'simp' is slow while 'value' is instantaneous

I am new to Isabelle/HOL, still in the study of the prog-prov exercises. In the meantime, I am exercising by applying these proof techniques to questions of combinatorial words. I observe a very different behavior (in terms of efficiency), between 'value' and 'lemma'.
Can one explain the different evaluation/search strategies between the two commands?
Is there a way to have the speed of 'value' used inside a proof of a 'lemma'?
Of course, I am asking because I have not found the answer in the documentation (so far). What is the manual where this difference of efficiency would be documented and explained?
Here is a minimal piece of source to reproduce the problem.
theory SlowLemma
imports Main
begin
(* Alphabet for Motzkin words. *)
datatype alphabet = up | lv | dn
(* Keep the [...] notation for lists. *)
no_notation Cons (infixr "#" 65) and append (infixr "#" 65)
primrec count :: "'a ⇒ 'a list ⇒ nat" where
"count _ Nil = 0" |
"count s (Cons h q) = (if h = s then Suc (count s q) else count s q)"
(* prefix n l simply returns undefined if n > length l. *)
fun prefix :: "'a list ⇒ nat ⇒ 'a list" where
"prefix _ 0 = []" |
"prefix (Cons h q) (Suc n) = Cons h (prefix q n)"
definition M_ex_7 :: "alphabet list" where
"M_ex_7 ≡ [lv, lv, up, up, lv, dn, dn]"
definition M_ex_19 :: "alphabet list" where
"M_ex_19 ≡ [lv, lv, up, up, lv, up, lv, dn, lv, dn, lv, up, dn, dn, lv, up, dn, lv, lv]"
fun height :: "alphabet list ⇒ int" where
"height w = (int (count up w + count up w)) - (int (count dn w + count dn w))"
primrec is_pre_M :: "alphabet list ⇒ nat ⇒ bool" where
"is_pre_M _ (0 :: nat) = True"
| "is_pre_M w (Suc n) = (let w' = prefix w (Suc n) in is_pre_M w' n ∧ height w' ≥ 0)"
fun is_M :: "alphabet list ⇒ bool" where
"is_M w = (is_pre_M w (length w) ∧ height w = 0)"
(* These two calls to value are fast. *)
value "is_M M_ex_7"
value "is_M M_ex_19"
(* This first lemma goes fast. *)
lemma is_M_M_ex_7: "is_M M_ex_7"
by (simp add: M_ex_7_def)
(* This second lemma takes five minutes. *)
lemma is_M_M_ex_19: "is_M M_ex_19"
by (simp add: M_ex_19_def)
end
simp is a proof method that goes through the proof kernel, i.e., every step has to be justified. For long rewriting chains, this may be quite expensive.
On the other hand, value uses the code generator where possible. All used constants are translated into ML code, which is then executed. You have to trust the result, i.e., it didn't go through the kernel and may be wrong.
The equivalent of value as a proof method is eval. Thus, an easy way to speed up your proofs is to use this:
lemma is_M_M_ex_19: "is_M M_ex_19"
by eval
Opinions in the Isabelle community about whether or not this should be used differ. Some say it's similar to axiomatization (because you have to trust it), others consider it a reasonable way if going through the kernel is prohibitively slow. Everyone agrees though that you have to be really careful about custom setup of the code generator (which you haven't done, so it should be fine).
There's middle ground: the code_simp method will set up simp to use only the equations that would otherwise be used by eval. That means: a much smaller set of rules for simp, while still going through the kernel. In your case, it is actually the same speed as by eval, so I would highly recommend doing that:
lemma is_M_M_ex_19: "is_M M_ex_19"
by code_simp
In your case, the reason why code_simp is much faster than simp is because of a simproc that has exponential runtime in the number of nested let expressions. Hence, another solution would be to use simp add: Let_def to just unfold let expressions.
Edited to reflect comment by Andreas Lochbihler

Systematically extract noun arguments from J expression

What is the systematic approach to extracting nouns as arguments from an expression in J? To be clear, an expression containing two literals should become a dyadic expression with the left and right arguments used instead of the literals.
I'm trying to learn tacit style so I prefer not to use named variables if it is avoidable.
A specific example is a simple die roll simulator I made:
>:?10#6 NB. Roll ten six sided dice.
2 2 6 5 3 6 4 5 4 3
>:?10#6
2 1 2 4 3 1 3 1 5 4
I would like to systematically extract the arguments 10 and 6 to the outside of the expression so it can roll any number of any sized dice:
d =. <new expression here>
10 d 6 NB. Roll ten six sided dice.
1 6 4 6 6 1 5 2 3 4
3 d 100 NB. Roll three one hundred sided dice.
7 27 74
Feel free to illustrate using my example, but I'm looking to be able to follow the procedure for arbitrary expressions.
Edit: I just found out that a quoted version using x and y can be automatically converted to tacit form using e.g. 13 : '>:?x#y'. If someone can show me how to find the definition of 13 : I might be able to answer my own question.
If your goal is to learn tacit style, it's better that you simply learn it from the ground up rather than try to memorize an explicit algorithm—J4C and Learning J are good resources—because the general case of converting an expression from explicit to tacit is intractable.
Even ignoring the fact that there have been no provisions for tacit conjunctions since J4, in the explicit definition of a verb you can (1) use control words, (2) use and modify global variables, (3) put expressions containing x and/or y as the operands of an adverb or conjunction, and (4) reference itself. Solving (1), (3), or (4) is very hard in the general case and (2) is just flat out impossible.*
If your J sentence is one of a small class of expressions, there is an easy way to apply the fork rules make it tacit, and this is what is more or less what is implemented in 13 :. Recall that
(F G H) y is (F y) G (H y), and x (F G H) y is (x F y) G (x H y) (Monad/Dyad Fork)
([: G H) y is G (H y), and x ([: G H) y is G (x H y) (Monad/Dyad Capped Fork)
x [ y is x, x ] y is y, and both of [ y and ] y are y (Left/Right)
Notice how forks use their center verbs as the 'outermost' verb: Fork gives a dyadic application of g, while Capped Fork gives a monadic one. This corresponds exactly to the two modes of application of a verb in J, monadic and dyadic. So a quick-and-dirty algorithm for making tacit a "dyadic" expression might look like the following, for F G H verbs and N nouns:
Replace x with (x [ y) and y with (x ] y). (Left/Right)
Replace any other noun n with (x N"_ y)
If you see the pattern (x F y) G (x H y), replace it with x (F G H) y. (Fork)
If you see the pattern G (x H y), replace it with x ([: G H) y. (*Capped Fork()
Repeat 1 through 4 until you attain the form x F y, at which point you win.
If no more simplifications can be performed and you have not yet won, you lose.
A similar algorithm can be derived for "monadic expressions", expressions only dependent on y. Here's a sample derivation.
<. (y - x | y) % x NB. start
<. ((x ] y) - (x [ y) | (x ] y)) % (x [ y) NB. 1
<. ((x ] y) - (x ([ | ]) y)) % (x [ y) NB. 3
<. (x (] - ([ | ])) y) % (x [ y) NB. 3
<. x ((] - ([ | ])) % [) y NB. 3
x ([: <. ((] - ([ | ])) % [)) y NB. 4 and we win
This neglects some obvious simplifications, but attains the goal. You can mix in various other rules to simplify, like the long train rule—if Train is a train of odd length then (F G (Train)) are equivalent (F G Train)—or the observation that x ([ F ]) y and x F y are equivalent. After learning the rules, it shouldn't be hard to modify the algorithm to get the result [: <. [ %~ ] - |, which is what 13 : '<. (y - x | y) % x' gives.
The fail condition is attained whenever an expression containing x and/or y is an operand to an adverb or conjunction. It is sometimes possible to recover a tacit form with some deep refactoring, and knowledge of the verb and gerundial forms of ^: and }, but I am doubtful that this can be done programmatically.
This is what makes (1), (3), and (4) hard instead of impossible. Given knowledge of how $: works, a tacit programmer can find a tacit form for, say, the Ackermann function without too much trouble, and a clever one can even refactor that for efficiency. If you could find an algorithm doing that, you'd obviate programmers, period.
ack1 =: (1 + ])`(([ - 1:) $: 1:)`(([ - 1:) $: [ $: ] - 1:)#.(, i. 0:)
ack2 =: $: ^: (<:#[`]`1:) ^: (0 < [) >:
3 (ack1, ack2) 3
61 61
TimeSpace =: 6!:2, 7!:2#] NB. iterations TimeSpace code
10 TimeSpace '3 ack1 8'
2.01708 853504
10 TimeSpace '3 ack2 8'
0.937484 10368
* This is kind of a lie. You can refactor the entire program involving such a verb through some advanced voodoo magic, cf. Pepe Quintana's talk at the 2012 J Conference. It isn't pretty.
13 : is documented in the vocabulary or NuVoc under : (Explicit).
The basic idea is that the value you want to be x becomes [ and the value you want to be y becomes ]. But as soon as the the rightmost token changes from a noun (value) to a verb like [ or ], the entire statement becomes a train, and you may need to use the verb [: or the conjunctions # or #: to restore the composition behavior you had before.
You can also replace the values with the actual names x and y, and then wrap the whole thing in ((dyad : ' ... ')). That is:
>:?10#6 NB. Roll ten six sided dice.
can become:
10 (dyad : '>: ? x # y') 6 NB. dyad is predefined. It's just 4.
If you only need the y argument, you can use monad, which is prefined as 3. The name verb is also 3. I tend to use verb : when I provide both a monadic and dyadic version, and monad when I only need the monadic meaning.
If your verb is a one-liner like this, you can sometimes convert it automatically to tacit form by replacing the 3 or 4 with 13.
I have some notes on factoring verbs in j that can help you with the step-by-step transformations.
addendum: psuedocode for converting a statement to tacit dyad
This only covers a single statement (one line of code) and may not work if the constant values you're trying to extract are being passed to a conjunction or adverb.
Also, the statement must not make any reference to other variables.
Append [ x=. xVal [ y =. yVal to the statement.
Substitute appropriate values for xVal and yVal.
Rewrite the original expression in terms of the new x and y.
rewrite statement [ x=. xVal [ y=. yVal as:
newVerb =: (4 : 0)
statement ] y NB. we'll fill in x later.
)
(xVal) newVerb yVal
Now you have an explicit definition in terms of x and y. The reason for putting it on multiple lines instead of using x (4 : 'expr') y is that if expr still contains a string literal, you will have to fiddle with escaping the single quotes.
Converting the first noun
Since you only had a pipeline before, the rightmost expression inside statement must be a noun. Convert it to a fork using the following rules:
y → (])
x → ]x ([)
_, __, _9 ... 9 → (_:), (__:), (_9:) ... (9:)
n → n"_ (for any other arbitrary noun)
This keeps the overall meaning the same because the verb you've just created is invoked immediately and applied to the [ y.
Anyway, this new tacit verb in parentheses becomes the core of the train you will build. From here on out, you work by consuming the rightmost expression in the statement, and moving it inside the parentheses.
Fork normal form
From here on out, we will assume the tacit verb we're creating is always a fork.
This new tacit verb isn't actually a fork, but we will pretend it is, because any single-token verb can be rewritten as a fork using the rule:
v → ([: ] v).
There is no reason to actually do this transformation, it's just so I can simplify the rule below and always call it a fork.
We will not use hooks because any hook can be rewritten as a fork with the rule:
(u v) → (] u [: v ])
The rules below should produce trains in this form automatically.
Converting the remaining tokens
Now we can use the following rules to convert the rest of the original pipeline, moving one item at a time into the fork.
For all of these rules, the (]x)? isn't J syntax. It means the ]x may or may not be there. You can't put the ] x in until you transform a usage of x without changing the meaning of the code. Once you transform an instance of x, the ]x is required.
Following the J convention, u and v represent arbitrary verbs, and n is an arbitrary noun. Note that these include verbs
tokens y u (]x)? (fork) ] y → tokens (]x)? (] u fork) ] y
tokens x u (]x)? (fork) ] y → tokens ]x ([ u fork) ] y
tokens n u (]x)? (fork) ] y → tokens (]x)? (n u fork) ] y
tokens u v (]x)? (fork) ] y → tokens u (]x)? ([: v fork) ] y
There are no rules for adverbs or conjunctions, because you should just treat those as part of the verbs. For example +:^:3 should be treated as a single verb. Similarly, anything in parentheses should be left alone as a single phrase.
Anyway, keep applying these rules until you run out of tokens.
Cleanup
You should end up with:
newVerb =: (4 : 0)
] x (fork) ] y
)
(xVal) newVerb yVal
This can be rewritten as:
(xVal) (fork) yVal
And you are done.

Unable to evaluate a lambda expression as argument in SICP ex-1.37

The problem can be found at http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-12.html#%_thm_1.37
The problem is to expand a continuing fraction in order to approximate phi. It suggests that your procedure should be able to calculate phi by evaluating:
(cont-frac (lambda (i) 1.0)
(lambda (i) 1.0)
k)
My solution is as follows:
(define (cont-frac n d k)
(if (= k 1) d
(/ n (+ d (cont-frac n d (- k 1))))))
This solution works when calling (cont-frac 1 1 k), but not when using the lambda expressions as the problem suggests. I get what looks like a type error
;ERROR: "ex-1.37.scm": +: Wrong type in arg1 #<CLOSURE <anon> (x) 1.0>
; in expression: (##+ ##d (##cont-frac ##n ##d (##- ##k 1)))
; in scope:
; (n d k) procedure cont-frac
; defined by load: "ex-1.37.scm"
;STACK TRACE
1; ((##if (##= ##k 1) ##d (##/ ##n (##+ ##d (##cont-frac ##n ##d ...
My question is two-part:
Question 1. Why am I getting this error when using the lambda arguments? I (mistakenly, for sure) thought that (lambda (x) 1) should evaluate to 1. It clearly does not. I'm not sure I understand what it DOES evaluate to: I presume that it doesn't evaluate to anything (i.e., "return a value" -- maybe the wrong term for it) without being passed an argument for x.
It still leaves unanswered why you would have a lambda that returns a constant. If I understand correctly, (lambda (x) 1.0) will always evaluate to 1.0, regardless of what the x is. So why not just put 1.0? This leads to:
Question 2. Why should I use them? I suspect that this will be useful in ex-1.38, which I've glanced at, but I can't understand why using (lambda (x) 1.0) is any different that using 1.0.
In Scheme lambda expression creates a function, therefore expression such as:
(lambda (i) 1.0)
really does have result, it is a function object.
But if you add parentheses around that expression, it will indeed be evaluated to 1.0 as you expected:
((lambda (i) 1.0))
Using of lambdas in that exercise is necessary for building general solution, as you've correctly noticed in exercise 1.38, you'll be using the same implementation of cont-frac function but with different numerator and denominator functions, and you'll see an example, where you should calculate one of them in runtime using loop counter.
You could compare your exercise solutions with mine, e.g. 1.37 and 1.38
(/ n (+ d (cont-frac n d (- k 1))))))
In this case 'd' being the lambda statement, it doesn't make any sense to '+' it, same for 'n' and '/' try something like
(/ (n k) (+ (d k) (cont-frac n d (- k 1))))))
you'll see why in the next exercise you can also make this tail-recursive
I named my variables F-d and F-n instead of d and n, becuase they accept a function that calculates the numerator and denominator terms. (lambda (i) 1.0) is a function that accepts one argument and returns 1.0, 1.0 is just a number. In other continued fractions, the value may vary with the depth (thus why you need to pass k to the numerator and denomenator function to calculate the proper term.

Resources