SML syntax: `val rec` and `fun` compared to each other

SML syntax: `val rec` and `fun` compared to each other - syntax

What are the know things possible with one and not with the other? What are the known idioms to work around limitations of any one of the two?
What I know of it
In another question, Andreas Rossberg pointed to a restriction applying to val rec in SML: it must be of the form of an fn‑match, even when other expressions would make sense.
The fun syntax does not have such a restriction, but can't be used to introduce a simple binding (I mean, simply a name with an optional type annotation and nothing else), as it requires arguments to be exposed.
In an older question I lose track of, there was discrete comments in favour or fun over val / val rec.
I personally more use val / val rec, because it expose the distinction between self‑recursive and non‑self‑recursive binding (while what's exposed as self‑recursive may not actually be, the other way always hold, what's exposed as not self‑recursive is never self‑recursive), and also because it use the same syntax as anonymous lambda expressions (more consistency).
The (all related) questions
These are the things I know. Are there others? I less know about any workaround idioms. Are they some?
Limitations of both seems to me to be syntactical only, and not have real semantic or soundness background. Is this indeed or are there semantic and soundness background for these limitations?
A sample case (you can skip it)
If it's not abusing, I'm posting below a snippet, a variation of the one posted in the question linked above. This snippet expose a case where I'm having an issue with both (I could not be happy of neither one). The comments tells where are the two issues and why it's issues to my eyes. This sample can't really be simplified, as the issue are syntactical, and so the real use case matters.
(* ======================================================================== *)
(* A process chain. *)
datatype 'a process = Chain of ('a -> 'a process)
(* ------------------------------------------------------------------------ *)
(* An example controlling iterator using a process chain. it ends up to be
* a kind of co‑iteration (if that's not misusing the word). *)
val rec iter =
fn process: int -> int process =>
fn first: int =>
fn last: int =>
let
val rec step =
fn (i, Chain process) =>
if i < first then ()
else if i = last then (process i; ())
else if i > last then ()
else
let val Chain process = process i
in step (i + 1, Chain process)
end
in step (first, Chain process)
end
(* ------------------------------------------------------------------------ *)
(* A tiny test use case. *)
val rec process: int -> int process =
fn a: int =>
(print (Int.toString a);
Chain (fn a => (print "-";
Chain (fn a => (print (Int.toString a);
Chain (fn a => (print "|";
Chain process)))))))
(* Note the above is recursive: fn x => (a x; Chain (fn x => …)). We can't
* easily extract seperated `fn`, which would be nice to help composition.
* This is solved in the next section. *)
val () = iter process 0 20
val () = print "\n"
(* ======================================================================== *)
(* This section attempts to set‑up functions and operators to help write
* `process` in more pleasant way or with a more pleasant look (helps
* readability).
*)
(* ------------------------------------------------------------------------ *)
(* Make nested functions, parameters, with an helper function. *)
val chain: ('a -> unit) -> ('a -> 'a process) -> ('a -> 'a process) =
fn e =>
fn p =>
fn a => (e a; Chain p)
(* Now that we can extract the nested functions, we can rewrite: *)
val rec process: int -> int process =
fn a =>
let
val e1 = fn a => print (Int.toString a)
val e2 = fn a => print "-"
val e3 = fn a => print (Int.toString a)
val e4 = fn a => print "|"
in
(chain e1 (chain e2 (chain e3 (chain e4 process)))) a
end
(* Using this:
* val e1 = fn a => print (Int.toString a)
* val e2 = fn a => print "-"
* …
*
* Due to an SML syntactical restriction, we can't write this:
* val rec process = chain e1 (chain e2 ( … process))
*
* This requires to add a parameter on both side, but this, is OK:
* fun process a = (chain e1 (chain e2 ( … process))) a
*)
val e1 = fn a => print (Int.toString a)
val e2 = fn a => print "-"
val e3 = fn a => print (Int.toString a)
val e4 = fn a => print "|"
(* An unfortunate consequence of the need to use `fun`: the parameter added
* for `fun`, syntactically appears at the end of the expression, while it
* will be the parameter passed to `e1`. This syntactical distance acts
* against readability.
*)
fun process a = (chain e1 (chain e2 (chain e3 (chain e4 process)))) a
(* Or else, this, not better, with a useless `fn` wrapper: *)
val rec process = fn a =>
(chain e1 (chain e2 (chain e3 (chain e4 process)))) a
(* A purely syntactical function, to move the last argument to the front. *)
val start: 'a -> ('a -> 'b) -> 'b = fn a => fn f => f a
(* Now that we can write `start a f` instead of `f a`, we can write: *)
fun process a = start a (chain e1 (chain e2 (chain e3 (chain e4 process))))
infixr 0 THEN
val op THEN = fn (e, p) => (chain e p)
fun process a = start a (e1 THEN e2 THEN e3 THEN e4 THEN process)
(* This is already more pleasant (while still not perfect). Let's test it: *)
val () = iter process 0 20
val () = print "\n"

The val rec form computes a smallest fixpoint. Such a fixpoint isn't always well-defined or unique in the general case (at least not in a strict language). In particular, what should the meaning of a recursive binding be if the right-hand side(s) contain expressions that require non-trivial computation, and these computations already depend on what's being defined?
No useful answer exists, so SML (like many other languages) restricts recursion to (syntactic) functions. This way, it has a clear semantic explanation in terms of well-known fixpoint operators like Y, and can be given simple enough evaluation rules.
The same applies to fun, of course. More specifically,
fun f x y = e
is merely defined as syntactic sugar for
val rec f = fn x => fn y => e
So there has to be at least one parameter to fun to satisfy the syntactic requirement for val rec.

I will attempt to start to answer my own question.
For the case of the forced use of a wrapper fn due to syntactic restrictions (may be an issue to consider adressing with sML ?), I could find, not really a workaround, but an idiom which helps to make these cases less noisy.
I reused the start function from the sample (see question), and renamed it as n_equiv, for the reason given in comment. This would just require a few prior wording to explain what the η-equivalence is and also to tell about the syntactical restrictions which justify the definition and use of this function (which is always good for learning material anyway, and I'm planning to post some SML samples on a French forum).
(* A purely syntactical function, to try to make forced use of `fn` wrappers
* a bit more acceptable. The function is named `n_equiv`, which refers to
* the η-equivalence axiom. It explicitly tells the construction has no
* effect. The function syntactically swap the function expression and its
* argument, so that both occurrences of the arguments appears close
* to each other in text, which helps avoid disturbance.
*)
val n_equiv: 'a -> ('a -> 'b) -> 'b = fn a => fn f => f a
Use case from the sample in the question, now looks like this:
fun process a = n_equiv a (chain e1 (chain e2 (chain e3 (chain e4 process))))
…
fun process a = n_equiv a (e1 THEN e2 THEN e3 THEN e4 THEN process)
That's already better, as now one is clearly told the surrounding construct is neutral.
To answer another part of the question, this case at least is more easily handled with fun than with val rec, as with val rec, the n_equiv self‑documenting idiom cannot be applied. That's a point in favour of fun over val rec … = fn …
Update #1
A page which mentions the compared verbosity of fun vs that of val: TipsForWritingConciseSML (mlton.org). See “Clausal Function Definitions” around the middle of the page. For non‑self‑recursive function, val … fn is less verbose than fun, it may vary for self‑recursive functions.

Related

Why does QuickCheck take a long time when testing a Functor instance with a specific type signature?

I'm working through the wonderful Haskell Book. While solving some exercises I ran QuickCheck test that took a relatively long time to run and I can't figure out why.
The exercise I am solving is in Chapter 16 - I need to write a Functor instance for
data Parappa f g a =
DaWrappa (f a) (g a)
Here is a link to the full code of my solution. The part I think is relevant is this:
functorCompose' :: (Eq (f c), Functor f)
=> Fun a b -> Fun b c -> f a -> Bool
functorCompose' fab gbc x =
(fmap g (fmap f x)) == (fmap (g . f) x)
where f = applyFun fab
g = applyFun gbc
type ParappaComp =
Fun Integer String
-> Fun String [Bool]
-> Parappa [] Maybe Integer
-- -> Parappa (Either Char) Maybe Integer
-> Bool
main :: IO ()
main = do
quickCheck (functorCompose' :: ParappaComp)
When I run this in the REPL it takes ~6 seconds to complete. If I change ParappaComp to use Either Char instead of [] (see comment in code), it finishes instantaneously like I'm used to seeing in all other exercises.
I suspect that maybe QuickCheck is using very long lists causing the test to take a long time, but I am not familiar enough with the environment to debug this or to test this hypothesis.
Why does this take so long?
How should I go about debugging this?

I suspect that maybe QuickCheck is using very long lists causing the test to take a long time, but I am not familiar enough with the environment to debug this or to test this hypothesis.
I'm not sure of the actual cause either, but one way to start debugging this is to use the collect function from QuickCheck to collect statistics about test cases. To start, you can collect the size of the result.
A simple way to obtain a size is by using the length function, requiring the functor f to be Foldable
You will need to implement or derive Foldable for Parappa (add {-# LANGUAGE DeriveFoldable #-} at the top of the file, add deriving Foldable to Parappa)
To use collect, you need to generalize Bool to Property (in the signature of functorCompose' and in the type synonym ParappaComp)
functorCompose' :: (Eq (f c), Functor f, Foldable f)
=> Fun a b -> Fun b c -> f a -> Property
functorCompose' fab gbc x =
collect (length x) $
(fmap g (fmap f x)) == (fmap (g . f) x)
where f = applyFun fab
g = applyFun gbc
With that you can see that the distribution of the lengths of generated lists is clustered around 20, with a long tail up to 100. That alone doesn't seem to explain the slowness, as one would expect that traversing lists of that size should be virtually instantaneous.

Scheme syntax-rules pattern matching algorithm

I'm writing a macro expansion algorithm for a programming project, and I am attempting to add an r7rs-small compliant macro expansion pass. One part of this expansion algorithm requires matching patterns.
However, I'm having difficulty coming up with a pattern-matching algorithm that deals with Scheme's repetition patterns. The given example is from the r7rs small spec, which expands let* into nested lets:
(define-syntax let*
(syntax-rules ()
((let* () body1 body2 ...)
(let () body1 body2 ...)
((let* ((name1 val1) (name2 val2) ....)
body1 body2 ...)
(let ((name1 val1))
(let* ((name2 val2) ...)
body1 body2 ...)))))
As you can see, the p ... syntax needs to be able to repeatedly match 0 or more repetitions of the pattern p.
My first attempt was:
-- This datatype is a given
data SExp = SAtom String | SPair SExp SExp | SEmpty
data Pat = PatEmpty -- ()
| PatPair Pat Pat -- (p1 . p2)
| PatWild -- _
| PatVar String -- x
| PatRepeated Pat -- p ...
| PatAtom String -- a
type MatchResult = Map String SExp
matchPat :: Pat -> SExp -> Maybe MatchResult
matchPat p e =
case (p, e) of
(PatEmpty, SEmpty) -> Just Map.empty
(PatWild, _) -> Just Map.empty
(PatVar x, _) -> Just (Map.singleton x e)
(PatAtom a, SAtom a') | a == a' -> Just Map.empty
(PatPair p1 p2, SPair e1 e2) -> do
-- This is the problem case
-- This implementation cannot handle repetitions,
-- since p1 should be able to consume parts of e2
res1 <- matchPat p1 e1
res2 <- matchPat p2 e2
Just (Map.union res1 res2)
_ -> Nothing
However, I'm skeptical that this pattern representation is good enough for implementing this kind of matching algorithm. Any help would be great.

Speeding up a stream like data type

I've made a type which is supposed to emulate a "stream". This is basically a list without memory.
data Stream a = forall s. Stream (s -> Maybe (a, s)) s
Basically a stream has two elements. A state s, and a function that takes the state, and returns an element of type a and the new state.
I want to be able to perform operations on streams, so I've imported Data.Foldable and defined streams on it as such:
import Data.Foldable
instance Foldable Stream where
foldr k z (Stream sf s) = go (sf s)
where
go Nothing = z
go (Just (e, ns)) = e `k` go (sf ns)
To test the speed of my stream, I've defined the following function:
mysum = foldl' (+) 0
And now we can compare the speed of ordinary lists and my stream type:
x1 = [1..n]
x2 = Stream (\s -> if (s == n + 1) then Nothing else Just (s, s + 1)) 1
--main = print $ mysum x1
--main = print $ mysum x2
My streams are about half the speed of lists (full code here).
Furthermore, here's a best case situation, without a list or a stream:
bestcase :: Int
bestcase = go 1 0 where
go i c = if i == n then c + i else go (i+1) (c+i)
This is a lot faster than both the list and stream versions.
So I've got two questions:
How to I get my stream version to be at least as fast as a list.
How to I get my stream version to be close to the speed of bestcase.

As it stands the foldl' you are getting from Foldable is defined in terms of the foldr you gave it. The default implementation is the brilliant and surprisingly good
foldl' :: (b -> a -> b) -> b -> t a -> b
foldl' f z0 xs = foldr f' id xs z0
where f' x k z = k $! f z x
But foldl' is the specialty of your type; fortunately the Foldable class includes foldl' as a method, so you can just add this to your instance.
foldl' op acc0 (Stream sf s0) = loop s0 acc0
where
loop !s !acc = case sf s of
Nothing -> acc
Just (a,s') -> loop s' (op acc a)
For me this seems to give about the same time as bestcase
Note that this is a standard case where we need a strictness annotation on the accumulator. You might look in the vector package's treatment of a similar type https://hackage.haskell.org/package/vector-0.10.12.2/docs/src/Data-Vector-Fusion-Stream.html for some ideas; or in the hidden 'fusion' modules of the text library https://github.com/bos/text/blob/master/Data/Text/Internal/Fusion .

Representing a Transducer Systems in sml

I need help writing code such that:
Given two functions, say f1 and f2 and an initial input i1 for f1, I will feed i1 to f1 and whatever ouptput it returns, I will feed to f2 and whatever f2 returns I will feed to f1 and so on...
Thus it will look like this:
fun pair(m1, m2, i1) = ...
m1 and m2 here actually represent Finite State Transducers such that m1 = (state, f1). the state here is the inital state we have i1. f1 takes in (state, input) and returns an output (next state, oput) the oput is then feeded to m1 and so on..
For clarification, this represents a Transducer Systems. This means that Two FSTs with complementary inputs and outputs can be run in parallel, with the output
of each serving as the input for the other.
This is supposed to return say a the list of outputs generated.
To help I have already wrote a function run that takes in a fst m and a list of inputs, gives out the list of outputs obtained by running m on the inputs.
However my head flipped when trying to write this function cause I kinda entered an infinite loop, also my code was unbelievably long while this can be done easily using my helper function run.
Any ideas?

Interesting question. I think you should somehow use a lazy evaluation. I'm not sure how to use it since I never did that and I have to admit I didn't really dig into it, but after short "googleing" I think I can provide a few useful links.
So, my first guess was:
fun pairFirst f1 f2 i1 =
fn () => pairFirst f2 f1 (f1 i1)
as you would do it in LISP, but it obviously doesn't work in SML. So I googled it.
First, I found out that SML actually does support lazy evaluation:
http://www.cs.cmu.edu/~rwh/introsml/core/lazydata.htm
Quote:
"First off, the lazy evaluation mechanisms of SML/NJ must be enabled by evaluating the following declarations:
Compiler.Control.Lazy.enabled := true;
open Lazy;"
I tried it, but it also didn't work, so I googled some more:
https://github.com/trptcolin/euler-sml/blob/master/lazy_utils.sml
Quote:
" (* most lazy details are from Programming in Standard ML, Robert Harper
* notable exception: Compiler.Control.Lazy.enabled has moved to Control.lazysml *)
Control.lazysml := true;
open Lazy;"
From the content of these two links, I constructed my second guess:
Control.lazysml := true;
open Lazy;
fun lazy pair (f1: 'a -> 'a, f2: 'a -> 'a, i1: 'a) : 'a susp =
pair (f2, f1, (f1 i1))
SML somehow "swallows" it:
- [opening /home/spela/test.sml]
val it = () : unit
opening Lazy
datatype 'a susp = $ of 'a
val pair = fn : ('a -> 'a) * ('a -> 'a) * 'a -> 'a susp
val pair_ = fn : ('a -> 'a) * ('a -> 'a) * 'a -> 'a
val it = () : unit
Does it work? I have no idea :)
- pair ((fn x => x + 1), (fn y => y - 1), 1);
val it = $$ : int susp
I haven't read these links, but I also found an article which I also haven't read but I believe it provides answers you are looking for:
http://www.cs.mcgill.ca/~bpientka/courses/cs302-fall10/handouts/lazy-hof.pdf
I believe those links could answer your questions.
If there is anyone familiar with this topic, PLEASE, answer the question, I think it would be interesting for many of us.
Best regards, Špela

Thank you for the push spela!
Your ideas are in the right track.
So typically here is how it goes:
You do in fact use lazy evaluation. Here we work with our own lazy structure anyhow(you can create your own structures in ml).
Using the function run i mentioned earlier, I can make a function that runs m1 on i1 and then call it in an mutually recursive function jest beneth it. Finally I will call the function all together!
Here is how it wil look like:
fun pair(m1, m2, i1)=
let
fun p1 () = run (m1) (delay(fn() => Gen(i1,p2())))
and p2 () = run (m2) (p1())
in
p1()
end
Here delay and Gen are part of my structure. Gen represents a stream with i1 as the first element and p2() as the rest. delay takes in a function and typically represents the laziness part in this implementation. Using mutually recursive functions (functions that call each other, enabled by typing "and" instead of "fun" like above) I could go back and forth and so on.
There is another simpler method to implement this believe it or not, but this is for starters. If you can any way to improve this answer(or another solution) you are welcome to share! Thank you

Is it possible to debug pattern matching in a Haskell function?

I have defined a type
data Expr =
Const Double
| Add Expr Expr
| Sub Expr Expr
and declared it as an instance of Eq typeclass:
instance Eq Expr where
(Add (Const a1) (Const a2)) == Const b = a1+a2 == b
(Add (Const a1) (Const a2)) == (Add (Const b1) (Const b2)) = a1+a2 == b1 + b2
Of course, the evaluation of the expression Sub (Const 1) (Const 1) == Const 0 will fail. How can I debug at runtime the pattern matching process to spot that it's failing? I would like to see how Haskell takes the arguments of == and walks through the patterns. Is it possible at all?

edit: providing a real answer to the question...
I find the easiest way to see what patterns are matching is to add trace statements, like so:
import Debug.Trace
instance Eq Expr where
(Add (Const a1) (Const a2)) == Const b = trace "Expr Eq pat 1" $ a1+a2 == b
(Add (Const a1) (Const a2)) == (Add (Const b1) (Const b2)) = trace "Expr Eq pat 2" $ a1+a2 == b1 + b2
-- catch any unmatched patterns
l == r = error $ "Expr Eq failed pattern match. \n l: " ++ show l ++ "\n r: " ++ show r
If you don't include a final statement to catch any otherwise unmatched patterns, you'll get a runtime exception, but I find it's more useful to see what data you're getting. Then it's usually simple to see why it doesn't match the previous patterns.
Of course you don't want to leave this in production code. I only insert traces as necessary then remove them when I've finished. You could also use CPP to leave them out of production builds.
I also want to say that I think pattern matching is the wrong way to go about this. You'll end up with a combinatorial explosion in the number of patterns, which quickly grows unmanageable. If you want to make a Float instance for Expr for example, you'll need several more primitive constructors.
Instead, you presumably have an interpreter function interpret :: Expr -> Double, or at least could write one. Then you can define
instance Eq Expr where
l == r = interpret l == interpret r
By pattern matching, you're essentially re-writing your interpret function in the Eq instance. If you want to make an Ord instance, you'll end up re-writing the interpret function yet again.

If you wish to get some examples on how the matching may fail, you could have a look at QuickCheck. There's an example on the manual (the size of test data) about generating and testing recursive data types that seems to perfectly suit your needs.
While the -Wall flag gives you a list of patterns non matched, a run of QuickCheck gives you examples of input data that lead your given proposition to failure.
For example, if I write a generator for your Expr and I give in input to quickCheck a proposition prop_expr_eq :: Expr -> Bool that checks if an Expr is equal to itself, I obtain very quickly Const 0.0 as a first example of non-matching input.
import Test.QuickCheck
import Control.Monad
data Expr =
Const Double
| Add Expr Expr
| Sub Expr Expr
deriving (Show)
instance Eq Expr where
(Add (Const a1) (Const a2)) == Const b = a1+a2 == b
(Add (Const a1) (Const a2)) == (Add (Const b1) (Const b2)) = a1+a2 == b1 + b2
instance Arbitrary Expr where
arbitrary = sized expr'
where
expr' 0 = liftM Const arbitrary
expr' n | n > 0 =
let subexpr = expr' (n `div` 2)
in oneof [liftM Const arbitrary,
liftM2 Add subexpr subexpr,
liftM2 Sub subexpr subexpr]
prop_expr_eq :: Expr -> Bool
prop_expr_eq e = e == e
As you see, running the test gives you a counterexample to prove that your equality test is wrong. I know this may be a little bit an overkill, but the advantage if you write things good is that you also get unit tests for your code that look at arbitrary properties, not only pattern matching exhaustiveness.
*Main> quickCheck prop_expr_eq
*** Failed! Exception: 'test.hs:(11,5)-(12,81): Non-exhaustive patterns in function ==' (after 1 test):
Const 0.0
PS: Another good reading about unit testing with QuickCheck is in the free book real world haskell.

You can break your complex pattern into simpler patterns and use trace to see what's going on. Something like this:
instance Eq Expr where
x1 == x2 | trace ("Top level: " ++ show (x, y1)) True,
Add x11 x12 <- x1,
trace ("First argument Add: " ++ show (x11, x12)) True,
Const a1 <- x11,
trace ("Matched first Const: " ++ show a1) True,
Const a2 <- x12,
trace ("Matched second Const: " ++ show a2) True,
Const b <- x2
trace ("Second argument Const: " ++ show b) True
= a1+a2 == b
It's a bit desperate, but desperate times calls for desperate measures. :)
As you get used to Haskell you rarely, if ever, need to do this.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

SML syntax: `val rec` and `fun` compared to each other - syntax

Related

Why does QuickCheck take a long time when testing a Functor instance with a specific type signature?

Scheme syntax-rules pattern matching algorithm

Speeding up a stream like data type

Representing a Transducer Systems in sml

Is it possible to debug pattern matching in a Haskell function?

Categories

Resources