Representing a Transducer Systems in sml - lazy-evaluation

I need help writing code such that:
Given two functions, say f1 and f2 and an initial input i1 for f1, I will feed i1 to f1 and whatever ouptput it returns, I will feed to f2 and whatever f2 returns I will feed to f1 and so on...
Thus it will look like this:
fun pair(m1, m2, i1) = ...
m1 and m2 here actually represent Finite State Transducers such that m1 = (state, f1). the state here is the inital state we have i1. f1 takes in (state, input) and returns an output (next state, oput) the oput is then feeded to m1 and so on..
For clarification, this represents a Transducer Systems. This means that Two FSTs with complementary inputs and outputs can be run in parallel, with the output
of each serving as the input for the other.
This is supposed to return say a the list of outputs generated.
To help I have already wrote a function run that takes in a fst m and a list of inputs, gives out the list of outputs obtained by running m on the inputs.
However my head flipped when trying to write this function cause I kinda entered an infinite loop, also my code was unbelievably long while this can be done easily using my helper function run.
Any ideas?

Interesting question. I think you should somehow use a lazy evaluation. I'm not sure how to use it since I never did that and I have to admit I didn't really dig into it, but after short "googleing" I think I can provide a few useful links.
So, my first guess was:
fun pairFirst f1 f2 i1 =
fn () => pairFirst f2 f1 (f1 i1)
as you would do it in LISP, but it obviously doesn't work in SML. So I googled it.
First, I found out that SML actually does support lazy evaluation:
http://www.cs.cmu.edu/~rwh/introsml/core/lazydata.htm
Quote:
"First off, the lazy evaluation mechanisms of SML/NJ must be enabled by evaluating the following declarations:
Compiler.Control.Lazy.enabled := true;
open Lazy;"
I tried it, but it also didn't work, so I googled some more:
https://github.com/trptcolin/euler-sml/blob/master/lazy_utils.sml
Quote:
" (* most lazy details are from Programming in Standard ML, Robert Harper
* notable exception: Compiler.Control.Lazy.enabled has moved to Control.lazysml *)
Control.lazysml := true;
open Lazy;"
From the content of these two links, I constructed my second guess:
Control.lazysml := true;
open Lazy;
fun lazy pair (f1: 'a -> 'a, f2: 'a -> 'a, i1: 'a) : 'a susp =
pair (f2, f1, (f1 i1))
SML somehow "swallows" it:
- [opening /home/spela/test.sml]
val it = () : unit
opening Lazy
datatype 'a susp = $ of 'a
val pair = fn : ('a -> 'a) * ('a -> 'a) * 'a -> 'a susp
val pair_ = fn : ('a -> 'a) * ('a -> 'a) * 'a -> 'a
val it = () : unit
Does it work? I have no idea :)
- pair ((fn x => x + 1), (fn y => y - 1), 1);
val it = $$ : int susp
I haven't read these links, but I also found an article which I also haven't read but I believe it provides answers you are looking for:
http://www.cs.mcgill.ca/~bpientka/courses/cs302-fall10/handouts/lazy-hof.pdf
I believe those links could answer your questions.
If there is anyone familiar with this topic, PLEASE, answer the question, I think it would be interesting for many of us.
Best regards, Špela

Thank you for the push spela!
Your ideas are in the right track.
So typically here is how it goes:
You do in fact use lazy evaluation. Here we work with our own lazy structure anyhow(you can create your own structures in ml).
Using the function run i mentioned earlier, I can make a function that runs m1 on i1 and then call it in an mutually recursive function jest beneth it. Finally I will call the function all together!
Here is how it wil look like:
fun pair(m1, m2, i1)=
let
fun p1 () = run (m1) (delay(fn() => Gen(i1,p2())))
and p2 () = run (m2) (p1())
in
p1()
end
Here delay and Gen are part of my structure. Gen represents a stream with i1 as the first element and p2() as the rest. delay takes in a function and typically represents the laziness part in this implementation. Using mutually recursive functions (functions that call each other, enabled by typing "and" instead of "fun" like above) I could go back and forth and so on.
There is another simpler method to implement this believe it or not, but this is for starters. If you can any way to improve this answer(or another solution) you are welcome to share! Thank you

Related

SML Syntax Breakdown

I am trying to study SML (for full transparency this is in preparation for an exam (exam has not started)) and one area that I have been struggling with is higher level functions such as map and foldl/r. I understand that they are used in situations where you would use a for loop in oop languages (I think). What I am struggling with though is what each part in a fold or map function is doing. Here are some examples that if someone could break them down I would be very appreciative
fun cubiclist L = map (fn x=> x*x*x) L;
fun min (x::xs) = foldr (fn (a,b) => if (a < b) then a else b) x xs;
So if I could break down the parts I see and high light the parts I'm struggling with I believe that would be helpful.
Obviously right off the bat you have the name of the functions and the parameters that are being passed in but one question I have on that part is why are we just passing in a variable to cubiclist but for min we pass in (x::xs)? Is it because the map function is automatically applying the function to each part in the map? Also along with that will the fold functions typically take the x::xs parameters while map will just take a variable?
Then we have the higher order function along with the anonymous functions with the logic/operations that we want to apply to each element in the list. But the parameters being passed in for the foldr anonymous function I'm not quite sure about. I understand we are trying to capture the lowest element in the list and the then a else b is returning either a or b to be compared with the other elements in the list. I'm pretty sure that they are rutnred and treated as a in future comparisons but where do we get the following b's from? Where do we say b is the next element in the list?
Then the part that I really don't understand and have no clue is the L; and x xs; at the end of the respective functions. Why are they there? What are they doing? what is their purpose? is it just syntax or is there actually a purpose for them being there, not saying that syntax isn't a purpose or a valid reason, but does they actually do something? Are those variables that can be changed out with something else that would provide a different answer?
Any help/explanation is much appreciated.
In addition to what #molbdnilo has already stated, it can be helpful to a newcomer to functional programming to think about what we're actually doing when we crate a loop: we're specifying a piece of code to run repeatedly. We need an initial state, a condition for the loop to terminate, and an update between each iteration.
Let's look at simple implementation of map.
fun map f [] = []
| map f (x :: xs) = f x :: map f xs
The initial state of the contents of the list.
The termination condition is the list is empty.
The update is that we tack f x onto the front of the result of mapping f to the rest of the list.
The usefulness of map is that we abstract away f. It can be anything, and we don't have to worry about writing the loop boilerplate.
Fold functions are both more complex and more instructive when comparing to loops in procedural languages.
A simple implementation of fold.
fun foldl f init [] = init
| foldl f init (x :: xs) = foldl f (f init x) xs
We explicitly provide an initial value, and a list to operate on.
The termination condition is the list being empty. If it is, we return the initial value provided.
The update is to call the function again. This time the initial value is updated, and the list is the tail of the original.
Consider summing a list of integers.
foldl op+ 0 [1,2,3,4]
foldl op+ 1 [2,3,4]
foldl op+ 3 [3,4]
foldl op+ 6 [4]
foldl op+ 10 []
10
Folds are important to understand because so many fundamental functions can be implemented in terms of foldl or foldr. Think of folding as a means of reducing (many programming languages refer to these functions as "reduce") a list to another value of some type.
map takes a function and a list and produces a new list.
In map (fn x=> x*x*x) L, the function is fn x=> x*x*x, and L is the list.
This list is the same list as cubiclist's parameter.
foldr takes a function, an initial value, and a list and produces some kind of value.
In foldr (fn (a,b) => if (a < b) then a else b) x xs, the function is fn (a,b) => if (a < b) then a else b, the initial value is x, and the list is xs.
x and xs are given to the function by pattern-matching; x is the argument's head and xs is its tail.
(It follows from this that min will fail if it is given an empty list.)

adding a number to a list within a function OCaml

Here is what I have and the error that I am getting sadly is
Error: This function has type 'a * 'a list -> 'a list
It is applied to too many arguments; maybe you forgot a `;'.
Why is that the case? I plan on passing two lists to the deleteDuplicates function, a sorted list, and an empty list, and expect the duplicates to be removed in the list r, which will be returned once the original list reaches [] condition.
will be back with updated code
let myfunc_caml_way arg0 arg1 = ...
rather than
let myfunc_java_way(arg0, arg1) = ...
Then you can call your function in this way:
myfunc_caml_way "10" 123
rather than
myfunc_java_way("10, 123)
I don't know how useful this might be, but here is some code that does what you want, written in a fairly standard OCaml style. Spend some time making sure you understand how and why it works. Maybe you should start with something simpler (eg how would you sum the elements of a list of integers ?). Actually, you should probably start with an OCaml tutorial, reading carefully and making sure you aunderstand the code examples.
let deleteDuplicates u =
(*
u : the sorted list
v : the result so far
last : the last element we read from u
*)
let rec aux u v last =
match u with
[] -> v
| x::xs when x = last -> aux xs v last
| x::xs -> aux u (x::v) x
in
(* the first element is a special case *)
match u with
[] -> []
| x::xs -> List.rev (aux xs [x] x)
This is not a direct answer to your question.
The standard way of defining an "n-ary" function is
let myfunc_caml_way arg0 arg1 = ...
rather than
let myfunc_java_way(arg0, arg1) = ...
Then you can call your function in this way:
myfunc_caml_way "10" 123
rather than
myfunc_java_way("10, 123)
See examples here:
https://github.com/ocaml/ocaml/blob/trunk/stdlib/complex.ml
By switching from myfunc_java_way to myfunc_caml_way, you will be benefited from what's called "Currying"
What is 'Currying'?
However please note that you sometimes need to enclose the whole invocation by parenthesis
myfunc_caml_way (otherfunc_caml_way "foo" "bar") 123
in order to tell the compiler not to interpret your code as
((myfunc_caml_way otherfunc_caml_way "foo") "bar" 123)
You seem to be thinking that OCaml uses tuples (a, b) to indicate arguments of function calls. This isn't the case. Whenever some expressions stand next to each other, that's a function call. The first expression is the function, and the rest of the expressions are the arguments to the function.
So, these two lines:
append(first,r)
deleteDuplicates(remaining, r)
Represent a function call with three arguments. The function is append. The first argument is (first ,r). The second argument is deleteDuplicates. The third argument is (remaining, r).
Since append has just one argument (a tuple), you're passing it too many arguments. This is what the compiler is telling you.
You also seem to be thinking that append(first, r) will change the value of r. This is not the case. Variables in OCaml are immutable. You can't do anything that will change the value of r.
Update
I think you have too many questions for SO to help you effectively at this point. You might try reading some OCaml tutorials. It will be much faster than asking a question here for every error you see :-)
Nonetheless, here's what "match failure" means. It means that somewhere you have a match that you're applying to an expression, but none of the patterns of the match matches the expression. Your deleteDuplicates code clearly has a pattern coverage error; i.e., it has a pattern that doesn't cover all cases. Your first match only works for empty lists or for lists of 2 or more elements. It doesn't work for lists of 1 element.

FP homework. Is it possible to define a function using nested pattern matching instead of auxiliary function?

I am solving the Programming assinment for Harvard CS 51 programming course in ocaml.
The problem is to define a function that can compress a list of chars to list of pairs where each pair contains a number of consequent occurencies of the character in the list and the character itself, i.e. after applying this function to the list ['a';'a';'a';'a';'a';'b';'b';'b';'c';'d';'d';'d';'d'] we should get the list of [(5,'a');(3,'b');(1,'c');(4,'d')].
I came up with the function that uses auxiliary function go to solve this problem:
let to_run_length (lst : char list) : (int*char) list =
let rec go i s lst1 =
match lst1 with
| [] -> [(i,s)]
| (x::xs) when s <> x -> (i,s) :: go 0 x lst1
| (x::xs) -> go (i + 1) s xs
in match lst with
| x :: xs -> go 0 x lst
| [] -> []
My question is: Is it possible to define recursive function to_run_length with nested pattern matching without defining an auxiliary function go. How in this case we can store a state of counter of already passed elements?
The way you have implemented to_run_length is correct, readable and efficient. It is a good solution. (only nitpick: the indentation after in is wrong)
If you want to avoid the intermediary function, you must use the information present in the return from the recursive call instead. This can be described in a slightly more abstract way:
the run length encoding of the empty list is the empty list
the run length encoding of the list x::xs is,
if the run length encoding of xs start with x, then ...
if it doesn't, then (x,1) ::run length encoding of xs
(I intentionally do not provide source code to let you work the detail out, but unfortunately there is not much to hide with such relatively simple functions.)
Food for thought: You usually encounter this kind of techniques when considering tail-recursive and non-tail-recursive functions (what I've done resembles turning a tail-rec function in non-tail-rec form). In this particular case, your original function was not tail recursive. A function is tail-recursive when the flows of arguments/results only goes "down" the recursive calls (you return them, rather than reusing them to build a larger result). In my function, the flow of arguments/results only goes "up" the recursive calls (the calls have the least information possible, and all the code logic is done by inspecting the results). In your implementation, flows goes both "down" (the integer counter) and "up" (the encoded result).
Edit: upon request of the original poster, here is my solution:
let rec run_length = function
| [] -> []
| x::xs ->
match run_length xs with
| (n,y)::ys when x = y -> (n+1,x)::ys
| res -> (1,x)::res
I don't think it is a good idea to write this function. Current solution is OK.
But if you still want to do it you can use one of two approaches.
1) Without changing arguments of your function. You can define some toplevel mutable values which will contain accumulators which are used in your auxilary function now.
2) You can add argument to your function to store some data. You can find some examples when googling for continuation-passing style.
Happy hacking!
P.S. I still want to underline that your current solution is OK and you don't need to improve it!

Is there a name for the function that returns a positionally-expanding version of its argument?

Consider splatter in this Python code:
def splatter(fn):
return lambda (args): fn(*args)
def add(a, b):
return a + b
list1 = [1, 2, 3]
list2 = [4, 5, 6]
print map(splatter(add), zip(list1, list2))
Mapping an n-ary function over n zipped sequences seems like a common enough operation that there might be a name for this already, but I have no idea where I'd find that. It vaguely evokes currying, and it seems like there are probably other related argument-centric HOFs that I've never heard of. Does anyone know if this is a "well-known" function? When discussing it I am currently stuck with the type of awkward language used in the question title.
Edit
Wow, Python's map does this automatically. You can write:
map(add, list1, list2)
And it will do the right thing, saving you the trouble of splattering your function. The only difference is that zip returns a list whose length is the the length of its shortest argument, whereas map extends shorter lists with None.
I think zipWith is the function that you are searching (this name is at least used in Haskell). It is even a bit more general. In Haskell zipWith is defined as follows (where the first line is just the type):
zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
zipWith f (a:as) (b:bs) = f a b : zipWith f as bs
zipWith _ _ _ = []
And your example would be something like
zipWith (+) [1, 2, 3] [4, 5, 6]
Since I do not know python very well I can only point to "zipWith analogue in Python?".
I randomly saw this in my list of "Questions asked," and was surprised that I now know the answer.
There are two interpretations of the function that I asked.
The first was my intent: to take a function that takes a fixed number of arguments and convert it into a function that takes those arguments as a fixed-size list or tuple. In Haskell, the function that does this operation is called uncurry.
uncurry :: (a -> b -> c) -> ((a, b) -> c)
(Extra parens for clarity.)
It's easy to imagine extending this to functions of more than two arguments, though it can't be expressed in Haskell. But uncurry3, uncurry4, etc. would not be out of place.
So I was right that it "vaguely evokes currying," as it is really the opposite.
The second interpretation is to take a function that takes an intentionally variable number of arguments and return a function that takes a single list.
Because splat is so weird as a syntactic construct in Python, this is hard to reason about.
But if we imagine, say, JavaScript, which has a first-class named function for "splatting:"
varFn.apply(null, args)
var splatter = function(f) {
return function(arg) {
return f.apply(null, arg);
};
};
Then we could rephrase that as merely a partial application of the "apply" function:
var splatter = function(f) {
return Function.prototype.apply.bind(f, null);
};
Or using, Underscore's partial, we can come up with the point-free definition:
var splatter = _.partial(Function.prototype.bind.bind(Function.prototype.apply), _, null)
Yes, that is a nightmare.
(The alternative to _.partial requires defining some sort of swap helper and would come out even less readable, I think.)
So I think that the name of this operation is just "a partial application of apply", or in the Python case it's almost like a section of the splat operator -- if splat were an "actual" operator.
But the particular combination of uncurry, zip, and map in the original question is exactly zipWith, as chris pointed out. In fact, HLint by default includes a rule to replace this complex construct with a single call to zipWith.
I hope that clears things up, past Ian.

Haskell mutable map/tree

I am looking for a mutable (balanced) tree/map/hash table in Haskell or a way how to simulate it inside a function. I.e. when I call the same function several times, the structure is preserved. So far I have tried Data.HashTable (which is OK, but somewhat slow) and tried Data.Array.Judy but I was unable to make it work with GHC 6.10.4. Are there any other options?
If you want mutable state, you can have it. Just keep passing the updated map around, or keep it in a state monad (which turns out to be the same thing).
import qualified Data.Map as Map
import Control.Monad.ST
import Data.STRef
memoize :: Ord k => (k -> ST s a) -> ST s (k -> ST s a)
memoize f = do
mc <- newSTRef Map.empty
return $ \k -> do
c <- readSTRef mc
case Map.lookup k c of
Just a -> return a
Nothing -> do a <- f k
writeSTRef mc (Map.insert k a c) >> return a
You can use this like so. (In practice, you might want to add a way to clear items from the cache, too.)
import Control.Monad
main :: IO ()
main = do
fib <- stToIO $ fixST $ \fib -> memoize $ \n ->
if n < 2 then return n else liftM2 (+) (fib (n-1)) (fib (n-2))
mapM_ (print <=< stToIO . fib) [1..10000]
At your own risk, you can unsafely escape from the requirement of threading state through everything that needs it.
import System.IO.Unsafe
unsafeMemoize :: Ord k => (k -> a) -> k -> a
unsafeMemoize f = unsafePerformIO $ do
f' <- stToIO $ memoize $ return . f
return $ unsafePerformIO . stToIO . f'
fib :: Integer -> Integer
fib = unsafeMemoize $ \n -> if n < 2 then n else fib (n-1) + fib (n-2)
main :: IO ()
main = mapM_ (print . fib) [1..1000]
Building on #Ramsey's answer, I also suggest you reconceive your function to take a map and return a modified one. Then code using good ol' Data.Map, which is pretty efficient at modifications. Here is a pattern:
import qualified Data.Map as Map
-- | takes input and a map, and returns a result and a modified map
myFunc :: a -> Map.Map k v -> (r, Map.Map k v)
myFunc a m = … -- put your function here
-- | run myFunc over a list of inputs, gathering the outputs
mapFuncWithMap :: [a] -> Map.Map k v -> ([r], Map.Map k v)
mapFuncWithMap as m0 = foldr step ([], m0) as
where step a (rs, m) = let (r, m') = myFunc a m in (r:rs, m')
-- this starts with an initial map, uses successive versions of the map
-- on each iteration, and returns a tuple of the results, and the final map
-- | run myFunc over a list of inputs, gathering the outputs
mapFunc :: [a] -> [r]
mapFunc as = fst $ mapFuncWithMap as Map.empty
-- same as above, but starts with an empty map, and ignores the final map
It is easy to abstract this pattern and make mapFuncWithMap generic over functions that use maps in this way.
Although you ask for a mutable type, let me suggest that you use an immutable data structure and that you pass successive versions to your functions as an argument.
Regarding which data structure to use,
There is an implementation of red-black trees at Kent
If you have integer keys, Data.IntMap is extremely efficient.
If you have string keys, the bytestring-trie package from Hackage looks very good.
The problem is that I cannot use (or I don't know how to) use a non-mutable type.
If you're lucky, you can pass your table data structure as an extra parameter to every function that needs it. If, however, your table needs to be widely distributed, you may wish to use a state monad where the state is the contents of your table.
If you are trying to memoize, you can try some of the lazy memoization tricks from Conal Elliott's blog, but as soon as you go beyond integer arguments, lazy memoization becomes very murky—not something I would recommend you try as a beginner. Maybe you can post a question about the broader problem you are trying to solve? Often with Haskell and mutability the issue is how to contain the mutation or updates within some kind of scope.
It's not so easy learning to program without any global mutable variables.
If I read your comments right, then you have a structure with possibly ~500k total values to compute. The computations are expensive, so you want them done only once, and on subsequent accesses, you just want the value without recomputation.
In this case, use Haskell's laziness to your advantage! ~500k is not so big: Just build a map of all the answers, and then fetch as needed. The first fetch will force computation, subsequent fetches of the same answer will reuse the same result, and if you never fetch a particular computation - it never happens!
You can find a small implementation of this idea using 3D point distances as the computation in the file PointCloud.hs. That file uses Debug.Trace to log when the computation actually gets done:
> ghc --make PointCloud.hs
[1 of 1] Compiling Main ( PointCloud.hs, PointCloud.o )
Linking PointCloud ...
> ./PointCloud
(1,2)
(<calc (1,2)>)
Just 1.0
(1,2)
Just 1.0
(1,5)
(<calc (1,5)>)
Just 1.0
(1,2)
Just 1.0
Are there any other options?
A mutable reference to a purely functional dictionary like Data.Map.

Resources