SML Syntax Breakdown - syntax

I am trying to study SML (for full transparency this is in preparation for an exam (exam has not started)) and one area that I have been struggling with is higher level functions such as map and foldl/r. I understand that they are used in situations where you would use a for loop in oop languages (I think). What I am struggling with though is what each part in a fold or map function is doing. Here are some examples that if someone could break them down I would be very appreciative
fun cubiclist L = map (fn x=> x*x*x) L;
fun min (x::xs) = foldr (fn (a,b) => if (a < b) then a else b) x xs;
So if I could break down the parts I see and high light the parts I'm struggling with I believe that would be helpful.
Obviously right off the bat you have the name of the functions and the parameters that are being passed in but one question I have on that part is why are we just passing in a variable to cubiclist but for min we pass in (x::xs)? Is it because the map function is automatically applying the function to each part in the map? Also along with that will the fold functions typically take the x::xs parameters while map will just take a variable?
Then we have the higher order function along with the anonymous functions with the logic/operations that we want to apply to each element in the list. But the parameters being passed in for the foldr anonymous function I'm not quite sure about. I understand we are trying to capture the lowest element in the list and the then a else b is returning either a or b to be compared with the other elements in the list. I'm pretty sure that they are rutnred and treated as a in future comparisons but where do we get the following b's from? Where do we say b is the next element in the list?
Then the part that I really don't understand and have no clue is the L; and x xs; at the end of the respective functions. Why are they there? What are they doing? what is their purpose? is it just syntax or is there actually a purpose for them being there, not saying that syntax isn't a purpose or a valid reason, but does they actually do something? Are those variables that can be changed out with something else that would provide a different answer?
Any help/explanation is much appreciated.

In addition to what #molbdnilo has already stated, it can be helpful to a newcomer to functional programming to think about what we're actually doing when we crate a loop: we're specifying a piece of code to run repeatedly. We need an initial state, a condition for the loop to terminate, and an update between each iteration.
Let's look at simple implementation of map.
fun map f [] = []
| map f (x :: xs) = f x :: map f xs
The initial state of the contents of the list.
The termination condition is the list is empty.
The update is that we tack f x onto the front of the result of mapping f to the rest of the list.
The usefulness of map is that we abstract away f. It can be anything, and we don't have to worry about writing the loop boilerplate.
Fold functions are both more complex and more instructive when comparing to loops in procedural languages.
A simple implementation of fold.
fun foldl f init [] = init
| foldl f init (x :: xs) = foldl f (f init x) xs
We explicitly provide an initial value, and a list to operate on.
The termination condition is the list being empty. If it is, we return the initial value provided.
The update is to call the function again. This time the initial value is updated, and the list is the tail of the original.
Consider summing a list of integers.
foldl op+ 0 [1,2,3,4]
foldl op+ 1 [2,3,4]
foldl op+ 3 [3,4]
foldl op+ 6 [4]
foldl op+ 10 []
10
Folds are important to understand because so many fundamental functions can be implemented in terms of foldl or foldr. Think of folding as a means of reducing (many programming languages refer to these functions as "reduce") a list to another value of some type.

map takes a function and a list and produces a new list.
In map (fn x=> x*x*x) L, the function is fn x=> x*x*x, and L is the list.
This list is the same list as cubiclist's parameter.
foldr takes a function, an initial value, and a list and produces some kind of value.
In foldr (fn (a,b) => if (a < b) then a else b) x xs, the function is fn (a,b) => if (a < b) then a else b, the initial value is x, and the list is xs.
x and xs are given to the function by pattern-matching; x is the argument's head and xs is its tail.
(It follows from this that min will fail if it is given an empty list.)

Related

Functional programming with OCAML

I'm new to functional programming and I'm trying to implement a basic algorithm using OCAML for course that I'm following currently.
I'm trying to implement the following algorithm :
Entries :
- E : a non-empty set of integers
- s : an integer
- d : a positive float different of 0
Output :
- T : a set of integers included into E
m <- min(E)
T <- {m}
FOR EACH e ∈ sort_ascending(E \ {m}) DO
IF e > (1+d)m AND e <= s THEN
T <- T U {e}
m <- e
RETURN T
let f = fun (l: int list) (s: int) (d: float) ->
List.fold_left (fun acc x -> if ... then (list_union acc [x]) else acc)
[(list_min l)] (list_sort_ascending l) ;;
So far, this is what I have, but I don't know how to handle the modification of the "m" variable mentioned in the algorithm... So I need help to understand what is the best way to implement the algorithm, maybe I'm not gone in the right direction.
Thanks by advance to anyone who will take time to help me !
The basic trick of functional programming is that although you can't modify the values of any variables, you can call a function with different arguments. In the initial stages of switching away from imperative ways of thinking, you can imagine making every variable you want to modify into the parameters of your function. To modify the variables, you call the function recursively with the desired new values.
This technique will work for "modifying" the variable m. Think of m as a function parameter instead.
You are already using this technique with acc. Each call inside the fold gets the old value of acc and returns the new value, which is then passed to the function again. You might imagine having both acc and m as parameters of this inner function.
Assuming list_min is defined you should think the problem methodically. Let's say you represent a set with a list. Your function takes this set and some arguments and returns a subset of the original set, given the elements meet certain conditions.
Now, when I read this for the first time, List.filter automatically came to my mind.
List.filter : ('a -> bool) -> 'a list -> 'a list
But you wanted to modify the m so this wouldn't be useful. It's important to know when you can use library functions and when you really need to create your own functions from scratch. You could clearly use filter while handling m as a reference but it wouldn't be the functional way.
First let's focus on your predicate:
fun s d m e -> (float e) > (1. +. d)*.(float m) && (e <= s)
Note that +. and *. are the plus and product functions for floats, and float is a function that casts an int to float.
Let's say the function predicate is that predicate I just mentioned.
Now, this is also a matter of opinion. In my experience I wouldn't use fold_left just because it's just complicated and not necessary.
So let's begin with my idea of the code:
let m = list_min l;;
So this is the initial m
Then I will define an auxiliary function that reads the m as an argument, with l as your original set, and s, d and m the variables you used in your original imperative code.
let rec f' l s d m =
match l with
| [] -> []
| x :: xs -> if (predicate s d m x) then begin
x :: (f' xs s d x)
end
else
f' xs s d m in
f' l s d m
Then for each element of your set, you check if it satisfies the predicate, and if it does, you call the function again but you replace the value of m with x.
Finally you could just call f' from a function f:
let f (l: int list) (s: int) (d: float) =
let m = list_min l in
f' l s d m
Be careful when creating a function like your list_min, what would happen if the list was empty? Normally you would use the Option type to handle those cases but you assumed you're dealing with a non-empty set so that's great.
When doing functional programming it's important to think functional. Pattern matching is super recommended, while pointers/references should be minimal. I hope this is useful. Contact me if you any other doubt or recommendation.

Recursive algorithm that returns every pair of a set

I was wondering if any algorithm of that kind does exist, I don't have the slightest idea on how to program it...
For exemple if you give it [1;5;7]
it should returns [(1,5);(1,7);(5,1);(5,7);(7,1);(7,5)]
I don't want to use any for loop.
Do you have any clue on how to achieve this ?
You have two cases: list is empty -> return empty list; list is not empty -> take first element x, for each element y yield (x, y) and make a recursive call on the tail of the list. Haskell:
pairs :: [a] -> [(a, a)]
pairs [] = []
pairs (x:xs) = [(x, x') | x' <- xs] ++ pairs xs
--*Main> pairs [1..10]
--[(1,2),(1,3),(1,4),(1,5),(1,6),(1,7),(1,8),(1,9),(1,10),(2,3),(2,4),(2,5),(2,6),(2,7),(2,8),(2,9),(2,10),(3,4),(3,5),(3,6),(3,7),(3,8),(3,9),(3,10),(4,5),(4,6),(4,7),(4,8),(4,9),(4,10),(5,6),(5,7),(5,8),(5,9),(5,10),(6,7),(6,8),(6,9),(6,10),(7,8),(7,9),(7,10),(8,9),(8,10),(9,10)]
I don't know is the algorithm used is a recursive one or not, but what are you asking for is the itertools.combinations('ABCD', 2) method from Python and I suppose the same thing is implemented in other programming language, so you can probably use the native method.
But if you need to write your own, then you can take a look at Algorithm to return all combinations of k elements from n (on this site) for some ideas

adding a number to a list within a function OCaml

Here is what I have and the error that I am getting sadly is
Error: This function has type 'a * 'a list -> 'a list
It is applied to too many arguments; maybe you forgot a `;'.
Why is that the case? I plan on passing two lists to the deleteDuplicates function, a sorted list, and an empty list, and expect the duplicates to be removed in the list r, which will be returned once the original list reaches [] condition.
will be back with updated code
let myfunc_caml_way arg0 arg1 = ...
rather than
let myfunc_java_way(arg0, arg1) = ...
Then you can call your function in this way:
myfunc_caml_way "10" 123
rather than
myfunc_java_way("10, 123)
I don't know how useful this might be, but here is some code that does what you want, written in a fairly standard OCaml style. Spend some time making sure you understand how and why it works. Maybe you should start with something simpler (eg how would you sum the elements of a list of integers ?). Actually, you should probably start with an OCaml tutorial, reading carefully and making sure you aunderstand the code examples.
let deleteDuplicates u =
(*
u : the sorted list
v : the result so far
last : the last element we read from u
*)
let rec aux u v last =
match u with
[] -> v
| x::xs when x = last -> aux xs v last
| x::xs -> aux u (x::v) x
in
(* the first element is a special case *)
match u with
[] -> []
| x::xs -> List.rev (aux xs [x] x)
This is not a direct answer to your question.
The standard way of defining an "n-ary" function is
let myfunc_caml_way arg0 arg1 = ...
rather than
let myfunc_java_way(arg0, arg1) = ...
Then you can call your function in this way:
myfunc_caml_way "10" 123
rather than
myfunc_java_way("10, 123)
See examples here:
https://github.com/ocaml/ocaml/blob/trunk/stdlib/complex.ml
By switching from myfunc_java_way to myfunc_caml_way, you will be benefited from what's called "Currying"
What is 'Currying'?
However please note that you sometimes need to enclose the whole invocation by parenthesis
myfunc_caml_way (otherfunc_caml_way "foo" "bar") 123
in order to tell the compiler not to interpret your code as
((myfunc_caml_way otherfunc_caml_way "foo") "bar" 123)
You seem to be thinking that OCaml uses tuples (a, b) to indicate arguments of function calls. This isn't the case. Whenever some expressions stand next to each other, that's a function call. The first expression is the function, and the rest of the expressions are the arguments to the function.
So, these two lines:
append(first,r)
deleteDuplicates(remaining, r)
Represent a function call with three arguments. The function is append. The first argument is (first ,r). The second argument is deleteDuplicates. The third argument is (remaining, r).
Since append has just one argument (a tuple), you're passing it too many arguments. This is what the compiler is telling you.
You also seem to be thinking that append(first, r) will change the value of r. This is not the case. Variables in OCaml are immutable. You can't do anything that will change the value of r.
Update
I think you have too many questions for SO to help you effectively at this point. You might try reading some OCaml tutorials. It will be much faster than asking a question here for every error you see :-)
Nonetheless, here's what "match failure" means. It means that somewhere you have a match that you're applying to an expression, but none of the patterns of the match matches the expression. Your deleteDuplicates code clearly has a pattern coverage error; i.e., it has a pattern that doesn't cover all cases. Your first match only works for empty lists or for lists of 2 or more elements. It doesn't work for lists of 1 element.

Optimize "list" indexing in Haskell

Say you have a very deterministic algorithm that produces a list, like inits in Data.List. Is there any way that a Haskell compiler can optimally perform an "indexing" operation on this algorithm without actually generating all the intermediate results?
For example, inits [1..] !! 10000 is pretty slow. Could a compiler somehow deduce what inits would produce on the 10000th element without any recursion, etc? Of course, this same idea could be generalized beyond lists.
Edit: While inits [1..] !! 10000 is constant, I am wondering about any "index-like" operation on some algorithm. For example, could \i -> inits [1..] !! i be optimized such that no [or minimal] recursion is performed to reach the result for any i?
Yes and no. If you look at the definition for Data.List.inits:
inits :: [a] -> [[a]]
inits xs = [] : case xs of
[] -> []
x : xs' -> map (x :) (inits xs')
you'll see that it's defined recursively. That means that each element of the resulting list is built on the previous element of the list. So if you want any nth element, you have to build all n-1 previous elements.
Now you could define a new function
inits' xs = [] : [take n xs | (n, _) <- zip [1..] xs]
which has the same behavior. If you try to take inits' [1..] !! 10000, it finishes very quickly because the successive elements of the list do not depend on the previous ones. Of course, if you were actually trying to generate a list of inits instead of just a single element, this would be much slower.
The compiler would have to know a lot of information to be able to optimize away recursion from a function like inits. That said, if a function really is "very deterministic", it should be trivial to rewrite it in a non recursive way.

FP homework. Is it possible to define a function using nested pattern matching instead of auxiliary function?

I am solving the Programming assinment for Harvard CS 51 programming course in ocaml.
The problem is to define a function that can compress a list of chars to list of pairs where each pair contains a number of consequent occurencies of the character in the list and the character itself, i.e. after applying this function to the list ['a';'a';'a';'a';'a';'b';'b';'b';'c';'d';'d';'d';'d'] we should get the list of [(5,'a');(3,'b');(1,'c');(4,'d')].
I came up with the function that uses auxiliary function go to solve this problem:
let to_run_length (lst : char list) : (int*char) list =
let rec go i s lst1 =
match lst1 with
| [] -> [(i,s)]
| (x::xs) when s <> x -> (i,s) :: go 0 x lst1
| (x::xs) -> go (i + 1) s xs
in match lst with
| x :: xs -> go 0 x lst
| [] -> []
My question is: Is it possible to define recursive function to_run_length with nested pattern matching without defining an auxiliary function go. How in this case we can store a state of counter of already passed elements?
The way you have implemented to_run_length is correct, readable and efficient. It is a good solution. (only nitpick: the indentation after in is wrong)
If you want to avoid the intermediary function, you must use the information present in the return from the recursive call instead. This can be described in a slightly more abstract way:
the run length encoding of the empty list is the empty list
the run length encoding of the list x::xs is,
if the run length encoding of xs start with x, then ...
if it doesn't, then (x,1) ::run length encoding of xs
(I intentionally do not provide source code to let you work the detail out, but unfortunately there is not much to hide with such relatively simple functions.)
Food for thought: You usually encounter this kind of techniques when considering tail-recursive and non-tail-recursive functions (what I've done resembles turning a tail-rec function in non-tail-rec form). In this particular case, your original function was not tail recursive. A function is tail-recursive when the flows of arguments/results only goes "down" the recursive calls (you return them, rather than reusing them to build a larger result). In my function, the flow of arguments/results only goes "up" the recursive calls (the calls have the least information possible, and all the code logic is done by inspecting the results). In your implementation, flows goes both "down" (the integer counter) and "up" (the encoded result).
Edit: upon request of the original poster, here is my solution:
let rec run_length = function
| [] -> []
| x::xs ->
match run_length xs with
| (n,y)::ys when x = y -> (n+1,x)::ys
| res -> (1,x)::res
I don't think it is a good idea to write this function. Current solution is OK.
But if you still want to do it you can use one of two approaches.
1) Without changing arguments of your function. You can define some toplevel mutable values which will contain accumulators which are used in your auxilary function now.
2) You can add argument to your function to store some data. You can find some examples when googling for continuation-passing style.
Happy hacking!
P.S. I still want to underline that your current solution is OK and you don't need to improve it!

Resources