meaning of this ocaml syntax - syntax

I'm reading lambdasoup/soup.ml at master · aantron/lambdasoup · GitHub but I don't understand the syntax.
and 'a node =
{mutable self : 'b. 'b node option;
mutable parent : general node option;
values : [ `Element of element_values
| `Text of string
| `Document of document_values ]}
I don't understand 'b. 'b node option, if it was * it would be a tuple but it's the first time I see with . Also why the back-tic in the branches (e.g. `Element)?

The type 'a . type is a type that is explicitly polymorphic in 'a. So your example 'b . 'b node option is explcitly a field whose contents are polymorphic. In other words, any value assigned to the field must itself be polymorphic.
Here's an example with list rather than node:
type a = { mutable self : 'b. 'b list option; }
# let x = { self = None };;
val x : a = {self = None}
# x.self <- None;;
- : unit = ()
# x.self <- Some [];;
- : unit = ()
# x.self <- Some [3];;
Error: This field value has type int list option
which is less general than 'b. 'b list option
#
You can assign None to x.self because None is polymorphic (its type is 'a option, which works for any option type). You can assign Some [] to x.self because it's also polymorphic (its type is 'a list option, which works for any optional list). But you can't assign Some [3] to x.self because its type is int list option; in other words, it's not polymorphic.
You can find a discussion of explicitly polymorphic types in Section 5.2.1 of the OCaml manual.
Variant values with leading backquote like `A or `B are so-called polymorphic variants. This is a different feature than the usual variant types. The basic idea is that a polymorphic variant represents a value that is not necessarily part of any predefined type. The associated types are essentially sets of these values. Polymorphic variants can also be constructors as in your example type; that is, they can take an associated value. Just as you can have Some "yes", your definition allows one to have `Text "yes".
You can find some discussion of polymorphic variants in Section 7.4 of the OCaml manual (search for "polymorphic variant types").

Related

Implementation of a circular doubly linked list in OCaml

I am attempting to implement a circular doubly linked list using OCaml type declaration. Here is what I have :
type 'a cList =
{
mutable value : 'a;
mutable left : 'a cList option;
mutable right : 'a cList option
}
;;
The problem comes up when I need to declare a first list containing a single element. Because the element cannot be referenced before being assigned, I cannot get the left and right members of the cell to point on itself.
The only workaround I have so far is to allow the left and right members to be of type option, set them to None and then modify them so that they point on the cell itself.
let unitClist v = let a = {
value = v;
left = None;
right = None
}
in
a.left <- Some a;
a.right <- Some a;
a
;;
It works, but it is a bit binding to have to work with the option type when you are sure to have a value.
Is there a better way to do it ?
Thanks in advance.
Apparently you can also define a recursive value directly with a record. By using a rec binding you can refer to the binding recursively in the definition (in certain cases):
type 'a cList = {
mutable value : 'a;
mutable left : 'a cList;
mutable right : 'a cList
}
let unitClist v =
let rec a = {
value = v;
left = a;
right = a
}
in a
This is documented in Chapter 8.1 of the OCaml Manual
You could use objects instead, which are late binding and therefore allow self-reference:
object(self)
method value = v
method left = self
method right = self
end
There's a performance cost to this, of course, because all method calls will be dispatched dynamically, but this wouldn't be possible otherwise so that's the essential trade-off. You can't reference something that doesn't yet exist without employing some kind of indirection (edit: which a heap-allocated record already is. See my other answer).

Do any functional programming languages have syntax sugar for changing part of an object?

In imperative programming, there is concise syntax sugar for changing part of an object, e.g. assigning to a field:
foo.bar = new_value
Or to an element of an array, or in some languages an array-like list:
a[3] = new_value
In functional programming, the idiom is not to mutate part of an existing object, but to create a new object with most of the same values, but a different value for that field or element.
At the semantic level, this brings about significant improvements in ease of understanding and composing code, albeit not without trade-offs.
I am asking here about the trade-offs at the syntax level. In general, creating a new object with most of the same values, but a different value for one field or element, is a much more heavyweight operation in terms of how it looks in your code.
Is there any functional programming language that provides syntax sugar to make that operation look more concise? Obviously you can write a function to do it, but imperative languages provide syntax sugar to make it more concise than calling a procedure; do any functional languages provide syntax sugar to make it more concise than calling a function? I could swear that I have seen syntax sugar for at least the object.field case, in some functional language, though I forget which one it was.
(Performance is out of scope here. In this context, I am talking only about what the code looks like and does, not how fast it does it.)
Haskell records have this functionality. You can define a record to be:
data Person = Person
{ name :: String
, age :: Int
}
And an instance:
johnSmith :: Person
johnSmith = Person
{ name = "John Smith"
, age = 24
}
And create an alternation:
johnDoe :: Person
johnDoe = johnSmith {name = "John Doe"}
-- Result:
-- johnDoe = Person
-- { name = "John Doe"
-- , age = 24
-- }
This syntax, however, is cumbersome when you have to update deeply nested records. We've got a library lens that solves this problem quite well.
However, Haskell lists do not provide an update syntax because updating on lists will have an O(n) cost - they are singly-linked lists.
If you want efficient update on list-like collections, you can use Arrays in the array package, or Vectors in the vector package. They both have the infix operator (//) for updating:
alteredVector = someVector // [(1, "some value")]
-- similar to `someVector[1] = "some value"`
it is not built-in, but I think infix notation is convenient enough!
One language with that kind of sugar is F#. It allows you to write
let myRecord3 = { myRecord2 with Y = 100; Z = 2 }
Scala also has sugar for updating a Map:
ms + (k -> v)
ms updated (k,v)
In a language such as Haskell, you would need to write this yourself. If you can express the update as a key-value pair, you might define
let structure' =
update structure key value
or
update structure (key, value)
which would let you use infix notation such as
structure `update` (key, value)
structure // (key, value)
As a proof of concept, here is one possible (inefficient) implementation, which also fails if your index is out of range:
module UpdateList (updateList, (//)) where
import Data.List (splitAt)
updateList :: [a] -> (Int,a) -> [a]
updateList xs (i,y) = let ( initial, (_:final) ) = splitAt i xs
in initial ++ (y:final)
infixl 6 // -- Same precedence as +
(//) :: [a] -> (Int,a) -> [a]
(//) = updateList
With this definition, ["a","b","c","d"] // (2,"C") returns ["a","b","C","d"]. And [1,2] // (2,3) throws a runtime exception, but I leave that as an exercise for the reader.
H. Rhen gave an example of Haskell record syntax that I did not know about, so I’ve removed the last part of my answer. See theirs instead.

Inline records in polymorphic variants?

The ocaml manual chapter 8 "language extensions" describes "inline records" (8.17):
The arguments of sum-type constructors can now be defined using the same syntax as records. Mutable and polymorphic fields are allowed. GADT syntax is supported. Attributes can be specified on individual fields. [...]
I am looking for that with polymorphic variants:
# type a = B of {x:int; mutable y:int} ;;
type a = B of { x : int; mutable y : int; }
# type b = `A of {u:int; mutable v:int} ;;
Line 1, characters 9-10:
Error: Syntax error
But that does not work, so right now I use an explicit auxiliary record type instead...
As I understand it now, this both takes more memory and is somewhat slower.
Can I get this cool feature with polymorphic variants, too?
In the cases of ordinary constructors, the compiler can use the type definition to distinguish between:
type t = A of int * int | B
let f = function
| A (_,y) -> y
| B -> 0
and
type 'a t = A of 'a | B
let f = function
| A (_,y) -> y
| B -> 0
Thus, it is possible to optimize the first
A (_,y) -> y
into "access the second field of the block` while still compiling the second case
A (_,y) -> y
to "access the tuple in the first field of the block, and then access the second field of the block".
For polymorphic variants, it is not possible to rely on the non-existing type definition to distinguish between those two solutions. Consequently, their memory representation must be uniform. This means that polymorphic variants always take one argument, and it is not really useful to label each argument of the constructor when there is only one argument.
This is why inline records cannot be combined with polymorphic variants.

Queues in Ocaml - enq

I can't seem to get my code working enq. How am I supposed to declare the queue datattype as well?
let enq (q, x) =
match q with
queue ([], b) -> queue([], x::b)
| queue (f, []) -> queue(f, [x])
| queue (f, b) -> queue(f, x::b);;
You need to distinguish between the name of the type and the constructor names used to make elements of the type. For your code, the type name could be queue (type names begin with lower case) and you could have just a single constructor named Queue (constructors begin with upper case).
The type declaration would look like this then:
type 'a queue = Queue of 'a list * 'a list
Note that this type is parameterized by the type of the elements of the queue (denoted as 'a in the declaration).
In your code, the pattern matches should be against the constructor Queue, not against the type name queue.
I would also say as a side comment that all the cases in your code look the same to me. I suspect there might be just one case.

Syntax sugar of OCaml functors

Why is it that given:
module type ENTRY = sig type t end
module type LOG = functor (E : ENTRY) -> sig type t end
This is a valid implementation of LOG
module Log :LOG = functor (LogEntry : ENTRY) ->
struct type t = LogEntry.t list end
But this isn't
module Log (LogEntry: ENTRY) :LOG = struct
type t = LogEntry.t list end
Error: Signature mismatch:
Modules do not match: sig type t = LogEntry.t list end is not included in LOG
If I remove the sig label (:LOG) from both definitions of Log then they return the same type as they are just syntactic sugar[1]
[1] http://caml.inria.fr/pub/docs/oreilly-book/html/book-ora132.html
The error message is confusing but the reason the first example passes and the second fails is actually very simple. Compare:
type entry = int
type log = int -> string
let log : log = fun s -> string_of_int s
and
let log (s : entry) : log = string_of_int s
The error message in case of modules states that a module field is not included in a functor, because an un-applied functor does not have fields.
ETA: a functor logically cannot have fields: functions/functors are a "different sort of beasts" than datastructures / modules. -- This makes the error message confusing, it sounds like we are asked to introduce a field although it already is present in the result of the functor.
Let me clarify on lukstafi's answer.
LOG is the type of the functor, in the first case it is matched against Log implementation itself (which happens to be a functor indeed), but in the second case it is matched against the result of the functor application (which happens to be an ordinary module), hence the mismatch.
All in all it looks like syntax misunderstanding. Read carefully the section on module types in the manual.

Resources