F# discriminated union syntax clarification - syntax

I'm reading Expert F# 4.0 and at some point (p.93) the following syntax is introduced for list:
type 'T list =
| ([])
| (::) of 'T * 'T list
Although I understand conceptually what's going on here, I do not understand the syntax. Apparently you can put [] or :: between parentheses and they mean something special.
Other symbols aren't allowed, for example (++) or (||). So what's going on here?
And another thing is the 'operator' nature of (::). Suppose I have the following (weird) type:
type 'T X =
| None
| Some of 'T * 'T X
| (::) of 'T * 'T X
Now I can say:
let x: X<string> = Some ("", None)
but these aren't allowed:
let x: X<string> = :: ("", None)
let x: X<string> = (::) ("", None)
So (::) is actually something completely different than Some, although both are cases in a discriminated union.

Theoretically, F# spec (see section 8.5) says that union case identifiers must be alphanumeric sequences starting with an upper-case letter.
However, this way of defining list cons is an ML idiomatic thing. There would be riots in the streets if we were forced to write Cons (x, Cons(y, Cons (z, Empty))) instead of x :: y :: z :: [].
So an exception was made for just these two identifiers - ([]) and (::). You can use these, but only these two. Besides these two, only capitalized alphanumeric names are allowed.
However, you can define free-standing functions with these funny names:
let (++) a b = a * b
These functions are usually called "operators" and can be called via infix notation:
let x = 5 ++ 6 // x = 30
As opposed to regular functions that only support prefix notation - i.e. f 5 6.
There is a separate quite intricate set of rules about which characters are allowed in operators, which can be only unary, which can be only binary, which can be both, and how they define the resulting operator precedence. See section 4.1 of the spec or here for full reference.

Related

Scope of implicit type variables in OCaml constraints

In Ocaml you can introduce new type variables inside a constraint, which is useful to enforce type-identities in the type-checker:
let f g n = (g (n:'n):'n) ;;
val f : ('n -> 'n) -> 'n -> 'n = <fun>
It is obviously possible to re-use these type variables (otherwise it would be a rather pointless exercise). However, since they are not introduced by some special statement, I wonder what there scope is? Is it the enclosing function, let-binding or top-level statement?
Is there a way to limit the scope of such an implicitly introduced type-variable?
A scope of any type variable used in a type constraint is the body of the enclosing let-expression. If an expression is mutually recursive, then the scope is extended to the whole set of mutual recursive expressions. The scope cannot be reduced. Let-expression is a typing primitive. It is not possible to hide or override a type variable.
Whenever a new type variable is introduced, it is looked up in a current typing context. If it was already introduced, then it is unified. Otherwise a new type variable is added to the context. (That can be later used for unification).
An example to clarify the idea:
let rec f g h x y = g (x : 'a) + h (y : 'a) and e (x : 'a) = x + 1;;
Here, 'a used to constraint x in e is the same 'a that was used to contraint x and y in the body of function f. Since, x in e is unified with int type the unification extends to function f, constraining function g and h to type int -> int.

Understanding the syntax of Agda

Using the following as an example
postulate DNE : {A : Set} → ¬ (¬ A) → A
data ∨ (A B : Set) : Set where
inl : A → A ∨ B
inr : B → A ∨ B
-- Use double negation to prove exclude middle
classical-2 : {A : Set} → A ∨ ¬ A
classical-2 = λ {A} → DNE (λ z → z (inr (λ x → z (inl x)))
I know this is correct, purely because of how agda works, but I am new to this language and can't get my head around how its syntax works, I would appreciate if anyone can walk me through what is going on, thanks :)
I have experience in haskell, although that was around a year ago.
Let's start with the postulate. The syntax is simply:
postulate name : type
This asserts that there exists some value of type type called name. Think of it as axioms in logic - things that are defined to be true and are not be questioned (by Agda, in this case).
Next up is the data definition. There's a slight oversight with the mixfix declaration so I'll fix it and explain what it does. The first line:
data _∨_ (A B : Set) : Set where
Introduces a new type (constructor) called _∨_. _∨_ takes two arguments of type Set and then returns a Set.
I'll compare it with Haskell. The A and B are more or less equivalent to a and b in the following example:
data Or a b = Inl a | Inr b
This means that the data definition defines a polymorphic type (a template or a generic, if you will). Set is the Agda equivalent of Haskell's *.
What's up with the underscores? Agda allows you to define arbitrary operators (prefix, postfix, infix... usually just called by a single name - mixfix). The underscores just tell Agda where the arguments are. This is best seen with prefix/postfix operators:
-_ : Integer → Integer -- unary minus
- n = 0 - n
_++ : Integer → Integer -- postfix increment
x ++ = x + 1
You can even create crazy operators such as:
if_then_else_ : ...
Next part is definition of the data constructors itself. If you've seen Haskell's GADTs, this is more or less the same thing. If you haven't:
When you define a constructor in Haskell, say Inr above, you just specify the type of the arguments and Haskell figures out the type of the whole thing, that is Inr :: b -> Or a b. When you write GADTs or define data types in Agda, you need to specify the whole type (and there are good reasons for this, but I won't get into that now).
So, the data definition specifies two constructors, inl of type A → A ∨ B and inr of type B → A ∨ B.
Now comes the fun part: first line of classical-2 is a simple type declaration. What's up with the Set thing? When you write polymorphic functions in Haskell, you just use lower case letters to represent type variables, say:
id :: a -> a
What you really mean is:
id :: forall a. a -> a
And what you really mean is:
id :: forall (a :: *). a -> a
I.e. it's not just any kind of a, but that a is a type. Agda makes you do this extra step and declare this quantification explicitly (that's because you can quantify over more things than just types).
And the curly braces? Let me use the Haskell example above again. When you use the id function somewhere, say id 5, you don't need to specify that a = Integer.
If you used normal paretheses, you'd have to provide the actual type A everytime you called classical-2. However, most of the time, the type can be deduced from the context (much like the id 5 example above), so for those cases, you can "hide" the argument. Agda then tries to fill that in automatically - and if it cannot, it complains.
And for the last line: λ x → y is the Agda way of saying \x -> y. That should explain most of the line, the only thing that remains are the curly braces yet again. I'm fairly sure that you can omit them here, but anyways: hidden arguments do what they say - they hide. So when you define a function from {A} to B, you just provide something of type B (because {A} is hidden). In some cases, you need to know the value of the hidden argument and that's what this special kind of lambda does: λ {A} → allows you to access the hidden A!

adding a number to a list within a function OCaml

Here is what I have and the error that I am getting sadly is
Error: This function has type 'a * 'a list -> 'a list
It is applied to too many arguments; maybe you forgot a `;'.
Why is that the case? I plan on passing two lists to the deleteDuplicates function, a sorted list, and an empty list, and expect the duplicates to be removed in the list r, which will be returned once the original list reaches [] condition.
will be back with updated code
let myfunc_caml_way arg0 arg1 = ...
rather than
let myfunc_java_way(arg0, arg1) = ...
Then you can call your function in this way:
myfunc_caml_way "10" 123
rather than
myfunc_java_way("10, 123)
I don't know how useful this might be, but here is some code that does what you want, written in a fairly standard OCaml style. Spend some time making sure you understand how and why it works. Maybe you should start with something simpler (eg how would you sum the elements of a list of integers ?). Actually, you should probably start with an OCaml tutorial, reading carefully and making sure you aunderstand the code examples.
let deleteDuplicates u =
(*
u : the sorted list
v : the result so far
last : the last element we read from u
*)
let rec aux u v last =
match u with
[] -> v
| x::xs when x = last -> aux xs v last
| x::xs -> aux u (x::v) x
in
(* the first element is a special case *)
match u with
[] -> []
| x::xs -> List.rev (aux xs [x] x)
This is not a direct answer to your question.
The standard way of defining an "n-ary" function is
let myfunc_caml_way arg0 arg1 = ...
rather than
let myfunc_java_way(arg0, arg1) = ...
Then you can call your function in this way:
myfunc_caml_way "10" 123
rather than
myfunc_java_way("10, 123)
See examples here:
https://github.com/ocaml/ocaml/blob/trunk/stdlib/complex.ml
By switching from myfunc_java_way to myfunc_caml_way, you will be benefited from what's called "Currying"
What is 'Currying'?
However please note that you sometimes need to enclose the whole invocation by parenthesis
myfunc_caml_way (otherfunc_caml_way "foo" "bar") 123
in order to tell the compiler not to interpret your code as
((myfunc_caml_way otherfunc_caml_way "foo") "bar" 123)
You seem to be thinking that OCaml uses tuples (a, b) to indicate arguments of function calls. This isn't the case. Whenever some expressions stand next to each other, that's a function call. The first expression is the function, and the rest of the expressions are the arguments to the function.
So, these two lines:
append(first,r)
deleteDuplicates(remaining, r)
Represent a function call with three arguments. The function is append. The first argument is (first ,r). The second argument is deleteDuplicates. The third argument is (remaining, r).
Since append has just one argument (a tuple), you're passing it too many arguments. This is what the compiler is telling you.
You also seem to be thinking that append(first, r) will change the value of r. This is not the case. Variables in OCaml are immutable. You can't do anything that will change the value of r.
Update
I think you have too many questions for SO to help you effectively at this point. You might try reading some OCaml tutorials. It will be much faster than asking a question here for every error you see :-)
Nonetheless, here's what "match failure" means. It means that somewhere you have a match that you're applying to an expression, but none of the patterns of the match matches the expression. Your deleteDuplicates code clearly has a pattern coverage error; i.e., it has a pattern that doesn't cover all cases. Your first match only works for empty lists or for lists of 2 or more elements. It doesn't work for lists of 1 element.

About Prolog syntax

Sometimes I see terms like:
X = a:b
or
X = a-b
I can do requests like
X = Y:Z
and the compiler unifies Y with a and Z with b, as expected.
Now my answer:
Which characters (or sequence of characters) am I allowed to use to combine two Prolog atoms?!
Maybe you can give me some links with further informations about this issue.
Thanks for your help and kind regards from Germany
Which characters (or sequence of characters) am I allowed to use to combine two Prolog atoms?!
What you are asking here for, is the entire operator syntax definition of Prolog. To get the very full answer to this, please refer to the tag iso-prolog for full information how to obtain the Prolog standard ISO/IEC 13211-1.
But as a short answer to start with:
Prolog syntax consists of
functional notation, like +(a,b), plus
a dynamically redefinable operator syntax, plus
some extra.
It seems you want to know which "characters" can be used as operators.
The short answer is that you can use all atoms Op that succeed for current_op(Pri,Fix,Op). So you can ask dynamically, which operators are present:
?- current_op(Pri, Fix, Op).
Pri = 1, Fix = fx, Op = ($)
; Pri = 1150, Fix = fx, Op = (module_transparent)
; Pri = 700, Fix = xfx, Op = (=#=)
; Pri = 700, Fix = xfx, Op = (#>=)
; Pri = 700, Fix = xfx, Op = (>=)
; ... .
All those operators can be used in the specified manner, as pre-, in-, or postfix with the indicated priorities. Some of these operators are specific to SWI, and some are defined by the standard. Above, only #>= and >= are standard operators.
Most of the operators consist of the graphic characters #$&*+-./:<=>?#^~ only or of letters, digits and underscores starting with a lower case letter. There are two solo characters !; and then there are ,| which are even more special. Operator names that are different to above need quoting - you rarely will encounter them.
To see how operators nest, use write_canonical(Term).
The long answer is that you are also able to define such operators yourself. However, be aware that changing the operator syntax has often many implications that are very difficult to fathom. Even more so, since many systems differ in some rarely used configurations. For example, the system you mentioned, SWI differs in several ways.
I'd suggest to avoid defining new operators until you have learned more about the Prolog language.
let's see what's inside X = Y:Z
?- display( X = Y:Z ).
=(_G3,:(_G1,_G2))
true.
then we have a nested structure, where functors are operators.
An operator is an atom, and the rule for atom syntax says that we have 3 kind to consider:
a sequence of any printable character enclosed in single quote
a sequence of special characters only, where a special character is one of `.=:-+*/><##~? (I hope I have found all of them, from this page you can check if I forgot someone !)
a sequence of lowercase/uppercase characters or the underscore, starting with a lowercase character
edit
A functor (shorthand for function constructor, I think, but function is misleading in Prolog context) it's the symbol that 'ties' several arguments. The number of arguments is named arity. In Prolog a term is an atomic literal (like a number, or an atom), or a recursive structure, composed of a functor and a number of arguments, each being a term itself (at least 1).
Given the appropriate declaration, i.e. op/3, unary and binary terms can be represented as expressions, like that one you show.
An example of operator, using the : special char, is ':-'
member(X,[X|_]).
member(X,[_|T]) :- member(X, T).
The O.P., said (and I quote):
Sometimes I see terms like: X = a:b or X = a-b
I can do requests like X = Y:Z and the compiler unifies Y with a and Z with b, as expected.
Now my answer: Which characters (or sequence of characters) am I allowed
to use to combine two Prolog atoms?!
The short answer is Pretty much whatever you want (provided it is an atom).
The longer answer is this:
What are seeing are infix (x infix_op b), prefix (pfx_op b) and suffix (b sfx_op ) operators. Any structure with an arity of 2 can be an infix operator. Any structure with an arity of 1 can be a prefix or suffix operator. As a result, any atom may be an operator.
Prolog is parsed via a precedence driven, recursive descent parser (written in Prolog, naturally). Operators are defined and enumerated, along with their precedence and associativity in the operator/3 predicate. Associativity has to do with how the parse tree is constructed. An expression like a - b - c could be parsed as ( a - ( b - c ) ) (right-associative), or ( ( a - b ) - c ) (left-associative).
Precedence has to do with how tightly operators bind. An expression like a + b * c binds as ( a + ( b * c ) not because of associativity, but because '*'/2 (multiplication) has higher precedence that '+'/2 (addition).
You can add, remove and change operators to your heart's content. Not that this gives you a lot of room to shoot yourself in the foot by breaking prolog's syntax.
It should be noted, however, that any operator expression can also be written via ordinary notation:
a + b * c
is exactly identical to
'+'( a , '*'(b,c) )

Functional programming languages with methods, method chaining etc

I've been investigating functional programming, and it occurred to me that there could be a functional language which has (immutable) objects with methods, and which therefore supports method chaining (where chainable methods would return new instances rather than mutating the instance the method is called on and returning it).
This would have readability advantages as...
o.f().g().h()
... is arguably more readable than:
h(g(f(o)))
It would also allow you to associate particular functions with particular types of object, by making them methods of those types (which I understand to be one advantage of object-oriented langauges).
Are there any languages which behave like this? Are there any reasons to believe that this would be a bad idea?
(I know that you can program like this in e.g Javascript, but Javascript doesn't enforce immutability.)
yes, for example, F# uses the forward pipe (|>) operator which makes the code very readable. for example,
(1..20)
|> Seq.map(functionFoo)
|> Seq.map(functionBoo)
and so on...
Frege has this, it is known as TDNR (type directed name resolution).
Specifically, if x has type T, and y occurs in the namespace of T, then x.y is the same as (T.y x) which is in plain english y from the name space T applied to x.
Practical applications of this are: convenient syntax for record field access and access to native (i.e. Java, as Frege is compiled to Java) methods.
Scala sounds like a good fit - it's a hybrid functional / object-oriented language.
You don't need objects for that, just define your own reverse apply infix operator, which most functional languages allow you to do. Currying then does the rest. For example, in OCaml:
let (>>) x f = f x
Demo:
let f x y z = z * (x - y)
let g x = x + 1
let h x y = y * x
5 >> f 6 2 >> g >> h 2 (* = h 2 (g (f 6 2 5)) *)
(Or choose whatever operator name you prefer; others use |> for example.)

Resources