ocaml: Basic syntax for function of several arguments - syntax

I am learning OCaml and I'm a complete beginner at this point. I'm trying to get used to the syntax and I just spent 15 minutes debugging a stupid syntax error.
let foo a b = "bar";;
let biz = foo 2. -1.;;
I was getting an error This expression has type 'a -> string but an expression was expected of type int. I resolved the error, but it prompted me to learn what is the best way to handle this syntax peculiarity.
Basically OCaml treats what I intended as the numeric constant -1. as two separate tokens: - and 1. and I end up passing just 1 argument to foo. In other languages I'm familiar with this doesn't happen because arguments are separated with a comma (or in Scheme there are parentheses).
What is the usual way to handle this syntax peculiarity in OCaml? Is it surrounding the number with parentheses (foo 2. (-1.)) or there is some other way?

There is an unary minus operator ~-. that can be used to avoid this issue: foo ~-.1. (and its integer counterpart ~-) but it is generally simpler to add parentheses around the problematic expression.

Related

What does :^ stand for in Ruby?

Not sure if I am searching in a wrong way, but couldn't find the answer anywhere online...
What does :^ stand for in Ruby? In particular trying to understand the code below:
# this returns the element in array_of_numbers, which occurs an odd number of times
array_of_numbers.reduce(:^)
# this returns 0
[1,2,3].reduce(:^)
# this returns 4
[1,2,3,4].reduce(:^)
Was tying to understand the logic playing with different arrays, but I think I am missing something. Thanks in advance!
: in front of a name produces a Symbol.
In some contexts, a Symbol can be used as a message to an object. The object that receives the message reacts to it by calling its method that has the same name as the symbol (if such a method exists).
In your examples, this method is Integer#^, which represent the exclusive OR bit operator.
[1,2,3].reduce(:^) is, more or less, the same as 1 ^ 2 ^ 3.*
Being an OOP language, 1 ^ 2 ^ 3 in Ruby is syntactic sugar for (1.^(2)).^(3).
Read more about the exclusive OR bit operator.
* They produce the same result but the explicit expression should be faster.

Overriding Ruby's & and | methods doesn't require . operator? [duplicate]

I'm wondering why calls to operator methods don't require a dot? Or rather, why can't normal methods be called without a dot?
Example
class Foo
def +(object)
puts "this will work"
end
def plus(object)
puts "this won't"
end
end
f = Foo.new
f + "anything" # "this will work"
f plus "anything" # NoMethodError: undefined method `plus' for main:Object
The answer to this question, as to pretty much every language design question is: "Just because". Language design is a series of mostly subjective trade-offs. And for most of those subjective trade-offs, the only correct answer to the question why something is the way it is, is simply "because Matz said so".
There are certainly other choices:
Lisp doesn't have operators at all. +, -, ::, >, = and so on are simply normal legal function names (variable names, actually), just like foo or bar?
(plus 1 2)
(+ 1 2)
Smalltalk almost doesn't have operators. The only special casing Smalltalk has is that methods which consist only of operator characters do not have to end with a colon. In particular, since there are no operators, all method calls have the same precedence and are evaluated strictly left-to-right: 2 + 3 * 4 is 20, not 14.
1 plus: 2
1 + 2
Scala almost doesn't have operators. Just like Lisp and Smalltalk, *, -, #::: and so on are simply legal method names. (Actually, they are also legal class, trait, type and field names.) Any method can be called either with or without a dot. If you use the form without the dot and the method takes only a single argument, then you can leave off the brackets as well. Scala does have precedence, though, although it is not user-definable; it is simply determined by the first character of the name. As an added twist, operator method names that end with a colon are inverted or right-associative, i.e. a :: b is equivalent to b.::(a) and not a.::(b).
1.plus(2)
1 plus(2)
1 plus 2
1.+(2)
1 +(2)
1 + 2
In Haskell, any function whose name consists of operator symbols is considered an operator. Any function can be treated as an operator by enclosing it in backticks and any operator can be treated as a function by enclosing it in brackets. In addition, the programmer can freely define associativity, fixity and precedence for user-defined operators.
plus 1 2
1 `plus` 2
(+) 1 2
1 + 2
There is no particular reason why Ruby couldn't support user-defined operators in a style similar to Scala. There is a reason why Ruby can't support arbitrary methods in operator position, simply because
foo plus bar
is already legal, and thus this would be a backwards-incompatible change.
Another thing to consider is that Ruby wasn't actually fully designed in advance. It was designed through its implementation. Which means that in a lot of places, the implementation is leaking through. For example, there is absolutely no logical reason why
puts(!true)
is legal but
puts(not true)
isn't. The only reason why this is so, is because Matz used an LALR(1) parser to parse a non-LALR(1) language. If he had designed the language first, he would have never picked an LALR(1) parser in the first place, and the expression would be legal.
The Refinement feature currently being discussed on ruby-core is another example. The way it is currently specified, will make it impossible to optimize method calls and inline methods, even if the program in question doesn't actually use Refinements at all. With just a simple tweak, it can be just as expressive and powerful, and ensure that the pessimization cost is only incurred for scopes that actually use Refinements. Apparently, the sole reason why it was specified this way, is that a) it was easier to prototype this way, and b) YARV doesn't have an optimizer, so nobody even bothered to think about the implications (well, nobody except Charles Oliver Nutter).
So, for basically any question you have about Ruby's design, the answer will almost always be either "because Matz said so" or "because in 1993 it was easier to implement that way".
The implementation doesn't have the additional complexity that would be needed to allow generic definition of new operators.
Instead, Ruby has a Yacc parser that uses a statically defined grammar. You get the built-in operators and that's it. Symbols occur in a fixed set of sentences in the grammar. As you have noted, the operators can be overloaded, which is more than most languages offer.
Certainly it's not because Matz was lazy.
Ruby actually has a fiendishly complex grammar that is roughly at the limit of what can be accomplished in Yacc. To get more complex would require using a less portable compiler generator or it would have required writing the parser by hand in C, and doing that would have limited future implementation portability in its own way as well as not providing the world with the Yacc input. That would be a problem because Ruby's Yacc source code is the only Ruby grammar documentation and is therefore "the standard".
Because Ruby has "syntax sugar" that allows for a variety of convenient syntax for preset situations. For example:
class Foo
def bar=( o ); end
end
# This is actually calling the bar= method with a parameter, not assigning a value
Foo.new.bar = 42
Here's a list of the operator expressions that may be implemented as methods in Ruby.
Because Ruby's syntax was designed to look roughly like popular OO languages, and those use the dot operator to call methods. The language it borrowed its object model from, Smalltalk, didn't use dots for messages, and in fact had a fairly "weird" syntax that many people found off-putting. Ruby has been called "Smalltalk with an Algol syntax," where Algol is the language that gave us the conventions you're talking about here. (Of course, there are actually more differences than just the Algol syntax.)
Missing braces was some "advantage" for ruby 1.8, but with ruby 1.9 you can't even write method_0 method_1 some param it will be rejected, so the language goes rather to the strict version instead of freeforms.

Pythonesque blocks and postfix expressions

In JavaScript,
f = function(x) {
return x + 1;
}
(5)
seems at a glance as though it should assign f the successor function, but actually assigns the value 6, because the lambda expression followed by parentheses is interpreted by the parser as a postfix expression, specifically a function call. Fortunately this is easy to fix:
f = function(x) {
return x + 1;
};
(5)
behaves as expected.
If Python allowed a block in a lambda expression, there would be a similar problem:
f = lambda(x):
return x + 1
(5)
but this time we can't solve it the same way because there are no semicolons. In practice Python avoids the problem by not allowing multiline lambda expressions, but I'm working on a language with indentation-based syntax where I do want multiline lambda and other expressions, so I'm trying to figure out how to avoid having a block parse as the start of a postfix expression. Thus far I'm thinking maybe each level of the recursive descent parser should have a parameter along the lines of 'we have already eaten a block in this statement so don't do postfix'.
Are there any existing languages that encounter this problem, and how do they solve it if so?
Python has semicolons. This is perfectly valid (though ugly and not recommended) Python code: f = lambda(x): x + 1; (5).
There are many other problems with multi-line lambdas in otherwise standard Python syntax though. It is completely incompatible with how Python handles indentation (whitespace in general, actually) inside expressions - it doesn't, and that's the complete opposite of what you want. You should read the numerous python-ideas thread about multi-line lambdas. It's somewhere between very hard to impossible.
If you want arbitrarily complex compound statements inside lambdas you can't use the existing rules for multi-line expressions even if you made all statements expressions. You'd have to change the indentation handling (see the language reference for how it works right now) so that expressions can also contain blocks. This is hard to do without breaking perfectly fine Python code, and will certainly result in a language many Python programmers will consider worse in several regards: Harder to understand, more complex to implement, permits some stupid errors, etc.
Most languages don't solve this exact problem at all. Most candidates (Scala, Ruby, Lisps, and variants of these three) have explicit end-of-block tokens. I know of two languages that have the same problem, one of which (Haskell) has been mentioned by another answer. Coffeescript also uses indentation without end-of-block tokens. It parses the transliteration of your example correctly. However, I could not find any specification of how or why it does this (and I won't dig through the parser source code). Both differ significantly from Python in syntax as well as design philosophy, so their solution is of little (if any) use for Python.
In Haskell, there is an implicit semicolon whenever you start a line with the same indentation as a previous one, assuming the parser is in a layout-sensitive mode.
More specifically, after a token is encountered that signals the start of a (layout-sensitive) block, the indentation level of the first token of the first block item is remembered. Each line that is indented more continues the current block item; each line that is indented the same starts a new block item, and the first line that is indented less implies the closure of the block.
How your last example would be treated depends on whether the f = is a block item in some block or not. If it is, then there will be an implicit semicolon between the lambda expression and the (5), since the latter is indented the same as the former. If it is not, then the (5) will be treated as continuing whatever block item the f = is a part of, making it an argument to the lamda function.
The details are a bit messier than this; look at the Haskell 2010 report.

Shorthand logic to prepend a variable in many languages

I'm interesting why the shorthand forms of the assignment operators only works in one way, that means appending the value of the variable.
Ex. (In Javascript):
x+=y x=x+y
x-=y x=x-y
x*=y x=x*y
x/=y x=x/y
x%=y x=x%y
Frequently I found situations where I need to prepend the variable:
Ex.
x=y+x
Suppose x and y are strings and you are concatenating.
I would like to have a syntax that allow something like:
x=+y
As I do with i++ or ++i incrementing number.
Is there some language that support this?
surely x=y+x is the same as y+=x
I'm puzzled as to why you would learn a new language just to save on 1 character!
However, I would suggest JQuery's .prepend() method
http://api.jquery.com/prepend/
There are languages that allow to define new operators and/or overload existing operators (see operator overloading).
But operators and the use of them should be unambiguous. In your example x=+y could be interpreted as x=y+x (as you denoted) but also as x=(+x) (+ as unary operation like negation operation in -1). This ambiguity can make using a language hard, especially when programmers want to make their code short and concise. That’s also why some languages don’t have syntactic sugar like pre/post increment/decrement operators (e.g. Python).

OCaml delimiters and scopes

I'm learning OCaml and although I have years of experience with imperative programming languages (C, C++, Java) I'm getting some problems with delimiters between declarations or expressions in OCaml syntax.
Basically I understood that I have to use ; to concatenate expressions and the value returned by the sequence will be the one of last expression used, so for example if I have
exp1; exp2; exp3
it will be considered as an expression that returns the value of exp3. Starting from this I could use
let t = something in exp1; exp2; exp3
and it should be ok, right?
When am I supposed to use the double semicol ;;? What does it exactly mean?
Are there other delimiters that I must use to avoid syntax errors?
I'll give you an example:
let rec satisfy dtmc state pformula =
match (state, pformula) with
(state, `Next sformula) ->
let s = satisfy_each dtmc sformula
and adder a state =
let p = 0.;
for i = 0 to dtmc.matrix.rows do
p <- p +. get dtmc.matrix i state.index
done;
a +. p
in
List.fold_left adder 0. s
| _ -> []
It gives me syntax error on | but I don't get why.. what am I missing? This is a problem that occurs often and I have to try many different solutions until it suddently works :/
A side question: declaring with let instead that let .. in will define a var binding that lasts whenever after it has been defined?
What I basically ask is: what are the delimiters I have to use and when I have to use them. In addition are there differences I should consider while using the interpreter ocaml instead that the compiler ocamlc?
Thanks in advance!
The ;; delimiter terminates a top-level entity. In the ocaml toplevel (interpreter), it signals to the interpreter that a particular piece of input is finished and should be evaluated.
In programs to be compiled with ocamlc or ocamlopt, you don't need it near as often, as consecutive top-level let (without in), module, type, exception, and similar statements automatically signal the beginning of a new "phrase". If you include a top-level expression in a module that is to be evaluated only for its side-effects (such as generating some output or registering a module), you'll need a ;; before it to tell the compiler to stop compiling the previous phrase and start compiling a new thing. Otherwise, if the previous thing is a let, it will assume that the new expression is part of the let. For example:
let msg = "Hello, world";; (* we need ;; here *)
print_endline msg;; (* ;; is optional here, unless we have another expression *)
When you do and don't need ;; is somewhat subtle, so I usually terminate all my module-level entities with it so I don't have to worry about when it is and isn't needed.
; is used to separate sequential "statements" within a single expression. So foo; bar is a single sequential expression composed of foo and bar, while foo;; bar is only valid at the top level of a module and signifies two expressions.
On let without in: that construct is only valid in a module definition and variables so bound will be bound through the end of the module. Often, this is just the end of the file; if you have nested modules, however, its scope can be more limited. It does not work inside another expression or definition such as a function definition, unless it is within a local module definition.
let p = 0.;
This is the error. The ; needs to be an in. You can't use let without in only to define global functions, you can't use it inside an expression.
A side question: declaring with let instead that let .. in will define a var binding that lasts whenever after it has been defined?
You can only ever use one or the other (except in the interactive interpreter where you are allowed to mix expressions and definitions). When defining a global function or value, you need let without in. Inside an expression you need let with in.
;; is used to terminate input and start interpreting in ocaml REPL, it has no special meaning when compiling with ocamlc or ocamlopt.
You cannot assign to arbitrary value with <- operator, you have to use ref type for mutable variables:
let p = ref 0. in
for i = 0 to dtmc.matrix.rows do
p := !p +. get dtmc.matrix i state.index
done;
a +. !p

Resources