Is it necessary to convert infix notation to postfix when creating an expression tree from it? - data-structures

I want to create an expression tree given expression in infix form. Is it necessary to convert the expression to postfix first and then create the tree? I understand that it somehow depends on the problem itself. But assume that it is simple expression of mathematical function with unknowns and operators like: / * ^ + -.

No. If you're going to build an expression tree, then it's not necessary to convert the expression to postfix first. It will be simpler just to build the expression tree as you parse.
I usually write recursive descent parsers for expressions. In that case each recursive call just returns the tree for the subexpression that it parses. If you want to use an iterative shunting-yard-like algorithm, then you can do that too.
Here's a simple recursive descent parser in python that makes a tree with tuples for nodes:
import re
def toTree(infixStr):
# divide string into tokens, and reverse so I can get them in order with pop()
tokens = re.split(r' *([\+\-\*\^/]) *', infixStr)
tokens = [t for t in reversed(tokens) if t!='']
precs = {'+':0 , '-':0, '/':1, '*':1, '^':2}
#convert infix expression tokens to a tree, processing only
#operators above a given precedence
def toTree2(tokens, minprec):
node = tokens.pop()
while len(tokens)>0:
prec = precs[tokens[-1]]
if prec<minprec:
break
op=tokens.pop()
# get the argument on the operator's right
# this will go to the end, or stop at an operator
# with precedence <= prec
arg2 = toTree2(tokens,prec+1)
node = (op, node, arg2)
return node
return toTree2(tokens,0)
print toTree("5+3*4^2+1")
This prints:
('+', ('+', '5', ('*', '3', ('^', '4', '2'))), '1')
Try it here:
https://ideone.com/RyusvI
Note that the above recursive descent style is the result of having written many parsers. Now I pretty much always parse expressions in this way (the recursive part, not the tokenization). It is just about as simple as an expression parser can be, and it makes it easy to handle parentheses, and operators that associate right-to-left like the assignment operator.

Related

What is the exact difference between the expression and the statement in programming [duplicate]

In Python, what is the difference between expressions and statements?
Expressions only contain identifiers, literals and operators, where operators include arithmetic and boolean operators, the function call operator () the subscription operator [] and similar, and can be reduced to some kind of "value", which can be any Python object. Examples:
3 + 5
map(lambda x: x*x, range(10))
[a.x for a in some_iterable]
yield 7
Statements (see 1, 2), on the other hand, are everything that can make up a line (or several lines) of Python code. Note that expressions are statements as well. Examples:
# all the above expressions
print 42
if x: do_y()
return
a = 7
Expression -- from the New Oxford American Dictionary:
expression: Mathematics a collection
of symbols that jointly express a
quantity : the expression for the
circumference of a circle is 2πr.
In gross general terms: Expressions produce at least one value.
In Python, expressions are covered extensively in the Python Language Reference In general, expressions in Python are composed of a syntactically legal combination of Atoms, Primaries and Operators.
Python expressions from Wikipedia
Examples of expressions:
Literals and syntactically correct combinations with Operators and built-in functions or the call of a user-written functions:
>>> 23
23
>>> 23l
23L
>>> range(4)
[0, 1, 2, 3]
>>> 2L*bin(2)
'0b100b10'
>>> def func(a): # Statement, just part of the example...
... return a*a # Statement...
...
>>> func(3)*4
36
>>> func(5) is func(a=5)
True
Statement from Wikipedia:
In computer programming a statement
can be thought of as the smallest
standalone element of an imperative
programming language. A program is
formed by a sequence of one or more
statements. A statement will have
internal components (e.g.,
expressions).
Python statements from Wikipedia
In gross general terms: Statements Do Something and are often composed of expressions (or other statements)
The Python Language Reference covers Simple Statements and Compound Statements extensively.
The distinction of "Statements do something" and "expressions produce a value" distinction can become blurry however:
List Comprehensions are considered "Expressions" but they have looping constructs and therfore also Do Something.
The if is usually a statement, such as if x<0: x=0 but you can also have a conditional expression like x=0 if x<0 else 1 that are expressions. In other languages, like C, this form is called an operator like this x=x<0?0:1;
You can write you own Expressions by writing a function. def func(a): return a*a is an expression when used but made up of statements when defined.
An expression that returns None is a procedure in Python: def proc(): pass Syntactically, you can use proc() as an expression, but that is probably a bug...
Python is a bit more strict than say C is on the differences between an Expression and Statement. In C, any expression is a legal statement. You can have func(x=2); Is that an Expression or Statement? (Answer: Expression used as a Statement with a side-effect.) The assignment statement of x=2 inside of the function call of func(x=2) in Python sets the named argument a to 2 only in the call to func and is more limited than the C example.
Though this isn't related to Python:
An expression evaluates to a value.
A statement does something.
>>> x + 2 # an expression
>>> x = 1 # a statement
>>> y = x + 1 # a statement
>>> print y # a statement (in 2.x)
2
An expression is something that can be reduced to a value, for example "1+3" is an expression, but "foo = 1+3" is not.
It's easy to check:
print(foo = 1+3)
If it doesn't work, it's a statement, if it does, it's an expression.
Another statement could be:
class Foo(Bar): pass
as it cannot be reduced to a value.
Statements represent an action or command e.g print statements, assignment statements.
print 'hello', x = 1
Expression is a combination of variables, operations and values that yields a result value.
5 * 5 # yields 25
Lastly, expression statements
print 5*5
An expression is something, while a statement does something.
An expression is a statement as well, but it must have a return.
>>> 2 * 2         #expression
>>> print(2 * 2)     #statement
PS:The interpreter always prints out the values of all expressions.
An expression is a statement that returns a value. So if it can appear on the right side of an assignment, or as a parameter to a method call, it is an expression.
Some code can be both an expression or a statement, depending on the context. The language may have a means to differentiate between the two when they are ambiguous.
STATEMENT:
A Statement is a action or a command that does something. Ex: If-Else,Loops..etc
val a: Int = 5
If(a>5) print("Hey!") else print("Hi!")
EXPRESSION:
A Expression is a combination of values, operators and literals which yields something.
val a: Int = 5 + 5 #yields 10
Expressions always evaluate to a value, statements don't.
e.g.
variable declaration and assignment are statements because they do not return a value
const list = [1,2,3];
Here we have two operands - a variable 'sum' on the left and an expression on the right.
The whole thing is a statement, but the bit on the right is an expression as that piece of code returns a value.
const sum = list.reduce((a, b)=> a+ b, 0);
Function calls, arithmetic and boolean operations are good examples of expressions.
Expressions are often part of a statement.
The distinction between the two is often required to indicate whether we require a pice of code to return a value.
References
Expressions and statements
2.3 Expressions and statements - thinkpython2 by Allen B. Downey
2.10. Statements and Expressions - How to Think like a Computer Scientist by Paul Resnick and Brad Miller
An expression is a combination of values, variables, and operators. A value all by itself is
considered an expression, and so is a variable, so the following are all legal expressions:
>>> 42
42
>>> n
17
>>> n + 25
42
When you type an expression at the prompt, the interpreter evaluates it, which means that
it finds the value of the expression. In this example, n has the value 17 and n + 25 has the
value 42.
A statement is a unit of code that has an effect, like creating a variable or displaying a
value.
>>> n = 17
>>> print(n)
The first line is an assignment statement that gives a value to n. The second line is a print
statement that displays the value of n.
When you type a statement, the interpreter executes it, which means that it does whatever
the statement says. In general, statements don’t have values.
An expression translates to a value.
A statement consumes a value* to produce a result**.
*That includes an empty value, like: print() or pop().
**This result can be any action that changes something; e.g. changes the memory ( x = 1) or changes something on the screen ( print("x") ).
A few notes:
Since a statement can return a result, it can be part of an expression.
An expression can be part of another expression.
Statements before could change the state of our Python program: create or update variables, define function, etc.
And expressions just return some value can't change the global state or local state in a function.
But now we got :=, it's an alien!
Expressions:
Expressions are formed by combining objects and operators.
An expression has a value, which has a type.
Syntax for a simple expression:<object><operator><object>
2.0 + 3 is an expression which evaluates to 5.0 and has a type float associated with it.
Statements
Statements are composed of expression(s). It can span multiple lines.
A statement contains a keyword.
An expression does not contain a keyword.
print "hello" is statement, because print is a keyword.
"hello" is an expression, but list compression is against this.
The following is an expression statement, and it is true without list comprehension:
(x*2 for x in range(10))
Python calls expressions "expression statements", so the question is perhaps not fully formed.
A statement consists of pretty much anything you can do in Python: calculating a value, assigning a value, deleting a variable, printing a value, returning from a function, raising an exception, etc. The full list is here: http://docs.python.org/reference/simple_stmts.html#
An expression statement is limited to calling functions (e.g.,
math.cos(theta)"), operators ( e.g., "2+3"), etc. to produce a value.

How to keep track of nesting information on iterative (non-recursive) function

Say I have a recursive descent parser that defines a bunch of nested rules.
Expr ← Sum
Sum ← Product (('+' / '-') Product)*
Product ← Value (('*' / '/') Value)*
Value ← [0-9]+ / '(' Expr ')'
Say I am right ● here on the second Value in the process:
Expr ← Sum
Sum ← Product (('+' / '-') Product)*
Product ← Value (('*' / '/') ●)*
Value ← [0-9]+ / '(' Expr ')'
That would mean that I am somewhere in here in a nesting level let's say:
Expr
Sum
|Product
+
Product
|Product
-
Product
|Value
*
Value
|Value
*
●
When parsing with recursive descent, it is recursive so when Value returns, we get back to the "sequence" * parsing node, which then returns to the Product node, which returns to the product sequence node, etc. So it's easy to build up the parsing tree.
But let's say that you want to do this using an iterative stack. The question is, how to keep track of the nesting information so that you can say in your code (eventually):
function handleValue(state, string) {
// ...
}
function handleValueSequence(state, string) {
if (state.startedValueSequenceEarlier) {
wrapItUp(new ValueSequence(state.values))
}
}
function handleProduct(state, string) {
// ...
}
function handleProductSequence(state, string) {
if (state.startedProductSequenceEarlier) {
wrapItUp(new ProductSequence(state.products))
}
}
The tricky part is, this can be arbitrarily nested, so you might have:
Product
Value
Product
Value
Product
...
So if your function like handleProductSequence doesn't have any context other than the function's arguments, I can't tell how it should figure out how to "wrapItUp" and finally create that ProductSequence object. In that state object I added, I am trying to think of ways of adding a state.stack property or something, but I'm not sure what would go in there. Any help would be much appreciated.
Your stack has to contain "where you are" in the control flow. In a recursive descent parser, that will be effectively the same as where you are in the parse, so you could write a generalised LL parser in this fashion. Personally, I'd probably represent a production as an object with a list of tokens and a handler function. (Plus some extension for EBNF operators like *.) A state would then be a production, a position in the production, and a list of already matched values.
But it's hard to see a good reason to do that when LR parser generators already exist, using essentially this representation, and they can handle many more grammars.

Mixing mathematical expression with control flow

I would like to know, how expressions are parsed when are mixed with control flow.
Let's assume such syntax:
case
when a == Method() + 1
then Something(1)
when a == Other() - 2
then 1
else 0
end
We've got here two conditional expressions, Method() + 1, Something(1) and 0. Each can be translated to postfix by Shunting-yard algorithm and then easily translated into AST. But is it possible to extend this algorithm to handle control - flow also? Or are there other approaches to solve such mixing of expressions and control flows?
another example:
a == b ? 1 : 2
also how can I classify such expression: a between b and c, can I say that between is three arguments function? Or is there any special name for such expressions?
You can certainly parse the ternary operator with an operator-precedence grammar. In
expr ? expr : expr
the binary "operator" here is ? expr :, which conveniently starts and ends with an operator token (albeit different ones). To adapt shunting yard to that, assign the right-precedence of ? and the left-precedence of : to the precedence of the ?: operator. The left-precedence of ? and the right-precedence of : are ±∞, just like parentheses (which, in effect, they are).
Since the case statement is basically repeated application of the ternary operator, using slightly different spellings for the tokens, and yields to a similar solution. (Here case when and end are purely parenthetic, while then and the remaining whens correspond to ? and :.)
Having said that, it really is simpler to use an LALR(1) parser generator, and there is almost certainly one available for whatever language you are writing in.
It's clear that both the ternary operator and OP's case statement are operator grammars:
Ternary operator:
ternary-expr: non-ternary-expr
| non-ternary-expr '?' expr ':' ternary-expr
Normally, the ternary operator will be lower precedence from any other operator and associate to the right, which is how the above is written. In C and other languages ternary expressions have the same precedence as assignment expressions, which is straightforward to add. That results in the relationships
X ·> ?
? <· X
? ·=· :
X ·> :
: <· X
Case statement (one of many possible formulations):
case_statement: 'case' case_body 'else' expr 'end'
case_body: 'when' expr 'then' expr
| case_body 'when' expr 'then' expr
Here are the precedence relationships for the above grammar:
case <· when
case ·=· else
when <· X (see below)
when ·=· then
then ·> when
then ·> else
else <· X
else ·=· end
X ·> then
X ·> when
X ·> end
X in the above relations refers to any binary or unary operator, any value terminal, ( and ).
It's straightforward to find left- and right-precedence functions for all of those terminals; the pattern will be similar to that of parentheses in a standard algebraic grammar.
The Shunting-yard algorithm is for expressions with unary and binary operators. You need something more powerful such as LL(1) or LALR(1) to parse control flow statements, and once you have that it will also handle expressions as well. No need for the Shunting-yard algorithm at all.

What does a Ruby case expression mean without an expression to inspect?

Take this example Ruby expression:
case
when 3 then "foo"
when 4 then "bar"
end
I was surprised to learn that this is not a syntax error. Instead, it evaluates to "foo"!
Why? What are the syntax and evaluation rules being applied here?
In this form of the case expression, the then clause associated with the lexically first when clause that evaluates to a truthy value is evaluated.
See clause b) 2) of §11.5.2.2.4 Semantics of the ISO Ruby Language Specification (bold emphasis mine):
Semantics
A case-expression is evaluated as follows:
a) […]
b. The meaning of the phrase “O is matching” in Step c) is defined as follows:
[…]
If the case-expression is a case-expression-without-expression, O is matching if and only if O is a trueish object.
c) Take the following steps:
Search the when-clauses in the order they appear in the program text for a matching when-clause as follows:
i) If the operator-expression-list of the when-argument is present:
I) For each of its operator-expressions, evaluate it and test if the resulting value is matching.
II) If a matching value is found, other operator-expressions, if any, are not evaluated.
ii) If no matching value is found, and the splatting-argument (see 11.3.2) is present:
I) Construct a list of values from it as described in 11.3.2. For each element of the resulting list, in the same order in the list, test if it is matching.
II) If a matching value is found, other values, if any, are not evaluated.
iii) A when-clause is considered to be matching if and only if a matching value is found in its when-argument. Later when-clauses, if any, are not tested in this case.
If one of the when-clauses is matching, evaluate the compound-statement of the then-clause of this when-clause. The value of the case-expression is the resulting value.
If none of the when-clauses is matching, and if there is an else-clause, then evaluate the compound-statement of the else-clause. The value of the case-expression is the resulting value.
Otherwise, the value of the case-expression is nil.
The RDoc documentation, while much less precise, also states that truthiness is the selection criteria, when the condition is omitted; and lexical ordering determines the order in which when clauses are checked (bold emphasis mine):
case
The case statement operator. Case statements consist of an optional condition, which is in the position of an argument to case, and zero or more when clauses. The first when clause to match the condition (or to evaluate to Boolean truth, if the condition is null) "wins", and its code stanza is executed. The value of the case statement is the value of the successful when clause, or nil if there is no such clause.
It is by design that case statement without a value to match against behaves as an if statement.
It is actually the same as writing:
if 3
'foo'
elsif 4
'bar'
end

What are the precise rules for when you can omit parenthesis, dots, braces, = (functions), etc.?

What are the precise rules for when you can omit (omit) parentheses, dots, braces, = (functions), etc.?
For example,
(service.findAllPresentations.get.first.votes.size) must be equalTo(2).
service is my object
def findAllPresentations: Option[List[Presentation]]
votes returns List[Vote]
must and be are both functions of specs
Why can't I go:
(service findAllPresentations get first votes size) must be equalTo(2)
?
The compiler error is:
"RestServicesSpecTest.this.service.findAllPresentations
of type
Option[List[com.sharca.Presentation]]
does not take parameters"
Why does it think I'm trying to pass in a parameter? Why must I use dots for every method call?
Why must (service.findAllPresentations get first votes size) be equalTo(2) result in:
"not found: value first"
Yet, the "must be equalTo 2" of
(service.findAllPresentations.get.first.votes.size) must be equalTo 2, that is, method chaining works fine? - object chain chain chain param.
I've looked through the Scala book and website and can't really find a comprehensive explanation.
Is it in fact, as Rob H explains in Stack Overflow question Which characters can I omit in Scala?, that the only valid use-case for omitting the '.' is for "operand operator operand" style operations, and not for method chaining?
You seem to have stumbled upon the answer. Anyway, I'll try to make it clear.
You can omit dot when using the prefix, infix and postfix notations -- the so called operator notation. While using the operator notation, and only then, you can omit the parenthesis if there is less than two parameters passed to the method.
Now, the operator notation is a notation for method-call, which means it can't be used in the absence of the object which is being called.
I'll briefly detail the notations.
Prefix:
Only ~, !, + and - can be used in prefix notation. This is the notation you are using when you write !flag or val liability = -debt.
Infix:
That's the notation where the method appears between an object and it's parameters. The arithmetic operators all fit here.
Postfix (also suffix):
That notation is used when the method follows an object and receives no parameters. For example, you can write list tail, and that's postfix notation.
You can chain infix notation calls without problem, as long as no method is curried. For example, I like to use the following style:
(list
filter (...)
map (...)
mkString ", "
)
That's the same thing as:
list filter (...) map (...) mkString ", "
Now, why am I using parenthesis here, if filter and map take a single parameter? It's because I'm passing anonymous functions to them. I can't mix anonymous functions definitions with infix style because I need a boundary for the end of my anonymous function. Also, the parameter definition of the anonymous function might be interpreted as the last parameter to the infix method.
You can use infix with multiple parameters:
string substring (start, end) map (_ toInt) mkString ("<", ", ", ">")
Curried functions are hard to use with infix notation. The folding functions are a clear example of that:
(0 /: list) ((cnt, string) => cnt + string.size)
(list foldLeft 0) ((cnt, string) => cnt + string.size)
You need to use parenthesis outside the infix call. I'm not sure the exact rules at play here.
Now, let's talk about postfix. Postfix can be hard to use, because it can never be used anywhere except the end of an expression. For example, you can't do the following:
list tail map (...)
Because tail does not appear at the end of the expression. You can't do this either:
list tail length
You could use infix notation by using parenthesis to mark end of expressions:
(list tail) map (...)
(list tail) length
Note that postfix notation is discouraged because it may be unsafe.
I hope this has cleared all the doubts. If not, just drop a comment and I'll see what I can do to improve it.
Class definitions:
val or var can be omitted from class parameters which will make the parameter private.
Adding var or val will cause it to be public (that is, method accessors and mutators are generated).
{} can be omitted if the class has no body, that is,
class EmptyClass
Class instantiation:
Generic parameters can be omitted if they can be inferred by the compiler. However note, if your types don't match, then the type parameter is always infered so that it matches. So without specifying the type, you may not get what you expect - that is, given
class D[T](val x:T, val y:T);
This will give you a type error (Int found, expected String)
var zz = new D[String]("Hi1", 1) // type error
Whereas this works fine:
var z = new D("Hi1", 1)
== D{def x: Any; def y: Any}
Because the type parameter, T, is inferred as the least common supertype of the two - Any.
Function definitions:
= can be dropped if the function returns Unit (nothing).
{} for the function body can be dropped if the function is a single statement, but only if the statement returns a value (you need the = sign), that is,
def returnAString = "Hi!"
but this doesn't work:
def returnAString "Hi!" // Compile error - '=' expected but string literal found."
The return type of the function can be omitted if it can be inferred (a recursive method must have its return type specified).
() can be dropped if the function doesn't take any arguments, that is,
def endOfString {
return "myDog".substring(2,1)
}
which by convention is reserved for methods which have no side effects - more on that later.
() isn't actually dropped per se when defining a pass by name paramenter, but it is actually a quite semantically different notation, that is,
def myOp(passByNameString: => String)
Says myOp takes a pass-by-name parameter, which results in a String (that is, it can be a code block which returns a string) as opposed to function parameters,
def myOp(functionParam: () => String)
which says myOp takes a function which has zero parameters and returns a String.
(Mind you, pass-by-name parameters get compiled into functions; it just makes the syntax nicer.)
() can be dropped in the function parameter definition if the function only takes one argument, for example:
def myOp2(passByNameString:(Int) => String) { .. } // - You can drop the ()
def myOp2(passByNameString:Int => String) { .. }
But if it takes more than one argument, you must include the ():
def myOp2(passByNameString:(Int, String) => String) { .. }
Statements:
. can be dropped to use operator notation, which can only be used for infix operators (operators of methods that take arguments). See Daniel's answer for more information.
. can also be dropped for postfix functions
list tail
() can be dropped for postfix operators
list.tail
() cannot be used with methods defined as:
def aMethod = "hi!" // Missing () on method definition
aMethod // Works
aMethod() // Compile error when calling method
Because this notation is reserved by convention for methods that have no side effects, like List#tail (that is, the invocation of a function with no side effects means that the function has no observable effect, except for its return value).
() can be dropped for operator notation when passing in a single argument
() may be required to use postfix operators which aren't at the end of a statement
() may be required to designate nested statements, ends of anonymous functions or for operators which take more than one parameter
When calling a function which takes a function, you cannot omit the () from the inner function definition, for example:
def myOp3(paramFunc0:() => String) {
println(paramFunc0)
}
myOp3(() => "myop3") // Works
myOp3(=> "myop3") // Doesn't work
When calling a function that takes a by-name parameter, you cannot specify the argument as a parameter-less anonymous function. For example, given:
def myOp2(passByNameString:Int => String) {
println(passByNameString)
}
You must call it as:
myOp("myop3")
or
myOp({
val source = sourceProvider.source
val p = myObject.findNameFromSource(source)
p
})
but not:
myOp(() => "myop3") // Doesn't work
IMO, overuse of dropping return types can be harmful for code to be re-used. Just look at specification for a good example of reduced readability due to lack of explicit information in the code. The number of levels of indirection to actually figure out what the type of a variable is can be nuts. Hopefully better tools can avert this problem and keep our code concise.
(OK, in the quest to compile a more complete, concise answer (if I've missed anything, or gotten something wrong/inaccurate please comment), I have added to the beginning of the answer. Please note this isn't a language specification, so I'm not trying to make it exactly academically correct - just more like a reference card.)
A collection of quotes giving insight into the various conditions...
Personally, I thought there'd be more in the specification. I'm sure there must be, I'm just not searching for the right words...
There are a couple of sources however, and I've collected them together, but nothing really complete / comprehensive / understandable / that explains the above problems to me...:
"If a method body has more than one
expression, you must surround it with
curly braces {…}. You can omit the
braces if the method body has just one
expression."
From chapter 2, "Type Less, Do More", of Programming Scala:
"The body of the upper method comes
after the equals sign ‘=’. Why an
equals sign? Why not just curly braces
{…}, like in Java? Because semicolons,
function return types, method
arguments lists, and even the curly
braces are sometimes omitted, using an
equals sign prevents several possible
parsing ambiguities. Using an equals
sign also reminds us that even
functions are values in Scala, which
is consistent with Scala’s support of
functional programming, described in
more detail in Chapter 8, Functional
Programming in Scala."
From chapter 1, "Zero to Sixty: Introducing Scala", of Programming Scala:
"A function with no parameters can be
declared without parentheses, in which
case it must be called with no
parentheses. This provides support for
the Uniform Access Principle, such
that the caller does not know if the
symbol is a variable or a function
with no parameters.
The function body is preceded by "="
if it returns a value (i.e. the return
type is something other than Unit),
but the return type and the "=" can be
omitted when the type is Unit (i.e. it
looks like a procedure as opposed to a
function).
Braces around the body are not
required (if the body is a single
expression); more precisely, the body
of a function is just an expression,
and any expression with multiple parts
must be enclosed in braces (an
expression with one part may
optionally be enclosed in braces)."
"Functions with zero or one argument
can be called without the dot and
parentheses. But any expression can
have parentheses around it, so you can
omit the dot and still use
parentheses.
And since you can use braces anywhere
you can use parentheses, you can omit
the dot and put in braces, which can
contain multiple statements.
Functions with no arguments can be
called without the parentheses. For
example, the length() function on
String can be invoked as "abc".length
rather than "abc".length(). If the
function is a Scala function defined
without parentheses, then the function
must be called without parentheses.
By convention, functions with no
arguments that have side effects, such
as println, are called with
parentheses; those without side
effects are called without
parentheses."
From blog post Scala Syntax Primer:
"A procedure definition is a function
definition where the result type and
the equals sign are omitted; its
defining expression must be a block.
E.g., def f (ps) {stats} is
equivalent to def f (ps): Unit =
{stats}.
Example 4.6.3 Here is a declaration
and a de?nition of a procedure named
write:
trait Writer {
def write(str: String)
}
object Terminal extends Writer {
def write(str: String) { System.out.println(str) }
}
The code above is implicitly completed
to the following code:
trait Writer {
def write(str: String): Unit
}
object Terminal extends Writer {
def write(str: String): Unit = { System.out.println(str) }
}"
From the language specification:
"With methods which only take a single
parameter, Scala allows the developer
to replace the . with a space and omit
the parentheses, enabling the operator
syntax shown in our insertion operator
example. This syntax is used in other
places in the Scala API, such as
constructing Range instances:
val firstTen:Range = 0 to 9
Here again, to(Int) is a vanilla
method declared inside a class
(there’s actually some more implicit
type conversions here, but you get the
drift)."
From Scala for Java Refugees Part 6: Getting Over Java:
"Now, when you try "m 0", Scala
discards it being a unary operator, on
the grounds of not being a valid one
(~, !, - and +). It finds that "m" is
a valid object -- it is a function,
not a method, and all functions are
objects.
As "0" is not a valid Scala
identifier, it cannot be neither an
infix nor a postfix operator.
Therefore, Scala complains that it
expected ";" -- which would separate
two (almost) valid expressions: "m"
and "0". If you inserted it, then it
would complain that m requires either
an argument, or, failing that, a "_"
to turn it into a partially applied
function."
"I believe the operator syntax style
works only when you've got an explicit
object on the left-hand side. The
syntax is intended to let you express
"operand operator operand" style
operations in a natural way."
Which characters can I omit in Scala?
But what also confuses me is this quote:
"There needs to be an object to
receive a method call. For instance,
you cannot do “println “Hello World!”"
as the println needs an object
recipient. You can do “Console
println “Hello World!”" which
satisfies the need."
Because as far as I can see, there is an object to receive the call...
I find it easier to follow this rule of thumb: in expressions spaces alternate between methods and parameters. In your example, (service.findAllPresentations.get.first.votes.size) must be equalTo(2) parses as (service.findAllPresentations.get.first.votes.size).must(be)(equalTo(2)). Note that the parentheses around the 2 have a higher associativity than the spaces. Dots also have higher associativity, so (service.findAllPresentations.get.first.votes.size) must be.equalTo(2)would parse as (service.findAllPresentations.get.first.votes.size).must(be.equalTo(2)).
service findAllPresentations get first votes size must be equalTo 2 parses as service.findAllPresentations(get).first(votes).size(must).be(equalTo).2.
Actually, on second reading, maybe this is the key:
With methods which only take a single
parameter, Scala allows the developer
to replace the . with a space and omit
the parentheses
As mentioned on the blog post: http://www.codecommit.com/blog/scala/scala-for-java-refugees-part-6 .
So perhaps this is actually a very strict "syntax sugar" which only works where you are effectively calling a method, on an object, which takes one parameter. e.g.
1 + 2
1.+(2)
And nothing else.
This would explain my examples in the question.
But as I said, if someone could point out to be exactly where in the language spec this is specified, would be great appreciated.
Ok, some nice fellow (paulp_ from #scala) has pointed out where in the language spec this information is:
6.12.3:
Precedence and associativity of
operators determine the grouping of
parts of an expression as follows.
If there are several infix operations in an expression, then
operators with higher precedence bind
more closely than operators with lower
precedence.
If there are consecutive infix operations e0 op1 e1 op2 . . .opn en
with operators op1, . . . , opn of the
same precedence, then all these
operators must have the same
associativity. If all operators are
left-associative, the sequence is
interpreted as (. . . (e0 op1 e1) op2
. . .) opn en. Otherwise, if all
operators are rightassociative, the
sequence is interpreted as e0 op1 (e1
op2 (. . .opn en) . . .).
Postfix operators always have lower precedence than infix operators. E.g.
e1 op1 e2 op2 is always equivalent to
(e1 op1 e2) op2.
The right-hand operand of a
left-associative operator may consist
of several arguments enclosed in
parentheses, e.g. e op (e1, . . .
,en). This expression is then
interpreted as e.op(e1, . . . ,en).
A left-associative binary operation e1
op e2 is interpreted as e1.op(e2). If
op is rightassociative, the same
operation is interpreted as { val
x=e1; e2.op(x ) }, where x is a fresh
name.
Hmm - to me it doesn't mesh with what I'm seeing or I just don't understand it ;)
There aren't any. You will likely receive advice around whether or not the function has side-effects. This is bogus. The correction is to not use side-effects to the reasonable extent permitted by Scala. To the extent that it cannot, then all bets are off. All bets. Using parentheses is an element of the set "all" and is superfluous. It does not provide any value once all bets are off.
This advice is essentially an attempt at an effect system that fails (not to be confused with: is less useful than other effect systems).
Try not to side-effect. After that, accept that all bets are off. Hiding behind a de facto syntactic notation for an effect system can and does, only cause harm.

Resources