Is there used incorrect terminology in description of a compile error as to 'for' syntax in golang? - go

I tried to use something like
for i := 0; i < len(bytes); ++i {
...
}
It is not correct and I got an error
syntax error: unexpected ++, expecting expression
It was because of ++i is not an expression I thought.
Then I found out that i++ (it works in for loop) is not an expression as well according to the documentation.
Also I met that in some cases (now I think in all cases) a statement can not be used instead of expression.
Now if we come back to the error we see that for loop requires an expression. I was confused with that. I checked one more part of the documentation it turns out for requires a statement.
For statements with for clause
A "for" statement with a ForClause is also controlled by its
condition, but additionally it may specify an init and a post
statement
I started with question (which I liked more than the final question because it was about language non-acquaintance as I thought)
Is it special case for loop syntax that statement are accepted as expression or are there other cases in golang?
During writing the question and checking the documentation I end up to a questions
Is there used incorrect terminology in description of the error that should be fixed not to confuse? Or is it normally in some cases to substitute such terms as statement and expression?

The Go Programming Language Specification
Primary expressions
Primary expressions are the operands for unary and binary expressions.
PrimaryExpr =
Operand |
Conversion |
PrimaryExpr Selector |
PrimaryExpr Index |
PrimaryExpr Slice |
PrimaryExpr TypeAssertion |
PrimaryExpr Arguments .
Selector = "." identifier .
Index = "[" Expression "]" .
Slice = "[" [ Expression ] ":" [ Expression ] "]" |
"[" [ Expression ] ":" Expression ":" Expression "]" .
TypeAssertion = "." "(" Type ")" .
Arguments = "(" [ ( ExpressionList | Type [ "," ExpressionList ] ) [ "..." ] [ "," ] ] ")" .
Operators and punctuation
The following character sequences represent operators:
++
--
Operators
Operators combine operands into expressions.
Expression = UnaryExpr | Expression binary_op Expression .
UnaryExpr = PrimaryExpr | unary_op UnaryExpr .
binary_op = "||" | "&&" | rel_op | add_op | mul_op .
rel_op = "==" | "!=" | "<" | "<=" | ">" | ">=" .
add_op = "+" | "-" | "|" | "^" .
mul_op = "*" | "/" | "%" | "<<" | ">>" | "&" | "&^" .
unary_op = "+" | "-" | "!" | "^" | "*" | "&" | "<-" .
Operator precedence
The ++ and -- operators form statements, not expressions.
IncDec statements
The "++" and "--" statements increment or decrement their operands by
the untyped constant 1. As with an assignment, the operand must be
addressable or a map index expression.
IncDecStmt = Expression ( "++" | "--" ) .
++ and -- are operators. The ++ and -- operators form statements, not expressions.
IncDecStmt = Expression ( "++" | "--" ) .
When the compiler encounters an ++ operator, it expects it to be immediately preceded by an expresssion.
For example,
package main
func main() {
// syntax error: unexpected ++, expecting expression
for i := 0; i < 1; ++i {}
}
Playground: https://play.golang.org/p/y2d9ijeMdw
Output:
main.go:6:21: syntax error: unexpected ++, expecting expression
The compiler complains about the syntax. It found a ++ operator without an immediately preceding expression: syntax error: unexpected ++, expecting expression.

The Go Spec says the post statement of a for clause accepts (among other things) a IndDec statement.
The IncDec statement is defined as: IncDecStmt = Expression ( "++" | "--" ) .
The parser finds an IndDec statement but an empty expression and thus spits out the error "expecting expression".
Edit: this probably fails because the fallback node to parse for a SimplStmt is an expression. The IncDecStmt failed, so it moves on to the default. The error accurately reflects the latest error that is bubbled up.
While the error message is correct, it is a little bit misleading. However, fixing it would involve passing more context about the current tree being parsed. eg: bad ForClause: bad PostStmt: bad SimpleStmt: expected expression.
There's still the problem that the expected expression is the last error encountered. Before that, it failed to parse the IncDecStmt but that error is swallowed because it falls back on an expression. The same applies at higher levels of the tree.
Even without that problem it would be rather heavy-handed and probably even more confusing than the current error messages. You may want to ask for input from the Go folks though.

Related

Gocc to ignore things in lexical parser

Is there ways to tell gocc to ignore things in lexical parser? E.g., for
2022-01-18 11:33:21.9885 [21] These are strings that I need to egnore, until - MYKW - Start Active One: 1/18/2022 11:33:21 AM
I want to tell gocc to ignore from [21] all the way to until. Here is what I've been trying:
/* Lexical part */
_letter : 'A'-'Z' | 'a'-'z' | '_' ;
_digit : '0'-'9' ;
_timestamp1 : _digit | ' ' | ':' | '-' | '.' ;
_timestamp2 : _digit | ' ' | ':' | '/' | 'A' | 'P' | 'M' ;
_ignore : '[' { . } ' ' '-' ' ' 'M' 'Y' 'K' 'W' ' ' '-' ' ' ;
_lineend : [ '\r' ] '\n' ;
timestamp : _timestamp1 { _timestamp1 } _ignore ;
taskLogStart : 'S' 't' 'a' 'r' 't' ' ' ;
jobName : { . } _timestamp2 { _timestamp2 } _lineend ;
/* Syntax part */
Log
: timestamp taskLogStart jobName ;
However, the parser failed at:
error: expected timestamp; got: unknown/invalid token "2022-01-18 11:33:21.9885 [21] T"
The reason I think it should be working is that, the following ignore rule works perfectly fine for white spaces:
!lineComment : '/' '/' { . } '\n' ;
!blockComment : '/' '*' { . | '*' } '*' '/' ;
and I'm just applying the above rule into my normal text parsing.
It doesn't work that way --
The EBNF looks very much like regular expressions but it does not work like regular expression at all -- what I mean is,
The line,
2022-01-18 11:33:21.9885 [21] These are strings that I need to ignore, until - MYKW - Start Active One: 1/18/2022 11:33:21 AM
If to match with regular expression, it can simply be:
([0-9.: -]+).*? - MYKW - Start ([^:]+):.*$
However, that cannot be directly translate into EBNF definition just like that, because the regular expression relies on the context in between each elements to ping point a match (e.g., the .*? matching rule is a local rule that only works based on the context it is in), however, gocc is a LR parser, which is a context-free grammar!!!
Basically a context-free grammar means, each time it is trying to do a .* match to all existing lexical symbols (i.e., each lexical symbol can be considered a global rule that is not affected by the context it is in). I cannot quite describe it but there is no previous context (or the symbol following it) involved in next match. That's the reason why the OP fails.
For a real sample of how the '{.}' can be used, see
How to describe this event log in formal BNF?

Need an example for how Go syntax for assignment operator uses the grammar rules specified using EBNF

As mentioned in the docs, syntax in Go is specified using Extended Backus-Naur Form (EBNF):
Production = production_name "=" [ Expression ] "." .
Expression = Alternative { "|" Alternative } .
Alternative = Term { Term } .
Term = production_name | token [ "…" token ] | Group | Option | Repetition .
Group = "(" Expression ")" .
Option = "[" Expression "]" .
Repetition = "{" Expression "}" .
I am trying to understand how Go syntax grammar is defined, how to breakdown/derive/understand the expression i++ and i+=1 using these grammar rules. How would these production rules be substituted step by step for the purpose of illustration?
The expression i++ uses the grammar rule for IncDec statements:
IncDecStmt = Expression ( "++" | "--" ) .
Here, production_name would be IncDecStmt and Term would be "++" or "--".

ANTLR mismatched token in simple grammar

I am currently debugging my grammar in ANTLRworks, and reduced it far more than is reasonable to this:
grammar DebugInternalGrammar;
RULE_STRING :
'"' (
('\\' .) |
(~ (
'\\' |
'"'
))
)* '"'
;
Which, when testing in the interpreter against the String
"L"
just yields
MismatchedTokenException(76!=34)
What does work is matching "", also reducing the grammar to:
grammar DebugInternalGrammar;
RULE_STRING :
'"' (
(~ (
'\\' |
'"'
))
)* '"'
;
matches "L" (I assume this is what it means when the parse tree in ANTLRworks shows <epsilon> as leaf).
What is wrong here? This is not the part of the grammar which caused me trouble before, so I am scratching my head as to what the problem could be and what ANTLRworks is trying to tell me.

Which line continuations are valid and which ones are invalid in shell scripting for POSIX shell?

In the following example, although I have split the line if true && true into two lines, it works fine and produces the output hi.
if true &&
true
then
echo hi
fi
But in the following example, where the redirection operator and the filename has been split into two different lines, I get an error.
wc -l <
/var/log/messages
The error I get is,
foo.sh: line 1: syntax error near unexpected token `newline'
foo.sh: line 1: `wc -l <'
Is there a rule defined by POSIX that I can use to easily understand where line continuations are valid and where they aren't?
You want to search for "control operators" in the POSIX Shell Command Language document (http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html)
Some excerpts:
2.9.1 Simple Commands
A "simple command" is a sequence of optional variable assignments and redirections, in any sequence, optionally followed by words and redirections, terminated by a control operator.
2.9.2 Pipelines
A pipeline is a sequence of one or more commands separated by the control operator '|'.
2.9.3 Lists
An AND-OR list is a sequence of one or more pipelines separated by the operators "&&" and "||" .
A list is a sequence of one or more AND-OR lists separated by the operators ';' and '&' and optionally terminated by ';', '&', or .
According to the grammar, the control operators that can be followed by a linebreak are:
&& and ||
|
; and &
Additionally, for and while loops, if and case statements, function definitions and subshell and grouping constructs can have liberal numbers of newlines in them.
Easily? Maybe not.
Thoroughly? 2.10 Shell Grammar.
Specifically AND_IF and io_file.
%token AND_IF OR_IF DSEMI
/* '&&' '||' ';;' */
and_or : pipeline
| and_or AND_IF linebreak pipeline
| and_or OR_IF linebreak pipeline
command : simple_command
| compound_command
| compound_command redirect_list
| function_definition
redirect_list : io_redirect
| redirect_list io_redirect
;
io_redirect : io_file
| IO_NUMBER io_file
| io_here
| IO_NUMBER io_here
;
io_file : '<' filename
| LESSAND filename
| '>' filename
| GREATAND filename
| DGREAT filename
| LESSGREAT filename
| CLOBBER filename
;
filename : WORD /* Apply rule 2 */

“IF ELSE” statement inside basic calculator

I’m trying to implement my own calculator with “IF ELSE” statements.
Here is the basic calculator example:
/* description: Parses end executes mathematical expressions. */
/* lexical grammar */
%lex
%%
\s+ /* skip whitespace */
[0-9]+("."[0-9]+)?\b return 'NUMBER'
"*" return '*'
"/" return '/'
"-" return '-'
"+" return '+'
"^" return '^'
"(" return '('
")" return ')'
"PI" return 'PI'
"E" return 'E'
<<EOF>> return 'EOF'
. return 'INVALID'
/lex
/* operator associations and precedence */
%left '+' '-'
%left '*' '/'
%left '^'
%left UMINUS
%start expressions
%% /* language grammar */
expressions
: e EOF
{return $1;}
;
e
: e '+' e
{$$ = $1+$3;}
| e '-' e
{$$ = $1-$3;}
| e '*' e
{$$ = $1*$3;}
| e '/' e
{$$ = $1/$3;}
| e '^' e
{$$ = Math.pow($1, $3);}
| '-' e %prec UMINUS
{$$ = -$2;}
| '(' e ')'
{$$ = $2;}
| NUMBER
{$$ = Number(yytext);}
| E
{$$ = Math.E;}
| PI
{$$ = Math.PI;}
;
I don’t understand if I add the “IF” statements like this:
IfStatement
: "IF" "(" Expression ")" Statement
{
$$ = new IfStatementNode($3, $5, null, createSourceLocation(null, #1, #5));
}
| "IF" "(" Expression ")" Statement "ELSE" Statement
{
$$ = new IfStatementNode($3, $5, $7, createSourceLocation(null, #1, #7));
}
;
The parser generates well.
So how I can use the statement like this IF(5>2)THEN (5+2) ELSE (5*2).
The calculator’s functionality works well of course, but “IF” doesn’t.
It seems that you are looking for two sorts of constructs: an IF statement and an IF expression. Fortunately, your example uses the THEN keyword to distinguish them. Your IF expression production would be something like:
IfExpression
: "IF" "(" Expression ")" "THEN" "(" Expression ")"
{
$$ = new IfExpressionNode(/* pass arguments as desired */);
}
| "IF" "(" Expression ")" "THEN" "(" Expression ")" "ELSE" "(" Expression ")"
{
$$ = new IfExpressionNode(/* arguments */);
}
;
You don't show how your two pieces of grammar are bound together, so it's hard to answer. Have you also looked at other questions, such as Reforming the grammar to remove shift reduce conflict in if-then-else?

Resources