I am attempting to parse Lua, which depends on whitespace in some cases due to the fact that it doesn't use braces for scope. I figure that by throwing out whitespace only if another rule doesn't match is the best way, but i have no clue how to do that. Can someone help me?
Looking at Lua's documentation, I see no need to take spaces into account.
The parser rule ifStatement:
ifStatement
: 'if' exp 'then' block ('elseif' exp 'then' block 'else' block)? 'end'
;
exp
: /* todo */
;
block
: /* todo */
;
should match both:
if j==10 then print ("j equals 10") end
and:
if j<10 then
print ("j < 10")
elseif j>100 then
print ("j > 100")
else
print ("j >= 10 && j <= 100")
end
No need to take spaces into account, AFAIK. So you can just add:
Space
: (' ' | '\t' | '\r' | '\n'){$channel=HIDDEN;}
;
in your grammar.
EDIT
It seems there is a Lua grammar on the ANTLR wiki: http://www.antlr.org/grammar/1178608849736/Lua.g
And it seems I my suggestion for an if statement slightly differs from the grammar above:
'if' exp 'then' block ('elseif' exp 'then' block)* ('else' block)? 'end'
which is the correct one, as you can see.
Related
I am working on a project where I need to check if the employee enter *done* in a text field, though employee enters '* done *' or '*done *' or '* done*' in similar fashion. As you see they are putting trailing and leading blank or both at a time.I have to check the column for all three/four possible entry in like statement, I tried trim,rtrim nothing seems like working.
case when
col like ('*done*')
or col like ('* done*')
or col like ('*done *')
or col like ('* done *')
end as work_status
doesn't seems a smart way to do it. What is the best way to to check this. Any help will be appreciated. Thank you.
Remove spaces:
case when replace(col, ' ') = '*done*' then 'done'
else 'not'
end as work_status
You can look for the done substring with anything preceding and following using the LIKE operator and % wildcards:
CASE WHEN col LIKE '%done%' THEN 'done' END AS work_status
Or you can trim the leading and trailing space characters:
CASE WHEN TRIM(col) = 'done' THEN 'done' END AS work_status
Or you can replace all the leading/trailing white spaces (in case the users have entered new lines, tabs, etc. rather than space characters) using a regular expression:
CASE
WHEN REGEXP_REPLACE(col, '^[[:space:]]+|[[:space:]]+$') = 'done'
THEN 'done'
END AS work_status
fiddle
So, I'm running into this issue wherein I want to have three conditions be checked before the routine continues, but it keeps throwing up syntax errors saying it didn't expect the multiple conditions. Now, I know I've seen other people use lines such as:
if x > 100 && x % 2 == 1
#Do something
end
But, for whatever reason, this line:
if (letters.eql? letters.upcase && dash.eql? '-' && numbers.to_i.to_s.eql? numbers)
is throwing up tons of errors. Is it something to do with '.eql?' or is it something extraneous about Ruby that I haven't encountered yet?
Here's the rest of the code for reference:
print "Enter license plate: ";
input = gets.strip;
if input.length == 8
letters = input[0,2];
dash = input[3];
numbers = input[4,7];
if (letters.eql? letters.upcase && dash.eql? '-' && numbers.to_i.to_s.eql? numbers)
puts "#{input} is a valid license plate."
else
print "All valid license plates are three (3) uppercase letters, followed by a dash (-), followed by four (4) digits";
end
else
print "All valid license plates are 8 characters long.";
end
Also, these are the errors:
LicensePlate.rb:7: syntax error, unexpected tSTRING_BEG, expecting ')'
...? letters.upcase && dash.eql? '-' && numbers.to_i.to_s.eql? ...
... ^
LicensePlate.rb:7: syntax error, unexpected tIDENTIFIER, expecting ')'
... numbers.to_i.to_s.eql? numbers)
...
This should do it:
if letters.eql?(letters.upcase) && dash.eql?('-') && numbers.to_i.to_s.eql?(numbers)
You can still wrap the entire conditional in parenthesis if you would like, but with Ruby (unlike JavaScript), you don't need to.
Think you're just missing some parens - try this:
if (letters.eql?(letters.upcase) && dash.eql?('-') && numbers.to_i.to_s.eql?(numbers))
This also works:
letters.eql? letters.upcase and dash.eql? '-' and numbers.to_i.to_s.eql? numbers
I believe this is due to operator precedence since this also works:
(letters.eql? letters.upcase) && (dash.eql? '-') && (numbers.to_i.to_s.eql? numbers)
Ruby seem to try and evaluate your condition prematurely.
EDIT: Just saw that Lurker was mentioning precedence previously.
In addition to the other answers - consider using a regular expression to check the format:
print "Enter license plate: "
input = gets.chomp
if input.length != 8
puts "All valid license plates are 8 characters long."
elsif input !~ /^[A-Z]{3}-\d{4}$/
print "All valid license plates are three (3) uppercase letters, followed by a dash (-), followed by four (4) digits"
else
puts "#{input} is a valid license plate."
end
Why is there a semicolon at the end of Proc.num_stack_slots.(i) <- 0 in the following code?
I thought semicolons are separators in OCaml. Can we always put an optional semicolon for the last expression of a block?
for i = 0 to Proc.num_register_classes - 1 do
Proc.num_stack_slots.(i) <- 0;
done;
See https://github.com/def-lkb/ocaml-tyr/blob/master/asmcomp/coloring.ml line 273 for the complete example.
There is no need for a semicolon after this expression, but as a syntactic courtesy, it is allowed here. In the example, you referenced, there is a semicolon, because after it a second expression follows.
Essentially, you can view a semicolon as a binary operator, that takes two-unit expressions, executes them from left to right, and returns a unit.
val (;): unit -> unit -> unit
then the following example will be more understandable:
for i = 1 to 5 do
printf "Hello, ";
printf "world\n"
done
here ; works just a glue. It is allowed to put a ; after the second expression, but only as the syntactic sugar, nothing more than a courtesy from compiler developers.
If you open a parser definition of the OCaml compiler you will see, that an expression inside a seq_expr can be ended by a semicolumn:
seq_expr:
| expr %prec below_SEMI { $1 }
| expr SEMI { reloc_exp $1 }
| expr SEMI seq_expr { mkexp(Pexp_sequence($1, $3)) }
That means that you can even write such strange code:
let x = 2 in x; let y = 3 in y; 25
These parse and execute fine:
"=".scan(/=/)
"=".scan (/=/)
This causes "unterminated regexp meets end of file":
"=".scan /=/
If I insert something before the = the error goes away:
"=".scan /^=/
What's going on?
I'm guessing that you're hitting this in the parser:
case '/':
if (IS_BEG()) {
lex_strterm = NEW_STRTERM(str_regexp, '/', 0);
return tREGEXP_BEG;
}
if ((c = nextc()) == '=') {
set_yylval_id('/');
lex_state = EXPR_BEG;
return tOP_ASGN;
}
Note the nextc() check in the second if. For reference, tOP_ASGN is:
%token <id> tOP_ASGN /* +=, -= etc. */
so it is used for operator-assign tokens.
This suggests that that /=/ in
'='.scan /=/
is being seen as the divide-assign operator (/=) followed by a start-regex-literal (/).
You'll have trouble (of a slightly different sort) with this:
' ='.scan / =/
but not this:
' ='.scan(/ =/)
There is often ambiguity when a method call doesn't have parentheses. In this case, I think operator precedence rules apply and that's not what you're expecting.
I tend to put parentheses on all my method calls because I'm too old and cranky to want to worry about how the parser is going to behave.
I need to be able to match a certain string ('[' then any number of equals signs or none then '['), then i need to match a matching close bracket (']' then the same number of equals signs then ']') after some other match rules. ((options{greedy=false;}:.)* if you must know). I have no clue how to do this in ANTLR, how can i do it?
An example: I need to match [===[whatever arbitrary text ]===] but not [===[whatever arbitrary text ]==].
I need to do it for an arbitrary number of equals signs as well, so therein lies the problem: how do i get it to match an equal number of equals signs in the open as in the close? The supplied parser rules so far dont seem to make sense as far as helping.
You can't easely write a lexer for it, you need parsing rules. Two rules should be sufficient. One is responsible for matching the braces, one for matching the equal signs.
Something like this:
braces : '[' ']'
| '[' equals ']'
;
equals : '=' equals '='
| '=' braces '='
;
This should cover the use case you described. Not absolute shure but maybe you have to use a predicate in the first rule of 'equals' to avoid ambiguous interpretations.
Edit:
It is hard to integrate your greedy rule and at the same time avoid a lexer context switch or something similar (hard in ANTLR). But if you are willing to integrate a little bit of java in your grammer you can write an lexer rule.
The following example grammar shows how:
grammar TestLexer;
SPECIAL : '[' { int counter = 0; } ('=' { counter++; } )+ '[' (options{greedy=false;}:.)* ']' ('=' { counter--; } )+ { if(counter != 0) throw new RecognitionException(input); } ']';
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
;
WS : ( ' '
| '\t'
| '\r'
| '\n'
) {$channel=HIDDEN;}
;
rule : ID
| SPECIAL
;
Your tags mention lexing, but your question itself doesn't. What you're trying to do is non-regular, so I don't think it can be done as part of lexing (though I don't remember if ANTLR's lexer is strictly regular -- it's been a couple of years since I last used ANTLR).
What you describe should be possible in parsing, however. Here's the grammar for what you described:
thingy : LBRACKET middle RBRACKET;
middle : EQUAL middle EQUAL
| LBRACKET RBRACKET;