How to match EOF condition using antlr 4 - comments

I am new to ANTLR and I am currently writing a lexer for cool language in ANTLR 4.
For more about cool language please refer http://theory.stanford.edu/~aiken/software/cool/cool-manual.pdf.
One rule of cool language that I was trying to implement was detecting EOF inside Comments (may be nested) or String Constants and reporting as an error.
This is the rule that I wrote :
ERROR : '(*' (COMMENT|~['(*'|'*)'])*? (~['*)']) EOF {reportError("EOF in comment");}
|'"' (~[\n"])* EOF {reportError("EOF in string");};
fragment COMMENT : '(*' (COMMENT|~['(*'|'*)'])*? '*)'
Here the fragment COMMENT is a recursive rule that I used.
The function reportError used above reports error which is given below:
public void reportError(String errorString){
setText(errorString);
setType(ERROR);
}
But when I run it on the test file given below:
"Test String
It gives the following output :
line 1:0 token recognition error at: '"Test String\n'
#name "helloworld.cl"
Clearly the String with EOF in it was not recognised and ERROR was not detected.
Can someone help me in pointing out where I am going wrong as EOF (and hence, the error rule) is somehow not getting detected by the lexer.
If something is not clear please do mention it.

'"' (~[\n"])* EOF
Here the ~[\n"]* part will stop at the first \n or " or at the end of the file.
If it stops at a ", the rule does not match because the EOF does not match and that's what we want because the string literal is properly terminated.
If it stops at the end of file, then the subsequent EOF will match and you'll get an ERROR token. So that's also what you want.
But if it stops at a \n, the EOF will not match and you won't get an error token even though you'd want one in this case. And since your input ends with a \n, that's exactly the scenario you're running into here. So in addition to EOF, you should also allow for erroneous string literals to end in \n:
'"' (~[\n"])* ('\n' | EOF)

You don't need a dedicated ERROR rule. You can handle that specific situation with an unfinished string directly in your error listener. Your comment rule shouldn't be a fragment however, as it has to recognize a lexeme on its own that must be handled (fragment rules are rather rules to be used in other lexer rules only).
When the lexer reaches a string but cannot finish it due to the end of the input, you can get the offending input from the current lexer state in your error listener. You can then check that to see what exactly wasn't finished, like I do here for 3 quoted text types in MySQL:
void LexerErrorListener::syntaxError(Recognizer *recognizer, Token *, size_t line,
size_t charPositionInLine, const std::string &, std::exception_ptr ep) {
// The passed in string is the ANTLR generated error message which we want to improve here.
// The token reference is always null in a lexer error.
std::string message;
try {
std::rethrow_exception(ep);
} catch (LexerNoViableAltException &) {
Lexer *lexer = dynamic_cast<Lexer *>(recognizer);
CharStream *input = lexer->getInputStream();
std::string text = lexer->getErrorDisplay(input->getText(misc::Interval(lexer->tokenStartCharIndex, input->index())));
if (text.empty())
text = " "; // Should never happen.
switch (text[0]) {
case '/':
message = "Unfinished multiline comment";
break;
case '"':
message = "Unfinished double quoted string literal";
break;
case '\'':
message = "Unfinished single quoted string literal";
break;
case '`':
message = "Unfinished back tick quoted string literal";
break;
default:
// Hex or bin string?
if (text.size() > 1 && text[1] == '\'' && (text[0] == 'x' || text[0] == 'b')) {
message = std::string("Unfinished ") + (text[0] == 'x' ? "hex" : "binary") + " string literal";
break;
}
// Something else the lexer couldn't make sense of (likely there is no rule that accepts this input).
message = "\"" + text + "\" is no valid input at all";
break;
}
owner->addError(message, 0, lexer->tokenStartCharIndex, line, charPositionInLine,
input->index() - lexer->tokenStartCharIndex);
}
}
This code was taken from the parser module in MySQL Workbench.

Related

SonarQube OpenEdge custom rule to verify &IF preprocessor with Proparse

I am trying to create a custom plugin, based on the Riverside OpenEdge plugin and its version of Proparse to create a rule that valids a &IF preprocessor.
This rule needs to verify if the application is using a deprecated value of a &GLOBAL-DEFINE like this:
/* "A" is the deprecated value of "opts" so I want to create a new ISSUE here */
&IF "{&opts}" = "A" &THEN
MESSAGE "DEPRECATED CODE".
&ENDIF
&IF "{&opts}" > "A" &THEN
MESSAGE "OK CODE".
&ENDIF
For this rule I extended I tried to do something like this:
if (unit.getMacroGraph().macroEventList.stream().noneMatch(macro -> macro instanceof NamedMacroRef
&& ((NamedMacroRef) macro).getMacroDef().getName().equalsIgnoreCase("opts"))) {
return;
}
TokenSource stream = unit.lex();
ProToken tok = (ProToken) stream.nextToken();
while (tok.getNodeType() != ABLNodeType.EOF_ANTLR4) {
if (tok.getNodeType() == ABLNodeType.AMPIF) {
// Verify node.
System.out.println(tok);
}
tok = (ProToken) stream.nextToken();
}
But I don't know if its the best way to verify (I did based on the code from other sources) and it's not working because the next node comes as an empty "QSSTRING". I am very new in the Proparse world, any help is appreciated.
First, you have to know that Proparse doesn't give access to every detail of the preprocessor. That said, the method unit.getMacroGraph() will give you access to the visible part of the preprocessor, so that's a good starting point.
If you're looking for usage of given preprocessor variable, you can search for NamedMacroRef instances pointing to the right MacroDef object (with NamedMacroRef#getMacroDef()#getName()), and the right value.
In a old-style for-each loop:
for (MacroRef ref : unit.getMacroSourceArray()) {
if ((ref instanceof NamedMacroRef)) {
if ("opts".equalsIgnoreCase(((NamedMacroRef) ref).getMacroDef().getName())
&& "A".equalsIgnoreCase(((NamedMacroRef) ref).getMacroDef().getValue())) {
System.out.println("OPTS variable usage with value 'A' at file " + ref.getFileIndex() + ":" + ref.getLine());
}
}
}
On this file:
&global-define opts a
&IF "{&opts}" = "A" &THEN
MESSAGE "DEPRECATED CODE".
&ENDIF
&undefine opts
&global-define opts b
&IF "{&opts}" > "A" &THEN
MESSAGE "OK CODE".
&ENDIF
This gives:
OPTS variable usage with value 'A' at file 0:2
So you don't have access to the expression engine, but I think that the current API is enough for what you want to do.
You can then report the issue with SonarQube with OpenEdgeProparseCheck#reportIssue()

#returns throwing error when running locally on docker

I'm trying to return array of asset from transaction. So as per the syntax I have added #returns and #commit(false) in cto file but its throwing error as
✖ Installing business network. This may take a minute...
ParseException: Expected ")", "-", "false", "true", comment, end of
line, number, string or whitespace but "P" found. File
models/org.zcon.healthcare.cto line 370 column 10
Command failed
And when i'm removing the #returns annotation its not throwing any error.
And well its not throwing any error when i'm removing parameter "Patient[]" from #returns annotation.. But it's against the syntax right?
I'm running the application locally using docker swarm.
My docker composer version is v0.19.12
What's wrong? Is this any bug?
In case if you want to see the transaction definition in cto file.
#commit(false)
#returns(Patient[])
transaction SearchPatient{
o String firstName optional
o String lastName optional
}
And in logic file
/**
* Sample transaction
* #param {org.zcon.healthcare.SearchPatient} tx
* #returns{org.zcon.healthcare.Patient[]}
* #transaction
*/
async function SearchPatient(tx){
let queryString = `SELECT org.zcon.healthcare.Patient WHERE (`;
let conditions = [];
if (tx.hasOwnProperty('firstName')) {
var firstName =tx.firstName;
conditions.push(`(firstName == "${firstName}")`)
};
if (tx.hasOwnProperty('lastName')) {
var lastName = tx.lastName;
conditions.push(`(lastName == "${lastName}")`)
};
queryString += conditions.join(' AND ') + ')';
let finalQuery = buildQuery(queryString);
const searchPatient = await query(finalQuery);
if(searchPatient.length ==0){
throw "No Patient Records found!!"
}else
return searchPatient;
}
I've not seen this error with composer network install (deploying to a running Fabric) I did the network install just fine (see screenshot) with your model and code. I suggest that your error may lie elsewhere in your business network ? Can you add the complete sequence of what you got to get the error? How did you build your bna file?
I also tried your code (ending....):
const searchPatient = await query(finalQuery);
console.log("results are " + searchPatient);
console.log("element 1 of array is " + searchPatient[0]);
if(searchPatient.length ==0){
throw "No Patient Records found!!"
} else
return searchPatient;
}
and can see the returned results fine (as shown in console.log - just using a different network name obviously FYI)

hl7 message encoding error while parsing the message in map-reduce

I am trying to parse a HL7 message by Hapi in map-reduce function i got EncodingNotSupportedException when i run the map task.
i tried to add \n or \r to the end of each segment but i am facing the same error.
the message is saved in text file and it uploaded to HDFS. should i need to add something this is my code
String v = value.toString();
InputStream is = new StringBufferInputStream(v);
is = new BufferedInputStream(is);
Hl7InputStreamMessageStringIterator iter = new Hl7InputStreamMessageStringIterator(
is);
HapiContext hcontext = new DefaultHapiContext();
Message hapiMsg;
Parser p = hcontext.getGenericParser();
while (iter.hasNext()) {
String msg = iter.next();
try {
hapiMsg = p.parse(msg);
} catch (EncodingNotSupportedException e) {
e.printStackTrace();
return;
} catch (HL7Exception e) {
e.printStackTrace();
return;
}
}
the sample message
MSH|^~\&|HIS|RIH|EKG|EKG|20150121002000||ADT^A01||P|2.5.1
EVN||20150121002000|||||CITY GENL HOSP^0133195934^NPI
PID|1||95101100001^^^^PI^CITY GENL HOSP&0133195934&NPI||SNOW^JOHN^^^MR^^L||19560121002000|M||2054-5^White^CDCREC|470 Ocean Ave^^NEW YORK^^11226^USA^C^^29051||^^^^^513^5551212|||||95101100001||||2186-5^White American^CDCREC|||1
PV1||E||E||||||||||1||||||||||||||||||||||||||||||
OBX|1|NM|21612-7^PATIENT AGE REPORTED^LN||60|a^YEAR^UCUM|||||F|||201601131443
OBX|2|NM|21613-7^Urination^LN||2|a^DAY^UCUM|||||F|||19740514201500
DG1|001||4158^Diabetes^I9CDX||19740514201500|A|5478^Non-infectious
DG1|002||2222^Huntington^I9CDX||19610718121500|A|6958^Genetic
Never store HL7-messages as text file, but as binary. Are you sure, that the segment delimiters are ok?
Just check your HL7 message after reading from HDFS either via printing to the console or via the use of a debugger, if the message contains only \r as segment delimiter before parsing.
The segment delimiter has to be a \r, ie x0d, "carriage return" and not a \n, ie x0a "newline". There are probably some tools, maybe HL7 editors, accepting alternative segment delimiters or writing the wrong delimiter, but this is not standard.

What are the rules that Vala's Process.spawn_command_line_async follows in interpreting CLI arguments?

Specifically, how does it interpret arguments that are in quotes or that feature redirects from standard input (e.g. <)?
I've got the following string:
string cmd = "mail -s 'Work Order #%s' -c %s -r email#server.com %s < email.txt".printf(wo.get_text(), ownmail, outmail.get_text());
When I use
Posix.system(cmd);
The command runs as expected and an email is sent, with the body taken from email.txt.
When I use
Process.spawn_command_line_async(cmd);
I get the error from the mail command that 'option -c is not found' or words to that effect. When I lose the quotes around Work Order #%s and instead escape the spaces, the email sends (with the subject line containing the back slashes) but instead of getting the body of the message from email.txt, it treats email.txt as another recipient of the email (it shows up in my inbox with 'email.txt' under the To: section). The < is being ignored or dropped. To check things out, I used
Process.spawn_command_line_async("echo %s".printf(cmd));
This showed me that the quotes around the subject line were being dropped but the < was still there. I can use Posix.system() in my program but for the sake of simplicity and reducing dependencies (and being more idiomatic), I'd prefer to use Process.spawn_command_line(). What am I missing?
Thank you!
You probably want to play around with Shell.quote() and Shell.unquote() in your "".printf() arguments.
The Vala Process.spawn_command_line_async() function is bound to GLib's g_spawn_command_line_async () function. So a good place to start looking for more details is the GLib documentation. The GLib documentation states g_spawn_command_line_async() uses g-shell-parse-argv to parse the command line. This parses the command line so the "results are defined to be the same as those you would get from a UNIX98 /bin/sh, as long as the input contains none of the unsupported shell expansions."
Also on that page are g_shell_quote () and g_shell_unquote (). These functions are bound to Vala as Shell.quote () and Shell.unquote ().
mail only accepts the body of the message from STDIN and g_spawn_command_line_async() won't handle the redirect. So you will either need a command line tool that takes the body as an argument or using something like Subprocess instead.
Thanks to both AIThomas and Jens sending me looking in the right direction, I was able to get it working with the following code:
static int main(string[] args) {
string subject = "-s " + Shell.quote("Work Order #123131");
string cc = "-c ccemail#org.org";
string frommail = "-r " + "senderemail#org.org";
string[] argv = {"mail", subject, cc, frommail, "destinationemail#org.org"};
int standard_input;
int child_pid;
Process.spawn_async_with_pipes (
".",
argv,
null,
SpawnFlags.SEARCH_PATH,
null,
out child_pid,
out standard_input,
null,
null);
FileStream instream = FileStream.fdopen(standard_input, "w");
instream.write("This is what will be emailed\n".data);
return 0;
}

JMeter BeanShell Assertion encounted "\\" after ""

My BeanShell Assertion returns the following result as error:
Assertion error: true
Assertion failure: false
Assertion failure message: org.apache.jorphan.util.JMeterException: Error invoking bsh method: eval
Sourced file: inline evaluation of: `` String sentText = \"Changed the TEXT\"; String receivedText = \"Changed the TEXT\"; . . . '' Token Parsing Error: Lexical error at line 2, column 18. Encountered: "\\" (92), after : ""
I have used a BeanShell PreProcessor to set a property as following and I use it in an edit, which works fine.
${__setProperty(textEdit,\"Changed the TEXT\")}
Then I get the information using a GET call and I use the following regular expression to get that specific information back.
\"edittedText\":(\".*?\")}
Then I use BeanShell Assertion to put the result from that regular expression in the property textEditPost like this. In that BeanShell Assertion I also check if the changed value is the new value.
${__setProperty(textEditPost,${textEditPost})}
String sentText = ${__property(textEdit)};
String receivedText = ${__property(textEditPost)};
if (sentText.equals(receivedText))
{
Failure = false;
}
else
{
Failure = true;
FailureMessage = "The Text does not match, expected: " + sentText + " but found: " + receivedText;
}
I have absolutely no idea where the error on encountering the two backslashes is coming from, as both Strings contain the same data.
Does anyone have an idea why this is happening and a possible solution?
I have found the problem after making some BeanShell Assertions for other things. And I also feel pretty stupid now for not realizing this earlier...
The problem is that the value in the property textEdit is \"Changed the TEXT\" and thus starts with a backslash. Due to this backslash the program has no idea what to do with it when trying to assign it to the String variable sentText or when using the property directly in the if statement.
By placing the property call between quotes the program can properly save it in the String variable. Like this:
${__setProperty(textEditPost,${textEditPost})}
String sentText = "${__property(textEdit)}";
String receivedText = "${__property(textEditPost)}";
if (sentText.equals(receivedText))
{
Failure = false;
}
else
{
Failure = true;
FailureMessage = "The Text does not match, expected: " + sentText + " but found: " + receivedText;
}
I hope this can help others with similar problems too.

Resources