Antlr v3 comment processing VHDL - comments

I am facing an ANTLR problem in a VHDL grammar I wrote. VHDL doesn't have true multiline comments, and no pragmas, so tool vendors invented a comment based mechanism to exclude certain parts of the code, something like
-- pragma translate_off
code to disregard
-- pragma translate_on
('--' introduces a comment in VHDL) where the actual code for the pragma varies, "synopsys translate off" and "rtl translate_off" are known variants.
the part of the ANTLR grammar handling comments is now
#lexer::members {
private static final Pattern translateOnPattern = Pattern.compile("\\s*--\\s*(rtl_synthesis\\s+on|(pragma|synthesis|synopsys)\\s+translate(\\s|_)on)\\s*");
private static final Pattern translateOffPattern = Pattern.compile("\\s*-- \\s*(rtl_synthesis\\s+off|(pragma|synthesis|synopsys)\\s+translate(\\s|_)off)\\s*");
private boolean translateOn = true;
}
[...]
COMMENT
: '--' ( ~( '\n' | '\r' ) )*
{
$channel = CHANNEL_COMMENT;
String content = getText();
Matcher mOn = translateOnPattern.matcher(content);
if(mOn.matches()) {
translateOn = true;
}
Matcher mOff = translateOffPattern.matcher(content);
if(mOff.matches()) {
translateOn = false;
}
}
;
The problem is that my comments go to the hidden channel and while I can recognize these pragmas by processing the comment in a lexer action using regex, I have not found a way to direct all coming tokens to the hidden channel until "-- pragma translate_on". Is that possbile or would you generally use a different approach?

Related

Where is the best place to define variable?

I am wondering about does it any difference the below code for performance or something? For example there is three variables and each one is when to use is defined.
bool myFunc()
{
string networkName;
if ( !Parse(example, XML_ATTRIBUTE_NAME, networkName) )
{
return false;
}
BYTE networkId;
if ( !Parse(example, XML_ATTRIBUTE_ID, networkId) )
{
return false;
}
string baudRate;
if ( !Parse(example, XML_ATTRIBUTE_BAUDRATE, baudRate) )
{
return false;
}
}
Does it any difference between above and below code for performance or something?
bool myFunc()
{
string networkName;
string baudRate;
BYTE networkId;
if ( !Parse(example, XML_ATTRIBUTE_NAME, networkName) )
{
return false;
}
if ( !Parse(example, XML_ATTRIBUTE_ID, networkId) )
{
return false;
}
if ( !Parse(example, XML_ATTRIBUTE_BAUDRATE, baudRate) )
{
return false;
}
}
Code Readability
The recommended practice is to put the declaration as close as possible to the first place where the variable is used. This also minimizes the scope.
From Steve McConnell's "Code Complete" book:
Ideally, declare and define each variable close to where it’s first
used. A declaration establishes a variable’s type. A definition assigns
the variable a specific value. In languages that support it, such as
C++ and Java, variables should be declared and defined close to where
they are first used. Ideally, each variable should be defined at the
same time it’s declared.
Nevertheless, few sources recommend placing declarations in the beginning of the block ({}).
From the obsolete Java Code Conventions:
Put declarations only at the beginning of blocks. (A block is any code
surrounded by curly braces "{" and "}".) Don't wait to declare
variables until their first use; it can confuse the unwary programmer
and hamper code portability within the scope.
Declaring variables only at the top of the function is considered bad practice. Place declarations in the most local blocks.
Performance
In fact, it depends. Declaring POD types should not affect performance at all: the memory for all local variables is allocated when you call the function (C, JavaScript, ActionScript...).
Remember that the compiler optimizes your code, so I guess non-POD types also wouldn't be a problem (C++).
Usually choosing the place to declare a variable is a premature optimization, so the performance is an unimportant point here because of its insignificant microscopic boost (or overhead). The major argument is still the code readability.
Additional Note
Before C99 (C language) standard, variables had to be declared in the beginning of the block.
Summarizing
Considering the above, the best approach (but still not mandatory) is to declare variable as close as possible to the place of its first usage, keeping the scope clean.
In general, it's just a matter of a code readability.

Decision can match input such as "{'A'..'Z', '_', 'a'..'z'}" using multiple alternatives: 1, 3

I am a startbie for this antlr 3.5. I understood that left recursion is accepted in ant;r 4.0 and not in 3.5, I am getting ambigious error warning for my grammar .
I am just verifying my email using this grammar, can some one fix this grammar
grammar HelloWorld;
options
{
// antlr will generate java lexer and parser
language = Java;
// generated parser should create abstract syntax tree
output = AST;
backtrack = true;
}
//as the generated lexer will reside in com.nuwaza.aqua.antlr
//package, we have to add package declaration on top of it
#lexer::header {
package com.nuwaza.aqua.antlr;
}
//as the generated parser will reside in org.meri.antlr_step_by_step.parsers
//package, we have to add package declaration on top of it
#parser::header {
package com.nuwaza.aqua.antlr;
}
// ***************** parser rules:
//our grammar accepts only salutation followed by an end symbol
expression : EmailId At Domain Dot Web EOF;
// ***************** lexer rules:
//the grammar must contain at least one lexer rule
EmailId: (Domain)+;
At : '#';
Domain:(Identifier)+;
Dot: DotOperator;
Web:(Identifier)+|(DotOperator)+|(Identifier)+;
/*Space
:
(
' '
| '\t'
| '\r'
| '\n'
| '\u000C'
)
{
skip();
}
;*/
Identifier
:
(
'a'..'z'
| 'A'..'Z'
| '_'
)
(
'a'..'z'
| 'A'..'Z'
| '_'
| Digit
)*
;
fragment
Digit
:
'0'..'9'
;
fragment DotOperator:'.';
I assume that your problem is in your rule: Identifier. If I were you, I would do something like:
Identifier : ID (ID |Digit)*;
fragment ID : ('a'..'z' | 'A'..'Z' | '_');
I hope this would help you. ;)
I am having two different grammar file and i am trying to use combined grammar for different abstraction.
My code is as follows
HelloWorldParser.g
parser grammar HelloWorldParser;
options
{
// antlr will generate java lexer and parser
language = Java;
// generated parser should create abstract syntax tree
output = AST;
}
//as the generated parser will reside in org.meri.antlr_step_by_step.parsers
//package, we have to add package declaration on top of it
// ***************** parser rules:
//our grammar accepts only salutation followed by an end symbol
expression1
:
Hello World EOF;
and HelloWorldLexer.g
lexer grammar HelloWorldLexer;
//as the generated lexer will reside in com.nuwaza.aqua.antlr
//package, we have to add package declaration on top of it
//as the generated parser will reside in org.meri.antlr_step_by_step.parsers
//package, we have to add package declaration on top of it
// ***************** lexer rules:
Hello: 'Hello';
World: 'World';
My combined grammar is
Test.g
grammar Test;
options
{
// antlr will generate java lexer and parser
language = Java;
// generated parser should create abstract syntax tree
output = AST;
}
import HelloWorldLexer, HelloWorldParser;
#lexer::header {
package com.nuwaza.aqua.antlr;
}
#parser::header {
package com.nuwaza.aqua.antlr;
}
// ***************** parser rules:
//our grammar accepts only salutation followed by an end symbol
expression:expression1;
My LexerParserGenerator is :
package com.nuwaza.aqua.antlr.generator;
import org.antlr.Tool;
public class LexerParserGenerator {
private static final String OUTPUT_DIRECTORY_KEY = "-o";
public static void main(String[] args) {
//provide the grammar ( .g file) residing path
String grammarPath = "./src/main/resources/grammar/Test.g";
//Specify the path with which grammar has to be generated.
String outputPath = "./src/main/java/com/nuwaza/aqua/antlr/";
Tool tool = new Tool(new String[] { grammarPath, OUTPUT_DIRECTORY_KEY,
outputPath });
tool.process();
}
}

Modelsim / reading a signal value

In my simulation, I want to have RW access to signals whereever there are in the project. To get the write access, I use the "signal_force" procedure from the modelsim_lib library. But to get the read access I havn't find the corresponding function.
The reason why signal_force fit my needs is that I'm working with input text files, so I have the name and the value of the signal from a "string" or a "line" variable and I can directly give these variable to the fonction.
I cannot use the "init_signal_spy" procedure because this procedure doesn't give back a value into a string but just duplicates the behavior of a signal onto an other. As my project has to be as generic as possible, I work with variables declared into procedures and I cannot link a signal onto a variable.
Thanks for your help
edited
Sorry, I win the "did not read very carefully" award for the day...
Just for completeness, I'm leaving the part of my answer that deals with signal spy (which is a proprietary ModelSim method), even though you said it wouldn't work for you:
library modelsim_lib;
use modelsim_lib.util.all;
architecture ...
signal local_sig ...
begin
process
begin
init_signal_spy("/sim/path/to/signal/internal_sig", "local_sig");
With VHDL-2008 (if you have support for it), the standard way to access signals not in scope is hierarchical/external names, and as a bonus, it does both "write" and "read". I may be a bit rusty on the nuances, but you access them like:
<<signal .sim.path.to.signal.internal_sig : std_logic>>
And you should be able to use that in place of any normal in-scope identifier, I believe. Aliases, assignments, etc.
If you're comfortable writing C code it should be straightforward to achieve what you want using the VHPI, although sadly despite being part of the VHDL standard Mentor are not planning to implement it. However it will also be possible using FLI although you're locked into a proprietary interface.
Something like this:
procedure get_signal_value_as_string(
vhdl_path : IN string;
vhdl_value: OUT string);
attribute FOREIGN of get_signal_value_as_string : procedure is “my_func mylib.so”;
procedure get_signal_value_as_string(
vhdl_path : IN string;
vhdl_value: OUT string) is
begin
report “ERROR: foreign subprogram get_signal_value_as_string not called”;
end;
Then in C:
#include <stdio.h>
#include "mti.h"
/* Convert a VHDL String array into a NULL terminated string */
static char *get_string(mtiVariableIdT id)
{
static char buf[1000];
mtiTypeIdT type;
int len;
mti_GetArrayVarValue(id, buf);
type = mti_GetVarType(id);
len = mti_TickLength(type);
buf[len] = 0;
return buf;
}
void my_func (
mtiVariableIdT vhdl_path /* IN string */
mtiVariableIdT vhdl_value /* OUT string */
)
{
mtiSignalIdT sigID = mti_FindSignal(get_string(vhdl_path));
mtiInt32T value = mti_GetSignalValue(sigID);
...
}
Plenty of example code in the FLI manual.

JavaCC: A LOOKAHEAD of 2 or greater make my compiler crash?

I am using the Grammar defined in the official Java 8 Language Specification to write a Parser for Java.
In my .jj file I have all of the usual kinds of choice conflicts such as
Warning: Choice conflict involving two expansions at
line 25, column 3 and line 31, column 3 respectively.
A common prefix is:
Consider using a lookahead of 2 for earlier expansion.
or
Warning: Choice conflict in (...)* construct at line 25, column 8.
I did carefully read the Lookahead tutorial from JavaCC but my problem is that whenever I set a LOOKAHEAD(n) where n > 1 and I compile the .jj file the compilation gets stuck and I need to kill the java process.
Why?
CODE
Since I am unable to localize the code which causes my problem I am also not possible to isolate the corresponding code portions.
I was able to restrict the search for the erroneous code fragments as follows:
I have uploaded the code at scribd here.
Please note:
The first rules have a leading // OK comment. This means that when I only have these rules I do get the warnings from the compiler that I have choice conflicts but when I add
LOOKAHEAD(3)
at the corresponding position the warnings disappear.
When I add all successive rules (at once) I am not able to add the
LOOKAHEAD(3) statement anymore. When I do my Eclipse IDE freezes and the javaw.exe process seems get deadlocked or run into an infinite loop when I try to compile the file with JavaCC (which is my actual problem).
Your grammar is so far from LL(1) that it is hard to know where to begin. Let's look at types. After correcting it to follow the grammar in the JLS 8, you have
void Type() :
{ }
{
PrimitiveType() |
ReferenceType()
}
where
void PrimitiveType() :
{ }
{
(Annotation())* NumericType() |
(Annotation())* <KW_boolean>
}
void ReferenceType() :
{ }
{
ClassOrInterfaceType() |
TypeVariable() |
ArrayType()
}
void ClassOrInterfaceType() :
{ }
{
(Annotation())* <Identifier> (TypeArguments())? |
(Annotation())* <Identifier> (TypeArguments())? M()
}
And the error for Type is
Warning: Choice conflict involving two expansions at
line 796, column 3 and line 797, column 3 respectively.
A common prefix is: "#" <Identifier>
Consider using a lookahead of 3 or more for earlier expansion.
The error message tells you exactly what the problem is. There can be annotations at the start of both alternatives in Type. One way to deal with this is to factor out what's common, which is annotations.
Now you have
void Type() :
{ }
{
( Annotation() )*
( PrimitiveType() | ReferenceType() )
}
void PrimitiveType() :
{ }
{
NumericType() |
<KW_boolean>
}
void ReferenceType() :
{ }
{
ClassOrInterfaceType() |
TypeVariable() |
ArrayType()
}
void ClassOrInterfaceType() :
{ }
{
<Identifier> (TypeArguments())? |
<Identifier> (TypeArguments())? M()
}
That fixes the problem with Type. There are still lots of problems, but now there is one less.
For example, all three choices in ReferenceType can start with an identifier. In the end you will want something like this
void Type() :
{ }
{
( Annotation() )*
( PrimitiveType() | ReferenceTypesOtherThanArrays() )
( Dims() )?
}
void PrimitiveType() :
{ }
{
NumericType() | <KW_boolean>
}
void ReferenceTypesOtherThanArrays() :
{ }
{
<Identifier>
( TypeArguments() )?
(
<Token_Dot>
( Annotation() )*
<Identifier>
( TypeArguments() )?
)*
}
Notice that TypeVariable is gone. This is because there is no way to syntactically distinguish a type variable from a class (or interface) name. Thus the grammar just above will accept, say T.x, where T is a type variable, whereas the JLS grammar does not. This is the kind of error you can only rule out using a symbol table. There are a few of situations like this in Java; for example, without a symbol table, you can't tell a package name from a class name or a class name from a variable name; in an expression a.b.c, a could be a package name, a class name, an interface name, a type variable, a variable, or a field name.
You can handle these sorts of issues in one of two ways: you can deal with the problem after parsing, i.e. in a later phase, or you can have a symbol table present during the parsing phase and use the symbol table to guide the parser using semantic lookahead. The latter option is not a good one for Java, however; it is best to parse first and deal with all issues that need a symbol table later. This is because, in Java a symbol can be declared after it is used. It might even be declared in another file. What we did in the Java compiler for the Teaching Machine was to parse all files first. Then build a symbol table. Then do semantic analysis. Of course if your application does not require diagnosing all errors, then these considerations can largely be ignored.

How do I combine LINQ expressions into one?

I've got a form with multiple fields on it (company name, postcode, etc.) which allows a user to search for companies in a database. If the user enters values in more than one field then I need to search on all of those fields. I am using LINQ to query the database.
So far I have managed to write a function which will look at their input and turn it into a List of expressions. I now want to turn that List into a single expression which I can then execute via the LINQ provider.
My initial attempt was as follows
private Expression<Func<Company, bool>> Combine(IList<Expression<Func<Company, bool>>> expressions)
{
if (expressions.Count == 0)
{
return null;
}
if (expressions.Count == 1)
{
return expressions[0];
}
Expression<Func<Company, bool>> combined = expressions[0];
expressions.Skip(1).ToList().ForEach(expr => combined = Expression.And(combined, expr));
return combined;
}
However this fails with an exception message along the lines of "The binary operator And is not defined for...". Does anyone have any ideas what I need to do to combine these expressions?
EDIT: Corrected the line where I had forgotten to assign the result of and'ing the expressions together to a variable. Thanks for pointing that out folks.
You can use Enumerable.Aggregate combined with Expression.AndAlso. Here's a generic version:
Expression<Func<T, bool>> AndAll<T>(
IEnumerable<Expression<Func<T, bool>>> expressions) {
if(expressions == null) {
throw new ArgumentNullException("expressions");
}
if(expressions.Count() == 0) {
return t => true;
}
Type delegateType = typeof(Func<,>)
.GetGenericTypeDefinition()
.MakeGenericType(new[] {
typeof(T),
typeof(bool)
}
);
var combined = expressions
.Cast<Expression>()
.Aggregate((e1, e2) => Expression.AndAlso(e1, e2));
return (Expression<Func<T,bool>>)Expression.Lambda(delegateType, combined);
}
Your current code is never assigning to combined:
expr => Expression.And(combined, expr);
returns a new Expression that is the result of bitwise anding combined and expr but it does not mutate combined.
EDIT: Jason's answer is now fuller than mine was in terms of the expression tree stuff, so I've removed that bit. However, I wanted to leave this:
I assume you're using these for a Where clause... why not just call Where with each expression in turn? That should have the same effect:
var query = ...;
foreach (var condition in conditions)
{
query = query.Where(condition);
}
Here we have a general question about combining Linq expressions. I have a general solution for this problem. I will provide an answer regarding the specific problem posted, although it's definitely not the way to go in such cases. But when simple solutions fail in your case, you may try to use this approach.
First you need a library consisting of 2 simple functions. They use System.Linq.Expressions.ExpressionVisitor to dynamically modify expressions. The key feature is unifying parameters inside the expression, so that 2 parameters with the same name were made identical (UnifyParametersByName). The remaining part is replacing a named parameter with given expression (ReplacePar). The library is available with MIT license on github: LinqExprHelper, but you may quickly write something on your own.
The library allows for quite simple syntax for combining complex expressions. You can mix inline lambda expressions, which are nice to read, together with dynamic expression creation and composition, which is very capable.
private static Expression<Func<Company, bool>> Combine(IList<Expression<Func<Company, bool>>> expressions)
{
if (expressions.Count == 0)
{
return null;
}
// Prepare a master expression, used to combine other
// expressions. It needs more input parameters, they will
// be reduced later.
// There is a small inconvenience here: you have to use
// the same name "c" for the parameter in your input
// expressions. But it may be all done in a smarter way.
Expression <Func<Company, bool, bool, bool>> combiningExpr =
(c, expr1, expr2) => expr1 && expr2;
LambdaExpression combined = expressions[0];
foreach (var expr in expressions.Skip(1))
{
// ReplacePar comes from the library, it's an extension
// requiring `using LinqExprHelper`.
combined = combiningExpr
.ReplacePar("expr1", combined.Body)
.ReplacePar("expr2", expr.Body);
}
return (Expression<Func<Company, bool>>)combined;
}
Assume you have two expression e1 and e2, you can try this:
var combineBody = Expression.AndAlso(e1.Body, Expression.Invoke(e2, e1.Parameters[0]));
var finalExpression = Expression.Lambda<Func<TestClass, bool>>(combineBody, e1.Parameters).Compile();

Resources