I'm trying to model a YAML-like DSL in Xtext. In this DSL, I need some Multiline String as in YAML.
description: |
Line 1
line 2
...
My first try was this:
terminal BEGIN:
'synthetic:BEGIN'; // increase indentation
terminal END:
'synthetic:END'; // decrease indentation
terminal MULTI_LINE_STRING:
"|"
BEGIN ANY_OTHER END;
and my second try was
terminal MULTI_LINE_STRING:
"|"
BEGIN
((!('\n'))+ '\n')+
END;
but both of them did not succeed. Is there any way to do this in Xtext?
UPDATE 1:
I've tried this alternative as well.
terminal MULTI_LINE_STRING:
"|"
BEGIN ->END
When I triggered the "Generate Xtext Artifacts" process, I got this error:
3492 [main] INFO nerator.ecore.EMFGeneratorFragment2 - Generating EMF model code
3523 [main] INFO clipse.emf.mwe.utils.GenModelHelper - Registered GenModel 'http://...' from 'platform:/resource/.../model/generated/....genmodel'
error(201): ../.../src-gen/.../parser/antlr/lexer/Internal..Lexer.g:236:71: The following alternatives can never be matched: 1
error(3): cannot find tokens file ../.../src-gen/.../parser/antlr/internal/Internal...Lexer.tokens
error(201): ../....idea/src-gen/.../idea/parser/antlr/internal/PsiInternal....g:4521:71: The following alternatives can never be matched: 1
This slide deck shows how we implemented a whitespace block scoping in an Xtext DSL.
We used synthetic tokens called BEGIN corresponding to an indent, and END corresponding to an outdent.
(Note: the language was subsequently renamed to RAPID-ML, included as a feature of RepreZen API Studio.)
I think your main problem is, that you have not defined when your multiline token is ending. Before you come to a solution you have to make clear in your mind how an algorithm should determine the end of the token. No tool can take this mental burdon from you.
Issue: There is no end marking character. Either you have to define such a character (unlike YAML) or define the end of the token in anather way. For example through some sort of semantic whitespace (I think YAML does it like that).
The first approach would make the thing very easy. Just read content until you find the closing character. The sescond approach would probably be manageable using a custom lexer. Basically you replace the generated lexer with your own implemented solution that is able to cound blanks or similar.
Here are some starting points about how this could be done (different approaches thinkable):
Writing a custom Xtext/ANTLR lexer without a grammar file
http://consoliii.blogspot.de/2013/04/xtext-is-incredibly-powerful-framework.html
Related
I have several processes with almost same flow like "Get some parameters, extract data from database according to them and upload them to target". The parameters vary slightly across processes as well as targets but only a bit. Most of the process is the same. I would like to extract those differences to parameter-context and dynamically load them. My idea is to have parameters defined following way and then using them.
So core of question is:
How to dynamically choose which parameter group load and use?
Having several parameter contexts with same-named/different-valued parameters and dynamically switching them would be probably the best, but it is not possible as far as I know.
Also duplicating flows is out-of-the-table. Any error correction would be spread out over several places and maintenance would be a nightmare.
Moreover, I know I can do it like "In GenetrateFlowFile for process A set value1=#{A_value1} and in GenetrateFlowFile for process B set value1=#{B_value1}. But this is tedious, error-prone and scales kinda bad. Not speaking of situation when I can have dozens of parameters and several processes. Also it is a kind of hardcoding, not configuring...
I was hoping for something like defining group=A and then using it like value1=#{ ${ group:append('_value1') } } but this does not work - it is evaluated as parameter literally named ${ group:append('_value1') }.
TL;DR: Use evaluateELString().
The actual solution is to set in GenetrateFlowFile processor group=A and in next UpdateAttribute processor set the following:
value1=${ group:prepend('hash{ '):append('_value1 }'):replace('hash', '#'):evaluateELString() }
The magic being done here is "Take value of group slap around it #{ and _value1 } to make it valid NiFi Expression Language statement and then evaluate it." (Notice - the word hash and function replace is there since I didn´t manage to escape the # char right before {.)
If you would like to have your value1 at the beginning of the statement then you can use following code. The result is same, it is easier to use (often-changed value value1 is at the beginning of the statement) and is less readable "what is really going on?"-wise.
value1=${ literal('value1'):prepend('_'):prepend(${ group }):prepend('hash{ '):append(' }'):replace('hash', '#'):evaluateELString() }
I get the following error while documenting a module variable json_class_index (See source), which does not have a docstring.
The generated documentation seems to be fine. What is a good fix?
reading sources... [100%] sanskrit_data_schema_common
/home/vvasuki/sanskrit_data/sanskrit_data/schema/common.py:docstring of sanskrit_data.schema.common.json_class_index:3: WARNING: Unexpected indentation.
/home/vvasuki/sanskrit_data/sanskrit_data/schema/common.py:docstring of sanskrit_data.schema.common.json_class_index:4: WARNING: Block quote ends without a blank line; unexpected unindent.
/home/vvasuki/sanskrit_data/sanskrit_data/schema/common.py:docstring of sanskrit_data.schema.common.json_class_index:7: WARNING: Unexpected indentation.
/home/vvasuki/sanskrit_data/sanskrit_data/schema/common.py:docstring of sanskrit_data.schema.common.json_class_index:8: WARNING: Inline strong start-string without end-string.
Edit:
PS: Note that removing the below docstring makes the error disappear, so it seems to be the thing to fix.
.. autodata:: json_class_index
:annotation: Maps jsonClass values to Python object names. Useful for (de)serialization. Updated using update_json_class_index() calls at the end of each module file (such as this one) whose classes may be serialized.
The warning messages indicate that the reStructuredText syntax of your docstrings is not valid and needs to be corrected.
Additionally your source code does not comply with PEP 8. Indentation should be 4 spaces, but your code uses 2, which might cause problems with Sphinx.
First make your code compliant with PEP 8 indentation.
Second, you must have two lines separating whatever precedes info field lists and the info field lists themselves.
Third, if the warnings persist, then look at the line numbers in the warnings—3, 4, 7, and 8—and the warnings themselves. It appears that the warnings correspond to this block of code:
#classmethod
def make_from_dict(cls, input_dict):
"""Defines *our* canonical way of constructing a JSON object from a dict.
All other deserialization methods should use this.
Note that this assumes that json_class_index is populated properly!
- ``from sanskrit_data.schema import *`` before using this should take care of it.
:param input_dict:
:return: A subclass of JsonObject
"""
Try this instead, post-PEP-8-ification, which should correct most of the warnings caused by faulty white space in your docstring:
#classmethod
def make_from_dict(cls, input_dict):
"""
Defines *our* canonical way of constructing a JSON object from a dict.
All other deserialization methods should use this.
Note that this assumes that json_class_index is populated properly!
- ``from sanskrit_data.schema import *`` before using this should take care of it.
:param input_dict:
:return: A subclass of JsonObject
"""
This style is acceptable according to PEP 257. The indentation is visually and vertically consistent, where the triple quotes vertically align with the left indentation. I think it's easier to read.
The fix was to add a docstring for the variable as follows:
#: Maps jsonClass values to Python object names. Useful for (de)serialization. Updated using update_json_class_index() calls at the end of each module file (such as this one) whose classes may be serialized.
json_class_index = {}
I've been working at implementing a simple serial forking described in the TM module's documentation (the Q values are stored as a priority weight in a mysql table) where my proxy is querying a database to determine to what domain to forward to.
I've verified through extensive use of xlog that a variable I'm using to build the new URI to use with seturi is getting everything correctly. I use an append_branch call in a subsequent while loop iterating over my sql query results, which doesn't have any problems with taking a very similarly formatted parameter. However, when I go to restart Kamailio it simply gripes at me that a string is expected. The line it corresponds to from console is just the seturi call. I've tried casting as a string, but that doesn't seem to be part of 4.4 (or my syntax is wrong).
I've thought about building the URI strings and storing into avp, but I suspect I'd have the same problem.
For reference, this is what I'm doing:
$var(basedest) = "sip:" + $var(number) + "#" + $(dbr(destination=>[0,0]))+ ":" + $var(port);
seturi($var(basedest));
And what it's outputting when trying to load the config:
<core> [cfg.y:3368]: yyerror_at(): parse error in config file //etc/kamailio/kamailio.cfg, line 570, column 9-22: syntax error
<core> [cfg.y:3371]: yyerror_at(): parse error in config file //etc/kamailio/kamailio.cfg, line 570, column 23: bad argument, string expected
Naturally, when I put $var(basedest) in double quotes, it's literally interpreted as a string. Single quotes behave similarly. Is there something I can do to work around this? When I feed it an explicit hardcoded string, it's happy as a can be and the routing works fine. When I try to do something very simple like the above, it gets upset. If possible, I'd like to avoid updating as I initially grabbed Kamailio from the yum repo.
Thanks in advance - this has been bugging me a good while.
Apparently, not a new problem. I ended up finding out what I can do to work around it.
For reference, seturi and $ru pseudo variable refer to the same thing. So basically you'd just do:
$var(mynewru) = "sip:user#domain:5060";
$ru = $var(mynewru);
This would achieve the same thing I was originally attempting to do before based on the TM module's documentation. For serial forking, issuing some number of append_branch calls is fine.
I have added a spec as first of three in a describe block. When it is the only spec in that block, I get the following output:
updateTimerDisplay
- calls removeCountdownTimer($widget) if no seconds are remaining... SlideshowHome
(no checkmark or cross, but a minus sign in the beginning of the line and the final line break missing, so the title of the next describe block follows in the same output line)
when I disable that spec (with xit(...)), I get this output:
updateTimerDisplay
* calls removeCountdownTimer($widget) if no seconds are remaining
but when I enable it together with the other specs, the output of the first spec is missing completely, but it is counted in the final specs count output from the jasmine task run.
Any ideas?
I had this same problem, and I fixed it by removing an underscore dependency in my vendors option. It may have something to do with grunt-contrib-jasmine using lodash.
Is it possible to extract the first and follow sets from a rule using ANTLR4? I played around with this a little bit in ANTLR3 and did not find a satisfactory solution, but if anyone has info for either version, it would be appreciated.
I would like to parse user input up the user's cursor location and then provide a list of possible choices for auto-completion. At the moment, I am not interested in auto-completing tokens which are partially entered. I want to display all possible following tokens at some point mid-parse.
For example:
sentence:
subjects verb (adverb)? '.' ;
subjects:
firstSubject (otherSubjects)* ;
firstSubject:
'The' (adjective)? noun ;
otherSubjects:
'and the' (adjective)? noun;
adjective:
'small' | 'orange' ;
noun:
CAT | DOG ;
verb:
'slept' | 'ate' | 'walked' ;
adverb:
'quietly' | 'noisily' ;
CAT : 'cat';
DOG : 'dog';
Given the grammar above...
If the user had not typed anything yet the auto-complete list would be ['The'] (Note that I would have to retrieve the FIRST and not the FOLLOW of rule sentence, since the follow of the base rule is always EOF).
If the input was "The", the auto-complete list would be ['small', 'orange', 'cat', 'dog'].
If the input was "The cat slept, the auto-complete list would be ['quietly', 'noisily', '.'].
So ANTLR3 provides a way to get the set of follows doing this:
BitSet followSet = state.following[state._fsp];
This works well. I can embed some logic into my parser so that when the parser calls the rule at which the user is positioned, it retrieves the follows of that rule and then provides them to the user. However, this does not work as well for nested rules (For instance, the base rule, because the follow set ignores and sub-rule follows, as it should).
I think I need to provide the FIRST set if the user has completed a rule (this could be hard to determine) as well as the FOLLOW set of to cover all valid options. I also think I will need to structure my grammar such that two tokens are never subsequent at the rule level.
I would have break the above "firstSubject" rule into some sub rules...
from
firstSubject:
'The'(adjective)? CAT | DOG;
to
firstSubject:
the (adjective)? CAT | DOG;
the:
'the';
I have yet to find any information on retrieving the FIRST set from a rule.
ANTLR4 appears to have drastically changed the way it works with follows at the level of the generated parser, so at this point I'm not really sure if I should continue with ANTLR3 or make the jump to ANTLR4.
Any suggestions would be greatly appreciated.
ANTLRWorks 2 (AW2) performs a similar operation, which I'll describe here. If you reference the source code for AW2, keep in mind that it is only released under an LGPL license.
Create a special token which represents the location of interest for code completion.
In some ways, this token behaves like the EOF. In particular, the ParserATNSimulator never consumes this token; a decision is always made at or before it is reached.
In other ways, this token is very unique. In particular, if the token is located at an identifier or keyword, it is treated as though the token type was "fuzzy", and allowed to match any identifier or keyword for the language. For ANTLR 4 grammars, if the caret token is located at a location where the user has typed g, the parser will allow that token to match a rule name or the keyword grammar.
Create a specialized ATN interpreter that can return all possible parse trees which lead to the caret token, without looking past the caret for any decision, and without constraining the exact token type of the caret token.
For each possible parse tree, evaluate your code completion in the context of whatever the caret token matched in a parser rule.
The union of all the results found in step 3 is a superset of the complete set of valid code completion results, and can be presented in the IDE.
The following describes AW2's implementation of the above steps.
In AW2, this is the CaretToken, and it always has the token type CARET_TOKEN_TYPE.
In AW2, this specialized operation is represented by the ForestParser<TParser> interface, with most of the reusable implementation in AbstractForestParser<TParser> and specialized for parsing ANTLR 4 grammars for code completion in GrammarForestParser.
In AW2, this analysis is performed primarily by GrammarCompletionQuery.TaskImpl.runImpl(BaseDocument).