goyacc: getting context to the yacc parser / no `%param` - go

What is the most idiomatic way to get some form of context to the yacc parser in goyacc, i.e. emulate the %param command in traditional yacc?
I need to parse to my .Parse function some context (in this case including for instance where to build its parse tree).
The goyacc .Parse function is declared
func ($$rcvr *$$ParserImpl) Parse($$lex $$Lexer) int {
Things I've thought of:
$$ParserImpl cannot be changed by the .y file, so the obvious solution (to add fields to it) is right out, which is a pity.
As $$Lexer is an interface, I could stuff the parser context into the Lexer implementation, then force type convert $$lex to that implementation (assuming my parser always used the same lexer), but this seems pretty disgusting (for which read non-idiomatic). Moreover there is (seemingly) no way to put a user-generated line at the top of the Parse function like c := yylex.(*lexer).c, so in the many tens of places I want to refer to this variable, I have to use the rather ugly form yylex.(*lexer).c rather than just c.
Normally I'd use %param in normal yacc / C (well, bison anyway), but that doesn't exist in goyacc.
I'd like to avoid postprocessing my generated .go file with sed or perl for what are hopefully obvious reasons.
I want to be able to (go)yacc parse more than one file at once, so a global variable is not possible (and global variables are hardly idiomatic).
What's the most idiomatic solution here? I keep thinking I must be missing something simple.

My own solution is to modify goyacc (see this PR) which adds a %param directive allowing one or more fields to be added to the $$ParserImpl structure (accessible as $$rcvr in code). This seems the most idiomatic route. This permits not only passing context in, but the ability for the user to add additional func()s using $$ParserImpl as a receiver.

Related

Why do Julia programmers need to prefix macros with the at-sign?

Whenever I see a Julia macro in use like #assert or #time I'm always wondering about the need to distinguish a macro syntactically with the # prefix. What should I be thinking of when using # for a macro? For me it adds noise and distraction to an otherwise very nice language (syntactically speaking).
I mean, for me '#' has a meaning of reference, i.e. a location like a domain or address. In the location sense # does not have a meaning for macros other than that it is a different compilation step.
The # should be seen as a warning sign which indicates that the normal rules of the language might not apply. E.g., a function call
f(x)
will never modify the value of the variable x in the calling context, but a macro invocation
#mymacro x
(or #mymacro f(x) for that matter) very well might.
Another reason is that macros in Julia are not based on textual substitution as in C, but substitution in the abstract syntax tree (which is much more powerful and avoids the unexpected consequences that textual substitution macros are notorious for).
Macros have special syntax in Julia, and since they are expanded after parse time, the parser also needs an unambiguous way to recognise them
(without knowing which macros have been defined in the current scope).
ASCII characters are a precious resource in the design of most programming languages, Julia very much included. I would guess that the choice of # mostly comes down to the fact that it was not needed for something more important, and that it stands out pretty well.
Symbols always need to be interpreted within the context they are used. Having multiple meanings for symbols, across contexts, is not new and will probably never go away. For example, no one should expect #include in a C program to go viral on Twitter.
Julia's Documentation entry Hold up: why macros? explains pretty well some of the things you might keep in mind while writing and/or using macros.
Here are a few snippets:
Macros are necessary because they execute when code is parsed,
therefore, macros allow the programmer to generate and include
fragments of customized code before the full program is run.
...
It is important to emphasize that macros receive their arguments as
expressions, literals, or symbols.
So, if a macro is called with an expression, it gets the whole expression, not just the result.
...
In place of the written syntax, the macro call is expanded at parse
time to its returned result.
It actually fits quite nicely with the semantics of the # symbol on its own.
If we look up the Wikipedia entry for 'At symbol' we find that it is often used as a replacement for the preposition 'at' (yes it even reads 'at'). And the preposition 'at' is used to express a spatial or temporal relation.
Because of that we can use the #-symbol as an abbreviation for the preposition at to refer to a spatial relation, i.e. a location like #tony's bar, #france, etc., to some memory location #0x50FA2C (e.g. for pointers/addresses), to the receiver of a message (#user0851 which twitter and other forums use, etc.) but as well for a temporal relation, i.e. #05:00 am, #midnight, #compile_time or #parse_time.
And since macros are processed at parse time (here you have it) and this is totally distinct from the other code that is evaluated at run time (yes there are many different phases in between but that's not the point here).
In addition to explicitly direct the attention to the programmer that the following code fragment is processed at parse time! as oppossed to run time, we use #.
For me this explanation fits nicely in the language.
thanks#all ;)

What is the best way to manage a large quantity of constants

I am currently working on a very complex program that processes rows from an input table and has a huge number of possible outcomes for each record. Because of this I have a very large number of constants defined for the outcome messages. There is one success message for the record, but a multitude of possible warnings and errors.
My first thought was to define all of my constants for these messages at the package body level, but then I decided to move each constant to the procedure where it is used. I'm now second guessing that decision and thinking of moving everything back to package body level. What is the best way to define this many constants? Ease of maintainability is my ultimate goal for this program since it is so complex.
I think this is a matter of taste. In my application I put all error codes into an Error-Package. All main and commonly used constants I put into a separate package (without a package body).
Again, a matter of taste, but I tend to put a list of named constants at the package spec level rather than the package body so that they can be referenced by any portion of the application. If I ever want to change the error code that c_err_for_specific_reason_x uses, it becomes a single place to do so.
If I wanted to hide the codes and put them within the body I would have a get_error_code(p_get_error_name varchar) function that did the translation based on you passing a valid constant name.
I've done both on different projects, but tend towards the list over the function most times. I tend to use the function if it a table-driven source of the data.
It ... wait for it ... depends!
Since you currently define your constants in the package body, you don't need them to be publicly accessible outside the package. So defining them in a spec really doesn't buy you anything.
Here's is the rule I follow: Define constants within the smallest scope needed. So if a constant is used only within one procedure, define it in that procedure. If it is used within more than one procedure, define it in the body. If it is used elsewhere by code in other packages (or non-packaged SPs) but only when using a particular package, define it in the spec of that package. If it is used by other code for general use, put it in a separate spec of such general constants.

what is the use of erlang compile options: "-compile({parse_transform, ms_transform})".?

As the title, does anybody could explain the use of parse_transform with ms_transform?
what the different between with it and without it ?
The -compile({parse_transform, ms_transform}). syntax invokes a parse transform.
A parse transform is a module which the compiler calls after the file or input has been parsed. The module is called with the full abstract syntax of the whole module and must return a new abstract for a whole module. The parse transform is allowed to do whatever it wants as long as the result is legal erlang syntax. It is like a super macro facility which works on the whole module not just single function calls. The resulting module is then compiled. You can have many parse transforms.
Parse transforms are typically used to do compile-time evaluation and code transformations. The ets:fun2ms call mentioned by #P_A is a typical example of this as it takes a fun and at compile-time transforms this into a match specification, see Matchspecs and ets:fun2ms. But parse transforms allow you to do much more, for example add and remove functions. An example of this is a parse transform which generates access functions for all the fields in a record.
It is a very powerful tool, but unfortunately easy to get wrong and so create a real mess. There are, however, some 3rd party support tools which can be very helpful.
ms_transform module implements parse_transform that translates fun syntax into match specifications. For example ets:fun2ms fun uses it.
Also you can use
-include_lib("stdlib/include/ms_transform.hrl").

Why do format conversion libraries lack a single method to write the output to a file?

From my experience with Ruby, libraries that parse/convert a format (such as YAML, JSON, XML, SASS, etc.) into objects often have a single method that covers from reading the file to parsing, which is usually named like load, load_file, etc. (In addition, they usually have a method that only does parsing on a string that was read in advance, which is usually named like decode, parse. etc.)
On the other hand, when it comes to converting the objects into the target file format, such libraries rarely have a single method that covers from conversion to writing to the destination file. Usually, they only have a single method that does only conversion, which is usually named like encode, render, etc., and the result string has to be written to the file using another method such as File.write.
What is the reason for this assymmetry? Why does writing to a file require an extra step?
I'd guess that it's because of error handling. Readings file can goi wrong in plenty of ways, but writing a file is even more error prone. It seems silly for a library that's main purpose is parsing to have to deal with file writing. I don't know why these libraries even include file read & parse methods.
Also, for a library to include these kinds of method is useless as soon as you need to access any of the options of the file writing and reading methods. So then the library includes an options parameter that gets passed to the file method, and now the code is just an unclear mess.
That's my 2ยข.

Why aren't the arguments to File.new symbols instead of strings?

I was wondering why the people who wrote the File library decided to make the arguments that determine what mode the file is opened in strings instead of symbols.
For example, this is how it is now:
f = File.new('file', 'rw')
But wouldn't it be a better design to do
f = File.new('file', :rw)
or even
f = File.new(:file, :rw)
for example? This seems to be the perfect place to use them since the argument definitely doesn't need to be mutable.
I am interested in knowing why it came out this way.
Update: I just got done reading a related question about symbols vs. strings, and I think the consensus was that symbols are just not as well known as strings, and everyone is used to using strings to index hash tables anyway. However, I don't think it would be valid for the designers of Ruby's standard library to plead ignorance on the subject of symbols, so I don't think that's the reason.
I'm no expert in the history of ruby, but you really have three options when you want parameters to a method: strings, symbols, and static classes.
For example, exception handling. Each exception is actually a type of class Exception.
ArgumentError.is_a? Class
=> True
So you could have each permission for the stream be it's own class. But that would require even more classes to be generated for the system.
The thing about symbols is they are never deleted. Every symbol you generate is preserved indefinitely; it's why using the method '.to_sym' lightly is discouraged. It leads to memory leaks.
Strings are just easier to manipulate. If you got the input mode from the user, you would need a '.to_sym' somewhere in your code, or at the very least, a large switch statement. With a string, you can just pass the user input directly to the method (if you were so trusting, of course).
Also, in C, you pass a character to the file i/o method. There are no Chars in ruby, just strings. Seeing as how ruby is built on C, that could be where it comes from.
It is simply a relic from previous languages.

Resources