I have some code which depends on some include files which are partly defined at the start of source files (which is usual) and others which are used within functions.
I typical example for that are the OpenFOAM solver sources.
Because the scheme of this code is highly procedural, but I want to put all this into a class which provides init(), run() and maybe release(), I plan to put some of the variables into the classes as private making them members.
I don't want to modify the included files because they belong to a library.
The reason for using a class is that other routines classes run together with this code.
Here is the thing. init() must prepare some variable and there situations that theses variables (being type of other clases) not explicit constructors and special arguments. It is called once. run() is called several times. The procedural code has a loop only and the contents of that loop are put into the run() method.
So the best solution was to put these variables into std::unique_ptr and init can construct whatever it needs to. Obviously with that trick the variable signature changed, so I created a second declaration of a reference like this:
std::unique_ptr<volScalarField> mp_p;
volScalarField &p = *mp_p;
Now this is a bit tedious so I created a macro
FOAMPTR(volVectorField, p)
which does all the work for me:
#define FOAMPTR(TYPE,NAME) std::unique_ptr<TYPE> mp_##NAME; TYPE &NAME=*mp_##NAME
It works pretty well, but I'm not fan of macros in general, especially if you need to debug code.
Now my question is: Is there a better way to tackle this and use something else like a template definition which might do all the magic?
Edit: With 'works pretty well' I mean, that the compiler can translate that. The reference though still is invalid.
Edit: Okay, I solved the invalid pointer problem using two Macros:
#define FOAMPTR(TYPE,NAME) std::unique_ptr<TYPE> mp_##NAME
#define FETCHFOAMREF(NAME) auto &NAME=*mp_##NAME
Now I put FOAMPTR(TYPE,NAME) to the member and I get my unique ptrs. In the run() method the second macro FETCHFOAMREF(NAME) is used. Of course init() must be sure to correctly initialize the object or else the program is going to crash.
I still leave the question open because I'm not satisfied with that solution.
Related
• The function will be inline; that is, the compiler will try to generate code for the function at each point of call rather than using function-call instructions to use common code. This can be a significant performance advantage for functions, such as month(), that hardly do anything but
are used a lot.
• All uses of the class will have to be recompiled whenever we make a change to the body of an inlined function. If the function body is out of the class declaration, recompilation of users is needed only when the class declaration is itself changed. Not recompiling when the body is
changed can be a huge advantage in large programs.
• The class definition gets larger. Consequently, it can be harder to find the members among the member function definitions.
All uses of the class will have to be recompiled whenever we make a change to the body of an inlined function. If the function body is out of the class declaration, recompilation of users is needed only when the class declaration is itself changed. Not recompiling when the body is
changed can be a huge advantage in large programs.
I don't know what the book is trying to say exactly in this point. What do we mean by "have to be recompiled" and "recompilation is needed only when the class declaration is itself changed"
I suppose, from the context, that the quoted part discusses the pros & cons of putting member definitions inside the class declaration.
Suppose you have class X. You have to declare it somewhere. In a typical scenario, it will be placed in a header file whose only role will be to hold this declaration. Let's call it x.h.
A class usually has member functions. Now you can choose to either put them inside the header file inside the class declaration or in a separate file (typically: x.cpp).
Solution 1:
// file x.h contains everything
class X
{
public:
X() { std::cout << "X() has been hit\n"; }
};
Solution 2:
// file x.h contains only the declaration(s)
class X
{
public:
X();
};
// file x.cpp contains the class member definitions
#include "x.h"
X::X() { std::cout << "X() has been hit\n"; }
Whichever solution you use, you surely have some code that uses your class, and typically it is located in a different source file(s), e.g.:
// main.cpp
#include "x.h"
int main()
{
X x;
}
The first thing to notice: the user (here: main.cpp) looks the same whether you choose Solution 1 or 2. This is great. Now, here comes the message Bjarne wants to tell you: consider how changes to the class code will impact the users.
In Solution 1 you've packed everything into the header file. Any change to the class, even so apparently harmless as adding a new member function or just changing class formatting (you know, tabs, spaces, etc.) or adding a comment will force the compiler to recompile main.cpp. Why? Professional C++ programs are composed of many, many source files and their compilation is controlled and executed by special utility programs, like cmake, make, and many others. They simply look at the timestamps of the files that make up the program. Any change is a signal to recompile. Header files are never compiled, but all source files (= *.cpp) that include them (even indirectly, via other header files) have to be recompiled. This explains this:
All uses of the class will have to be recompiled whenever we make a change to the body of an inlined function.
(just to be sure: all class member functions declared inside the class declaration are considered inline by default). Here, main.cpp is an example of a "uses" mentioned above.
In Solution 2, file main.cpp will be recompiled only if x.h has been changed (in any way). If a programmer touches only x.cpp, then main.cpp will not be recompiled, because (a) C++ is designed in such a way to allow it and (b) professional C++ programs use other programs (I've mentioned above) that facilitate the efficient compilation of even large C++ programs. To be explicit: they are not compiled using commands like g++ *.cpp that can be found in some introductory C++ textbooks.
One final remark. The inline keyword was introduced essentially to allow Solution 1. Solution 2 is the original C language way. Solution 1 is sometimes used in C++ for better performance (but modern compilers can in many situations do the same job without it) and very often for templates (which are absent in C). Solution 1 is the most common way of programming templates, Solution 2 is typical for "ordinary" member functions. What Bjarne writes about is extremely important for library designers, I hope now you understand why.
I'm moderately new to common lisp, but have extended experience with other 'separate compilation' languages (think C/C++/FORTRAN and such)
I know how to do an ASDF system definition. I know how to separate stuff in packages. I'm using SBCL, by the way.
The question is this: what's the best practice for splitting code (large packages) between .lisp files? I mean, in C there are include files, while lisp lives with the current image state. So with multiple files I need to handle dependencies or serial order in the system definition. But without something like forward declarations it's painful.
Simple example on what I want to do: I have, for example, two defstructs that are part of the same bigger data structure (like struct1 is a parent of some set of struct2). Some functions works on one, some other works on the other and some other use both.
So I would have: a packages.lisp, a fun1.lisp (with the first defstruct and related functions), a fun2.lisp (with the other defstruct and functions) and a funmix.lisp (with functions that use both). In an ideal world everything is sealed and compiling these in this order would be fine. As most of you know, this in practice almost never happen.
If I need to use struct2 functions from the struct1 ones I would need to either reorder or add a dependency. But then if there's some kind of back call (that can't be done with a closure) I would have struct1.lisp depending on struct2.lisp and vice-versa which is obviously not valid. So what? I could break the loop putting the defstruct in a separate file (say, structs.lisp) but what if either of the struct's function need to access the common functions in the third file? I would like to avoid style notes.
What's the common way to solve this, i.e. keeping loosely related code in the same file but still be able to interface to other ones. Is the correct solution to seal everything in a compilation unit (a single file)? use a package for every file with exports?
Lisp dependencies are simple, because in many cases, a Lisp implementation doesn't need to process the definition of something in order to compile its use.
Some exceptions to the rule are:
Macros: macros must be loaded in order to be expanded. There is a compile-time dependency between a file which uses macro and the file which defines them.
Packages: a package foo must be defined in order to use symbols like foo:bar or foo::priv. If foo is defined by a defpackage form in some foo.lisp file, then that file has to be loaded (either in source or compiled form).
Constants: constants defined with defconstant should be seen before their use. Similar remarks apply to inline functions, compiler macros.
Any custom things in a "domain specific language" which enforces definition before use. E.g. if Whizbang Inference Engine needs rules to be defined when uses of the rules are compiled, you have to arrange for that.
For certain diagnostics to be suppressed like calls to undefined functions, the defining and using files must be taken to be as a single compilation unit. (See below.)
All the above remarks also have implications for incremental recompilation.
When there is dependency like the above between files so that one is a prerequisite of the other, when the prerequisite is touched, the dependent one must be recompiled.
How to split code into files is going to be influenced by all the usual things: cohesion, coupling and what have you. Common-Lisp-specific reasons to keep certain things together in one file is inlining. The call to a function which is in the same file as the caller may be inlined. If your program supports any in-service upgrade, the granularity of code loading is individual files. If some functions foo and bar should be independently redefinable, don't put them in the same file.
Now about compilation units. Suppose you have a file foo.lisp which defines a function called foo and bar.lisp which calls (foo). If you just compile bar.lisp, you will likely get a warning that an undefined function foo has been called. You could compile foo.lisp first and then load it, and then compile bar.lisp. But that will not work if there is a circular reference between the two: say foo.lisp also calls (bar) which bar.lisp defines.
In Common Lisp, you can defer such warnings to the end of a compilation unit, and what defines a compilation unit isn't a single file, but a dynamic scope established by a macro called with-compilation-unit. Simply put, if we do this:
(with-compilation-unit
(compile-file "foo.lisp") ;; contains (defun foo () (bar))
(compile-file "bar.lisp")) ;; contains (defun bar () (foo))
If a compile-file isn't surrounded by with-compilation-unit then there is a compilation unit spanning that file. Otherwise, the outermost nesting of the with-compilation-unit macro determines the scope of what is in the compilation unit.
Warnings about undefined functions (and such) are deferred to the end of the compilation unit. So by putting foo.lisp and bar.lisp compilation into one unit, we suppress the warnings about either foo or bar not being defined and we can compile the two in any order.
Build systems use with-compilation-unit under the hood, as appropriate.
The compilation unit isn't about dependencies but diagnostics. Above, we don't have a compile time dependency. If we touch foo.lisp, bar.lisp doesn't have to be recompiled or vice versa.
By and large, Lisp codebases don't have a lot of hard dependencies among the files. Incremental compilation often means that just the affected files that were changed have to be recompiled. The C or C++ problem that everything has to be rebuilt because a core header file was touched is essentially nonexistent.
but what if
No matter how you first organize your code, if you change it significantly you are going to have to refactor. IMO there is no ideal way of grouping dependencies in advance.
As a rule of thumb it is generally safe to define generic functions first, then types, then actual methods, for example. For non-generic functions, you can cut circular dependencies by adding forward declarations:
(declaim (ftype function ...))
Having too much circular dependency is a bit of a code smell.
Is the correct solution to seal everything in a compilation unit
Yes, if you group the definitions in the same compilation unit (the same file), the file compiler will be able to silence the style notes until it reaches the end of file: at this point it knows if there are still missing references or if all the cross-references are resolved.
But then if there's some kind of back call (that can't be done with a closure)
If you have a specific example in mind please share, but typically you can define struct1 and its functions in a way that can be self-contained; maybe it can accept a map that binds event names to callbacks:
(make-struct-1 :callbacks (list :on-empty one-is-empty
:on-full one-is-full))
Similarly, struct2 can accept callbacks too (Dependency Injection) and the main struct ties them using closures (?).
Alternatively, you can design your data-structures so that they signal conditions, and the in the caller code you intercept them to bind things together.
What is the most idiomatic way to get some form of context to the yacc parser in goyacc, i.e. emulate the %param command in traditional yacc?
I need to parse to my .Parse function some context (in this case including for instance where to build its parse tree).
The goyacc .Parse function is declared
func ($$rcvr *$$ParserImpl) Parse($$lex $$Lexer) int {
Things I've thought of:
$$ParserImpl cannot be changed by the .y file, so the obvious solution (to add fields to it) is right out, which is a pity.
As $$Lexer is an interface, I could stuff the parser context into the Lexer implementation, then force type convert $$lex to that implementation (assuming my parser always used the same lexer), but this seems pretty disgusting (for which read non-idiomatic). Moreover there is (seemingly) no way to put a user-generated line at the top of the Parse function like c := yylex.(*lexer).c, so in the many tens of places I want to refer to this variable, I have to use the rather ugly form yylex.(*lexer).c rather than just c.
Normally I'd use %param in normal yacc / C (well, bison anyway), but that doesn't exist in goyacc.
I'd like to avoid postprocessing my generated .go file with sed or perl for what are hopefully obvious reasons.
I want to be able to (go)yacc parse more than one file at once, so a global variable is not possible (and global variables are hardly idiomatic).
What's the most idiomatic solution here? I keep thinking I must be missing something simple.
My own solution is to modify goyacc (see this PR) which adds a %param directive allowing one or more fields to be added to the $$ParserImpl structure (accessible as $$rcvr in code). This seems the most idiomatic route. This permits not only passing context in, but the ability for the user to add additional func()s using $$ParserImpl as a receiver.
I've inherited a C++98 codebase which has two major uses of memset() on C++ classes, with macros expanded for clarity:
// pattern #1:
Obj o;
memset(&o, 0, sizeof(o));
// pattern #2:
// (elsewhere: Obj *o;)
memset(something->o, 0, sizeof(*something->o));
As you may have guessed, this codebase does not use STL or otherwise non-POD classes. When I try to put as little as an std::string into one of its classes, bad things generally happen.
It was my understanding that these patterns could be rewrited as follows in C++11:
// pattern #1
Obj o = {};
// pattern #2
something->o = {};
Which is to say, assignment of {} would rewrite the contents of the object with the default-initialized values in both cases. Nice and clean, isn't it?
Well, yes, but it doesn't work. It works on *nix systems, but results in fairly inexplicable results (in essence, garbage values) when built with VS2013 with v120_xp toolset, which implies that my understanding of initializer lists is somehow lacking.
So, the questions:
Why didn't this work?
What's a better way to replace this use of memset that ensures that members with constructors are properly default-initialized, and which can preferably be reliably applied with as little as search-and-replace (there are unfortunately no tests). Bonus points if it works on pre-VS2013.
The behavior of brace-initialization depends on what kind of object you try to initialize.
On aggregates (e.g. simple C-style structures) using an empty brace-initializer zero-initializes the aggregate, i.e. it makes all members zero.
On non-aggregates an empty brace-initializer calls the default constructor. And if the constructor doesn't explicitly initialize the members (which the compilers auto-generated constructor doesn't) then the members will be constructed but otherwise uninitialized. Members with their own constructors that initialize themselves will be okay, but e.g. an int member will have an indeterminate value.
The best way to solve your problems, IMO, is to add a default constructor (if the classes doesn't have it already) with an initializer list that explicitly initializes the members.
It works on *nix systems, but results in fairly inexplicable results (in essence, garbage values) when built with VS2013 with v120_xp toolset, which implies that my understanding of initializer lists is somehow lacking.
The rules for 'default' initialization have changed from version to version of C++, but VC++ has stuck with the C++98 rules, ignoring even the updates from C++03 I think.
Other compilers have implemented new rules, with gcc at one point even implementing some defect resolutions that hadn't been accepted for future inclusion in the official spec.
So even though what you want is guaranteed by the standard, for the most part it's probably best not to try to rely on the behavior of initialization of members that don't have explicit initializers.
I think placement new is established enough that it works on VS, so you might try:
#include <new>
new(&o) T();
new(something->p) T();
Make sure not to do this on any object that hasn't been allocated and destructed/uninitialized first! (But it was pointed out below that this might fail if a constructor throws an exception.)
You might be able to just assign from a default object, that is, o = T(); or *(something->p) = T();. A good general strategy might be to give each of these POD classes a trivial default constructor with : o() in the initializer-list.
If I write a function or module that calls another module, how can I get the name of the calling function/module? This would be helpful for debugging purposes.
The Stack function will do almost exactly what you want, giving a list of the "tags" (for your purposes, read "functions") that are in the call stack. It's not bullet-proof, because of the existence of other functions like StackBegin and StackInhibit, but those are very exotic to begin with.
In most instances, Stack will return the symbols that name the functions being evaluated. To figure out what context those symbols are from, you can use the Context function, which is aboput as close as you can come to figuring out what package they're a part of. This requires some care, though, as symbols can be added to packages dynamically (via Get, Import, ToExpression or Symbol) and they can be redefined or modified (with new evaluation rules, for instance) in other packages as well.