I am converting C++ code to C++11. Since C++11 supports move construction I am replacing methods like
void foo(const Bar& obj);
with
void foo(Bar obj);
in places where I think it makes sense, for example in assignment operators. Unfortunately, the only way I know of that detects if the move constructor is actually used is adding debug messages and run the code.
What I would like to have is information about what the compiler is doing while compiling the source, to get an idea where it is using move construction and where not (and possibly why) so that I can get a better understanding where I need to change the program and where changes actually improved it (to avoid unnecessary copy constructions).
Is there a way to get this information? Maybe with CLang?
Related
I have some code which depends on some include files which are partly defined at the start of source files (which is usual) and others which are used within functions.
I typical example for that are the OpenFOAM solver sources.
Because the scheme of this code is highly procedural, but I want to put all this into a class which provides init(), run() and maybe release(), I plan to put some of the variables into the classes as private making them members.
I don't want to modify the included files because they belong to a library.
The reason for using a class is that other routines classes run together with this code.
Here is the thing. init() must prepare some variable and there situations that theses variables (being type of other clases) not explicit constructors and special arguments. It is called once. run() is called several times. The procedural code has a loop only and the contents of that loop are put into the run() method.
So the best solution was to put these variables into std::unique_ptr and init can construct whatever it needs to. Obviously with that trick the variable signature changed, so I created a second declaration of a reference like this:
std::unique_ptr<volScalarField> mp_p;
volScalarField &p = *mp_p;
Now this is a bit tedious so I created a macro
FOAMPTR(volVectorField, p)
which does all the work for me:
#define FOAMPTR(TYPE,NAME) std::unique_ptr<TYPE> mp_##NAME; TYPE &NAME=*mp_##NAME
It works pretty well, but I'm not fan of macros in general, especially if you need to debug code.
Now my question is: Is there a better way to tackle this and use something else like a template definition which might do all the magic?
Edit: With 'works pretty well' I mean, that the compiler can translate that. The reference though still is invalid.
Edit: Okay, I solved the invalid pointer problem using two Macros:
#define FOAMPTR(TYPE,NAME) std::unique_ptr<TYPE> mp_##NAME
#define FETCHFOAMREF(NAME) auto &NAME=*mp_##NAME
Now I put FOAMPTR(TYPE,NAME) to the member and I get my unique ptrs. In the run() method the second macro FETCHFOAMREF(NAME) is used. Of course init() must be sure to correctly initialize the object or else the program is going to crash.
I still leave the question open because I'm not satisfied with that solution.
Consider the following code:
int a;
int b;
Is there a way to force that a precedes b on the stack?
One way to do the ordering would be to put b in a function:
void foo() {
int b;
}
...
int a;
foo();
However, that would generally work only if b isn't inlined.
Maybe there's a different way to do that? Putting an inline assembler between the two declarations may do a trick, but I am not sure.
Your initial question was about forcing a function call to not be inlined.
To improve on Jordy Baylac's answer, you might try to declare the function within the block calling it, and perhaps use a statement expr:
#define FOO_WITHOUT_INLINING(c,i) ({ \
extern int foo (char, int) __attribute__((noinline)); \
int r = foo(c,i); \
r; })
(If the type of foo is unknown, you could use typeof)
However, I still think that your question is badly formulated (and is meaningless, if one avoid reading your comments which should really go inside the question, which should have mentioned your libmill). By definition of inlining, a compiler can inline any function as it wants without changing the semantics of the program.
For example, a user of your library might legitimately compile it with -flto -O2 (both at compiling and at linking stage). I don't know what would happen then.
I believe you might redesign your code, perhaps using -fsplit-stack; are you implementing some call/cc in C? Then look inside the numerous existing implementations of it, and inside Gabriel Kerneis CPC.... See also setcontext(3) & longjmp(3)
Perhaps you might need to use somewhere the return_twice (and/or nothrow) function attribute of GCC, or some _Pragma like GCC optimize
Then you edited your question to change it completely (asking about order of variables on the call stack), still without mentioning in the question your libmill and its go macro (as you should; comments are volatile so should not contain most of the question).
But the C compiler is not even supposed to have a call stack (an hypothetical C99 conforming compiler could do whole program optimization to avoid any call stack) in the compiled program. And GCC is certainly allowed to put some variables outside of the call stack (e.g. only in registers) and it is doing that. And some implementations (IA64 probably) have two call stacks.
So your changed question is completely meaniningless: a variable might not sit on the stack (e.g. only be in a register, or even disappear completely if the compiler can prove it is useless after some other optimizations), and the compiler is allowed to optimize and use the same call stack slot for two variables (and GCC is doing such an optimization quite often). So you cannot force any order on the call stack layout.
If you need to be sure that two local variables a & b have some well defined order on the call stack, make them into a struct e.g.
struct { int _a, _b; } _locals;
#define a _locals._a
#define b _locals._b
then, be sure to put the &_locals somewhere (e.g. in a volatile global or thread-local variable). Since some versions of GCC (IIRC 4.8 or 4.7) had some optimization passes to reorder the fields of non-escaping struct-s
BTW, you might customize GCC with your MELT extension to help about that (e.g. introduce your own builtin or pragma doing part of the work).
Apparently, you are inventing some new dialect of C (à la CPC); then you should say that!
below there is a way, using gcc attributes:
char foo (char, int) __attribute__ ((noinline));
and, as i said, you can try -fno-inline-functions option, but this is for all functions in the compilation process
It is still unclear for me why you want function not to be inline-d, but here is non-pro solution I am proposing:
You can make this function in separate object something.o file.
Since you will include header only, there will be no way for the compiler to inline the function.
However linker might decide to inline it later at linking time.
Background information (you do not need to repeat these steps to answer the question, this just gives some background):
I am trying to compile a rather large set of generated modules. These files are the output of a prototype Modelica to OCaml compiler and reflect the Modelica class structure of the Modelica Standard Library.
The main feature is the use of polymorphic, open recursion: Every method takes a this argument which contains the final superclass hierarchy. So for instance the model:
model A type T = Real type S = T end A;
is translated into
let m_A = object
method m_T this = m_Modelica_Real
method m_S this = this#m_T this
end
and has to be closed before usage:
let _ = m_A#m_T m_A
This seems to postpone a lot of typechecking until the superclass hierarchy is actually fixed, which in turn makes it impossible to compile the final linkage module (try ocamlbuild Linkage.cmo after editing the comments in the corresponding file to see what I mean).
Unfortunately, since the code base is rather large and uses a lot of objects, the type-structure might not be the root cause after all, it might as well be some optimization or a flaw in the code-generation (although I strongly suspect the typechecker). So my question is: Is there any way to profile the ocaml compiler in a way that signals when a certain phase (typechecking, intermediate code generation, optimization) is over and how long it took? Any further insights into my particular use case are also welcome.
As of right now, there isn't.
You can do it yourself though, the compiler source are open and you can get those and modify them to fit your needs.
Depending on whether you use ocamlc or ocamlopt, you'll need to modify either driver/compile.ml or driver/optcompile.ml to add timers to the compilation process.
Fortunately, this already has been done for you here. Just compile with the option -dtimings or environment variable OCAMLPARAM=timings=1,_.
Even more easily, you can download the opam Flambda switch:
opam switch install 4.03.0+pr132
ocamlopt -dtimings myfile.ml
Note: Flambda itself changes the compilation time (most what happens after typing) and its integration into the OCaml compiler is not confirmed yet.
OCaml compiler is an ordinary OCaml program in that regard. I would use poorman's profiler for a quick inspection, using e.g. pmp script.
I am using GCC 4.6 as part of the lpcxpresso ide for a Cortex embedded processor. I have very limited code size, especially when compiling in debug mode. Using attribute((always_inline)) has so far proven to be a good tool to inline trivial functions and this saves a lot of code bloat in debug mode while still maintaining readability. I expect it to be somewhat mainstream and supported in the future because it is mentioned here http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0348c/CIAJGAIH.html
Now to my question: Is this the correct Syntax for declaring a Lambda always inline?
#define ALWAYS_INLINE __attribute__((always_inline))
[](volatile int &i)ALWAYS_INLINE{i++;}
It does work, my question is will it continue to work in future and what can I do to ensure it works in the future. If I ever switch to another major compiler that supports c++11 will I find a similar keyword which I can replace the attribute((always_inline)) with?
If I were to meet my fairy godmother I would wish for a compiler directive which causes all lambdas which are constructed as temporaries with empty constructors and bound by reference to be automatically inlined even in debug mode. Any ideas?
Will it continue to work in future?
Likely but, always_inline is compiler specific and since there is no standard specifying its exact behavior with lambda, there is no guaranty that this will continue to work in the future.
What can I do to ensure it works?
This depends on the compiler not you. If a future version drops support for always_inline with lambda, you have to stick with a version that works or code your own preprocessor that inlines lambdas with an always_inline-like keyword.
If I ever switch to another major compiler that supports c++11 will I
find a similar keyword?
Likely but again, there is no guaranty. The only real standard is the C++ inline keyword and it is not applicable to lambdas. For non-lambda it only suggests inlining and tells the compiler that a function may be defined in different compile units.
Because of the layers of standards, the include files for c++ are a rats nest. I was trying to figure out what __isnan actually calls, and couldn't find anywhere with an actual definition.
So I just compiled with -S to see the assembly, and if I write:
#include <ieee754.h>
void f(double x) {
if (__isinf(x) ...
if (__isnan(x)) ...
}
Both of these routines are called. I would like to see the actual definition, and possibly refactor things like this to be inline, since it should be just a bit comparison, albeit one that is hard to achieve when the value is in a floating point register.
Anyway, whether or not it's a good idea, the question stands: WHERE is the source code for __isnan(x)?
Glibc has versions of the code in the sysdeps folder for each of the systems it supports. The one you’re looking for is in sysdeps/ieee754/dbl-64/s_isnan.c. I found this with git grep __isnan.
(While C++ headers include code for templates, functions from the C library will not, and you have to look inside glibc or whichever.)
Here, for the master head of glibc, for instance.