gcc - gdb - pretty print stl - debugging

I'm currently doing some research on the STL, especially for printing the STL content during debug. I know there are many different approaches.
Like:
http://sourceware.org/gdb/wiki/STLSupport
or using a shared library to print the content of a container
What I'm currently looking for is, why g++ deletes functions, which are not used for example I have following code and use the compile setting g++ -g main.cpp -o main.o.
include <vector>
include <iostream>
using namespace std;
int main() {
std::vector<int> vec;
vec.push_back(10);
vec.push_back(20);
vec.push_back(30);
return;
}
So when I debug this code I will see that I can't use print vec.front(). The message I receive is:
Cannot evaluate function -- may be inlined
Therefore I tried to use the setting -fkeep-inline-functions, but no changes.
When i use nm main.o | grep front I see that there is no line entry for the method .front(). Doing the same again but, with an extra vec.front() entry within my code I can use print vec.front(), and using nm main.o | grep front where I see the entry
0000000000401834 W _ZNSt6vectorIiSaIiEE5frontEv
Can someone explain me how I can keep all functions within my code without loosing them. I think, that dead functions do not get deleted as long as I don't set optimize settings or do following.
How to tell compiler to NOT optimize certain code away?
Why I need it: Current Python implementations use the internal STL implementation to print the content of a container, but it would be much more interesting to use functions which are defined by ISO/IEC 14882. I know it's possible to write a shared library, which can be compiled to your actual code before you debug it, to maintain that you have all STL functions, but who wants to compile an extra lib to its code, before debugging. It would also be interesting to know if there are some advantages and disadvantages of this two approaches (Shared Lib. and Python)?

What's exactly a dead function, isn't it a function which is available in my source code but isn't used?
There are two cases to consider:
int unused_function() { return 42; }
int main() { return 0; }
If you compile above program, the unused_function is dead -- never called. However, it would still be present in the final executable (even with optimization [1]).
Now consider this:
template <typename T> int unused_function(T*) { return 42; }
int main() { return 0; }
In this case, unused_function will not be present, even when you turn off all optimizations.
Why? Because the template is not a "real" function. It's a prototype, from which the compiler can create "real" functions (called "template instantiation") -- one for each type T. Since you've never used unused_function, the compiler didn't create any "real" instances of it.
You can request that the compiler explicitly instantiate all functions in a given class, with explicit instantiation request, like so:
#include <vector>
template class std::vector<int>;
int main() { return 0; }
Now, even though none of the vector functions are used, they are all instantiated into the final binary.
[1] If you are using the GNU ld (or gold), you could still get rid of unused_function in this case, by compiling with -ffunction-sections and linking with -Wl,--gc-sections.

Thanks for your answer. Just to repeat, template functions don't get initiated by the gcc, because they are prototypes. Only when the function is used or it gets explicitly initiated it will be available within my executable.
So what we have mentioned until yet is :
function definition int unusedFunc() { return 10; }
function prototype int protypeFunc(); (just to break it down)
What happens when you inline functions? I always thought, that the function will be inserted within my source code, but now I read, that compilers often decide what to do on their own. (Sounds strange, because their must be rule). It doesn't matter if you use the keyword inline, for example.
inline int inlineFunc() { return 10; }
A friend of mine also told me that he hasn't had access to addresses of functions, although he hasn't used inline. Are there any function types I forgot? He also told me that their should be differences within the object data format.
#edit - forgot:
nested functions
function pointers
overloaded functions

Related

Confusion in Bjarne's PPP 2nd edition Pg. 316

• The function will be inline; that is, the compiler will try to generate code for the function at each point of call rather than using function-call instructions to use common code. This can be a significant performance advantage for functions, such as month(), that hardly do anything but
are used a lot.
• All uses of the class will have to be recompiled whenever we make a change to the body of an inlined function. If the function body is out of the class declaration, recompilation of users is needed only when the class declaration is itself changed. Not recompiling when the body is
changed can be a huge advantage in large programs.
• The class definition gets larger. Consequently, it can be harder to find the members among the member function definitions.
All uses of the class will have to be recompiled whenever we make a change to the body of an inlined function. If the function body is out of the class declaration, recompilation of users is needed only when the class declaration is itself changed. Not recompiling when the body is
changed can be a huge advantage in large programs.
I don't know what the book is trying to say exactly in this point. What do we mean by "have to be recompiled" and "recompilation is needed only when the class declaration is itself changed"
I suppose, from the context, that the quoted part discusses the pros & cons of putting member definitions inside the class declaration.
Suppose you have class X. You have to declare it somewhere. In a typical scenario, it will be placed in a header file whose only role will be to hold this declaration. Let's call it x.h.
A class usually has member functions. Now you can choose to either put them inside the header file inside the class declaration or in a separate file (typically: x.cpp).
Solution 1:
// file x.h contains everything
class X
{
public:
X() { std::cout << "X() has been hit\n"; }
};
Solution 2:
// file x.h contains only the declaration(s)
class X
{
public:
X();
};
// file x.cpp contains the class member definitions
#include "x.h"
X::X() { std::cout << "X() has been hit\n"; }
Whichever solution you use, you surely have some code that uses your class, and typically it is located in a different source file(s), e.g.:
// main.cpp
#include "x.h"
int main()
{
X x;
}
The first thing to notice: the user (here: main.cpp) looks the same whether you choose Solution 1 or 2. This is great. Now, here comes the message Bjarne wants to tell you: consider how changes to the class code will impact the users.
In Solution 1 you've packed everything into the header file. Any change to the class, even so apparently harmless as adding a new member function or just changing class formatting (you know, tabs, spaces, etc.) or adding a comment will force the compiler to recompile main.cpp. Why? Professional C++ programs are composed of many, many source files and their compilation is controlled and executed by special utility programs, like cmake, make, and many others. They simply look at the timestamps of the files that make up the program. Any change is a signal to recompile. Header files are never compiled, but all source files (= *.cpp) that include them (even indirectly, via other header files) have to be recompiled. This explains this:
All uses of the class will have to be recompiled whenever we make a change to the body of an inlined function.
(just to be sure: all class member functions declared inside the class declaration are considered inline by default). Here, main.cpp is an example of a "uses" mentioned above.
In Solution 2, file main.cpp will be recompiled only if x.h has been changed (in any way). If a programmer touches only x.cpp, then main.cpp will not be recompiled, because (a) C++ is designed in such a way to allow it and (b) professional C++ programs use other programs (I've mentioned above) that facilitate the efficient compilation of even large C++ programs. To be explicit: they are not compiled using commands like g++ *.cpp that can be found in some introductory C++ textbooks.
One final remark. The inline keyword was introduced essentially to allow Solution 1. Solution 2 is the original C language way. Solution 1 is sometimes used in C++ for better performance (but modern compilers can in many situations do the same job without it) and very often for templates (which are absent in C). Solution 1 is the most common way of programming templates, Solution 2 is typical for "ordinary" member functions. What Bjarne writes about is extremely important for library designers, I hope now you understand why.

Compile-time AVX detection when using multi-versioning

I have quite big function compiled for two different architectures:
__attribute__ ((target ("arch=broadwell"))) void doStuff()
{
doStuffImpl()
}
__attribute__ ((target ("arch=nocona"))) void doStuff()
{
doStuffImpl();
}
__attribute__((always_inline)) void doStuffImpl()
{
(...)
}
I know this is old way of doing multi-versioning, but I'm using gcc 4.9.3. Also actually doStuffImpl() is not single function, but bunch of functions with inlining, where doStuff() is last actual function call, but I don't think it changes anything.
Function contains some code that is auto-vectorized by compiler, but also I need to add some hand-crafted intrinsics there. Obviously different in two different flavours.
Question is: how can I recognise in compile-time which SIMD extensions are available?
I was trying something like:
#ifdef __AVX2__
AVX_intrinsics();
#elif defined __SSE4.2__
SSE_intrinsics();
#endif
But it seems that defines comes from "global" -march flag, not the one from multiversioning override.
Godbolt (intrinsics are garbage, but shows my point)
I could extract this part and do separate multiversioned function, but that would add cost of dispatching and function call.
Is there any way to do compile time differentiation of two multiversioning variants of function?
As answered in the comments:
I'd recommend moving each of the CPU targets to a separate translation unit, which is compiled with the corresponding compiler flags. The common doStuffImpl function can be implemented in a header, included in each of the TUs. In that header, you can use predefined macros like __AVX__ to test for available ISA extensions. The __attribute__((target)) attributes are no longer needed and can be removed in this case.

gcc/clang: How to force ordering of items on the stack?

Consider the following code:
int a;
int b;
Is there a way to force that a precedes b on the stack?
One way to do the ordering would be to put b in a function:
void foo() {
int b;
}
...
int a;
foo();
However, that would generally work only if b isn't inlined.
Maybe there's a different way to do that? Putting an inline assembler between the two declarations may do a trick, but I am not sure.
Your initial question was about forcing a function call to not be inlined.
To improve on Jordy Baylac's answer, you might try to declare the function within the block calling it, and perhaps use a statement expr:
#define FOO_WITHOUT_INLINING(c,i) ({ \
extern int foo (char, int) __attribute__((noinline)); \
int r = foo(c,i); \
r; })
(If the type of foo is unknown, you could use typeof)
However, I still think that your question is badly formulated (and is meaningless, if one avoid reading your comments which should really go inside the question, which should have mentioned your libmill). By definition of inlining, a compiler can inline any function as it wants without changing the semantics of the program.
For example, a user of your library might legitimately compile it with -flto -O2 (both at compiling and at linking stage). I don't know what would happen then.
I believe you might redesign your code, perhaps using -fsplit-stack; are you implementing some call/cc in C? Then look inside the numerous existing implementations of it, and inside Gabriel Kerneis CPC.... See also setcontext(3) & longjmp(3)
Perhaps you might need to use somewhere the return_twice (and/or nothrow) function attribute of GCC, or some _Pragma like GCC optimize
Then you edited your question to change it completely (asking about order of variables on the call stack), still without mentioning in the question your libmill and its go macro (as you should; comments are volatile so should not contain most of the question).
But the C compiler is not even supposed to have a call stack (an hypothetical C99 conforming compiler could do whole program optimization to avoid any call stack) in the compiled program. And GCC is certainly allowed to put some variables outside of the call stack (e.g. only in registers) and it is doing that. And some implementations (IA64 probably) have two call stacks.
So your changed question is completely meaniningless: a variable might not sit on the stack (e.g. only be in a register, or even disappear completely if the compiler can prove it is useless after some other optimizations), and the compiler is allowed to optimize and use the same call stack slot for two variables (and GCC is doing such an optimization quite often). So you cannot force any order on the call stack layout.
If you need to be sure that two local variables a & b have some well defined order on the call stack, make them into a struct e.g.
struct { int _a, _b; } _locals;
#define a _locals._a
#define b _locals._b
then, be sure to put the &_locals somewhere (e.g. in a volatile global or thread-local variable). Since some versions of GCC (IIRC 4.8 or 4.7) had some optimization passes to reorder the fields of non-escaping struct-s
BTW, you might customize GCC with your MELT extension to help about that (e.g. introduce your own builtin or pragma doing part of the work).
Apparently, you are inventing some new dialect of C (à la CPC); then you should say that!
below there is a way, using gcc attributes:
char foo (char, int) __attribute__ ((noinline));
and, as i said, you can try -fno-inline-functions option, but this is for all functions in the compilation process
It is still unclear for me why you want function not to be inline-d, but here is non-pro solution I am proposing:
You can make this function in separate object something.o file.
Since you will include header only, there will be no way for the compiler to inline the function.
However linker might decide to inline it later at linking time.

Storing pairs in a GCC rope with c++11

I'm using a GCC extension rope to store pairs of objects in my program and am running into some C++11 related trouble. The following compiles under C++98
#include <ext/rope>
typedef std::pair<int, int> std_pair;
int main()
{
__gnu_cxx::rope<std_pair> r;
}
but not with C++11 under G++ 4.8.2 or 4.8.3.
What happens is that the uninitialised_copy_n algorithm is pulled in from two places, the ext/memory and the C++11 version of the memory header. The gnu_cxx namespace is pulled in by rope and the std namespace is pulled in by pair and there are now two identically defined methods in scope leading to a compile error.
I assume this is a bug in a weird use case for a rarely used library but what would be the correct fix? You can't remove the function from ext/memory to avoid breaking existing code and it now required to be in std. I've worked around it using my own pair class but how should this be fixed properly?
If changing the libstdc++ headers is an option (and I asked in the comments whether you were looking for a way to fix it in libstdc++, or work around it in your program), then the simple solution, to me, seems to be to make sure there is only one uninitialized_copy_n function. ext/memory already includes <memory>, which provides std::uninitialized_copy_n. So instead of defining __gnu_cxx::uninitialized_copy_n, it can have using std::uninitialized_copy_n; inside the __gnu_cxx namespace. It can even conditionalize this on C++11 support, so that pre-C++11 code gets the custom implementation of those functions, and C++11 code gets the std implementation of those functions.
This way, code that attempts to use __gnu_cxx::uninitialized_copy_n, whether directly or through ADL, will continue to work, but there is no ambiguity between std::uninitialized_copy_n and __gnu_cxx::uninitialized_copy_n, because they are the very same function.

OpenCV in Go without SWIG and third-parties lib

Main goal: Make OpenCV work in Go without SWIG and third party lib (an application to compare image in linux using Go)
I am new in all the kits (OpenCv Go and linux)
Can image detection (feature2d etc) can be done by C-api only? There is no convenient way to call C++ code and C-api is not updated(?)
I have followed How to use C++ in Go? but I failed.
When I make, I got the following errors
makefile:5: /usr/local/go/bin/src/Make.amd64: No such file or directory
makefile:6: /usr/local/go/bin/src/Make.pkg: No such file or directory
makefile:8: * missing separator. Stop.
The makefile is as followed
GOROOT=/usr/local/go/bin
GOARCH=amd64
TARG=foo
CGOFILES=foo.go
include $(GOROOT)/src/Make.$(GOARCH)
include $(GOROOT)/src/Make.pkg
foo.o:foo.cpp
g++ $(_CGO_CFLAGS_$(GOARCH)) -fPIC -O2 -o $# -c $(CGO_CFLAGS) $<
cfoo.o:cfoo.cpp
g++ $(_CGO_CFLAGS_$(GOARCH)) -fPIC -O2 -o $# -c $(CGO_CFLAGS) $<
CGO_LDFLAGS+=-lstdc++
$(elem)_foo.so: foo.cgo4.o foo.o cfoo.o
gcc $(_CGO_CFLAGS_$(GOARCH)) $(_CGO_LDFLAGS_$(GOOS)) -o $# $^ $(CGO_LDFLAGS)
Thanks a lot
You can't call C++ code without either writing C wrappers (+ cgo) yourself or using SWIG, that's just the way it is sadly.
That post you linked is extremely outdated and can't be used anymore.
On the other hand, you can always start rewriting opencv in pure go, the speed differences won't be that massive, specially if you learn how to use unsafe for the speed-critical parts.
disclaimer using unsafe is not advised since, well, it's unsafe.
You can do this, I've ported a very trivial subset of OpenCV into Go for my own purposes. In general, the process is to allocate everything on the heap and return it as a typedef'd void*. For example:
typedef void* gocv_matrix;
From there, a lot of your work is passthrough functions. One very important note is that your header files must be in pure C and must only (recursively) include headers that are pure C. This means your headers are going to be mostly prototypes/forward declarations.
So a few Matrix methods in your header mat.h may look like
gocv_matrix newMatrix();
void add(gocv_matrix m1, gocv_matrix m2, gocv_matrix dst);
void destroy(gocv_matrix m);
Then your implementation in mat.cxx will look something like
//include all relevant C++ OpenCV headers directly
gocv_matrix newMatrix() {
cv::Matrix *mat = new cv::Matrix();
return (gocv_matrix)mat;
}
void add(gocv_matrix m1, gocv_matrix m2, gocv_matrix dst) {
cv::Matrix *a = (cv::Matrix *)m1;
cv::Matrix *b = (cv::Matrix *)m2;
cv::Matrix *dstMat = (cv::Matrix *)dst;
(*dstMat) = (*a)+(*b);
}
void destroy(gocv_matrix m) {
cv::Matrix *a = (cv::Matrix *)(m1);
delete a;
}
(Disclaimer: the exact code here isn't verified for correctness, this is just the gist).
A few special notes:
Make sure you have a destroy method that you actually call or you'll leak memory.
Since C and C++ constants aren't the same as Go constants, you'll have to declare them as var instead of const.
Some of OpenCV's constants are included in headers which aren't pure C, which makes it extremely difficult to define them within Go. I noticed this most with some image processing subpackages.
Note the lack of templated generics. In general you're either foregoing templates entirely, defining a different type for each possible instance, or picking one (probably double, maybe an int size for displaying images) and sticking with it.
Note that you can't use overloaded operators this way. So a+b*c is b.Mul(c).Add(a). In theory you could invent some expression parser that takes in a string like "a+(b*c)" and a list of matrices, and then does some call batching, but if you were at that point in development you wouldn't be asking this question.
This is normal with cgo in general, but you'll probably be using unsafe a lot, especially if you want to work directly with the raw backing data of the matrix. You can reduce this somewhat by making your Go-level Mytype type a simple struct that contains a C.mytype instead of actually converting it.
Honestly, you should probably just use SWIG, since this is basically already what it does for you anyway, in addition to extra niceties like generating actual Go constants for you in most cases instead of sketchy var magic.

Resources