How exactly does gperftools CPU profiler start? - gcc

gperftools documentation says that libprofiler should be linked into a target program:
$ gcc myprogram.c -lprofiler
(without changing a code of the program).
And then program should be run with a specific environment variable:
CPUPROFILE=/tmp/profiler_output ./a.out
The question is: how does libprofile have a chance to start and finish a profiler when it is merely loaded, but its functions are not called?
There is no constructor function in that library (proof).
All occasions of "CPUPROFILE" in library code do not refer to any place where profiler is started.
I am out of ideas, where to look next?

As per the documentation the linked webpage, under Linking the library, it describes that the -lprofiler step is the same as linking against the shared object file with LD_PRELOAD option.
The shared object file isn't the same as just the header file. The header file contains function declarations which are looked up by when compiling a program , so the names of the functions resolve, but the names are just names, not implementations. The shared object file (.so) contains the implementations of the functions. For more information see the following StackOverflow answer.
Source file of /trunk/src/profiler.cc on Line 182, has a CPUProfiler constructor, that checks for whether profiling should be enabled or not based on the CPUPROFILE environment variable (Line 187 and Line 230).
It then calls the Start function on Line 237. As per the comments in this file, the destructor calls the Stop function on Line 273.
To answer your question I believe Line 132 CpuProfiler CpuProfiler::instance_; is the line where the CpuProfiler is instantiated.
This lack of clarity in the gperftools documentation is known issue see here.

I think the profiler gets initialized with the REGISTER_MODULE_INITIALIZER macro seen at the bottom of profile-handler.cc (as well as in heap-checker.cc, heap-profiler.cc, etc). This calls src/base/googleinit.h which defines a dummy static object whose constructor is called when the library is loaded. That dummy constructor then calls ProfileHandlerRegisterThread() which then uses the pthread_once variable to initialize the singleton object (ProfileHandler::instance_).
The function REGISTER_MODULE_INITIALIZER simulates the module_init()/module_exit() functions seen in Linux loadable kernel modules.
(my answer is based on the 2.0 version of gperftools)

Related

What's the differences from inline and block compilation of SBCL?

Several weeks ago, SBCL updated 2.0.2 and brought the Block compilation feature. I have read this article to understand what it is.
I have a question, what's the difference between (declaim (inline 'some-function)) and Block compilation? Block compilation is automatic by the compiler?
Thanks.
Inline compilation is a specific optimization technique. A function being called is directly integrated into the calling function - usually using its source code - and then compiled.
This means that the inlined function might not be inlined only in one function, but in multiple functions.
Advantage: the overhead of calling a function disappears.
Disadvantage: the code size increases and the calling function(s) needs to be recompiled, when the inlined function changed and we want this change to become visible. Macros have the same problem.
Block compilation means that a bunch of code gets compiled together with different semantic constraints and that this enables the compiler to do a bunch of new optimizations.
Common Lisp has in the standard support for block compilation of single files. It allows the file compiler to assume that a file is such a block of code.
Example from the Common Lisp standard:
3.2.2.3 Semantic Constraints
A call within a file to a named function that is defined in the same file refers to that function, unless that function has been declared notinline. The consequences are unspecified if functions are redefined individually at run time or multiply defined in the same file.
This allows the code to call a global function and not use the symbol's function cell for the call. Thus this disables late binding for global function calls - in this file and for functions in this file.
It's not said how this can be achieved, but the compiler might just allocate the code somewhere and the calls just jump there.
So this part of block compilation is defined in the standard and some compilers are doing that.
Block compilation for multiple files
If the file compiler can use block compilation for one file, then what about multiple files? A few compilers can also tell the file compiler that several files make a block for compilation. CMUCL does that. SBCL was derived and simplified from CMUCL and lacks it until now. I think Lucid Common Lisp (which is no longer actively sold) did support something like that, too.
Might be useful to add this to SBCL, too.

dlopen and dylib : main application and dylib address space

My main application statically links to a static library A with a function ABC and my dynamic library xyz.dylib also statically links to the same static library A which has the same function ABC. The function ABC uses a globally defined variable.
Now when the main application Loads xyz.dylib using dlopen on runtime. The initializer gets called where i have called ABC function. This function ABC and uses the global variable from main application address space.
On Osx, functions which are inline the dylib linker will use the first one that is used. So for example, if an inline function is used in your main executable first, and then used in the loaded dylib, it will use the one in the main executable.
This is normally fine, unless your inline makes reference to a global symbol, in which case you are now be using one if your globals for both the dylib, and your executable.
Again this is usually fine, since the same version is used consistently.
The problem happens when you have 2 inline functions that reference a global that is in both executable and dylib, and one function gets used first in the executable, and another one used first in the dylib. Then you have a mismatched pair. For example:
class MagicAlloc
{
void* Alloc() { return gAlloc.get(); }
void Free( void* v ) { gAlloc.free( v ); }
static RealAllocator gAlloc;
};
Suppose you call MagicAlloc::Alloc in the executable, then call it in the dylib, now for all allocations in both you will use the gAlloc in the executable. Then the first call to MagicAlloc::Free happens in the dylib. Then you will try to free something allocated in the binary on the globals from the dylib.
There are two solutions:
Don't use inlines to reference globals/statics. Move the global structure, and the function definitions into the same translation unit ( object file ). Mark the globals "static" so they aren't even visible outside the TLU. Now your functions will be resolved statically in the link step, and bound to the right global.
Hide all the symbols in the executable except the plugin api. Link as normal, but when linking the binary itself pass the following to the linker:
-Wl,-exported_symbols_list,export_file
Where export file is a list of link symbols that should be exported. E.g. you will need to at least have "_main" in that file. Now when your dylib runs it won't be able to dynamically link to the wrong inlines, because they won't be in the dynamic symbol table. The second solution is also more secure, since a malicious plugin won't be able to access globals as easily.

Xcode error: Command /Developer/usr/bin/clang++ failed with exit code 1 due to duplicate symbol

I'm trying to write a program in C++ which runs Conway's Game of Life. I think I have everything that I need, but I'm having some trouble with compiling.
The program is composed of four files: gameoflife.h, a header file which contains my global constants and function declarations, gameoflife.cpp, which defines the functions, main.cpp, which uses the functions, and seeds.cpp, which contains a list of predefined seeds to be used.
When I go to compile the application, I seem to have a clash of duplicate symbols between main.cpp and gameoflife.cpp over an array called currGen which is declared in gameoflife.h.
Both main.cpp and gameoflife.cpp include gameoflife.h, which of course is necessary so that they have access to the global constants and function declarations.
The exact error I receive is the following:
duplicate symbol _currGen in /(same_path)/ConwaysGameOfLife.build/Objects-normal/
x86_64/gameoflife.o and
/(same_path)/ConwaysGameOfLife.build/Objects-normal/x86_64/main.o
for architecture x86_64
Command /Developer/usr/bin/clang++ failed with exit code 1
I've looked around on Stack Overflow but haven't found anything which matches my problem. Any help would be greatly appreciated!
You are probably defining the variable currGen in your header file, not just declaring it.
There needs to be exactly one definition, in one .cpp file. The .h file should just declare it, using extern.
This answer goes into much more detail.

"irq_to_desc" undefined?

everybody.
I need to use $irq_to_desc in my project, but despite the fact I included all h files it needs, gcc still emits ""irq_to_desc" undefined!" messages. I found something on the topic here http://comments.gmane.org/gmane.linux.kernel.kernelnewbies/34403 but I still dont understand how to fix this prroblem.
I don't believe you can use irq_to_desc() in a module.
If CONFIG_GENERIC_HARDIRQS isn't defined, then irq_to_desc() is #defined as a macro in include/linux/irqnr.h. Since the variable it references, irq_desc, isn't in an EXPORT_SYMBOL or EXPORT_SYMBOL_GPL declaration, I don't think you could link a module using that variable into the kernel -- only statically compiled in-kernel code can use it.
If CONFIG_GENERIC_HARDIRQS is defined, then a function irq_to_desc() is declared in include/linux/irqnr.h and defined in kernel/irq/irqdesc.c. There are two definitions of irq_to_desc() in kernel/irq/irqdesc.c depending upon the value of CONFIG_SPARSE_IRQ. There is no corresponding EXPORT_SYMBOL or EXPORT_SYMBOL_GPL declaration for the function, so it can't be used in modules -- only statically compiled in-kernel code.

Size of a library and the executable

I have a static library *.lib created using MSVC on windows. The size of library is say 70KB. Then I have an application which links this library. But now the size of the final executable (*.exe) is 29KB, less than the library. What i want to know is :
Since the library is statically linked, I was thinking it should add directly to the executable size and the final exe size should be more than that? Does windows exe format also do some compression of the binary data?
How is it for linux systems, that is how do sizes of library on linux (*.a/*.la file) relate with size of linux executable (*.out) ?
-AD
A static library on both Windows and Unix is a collection of .obj/.o files. The linker looks at each of these object files and determines if it is needed for the program to link. If it isn't needed, then the object file won't get included in the final executable. This can lead to executables that are smaller then the library.
EDIT: As MSalters points out, on Windows the VC++ compiler now supports generating object files that enable function-level linking, e.g., see here. In fact, edit-and-continue requires this, since the edit-and-continue needs to be able to replace the smallest possible part of the executable.
There is additional bookkeeping information in the .lib file that is not needed for the final executable. This information helps the linker find the code to actually link. Also, debug information may be stored in the .lib file but not in the .exe file (I don't recall where debug info is stored for objs in a lib file, it might be somewhere else).
The static library probably contains several functions which are never used. When the linker links the library with the main executable, it sees that certain functions are never used (and that their addresses are never taken and stored in function pointers), it just throws away the code. It can also do this recursively: if function A() is never called, and A() calls B(), but B() is never otherwise called, it can remove the code for both A() and B(). On Linux, the same thing happens.
A static library has to contain every symbol defined in its source code, because it might get linked into an executable which needs just that specific symbol. But once it is linked into an executable, we know exactly which symbols end up being used, and which ones don't. So the linker can trivially remove unused code, trimming the file size by a lot. Similarly, any duplicate symbols (anything that's defined in both the static library and the executable it's linked into gets merged into a single instance.
Disclaimer: It's been a long time since I dealt with static linking, so take my answer with a grain of salt.
You wrote: I was thinking it should add directly to the executable size and final exe size should be more than that?
Naive linkers work exactly this way - back when I was doing hobby development for CP/M systems (a LONG time ago), this was a real problem.
Modern linkers are smarter, however - they only link in the functions referenced by the original code, or as required.
Additionally to the current answers, the linker is allowed to remove function definitions if they have identical object code - this is intended to help reduce the bloating effects of templated code.
#All: Thanks for the pointers.
#Greg Hewgill - Your answer was a good pointer. Thanks.
The answer i found out was as follows:
1.)During Library building what happens is if the option "Keep Program debug databse" in MSVC (or something alike ) is ON, then library will have this debug info bloating its size.
but when i statically include that library and create a executable, the linker strips all that debug info from the library before geenrating the exe and hence the exe size is less than that of the library.
2.) When i disabled the option "Keep Program debug databse", i got an library whose size was smaller than the final executable, which was what i thought is nromal in most situations.
-AD

Resources