GCC online doc - 3.10 Options That Control Optimization affirm that -fomit-frame-pointer gcc option can make debbuging impossible.
-fomit-frame-pointer
Don't keep the frame pointer in a register for functions that don't need one. This avoids the instructions to save, set up and restore frame pointers; it also makes an extra register available in many functions. It also makes debugging impossible on some machines.
I understand why local variables are harder to locate and stack traces are much harder to reconstruct without a frame pointer to help out.
But, In what circumstances is it make debugging impossible?
It may be impossible in the sense that existing tools for these platforms (which are often provided by platform vendor, not GNU) expect frame pointer to be present for successful unwinding. One could theoretically modify them to be more intelligent but in practice this is not possible.
Related
for University we need to implement our own va_start and va_arg (without using libraries) for variable argument lists.
This isn't really a problem, but gcc and clang are giving us a hard time.
They are optimizing the code so that the parameter are passed through registers and not on the stack, which makes our task impossible.
I already tried to use optimization -O0 but even then they seem to pass them in registers.
Is there a way to disable that feature?
best wishes
Leo
Edit:
We are using 64-bit machines only
Edit2:
I found this site:
https://gcc.gnu.org/onlinedocs/gcc-2.95.3/gcc_17.html
It describes macros which define if a parameter is passed on the stack or not.
Could I use these marcos somehow to tell gcc to pass all parameters on the stack?
I played around with them but sadly did not archive anything...
I want to investigate a way for my other question (gcc: Strip unused functions) by building the latest gcc 6.3.0.
There are some options from https://gcc.gnu.org/install/configure.html and https://gcc.gnu.org/onlinedocs/libstdc++/manual/configure.html that I would like to try, but don't understand what they meant.
Specifically, these are the flags I want to try:
--disable-libstdcxx-verbose: I rarely use exceptions so I am not very familiar with how it works. I have never seen the "verbose messages" it mentioned before.
--enable-linker-build-id and --enable-gnu-unique-object: Simply don't understand what the explanations are trying to say. What are the benefits exactly?
--enable-cxx-flags="-ffunction-sections -fstrict-aliasing -fno-exceptions": If I use -fno-exceptions in libstdc++, doesn't that means I get no exceptions if I use libstdc++? -ffunction-sections is used, but where to put -Wl,-gc-sections?
Although I always use --enable-lto, but with ld.bfd, it seems to be quite useless compared to the famous gold linker.
If you have more flags you think I should try, please let me know!
--disable-libstdcxx-verbose: I rarely use exceptions
so I am not very familiar with how it works. I have never seen the
"verbose messages" it mentioned before.
+1, you normally don't run into errors which trigger these friendly error messages you can avoid paying for them.
--enable-linker-build-id and --enable-gnu-unique-object:
Simply don't understand what the explanations are trying to say.
What are the benefits exactly?
There are none.
Unique objects is a badly designed feature that prevents shared libraries which contain references to globally used objects (usually vtables) from unloading on dlclose. AFAIR it's enabled by default (as it's needed to simulate C++ semantics in shared libs environment).
Build id is needed to support separate debuginfo.
--enable-cxx-flags="-ffunction-sections -fstrict-aliasing -fno-exceptions":
You won't benefit from -fstrict-aliasing as it's enabled at -O2 and higher by default.
-ffunction-sections is used, but where to put -Wl,-gc-sections?
To --enable-cxx-flags as well (note that it wants double-dash i.e. -Wl,--gc-sections).
Although I always use --enable-lto, but with ld.bfd,
it seems to be quite useless compared to the famous gold linker.
This flags simply enables LTO support in GCC (it's actually equivalent to adding lto to --enable-languages). It won't cause any difference unless you enable -flto in CXXFLAGS as well. Keep in mind that LTO will normally increase executable size (as compiler will have more opportunities for inlining).
If you have more flags you think I should try, please let me know!
Speaking of size reduction, I'd say -ffunction-sections is your best bet (be sure to verify that configure machinery passes all options correctly and libstdc++.a indeed has one section per function). You could also add -fdata-sections.
I'm trying to compile code on GCC that uses OpenACC to offload to an NVIDIA GPU but I haven't been able to find a similar compiler option to the one mentioned above. Is there a way to tell GCC to be more verbose on all operations related to offloading?
Unfortunately, GCC does not yet provide a user-friendly interface to such information (it's on the long TODO list...).
What you currently have to do is look at the dump files produced by -fdump-tree-[...] for the several compiler passes that are involved, and gather information that way, which requires understanding of GCC internals. Clearly not quite ideal :-/ -- and patches welcome probably is not the answer you've been hoping for.
Typically, for a compiler it is rather trivial to produce diagnostic messages for wrong syntax in source code ("expected [...] before/after/instead of [...]"), but what you're looking for is diagnostic messages for failed optimizations, and similar, which is much harder to produce in a form that's actually useful for a user, and so far we (that is, the GCC developers) have not been able to spend the required amount of time on this.
I'm using g++ to compile and link a project consisting of about 15 c++ source files and 4 shared object files. Recently the linking time more than doubled, but I don't have the history of the makefile available to me. Is there any way to profile g++ to see what part of the linking is taking a long time?
Edit: After I noticed that the makefile was using -O3 optimizations all the time, I managed to halve the linking time just by removing that switch. Is there any good way I could have found this without trial and error?
Edit: I'm not actually interested in profiling how ld works. I'm interested in knowing how I can match increases in linking time to specific command line switches or object files.
Profiling g++ will prove futile, because g++ doesn't perform linking, the linker ld does.
Profiling ld will also likely not show you anything interesting, because linking time is most often dominated by disk I/O, and if your link isn't, you wouldn't know what to make of the profiling data, unless you understand ld internals.
If your link time is noticeable with only 15 files in the link, there is likely something wrong with your development system [1]; either it has a disk that is on its last legs and is constantly retrying, or you do not have enough memory to perform the link (linking is often RAM-intensive), and your system swaps like crazy.
Assuming you are on an ELF based system, you may also wish to try the new gold linker (part of binutils), which is often several times faster than the GNU ld.
[1] My typical links involve 1000s of objects, produce 200+MB executables, and finish in less than 60s.
If you have just hit your RAM limit, you'll be probably able to hear the disk working, and a system activity monitor will tell you that. But if linking is still CPU-bound (i.e. if CPU usage is still high), that's not the issue. And if linking is IO-bound, the most common culprit can be runtime info. Have a look at the executable size anyway.
To answer your problem in a different way: are you doing heavy template usage? For each usage of a template with a different type parameter, a new instance of the whole template is generated, so you get more work for the linker. To make that actually noticeable, though, you'd need to use some library really heavy on templates. A lot of ones from the Boost project qualifies - I got template-based code bloat when using Boost::Spirit with a complex grammar. And ~4000 lines of code compiled to 7,7M of executable - changing one line doubled the number of specializations required and the size of the final executable. Inlining helped a lot, though, leading to 1,9M of output.
Shared libraries might be causing other problems, you might want to look at documentation for -fvisibility=hidden, and it will improve your code anyway. From GCC manual for -fvisibility:
Using this feature can very substantially
improve linking and load times of shared object libraries, produce
more optimized code, provide near-perfect API export and prevent
symbol clashes. It is *strongly* recommended that you use this in
any shared objects you distribute.
In fact, the linker normally must support the possibility for the application or for other libraries to override symbols defined into the library, while typically this is not the intended usage. Note that using that is not for free however, it does require (trivial) code changes.
The link suggested by the docs is: http://gcc.gnu.org/wiki/Visibility
Both gcc and g++ support the -v verbose flag, which makes them output details of the current task.
If you're interested in really profiling the tools, you may want to check out Sysprof or OProfile.
When writing C/C++ code, in order to debug the binary executable the debug option must be enabled on the compiler/linker. In the case of GCC, the option is -g. When the debug option is enabled, how does the affect the binary executable? What additional data is stored in the file that allows the debugger function as it does?
-g tells the compiler to store symbol table information in the executable. Among other things, this includes:
symbol names
type info for symbols
files and line numbers where the symbols came from
Debuggers use this information to output meaningful names for symbols and to associate instructions with particular lines in the source.
For some compilers, supplying -g will disable certain optimizations. For example, icc sets the default optimization level to -O0 with -g unless you explicitly indicate -O[123]. Also, even if you do supply -O[123], optimizations that prevent stack tracing will still be disabled (e.g. stripping frame pointers from stack frames. This has only a minor effect on performance).
With some compilers, -g will disable optimizations that can confuse where symbols came from (instruction reordering, loop unrolling, inlining etc). If you want to debug with optimization, you can use -g3 with gcc to get around some of this. Extra debug info will be included about macros, expansions, and functions that may have been inlined. This can allow debuggers and performance tools to map optimized code to the original source, but it's best effort. Some optimizations really mangle the code.
For more info, take a look at DWARF, the debugging format originally designed to go along with ELF (the binary format for Linux and other OS's).
A symbol table is added to the executable which maps function/variable names to data locations, so that debuggers can report back meaningful information, rather than just pointers. This doesn't effect the speed of your program, and you can remove the symbol table with the 'strip' command.
In addition to the debugging and symbol information
Google DWARF (A Developer joke on ELF)
By default most compiler optimizations are turned off when debugging is enabled.
So the code is the pure translation of the source into Machine Code rather than the result of many highly specialized transformations that are applied to release binaries.
But the most important difference (in my opinion)
Memory in Debug builds is usually initialized to some compiler specific values to facilitate debugging. In release builds memory is not initialized unless explicitly done so by the application code.
Check your compiler documentation for more information:
But an example for DevStudio is:
0xCDCDCDCD Allocated in heap, but not initialized
0xDDDDDDDD Released heap memory.
0xFDFDFDFD "NoMansLand" fences automatically placed at boundary of heap memory. Should never be overwritten. If you do overwrite one, you're probably walking off the end of an array.
0xCCCCCCCC Allocated on stack, but not initialized
-g adds debugging information in the executable, such as the names of variables, the names of functions, and line numbers. This allows a debugger, such as gdb to step through code line by line, set breakpoints, and inspect the values of variables. Because of this additional information using -g increases the size of the executable.
Also, gcc allows to use -g together with -O flags, which turn on optimization. Debugging an optimized executable can be very tricky, because variables may be optimized away, or instructions may be executed in a different order. Generally, it is a good idea to turn off optimization when using -g, even though it results in much slower code.
Just as a matter of interest, you can crack open a hexeditor and take a look at an executable produced with -g and one without. You can see the symbols and things that are added. It may change the assembly (-S) too, but I'm not sure.
There is some overlap with this question which covers the issue from the other side.
Some operating systems (like z/OS) produce a "side file" that contains the debug symbols. This helps avoid bloating the executable with extra information.