This question already has answers here:
How to remove unused C/C++ symbols with GCC and ld?
(11 answers)
Closed 6 years ago.
I need some help for compiling with GCC under MinGW.
Say I have two files:
File a.c contains two functions, a1 and a2
File b.c contains two functions, b1 and b2.
Then I link the two objects into a shared library. The command used are like:
gcc -c a.c
gcc -c b.c
gcc -shared -Wl, --version-script v.ver -Wl, -Map=out.map -Wl, --strip-all -o mydll.dll a.o b.o
File v.ver looks like:
mylib {
global: a1;
a2;
local: *;
}
which is used to control which functions to be exported.
By checking the mapfile I can see that the two functions in b.c are also included into the .text section of the DLL file.
Because this DLL file only exports a1 and a2, and b1 and b2 are only defined in b.c, but never used anywhere. Is there an option I could add in GCC or ld so that b1 and b2 are not built into the DLL file so that I can save some space in the DLL file?
Yes, this is possible. To do this, add the following two flags when compiling your C source code to objects:
-ffunction-sections -fdata-sections
This will generate bigger object files, but will add a lot of information for the linker.
When calling the linker add the following flag:
--gc-sections
The linker will now throw away all functions and sections that are not used. Note that this might incur a performance penalty:
Only use these options when there are significant benefits from doing
so. When you specify these options, the assembler and linker create
larger object and executable files and are also slower. These options
affect code generation. They prevent
optimizations by the compiler and assembler using relative locations inside a translation unit since the locations are unknown
until link time. An example of such an optimization is relaxing calls
to short call instructions.
(man gcc)
See also this question: Query on -ffunction-section & -fdata-sections options of gcc for more information.
Related
Normally, one can get GCC's optimized assembler output from a source file using the -S flag in GCC and Clang, as in the following example.
gcc -O3 -S -c -o foo.s foo.c
But suppose I compile all of my source files using -O3 -flto to enable link-time whole-program optimizations and want to see the final compiler-generated optimized assembly for a function, and/or see where/how code gets inlined.
The result of compiling is a bunch of .o files which are really IR files disguised as object files, as expected. In linking an executable or shared library, these are then smushed together, optimized as a whole, and then compiled into the target binary.
But what if I want assembly output from this procedure? That is, the assembly source that results after link-time optimizations, during the compilation of IR to assembly, and before the actual assembly and linkage into the final executable.
I tried simply adding a -S flag to the link step, but that didn't really work.
I know disassembling the executable is possible, even interleaving with source, but sometimes it's nicer to look at actual compiler-generated assembly, especially with -fverbose-asm.
For GCC just add -save-temps to linker command:
$ gcc -flto -save-temps ... *.o -o bin/libsortcheck.so
$ ls -1
...
libsortcheck.so.ltrans0.s
For Clang the situation is more complicated. In case you use GNU ld (default or -fuse-ld=ld) or Gold linker (enabled via -fuse-ld=gold), you need to run with -Wl,-plugin-opt=emit-asm:
$ clang tmp.c -flto -Wl,-plugin-opt=emit-asm -o tmp.s
For newer (11+) versions of LLD linker (enabled via -fuse-ld=lld) you can generate asm with -Wl,--lto-emit-asm.
I am encountering a problem when I include a Fortran
subroutine in a shared library. This subroutine has a
named common block.
I have a Fortran main program that uses this common block
and links with the shared library.
The behavior is that variables in the common block set in
either the subroutine or main program are not shared between
the two.
I am using gfortran 4.9.3 under MinGW on windows. Here are the pieces of
my very simple example.
Main program:
program mainp
common/whgc/ivar
ivar = 23
call sharedf
end
Subroutine:
subroutine sharedf
common/whgc/ivar
print *, 'ivar=', ivar
end
Makefile:
FC = gfortran
FFLAGS=-g
all: shltest.dll mainp.exe
shltest.dll: sharedf.o
$(FC) -shared -o shltest.dll sharedf.o
mainp.exe: mainp.o shltest.dll
$(FC) -o mainp.exe mainp.o shltest.dll
clean:
rm *.o mainp.exe shltest.dll
When mainp.exe is run, it produces ivar = 0 instead of the correct ivar=23
Here are the results of some experimentation I did with nm.
nm -g mainp.o shows:
...
00000004 C _whgc_
nm on sharedf.o shows the same.
nm -g shltest.dll shows:
...
71446410 B _whgc_
nm -g mainp.exe shows:
...
00406430 B _whgc_
This is the only _whgc_ symbol in mainp.exe.
However, when I run mainp.exe in gdb and set break points in both
mainp and sharedf, I can print the address of ivar at each break point. The addresses
are not the same.
From the behavior it seems clear that GNU ld is not correctly
matching the _whgc_ symbols but I'm unclear about what options
to pass either in the shared library build or the final link to
make it do so?
(Please don't suggest alternatives to common blocks. In my real
application I am dealing with legacy code that uses common blocks.)
EDIT:
I tried my example on Linux/x86 and there the behavior is correct.
Of course on Linux the shared library and executable are ELF format
objects and on Windows/MinGW the format is PE/COFF.
Let's say that you have prog.c that includes lib.h, whose functions are defined in lib.c, and you build the program with ̀gcc -O3 lib.c prog.c.
Does GCC merge both source files before compiling them?
Is GCC able to inline short functions of lib.c into the resulting binary?
Summary of answers
This does the trick: gcc -flto -O3 lib.c prog.c.
Both source files are still compiled individually, but the linker is able to inline functions from one file into the other one.
Does GCC merge both source files before compiling them?
No, it doesn't
Is GCC able to inline short functions of lib.c into the resulting binary?
Yes, at advanced optimization level. Look at Whole Program Optimization, Link Time Optimization and similar options
GCC can get pretty picky about the order in which it accepts its arguments:
# Works.
g++ Foo.cpp -L. -I. -lBar -o Foo
# Linker errors.
g++ -o Foo -I. -L. -lBar Foo.cpp
What, specifically, are the ordering requirements for command-line options?
Libraries are loaded on demand based on the symbols required from them, so the library which provides a symbol needed by something else must follow that something else. This is historical; arguably a modern system should resolve symbols automatically, handling loops sensibly (that being the reason for the rule; you broke dependency cycles manually by specifying libraries in order and as many times as needed), but g++ follows the traditional rule so it will work with vendor lds. (GNU ld doesn't work everywhere, so it wouldn't be possible to rely on it to resolve symbol dependency loops. There are also bootstrapping concerns even on platforms where GNU ld does work.) Similarly, other linker-oriented options must be specified in the correct order relative to the things they affect (for example, a -L option must precede a library which lives in the specified directory; this can be important if a library in one directory shadows a library of the same name in a standard directory).
I'm using gcc 4.3.4 and ld 2.20.51 in Cygwin under Windows 7. Here's a simplified version of my problem:
foo.o contains function foo_bar() which calls bar() in bar.o
bar.o contains function bar()
main.c calls functions in foo.o, but foo_bar() is not in the call chain
If I try to compile main.c and link it to foo.o, I get an undefined reference to _foo_bar error from ld. As you can see from my Makefile except below, I've tried using flags for putting each function in its own section and having the linker discard unused sections.
COMPILE_CYGWIN = gcc -iquote$(INCDIR)
COMPILE = $(COMPILE_CYGWIN) -g -MMD -MP -Wall -ffunction-sections -Wl,-gc-sections $(DEFINE)
main_OBJECTS = main.o foo.o
main.exe : $(main_OBJECTS)
$(COMPILE) -o main.exe $(main_OBJECTS)
The function foo_bar() is a short function that provides a connection between two networking layers in a protocol stack. Some programs don't need it, so they won't link in the other object files related to the upper layer of the stack. It's a small function, and seems inappropriate to put it into its own .o file.
I don't understand why ld throws the error -- nothing is calling foo_bar(), so there's no need to include bar() in the final executable. A coworker has just told me that ld is not a "smart linker", so maybe what I'm trying to do isn't possible?
Unless the linker is from Cyberdyne Systems it has no way to know exactly which functions will actually be called. It only knows which ones are referenced. Even Skynet's linker can't predict what run-time decisions will be made or what will happen if you load a module dynamically at run-time and it starts calling various global functions1.
So, if you link in module m and it references function f, you will need to link with whatever module has f.
1. This problem is related to the Halting Problem and has been proven undecidable.
I hit the similar issue and I find this page:
http://lists.gnu.org/archive/html/bug-gnu-utils/2004-09/msg00098.html
Highligt:
The GNU linker still works at .o file granularity.
Gcc pulls in foo.o and then find bar() was undefined.
You'd better put foo_bar() into another .o file.