Is there a way to determine that a .a or .so library has been compiled as position indepenent code? - gcc

I am getting a linking error when compiling the numpy library against lapack indicating I need to compile lapack with -fPIC. I thought I had done just that. Is there a way to determine that the produced lapack library is position independent?

You may have some luck with this answer, although it's platform dependent and doesn't work for all object files (but if you code manipulates pointers in any way, it should work).
This is the result of objdump -r on a file compiled with -fPIC:
test.o: file format elf32-i386
RELOCATION RECORDS FOR [.text]:
OFFSET TYPE VALUE
00000007 R_386_PC32 __i686.get_pc_thunk.cx
0000000d R_386_GOTPC _GLOBAL_OFFSET_TABLE_
and this is for a file without PIC:
test.o: file format elf32-i386

In general, you have no way of knowing:
$ cat a.c
int foo(int x) { return x+1; }
$ gcc -fno-pic a.c -c -o nopic.o
$ gcc -fPIC a.c -c -o pic.o
$ cmp pic.o nopic.o
$ cmp pic.o nopic.o && echo Identical
Identical

Related

GCC differently treats an object and a static library regarding undefined symbols

Recently I discoved that Linux linker does not fail due to undefined symbols from static libraries, however does fail due to the same undefined symbols if I link directly with te object files. Here is a simple example:
Source code:
$ cat main.c
int main() { return 0; }
$ cat src.c
int outerUnusedFunc() {
return innerUndefinedFunc();
}
int innerUndefinedFunc();
Creating *.o and *.a from it, comparing using "nm":
$ gcc -c -o main.o main.c
$ gcc -c -o src.o src.c
$ ar r src.a src.o
ar: creating src.a
$ nm src.o
U innerUndefinedFunc
0000000000000000 T outerUnusedFunc
$ nm src.a
src.o:
U innerUndefinedFunc
0000000000000000 T outerUnusedFunc
(Here we clearly see that both *.o and *.a contain the equal symbols list)
And now...
$ ld -o exe main.o src.o
src.o: In function `outerUnusedFunc':
src.c:(.text+0xa): undefined reference to `innerUndefinedFunc'
$ echo $?
1
$ ld -o exe main.o src.a
$ echo $?
0
What is the reason for GCC to treat it differenty?
If you read the static-libraries tag wiki
it will explain why no object files from src.a are linked into your program, and therefore
why it doesn't matter what undefined symbols are referenced in them.
The difference between an object file foo.o and a static library libfoo.a, as linker inputs, is
that an object file is always linked into your program, unconditionally, whereas the same object file
in a static library library, libfoo.a(foo.o), is extracted from libfoo.a and linked into
the program only if the linker needs it to carry on the linkage, as explained by the tag wiki.
Naturally, the linker will give errors only for undefined references in object files that are linked into the program.
The behaviour you are observing is behaviour of the linker, whether or not you invoke it via a GCC
front-end.
Giving the linker foo.o tells it: I want this in the program. Giving the linker
libfoo.a tells it: Here are some object files that you might or might not need.
In the second case — with static library — command line with says "build exe from main.o and add all required things from src.a".
ld just ignores the library because no external symbols required for main.o (outerUnusedFunc is not referenced from main.o).
But in the first case command line says "build exe from main.o and src.o".
ld should place src.o content into output file.
Hence, it obligate to analyze src.o module, add outerUnusedFunc into output file and resolve all symbols for outerUnusedFunc despite it is unused.
You can enable garbage collection for code sections
gcc --function-sections -Wl,--gc-sections -o exe main.c src.c
In this case outerUnusedFunc (as well as all other functions) will be placed
in separate section. ld will see that this section unused (no symbols referenced). It will remove all the section from output file so that innerUndefinedFunc would not be referenced and the symbol should not be resolved — the same result as for library case.
On the other hand, you can manually reference outerUnusedFunc as "undefined" so that ld should find it in library and add to output file.
ld -o exe main.o -u outerUnusedFunc src.a
in this case the same error (undefined reference to innerUndefinedFunc) will be produced.

How to see the object file contents of a .so file

how to see what .o files constitute .so file?
Means how to notice what are the object files are used to build the .so file from the .so file (If I have only the .so file)
You can't know, given just a shared library, what object files were
compiled into it. If you're lucky, you may be able to make a reasonable guess.
A shared library is made, by the linker, from object files and
possibly other shared libraries, but it does not contain the object files
or shared libraries from which it was made. A static library, on the other hand, which
is made by the archiver ar, does contain object
files: it is just an ar archive of object files.
If a shared library has not been stripped of debugging information, then
for debugging purposes its symbol table will contain the names of the source files
from which the object files were compiled that were linked in the shared library - at least those source files which were compiled with debugging information.
From the names of those source files you can infer the names of the object files
with reasonable confidence, but not with certainty.
For example, here we make a shared library from source files foo.c and bar.c.
Compile the source files to object files:
$ gcc -Wall -fPIC -c -o foo.o foo.c
$ gcc -Wall -fPIC -c -o bar.o bar.c
Link the object files to make a shared library:
$ gcc -shared -o libfoobar.so foo.o bar.o
Then:
$ readelf -s libfoobar.so | grep FILE
26: 0000000000000000 0 FILE LOCAL DEFAULT ABS crtstuff.c
35: 0000000000000000 0 FILE LOCAL DEFAULT ABS foo.c
37: 0000000000000000 0 FILE LOCAL DEFAULT ABS bar.c
39: 0000000000000000 0 FILE LOCAL DEFAULT ABS crtstuff.c
42: 0000000000000000 0 FILE LOCAL DEFAULT ABS
indicates that three source files have contributed debugging info to the
library, and we'd infer that the object files to which they were compiled
were likely to be:
crtstuff.o
foo.o
bar.o
Note that crtstuff.c is not one of the source files that we compiled. It
happens to contain program initialization and finalization code from the C runtime library, which has got into
our library from a C runtime object file that is linked by default.
This inference could be wrong about any of the files, since:
$ gcc -Wall -fPIC -c -o abc.o foo.c
$ gcc -Wall -fPIC -c -o xyz.o bar.c
$ gcc -shared -o libfoobar.so abc.o xyz.o
is also a perfectly possible way of compiling and linking the library.
If debugging information has been stripped from the library:
$ strip -g libfoobar.so
then we are out of luck:
$ readelf -s libfoobar.so | grep FILE
$
No more FILE symbols.

Can you pass your code directly into gcc? For example: gcc -? 'int main(){return 0;}'

Can you pass your code directly into gcc? If so what is the command line option for it?
For example:
g++ -? 'int main(){return 0;}'
I need to know because I am using a system command and I rather not make files:
system("g++ -C "+code_string+" -o run.out");
Basile Starynkevitch solution worked, however I am getting compile errors when I use newlines:
echo '#include\nint main(){printf("Hello World"); return 0;}' | g++ -x c++ -Wall -o myprog /dev/stdin
Edit: fixed it
echo -e '#include\nint main(){printf("Hello World"); return 0;}' | g++ -x c++ -Wall -o myprog /dev/stdin
You could ask GCC to read from stdin. Read the Invoking GCC chapter of its documentation. Use its -x option with /dev/stdinor with -:
echo 'int main(){return 0;}' | g++ -x c++ -O -Wall -o myprog /dev/stdin
BTW, since int main(){return 0;} is a valid C program, you could use
echo 'int main(){return 0;}' | gcc -x c -O -Wall -o myprog -
Programatically, you should consider using popen(3) to get a some FILE* handle for a pipe(7) (so FILE* f = popen("g++ -x c++ -O -Wall -o myprog /dev/stdin", "w"); then check that f is not null) and fprintf into it then pclose it at last. Don't forget to test the status of pclose.
However, most of the time spent by GCC is not parsing (use -ftime-report developer option to find out). You often want to ask it to optimize (with -O2 -march=native or just -O for example), and you surely want to ask for all warnings (with at least -Wall and perhaps also -Wextra).
If you want to produce some plugin code in /tmp/someplugin.so from some emitted C++ code in /tmp/myemitted.cc to be dynamically loaded on Linux, compile it as position-independent code into a shared object dynamic library with e.g.
g++ -o /tmp/someplugin.so -fPIC -shared -Wall -O /tmp/myemitted.cc
etc.... then use dlopen(3) on /tmp/someplugin.so with dlsym(3) to fetch some loaded symbols. My GCC MELT is doing this.
Since parsing time is negligible, you could instead write C or C++ code in some temporary file (inside /tmp/ or /run which is often some fast tmpfs on most Linux systems, so writing into it does not require disk I/O).
At last, recent GCC (use at least GCC 6) also has GCCJIT (actually libgccjit). You could use it to build some representation of generated code then ask GCC to compile it.
See also this and that. Read the C++ dlopen mini howto and the Program Library HowTo, and Drepper's How To Write Shared Libraries
I rather not make files
Generating a temporary file (see mkstemp(3) etc... and you practically could also general some random file name under /tmp/ ending with .c, then register its removal with atexit(3) passed some function doing unlink(2)...) is really quick (but you should build some kind of AST in memory before emitting C++ or C code from it). And using some Makefile to compile the generated code with some make command has the advantage (for the advanced user) to be able to change compilers or options (by editing that Makefile to configure make).
So you are IMHO wrong in avoiding temporary files (notice that gcc & g++ are also generating and deleting temporary files, e.g. containing some assembler code). I would suggest on the contrary generating a temporary file (matching /tmp/mytemp*.cc) using some random numbers (see random(3); don't forget to seed the PRNG with e.g. srandom(time(NULL)); early in your main). It could be as simple as
char tmpbuf[80];
bool unique;
do { // in practice, this loop is extremely likely to run once
snprintf(tmpbuf, sizeof(tmpbuf), "/tmp/mytemp_%lx_p%d.cc",
random(), (int)getpid());
unique = access(tmpbuf, F_OK);
} while (unique);
// here tmpbuf contains a unique temporary file name
You coded:
system("g++ -C "+code_string+" -o run.out");
Beware, + is usually not string catenation. You might use snprintf(3) or asprintf(3) to build strings. Or use in C++ std::string. And if you use system(3) you should check its return code:
char cmdbuf[128];
snprintf(cmdbuf, sizeof(cmdbuf), "g++ -Wall -O %s -o run.out", tmpbuf);
fflush(NULL);
if (system(cmdbuf) != 0) {
fprintf(stderr, "compilation %s failed\n", cmdbuf);
exit(EXIT_FAILURE);
}
BTW, your example is wrong (missing <stdio.h>); it is C code, not C++ code. It should be
echo -e '#include <stdio.h>\nint main(){printf("Hello World"); return 0;}' \
| gcc -x c -Wall -O -o myprog -
PS. My answer is focused on Linux, but you could adapt it for your OS.

Creating shared object from static library whose object files were linked with -fPIC

For a project we are trying to create a shared object file that exports a set of functions specified in libname.exports. Of course we know that the object files from which the .so file gets linked have to be created using -fPIC, so that has been taken care of. We then combined the object files into an archive named libname.a. This should now be the basis for the .so file to be created - or so was the idea.
We're passing libname.exports to --retain-symbols-file, so the expected behavior was that the linker would pull in any of the .a members relevant to those symbols.
However, the output of nm libname.so is empty. On the other hand grepping in nm libname.a shows that the relevant symbols named in libname.exports exist in the .a members.
Now I stumbled over --whole-archive and thus adjusted the command line from:
gcc -o libname.so -shared -Wl,-z,defs,--retain-symbols-file,libname.exports,-L. libname.a -lc
to:
gcc -o libname.so -shared -Wl,-z,defs,--retain-symbols-file,libname.exports,-L.,--whole-archive,libname.a,--no-whole-archive -lc
which appears to have the intended effect of including all the object files from the .a (although the size difference is strange). However, nm libname.so still gives me no output.
How can I use the archive file to create a shared object with only the symbols named in libname.exports visible?
Unfortunately How to create a shared object file from static library doesn't quite answer my question.
Note: before you ask. The idea behind using the .a file as input is because it makes it easy to use a pattern rule in GNUmakefile and because the .a file with -fPIC is needed regardless. There shouldn't be any difference between linking the individual object files versus the archive file.
You could use the -u SYMBOL option to force objects to be read in from an archive.
% cc -c -fPIC a.c
% nm a.o
00000000 T a
% ar rv liba.a a.o
ar: creating liba.a
a - a.o
% gcc -o liba.so -shared -u a liba.a
% nm liba.so | awk '$3 == "a" { print }'
0000042c T a
One thing to check would be the spellings of the symbols being specified with --retain-symbols-file. For example, symbol names in objects compiled from C++ code are likely to be mangled:
% g++ -c -fPIC a.c
% nm a.o | awk '$2 == "T" { print }'
00000000 T _Z1av

How can I tell, with something like objdump, if an object file has been built with -fPIC?

How can I tell, with something like objdump, if an object file has been built with -fPIC?
The answer depends on the platform. On most platforms, if output from
readelf --relocs foo.o | egrep '(GOT|PLT|JU?MP_SLOT)'
is empty, then either foo.o was not compiled with -fPIC, or foo.o doesn't contain any code where -fPIC matters.
I just had to do this on a PowerPC target to find which shared object (.so) was being built without -fPIC. What I did was run readelf -d libMyLib1.so and look for TEXTREL. If you see TEXTREL, one or more source files that make up your .so were not built with -fPIC. You can substitute readelf with elfdump if necessary.
E.g.,
[user#host lib]$ readelf -d libMyLib1.so | grep TEXT # Bad, not -fPIC
0x00000016 (TEXTREL)
[user#host lib]$ readelf -d libMyLib2.so | grep TEXT # Good, -fPIC
[user#host lib]$
And to help people searching for solutions, the error I was getting when I ran my executable was this:
root#target:/# ./program: error while loading shared libraries: /usr/lib/libMyLi
b1.so: R_PPC_REL24 relocation at 0x0fc5987c for symbol 'memcpy' out of range
I don't know whether this info applies to all architectures.
Source: blogs.oracle.com/rie
I assume, what you really want to know is whether or not a shared library is composed from object files compiled with -fPIC.
As already mentioned, if there are TEXTRELs, then -fPIC was not used.
There is a great tool called scanelf which can show you the symbols that caused .text relocations.
More information can be found at HOWTO Locate and Fix .text Relocations TEXTRELs.
-fPIC means that code will be able to execute in addresses different form the address that was compile for.
To do it , disasambler will look like this....
call get_offset_from_compilation_address
get_offset_from_compilation_address: pop ax
sub ax, ax , &get_offset_from_compilation_address
now in ax we have an offset that we need to add to any access to memory.
load bx, [ax + var_address}
readelf -a *.so | grep Flags
Flags: 0x50001007, noreorder, pic, cpic, o32, mips32
This should work most of the time.
Another option to distinguish whether your program is generated wit -fPIC option:
provided that your code has -g3 -gdwarf-2 option enabled when compiling.
other gcc debug format may also contains the macro info:
Note the following $'..' syntax is assumes bash
echo $' main() { printf("%d\\n", \n#ifdef __PIC__\n__PIC__\n#else\n0\n#endif\n); }' | gcc -fPIC -g3
-gdwarf-2 -o test -x c -
readelf --debug-dump=macro ./test | grep __PIC__
such a method works because gcc manual declares that if -fpic is used, PIC is defined to 1, and
if -fPIC used, PIC is 2.
The above answers by checking the GOT is the better way. Because the prerequest of -g3 -gdwarf-2 I guess seldom being used.
From The Linux Programming Interface:
On Linux/x86-32, it is possible to create a shared library using
modules compiled without the –fPIC option. However, doing so loses
some of the benefits of shared libraries, since pages of program text
containing position-dependent memory references are not shared across
processes. On some architectures, it is impossible to build shared
libraries without the –fPIC option.
In order to determine whether an existing object file has been
compiled with the –fPIC option, we can check for the presence of the
name _GLOBAL_OFFSET_TABLE_ in the object file’s symbol table, using
either of the following commands:
$ nm mod1.o | grep _GLOBAL_OFFSET_TABLE_
$ readelf -s mod1.o | grep _GLOBAL_OFFSET_TABLE_
Conversely, if either of the following equivalent commands yields any
output, then the specified shared library includes at least one object
module that was not compiled with –fPIC:
$ objdump --all-headers libfoo.so | grep TEXTREL
$ readelf -d libfoo.so | grep TEXTREL
However, neither above quoting nor any answer of this question works for x86_64.
What I've observed on my x86_64 Ubuntu machine is that, whether specifying -fPIC or not, it would generate fPIC .o. That is
gcc -g -Wall -c -o my_so.o my_so.c // has _GLOBAL_OFFSET_TABLE_
gcc -g -Wall -fPIC -c -o my_so_fpic.o my_so.c // has _GLOBAL_OFFSET_TABLE_
readelf -s my_so.o > 1.txt && readelf -s my_so_fpic > 2.txt && diff 1.txt 2.txt
has no difference and both my_so.o and my_so_fpic.o can be used to create a shared library.
In order to generate non fpic object file, I found a gcc flag called -fno-pic in the first comment of How to test whether a Linux binary was compiled as position independent code? .
This works,
gcc -g —Wall -fno-pic -c -o my_so_fnopic.o my_so.c // no _GLOBAL_OFFSET_TABLE_
and
gcc -g -Wall -shared -o libdemo.so my_so_fnopic.o
gives error:
/usr/bin/ld: my_so_fnopic.o: relocation R_X86_64_32 against `.rodata' can not be used when making a shared object; recompile with -fPIC
collect2: error: ld returned 1 exit status
can not create a shared library with non pic .o.

Resources