How to set the dynamic linker path for a shared library? - gcc

I want to compile a shared library with an .interp segment.
#include <stdio.h>
int foo(int argc, char** argv) {
printf("Hello, world!\n");
return 0;
}
I'm using the following commands.
gcc -c -o test.o test.c
ld --dynamic-linker=blah -shared -o test.so test.o
I end up without an INTERP segment, as if I never passed the --dynamic-linker=blah option. Check with readelf -l test.so. When building an executable, the linker processes the option correctly and puts an INTERP segment in the program header. How to do I make it work for shared libraries too?

ld doesn't include a .interp section if -shared is used, as #MichaelDillon already said. You can however provide this section yourself.
const char interp_section[] __attribute__((section(".interp"))) = "/path/to/dynamic/linker";
The line above will save the string "/path/to/dynamic/linker" in the .interp section using GCC attributes.
If you're trying to build a shared object that's also executable by itself, check this question out. It has a more comprehensive description of the process.

The INTERP segment only goes into binaries which need to load the ELF interpreter (ld.so) in the first place. A shared library has no INTERP segment because the ELF interpreter is already loaded before the shared library is loaded.

In most linux systems the ldconfig is run at every system boot and it looks definitions in /etc/ld.so.conf for looking in directories that have shared libraries. In the file /etc/ld.so.cache there are mappings for shared libraries sonames and the library full path. Consider reading this article: http://grahamwideman.wordpress.com/2009/02/09/the-linux-loader-and-how-it-finds-libraries/#comment-164

Related

cannot execute binary file: Exec format error 64bits

I'm under Windows Linux Subsystem which works well on other computer.
I have a 64-bits file: ./ensembles.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
uname -m: x86_64
I tried with the gcc compiler and the clang one, both loose.
Even this C code doesn't work:
#include <stdio.h>
#include <stdlib.h>
#include "sac.h"
#include "type_ensemble.h"
#include "operations_ens.h"
int main(int argc, char ** argv) {
}
The error: -bash: ./ensembles.o: cannot execute binary file: Exec format error
My Makefile:
ensembles.o : ensembles.c sac.h type_ensemble.h operations_ens.h
gcc -c ensembles.c
operation_ens.o : operations_ens.c operations_ens.h
gcc -c operations_ens.c
sac.o : sac.c sac.h
gcc -c sac.c
main: ensembles.o operation_ens.o sac.o
gcc -o main ensembles.o operation_ens.o sac.o
A file of type ELF 64-bit LSB relocatable is a file of ELF type ET_REL, which is not directly executable. It's commonly called an object file or .o file, and it is an input file for the link editor.
You need to link it (either with the gcc or the ld command) to produce an executable. If you are invoking gcc, you must not pass options like -r or -c, or otherwise GCC will not produce an executable.
In the makefile you quote, only the first target will be executed by make because it is the default target. Try moving the rule for main to the beginning of the file, or add a rule
all: main
at the beginning. You can also invoke make main to request building the main file explicitly.

extract debug symbol info from elf binary

Let's have a look to this basic c program:
#include <stdio.h>
int myadd(int a, int b);
int myadd(int a, int b)
{
return a+b;
}
int main(int argc, char *argv[])
{
int res = myadd(argc,3);
printf("%d\n",res);
return 0;
}
What i want is to understand how debug symbol files work.
If i compile this way:
gcc test.c
I can see debug symbols in gdb:
gdb ./a.out
(gdb) disassemble myadd
Dump of assembler code for function myadd:
0x00000000000006b0 <+0>: push %rbp
That's fine !
Now, if i run:
gcc -s test.c
Here what i get in gdb:
(gdb) disassemble myadd
No symbol table is loaded. Use the "file" command.
That's fine too, because i have stripped symbols with -s gcc option.
Now, i want to "split" my elf executable in 2 files:
- A stripped elf executable
- an external debug symbol files.
Here what i read in some tutorials:
gcc test.c
objcopy --only-keep-debug a.out a.dbg
strip ./a.out
But, now, if i want to run gdb, i say to gdb to look inside ./a.dbg for debug symbols
gdb -s ./a.dbg a.out
And gdb cannot resolve myadd function:
(gdb) disassemble myadd
No symbol table is loaded. Use the "file" command.
And this is what i do not understand: Why gdb does not resolv myadd function?
Thanks
If i compile this way: gcc test.c I can see debug symbols in gdb
You do not see debug symbols here, only the symbol table (which is distinct from debug symbols).
To see debug symbols, compile with gcc -g test.c.
gdb -s a.dbg a.out
The problem here is that when GDB sees "unadorned" a.out, it throws away previously specified symbol file (a.dbg) and replaces it with (fully stripped) a.out. You want:
gdb -s a.dbg -e a.out
Update:
What does mean a "stripped" file: Does it mean this is a file without symbol table or without debuging informations?
On ELF platforms, the state of the file with respect to "strip"-ness is not binary: you can remove individual sections of the file, and depending on exactly what you stripped, your debugging experience will be affected to varying degree.
This command: strip -g a.out removes all .debug_* sections, leaving you without instruction address to source file and line mapping, and without stack address to local variables mapping. However, the symbol table remains in the binary, and can be used to provide instruction address to function name mapping.
This command: strip a.out removes all .debug_* sections, as well as .symtab and .strtab (which together form the symbol table). Such binary is often called "fully stripped".
One could also use obcopy to remove individual sections. It is possible to remove source file/line info (.debug_line section) without removing variable info, and vice versa.
I have tried eu-unstrip ./a.out ./a.dbg but ./a.out result file does not contains debug informations.
You may be hitting a bug in eu-unstrip, perhaps this one.

ld fails to find the entry symbol main when linking

I am writing a simple hello world bootloader in C with inline assembly using this article. Nothing fancy, no kernel loading and other advanced topics. Just a plain old "hello world" message.
Here are my files:
boot.c
/* generate 16-bit code */
__asm__(".code16\n");
/* jump boot code entry */
__asm__("jmpl $0x0000, $main\n");
/* user defined function to print series of characters terminated by null
character */
void printString(const char* pStr) {
while (*pStr) {
__asm__ __volatile__ (
"int $0x10" : : "a"(0x0e00 | *pStr), "b"(0x0007)
);
++pStr;
}
}
void main() {
/* calling the printString function passing string as an argument */
printString("Hello, world!");
}
boot.ld
ENTRY(main);
SECTIONS
{
. = 0x7C00;
.text : AT(0x7C00)
{
*(.text);
}
.sig : AT(0x7DFE)
{
SHORT(0xaa55);
}
}
I then ran the following commands: (different from the first article; adapted from another StackOverflow article as the commands in the first article won't work for me)
gcc -std=c99 -c -g -Os -march=i686 -m32 -ffreestanding -Wall -Werror boot.c -o boot.o
ld -static -T boot.ld -m elf_i386 -nostdlib --nmagic -o boot.elf boot.o
The first line compiles successfully, but I get errors upon executing the second line:
ld: warning: cannot find entry symbol main; defaulting to 0000000000007c00
boot.o:boot.c:(.text+0x2): undefined reference to 'main'
boot.o: In function 'main':
C:(...)/boot.c:16: undefined reference to '__main'
C:(...)/boot.c:16:(.text.startup+0xe): relocation truncated to fit: DISP16 against undefined symbol '__main'
What's wrong? I use Windows 10 x64 with the gcc compiler that comes with Dev-C++.
I'd suggest an i686-elf cross compiler rather than using a native windows compiler and tool chain. I think part of your problem is peculiarities related to the Windows i386pe format.
The .sig section is likely not being written at all since that unknown section probably isn't marked allocatable data. The result of that is the signature isn't written to the final binary file. It is also possible the virtual memory address (VMA) is not being set in boot.ld so it may not advance the boot signature into the last 2 bytes of the 512 byte sector. As well with the Windows format read only data will be placed in sections starting with .rdata. You'll want to make sure those are included after the data section and before the boot signature. Failure to do this will default the linker script into placing unprocessed input sections at the end beyond the boot signature.
Assuming you have made the changes as you mentioned in the comments about the extra underscores your files may work this way:
boot.ld:
ENTRY(__main);
SECTIONS
{
. = 0x7C00;
.text : AT(0x7C00)
{
*(.text);
}
.data :
{
*(.data);
*(.rdata*);
}
.sig 0x7DFE : AT(0x7DFE) SUBALIGN(0)
{
SHORT(0xaa55);
}
}
The commands to compile/link and adjust the .sig section to be a regular readonly allocated data section would look like:
gcc.exe -std=c99 -c -g -Os -march=i686 -m32 -ffreestanding -Wall -Werror boot.c -o boot.o
ld.exe -mi386pe -static -T boot.ld -nostdlib --nmagic -o boot.elf boot.o
# This adjusts the .sig section attributes and updates boot.elf
objcopy --set-section-flags .sig=alloc,contents,load,data,readonly boot.elf boot.elf
# Convert to binary
objcopy -O binary boot.elf boot.bin
Other Observations
Your use of __asm__(".code16\n"); will not generate usable code for a bootloader. You'll want to use the experimental pseudo 16-bit code generation that forces the assembler to modify instructions to be compatible with 32-bit code but encoded to be usable in 16-bit real mode. You can do this by using __asm__(".code16gcc\n"); at the top of each C/C++ files.
This tutorial has some bad advice. The global level basic assembly statement that does the JMP to main may be relocated to somewhere other than the beginning of the bootloader (some optimization levels may cause this). The startup code doesn't set ES, DS, CS to 0x0000, nor does it set the SS:SP stack segment and pointer. This can cause problems.
If trying to run from a USB drive on real hardware you may find you'll need a Boot Parameter Block. This Stackoverflow Answer I wrote discusses this issue and a possible work around under Real Hardware / USB / Laptop Issues
Note: The only useful code that GCC currently generates is 32-bit code that can run in 16-bit real mode. This means that you can't expect this code to run on a processor earlier than a 386 like the 80186/80286/8086 etc.
My general recommendation is to not create bootloaders with GCC unless you know what you are really doing and understand all the nuances involved. Writing it in assembly is probably a much better idea.
If you want a C/C++ compiler that generates true 16-bit code you may wish to look at OpenWatcom

how is ld-linux.so* itself linked and loaded?

Just by curiosity, how is the Linux dynamic linker/loader ld-linux.so* itself linked and loaded?
The above screenshot shows that file and ldd seems to give contradictory result: one says statically linked, the other says dynamically linked.
Then how is the loader itself be loaded?
ld-linux.so* doesn't depends any other libraries. It is runable by itself when loaded to memory.
ldd is a script, it loads the object file via the loader, and the loader checks whether the object is dynamically or statically linked, try this:
LD_TRACE_LOADED_OBJECTS=1 /lib64/ld-linux-x86-64.so.2 /lib64/ld-linux-x86-64.so.2
file reads the magic number or elf header to figure whether the object is dynamically or statically linked, it may output different value from ldd
IMO, ld-linux.so is static linked, because it doesn't have an .interp section which all dynamically linked object must have.
#Zang MingJie
Your answer helped me a lot, but the following words might confuse some people:
IMO, ld-linux.so is static linked, because it doesn't have an .interp >section which all dynamically linked object must have.
We should divide "all dynamic linked object" into two parts, one kind we called 'shared object' is generated like this:
gcc -c -o test.o test.c -fPIC
ld -o test.so test.o -shared
Another kind is called 'dynamic linked executable file' :
gcc -c -o test.o test.c -fPIC
ld -o test.so test.o
Two points are important:
1, Shared object has no '.iNTERP' segment, while Dynamic linked executable file has.
2, Linux kernel doesn't care whether an ELF file is EXEC or DYN indicated by the elf header. He searchs for the .INTERP segment firstly, if failed, He mmap() every LOAD type segment, and pass the control to eheader->e_entry, regardless he is loading a executable file or shared object.
Since ld-linux.so is a common shared object, it's not strange that she doesn't own an .INTERP segment. And it's not strange she can be run as executable . Every shared object can.
Write code like this:
void foobar(void){ while(1); }
compile it into a shared object ( using command line above ).
run it:
gdb ./test.so
You will get a process stuck in dead loop.
Use Ctrl-C to interrupt it. you will see ( need gcc's -g option )
Program received signal SIGINT, Interrupt.
foobar (void) at test.c:1
1 while(1);
(gdb)
You can go more far:
(gdb) p $eip
$1 = (void (*)()) 0x80000183 <foobar+3>
(gdb)
If you are familiar with linux kernel, you should know 0x80000000 is related the value of kernel variable 'mmap_min_addr'. Because test.so is a shared object, her load-address is zero, so kernel found a default virtual address for her, that's 0x80000000, not 0x804000.
I don't know how i got so off-topic ...

Linker and dependencies

For a Linux/g++ project, I have a helper library ("libcommon.a") that I wrote that is used in two different programs ("client" and "server"). One particular source file among several, oshelper.cpp, has a set of unrelated utility functions:
// header file
#ifndef OSHELPER_H
#define OSHELPER_H
size_t GetConsoleWidth();
uint32_t GetMillisecondCounter();
#endif
// -----------------------------------------
// Code file
#include "commonincludes.h"
#include "oshelper.h"
size_t GetConsoleWidth()
{
struct winsize ws = {};
ioctl(0, TIOCGWINSZ, &ws);
return ws.ws_col;
}
uint32_t GetMillisecondCounter()
{
timespec ts={};
clock_gettime(CLOCK_MONOTONIC, &ts);
return (uint32_t)(ts.tv_nsec / 1000000 + ts.tv_sec * 1000);
}
Both programs link to the library that contains these functions (libcommon.a or -lcommon).
"client" program calls both the GetConsoleWidth and GetMillisecondCounter function. And since GetMillisecondCounter ultimately depends on a call to "clock_gettime", -lrt is a required parameter to the linker such that librt is linked in. This is expected.
"server" just calls GetConsoleWidth. It never calls GetMillisecondCounter. But without "-lrt" being passed, the linker complains about the unresolved reference to clock_gettime. Which is obviously fixed by passing -lrt to g++. And then "ldd server" shows that librt.so.1 is still a runtime dependency. So the linkage to clock_gettime clearly did not get optimized away.
But when I separate the implementation of GetConsoleWidth into a seperate source file (but still part of libcommon.a), the linker stops complaining about the unresolved reference to clock_gettime and no longer insists that I pass in -lrt.
It's as if the g++ linker can only cull out unused object files, but not unused function calls.
What's going on here?
Update: the compiler and linker command lines are as basic as they can get:
g++ -c oshelper.cpp
g++ -c someotherfile.cpp
etc...
ar -rv libcommon.a oshelper.o someotherfile.o ...
g++ server.cpp -lcommon -lpthread -o server
g++ client.cpp -lcommon -lrt -o client
Without special commands an .o is linked in its entirety, and thus all he dependencies are required.
You need to build the compilation units in the library with compiler flags that put all symbols in separate sections, and then call the linker with an option that "garbage collects" sections, so that only code referenced directly or indirect from main (and maybe ctors/dtors) is linked in.
I don't know the commands exactly but search for gcc parameters like -ffunction-sections -fdata-sections -fvtable-gc and -gc-section(s)

Resources