how is ld-linux.so* itself linked and loaded? - linux-kernel

Just by curiosity, how is the Linux dynamic linker/loader ld-linux.so* itself linked and loaded?
The above screenshot shows that file and ldd seems to give contradictory result: one says statically linked, the other says dynamically linked.
Then how is the loader itself be loaded?

ld-linux.so* doesn't depends any other libraries. It is runable by itself when loaded to memory.
ldd is a script, it loads the object file via the loader, and the loader checks whether the object is dynamically or statically linked, try this:
LD_TRACE_LOADED_OBJECTS=1 /lib64/ld-linux-x86-64.so.2 /lib64/ld-linux-x86-64.so.2
file reads the magic number or elf header to figure whether the object is dynamically or statically linked, it may output different value from ldd
IMO, ld-linux.so is static linked, because it doesn't have an .interp section which all dynamically linked object must have.

#Zang MingJie
Your answer helped me a lot, but the following words might confuse some people:
IMO, ld-linux.so is static linked, because it doesn't have an .interp >section which all dynamically linked object must have.
We should divide "all dynamic linked object" into two parts, one kind we called 'shared object' is generated like this:
gcc -c -o test.o test.c -fPIC
ld -o test.so test.o -shared
Another kind is called 'dynamic linked executable file' :
gcc -c -o test.o test.c -fPIC
ld -o test.so test.o
Two points are important:
1, Shared object has no '.iNTERP' segment, while Dynamic linked executable file has.
2, Linux kernel doesn't care whether an ELF file is EXEC or DYN indicated by the elf header. He searchs for the .INTERP segment firstly, if failed, He mmap() every LOAD type segment, and pass the control to eheader->e_entry, regardless he is loading a executable file or shared object.
Since ld-linux.so is a common shared object, it's not strange that she doesn't own an .INTERP segment. And it's not strange she can be run as executable . Every shared object can.
Write code like this:
void foobar(void){ while(1); }
compile it into a shared object ( using command line above ).
run it:
gdb ./test.so
You will get a process stuck in dead loop.
Use Ctrl-C to interrupt it. you will see ( need gcc's -g option )
Program received signal SIGINT, Interrupt.
foobar (void) at test.c:1
1 while(1);
(gdb)
You can go more far:
(gdb) p $eip
$1 = (void (*)()) 0x80000183 <foobar+3>
(gdb)
If you are familiar with linux kernel, you should know 0x80000000 is related the value of kernel variable 'mmap_min_addr'. Because test.so is a shared object, her load-address is zero, so kernel found a default virtual address for her, that's 0x80000000, not 0x804000.
I don't know how i got so off-topic ...

Related

What do link editor (LD) params mean?

I write NASM (netwide assembler) program and for some reasons I needed to use some functions written in C. So, I tried to link compiled C object files with compiled Assembly objects using ld link editor. I did it by this way :
ld -m elf_x86_64 -lc --dynamic-linker=/lib64/ld-linux-x86-64.so.2 object_files -o program.
And it didn't want to link and work long enough until I picked up the necessary parameters. Now this works as expected with this parameter set. But I don't understand the meaning of -lc and --dynamic-linker=/lib64/ld-linux-x86-64.so.2. What do they do ?
-lc - link c standard library
--dynamic-linker=/lib64/ld-linux-x86-64.so.2. - set the program loader. Linux ELF binaries have a field for this.
Afaik the latter is needed even for static binaries, anything other will confuse the loader, and it won't execute.
man ld lists its parameters.

Dynamically load code on embedded target

I have an application which runs on bare metal target and has the following structure
main.c
service.c/.h
It's compiled to ELF executable (system.elf) using standard gcc -c, ld sequence. I use linker to generate a map file showing adresses of all symbols.
Now, without re-flashing my system I need to add an extra functionality with a custom run-time loader. Remember, this is a bare-metal with no OS.
I'd like to
compile extra.c which uses APIs defined in service.h (and somehow link against existing service.o/system.elf)
copy the resulting executable to my SDRAM at runtime and jump to it
loaded code should be able to run and accesses the exported symbols from service.c as expected
I thought I'd be able to to reuse map file to link the extra.o against system.elf but this didn't work:
ld -o extraExe extra.o system.map
Does gcc or ld have some mode to make this late linking procedure? If not, how can I achieve dynamic code loading which I outlined above?
You can use the '-R filename' or '--just-symbols=filename' command options in ld. It reads symbol names and their addresses from filename, but does not relocate it or include it in the output. This allows your output file to refer symbolically to absolute locations of memory defined in your system.elf program.
(refer to ftp://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_node/ld_3.html).
So here filename will be 'system.elf'. You can compile extra.c with GCC normally including services.h but without linking and generate 'extra.o' then call ld as below:
ld -R"system.elf" -o"extra.out" extra.o
The 'extra.out' shall have your symbols linked. You can use objdump to compare contents of both 'extra.out' and 'extra.o'.
Note that you can always pass the start address of your program to the ld (e.g. -defsym _TEXT_START_ADDR=0xAAAA0123) as well as start address of other memory sections like bss,data. (i.e. -Tbss, -Tdata)
Be careful to use a valid address that does not conflict with your 'system.elf' as ld will not generate error for that. You can define new areas for the loaded code+data+bss in your original linker script and re-compile the system.elf then point the start addresses to your defined areas while linking 'extra.o'.

Named common block in a shared library

I am encountering a problem when I include a Fortran
subroutine in a shared library. This subroutine has a
named common block.
I have a Fortran main program that uses this common block
and links with the shared library.
The behavior is that variables in the common block set in
either the subroutine or main program are not shared between
the two.
I am using gfortran 4.9.3 under MinGW on windows. Here are the pieces of
my very simple example.
Main program:
program mainp
common/whgc/ivar
ivar = 23
call sharedf
end
Subroutine:
subroutine sharedf
common/whgc/ivar
print *, 'ivar=', ivar
end
Makefile:
FC = gfortran
FFLAGS=-g
all: shltest.dll mainp.exe
shltest.dll: sharedf.o
$(FC) -shared -o shltest.dll sharedf.o
mainp.exe: mainp.o shltest.dll
$(FC) -o mainp.exe mainp.o shltest.dll
clean:
rm *.o mainp.exe shltest.dll
When mainp.exe is run, it produces ivar = 0 instead of the correct ivar=23
Here are the results of some experimentation I did with nm.
nm -g mainp.o shows:
...
00000004 C _whgc_
nm on sharedf.o shows the same.
nm -g shltest.dll shows:
...
71446410 B _whgc_
nm -g mainp.exe shows:
...
00406430 B _whgc_
This is the only _whgc_ symbol in mainp.exe.
However, when I run mainp.exe in gdb and set break points in both
mainp and sharedf, I can print the address of ivar at each break point. The addresses
are not the same.
From the behavior it seems clear that GNU ld is not correctly
matching the _whgc_ symbols but I'm unclear about what options
to pass either in the shared library build or the final link to
make it do so?
(Please don't suggest alternatives to common blocks. In my real
application I am dealing with legacy code that uses common blocks.)
EDIT:
I tried my example on Linux/x86 and there the behavior is correct.
Of course on Linux the shared library and executable are ELF format
objects and on Windows/MinGW the format is PE/COFF.

gcc/ld: undefined reference to unused function

I'm using gcc 4.3.4 and ld 2.20.51 in Cygwin under Windows 7. Here's a simplified version of my problem:
foo.o contains function foo_bar() which calls bar() in bar.o
bar.o contains function bar()
main.c calls functions in foo.o, but foo_bar() is not in the call chain
If I try to compile main.c and link it to foo.o, I get an undefined reference to _foo_bar error from ld. As you can see from my Makefile except below, I've tried using flags for putting each function in its own section and having the linker discard unused sections.
COMPILE_CYGWIN = gcc -iquote$(INCDIR)
COMPILE = $(COMPILE_CYGWIN) -g -MMD -MP -Wall -ffunction-sections -Wl,-gc-sections $(DEFINE)
main_OBJECTS = main.o foo.o
main.exe : $(main_OBJECTS)
$(COMPILE) -o main.exe $(main_OBJECTS)
The function foo_bar() is a short function that provides a connection between two networking layers in a protocol stack. Some programs don't need it, so they won't link in the other object files related to the upper layer of the stack. It's a small function, and seems inappropriate to put it into its own .o file.
I don't understand why ld throws the error -- nothing is calling foo_bar(), so there's no need to include bar() in the final executable. A coworker has just told me that ld is not a "smart linker", so maybe what I'm trying to do isn't possible?
Unless the linker is from Cyberdyne Systems it has no way to know exactly which functions will actually be called. It only knows which ones are referenced. Even Skynet's linker can't predict what run-time decisions will be made or what will happen if you load a module dynamically at run-time and it starts calling various global functions1.
So, if you link in module m and it references function f, you will need to link with whatever module has f.
1. This problem is related to the Halting Problem and has been proven undecidable.
I hit the similar issue and I find this page:
http://lists.gnu.org/archive/html/bug-gnu-utils/2004-09/msg00098.html
Highligt:
The GNU linker still works at .o file granularity.
Gcc pulls in foo.o and then find bar() was undefined.
You'd better put foo_bar() into another .o file.

How to set the dynamic linker path for a shared library?

I want to compile a shared library with an .interp segment.
#include <stdio.h>
int foo(int argc, char** argv) {
printf("Hello, world!\n");
return 0;
}
I'm using the following commands.
gcc -c -o test.o test.c
ld --dynamic-linker=blah -shared -o test.so test.o
I end up without an INTERP segment, as if I never passed the --dynamic-linker=blah option. Check with readelf -l test.so. When building an executable, the linker processes the option correctly and puts an INTERP segment in the program header. How to do I make it work for shared libraries too?
ld doesn't include a .interp section if -shared is used, as #MichaelDillon already said. You can however provide this section yourself.
const char interp_section[] __attribute__((section(".interp"))) = "/path/to/dynamic/linker";
The line above will save the string "/path/to/dynamic/linker" in the .interp section using GCC attributes.
If you're trying to build a shared object that's also executable by itself, check this question out. It has a more comprehensive description of the process.
The INTERP segment only goes into binaries which need to load the ELF interpreter (ld.so) in the first place. A shared library has no INTERP segment because the ELF interpreter is already loaded before the shared library is loaded.
In most linux systems the ldconfig is run at every system boot and it looks definitions in /etc/ld.so.conf for looking in directories that have shared libraries. In the file /etc/ld.so.cache there are mappings for shared libraries sonames and the library full path. Consider reading this article: http://grahamwideman.wordpress.com/2009/02/09/the-linux-loader-and-how-it-finds-libraries/#comment-164

Resources