why system loader load strings at read/execute segment? - loader

I compiled this simple program at ubuntu 15.10 x64
char *glob = "hello strings"
void main() {
}
and using the gdb I could find the "hello strings" are located at the
read/execute segment with .text section.
I already know that some strings contained in the ELF header are located in the code segment
but why the user defined strings are located at the same segment with code?
I've also tried to enlarge the size of the strings to 0x1000 for checking
whether it is compiler optimization to locate small sized strings with code section, but
they are also located at the same segment with code.
It's very interesting to me because intuitively strings should be readable not executable.

By default, the Linux linker creates two PT_LOAD segments: a read-only one, containing .text, and a writable one containing .data (initialized data).
Your string literal resides in .rodata section.
Which of the above two segments would you like this section to go into? If it is to be read-only, then it will have to go into the same segment that contains .text, and that segment must be executable. If the section is to go into the writable segment, it will not have execute permissions, but then you would be able to write to these strings, and they would not be shared when multiple instances of your binary run.
You can see the assignment of sections to segments in the output of readelf -l a.out.
With older versions of GCC (before 4.0), you can see that adding -fwritable-strings moves the string into .data, and into non-executable segment.
Gold linker supports --rodata flag, which moves all read-only non-executable sections into a separate PT_LOAD segment. But that increases the number of mmap and mprotect calls that the dynamic loader has to perform, and so is not the default.

Related

Can gcc be configured to compile position-independent code for the code but position-dependent code for the data?

I'm trying to build bootable code for an ARM M7-based embedded system that is able to execute in place at two different locations in the QSPI, so that if one version gets corrupted, the backup version of the image can be executed in a different place.
Compiling with -fpic seems to produce a relocatable code image that is (nearly) able to execute in both places fine. However, the problem is that the data/bss the code refers to is also getting offset by the same amount - that is, the compiler is assuming that the .data and .bss segments live immediately after the .text segment, which isn't true for XIP embedded systems (where the RAM is separate).
As a result, if the original binary was linked to run at 0x60000000 (and using a fixed ram area at 0x20000000) but is then executed in place at 0x60100000 instead , the ram addresses will be shifted by 0x100000 as well (i.e. to 0x20100000), which isn't what I want at all.
Clearly, what I'd like to do is to modify gcc's behaviour so that references to the code (executing in place in two different places in the QSPI) are position-independent, while references to the .data/bss segments (in a fixed position in RAM) are position-dependent (as per normal).
Is this something that gcc can be tweaked to achieve (e.g. by some obscure linker attribute flag)? Or is this just out of its reach? Thanks!

DOS inserting segment addresses at runtime

I noticed a potential bug in some code i'm writing.
I though that if I used mov ax, seg segment_name, the program might be non-portable and only work on one machine in a specific configuration since the load location can vary from machine to machine.
So I decided to disassemble a program containing just that one instruction on two different machines running DOS and I found that the problem was magically solved.
Output of debug on machine one: 0C7A:014C B8BB0C MOV AX,0CBB
Output of debug on machine two: 06CA:014C B80B07 MOV AX,070B
After hex dumping the program I found that the unaltered bytes are actually B84200.
Manually inserting those bytes back into the program results in mov ax, 0042
So does the PE format store references to those instructions and update them at runtime?
As Peter Cordes noted, MS-DOS doesn't use the PECOFF executable format that Windows uses. It has it's own "MZ" executable format, named after the first two bytes of the executable that identify as being in this format.
The MZ format supports the use of multiple segments through a relocation table containing relocations. These relocations are just simple segment:offset values that indicate the location of 16-bit segment values that need to be adjusted based on where the executable was loaded in memory. MS-DOS performs these adjustments by simply adding the actual load segment of the program to the value contained in the executable. This means that without relocations applied the executable would only work if loaded at segment 0, which happens to be impossible.
Note this isn't just necessary for a program to work on multiple machines, it's also necessary for the same program to work reliably on the same machine. The load address can change based on what various configuration details, was well as other programs and drivers that have already been loaded in memory, so the load address of an MS-DOS executable is essentially unpredictable.
Working backwards from your example, we can tell where your example program was loaded into memory on both machines. Since 0042h was relocated into 0CBBh on the first machine and into 070Bh on the second machine, we know MS-DOS loaded your program on the two machines at segments 0C79h and 06C9h respectively:
0CBB - 0042 = 0C79
070B - 0042 = 06C9
From that we can determine that your example executable has the entry 0001:014D, or equivalent segment:offset value, in it's relocation table:
0C7A:014D - 0C79:0000 = 0001:014D
06CA:014D - 06C9:0000 = 0001:014D
This entry indicates the unrelocated location of the 16-bit immediate operand of the mov ax, seg segname instruction that needs adjusting.

VirtualAddress, LoadAddress, and PhysicalAddress in ELF file?

According to the ld manual on Output Section Description:
section [address] [(type)] :
[AT(lma)]
[ALIGN(section_align) | ALIGN_WITH_INPUT]
[SUBALIGN(subsection_align)]
[constraint]
{
output-section-command
output-section-command
...
} [>region] [AT>lma_region] [:phdr :phdr ...] [=fillexp] [,]
The address or >region stand for the VMA, i.e. the Virtual Memory Address of the output section.
The AT() or AT>lma_region stand for the LMA, i.e. the Load Memory Address of the output section.
And I decide get a close view with readelf -e to dump the section headers and program headers of a helloworld elf file. The result is below:
My questions are:
Why there's no LMA in the dumped headers? How is LMA represented in ELF file?
What does the Addr column in the red rectangle mean? VMA?
What does the PhysAddr in the green rectangle mean?
ADD 1
So far, It seems the PhysAddr is the LMA.
Why there's no LMA in the dumped headers? How is the LMA represented
in an ELF file
Firstly there is no LMA header within an elf file, it is actually quiet simple, multiple sections in an ELF file are mapped into segments, if the sections mapped into segments have a LOAD flag for example (PROFBITS) is a loadable section type, and the segment they are mapped into is also a load type segment (INTERP and LOAD) for example are also loadable segments, that means every section within that segment within that elf file would be loaded into memory. where? simply to the VMA they were given, so no there is no LMA in an elf file, a LMA is represented by a VMA given that the section should be loaded which is a specified type / flag.
What does the addr column in the red rectangle mean?
This has a direct correlation to your previous question, Yes! it does mean a VMA, in order to have this properly explained we need to understand that an ELF format was designed for architectures that support some memory protection / memory segmentation.
you might want to give some section special permissions, instead of giving every section it's own memory protection, you'll map multiple sections into a segment and give that sole segment it's own memory protections.
This causes the need to map sections into segment, how would the OS loader know how to map each section into segment and by that give it the appropriate memory protection? by it's address.
Each section is also given an address and by those addresses / offsets / sizes they are mapped into a segment which in overall would be allocated into memory and given some memory protection rules that would apply to all sections.
The only way that the OS could know how to map these is by address so yes if the section is of a loadable type it's ADDR means VMA
( at least for modern systems that use Virtual Memory and dont abuse the elf file )
What does the PhysAddr mean?
As much as I know, PhysAddr is only relevant to old fashioned architectures in which physical addressing is relevant to user-space programs, this section should hold the actual physical address the segment would sit in, yet in most modern systems this is simply ignored...
I suggest you read this http://flint.cs.yale.edu/cs422/doc/ELF_Format.pdf,
personally back in the day when learning this, it helped me a lot and gave me a lot of knowledge regarding ELF files
hopefully I've helped you some how! :)

ELF, PIE ASLR and everything in between, specifically within Linux

Before asking my question, I would like to cover some few technical details I want to make sure I've got correct:
A Position Independent Executable (PIE) is a program that would be able to execute regardless of which memory address it is loaded into, right?
ASLR (Address Space Layout Randomization) pretty much states that in order to keep addresses static, we would randomize them in some manner,
I've read that specifically within Linux and Unix based systems, implementing ASLR is possible regardless of if our code is a PIE, if it is PIE, all jumps, calls and offsets are relative hence we have no problem.
If it's not, code somehow gets modified and addresses are edited regardless of whether the code is an executable or a shared object.
Now this leads me to ask a few questions
If ASLR is possible to implement within codes that aren't PIE and are executables AND NOT SHARED / RELOCATABLE OBJECT (I KNOW HOW RELOCATION WORKS WITHIN RELOCATABLE OBJECTS!!!!), how is it done? ELF format should hold no section that states where within the code sections are functions so the kernel loader could modify it, right? ASLR should be a kernel functionality so how on earth could, for example, an executable containing, for example, these instructions.
pseudo code:
inc_eax:
add eax, 5
ret
main:
mov eax, 5
mov ebx, 6
call ABSOLUTE_ADDRES{inc_eax}
How would the kernel executable loader know how to change the
addresses if they aren't stored in some relocatable table within the ELF
file and aren't relative in order to load the executable into some
random address?
Let's say I'm wrong, and in order to implement ASLR you must have a
PIE executable. All segments are relative. How would one compile a
C++ OOP code and make it work, for example, if I have some instance
of a class using a pointer to a virtual table within its struct,
and that virtual table should hold absolute addresses, hence I
wouldn't be able to compile a pure PIE for C++ programs that have
usage of run time virtual tables, and again ASLR isn't possible....
I doubt that virtual tables would contain relative addresses and
there would be a different virtual table for each call of some
virtual function...
My last and least significant question is regarding ELF and PIE — is there some special way to detect an ELF executable is PIE? I'm familiar with the ELF format so I doubt that there is a way, but I might be wrong. Anyway, if there isn't a way, how does the kernel loader know if our executable is PIE hence it could use ASLR on it.
I've got this all messed up in my head and I'd love it if someone could help me here.
Your question appears to be a mish-mash of confusion and misunderstanding.
A Position Independent Executable (PIE) is a program that would be able to execute regardless of which memory address it is loaded into, right?
Almost. A PIE binary usually can not be loaded into memory at arbitrary address, as its PT_LOAD segments will have some alignment requirements (e.g. 0x400, or 0x10000). But it can be loaded and will run correctly if loaded into memory at address satisfying the alignment requirements.
ASLR (Address Space Layout Randomization) pretty much states that in order to keep addresses static we would randomize them in some manner,
I can't parse the above statement in any meaningful way.
ASLR is a technique for randomizing various parts of address space, in order to make "known address" attacks more difficult.
Note that ASLR predates PIE binaries, and does not in any way require PIE. When ASLR was introduced, it randomized placement of stack, heap, and shared libraries. The placement of (non-PIE) main executable could not be randomized.
ASLR has been considered a success, and therefore extended to also support PIE main binary, which is really a specially crafted shared library (and has ET_DYN file type).
call ABSOLUTE_ADDRES{inc_eax}
how would the kernel executable loader know how to change the addresses if > they aren't stored in some relocatable table
Simple: on x86, there is no instruction to call ABSOLUTE_ADDRESS -- all calls are relative.
2 ... I wouldn't be able to compile a pure PIE for C++ programs that have usage of run time virtual tables, and again ASLR isn't possible..
PIE binary requires relocation, just like a shared library. Virtual tables in PIE binaries work exactly the same way they work in shared libraries: ld-linux.so.2 updates GOT (global offset table) before transferring control to the PIE binary.
3 ... is there some special way to detect an ELF executable is PIE
Simple: a PIE binary has ELF file type set to ET_DYN (a non-PIE binary will have type ET_EXEC). If you run file a.out on a PIE executable, you'll see that it's a "shared library".

What's the difference between .rdata and .idata segments?

I noticed in IDA that the PE file which I analyze has not only the .rdata section but also .idata. What's the difference?
.rdata is for const data. It is the read only version of the .data segment.
.idata holds the import directory (.edata for exports). It is used by EXE's and DLL's to designate the imported and exported functions. See the PE format specification (http://msdn.microsoft.com/library/windows/hardware/gg463125) for details.
Summarizing typical segment names:
.text: Code
.data: Initialized data
.bss: Uninitialized data
.rdata: Const/read-only (and initialized) data
.edata: Export descriptors
.idata: Import descriptors
.reloc: Relocation table (for code instructions with absolute addressing when
the module could not be loaded at its preferred base address)
.rsrc: Resources (icon, bitmap, dialog, ...)
.tls: __declspec(thread) data (Fails with dynamically loaded DLLs -> hard to find bugs)
As Martin Rosenau mentions, the segment names are only typical. The true segment type is specified in the segment header or is defined by usage of data stored in the segment.
In fact, the names of the segments are ignored by Windows.
There are linkers that use different segment names and it is even possible to store the Import Descriptors, Export descriptors, Resources etc. in the ".text" segment instead of using separate segments.
However it seems to be simpler to create separate sections for such metadata so most linkers will use separate sections.
This means: Sections ".idata", ".rdata", ".rsrc", ... do not contain program data (although their name ends with "data") but they contain meta information that is used by the operating system. The ".rsrc" section for example holds information about the icon that is shown when looking at the executable file in the Explorer.
".idata" contains information about all DLL files required by the program.

Resources