Linux's security measures against executing shellcode - linux-kernel

I'm learning the basics of computer security and I'm trying to execute some shellcode I've written. I followed the steps given here
http://dl.packetstormsecurity.net/papers/shellcode/own-shellcode.pdf
http://webcache.googleusercontent.com/search?q=cache:O3uJcNhsksAJ:dl.packetstormsecurity.net/papers/shellcode/own-shellcode.pdf+own+shellcode&cd=1&hl=nl&ct=clnk&gl=nl
$ cat pause.s
xor %eax,%eax
mov $29,%al
int $0x80
$ as -o pause.o pause.s
$ ld -o pause pause.o
ld: warning: cannot find entry symbol _start; defaulting to <<some address here>>
$ ./pause
^C
$ objdump -d ./pause
pause: file format elf64-x86_64
Disassembly of section .text:
08048054 <.text>:
8048054: 31 c0 xor %eax,%eax
8048056: b0 1d mov $0x1d,%al
8048058: cd 80 int $0x8
$
Since I got my pause program to work, I just copied the objdump output to a c file.
test.c:
int main()
{
char s[] = "\x31\xc0\xb0\x1d\xcd\x80";
(*(void(*)())s)();
}
But this produces a segfault. Now, this can only be due to security measures of Arch Linux (?). So how can I get this to work?

The page s lives in isn't mapped with execute permissions. Since you're on x86_64 you definitely have NX support in hardware. By default these days code and data live in very separate pages, with data not having the execute permission.
You can work around this with either mmap() or mprotect() to allocate or alter pages to have the PROT_EXEC permission.

You can also use a #define to define your shellcode. This way the pre-processor will insert the code directly into main
#define SHELLCODE "\x31\xc0\xb0\x1d\xcd\x80"
int main()
{
(*(void(*)())SHELLCODE)();
}
The older style of writing shellcode doesn't work on newer systems because of security measures.
You will also probably have to compile with stack protection turned off:
gcc -z execstack -fno-stack-protector shellcode.c -o shellcode
Here is a fully working example that uses exit system call that I've tested on 3.2.0.3 kernel x86_64:
#include<stdio.h>
#define SHELLCODE "\x48\xc7\xc0\x3c\x00\x00\x00\x48\xc7\xc7\xe7\x03\x00\x00\x0f\05"
main()
{
int (*function)();
// cast shellcode as a function
function = (int(*)())SHELLCODE;
// execute shellcode function
(int)(*function)();
return 0;
}
The shellcode is using 64 bit registers, so it won't work on 32bit machine.
To verify that the code works, you can test it with strace:
strace shellcode
execve("./shellcode", ["shellcode"], [/* 38 vars */]) = 0
....
munmap(0x7ffff7fd5000, 144436) = 0
_exit(999) <---- we passed 999 to exit, our shellcode works!

Related

Do dynamic libraries have the same virtual memory address in all programs?

When a library is dynamically linked to a program does it have the same address in that program as in any other program?
I my head I imagined each process gets the whole of the address space and then everything in that process (inc. dynamic libraries that are already in memory) gets mapped to semi-random parts of it because of ASLR.
But I did a short experiment that seems to imply that the address of libraries that are in memory are fixed across different processes and thus reusable across programs? Is that correct?
I wrote two short c programs which used the "sleep" function. In one I printed out the address of the sleep function and in the second I assigned a function pointer to that address. I ran them both and the sleep function worked in both.
#include <stdio.h>
#include <unistd.h>
int main()
{
while(1)
{
printf("%s\n", &"hi");
sleep(2);
printf("pointer to sleep: %p\n", sleep);
}
}
#include <stdio.h>
#include <unistd.h>
#define sleepagain ((void (*)(int))0x7fff7652e669) //addr of sleep from first program
int main()
{
while(1)
{
printf("%s\n", &"test");
sleepagain(2);
}
}
I wasn't sure what this would show but what it actually showed was a) the address was the same every time I ran the first program and b) that sleep still functioned when I ran the second.
I think I understand how this works but I am curious if it has to work the way it does and what are the reasons behind it?
Just to reference the answer I got already when I took a look with otool -IvV I got:
a.out:
Indirect symbols for (__TEXT,__stubs) 2 entries
address index name
0x0000000100000f62 2 _printf
0x0000000100000f68 3 _sleep
Indirect symbols for (__DATA,__nl_symbol_ptr) 2 entries
address index name
0x0000000100001000 4 dyld_stub_binder
0x0000000100001008 ABSOLUTE
Indirect symbols for (__DATA,__got) 1 entries
address index name
0x0000000100001010 3 _sleep
Indirect symbols for (__DATA,__la_symbol_ptr) 2 entries
address index name
0x0000000100001018 2 _printf
0x0000000100001020 3 _sleep
Which is also what the indirect address was in lldb. The address was the address of sleep itself:
Process 11209 launched: 'stuff/a.out' (x86_64)
hi
Process 11209 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
frame #0: 0x00007fff7652e669 libsystem_c.dylib`sleep
libsystem_c.dylib`sleep:
-> 0x7fff7652e669 <+0>: push rbp
0x7fff7652e66a <+1>: mov rbp, rsp
0x7fff7652e66d <+4>: push rbx
0x7fff7652e66e <+5>: sub rsp, 0x28
Target 0: (a.out) stopped.
For some additional info:
$ otool -hv a.out
Mach header
magic cputype cpusubtype caps filetype ncmds sizeofcmds flags
MH_MAGIC_64 X86_64 ALL LIB64 EXECUTE 15 1296 NOUNDEFS DYLDLINK TWOLEVEL PIE
On macOS, many system libraries are part of the dyld shared cache. There's a machine-wide mapping. So, those libraries end up at the same address in all processes of the same architecture (32- or 64-bit).
The location of the dyld shared cache is randomized at system boot. So, library addresses will be the same from process to process until you reboot.
Not all system libraries are part of the cache, only the ones that Apple deems to be commonly loaded.
Libraries of your own or from third parties will be loaded at random locations each time they're loaded, assuming they are position-independent.
Try looking at the output from vmmap -v <pid>. Look for the line with "machine-wide VM submap" and those that follow.

How does the `asm()` function works in C language?

I am learning Operating System Development and a Beginner of course. I would like to build my system in real mode environment which is a 16 bit environment using C language.
In C, I used a function asm() to convert the codes to 16 bit as follows:
asm(".code16")
which in GCC's language to generate 16 bit executables(not exactly though).
Question:
Suppose I have two header files head1.h and head2.h and a main.c file. The contents of main.c file are as follows:
asm(".code16");
#include<head1.h>
#include<head2.h>
int main(){
return 0;
}
Now, Since I started my code with the command to generate 16 bit executable file and then included head1.h and head2.h, will I need to do the same in all header files that I am to create? (or) Is it sufficient to add the line asm(".code16"); once?
OS: Ubuntu
Compiler: Gnu CC
To answer your question: It suffices for the asm block to be present at the beginning of the translation unit.
So putting it once at the beginning will do.
But you can do better: you can avoid it altogether and use the -m16 command line option (available from 5.2.0) instead.
But you can do better: you can avoid it altogether.
The effect of -m16 and .code16 is to make 32-bit code executable in real mode, it is not to produce real mode code.
Look
16.c
int main()
{
return 4;
}
Extracting the raw .text segment
>gcc -c -m16 16.c
>objcopy -j .text -O binary 16.o 16.bin
>ndisasm 16.bin
we get
00000000 6655 push ebp
00000002 6689E5 mov ebp,esp
00000005 6683E4F0 and esp,byte -0x10
00000009 66E800000000 call dword 0xf
0000000F 66B804000000 mov eax,0x4
00000015 66C9 o32 leave
00000017 66C3 o32 ret
Which is just 32-bit code filled with operand size prefixes.
On a real pre-386 machine this won't work as the 66h opcode is UD.
There are old 16-bit compilers, like Turbo C1, that address the problematic of the real-mode applications properly.
Alternatively, switch in protected mode as soon as possible or consider using UEFI.
1 It is available online. This compiler is as old as me!
It is not needed to add asm("code16") neither in head1.h nor head2.h.
The main reason is how the C pre-compiler works. It replaces the content of head1.h and head2.h within main.c.
Please check How `#include' Works for further information.
Hope it helps!
Best regards,
Miguel Ángel

How to generate assembly code with gcc that can be compiled with nasm [duplicate]

This question already has answers here:
How to generate a nasm compilable assembly code from c source code on Linux?
(3 answers)
Closed 2 years ago.
I am trying to learn assembly language as a hobby and I frequently use gcc -S to produce assembly output. This is pretty much straightforward, but I fail to compile the assembly output. I was just curious whether this can be done at all. I tried using both standard assembly output and intel syntax using the -masm=intel. Both can't be compiled with nasm and linked with ld.
Therefore I would like to ask whether it is possible to generate assembly code, that can be then compiled.
To be more precise I used the following C code.
>> cat csimp.c
int main (void){
int i,j;
for(i=1;i<21;i++)
j= i + 100;
return 0;
}
Generated assembly with gcc -S -O0 -masm=intel csimp.c and tried to compile with nasm -f elf64 csimp.s and link with ld -m elf_x86_64 -s -o test csimp.o. The output I got from nasm reads:
csimp.s:1: error: attempt to define a local label before any non-local labels
csimp.s:1: error: parser: instruction expected
csimp.s:2: error: attempt to define a local label before any non-local labels
csimp.s:2: error: parser: instruction expected
This is most probably due to broken assembly syntax. My hope is that I would be able to fix this without having to manually correct the output of gcc -S
Edit:
I was given a hint that my problem is solved in another question; unfortunately, after testing the method described there, I was not able to produce nasm assembly format. You can see the output of objconv below.
Therefore I still need your help.
>>cat csimp.asm
; Disassembly of file: csimp.o
; Sat Jan 30 20:17:39 2016
; Mode: 64 bits
; Syntax: YASM/NASM
; Instruction set: 8086, x64
global main: ; **the ':' should be removed !!!**
SECTION .text ; section number 1, code
main: ; Function begin
push rbp ; 0000 _ 55
mov rbp, rsp ; 0001 _ 48: 89. E5
mov dword [rbp-4H], 1 ; 0004 _ C7. 45, FC, 00000001
jmp ?_002 ; 000B _ EB, 0D
?_001: mov eax, dword [rbp-4H] ; 000D _ 8B. 45, FC
add eax, 100 ; 0010 _ 83. C0, 64
mov dword [rbp-8H], eax ; 0013 _ 89. 45, F8
add dword [rbp-4H], 1 ; 0016 _ 83. 45, FC, 01
?_002: cmp dword [rbp-4H], 20 ; 001A _ 83. 7D, FC, 14
jle ?_001 ; 001E _ 7E, ED
pop rbp ; 0020 _ 5D
ret ; 0021 _ C3
; main End of function
SECTION .data ; section number 2, data
SECTION .bss ; section number 3, bss
Apparent solution:
I made a mistake when cleaning up the output of objconv. I should have run:
sed -i "s/align=1//g ; s/[a-z]*execute//g ; s/: *function//g; /default *rel/d" csimp.asm
All steps can be condensed in a bash script
#! /bin/bash
a=$( echo $1 | sed "s/\.c//" ) # strip the file extension .c
# compile binary with minimal information
gcc -fno-asynchronous-unwind-tables -s -c ${a}.c
# convert the executable to nasm format
./objconv/objconv -fnasm ${a}.o
# remove unnecesairy objconv information
sed -i "s/align=1//g ; s/[a-z]*execute//g ; s/: *function//g; /default *rel/d" ${a}.asm
# run nasm for 64-bit binary
nasm -f elf64 ${a}.asm
# link --> see comment of MichaelPetch below
ld -m elf_x86_64 -s ${a}.o
Running this code I get the ld warning:
ld: warning: cannot find entry symbol _start; defaulting to 0000000000400080
The executable produced in this manner crashes with segmentation fault message. I would appreciate your help.
The difficulty I think you hit with the entry point error was attempting to use ld on an object file containing the entry point named main while ld was looking for an entry point named _start.
There are a couple of considerations. First, if you are linking with the C library for the use of functions like printf, linking will expect main as the entry point, but if you are not linking with the C library, ld will expect _start. Your script is very close, but you will need some way to differentiate which entry point you need to fully automate the process for any source file.
For example, the following is a conversion using your approach of a source file including printf. It was converted to nasm using objconv as follows:
Generate the object file:
gcc -fno-asynchronous-unwind-tables -s -c struct_offsetof.c -o s3.obj
Convert with objconv to nasm format assembly file
objconv -fnasm s3.obj
(note: my version of objconv added DOS line endings -- probably an option missed, I just ran it through dos2unix)
Using a slightly modified version of your sed call, tweak the contents:
sed -i -e 's/align=1//g' -e 's/[a-z]*execute//g' -e \
's/: *function//g' -e '/default *rel/d' s3.asm
(note: if no standard library functions, and using ld, change main to _start by adding the following expressions to your sed call)
-e 's/^main/_start/' -e 's/[ ]main[ ]*.*$/ _start/'
(there are probably more elegant expressions for this, this was just for example)
Compile with nasm (replacing original object file):
nasm -felf64 -o s3.obj s3.asm
Using gcc for link:
gcc -o s3 s3.obj
Test
$ ./s3
sizeof test : 40
myint : 0 0
mychar : 4 4
myptr : 8 8
myarr : 16 16
myuint : 32 32
You basically can't, at least directly. GCC does output assembly in Intel syntax; but NASM/MASM/TASM have their own Intel syntax. They are largely based on it, but there are as well some differences the assembler may not be able to understand and thus fail to compile.
The closest thing is probably having objdump show the assembly in Intel format:
objdump -d $file -M intel
Peter Cordes suggests in the comments that assembler directives will still target GAS, so they won't be recognized by NASM for example. They typically have the same name, but GAS-like directives start with a . as in .section text (vs section text).
There are many different assembly languages - for each CPU there's possibly multiple possible syntaxes (e.g. "Intel syntax", "AT&T syntax"), then completely different directives, pre-processor, etc on top of that. It adds up to about 30 different dialects of assembly language for 32-bit 80x86 alone.
GCC is only able to generate one dialect of assembly language for 32-bit 80x86. This means it can't work with NASM, FASM, MASM, TASM, A86/A386, etc. It only works for GAS (and possibly YASM in its "AT&T mode" maybe).
Of course you can compile code with 3 different compilers into 3 different types of assembly, then write 3 more different pieces of code (in 3 more different types of assembly) yourself; then assemble all of that (each with their appropriate assembler) into object files and link all the object files together.

basic assembly not working on Mac (x86_64+Lion)?

here is the code(exit.s):
.section .data,
.section .text,
.globl _start
_start:
movl $1, %eax
movl $32, %ebx
syscall
when I execute " as exit.s -o exit.o && ld exit.o -o exit -e _start && ./exit"
the return is "Bus error: 10" and the output of "echo $?" is 138
I also tried the example of the correct answer in this question: Process command line in Linux 64 bit
stil get "bus error"...
First, you are using old 32-bit Linux kernel calling convention on Mac OS X - this absolutely doesn't work.
Second, syscalls in Mac OS X are structured in a different way - they all have a leading class identifier and a syscall number. The class can be Mach, BSD or something else (see here in the XNU source) and is shifted 24 bits to the left. Normal BSD syscalls have class 2 and thus begin from 0x2000000. Syscalls in class 0 are invalid.
As per §A.2.1 of the SysV AMD64 ABI, also followed by Mac OS X, syscall id (together with its class on XNU!) goes to %rax (or to %eax as the high 32 bits are unused on XNU). The fist argument goes in %rdi. Next goes to %rsi. And so on. %rcx is used by the kernel and its value is destroyed and that's why all functions in libc.dyld save it into %r10 before making syscalls (similarly to the kernel_trap macro from syscall_sw.h).
Third, code sections in Mach-O binaries are called __text and not .text as in Linux ELF and also reside in the __TEXT segment, collectively referred as (__TEXT,__text) (nasm automatically translates .text as appropriate if Mach-O is selected as target object type) - see the Mac OS X ABI Mach-O File Format Reference. Even if you get the assembly instructions right, putting them in the wrong segment/section leads to bus error. You can either use the .section __TEXT,__text directive (see here for directive syntax) or you can also use the (simpler) .text directive, or you can drop it altogether since it is assumed if no -n option was supplied to as (see the manpage of as).
Fourth, the default entry point for the Mach-O ld is called start (although, as you've already figured it out, it can be changed via the -e linker option).
Given all the above you should modify your assembler source to read as follows:
; You could also add one of the following directives for completeness
; .text
; or
; .section __TEXT,__text
.globl start
start:
movl $0x2000001, %eax
movl $32, %edi
syscall
Here it is, working as expected:
$ as -o exit.o exit.s; ld -o exit exit.o
$ ./exit; echo $?
32
Adding more explanation on the magic number. I made the same mistake by applying the Linux syscall number to my NASM.
From the xnu kernel sources in osfmk/mach/i386/syscall_sw.h (search SYSCALL_CLASS_SHIFT).
/*
* Syscall classes for 64-bit system call entry.
* For 64-bit users, the 32-bit syscall number is partitioned
* with the high-order bits representing the class and low-order
* bits being the syscall number within that class.
* The high-order 32-bits of the 64-bit syscall number are unused.
* All system classes enter the kernel via the syscall instruction.
Syscalls are partitioned:
#define SYSCALL_CLASS_NONE 0 /* Invalid */
#define SYSCALL_CLASS_MACH 1 /* Mach */
#define SYSCALL_CLASS_UNIX 2 /* Unix/BSD */
#define SYSCALL_CLASS_MDEP 3 /* Machine-dependent */
#define SYSCALL_CLASS_DIAG 4 /* Diagnostics */
As we can see, the tag for BSD system calls is 2. So that magic number 0x2000000 is constructed as:
// 2 << 24
#define SYSCALL_CONSTRUCT_UNIX(syscall_number) \
((SYSCALL_CLASS_UNIX << SYSCALL_CLASS_SHIFT) | \
(SYSCALL_NUMBER_MASK & (syscall_number)))
Why it uses BSD tag in the end, probably Apple switches from mach kernel to BSD kernel. Historical reason.
Inspired by the original answer.

What is the -fPIE option for position-independent executables in gcc and ld?

How will it change the code, e.g. function calls?
PIE is to support address space layout randomization (ASLR) in executable files.
Before the PIE mode was created, the program's executable could not be placed at a random address in memory, only position independent code (PIC) dynamic libraries could be relocated to a random offset. It works very much like what PIC does for dynamic libraries, the difference is that a Procedure Linkage Table (PLT) is not created, instead PC-relative relocation is used.
After enabling PIE support in gcc/linkers, the body of program is compiled and linked as position-independent code. A dynamic linker does full relocation processing on the program module, just like dynamic libraries. Any usage of global data is converted to access via the Global Offsets Table (GOT) and GOT relocations are added.
PIE is well described in this OpenBSD PIE presentation.
Changes to functions are shown in this slide (PIE vs PIC).
x86 pic vs pie
Local global variables and functions are optimized in pie
External global variables and functions are same as pic
and in this slide (PIE vs old-style linking)
x86 pie vs no-flags (fixed)
Local global variables and functions are similar to fixed
External global variables and functions are same as pic
Note, that PIE may be incompatible with -static
Minimal runnable example: GDB the executable twice
For those that want to see some action, let's see ASLR work on the PIE executable and change addresses across runs:
main.c
#include <stdio.h>
int main(void) {
puts("hello");
}
main.sh
#!/usr/bin/env bash
echo 2 | sudo tee /proc/sys/kernel/randomize_va_space
for pie in no-pie pie; do
exe="${pie}.out"
gcc -O0 -std=c99 "-${pie}" "-f${pie}" -ggdb3 -o "$exe" main.c
gdb -batch -nh \
-ex 'set disable-randomization off' \
-ex 'break main' \
-ex 'run' \
-ex 'printf "pc = 0x%llx\n", (long long unsigned)$pc' \
-ex 'run' \
-ex 'printf "pc = 0x%llx\n", (long long unsigned)$pc' \
"./$exe" \
;
echo
echo
done
For the one with -no-pie, everything is boring:
Breakpoint 1 at 0x401126: file main.c, line 4.
Breakpoint 1, main () at main.c:4
4 puts("hello");
pc = 0x401126
Breakpoint 1, main () at main.c:4
4 puts("hello");
pc = 0x401126
Before starting execution, break main sets a breakpoint at 0x401126.
Then, during both executions, run stops at address 0x401126.
The one with -pie however is much more interesting:
Breakpoint 1 at 0x1139: file main.c, line 4.
Breakpoint 1, main () at main.c:4
4 puts("hello");
pc = 0x5630df2d6139
Breakpoint 1, main () at main.c:4
4 puts("hello");
pc = 0x55763ab2e139
Before starting execution, GDB just takes a "dummy" address that is present in the executable: 0x1139.
After it starts however, GDB intelligently notices that the dynamic loader placed the program in a different location, and the first break stopped at 0x5630df2d6139.
Then, the second run also intelligently noticed that the executable moved again, and ended up breaking at 0x55763ab2e139.
echo 2 | sudo tee /proc/sys/kernel/randomize_va_space ensures that ASLR is on (the default in Ubuntu 17.10): How can I temporarily disable ASLR (Address space layout randomization)? | Ask Ubuntu.
set disable-randomization off is needed otherwise GDB, as the name suggests, turns off ASLR for the process by default to give fixed addresses across runs to improve the debugging experience: Difference between gdb addresses and "real" addresses? | Stack Overflow.
readelf analysis
Furthermore, we can also observe that:
readelf -s ./no-pie.out | grep main
gives the actual runtime load address (pc pointed to the following instruction 4 bytes after):
64: 0000000000401122 21 FUNC GLOBAL DEFAULT 13 main
while:
readelf -s ./pie.out | grep main
gives just an offset:
65: 0000000000001135 23 FUNC GLOBAL DEFAULT 14 main
By turning ASLR off (with either randomize_va_space or set disable-randomization off), GDB always gives main the address: 0x5555555547a9, so we deduce that the -pie address is composed from:
0x555555554000 + random offset + symbol offset (79a)
TODO where is 0x555555554000 hard coded in the Linux kernel / glibc loader / wherever? How is the address of the text section of a PIE executable determined in Linux?
Minimal assembly example
Another cool thing we can do is to play around with some assembly code to understand more concretely what PIE means.
We can do that with a Linux x86_64 freestanding assembly hello world:
main.S
.text
.global _start
_start:
asm_main_after_prologue:
/* write */
mov $1, %rax /* syscall number */
mov $1, %rdi /* stdout */
mov $msg, %rsi /* buffer */
mov $len, %rdx /* len */
syscall
/* exit */
mov $60, %rax /* syscall number */
mov $0, %rdi /* exit status */
syscall
msg:
.ascii "hello\n"
len = . - msg
GitHub upstream
and it assembles and runs fine with:
as -o main.o main.S
ld -o main.out main.o
./main.out
However, if we try to link it as PIE with (--no-dynamic-linker is required as explained at: How to create a statically linked position independent executable ELF in Linux?):
ld --no-dynamic-linker -pie -o main.out main.o
then link will fail with:
ld: main.o: relocation R_X86_64_32S against `.text' can not be used when making a PIE object; recompile with -fPIC
ld: final link failed: nonrepresentable section on output
Because the line:
mov $msg, %rsi /* buffer */
hardcodes the message address in the mov operand, and is therefore not position independent.
If we instead write it in a position independent way:
lea msg(%rip), %rsi
then PIE link works fine, and GDB shows us that the executable does get loaded at a different location in memory every time.
The difference here is that lea encoded the address of msg relative to the current PC address due to the rip syntax, see also: How to use RIP Relative Addressing in a 64-bit assembly program?
We can also figure that out by disassembling both versions with:
objdump -S main.o
which give respectively:
e: 48 c7 c6 00 00 00 00 mov $0x0,%rsi
e: 48 8d 35 19 00 00 00 lea 0x19(%rip),%rsi # 2e <msg>
000000000000002e <msg>:
2e: 68 65 6c 6c 6f pushq $0x6f6c6c65
So we see clearly that lea already has the full correct address of msg encoded as current address + 0x19.
The mov version however has set the address to 00 00 00 00, which means that a relocation will be performed there: What do linkers do? The cryptic R_X86_64_32S in the ld error message is the actual type of relocation that was required and which cannot happen in PIE executables.
Another fun thing that we can do is to put the msg in the data section instead of .text with:
.data
msg:
.ascii "hello\n"
len = . - msg
Now the .o assembles to:
e: 48 8d 35 00 00 00 00 lea 0x0(%rip),%rsi # 15 <_start+0x15>
so the RIP offset is now 0, and we guess that a relocation has been requested by the assembler. We confirm that with:
readelf -r main.o
which gives:
Relocation section '.rela.text' at offset 0x160 contains 1 entry:
Offset Info Type Sym. Value Sym. Name + Addend
000000000011 000200000002 R_X86_64_PC32 0000000000000000 .data - 4
so clearly R_X86_64_PC32 is a PC relative relocation that ld can handle for PIE executables.
This experiment taught us that the linker itself checks the program can be PIE and marks it as such.
Then when compiling with GCC, -pie tells GCC to generate position independent assembly.
But if we write assembly ourselves, we must manually ensure that we have achieved position independence.
In ARMv8 aarch64, the position independent hello world can be achieved with the ADR instruction.
How to determine if an ELF is position independent?
Besides just running it through GDB, some static methods are mentioned at:
executable: https://unix.stackexchange.com/questions/89211/how-to-test-whether-a-linux-binary-was-compiled-as-position-independent-code/435038#435038
library: How can I tell, with something like objdump, if an object file has been built with -fPIC?
Tested in Ubuntu 18.10.

Resources