How is MacOS stack initialized at the start of the process? - macos

Out of curiosity how MacOS prepares its stack, I wrote an (x86_64) assembly program to print the top of the stack to stdout right when a process gets started:
global start
start: ; entry point of the binary, called by the loader
push rsp ; push the stack pointer to stack so that we'll se that too
mov rdi, 1 ; file to write to: file descriptor 1 (STDOUT)
lea rsi, [rsp] ; source of the write: stack
mov rdx, 64 ; number of bytes to write: 64 (8 x 64-bit integers)
mov rax, 0x02000004 ; MacOS syscall number for write
syscall
mov rsi, [rsp+16] ; smoke test: argv contents
mov rdx, 16 ; we expect the argv[0] ("./inspect_stack\0") to be 16 bytes long
mov rax, 0x02000004
syscall
mov rsi, [rsp+32] ; another smoke test: envp???
mov rdx, 11
mov rax, 0x02000004
syscall
mov rax, 0x02000001 ; MacOS syscall number for exit
syscall
Running this program and inspecting the output:
nasm -f macho64 inspect_stack.asm && ld inspect_stack.o -static -o inspect_stack && ./inspect_stack | xxd -e -g 8 -c 8
I see something like this: (added some comments of my own)
00000000: 00007ff7bfeff6b0 ........ # this is the stack pointer we pushed
00000008: 0000000000000001 ........ # argc
00000010: 00007ff7bfeff880 ........ # argv; see the smoke test result
00000018: 0000000000000000 ........ # a null pointer???
00000020: 00007ff7bfeff890 ........ # are these part of envp?
00000028: 00007ff7bfeff89f ........ # ...seems like an array of pointers stored inline?
00000030: 00007ff7bfeff8dc ........ # ...and they seem to point at a continuous buffer
00000038: 00007ff7bfeff8ed ........
00000040: 636570736e692f2e ./inspec # the result of the 1st smoke test. yes, argv[0]!
00000048: 006b636174735f74 t_stack.
00000050: 6573552f3d445750 PWD=/Use # the result of the 2nd smoke test... seems like envp?
00000058: 2f7372 rs/
So, I had an understanding that there would be a 64-bit integer (argc) and two pointers (to argv and to envp) stored to the stack at the start of the program. However, this doesn't seem to be true, or then the envp pointer is null for some reason. However, we can see that the envp array, stored inline, seemingly starts after the null. What's the actual layout of the stack when the process starts?

Inspecting a bit more, and adding more arguments, I noticed that my understanding that there would be two pointers to argv and envp at the top of the stack, was mistaken. Instead, argv and envp are stored inline, as arrays of pointers to the associated strings. Both arrays are null-terminated, so the null value I was seeing was actually the terminator of argv. Adding more arguments makes this a lot clearer:
nasm -f macho64 inspect_stack.asm && ld inspect_stack.o -static -o inspect_stack && ./inspect_stack first second | xxd -e -g 8 -c 8
00000000: 00007ff7bfeff698 ........
00000008: 0000000000000003 ........ # argc
00000010: 00007ff7bfeff878 x....... # argv[0]
00000018: 00007ff7bfeff888 ........ # argv[1]
00000020: 00007ff7bfeff88e ........ # argv[2]
00000028: 0000000000000000 ........ # argv end
00000030: 00007ff7bfeff895 ........ # envp[0]
00000038: 00007ff7bfeff8a4 ........ # envp[1] and so on
00000040: 636570736e692f2e ./inspec
00000048: 006b636174735f74 t_stack.
00000050: 5000646e6f636573 second.P # the second smoke test now sees argv[2]!
00000058: 3d4457 WD= # seems that the envp strings are located right after argc strings
TL;DR: I thought that the second and third 64-bit values in the stack were char **argv and char **envp. Instead, they were argv[0] and argv[1]. Now, to get char **argv that C main would expect I could take [rsp + 8] (8 bytes for skipping argc), and to get char **envp I could mov rax, [rsp] and then take [rsp + 8 + rax*8 + 8] (8 bytes for skipping argc, then skipping argc number of pointers, and finally another 8 bytes for skipping the null terminator).

Related

Trying to execute a bash script in NASM

Hello I am quite a beginner in nasm. I am trying to write a program that executes a script, that takes one argument, with /bin/bash.
SECTION .data
command db '/bin/bash', 0
script db 'path/to/script', 0
script_arg db 'my_arg', 0
arguments dd command
dd script ; arguments to pass to commandline, in this case just the path to the script
dd script_arg
dd 0
SECTION .text
global _start
_start:
mov edx, 0 ; no environment variables are being used
mov ecx, arguments ; array of pointers has to be passed
mov ebx, command ; bash
mov eax, 11 ; invoke SYS_EXECVE
int 80h
The code above just executes the script with bash but does not add any arguments to the script itself. I tried to pass it as an additional argument but that does nothing. If I add the argument to the path to script string (path/to/script arg1) it breaks the terminal (color theme is set to just white text) and other than that does nothing.
Also what would be the easiest way of changing the contents of the arguments pointer array? How would I define that in .bss section and change its contents while the program is running? At least a point to the documentation about that would be nice...
When I put in run-bash.asm :
SECTION .data
command db '/bin/bash', 0
script db './test.sh', 0
script_arg db 'my_arg', 0
arguments dd command
dd script ; arguments to pass to commandline, in this case just the path to the script
dd script_arg
dd 0
SECTION .text
global _start
_start:
mov edx, 0 ; no environment variables are being used
mov ecx, arguments ; array of pointers has to be passed
mov ebx, command ; bash
mov eax, 11 ; invoke SYS_EXECVE
int 80h
And put in test.sh :
#!/usr/bin/env bash
echo "First argument is : $1"
The run it with :
nasm -f elf run-bash.asm
ld -m elf_i386 run-bash.o -o run-bash
chmod +x run-bash
./run-bash
# Output :
# First argument is : my_arg

What is the meaning of: "Search tree file's format version number (0) is not supported"?

In macOS 10.13 High Sierra on Xcode 9 I get this log message:
2017-09-28 15:19:28.246511+0800 wr[5376:128702] MessageTracer:
load_domain_whitelist_search_tree:73: Search tree file's format
version number (0) is not supported
2017-09-28 15:19:28.246541+0800 wr[5376:128702] MessageTracer: Falling back to default whitelist
What is the meaning of this message?
This command removes the log messages:
xattr -w format_version 1 "/Library/Application Support/CrashReporter/SubmitDiagInfo.domains"
Those messages come from a function msgtracer_domain_new in /usr/lib/libDiagnosticMessagesClient.dylib.
Run your application on Xcode 9.
Stop it.
In the the Debug navigator, click on NSApplicationMain just above main
Set a breakpoint at the first line pushq %rbp
Run your app again.
When the breakpoint hits, set another breakpoint by typing breakpoint set -n msgtracer_domain_new
Continue program execution.
As the breakpoint hits, look into the assembler code. you will see:
libDiagnosticMessagesClient.dylib`msgtracer_domain_new:
-> 0x7fff667c7f08 <+0>: pushq %rbp
0x7fff667c7f09 <+1>: movq %rsp, %rbp
0x7fff667c7f0c <+4>: pushq %r15
(omit)
0x7fff667c7ff1 <+233>: leaq 0xc1d(%rip), %rdi ; "/Library/Application Support/CrashReporter/SubmitDiagInfo.domains"
0x7fff667c7ff8 <+240>: xorl %r13d, %r13d
0x7fff667c7ffb <+243>: movl $0x20, %esi
0x7fff667c8000 <+248>: xorl %eax, %eax
0x7fff667c8002 <+250>: callq 0x7fff667c8990 ; symbol stub for: open
(omit)
0x7fff667c801d <+277>: leaq 0xc33(%rip), %rsi ; "format_version"
0x7fff667c8024 <+284>: movl $0x4, %ecx
0x7fff667c8029 <+289>: xorl %r8d, %r8d
0x7fff667c802c <+292>: xorl %r9d, %r9d
0x7fff667c802f <+295>: movl %r15d, %edi
0x7fff667c8032 <+298>: movq %r12, %rdx
0x7fff667c8035 <+301>: callq 0x7fff667c895a ; symbol stub for: fgetxattr
0x7fff667c803a <+306>: cmpl %r13d, (%r12)
0x7fff667c803e <+310>: jne 0x7fff667c808b ; <+387>
0x7fff667c8040 <+312>: movl $0x0, (%rsp)
0x7fff667c8047 <+319>: leaq 0xc18(%rip), %rcx ; "MessageTracer: %s:%d: Search tree file's format version number (%u) is not supported"
0x7fff667c804e <+326>: leaq 0xb9e(%rip), %r8 ; "load_domain_whitelist_search_tree"
(omit)
0x7fff667c808f <+391>: leaq 0xc25(%rip), %rcx ; "MessageTracer: Falling back to default whitelist"
0x7fff667c8096 <+398>: xorl %edi, %edi
0x7fff667c8098 <+400>: xorl %esi, %esi
0x7fff667c809a <+402>: movl $0x6, %edx
0x7fff667c809f <+407>: xorl %eax, %eax
0x7fff667c80a1 <+409>: callq 0x7fff667c8924 ; symbol stub for: asl_log
In my case, MacBook Pro late 2011 running High Sierra 10.13:
$ ls -l# "/Library/Application Support/CrashReporter/SubmitDiagInfo.domains"
-rw-rw-r--# 1 root admin 12988 Sep 21 2014 /Library/Application Support/CrashReporter/SubmitDiagInfo.domains
com.apple.TextEncoding 15
os_version 12
That file does not have a xattr format_version expected by the function msgtracer_domain_new
Does anyone know how to update it?
Appended:
Tips for looking into the similar phenomenon.
Find a process id of your app.
$ ps -ef | grep your_app_name | grep -v grep
999 86803 86804 0 1:34AM ?? 0:00.97 /Users/xxx/Library/Developer/Xcode/DerivedData/....
Obtain file paths that your app has loaded.
$ vmmap 86803 | perl -ne 'print "$1\n" if m{(/\S*)\Z}' | sort -u > z
Edit the temporary file as needed to remove irreverent file paths.
Find the file which includes the message.
$ cat z | xargs grep -l -b 'Search tree file' 2> /dev/null
/usr/lib/libDiagnosticMessagesClient.dylib
Confirm if the message exists.
$ strings /usr/lib/libDiagnosticMessagesClient.dylib | grep 'Search tree file'
MessageTracer: %s:%d: Search tree file's format version number (%u) is not supported
Produce debugger commands, and then apply them.
$ nm /usr/lib/libDiagnosticMessagesClient.dylib | grep " T " | sort -u | perl -pe 's/.* _/breakpoint set -n /'
breakpoint set -n msgtracer_domain_new
breakpoint set -n msgtracer_domain_free
breakpoint set -n msgtracer_msg_new
breakpoint set -n msgtracer_set
breakpoint set -n msgtracer_msg_free
breakpoint set -n msgtracer_vlog
breakpoint set -n msgtracer_log
breakpoint set -n msgtracer_vlog_with_keys_skip_nulls
breakpoint set -n msgtracer_vlog_with_keys
breakpoint set -n msgtracer_log_with_keys
breakpoint set -n msgtracer_log_with_keys_skip_nulls
breakpoint set -n msgtracer_uuid_create
The way mentioned above is not perfect. It does not take care of white spaces in a file path. As long as it works, it would be fine.
I love to use perl to manipulate texts. You will use your favorite ones.
I was seeing this problem on a computer that had been updated to High Sierra.
I went to the security and privacy panel in system preferences. On the privacy tab, I unlocked and updated my privacy settings. I set sharing with Apple and 3rd party devs. The problem went away.

Why is "info register ebp" in gdb not displaying a decimal number?

I am debugging a very simple code with gdb:
mov ebp,eax ; Save # of bytes read from file for later
Here is my output:
Breakpoint 2, Read () at hexdump1.asm:44
(gdb) info register eax
eax 0xd 13
(gdb) step
Read () at hexdump1.asm:45
(gdb) info register ebp
ebp 0xd 0xd
Why is gdb showing me 0xd 13 for eax but 0xd 0xd for ebp?
The info registers command prints out registers in both raw format (hex) and natural format. The natural format is based on the type of the register, declared in xml files in gdb's source code. For example, i386/32bit-core.xml contains:
<reg name="eax" bitsize="32" type="int32"/>
<reg name="ecx" bitsize="32" type="int32"/>
<reg name="edx" bitsize="32" type="int32"/>
<reg name="ebx" bitsize="32" type="int32"/>
<reg name="esp" bitsize="32" type="data_ptr"/>
<reg name="ebp" bitsize="32" type="data_ptr"/>
<reg name="esi" bitsize="32" type="int32"/>
<reg name="edi" bitsize="32" type="int32"/>
<reg name="eip" bitsize="32" type="code_ptr"/>
<reg name="eflags" bitsize="32" type="i386_eflags"/>
<reg name="cs" bitsize="32" type="int32"/>
<reg name="ss" bitsize="32" type="int32"/>
<reg name="ds" bitsize="32" type="int32"/>
<reg name="es" bitsize="32" type="int32"/>
<reg name="fs" bitsize="32" type="int32"/>
<reg name="gs" bitsize="32" type="int32"/>
From within gdb, you can view the type of a register:
(gdb) whatis $eax
type = int32_t
(gdb) whatis $ebp
type = void *
Your question is why (gdb) info register eax displays the content of EAX with a hex and a decimal number, while (gdb) info register ebx only uses hex numbers for EBP, right?
That is not only the case for EBP, but also for ESP, EFLAGS and EIP, too. I think, that has no special meaning. gdb just tries to display it in a usefull way. For example for EFLAGS, you want to see the status of the flags and not a decimal number (in the example below IF is set). In the case of EBP and ESP, we are talking about registers which are usually used to point to an address in the stack/memory. Thus normally, you do not want to know the decimal value. Okay, in this case, showing hex twice is quite useless though.
Here is an example which displays the content of all registers with the info registers command (i r is the short form, I just found out :P).
(gdb) i r
eax 0x0 0
ecx 0x0 0
edx 0x0 0
ebx 0x0 0
esp 0xbffff234 0xbffff234
ebp 0x0 0x0
esi 0x0 0
edi 0x0 0
eip 0x804822d 0x804822d
eflags 0x202 [ IF ]
cs 0x73 115
ss 0x7b 123
ds 0x7b 123
es 0x7b 123
fs 0x0 0
gs 0x0 0
more infos: https://sourceware.org/gdb/onlinedocs/gdb/Registers.html

Tiny Pe file format program error when running on Windows 7 64-bit

I'm trying to run the following assembly code (assembled with Nasm) in Windows 7 Ultimate 64-bit.
; tiny.asm
BITS 32
;
; MZ header
;
; The only two fields that matter are e_magic and e_lfanew
mzhdr:
dw "MZ" ; e_magic
dw 0 ; e_cblp UNUSED
dw 0 ; e_cp UNUSED
dw 0 ; e_crlc UNUSED
dw 0 ; e_cparhdr UNUSED
dw 0 ; e_minalloc UNUSED
dw 0 ; e_maxalloc UNUSED
dw 0 ; e_ss UNUSED
dw 0 ; e_sp UNUSED
dw 0 ; e_csum UNUSED
dw 0 ; e_ip UNUSED
dw 0 ; e_cs UNUSED
dw 0 ; e_lsarlc UNUSED
dw 0 ; e_ovno UNUSED
times 4 dw 0 ; e_res UNUSED
dw 0 ; e_oemid UNUSED
dw 0 ; e_oeminfo UNUSED
times 10 dw 0 ; e_res2 UNUSED
dd pesig ; e_lfanew
;
; PE signature
;
pesig:
dd "PE"
;
; PE header
;
pehdr:
dw 0x014C ; Machine (Intel 386)
dw 1 ; NumberOfSections
dd 0x4545BE5D ; TimeDateStamp UNUSED
dd 0 ; PointerToSymbolTable UNUSED
dd 0 ; NumberOfSymbols UNUSED
dw opthdrsize ; SizeOfOptionalHeader
dw 0x103 ; Characteristics (no relocations, executable, 32 bit)
;
; PE optional header
;
filealign equ 1
sectalign equ 1
%define round(n, r) (((n+(r-1))/r)*r)
opthdr:
dw 0x10B ; Magic (PE32)
db 8 ; MajorLinkerVersion UNUSED
db 0 ; MinorLinkerVersion UNUSED
dd round(codesize, filealign) ; SizeOfCode UNUSED
dd 0 ; SizeOfInitializedData UNUSED
dd 0 ; SizeOfUninitializedData UNUSED
dd start ; AddressOfEntryPoint
dd code ; BaseOfCode UNUSED
dd round(filesize, sectalign) ; BaseOfData UNUSED
dd 0x400000 ; ImageBase
dd sectalign ; SectionAlignment
dd filealign ; FileAlignment
dw 4 ; MajorOperatingSystemVersion UNUSED
dw 0 ; MinorOperatingSystemVersion UNUSED
dw 0 ; MajorImageVersion UNUSED
dw 0 ; MinorImageVersion UNUSED
dw 4 ; MajorSubsystemVersion
dw 0 ; MinorSubsystemVersion UNUSED
dd 0 ; Win32VersionValue UNUSED
dd round(filesize, sectalign) ; SizeOfImage
dd round(hdrsize, filealign) ; SizeOfHeaders
dd 0 ; CheckSum UNUSED
dw 2 ; Subsystem (Win32 GUI)
dw 0x400 ; DllCharacteristics UNUSED
dd 0x100000 ; SizeOfStackReserve UNUSED
dd 0x1000 ; SizeOfStackCommit
dd 0x100000 ; SizeOfHeapReserve
dd 0x1000 ; SizeOfHeapCommit UNUSED
dd 0 ; LoaderFlags UNUSED
dd 16 ; NumberOfRvaAndSizes UNUSED
;
; Data directories
;
times 16 dd 0, 0
opthdrsize equ $ - opthdr
;
; PE code section
;
db ".text", 0, 0, 0 ; Name
dd codesize ; VirtualSize
dd round(hdrsize, sectalign) ; VirtualAddress
dd round(codesize, filealign) ; SizeOfRawData
dd code ; PointerToRawData
dd 0 ; PointerToRelocations UNUSED
dd 0 ; PointerToLinenumbers UNUSED
dw 0 ; NumberOfRelocations UNUSED
dw 0 ; NumberOfLinenumbers UNUSED
dd 0x60000020 ; Characteristics (code, execute, read) UNUSED
hdrsize equ $ - $$
;
; PE code section data
;
align filealign, db 0
code:
; Entry point
start:
push byte 42
pop eax
ret
codesize equ $ - code
filesize equ $ - $$
Code taken from: http://www.phreedom.org/solar/code/tinype/
I create the executable using: nasm -f bin -o tiny.exe tiny.asm
But when i'm trying to run the tiny.exe, i get an error: The application was unable to start correctly (0xc0000018).
From the other hand on a Windows XP SP3 machine runs flawlessly. Any idea what might be wrong?
filealign equ 1
sectalign equ 1
The Windows 7 loader does not accept filealign less than 512 and sectionalign less than 4096.
Edit:
In the light of counter examples, it seems that the alignment limits are 4/4.
Windows XP has smaller requirements in size than Vista or later:
it accepts a truncated OptionalHeader while later Windows version reject the file if it's not complete.
Thus, you just need to add padding to make it work under Vista or later.
see my PE page on Corkami for more details and examples, with sources and binaries.
(this has absolutely nothing to do with alignments)

printing new lines with printf assembly

Hi I'm trying to write some assembly code that uses printf to print a given string. I am declaring my strings before use in the .data section and a test example looks as follows:
extern printf
extern fflush
LINUX equ 80H ; interupt number for entering Linux kernel
EXIT equ 60 ; Linux system call 1 i.e. exit ()
section .data
outputstringfmt: db "%s", 0
sentence0: db "Hello\nWorld\n", 0
segment .text
global main
main:
mov r8, sentence0
push r8
call print_sentence
add rsp, 8
call os_return
print_sentence:
push rbp
mov rbp, rsp
push r12
mov r12, [rbp + 16]
push rsi
push rdi
push r8
push r9
push r10
mov rsi, r12
mov rdi, outputstringfmt
xor rax, rax
call printf
xor rax, rax
call fflush
pop r10
pop r9
pop r8
pop rdi
pop rsi
pop r12
pop rbp
ret
os_return:
mov rax, EXIT ; Linux system call 1 i.e. exit ()
mov rdi, 0 ; Error code 0 i.e. no errors
syscall ; Interrupt Linux kernel 64-bit
I'm then compiling as follows:
nasm -f elf64 test.asm; gcc -m64 -o test test.o
And finally running:
./test
My output is as follows:
Hello\nWorld\n
I really don't want to split sentence0 up into the following:
sentence0: db "Hello", 10, 0
sentence1: db "World", 10, 0
and then call the print twice. Is there a better way to do it?
Thanks in advance!
NASM accepts strings in single quotes ('...') or double quotes ("..."), which are equivalent, and do not provide any escapes; or in backquotes (`...`), which provide support for C-style escapes, which is what you want.
(See section 3.4.2, "Character Strings", in the documentation.)
To get actual ASCII newlines in your data in memory, rather than literal backslash n:
sentence0: db `Hello\nWorld\n`, 0
Or do it manually:
sentence0: db 'Hello', 10, 'World`, 10, 0
YASM (another NASM-syntax assembler) doesn't accept backticks, so the manual option is your only choice there.
And BTW, you can call puts instead of printf if you don't have any actual formatting in your format string (leave out the trailing newline).
You have the newlines (\n) in the string to be output. They should be in the format string to be treated as newlines. This solves half of your problem:
outputstringfmt: db "%s\n%s\n", 0
sentence0: db "Hello", 0
sentence1: db "World", 0
And something like this should print newlines after each word:
outputstringfmt: db "%s", 0
sentence0: db "Hello", 10 , "World", 10 , 0

Resources