Mac OS x86 Assembly: Why does the initialized memory amount change? - macos

I just started learning assembly a week or so ago, and when debugging a program, I came across some strange memory usage. The following code (see end of post) is broken into two files for a reason.
If I compile and run with
gcc main.s
./a.out
with only code block 1 running (code block 2 commented out), then the program prints "8", meaning that right when my program starts, the Mac OS automatically puts 8 bytes worth of stuff on the stack, then leaves my program to do its thing.
However, if I compile and run with
gcc main.s print.s
./a.out
With only code block 2 running (code block 1 commented out), then the program prints "16", meaning that Mac OS is initially putting 16 bytes on the stack instead of 8. When this happens, the offsets applied to rsp to achieve 16-byte alignment remain the same, meaning that the start of the stack is being offset by 8 bytes whenever an outside function is called.
I also tried putting the _printNum function in the same file as main.s, but the discrepancy persisted. Another thing I tried was to add another format string and use it later on in the program to see if something to do with the format string was using memory, but it made no difference.
What I think is going on is that Mac OS is pushing the instruction pointer for the next instruction to execute when my program terminates onto the stack, then pushing the old base stack pointer onto the stack, both 32-bit, for a total of 8 bytes. When I include a function call (either local or external to the main file), it seems like the assembler decides to use 64-bit addresses instead of 32-bit addresses, doubling the memory used, and hence the 16 bytes used.
Why is this happening, and if I am wrong, what is Mac OS doing to the stack? Is any of the extra stack used of value to me? Is the computer doing something else instead of switching from 32-bit to 64-bit addressing? Thanks.
main program (main.s):
.cstring
_format: .asciz "%d\n"
.text
.globl _main
_main:
movq %rbp, %rax # Put stack base pointer in rax
subq %rsp, %rax # Subtract stack pointer to get total memory used
subq $8, %rsp # Get 16-byte alignment
#---------------------------------------------------------
# code block 1 - prints rax manually
#---------------------------------------------------------
movq %rax, %rsi # Value to print needs to be in rsi
lea _format(%rip), %rdi # Address of format string goes in rdi
# Don't know what the "_format(%rip)" does,
# but it works (any info would be handy)
call _printf
#---------------------------------------------------------
# code block 2 - prints rax via function call
#---------------------------------------------------------
call _printNum # Prints the value of rax
#---------------------------------------------------------
# stack cleanup and return
#---------------------------------------------------------
addq $8, %rsp # Account for the previous -8 to rsp
ret # end program
printing function (print.s):
.cstring
_format: .asciz "%d\n"
.text
.globl _printNum
# assumes 16-byte aligned when called
# prints the value of the rax register
_printNum:
push %rbp # save %rbp - previous stack base
movq %rsp, %rbp # update stack base
push %rsi # save %rsi - register
push %rdi # save %rdi - register
# print - already 16 byte aligned (rip and three values for 32 bytes)
movq %rax, %rsi # load the value to print
lea _format(%rip), %rdi # load the format string
call _printf
# restore registers
popq %rdi
popq %rsi
popq %rbp
# return
ret

Related

How to correctly use the "write" syscall on MacOS to print to stdout?

I have looked at similar questions, but cannot seem to find what is wrong with my code.
I am attempting to make the "write" syscall on MacOS to print a string to standard output.
I am able to do it with printf perfectly, and am familiar with calling other functions in x64 assembly.
This is, however, my first attempt at a syscall.
I am using GCC's GAS assembler.
This is my code:
.section __TEXT,__text
.globl _main
_main:
pushq %rbp
movq %rsp, %rbp
subq $32, %rsp
movq $0x20000004, %rax
movq $1, %rdi
leaq syscall_str(%rip), %rsi
movq $25, %rdx
syscall
jc error
xorq %rax, %rax
leave
ret
error:
movq $1, %rax
leave
ret
.section __DATA,__data
syscall_str:
.asciz "Printed with a syscall.\n"
There does not seem to be any error; there is simply nothing written to stdout.
I know that start is usually used as the starting point for an executable on MacOS, but it does not compile with GCC.
You are using the incorrect SYSCALL number for MacOS. The base for the user system calls is 0x2000000. You are incorrectly using that base. As a result you have encoded the write SYSCALL as $0x20000004 when it should have been $0x2000004 (one less zero)
As a rule of thumb, make sure you are using the correct value for the SYSCALL number in the %rax register; ensure you are using the correct arguments for the write SYSCALL. The write SYSCALL expects the following arguments:
%rdi: file descriptor to write to (e.g. 1 for standard output)
%rsi: pointer to the buffer containing the data to be written
%rdx: number of bytes to be written
On macOS, you need to use the syscall instruction to invoke a system call.

Using push/pop around function calls on Windows x64

I am fairly new to assembly and architectures so I was playing around with the GCC to assemble my assembler file.
I am running Windows 10, with AMD 3750H (IDK if this helps)
It is a fairly simple program, that does the following:
Creates a stack frame
Pushes two numbers to the stack
Pops them one at a time, calling printf once for each. (So the last pop is after the first call)
exits
Here is the code I wrote:
.data
form:
.ascii "%d\n\0";
.text
.globl main
main:
pushq %rbp
movq %rsp, %rbp
subq $32, %rsp;
pushq $420;
pushq $69;
lea form(%rip) , %rcx;
xor %eax, %eax
popq %rdx
call printf
lea form(%rip) , %rcx;
xor %eax, %eax
popq %rdx
call printf
mov %rbp, %rsp
popq %rbp
ret
But the output I get is (rather strangely):
69
4199744
I read about shadow space in the Windows x64 calling convention but I couldn't find the proper way to work with it when using push/pop:
What is the 'shadow space' in x64 assembly?
gcc output on cygwin using stack space outside stack frame
This is what I tried (thanks to Jester) and it worked
# subtract 32 when I push
pushq $420;
subq $32, %rsp
# Add 32 when I pop
addq $32, %rsp
popq %rdx
But for some reason I feel there maybe a more elegant way to go about this
Do I have to leave 32 bytes after every push? That seems like a lot of space is being wasted.

Writing and debugging a min program in asm

I am trying to write a program to find the minimum value of a list of integers in asm. Here is what I have so far:
.section .data
data_items:
.long 2,3,4,5,1,9,10 # set 10 as the sentinal value
.section text
.globl _start
_start:
# %ebx holds min
# %edi holds index (destination index)
# %eax current data item
movl $255, %ebx # set the current min to 255
movl $0, %edi # the index is also zero
start_loop:
movl data_items(,%edi,4), %eax # set %eax equal to the current data item
cmpl $10, %eax # compare %eax with zero to see if we should exit
je exit_loop # if it's the sentinel value, exit
incl %edi # increment the index
cmpl %eax, %edi # compare the current value to the current min
jge start_loop # if it's not less than the current value, go to start
movl %eax, %ebx # move the current value if less that the current min
jmp start_loop # always go back to the start if we've gotten this far
exit_loop:
movl $1, %eax # push the linux system call to %eax (1=exit)
int $0x80 # give linux control (so it will exit)
When I run this, I get the following:
$ as min.s -o min.o && ld min.o -o min && ./min
Segmentation fault (core dumped)
How is one supposed to debug asm? For example, at least in C the compiler tells you what the error might be and the line number, whereas here I know just about nothing. (Note: the error is having .section text instead of .section .text but how would one figure that out?)
It's very possible in C to write a program that compiles with no warnings but crashes (e.g. NULL pointer deref), and you'll see exactly the same thing. It's much more likely in asm, though.
You debug asm with a debugger, GDB for example. See tips at the bottom of https://stackoverflow.com/tags/x86/info. And if you make any system calls, use strace to see what your program is actually doing.
To debug this, you'd run it under GDB and notice that it segfaulted on the first instruction, movl $255, %ebx. It doesn't access memory so code-fetch must have faulted. So there must be something wrong with your sections that resulted in your code in section linked into a non-executable segment of your executable.
objdump -d would also have given you a hint: it disassembles the .text section by default, and this program doesn't have one.
The reason text instead of .text causes this problem is that the defaults for sections with random names that aren't one of the few specially-recognized ones are read+write without exec.
In GAS, use .text or .data, special shortcut directives for .section .text or .data which avoid this problem for those sections. https://sourceware.org/binutils/docs/as/Text.html
But not all "standard" sections have special directives, you do still need .section .rodata to switch to the read-only data section, where you should have put your array. (read, no write. On newer toolchains, also no exec). Instead of switching to the .bss section, though, you can use .comm or .lcomm (https://sourceware.org/binutils/docs/as/bss.html)
Another possible problem is that you're building this 32-bit code as a 64-bit executable (unless you're using a 32-bit-only install where as --32 is the default). Using 32-bit addressing modes works in 64-bit modes, truncating the address to 32 bits. That works when accessing static data in a position-dependent executable on Linux, because all code+data is linked into the low 2GiB of virtual address space.
But any access to (%esp) or -4(%ebp) or whatever would fault because the stack in a 64-bit process is mapped to a high address with non-zero bits outside the low 32.
You'd notice that problem in GDB because layout reg would show all 16 64-bit integer registers, RAX..R15.

What is the correct constant for the exit system call?

I am trying to learn x86_64 assembly, and am using GCC as my assembler. The exact command I'm using is:
gcc -nostdlib tapydn.S -D__ASSEMBLY__
I'm mainly using gcc for its preprocessor. Here is tapydn.S:
.global _start
#include <asm-generic/unistd.h>
syscall=0x80
.text
_start:
movl $__NR_exit, %eax
movl $0x00, %ebx
int $syscall
This results in a segmentation fault. I believe the problem is with the following line:
movl $__NR_exit, %eax
I used __NR_exit because it was more descriptive than some magic number. However, it appears that my usage of it is incorrect. I believe this to be the case because when I change the line in question to the following, it runs fine:
movl $0x01, %eax
Further backing up this trail of thought is the contents of usr/include/asm-generic/unistd.h:
#define __NR_exit 93
__SYSCALL(__NR_exit, sys_exit)
I expected the value of __NR_exit to be 1, not 93! Clearly I am misunderstanding its purpose and consequently its usage. For all I know, I'm getting lucky with the $0x01 case working (much like undefined behaviour in C++), so I kept digging...
Next, I looked for the definition of sys_exit. I couldn't find it. I tried using it anyway as follows (with and without the preceeding $):
movl $sys_exit, %eax
This wouldn't link:
/tmp/cc7tEUtC.o: In function `_start':
(.text+0x1): undefined reference to `sys_exit'
collect2: error: ld returned 1 exit status
My guess is that it's a symbol in one of the system libraries and I'm not linking it due to my passing -nostdlib to GCC. I'd like to avoid linking such a large library for just one symbol if possible.
In response to Jester's comment about mixing 32 and 64 bit constants, I tried using the value 0x3C as suggested:
movq $0x3C, %eax
movq $0x00, %ebx
This also resulting a segmentation fault. I also tried swapping out eax and ebx for rax and rbx:
movq $0x3C, %rax
movq $0x00, %rbx
The segmentation fault remained.
Jester then commented stating that I should be using syscall rather than int $0x80:
.global _start
#include <asm-generic/unistd.h>
.text
_start:
movq $0x3C, %rax
movq $0x00, %rbx
syscall
This works, but I was later informed that I should be using rdi instead of rbx as per the System V AMD64 ABI:
movq $0x00, %rdi
This also works fine, but still ends up using the magic number 0x3C for the system call number.
Wrapping up, my questions are as follows:
What is the correct usage of __NR_exit?
What should I be using instead of a magic number for the exit system call?
The correct header file to get the system call numbers is sys/syscall.h. The constants are called SYS_### where ### is the name of the system call you are interested in. The __NR_### macros are implementation details and should not be used. As a rule of thumb, if an identifier begins with an underscore it should not be used, if it begins with two it should definitely not be used. The arguments go into rdi, rsi, rdx, r10, r8, and r9. Here is a sample program for Linux:
#include <sys/syscall.h>
.globl _start
_start:
mov $SYS_exit,%eax
xor %edi,%edi
syscall
These conventions are mostly portable to other UNIX-like operating systems.

x86 asm printf causes segfault when using intel syntax (gcc)

I'm just starting to learn x86 assembly, and I am a bit confused as to why this little example doesn't work. All I want to do is to print the content of the eax register as a decimal value. This is my code in AT&T Syntax:
.data
intout:
.string "%d\n"
.text
.globl main
main:
movl $666, %eax
pushl %eax
pushl $intout
call printf
movl $1, %eax
int $0x80
Which I compile and run as follows:
gcc -m32 -o hello helloworld.S
./hello
This works as excepted (Printing 666 to the console). On a little side note, I would like to point out that I don't understand what exactly "movl $1, %eax" and "int $0x80" are supposed to accomplish here. I'm also a not sure what "pushl $intout" does. Why is my output composed out of two separate stack entries? And what exactly does the .string macro do?
These are only side questions however, since my real problem is that I can't find a way to make this run using the much easier to read/write/comprehend Intel syntax.
Here is the code:
.intel_syntax noprefix
.data
intout:
.string "%d\n"
.text
.globl main
main:
mov eax, 666
push eax
push intout
call printf
mov eax, 1
int 0x80
Running this same as above, it just prints "Segmentation fault".
What am I doing wrong?
You need to use push OFFSET intout otherwise the 32-bit value stored at intout will be pushed on the stack, rather than its address.
intout is just a label, which is basically a name assigned to an address in your program. The .string "%d\n" directive that follows it defines a sequence of bytes in your program, both allocating memory and initializing that memory. Specifically it allocates 4 bytes in the .data section and initializes them with the characters '%', 'd', '\n', and '\0'. Since the label intout is defined just before the .string line it has the address of the first byte in the string.
The line push intout results in a instruction that reads the 4 bytes starting at the address of referred to by intout and pushes them on to the stack (specifically it subtracts 4 from ESP and then copies them to the 4 bytes now pointed to by ESP.) The line push $intout (or push OFFSET intout) pushes the 4 bytes that make up the 32-bit address of intout on the stack.
This means that the line push intout pushes a meaningless value on to the stack. The function printf ends up interpreting it as a pointer, an address where the format string is supposed to be stored, but since it doesn't point to valid location in memory your program crashes.

Resources