I have been trying to get a better idea of what happens under the hood by using the compiler to generate the assembly programs of various C programs at different optimization levels. There is something that has been bothering me for a while.
When I compile t.c as follows,
gcc -S t.c
I get the assembly in AT&T syntax as follows.
function:
pushl %ebp
movl %esp, %ebp
movl 12(%ebp), %eax
addl 8(%ebp), %eax
popl %ebp
ret
.size function, .-function
When I compile using the masm argument as follows:-
gcc -S t.c -masm=intel
I get the following output.
function:
push %ebp
mov %ebp, %esp
mov %eax, DWORD PTR [%ebp+12]
add %eax, DWORD PTR [%ebp+8]
pop %ebp
ret
.size function, .-function
There is a change in syntax but there are still "%"s before the notation of registers(this is why I don't prefer AT&T syntax in the first place).
Can someone shed some light on why this is happening? How do I solve this issue?
The GNU assembler (gas) does have a separate option for controlling the % prefix. Documentation seems to suggest GCC doesn't have such an option, but my GCC (version Debian 4.3.2-1.1) doesn't produce the % prefix.
Related
I am trying to learn x86_64 assembly, and am using GCC as my assembler. The exact command I'm using is:
gcc -nostdlib tapydn.S -D__ASSEMBLY__
I'm mainly using gcc for its preprocessor. Here is tapydn.S:
.global _start
#include <asm-generic/unistd.h>
syscall=0x80
.text
_start:
movl $__NR_exit, %eax
movl $0x00, %ebx
int $syscall
This results in a segmentation fault. I believe the problem is with the following line:
movl $__NR_exit, %eax
I used __NR_exit because it was more descriptive than some magic number. However, it appears that my usage of it is incorrect. I believe this to be the case because when I change the line in question to the following, it runs fine:
movl $0x01, %eax
Further backing up this trail of thought is the contents of usr/include/asm-generic/unistd.h:
#define __NR_exit 93
__SYSCALL(__NR_exit, sys_exit)
I expected the value of __NR_exit to be 1, not 93! Clearly I am misunderstanding its purpose and consequently its usage. For all I know, I'm getting lucky with the $0x01 case working (much like undefined behaviour in C++), so I kept digging...
Next, I looked for the definition of sys_exit. I couldn't find it. I tried using it anyway as follows (with and without the preceeding $):
movl $sys_exit, %eax
This wouldn't link:
/tmp/cc7tEUtC.o: In function `_start':
(.text+0x1): undefined reference to `sys_exit'
collect2: error: ld returned 1 exit status
My guess is that it's a symbol in one of the system libraries and I'm not linking it due to my passing -nostdlib to GCC. I'd like to avoid linking such a large library for just one symbol if possible.
In response to Jester's comment about mixing 32 and 64 bit constants, I tried using the value 0x3C as suggested:
movq $0x3C, %eax
movq $0x00, %ebx
This also resulting a segmentation fault. I also tried swapping out eax and ebx for rax and rbx:
movq $0x3C, %rax
movq $0x00, %rbx
The segmentation fault remained.
Jester then commented stating that I should be using syscall rather than int $0x80:
.global _start
#include <asm-generic/unistd.h>
.text
_start:
movq $0x3C, %rax
movq $0x00, %rbx
syscall
This works, but I was later informed that I should be using rdi instead of rbx as per the System V AMD64 ABI:
movq $0x00, %rdi
This also works fine, but still ends up using the magic number 0x3C for the system call number.
Wrapping up, my questions are as follows:
What is the correct usage of __NR_exit?
What should I be using instead of a magic number for the exit system call?
The correct header file to get the system call numbers is sys/syscall.h. The constants are called SYS_### where ### is the name of the system call you are interested in. The __NR_### macros are implementation details and should not be used. As a rule of thumb, if an identifier begins with an underscore it should not be used, if it begins with two it should definitely not be used. The arguments go into rdi, rsi, rdx, r10, r8, and r9. Here is a sample program for Linux:
#include <sys/syscall.h>
.globl _start
_start:
mov $SYS_exit,%eax
xor %edi,%edi
syscall
These conventions are mostly portable to other UNIX-like operating systems.
BOOL32 doStuff() {
return TRUE;
}
gcc 2.95 for vxworks 5.x, compiling the above code with -O0 for 32-bit x86 generated following code:
doStuff:
0e9de190: push %ebp
0e9de191: mov %esp,%ebp
308 return TRUE;
0e9de193: mov $0x1,%eax
0e9de198: jmp 0xe9de1a0 <doStuff+16>
312 {
0e9de19a: lea 0x0(%esi),%esi
// The JMP jumps here
0e9de1a0: mov %ebp,%esp
0e9de1a2: pop %ebp
0e9de1a3: ret
Everything looks normal until the JMP and LEA instruction. What are they for?
My guess is that it is some kind of alignment, but I am not sure about this.
I would have done something like this:
doStuff:
0e9de190: push %ebp
0e9de191: mov %esp,%ebp
308 return TRUE;
0e9de193: mov $0x1,%eax
0e9de1XX: mov %ebp,%esp
0e9de1XX: pop %ebp
0e9de1XX: ret
0e9de1XX: fill with lea 0x0, %esi
lea 0x0(%esi),%esi is a long NOP, and the jmp is jumping over it. You probably have an ancient version of binutils (containing as) to go with your ancient gcc version.
So when gcc put a .p2align to align a label in the middle of the function that isn't otherwise a branch target (for some bizarre reason, but it's -O0 so it's not even supposed to be good code), the assembler made a long NOP and jumped over it.
Normally you'd only jump over a block of NOPs if there were a lot of them, especially if they were all single-byte NOPs. This is really dumb code, so stop using such crusty tools. You could try upgrading your assembler (but still using gcc2.95 if you need to). Or check that it doesn't happen at -O2 or -O3, in which case it doesn't matter.
If you have to keep using gcc2.95 for some reason, then just be aware that it's ancient, and this is part of the tradeoff you're making to keep using whatever it is that's forcing you to use it.
I've been trying to get familiar with assembly on mac, and from what I can tell, the documentation is really sparse, and most books on the subject are for windows or linux. I thought I would be able to translate from linux to mac pretty easily, however this (linux)
.file "simple.c"
.text
.globl simple
.type simple, #function
simple:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %edx
movl 12(%ebp), %eax
addl (%edx), %eax
movl %eax, (%edx)
popl %ebp
ret
.size simple, .-simple
.ident "GCC: (Ubuntu 4.3.2-1ubuntu11) 4.3.2"
.section .note.GNU-stack,"",#progbits
seems pretty different from this (mac)
.section __TEXT,__text,regular,pure_instructions
.globl _simple
.align 4, 0x90
_simple: ## #simple
.cfi_startproc
## BB#0:
pushq %rbp
Ltmp2:
.cfi_def_cfa_offset 16
Ltmp3:
.cfi_offset %rbp, -16
movq %rsp, %rbp
Ltmp4:
.cfi_def_cfa_register %rbp
addl (%rdi), %esi
movl %esi, (%rdi)
movl %esi, %eax
popq %rbp
ret
.cfi_endproc
.subsections_via_symbols
The "normal" (for lack of a better word) instructions and registers such as pushq %rbp don't worry me. But the "weird" ones like .cfi_startproc and Ltmp2: which are smack dab in the middle of the machine instructions don't make any sense.
I have no idea where to go to find out what these are and what they mean. I'm about to pull my hair out as I've been trying to find a good resource for beginners for months. Any suggestions?
To begin with, you're comparing 32-bit x86 assembly with 64-bit x86-64. While the OS X Mach-O ABI supports 32-bit IA32, I suspect you want the x86-64 SysV ABI. (Thankfully, the x86-64.org site seems to be up again). The Mach-O x86-64 model is essentially a variant of the ELF / SysV ABI, so the differences are relatively minor for user-space code, even with different assemblers.
The .cfi directives are DWARF debugging directives that you don't strictly need for assembly - they are used for call frame information, etc. Here are some minimal examples:
ELF x64-64 assembler:
.text
.p2align 4
.globl my_function
.type my_function,#function
my_function:
...
.L__some_address:
.size my_function,[.-my_function]
Mach-O x86-64 assembler:
.text
.p2align 4
.globl _my_function
_my_function:
...
L__some_address:
Short of writing an asm tutorial, the main differences between the assemblers are: leading underscores for Mach-O functions names, .L vs L for labels (destinations). The assembler with OS X understands the '.p2align' directive. .align 4, 0x90 essentially does the same thing.
Not all the directives in compiler-generated code are essential for the assembler to generate valid object code. They are required to generate stack frame (debugging) and exception handling data. Refer to the links for more information.
Obviously the Linux code is 32-bit Linux code. Note that 64-bit Linux can run both 32- and 64-bit code!
The Mac code is definitely 64-bit code.
This is the main difference.
The ".cfi_xxx" lines are only information used for the Mac specific file format.
I'm trying to convert a snippet of mine to a compiler that uses an inline asm syntax similar to gcc's. I read the documentation and all was fine until I encountered this line:
mov eax, dword ptr fs:[0x20]
I converted that to:
movl 0x20(%fs:), %eax
The compiled flipped, telling me that fs is not a 32bit register and that this operation is invalid. How should I access fs in at&t syntax?
Found the answer, it seems that gcc or the at&t is very inconsistent.
movl %fs:0x20, %eax
I'm trying to follow the book Professional Asssembly Language on Mac OS X Montain Lion.
On google I found a port for Mac OS X at the following url: Assembly on MacOS X
Created the file with Vim and compiled it with GAS:
as -g -arch i386 -o cpuid.o cpuid.s
Linked the code using gcc:
gcc -m32 -arch i386 -o cpuid cpuid.o
The resulting executable cpuid, runs without errors but if I try to debug it with gdb at the end it says Program exited with code 044 instead of Program exited normally.
Trying to find a way to make it exit correctly I've created an hello world example in C and generated assembly code it with:
gcc -Wall -03 -m32 -fno-PIC hello_pf.c -S -o hello_pf.s
The resulting assembly code is bellow:
.section __TEXT,__text,regular,pure_instructions
.globl _main
.align 4, 0x90
_main:
pushl %ebp
movl %esp, %ebp
subl $24, %esp
leal L_.str, %eax
movl %eax, (%esp)
call _puts
movl $0, -8(%ebp)
movl -8(%ebp), %eax
movl %eax, -4(%ebp)
movl -4(%ebp), %eax
addl $24, %esp
popl %ebp
ret
.section __TEXT,__cstring,cstring_literals
L_.str:
.asciz "Hello world!\n"
.subsections_via_symbols
Can someone provide any help regarding this issue?
How can I make a working version of cpuid from the link provided above using IA-32 mac ox s assembly?
Where can I look for a detailed description of stack align problem in Mac OS X? I've read what's on Apple site but for a beginner is not very helpful.
What are for the instructions after call _puts from the above sample code?
How does calling libc functions from assembly really works? Any good detailed articles on this topic?
Thank you!
First you need to understand the register usage in the calling conventions, a good place for that is
http://www.agner.org/optimize/calling_conventions.pdf
You will find that on Mac OS X 64-bit the return value for a function returning an "int" - such as main() - is in %rax. You seem to want to use a 32-bit executable, in which case the return value is in %eax. One convenient way to zero out a register is to XOR it with itself, so you should add this to the end of your routine:
xorl %eax,%eax
That'll set %eax to zero, and that will be your exit code.