basic assembly not working on Mac (x86_64+Lion)? - macos

here is the code(exit.s):
.section .data,
.section .text,
.globl _start
_start:
movl $1, %eax
movl $32, %ebx
syscall
when I execute " as exit.s -o exit.o && ld exit.o -o exit -e _start && ./exit"
the return is "Bus error: 10" and the output of "echo $?" is 138
I also tried the example of the correct answer in this question: Process command line in Linux 64 bit
stil get "bus error"...

First, you are using old 32-bit Linux kernel calling convention on Mac OS X - this absolutely doesn't work.
Second, syscalls in Mac OS X are structured in a different way - they all have a leading class identifier and a syscall number. The class can be Mach, BSD or something else (see here in the XNU source) and is shifted 24 bits to the left. Normal BSD syscalls have class 2 and thus begin from 0x2000000. Syscalls in class 0 are invalid.
As per §A.2.1 of the SysV AMD64 ABI, also followed by Mac OS X, syscall id (together with its class on XNU!) goes to %rax (or to %eax as the high 32 bits are unused on XNU). The fist argument goes in %rdi. Next goes to %rsi. And so on. %rcx is used by the kernel and its value is destroyed and that's why all functions in libc.dyld save it into %r10 before making syscalls (similarly to the kernel_trap macro from syscall_sw.h).
Third, code sections in Mach-O binaries are called __text and not .text as in Linux ELF and also reside in the __TEXT segment, collectively referred as (__TEXT,__text) (nasm automatically translates .text as appropriate if Mach-O is selected as target object type) - see the Mac OS X ABI Mach-O File Format Reference. Even if you get the assembly instructions right, putting them in the wrong segment/section leads to bus error. You can either use the .section __TEXT,__text directive (see here for directive syntax) or you can also use the (simpler) .text directive, or you can drop it altogether since it is assumed if no -n option was supplied to as (see the manpage of as).
Fourth, the default entry point for the Mach-O ld is called start (although, as you've already figured it out, it can be changed via the -e linker option).
Given all the above you should modify your assembler source to read as follows:
; You could also add one of the following directives for completeness
; .text
; or
; .section __TEXT,__text
.globl start
start:
movl $0x2000001, %eax
movl $32, %edi
syscall
Here it is, working as expected:
$ as -o exit.o exit.s; ld -o exit exit.o
$ ./exit; echo $?
32

Adding more explanation on the magic number. I made the same mistake by applying the Linux syscall number to my NASM.
From the xnu kernel sources in osfmk/mach/i386/syscall_sw.h (search SYSCALL_CLASS_SHIFT).
/*
* Syscall classes for 64-bit system call entry.
* For 64-bit users, the 32-bit syscall number is partitioned
* with the high-order bits representing the class and low-order
* bits being the syscall number within that class.
* The high-order 32-bits of the 64-bit syscall number are unused.
* All system classes enter the kernel via the syscall instruction.
Syscalls are partitioned:
#define SYSCALL_CLASS_NONE 0 /* Invalid */
#define SYSCALL_CLASS_MACH 1 /* Mach */
#define SYSCALL_CLASS_UNIX 2 /* Unix/BSD */
#define SYSCALL_CLASS_MDEP 3 /* Machine-dependent */
#define SYSCALL_CLASS_DIAG 4 /* Diagnostics */
As we can see, the tag for BSD system calls is 2. So that magic number 0x2000000 is constructed as:
// 2 << 24
#define SYSCALL_CONSTRUCT_UNIX(syscall_number) \
((SYSCALL_CLASS_UNIX << SYSCALL_CLASS_SHIFT) | \
(SYSCALL_NUMBER_MASK & (syscall_number)))
Why it uses BSD tag in the end, probably Apple switches from mach kernel to BSD kernel. Historical reason.
Inspired by the original answer.

Related

macOS 64-bit System Call Table [duplicate]

This question already has answers here:
basic assembly not working on Mac (x86_64+Lion)?
(2 answers)
Closed 3 years ago.
I can find a Linux 64-bit system call table, but the call numbers do not work on macOS - I get a Bus Error: 10 whenever I try to use them.
What are the macOS call numbers for operations like sys_write?
You can get the list of system call numbers from user mode in (/usr/include/)sys/syscall.h. The numbers ARE NOT the same as in Linux. The file is autogenerated during XNU build from bsd/kern/syscalls/syscalls.master.
If you use the libsystem_kernel syscall export you can use the numbers as they are. If you use assembly you have to add 0x2000000 to mark them for the BSD layer (rather than 0x1000000, which would mean Mach traps, or 0x3000000, which would mean machine dependent).
To see examples of system call usage in assembly, you can easily disassemble the exported wrappers: x86_64's /usr/lib/system/libsystem_kernel.dylib (or ARM64's using jtool from the shared library cache).
You need to add 0x2000000 to the call number using a syscalls.master file. I'm using the XNU bds/kern/syscalls.master file. Here's a function in the syscalls.master file that I'm going to call:
4 AUE_NULL ALL { user_ssize_t write(int fd, user_addr_t cbuf, user_size_t nbyte); }
In terms of which registers to pass arguments to, it's the same as 64-bit Linux. Arguments are passed through the rdi, rsi, rdx, r10, r8 and r9 registers, respectively. The write function takes three arguments, which are described in the following assembly:
mov rax, 0x2000004 ; sys_write call identifier
mov rdi, 1 ; STDOUT file descriptor
mov rsi, myMessage ; buffer to print
mov rdx, myMessageLen ; length of buffer
syscall ; make the system call
Error returns are different from Linux, though: on error, CF=1 and RAX=an errno code. (vs. Linux using rax=-4095..-1 as -errno in-band signalling.) See What is the relation between (carry flag) and syscall in assembly (x64 Intel syntax on Mac Os)?
RCX and R11 are overwritten by the syscall instruction itself, before any kernel code runs, so that part is necessarily the same as Linux.
As was already pointed out, you need to add 0x2000000 to the call number. The explanation of that magic number comes from the xnu kernel sources in osfmk/mach/i386/syscall_sw.h (search SYSCALL_CLASS_SHIFT).
/*
* Syscall classes for 64-bit system call entry.
* For 64-bit users, the 32-bit syscall number is partitioned
* with the high-order bits representing the class and low-order
* bits being the syscall number within that class.
* The high-order 32-bits of the 64-bit syscall number are unused.
* All system classes enter the kernel via the syscall instruction.
There are classes of system calls on OSX. All system calls enter the kernel via the syscall instruction. At that point there are Mach system calls, BSD system calls, NONE, diagnostic and machine-dependent.
#define SYSCALL_CLASS_NONE 0 /* Invalid */
#define SYSCALL_CLASS_MACH 1 /* Mach */
#define SYSCALL_CLASS_UNIX 2 /* Unix/BSD */
#define SYSCALL_CLASS_MDEP 3 /* Machine-dependent */
#define SYSCALL_CLASS_DIAG 4 /* Diagnostics */
Each system call is tagged with a class enumeration which is left-shifted 24 bits, SYSCALL_CLASS_SHIFT. The enumeration for BSD system calls is 2, SYSCALL_CLASS_UNIX. So that magic number 0x2000000 is constructed as:
// 2 << 24
#define SYSCALL_CONSTRUCT_UNIX(syscall_number) \
((SYSCALL_CLASS_UNIX << SYSCALL_CLASS_SHIFT) | \
(SYSCALL_NUMBER_MASK & (syscall_number)))
Apparently you can get that magic number from the kernel sources but not from the developer include files. I think this means that Apple really wants you to link against library object files that resolve your system call shim rather than use an inline routine: object compatibility rather than source compatibility.
On x86_64, the system call itself uses the System V ABI (section A.2.1) as Linux does and it uses the syscall instruction (int 0x80 for syscall in Linux). Arguments are passed in rdi, rsi, rdx, r10, r8 and r9. The syscall number is in the rax register.

What are .seh_* assembly commands that gcc outputs?

I use gcc -S for a hello world program. What are the 5 .seh_ commands? I can't seem to find much info at all about them when I search.
.file "hi.c"
.def __main; .scl 2; .type 32; .endef
.section .rdata,"dr"
.LC0:
.ascii "Hello World\0"
.text
.globl main
.def main; .scl 2; .type 32; .endef
.seh_proc main
main:
pushq %rbp
.seh_pushreg %rbp
movq %rsp, %rbp
.seh_setframe %rbp, 0
subq $32, %rsp
.seh_stackalloc 32
.seh_endprologue
call __main
leaq .LC0(%rip), %rcx
call puts
movl $0, %eax
addq $32, %rsp
popq %rbp
ret
.seh_endproc
.ident "GCC: (rubenvb-4.8.0) 4.8.0"
.def puts; .scl 2; .type 32; .endef
These are gas's implementation of MASM's frame handling pseudos for generating an executable's .pdata and .xdata sections (structured exception handling stuff). Also check out Raw Pseudo Operations. Apparently if your code might be in the stack during an SEH unwind operation, you are expected to use these.
I found slightly more information at https://sourceware.org/ml/binutils/2009-08/msg00193.html. This thread seems to be the original checkin to gas to add support for all the .set_* pseudo ops.
I would like to show the patch for .pdata and .xdata generation of
pe-coff targets via gas, and to get some feed-back. This patch
includes support for arm, ppc, arm, sh (3&4), mips, and x64. As for
x86 there is no OS support for runtime function information, I spared
this part. It would just increase executable size for x86 PE and there
is no real gain for this target.
Short overview:
There are at the moment three different function entry formats preset.
The first is the MIPS one. The second version is for ARM, PPC, SH3,
and SH4 mainly for Windows CE. The third is the IA64 and x64 version.
Note, the IA64 isn't implemented yet, but to find information about
it, please see specification about IA64 on
http://download.intel.com/design/Itanium/Downloads/245358.pdf file.
The first version has just entries in the pdata section: BeginAddress,
EndAddress, ExceptionHandler, HandlerData, and PrologueEndAddress.
Each value is a pointer to the corresponding data and has size of 4
bytes.
The second variant has the following entries in the pdata section.
BeginAddress, PrologueLength (8 bits), EndAddress (22 bits),
Use-32-bit-instruction (1 bit), and Exception-Handler-Exists (1 bit).
If the FunctionLength is zero, or the Exception-Handler-Exists bit is
true, a DATA_EH block is placed directly before function entry.
The third version has a function entry block of BeginAddress (RVA),
EndAddress (RVA), and UnwindData (RVA). The description of the
prologue, excepetion-handler, and additional SEH data is stored within
the UNWIND_DATA field in the xdata section.
.seh_proc <fct_name>
This specifies, that a SEH block begins for the function <fct_name>. This is valid for all
targets.
.seh_endprologue
By this pseudo the location of the prologue end-address (taken by the current code address of the appearance of
this pseudo). Valid for all targets.
.seh_handler <handler>[,<handler-data>]
This pseudo specifies the handler function to be used. For version 2 the
handler-data field specifies the user optional data block. For version
3 the handler-data field can be a rva to user-data (for FHANDLER), if
the name is #unwind the UHANDLER unwind block is generated, and if it
is #except (or not specified at all) EHANDLER exception block is
generated.
.seh_eh
This pseudo is used for version 2 to indicate the location of the function begin in assembly. Here the PDATA_EH data is
may stored to.
.seh_32/.seh_no32
This pseudos are just used for version 2 (see above for description). At the moment it defaults to no32, if not
specified.
.seh_endproc
By this pseudo the end of the SEH block is specified.
.seh_setframe <reg>,<offset>
By this pseudo the frame-register and the offset (value between 0-240 with 16-byte
alignment) can be specified. This is just used by version 3.
.seh_stackalloc <size>
By this stack allocation in code is described for version 3.
.seh_pushreg <reg>
By this a general register push in code is described for version 3.
.seh_savereg <reg>
By this a general register save to memory in code is described for version 3.
.seh_savemm <mm>
By this a mm register save to memory in code is described for version 3.
.seh_savexmm
By this a xmm register save to memory in code is described for version 3.
.seh_pushframe
By this information about entry kind can be described for version 3.
.seh_scope <begin>,<end>,<handler>,<jump>
By this SCOPED entries for unwind or exceptions can be specified for
version 3. This is just valid for UHANDLE and EHANDLER xdata
descriptor and a global handler has to be specified. For handler and
jump arguments, names of #1,#0, and #null can be used and they are
specifying that a constant instead of a rva has to be used.
There is also some hard-core discussion of .xdata and .pdata (along with a bunch of links) at https://sourceware.org/ml/binutils/2009-04/msg00181.html.
I stopped them from being output by using:
gcc -S -fno-asynchronous-unwind-tables hi.c
so I can look that up. But I'm happy with just not having them output anymore.
They seem related to exception handling. That's all I could find.
http://ftp.netbsd.org/pub/NetBSD/NetBSD-current/src/external/gpl3/binutils/dist/gas/config/obj-coff-seh.h

Generating a pure (or flat) binary

How can you generate a flat binary that will run directly on the CPU?
That is, without an Operating System; also called free standing environment code (see What is the name for a program running directly without an OS?).
I've noticed that the assembler I'm using, as from the OS-X developer tools bundle, keeps generating Mach-O files, and not flat binaries.
This is the way I've done it. Using the linker that comes with the XCode Command Line Tools, you can combine object files using:
ld code1.o code2.o -o code.bin -r -U start
The -r asks ld to just combine object files together without making a library, -U tells ld to ignore the missing definition of _start (which would normally be provided by the C stdlib).
This creates a binary which still has some header bytes, but this is easily identified with
otool -l code.bin
Look for the __text section in the output:
Section
sectname __text
segname __TEXT
addr 0x00000000
size 0x0000003b
offset 240
align 2^4 (16)
reloff 300
nreloc 1
flags 0x80000400
reserved1 0
reserved2 0
Note the offset (which you can confirm by comparing the output of otool -l and hexdump). We don't want the headers so just use dd to copy out the bytes you need:
dd if=code.bin of=code_stripped.bin ibs=240 skip=1
where I've set the block size to the offset and skipping one block.
You don't. You get the linker to produce a flat (pure) binary. To do that, you have to write a linker script file with OUTPUT_FORMAT(binary). If memory serves, you also need to specify something about how the sections are merged, but I don't remember any of the details.
I don't think you necessarily need to do this. Some bootloaders can load more complex executable formats. For example, GRUB can load ELF right off the bat. I'm sure you can somehow get it or some other bootloader to load Mach-O files.
You may want to try using the nasm assembler -- it has an option to control the output binary format, including -f bin for flat binaries.
Note that you can't easily compile C code to flat binaries, since almost any C code will require binary features (like external symbols and relocations) which can't be represented in a flat binary.
There is no easy way I know of.
Once I needed to create plain binary file which will be loaded and executed by another program. However, as didn't allow me to do that. I tried to use gobjcopy to convert object file to raw binary, but it was not able to properly convert code such as this:
.quad LinkName2 - LinkName1
In binary file produced by gobjcopy it looked like
.quad 0
I've ended up writing special dumping program, which is executable that will save part of the memory on disk:
.set SYS_EXIT, 0x2000001
.set SYS_READ, 0x2000003
.set SYS_WRITE, 0x2000004
.set SYS_OPEN, 0x2000005
.set SYS_CLOSE, 0x2000006
.data
dumpfile: .ascii "./dump"
.byte 0
OutputFileDescriptor: .quad 0
.section __TEXT,__text,regular
.globl _main
_main:
movl $0644, %edx # file mode
movl $0x601, %esi # O_CREAT | O_TRUNC | O_WRONLY
leaq dumpfile(%rip), %rdi
movl $SYS_OPEN, %eax
syscall
movq %rax, OutputFileDescriptor(%rip)
movq $EndDump - BeginDump, %rdx
leaq BeginDump(%rip), %rsi
movq OutputFileDescriptor(%rip), %rdi
movl $SYS_WRITE, %eax
syscall
movq OutputFileDescriptor(%rip), %rdi
movl $SYS_CLOSE, %eax
syscall
Done:
movq %rax, %rdi
movl $SYS_EXIT, %eax
syscall
.align 3
BeginDump:
.include "dump.s"
EndDump:
.quad 0
The code that have to be saved as raw binary file is included in dump.s

64 bit assembly on Mac OS X runtime errors: "dyld: no writable segment" and "Trace/BPT trap"

When attempting to run the following assembly program:
.globl start
start:
pushq $0x0
movq $0x1, %rax
subq $0x8, %rsp
int $0x80
I am receiving the following errors:
dyld: no writable segment
Trace/BPT trap
Any idea what could be causing this? The analogous program in 32 bit assembly runs fine.
OSX now requires your executable to have a writable data segment with content, so it can relocate and link your code dynamically. Dunno why, maybe security reasons, maybe due to the new RIP register. If you put a .data segment in there (with some bogus content), you'll avoid the "no writable segment" error. IMO this is an ld bug.
Regarding the 64-bit syscall, you can do it 2 ways. GCC-style, which uses the _syscall PROCEDURE from libSystem.dylib, or raw. Raw uses the syscall instruction, not the int 0x80 trap. int 0x80 is an illegal instruction in 64-bit.
The "GCC method" will take care of categorizing the syscall for you, so you can use the same 32-bit numbers found in sys/syscall.h. But if you go raw, you'll have to classify what kind of syscall it is by ORing it with a type id. Here is an example of both. Note that the calling convention is different! (this is NASM syntax because gas annoys me)
; assemble with
; nasm -f macho64 -o syscall64.o syscall64.asm && ld -lc -ldylib1.o -e start -o syscall64 syscall64.o
extern _syscall
global start
[section .text align=16]
start:
; do it gcc-style
mov rdi, 0x4 ; sys_write
mov rsi, 1 ; file descriptor
mov rdx, hello
mov rcx, size
call _syscall ; we're calling a procedure, not trapping.
;now let's do it raw
mov rax, 0x2000001 ; SYS_exit = 1 and is type 2 (bsd call)
mov rdi, 0 ; Exit success = 0
syscall ; faster than int 0x80, and legal!
[section .data align=16]
hello: db "hello 64-bit syscall!", 0x0a
size: equ $-hello
check out http://www.opensource.apple.com/source/xnu/xnu-792.13.8/osfmk/mach/i386/syscall_sw.h for more info on how a syscall is typed.
The system call interface is different between 32 and 64 bits. Firstly, int $80 is replaced by syscall and the system call numbers are different. You will need to look up documentation for a 64-bit version of your system call. Here is an example of what a 64-bit program may look like.

assembly language in os x

I used assembly language step by step to learn assembly language programming on linux. I recently got a Mac, on which int 0x80 doesn't seem to work (illegal instruction).
So just wanted to know if there is a good reference (book/webpage) which gives the differences b/w the standard unix assembly and darwin assembly.
For practical purposes, this answer shows how to compile a hello world application using nasm on OSX.
This code can be compiled for linux as is, but the cmd-line command to compile it would probably differ:
section .text
global mystart ; make the main function externally visible
mystart:
; 1 print "hello, world"
; 1a prepare the arguments for the system call to write
push dword mylen ; message length
push dword mymsg ; message to write
push dword 1 ; file descriptor value
; 1b make the system call to write
mov eax, 0x4 ; system call number for write
sub esp, 4 ; OS X (and BSD) system calls needs "extra space" on stack
int 0x80 ; make the actual system call
; 1c clean up the stack
add esp, 16 ; 3 args * 4 bytes/arg + 4 bytes extra space = 16 bytes
; 2 exit the program
; 2a prepare the argument for the sys call to exit
push dword 0 ; exit status returned to the operating system
; 2b make the call to sys call to exit
mov eax, 0x1 ; system call number for exit
sub esp, 4 ; OS X (and BSD) system calls needs "extra space" on stack
int 0x80 ; make the system call
; 2c no need to clean up the stack because no code here would executed: already exited
section .data
mymsg db "hello, world", 0xa ; string with a carriage-return
mylen equ $-mymsg ; string length in bytes
Assemble the source (hello.nasm) to an object file:
nasm -f macho hello.nasm
Link to produce the executable:
ld -o hello -e mystart hello.o
This question will likely help: List of and documentation for system calls for XNU kernel in OSX.
Unfortunately, it looks like the book mentioned there is the only way to find out. As for int 0x80, I doubt it will work because it is a pretty Linux specific API that is built right into the kernel.
The compromise I make when working on an unfamiliar OS is to just use libc calls, but I can understand that even that may be too high level if you're just looking to learn.
can you post your code and how you compiled? (There are many ways to elicit illegal instruction errors)
OSX picked up bsd style of passing arguments, which is why you have to do thing slightly differently.
I bookmarked this a while ago: http://www.freebsd.org/doc/en/books/developers-handbook/book.html#X86-SYSTEM-CALLS

Resources