How can you generate a flat binary that will run directly on the CPU?
That is, without an Operating System; also called free standing environment code (see What is the name for a program running directly without an OS?).
I've noticed that the assembler I'm using, as from the OS-X developer tools bundle, keeps generating Mach-O files, and not flat binaries.
This is the way I've done it. Using the linker that comes with the XCode Command Line Tools, you can combine object files using:
ld code1.o code2.o -o code.bin -r -U start
The -r asks ld to just combine object files together without making a library, -U tells ld to ignore the missing definition of _start (which would normally be provided by the C stdlib).
This creates a binary which still has some header bytes, but this is easily identified with
otool -l code.bin
Look for the __text section in the output:
Section
sectname __text
segname __TEXT
addr 0x00000000
size 0x0000003b
offset 240
align 2^4 (16)
reloff 300
nreloc 1
flags 0x80000400
reserved1 0
reserved2 0
Note the offset (which you can confirm by comparing the output of otool -l and hexdump). We don't want the headers so just use dd to copy out the bytes you need:
dd if=code.bin of=code_stripped.bin ibs=240 skip=1
where I've set the block size to the offset and skipping one block.
You don't. You get the linker to produce a flat (pure) binary. To do that, you have to write a linker script file with OUTPUT_FORMAT(binary). If memory serves, you also need to specify something about how the sections are merged, but I don't remember any of the details.
I don't think you necessarily need to do this. Some bootloaders can load more complex executable formats. For example, GRUB can load ELF right off the bat. I'm sure you can somehow get it or some other bootloader to load Mach-O files.
You may want to try using the nasm assembler -- it has an option to control the output binary format, including -f bin for flat binaries.
Note that you can't easily compile C code to flat binaries, since almost any C code will require binary features (like external symbols and relocations) which can't be represented in a flat binary.
There is no easy way I know of.
Once I needed to create plain binary file which will be loaded and executed by another program. However, as didn't allow me to do that. I tried to use gobjcopy to convert object file to raw binary, but it was not able to properly convert code such as this:
.quad LinkName2 - LinkName1
In binary file produced by gobjcopy it looked like
.quad 0
I've ended up writing special dumping program, which is executable that will save part of the memory on disk:
.set SYS_EXIT, 0x2000001
.set SYS_READ, 0x2000003
.set SYS_WRITE, 0x2000004
.set SYS_OPEN, 0x2000005
.set SYS_CLOSE, 0x2000006
.data
dumpfile: .ascii "./dump"
.byte 0
OutputFileDescriptor: .quad 0
.section __TEXT,__text,regular
.globl _main
_main:
movl $0644, %edx # file mode
movl $0x601, %esi # O_CREAT | O_TRUNC | O_WRONLY
leaq dumpfile(%rip), %rdi
movl $SYS_OPEN, %eax
syscall
movq %rax, OutputFileDescriptor(%rip)
movq $EndDump - BeginDump, %rdx
leaq BeginDump(%rip), %rsi
movq OutputFileDescriptor(%rip), %rdi
movl $SYS_WRITE, %eax
syscall
movq OutputFileDescriptor(%rip), %rdi
movl $SYS_CLOSE, %eax
syscall
Done:
movq %rax, %rdi
movl $SYS_EXIT, %eax
syscall
.align 3
BeginDump:
.include "dump.s"
EndDump:
.quad 0
The code that have to be saved as raw binary file is included in dump.s
Related
I have written a small piece of assembly with AT&T syntax and have currently declared three variables in the .data section. However, when I attempt to move any of those variables to a register, such as %eax, an error from gcc is raised. The code and error message is below:
.data
x:.int 14
y:.int 4
str: .string "some string\n"
.globl _main
_main:
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
movl x, %eax; #attempting to move the value of x to %eax;
leave
ret
The error raised is:
call_function.s:14:3: error: 32-bit absolute addressing is not supported in 64-bit mode
movl x, %eax;
^
I have also tried moving the value by first adding the $ character in front of x, however, a clang error is raised:
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Does anyone know how the value stored in x can be successfully moved to %eax? I am using x86 assembly on Mac OSX and compiling with gcc.
A RIP-relative addressing mode is the only good option for addressing static data on MacOS; the image base address is above 2^32 so 32-bit absolute addresses aren't usable even in position-dependent code (unlike x86-64 Linux). RIP-relative addressing of static data is position-independent, so it works even in position-independent executables (ASLR) and libraries.
movl x(%rip), %eax is the AT&T syntax for RIP-relative.
mov eax, dword ptr [rip+x] in GAS .intel_syntax noprefix.
Or, to get the address of a symbol into a register, lea x(%rip), %rdi
NASM syntax: mov eax, [rel x], or use default rel so [x] is RIP-relative.
See Mach-O 64-bit format does not support 32-bit absolute addresses. NASM Accessing Array for more background on what you can do on OS X, e.g. movabs x, %eax would be possible because the destination register is AL/AX/EAX/RAX. (64-bit absolute address, but don't do that because it's larger and not faster than a RIP-relative load.)
See also http://felixcloutier.com/x86/MOV.html.
I am trying to write a program to find the minimum value of a list of integers in asm. Here is what I have so far:
.section .data
data_items:
.long 2,3,4,5,1,9,10 # set 10 as the sentinal value
.section text
.globl _start
_start:
# %ebx holds min
# %edi holds index (destination index)
# %eax current data item
movl $255, %ebx # set the current min to 255
movl $0, %edi # the index is also zero
start_loop:
movl data_items(,%edi,4), %eax # set %eax equal to the current data item
cmpl $10, %eax # compare %eax with zero to see if we should exit
je exit_loop # if it's the sentinel value, exit
incl %edi # increment the index
cmpl %eax, %edi # compare the current value to the current min
jge start_loop # if it's not less than the current value, go to start
movl %eax, %ebx # move the current value if less that the current min
jmp start_loop # always go back to the start if we've gotten this far
exit_loop:
movl $1, %eax # push the linux system call to %eax (1=exit)
int $0x80 # give linux control (so it will exit)
When I run this, I get the following:
$ as min.s -o min.o && ld min.o -o min && ./min
Segmentation fault (core dumped)
How is one supposed to debug asm? For example, at least in C the compiler tells you what the error might be and the line number, whereas here I know just about nothing. (Note: the error is having .section text instead of .section .text but how would one figure that out?)
It's very possible in C to write a program that compiles with no warnings but crashes (e.g. NULL pointer deref), and you'll see exactly the same thing. It's much more likely in asm, though.
You debug asm with a debugger, GDB for example. See tips at the bottom of https://stackoverflow.com/tags/x86/info. And if you make any system calls, use strace to see what your program is actually doing.
To debug this, you'd run it under GDB and notice that it segfaulted on the first instruction, movl $255, %ebx. It doesn't access memory so code-fetch must have faulted. So there must be something wrong with your sections that resulted in your code in section linked into a non-executable segment of your executable.
objdump -d would also have given you a hint: it disassembles the .text section by default, and this program doesn't have one.
The reason text instead of .text causes this problem is that the defaults for sections with random names that aren't one of the few specially-recognized ones are read+write without exec.
In GAS, use .text or .data, special shortcut directives for .section .text or .data which avoid this problem for those sections. https://sourceware.org/binutils/docs/as/Text.html
But not all "standard" sections have special directives, you do still need .section .rodata to switch to the read-only data section, where you should have put your array. (read, no write. On newer toolchains, also no exec). Instead of switching to the .bss section, though, you can use .comm or .lcomm (https://sourceware.org/binutils/docs/as/bss.html)
Another possible problem is that you're building this 32-bit code as a 64-bit executable (unless you're using a 32-bit-only install where as --32 is the default). Using 32-bit addressing modes works in 64-bit modes, truncating the address to 32 bits. That works when accessing static data in a position-dependent executable on Linux, because all code+data is linked into the low 2GiB of virtual address space.
But any access to (%esp) or -4(%ebp) or whatever would fault because the stack in a 64-bit process is mapped to a high address with non-zero bits outside the low 32.
You'd notice that problem in GDB because layout reg would show all 16 64-bit integer registers, RAX..R15.
I have written a small piece of assembly with AT&T syntax and have currently declared three variables in the .data section. However, when I attempt to move any of those variables to a register, such as %eax, an error from gcc is raised. The code and error message is below:
.data
x:.int 14
y:.int 4
str: .string "some string\n"
.globl _main
_main:
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
movl x, %eax; #attempting to move the value of x to %eax;
leave
ret
The error raised is:
call_function.s:14:3: error: 32-bit absolute addressing is not supported in 64-bit mode
movl x, %eax;
^
I have also tried moving the value by first adding the $ character in front of x, however, a clang error is raised:
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Does anyone know how the value stored in x can be successfully moved to %eax? I am using x86 assembly on Mac OSX and compiling with gcc.
A RIP-relative addressing mode is the only good option for addressing static data on MacOS; the image base address is above 2^32 so 32-bit absolute addresses aren't usable even in position-dependent code (unlike x86-64 Linux). RIP-relative addressing of static data is position-independent, so it works even in position-independent executables (ASLR) and libraries.
movl x(%rip), %eax is the AT&T syntax for RIP-relative.
mov eax, dword ptr [rip+x] in GAS .intel_syntax noprefix.
Or, to get the address of a symbol into a register, lea x(%rip), %rdi
NASM syntax: mov eax, [rel x], or use default rel so [x] is RIP-relative.
See Mach-O 64-bit format does not support 32-bit absolute addresses. NASM Accessing Array for more background on what you can do on OS X, e.g. movabs x, %eax would be possible because the destination register is AL/AX/EAX/RAX. (64-bit absolute address, but don't do that because it's larger and not faster than a RIP-relative load.)
See also http://felixcloutier.com/x86/MOV.html.
I am trying to learn x86_64 assembly, and am using GCC as my assembler. The exact command I'm using is:
gcc -nostdlib tapydn.S -D__ASSEMBLY__
I'm mainly using gcc for its preprocessor. Here is tapydn.S:
.global _start
#include <asm-generic/unistd.h>
syscall=0x80
.text
_start:
movl $__NR_exit, %eax
movl $0x00, %ebx
int $syscall
This results in a segmentation fault. I believe the problem is with the following line:
movl $__NR_exit, %eax
I used __NR_exit because it was more descriptive than some magic number. However, it appears that my usage of it is incorrect. I believe this to be the case because when I change the line in question to the following, it runs fine:
movl $0x01, %eax
Further backing up this trail of thought is the contents of usr/include/asm-generic/unistd.h:
#define __NR_exit 93
__SYSCALL(__NR_exit, sys_exit)
I expected the value of __NR_exit to be 1, not 93! Clearly I am misunderstanding its purpose and consequently its usage. For all I know, I'm getting lucky with the $0x01 case working (much like undefined behaviour in C++), so I kept digging...
Next, I looked for the definition of sys_exit. I couldn't find it. I tried using it anyway as follows (with and without the preceeding $):
movl $sys_exit, %eax
This wouldn't link:
/tmp/cc7tEUtC.o: In function `_start':
(.text+0x1): undefined reference to `sys_exit'
collect2: error: ld returned 1 exit status
My guess is that it's a symbol in one of the system libraries and I'm not linking it due to my passing -nostdlib to GCC. I'd like to avoid linking such a large library for just one symbol if possible.
In response to Jester's comment about mixing 32 and 64 bit constants, I tried using the value 0x3C as suggested:
movq $0x3C, %eax
movq $0x00, %ebx
This also resulting a segmentation fault. I also tried swapping out eax and ebx for rax and rbx:
movq $0x3C, %rax
movq $0x00, %rbx
The segmentation fault remained.
Jester then commented stating that I should be using syscall rather than int $0x80:
.global _start
#include <asm-generic/unistd.h>
.text
_start:
movq $0x3C, %rax
movq $0x00, %rbx
syscall
This works, but I was later informed that I should be using rdi instead of rbx as per the System V AMD64 ABI:
movq $0x00, %rdi
This also works fine, but still ends up using the magic number 0x3C for the system call number.
Wrapping up, my questions are as follows:
What is the correct usage of __NR_exit?
What should I be using instead of a magic number for the exit system call?
The correct header file to get the system call numbers is sys/syscall.h. The constants are called SYS_### where ### is the name of the system call you are interested in. The __NR_### macros are implementation details and should not be used. As a rule of thumb, if an identifier begins with an underscore it should not be used, if it begins with two it should definitely not be used. The arguments go into rdi, rsi, rdx, r10, r8, and r9. Here is a sample program for Linux:
#include <sys/syscall.h>
.globl _start
_start:
mov $SYS_exit,%eax
xor %edi,%edi
syscall
These conventions are mostly portable to other UNIX-like operating systems.
When attempting to run the following assembly program:
.globl start
start:
pushq $0x0
movq $0x1, %rax
subq $0x8, %rsp
int $0x80
I am receiving the following errors:
dyld: no writable segment
Trace/BPT trap
Any idea what could be causing this? The analogous program in 32 bit assembly runs fine.
OSX now requires your executable to have a writable data segment with content, so it can relocate and link your code dynamically. Dunno why, maybe security reasons, maybe due to the new RIP register. If you put a .data segment in there (with some bogus content), you'll avoid the "no writable segment" error. IMO this is an ld bug.
Regarding the 64-bit syscall, you can do it 2 ways. GCC-style, which uses the _syscall PROCEDURE from libSystem.dylib, or raw. Raw uses the syscall instruction, not the int 0x80 trap. int 0x80 is an illegal instruction in 64-bit.
The "GCC method" will take care of categorizing the syscall for you, so you can use the same 32-bit numbers found in sys/syscall.h. But if you go raw, you'll have to classify what kind of syscall it is by ORing it with a type id. Here is an example of both. Note that the calling convention is different! (this is NASM syntax because gas annoys me)
; assemble with
; nasm -f macho64 -o syscall64.o syscall64.asm && ld -lc -ldylib1.o -e start -o syscall64 syscall64.o
extern _syscall
global start
[section .text align=16]
start:
; do it gcc-style
mov rdi, 0x4 ; sys_write
mov rsi, 1 ; file descriptor
mov rdx, hello
mov rcx, size
call _syscall ; we're calling a procedure, not trapping.
;now let's do it raw
mov rax, 0x2000001 ; SYS_exit = 1 and is type 2 (bsd call)
mov rdi, 0 ; Exit success = 0
syscall ; faster than int 0x80, and legal!
[section .data align=16]
hello: db "hello 64-bit syscall!", 0x0a
size: equ $-hello
check out http://www.opensource.apple.com/source/xnu/xnu-792.13.8/osfmk/mach/i386/syscall_sw.h for more info on how a syscall is typed.
The system call interface is different between 32 and 64 bits. Firstly, int $80 is replaced by syscall and the system call numbers are different. You will need to look up documentation for a 64-bit version of your system call. Here is an example of what a 64-bit program may look like.