I used assembly language step by step to learn assembly language programming on linux. I recently got a Mac, on which int 0x80 doesn't seem to work (illegal instruction).
So just wanted to know if there is a good reference (book/webpage) which gives the differences b/w the standard unix assembly and darwin assembly.
For practical purposes, this answer shows how to compile a hello world application using nasm on OSX.
This code can be compiled for linux as is, but the cmd-line command to compile it would probably differ:
section .text
global mystart ; make the main function externally visible
mystart:
; 1 print "hello, world"
; 1a prepare the arguments for the system call to write
push dword mylen ; message length
push dword mymsg ; message to write
push dword 1 ; file descriptor value
; 1b make the system call to write
mov eax, 0x4 ; system call number for write
sub esp, 4 ; OS X (and BSD) system calls needs "extra space" on stack
int 0x80 ; make the actual system call
; 1c clean up the stack
add esp, 16 ; 3 args * 4 bytes/arg + 4 bytes extra space = 16 bytes
; 2 exit the program
; 2a prepare the argument for the sys call to exit
push dword 0 ; exit status returned to the operating system
; 2b make the call to sys call to exit
mov eax, 0x1 ; system call number for exit
sub esp, 4 ; OS X (and BSD) system calls needs "extra space" on stack
int 0x80 ; make the system call
; 2c no need to clean up the stack because no code here would executed: already exited
section .data
mymsg db "hello, world", 0xa ; string with a carriage-return
mylen equ $-mymsg ; string length in bytes
Assemble the source (hello.nasm) to an object file:
nasm -f macho hello.nasm
Link to produce the executable:
ld -o hello -e mystart hello.o
This question will likely help: List of and documentation for system calls for XNU kernel in OSX.
Unfortunately, it looks like the book mentioned there is the only way to find out. As for int 0x80, I doubt it will work because it is a pretty Linux specific API that is built right into the kernel.
The compromise I make when working on an unfamiliar OS is to just use libc calls, but I can understand that even that may be too high level if you're just looking to learn.
can you post your code and how you compiled? (There are many ways to elicit illegal instruction errors)
OSX picked up bsd style of passing arguments, which is why you have to do thing slightly differently.
I bookmarked this a while ago: http://www.freebsd.org/doc/en/books/developers-handbook/book.html#X86-SYSTEM-CALLS
Related
I wrote the following code to check if the 1st number- 'x' is greater than the 2nd number- 'y'. For x>y output should be 1 and for x<=y output should be 0.
section .txt
global _start
global checkGreater
_start:
mov rdi,x
mov rsi,y
call checkGreater
mov rax,60
mov rdi,0
syscall
checkGreater:
mov r8,rdi
mov r9,rsi
cmp r8,r9
jg skip
mov [c],byte '0'
skip:
mov rax,1
mov rdi,1
mov rsi,c
mov rdx,1
syscall
ret
section .data
x db 7
y db 5
c db '1',0
But due to some reasons(of course from my end), the code always gives 0 as the output when executed.
I am using the following commands to run the code on Ubuntu 20.04.1 LTS with nasm 2.14.02-1
nasm -f elf64 fileName.asm
ld -s -o fileName fileName.o
./fileName
Where did I make a mistake?
And how should one debug assembly codes, I looked for printing received arguments in checkGreater, but it turns out that's a disturbing headache itself.
Note: If someone wondering why I didn't directly use x and y in checkGreater, I want to extend the comparison to user inputs, and so wrote code in that way only.
The instructions
mov rdi,x
mov rsi,y
write the address of x into rdi, and of y into rsi. The further code then goes on to compare the addresses, which are always x<y, since x is defined above y.
What you should have written instead is
mov rdi,[x]
mov rsi,[y]
But then you have another problem: x and y variables are 1 byte long, while the destination registers are 8 bytes long. So simply doing the above fix will read extraneous bytes, leading to useless results. The final correction is to either fix the size of the variables (writing dq instead of db), or read them as bytes:
movzx rdi,byte [x]
movzx rsi,byte [y]
As for
And how should one debug assembly codes
The main tool for you is an assembly-level debugger, like EDB on Linux or x64dbg on Windows. But in fact, most debuggers, even the ones intended for languages like C++, are capable of displaying disassembly for the program being debugged. So you can use e.g. GDB, or even a GUI wrapper for it like Qt Creator or Eclipse. Just be sure to switch to machine code mode, or use the appropriate commands like GDB's disassemble, stepi, info registers etc..
Note that you don't have to build EDB or GDB from source (as the links above might suggest): they are likely already packaged in the Linux distribution you use. E.g. on Ubuntu the packages are called edb-debugger and gdb.
Does anyone know of any good tools (I'm looking for IDEs) to write assembly on the Mac. Xcode is a little cumbersome to me.
Also, on the Intel Macs, can I use generic x86 asm? Or is there a modified instruction set? Any information about post Intel.
Also: I know that on windows, asm can run in an emulated environment created by the OS to let the code think it's running on its own dedicated machine. Does OS X provide the same thing?
After installing any version of Xcode targeting Intel-based Macs, you should be able to write assembly code. Xcode is a suite of tools, only one of which is the IDE, so you don't have to use it if you don't want to. (That said, if there are specific things you find clunky, please file a bug at Apple's bug reporter - every bug goes to engineering.) Furthermore, installing Xcode will install both the Netwide Assembler (NASM) and the GNU Assembler (GAS); that will let you use whatever assembly syntax you're most comfortable with.
You'll also want to take a look at the Compiler & Debugging Guides, because those document the calling conventions used for the various architectures that Mac OS X runs on, as well as how the binary format and the loader work. The IA-32 (x86-32) calling conventions in particular may be slightly different from what you're used to.
Another thing to keep in mind is that the system call interface on Mac OS X is different from what you might be used to on DOS/Windows, Linux, or the other BSD flavors. System calls aren't considered a stable API on Mac OS X; instead, you always go through libSystem. That will ensure you're writing code that's portable from one release of the OS to the next.
Finally, keep in mind that Mac OS X runs across a pretty wide array of hardware - everything from the 32-bit Core Single through the high-end quad-core Xeon. By coding in assembly you might not be optimizing as much as you think; what's optimal on one machine may be pessimal on another. Apple regularly measures its compilers and tunes their output with the "-Os" optimization flag to be decent across its line, and there are extensive vector/matrix-processing libraries that you can use to get high performance with hand-tuned CPU-specific implementations.
Going to assembly for fun is great. Going to assembly for speed is not for the faint of heart these days.
As stated before, don't use syscall. You can use standard C library calls though, but be aware that the stack MUST be 16 byte aligned per Apple's IA32 function call ABI.
If you don't align the stack, your program will crash in __dyld_misaligned_stack_error when you make a call into any of the libraries or frameworks.
The following snippet assembles and runs on my system:
; File: hello.asm
; Build: nasm -f macho hello.asm && gcc -o hello hello.o
SECTION .rodata
hello.msg db 'Hello, World!',0x0a,0x00
SECTION .text
extern _printf ; could also use _puts...
GLOBAL _main
; aligns esp to 16 bytes in preparation for calling a C library function
; arg is number of bytes to pad for function arguments, this should be a multiple of 16
; unless you are using push/pop to load args
%macro clib_prolog 1
mov ebx, esp ; remember current esp
and esp, 0xFFFFFFF0 ; align to next 16 byte boundary (could be zero offset!)
sub esp, 12 ; skip ahead 12 so we can store original esp
push ebx ; store esp (16 bytes aligned again)
sub esp, %1 ; pad for arguments (make conditional?)
%endmacro
; arg must match most recent call to clib_prolog
%macro clib_epilog 1
add esp, %1 ; remove arg padding
pop ebx ; get original esp
mov esp, ebx ; restore
%endmacro
_main:
; set up stack frame
push ebp
mov ebp, esp
push ebx
clib_prolog 16
mov dword [esp], hello.msg
call _printf
; can make more clib calls here...
clib_epilog 16
; tear down stack frame
pop ebx
mov esp, ebp
pop ebp
mov eax, 0 ; set return code
ret
Recently I wanted to learn how to compile Intel x86 on Mac OS X:
For nasm:
-o hello.tmp - outfile
-f macho - specify format
Linux - elf or elf64
Mac OSX - macho
For ld:
-arch i386 - specify architecture (32 bit assembly)
-macosx_version_min 10.6 (Mac OSX - complains about default specification)
-no_pie (Mac OSX - removes ld warning)
-e main - specify main symbol name (Mac OSX - default is start)
-o hello.o - outfile
For Shell:
./hello.o - execution
One-liner:
nasm -o hello.tmp -f macho hello.s && ld -arch i386 -macosx_version_min 10.6 -no_pie -e _main -o hello.o hello.tmp && ./hello.o
Let me know if this helps!
I wrote how to do it on my blog here:
http://blog.burrowsapps.com/2013/07/how-to-compile-helloworld-in-intel-x86.html
For a more verbose explanation, I explained on my Github here:
https://github.com/jaredsburrows/Assembly
Running assembly Code on Mac is just 3 steps away from you. It could be done using XCODE but better is to use NASM Command Line Tool.
For My Ease I have already installed Xcode, if you have Xcode installed its good.
But You can do it without XCode as well.
Just Follow:
First Install NASM using Homebrew brew install nasm
convert .asm file into Obj File using this command nasm -f macho64 myFile.asm
Run Obj File to see OutPut using command ld -macosx_version_min 10.7.0 -lSystem -o OutPutFile myFile.o && ./64
Simple Text File named myFile.asm is written below for your convenience.
global start
section .text
start:
mov rax, 0x2000004 ; write
mov rdi, 1 ; stdout
mov rsi, msg
mov rdx, msg.len
syscall
mov rax, 0x2000001 ; exit
mov rdi, 0
syscall
section .data
msg: db "Assalam O Alaikum Dear", 10
.len: equ $ - msg
Also, on the Intel Macs, can I use generic x86 asm? or is there a modified instruction set? Any information about post Intel Mac assembly helps.
It's the same instruction set; it's the same chips.
The features available to use are dependent on your processor. Apple uses the same Intel stuff as everybody else. So yes, generic x86 should be fine (assuming you're not on a PPC :D).
As far as tools go, I think your best bet is a good text editor that 'understands' assembly.
Forget about finding a IDE to write/run/compile assembler on Mac. But, remember mac is UNIX. See http://asm.sourceforge.net/articles/linasm.html. A decent guide (though short) to running assembler via GCC on Linux. You can mimic this. Macs use Intel chips so you want to look at Intel syntax.
Does anyone know of any good tools (I'm looking for IDEs) to write assembly on the Mac. Xcode is a little cumbersome to me.
Also, on the Intel Macs, can I use generic x86 asm? Or is there a modified instruction set? Any information about post Intel.
Also: I know that on windows, asm can run in an emulated environment created by the OS to let the code think it's running on its own dedicated machine. Does OS X provide the same thing?
After installing any version of Xcode targeting Intel-based Macs, you should be able to write assembly code. Xcode is a suite of tools, only one of which is the IDE, so you don't have to use it if you don't want to. (That said, if there are specific things you find clunky, please file a bug at Apple's bug reporter - every bug goes to engineering.) Furthermore, installing Xcode will install both the Netwide Assembler (NASM) and the GNU Assembler (GAS); that will let you use whatever assembly syntax you're most comfortable with.
You'll also want to take a look at the Compiler & Debugging Guides, because those document the calling conventions used for the various architectures that Mac OS X runs on, as well as how the binary format and the loader work. The IA-32 (x86-32) calling conventions in particular may be slightly different from what you're used to.
Another thing to keep in mind is that the system call interface on Mac OS X is different from what you might be used to on DOS/Windows, Linux, or the other BSD flavors. System calls aren't considered a stable API on Mac OS X; instead, you always go through libSystem. That will ensure you're writing code that's portable from one release of the OS to the next.
Finally, keep in mind that Mac OS X runs across a pretty wide array of hardware - everything from the 32-bit Core Single through the high-end quad-core Xeon. By coding in assembly you might not be optimizing as much as you think; what's optimal on one machine may be pessimal on another. Apple regularly measures its compilers and tunes their output with the "-Os" optimization flag to be decent across its line, and there are extensive vector/matrix-processing libraries that you can use to get high performance with hand-tuned CPU-specific implementations.
Going to assembly for fun is great. Going to assembly for speed is not for the faint of heart these days.
As stated before, don't use syscall. You can use standard C library calls though, but be aware that the stack MUST be 16 byte aligned per Apple's IA32 function call ABI.
If you don't align the stack, your program will crash in __dyld_misaligned_stack_error when you make a call into any of the libraries or frameworks.
The following snippet assembles and runs on my system:
; File: hello.asm
; Build: nasm -f macho hello.asm && gcc -o hello hello.o
SECTION .rodata
hello.msg db 'Hello, World!',0x0a,0x00
SECTION .text
extern _printf ; could also use _puts...
GLOBAL _main
; aligns esp to 16 bytes in preparation for calling a C library function
; arg is number of bytes to pad for function arguments, this should be a multiple of 16
; unless you are using push/pop to load args
%macro clib_prolog 1
mov ebx, esp ; remember current esp
and esp, 0xFFFFFFF0 ; align to next 16 byte boundary (could be zero offset!)
sub esp, 12 ; skip ahead 12 so we can store original esp
push ebx ; store esp (16 bytes aligned again)
sub esp, %1 ; pad for arguments (make conditional?)
%endmacro
; arg must match most recent call to clib_prolog
%macro clib_epilog 1
add esp, %1 ; remove arg padding
pop ebx ; get original esp
mov esp, ebx ; restore
%endmacro
_main:
; set up stack frame
push ebp
mov ebp, esp
push ebx
clib_prolog 16
mov dword [esp], hello.msg
call _printf
; can make more clib calls here...
clib_epilog 16
; tear down stack frame
pop ebx
mov esp, ebp
pop ebp
mov eax, 0 ; set return code
ret
Recently I wanted to learn how to compile Intel x86 on Mac OS X:
For nasm:
-o hello.tmp - outfile
-f macho - specify format
Linux - elf or elf64
Mac OSX - macho
For ld:
-arch i386 - specify architecture (32 bit assembly)
-macosx_version_min 10.6 (Mac OSX - complains about default specification)
-no_pie (Mac OSX - removes ld warning)
-e main - specify main symbol name (Mac OSX - default is start)
-o hello.o - outfile
For Shell:
./hello.o - execution
One-liner:
nasm -o hello.tmp -f macho hello.s && ld -arch i386 -macosx_version_min 10.6 -no_pie -e _main -o hello.o hello.tmp && ./hello.o
Let me know if this helps!
I wrote how to do it on my blog here:
http://blog.burrowsapps.com/2013/07/how-to-compile-helloworld-in-intel-x86.html
For a more verbose explanation, I explained on my Github here:
https://github.com/jaredsburrows/Assembly
Running assembly Code on Mac is just 3 steps away from you. It could be done using XCODE but better is to use NASM Command Line Tool.
For My Ease I have already installed Xcode, if you have Xcode installed its good.
But You can do it without XCode as well.
Just Follow:
First Install NASM using Homebrew brew install nasm
convert .asm file into Obj File using this command nasm -f macho64 myFile.asm
Run Obj File to see OutPut using command ld -macosx_version_min 10.7.0 -lSystem -o OutPutFile myFile.o && ./64
Simple Text File named myFile.asm is written below for your convenience.
global start
section .text
start:
mov rax, 0x2000004 ; write
mov rdi, 1 ; stdout
mov rsi, msg
mov rdx, msg.len
syscall
mov rax, 0x2000001 ; exit
mov rdi, 0
syscall
section .data
msg: db "Assalam O Alaikum Dear", 10
.len: equ $ - msg
Also, on the Intel Macs, can I use generic x86 asm? or is there a modified instruction set? Any information about post Intel Mac assembly helps.
It's the same instruction set; it's the same chips.
The features available to use are dependent on your processor. Apple uses the same Intel stuff as everybody else. So yes, generic x86 should be fine (assuming you're not on a PPC :D).
As far as tools go, I think your best bet is a good text editor that 'understands' assembly.
Forget about finding a IDE to write/run/compile assembler on Mac. But, remember mac is UNIX. See http://asm.sourceforge.net/articles/linasm.html. A decent guide (though short) to running assembler via GCC on Linux. You can mimic this. Macs use Intel chips so you want to look at Intel syntax.
So i was wondering if there is any? I know afd on windows but not sure anything about mac?
And this his how i am using nasam on the following code: nasm a.asm -o a.com -l a.lst
[org 0x100]
mov ax, 5
mov bx, 10
add ax, bx
mov bx, 15
add ax, bx
mov ax, 0x4c00
int 0x21
On windows i know a debugger name afd which help me to step through each statement but not sure how i can do this using gdb.
And neither i am able to execute this .com file, am i supposed to make some other file here?
Why are you writing 16-bit code that makes DOS syscalls? If you want to know how to write asm that's applicable to your OS, take a look the code generated by "gcc -S" on some C code... (Note that code generated this way will have operands reversed, and is meant to be assembled with as instead of nasm)
Further, are you aware what this code is doing? It reads to me like this:
ax = 5
bx = 10
ax += bx
bx = 15
ax += bx
ax = 0x4c00
int 21h
Seems like this code is equivalent to:
mov bx, 15
mov ax, 4c00
int 21h
Which according to what I see here, is exit(0). You didn't need to change bx either...
But. This doesn't even apply to what you were trying to do, because Mac OS X is not MS-DOS, does not know about DOS APIs, cannot run .COM files, etc. I wasn't even aware that it can run 16 bit code. You will want to look at nasm's -f elf option, and you will want to use registers like eax rather than ax.
I've not done assembly programming on OS X, but you could theoretically do something like this:
extern exit
global main
main:
push dword 0
call exit
; This will never get called, but hey...
add esp, 4
xor eax, eax
ret
Then:
nasm -f elf foo.asm -o foo.o
ld -o foo foo.o -lc
Of course this is relying on the C library, which you might not want to do. I've omitted the "full" version because I don't know what the syscall interface looks like on Mac. On many platforms your entry point is the symbol _start and you do syscalls with int 80h or sysenter.
As for debugging... I would also suggest GDB. You can advance by a single instruction with stepi, and the info registers command will dump register state. The disassemble command is also helpful.
Update: Just remembered, I don't think Mac OS X uses ELF... Well.. Much of what I wrote still applies. :-)
Xcode ships with GDB, the GNU Debugger.
Xcode 4 and newer ships with LLDB instead.
As others have said, use GDB, the gnu debugger. In debugging assembly source, I usually find it useful to load a command file that contains something like the following:
display/5i $pc
display/x $eax
display/x $ebx
...
display/5i will display 5 instructions starting with the next to be executed. You can use the stepi command to step execution one instruction at a time. display/x $eax displays the contents of the eax register in hex. You will also likely want to use the x command to examine the contents of memory: x/x $eax, for example, prints the contents of the memory whose address is stored in eax.
These are a few of many commands. Download the GDB manual and skim through it to find other commands you may be interested in using.
IDA Pro does work on the Mac after a fashion (UI still runs on Windows; see an example).
Does anyone know of any good tools (I'm looking for IDEs) to write assembly on the Mac. Xcode is a little cumbersome to me.
Also, on the Intel Macs, can I use generic x86 asm? Or is there a modified instruction set? Any information about post Intel.
Also: I know that on windows, asm can run in an emulated environment created by the OS to let the code think it's running on its own dedicated machine. Does OS X provide the same thing?
After installing any version of Xcode targeting Intel-based Macs, you should be able to write assembly code. Xcode is a suite of tools, only one of which is the IDE, so you don't have to use it if you don't want to. (That said, if there are specific things you find clunky, please file a bug at Apple's bug reporter - every bug goes to engineering.) Furthermore, installing Xcode will install both the Netwide Assembler (NASM) and the GNU Assembler (GAS); that will let you use whatever assembly syntax you're most comfortable with.
You'll also want to take a look at the Compiler & Debugging Guides, because those document the calling conventions used for the various architectures that Mac OS X runs on, as well as how the binary format and the loader work. The IA-32 (x86-32) calling conventions in particular may be slightly different from what you're used to.
Another thing to keep in mind is that the system call interface on Mac OS X is different from what you might be used to on DOS/Windows, Linux, or the other BSD flavors. System calls aren't considered a stable API on Mac OS X; instead, you always go through libSystem. That will ensure you're writing code that's portable from one release of the OS to the next.
Finally, keep in mind that Mac OS X runs across a pretty wide array of hardware - everything from the 32-bit Core Single through the high-end quad-core Xeon. By coding in assembly you might not be optimizing as much as you think; what's optimal on one machine may be pessimal on another. Apple regularly measures its compilers and tunes their output with the "-Os" optimization flag to be decent across its line, and there are extensive vector/matrix-processing libraries that you can use to get high performance with hand-tuned CPU-specific implementations.
Going to assembly for fun is great. Going to assembly for speed is not for the faint of heart these days.
As stated before, don't use syscall. You can use standard C library calls though, but be aware that the stack MUST be 16 byte aligned per Apple's IA32 function call ABI.
If you don't align the stack, your program will crash in __dyld_misaligned_stack_error when you make a call into any of the libraries or frameworks.
The following snippet assembles and runs on my system:
; File: hello.asm
; Build: nasm -f macho hello.asm && gcc -o hello hello.o
SECTION .rodata
hello.msg db 'Hello, World!',0x0a,0x00
SECTION .text
extern _printf ; could also use _puts...
GLOBAL _main
; aligns esp to 16 bytes in preparation for calling a C library function
; arg is number of bytes to pad for function arguments, this should be a multiple of 16
; unless you are using push/pop to load args
%macro clib_prolog 1
mov ebx, esp ; remember current esp
and esp, 0xFFFFFFF0 ; align to next 16 byte boundary (could be zero offset!)
sub esp, 12 ; skip ahead 12 so we can store original esp
push ebx ; store esp (16 bytes aligned again)
sub esp, %1 ; pad for arguments (make conditional?)
%endmacro
; arg must match most recent call to clib_prolog
%macro clib_epilog 1
add esp, %1 ; remove arg padding
pop ebx ; get original esp
mov esp, ebx ; restore
%endmacro
_main:
; set up stack frame
push ebp
mov ebp, esp
push ebx
clib_prolog 16
mov dword [esp], hello.msg
call _printf
; can make more clib calls here...
clib_epilog 16
; tear down stack frame
pop ebx
mov esp, ebp
pop ebp
mov eax, 0 ; set return code
ret
Recently I wanted to learn how to compile Intel x86 on Mac OS X:
For nasm:
-o hello.tmp - outfile
-f macho - specify format
Linux - elf or elf64
Mac OSX - macho
For ld:
-arch i386 - specify architecture (32 bit assembly)
-macosx_version_min 10.6 (Mac OSX - complains about default specification)
-no_pie (Mac OSX - removes ld warning)
-e main - specify main symbol name (Mac OSX - default is start)
-o hello.o - outfile
For Shell:
./hello.o - execution
One-liner:
nasm -o hello.tmp -f macho hello.s && ld -arch i386 -macosx_version_min 10.6 -no_pie -e _main -o hello.o hello.tmp && ./hello.o
Let me know if this helps!
I wrote how to do it on my blog here:
http://blog.burrowsapps.com/2013/07/how-to-compile-helloworld-in-intel-x86.html
For a more verbose explanation, I explained on my Github here:
https://github.com/jaredsburrows/Assembly
Running assembly Code on Mac is just 3 steps away from you. It could be done using XCODE but better is to use NASM Command Line Tool.
For My Ease I have already installed Xcode, if you have Xcode installed its good.
But You can do it without XCode as well.
Just Follow:
First Install NASM using Homebrew brew install nasm
convert .asm file into Obj File using this command nasm -f macho64 myFile.asm
Run Obj File to see OutPut using command ld -macosx_version_min 10.7.0 -lSystem -o OutPutFile myFile.o && ./64
Simple Text File named myFile.asm is written below for your convenience.
global start
section .text
start:
mov rax, 0x2000004 ; write
mov rdi, 1 ; stdout
mov rsi, msg
mov rdx, msg.len
syscall
mov rax, 0x2000001 ; exit
mov rdi, 0
syscall
section .data
msg: db "Assalam O Alaikum Dear", 10
.len: equ $ - msg
Also, on the Intel Macs, can I use generic x86 asm? or is there a modified instruction set? Any information about post Intel Mac assembly helps.
It's the same instruction set; it's the same chips.
The features available to use are dependent on your processor. Apple uses the same Intel stuff as everybody else. So yes, generic x86 should be fine (assuming you're not on a PPC :D).
As far as tools go, I think your best bet is a good text editor that 'understands' assembly.
Forget about finding a IDE to write/run/compile assembler on Mac. But, remember mac is UNIX. See http://asm.sourceforge.net/articles/linasm.html. A decent guide (though short) to running assembler via GCC on Linux. You can mimic this. Macs use Intel chips so you want to look at Intel syntax.