Currently I'm now learning assembly language, out of curiosity, I want to see how my program that was coded using assembly work under the hood, so I load up GDB (with -tui and layout asm ) and start debugging.
It came at my surprise when GDB asking for STDIN input, it only take 1 character and then stop taking others even if I'm not pressing enter.
I tried again in terminal, it's working fine and my program can take multiple STDIN input without any problem.
Here I provide partion from my testing assembly, this problem is reproducible with following assembly :
global main
segment .bss ; hold uninitialize buffer
szinput resb 10 ; declare 10 bytes uninitialize buffer
nszinput equ $ - szinput ; equ is meaning constant
segment .text
main:
pusha
mov eax, DWORD 3 ; read syscall
mov ebx, DWORD 0 ; 0 is stdin file descriptor
mov ecx, szinput ; void *buf
mov edx, nszinput ; size_t count
int 80h ; invoke syscall
popa
ret
Note that this assembly is for NASM assembler.
Running gdb --version, and this is the result : GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Related
I wrote the following code to check if the 1st number- 'x' is greater than the 2nd number- 'y'. For x>y output should be 1 and for x<=y output should be 0.
section .txt
global _start
global checkGreater
_start:
mov rdi,x
mov rsi,y
call checkGreater
mov rax,60
mov rdi,0
syscall
checkGreater:
mov r8,rdi
mov r9,rsi
cmp r8,r9
jg skip
mov [c],byte '0'
skip:
mov rax,1
mov rdi,1
mov rsi,c
mov rdx,1
syscall
ret
section .data
x db 7
y db 5
c db '1',0
But due to some reasons(of course from my end), the code always gives 0 as the output when executed.
I am using the following commands to run the code on Ubuntu 20.04.1 LTS with nasm 2.14.02-1
nasm -f elf64 fileName.asm
ld -s -o fileName fileName.o
./fileName
Where did I make a mistake?
And how should one debug assembly codes, I looked for printing received arguments in checkGreater, but it turns out that's a disturbing headache itself.
Note: If someone wondering why I didn't directly use x and y in checkGreater, I want to extend the comparison to user inputs, and so wrote code in that way only.
The instructions
mov rdi,x
mov rsi,y
write the address of x into rdi, and of y into rsi. The further code then goes on to compare the addresses, which are always x<y, since x is defined above y.
What you should have written instead is
mov rdi,[x]
mov rsi,[y]
But then you have another problem: x and y variables are 1 byte long, while the destination registers are 8 bytes long. So simply doing the above fix will read extraneous bytes, leading to useless results. The final correction is to either fix the size of the variables (writing dq instead of db), or read them as bytes:
movzx rdi,byte [x]
movzx rsi,byte [y]
As for
And how should one debug assembly codes
The main tool for you is an assembly-level debugger, like EDB on Linux or x64dbg on Windows. But in fact, most debuggers, even the ones intended for languages like C++, are capable of displaying disassembly for the program being debugged. So you can use e.g. GDB, or even a GUI wrapper for it like Qt Creator or Eclipse. Just be sure to switch to machine code mode, or use the appropriate commands like GDB's disassemble, stepi, info registers etc..
Note that you don't have to build EDB or GDB from source (as the links above might suggest): they are likely already packaged in the Linux distribution you use. E.g. on Ubuntu the packages are called edb-debugger and gdb.
I'm reading a book where all the assembly code examples are written for 32-bit Linux environment, and I'm using a 64-bit Mac. I was able to compile the following program with NASM after changing _start to start. However, when I run the executable it doesn't print hello world as I would expect it to. Is there an option to pass to NASM to compile this in a way that will run on a 64-bit Mac?
I tried:
nasm -f macho32 helloworld.asm
and
nasm -f macho helloworld.asm
followed by:
ld helloworld.o -o helloworld
My code is:
section .data ; data segment
msg db "Hello, world!", 0x0a ; the string and newline char
section .text ; text segment
global start ; Default entry point for ELF linking
start:
; SYSCALL: write(1, msg, 14)
mov eax, 4 ; put 4 into eax, since write is syscall #4
mov ebx, 1 ; put 1 into ebx, since stdout is 1
mov ecx, msg ; put the address of the string into ecx
mov edx, 14 ; put 14 into edx, since our string is 14 bytes
int 0x80 ; Call the kernel to make the system call happen
; SYSCALL: exit(0)
mov eax, 1 ; put 1 into eax, since exit is syscall #1
mov ebx, 0 ; exit with success
int 0x80 ; do the syscall
It's simply not going to work like that. Get a VM and install 32 bit linux in the VM. The problem is not running x86_32 code on x64. The problem is trying the Linux syscall gate on MAC. There's no reason to believe that would work.
I am trying to learn assembler and am somewhat confused by the method used by osx with nasm macho32 for passing arguments to functions.
I am following the book 'Assembly Language Step By Step' by Jeff Duntemann and using the internet extensively have altered it to run on osx both 32 and 64 bit.
So to begin with the linux version from the book
section .data ; Section containing initialised data
EatMsg db "Eat at Joe's!",10
EatLen equ $-EatMsg
section .bss ; Section containing uninitialised data
section .text ; Section containing code
global start ; Linker needs this to find the entry point!
start:
nop
mov eax, 4 ; Specify sys_write syscall
mov ebx, 1 ; Specify File Descriptor 1: Standard Output
mov ecx, EatMsg ; Pass offset of the message
mov edx, EatLen ; Pass the length of the message
int 0x80 ; Make syscall to output the text to stdout
mov eax, 1 ; Specify Exit syscall
mov ebx, 0 ; Return a code of zero
int 0x80 ; Make syscall to terminate the program
section .data ; Section containing initialised data
EatMsg db "Eat at Joe's!", 0x0a
EatLen equ $-EatMsg
section .bss ; Section containing uninitialised data
section .text ; Section containing code
global start ; Linker needs this to find the entry point!
Then very similarly the 64 bit version for osx, other than changing the register names, replacing int 80H (which I understand is somewhat archaic) and adding 0x2000000 to the values moved to eax (don't understand this in the slightest) there isn't much to alter.
section .data ; Section containing initialised data
EatMsg db "Eat at Joe's!", 0x0a
EatLen equ $-EatMsg
section .bss ; Section containing uninitialised data
section .text ; Section containing code
global start ; Linker needs this to find the entry point!
start:
mov rax, 0x2000004 ; Specify sys_write syscall
mov rdi, 1 ; Specify File Descriptor 1: Standard Output
mov rsi, EatMsg ; Pass offset of the message
mov rdx, EatLen ; Pass the length of the message
syscall ; Make syscall to output the text to stdout
mov rax, 0x2000001 ; Specify Exit syscall
mov rdi, 0 ; Return a code of zero
syscall ; Make syscall to terminate the program
The 32 Bit mac version on the other hand is quite different. I can see we are pushing the arguments to the stack dword, so my question is (and sorry for the long preamble) what is the difference between the stack that eax is being pushed to and dword and why do we just use the registers and not the stack in the 64 bit version (and linux)?
section .data ; Section containing initialised data
EatMsg db "Eat at Joe's!", 0x0a
EatLen equ $-EatMsg
section .bss ; Section containing uninitialised data
section .text ; Section containing code
global start ; Linker needs this to find the entry point!
start:
mov eax, 0x4 ; Specify sys_write syscall
push dword EatLen ; Pass the length of the message
push dword EatMsg ; Pass offset of the message
push dword 1 ; Specify File Descriptor 1: Standard Output
push eax
int 0x80 ; Make syscall to output the text to stdout
add esp, 16 ; Move back the stack pointer
mov eax, 0x1 ; Specify Exit syscall
push dword 0 ; Return a code of zero
push eax
int 0x80 ; Make syscall to terminate the program
Well, you don't quite understand what is dword. Speaking HLL, it is not a variable, but rather a type. So push doword 1 means that you pushes a double word constant 1 into the stack. There only ONE stack, and both the one and the register eax are pushed in it.
The registers are used in linux because they are much faster, especially on old processors. Linux ABI (which is, as far as i know, a descent of System V ABI) was developed quite a long time ago and often used in systems where performance was critical, when the difference was very significant. OSX intel abi is much younger, afaik, and simplicity of using stack where more important in desktop OSX than the negligible slowdown. In 64-bit processors, more registers where added and hence the where more efficient to use them.
I'm trying to debug an assembly program using gdb and Emacs. My problem is that, when I try to debug step-by-step, it doesn't show a pointer arrow at the current executing line. The code I'm trying to debug is:
SECTION .data ; Section containing initialised data
EatMsg: db "Eat at Joe's!",10
EatLen: equ $-EatMsg
SECTION .bss ; Section containing uninitialized data
SECTION .text ; Section containing code
global _start ; Linker needs this to find the entry point!
_start:
nop ; This no-op keeps gdb happy...
mov eax,4 ; Specify sys_write call
mov ebx,1 ; Specify File Descriptor 1: Standard Output
mov ecx,EatMsg ; Pass offset of the message
mov edx,EatLen ; Pass the length of the message
int 80H ; Make kernel call
MOV eax,1 ; Code for Exit Syscall
mov ebx,0 ; Return a code of zero
int 80H ; Make kernel call
and I'm compiling with these lines:
nasm -f elf -g -F stabs eatsyscall.asm -l eatsyscall.lst
ld -melf_i386 -o eatsyscall eatsyscall.o
What I see in Emacs is that. In this screenshot I'm currently executing the line after the breakpoint and no pointer to that line appears. Is it possible to have one?
first of all, i hope you are still looking for the solution, it has been 2 years ! if you are, then try coaxing nasm to generate debugging information with DWARF instead of STAB i.e the following
nasm -f elf -g -F dwarf eatsyscall.asm ...
that seems to work for me (TM)
Try to download nasm2.5 or the latest available, it should work
When attempting to run the following assembly program:
.globl start
start:
pushq $0x0
movq $0x1, %rax
subq $0x8, %rsp
int $0x80
I am receiving the following errors:
dyld: no writable segment
Trace/BPT trap
Any idea what could be causing this? The analogous program in 32 bit assembly runs fine.
OSX now requires your executable to have a writable data segment with content, so it can relocate and link your code dynamically. Dunno why, maybe security reasons, maybe due to the new RIP register. If you put a .data segment in there (with some bogus content), you'll avoid the "no writable segment" error. IMO this is an ld bug.
Regarding the 64-bit syscall, you can do it 2 ways. GCC-style, which uses the _syscall PROCEDURE from libSystem.dylib, or raw. Raw uses the syscall instruction, not the int 0x80 trap. int 0x80 is an illegal instruction in 64-bit.
The "GCC method" will take care of categorizing the syscall for you, so you can use the same 32-bit numbers found in sys/syscall.h. But if you go raw, you'll have to classify what kind of syscall it is by ORing it with a type id. Here is an example of both. Note that the calling convention is different! (this is NASM syntax because gas annoys me)
; assemble with
; nasm -f macho64 -o syscall64.o syscall64.asm && ld -lc -ldylib1.o -e start -o syscall64 syscall64.o
extern _syscall
global start
[section .text align=16]
start:
; do it gcc-style
mov rdi, 0x4 ; sys_write
mov rsi, 1 ; file descriptor
mov rdx, hello
mov rcx, size
call _syscall ; we're calling a procedure, not trapping.
;now let's do it raw
mov rax, 0x2000001 ; SYS_exit = 1 and is type 2 (bsd call)
mov rdi, 0 ; Exit success = 0
syscall ; faster than int 0x80, and legal!
[section .data align=16]
hello: db "hello 64-bit syscall!", 0x0a
size: equ $-hello
check out http://www.opensource.apple.com/source/xnu/xnu-792.13.8/osfmk/mach/i386/syscall_sw.h for more info on how a syscall is typed.
The system call interface is different between 32 and 64 bits. Firstly, int $80 is replaced by syscall and the system call numbers are different. You will need to look up documentation for a 64-bit version of your system call. Here is an example of what a 64-bit program may look like.