NASM 64-bit OS X Inputted String Overwriting Bytes of Existing Value - macos

I am trying to write a simple assembly program to add two numbers together. I want the user to be able to enter the values. The problem I am encountering is that when I display a string message and then read a value in, the next time the string is required the first x characters of the string have been overwritten by the data that was entered by the user.
My assumption is that this is related to the use of LEA to load the string into the register. I have been doing this because Macho64 complains if a regular MOV instruction is used in this situation (something to do with addressing space in 64-bits on the Mac).
My code is as follows:
section .data ;this is where constants go
input_message db 'Please enter your next number: '
length equ $-input_message
section .text ;declaring our .text segment
global _main ;telling where program execution should start
_main: ;this is where code starts getting executed
mov r8, 0
_loop_values:
call _get_value
call _write
inc r8 ;increment the loop counter
cmp r8, 2 ;compare loop counter to zero
jne _loop_values
call _exit
_get_value:
lea rcx, [rel input_message] ;move the input message into rcx for function call
mov rdx, length ;load the length of the message for function call
call _write
call _read
ret
_read:
mov rdx, 255 ;set buffer size for input
mov rdi, 0 ;stdout
mov rax, SYSCALL_READ
syscall
mov rdx, rax ;move the length from rax to rdx
dec rdx ;remove new line character from input length
mov rcx, rsi ;move the value input from rsi to rcx
ret
_write:
mov rsi, rcx ;load the output message
;mov rdx, rax
mov rax, SYSCALL_WRITE
syscall
ret
_exit:
mov rax, SYSCALL_EXIT
mov rdi, 0
syscall
The program loops twice as it should. The first time I get the following prompt:
Please enter your next number:
I would the enter something like 5 (followed by the return key)
The next prompt would be:
5
ease enter your next number:
Any assistance would be much appreciated.

I think all 64-bit code on Mac is required to be rip relative.
Absolute addresses are not supported. in this type of addressing you address your symbol relative to rip.
NASM documentation says:
default abs
mov eax,[foo] ; 32−bit absolute disp, sign−extended
mov eax,[a32 foo] ; 32−bit absolute disp, zero−extended
mov eax,[qword foo] ; 64−bit absolute disp
default rel
mov eax,[foo] ; 32−bit relative disp
mov eax,[a32 foo] ; d:o, address truncated to 32 bits(!)
mov eax,[qword foo] ; error
mov eax,[abs qword foo] ; 64−bit absolute disp
and you can also see this question.

Related

Why does a function double dereference arguments stored on stack and how is that possible? [duplicate]

This question already has answers here:
Basic use of immediates vs. square brackets in YASM/NASM x86 assembly
(4 answers)
x86 Nasm assembly - push'ing db vars on stack - how is the size known?
(2 answers)
Referencing the contents of a memory location. (x86 addressing modes)
(2 answers)
Why do you have to dereference the label of data to store something in there: Assembly 8086 FASM
(1 answer)
Closed 7 months ago.
I tried to understand "lfunction" stack arguments loading to "flist" in following assembly code I found on a book (The book doesn't explain it. Code compiles and run without errors giving intended output displaying "The string is: ABCDEFGHIJ".) but I can't grasp the legality or logic of the code. What I don't understand is listed below.
In lfunction:
Non-volatile (as per Microsoft x64 calling convention) register RBX is not backed up before 'XOR'ing. (But it is not what bugs me most.)
In portion ";arguments on stack"
mov rax, qword [rbp+8+8+32]
mov bl,[rax]
Here [rbp+8+8+32] dereferences corresponding address stored in stack so RAX should
be loaded with value represented by'fourth' which is char 'D'(0x44) as per my understanding (Why qword?). And if so, what dereferencing char 'D' in second line can possibly mean (There should be a memory address to dereference but 'D' is a char.)?
Original code is listed below:
%include "io64.inc"
; stack.asm
extern printf
section .data
first db "A"
second db "B"
third db "C"
fourth db "D"
fifth db "E"
sixth db "F"
seventh db "G"
eighth db "H"
ninth db "I"
tenth db "J"
fmt db "The string is: %s",10,0
section .bss
flist resb 14 ;length of string plus end 0
section .text
global main
main:
push rbp
mov rbp,rsp
sub rsp, 8
mov rcx, flist
mov rdx, first
mov r8, second
mov r9, third
push tenth ; now start pushing in
push ninth ; reverse order
push eighth
push seventh
push sixth
push fifth
push fourth
sub rsp,32 ; shadow
call lfunc
add rsp,32+8
; print the result
mov rcx, fmt
mov rdx, flist
sub rsp,32+8
call printf
add rsp,32+8
leave
ret
;––––––––––––––––––––––––-
lfunc:
push rbp
mov rbp,rsp
xor rax,rax ;clear rax (especially higher bits)
;arguments in registers
mov al,byte[rdx] ; move content argument to al
mov [rcx], al ; store al to memory(resrved at section .bss)
mov al, byte[r8]
mov [rcx+1], al
mov al, byte[r9]
mov [rcx+2], al
;arguments on stack
xor rbx,rbx
mov rax, qword [rbp+8+8+32] ; rsp + rbp + return address + shadow
mov bl,[rax]
mov [rcx+3], bl
mov rax, qword [rbp+48+8]
mov bl,[rax]
mov [rcx+4], bl
mov rax, qword [rbp+48+16]
mov bl,[rax]
mov [rcx+5], bl
mov rax, qword [rbp+48+24]
mov bl,[rax]
mov [rcx+6], bl
mov rax, qword [rbp+48+32]
mov bl,[rax]
mov [rcx+7], bl
mov rax, qword [rbp+48+40]
mov bl,[rax]
mov [rcx+8], bl
mov rax, qword [rbp+48+48]
mov bl,[rax]
mov [rcx+9], bl
mov bl,0 ; terminating zero
mov [rcx+10], bl
leave
ret
Additional info:
I cannot look at register values just after line 50 which
corresponds to "XOR RAX, RAX" in lfunc because debugger auto skips
single stepping to line 37 of main function which corresponds to
"add RSP, 32+8". Even If I marked breakpoints in between
aforementioned lines in lfunc code the debugger simply hangs so I
have to manually abort debugging.
In portion ";arguments on stack"
mov rax, qword [rbp+8+8+32]
mov bl,[rax]
I am mentioning this again to be more precise of what am asking because question was marked as duplicate and
provided links with answers that doesn't address my specific issue. At line
[rbp+8+8+32] == 0x44 because clearly, mov with square brackets dereferences reference address (which I assume 64bit width) rbp+3h. So, the size of 0x44 is byte. That is why ask "Why qword?" because it implies "lea [rbp+8+8+32]" which is a qword reference, not mov. So if [rbp+8+8+32] equals 0x44, then [rax] == [0x0000000000000044], which a garbage ( not relevant to our code here) address.

Segmentation fault when adding 2 digits - nasm MacOS x86_64

I am trying to write a program that accepts 2 digits as user input, and then outputs their sum. I keep getting segmentation error when trying to run program(I am able to input 2 digits, but then the program crashes). I already check answers to similar questions and many of them pointed out to clear the registers, which I did, but I am still getting a segmentation fault.
section .text
global _main ;must be declared for linker (ld)
default rel
_main: ;tells linker entry point
call _readData
call _readData1
call _addData
call _displayData
mov RAX, 0x02000001 ;system call number (sys_exit)
syscall
_addData:
mov byte [sum], 0 ; init sum with 0
lea EAX, [buffer] ; load value from buffer to register
lea EBX, [buffer1] ; load value from buffer1 to register
sub byte [EAX], '0' ; transfrom to digit
sub byte [EBX], '0' ; transform to digit
add [sum], EAX ; increment value of sum by value from register
add [sum], EBX ; increment value of sum by value from 2nd register
add byte [sum], '0' ; convert to ASCI
xor EAX, EAX ; clear registers
xor EBX, EBX ; clear registers
ret
_readData:
mov RAX, 0x02000003
mov RDI, 2
mov RSI, buffer
mov RDX, SIZE
syscall
ret
_readData1:
mov RAX, 0x02000003
mov RDI, 2
mov RSI, buffer1
mov RDX, SIZE
syscall
ret
_displayData:
mov RAX, 0x02000004
mov RDI, 1
mov RSI, sum
mov RDX, SIZE
syscall
ret
section .bss
SIZE equ 4
buffer: resb SIZE
buffer1: resb SIZE
sum: resb SIZE
I see that, unlike other languages I learned, it is quite difficult to find a good source /tutorial about programming assembly using nasm on x86_64 architecture. Is there any kind of walkthrough for beginners(so I do not need to ask on SO everytime I am stuck :D)

x86 Assembly; overwriting .bss values?

I'm currently trying to write a small program in ASM, and I have the following issue. I take input from the user as a string which I store in a variable I've declared in the .bss section of my code; I then re-prompt and overwrite the previously stored answer and do this multiple times. My issue is if someone has entered an answer that was shorter than the last (i.e. "James" then "Jim") I get the following output:
"Hi, James"
"What's your name?"
"Jim"
"Hi, Jimes"
What's happening here is the characters that weren't overwritten remain and get printed, as expected. What I'm wondering is how I may go about wiping the data in the .bss db between prompts?
Here is the code so far:
section .data
question: db "What's your name?", 10
answer: db "Hello, "
ln db 10
section .bss
name resb 16
section .text
global start
start:
call prompt
call getName
mov rsi, answer
mov rdx, 7
call print
mov rsi, name
mov rdx, 10
call print
mov rsi, ln
mov rdx, 1
call print
call loop_name
mov rax, 0x02000001
mov rdi, 0
syscall
reset_name:
loop_name:
mov cx, 3
startloop:
cmp cx, 0
jz endofloop
push cx
loopy:
call getName
mov rsi, answer
mov rdx, 7
call print
mov rsi, name
mov rdx, 10
call print
pop cx
dec cx
jmp startloop
endofloop:
; Loop ended
; Do what ever you have to do here
ret
prompt:
mov rax, 0x02000004
mov rdi, 1
mov rsi, question
mov rdx, 18
syscall
print:
mov rax, 0x02000004
mov rdi, 1
syscall
ret
getName:
mov rax, 0x02000003 ; read
mov rdi, 0
mov rsi, name
mov rdx, 37
syscall
ret
Any ideas? (Variable in question is name)
While I don't know the system calls you're using, we can do one of three things:
clear the entire variable before reusing it.
use and share an explicit length value to indicate how many bytes of it are valid
null terminate the string right after it is input
Using an explicit length value may involve someone placing a null terminator at the right point in time (e.g. just before printing).
The read operation should return to you a length that you can pass to someone else (e.g. as a pair pointer & length), or otherwise use immediately to null terminate the string.  If it doesn't, then use the first approach of clearing the entire variable before reusing it.
Typically, syscalls have return values, that indicate length on success or else negative values for failure.  In such case, you are ignoring both.

NASM ReadConsoleA or WriteConsoleA Buffer Debugging Issue

I am writing a NASM Assembly program on Windows to get the user to enter in two single digit numbers, add these together and then output the result. I am trying to use the Windows API for input and output.
Unfortunately, whilst I can get it to read in one number as soon as the program loops round to get the second the program ends rather than asking for the second value.
The output of the program shown below:
What is interesting is that if I input 1 then the value displayed is one larger so it is adding to something!
This holds for other single digits (2-9) entered as well.
I am pretty sure it is related to how I am using the ReadConsoleA function but I have hit a bit of a wall attempting to find a solution. I have installed gdb to debug the program and assembled it as follows:
nasm -f win64 -g -o task9.obj task9.asm
GoLink /console /entry _main task9.obj kernel32.dll
gdb task9
But I just get the following error:
"C:\Users\Administrator\Desktop/task9.exe": not in executable format: File format not recognized
I have since read that NASM doesn't output the debug information needed for the Win64 format but I am not 100% sure about that. I am fairly sure I have the 64-bit version of GDB installed:
My program is as follows:
extern ExitProcess ;windows API function to exit process
extern WriteConsoleA ;windows API function to write to the console window (ANSI version)
extern ReadConsoleA ;windows API function to read from the console window (ANSI version)
extern GetStdHandle ;windows API to get the for the console handle for input/output
section .data ;the .data section is where variables and constants are defined
STD_OUTPUT_HANDLE equ -11
STD_INPUT_HANDLE equ -10
digits db '0123456789' ;list of digits
input_message db 'Please enter your next number: '
length equ $-input_message
section .bss ;the .bss section is where space is reserved for additional variables
input_buffer: resb 2 ;reserve 64 bits for user input
char_written: resb 4
chars: resb 1 ;reversed for use with write operation
section .text ;the .text section is where the program code goes
global _main ;tells the machine which label to start program execution from
_num_to_str:
cmp rax, 0 ;compare value in rax to 0
jne .convert ;if not equal then jump to label
jmp .output
.convert:
;get next digit value
inc r15 ;increment the counter for next digit
mov rcx, 10
xor rdx, rdx ;clear previous remainder result
div rcx ;divide value in rax by value in rcx
;quotient (result) stored in rax
;remainder stored in rdx
push rdx ;store remainder on the stack
jmp _num_to_str
.output:
pop rdx ;get the last digit from the stack
;convert digit value to ascii character
mov r10, digits ;load the address of the digits into rsi
add r10, rdx ;get the character of the digits string to display
mov rdx, r10 ;digit to print
mov r8, 1 ;one byte to be output
call _print
;decide whether to loop
dec r15 ;reduce remaining digits (having printed one)
cmp r15, 0 ;are there digits left to print?
jne .output ;if not equal then jump to label output
ret
_print:
;get the output handle
mov rcx, STD_OUTPUT_HANDLE ;specifies that the output handle is required
call GetStdHandle ;returns value for handle to rax
mov rcx, rax
mov r9, char_written
call WriteConsoleA
ret
_read:
;get the input handle
mov rcx, STD_INPUT_HANDLE ;specifies that the input handle is required
call GetStdHandle
;get value from keyboard
mov rcx, rax ;place the handle for operation
mov rdx, input_buffer ;set name to receive input from keyboard
mov r8, 2 ;max number of characters to read
mov r9, chars ;stores the number of characters actually read
call ReadConsoleA
movzx r12, byte[input_buffer]
ret
_get_value:
mov rdx, input_message ;move the input message into rdx for function call
mov r8, length ;load the length of the message for function call
call _print
xor r8, r8
xor r9, r9
call _read
.end:
ret
_main:
mov r13, 0 ;counter for values input
mov r14, 0 ;total for calculation
.loop:
xor r12, r12
call _get_value ;get value from user
sub r12, '0' ;convert char to integer
add r14, r12 ;add value to total
;decide whether to loop for another character or not
inc r13
cmp r13, 2
jne .loop
;convert total to ASCII value
mov rax, r14 ;num_to_str expects total in rax
mov r15, 0 ;num_to_str uses r15 as a counter - must be initialised
call _num_to_str
;exit the program
mov rcx, 0 ;exit code
call ExitProcess
I would really appreciate any assistance you can offer either with resolving the issue or how to resolve the issue with gdb.
I found the following issues with your code:
Microsoft x86-64 convention mandates rsp be 16 byte aligned.
You must reserve space for the arguments on the stack, even if you pass them in registers.
Your chars variable needs 4 bytes not 1.
ReadConsole expects 5 arguments.
You should read 3 bytes because ReadConsole returns CR LF. Or you could just ignore leading whitespace.
Your _num_to_str is broken if the input is 0.
Based on Jester's suggestions this is the final program:
extern ExitProcess ;windows API function to exit process
extern WriteConsoleA ;windows API function to write to the console window (ANSI version)
extern ReadConsoleA ;windows API function to read from the console window (ANSI version)
extern GetStdHandle ;windows API to get the for the console handle for input/output
section .data ;the .data section is where variables and constants are defined
STD_OUTPUT_HANDLE equ -11
STD_INPUT_HANDLE equ -10
digits db '0123456789' ;list of digits
input_message db 'Please enter your next number: '
length equ $-input_message
NULL equ 0
section .bss ;the .bss section is where space is reserved for additional variables
input_buffer: resb 3 ;reserve 64 bits for user input
char_written: resb 4
chars: resb 4 ;reversed for use with write operation
section .text ;the .text section is where the program code goes
global _main ;tells the machine which label to start program execution from
_num_to_str:
sub rsp, 32
cmp rax, 0
jne .next_digit
push rax
inc r15
jmp .output
.next_digit:
cmp rax, 0 ;compare value in rax to 0
jne .convert ;if not equal then jump to label
jmp .output
.convert:
;get next digit value
inc r15 ;increment the counter for next digit
mov rcx, 10
xor rdx, rdx ;clear previous remainder result
div rcx ;divide value in rax by value in rcx
;quotient (result) stored in rax
;remainder stored in rdx
sub rsp, 8 ;add space on stack for value
push rdx ;store remainder on the stack
jmp .next_digit
.output:
pop rdx ;get the last digit from the stack
add rsp, 8 ;remove space from stack for popped value
;convert digit value to ascii character
mov r10, digits ;load the address of the digits into rsi
add r10, rdx ;get the character of the digits string to display
mov rdx, r10 ;digit to print
mov r8, 1 ;one byte to be output
call _print
;decide whether to loop
dec r15 ;reduce remaining digits (having printed one)
cmp r15, 0 ;are there digits left to print?
jne .output ;if not equal then jump to label output
add rsp, 32
ret
_print:
sub rsp, 40
;get the output handle
mov rcx, STD_OUTPUT_HANDLE ;specifies that the output handle is required
call GetStdHandle ;returns value for handle to rax
mov rcx, rax
mov r9, char_written
mov rax, qword 0 ;fifth argument
mov qword [rsp+0x20], rax
call WriteConsoleA
add rsp, 40
ret
_read:
sub rsp, 40
;get the input handle
mov rcx, STD_INPUT_HANDLE ;specifies that the input handle is required
call GetStdHandle
;get value from keyboard
mov rcx, rax ;place the handle for operation
xor rdx, rdx
mov rdx, input_buffer ;set name to receive input from keyboard
mov r8, 3 ;max number of characters to read
mov r9, chars ;stores the number of characters actually read
mov rax, qword 0 ;fifth argument
mov qword [rsp+0x20], rax
call ReadConsoleA
movzx r12, byte[input_buffer]
add rsp, 40
ret
_get_value:
sub rsp, 40
mov rdx, input_message ;move the input message into rdx for function call
mov r8, length ;load the length of the message for function call
call _print
call _read
.end:
add rsp, 40
ret
_main:
sub rsp, 40
mov r13, 0 ;counter for values input
mov r14, 0 ;total for calculation
.loop:
call _get_value ;get value from user
sub r12, '0' ;convert char to integer
add r14, r12 ;add value to total
;decide whether to loop for another character or not
inc r13
cmp r13, 2
jne .loop
;convert total to ASCII value
mov rax, r14 ;num_to_str expects total in rax
mov r15, 0 ;num_to_str uses r15 as a counter - must be initialised
call _num_to_str
;exit the program
mov rcx, 0 ;exit code
call ExitProcess
add rsp, 40
ret
As it turned out I was actually missing a 5th argument in the WriteConsole function as well.

NASM Windows ReadConsole NumberOfCharsRead Buffer

I am attempting to convert an existing assembly program that I have so that it works on Windows. It should ask the user for their name and then output "Hello name". I pretty much have it working but there is a problem with getting the number of characters actually read from the ReadConsole call.
My program is:
extern ExitProcess ;windows API function to exit process
extern WriteConsoleA ;windows API function to write to the console window (ANSI version)
extern ReadConsoleA ;windows API function to read from the console window (ANSI version)
extern GetStdHandle ;windows API to get the for the console handle for input/output
section .data ;the .data section is where variables and constants are defined
hello db 'Hello ' ;db stands for 'define byte'
length equ $-hello ;get the length of the hello
;$ is the current memory location
;$-hello means the difference between the current location and
;where hello started
enter_name db 'Please enter your name: '
name_len equ $-enter_name
chars dd 0
section .bss ;the .bss section is where space is reserved for additional variables
name:
resb 30 ;reserve 30 bytes of space for the name
char_written:
resb 4
section .text ;the .text section is where the program code goes
global _main ;tells the machine which label to start program execution from
_main:
;get the output handle
mov rcx, -11 ;specifies that the output handle is required
call GetStdHandle ;returns value for handle to rax
;print message asking user for their name
mov rcx, rax
mov rdx, enter_name ;load the address of enter name message into rsi
mov r8, name_len ;move the length into the rdx register
mov r9, char_written
call WriteConsoleA
;get the input handle
mov rcx, -10 ;specifies that the input handle is required
call GetStdHandle
;get value from keyboard
mov rcx, rax ;place the handle for operation
mov rdx, name ;set name to receive input from keyboard
mov r8, 30 ;max number of characters to read
mov r9, chars ;stores the number of characters actually read
call ReadConsoleA
;print 'hello' + user's name message
;get the output handle
mov rcx, -11 ;specifies that the output handle is required
call GetStdHandle ;returns value for handle to rax
;print message asking user for their name
mov rcx, rax
mov rdx, hello ;load the address of enter name message into rsi
mov r8, length ;move the length into the rdx register
mov r9, char_written
call WriteConsoleA
mov rcx, -11 ;specifies that the output handle is required
call GetStdHandle ;returns value for handle to rax
;print name
mov rcx, rax
mov rdx, name ;load the address of enter name message into rsi
mov r8, chars ;move the length into the rdx register
mov r9, char_written
call WriteConsoleA
;exit the program
mov rcx, 0 ;exit code?
call ExitProcess
The way that I have read the Windows API documentation on ReadConsole is that it requires the following parameters:
Output Handle
Place to put the read data
A value for the number of characters to read
Place to put the actual number of characters read
I have defined a label name to receive the characters and chars to receive the actual number of characters.
The read call looks like this:
;get the input handle
mov rcx, -10 ;specifies that the input handle is required
call GetStdHandle
;get value from keyboard
mov rcx, rax ;place the handle for operation
mov rdx, name ;set name to receive input from keyboard
mov r8, 30 ;max number of characters to read
mov r9, chars ;stores the number of characters actually read
call ReadConsoleA
My assumption is that if the name entered is Anna then the value 4 or 5 (including new line character) should be stored in chars. This does not seem to be the case as when I attempt to output the name I get nothing with this code:
mov rcx, -11 ;specifies that the output handle is required
call GetStdHandle ;returns value for handle to rax
;print name
mov rcx, rax
mov rdx, name ;load the address of enter name message into rsi
mov r8, chars ;move the length into the rdx register
mov r9, char_written
call WriteConsoleA
But if I switch the chars label out for a value it works as expected i.e.
mov rcx, -11 ;specifies that the output handle is required
call GetStdHandle ;returns value for handle to rax
;print name
mov rcx, rax
mov rdx, name ;load the address of enter name message into rsi
mov r8, 4 ;move the length into the rdx register
mov r9, char_written
call WriteConsoleA
Obviously I am doing something wrong with this buffer - any assistance would be much appreciated.

Resources