Segmentation fault when adding 2 digits - nasm MacOS x86_64 - macos

I am trying to write a program that accepts 2 digits as user input, and then outputs their sum. I keep getting segmentation error when trying to run program(I am able to input 2 digits, but then the program crashes). I already check answers to similar questions and many of them pointed out to clear the registers, which I did, but I am still getting a segmentation fault.
section .text
global _main ;must be declared for linker (ld)
default rel
_main: ;tells linker entry point
call _readData
call _readData1
call _addData
call _displayData
mov RAX, 0x02000001 ;system call number (sys_exit)
syscall
_addData:
mov byte [sum], 0 ; init sum with 0
lea EAX, [buffer] ; load value from buffer to register
lea EBX, [buffer1] ; load value from buffer1 to register
sub byte [EAX], '0' ; transfrom to digit
sub byte [EBX], '0' ; transform to digit
add [sum], EAX ; increment value of sum by value from register
add [sum], EBX ; increment value of sum by value from 2nd register
add byte [sum], '0' ; convert to ASCI
xor EAX, EAX ; clear registers
xor EBX, EBX ; clear registers
ret
_readData:
mov RAX, 0x02000003
mov RDI, 2
mov RSI, buffer
mov RDX, SIZE
syscall
ret
_readData1:
mov RAX, 0x02000003
mov RDI, 2
mov RSI, buffer1
mov RDX, SIZE
syscall
ret
_displayData:
mov RAX, 0x02000004
mov RDI, 1
mov RSI, sum
mov RDX, SIZE
syscall
ret
section .bss
SIZE equ 4
buffer: resb SIZE
buffer1: resb SIZE
sum: resb SIZE
I see that, unlike other languages I learned, it is quite difficult to find a good source /tutorial about programming assembly using nasm on x86_64 architecture. Is there any kind of walkthrough for beginners(so I do not need to ask on SO everytime I am stuck :D)

Related

Comparing input to character not working in x86_64 Mac assembly nasm

In nasm assembly on mac with the processor architecture x86_64, I am struggling to compare input to a string or character. When comparing input (stdin) to a character, it's not being true when it should be. I am new to assembly.
Here is my code.
global start
section .bss
input resb 10
section .text
start:
;getting the input
mov rax, 0x2000003 ;meaning read
mov rdi, 0
mov rsi, input
mov rdx, 10
syscall ;special
;here is where I do the comparing
mov rax, r
mov rbx, input
cmp rax, rbx
je right
;jumping to the return function
jmp ret
right:
mov rax, 0x2000004 ;meaning write
mov rdi, 1
mov rsi, right_way
mov rdx, right_len
syscall ;special
ret:
mov rax, 0x2000001 ;return 0
xor rdi, rdi ;which means to make rdi = 0 could be replaced with mov rdi, 0 but xor is faster
syscall
section .data
right_way: db "You are correct!", 10, 0
right_len: equ $-right_way
r: db "r", 10
At the "je right" line, it should jump to the right function but it does not. Do I need to convert the input to something else?
Help would be appreciated. Thanks!

Open file, delete zeros, sort it - NASM

I am currently working on some problems and this is the one I am having trouble with. To make it all clear, I am a beginner, so any help is more than welcome.
Problem:
Sort the content of a binary file in descending order. The name of the file is passed as a command line argument. File content is interpreted as four-byte positive integers, where value 0, when found, is not written into the file. The result must be written in the same file that has been read.
The way I understand is that I have to have a binary file. Open it. Get its content. Find all characters while keeping in mind those are positive, four-byte integers, find zeros, get rid of zeros, sort the rest of the numbers.
We are allowed to use glibc, so this was my attempt:
section .data
warning db 'File does not exist!', 10, 0
argument db 'Enter your argument.', 10, 0
mode dd 'r+'
opened db 'File is open. Time to read.', 10, 0
section .bss
content resd 10
counter resb 1
section .text
extern printf, fopen, fgets, fputc
global main
main:
push rbp
mov rbp, rsp
push rsi
push rdi
push rbx
;location of argument's address
push rsi
cmp rdi, 2
je .openfile
mov rdi, argument
mov rax, 0
call printf
jmp .end
.openfile:
pop rbx
;First real argument of command line
mov rdi, [rbx + 8]
mov rsi, mode
mov rax, 0
call fopen
cmp al, 0
je .end
push rax
mov rdi, opened
mov rax, 0
call printf
.readfromfile:
mov rdi, content
mov rsi, 12 ;I wrote 10 numbers in my file
pop rdx
mov rax, 0
call fgets
cmp al, 0
je .end
push rax
mov rsi, tekst
pop rdi
.loop:
lodsd
inc byte[counter]
cmp eax, '0'
jne .loop
;this is the part where I am not sure what to do.
;I am trying to delete the zero with backspace, then use space and
;backspace again - I saw it here somewhere as a solution
mov esi, 0x08
call fputc
mov esi, 0x20
call fputc
mov esi, 0x08
call fputc
cmp eax, 0
je .end
jmp .loop
.end:
pop rdi
pop rsi
pop rbx
mov rsp, rbp
pop rbp
ret
So, my idea was to open the file, find zero, delete it by using backspace and space, then backspace again; Continue until I get to the end of the file, then sort it. As it can be seen I did not attempt to sort the content because I cannot get program to do the first part for me. I have been trying this for couple of days now and everything is getting foggy.
If someone can help me out, I would be very grateful. If there is something similar to this problem, feel free to link it to me. Anything that could help, I am ready to read and learn.
I am also unsure about how much information do I have to give. If something is unclear, please point it out to me.
Thank you
For my own selfish fun, an example of memory area being "collapsed" when dword zero value is detected:
to build in linux with NASM for target ELF64 executable:
nasm -f elf64 so_64b_collapseZeroDword.asm -l so_64b_collapseZeroDword.lst -w+all
ld -b elf64-x86-64 -o so_64b_collapseZeroDword so_64b_collapseZeroDword.o
And for debugger I'm using edb (built from sources) (the executable doesn't do anything observable by user, when it works correctly, it's supposed to be run in debugger single-stepping over instructions and having memory view over the .data segment to see how the values are moved around in memory).
source file so_64b_collapseZeroDword.asm
segment .text
collapseZeroDwords:
; input (custom calling convention, suitable only for calls from assembly):
; rsi - address of first element
; rdx - address beyond last element ("vector::end()" pointer)
; return: rdi - new "beyond last element" address
; modifies: rax, rsi, rdi
; the memory after new end() is not cleared (the zeroes are just thrown away)!
; search for first zero (up till that point the memory content will remain same)
cmp rsi, rdx
jae .noZeroFound ; if the (rsi >= end()), no zero was in the memory
lodsd ; eax = [rsi], rsi += 4
test eax, eax ; check for zero
jne collapseZeroDwords
; first zero found, from here on, the non-zero values will be copied to earlier area
lea rdi, [rsi-4] ; address where the non-zero values should be written
.moveNonZeroValues:
cmp rsi, rdx
jae .wholeArrayCollapsed ; if (rsi >= end()), whole array is collapsed
lodsd ; eax = [rsi], rsi += 4
test eax, eax ; check for zero
jz .moveNonZeroValues ; zero detected, skip the "store" value part
stosd ; [rdi] = eax, rdi += 4 (pointing beyond last element)
jmp .moveNonZeroValues
.noZeroFound:
mov rdi, rdx ; just return the original "end()" pointer
.wholeArrayCollapsed: ; or just return when rdi is already set as new end()
ret
global _start
_start: ; run some hardcoded simple tests, verify in debugger
lea rsi, [test1]
lea rdx, [test1+4*4]
call collapseZeroDwords
cmp rdi, test1+4*4 ; no zero collapsed
lea rsi, [test2]
lea rdx, [test2+4*4]
call collapseZeroDwords
cmp rdi, test2+3*4 ; one zero
lea rsi, [test3]
lea rdx, [test3+4*4]
call collapseZeroDwords
cmp rdi, test3+3*4 ; one zero
lea rsi, [test4]
lea rdx, [test4+4*4]
call collapseZeroDwords
cmp rdi, test4+2*4 ; two zeros
lea rsi, [test5]
lea rdx, [test5+4*4]
call collapseZeroDwords
cmp rdi, test5+2*4 ; two zeros
lea rsi, [test6]
lea rdx, [test6+4*4]
call collapseZeroDwords
cmp rdi, test6+0*4 ; four zeros
; exit back to linux
mov eax, 60
xor edi, edi
syscall
segment .data
; all test arrays are 4 elements long for simplicity
dd 0xCCCCCCCC ; debug canary value to detect any over-read or over-write
test1 dd 71, 72, 73, 74, 0xCCCCCCCC
test2 dd 71, 72, 73, 0, 0xCCCCCCCC
test3 dd 0, 71, 72, 73, 0xCCCCCCCC
test4 dd 0, 71, 0, 72, 0xCCCCCCCC
test5 dd 71, 0, 72, 0, 0xCCCCCCCC
test6 dd 0, 0, 0, 0, 0xCCCCCCCC
I tried to comment it extensively to show what/why/how it is doing, but feel free to ask about any particular part. The code was written with simplicity on mind, so it doesn't use any aggressive performance optimizations (like vectorized search for first zero value, etc).

Assembly Programming using SASM on Windows, with an example using int 0x80 (Linux system calls)

I need help. I'm trying to run the program (NASM) below in SASM.
SYS_EXIT equ 1
SYS_READ equ 3
SYS_WRITE equ 4
STDIN equ 0
STDOUT equ 1
segment .data
msg1 db "Enter a digit ", 0xA,0xD
len1 equ $- msg1
msg2 db "Please enter a second digit", 0xA,0xD
len2 equ $- msg2
msg3 db "The sum is: "
len3 equ $- msg3
segment .bss
num1 resb 2
num2 resb 2
res resb 1
section .text
global _start ;must be declared for using gcc
_start: ;tell linker entry point
mov eax, SYS_WRITE
mov ebx, STDOUT
mov ecx, msg1
mov edx, len1
int 0x80
mov eax, SYS_READ
mov ebx, STDIN
mov ecx, num1
mov edx, 2
int 0x80
mov eax, SYS_WRITE
mov ebx, STDOUT
mov ecx, msg2
mov edx, len2
int 0x80
mov eax, SYS_READ
mov ebx, STDIN
mov ecx, num2
mov edx, 2
int 0x80
mov eax, SYS_WRITE
mov ebx, STDOUT
mov ecx, msg3
mov edx, len3
int 0x80
; moving the first number to eax register and second number to ebx
; and subtracting ascii '0' to convert it into a decimal number
mov eax, [num1]
sub eax, '0'
mov ebx, [num2]
sub ebx, '0'
; add eax and ebx
add eax, ebx
; add '0' to to convert the sum from decimal to ASCII
add eax, '0'
; storing the sum in memory location res
mov [res], eax
; print the sum
mov eax, SYS_WRITE
mov ebx, STDOUT
mov ecx, res
mov edx, 1
int 0x80
exit:
mov eax, SYS_EXIT
xor ebx, ebx
int 0x80
I had this error:
[20:53:11] Warning! Errors have occurred in the build:
c:/program files (x86)/sasm/mingw/bin/../lib/gcc/mingw32/4.6.2/../../../libmingw32.a(main.o): In function 'main':
C:\MinGW\msys\1.0\src\mingwrt/../mingw/main.c:73: undefined reference to `WinMain#16'
Also, how do I limit users input up to 4 digits only?
global _start should change to global main and Linux system calls should be replaced by Windows API function calls and declared as external. Modern versions of Windows doesn't approve use of system calls due to malware or badware risks, so deprecated (permanent) system call codes. Every modern version of Windows has different system call number codes, though you can find them on internet, you shouldn't rely on them unless you want to revise your assembly code for each version of Windows thus reducing portability and increasing workload. There are significant differences between Linux/Mac and Windows in the way of handling registers, stack and function names.

NASM ReadConsoleA or WriteConsoleA Buffer Debugging Issue

I am writing a NASM Assembly program on Windows to get the user to enter in two single digit numbers, add these together and then output the result. I am trying to use the Windows API for input and output.
Unfortunately, whilst I can get it to read in one number as soon as the program loops round to get the second the program ends rather than asking for the second value.
The output of the program shown below:
What is interesting is that if I input 1 then the value displayed is one larger so it is adding to something!
This holds for other single digits (2-9) entered as well.
I am pretty sure it is related to how I am using the ReadConsoleA function but I have hit a bit of a wall attempting to find a solution. I have installed gdb to debug the program and assembled it as follows:
nasm -f win64 -g -o task9.obj task9.asm
GoLink /console /entry _main task9.obj kernel32.dll
gdb task9
But I just get the following error:
"C:\Users\Administrator\Desktop/task9.exe": not in executable format: File format not recognized
I have since read that NASM doesn't output the debug information needed for the Win64 format but I am not 100% sure about that. I am fairly sure I have the 64-bit version of GDB installed:
My program is as follows:
extern ExitProcess ;windows API function to exit process
extern WriteConsoleA ;windows API function to write to the console window (ANSI version)
extern ReadConsoleA ;windows API function to read from the console window (ANSI version)
extern GetStdHandle ;windows API to get the for the console handle for input/output
section .data ;the .data section is where variables and constants are defined
STD_OUTPUT_HANDLE equ -11
STD_INPUT_HANDLE equ -10
digits db '0123456789' ;list of digits
input_message db 'Please enter your next number: '
length equ $-input_message
section .bss ;the .bss section is where space is reserved for additional variables
input_buffer: resb 2 ;reserve 64 bits for user input
char_written: resb 4
chars: resb 1 ;reversed for use with write operation
section .text ;the .text section is where the program code goes
global _main ;tells the machine which label to start program execution from
_num_to_str:
cmp rax, 0 ;compare value in rax to 0
jne .convert ;if not equal then jump to label
jmp .output
.convert:
;get next digit value
inc r15 ;increment the counter for next digit
mov rcx, 10
xor rdx, rdx ;clear previous remainder result
div rcx ;divide value in rax by value in rcx
;quotient (result) stored in rax
;remainder stored in rdx
push rdx ;store remainder on the stack
jmp _num_to_str
.output:
pop rdx ;get the last digit from the stack
;convert digit value to ascii character
mov r10, digits ;load the address of the digits into rsi
add r10, rdx ;get the character of the digits string to display
mov rdx, r10 ;digit to print
mov r8, 1 ;one byte to be output
call _print
;decide whether to loop
dec r15 ;reduce remaining digits (having printed one)
cmp r15, 0 ;are there digits left to print?
jne .output ;if not equal then jump to label output
ret
_print:
;get the output handle
mov rcx, STD_OUTPUT_HANDLE ;specifies that the output handle is required
call GetStdHandle ;returns value for handle to rax
mov rcx, rax
mov r9, char_written
call WriteConsoleA
ret
_read:
;get the input handle
mov rcx, STD_INPUT_HANDLE ;specifies that the input handle is required
call GetStdHandle
;get value from keyboard
mov rcx, rax ;place the handle for operation
mov rdx, input_buffer ;set name to receive input from keyboard
mov r8, 2 ;max number of characters to read
mov r9, chars ;stores the number of characters actually read
call ReadConsoleA
movzx r12, byte[input_buffer]
ret
_get_value:
mov rdx, input_message ;move the input message into rdx for function call
mov r8, length ;load the length of the message for function call
call _print
xor r8, r8
xor r9, r9
call _read
.end:
ret
_main:
mov r13, 0 ;counter for values input
mov r14, 0 ;total for calculation
.loop:
xor r12, r12
call _get_value ;get value from user
sub r12, '0' ;convert char to integer
add r14, r12 ;add value to total
;decide whether to loop for another character or not
inc r13
cmp r13, 2
jne .loop
;convert total to ASCII value
mov rax, r14 ;num_to_str expects total in rax
mov r15, 0 ;num_to_str uses r15 as a counter - must be initialised
call _num_to_str
;exit the program
mov rcx, 0 ;exit code
call ExitProcess
I would really appreciate any assistance you can offer either with resolving the issue or how to resolve the issue with gdb.
I found the following issues with your code:
Microsoft x86-64 convention mandates rsp be 16 byte aligned.
You must reserve space for the arguments on the stack, even if you pass them in registers.
Your chars variable needs 4 bytes not 1.
ReadConsole expects 5 arguments.
You should read 3 bytes because ReadConsole returns CR LF. Or you could just ignore leading whitespace.
Your _num_to_str is broken if the input is 0.
Based on Jester's suggestions this is the final program:
extern ExitProcess ;windows API function to exit process
extern WriteConsoleA ;windows API function to write to the console window (ANSI version)
extern ReadConsoleA ;windows API function to read from the console window (ANSI version)
extern GetStdHandle ;windows API to get the for the console handle for input/output
section .data ;the .data section is where variables and constants are defined
STD_OUTPUT_HANDLE equ -11
STD_INPUT_HANDLE equ -10
digits db '0123456789' ;list of digits
input_message db 'Please enter your next number: '
length equ $-input_message
NULL equ 0
section .bss ;the .bss section is where space is reserved for additional variables
input_buffer: resb 3 ;reserve 64 bits for user input
char_written: resb 4
chars: resb 4 ;reversed for use with write operation
section .text ;the .text section is where the program code goes
global _main ;tells the machine which label to start program execution from
_num_to_str:
sub rsp, 32
cmp rax, 0
jne .next_digit
push rax
inc r15
jmp .output
.next_digit:
cmp rax, 0 ;compare value in rax to 0
jne .convert ;if not equal then jump to label
jmp .output
.convert:
;get next digit value
inc r15 ;increment the counter for next digit
mov rcx, 10
xor rdx, rdx ;clear previous remainder result
div rcx ;divide value in rax by value in rcx
;quotient (result) stored in rax
;remainder stored in rdx
sub rsp, 8 ;add space on stack for value
push rdx ;store remainder on the stack
jmp .next_digit
.output:
pop rdx ;get the last digit from the stack
add rsp, 8 ;remove space from stack for popped value
;convert digit value to ascii character
mov r10, digits ;load the address of the digits into rsi
add r10, rdx ;get the character of the digits string to display
mov rdx, r10 ;digit to print
mov r8, 1 ;one byte to be output
call _print
;decide whether to loop
dec r15 ;reduce remaining digits (having printed one)
cmp r15, 0 ;are there digits left to print?
jne .output ;if not equal then jump to label output
add rsp, 32
ret
_print:
sub rsp, 40
;get the output handle
mov rcx, STD_OUTPUT_HANDLE ;specifies that the output handle is required
call GetStdHandle ;returns value for handle to rax
mov rcx, rax
mov r9, char_written
mov rax, qword 0 ;fifth argument
mov qword [rsp+0x20], rax
call WriteConsoleA
add rsp, 40
ret
_read:
sub rsp, 40
;get the input handle
mov rcx, STD_INPUT_HANDLE ;specifies that the input handle is required
call GetStdHandle
;get value from keyboard
mov rcx, rax ;place the handle for operation
xor rdx, rdx
mov rdx, input_buffer ;set name to receive input from keyboard
mov r8, 3 ;max number of characters to read
mov r9, chars ;stores the number of characters actually read
mov rax, qword 0 ;fifth argument
mov qword [rsp+0x20], rax
call ReadConsoleA
movzx r12, byte[input_buffer]
add rsp, 40
ret
_get_value:
sub rsp, 40
mov rdx, input_message ;move the input message into rdx for function call
mov r8, length ;load the length of the message for function call
call _print
call _read
.end:
add rsp, 40
ret
_main:
sub rsp, 40
mov r13, 0 ;counter for values input
mov r14, 0 ;total for calculation
.loop:
call _get_value ;get value from user
sub r12, '0' ;convert char to integer
add r14, r12 ;add value to total
;decide whether to loop for another character or not
inc r13
cmp r13, 2
jne .loop
;convert total to ASCII value
mov rax, r14 ;num_to_str expects total in rax
mov r15, 0 ;num_to_str uses r15 as a counter - must be initialised
call _num_to_str
;exit the program
mov rcx, 0 ;exit code
call ExitProcess
add rsp, 40
ret
As it turned out I was actually missing a 5th argument in the WriteConsole function as well.

NASM 64-bit OS X Inputted String Overwriting Bytes of Existing Value

I am trying to write a simple assembly program to add two numbers together. I want the user to be able to enter the values. The problem I am encountering is that when I display a string message and then read a value in, the next time the string is required the first x characters of the string have been overwritten by the data that was entered by the user.
My assumption is that this is related to the use of LEA to load the string into the register. I have been doing this because Macho64 complains if a regular MOV instruction is used in this situation (something to do with addressing space in 64-bits on the Mac).
My code is as follows:
section .data ;this is where constants go
input_message db 'Please enter your next number: '
length equ $-input_message
section .text ;declaring our .text segment
global _main ;telling where program execution should start
_main: ;this is where code starts getting executed
mov r8, 0
_loop_values:
call _get_value
call _write
inc r8 ;increment the loop counter
cmp r8, 2 ;compare loop counter to zero
jne _loop_values
call _exit
_get_value:
lea rcx, [rel input_message] ;move the input message into rcx for function call
mov rdx, length ;load the length of the message for function call
call _write
call _read
ret
_read:
mov rdx, 255 ;set buffer size for input
mov rdi, 0 ;stdout
mov rax, SYSCALL_READ
syscall
mov rdx, rax ;move the length from rax to rdx
dec rdx ;remove new line character from input length
mov rcx, rsi ;move the value input from rsi to rcx
ret
_write:
mov rsi, rcx ;load the output message
;mov rdx, rax
mov rax, SYSCALL_WRITE
syscall
ret
_exit:
mov rax, SYSCALL_EXIT
mov rdi, 0
syscall
The program loops twice as it should. The first time I get the following prompt:
Please enter your next number:
I would the enter something like 5 (followed by the return key)
The next prompt would be:
5
ease enter your next number:
Any assistance would be much appreciated.
I think all 64-bit code on Mac is required to be rip relative.
Absolute addresses are not supported. in this type of addressing you address your symbol relative to rip.
NASM documentation says:
default abs
mov eax,[foo] ; 32−bit absolute disp, sign−extended
mov eax,[a32 foo] ; 32−bit absolute disp, zero−extended
mov eax,[qword foo] ; 64−bit absolute disp
default rel
mov eax,[foo] ; 32−bit relative disp
mov eax,[a32 foo] ; d:o, address truncated to 32 bits(!)
mov eax,[qword foo] ; error
mov eax,[abs qword foo] ; 64−bit absolute disp
and you can also see this question.

Resources