x86 Assembly; overwriting .bss values?

x86 Assembly; overwriting .bss values? - macos

I'm currently trying to write a small program in ASM, and I have the following issue. I take input from the user as a string which I store in a variable I've declared in the .bss section of my code; I then re-prompt and overwrite the previously stored answer and do this multiple times. My issue is if someone has entered an answer that was shorter than the last (i.e. "James" then "Jim") I get the following output:
"Hi, James"
"What's your name?"
"Jim"
"Hi, Jimes"
What's happening here is the characters that weren't overwritten remain and get printed, as expected. What I'm wondering is how I may go about wiping the data in the .bss db between prompts?
Here is the code so far:
section .data
question: db "What's your name?", 10
answer: db "Hello, "
ln db 10
section .bss
name resb 16
section .text
global start
start:
call prompt
call getName
mov rsi, answer
mov rdx, 7
call print
mov rsi, name
mov rdx, 10
call print
mov rsi, ln
mov rdx, 1
call print
call loop_name
mov rax, 0x02000001
mov rdi, 0
syscall
reset_name:
loop_name:
mov cx, 3
startloop:
cmp cx, 0
jz endofloop
push cx
loopy:
call getName
mov rsi, answer
mov rdx, 7
call print
mov rsi, name
mov rdx, 10
call print
pop cx
dec cx
jmp startloop
endofloop:
; Loop ended
; Do what ever you have to do here
ret
prompt:
mov rax, 0x02000004
mov rdi, 1
mov rsi, question
mov rdx, 18
syscall
print:
mov rax, 0x02000004
mov rdi, 1
syscall
ret
getName:
mov rax, 0x02000003 ; read
mov rdi, 0
mov rsi, name
mov rdx, 37
syscall
ret
Any ideas? (Variable in question is name)

While I don't know the system calls you're using, we can do one of three things:
clear the entire variable before reusing it.
use and share an explicit length value to indicate how many bytes of it are valid
null terminate the string right after it is input
Using an explicit length value may involve someone placing a null terminator at the right point in time (e.g. just before printing).
The read operation should return to you a length that you can pass to someone else (e.g. as a pair pointer & length), or otherwise use immediately to null terminate the string.  If it doesn't, then use the first approach of clearing the entire variable before reusing it.
Typically, syscalls have return values, that indicate length on success or else negative values for failure.  In such case, you are ignoring both.

Related

Why does a function double dereference arguments stored on stack and how is that possible? [duplicate]

This question already has answers here:
Basic use of immediates vs. square brackets in YASM/NASM x86 assembly
(4 answers)
x86 Nasm assembly - push'ing db vars on stack - how is the size known?
(2 answers)
Referencing the contents of a memory location. (x86 addressing modes)
(2 answers)
Why do you have to dereference the label of data to store something in there: Assembly 8086 FASM
(1 answer)
Closed 7 months ago.
I tried to understand "lfunction" stack arguments loading to "flist" in following assembly code I found on a book (The book doesn't explain it. Code compiles and run without errors giving intended output displaying "The string is: ABCDEFGHIJ".) but I can't grasp the legality or logic of the code. What I don't understand is listed below.
In lfunction:
Non-volatile (as per Microsoft x64 calling convention) register RBX is not backed up before 'XOR'ing. (But it is not what bugs me most.)
In portion ";arguments on stack"
mov rax, qword [rbp+8+8+32]
mov bl,[rax]
Here [rbp+8+8+32] dereferences corresponding address stored in stack so RAX should
be loaded with value represented by'fourth' which is char 'D'(0x44) as per my understanding (Why qword?). And if so, what dereferencing char 'D' in second line can possibly mean (There should be a memory address to dereference but 'D' is a char.)?
Original code is listed below:
%include "io64.inc"
; stack.asm
extern printf
section .data
first db "A"
second db "B"
third db "C"
fourth db "D"
fifth db "E"
sixth db "F"
seventh db "G"
eighth db "H"
ninth db "I"
tenth db "J"
fmt db "The string is: %s",10,0
section .bss
flist resb 14 ;length of string plus end 0
section .text
global main
main:
push rbp
mov rbp,rsp
sub rsp, 8
mov rcx, flist
mov rdx, first
mov r8, second
mov r9, third
push tenth ; now start pushing in
push ninth ; reverse order
push eighth
push seventh
push sixth
push fifth
push fourth
sub rsp,32 ; shadow
call lfunc
add rsp,32+8
; print the result
mov rcx, fmt
mov rdx, flist
sub rsp,32+8
call printf
add rsp,32+8
leave
ret
;––––––––––––––––––––––––-
lfunc:
push rbp
mov rbp,rsp
xor rax,rax ;clear rax (especially higher bits)
;arguments in registers
mov al,byte[rdx] ; move content argument to al
mov [rcx], al ; store al to memory(resrved at section .bss)
mov al, byte[r8]
mov [rcx+1], al
mov al, byte[r9]
mov [rcx+2], al
;arguments on stack
xor rbx,rbx
mov rax, qword [rbp+8+8+32] ; rsp + rbp + return address + shadow
mov bl,[rax]
mov [rcx+3], bl
mov rax, qword [rbp+48+8]
mov bl,[rax]
mov [rcx+4], bl
mov rax, qword [rbp+48+16]
mov bl,[rax]
mov [rcx+5], bl
mov rax, qword [rbp+48+24]
mov bl,[rax]
mov [rcx+6], bl
mov rax, qword [rbp+48+32]
mov bl,[rax]
mov [rcx+7], bl
mov rax, qword [rbp+48+40]
mov bl,[rax]
mov [rcx+8], bl
mov rax, qword [rbp+48+48]
mov bl,[rax]
mov [rcx+9], bl
mov bl,0 ; terminating zero
mov [rcx+10], bl
leave
ret
Additional info:
I cannot look at register values just after line 50 which
corresponds to "XOR RAX, RAX" in lfunc because debugger auto skips
single stepping to line 37 of main function which corresponds to
"add RSP, 32+8". Even If I marked breakpoints in between
aforementioned lines in lfunc code the debugger simply hangs so I
have to manually abort debugging.
In portion ";arguments on stack"
mov rax, qword [rbp+8+8+32]
mov bl,[rax]
I am mentioning this again to be more precise of what am asking because question was marked as duplicate and
provided links with answers that doesn't address my specific issue. At line
[rbp+8+8+32] == 0x44 because clearly, mov with square brackets dereferences reference address (which I assume 64bit width) rbp+3h. So, the size of 0x44 is byte. That is why ask "Why qword?" because it implies "lea [rbp+8+8+32]" which is a qword reference, not mov. So if [rbp+8+8+32] equals 0x44, then [rax] == [0x0000000000000044], which a garbage ( not relevant to our code here) address.

Why doesn't this assembly code print the top of the stack?

After successfully making a "Hello, World!" program in x86-64, I wanted
to make a program that can peek at the top of the stack (without popping it, and using the esp register so I can learn how it works). This is the program in NASM:
extern GetStdHandle, WriteConsoleA, ExitProcess
section .bss
dummy resd 1
section .text
%macro print 3
mov rcx, %1
mov rdx, %2
mov r8, %3
mov r9, dummy
push NULL
call WriteConsoleA
%endmacro
_start:
mov rcx, STD_OUTPUT_HANDLE
call GetStdHandle
push 65
print rax, [x], 1
mov rcx, 0
call ExitProcess
NULL equ 0
STD_OUTPUT_HANDLE equ -11
At the print rax, [x], 1 line, x is replaced by something. I tried a variety of things, like rsp, esp, rsi, esi, rsp+1, rsp+4, etc. None of them worked. They either don't compile or don't print anything.
What is the correct way to do it? (note: this is solely for experimental purposes. I know I could use push/pop in this case, but I want to learn how to do it this way.)

mov rdx, [rsp] will load 65 into rdx. But WriteConsole expects the address of the string to print. So you want mov rdx, rsp.
One other thing that should be fixed: the stack should be aligned to 16 bytes before the call and there should be 32 bytes of empty space at the top of the stack. After the push, put sub rsp, 40. Then use rsp+40 as the address to print.

Open file, delete zeros, sort it - NASM

I am currently working on some problems and this is the one I am having trouble with. To make it all clear, I am a beginner, so any help is more than welcome.
Problem:
Sort the content of a binary file in descending order. The name of the file is passed as a command line argument. File content is interpreted as four-byte positive integers, where value 0, when found, is not written into the file. The result must be written in the same file that has been read.
The way I understand is that I have to have a binary file. Open it. Get its content. Find all characters while keeping in mind those are positive, four-byte integers, find zeros, get rid of zeros, sort the rest of the numbers.
We are allowed to use glibc, so this was my attempt:
section .data
warning db 'File does not exist!', 10, 0
argument db 'Enter your argument.', 10, 0
mode dd 'r+'
opened db 'File is open. Time to read.', 10, 0
section .bss
content resd 10
counter resb 1
section .text
extern printf, fopen, fgets, fputc
global main
main:
push rbp
mov rbp, rsp
push rsi
push rdi
push rbx
;location of argument's address
push rsi
cmp rdi, 2
je .openfile
mov rdi, argument
mov rax, 0
call printf
jmp .end
.openfile:
pop rbx
;First real argument of command line
mov rdi, [rbx + 8]
mov rsi, mode
mov rax, 0
call fopen
cmp al, 0
je .end
push rax
mov rdi, opened
mov rax, 0
call printf
.readfromfile:
mov rdi, content
mov rsi, 12 ;I wrote 10 numbers in my file
pop rdx
mov rax, 0
call fgets
cmp al, 0
je .end
push rax
mov rsi, tekst
pop rdi
.loop:
lodsd
inc byte[counter]
cmp eax, '0'
jne .loop
;this is the part where I am not sure what to do.
;I am trying to delete the zero with backspace, then use space and
;backspace again - I saw it here somewhere as a solution
mov esi, 0x08
call fputc
mov esi, 0x20
call fputc
mov esi, 0x08
call fputc
cmp eax, 0
je .end
jmp .loop
.end:
pop rdi
pop rsi
pop rbx
mov rsp, rbp
pop rbp
ret
So, my idea was to open the file, find zero, delete it by using backspace and space, then backspace again; Continue until I get to the end of the file, then sort it. As it can be seen I did not attempt to sort the content because I cannot get program to do the first part for me. I have been trying this for couple of days now and everything is getting foggy.
If someone can help me out, I would be very grateful. If there is something similar to this problem, feel free to link it to me. Anything that could help, I am ready to read and learn.
I am also unsure about how much information do I have to give. If something is unclear, please point it out to me.
Thank you

For my own selfish fun, an example of memory area being "collapsed" when dword zero value is detected:
to build in linux with NASM for target ELF64 executable:
nasm -f elf64 so_64b_collapseZeroDword.asm -l so_64b_collapseZeroDword.lst -w+all
ld -b elf64-x86-64 -o so_64b_collapseZeroDword so_64b_collapseZeroDword.o
And for debugger I'm using edb (built from sources) (the executable doesn't do anything observable by user, when it works correctly, it's supposed to be run in debugger single-stepping over instructions and having memory view over the .data segment to see how the values are moved around in memory).
source file so_64b_collapseZeroDword.asm
segment .text
collapseZeroDwords:
; input (custom calling convention, suitable only for calls from assembly):
; rsi - address of first element
; rdx - address beyond last element ("vector::end()" pointer)
; return: rdi - new "beyond last element" address
; modifies: rax, rsi, rdi
; the memory after new end() is not cleared (the zeroes are just thrown away)!
; search for first zero (up till that point the memory content will remain same)
cmp rsi, rdx
jae .noZeroFound ; if the (rsi >= end()), no zero was in the memory
lodsd ; eax = [rsi], rsi += 4
test eax, eax ; check for zero
jne collapseZeroDwords
; first zero found, from here on, the non-zero values will be copied to earlier area
lea rdi, [rsi-4] ; address where the non-zero values should be written
.moveNonZeroValues:
cmp rsi, rdx
jae .wholeArrayCollapsed ; if (rsi >= end()), whole array is collapsed
lodsd ; eax = [rsi], rsi += 4
test eax, eax ; check for zero
jz .moveNonZeroValues ; zero detected, skip the "store" value part
stosd ; [rdi] = eax, rdi += 4 (pointing beyond last element)
jmp .moveNonZeroValues
.noZeroFound:
mov rdi, rdx ; just return the original "end()" pointer
.wholeArrayCollapsed: ; or just return when rdi is already set as new end()
ret
global _start
_start: ; run some hardcoded simple tests, verify in debugger
lea rsi, [test1]
lea rdx, [test1+4*4]
call collapseZeroDwords
cmp rdi, test1+4*4 ; no zero collapsed
lea rsi, [test2]
lea rdx, [test2+4*4]
call collapseZeroDwords
cmp rdi, test2+3*4 ; one zero
lea rsi, [test3]
lea rdx, [test3+4*4]
call collapseZeroDwords
cmp rdi, test3+3*4 ; one zero
lea rsi, [test4]
lea rdx, [test4+4*4]
call collapseZeroDwords
cmp rdi, test4+2*4 ; two zeros
lea rsi, [test5]
lea rdx, [test5+4*4]
call collapseZeroDwords
cmp rdi, test5+2*4 ; two zeros
lea rsi, [test6]
lea rdx, [test6+4*4]
call collapseZeroDwords
cmp rdi, test6+0*4 ; four zeros
; exit back to linux
mov eax, 60
xor edi, edi
syscall
segment .data
; all test arrays are 4 elements long for simplicity
dd 0xCCCCCCCC ; debug canary value to detect any over-read or over-write
test1 dd 71, 72, 73, 74, 0xCCCCCCCC
test2 dd 71, 72, 73, 0, 0xCCCCCCCC
test3 dd 0, 71, 72, 73, 0xCCCCCCCC
test4 dd 0, 71, 0, 72, 0xCCCCCCCC
test5 dd 71, 0, 72, 0, 0xCCCCCCCC
test6 dd 0, 0, 0, 0, 0xCCCCCCCC
I tried to comment it extensively to show what/why/how it is doing, but feel free to ask about any particular part. The code was written with simplicity on mind, so it doesn't use any aggressive performance optimizations (like vectorized search for first zero value, etc).

NASM 64-bit OS X Inputted String Overwriting Bytes of Existing Value

I am trying to write a simple assembly program to add two numbers together. I want the user to be able to enter the values. The problem I am encountering is that when I display a string message and then read a value in, the next time the string is required the first x characters of the string have been overwritten by the data that was entered by the user.
My assumption is that this is related to the use of LEA to load the string into the register. I have been doing this because Macho64 complains if a regular MOV instruction is used in this situation (something to do with addressing space in 64-bits on the Mac).
My code is as follows:
section .data ;this is where constants go
input_message db 'Please enter your next number: '
length equ $-input_message
section .text ;declaring our .text segment
global _main ;telling where program execution should start
_main: ;this is where code starts getting executed
mov r8, 0
_loop_values:
call _get_value
call _write
inc r8 ;increment the loop counter
cmp r8, 2 ;compare loop counter to zero
jne _loop_values
call _exit
_get_value:
lea rcx, [rel input_message] ;move the input message into rcx for function call
mov rdx, length ;load the length of the message for function call
call _write
call _read
ret
_read:
mov rdx, 255 ;set buffer size for input
mov rdi, 0 ;stdout
mov rax, SYSCALL_READ
syscall
mov rdx, rax ;move the length from rax to rdx
dec rdx ;remove new line character from input length
mov rcx, rsi ;move the value input from rsi to rcx
ret
_write:
mov rsi, rcx ;load the output message
;mov rdx, rax
mov rax, SYSCALL_WRITE
syscall
ret
_exit:
mov rax, SYSCALL_EXIT
mov rdi, 0
syscall
The program loops twice as it should. The first time I get the following prompt:
Please enter your next number:
I would the enter something like 5 (followed by the return key)
The next prompt would be:
5
ease enter your next number:
Any assistance would be much appreciated.

I think all 64-bit code on Mac is required to be rip relative.
Absolute addresses are not supported. in this type of addressing you address your symbol relative to rip.
NASM documentation says:
default abs
mov eax,[foo] ; 32−bit absolute disp, sign−extended
mov eax,[a32 foo] ; 32−bit absolute disp, zero−extended
mov eax,[qword foo] ; 64−bit absolute disp
default rel
mov eax,[foo] ; 32−bit relative disp
mov eax,[a32 foo] ; d:o, address truncated to 32 bits(!)
mov eax,[qword foo] ; error
mov eax,[abs qword foo] ; 64−bit absolute disp
and you can also see this question.

NASM string from user input not comparing

It's my second day I'm learning NASM and assembly language at all, so I've decided to write a kind of a calculator. The problem is that when user enters the operation, the program doesn't compare it.I mean compares but it doesn't consider that strings are equal when they are. I've googled a lot, but no results. What might be my broblem? Here's source
operV resb 255 ; this is declaration of variable later used to store the input, in .bss of course
mov rax, 0x2000003 ;here user enters the operation, input is "+", or "-"
mov rdi, 0
mov rsi, operV
mov rdx, 255
syscall
mov rax, operV ; here is part where stuff is compared
mov rdi, "+"
cmp rax, rdi
je add
mov rdi, "-"
cmp rax, rdi
je substr
;etc...

You also press the enter key when you submit information like that. change rdi to "+", 10 on linux, or "+", 11 on windows
The answer came to me in a dream last night.
operV resd 4; Resd because registers are a dword in size, and we want the comparison to be as seamless ass possible, you can leave this as 255, but thats a little excessive as we only need 4
mov rax, 0x2000003
mov rdi, 0
mov rsi. operV
mov rdx, 4 ;You only need 4 bytes (1 dword) as this is all we are accepting (+ and line end, then empty space so it matches up with the register we are comparing too) you can leave this as 255 too, but again, we only need 4
syscall
mov rax, operV
mov rdi, "x", 10
cmp dword[rax], rdi ; You were comparing the pointer in rax with rdi, not the content pointed to by rax with rdi. now we are comparing the double word (4 bytes) at the location pointed too by rax (operV) and comparing them. Instead of comparing the location of operV with the thing you want.
je add
That's the changes done :P

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

x86 Assembly; overwriting .bss values? - macos

Related

Why does a function double dereference arguments stored on stack and how is that possible? [duplicate]

Why doesn't this assembly code print the top of the stack?

Open file, delete zeros, sort it - NASM

NASM 64-bit OS X Inputted String Overwriting Bytes of Existing Value

NASM string from user input not comparing

Categories

Resources