Scanf on nasm assembly program - gcc

I am trying to write a simple program using scanf and printf, but it is not storing my values properly.
extern printf
extern scanf
SECTION .data
str1: db "Enter a number: ",0,10
str2: db "your value is %d, squared = %d",0,10
fmt1: db "%d",0
location: dw 0h
SECTION .bss
input1: resw 1
SECTION .text
global main
main:
push ebp
mov ebp, esp
push str1
call printf
add esp, 4
push location
push fmt1
call scanf
mov ebx, eax ;ebx holds input
mul eax ;eax holds input*input
push eax
push ebx
push dword str2
call printf
add esp, 12
mov esp, ebp
pop ebp
mov eax,0
ret
For some reason when I run the program, no matter what number I enter, the program prints 1 for both inputs.
I am using nasm, linked with gcc

You're making an incorrect assumption here:
call scanf
mov ebx, eax ;ebx holds input
scanf actually returns "the number of items of the argument list successfully filled" (source). Your integer is in location.
By the way, you should probably make location at least 4 bytes (i.e. use dd instead of dw).

Related

Why does a function double dereference arguments stored on stack and how is that possible? [duplicate]

This question already has answers here:
Basic use of immediates vs. square brackets in YASM/NASM x86 assembly
(4 answers)
x86 Nasm assembly - push'ing db vars on stack - how is the size known?
(2 answers)
Referencing the contents of a memory location. (x86 addressing modes)
(2 answers)
Why do you have to dereference the label of data to store something in there: Assembly 8086 FASM
(1 answer)
Closed 7 months ago.
I tried to understand "lfunction" stack arguments loading to "flist" in following assembly code I found on a book (The book doesn't explain it. Code compiles and run without errors giving intended output displaying "The string is: ABCDEFGHIJ".) but I can't grasp the legality or logic of the code. What I don't understand is listed below.
In lfunction:
Non-volatile (as per Microsoft x64 calling convention) register RBX is not backed up before 'XOR'ing. (But it is not what bugs me most.)
In portion ";arguments on stack"
mov rax, qword [rbp+8+8+32]
mov bl,[rax]
Here [rbp+8+8+32] dereferences corresponding address stored in stack so RAX should
be loaded with value represented by'fourth' which is char 'D'(0x44) as per my understanding (Why qword?). And if so, what dereferencing char 'D' in second line can possibly mean (There should be a memory address to dereference but 'D' is a char.)?
Original code is listed below:
%include "io64.inc"
; stack.asm
extern printf
section .data
first db "A"
second db "B"
third db "C"
fourth db "D"
fifth db "E"
sixth db "F"
seventh db "G"
eighth db "H"
ninth db "I"
tenth db "J"
fmt db "The string is: %s",10,0
section .bss
flist resb 14 ;length of string plus end 0
section .text
global main
main:
push rbp
mov rbp,rsp
sub rsp, 8
mov rcx, flist
mov rdx, first
mov r8, second
mov r9, third
push tenth ; now start pushing in
push ninth ; reverse order
push eighth
push seventh
push sixth
push fifth
push fourth
sub rsp,32 ; shadow
call lfunc
add rsp,32+8
; print the result
mov rcx, fmt
mov rdx, flist
sub rsp,32+8
call printf
add rsp,32+8
leave
ret
;––––––––––––––––––––––––-
lfunc:
push rbp
mov rbp,rsp
xor rax,rax ;clear rax (especially higher bits)
;arguments in registers
mov al,byte[rdx] ; move content argument to al
mov [rcx], al ; store al to memory(resrved at section .bss)
mov al, byte[r8]
mov [rcx+1], al
mov al, byte[r9]
mov [rcx+2], al
;arguments on stack
xor rbx,rbx
mov rax, qword [rbp+8+8+32] ; rsp + rbp + return address + shadow
mov bl,[rax]
mov [rcx+3], bl
mov rax, qword [rbp+48+8]
mov bl,[rax]
mov [rcx+4], bl
mov rax, qword [rbp+48+16]
mov bl,[rax]
mov [rcx+5], bl
mov rax, qword [rbp+48+24]
mov bl,[rax]
mov [rcx+6], bl
mov rax, qword [rbp+48+32]
mov bl,[rax]
mov [rcx+7], bl
mov rax, qword [rbp+48+40]
mov bl,[rax]
mov [rcx+8], bl
mov rax, qword [rbp+48+48]
mov bl,[rax]
mov [rcx+9], bl
mov bl,0 ; terminating zero
mov [rcx+10], bl
leave
ret
Additional info:
I cannot look at register values just after line 50 which
corresponds to "XOR RAX, RAX" in lfunc because debugger auto skips
single stepping to line 37 of main function which corresponds to
"add RSP, 32+8". Even If I marked breakpoints in between
aforementioned lines in lfunc code the debugger simply hangs so I
have to manually abort debugging.
In portion ";arguments on stack"
mov rax, qword [rbp+8+8+32]
mov bl,[rax]
I am mentioning this again to be more precise of what am asking because question was marked as duplicate and
provided links with answers that doesn't address my specific issue. At line
[rbp+8+8+32] == 0x44 because clearly, mov with square brackets dereferences reference address (which I assume 64bit width) rbp+3h. So, the size of 0x44 is byte. That is why ask "Why qword?" because it implies "lea [rbp+8+8+32]" which is a qword reference, not mov. So if [rbp+8+8+32] equals 0x44, then [rax] == [0x0000000000000044], which a garbage ( not relevant to our code here) address.

What exactly are the variables in assembly?

I am new to x86 assembly and have been doing some experiments lately using nasm and running the program on a windows 10 machine.
I Have this code:
global _start
extern _GetStdHandle#4
extern _WriteFile#20
extern _ExitProcess#4
section .data
message db "1234"
section .text
_start:
call print
call _ExitProcess#4
print:
; DWORD bytes;
mov ebp, esp
sub esp, 4
; hStdOut = GetstdHandle( STD_OUTPUT_HANDLE)
push -11
call _GetStdHandle#4
mov ebx, eax
; WriteFile( hstdOut, message, length(message), &bytes, 0);
push 0
lea eax, [ebp-4]
push eax
push 4
push message
push ebx
call _WriteFile#20
mov esp, ebp
ret
; ExitProcess(0)
That I assemble it using the following commands:
nasm -f win32 out.asm
link out.obj /entry:start /subsystem:console "C:\Program Files (x86)\Windows Kits\10\Lib\10.0.18362.0\um\x86\kernel32.lib"
and when running it on cmd it outputs "1234" as expected
Now when assembling and running the following code, where instead of pushing message the program pushes "1234" directly
global _start
extern _GetStdHandle#4
extern _WriteFile#20
extern _ExitProcess#4
section .data
message db "1234"
section .text
_start:
call print
call _ExitProcess#4
print:
; DWORD bytes;
mov ebp, esp
sub esp, 4
; hStdOut = GetstdHandle( STD_OUTPUT_HANDLE)
push -11
call _GetStdHandle#4
mov ebx, eax
; WriteFile( hstdOut, message, length(message), &bytes, 0);
push 0
lea eax, [ebp-4]
push eax
push 4
push "1234"
push ebx
call _WriteFile#20
mov esp, ebp
ret
It outputs nothing
Why? What information does message have that "1234" doesn't? When pushing message, does the program just push the address of the memory that is storing "1234"? If so, can I store "1234" somewhere else, and than push its address without creating a variable?
A variable is a logical construct — variables have lifetimes, some short, some long.  They can come into being and disappear.
By contrast, registers and memory are physical constructs — in some sense, they are always there.
In assembly programming, by a human or generated by compiler, we make mappings from logical variables needed by our C code, algorithms, and pseudo code, to physical storage available in the processor.  When a variable's lifetime ends, we can reuse the physical storage that it was using for another purpose (another variable).
Assembly language supports global variables (full process lifetime), and local variables — which can be either in memory on the stack, or CPU registers.  CPU registers, of course, do not have addresses, so cannot be passed by (memory) reference.  CPU registers also cannot be indexed, so to index an array requires memory.
I would make a local variable on the stack, like this:
print:
; DWORD bytes;
mov ebp, esp
sub esp, 12
; hStdOut = GetstdHandle( STD_OUTPUT_HANDLE)
push -11
call _GetStdHandle#4
mov ebx, eax
lea ecx, [ebp-12]
mov dword ptr [ecx], “1234”
; WriteFile( hstdOut, message, length(message), &bytes, 0);
push 0
lea eax, [ebp-4]
push eax
push 4
push ecx
push ebx
call _WriteFile#20
mov esp, ebp
ret
Note: the syntax mov ..., “1234” may or may not do what you want, depending on the assembler. I don’t remember how the Microsoft assembler handles it. If it doesn’t translate to 0x34333231, then use that constant instead.

ESI and EDI change values after function call

I'm trying to convert some strings representing binary numbers into their actual values, using a conversion function defined in a different file.
Here's my code:
main.asm
bits 32
global start
%include 'convert.asm'
extern exit, scanf, printf
import exit msvcrt.dll
import scanf msvcrt.dll
import printf msvcrt.dll
section data use32 class=data
s DB '10100111b', '01100011b', '110b', '101011b'
len EQU $ - s
res times len DB 0
segment code use32 class=code
start:
mov ESI, s ; move source string
mov EDI, res ; move destination string
mov ECX, len ; length of the string
mov EBX, 0
repeat:
lodsb ; load current byte into AL
inc BL
cmp AL, 'b' ; check if its equal to the character b
jne end ; if its not, we need to keep parsing
push dword ESI ; push the position of the current character in the source string to the stack
push dword EDI ; push the position of the current character in the destination string to the stack
push dword EBX ; push the current length to the stack
call func1 ; call the function
end:
loop repeat
push dword 0
call [exit]
convert.asm
func1:
mov ECX, [ESP] ; first parameter is the current parsed length
mov EDI, [ESP + 4] ; then EDI
mov ESI, [ESP + 8] ; and ESI
sub ESI, ECX
parse:
mov EDX, [ESI]
sub EDX, '0'
mov [EDI], EDX
shl dword [EDI], 1
inc ESI
loop parse
ret 4 * 3
I noticed that I keep getting access violation errors after the function call though. ESI has some random value after the call. Am I doing something wrong? I think the parameter pushing part should be alright. Inside the conversion function, the parameters should be accessed in the reverse order. But that's not happening for some reason.
I'm also pretty sure that I did the compiling/linking part alright using nasm and alink.
nasm -fobj main.asm
nasm -fobj convert.asm
alink main.obj convert.obj -oPE -subsys console -entry start

NASM 64-bit OS X Inputted String Overwriting Bytes of Existing Value

I am trying to write a simple assembly program to add two numbers together. I want the user to be able to enter the values. The problem I am encountering is that when I display a string message and then read a value in, the next time the string is required the first x characters of the string have been overwritten by the data that was entered by the user.
My assumption is that this is related to the use of LEA to load the string into the register. I have been doing this because Macho64 complains if a regular MOV instruction is used in this situation (something to do with addressing space in 64-bits on the Mac).
My code is as follows:
section .data ;this is where constants go
input_message db 'Please enter your next number: '
length equ $-input_message
section .text ;declaring our .text segment
global _main ;telling where program execution should start
_main: ;this is where code starts getting executed
mov r8, 0
_loop_values:
call _get_value
call _write
inc r8 ;increment the loop counter
cmp r8, 2 ;compare loop counter to zero
jne _loop_values
call _exit
_get_value:
lea rcx, [rel input_message] ;move the input message into rcx for function call
mov rdx, length ;load the length of the message for function call
call _write
call _read
ret
_read:
mov rdx, 255 ;set buffer size for input
mov rdi, 0 ;stdout
mov rax, SYSCALL_READ
syscall
mov rdx, rax ;move the length from rax to rdx
dec rdx ;remove new line character from input length
mov rcx, rsi ;move the value input from rsi to rcx
ret
_write:
mov rsi, rcx ;load the output message
;mov rdx, rax
mov rax, SYSCALL_WRITE
syscall
ret
_exit:
mov rax, SYSCALL_EXIT
mov rdi, 0
syscall
The program loops twice as it should. The first time I get the following prompt:
Please enter your next number:
I would the enter something like 5 (followed by the return key)
The next prompt would be:
5
ease enter your next number:
Any assistance would be much appreciated.
I think all 64-bit code on Mac is required to be rip relative.
Absolute addresses are not supported. in this type of addressing you address your symbol relative to rip.
NASM documentation says:
default abs
mov eax,[foo] ; 32−bit absolute disp, sign−extended
mov eax,[a32 foo] ; 32−bit absolute disp, zero−extended
mov eax,[qword foo] ; 64−bit absolute disp
default rel
mov eax,[foo] ; 32−bit relative disp
mov eax,[a32 foo] ; d:o, address truncated to 32 bits(!)
mov eax,[qword foo] ; error
mov eax,[abs qword foo] ; 64−bit absolute disp
and you can also see this question.

NASM mov from register to memory

I know there are lots of references out there talking about NASM and mov but either I'm missing something fundamental or people need to write better help guides!
SECTION .data
fmtStart: db "Enter two numbers in format '# #'", 10, 0
fmtTest: db "sum: %d", 10, 0
input: db "%d %d", 0
SECTION .bss ; BSS, uninitialized variables
int1: resd 1
int2: resd 1
sum: resd 1
SECTION .text ; Code section.
global main ; the standard gcc entry point
main: ; the program label for the entry point
push ebp ; set up stack frame
mov ebp,esp
;; Get the data
push dword fmtStart
call printf
add esp, 4
push dword int2
push dword int1
push dword input
call scanf
add esp, 12
;; Do calculations
;; Add
xor eax, eax
mov eax, [int1]
add eax, [int2]
mov [sum], eax
push dword sum
push dword fmtTest
call printf
add esp, 24
mov esp, ebp ; take down stack frame
pop ebp ; same as "leave" op
mov eax,0 ; normal, no error, return value
ret ; return
I get:
Enter two numbers in format '# #'
2 3
sum: 4247592
which isn't what I get when I add 2 and 3 with my calculator, maybe that's just me though.
my understanding of the code is as follows: the data section declares variables that are initialized to stuff, in this case my formatted strings; the bss section is for uninitialized variables, in this case my input vars and the sum var; the text section is where the code goes; I declare main as the entry point for gcc; I prompt the user for two numbers; I zero out eax with the xor; move the value of int1 to eax; add the value of int2 to eax; move what's in eax to be the value of sum; push it onto the stack with the formatted string; call printf to display stuff; end the program.
--EDIT--
To be clear, either add isn't working or mov isn't working. It seems like add should be working so I'm assuming it's mov. I don't understand what about mov [var], register would be wrong but obviously something isn't right!
Here's the problem:
push dword sum
push dword fmtTest
call printf
printf, unlike scanf, takes its arguments (after the format) by value, while in your code sum is the address of the memory location. Just do:
push [sum]
push fmtTest
call printf
(incidentally, the xor eax,eax before the mov eax,[int1] is useless, since you are immediately rewriting the content of the register)

Resources