x64 asm ret lands in no mans land - winapi

I assumed I had push'ed something without popping it, or vice versa, but I can't find anything wrong! I write to the console with a call to a dll that links properly, and I inexplicably am in no mans land... (address 0x0000000000000000)
I've put some sleeps in, and I'm sure that the api call WriteConsoleA is returning. It's on my last ret under the print function.
Any ideas?
.exe:
extern FreeConsole
extern Sleep
extern ExitProcess
extern print
extern newconsole
extern strlen
section .data BITS 64
title: db 'Consolas!',0
message: db 'Hello, world',0,0
section .text bits 64
global Start
Start:
mov rcx, title
call newconsole
mov rcx, 1000
call Sleep
mov rcx, message
call print
mov rcx, 10000
call Sleep
call FreeConsole
xor rcx, rcx
call ExitProcess
.dll:
extern AllocConsole
extern SetConsoleTitleA
extern GetStdHandle
extern WriteConsoleA
extern Sleep
export newconsole
export strlen
export print
section .data BITS 64
console.writehandle: dq 0
console.readhandle: dq 0
console.write.result: dq 0
section .text BITS 64
global strlen
strlen:
push rax
push rdx
push rdi
mov rdi, rcx
xor rax, rax
mov rcx, dword -1
cld
repnz scasb
neg rcx
sub rcx, 2
pop rdi
pop rdx
pop rax
ret
global print
print:
mov rbp, rsp
push rcx
call strlen
mov r8, rcx
pop rdx
mov rcx, [console.writehandle]
mov r9, console.write.result
push qword 0
call WriteConsoleA
ret
global newconsole
newconsole:
push rax
push rcx
call AllocConsole
pop rcx
call SetConsoleTitleA
mov rcx, -11
call GetStdHandle
mov [console.writehandle], rax
pop rax
ret

I assume you're talking about this function:
global print
print:
mov rbp, rsp
push rcx
call strlen
mov r8, rcx
pop rdx
mov rcx, [console.writehandle]
mov r9, console.write.result
push qword 0
call WriteConsoleA
ret
The x64 ABI requires that stack space is reserved even for parameters passed in registers. WriteConsoleA is free to use those stack locations for whatever it wants - so you need to make sure that you've adjusted the stack appropriately. As it stands, you're pushing only the last reserved pointer parameter. I think something like the following will do the trick for you:
push qword 0
sub rsp, 4 * 8 // reserve stack for register parameters
call WriteConsoleA
mov rsp, rbp // restore rsp
ret
See http://msdn.microsoft.com/en-us/library/ms235286.aspx (emphasis added):
The x64 Application Binary Interface (ABI) is a 4 register fast-call calling convention, with stack-backing for those registers.
...
The caller is responsible for allocating space for parameters to the callee, and must always allocate sufficient space for the 4 register parameters, even if the callee doesn’t have that many parameters.

According to calling convention, you have to clean up arguments you put on the stack. In this case that applies to the 5th argument to WriteConsoleA. Since you have a copy of original rsp in rbp, you can reload rsp from rbp, or just add 8 after the call.

Related

Repeated call of WriteConsole (NASM x64 on Win64)

I started to learn Assembly lately and for practice, I thought of makeing a small game.
To make the border graphic of the game I need to print a block character n times.
To test this, I wrote the following code:
bits 64
global main
extern ExitProcess
extern GetStdHandle
extern WriteConsoleA
section .text
main:
mov rcx, -11
call GetStdHandle
mov rbx, rax
drawFrame:
mov r12, [sze]
l:
mov rcx, rbx
mov rdx, msg
mov r8, 1
sub rsp, 48
mov r9, [rsp+40]
mov qword [rsp+32], 0
call WriteConsoleA
dec r12
jnz l
xor rcx, rcx
call ExitProcess
section .data
score dd 0
sze dq 20
msg db 0xdb
I wanted to make this with the WinAPI Function for ouput.
Interestingly, this code stops after printing one char when using WriteConsoleA, but when I use C's putchar, it works correctly. I could also manage to make a C equivalent with the WriteConsoleA function, which also works fine. The disassembly of the C code didn't bring me further.
I suspect there's something wrong in my use of the stack that I don't see. Hopefully someone can explain or point out.
You don't want to keep subtracting 48 from RSP through each loop. You only need to allocate that space once before the loop and before you call a C library function or the WinAPI.
The primary problem is with your 4th parameter in R9. The WriteConsole function is defined as:
BOOL WINAPI WriteConsole(
_In_ HANDLE hConsoleOutput,
_In_ const VOID *lpBuffer,
_In_ DWORD nNumberOfCharsToWrite,
_Out_opt_ LPDWORD lpNumberOfCharsWritten,
_Reserved_ LPVOID lpReserved
);
R9 is supposed to be a pointer to a memory location that returns a DWORD with the number of characters written, but you do:
mov r9, [rsp+40]
This moves the 8 bytes starting at memory address RSP+40 to R9. What you want is the address of [rsp+40] which can be done using the LEA instruction:
lea r9, [rsp+40]
Your code could have looked like:
bits 64
global main
extern ExitProcess
extern GetStdHandle
extern WriteConsoleA
section .text
main:
sub rsp, 56 ; Allocate space for local variable(s)
; Allocate 32 bytes of space for shadow store
; Maintain 16 byte stack alignment for WinAPI/C library calls
; 56+8=64 . 64 is evenly divisible by 16.
mov rcx, -11
call GetStdHandle
mov rbx, rax
drawFrame:
mov r12, [sze]
l:
mov rcx, rbx
mov rdx, msg
mov r8, 1
lea r9, [rsp+40]
mov qword [rsp+32], 0
call WriteConsoleA
dec r12
jnz l
xor rcx, rcx
call ExitProcess
section .data
score dd 0
sze dq 20
msg db 0xdb
Important Note: In order to be compliant with the 64-bit Microsoft ABI you must maintain the 16 byte alignment of the stack pointer prior to calling a WinAPI or C library function. Upon calling the main function the stack pointer (RSP) was 16 byte aligned. At the point the main function starts executing the stack is misaligned by 8 because the 8 byte return address was pushed on the stack. 48+8=56 doesn't get you back on a 16 byte aligned stack address (56 is not evenly divisible by 16) but 56+8=64 does. 64 is evenly divisible by 16.

Why doesn't this assembly code print the top of the stack?

After successfully making a "Hello, World!" program in x86-64, I wanted
to make a program that can peek at the top of the stack (without popping it, and using the esp register so I can learn how it works). This is the program in NASM:
extern GetStdHandle, WriteConsoleA, ExitProcess
section .bss
dummy resd 1
section .text
%macro print 3
mov rcx, %1
mov rdx, %2
mov r8, %3
mov r9, dummy
push NULL
call WriteConsoleA
%endmacro
_start:
mov rcx, STD_OUTPUT_HANDLE
call GetStdHandle
push 65
print rax, [x], 1
mov rcx, 0
call ExitProcess
NULL equ 0
STD_OUTPUT_HANDLE equ -11
At the print rax, [x], 1 line, x is replaced by something. I tried a variety of things, like rsp, esp, rsi, esi, rsp+1, rsp+4, etc. None of them worked. They either don't compile or don't print anything.
What is the correct way to do it? (note: this is solely for experimental purposes. I know I could use push/pop in this case, but I want to learn how to do it this way.)
mov rdx, [rsp] will load 65 into rdx. But WriteConsole expects the address of the string to print. So you want mov rdx, rsp.
One other thing that should be fixed: the stack should be aligned to 16 bytes before the call and there should be 32 bytes of empty space at the top of the stack. After the push, put sub rsp, 40. Then use rsp+40 as the address to print.

NASM ReadConsoleA or WriteConsoleA Buffer Debugging Issue

I am writing a NASM Assembly program on Windows to get the user to enter in two single digit numbers, add these together and then output the result. I am trying to use the Windows API for input and output.
Unfortunately, whilst I can get it to read in one number as soon as the program loops round to get the second the program ends rather than asking for the second value.
The output of the program shown below:
What is interesting is that if I input 1 then the value displayed is one larger so it is adding to something!
This holds for other single digits (2-9) entered as well.
I am pretty sure it is related to how I am using the ReadConsoleA function but I have hit a bit of a wall attempting to find a solution. I have installed gdb to debug the program and assembled it as follows:
nasm -f win64 -g -o task9.obj task9.asm
GoLink /console /entry _main task9.obj kernel32.dll
gdb task9
But I just get the following error:
"C:\Users\Administrator\Desktop/task9.exe": not in executable format: File format not recognized
I have since read that NASM doesn't output the debug information needed for the Win64 format but I am not 100% sure about that. I am fairly sure I have the 64-bit version of GDB installed:
My program is as follows:
extern ExitProcess ;windows API function to exit process
extern WriteConsoleA ;windows API function to write to the console window (ANSI version)
extern ReadConsoleA ;windows API function to read from the console window (ANSI version)
extern GetStdHandle ;windows API to get the for the console handle for input/output
section .data ;the .data section is where variables and constants are defined
STD_OUTPUT_HANDLE equ -11
STD_INPUT_HANDLE equ -10
digits db '0123456789' ;list of digits
input_message db 'Please enter your next number: '
length equ $-input_message
section .bss ;the .bss section is where space is reserved for additional variables
input_buffer: resb 2 ;reserve 64 bits for user input
char_written: resb 4
chars: resb 1 ;reversed for use with write operation
section .text ;the .text section is where the program code goes
global _main ;tells the machine which label to start program execution from
_num_to_str:
cmp rax, 0 ;compare value in rax to 0
jne .convert ;if not equal then jump to label
jmp .output
.convert:
;get next digit value
inc r15 ;increment the counter for next digit
mov rcx, 10
xor rdx, rdx ;clear previous remainder result
div rcx ;divide value in rax by value in rcx
;quotient (result) stored in rax
;remainder stored in rdx
push rdx ;store remainder on the stack
jmp _num_to_str
.output:
pop rdx ;get the last digit from the stack
;convert digit value to ascii character
mov r10, digits ;load the address of the digits into rsi
add r10, rdx ;get the character of the digits string to display
mov rdx, r10 ;digit to print
mov r8, 1 ;one byte to be output
call _print
;decide whether to loop
dec r15 ;reduce remaining digits (having printed one)
cmp r15, 0 ;are there digits left to print?
jne .output ;if not equal then jump to label output
ret
_print:
;get the output handle
mov rcx, STD_OUTPUT_HANDLE ;specifies that the output handle is required
call GetStdHandle ;returns value for handle to rax
mov rcx, rax
mov r9, char_written
call WriteConsoleA
ret
_read:
;get the input handle
mov rcx, STD_INPUT_HANDLE ;specifies that the input handle is required
call GetStdHandle
;get value from keyboard
mov rcx, rax ;place the handle for operation
mov rdx, input_buffer ;set name to receive input from keyboard
mov r8, 2 ;max number of characters to read
mov r9, chars ;stores the number of characters actually read
call ReadConsoleA
movzx r12, byte[input_buffer]
ret
_get_value:
mov rdx, input_message ;move the input message into rdx for function call
mov r8, length ;load the length of the message for function call
call _print
xor r8, r8
xor r9, r9
call _read
.end:
ret
_main:
mov r13, 0 ;counter for values input
mov r14, 0 ;total for calculation
.loop:
xor r12, r12
call _get_value ;get value from user
sub r12, '0' ;convert char to integer
add r14, r12 ;add value to total
;decide whether to loop for another character or not
inc r13
cmp r13, 2
jne .loop
;convert total to ASCII value
mov rax, r14 ;num_to_str expects total in rax
mov r15, 0 ;num_to_str uses r15 as a counter - must be initialised
call _num_to_str
;exit the program
mov rcx, 0 ;exit code
call ExitProcess
I would really appreciate any assistance you can offer either with resolving the issue or how to resolve the issue with gdb.
I found the following issues with your code:
Microsoft x86-64 convention mandates rsp be 16 byte aligned.
You must reserve space for the arguments on the stack, even if you pass them in registers.
Your chars variable needs 4 bytes not 1.
ReadConsole expects 5 arguments.
You should read 3 bytes because ReadConsole returns CR LF. Or you could just ignore leading whitespace.
Your _num_to_str is broken if the input is 0.
Based on Jester's suggestions this is the final program:
extern ExitProcess ;windows API function to exit process
extern WriteConsoleA ;windows API function to write to the console window (ANSI version)
extern ReadConsoleA ;windows API function to read from the console window (ANSI version)
extern GetStdHandle ;windows API to get the for the console handle for input/output
section .data ;the .data section is where variables and constants are defined
STD_OUTPUT_HANDLE equ -11
STD_INPUT_HANDLE equ -10
digits db '0123456789' ;list of digits
input_message db 'Please enter your next number: '
length equ $-input_message
NULL equ 0
section .bss ;the .bss section is where space is reserved for additional variables
input_buffer: resb 3 ;reserve 64 bits for user input
char_written: resb 4
chars: resb 4 ;reversed for use with write operation
section .text ;the .text section is where the program code goes
global _main ;tells the machine which label to start program execution from
_num_to_str:
sub rsp, 32
cmp rax, 0
jne .next_digit
push rax
inc r15
jmp .output
.next_digit:
cmp rax, 0 ;compare value in rax to 0
jne .convert ;if not equal then jump to label
jmp .output
.convert:
;get next digit value
inc r15 ;increment the counter for next digit
mov rcx, 10
xor rdx, rdx ;clear previous remainder result
div rcx ;divide value in rax by value in rcx
;quotient (result) stored in rax
;remainder stored in rdx
sub rsp, 8 ;add space on stack for value
push rdx ;store remainder on the stack
jmp .next_digit
.output:
pop rdx ;get the last digit from the stack
add rsp, 8 ;remove space from stack for popped value
;convert digit value to ascii character
mov r10, digits ;load the address of the digits into rsi
add r10, rdx ;get the character of the digits string to display
mov rdx, r10 ;digit to print
mov r8, 1 ;one byte to be output
call _print
;decide whether to loop
dec r15 ;reduce remaining digits (having printed one)
cmp r15, 0 ;are there digits left to print?
jne .output ;if not equal then jump to label output
add rsp, 32
ret
_print:
sub rsp, 40
;get the output handle
mov rcx, STD_OUTPUT_HANDLE ;specifies that the output handle is required
call GetStdHandle ;returns value for handle to rax
mov rcx, rax
mov r9, char_written
mov rax, qword 0 ;fifth argument
mov qword [rsp+0x20], rax
call WriteConsoleA
add rsp, 40
ret
_read:
sub rsp, 40
;get the input handle
mov rcx, STD_INPUT_HANDLE ;specifies that the input handle is required
call GetStdHandle
;get value from keyboard
mov rcx, rax ;place the handle for operation
xor rdx, rdx
mov rdx, input_buffer ;set name to receive input from keyboard
mov r8, 3 ;max number of characters to read
mov r9, chars ;stores the number of characters actually read
mov rax, qword 0 ;fifth argument
mov qword [rsp+0x20], rax
call ReadConsoleA
movzx r12, byte[input_buffer]
add rsp, 40
ret
_get_value:
sub rsp, 40
mov rdx, input_message ;move the input message into rdx for function call
mov r8, length ;load the length of the message for function call
call _print
call _read
.end:
add rsp, 40
ret
_main:
sub rsp, 40
mov r13, 0 ;counter for values input
mov r14, 0 ;total for calculation
.loop:
call _get_value ;get value from user
sub r12, '0' ;convert char to integer
add r14, r12 ;add value to total
;decide whether to loop for another character or not
inc r13
cmp r13, 2
jne .loop
;convert total to ASCII value
mov rax, r14 ;num_to_str expects total in rax
mov r15, 0 ;num_to_str uses r15 as a counter - must be initialised
call _num_to_str
;exit the program
mov rcx, 0 ;exit code
call ExitProcess
add rsp, 40
ret
As it turned out I was actually missing a 5th argument in the WriteConsole function as well.

and who really knows why child window (button) not created or maybe not visible after wm_create message

extern GetModuleHandleA
extern LoadCursorA
extern RegisterClassA
extern CreateWindowExA
extern GetMessageA
extern DispatchMessageA
extern TranslateMessage
extern ExitProcess
extern PostQuitMessage
extern DefWindowProcA
section .data
MSG dq 0 ;+0 hWnd
dd 0 ;+8 message
dd 0 ;padding for next
wParam dq 0 ;+10 wParam
dq 0 ;+18 lParam
dd 0 ;+20 time
dd 0 ;+24 1st part of point structure
dd 0 ;+28 2nd part of point structure
dd 0 ;padding to bring total size to 48 bytes
WNDCLASS dd 1h+2h+40h ;+0 window class style (CS_VREDRAW+CS_HREDRAW+CS_CLASSDC)
dd 0 ;padding for next
dq WndProcTable ;+8 pointer to Window Procedure
dd 0 ;+10 no. of extra bytes to allocate after structure
dd 0 ;+14 no. of extra bytes to allocate after window instance
hInst dq 0 ;+18 handle to instance containing window procedure
dq 0 ;+20 handle to the class icon
hCursor dq 0 ;+28 handle to the class cursor
dq 6 ;+30 identifies the class background brush (6=COLOR_WINDOW+1)
dq 0 ;+38 pointer to resource name for class menu
dq win_class_name ;+40 pointer to string for window class name
win_class_name db 'simplewindow',0 ;string holding name of window class
win_id dq 0
but_id dq 0
kopf db '64 bit program', 0
class_button db 'button', 0
button_kopf db 'hjh', 0
mbt db 'this is only a test', 0
mbc db 'achtung', 0
section .text
global start
start:
sub rsp, 0x8
xor rcx, rcx
call GetModuleHandleA
mov [hInst], rax
mov rcx, 0
mov rdx, 32512
call LoadCursorA
mov [hCursor], rax
mov rcx, WNDCLASS
sub rsp, 0x20
call RegisterClassA
add rsp, 0x20
;creating main window
mov rcx, 0
mov rdx, win_class_name
mov r8, kopf
mov r9, 0x10000000+0x00080000+0x00020000
push 0
push qword[hInst]
push 0
push 0
push 512
push 512
push 256
push 256
sub rsp, 0x20
call CreateWindowExA
add rsp, 0x20
add rsp, 0x40
mov [win_id], rax
zyklus:
mov rcx, MSG
mov rdx, 0
mov r8, 0
mov r9, 0
sub rsp, 20h
call GetMessageA
add rsp, 20h
or rax, rax
jz fertig
mov rcx, MSG
call TranslateMessage
mov rcx, MSG
call DispatchMessageA
jmp zyklus
fertig:
mov rcx, [wParam]
call ExitProcess
WndProcTable:
sub rsp, 0x8
cmp edx, 0x01 ; see if it is wm_create message
jne quit
; creating button
mov rcx, 0
mov rdx, class_button
mov r8, button_kopf
mov r9, 0x40000000+0x10000000 ; child +visible
push 0
push qword[hInst]
push 0
push qword[win_id]
push 20
push 50
push 30
push 30
sub rsp, 0x20
call CreateWindowExA
add rsp, 0x20
add rsp, 0x40
mov [but_id], rax
jmp alles
quit:
cmp edx, 0x02
jne weiter
xor rcx, rcx
call PostQuitMessage
weiter:
sub rsp,20h
call DefWindowProcA
add rsp,20h
alles:
add rsp, 0x8
ret
however if you place the creation button code after creation main window code
everything works fine but it fails while processing wm_create mrssage
nasm -f win64 first.nasm -o first,obj
golink first,obj user32.dll kernel32.dll gdi32.dll
the button doesn't appear or may be not created at all
what's wrong?
I want to know what is going wrong with this piece of code
is there anybody who notices any mistake in this code
I don't know where to find mistakes
now it is solved. the thing is that main window handle is not valid in creating button after wm_create the valid handle is in rcx register (it is passed in wndproctable as the first parameter) so the right line is push rcx
now it is solved. the thing is that main window handle is not valid in creating button after wm_create the valid handle is in rcx register (it is passed in wndproctable as the first parameter) so the right line is push rcx

How to set a value in the Windows registry in assembly language?

Ok. So I have this program that attempts to create a value in the Windows registry. Unfortunately, nothing happens. I have been trying to figure out if any of the parameters are wrong. Here is the code:
includelib \Masm64\Lib\Kernel32.lib
includelib \Masm64\Lib\Advapi32.lib
extern RegOpenKeyExA : proc
extern RegSetValueExA : proc
extern ExitProcess : proc
dseg segment para 'DATA'
vlnm db 'Startup', 0
sbky db 'Software\Microsoft\Windows\CurrentVersion\Run', 0
phkr dd 0
path db 'C:\Users\School\AppData\Roaming\Startups.exe', 0
dseg ends
cseg segment para 'CODE'
start proc
lea rdx, [phkr]
push rdx
sub rsp, 28h
mov r9d, 2
xor r8d, r8d
lea rdx, [sbky]
mov ecx, 80000001h
call RegOpenKeyExA
add rsp, 28h
push 45
lea rbx, [path]
push rbx
sub rsp, 28h
mov r9d, 1
xor r8d, r8d
lea rdx, [vlnm]
mov ecx, phkr
call RegSetValueExA
call ExitProcess
start endp
cseg ends
end
Any suggestions?
Allow me to answer my own question. The problem does not truly concern incorrect parameters, but a mistake that I made allocating stack space. Whereas I was expected to allocate 20h of stack space for rcx, rdx, r8, and r9, and align the return address on a 16-byte boundary, I had mistakenly created a template as follows:
*empty* (rsp-8)
param2 (rsp-16)
param1 (rsp-24)
*empty* (rsp-32... causes incorrect parameters and convention!)
space for r9 (rsp-40)
space for r8 (rsp-48)
space for rdx (rsp-56)
space for rcx (rsp-64)
return address (rsp-72... not on a 16-byte boundary!)
The correct template would be
*empty* (rsp-8)
param2 (rsp-16)
param1 (rsp-24)
space for r9 (rsp-32)
space for r8 (rsp-40)
space for rdx (rsp-48)
space for rcx (rsp-56)
return address (rsp-64)
I had unintentionally allocated an extra 8 bytes between the stack parameters and register parameters, before the RegSetValueEx call, thus supplying an incorrect parameter. Here is the correct code:
includelib \Masm64\Lib\Kernel32.lib
includelib \Masm64\Lib\Advapi32.lib
extern RegOpenKeyExA : proc
extern RegSetValueExA : proc
extern ExitProcess : proc
dseg segment para 'DATA'
vlnm db 'Startup', 0
sbky db 'Software\Microsoft\Windows\CurrentVersion\Run', 0
phkr dd 0
path db 'C:\Users\Games\AppData\Roaming\Startups.exe', 0
dseg ends
cseg segment para 'CODE'
start proc
lea rdx, [phky]
push rdx
sub rsp, 20h
mov r9d, 2
xor r8d, r8d
lea rdx, [sbky]
mov ecx, 80000001h
call RegOpenKeyExA
add rsp, 20h
push 44
lea rbx, [path]
push rbx
sub rsp, 20h
mov r9d, 1
xor r8, r8
lea rdx, [vlnm]
mov ecx, phkr
call RegSetValueExA
fini: call ExitProcess
start endp
cseg ends
end
Cheers!
You're only allocating 2 bytes for your key (phkr dw 0). It seems to me like it should be at least 4 bytes.
Apart from that, I suggest that you add some error checks. Both RegOpenKeyEx and RegSetValueEx return non-zero error codes if they fail.

Resources