ESI and EDI change values after function call - windows

I'm trying to convert some strings representing binary numbers into their actual values, using a conversion function defined in a different file.
Here's my code:
main.asm
bits 32
global start
%include 'convert.asm'
extern exit, scanf, printf
import exit msvcrt.dll
import scanf msvcrt.dll
import printf msvcrt.dll
section data use32 class=data
s DB '10100111b', '01100011b', '110b', '101011b'
len EQU $ - s
res times len DB 0
segment code use32 class=code
start:
mov ESI, s ; move source string
mov EDI, res ; move destination string
mov ECX, len ; length of the string
mov EBX, 0
repeat:
lodsb ; load current byte into AL
inc BL
cmp AL, 'b' ; check if its equal to the character b
jne end ; if its not, we need to keep parsing
push dword ESI ; push the position of the current character in the source string to the stack
push dword EDI ; push the position of the current character in the destination string to the stack
push dword EBX ; push the current length to the stack
call func1 ; call the function
end:
loop repeat
push dword 0
call [exit]
convert.asm
func1:
mov ECX, [ESP] ; first parameter is the current parsed length
mov EDI, [ESP + 4] ; then EDI
mov ESI, [ESP + 8] ; and ESI
sub ESI, ECX
parse:
mov EDX, [ESI]
sub EDX, '0'
mov [EDI], EDX
shl dword [EDI], 1
inc ESI
loop parse
ret 4 * 3
I noticed that I keep getting access violation errors after the function call though. ESI has some random value after the call. Am I doing something wrong? I think the parameter pushing part should be alright. Inside the conversion function, the parameters should be accessed in the reverse order. But that's not happening for some reason.
I'm also pretty sure that I did the compiling/linking part alright using nasm and alink.
nasm -fobj main.asm
nasm -fobj convert.asm
alink main.obj convert.obj -oPE -subsys console -entry start

Related

I'm unsure what the problem with my assembly code it works until eax is popped and replaced by a register

; Input x and y, output min of the two numbers
.586
.MODEL FLAT
INCLUDE io.h
.STACK 4096
.DATA
number DWORD ?
array DWORD 20, 15, 62, 40, 18
nbrElts DWORD 5
prompt BYTE "Enter value:", 0
string BYTE 80 DUP (?)
resultLbl BYTE "Position", 0
result BYTE 11 DUP (?), 0
.CODE
_MainProc PROC
input prompt, string, 20 ; read ASCII characters
atod string ; convert to integer
mov number, eax ; store in memory
push nbrElts ; 3rd parameter (# of elements in array)
lea eax, array ; 2nd parameter (address of array)
push eax
push number ; 1st parameter (value from user)
call searchArray ; searchArray(number, array, 5)
add esp, 12
dtoa result, eax ; convert to ASCII characters
output resultLbl, result ; output label and result
mov eax, 0 ; exit with return code 0
ret
_MainProc ENDP
; searchArray(int x, array, int y)
;
searchArray PROC
push ebp ; save base pointer
mov ebp, esp ; establish stack frame
push eax ; save registers
push ebx
push esi
push ecx
push edx
mov ebx, [ebp+8] ; x, value from user
mov esi, [ebp+12] ; address of array
mov ecx, [ebp+16] ; y, number of elements
mov edx, 1
mov ecx, 5
forLoop:
mov eax, [esi] ; a[i]
cmp eax, ebx ; eax = ebx ?
je isEqual
;cmp eax, ebx
add esi, 4
inc edx
loop forLoop
;mov eax, 0
cmp edx, 6
je notEqual
isEqual:
mov eax, edx
jmp exitCode
notEqual:
mov eax, 0
jmp exitCode
exitCode:
mov eax, edx
pop edx ; restore EBP
pop ecx ; restore EAX
pop esi
pop ebx
pop ebp
ret ; return
searchArray ENDP
END ; end of source code
The pops at the end of the function need to match the pushes at the beginning of the function. If they don't match, the stack pointer ends up in the wrong place and the ret returns to the wrong place.
In your case, you have an extra push without a corresponding pop.
The reason to push registers at the beginning and pop them at the end is to preserve their values. But you don't want to preserve the value of eax. You want to return a different value, the result of the function. So there is absolutely no reason to push eax.

Passing string parameter to a PROC

I want to call a function that will perform upper to lower case conversion to a user typed string, preserving the especial characters. This part works, but only for the first 4 characters, everything after that just gets truncated. I believe it is because I have defined the parameters as DWORD:
I have tried using PAGE, PARA and BYTE. The first two don't work and with byte says type missmatch.
upperToLower proc, source:dword, auxtarget:dword
mov eax, source ;Point to string
mov ebx, auxtarget ; point to destination
L1:
mov dl, [eax] ; Get a character from buffer
cmp byte ptr [eax], 0 ; End of string? (not counters)
je printString ; if true, jump to printString
cmp dl, 65 ; 65 == 'A'
jl notUpper ; if less, it's not uppercase
cmp dl, 90 ; 90 == 'Z'
jg notUpper ; if greater, it's not uppercase
xor dl, 100000b ; XOR to change upper to lower
mov [ebx], dl ; add char to target
inc eax ; Move counter up
inc ebx ; move counter up
jmp L1 ; loop
notUpper: ; not uppercase
mov [ebx], dl ; copy the letter
inc eax ;next letter
inc ebx
jmp L1
printString:
invoke WriteConsoleA, consoleOutHandle, auxtarget, sizeof auxtarget, bytesWritten, 0
ret
upperToLower endp
The PROTO:
upperToLower PROTO,
source: dword,
auxtarget: dword
Invoke:
invoke upperToLower, offset buffer, offset target
The buffer parameter is: buffer db 128 DUP(?)
How can I get printed the whole string, and not just the first 4 characters?
Why are only 4 characters being printed? You write the string to the console with:
invoke WriteConsoleA, consoleOutHandle, auxtarget, sizeof auxtarget, bytesWritten, 0
The sizeof auxtarget parameter is the size of auxtarget which is a DWORD (4 bytes) thus you are asking to only print 4 bytes. You need to pass the length of the string. You can easily do so by taking the ending address in EAX and subtracting the source pointer from it. The result would be the length of the string you traversed.
Modify the code to be:
printString:
sub eax, source
invoke WriteConsoleA, consoleOutHandle, auxtarget, eax, bytesWritten, 0
A version of your code that follows the C call convention, uses both a source and destination buffer, tests for the pointers to make sure they aren't NULL, does the conversion using a similar method described by Peter Cordes is as follows:
upperToLower proc uses edi esi, source:dword, dest:dword
; uses ESI EDI is used to tell assembler we are clobbering two of
; the cdecl calling convetions non-volatile registers. See:
; https://en.wikipedia.org/wiki/X86_calling_conventions#cdecl
mov esi, source ; ESI = Pointer to string
test esi, esi ; Is source a NULL pointer?
jz done ; If it is then we are done
mov edi, dest ; EDI = Pointer to string
test edi, edi ; Is dest a NULL pointer?
jz done ; If it is then we are done
xor edx, edx ; EDX = 0 = current character index into the strings
jmp getnextchar ; Jump into loop at point of getting next character
charloop:
lea ecx, [eax - 'A'] ; cl = al-'A', and we do not care about the rest
; of the register
cmp cl, 25 ; if(c >= 'A' && c <= 'Z') c += 0x20;
lea ecx, [eax + 20h] ; without affecting flags
cmovna eax, ecx ; take the +0x20 version if it was in the
; uppercase range to start with
mov [edi + edx], al ; Update character in destination string
inc edx ; Go to next character
getnextchar:
movzx eax, byte ptr [esi + edx]
; mov al, [esi + edx] leaving high garbage in EAX is ok
; too, but this avoids a partial-register stall
; when doing the mov+sub
; in one instruction with LEA
test eax, eax ; Is the character NUL(0) terminator?
jnz charloop ; If not go back and process character
printString:
; EDI = source, EDX = length of string
invoke WriteConsoleA, consoleOutHandle, edi, edx, bytesWritten, 0
mov edx, sizeof buffer
done:
ret
upperToLower endp
A version that takes one parameter and changes the source string to upper case could be done this way:
upperToLower proc, source:dword
mov edx, source ; EDX = Pointer to string
test edx, edx ; Is it a NULL pointer?
jz done ; If it is then we are done
jmp getnextchar ; Jump into loop at point of getting next character
charloop:
lea ecx, [eax - 'A'] ; cl = al-'A', and we do not care about the rest
; of the register
cmp cl, 25 ; if(c >= 'A' && c <= 'Z') c += 0x20;
lea ecx, [eax + 20h] ; without affecting flags
cmovna eax, ecx ; take the +0x20 version if it was in the
; uppercase range to start with
mov [edx], al ; Update character in string
inc edx ; Go to next character
getnextchar:
movzx eax, byte ptr [edx] ; mov al, [edx] leaving high garbage in EAX is ok, too,
; but this avoids a partial-register stall
; when doing the mov+sub in one instruction with LEA
test eax, eax ; Is the character NUL(0) terminator?
jnz charloop ; If not go back and process character
printString:
sub edx, source ; EDX-source=length
invoke WriteConsoleA, consoleOutHandle, source, edx, bytesWritten, 0
done:
ret
upperToLower endp
Observations
A generic upperToLower function that does the string conversion would normally not do the printing itself. You'd normally call upperToLower to do the conversion only, then you'd output the string to the display in a separate call.

Access violation when trying to modify a byte in assembly procedure under NASM Windows

I'm totally new to assembly and I've been coding for about 2 weeks in Linux and able to output certain information on the terminal under Linux. However, today when I began coding under Windows, I ran into a problem, i'm unable to modify a byte in a string that I pass as a parameter to a function. I debugged the program in OllyDbg and it tells me that there's an access violation when writing to [address] - use Shift F7/F8/F9 to pass exception to program. I can't pass the exception to the program since there's no exception handler in the program. Is it illegal to modify a character in a string passed to a procedure? I have posted the code below.
section .bss
var resb 6
section .text
global _start
_start:
xor eax, eax ; empty all registers
xor ebx, ebx
xor ecx, ecx
xor edx, edx
jmp short get_library
get_lib_addr: ; get_lib_addr(char *name)
mov ecx, [esp+8] ; get the first parameter
xor edx, edx ; zero out register
mov byte [ecx+6], dl ; add null terminating character to string
jmp short fin ; go to end
get_library:
xor ecx, ecx
mov ebx, var ; start address of var
jmp start_loop ; start looping
start_loop:
cmp ecx, 5
jge end_loop
mov byte [ebx+ecx], 'h'
inc ecx
jmp start_loop
end_loop:
push dword var ; at this point - var = "hhhhh"
call get_lib_addr ; call get_lib_addr("hhhhh")
fin:
xor eax, eax
mov ebx, 0x750b4b80 ; ExitProcess
push eax ; eax = 0
call ebx ; ExitProcess(0)
And a screenshot of the debugging in OllyDbg.
https://imgur.com/a/48RAXOw
2nd case
Part of the code from Vivid Machines Shellcoding for Linux & Windows is copied below. When I try to use this syntax for passing parameters (assuming that's what is happening), I get an access violation trying to edit the string and adding a null terminating character in place of the N. I'm positive that the string is passed and ecx gets the correct string. Any reason this doesn't work?
LibraryReturn:
pop ecx ;get the library string
mov [ecx + 10], dl ;insert NULL - ***Access violation here***
;the N at the end of each string signifies the location of the NULL
;character that needs to be inserted
GetLibrary:
call LibraryReturn
db 'user32.dllN'
Screenshot of the debug information in the second scenario showing that N is indeed the value being edited.
https://imgur.com/a/W5vdR7d

Assembly Programming using SASM on Windows, with an example using int 0x80 (Linux system calls)

I need help. I'm trying to run the program (NASM) below in SASM.
SYS_EXIT equ 1
SYS_READ equ 3
SYS_WRITE equ 4
STDIN equ 0
STDOUT equ 1
segment .data
msg1 db "Enter a digit ", 0xA,0xD
len1 equ $- msg1
msg2 db "Please enter a second digit", 0xA,0xD
len2 equ $- msg2
msg3 db "The sum is: "
len3 equ $- msg3
segment .bss
num1 resb 2
num2 resb 2
res resb 1
section .text
global _start ;must be declared for using gcc
_start: ;tell linker entry point
mov eax, SYS_WRITE
mov ebx, STDOUT
mov ecx, msg1
mov edx, len1
int 0x80
mov eax, SYS_READ
mov ebx, STDIN
mov ecx, num1
mov edx, 2
int 0x80
mov eax, SYS_WRITE
mov ebx, STDOUT
mov ecx, msg2
mov edx, len2
int 0x80
mov eax, SYS_READ
mov ebx, STDIN
mov ecx, num2
mov edx, 2
int 0x80
mov eax, SYS_WRITE
mov ebx, STDOUT
mov ecx, msg3
mov edx, len3
int 0x80
; moving the first number to eax register and second number to ebx
; and subtracting ascii '0' to convert it into a decimal number
mov eax, [num1]
sub eax, '0'
mov ebx, [num2]
sub ebx, '0'
; add eax and ebx
add eax, ebx
; add '0' to to convert the sum from decimal to ASCII
add eax, '0'
; storing the sum in memory location res
mov [res], eax
; print the sum
mov eax, SYS_WRITE
mov ebx, STDOUT
mov ecx, res
mov edx, 1
int 0x80
exit:
mov eax, SYS_EXIT
xor ebx, ebx
int 0x80
I had this error:
[20:53:11] Warning! Errors have occurred in the build:
c:/program files (x86)/sasm/mingw/bin/../lib/gcc/mingw32/4.6.2/../../../libmingw32.a(main.o): In function 'main':
C:\MinGW\msys\1.0\src\mingwrt/../mingw/main.c:73: undefined reference to `WinMain#16'
Also, how do I limit users input up to 4 digits only?
global _start should change to global main and Linux system calls should be replaced by Windows API function calls and declared as external. Modern versions of Windows doesn't approve use of system calls due to malware or badware risks, so deprecated (permanent) system call codes. Every modern version of Windows has different system call number codes, though you can find them on internet, you shouldn't rely on them unless you want to revise your assembly code for each version of Windows thus reducing portability and increasing workload. There are significant differences between Linux/Mac and Windows in the way of handling registers, stack and function names.

How to echo memory location use NASM [duplicate]

I am looking for a way to print an integer in assembler (the compiler I am using is NASM on Linux), however, after doing some research, I have not been able to find a truly viable solution. I was able to find a description for a basic algorithm to serve this purpose, and based on that I developed this code:
global _start
section .bss
digit: resb 16
count: resb 16
i: resb 16
section .data
section .text
_start:
mov dword[i], 108eh ; i = 4238
mov dword[count], 1
L01:
mov eax, dword[i]
cdq
mov ecx, 0Ah
div ecx
mov dword[digit], edx
add dword[digit], 30h ; add 48 to digit to make it an ASCII char
call write_digit
inc dword[count]
mov eax, dword[i]
cdq
mov ecx, 0Ah
div ecx
mov dword[i], eax
cmp dword[i], 0Ah
jg L01
add dword[i], 48 ; add 48 to i to make it an ASCII char
mov eax, 4 ; system call #4 = sys_write
mov ebx, 1 ; file descriptor 1 = stdout
mov ecx, i ; store *address* of i into ecx
mov edx, 16 ; byte size of 16
int 80h
jmp exit
exit:
mov eax, 01h ; exit()
xor ebx, ebx ; errno
int 80h
write_digit:
mov eax, 4 ; system call #4 = sys_write
mov ebx, 1 ; file descriptor 1 = stdout
mov ecx, digit ; store *address* of digit into ecx
mov edx, 16 ; byte size of 16
int 80h
ret
C# version of what I want to achieve (for clarity):
static string int2string(int i)
{
Stack<char> stack = new Stack<char>();
string s = "";
do
{
stack.Push((char)((i % 10) + 48));
i = i / 10;
} while (i > 10);
stack.Push((char)(i + 48));
foreach (char c in stack)
{
s += c;
}
return s;
}
The issue is that it outputs the characters in reverse, so for 4238, the output is 8324. At first, I thought that I could use the x86 stack to solve this problem, push the digits in, and pop them out and print them at the end, however when I tried implementing that feature, it flopped and I could no longer get an output.
As a result, I am a little bit perplexed about how I can implement a stack in to this algorithm in order to accomplish my goal, aka printing an integer. I would also be interested in a simpler/better solution if one is available (as it's one of my first assembler programs).
One approach is to use recursion. In this case you divide the number by 10 (getting a quotient and a remainder) and then call yourself with the quotient as the number to display; and then display the digit corresponding to the remainder.
An example of this would be:
;Input
; eax = number to display
section .data
const10: dd 10
section .text
printNumber:
push eax
push edx
xor edx,edx ;edx:eax = number
div dword [const10] ;eax = quotient, edx = remainder
test eax,eax ;Is quotient zero?
je .l1 ; yes, don't display it
call printNumber ;Display the quotient
.l1:
lea eax,[edx+'0']
call printCharacter ;Display the remainder
pop edx
pop eax
ret
Another approach is to avoid recursion by changing the divisor. An example of this would be:
;Input
; eax = number to display
section .data
divisorTable:
dd 1000000000
dd 100000000
dd 10000000
dd 1000000
dd 100000
dd 10000
dd 1000
dd 100
dd 10
dd 1
dd 0
section .text
printNumber:
push eax
push ebx
push edx
mov ebx,divisorTable
.nextDigit:
xor edx,edx ;edx:eax = number
div dword [ebx] ;eax = quotient, edx = remainder
add eax,'0'
call printCharacter ;Display the quotient
mov eax,edx ;eax = remainder
add ebx,4 ;ebx = address of next divisor
cmp dword [ebx],0 ;Have all divisors been done?
jne .nextDigit
pop edx
pop ebx
pop eax
ret
This example doesn't suppress leading zeros, but that would be easy to add.
I think that maybe implementing a stack is not the best way to do this (and I really think you could figure out how to do that, saying as how pop is just a mov and a decrement of sp, so you can really set up a stack anywhere you like by just allocating memory for it and setting one of your registers as your new 'stack pointer').
I think this code could be made clearer and more modular if you actually allocated memory for a c-style null delimited string, then create a function to convert the int to string, by the same algorithm you use, then pass the result to another function capable of printing those strings. It will avoid some of the spaghetti code syndrome you are suffering from, and fix your problem to boot. If you want me to demonstrate, just ask, but if you wrote the thing above, I think you can figure out how with the more split up process.
; Input
; EAX = pointer to the int to convert
; EDI = address of the result
; Output:
; None
int_to_string:
xor ebx, ebx ; clear the ebx, I will use as counter for stack pushes
.push_chars:
xor edx, edx ; clear edx
mov ecx, 10 ; ecx is divisor, devide by 10
div ecx ; devide edx by ecx, result in eax remainder in edx
add edx, 0x30 ; add 0x30 to edx convert int => ascii
push edx ; push result to stack
inc ebx ; increment my stack push counter
test eax, eax ; is eax 0?
jnz .push_chars ; if eax not 0 repeat
.pop_chars:
pop eax ; pop result from stack into eax
stosb ; store contents of eax in at the address of num which is in EDI
dec ebx ; decrement my stack push counter
cmp ebx, 0 ; check if stack push counter is 0
jg .pop_chars ; not 0 repeat
mov eax, 0x0a
stosb ; add line feed
ret ; return to main
; eax = number to stringify/output
; edi = location of buffer
intToString:
push edx
push ecx
push edi
push ebp
mov ebp, esp
mov ecx, 10
.pushDigits:
xor edx, edx ; zero-extend eax
div ecx ; divide by 10; now edx = next digit
add edx, 30h ; decimal value + 30h => ascii digit
push edx ; push the whole dword, cause that's how x86 rolls
test eax, eax ; leading zeros suck
jnz .pushDigits
.popDigits:
pop eax
stosb ; don't write the whole dword, just the low byte
cmp esp, ebp ; if esp==ebp, we've popped all the digits
jne .popDigits
xor eax, eax ; add trailing nul
stosb
mov eax, edi
pop ebp
pop edi
pop ecx
pop edx
sub eax, edi ; return number of bytes written
ret

Resources