Example:
Dim x As Integer, y As Integer
Input "x=", x
y = x ^ 3 + 3 * x ^ 2 - 24 * x + 30
Print y
End
When I used FreeBasic compiler to generate the assembly code of this source code, I found
.globl _main
_main:
and
call ___main
in assembly code. In addition, it looks like that the Input statement is compiled as
call _fb_ConsoleInput#12
and
call _fb_InputInt#4
The "^" operator is compiled as
call _pow
(I am not sure whether the math function library of FreeBasic is integrated or external)
and the Print statement is compiled as
call _fb_PrintInt#12
and the End statement is compiled as
call _fb_End#4
The question is: How is FreeBasic source code compiled? Why _main and ___main appeared in assembly code? Are I/O statements compiled as function calls?
Reference: Assembly code generated by FreeBasic compiler
.intel_syntax noprefix
.section .text
.balign 16
.globl _main
_main:
push ebp
mov ebp, esp
and esp, 0xFFFFFFF0
sub esp, 20
mov dword ptr [ebp-4], 0
call ___main
push 0
push dword ptr [ebp+12]
push dword ptr [ebp+8]
call _fb_Init#12
.L_0002:
mov dword ptr [ebp-8], 0
mov dword ptr [ebp-12], 0
push -1
push 0
push 2
push offset _Lt_0004
call _fb_StrAllocTempDescZEx#8
push eax
call _fb_ConsoleInput#12
lea eax, [ebp-8]
push eax
call _fb_InputInt#4
push dword ptr [_Lt_0005+4]
push dword ptr [_Lt_0005]
fild dword ptr [ebp-8]
sub esp,8
fstp qword ptr [esp]
call _pow
add esp, 16
fild dword ptr [ebp-8]
fild dword ptr [ebp-8]
fxch st(1)
fmulp
fmul qword ptr [_Lt_0005]
fxch st(1)
faddp
mov eax, dword ptr [ebp-8]
imul eax, 24
push eax
fild dword ptr [esp]
add esp, 4
fxch st(1)
fsubrp
fadd qword ptr [_Lt_0006]
fistp dword ptr [ebp-12]
push 1
push dword ptr [ebp-12]
push 0
call _fb_PrintInt#12
push 0
call _fb_End#4
.L_0003:
push 0
call _fb_End#4
mov eax, dword ptr [ebp-4]
mov esp, ebp
pop ebp
ret
.section .data
.balign 4
_Lt_0004: .ascii "x=\0"
.balign 8
_Lt_0005: .quad 0x4008000000000000
.balign 8
_Lt_0006: .quad 0x403E000000000000
Yes, things like PRINT are implemented as function calls, though i am not sure why this matters to you unless you are currently learning assembly.
As for _main, that is the ASM name for the main() C function used as the main program.
On x86, it is common for global/exported function names in C to be preceded by _ in the ASM output.
___main is the ASM name for the __main() C function called by the MinGW C runtime library startup code before anything in _main is executed.
Again, you'll see the extra _ preceding the C function name.
After that is a call to fb_Init(argc, argv, FB_LANG_FB) to initialize the FreeBASIC runtime library with the default "fb" FreeBASIC dialect and argc elements in the argument vector argv.
The #12 means the argument list is 12 bytes long (e.g., 4+4+4=12 as with fb_Init here); see __stdcall | Microsoft Docs for more information on that.
Related
I started to learn Assembly lately and for practice, I thought of makeing a small game.
To make the border graphic of the game I need to print a block character n times.
To test this, I wrote the following code:
bits 64
global main
extern ExitProcess
extern GetStdHandle
extern WriteConsoleA
section .text
main:
mov rcx, -11
call GetStdHandle
mov rbx, rax
drawFrame:
mov r12, [sze]
l:
mov rcx, rbx
mov rdx, msg
mov r8, 1
sub rsp, 48
mov r9, [rsp+40]
mov qword [rsp+32], 0
call WriteConsoleA
dec r12
jnz l
xor rcx, rcx
call ExitProcess
section .data
score dd 0
sze dq 20
msg db 0xdb
I wanted to make this with the WinAPI Function for ouput.
Interestingly, this code stops after printing one char when using WriteConsoleA, but when I use C's putchar, it works correctly. I could also manage to make a C equivalent with the WriteConsoleA function, which also works fine. The disassembly of the C code didn't bring me further.
I suspect there's something wrong in my use of the stack that I don't see. Hopefully someone can explain or point out.
You don't want to keep subtracting 48 from RSP through each loop. You only need to allocate that space once before the loop and before you call a C library function or the WinAPI.
The primary problem is with your 4th parameter in R9. The WriteConsole function is defined as:
BOOL WINAPI WriteConsole(
_In_ HANDLE hConsoleOutput,
_In_ const VOID *lpBuffer,
_In_ DWORD nNumberOfCharsToWrite,
_Out_opt_ LPDWORD lpNumberOfCharsWritten,
_Reserved_ LPVOID lpReserved
);
R9 is supposed to be a pointer to a memory location that returns a DWORD with the number of characters written, but you do:
mov r9, [rsp+40]
This moves the 8 bytes starting at memory address RSP+40 to R9. What you want is the address of [rsp+40] which can be done using the LEA instruction:
lea r9, [rsp+40]
Your code could have looked like:
bits 64
global main
extern ExitProcess
extern GetStdHandle
extern WriteConsoleA
section .text
main:
sub rsp, 56 ; Allocate space for local variable(s)
; Allocate 32 bytes of space for shadow store
; Maintain 16 byte stack alignment for WinAPI/C library calls
; 56+8=64 . 64 is evenly divisible by 16.
mov rcx, -11
call GetStdHandle
mov rbx, rax
drawFrame:
mov r12, [sze]
l:
mov rcx, rbx
mov rdx, msg
mov r8, 1
lea r9, [rsp+40]
mov qword [rsp+32], 0
call WriteConsoleA
dec r12
jnz l
xor rcx, rcx
call ExitProcess
section .data
score dd 0
sze dq 20
msg db 0xdb
Important Note: In order to be compliant with the 64-bit Microsoft ABI you must maintain the 16 byte alignment of the stack pointer prior to calling a WinAPI or C library function. Upon calling the main function the stack pointer (RSP) was 16 byte aligned. At the point the main function starts executing the stack is misaligned by 8 because the 8 byte return address was pushed on the stack. 48+8=56 doesn't get you back on a 16 byte aligned stack address (56 is not evenly divisible by 16) but 56+8=64 does. 64 is evenly divisible by 16.
I am new to x86 assembly and have been doing some experiments lately using nasm and running the program on a windows 10 machine.
I Have this code:
global _start
extern _GetStdHandle#4
extern _WriteFile#20
extern _ExitProcess#4
section .data
message db "1234"
section .text
_start:
call print
call _ExitProcess#4
print:
; DWORD bytes;
mov ebp, esp
sub esp, 4
; hStdOut = GetstdHandle( STD_OUTPUT_HANDLE)
push -11
call _GetStdHandle#4
mov ebx, eax
; WriteFile( hstdOut, message, length(message), &bytes, 0);
push 0
lea eax, [ebp-4]
push eax
push 4
push message
push ebx
call _WriteFile#20
mov esp, ebp
ret
; ExitProcess(0)
That I assemble it using the following commands:
nasm -f win32 out.asm
link out.obj /entry:start /subsystem:console "C:\Program Files (x86)\Windows Kits\10\Lib\10.0.18362.0\um\x86\kernel32.lib"
and when running it on cmd it outputs "1234" as expected
Now when assembling and running the following code, where instead of pushing message the program pushes "1234" directly
global _start
extern _GetStdHandle#4
extern _WriteFile#20
extern _ExitProcess#4
section .data
message db "1234"
section .text
_start:
call print
call _ExitProcess#4
print:
; DWORD bytes;
mov ebp, esp
sub esp, 4
; hStdOut = GetstdHandle( STD_OUTPUT_HANDLE)
push -11
call _GetStdHandle#4
mov ebx, eax
; WriteFile( hstdOut, message, length(message), &bytes, 0);
push 0
lea eax, [ebp-4]
push eax
push 4
push "1234"
push ebx
call _WriteFile#20
mov esp, ebp
ret
It outputs nothing
Why? What information does message have that "1234" doesn't? When pushing message, does the program just push the address of the memory that is storing "1234"? If so, can I store "1234" somewhere else, and than push its address without creating a variable?
A variable is a logical construct — variables have lifetimes, some short, some long. They can come into being and disappear.
By contrast, registers and memory are physical constructs — in some sense, they are always there.
In assembly programming, by a human or generated by compiler, we make mappings from logical variables needed by our C code, algorithms, and pseudo code, to physical storage available in the processor. When a variable's lifetime ends, we can reuse the physical storage that it was using for another purpose (another variable).
Assembly language supports global variables (full process lifetime), and local variables — which can be either in memory on the stack, or CPU registers. CPU registers, of course, do not have addresses, so cannot be passed by (memory) reference. CPU registers also cannot be indexed, so to index an array requires memory.
I would make a local variable on the stack, like this:
print:
; DWORD bytes;
mov ebp, esp
sub esp, 12
; hStdOut = GetstdHandle( STD_OUTPUT_HANDLE)
push -11
call _GetStdHandle#4
mov ebx, eax
lea ecx, [ebp-12]
mov dword ptr [ecx], “1234”
; WriteFile( hstdOut, message, length(message), &bytes, 0);
push 0
lea eax, [ebp-4]
push eax
push 4
push ecx
push ebx
call _WriteFile#20
mov esp, ebp
ret
Note: the syntax mov ..., “1234” may or may not do what you want, depending on the assembler. I don’t remember how the Microsoft assembler handles it. If it doesn’t translate to 0x34333231, then use that constant instead.
I'm using C++builder for GUI application on Win32. Borland compiler optimization is very bad and does not know how to use SSE.
I have a function that is 5 times faster when compiled with mingw gcc 4.7.
I think about asking gcc to generate assembler code and then use this cod inside my C function because Borland compiler allows inline assembler.
The function in C looks like this :
void Test_Fn(double *x, size_t n,double *AV, size_t *mA, size_t NT)
{
double s = 77.777;
size_t m = mA[NT-3];
AV[2]=x[n-4]+m*s;
}
I made the function code very simple in order to simplify my question. My real function contains many loops.
The Borland C++ compiler generated this assembler code :
;
; void Test_Fn(double *x, size_t n,double *AV, size_t *mA, size_t NT)
;
#1:
push ebp
mov ebp,esp
add esp,-16
push ebx
;
; {
; double s = 77.777;
;
mov dword ptr [ebp-8],1580547965
mov dword ptr [ebp-4],1079210426
;
; size_t m = mA[NT-3];
;
mov edx,dword ptr [ebp+20]
mov ecx,dword ptr [ebp+24]
mov eax,dword ptr [edx+4*ecx-12]
;
; AV[2]=x[n-4]+m*s;
;
?live16385#48: ; EAX = m
xor edx,edx
mov dword ptr [ebp-16],eax
mov dword ptr [ebp-12],edx
fild qword ptr [ebp-16]
mov ecx,dword ptr [ebp+8]
mov ebx,dword ptr [ebp+12]
mov eax,dword ptr [ebp+16]
fmul qword ptr [ebp-8]
fadd qword ptr [ecx+8*ebx-32]
fstp qword ptr [eax+16]
;
; }
;
?live16385#64: ;
#2:
pop ebx
mov esp,ebp
pop ebp
ret
While the gcc generated assembler code is :
_Test_Fn:
mov edx, DWORD PTR [esp+20]
mov eax, DWORD PTR [esp+16]
mov eax, DWORD PTR [eax-12+edx*4]
mov edx, DWORD PTR [esp+8]
add eax, -2147483648
cvtsi2sd xmm0, eax
mov eax, DWORD PTR [esp+4]
addsd xmm0, QWORD PTR LC0
mulsd xmm0, QWORD PTR LC1
addsd xmm0, QWORD PTR [eax-32+edx*8]
mov eax, DWORD PTR [esp+12]
movsd QWORD PTR [eax+16], xmm0
ret
LC0:
.long 0
.long 1105199104
.align 8
LC1:
.long 1580547965
.long 1079210426
.align 8
I like to get help about how the function arguments acces is done in gcc and Borland C++.
My function in C++ for Borland would be something like :
void Test_Fn(double *x, size_t n,double *AV, size_t *mA, size_t NT)
{
__asm
{
put gcc generated assembler here
}
}
Borland starts using ebp register while gcc use esp register.
Can I force one of the compilers to generate compatible code for accessing the arguments using some calling conventions like cdecl ou stdcall ?
The arguments are passed similarly in both cases. The difference is that the code generated by Borland expresses the argument locations relative to EBP register and GCC relative to ESP, but both of them refer to the same addresses.
Borlands sets EBP to point to the start of the function's stack frame and expresses locations relative to that, while GCC doesn't set up a new stack frame but expresses locations relative to ESP, which the caller has left pointing to the end of the caller's stack frame.
The code generated by Borland sets up a stack frame at the beginning of the function, causing EBP in the Borland code to be equal to ESP in the GCC code decreased by 4. This can be seen by looking at the first two Borland lines:
push ebp ; decrease esp by 4
mov ebp,esp ; ebp = the original esp decreased by 4
The GCC code doesn't alter ESP and Borland code doesn't alter EBP until the end of the procedure, so the relationsip holds when the arguments are accessed.
The calling convention seems to be cdecl in both of the cases, and there's no difference in how the functions are called. You can add keyword __cdecl to both in order to make that clear.
void __cdecl Test_Fn(double *x, size_t n,double *AV, size_t *mA, size_t NT)
However adding inline assembly compiled with GCC to the function compiled with Borland is not straightforward, because Borland might set up a stack frame even if the function body contains only inline assembly, causing the value of ESP register to differ from the one used in the GCC code. I see three possible workarounds:
Compile with Borland without the option "Standard stack frames". If the compiler figures out that a stack frame is not needed, this might work.
Compile with GCC without the option -fomit-frame-pointer. This should make sure that atleast the value of EBP is the same in both. The option is enabled at levels -O, -O2, -O3 and -Os.
Manually edit the assembly produced by GCC, changing references to ESP to EBP and adding 4 to the offset.
I would recommend you do some reading up on Application Binary Interfaces.
Here is a relevant link to help you figure out what compiler generates what sort of code:
https://en.wikipedia.org/wiki/X86_calling_conventions
I'd try either compiling everything with GCC, or see if compiling just the critical file with GCC and the rest with Borland and linking together works. What you explain can be made to work, but it will be a hard job that probably isn't worth your invested time (unless it will run very frequently on many, many machines).
I assumed I had push'ed something without popping it, or vice versa, but I can't find anything wrong! I write to the console with a call to a dll that links properly, and I inexplicably am in no mans land... (address 0x0000000000000000)
I've put some sleeps in, and I'm sure that the api call WriteConsoleA is returning. It's on my last ret under the print function.
Any ideas?
.exe:
extern FreeConsole
extern Sleep
extern ExitProcess
extern print
extern newconsole
extern strlen
section .data BITS 64
title: db 'Consolas!',0
message: db 'Hello, world',0,0
section .text bits 64
global Start
Start:
mov rcx, title
call newconsole
mov rcx, 1000
call Sleep
mov rcx, message
call print
mov rcx, 10000
call Sleep
call FreeConsole
xor rcx, rcx
call ExitProcess
.dll:
extern AllocConsole
extern SetConsoleTitleA
extern GetStdHandle
extern WriteConsoleA
extern Sleep
export newconsole
export strlen
export print
section .data BITS 64
console.writehandle: dq 0
console.readhandle: dq 0
console.write.result: dq 0
section .text BITS 64
global strlen
strlen:
push rax
push rdx
push rdi
mov rdi, rcx
xor rax, rax
mov rcx, dword -1
cld
repnz scasb
neg rcx
sub rcx, 2
pop rdi
pop rdx
pop rax
ret
global print
print:
mov rbp, rsp
push rcx
call strlen
mov r8, rcx
pop rdx
mov rcx, [console.writehandle]
mov r9, console.write.result
push qword 0
call WriteConsoleA
ret
global newconsole
newconsole:
push rax
push rcx
call AllocConsole
pop rcx
call SetConsoleTitleA
mov rcx, -11
call GetStdHandle
mov [console.writehandle], rax
pop rax
ret
I assume you're talking about this function:
global print
print:
mov rbp, rsp
push rcx
call strlen
mov r8, rcx
pop rdx
mov rcx, [console.writehandle]
mov r9, console.write.result
push qword 0
call WriteConsoleA
ret
The x64 ABI requires that stack space is reserved even for parameters passed in registers. WriteConsoleA is free to use those stack locations for whatever it wants - so you need to make sure that you've adjusted the stack appropriately. As it stands, you're pushing only the last reserved pointer parameter. I think something like the following will do the trick for you:
push qword 0
sub rsp, 4 * 8 // reserve stack for register parameters
call WriteConsoleA
mov rsp, rbp // restore rsp
ret
See http://msdn.microsoft.com/en-us/library/ms235286.aspx (emphasis added):
The x64 Application Binary Interface (ABI) is a 4 register fast-call calling convention, with stack-backing for those registers.
...
The caller is responsible for allocating space for parameters to the callee, and must always allocate sufficient space for the 4 register parameters, even if the callee doesn’t have that many parameters.
According to calling convention, you have to clean up arguments you put on the stack. In this case that applies to the 5th argument to WriteConsoleA. Since you have a copy of original rsp in rbp, you can reload rsp from rbp, or just add 8 after the call.
I have been messing around with the PE file structure in Assembly Language. I'm pretty sure I have gotten to the the Import Section correctly. I am using this as a reference where each box is equal to 4 bytes:
+-------------------------+-------------------------+
| RVA to a list of | DATE/TIME |
| pointer to APIs names | | IMPORT DATA DIRECTORY
+-------------------------+-------------------------+ #1
| .DLL address (unused) | RVA to .DLL name |
+-------------------------+-------------------------+
|RVA to API address list |
+-------------------------+
Ollydbg. Notice the value of eax on the right side (00402048) and then look at the value of the highlighted call instruction is jumping to(00402000).
I attempted to call the first first function from the (RVA to API address list) which is ExitProcess however when I tried issuing a call to the address, it caused my program to crash. When I debugged it with Ollydbg, I found out that the address when call ExitProcess was issued was different than the address I found in the list. In Ollydbg the address I found pointed to <&KERNEL32.ExitProcess> while the call ExitProcess pointed to < JMP.&KERNEL32.ExitProcess>. I have read somewhere about some kind of jmp stub. Is that what this is? How am I supposed to call the functions in the "RVA to API address list"?
I know this may be confusing. If you need more clarification let me know.
Here is the code:
extern printf
extern ExitProcess
global _start
section .code
_start:
mov eax, [imagebase]
mov esi, eax
add eax, 3ch
mov eax, DWORD [eax]
add eax, esi; PE header pointer in eax
add eax, 128; 24 for PE Optional Header offset and then 104 for import RVA
mov ebx, DWORD [eax]
add ebx, DWORD [imagebase]; ebx now has import section offset
mov eax, DWORD [ebx+16]
add eax, DWORD [imagebase]; has array offset
mov ecx, ExitProcess
push 0
call ecx
;call eax
;jmp ecx
;call ExitProcess
imagebase: db 0,0,64,0; 0x00400000; This is right
It seems as though I had found array but I never retrieved the value at that address. So I was trying to call the function at the address of the array not the at the first element of the array.
extern printf
extern ExitProcess
global _start
section .code
_start:
mov eax, [imagebase]
mov esi, eax
add eax, 3ch
mov eax, DWORD [eax]
add eax, esi; PE header pointer in eax
add eax, 128; 24 for PE Optional Header offset and then 104 for import RVA
mov ebx, DWORD [eax]
add ebx, DWORD [imagebase]; ebx now has import section offset
mov eax, DWORD [ebx+16]
add eax, DWORD [imagebase]; has array offset
mov eax, [eax];This is what I needed to do
push 0
call eax
imagebase: db 0,0,64,0;