VS2010 auto-format for ASM files - visual-studio-2010

I have the following .asm file in my solution:
doTimerAsm PROC EXPORT
pushfd ; backup flags register (4 Bytes)
pushad ; bakcup general purpose registers (8*4 Bytes)
mov eax, [esp+24h] ; load interrupt handler pointer
push [esp+28h] ; push handler parameter
call eax ; invoke interrupt handler
popad ; after return from interrupt handler, restore general purpose registers
popfd ; restore flags register
add esp, 08h ; pop interrupt handler pointer and the parameter
ret ; resume
doTimerAsm ENDP
I ran the VS auto-format on it expecting it to just indent the middle lines. Insead, it produced the following:
doTimerAsm PROC EXPORT
pushfd ;#1 backup flags register (4 Bytes)
pushad ;#1 bakcup general purpose registers (8*4 Bytes)
mov eax, [esp+24h] ;#4 load interrupt handler pointer
push [esp+28h] ;#? push handler parameter
call eax ;#2 invoke interrupt handler
popad ;#1 after return from interrupt handler, restore general purpose registers
popfd ;#1 restore flags register
add esp, 08h ;#3 pop interrupt handler pointer and the parameter
ret ;#1 resume
doTimerAsm ENDP
Can anyone explain these numbers (and the ?) to me? what do they mean? How is this auto-formatted?

The number after the hash is the byte-size of the line of assembly (as it would be if the instruction was assembled by the assembler). the ? indicates that the size couldn't be calculated (in this case you are missing a size specifier on [esp+28h], thus the assembler cannot tell if its a byte, word, dword or qword).

Related

FASM assembly program waits before exiting

This is the code that I use with FASM:
format PE console
entry main
include '..\MACRO\import32.inc'
section '.data' data readable writeable
msg db "привіт!",0dh,0ah,0 ;hi
lcl_set db ?
section '.code' code readable executable
main:
;fail without set locale
push msg
call [printf]
pop ecx
;succeed with set locale
push msg
call _liapnuty
pop ecx
push 0
call [ExitProcess]
_liapnuty:
push ebp
mov ebp, esp
;sub esp, 0
mov ebx,[ebp+8] ; 1st arg addr
mov al, [lcl_set]
or al, al
jnz _liapnuty_rest
call __set_locale
_liapnuty_rest:
push ebx
call [printf]
pop ebx
mov esp, ebp
pop ebp
ret 0
__set_locale:
mov al, [lcl_set]
or al, al
jnz __set_locale_rest
push 1251
call SetConsoleCP
call SetConsoleOutputCP
pop ecx
mov [lcl_set], 1
;push lcl
;call [system]
; pop ecx
; mov [lcl_set], 1
;push cls
;call [printf]
; pop ecx
__set_locale_rest:
ret 0
section '.idata' import data readable
library kernel,'kernel32.dll',\
msvcrt,'msvcrt.dll'
import kernel,\
SetConsoleCP,'SetConsoleCP',\
SetConsoleOutputCP,'SetConsoleOutputCP',\
ExitProcess,'ExitProcess'
import msvcrt,\
printf,'printf'
It works almost perfectly, except that before exiting it waits for like a second for some reason. It outputs data almost instantly, yet it fails to shut down quickly. If the reason is using these libraries or not clearing the stack after calling ExitProcess (which obviously can't be done), then let me know and I will mostly gladly accept this answer, but I want to be 100% sure I'm doing everything correctly.
The reason for all of it was because kernel32 functions pop their parameters themselves on return. If I remove unnecessary pops it starts working fast again. Of course, the program still runs with damaged stack but it does a lot of damage control at the end. That's why it was slow, but still worked. For everyone facing this issue, make sure to be careful with the calling convention.
To debug the application and find the error I used OLLYDBG. It's free and it works. It helps you debug EXEs and DLLs, allowing to step one command at a time. Also it shows the memory, the stack and all of the registers and flags.
Using the stack I was able to find out that it gets corrupted.

Register ESI causes RunTime-Check Failure #0 error

I've spend lot of time trying to solve this problem and I don't understand, why it doesn't work. Problem's description is in comments below:
.386
.MODEL FLAT, STDCALL
OPTION CASEMAP:NONE
.NOLIST
.NOCREF
INCLUDE \masm32\include\windows.inc
.LIST
.CODE
DllEntry PROC hInstDLL:HINSTANCE, reason:DWORD, reserved1:DWORD
mov eax, TRUE
ret
DllEntry ENDP
caesarAsm proc string: DWORD, key: DWORD, stringLength : DWORD
mov esi, 1 ; I cannot use this register, mov esi, (anything) causes Crash:
; Run-Time Check Failure #0 - The value of ESP was not properly saved across a function call. This is usually a result of calling a function declared with one calling convention with a function pointer declared with a different calling convention
mov eax, string
ret
caesarAsm endp
END DllEntry
I searched "whole" Internet, I found that the problem is connected with stack, but no operation on stacks helped me to solve it.
I'm using Microsoft Visual Studio 2012
I assume the error does not occur in this function, rather, it is triggered elsewhere. The esi register is a callee-saved register. You must make sure its value is the same at function exit as it was at entry. You can use it in your function, but you must save and restore its value. Such as:
push esi
mov esi, 1
mov eax, string
pop esi
ret
This is all well documented. You may only use eax, ecx and edx without saving.
Side note: you are using high-level features of your assembler, you might want to check the actual generated code or refrain from using them until you are confident in what the result is going to be. Incidentally masm has a USES keyword which would do the save/restore for you.

Structured Exception Handler catches near-zero EIP trap differently on nearly identical machines?

I have a rather complex, but extremely well-tested assembly language x86-32 application running on variety of x86-32 and x86-64 boxes. This is a runtime system for a language compiler, so it supports the execution of another compiled binary program, the "object code".
It uses Windows SEH to catch various kinds of traps: division by zero, illegal access, ... and prints a register dump using the context information provided by Windows, that shows the state of the machine at the time of the trap. (It does lots of other stuff irrelevant to the question, such as printing a function backtrace or recovering from the division by zero as appropriate). This allows the writer of the "object code" to get some idea what went wrong with his program.
It behaves differently on two Windows 7-64 systems, that are more or less identical, on what I think is an illegal memory access. The specific problem is that the "object code" (not the well-tested runtime system) somewhere stupidly loads 0x82 into EIP; that is a nonexistent page in the address space AFAIK. I expect a Windows trap though the SEH, and expect to a register dump with EIP=00000082 etc.
On one system, I get exactly that register dump. I could show it here, but it doesn't add anything to my question. So, it is clear the SEH in my runtime system can catch this, and display the situation. This machine does not have any MS development tools on it.
On the other ("mystery") system, with the same exact binaries for runtime system and object code, all I get is the command prompt. No further output. FWIW, this machine has MS Visual Studio 2010 on it. The mystery machine is used heavily for other purposes, and shows no other funny behaviors in normal use.
I assume the behavior difference is caused by a Windows configuration somewhere, or something that Visual Studio controls. It isn't the DEP configuration the system menu; they are both configured (vanilla) as "DEP for standard system processes". And my runtime system executable has "No (/NXCOMPAT:NO)" configured.
Both machines are i7 but different chips, 4 cores, lots of memory, different motherboards. I don't think this is relevant; surely both of these CPUs take traps the same way.
The runtime system includes the following line on startup:
SetErrorMode(SEM_FAILCRITICALERRORS | SEM_NOGPFAULTERRORBOX); // stop Windows pop-up on crashes
This was recently added to prevent the "mystery" system from showing a pop-up window, "xxx.exe has stopped working" when the crash occurs. The pop-up box behaviour doesn't happen on the first system, so all this did was push the problem into a different corner on the "mystery" machine.
Any clue where I look to configure/control this?
I provide here the SEH code I am using. It has been edited
to remove a considerable amount of sanity-checking code
that I claim has no effect on the apparant state seen
in this code.
The top level of the runtime system generates a set of worker
threads (using CreateThread) and points to execute ASMGrabGranuleAndGo;
each thread sets up its own SEH, and branches off to a work-stealing scheduler, RunReadyGranule. To the best of my knowledge, the SEH is not changed
after that; at least, the runtime system and the "object code" do
not do this, but I have no idea what the underlying (e.g, standard "C")
libraries might do.
Further down I provide the trap handler, TopLevelEHFilter.
Yes, its possible the register printing machinery itself blows
up causing a second exception. I'll try to check into this again soon,
but IIRC my last attempt to catch this in the debugger on the
mystery machine, did not pass control to the debugger, just
got me the pop up window.
public ASMGrabGranuleAndGo
ASSUME FS:NOTHING ; cancel any assumptions made for this register
ASMGrabGranuleAndGo:
;Purpose: Entry for threads as workers in PARLANSE runtime system.
; Each thread initializes as necessary, just once,
; It then goes and hunts for work in the GranulesQ
; and start executing a granule whenever one becomes available
; install top level exception handler
; Install handler for hardware exceptions
cmp gCompilerBreakpointSet, 0
jne HardwareEHinstall_end ; if set, do not install handler
push offset TopLevelEHFilter ; push new exception handler on Windows thread stack
mov eax, [TIB_SEH] ; expected to be empty
test eax, eax
BREAKPOINTIF jne
push eax ; save link to old exception handler
mov fs:[TIB_SEH], esp ; tell Windows that our exception handler is active for this thread
HardwareEHinstall_end:
;Initialize FPU to "empty"... all integer grains are configured like this
finit
fldcw RTSFPUStandardMode
lock sub gUnreadyProcessorCount, 1 ; signal that this thread has completed its initialization
##: push 0 ; sleep for 0 ticks
call MySleep ; give up CPU (lets other threads run if we don't have enuf CPUs)
lea esp, [esp+4] ; pop arguments
mov eax, gUnreadyProcessorCount ; spin until all other threads have completed initialization
test eax, eax
jne #b
mov gThreadIsAlive[ecx], TRUE ; signal to scheduler that this thread now officially exists
jmp RunReadyGranule
ASMGrabGranuleAndGo_end:
;-------------------------------------------------------------------------------
TopLevelEHFilter: ; catch Windows Structured Exception Handling "trap"
; Invocation:
; call TopLevelEHFilter(&ReportRecord,&RegistrationRecord,&ContextRecord,&DispatcherRecord)
; The arguments are passed in the stack at an offset of 8 (<--NUMBER FROM MS DOCUMENT)
; ESP here "in the stack" being used by the code that caused the exception
; May be either grain stack or Windows thread stack
extern exit :proc
extern syscall #RTSC_PrintExceptionName#4:near ; FASTCALL
push ebp ; act as if this is a function entry
mov ebp, esp ; note: Context block is at offset ContextOffset[ebp]
IF_USING_WINDOWS_THREAD_STACK_GOTO unknown_exception, esp ; don't care what it is, we're dead
; *** otherwise, we must be using PARLANSE function grain stack space
; Compiler has ensured there's enough room, if the problem is a floating point trap
; If the problem is illegal memory reference, etc,
; there is no guarantee there is enough room, unless the application is compiled
; with -G ("large stacks to handle exception traps")
; check what kind of exception
mov eax, ExceptionRecordOffset[ebp]
mov eax, ExceptionRecord.ExceptionCode[eax]
cmp eax, _EXCEPTION_INTEGER_DIVIDE_BY_ZERO
je div_by_zero_exception
cmp eax, _EXCEPTION_FLOAT_DIVIDE_BY_ZERO
je float_div_by_zero_exception
jmp near ptr unknown_exception
float_div_by_zero_exception:
mov ebx, ContextOffset[ebp] ; ebx = context record
mov Context.FltStatusWord[ebx], CLEAR_FLOAT_EXCEPTIONS ; clear any floating point exceptions
mov Context.FltTagWord[ebx], -1 ; Marks all registers as empty
div_by_zero_exception: ; since RTS itself doesn't do division (that traps),
; if we get *here*, then we must be running a granule and EBX for granule points to GCB
mov ebx, ContextOffset[ebp] ; ebx = context record
mov ebx, Context.Rebx[ebx] ; grain EBX has to be set for AR Allocation routines
ALLOCATE_2TOK_BYTES 5 ; 5*4=20 bytes needed for the exception structure
mov ExceptionBufferT.cArgs[eax], 0
mov ExceptionBufferT.pException[eax], offset RTSDivideByZeroException ; copy ptr to exception
mov ebx, ContextOffset[ebp] ; ebx = context record
mov edx, Context.Reip[ebx]
mov Context.Redi[ebx], eax ; load exception into thread's edi
GET_GRANULE_TO ecx
; This is Windows SEH (Structured Exception Handler... see use of Context block below!
mov eax, edx
LOOKUP_EH_FROM_TABLE ; protected by DelayAbort
TRUST_JMP_INDIRECT_OK eax
mov Context.Reip[ebx], eax
mov eax, ExceptionContinueExecution ; signal to Windows: "return to caller" (we've revised the PC to go to Exception handler)
leave
ret
TopLevelEHFilter_end:
unknown_exception:
<print registers, etc. here>
"DEP for standard system processes" won't help you; it's internally known as "OptIn". What you need is the IMAGE_DLLCHARACTERISTICS_NX_COMPAT flag set in the PE header of your .exe file. Or call the SetProcessDEPPolicy function in kernel32.dll The SetProcessMitigationPolicy would be good also... but it isn't available until Windows 8.
There's some nice explanation on Ed Maurer's blog, which explains both how .NET uses DEP (which you won't care about) but also the system rules (which you do).
BIOS settings can also affect whether hardware NX is available.

Malloc and Free multiple arrays in assembly

I'm trying to experiment with malloc and free in assembly code (NASM, 64 bit).
I have tried to malloc two arrays, each with space for 2 64 bit numbers. Now I would like to be able to write to their values (not sure if/how accessing them will work exactly) and then at the end of the whole program or in the case of an error at any point, free the memory.
What I have now works fine if there is one array but as soon as I add another, it fails on the first attempt to deallocate any memory :(
My code is currently the following:
extern printf, malloc, free
LINUX equ 80H ; interupt number for entering Linux kernel
EXIT equ 60 ; Linux system call 1 i.e. exit ()
segment .text
global main
main:
push dword 16 ; allocate 2 64 bit numbers
call malloc
add rsp, 4 ; Undo the push
test rax, rax ; Check for malloc failure
jz malloc_fail
mov r11, rax ; Save base pointer for array
; DO SOME CODE/ACCESSES/OPERATIONS HERE
push dword 16 ; allocate 2 64 bit numbers
call malloc
add rsp, 4 ; Undo the push
test rax, rax ; Check for malloc failure
jz malloc_fail
mov r12, rax ; Save base pointer for array
; DO SOME CODE/ACCESSES/OPERATIONS HERE
malloc_fail:
jmp dealloc
; Finish Up, deallocate memory and exit
dealloc:
dealloc_1:
test r11, r11 ; Check that the memory was originally allocated
jz dealloc_2 ; If not, try the next block of memory
push r11 ; push the address of the base of the array
call free ; Free this memory
add rsp, 4
dealloc_2:
test r12, r12
jz dealloc_end
push r12
call free
add rsp, 4
dealloc_end:
call os_return ; Exit
os_return:
mov rax, EXIT
mov rdi, 0
syscall
I'm assuming the above code is calling the C functions malloc() and free()...
If 1st malloc() fails, you arrive at dealloc_1 with whatever garbage is in r11 and r12 after returning from the malloc().
If 2nd malloc() fails, you arrive at dealloc_1 with whatever garbage is in r12 after returning from the malloc().
Therefore, you have to zero out r11 and r12 before doing the first allocation.
Since this is 64-bit mode, all pointers/addresses and sizes are normally 64-bit. When you pass one of those to a function, it has to be 64-bit. So, push dword 16 isn't quite right. It should be push qword 16 instead. Likewise, when you are removing these parameters from the stack, you have to remove exactly as many bytes as you've put there, so add rsp, 4 must change to add rsp, 8.
Finally, I don't know which registers malloc() and free() preserve and which they don't. You may need to save and restore the so-called volatile registers (see your C compiler documentation). The same holds for the code not shown. It must preserve r11 and r12 so they can be used for deallocation. EDIT: And I'd check if it's the right way of passing parameters through the stack (again, see your compiler documentation).
EDIT: you're testing r11 for 0 right before 2nd free(). It should be r12. But free() doesn't really mind receiving NULL pointers. So, these checks can be removed.
Pay attention to your code.
You have to obey x86-64 calling conventions: arguments might be passed through registers, in the case of malloc that would be RDI for the size. And as already pointed out, you have to watch out which registers are preserved by the called functions. (afaik only RBP, RSP and R12-R15 are preserved across function calls)
There are at least two bugs, because you test r11 again (the line test r11,r11 after dealloc_2:, but you supposedly wanted to test r12 here. Additionally you want to push a qword, if you are in 64 bit mode.
The reason the deallocation doesn't work at all may be because you are changing the contents of r11 or r12.
Not that both tests are not needed, as it is perfectly safe to call free with an null pointer.

Endless loop with assember

OK, I'm new to PC Assembler. I"m trying to write an program, but it won't stop looping. I'm guessing the ECX register is being modified? How can I fix this? Thanks.
DATA SECTION
;
KEEP DD 0 ;temporary place to keep things
;
CODE SECTION
;
START:
MOV ECX,12
TOPOFLOOP:
PUSH -11 ;STD_OUTPUT_HANDLE
CALL GetStdHandle ;get, in eax, handle to active screen buffer
PUSH 0,ADDR KEEP ;KEEP receives output from API
PUSH 5,'bruce' ;5=length of string
PUSH EAX ;handle to active screen buffer
CALL WriteFile
XOR EAX,EAX ;return eax=0 as preferred by Windows
LOOP TOPOFLOOP
ENDLABEL:
RET
In most x86 calling convention, including the stdcall convention used by Windows API functions, ECX is a caller-save register -- the called function is not required to make sure the value of the register is the same when it returns as when it was called. You have to save it somewhere safe in your own code.

Resources