How does one display "Hello, world!" without using the benefits of a high-level assembler? - winapi

I'm attempting to display "Hello, world!" with FASM on a 64-bit Windows 7 machine without using the crutches that modern assemblers seem to provide in abundance.
This rather simple task proved to be surprisingly frustrating since every example and tutorial I could find insists on resorting to macros, including prewritten code, or importing libraries from high-level languages. I thought that the kind of people who want to learn assembly typically do so to develop a direct and intimate understanding of how computers work. All these abstractions and obfuscations seem to detract from that purpose.
Rant aside, I'm looking for code that can display "Hello, world!" on a console without reusing, including, and importing anything except to directly access the Windows API. Although I'm aware that many assemblers come packaged with files that provide access to the Windows API, I'd rather not rely on them.
Also, if you have any suggestions as to what assemblers or tutorials I can use to better facilitate my approach to learning, I'd greatly appreciate it.

The big problem with "pure" windows programming is that Windows require that the program contains import section, about what functions from the system DLLs have to be provided to the program - so called import table.
This table is not a part of the program and has nothing to do with assembly programming itself. Besides, the import table has complex structure, not very convenient to be manually build. That is why FASM provides some standard way for the user to build these import tables.
The proper approach to you, if you goal is to learn assembly, is to read the FASM manuals, where these macros are described, then to read the example code provided in any FASM distribution and then to start using them and concentrate to the assembly programming.
The moderate use of macros does not make your program less assembly written!
The FASM message board is good place to ask questions and to get help, but you have to make your homework after all.

Every running process under windows gets either kernel32 or kernalbase loaded into its address space, using this fact and the PEB internals, you can easily access any windows function (provided you have the right access privileges).
This blog entry details how to go about doing this to display a message with MessageBoxA.
In all honesty, unless you have some extreme reason for doing this, you are going to just end up wasting time, rather use the tools provided (in this case, a linker, so you can access any windows API without going through 10000 hurdles and loops).

I managed to link to one library only (kernel32.dll) and make reference to 3 functions:
GetStdHandle
WriteConsole
ExitProcess
The code below is the result of my exhaustive Google search, and my own reference to MS documentation.
format PE console
entry start
include 'include\win32a.inc'
section '.data' data readable writable
msg db 'Hello World!',13,10,0
len = $-msg
dummy dd ?
section '.code' readable writable executable
start:
push STD_OUTPUT_HANDLE
call [GetStdHandle] ;STD_OUTPUT_HANDLE (DWORD)-11
push 0 ;LPVOID lpReserved
push dummy ;LPDWORD lpNumberOfCharsWritten
push len ;DWORD nNumberOfCharsToWrite
push msg ;VOID *lpBuffer;
push eax ;HANDLE hConsoleOutput
call [WriteConsole]
push 0
call [ExitProcess]
section '.idata' data import readable writable
library kernel32,'KERNEL32.DLL'
include 'include\api\kernel32.inc'

Asking google for help: http://board.flatassembler.net/topic.php?t=14034
Trying it out yourself
; Example of 64-bit PE program
format PE64 GUI
entry start
section '.text' code readable executable
start:
sub rsp,8*5 ; reserve stack for API use and make stack dqword aligned
mov r9d,0
lea r8,[_caption]
lea rdx,[_message]
mov rcx,0
call [MessageBoxA]
mov ecx,eax
call [ExitProcess]
section '.data' data readable writeable
_caption db 'Win64 assembly program',0
_message db 'Hello World!',0
section '.idata' import data readable writeable
dd 0,0,0,RVA kernel_name,RVA kernel_table
dd 0,0,0,RVA user_name,RVA user_table
dd 0,0,0,0,0
kernel_table:
ExitProcess dq RVA _ExitProcess
dq 0
user_table:
MessageBoxA dq RVA _MessageBoxA
dq 0
kernel_name db 'KERNEL32.DLL',0
user_name db 'USER32.DLL',0
_ExitProcess dw 0
db 'ExitProcess',0
_MessageBoxA dw 0
db 'MessageBoxA',0
Using nasm to compile this hello world (16 bit) code taken from here:
.model tiny
.code
org 100h
main proc
mov ah,9 ; Display String Service
mov dx,offset hello_message ; Offset of message (Segment DS is the right segment in .COM files)
int 21h ; call DOS int 21h service to display message at ptr ds:dx
retn ; returns to address 0000 off the stack
; which points to bytes which make int 20h (exit program)
hello_message db 'Hello, world!$'
main endp
end main

Related

Explain to me how Windows allocates process virtual memory

I have pretty complex question combined of multiple related questions. Let me give you the preamble.
I wrote a simple Win64 program in assembly language which prints "2 + 3 = 5" using printf and then "Hello World!" using puts:
format PE64
entry start
section '.text' code readable executable
start:
sub rsp,8*5 ; reserve stack for API use and make stack dqword aligned
mov edx, 3
mov ecx, 2
call print_sum
lea rcx,[_hw_message]
call [puts]
mov ecx,0
call [ExitProcess]
print_sum:
sub rsp, 20h
mov r9d, ecx
add r9d, edx
mov r8d, edx
mov edx, ecx
lea ecx, [_format_message]
call [printf]
add rsp, 20h
ret
section '.data' data readable writeable
_hw_message db 'Hello World!',0
_format_message db '%d + %d = %d',13,10,0
section '.idata' import data readable writeable
dd 0,0,0,RVA kernel_name,RVA kernel_table
dd 0,0,0,RVA msvcrt_name,RVA msvcrt_table
kernel_table:
ExitProcess dq RVA _ExitProcess
dq 0
msvcrt_table:
printf dq RVA _printf
puts dq RVA _puts
dq 0
kernel_name db 'KERNEL32.DLL',0
msvcrt_name db 'msvcrt.dll',0
_ExitProcess dw 0
db 'ExitProcess',0
_printf dw 0
db 'printf',0
_puts dw 0
db 'puts',0
and built it with fasm. Resulting binary size is 2048 bytes.
I've opened it with CFF Explorer to see PE header values.
Image base is 0x400000, entry point is 0x1000, .text section virtual address is 0x1000 too, so, as far as I understand, it should start in virtual memory at offset 0x401000 and it is also its entry point.
Then I've opened it in debugger (I use x64dbg) to confirm my guess:
Looks believable. Also note that stack is located at 0x8A000.
Fine, then I've tried the same with another program – notepad.exe from C:\Windows:
Wait, what? 0x140000000 + 0x24050 = 0x140024050, not 0x7FF75FD04050. And I can't find in PE headers such big values starting with 7FF.
In addition, the stack is again located somewhere at the beginning of the process's memory map, but now its address is already much larger:
I thought that perhaps this is because notepad.exe is a system program and is tightly tied to the Windows system APIs, and some parts of it (and maybe all the code) are always loaded into RAM while Windows is running. Therefore, I tried to do the same with x64dbg itself, and saw about the same picture:
image base: 0x140000000
entry point (in headers): 0x2440
entry point in VM: 0x7FF6B0E82440
location of stack in VM: 0xFDA07F8000
So the questions are:
Why are sections of some programs mapped to addresses greater than 0x7ff000000000, which doesn't match PE headers?
How are these processes different from others?
How does the OS decide where to place the stack in virtual memory?
Each thread has its own stack. As you can see from the screenshots, thread stacks are usually placed before code sections. If the program starts a dynamic number of threads, this memory may not be enough. Where, in this case, will stacks be allocated for new threads?
How can I programmatically, having an executable file, but not running it, statically determine at what addresses in the virtual memory of its process the sections, the stack will be located, and what address spaces will be available for allocation on the heap?
I understand that this can be difficult to explain in a nutshell, so I appreciate if, in addition to answering my questions, you can recommend me some reading material that will help me improve my understanding of the Windows virtual memory mapping.
What you're seeing is Address space layout randomization, which is enabled by default in MSVC with the linker flag:
/DYNAMICBASE.
To enable this, the flag IMAGE_DLLCHARACTERISTICS_DYNAMIC_BASE (0x40) must be set in the PE header, at FileHeader -> OptionalHeader -> DllCharacteristics.
When enabled, the OS will select a random address for the base image, stack, and heap. The ImageBase specified in the PE header will be ignored.

Assembler calling conventions for Windows 10 API routines

Back in the 1970's I cut my teeth on the IBM 370 mainframe assembler, and in the early 1980's I had the original IBM PC, with the Microsoft Macro Assembler. At that time it was sold as a separate product, and came with a very useful manual. Now I'm retired, in quarantine, and looking to get back into assembler language.
I downloaded Visual Studio 2019 Community, which has MASM included in it, and for interactive debugging I'm using x64dbg. My PC is 64 bit, so I'm using the ML64 assembler as provided with VS.
My question is regarding the calling convention for the Windows API functions.
These days the Windows functions all seem to be geared toward C++ and, in my understanding, the calling convention reflects the machine code that is generated by C++ for calling those functions. I want to develop a template that I can use for all future calls, so it's coded for a nonexistent function called apifunc. This fictional function has five parameters.
; command to assemble is:
; ml64 samplecall.asm /link /subsystem:windows /defaultlib:kernel32.lib /entry:Start
extrn ExitProcess: PROC
extrn apifunc: PROC ; any hypothetical api function with five parameters
.data
;
parm1 dword ? ; these could be any required data type
parm2 dword ?
parm3 dword ?
parm4 dword ?
parm5 dword ?
;
.code
Start PROC
;
sub rsp, 32 ; room on the stack for first four parameters, 8 bytes each
;
lea rcx, parm1 ; pass the first four parameters in registers
lea rdx, parm2
lea r8, parm3
lea r9, parm4
lea rax, parm5 ; address of the fifth and last parameter
push rax ; put it on the stack
call apifunc ; call the hypothetical function
;
call ExitProcess
;
Start ENDP
End:
Does this code look even remotely correct? When control returns from apifunc, do I have any indication at all of whether it was successful and, if it was not, why not? Do I need to add 40 back to the stack pointer in order to leave it in the same condition in which it was passed to me?
Please be patient with me, because I now stand at the bottom of a very steep learning curve. I hope my questions make sense, and that I provided enough information.

NASM FindFirstFileA LPWIN32_FIND_DATAA

I've written a basic program in NASM trying to use FindFileA with a view to eventually listing all files in a directory.
extern FindFirstFileA ; kernel32.dll
extern ExitProcess ; kernel32.dll
section .code
Start:
push dataStructPtr
push searchParameters
call [FindFirstFileA]
mov [fileHandle], eax
push 0
call [ExitProcess]
section .data
searchParameters: db "*.*",0
section .bss
dataStructPtr: resb 4
fileHandle: resb 4
As far as I can tell a 32-bit pointer to the WIN32_FIND_DATAA structure should be going into address 402008.
However it looks like more than 4 bytes are being written and also that gives me an address of 007334C8 which the program memory does not go up to.
Would you be able to shed some light on why this is happening and where the structure resides so I could look at it using OllyDbg?
Using OllyDbg to look at it:
Many Thanks

watch a directory for changes using masm assembly

I have been only programming in assembly for 2 weeks now so I am kind of new to assembly and I need some help.
I need to watch a directory and all sub directories for changes. The only changes I need to be notified of are file creation and when a file is edited, but if you include others that is fine.
I need to be notified of the file who made the changes to a message box. I do not need to know what change the file made, I just need the file path to a message box. I tried to search the web but cant find anything for how to do this in assembly particular masm.The only stuff I could find was this code that I think was written for masm and I tried it but it message boxes A or other letters and that is it and it blocks me from changing the name of any file in that directory, and i do not want it to do that.
.data
FolderPath3 db "C:\users",0
.data ?
hFile dd ?
FileBuffer DB 200 DUP(?)
ThreadProc PROC uses edi esi Param:DWORD
LOCAL lpBytesReturned:dword
invoke CreateFile,addr FolderPath3,GENERIC_READ,FILE_SHARE_DELETE or FILE_SHARE_READ,0,\
OPEN_EXISTING,FILE_FLAG_BACKUP_SEMANTICS,0
mov hFile,eax
invoke ReadDirectoryChangesW,hFile,addr FileBuffer,sizeof FileBuffer,TRUE,FILE_NOTIFY_CHANGE_LAST_ACCESS,\
addr lpBytesReturned,0,0
.if eax==0
invoke MessageBoxA,0,0,0,MB_OK
.else
xor ecx,ecx
##:
add edi,ecx
lea edi,FileBuffer
mov esi,[edi].FILE_NOTIFY_INFORMATION.Action
.if esi==FILE_ACTION_MODIFIED
invoke MessageBoxA, NULL, addr [edi].FILE_NOTIFY_INFORMATION.FileName, offset BoxCaption, NULL
.elseif esi==0
invoke CloseHandle,hDir
ret
.endif
mov ecx,[edi].FILE_NOTIFY_INFORMATION.NextEntryOffset
.if ecx==0
invoke RtlZeroMemory,addr FileBuffer,sizeof FileBuffer
jmp ThreadProc
.endif
jmp #B
.endif
ret
ThreadProc ENDP
if anyone can fix the above code or show me different code that works it would be great,
thank you
The essence of the task is the operating system specific services and handling the notifications.
If you are lost doing this in assembly, code it in a high level language (C, C++, Perl, etc.) and get that working. It should not be hard to find examples of doing just this from MSDN. Once you have learned how to do that, it will then be pretty clear what the assembly language has to do.

System Calls in Windows & Native API?

Recently I've been using lot of assembly language in *NIX operating systems. I was wondering about the Windows domain.
Calling convention in Linux:
mov $SYS_Call_NUM, %eax
mov $param1 , %ebx
mov $param2 , %ecx
int $0x80
Thats it. That is how we should make a system call in Linux.
Reference of all system calls in Linux:
Regarding which $SYS_Call_NUM & which parameters we can use this reference : http://docs.cs.up.ac.za/programming/asm/derick_tut/syscalls.html
OFFICIAL Reference : http://kernel.org/doc/man-pages/online/dir_section_2.html
Calling convention in Windows:
???
Reference of all system calls in Windows:
???
Unofficial : http://www.metasploit.com/users/opcode/syscalls.html , but how do I use these in assembly unless I know the calling convention.
OFFICIAL : ???
If you say, they didn't documented it. Then how is one going to write libc for windows without knowing system calls? How is one gonna do Windows Assembly programming? Atleast in the driver programming one needs to know these. right?
Now, whats up with the so called Native API? Is Native API & System calls for windows both are different terms referring to same thing? In order to confirm I compared these from two UNOFFICIAL Sources
System Calls: http://www.metasploit.com/users/opcode/syscalls.html
Native API: http://undocumented.ntinternals.net/aindex.html
My observations:
All system calls are beginning with letters Nt where as Native API is consisting of lot of functions which are not beginning with letters Nt.
System Call of windows are subset of Native API. System calls are just part of Native API.
Can any one confirm this and explain.
EDIT:
There was another answer. It was a 2nd answer. I really liked it but I don't know why answerer has deleted it. I request him to repost his answer.
If you're doing assembly programming under Windows you don't do manual syscalls. You use NTDLL and the Native API to do that for you.
The Native API is simply a wrapper around the kernelmode side of things. All it does is perform a syscall for the correct API.
You should NEVER need to manually syscall so your entire question is redundant.
Linux syscall codes do not change, Windows's do, that's why you need to work through an extra abstraction layer (aka NTDLL).
EDIT:
Also, even if you're working at the assembly level, you still have full access to the Win32 API, there's no reason to be using the NT API to begin with! Imports, exports, etc all work just fine in assembly programs.
EDIT2:
If you REALLY want to do manual syscalls, you're going to need to reverse NTDLL for each relevant Windows version, add version detection (via the PEB), and perform a syscall lookup for each call.
However, that would be silly. NTDLL is there for a reason.
People have already done the reverse-engineering part: see https://j00ru.vexillium.org/syscalls/nt/64/ for a table of system-call numbers for each Windows kernel. (Note that the later rows do change even between versions of Windows 10.) Again, this is a bad idea outside of personal-use-only experiments on your own machine to learn more about asm and/or Windows internals. Don't inline system calls into code that you distribute to anyone else.
The other thing you need to know about the windows syscall convention is that as I understand it the syscall tables are generated as part of the build process. This means that they can simply change - no one tracks them. If someone adds a new one at the top of the list, it doesn't matter. NTDLL still works, so everyone else who calls NTDLL still works.
Even the mechanism used to perform syscalls (which int, or sysenter) is not fixed in stone and has changed in the past, and I think that once upon a time the same version of windows used different DLLs which used different entry mechanisms depending on the CPU in the machine.
I was interested in doing a windows API call in assembly with no imports (as an educational exercise), so I wrote the following FASM assembly to do what NtDll!NtCreateFile does. It's a rough demonstration on my 64-bit version of Windows (Win10 1803 Version 10.0.17134), and it crashes out after the call, but the return value of the syscall is zero so it is successful. Everything is set up per the Windows x64 calling convention, then the system call number is loaded into RAX, and then it's the syscall assembly instruction to run the call. My example creates the file c:\HelloWorldFile_FASM, so it has to be run "as administrator".
format PE64 GUI 4.0
entry start
section '.text' code readable executable
start:
;puting the first four parameters into the right registers
mov rcx, _Handle
mov rdx, [_access_mask]
mov r8, objectAttributes
mov r9, ioStatusBlock
;I think we need 1 stack word of padding:
push 0x0DF0AD8B
;pushing the other params in reverse order:
push [_eaLength]
push [_eaBuffer]
push [_createOptions]
push [_createDisposition]
push [_shareAcceses]
push [_fileAttributes]
push [_pLargeInterger]
;adding the shadow space (4x8)
; push 0x0
; push 0x0
; push 0x0
; push 0x0
;pushing the 4 register params into the shadow space for ease of debugging
push r9
push r8
push rdx
push rcx
;now pushing the return address to the stack:
push endOfProgram
mov r10, rcx ;copied from ntdll!NtCreateFile, not sure of the reason for this
mov eax, 0x55
syscall
endOfProgram:
retn
section '.data' data readable writeable
;parameters------------------------------------------------------------------------------------------------
_Handle dq 0x0
_access_mask dq 0x00000000c0100080
_pObjectAttributes dq objectAttributes ; at 00402058
_pIoStatusBlock dq ioStatusBlock
_pLargeInterger dq 0x0
_fileAttributes dq 0x0000000000000080
_shareAcceses dq 0x0000000000000002
_createDisposition dq 0x0000000000000005
_createOptions dq 0x0000000000000060
_eaBuffer dq 0x0000000000000000 ; "optional" param
_eaLength dq 0x0000000000000000
;----------------------------------------------------------------------------------------------------------
align 16
objectAttributes:
_oalength dq 0x30
_rootDirectory dq 0x0
_objectName dq unicodeString
_attributes dq 0x40
_pSecurityDescriptor dq 0x0
_pSecurityQualityOfService dq securityQualityOfService
unicodeString:
_unicodeStringLength dw 0x34
_unicodeStringMaxumiumLength dw 0x34, 0x0, 0x0
_pUnicodeStringBuffer dq _unicodeStringBuffer
_unicodeStringBuffer du '\??\c:\HelloWorldFile_FASM' ; may need to "run as adinistrator" for the file create to work.
ioStatusBlock:
_status_pointer dq 0x0
_information dq 0x0
securityQualityOfService:
_sqlength dd 0xC
_impersonationLevel dd 0x2
_contextTrackingMode db 0x1
_effectiveOnly db 0x1, 0x0, 0x0
I used the documentation for Ntdll!NtCreateFile, and I also used the kernel debugger to look at and copy a lot of the params.
__kernel_entry NTSTATUS NtCreateFile(
OUT PHANDLE FileHandle,
IN ACCESS_MASK DesiredAccess,
IN POBJECT_ATTRIBUTES ObjectAttributes,
OUT PIO_STATUS_BLOCK IoStatusBlock,
IN PLARGE_INTEGER AllocationSize OPTIONAL,
IN ULONG FileAttributes,
IN ULONG ShareAccess,
IN ULONG CreateDisposition,
IN ULONG CreateOptions,
IN PVOID EaBuffer OPTIONAL,
IN ULONG EaLength
);
Windows system calls are performed by calling into system DLLs such as kernel32.dll or gdi32.dll, which is done with ordinary subroutine calls. The mechanisms for trapping into the OS privileged layer is undocumented, but that is okay because DLLs like kernel32.dll do this for you.
And by system calls, I'm referring to documented Windows API entry points like CreateProcess() or GetWindowText(). Device drivers will generally use a different API from the Windows DDK.
OFFICIAL Calling convention in Windows: http://msdn.microsoft.com/en-us/library/7kcdt6fy.aspx
(hope this link survives in the future; if it doesn't, just search for "x64 Software Conventions" on MSDN).
The function calling convention differs in Linux & Windows x86_64. In both ABIs, parameters are preferably passed via registers, but the registers used differ. More on the Linux ABI can be found at http://www.x86-64.org/documentation/abi.pdf

Resources