How to call library functions in shellcode - windows

I want to generate shellcode using the following NASM code:
global _start
extern exit
section .text
_start:
xor rcx, rcx
or rcx, 10
call exit
The problem here is that I cannot use this because the address of exit function cannot be hard coded. So, how do I go about using library functions without having to re-implement them using system calls?
One way that I can think of, is to retrieve the address of exit function in a pre-processing program using GetProcAddress and substitute it in the shellcode at the appropriate place.
However, this method does not generate shellcode that can be run as it is. I'm sure there must be a better way to do it.

I am not an expert on writing shellcode, but you could try to find the import address table (IAT) of your target program and use the stored function pointers to call windows functions.
Note that you would be limited to the functions the target program uses.
Also you would have to let your shellcode calculate IAT's position relative to the process's base address due to relocations. Of course you could rely on Windows not relocating, but this might result in errors in a few cases.
Another issue is that you would have to find the target process's base address from outside.
A totally different attempt would be using syscalls, but they are really hard to use, not talking about the danger using them.
Information on PE file structure:
https://msdn.microsoft.com/en-us/library/ms809762.aspx

Related

writing a !address equivalent in WinAPI

Implementing !address feature of Windbg...
I am using VirtualQueryEx to query another Process memory and using getModuleFileName on the base addresses returned from VirtualQueryEx gives the module name.
What is left are the other non-module regions of a Process. How do I determine if a file is mapped to a region, or if the region represents the stack or the heap or PEB/TEB etc.
Basically, How do I figure out if a region represents Heap, the stack or PEB. How does Windbg do it?
One approach is to disassemble the code in the debugger extension DLL that implements !address. There is documentation within the Windbg help file on writing an extension. You could use that documentation to reverse engineer where the handler of !address is located. Then browsing through the disassembly you can see what functions it calls.
Windbg has support for debugging another instance of Windbg, specifically to debug an extension DLL. You can use this facility to better delve into the implementation of !address.
While the reverse engineering approach may be tedious, it will be more deterministic than theorizing how !address is implemented and trying out each theory.
To add to #Χpẘ answer, the reverse of the command shouldn't be really hard as debugger extensions DLLs come with symbols (I already reversed one to explain the internal flag of the !heap command).
Note that it is just a quick overview, I haven't perused inside it too much.
According to the !address documentation the command is located in exts.dll library. The command itself is located in Extension::address.
There are two commands handled there, a kernel mode (KmAnalyzeAddress) and a user mode one (UmAnalyzeAddress).
Inside UmAnalyzeAddress, the code:
Parse the command line: UmParseCommandLine(CmdArgs &,UmFilterData &)
Check if the process PEB is available IsTypeAvailable(char const *,ulong *) with "${$ntdllsym}!_PEB"
Allocate a std::list of user mode ranges: std::list<UmRange,std::allocator<UmRange>>::list<UmRange,std::allocator<UmRange>>(void)
Starts a loop to gather the required information:
UmRangeData::GetWowState(void)
UmMapBuild
UmMapFileMappings
UmMapModules
UmMapPebs
UmMapTebsAndStacks
UmMapHeaps
UmMapPageHeaps
UmMapCLR
UmMapOthers
Finally the results are finally output to screen using UmPrintResults.
Each of the above function can be simplfied to basic components, e.g. UmFileMappingshas the following central code:
.text:101119E0 push edi ; hFile
.text:101119E1 push offset LibFileName ; "psapi.dll"
.text:101119E6 call ds:LoadLibraryExW(x,x,x)
.text:101119EC mov [ebp+hLibModule], eax
.text:101119F2 test eax, eax
.text:101119F4 jz loc_10111BC3
.text:101119FA push offset ProcName ; "GetMappedFileNameW"
.text:101119FF push eax ; hModule
.text:10111A00 mov byte ptr [ebp+var_4], 1
.text:10111A04 call ds:GetProcAddress(x,x)
Another example, to find each stacks, the code just loops trhough all threads, get their TEB and call:
.text:1010F44C push offset aNttib_stackbas ; "NtTib.StackBase"
.text:1010F451 lea edx, [ebp+var_17C]
.text:1010F457 lea ecx, [ebp+var_CC]
.text:1010F45D call ExtRemoteTyped::Field(char const *)
There is a lot of fetching from _PEB, _TEB, _HEAP and other internal structures so it's not probably doable without going directly through those structures. So, I guess that some of the information returned by !address are not accessible through usual / common APIs.
You need to determine if the address you are interested in lies within a memory mapped file. Check out --> GetMappedFileName. Getting the heap and stack addresses of a process will be a little more problematic as the ranges are dynamic and don't always lie sequentially.
Lol, I don't know, I would start with a handle to the heap. If you can spawn/inherit a process then you more than likely can access the handle to the heap. This function looks promising: GetProcessHeap . That debug app runs as admin, it can walk the process chain and spy on any user level process. I don't think you will be able to access protected memory of kernel mode apps such as File System Filters, however, as they are dug down a little lower by policy.

Process injections. Windows

I know how to inject my code into other process using system calls - just get module handle using GetModuleHandle, and get proc address using GetProcAddress. By that address I can write jump instructions to my function.
But I need to inject into function of target executable. I have function's offset inside PE, know section. How to calculate needed addres in runtime to write jump instructions?
Thanks!
The module handle is the address of the start of the module, so if you know the offset you can just add them together.

Visual Studio 2010 x64 __setReg Equivalent Compiler Intrinsic

I have an application I have written in C where I really need to modify the value of one of the processor registers before calling a function. Normally I would do this with inline assembly, but as we all know that has been removed for 64 bit applications. I also cannot do this in a separate .asm file that is compiled with ml64 due to certain project constraints. So basically I need to execute the equivalent of the following code inline:
_asm mov r10d, 0xDEADBEEF
Does anyone know of a creative method or some other compiler intrinsic for x64 that will allow you to modify the value of a register inline?
Unfortunately, after looking at possible workarounds, it seems that Hans was right and it's simply not possible to modify the contents of a register inline. There is no compiler intrinsic that exists to do it and the only alternative is to either write the entire function in 64 bit assembly as a separate .asm file and compile it with ml64, or do as Alexey suggested and allocate an executable block of memory before hand and write the opcodes to it. You can then create a function pointer and just call this code directly. So for example, if I wanted to do the equivalent of:
mov r10d, ecx
ret
Just create an array to store the opcodes:
BYTE copyValueToR10[] = "\x44\x8B\xD1\xC3";
You can then VirtualAlloc memory for this tiny function with PAGE_EXECUTE protection. Next just create a function pointer and you're good to go. Definitely a dirty way to do it, but given the constraints of not having inline asm or wanting to compile using ml64, this seems to be the only other way to do it.

call immediate versus call dword near [dword addr]

So recently I've been wanting to call some win32 calls from assembly, and I've been using NASM as my external assembler. I was calling SendMessage in my code in the following way:
call __imp__SendMessageW#16
This was assembled into a relative jump (0xE8 opcode) and the result was an access violation. In the debugger, the computed jump offset seemed to be the correct one (in that __imp__SendMessageW#16 really did seem to reside there) but nonetheless it did not work. Examining the assembly produced by Visual Studio when I called the function from C++, I noticed that it wasn't a relative immediate jump it was using, but instead (in the language of MASM) a call dword ptr [__imp__SendMessageW#16], corresponding to an 0xFF15 opcode. After some futzing around I figured out that NASM syntax encodes this as call dword near [dword __imp__SendMessageW#16], and making the change my code suddenly worked.
My question is, why does one work and not the other? Is there some relocation of code going on that causes the relative immediate call to jump somewhere unfriendly? I've never been much of an assembly programmer but my impression was always that the two calls should do the same thing and the main difference is that one is position independent and the other is not (assuming that they move the IP to the same place). The relocation of code theory makes sense given that, but then how do you explain the debugger showing the right address?
Also: what's the logic behind the [] syntax in this call? The offset is still an immediate (just little endian encoded immediately after 0xFF15), there's no memory access going on here beyond the instruction fetch (I tend to think of [] as a dereference outside the context of lea).
call dword[__imp__SendMessageW#16]
_imp_SendMessageW#16 is an address to your imports section that contains the address of the API function. You use the square brackets to deference (call the address STORED by this address)

Simple "Hello-World", null-free shellcode for Windows needed

I would like to test a buffer-overflow by writing "Hello World" to console (using Windows XP 32-Bit). The shellcode needs to be null-free in order to be passed by "scanf" into the program I want to overflow. I've found plenty of assembly-tutorials for Linux, however none for Windows. Could someone please step me through this using NASM? Thxxx!
Assembly opcodes are the same, so the regular tricks to produce null-free shellcodes still apply, but the way to make system calls is different.
In Linux you make system calls with the "int 0x80" instruction, while on Windows you must use DLL libraries and do normal usermode calls to their exported functions.
For that reason, on Windows your shellcode must either:
Hardcode the Win32 API function addresses (most likely will only work on your machine)
Use a Win32 API resolver shellcode (works on every Windows version)
If you're just learning, for now it's probably easier to just hardcode the addresses you see in the debugger. To make the calls position independent you can load the addresses in registers. For example, a call to a function with 4 arguments:
PUSH 4 ; argument #4 to the function
PUSH 3 ; argument #3 to the function
PUSH 2 ; argument #2 to the function
PUSH 1 ; argument #1 to the function
MOV EAX, 0xDEADBEEF ; put the address of the function to call
CALL EAX
Note that the argument are pushed in reverse order. After the CALL instruction EAX contains the return value, and the stack will be just like it was before (i.e. the function pops its own arguments). The ECX and EDX registers may contain garbage, so don't rely on them keeping their values after the call.
A direct CALL instruction won't work, because those are position dependent.
To avoid zeros in the address itself try any of the null-free tricks for x86 shellcode, there are many out there but my favorite (albeit lengthy) is encoding the values using XOR instructions:
MOV EAX, 0xDEADBEEF ^ 0xFFFFFFFF ; your value xor'ed against an arbitrary mask
XOR EAX, 0xFFFFFFFF ; the arbitrary mask
You can also try NEG EAX or NOT EAX (sign inversion and bit flipping) to see if they work, it's much cheaper (two bytes each).
You can get help on the different API functions you can call here: http://msdn.microsoft.com
The most important ones you'll need are probably the following:
WinExec(): http://msdn.microsoft.com/en-us/library/ms687393(VS.85).aspx
LoadLibrary(): http://msdn.microsoft.com/en-us/library/windows/desktop/ms684175(v=vs.85).aspx
GetProcAddress(): http://msdn.microsoft.com/en-us/library/ms683212%28v=VS.85%29.aspx
The first launches a command, the next two are for loading DLL files and getting the addresses of its functions.
Here's a complete tutorial on writing Windows shellcodes: http://www.codeproject.com/Articles/325776/The-Art-of-Win32-Shellcoding
Assembly language is defined by your processor, and assembly syntax is defined by the assembler (hence, at&t, and intel syntax) The main difference (at least i think it used to be...) is that windows is real-mode (call the actual interrupts to do stuff, and you can use all the memory accessible to your computer, instead of just your program) and linux is protected mode (You only have access to memory in your program's little cubby of memory, and you have to call int 0x80 and make calls to the kernel, instead of making calls to the hardware and bios) Anyway, hello world type stuff would more-or-less be the same between linux and windows, as long as they are compatible processors.
To get the shellcode from your program you've made, just load it into your target system's
debugger (gdb for linux, and debug for windows) and in debug, type d (or was it u? Anyway, it should say if you type h (help)) and between instructions and memory will be the opcodes.
Just copy them all over to your text editor into one string, and maybe make a program that translates them all into their ascii values. Not sure how to do this in gdb tho...
Anyway, to make it into a bof exploit, enter aaaaa... and keep adding a's until it crashes
from a buffer overflow error. But find exactly how many a's it takes to crash it. Then, it should tell you what memory adress that was. Usually it should tell you in the error message. If it says '9797[rest of original return adress]' then you got it. Now u gotta use ur debugger to find out where this was. disassemble the program with your debugger and look for where scanf was called. Set a breakpoint there, run and examine the stack. Look for all those 97's (which i forgot to mention is the ascii number for 'a'.) and see where they end. Then remove breakpoint and type the amount of a's you found out it took (exactly the amount. If the error message was "buffer overflow at '97[rest of original return adress]" then remove that last a, put the adress you found examining the stack, and insert your shellcode. If all goes well, you should see your shellcode execute.
Happy hacking...

Resources