Identify and intercept function call - windows

I'm developing a launcher for a game.
Want to intercept game's call for a function that prints text.
I don't know whether the code that contains this function is dynamically linked or statically. So I dont even know the function name.
I did intercepted some windows-api calls of this game through microsoft Detours, Ninject and some others.
But this one is not in import table either.
What should I do to catch this function call? What profiler should be used? IDA? How this could be done?
EDIT:
Finally found function address. Thanks, Skino!
Tried to hook it with Detours, injected dll. Injected DllMain:
typedef int (WINAPI *PrintTextType)(char *, int, float , int);
static PrintTextType PrintText_Origin = NULL;
int WINAPI PrintText_Hooked(char * a, int b, float c, int d)
{
return PrintText_Origin(a, b, c , d);
}
HMODULE game_dll_base;
/* game_dll_base initialization goes here */
BOOL APIENTRY DllMain(HINSTANCE hinstDLL, DWORD fdwReason, LPVOID lpvReserved)
{
if(fdwReason==DLL_PROCESS_ATTACH)
{
DisableThreadLibraryCalls(hinstDLL);
DetourTransactionBegin();
DetourUpdateThread(GetCurrentThread());
PrintText_Origin = (PrintTextType)((DWORD)game_dll_base + 0x6049B0);
DetourAttach((PVOID *)&PrintText_Origin , PrintText_Hooked);
DetourTransactionCommit();
}
}
It hooks as expected. Parameter a has text that should be displayed. But when calling original function return PrintText_Origin (a, b, c , d); application crashes(http://i46.tinypic.com/ohabm.png, http://i46.tinypic.com/dfeh4.png)
Original function disassembly:
http://pastebin.com/1Ydg7NED
After Detours:
http://pastebin.com/eM3L8EJh
EDIT2:
After Detours:
http://pastebin.com/GuJXtyad
PrintText_Hooked disassembly http://pastebin.com/FPRMK5qt w3_loader.dll is the injected dll
Im bad at ASM, please tell what can be wrong ?

Want to intercept game's call for a function that prints text.
You can use a debugger for the investigative phase. Either IDA, or even Visual Studio (in combination with e.g. HxD), should do. It should be relatively easy to identify the function using the steps below:
Identify a particular fragment of text whose printing you want to trace (e.g. Hello World!)
Break the game execution at any point before the game normally prints the fragment you identified above
Search for that fragment of text† (look for either Unicode or ANSI) in the game's memory. IDA will allow you to do that IIRC, as will the free HxD (Extras > Open RAM...)
Once the address of the fragment has been identified, set a break-on-access/read data breakpoint so the debugger will give you control the moment the game attempts to read said fragment (while or immediately prior to displaying it)
Resume execution, wait for the data breakpoint to trigger
Inspect the stack trace and look for a suitable candidate for hooking
Step through from the moment the fragment is read from memory until it is printed if you want to explore additional potential hook points
†provided text is not kept compressed (or, for whatever reason, encrypted) until the very last moment
Once you are done with the investigative phase and you have identified where you'd like to inject your hook, you have two options when writing your launcher:
If, based on the above exercise, you were able to identify an export/import after all, then use any API hooking techniques
EDIT Use Microsoft Detours, making sure that you first correctly identify the calling convention (cdecl, fastcall, stdcall) of the function you are trying to detour, and use that calling convention for both the prototype of the original as well as for the implementation of the dummy. See examples.
If not, you will have to
use the Debugging API to programatically load the game
compute the hook address based on your investigative phase (either as a hard-coded offset from the module base, or by looking for the instruction bytes around the hook site‡)
set a breakpoint
resume the process
wait for the breakpoint to trigger, do whatever you have to do
resume execution, wait for the next trigger etc. again, all done programatically by your launcher via the Debugging API.
‡to be able to continue to work with eventual patch releases of the game

At this stage it sounds like you don't have a notion of what library function you're trying to hook, and you've stated it's not (obviously at least) an imported external function in the import table which probably means that the function responsible for generating the text is likely located inside the .text of the application you are disassembling directly or loaded dynamically, the text generation (especially in a game) is likely a part of the application.
In my experience, this simplest way to find code that is difficult to trace such as this is by stopping the application shortly during or before/after text is displayed and using IDA's fabulous call-graph functionality to establish what is responsible for writing it out (use watches and breakpoints liberally!)
Look carefully to calls to CreateRemoteThread or any other commonly used dynamic loading mechanism if you have reason to believe this functionality might be provided by an exported function that isn't showing up in the import table.
I strongly advice against it but for the sake of completeness, you could also hook NtSetInformationThread in the system service dispatch table. here's a good dump of the table for different Windows versions here. If you want to get the index in the table yourself you can just disassemble the NtSetInformationThread export from ntdll.dll.

Related

How can an ebpf program change kernel execution flow or call kernel functions?

I'm trying to figure out how an ebpf program can change the outcome of a function (not a syscall, in my case) in kernel space. I've found numerous articles and blog posts about how ebpf turns the kernel into a programmable kernel, but it seems like every example is just read-only tracing and collecting statistics.
I can think of a few ways of doing this: 1) make a kernel application read memory from an ebpf program, 2) make ebpf change the return value of a function, 3) allow an ebpf program to call kernel functions.
The first approach does not seem like a good idea.
The second would be enough, but as far as I understand it's not easy. This question says syscalls are read-only. This bcc document says it is possible but the function needs to be whitelisted in the kernel. This makes me think that the whitelist is fixed and can only be changed by recompiling the kernel, is this correct?
The third seems to be the most flexible one, and this blog post encouraged me to look into it. This is the one I'm going for.
I started with a brand new 5.15 kernel, which should have this functionality
As the blog post says, I did something no one should do (security is not an issue since I'm just toying with this) and opened every function to ebpf by adding this to net/core/filter.c (which I'm not sure is the correct place to do so):
static bool accept_the_world(int off, int size,
enum bpf_access_type type,
const struct bpf_prog *prog,
struct bpf_insn_access_aux *info)
{
return true;
}
bool export_the_world(u32 kfunc_id)
{
return true;
}
const struct bpf_verifier_ops all_verifier_ops = {
.check_kfunc_call = export_the_world,
.is_valid_access = accept_the_world,
};
How does the kernel know of the existence of this struct? I don't know. None of the other bpf_verifier_ops declared are used anywhere else, so it doesn't seem like there is a register_bpf_ops
Next I was able to install bcc (after a long fight due to many broken installation guides).
I had to checkout v0.24 of bcc. I read somewhere that pahole is required when compiling the kernel, so I updated mine to v1.19.
My python file is super simple, I just copied the vfs example from bcc and simplified it:
bpf_text_kfunc = """
extern void hello_test_kfunc(void) __attribute__((section(".ksyms")));
KFUNC_PROBE(vfs_open)
{
stats_increment(S_OPEN);
hello_test_kfunc();
return 0;
}
"""
b = BPF(text=bpf_text_kfunc)
Where hello_test_kfunc is just a function that does a printk, inserted as a module into the kernel (it is present in kallsyms).
When I try to run it, I get:
/virtual/main.c:25:5: error: cannot call non-static helper function
hello_test_kfunc();
^
And this is where I'm stuck. It seems like it's the JIT that is not allowing this, but who exactly is causing this issue? BCC, libbpf or something else? Do I need to manually write bpf code to call kernel functions?
Does anyone have an example with code of what the lwn blog post I linked talks about actually working?
eBPF is fundamentally made to extend kernel functionality in very specific limited ways. Essentially a very advanced plugin system. One of the main design principles of the eBPF is that a program is not allowed to break the kernel. Therefor it is not possible to change to outcome of arbitrary kernel functions.
The kernel has facilities to call a eBPF program at any time the kernel wants and then use the return value or side effects from helper calls to effect something. The key here is that the kernel always knows it is doing this.
One sort of exception is the BPF_PROG_TYPE_STRUCT_OPS program type which can be used to replace function pointers in whitelisted structures.
But again, explicitly allowed by the kernel.
make a kernel application read memory from an ebpf program
This is not possible since the memory of an eBPF program is ephemaral, but you could define your own custom eBPF program type and pass in some memory to be modified to the eBPF program via a custom context type.
make ebpf change the return value of a function
Not possible unless you explicitly call a eBPF program from that function.
allow an ebpf program to call kernel functions.
While possible for a number for purposes, this typically doesn't give you the ability to change return values of arbitrary functions.
You are correct, certain program types are allowed to call some kernel functions. But these are again whitelisted as you discovered.
How does the kernel know of the existence of this struct?
Macro magic. The verifier builds a list of these structs. But only if the program type exists in the list of program types.
/virtual/main.c:25:5: error: cannot call non-static helper function
This seems to be a limitation of BCC, so if you want to play with this stuff you will likely have to manually compile your eBPF program and load it with libbpf or cilium/ebpf.

MFC: Conflicting information on how to clean up after using COleDataSource::DoDragDrop()

Can someone clear up all the conflicting information.
The documentation for COleDataSource::CacheData() says:
After the call to CacheData the ptd member of lpFormatEtc and the
contents of lpStgMedium are owned by the data object, not by the
caller.
The documentation for COleDataSource::CacheGlobalData() doesn't say that.
You find code examples of how to use COleDataSource::DoDragDrop() on places like Code Project which call COleDataSource::CacheGlobalData() then check the DoDragDrop() results and free the memory if it didn't drop:
DROPEFFECT dweffect=datasrc->DoDragDrop(DROPEFFECT_COPY);
// They say if operation wasn't accepted, or was canceled, we
// should call GlobalFree() to clean up.
if (dweffect==DROPEFFECT_NONE) {
GlobalFree(hgdrop);
}
You also have Q182219 (which for some unknown reason MS has totally broken all the good old MSKB links and you can't find any information anymore (links all over the Internet are dead). Q182219 says something about after a successful move operation on NT based OS (Win10), it will return DROPEFFECT_NONE so you have to check if things really moved.
So the questions are:
1) Should you really free the memory if the drop operation returns DROPEFFECT_NONE or would MFC handle all that?
2) Does it still return DROPEFFFECT_NONE after a successful move operation or does MFC handle that internally (or fixed in some Windows version) ?
3) If it does return DROPEFFECT_NONE after a move, wouldn't logic like the above be a double free on memory if it was a move operation?
Extra:
You also find examples and usage of people doing COleDataSource mydatasource which is WRONG. You have to allocate it like COleDataSource *mydatasource=new ColeDataSource() and then at the end use mydatasource->ExternalRelease() to release it. (ExternalRelease() handles calling InternalRelease() if the object isn't using aggregation (that is a derived class that acts as a wrapper))
TIA!!

What exactly happened with the Lisp REPL on JPL's DS-1?

I've heard the Google talk (http://www.youtube.com/watch?v=_gZK0tW8EhQ) by Ron Garret and read the paper (http://www.flownet.com/gat/jpl-lisp.html), but I'm not understanding how it worked to "correct" supposedly running code with a REPL. Was the DS-1's Lisp code running is some sort of virtual machine? Or was it "live" in the REPL's actual world? Or was the Lisp code an executable that got replaced? What exactly happened/happens when you dynamically change running Lisp code through a REPL?
Whereas most programs are built and distributed as an executable that contains only the necessary components to run the program, Lisp can be distributed as an image that contains not just the components for the specific program, but also much or all of the Lisp runtime and development environment.
The REPL is the quintessential mechanism for providing interactive access to a running Lisp environment. The two key components of the REPL, Read, and Eval, expose much of the Lisp runtime system. For example, many Lisp systems today implement Eval by compiling the provided form (that is read by the Reader), compiling the form to machine code, and then executing the result. This is in contrast to interpreting the form. Some systems, especially in the past, contained both an interpreter that executes quickly and is suitable for interactive access, and a compiler that produces better code. But modern systems are fast enough that the compiler phase isn't noticeable and simply forgo the interpreter.
Of course, you can do very similar things today. A simple example is running SSH to your Linux box that's hosting PHP. Your PHP server is up and running and live, serving pages and requests. But you login through SSH, go over and fix a PHP file, and as soon as you save that file, all of your users see the new result in real time -- the system updated itself on the fly.
The fact that PHP is running on a Linux runtime vs Lisp running on a Lisp runtime, is a detail. The effect is the same. The fact that PHP isn't compiled is a detail also. For example, you can do the same thing on a Java server: modify a JSP, save it, and the JSP is converted in to a Servlet as Java source code, then compiled on the fly by the Java runtime, then loaded in to the executing container, replacing the old code.
Lisps capability to do this is very nice, and it was very interesting far back in the day. Today, it's less so, as there are different system providing similar capabilities.
Addenda:
No, Lisp is not a virtual machine, there's no need for it to be that complicated.
The key to the concept is dynamic dispatch. With dynamic dispatch there is some lookup involved before a function is invoked.
In a static language like C, locations of things are pretty much set in stone once the linker and loader have finished processing the executable in preparation to start executing.
So, in C if you have something simple like:
int add(int i) {
return i + 1;
}
void main() {
add(1);
}
After compiling and linking and loading of the program, the address of the add function will be set in stone, and thus thing referring to that function will know exactly where to find it.
So, in assembly language: (note this is a contrived assembly language)
add: pop r1 ; pop R1 from the stack, loading the i parameter
add r1, 1; Add 1 to the parameter.
push r1 ; push result of function call
rts ; return from subroutine
main: push 1 ; Push parameter to function
call add ; call function
pop r1 ; gather (and ignore) the result
So, you can see here that add is fixed in place.
In something like Lisp, function are referred to indirectly.
int add(int i) {
return i + 1;
}
int *add_ptr() = &add;
void main() {
*(add_ptr)(1);
}
In assembly you get:
add: pop r1 ; pop R1 from the stack, loading the i parameter
add r1, 1; Add 1 to the parameter.
push r1 ; push result of function call
rts ; return from subroutine
add_ptr: dw add ; put the address of the add routine in add_ptr
main: push 1 ; Push parameter to function
mov r1, add_ptr ; Put the contents of add_ptr into R1
call (r1) ; call function indirectly through R1
pop r1 ; gather (and ignore) the result
Now, you can see here that rather than calling add directly, it is called indirectly through the add_ptr. In a Lisp runtime, it has the capability of compiling new code, and when that happens, add_ptr would be overwritten to point to the newly compiled code. You can see how the code in main never has to change, it will call whatever function add_ptr is pointing to.
Since most all of the functions in Lisp are indirectly referenced through their symbols, a lot can change "behind the back" of a running system, and the system, will continue to run.
When a function is recompiled, the old function code (assuming no other references) become eligible for garbage collection, and will, typically, eventually go away.
You can also see that when the system is garbage collected, any code that is moved (such as the code for the add function) can be moved by the runtime, and it's new location updated in the add_ptr so the system will continue operating even after code and been relocated by the garbage collector.
So, the key to it all, is to have your functions invoked through some lookup mechanism. Having this gives you a lot of flexibility.
Note, you can also do this is a running C system, for example. You can put code in a dynamic library, load the library, execute the code, and if you want you can build a new dynamic library, close the old one, open the new one, and invoke the new code -- all in a "running" system. The dynamic library interface provides the lookup mechanism that isolates the code.

Some Windows API calls fail unless the string arguments are in the system memory rather than local stack

We have an older massive C++ application and we have been converting it to support Unicode as well as 64-bits. The following strange thing has been happening:
Calls to registry functions and windows creation functions, like the following, have been failing:
hWnd = CreateSysWindowExW( ExStyle, ClassNameW.StringW(), Label2.StringW(), Style,
Posn.X(), Posn.Y(),
Size.X(), Size.Y(),
hParentWnd, (HMENU)Id,
AppInstance(), NULL);
ClassNameW and Label2 are instances of our own Text class which essentially uses malloc to allocate the memory used to store the string.
Anyway, when the functions fail, and I call GetLastError it returns the error code for "invalid memory access" (though I can inspect and see the string arguments fine in the debugger). Yet if I change the code as follows then it works perfectly fine:
BSTR Label2S = SysAllocString(Label2.StringW());
BSTR ClassNameWS = SysAllocString(ClassNameW.StringW());
hWnd = CreateSysWindowExW( ExStyle, ClassNameWS, Label2S, Style,
Posn.X(), Posn.Y(),
Size.X(), Size.Y(),
hParentWnd, (HMENU)Id,
AppInstance(), NULL);
SysFreeString(ClassNameWS); ClassNameWS = 0;
SysFreeString(Label2S); Label2S = 0;
So what gives? Why would the original functions work fine with the arguments in local memory, but when used with Unicode, the registry function require SysAllocString, and when used in 64-bit, the Windows creation functions also require SysAllocString'd string arguments? Our Windows procedure functions have all been converted to be Unicode, always, and yes we use SetWindowLogW call the correct default Unicode DefWindowProcW etc. That all seems to work fine and handles and draws Unicode properly etc.
The documentation at http://msdn.microsoft.com/en-us/library/ms632679%28v=vs.85%29.aspx does not say anything about this. While our application is massive we do use debug heaps and tools like Purify to check for and clean up any memory corruption. Also at the time of this failure, there is still only one main system thread. So it is not a thread issue.
So what is going on? I have read that if string arguments are marshalled anywhere or passed across process boundaries, then you have to use SysAllocString/BSTR, yet we call lots of API functions and there is lots of code out there which calls these functions just using plain local strings?
What am I missing? I have tried Googling this, as someone else must have run into this, but with little luck.
Edit 1: Our StringW function does not create any temporary objects which might go out of scope before the actual API call. The function is as follows:
Class Text {
const wchar_t* StringW () const
{
return TextStartW;
}
wchar_t* TextStartW; // pointer to current start of text in DataArea
I have been running our application with the debug heap and memory checking and other diagnostic tools, and found no source of memory corruption, and looking at the assembly, there is no sign of temporary objects or invalid memory access.
BUT I finally figured it out:
We compile our code /Zp1, which means byte aligned memory allocations. SysAllocString (in 64-bits) always return a pointer that is aligned on a 8 byte boundary. Presumably a 32-bit ANSI C++ application goes through an API layer to the underlying Unicode windows DLLs, which would also align the pointer for you.
But if you use Unicode, you do not get that incidental pointer alignment that the conversion mapping layer gives you, and if you use 64-bits, of course the situation will get even worse.
I added a method to our Text class which shifts the string pointer so that it is aligned on an eight byte boundary, and viola, everything runs fine!!!
Of course the Microsoft people say it must be memory corruption and I am jumping the wrong conclusion, but there is evidence it is not the case.
Also, if you use /Zp1 and include windows.h in a 64-bit application, the debugger will tell you sizeof(BITMAP)==28, but calling GetObject on a bitmap will fail and tell you it needs a 32-byte structure. So I suspect that some of Microsoft's API is inherently dependent on aligned pointers, and I also know that some optimized assembly (I have seen some from Fortran compilers) takes advantage of that and crashes badly if you ever give it unaligned pointers.
So the moral of all of this is, dont use "funky" compiler arguments like /Zp1. In our case we have to for historical reasons, but the number of times this has bitten us...
Someone please give me a "this is useful" tick on my answer please?
Using a bit of psychic debugging, I'm going to guess that the strings in your application are pooled in a read-only section.
It's possible that the CreateSysWindowsEx is attempting to write to the memory passed in for the window class or title. That would explain why the calls work when allocated on the heap (SysAllocString) but not when used as constants.
The easiest way to investigate this is to use a low level debugger like windbg - it should break into the debugger at the point where the access violation occurs which should help figure out the problem. Don't use Visual Studio, it has a nasty habit of being helpful and hiding first chance exceptions.
Another thing to try is to enable appverifier on your application - it's possible that it may show something.
Calling a Windows API function does not cross the process boundary, since the various Windows DLLs are loaded into your process.
It sounds like whatever pointer that StringW() is returning isn't valid when Windows is trying to access it. I would look there - is it possible that the pointer returned it out of scope and deleted shortly after it is called?
If you share some more details about your string class, that could help diagnose the problem here.

How to find out caller info?

This will require some background. I am using Detours to intercept system calls. For those of who don't know what Detours is - it is a tool which redirects call to system functions to a detour function which allows us to do whatever we want to do before and after the actual system call is made. What I want to know is that if it is possible to find out somehow any info about the dll/module which has made this system call? Does any win32 api function help me do this?
Lets say traceapi.dll makes a system call to GetModuleFileNameW() inside kernel32.dll. Detour will intercept this call and redirect control to a detour function (say Mine_GetModuleFileNameW()). Now inside Mine_GetModuleFileNameW(), is it possible to find out that this call originated from traceapi?
call ZwQuerySystemInformation with first argument SystemProcessesAndThreadsInformation.
once you have the returned buf, typecast it to PSYTSTEM+PROCESS_INFORMATION and use its field to extract your info.
status = ZwQuerySystemInformation (
SystemProcessesAndThreadsInformation, buf, bufsize, NULL);
PSYSTEM_PROCESS_INFORMATION proc_info = (PSYSTEM_PROCESS_INFORMATION) buf;
proc_info->ProcessName, which is a UNICODE_STRING will give you the calling process name.
Please note that the structure and field I am talking about is not documented and might change in future release of windows. However, I am using it and it works fine on WIN XP and above.
I don't know how many stack frames will be on the stack that are owned by Detours code. Easy to find out in the debugger, the odds are good that there are none. That makes it easy, use the _ReturnAddress intrinsic to get the caller's address. VirtualQuery() to get the base address, cast it to HMODULE and use GetModuleFileName(). Well, the non-detoured one :)
If there are Detours stack frames then it gets a lot harder. StackWalk64() to skip them, perilous if there are FPO frames present.

Resources