How to find out caller info?

How to find out caller info? - windows

This will require some background. I am using Detours to intercept system calls. For those of who don't know what Detours is - it is a tool which redirects call to system functions to a detour function which allows us to do whatever we want to do before and after the actual system call is made. What I want to know is that if it is possible to find out somehow any info about the dll/module which has made this system call? Does any win32 api function help me do this?
Lets say traceapi.dll makes a system call to GetModuleFileNameW() inside kernel32.dll. Detour will intercept this call and redirect control to a detour function (say Mine_GetModuleFileNameW()). Now inside Mine_GetModuleFileNameW(), is it possible to find out that this call originated from traceapi?

call ZwQuerySystemInformation with first argument SystemProcessesAndThreadsInformation.
once you have the returned buf, typecast it to PSYTSTEM+PROCESS_INFORMATION and use its field to extract your info.
status = ZwQuerySystemInformation (
SystemProcessesAndThreadsInformation, buf, bufsize, NULL);
PSYSTEM_PROCESS_INFORMATION proc_info = (PSYSTEM_PROCESS_INFORMATION) buf;
proc_info->ProcessName, which is a UNICODE_STRING will give you the calling process name.
Please note that the structure and field I am talking about is not documented and might change in future release of windows. However, I am using it and it works fine on WIN XP and above.

I don't know how many stack frames will be on the stack that are owned by Detours code. Easy to find out in the debugger, the odds are good that there are none. That makes it easy, use the _ReturnAddress intrinsic to get the caller's address. VirtualQuery() to get the base address, cast it to HMODULE and use GetModuleFileName(). Well, the non-detoured one :)
If there are Detours stack frames then it gets a lot harder. StackWalk64() to skip them, perilous if there are FPO frames present.

Related

what is the purpose of the BeingDebugged flag in the PEB structure?

What is the purpose of this flag (from the OS side)?
Which functions use this flag except isDebuggerPresent?
thanks a lot

It's effectively the same, but reading the PEB doesn't require a trip through kernel mode.
More explicitly, the IsDebuggerPresent API is documented and stable; the PEB structure is not, and could, conceivably, change across versions.
Also, the IsDebuggerPresent API (or flag) only checks for user-mode debuggers; kernel debuggers aren't detected via this function.
Why put it in the PEB? It saves some time, which was more important in early versions of NT. (There are a bunch of user-mode functions that check this flag before doing some runtime validation, and will break to the debugger if set.)
If you change the PEB field to 0, then IsDebuggerPresent will also return 0, although I believe that CheckRemoteDebuggerPresent will not.

As you have found the IsDebuggerPresent flag reads this from the PEB. As far as I know the PEB structure is not an official API but IsDebuggerPresent is so you should stick to that layer.
The uses of this method are quite limited if you are after a copy protection to prevent debugging your app. As you have found it is only a flag in your process space. If somebody debugs your application all he needs to do is to zero out the flag in the PEB table and let your app run.
You can raise the level by using the method CheckRemoteDebuggerPresent where you pass in your own process handle to get an answer. This method goes into the kernel and checks for the existence of a special debug structure which is associated with your process if it is beeing debugged. A user mode process cannot fake this one but you know there are always ways around by simply removing your check ....

WINAPI: GetModuleHandle and increment refcount

How can I increment refcount of the HMODULE returned by the GetModuleHandle? Can I DuplicateHandle, or I need to go through hops, retrieve module's path and then LoarLibrary on that path? In short, I want to emulate GetModuleHandleEx without using this function (which is XP+).

You cannot use DuplicateHandle() on a HMODULE. The MSDN Library article lists the kind of handles that DH will accept in the Remarks section, a module handle is not one of them.
One reason for this is that a HMODULE is not actually a handle at all, it is a pseudo handle. There's history behind this, back in the 16-bit versions of Windows they actually were handles. But that disappeared in the 32-bit version, they are now simply the address of the module where it is loaded in memory. One pretty standard trick to convert a code address to a module handle is to use VirtualQuery() and cast the returned MEMORY_BASIC_INFORMATION.BaseAddress to (HMODULE). Very handy sometimes.
Yes, the only other way to increment the reference count is to use LoadLibrary().

IoGetDeviceObjectPointer() fails with no return status

This is my code:
UNICODE_STRING symbol;
WCHAR ntNameBuffer[128];
swprintf(ntNameBuffer, L"\\Device\\Harddisk1\\Partition1");
RtlInitUnicodeString(&symbol, ntNameBuffer);
KdPrint(("OSNVss:symbol is %ws\n",symbol.Buffer));
status = IoGetDeviceObjectPointer(&symbol,
FILE_READ_DATA,
&pDiskFileObject,
&pDiskDeviceObject);
My driver is next-lower-level of \\Device\\Harddisk1\\Partition1.
When I call IoGetDeviceObjectPointer it will fail and no status returns and it not continue do remaining code.
When I use windbg debug this, it will break with a intelpm.sys;
If I change the objectname to "\\Device\\Harddisk1\\Partition2" (the partition2 is really existing), it will success call
If I change objectname to "\\Device\\Harddisk1\\Partition3", (the partition3 is not existing), it failed and return status = 0xc0000034, mean objectname is not existing.
Does anybody know why when I use object "\\Device\\Harddisk1\\Partition1" it fails and no return status? thanks very much!

First and foremost: what are you trying to achieve and what driver model are you using? What bitness, what OS versions are targeted and on which OS version does it fail? Furthermore: you are at the correct IRQL for the call and is running inside a system thread, right? From which of your driver's entry points (IRP_MJ_*, DriverEntry ...) are you calling this code?
Anyway, was re-reading the docs on this function. Noting in particular the part:
The IoGetDeviceObjectPointer routine returns a pointer to the top object in the named device object's stack and a pointer to the
corresponding file object, if the requested access to the objects can
be granted.
and:
IoGetDeviceObjectPointer establishes a "connection" between the caller
and the next-lower-level driver. A successful caller can use the
returned device object pointer to initialize its own device objects.
It can also be used as as an argument to IoAttachDeviceToDeviceStack,
IoCallDriver, and any routine that creates IRPs for lower drivers. The
returned pointer is a required argument to IoCallDriver.
You don't say, but if you are doing this on a 32bit system, it may be worthwhile tracking down what's going on with IrpTracker. However, my guess is that said "connection" or rather the request for it gets somehow swallowed by the next-lower-level driver or so.
It is also hard to say what kind of driver you are writing here (and yes, this can be important).
Try not just breaking at a particular point before or after the fact but rather follow the stack that the IRP would travel downwards in the target device object's stack.
But thinking about it, you probably aren't attached to the stack at all (for whatever reason). Could it be that you actually should be using IoGetDiskDeviceObject instead, in order to get the actual underlying device object (at the bottom of the stack) and not a reference to the top-level object attached?
Last but not least: don't forget you can also ask this question over on the OSR mailing lists. There are plenty of seasoned professionals there who may have run into the exact same problem (assuming you are doing all of the things correct that I asked about).

thanks everyone , I solve this problem; what cause this problem is it becoming synchronous; when I
call IoGetDeviceObjectPointer , it will generate an new Irp IRP_MJ_WRITER which pass though from high level, when this irp reach my driver, my thread which handle IRP is the same thread whilch call IoGetDeviceObjectPointer ,so it become drop-dead halt;

COM memory management

I have some questions regarding COM memory management:
I have a COM method:
STDMETHODIMP CWhitelistPolicy::GetWebsitesStrings(SAFEARRAY** result)
result = SAFEARRAY(BSTR). If I receive another SAFEARRAY(BSTR) from another interface method(in order to set *result) do I have to make copies of the strings received in order to pass them to *result and outside client? Or considering I will not use the strings for myself I can just pass them to the client (and passing out the ownership)?
2.
STDMETHODIMP CWhitelistPolicy::SetWebsitesStrings(SAFEARRAY* input)
Here I receive a BSTR array as input. Again my method is responsible for the memory allocated in input?
3.
STDMETHOD(SetUsers)(SAFEARRAY* input);
Here I call a method on another interface (SetUsers) and I allocate memory for the input SAFEARRAY. After I call SetUsers I can dispose of the SAFEARRAY? Memory is always copied when marshaling takes place isn't it? (in my case SetUsers method is called on an interface that is hosted as a COM DLL inside my process)

The way I think about it to answer questions like this is to think about a COM call that crosses machines. Then it's obvious for an [out] param; I the caller own and have to free the memory because the remote marshaling layer can't do it. For [in] parameters, it's obvious the marshaling layer must copy my data and again the remote marshaling layer can't free what I passed in.
A core tenet in COM is location neutrality, the rules when calling in the same apartment are the rules when using DCOM across machines.
You're responsible to free - you don't pass ownership when you call the next fnc because it could be remote and getting a copy, not your original data.
No - as the callee, you don't have to free it. If it's intra-apartment, it's the memory the caller provided and the caller has to free it. If it's a remote call, the server stub allocates it and will free it when the method returns.
Yes, you free it - no, it's not always copied (it might be), which is why the answer to 2 is no. If it's copied, there's a stub that allocated and the stub will free it.
Note my answers to your questions didn't cover the case of [in,out] parameters - see the so question Who owns returned BSTR? for some more details on this case.
Com allocation rules are complicated but rational. Get the book "essential com" by Don Box if you want to understand/see examples of all the cases. Still you're going to make mistakes so you should have a strategy for detecting them. I use gflags (part of Windbg) and its heap checking flags to catch when a double free occurs (a debug message is displayed and execution halted at the call with an INT 3). Vstudio's debugger used to turn them on for you when it launched the executable (it likely still does) but you can force them on with gflags under the image options tab.
You should also know how to use UMDH (also part of windbg) to detect leaks. DebugDiag is the newer tool for this and seems easier to use, but sadly, you can only have the 32 bit or 64 bit version installed, but not both.
The problem then are BSTRs, which are cached, make detecting double frees and leaks tricky because interacting with the heap is delayed. You can shut off the ole string cache by setting the environment variable OANOCACHE to 1 or calling the function SetOaNoCache. The function's not defined in a header file so see this SO question Where is SetOaNoCache defined?. Note the accepted answer shows the hard way to call it through GetProcAddress(). The answer below the accepted one shows all you need is an extern "C" as it's in the oleaut32 export lib. Finally, see this Larry Osterman blog post for a more detailed description of the difficulties caused by the cache when hunting leaks.

Some Windows API calls fail unless the string arguments are in the system memory rather than local stack

We have an older massive C++ application and we have been converting it to support Unicode as well as 64-bits. The following strange thing has been happening:
Calls to registry functions and windows creation functions, like the following, have been failing:
hWnd = CreateSysWindowExW( ExStyle, ClassNameW.StringW(), Label2.StringW(), Style,
Posn.X(), Posn.Y(),
Size.X(), Size.Y(),
hParentWnd, (HMENU)Id,
AppInstance(), NULL);
ClassNameW and Label2 are instances of our own Text class which essentially uses malloc to allocate the memory used to store the string.
Anyway, when the functions fail, and I call GetLastError it returns the error code for "invalid memory access" (though I can inspect and see the string arguments fine in the debugger). Yet if I change the code as follows then it works perfectly fine:
BSTR Label2S = SysAllocString(Label2.StringW());
BSTR ClassNameWS = SysAllocString(ClassNameW.StringW());
hWnd = CreateSysWindowExW( ExStyle, ClassNameWS, Label2S, Style,
Posn.X(), Posn.Y(),
Size.X(), Size.Y(),
hParentWnd, (HMENU)Id,
AppInstance(), NULL);
SysFreeString(ClassNameWS); ClassNameWS = 0;
SysFreeString(Label2S); Label2S = 0;
So what gives? Why would the original functions work fine with the arguments in local memory, but when used with Unicode, the registry function require SysAllocString, and when used in 64-bit, the Windows creation functions also require SysAllocString'd string arguments? Our Windows procedure functions have all been converted to be Unicode, always, and yes we use SetWindowLogW call the correct default Unicode DefWindowProcW etc. That all seems to work fine and handles and draws Unicode properly etc.
The documentation at http://msdn.microsoft.com/en-us/library/ms632679%28v=vs.85%29.aspx does not say anything about this. While our application is massive we do use debug heaps and tools like Purify to check for and clean up any memory corruption. Also at the time of this failure, there is still only one main system thread. So it is not a thread issue.
So what is going on? I have read that if string arguments are marshalled anywhere or passed across process boundaries, then you have to use SysAllocString/BSTR, yet we call lots of API functions and there is lots of code out there which calls these functions just using plain local strings?
What am I missing? I have tried Googling this, as someone else must have run into this, but with little luck.
Edit 1: Our StringW function does not create any temporary objects which might go out of scope before the actual API call. The function is as follows:
Class Text {
const wchar_t* StringW () const
{
return TextStartW;
}
wchar_t* TextStartW; // pointer to current start of text in DataArea

I have been running our application with the debug heap and memory checking and other diagnostic tools, and found no source of memory corruption, and looking at the assembly, there is no sign of temporary objects or invalid memory access.
BUT I finally figured it out:
We compile our code /Zp1, which means byte aligned memory allocations. SysAllocString (in 64-bits) always return a pointer that is aligned on a 8 byte boundary. Presumably a 32-bit ANSI C++ application goes through an API layer to the underlying Unicode windows DLLs, which would also align the pointer for you.
But if you use Unicode, you do not get that incidental pointer alignment that the conversion mapping layer gives you, and if you use 64-bits, of course the situation will get even worse.
I added a method to our Text class which shifts the string pointer so that it is aligned on an eight byte boundary, and viola, everything runs fine!!!
Of course the Microsoft people say it must be memory corruption and I am jumping the wrong conclusion, but there is evidence it is not the case.
Also, if you use /Zp1 and include windows.h in a 64-bit application, the debugger will tell you sizeof(BITMAP)==28, but calling GetObject on a bitmap will fail and tell you it needs a 32-byte structure. So I suspect that some of Microsoft's API is inherently dependent on aligned pointers, and I also know that some optimized assembly (I have seen some from Fortran compilers) takes advantage of that and crashes badly if you ever give it unaligned pointers.
So the moral of all of this is, dont use "funky" compiler arguments like /Zp1. In our case we have to for historical reasons, but the number of times this has bitten us...
Someone please give me a "this is useful" tick on my answer please?

Using a bit of psychic debugging, I'm going to guess that the strings in your application are pooled in a read-only section.
It's possible that the CreateSysWindowsEx is attempting to write to the memory passed in for the window class or title. That would explain why the calls work when allocated on the heap (SysAllocString) but not when used as constants.
The easiest way to investigate this is to use a low level debugger like windbg - it should break into the debugger at the point where the access violation occurs which should help figure out the problem. Don't use Visual Studio, it has a nasty habit of being helpful and hiding first chance exceptions.
Another thing to try is to enable appverifier on your application - it's possible that it may show something.

Calling a Windows API function does not cross the process boundary, since the various Windows DLLs are loaded into your process.
It sounds like whatever pointer that StringW() is returning isn't valid when Windows is trying to access it. I would look there - is it possible that the pointer returned it out of scope and deleted shortly after it is called?
If you share some more details about your string class, that could help diagnose the problem here.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio