Interop, overlapped I/O, handles: use SafeHandle or pin? - interop

When you are passing an unmanaged handle (stored in either IntPtr or SafeHandle at the managed side) from managed to unmanaged code to do overlapped I/O, what is the correct approach?
use a SafeHandle to wrap the (IntPtr) OS handle in,
or use GCHandle.Alloc(IntPtrHandle, GCHandleType.Pinned) to pin it?
I am currently using a SafeHandle in combination with a NativeOverlapped structure, but I'm beginning to suspect more and more that GC is moving either or both around in memory while unmanaged overlapped IO is in progress.
Would I better go back to using an IntPtr instead of a SafeHandle, and use a GCHandle structure to pin it?
Or is the right way a combination of all, i.e. in your NativeOverlapped, use a pinned version of the IntPtr which is in its turn coming from the SafeHandle?
-- EDIT
Reflecting about this during lunch, I got the idea that I'm being stupid.
It must be the overlapped structure that needs to be pinned, not the handle in it. Is that the right (best) answer?

Related

How does GetWindowText get the name of a window owned by another process without a syscall to read that process's memory?

I wanted to figure out what the syscalls behind GetWindowText are. I wrote a simple program to call GetWindowText with a handle to a window in a different process.
int CALLBACK WinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR lpCmdLine, int nCmdShow)
{
MessageBox(0,"Attach debugger and set bp","on GetWindowTextA",0);
HWND winmine = FindWindow(NULL,"Minesweeper");
if(winmine != NULL)
{
char buf[255] = "";
GetWindowTextA(winmine, buf, 254);
MessageBox(0,buf,"Found",0);
}
else
{
MessageBox(0,"?","Found nothing",0);
}
return 0;
}
I attached a debugger and stepped through the GetWindowTextA call, manually stepping through everything except these API calls (in order):
GetWindowThreadProcessId (in GetWindowLong)
InterlockedIncrement
WCSToMBEx (which is basically WideCharToMultiByte)
InterlockedDecrement
None of these API calls seem to be able to read a string in memory not owned by the calling process. I used a usermode debugger so I certainly didn't end up in kernelmode while stepping without realizing it. This means that GetWindowText got the window name without performing a context switch. Which seems to imply that the text for every window that exists is accessible without a context switch.. and that can't be right because there's no way Windows keeps a copy of the text for every single window/control on the system, on every single process.
I have read this article. It mentions that window names are stored in quote "a special place", but does not explain how this "special place" can be accessed from a different process without a syscall/context switching.
So I'm looking for any explanations as to how this is done. Any information you can provide is greatly appreciated.
GetWindowText got the window name without performing a context switch. Which seems to imply that the text for every window that exists is accessible without a context switch.
This info is stored in memory that is shared between all the processes that use user32.dll. You may try to search virtual space of your process for unicode names of other processes' windows.
It gets mapped into the process address space during user32.dll loading. There are some kernel structures/sections involved: win32k!gSharedInfo, win32k!ghSectionShared, win32k!gpsi and others (which I don't know of).
Actually, the lower 16 bits of HWND represents index into window info array with base address *(&user32!gSharedInfo + 1). The first field of this window info is the kernel address of another structure which contains all the shared window information. Subtracting the difference between kernel address of the section and its user-space mapping (which is stored in TEB!Win32ClientInfo) you can get relevant info.
user32!ValidateHwnd is the function that converts window handle into this address which can be used by inner user32 functions like user32!DefWindowProcWorker.
Pseudocode of GetWindowTextW looks like (excluding error-handling):
GetWindowTextW(HWND hwnd, wchar_t* buf, int size)
{
inner_hwnd = ValidateHwnd(hwnd);
if (TestWindowProcess(inner_hwnd))
SendMessageWorker(inner_hwnd, WM_GETTEXT, size, buf, FALSE);
else
DefWindowProcWorker(inner_hwnd, WM_GETTEXT, size, buf, FALSE);
}
DefWindowProcWorker which is called in your case with WM_GETTEXT will just parse the structure referenced by inner_hwnd and copy window's name into buf.
it seems that the text for EDITTEXT controls are not stored in this manner
I never knew all the info that was stored in there though it seems like a good choice to not pollute processes' virtual space with all kinds of user/gdi params. Besides, lower integrity processes should not be able to get higher integrity processes sensitive information.
because there's no way Windows keeps a copy of the text for every single window
The text most certainly exists, just not as a copy. The text for a window is stored in the virtual memory of the process that owns the window. Might be in RAM, not terribly likely if the process has been dormant for a while, definitely on disk in the paging file. Which doesn't stop GetWindowText() from making a copy. On-the-fly, when you call it.
GetWindowText() is limited, it is documented to only being capable of copying the caption text of a window, so it probably uses the desktop heap for the session to retrieve the text. Not otherwise a restriction to a winapi function like SendMessage(), you can use WM_GETTEXT to obtain a gigabyte from an Edit control. That certainly crosses the process boundary.
As an operating system function, SendMessage can of course break all the rules that apply to normal processes. The OS has no trouble addressing the VM of an arbitrary process. Rules that are routinely broken, your debugger does it as well. With functions that you can use to also break the rules, ReadProcessMemory() and WriteProcessMemory().

When using CoTaskMemAlloc, should I always call CoTaskMemFree?

I'm writing some COM and ATL code, and for some reason all the code uses CoTaskMemAlloc to allocate memory instead of new or malloc. So I followed along this coding style and I also use CoTaskMemAlloc.
My teachers taught me to always delete or free when allocating memory. However I'm not sure if I should always be calling CoTaskMemFree if I use CoTaskMemAlloc?
Using the CRT's provided new/malloc and delete/free is a problem in COM interop. To make them work, it is very important that the same copy of the CRT both allocates and releases the memory. That's impossible to enforce in a COM interop scenario, your COM server and the client are practically guaranteed to use different versions of the CRT. Each using their own heap to allocate from. This causes undiagnosable memory leaks on Windows XP, a hard exception on Vista and up.
Which is why the COM heap exists, a single predefined heap in a process that's used both by the server and the client. IMalloc is the generic interface to access that shared heap, CoTaskMemAlloc() and CoTaskMemFree() are the system provided helper functions to use that interface.
That said, this is only necessary in a case where the server allocates memory and the client has to release it. Or the other way around. Which should always be rare in an interop scenario, the odds for accidents are just too large. In COM Automation there are just two such cases, a BSTR and a SAFEARRAY, types that are already wrapped. You avoid it in other cases by having the method caller provide the memory and the callee fill it in. Which also allows a strong optimization, the memory could come from the caller's stack.
Review the code and check who allocates the memory and who needs to release it. If both exist in the same module then using new/malloc is fine because there's now a hard guarantee that the same CRT instance takes care of it. If that's not the case then consider fixing it so the caller provides the memory and releases it.
The allocation and freeing of memory must always come from the same source. If you use CoTaskMemAlloc then you must use CoTaskMemFree to free the memory.
Note in C++ though the act of managing memory and object construction / destruction (new / delete) are independent actions. It's possible to customize specific objects to use a different memory allocator and still allow for the standard new / delete syntax which is preferred. For example
class MyClass {
public:
void* operator new(size_t size) {
return ::CoTaskMemAlloc(size);
}
void* operator new[](size_t size) {
return ::CoTaskMemAlloc(size);
}
void operator delete(void* pMemory) {
::CoTaskMemFree(pMemory);
}
void operator delete[](void* pMemory) {
::CoTaskMemFree(pMemory);
}
};
Now I can use this type just like any other C++ type and yet the memory will come from the COM heap
// Normal object construction but memory comes from CoTaskMemAlloc
MyClass *pClass = new MyClass();
...
// Normal object destruction and memory freed from CoTaskMemFree
delete pClass;
The answer to the question is: Yes, you should use CoTaskMemFree to free memory allocated with CoTaskMemAlloc.
The other answers do a good job explaining why CoTaskMemAlloc and CoTaskMemFree are necessary for memory passed between COM servers and COM clients, but they didn't directly answer your question.
Your teacher was right: You should always use the corresponding release function for any resource. If you use new, use delete. If you use malloc, use free. If you use CreateFile, use CloseHandle. Etc.
Better yet, in C++, use RAII objects that allocate the resource in the constructor and release the resource in the destructor, and then use those RAII wrappers instead of the bare functions. This makes it easier and cleaner to write code that doesn't leak, even if you get something like an exception.
The standard template library provides containers that implement RAII, which is why you should learn to use a std::vector or std::string rather than allocating bare memory and trying to manage it yourself. There are also smart pointers like std::shared_ptr and std::unique_ptr that can be used to make sure the right release call is always made at the right time.
ATL provides some classes like ATL::CComPtr which are wrapper objects that handle the reference counting of COM objects for you. They are not foolproof to use correctly, and, in fact, have a few more gotchas than most of the modern STL classes, so read the documentation carefully. When used correctly, it's relatively easy to make sure the AddRef and Release calls all match up.

COM memory management

I have some questions regarding COM memory management:
I have a COM method:
STDMETHODIMP CWhitelistPolicy::GetWebsitesStrings(SAFEARRAY** result)
result = SAFEARRAY(BSTR). If I receive another SAFEARRAY(BSTR) from another interface method(in order to set *result) do I have to make copies of the strings received in order to pass them to *result and outside client? Or considering I will not use the strings for myself I can just pass them to the client (and passing out the ownership)?
2.
STDMETHODIMP CWhitelistPolicy::SetWebsitesStrings(SAFEARRAY* input)
Here I receive a BSTR array as input. Again my method is responsible for the memory allocated in input?
3.
STDMETHOD(SetUsers)(SAFEARRAY* input);
Here I call a method on another interface (SetUsers) and I allocate memory for the input SAFEARRAY. After I call SetUsers I can dispose of the SAFEARRAY? Memory is always copied when marshaling takes place isn't it? (in my case SetUsers method is called on an interface that is hosted as a COM DLL inside my process)
The way I think about it to answer questions like this is to think about a COM call that crosses machines. Then it's obvious for an [out] param; I the caller own and have to free the memory because the remote marshaling layer can't do it. For [in] parameters, it's obvious the marshaling layer must copy my data and again the remote marshaling layer can't free what I passed in.
A core tenet in COM is location neutrality, the rules when calling in the same apartment are the rules when using DCOM across machines.
You're responsible to free - you don't pass ownership when you call the next fnc because it could be remote and getting a copy, not your original data.
No - as the callee, you don't have to free it. If it's intra-apartment, it's the memory the caller provided and the caller has to free it. If it's a remote call, the server stub allocates it and will free it when the method returns.
Yes, you free it - no, it's not always copied (it might be), which is why the answer to 2 is no. If it's copied, there's a stub that allocated and the stub will free it.
Note my answers to your questions didn't cover the case of [in,out] parameters - see the so question Who owns returned BSTR? for some more details on this case.
Com allocation rules are complicated but rational. Get the book "essential com" by Don Box if you want to understand/see examples of all the cases. Still you're going to make mistakes so you should have a strategy for detecting them. I use gflags (part of Windbg) and its heap checking flags to catch when a double free occurs (a debug message is displayed and execution halted at the call with an INT 3). Vstudio's debugger used to turn them on for you when it launched the executable (it likely still does) but you can force them on with gflags under the image options tab.
You should also know how to use UMDH (also part of windbg) to detect leaks. DebugDiag is the newer tool for this and seems easier to use, but sadly, you can only have the 32 bit or 64 bit version installed, but not both.
The problem then are BSTRs, which are cached, make detecting double frees and leaks tricky because interacting with the heap is delayed. You can shut off the ole string cache by setting the environment variable OANOCACHE to 1 or calling the function SetOaNoCache. The function's not defined in a header file so see this SO question Where is SetOaNoCache defined?. Note the accepted answer shows the hard way to call it through GetProcAddress(). The answer below the accepted one shows all you need is an extern "C" as it's in the oleaut32 export lib. Finally, see this Larry Osterman blog post for a more detailed description of the difficulties caused by the cache when hunting leaks.

Some Windows API calls fail unless the string arguments are in the system memory rather than local stack

We have an older massive C++ application and we have been converting it to support Unicode as well as 64-bits. The following strange thing has been happening:
Calls to registry functions and windows creation functions, like the following, have been failing:
hWnd = CreateSysWindowExW( ExStyle, ClassNameW.StringW(), Label2.StringW(), Style,
Posn.X(), Posn.Y(),
Size.X(), Size.Y(),
hParentWnd, (HMENU)Id,
AppInstance(), NULL);
ClassNameW and Label2 are instances of our own Text class which essentially uses malloc to allocate the memory used to store the string.
Anyway, when the functions fail, and I call GetLastError it returns the error code for "invalid memory access" (though I can inspect and see the string arguments fine in the debugger). Yet if I change the code as follows then it works perfectly fine:
BSTR Label2S = SysAllocString(Label2.StringW());
BSTR ClassNameWS = SysAllocString(ClassNameW.StringW());
hWnd = CreateSysWindowExW( ExStyle, ClassNameWS, Label2S, Style,
Posn.X(), Posn.Y(),
Size.X(), Size.Y(),
hParentWnd, (HMENU)Id,
AppInstance(), NULL);
SysFreeString(ClassNameWS); ClassNameWS = 0;
SysFreeString(Label2S); Label2S = 0;
So what gives? Why would the original functions work fine with the arguments in local memory, but when used with Unicode, the registry function require SysAllocString, and when used in 64-bit, the Windows creation functions also require SysAllocString'd string arguments? Our Windows procedure functions have all been converted to be Unicode, always, and yes we use SetWindowLogW call the correct default Unicode DefWindowProcW etc. That all seems to work fine and handles and draws Unicode properly etc.
The documentation at http://msdn.microsoft.com/en-us/library/ms632679%28v=vs.85%29.aspx does not say anything about this. While our application is massive we do use debug heaps and tools like Purify to check for and clean up any memory corruption. Also at the time of this failure, there is still only one main system thread. So it is not a thread issue.
So what is going on? I have read that if string arguments are marshalled anywhere or passed across process boundaries, then you have to use SysAllocString/BSTR, yet we call lots of API functions and there is lots of code out there which calls these functions just using plain local strings?
What am I missing? I have tried Googling this, as someone else must have run into this, but with little luck.
Edit 1: Our StringW function does not create any temporary objects which might go out of scope before the actual API call. The function is as follows:
Class Text {
const wchar_t* StringW () const
{
return TextStartW;
}
wchar_t* TextStartW; // pointer to current start of text in DataArea
I have been running our application with the debug heap and memory checking and other diagnostic tools, and found no source of memory corruption, and looking at the assembly, there is no sign of temporary objects or invalid memory access.
BUT I finally figured it out:
We compile our code /Zp1, which means byte aligned memory allocations. SysAllocString (in 64-bits) always return a pointer that is aligned on a 8 byte boundary. Presumably a 32-bit ANSI C++ application goes through an API layer to the underlying Unicode windows DLLs, which would also align the pointer for you.
But if you use Unicode, you do not get that incidental pointer alignment that the conversion mapping layer gives you, and if you use 64-bits, of course the situation will get even worse.
I added a method to our Text class which shifts the string pointer so that it is aligned on an eight byte boundary, and viola, everything runs fine!!!
Of course the Microsoft people say it must be memory corruption and I am jumping the wrong conclusion, but there is evidence it is not the case.
Also, if you use /Zp1 and include windows.h in a 64-bit application, the debugger will tell you sizeof(BITMAP)==28, but calling GetObject on a bitmap will fail and tell you it needs a 32-byte structure. So I suspect that some of Microsoft's API is inherently dependent on aligned pointers, and I also know that some optimized assembly (I have seen some from Fortran compilers) takes advantage of that and crashes badly if you ever give it unaligned pointers.
So the moral of all of this is, dont use "funky" compiler arguments like /Zp1. In our case we have to for historical reasons, but the number of times this has bitten us...
Someone please give me a "this is useful" tick on my answer please?
Using a bit of psychic debugging, I'm going to guess that the strings in your application are pooled in a read-only section.
It's possible that the CreateSysWindowsEx is attempting to write to the memory passed in for the window class or title. That would explain why the calls work when allocated on the heap (SysAllocString) but not when used as constants.
The easiest way to investigate this is to use a low level debugger like windbg - it should break into the debugger at the point where the access violation occurs which should help figure out the problem. Don't use Visual Studio, it has a nasty habit of being helpful and hiding first chance exceptions.
Another thing to try is to enable appverifier on your application - it's possible that it may show something.
Calling a Windows API function does not cross the process boundary, since the various Windows DLLs are loaded into your process.
It sounds like whatever pointer that StringW() is returning isn't valid when Windows is trying to access it. I would look there - is it possible that the pointer returned it out of scope and deleted shortly after it is called?
If you share some more details about your string class, that could help diagnose the problem here.

Equivalent to pread/pwrite in MSVC?

What calls best emulate pread/pwrite in MSVC 10?
At the C runtime library level, look at fread, fwrite and fseek.
At the Win32 API level, have a look at ReadFile, WriteFile, and SetFilePointer. MSDN has extensive coverage of file I/O API's.
Note that both ReadFile and WriteFile take an OVERLAPPED struct argument, which lets you specify a file offset. The offset is respected for all files that support byte offsets, even when opened for synchronous (i.e. non 'overlapped') I/O.
Depending on the problem you are trying to solve, file mapping may be a better design choice.
It looks like you just use the lpOverlapped parameter to ReadFile/WriteFile to pass a pointer to an OVERLAPPED structure with the offset specified in Offset and OffsetHigh.
(Note: You don't actually get overlapping IO unless the handle was opened with FILE_FLAG_OVERLAPPED.)
The answer provided by Oren seems correct but doesn't seem to meet the needs. Actually, I too was here for searching the answer but couldn't find it. So, I will update a bit here.
As said,
At the C runtime library level, there are fread, fwrite and fseek.
At the Win32 API level, we can have two level of abstractions. One at the lower level which works with file descriptors and other at higher level which works with Windows' defined data structures such as File and Handle.
If you wish to work with Files and Handles, you have ReadFile, WriteFile, and SetFilePointer. But most the time, C++ developers prefer working with File Descriptors. For that, you have _read, _write and _lseek.

Resources