COM memory management - memory-management

I have some questions regarding COM memory management:
I have a COM method:
STDMETHODIMP CWhitelistPolicy::GetWebsitesStrings(SAFEARRAY** result)
result = SAFEARRAY(BSTR). If I receive another SAFEARRAY(BSTR) from another interface method(in order to set *result) do I have to make copies of the strings received in order to pass them to *result and outside client? Or considering I will not use the strings for myself I can just pass them to the client (and passing out the ownership)?
2.
STDMETHODIMP CWhitelistPolicy::SetWebsitesStrings(SAFEARRAY* input)
Here I receive a BSTR array as input. Again my method is responsible for the memory allocated in input?
3.
STDMETHOD(SetUsers)(SAFEARRAY* input);
Here I call a method on another interface (SetUsers) and I allocate memory for the input SAFEARRAY. After I call SetUsers I can dispose of the SAFEARRAY? Memory is always copied when marshaling takes place isn't it? (in my case SetUsers method is called on an interface that is hosted as a COM DLL inside my process)

The way I think about it to answer questions like this is to think about a COM call that crosses machines. Then it's obvious for an [out] param; I the caller own and have to free the memory because the remote marshaling layer can't do it. For [in] parameters, it's obvious the marshaling layer must copy my data and again the remote marshaling layer can't free what I passed in.
A core tenet in COM is location neutrality, the rules when calling in the same apartment are the rules when using DCOM across machines.
You're responsible to free - you don't pass ownership when you call the next fnc because it could be remote and getting a copy, not your original data.
No - as the callee, you don't have to free it. If it's intra-apartment, it's the memory the caller provided and the caller has to free it. If it's a remote call, the server stub allocates it and will free it when the method returns.
Yes, you free it - no, it's not always copied (it might be), which is why the answer to 2 is no. If it's copied, there's a stub that allocated and the stub will free it.
Note my answers to your questions didn't cover the case of [in,out] parameters - see the so question Who owns returned BSTR? for some more details on this case.
Com allocation rules are complicated but rational. Get the book "essential com" by Don Box if you want to understand/see examples of all the cases. Still you're going to make mistakes so you should have a strategy for detecting them. I use gflags (part of Windbg) and its heap checking flags to catch when a double free occurs (a debug message is displayed and execution halted at the call with an INT 3). Vstudio's debugger used to turn them on for you when it launched the executable (it likely still does) but you can force them on with gflags under the image options tab.
You should also know how to use UMDH (also part of windbg) to detect leaks. DebugDiag is the newer tool for this and seems easier to use, but sadly, you can only have the 32 bit or 64 bit version installed, but not both.
The problem then are BSTRs, which are cached, make detecting double frees and leaks tricky because interacting with the heap is delayed. You can shut off the ole string cache by setting the environment variable OANOCACHE to 1 or calling the function SetOaNoCache. The function's not defined in a header file so see this SO question Where is SetOaNoCache defined?. Note the accepted answer shows the hard way to call it through GetProcAddress(). The answer below the accepted one shows all you need is an extern "C" as it's in the oleaut32 export lib. Finally, see this Larry Osterman blog post for a more detailed description of the difficulties caused by the cache when hunting leaks.

Related

MFC: Conflicting information on how to clean up after using COleDataSource::DoDragDrop()

Can someone clear up all the conflicting information.
The documentation for COleDataSource::CacheData() says:
After the call to CacheData the ptd member of lpFormatEtc and the
contents of lpStgMedium are owned by the data object, not by the
caller.
The documentation for COleDataSource::CacheGlobalData() doesn't say that.
You find code examples of how to use COleDataSource::DoDragDrop() on places like Code Project which call COleDataSource::CacheGlobalData() then check the DoDragDrop() results and free the memory if it didn't drop:
DROPEFFECT dweffect=datasrc->DoDragDrop(DROPEFFECT_COPY);
// They say if operation wasn't accepted, or was canceled, we
// should call GlobalFree() to clean up.
if (dweffect==DROPEFFECT_NONE) {
GlobalFree(hgdrop);
}
You also have Q182219 (which for some unknown reason MS has totally broken all the good old MSKB links and you can't find any information anymore (links all over the Internet are dead). Q182219 says something about after a successful move operation on NT based OS (Win10), it will return DROPEFFECT_NONE so you have to check if things really moved.
So the questions are:
1) Should you really free the memory if the drop operation returns DROPEFFECT_NONE or would MFC handle all that?
2) Does it still return DROPEFFFECT_NONE after a successful move operation or does MFC handle that internally (or fixed in some Windows version) ?
3) If it does return DROPEFFECT_NONE after a move, wouldn't logic like the above be a double free on memory if it was a move operation?
Extra:
You also find examples and usage of people doing COleDataSource mydatasource which is WRONG. You have to allocate it like COleDataSource *mydatasource=new ColeDataSource() and then at the end use mydatasource->ExternalRelease() to release it. (ExternalRelease() handles calling InternalRelease() if the object isn't using aggregation (that is a derived class that acts as a wrapper))
TIA!!

When using CoTaskMemAlloc, should I always call CoTaskMemFree?

I'm writing some COM and ATL code, and for some reason all the code uses CoTaskMemAlloc to allocate memory instead of new or malloc. So I followed along this coding style and I also use CoTaskMemAlloc.
My teachers taught me to always delete or free when allocating memory. However I'm not sure if I should always be calling CoTaskMemFree if I use CoTaskMemAlloc?
Using the CRT's provided new/malloc and delete/free is a problem in COM interop. To make them work, it is very important that the same copy of the CRT both allocates and releases the memory. That's impossible to enforce in a COM interop scenario, your COM server and the client are practically guaranteed to use different versions of the CRT. Each using their own heap to allocate from. This causes undiagnosable memory leaks on Windows XP, a hard exception on Vista and up.
Which is why the COM heap exists, a single predefined heap in a process that's used both by the server and the client. IMalloc is the generic interface to access that shared heap, CoTaskMemAlloc() and CoTaskMemFree() are the system provided helper functions to use that interface.
That said, this is only necessary in a case where the server allocates memory and the client has to release it. Or the other way around. Which should always be rare in an interop scenario, the odds for accidents are just too large. In COM Automation there are just two such cases, a BSTR and a SAFEARRAY, types that are already wrapped. You avoid it in other cases by having the method caller provide the memory and the callee fill it in. Which also allows a strong optimization, the memory could come from the caller's stack.
Review the code and check who allocates the memory and who needs to release it. If both exist in the same module then using new/malloc is fine because there's now a hard guarantee that the same CRT instance takes care of it. If that's not the case then consider fixing it so the caller provides the memory and releases it.
The allocation and freeing of memory must always come from the same source. If you use CoTaskMemAlloc then you must use CoTaskMemFree to free the memory.
Note in C++ though the act of managing memory and object construction / destruction (new / delete) are independent actions. It's possible to customize specific objects to use a different memory allocator and still allow for the standard new / delete syntax which is preferred. For example
class MyClass {
public:
void* operator new(size_t size) {
return ::CoTaskMemAlloc(size);
}
void* operator new[](size_t size) {
return ::CoTaskMemAlloc(size);
}
void operator delete(void* pMemory) {
::CoTaskMemFree(pMemory);
}
void operator delete[](void* pMemory) {
::CoTaskMemFree(pMemory);
}
};
Now I can use this type just like any other C++ type and yet the memory will come from the COM heap
// Normal object construction but memory comes from CoTaskMemAlloc
MyClass *pClass = new MyClass();
...
// Normal object destruction and memory freed from CoTaskMemFree
delete pClass;
The answer to the question is: Yes, you should use CoTaskMemFree to free memory allocated with CoTaskMemAlloc.
The other answers do a good job explaining why CoTaskMemAlloc and CoTaskMemFree are necessary for memory passed between COM servers and COM clients, but they didn't directly answer your question.
Your teacher was right: You should always use the corresponding release function for any resource. If you use new, use delete. If you use malloc, use free. If you use CreateFile, use CloseHandle. Etc.
Better yet, in C++, use RAII objects that allocate the resource in the constructor and release the resource in the destructor, and then use those RAII wrappers instead of the bare functions. This makes it easier and cleaner to write code that doesn't leak, even if you get something like an exception.
The standard template library provides containers that implement RAII, which is why you should learn to use a std::vector or std::string rather than allocating bare memory and trying to manage it yourself. There are also smart pointers like std::shared_ptr and std::unique_ptr that can be used to make sure the right release call is always made at the right time.
ATL provides some classes like ATL::CComPtr which are wrapper objects that handle the reference counting of COM objects for you. They are not foolproof to use correctly, and, in fact, have a few more gotchas than most of the modern STL classes, so read the documentation carefully. When used correctly, it's relatively easy to make sure the AddRef and Release calls all match up.

Some Windows API calls fail unless the string arguments are in the system memory rather than local stack

We have an older massive C++ application and we have been converting it to support Unicode as well as 64-bits. The following strange thing has been happening:
Calls to registry functions and windows creation functions, like the following, have been failing:
hWnd = CreateSysWindowExW( ExStyle, ClassNameW.StringW(), Label2.StringW(), Style,
Posn.X(), Posn.Y(),
Size.X(), Size.Y(),
hParentWnd, (HMENU)Id,
AppInstance(), NULL);
ClassNameW and Label2 are instances of our own Text class which essentially uses malloc to allocate the memory used to store the string.
Anyway, when the functions fail, and I call GetLastError it returns the error code for "invalid memory access" (though I can inspect and see the string arguments fine in the debugger). Yet if I change the code as follows then it works perfectly fine:
BSTR Label2S = SysAllocString(Label2.StringW());
BSTR ClassNameWS = SysAllocString(ClassNameW.StringW());
hWnd = CreateSysWindowExW( ExStyle, ClassNameWS, Label2S, Style,
Posn.X(), Posn.Y(),
Size.X(), Size.Y(),
hParentWnd, (HMENU)Id,
AppInstance(), NULL);
SysFreeString(ClassNameWS); ClassNameWS = 0;
SysFreeString(Label2S); Label2S = 0;
So what gives? Why would the original functions work fine with the arguments in local memory, but when used with Unicode, the registry function require SysAllocString, and when used in 64-bit, the Windows creation functions also require SysAllocString'd string arguments? Our Windows procedure functions have all been converted to be Unicode, always, and yes we use SetWindowLogW call the correct default Unicode DefWindowProcW etc. That all seems to work fine and handles and draws Unicode properly etc.
The documentation at http://msdn.microsoft.com/en-us/library/ms632679%28v=vs.85%29.aspx does not say anything about this. While our application is massive we do use debug heaps and tools like Purify to check for and clean up any memory corruption. Also at the time of this failure, there is still only one main system thread. So it is not a thread issue.
So what is going on? I have read that if string arguments are marshalled anywhere or passed across process boundaries, then you have to use SysAllocString/BSTR, yet we call lots of API functions and there is lots of code out there which calls these functions just using plain local strings?
What am I missing? I have tried Googling this, as someone else must have run into this, but with little luck.
Edit 1: Our StringW function does not create any temporary objects which might go out of scope before the actual API call. The function is as follows:
Class Text {
const wchar_t* StringW () const
{
return TextStartW;
}
wchar_t* TextStartW; // pointer to current start of text in DataArea
I have been running our application with the debug heap and memory checking and other diagnostic tools, and found no source of memory corruption, and looking at the assembly, there is no sign of temporary objects or invalid memory access.
BUT I finally figured it out:
We compile our code /Zp1, which means byte aligned memory allocations. SysAllocString (in 64-bits) always return a pointer that is aligned on a 8 byte boundary. Presumably a 32-bit ANSI C++ application goes through an API layer to the underlying Unicode windows DLLs, which would also align the pointer for you.
But if you use Unicode, you do not get that incidental pointer alignment that the conversion mapping layer gives you, and if you use 64-bits, of course the situation will get even worse.
I added a method to our Text class which shifts the string pointer so that it is aligned on an eight byte boundary, and viola, everything runs fine!!!
Of course the Microsoft people say it must be memory corruption and I am jumping the wrong conclusion, but there is evidence it is not the case.
Also, if you use /Zp1 and include windows.h in a 64-bit application, the debugger will tell you sizeof(BITMAP)==28, but calling GetObject on a bitmap will fail and tell you it needs a 32-byte structure. So I suspect that some of Microsoft's API is inherently dependent on aligned pointers, and I also know that some optimized assembly (I have seen some from Fortran compilers) takes advantage of that and crashes badly if you ever give it unaligned pointers.
So the moral of all of this is, dont use "funky" compiler arguments like /Zp1. In our case we have to for historical reasons, but the number of times this has bitten us...
Someone please give me a "this is useful" tick on my answer please?
Using a bit of psychic debugging, I'm going to guess that the strings in your application are pooled in a read-only section.
It's possible that the CreateSysWindowsEx is attempting to write to the memory passed in for the window class or title. That would explain why the calls work when allocated on the heap (SysAllocString) but not when used as constants.
The easiest way to investigate this is to use a low level debugger like windbg - it should break into the debugger at the point where the access violation occurs which should help figure out the problem. Don't use Visual Studio, it has a nasty habit of being helpful and hiding first chance exceptions.
Another thing to try is to enable appverifier on your application - it's possible that it may show something.
Calling a Windows API function does not cross the process boundary, since the various Windows DLLs are loaded into your process.
It sounds like whatever pointer that StringW() is returning isn't valid when Windows is trying to access it. I would look there - is it possible that the pointer returned it out of scope and deleted shortly after it is called?
If you share some more details about your string class, that could help diagnose the problem here.

Can address space be recycled for multiple calls to MapViewOfFileEx without chance of failure?

Consider a complex, memory hungry, multi threaded application running within a 32bit address space on windows XP.
Certain operations require n large buffers of fixed size, where only one buffer needs to be accessed at a time.
The application uses a pattern where some address space the size of one buffer is reserved early and is used to contain the currently needed buffer.
This follows the sequence:
(initial run) VirtualAlloc -> VirtualFree -> MapViewOfFileEx
(buffer changes) UnMapViewOfFile -> MapViewOfFileEx
Here the pointer to the buffer location is provided by the call to VirtualAlloc and then that same location is used on each call to MapViewOfFileEx.
The problem is that windows does not (as far as I know) provide any handshake type operation for passing the memory space between the different users.
Therefore there is a small opportunity (at each -> in my above sequence) where the memory is not locked and another thread can jump in and perform an allocation within the buffer.
The next call to MapViewOfFileEx is broken and the system can no longer guarantee that there will be a big enough space in the address space for a buffer.
Obviously refactoring to use smaller buffers reduces the rate of failures to reallocate space.
Some use of HeapLock has had some success but this still has issues - something still manages to steal some memory from within the address space.
(We tried Calling GetProcessHeaps then using HeapLock to lock all of the heaps)
What I'd like to know is there anyway to lock a specific block of address space that is compatible with MapViewOfFileEx?
Edit: I should add that ultimately this code lives in a library that gets called by an application outside of my control
You could brute force it; suspend every thread in the process that isn't the one performing the mapping, Unmap/Remap, unsuspend the suspended threads. It ain't elegant, but it's the only way I can think of off-hand to provide the kind of mutual exclusion you need.
Have you looked at creating your own private heap via HeapCreate? You could set the heap to your desired buffer size. The only remaining problem is then how to get MapViewOfFileto use your private heap instead of the default heap.
I'd assume that MapViewOfFile internally calls GetProcessHeap to get the default heap and then it requests a contiguous block of memory. You can surround the call to MapViewOfFile with a detour, i.e., you rewire the GetProcessHeap call by overwriting the method in memory effectively inserting a jump to your own code which can return your private heap.
Microsoft has published the Detour Library that I'm not directly familiar with however. I know that detouring is surprisingly common. Security software, virus scanners etc all use such frameworks. It's not pretty, but may work:
HANDLE g_hndPrivateHeap;
HANDLE WINAPI GetProcessHeapImpl() {
return g_hndPrivateHeap;
}
struct SDetourGetProcessHeap { // object for exception safety
SDetourGetProcessHeap() {
// put detour in place
}
~SDetourGetProcessHeap() {
// remove detour again
}
};
void MapFile() {
g_hndPrivateHeap = HeapCreate( ... );
{
SDetourGetProcessHeap d;
MapViewOfFile(...);
}
}
These may also help:
How to replace WinAPI functions calls in the MS VC++ project with my own implementation (name and parameters set are the same)?
How can I hook Windows functions in C/C++?
http://research.microsoft.com/pubs/68568/huntusenixnt99.pdf
Imagine if I came to you with a piece of code like this:
void *foo;
foo = malloc(n);
if (foo)
free(foo);
foo = malloc(n);
Then I came to you and said, help! foo does not have the same address on the second allocation!
I'd be crazy, right?
It seems to me like you've already demonstrated clear knowledge of why this doesn't work. There's a reason that the documention for any API that takes an explicit address to map into lets you know that the address is just a suggestion, and it can't be guaranteed. This also goes for mmap() on POSIX.
I would suggest you write the program in such a way that a change in address doesn't matter. That is, don't store too many pointers to quantities inside the buffer, or if you do, patch them up after reallocation. Similar to the way you'd treat a buffer that you were going to pass into realloc().
Even the documentation for MapViewOfFileEx() explicitly suggests this:
While it is possible to specify an address that is safe now (not used by the operating system), there is no guarantee that the address will remain safe over time. Therefore, it is better to let the operating system choose the address. In this case, you would not store pointers in the memory mapped file, you would store offsets from the base of the file mapping so that the mapping can be used at any address.
Update from your comments
In that case, I suppose you could:
Not map into contiguous blocks. Perhaps you could map in chunks and write some intermediate function to decide which to read from/write to?
Try porting to 64 bit.
As the earlier post suggests, you can suspend every thread in the process while you change the memory mappings. You can use SuspendThread()/ResumeThread() for that. This has the disadvantage that your code has to know about all the other threads and hold thread handles for them.
An alternative is to use the Windows debug API to suspend all threads. If a process has a debugger attached, then every time the process faults, Windows will suspend all of the process's threads until the debugger handles the fault and resumes the process.
Also see this question which is very similar, but phrased differently:
Replacing memory mappings atomically on Windows

Usage of CoTaskMemAlloc?

When is it appropriate to use CoTaskMemAlloc? Can someone give an example?
Use CoTaskMemAlloc when returning a char* from a native C++ library to .NET as a string.
C#
[DllImport("test.dll", CharSet=CharSet.Ansi)]
extern static string Foo();
C
char* Foo()
{
std::string response("response");
int len = response.length() + 1;
char* buff = (char*) CoTaskMemAlloc(len);
strcpy_s(buff, len, response.c_str());
return buff;
}
Since .NET uses CoTaskMemFree, you have to allocate the string like this, you can't allocate it on the stack or the heap using malloc / new.
Gosh, I had to think for a while for this one -- I've done a fair amount of small-scale COM programming with ATL and rarely have had to use it.
There is one situation though that comes to mind: Windows Shell extensions. If you are dealing with a set of filesystem objects you might have to deal with PIDLs (pointer to an ID list). These are bizarre little filesystem object abstractions and they need to be explicitly allocated/deallocated using a COM-aware allocator such as CoTaskMemAlloc. There is also an alternative, the IMalloc interface pointer obtained from SHGetMalloc (deprecated) or CoGetMalloc -- it's just an abstraction layer to use, so that your code isn't tied to a specific memory allocator and can use any appropriate one.
The point of using CoTaskMemAlloc or IMalloc rather than malloc() is that the memory allocation/deallocation needs to be something that is "COM-aware" so that its allocation and deallocation are performed consistently at run-time, even if the allocation and deallocation are done by completely unrelated code (e.g. Windows allocates memory, transfers it to your C++ code which later deallocates, or your C++ code allocates, transfers it to someone else's VB code which later deallocates). Neither malloc() nor new are capable of interoperating with the system's run-time heap so you can't use them to allocate memory to transfer to other COM objects, or to receive memory from other COM objects and deallocate.
This MSDN article compares a few of the various allocators exposed by Win32, including CoTaskMemAlloc. It's mainly used in COM programming--most specifically when the implementation of a COM server needs to allocate memory to return back to a client. If you aren't writing a COM server, then you probably don't need to use it.
(However, if you call code that allocates memory using CoTaskMemAlloc and returns it back to you, you'll need to free the returned allocation(s) using CoTaskMemFree.)
there is not really much which can go wrong as the following calls all end up with the same allocation:
CoTaskMemAlloc/SHAlloc -> IMalloc.Alloc -> GlobalAlloc(GMEM_FIXED)
only if you use non-windows (compiler-library) calls like malloc() things will go wrong.
Officially one should use CoTaskMemAlloc for COM calls (like allocating a FORMATETC.ptd field)
That CoTaskMemAlloc equals GlobalAlloc() will stay this way 'till eternity is seen at the clipboard api versus com STGMEDIUM. The STGMEDIUM uses the clipboard structures and method and while STGMEDIUM is com and thus CoTaskMemAlloc, the clipboard apis prescribe GlobalAlloc()
CoTaskMemAlloc is same as malloc except that former is used to allocate memory which is used across process boundaries.
i.e., if we have two processes, process1 and process2, assume that process1 is a COM server, and process2 is a COM Client which uses the interfaces exposed by process1.
If process1 has to send some data, then he can allocate memory using CoTaskMemAlloc to allocate the memory and copies the data.
That memory location can be accessed by process2.
COM library automatically does the marshalling and unmarshalling.

Resources