Usage of CoTaskMemAlloc? - windows

When is it appropriate to use CoTaskMemAlloc? Can someone give an example?

Use CoTaskMemAlloc when returning a char* from a native C++ library to .NET as a string.
C#
[DllImport("test.dll", CharSet=CharSet.Ansi)]
extern static string Foo();
C
char* Foo()
{
std::string response("response");
int len = response.length() + 1;
char* buff = (char*) CoTaskMemAlloc(len);
strcpy_s(buff, len, response.c_str());
return buff;
}
Since .NET uses CoTaskMemFree, you have to allocate the string like this, you can't allocate it on the stack or the heap using malloc / new.

Gosh, I had to think for a while for this one -- I've done a fair amount of small-scale COM programming with ATL and rarely have had to use it.
There is one situation though that comes to mind: Windows Shell extensions. If you are dealing with a set of filesystem objects you might have to deal with PIDLs (pointer to an ID list). These are bizarre little filesystem object abstractions and they need to be explicitly allocated/deallocated using a COM-aware allocator such as CoTaskMemAlloc. There is also an alternative, the IMalloc interface pointer obtained from SHGetMalloc (deprecated) or CoGetMalloc -- it's just an abstraction layer to use, so that your code isn't tied to a specific memory allocator and can use any appropriate one.
The point of using CoTaskMemAlloc or IMalloc rather than malloc() is that the memory allocation/deallocation needs to be something that is "COM-aware" so that its allocation and deallocation are performed consistently at run-time, even if the allocation and deallocation are done by completely unrelated code (e.g. Windows allocates memory, transfers it to your C++ code which later deallocates, or your C++ code allocates, transfers it to someone else's VB code which later deallocates). Neither malloc() nor new are capable of interoperating with the system's run-time heap so you can't use them to allocate memory to transfer to other COM objects, or to receive memory from other COM objects and deallocate.

This MSDN article compares a few of the various allocators exposed by Win32, including CoTaskMemAlloc. It's mainly used in COM programming--most specifically when the implementation of a COM server needs to allocate memory to return back to a client. If you aren't writing a COM server, then you probably don't need to use it.
(However, if you call code that allocates memory using CoTaskMemAlloc and returns it back to you, you'll need to free the returned allocation(s) using CoTaskMemFree.)

there is not really much which can go wrong as the following calls all end up with the same allocation:
CoTaskMemAlloc/SHAlloc -> IMalloc.Alloc -> GlobalAlloc(GMEM_FIXED)
only if you use non-windows (compiler-library) calls like malloc() things will go wrong.
Officially one should use CoTaskMemAlloc for COM calls (like allocating a FORMATETC.ptd field)
That CoTaskMemAlloc equals GlobalAlloc() will stay this way 'till eternity is seen at the clipboard api versus com STGMEDIUM. The STGMEDIUM uses the clipboard structures and method and while STGMEDIUM is com and thus CoTaskMemAlloc, the clipboard apis prescribe GlobalAlloc()

CoTaskMemAlloc is same as malloc except that former is used to allocate memory which is used across process boundaries.
i.e., if we have two processes, process1 and process2, assume that process1 is a COM server, and process2 is a COM Client which uses the interfaces exposed by process1.
If process1 has to send some data, then he can allocate memory using CoTaskMemAlloc to allocate the memory and copies the data.
That memory location can be accessed by process2.
COM library automatically does the marshalling and unmarshalling.

Related

What function has replaced GlobalAlloc in Win32?

Since GlobalAlloc has become local in the 32 bit model, it can't be used so that allocated memory to be shared between Win32 applications. What function is replacing GlobalAlloc and uses the same simplicity? (Like retrieving a handler to the memory block, which is converted to pointer later by the application that uses it.)
The way you share memory between processes in Win32 is with memory mapped files. Start with CreateFileMapping and MapViewOfFile.
If you are after general IPC rather than just shared memory, I would suggest reading the "Interprocess Communications" chapter. It seems you could use named memory mappings as the closest equivalent of GlobalAlloc().

When using CoTaskMemAlloc, should I always call CoTaskMemFree?

I'm writing some COM and ATL code, and for some reason all the code uses CoTaskMemAlloc to allocate memory instead of new or malloc. So I followed along this coding style and I also use CoTaskMemAlloc.
My teachers taught me to always delete or free when allocating memory. However I'm not sure if I should always be calling CoTaskMemFree if I use CoTaskMemAlloc?
Using the CRT's provided new/malloc and delete/free is a problem in COM interop. To make them work, it is very important that the same copy of the CRT both allocates and releases the memory. That's impossible to enforce in a COM interop scenario, your COM server and the client are practically guaranteed to use different versions of the CRT. Each using their own heap to allocate from. This causes undiagnosable memory leaks on Windows XP, a hard exception on Vista and up.
Which is why the COM heap exists, a single predefined heap in a process that's used both by the server and the client. IMalloc is the generic interface to access that shared heap, CoTaskMemAlloc() and CoTaskMemFree() are the system provided helper functions to use that interface.
That said, this is only necessary in a case where the server allocates memory and the client has to release it. Or the other way around. Which should always be rare in an interop scenario, the odds for accidents are just too large. In COM Automation there are just two such cases, a BSTR and a SAFEARRAY, types that are already wrapped. You avoid it in other cases by having the method caller provide the memory and the callee fill it in. Which also allows a strong optimization, the memory could come from the caller's stack.
Review the code and check who allocates the memory and who needs to release it. If both exist in the same module then using new/malloc is fine because there's now a hard guarantee that the same CRT instance takes care of it. If that's not the case then consider fixing it so the caller provides the memory and releases it.
The allocation and freeing of memory must always come from the same source. If you use CoTaskMemAlloc then you must use CoTaskMemFree to free the memory.
Note in C++ though the act of managing memory and object construction / destruction (new / delete) are independent actions. It's possible to customize specific objects to use a different memory allocator and still allow for the standard new / delete syntax which is preferred. For example
class MyClass {
public:
void* operator new(size_t size) {
return ::CoTaskMemAlloc(size);
}
void* operator new[](size_t size) {
return ::CoTaskMemAlloc(size);
}
void operator delete(void* pMemory) {
::CoTaskMemFree(pMemory);
}
void operator delete[](void* pMemory) {
::CoTaskMemFree(pMemory);
}
};
Now I can use this type just like any other C++ type and yet the memory will come from the COM heap
// Normal object construction but memory comes from CoTaskMemAlloc
MyClass *pClass = new MyClass();
...
// Normal object destruction and memory freed from CoTaskMemFree
delete pClass;
The answer to the question is: Yes, you should use CoTaskMemFree to free memory allocated with CoTaskMemAlloc.
The other answers do a good job explaining why CoTaskMemAlloc and CoTaskMemFree are necessary for memory passed between COM servers and COM clients, but they didn't directly answer your question.
Your teacher was right: You should always use the corresponding release function for any resource. If you use new, use delete. If you use malloc, use free. If you use CreateFile, use CloseHandle. Etc.
Better yet, in C++, use RAII objects that allocate the resource in the constructor and release the resource in the destructor, and then use those RAII wrappers instead of the bare functions. This makes it easier and cleaner to write code that doesn't leak, even if you get something like an exception.
The standard template library provides containers that implement RAII, which is why you should learn to use a std::vector or std::string rather than allocating bare memory and trying to manage it yourself. There are also smart pointers like std::shared_ptr and std::unique_ptr that can be used to make sure the right release call is always made at the right time.
ATL provides some classes like ATL::CComPtr which are wrapper objects that handle the reference counting of COM objects for you. They are not foolproof to use correctly, and, in fact, have a few more gotchas than most of the modern STL classes, so read the documentation carefully. When used correctly, it's relatively easy to make sure the AddRef and Release calls all match up.

Sharing GlobalAlloc() memory from DLL to multiple Win32 applications

I want to move my caching library to a DLL and allow multiple applications to share a single pointer allocated within the DLL using GlobalAlloc(). How could I accomplish this, and would it result in a significant performance decrease?
You could certainly do this and there won't be any performance implication for a single pointer.
Rather than use GlobalAlloc, a legacy API, you should opt for a different shared heap. For example the simplest to use is the COM allocator, CoTaskMemAlloc. Or you can use HeapAlloc passing the process heap obtained by GetProcessHeap.
For example, and neglecting to show error checking:
void *mem = HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, size);
Note that you only need to worry about heap sharing if you expect the memory to be deallocated in a different module from where it was created. If your DLL both creates and destroys the memory then you can use plain old malloc. Because all modules live in the same process address space, memory allocated by any module in that process, can be used by any other module.
Update
I failed on first reading of the question to pick up on the possibility that you may be wanting multiple process to have access to the same memory. If that's what you need then it is only possible with memory mapped files, or perhaps with some form of IPC.

COM memory management

I have some questions regarding COM memory management:
I have a COM method:
STDMETHODIMP CWhitelistPolicy::GetWebsitesStrings(SAFEARRAY** result)
result = SAFEARRAY(BSTR). If I receive another SAFEARRAY(BSTR) from another interface method(in order to set *result) do I have to make copies of the strings received in order to pass them to *result and outside client? Or considering I will not use the strings for myself I can just pass them to the client (and passing out the ownership)?
2.
STDMETHODIMP CWhitelistPolicy::SetWebsitesStrings(SAFEARRAY* input)
Here I receive a BSTR array as input. Again my method is responsible for the memory allocated in input?
3.
STDMETHOD(SetUsers)(SAFEARRAY* input);
Here I call a method on another interface (SetUsers) and I allocate memory for the input SAFEARRAY. After I call SetUsers I can dispose of the SAFEARRAY? Memory is always copied when marshaling takes place isn't it? (in my case SetUsers method is called on an interface that is hosted as a COM DLL inside my process)
The way I think about it to answer questions like this is to think about a COM call that crosses machines. Then it's obvious for an [out] param; I the caller own and have to free the memory because the remote marshaling layer can't do it. For [in] parameters, it's obvious the marshaling layer must copy my data and again the remote marshaling layer can't free what I passed in.
A core tenet in COM is location neutrality, the rules when calling in the same apartment are the rules when using DCOM across machines.
You're responsible to free - you don't pass ownership when you call the next fnc because it could be remote and getting a copy, not your original data.
No - as the callee, you don't have to free it. If it's intra-apartment, it's the memory the caller provided and the caller has to free it. If it's a remote call, the server stub allocates it and will free it when the method returns.
Yes, you free it - no, it's not always copied (it might be), which is why the answer to 2 is no. If it's copied, there's a stub that allocated and the stub will free it.
Note my answers to your questions didn't cover the case of [in,out] parameters - see the so question Who owns returned BSTR? for some more details on this case.
Com allocation rules are complicated but rational. Get the book "essential com" by Don Box if you want to understand/see examples of all the cases. Still you're going to make mistakes so you should have a strategy for detecting them. I use gflags (part of Windbg) and its heap checking flags to catch when a double free occurs (a debug message is displayed and execution halted at the call with an INT 3). Vstudio's debugger used to turn them on for you when it launched the executable (it likely still does) but you can force them on with gflags under the image options tab.
You should also know how to use UMDH (also part of windbg) to detect leaks. DebugDiag is the newer tool for this and seems easier to use, but sadly, you can only have the 32 bit or 64 bit version installed, but not both.
The problem then are BSTRs, which are cached, make detecting double frees and leaks tricky because interacting with the heap is delayed. You can shut off the ole string cache by setting the environment variable OANOCACHE to 1 or calling the function SetOaNoCache. The function's not defined in a header file so see this SO question Where is SetOaNoCache defined?. Note the accepted answer shows the hard way to call it through GetProcAddress(). The answer below the accepted one shows all you need is an extern "C" as it's in the oleaut32 export lib. Finally, see this Larry Osterman blog post for a more detailed description of the difficulties caused by the cache when hunting leaks.

Can address space be recycled for multiple calls to MapViewOfFileEx without chance of failure?

Consider a complex, memory hungry, multi threaded application running within a 32bit address space on windows XP.
Certain operations require n large buffers of fixed size, where only one buffer needs to be accessed at a time.
The application uses a pattern where some address space the size of one buffer is reserved early and is used to contain the currently needed buffer.
This follows the sequence:
(initial run) VirtualAlloc -> VirtualFree -> MapViewOfFileEx
(buffer changes) UnMapViewOfFile -> MapViewOfFileEx
Here the pointer to the buffer location is provided by the call to VirtualAlloc and then that same location is used on each call to MapViewOfFileEx.
The problem is that windows does not (as far as I know) provide any handshake type operation for passing the memory space between the different users.
Therefore there is a small opportunity (at each -> in my above sequence) where the memory is not locked and another thread can jump in and perform an allocation within the buffer.
The next call to MapViewOfFileEx is broken and the system can no longer guarantee that there will be a big enough space in the address space for a buffer.
Obviously refactoring to use smaller buffers reduces the rate of failures to reallocate space.
Some use of HeapLock has had some success but this still has issues - something still manages to steal some memory from within the address space.
(We tried Calling GetProcessHeaps then using HeapLock to lock all of the heaps)
What I'd like to know is there anyway to lock a specific block of address space that is compatible with MapViewOfFileEx?
Edit: I should add that ultimately this code lives in a library that gets called by an application outside of my control
You could brute force it; suspend every thread in the process that isn't the one performing the mapping, Unmap/Remap, unsuspend the suspended threads. It ain't elegant, but it's the only way I can think of off-hand to provide the kind of mutual exclusion you need.
Have you looked at creating your own private heap via HeapCreate? You could set the heap to your desired buffer size. The only remaining problem is then how to get MapViewOfFileto use your private heap instead of the default heap.
I'd assume that MapViewOfFile internally calls GetProcessHeap to get the default heap and then it requests a contiguous block of memory. You can surround the call to MapViewOfFile with a detour, i.e., you rewire the GetProcessHeap call by overwriting the method in memory effectively inserting a jump to your own code which can return your private heap.
Microsoft has published the Detour Library that I'm not directly familiar with however. I know that detouring is surprisingly common. Security software, virus scanners etc all use such frameworks. It's not pretty, but may work:
HANDLE g_hndPrivateHeap;
HANDLE WINAPI GetProcessHeapImpl() {
return g_hndPrivateHeap;
}
struct SDetourGetProcessHeap { // object for exception safety
SDetourGetProcessHeap() {
// put detour in place
}
~SDetourGetProcessHeap() {
// remove detour again
}
};
void MapFile() {
g_hndPrivateHeap = HeapCreate( ... );
{
SDetourGetProcessHeap d;
MapViewOfFile(...);
}
}
These may also help:
How to replace WinAPI functions calls in the MS VC++ project with my own implementation (name and parameters set are the same)?
How can I hook Windows functions in C/C++?
http://research.microsoft.com/pubs/68568/huntusenixnt99.pdf
Imagine if I came to you with a piece of code like this:
void *foo;
foo = malloc(n);
if (foo)
free(foo);
foo = malloc(n);
Then I came to you and said, help! foo does not have the same address on the second allocation!
I'd be crazy, right?
It seems to me like you've already demonstrated clear knowledge of why this doesn't work. There's a reason that the documention for any API that takes an explicit address to map into lets you know that the address is just a suggestion, and it can't be guaranteed. This also goes for mmap() on POSIX.
I would suggest you write the program in such a way that a change in address doesn't matter. That is, don't store too many pointers to quantities inside the buffer, or if you do, patch them up after reallocation. Similar to the way you'd treat a buffer that you were going to pass into realloc().
Even the documentation for MapViewOfFileEx() explicitly suggests this:
While it is possible to specify an address that is safe now (not used by the operating system), there is no guarantee that the address will remain safe over time. Therefore, it is better to let the operating system choose the address. In this case, you would not store pointers in the memory mapped file, you would store offsets from the base of the file mapping so that the mapping can be used at any address.
Update from your comments
In that case, I suppose you could:
Not map into contiguous blocks. Perhaps you could map in chunks and write some intermediate function to decide which to read from/write to?
Try porting to 64 bit.
As the earlier post suggests, you can suspend every thread in the process while you change the memory mappings. You can use SuspendThread()/ResumeThread() for that. This has the disadvantage that your code has to know about all the other threads and hold thread handles for them.
An alternative is to use the Windows debug API to suspend all threads. If a process has a debugger attached, then every time the process faults, Windows will suspend all of the process's threads until the debugger handles the fault and resumes the process.
Also see this question which is very similar, but phrased differently:
Replacing memory mappings atomically on Windows

Resources