Atomic std::shared_ptrs before c++20 - c++11

Currently I'm using task based scheduling to ensure "thread safe" shared ptr setting/getting, but that is insanely inefficient.
How would one implement atomic shared ptrs using C++11? Can we just use the std::atomic> template?

Related

Is it necessary to free a mutex created by xSemaphoreCreateMutex()?

FreeRTOS and ESP-IDF provide xSemaphoreCreateMutex() which allocates and initializes a mutex. Their docs say:
If a mutex is created using xSemaphoreCreateMutex() then the required
memory is automatically dynamically allocated inside the
xSemaphoreCreateMutex() function. (see
http://www.freertos.org/a00111.html).
However, I can't find any info on whether it is necessary to free the memory created by the mutex. This would be important if using C++ with a mutex member variable, like:
class MyClass
{
MyClass::MyClass()
{
mAccessMutex = xSemaphoreCreateMutex();
}
MyClass::~MyClass()
{
// What to do here??
}
SemaphoreHandle_t mAccessMutex;
}
REFERENCE
https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/system/freertos.html?highlight=xsemaphore#semaphore-api
According to FreeRTOS API reference, the proper way to destroy/delete a mutex is vSemaphoreDelete()
Deletes a semaphore, including mutex type semaphores and recursive
semaphores.
Do not delete a semaphore that has tasks blocked on it.
If you're using heap_1, deleting is not possible. Also, make sure that you fully understand the perils of dynamic memory allocation in embedded systems before using it. If MyClass is going to be created and destroyed in a regular/periodic basis, this may cause problems.
So yes, it's necessary to call vSemaphoreDelete(mAccessMutex) in ~MyClass(). But it's probably best to make sure that MyClass instances never get destroyed. In my projects, I generally use one-time dynamic allocation during initialization and forget to implement a proper destructor (which is a bad habit that I need to fix).

Do WinAPI Slim Reader/Writer (SRW) Locks use memory barriers?

The WinAPI Docu says:
"The following synchronization functions use the appropriate barriers to ensure memory ordering:
Functions that enter or leave critical sections
Functions that signal synchronization objects
Wait functions
Interlocked functions"
Synchonization Docu
Now the question is: Do WinAPI Slim Reader/Writer (SRW) Locks also use memory barriers?
SRW Locks Docu
Note: WinAPI SRW Locks are neither critical sections nor synchronization objects.

Can I use a register as a loop counter?

Since the calling convention of a function states which registers are preserved, can a register be used as a loop counter?
I first thought that the ecx register is used as a loop counter, but after finding out that an stdcall function I have used has not preserved the value of ecx, I thought otherwise.
Is there a register that is guaranteed (by mostly used calling conventions at least) to be preserved?
Note: I don't have a problem in using a stack variable as a loop counter, I just want to make sure that it is the only way.
You can use any general-purpose register, and occasionally others, as the loop counter (just not the stack pointer of course ☺).
Either you use one to loop manually, i.e. replace…
loop label
… with…
dec ebp
jnz label
… which is faster anyway (because AMD (and later Intel, when they caught up, MHz-wise) artificially slowed down the loop instruction as otherwise, Windows® and some Turbo Pascal compiled software crashed).
Or you just save the counter in between:
label:
push ecx
call func
pop ecx
loop label
Both are standard strategies.
Is there a register that is guaranteed (by mostly used calling conventions at least) to be preserved?
You can choose any free register in your own code if your loop code will not call any external entity.
If your loop code will call an external entity where the only guaranteed contract is the ABI and calling convention then you must save/restore your registers and make the register choice case-by-case.
Quoting Agner Fog's excellent paper Calling conventions for different C++ compilers and operating systems:
6 Register usage
The rules for register usage depend on the operating system, as shown in table 4. Scratch registers are registers that can be used for temporary storage without restrictions (also called caller-save or volatile registers). Callee-save registers are registers that you have to save before using them and restore after using them (also called non-volatile registers). You can rely on these registers having the same value after a call as before the call...
...
See also:
Wikipedia: x86 calling conventions

When using CoTaskMemAlloc, should I always call CoTaskMemFree?

I'm writing some COM and ATL code, and for some reason all the code uses CoTaskMemAlloc to allocate memory instead of new or malloc. So I followed along this coding style and I also use CoTaskMemAlloc.
My teachers taught me to always delete or free when allocating memory. However I'm not sure if I should always be calling CoTaskMemFree if I use CoTaskMemAlloc?
Using the CRT's provided new/malloc and delete/free is a problem in COM interop. To make them work, it is very important that the same copy of the CRT both allocates and releases the memory. That's impossible to enforce in a COM interop scenario, your COM server and the client are practically guaranteed to use different versions of the CRT. Each using their own heap to allocate from. This causes undiagnosable memory leaks on Windows XP, a hard exception on Vista and up.
Which is why the COM heap exists, a single predefined heap in a process that's used both by the server and the client. IMalloc is the generic interface to access that shared heap, CoTaskMemAlloc() and CoTaskMemFree() are the system provided helper functions to use that interface.
That said, this is only necessary in a case where the server allocates memory and the client has to release it. Or the other way around. Which should always be rare in an interop scenario, the odds for accidents are just too large. In COM Automation there are just two such cases, a BSTR and a SAFEARRAY, types that are already wrapped. You avoid it in other cases by having the method caller provide the memory and the callee fill it in. Which also allows a strong optimization, the memory could come from the caller's stack.
Review the code and check who allocates the memory and who needs to release it. If both exist in the same module then using new/malloc is fine because there's now a hard guarantee that the same CRT instance takes care of it. If that's not the case then consider fixing it so the caller provides the memory and releases it.
The allocation and freeing of memory must always come from the same source. If you use CoTaskMemAlloc then you must use CoTaskMemFree to free the memory.
Note in C++ though the act of managing memory and object construction / destruction (new / delete) are independent actions. It's possible to customize specific objects to use a different memory allocator and still allow for the standard new / delete syntax which is preferred. For example
class MyClass {
public:
void* operator new(size_t size) {
return ::CoTaskMemAlloc(size);
}
void* operator new[](size_t size) {
return ::CoTaskMemAlloc(size);
}
void operator delete(void* pMemory) {
::CoTaskMemFree(pMemory);
}
void operator delete[](void* pMemory) {
::CoTaskMemFree(pMemory);
}
};
Now I can use this type just like any other C++ type and yet the memory will come from the COM heap
// Normal object construction but memory comes from CoTaskMemAlloc
MyClass *pClass = new MyClass();
...
// Normal object destruction and memory freed from CoTaskMemFree
delete pClass;
The answer to the question is: Yes, you should use CoTaskMemFree to free memory allocated with CoTaskMemAlloc.
The other answers do a good job explaining why CoTaskMemAlloc and CoTaskMemFree are necessary for memory passed between COM servers and COM clients, but they didn't directly answer your question.
Your teacher was right: You should always use the corresponding release function for any resource. If you use new, use delete. If you use malloc, use free. If you use CreateFile, use CloseHandle. Etc.
Better yet, in C++, use RAII objects that allocate the resource in the constructor and release the resource in the destructor, and then use those RAII wrappers instead of the bare functions. This makes it easier and cleaner to write code that doesn't leak, even if you get something like an exception.
The standard template library provides containers that implement RAII, which is why you should learn to use a std::vector or std::string rather than allocating bare memory and trying to manage it yourself. There are also smart pointers like std::shared_ptr and std::unique_ptr that can be used to make sure the right release call is always made at the right time.
ATL provides some classes like ATL::CComPtr which are wrapper objects that handle the reference counting of COM objects for you. They are not foolproof to use correctly, and, in fact, have a few more gotchas than most of the modern STL classes, so read the documentation carefully. When used correctly, it's relatively easy to make sure the AddRef and Release calls all match up.

Sharing GlobalAlloc() memory from DLL to multiple Win32 applications

I want to move my caching library to a DLL and allow multiple applications to share a single pointer allocated within the DLL using GlobalAlloc(). How could I accomplish this, and would it result in a significant performance decrease?
You could certainly do this and there won't be any performance implication for a single pointer.
Rather than use GlobalAlloc, a legacy API, you should opt for a different shared heap. For example the simplest to use is the COM allocator, CoTaskMemAlloc. Or you can use HeapAlloc passing the process heap obtained by GetProcessHeap.
For example, and neglecting to show error checking:
void *mem = HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, size);
Note that you only need to worry about heap sharing if you expect the memory to be deallocated in a different module from where it was created. If your DLL both creates and destroys the memory then you can use plain old malloc. Because all modules live in the same process address space, memory allocated by any module in that process, can be used by any other module.
Update
I failed on first reading of the question to pick up on the possibility that you may be wanting multiple process to have access to the same memory. If that's what you need then it is only possible with memory mapped files, or perhaps with some form of IPC.

Resources