C++ memory management patterns for objects used in callback chains - memory-management

A couple codebases I use include classes that manually call new and delete in the following pattern:
class Worker {
public:
void DoWork(ArgT arg, std::function<void()> done) {
new Worker(std::move(arg), std::move(done)).Start();
}
private:
Worker(ArgT arg, std::function<void()> done)
: arg_(std::move(arg)),
done_(std::move(done)),
latch_(2) {} // The error-prone Latch interface isn't the point of this question. :)
void Start() {
Async1(<args>, [=]() { this->Method1(); });
}
void Method1() {
StartParallel(<args>, [=]() { this->latch_.count_down(); });
StartParallel(<other_args>, [=]() { this->latch_.count_down(); });
latch_.then([=]() { this->Finish(); });
}
void Finish() {
done_();
// Note manual memory management!
delete this;
}
ArgT arg_
std::function<void()> done_;
Latch latch_;
};
Now, in modern C++, explicit delete is a code smell, as, to some extent is delete this. However, I think this pattern (creating an object to represent a chunk of work managed by a callback chain) is fundamentally a good, or at least not a bad, idea.
So my question is, how should I rewrite instances of this pattern to encapsulate the memory management?
One option that I don't think is a good idea is storing the Worker in a shared_ptr: fundamentally, ownership is not shared here, so the overhead of reference counting is unnecessary. Furthermore, in order to keep a copy of the shared_ptr alive across the callbacks, I'd need to inherit from enable_shared_from_this, and remember to call that outside the lambdas and capture the shared_ptr into the callbacks. If I ever wrote the simple code using this directly, or called shared_from_this() inside the callback lambda, the object could be deleted early.

I agree that delete this is a code smell, and to a lesser extent delete on its own. But I think that here it is a natural part of continuation-passing style, which (to me) is itself something of a code smell.
The root problem is that the design of this API assumes unbounded control-flow: it acknowledges that the caller is interested in what happens when the call completes, but signals that completion via an arbitrarily-complex callback rather than simply returning from a synchronous call. Better to structure it synchronously and let the caller determine an appropriate parallelization and memory-management regime:
class Worker {
public:
void DoWork(ArgT arg) {
// Async1 is a mistake; fix it later. For now, synchronize explicitly.
Latch async_done(1);
Async1(<args>, [&]() { async_done.count_down(); });
async_done.await();
Latch parallel_done(2);
RunParallel([&]() { DoStuff(<args>); parallel_done.count_down(); });
RunParallel([&]() { DoStuff(<other_args>); parallel_done.count_down(); };
parallel_done.await();
}
};
On the caller-side, it might look something like this:
Latch latch(tasks.size());
for (auto& task : tasks) {
RunParallel([=]() { DoWork(<args>); latch.count_down(); });
}
latch.await();
Where RunParallel can use std::thread or whatever other mechanism you like for dispatching parallel events.
The advantage of this approach is that object lifetimes are much simpler. The ArgT object lives for exactly the scope of the DoWork call. The arguments to DoWork live exactly as long as the closures containing them. This also makes it much easier to add return-values (such as error codes) to DoWork calls: the caller can just switch from a latch to a thread-safe queue and read the results as they complete.
The disadvantage of this approach is that it requires actual threading, not just boost::asio::io_service. (For example, the RunParallel calls within DoWork() can't block on waiting for the RunParallel calls from the caller side to return.) So you either have to structure your code into strictly-hierarchical thread pools, or you have to allow a potentially-unbounded number of threads.

One option is that the delete this here is not a code smell. At most, it should be wrapped into a small library that would detect if all the continuation callbacks were destroyed without calling done_().

Related

Removing a std::function<()> from a vector c++

I'm building a publish-subscribe class (called SystermInterface), which is responsible to receive updates from its instances, and publish them to subscribers.
Adding a subscriber callback function is trivial and has no issues, but removing it yields an error, because std::function<()> is not comparable in C++.
std::vector<std::function<void()> subs;
void subscribe(std::function<void()> f)
{
subs.push_back(f);
}
void unsubscribe(std::function<void()> f)
{
std::remove(subs.begin(), subs.end(), f); // Error
}
I've came down to five solutions to this error:
Registering the function using a weak_ptr, where the subscriber must keep the returned shared_ptr alive.
Solution example at this link.
Instead of registering at a vector, map the callback function by a custom key, unique per callback function.
Solution example at this link
Using vector of function pointers. Example
Make the callback function comparable by utilizing the address.
Use an interface class (parent class) to call a virtual function.
In my design, all intended classes inherits a parent class called
ServiceCore, So instead of registering a callback function, just
register ServiceCore reference in the vector.
Given that the SystemInterface class has a field attribute per instance (ID) (Which is managed by ServiceCore, and supplied to SystemInterface by constructing a ServiceCore child instance).
To my perspective, the first solution is neat and would work, but it requires handling at subscribers, which is something I don't really prefer.
The second solution would make my implementation more complex, where my implementation looks as:
using namespace std;
enum INFO_SUB_IMPORTANCE : uint8_t
{
INFO_SUB_PRIMARY, // Only gets the important updates.
INFO_SUB_COMPLEMENTARY, // Gets more.
INFO_SUB_ALL // Gets all updates
};
using CBF = function<void(string,string)>;
using INFO_SUBTREE = map<INFO_SUB_IMPORTANCE, vector<CBF>>;
using REQINF_SUBS = map<string, INFO_SUBTREE>; // It's keyed by an iterator, explaining it goes out of the question scope.
using INFSRC_SUBS = map<string, INFO_SUBTREE>;
using WILD_SUBS = INFO_SUBTREE;
REQINF_SUBS infoSubrs;
INFSRC_SUBS sourceSubrs;
WILD_SUBS wildSubrs;
void subscribeInfo(string info, INFO_SUB_IMPORTANCE imp, CBF f) {
infoSubrs[info][imp].push_back(f);
}
void subscribeSource(string source, INFO_SUB_IMPORTANCE imp, CBF f) {
sourceSubrs[source][imp].push_back(f);
}
void subscribeWild(INFO_SUB_IMPORTANCE imp, CBF f) {
wildSubrs[imp].push_back(f);
}
The second solution would require INFO_SUBTREE to be an extended map, but can be keyed by an ID:
using KEY_T = uint32_t; // or string...
using INFO_SUBTREE = map<INFO_SUB_IMPORTANCE, map<KEY_T,CBF>>;
For the third solution, I'm not aware of the limitations given by using function pointers, and the consequences of the fourth solution.
The Fifth solution would eliminate the purpose of dealing with CBFs, but it'll be more complex at subscriber-side, where a subscriber is required to override the virtual function and so receives all updates at one place, in which further requires filteration of the message id and so direct the payload to the intended routines using multiple if/else blocks, which will increase by increasing subscriptions.
What I'm looking for is an advice for the best available option.
Regarding your proposed solutions:
That would work. It can be made easy for the caller: have subscribe() create the shared_ptr and corresponding weak_ptr objects, and let it return the shared_ptr.
Then the caller must not lose the key. In a way this is similar to the above.
This of course is less generic, and then you can no longer have (the equivalent of) captures.
You can't: there is no way to get the address of the function stored inside a std::function. You can do &f inside subscribe() but that will only give you the address of the local variable f, which will go out of scope as soon as you return.
That works, and is in a way similar to 1 and 2, although now the "key" is provided by the caller.
Options 1, 2 and 5 are similar in that there is some other data stored in subs that refers to the actual std::function: either a std::shared_ptr, a key or a pointer to a base class. I'll present option 6 here, which is kind of similar in spirit but avoids storing any extra data:
Store a std::function<void()> directly, and return the index in the vector where it was stored. When removing an item, don't std::remove() it, but just set it to std::nullptr. Next time subscribe() is called, it checks if there is an empty element in the vector and reuses it:
std::vector<std::function<void()> subs;
std::size_t subscribe(std::function<void()> f) {
if (auto it = std::find(subs.begin(), subs.end(), std::nullptr); it != subs.end()) {
*it = f;
return std::distance(subs.begin(), it);
} else {
subs.push_back(f);
return subs.size() - 1;
}
}
void unsubscribe(std::size_t index) {
subs[index] = std::nullptr;
}
The code that actually calls the functions stored in subs must now of course first check against std::nullptrs. The above works because std::nullptr is treated as the "empty" function, and there is an operator==() overload that can check a std::function against std::nullptr, thus making std::find() work.
One drawback of option 6 as shown above is that a std::size_t is a rather generic type. To make it safer, you might wrap it in a class SubscriptionHandle or something like that.
As for the best solution: option 1 is quite heavy-weight. Options 2 and 5 are very reasonable, but 6 is, I think, the most efficient.

Why one may need a shared_from_this instead of directly using this pointer?

Look at the second answer here:
What is the need for enable_shared_from_this?
it says:
"Short answer: you need enable_shared_from_this when you need to use inside the object itself existing shared pointer guarding this object.
Out of the object you can simply assign and copy a shared_ptr because you deal with the shared_ptr variable as is."
and later down in the last line it says:
"And when and why one can need a shared pointer to this instead of just this it is quite other question. For example, it is widely used in asynchronous programming for callbacks binding."
Here in this post I want to ask exactly this other question. What is an use case for "enable_shared_from_this" and "shared_from_this"?
A simple use-case would be to ensure this survives till the end of some asynchronous, or delayed operation:
class My_type : public std::enable_shared_from_this<My_type> {
public:
void foo() {}
void perform_foo() {
auto self = shared_from_this();
std::async(std::launch::async, [self, this]{ foo(); });
}
};
boost::asio uses this technique a lot in their examples:
https://www.boost.org/doc/libs/1_66_0/doc/html/boost_asio/example/cpp11/allocation/server.cpp

Avoiding deadlock in reentrant code C++11

I am working on refactoring some legacy code that suffers from deadlocks. There are two main root causes:
1) the same thread locking the same mutex multiple times, which should not difficult to resolve, and
2) the code occasionally calls into user defined functions which can enter the same code at the top level. I need to lock the mutex before calling user defined functions, but I might end up executing the same code again which will result in a deadlock situation. So, I need some mechanism to tell me that the mutex has already been locked and I should not lock it again. Any suggestions?
Here is a (very) brief summary of what the code does:
class TreeNode {
public:
// Assign a new value to this tree node
void set(const boost::any& value, boost::function<void, const TreeNode&> validator) {
boost::upgrade_lock<boost::shared_mutex> lock(mutexToTree_);
// call validator here
boost::upgrade_to_unique_lock<boost::shared_mutex> ulock(lock);
// set this TreeNode to value
}
// Retrieve the value of this tree node
boost::any get() {
boost::shared_lock<boost::shared_mutex> lock(mutexToTree_);
// get value for this tree node
}
private:
static boost::shared_mutex mutexToRoot_;
};
The problem is that the validator function can call into get(), which locks mutexToRoot_ on the same thread. I could modify mutexToRoot_ to be a recursive mutex but that would prevent other threads from reading the tree during get() operation, which is unwanted behavior.
Since C++11 you can use std::recursive_mutex, which allows the owning thread to call lock or try_lock without blocking/reporting failure, whereas the other threads will block on lock/receive false on try_lock until the owning thread calls unlock as many times as it called lock/try_lock before.

std::unique_ptr<Object> and many viewers (Object*), is it good design?

Say I want to manage an Object with unique_ptr in a sort of master class. However, I'm in a situation where many other classes need to use this Object. I'm passing Object* to them. I don't think this is a good design, but I can't find a right solution.
class Gadget1 {
Object* obj_;
public:
Gadget1(Object* obj) : obj_(obj) {}
};
class Gadget2 {
// .. similar
};
class Worker {
std::unique_ptr<Object> obj_;
public:
void init() {
obj_ = std::make_unique<Object>(...);
createGadget1(obj_.get());
createGadget2(obj_.get());
...
}
};
What'd be a right and safe approach? Should Gadget have unique_ptr<Object>& instead of Object*?
Assume that the lifetime of Gadget1 is guaranteed to shorter than Worker.
Your design is perfectly fine: smart pointers for the owner(s), and raw pointers for everyone else.
If you cannot guarantee that the objects outlives the observers, either:
Notify the observers when an object dies so they can update their raw pointer, or
Give std::weak_ptrs instead of raw pointers to the observers so they can check.
In any case, you should not use std::unique_ptr<Object> &: observers should not care about how the object's lifetime is ensured.
Plus, this adds nothing over a raw pointer: if the object dies, it's because its owner died, so the std::unique_ptr is dead too, and the reference is dangling -- back to square one.

Casting std::future or std::shared_future in c++11

This may sound stupid, but C++ and C++11 has surprised me before in terms of the magic it can achieve. Perhaps this is too far, but I prefer confirming my fears rather than assuming them.
Is it possible in any way to cast a std::future or std::future_shared object?
I find it usually helps if I describe the concrete problem I am having. I am essentially loading some audio and video asynchronously and although I've started using std::async which I find really useful, I haven't used futures before until now. It's essentially born out of me learning that futures seem to handle exceptions fairly well and I want to make my async loading a bit more robust. My crummy program will occasionally run out of memory but that won't occur to the program until the async call has been launched. Solving the memory issue is another issue entirely and not a viable solution currently.
Anyway - I have two separate objects that handle the loading of audio (AudioLibrary) and video (VideoLibrary), but since they share a number of commonalities they both inherit from the same base object (BaseLibrary).
The audio and video each of these respective libraries return come in their own containers for audio (AudioTrack) and video (VideoTrack), which also inherit from a common object (BaseTrack).
I'm sure you can see where this is going. I'd like some general exception handling to occur in the BaseLibrary which will have some virtual functions like loadMedia. These will be overwritten by the derived libraries. Thus the trouble begins. I read that pointer objects (like unique_ptr or shared_ptr) cannot be covariant and so just creating a virtual method doesn't quite solve it.
However, I was hoping via virtual functions I could still somehow achieve what I wanted.
Something along the lines of BaseLibrary implementing the following:
std::shared_future<BaseTrack> BaseLibrary::loadMedia()
std::shared_future<BaseTrack> BaseLibrary::loadMediaHelper()
and then AudioLibrary would implement
std::shared_future<AudioTrack> AudioLibrary::loadAudio()
where this function makes use of the functions in the BaseLibrary yet returns its own specific type of AudioTrack, rather than a BaseTrack.
Is this at all possible?
Update 1:
Thanks to the comment and answer, I see how it's possible to achieve what I want, but I have a few more unresolved questions. I think it'll be much easier to address those by just being very explicit. I'm actually working with shared_ptrs since a number of objects are making use of the loaded audio and video, so I have the following type defs:
typedef std::shared_ptr<BaseTrack> BaseTrackPtr;
typedef std::shared_ptr<AudioTrack> AudioTrackPtr;
AudioTrack inherits from BaseTrack of course. Following the given advice I have a compile-able (abbreviated) code structure which is as follows for the BaseLibrary:
class BaseLibrary {
virtual std::shared_future<BaseTrackPtr> loadMedia();
virtual std::shared_future<BaseTrackPtr> loadMediaHelper() = 0;
}
std::shared_future<BaseTrackPtr> BaseLibrary::loadMedia()
{
// Place code to catch exceptions coming through the std::future here.
// Call the loadMediaHelper via async - loadMediaHelper is overwritten in the inherited libraries.
}
And the AudioLibrary:
class AudioLibrary : public BaseLibrary {
public:
virtual std::shared_future<AudioTrackPtr> loadAudio();
protected:
virtual std::shared_future<BaseTrackPtr> loadMediaHelper();
}
std::shared_future<AudioTrackPtr> AudioLibrary::loadAudio()
{
std::shared_future<BaseTrackPtr> futureBaseTrackPtr = loadMedia();
return std::async( std::launch::deferred, [=]() {
return AudioTrackPtr( std::static_pointer_cast<AudioTrack>( futureBaseTrackPtr.get() ) );
} );
}
std::shared_future<BaseTrackPtr> AudioLibrary::loadMediaHelper()
{
// Place specific audio loading code here
}
This structure allows me to catch any video/audio loading exceptions in one place, and also return the proper Audio/Video Object rather than a base object that needs to be recast.
My two current questions are as follows:
Isn't it best to let the async call in loadMedia in the BaseLibrary be std::launch::deferred, and then let the async calls in either loadAudio (or loadVideo) be std::launch::async? I essentially want the loading commence immediately, but might as well wait til the outer async call is performed...? Does that make sense?
Finally, is this hideously ugly? A part of me feels like I'm properly leveraging all the goodness C++11 has to offer, shared_ptr's, futures and so forth. But I'm also quite new to futures so... I don't know if putting a shared pointer in a shared future is... Weird?
So you have something like:
class BaseLibrary
{
public:
virtual ~BaseLibrary() {}
virtual std::shared_future<std::unique_ptr<BaseTrack>> loadMedia() = 0;
};
class AudioLibrary : public BaseLibrary
{
public:
std::shared_future<AudioTrack> loadAudio();
std::shared_future<std::unique_ptr<BaseTrack>> loadMedia() override;
};
So you may implement loadMedia() like that:
std::shared_future<std::unique_ptr<BaseTrack>> AudioLibrary::loadMedia()
{
auto futureAudioTrack = loadAudio();
return std::async(std::launch::deferred,
[=]{
std::unique_ptr<BaseTrack> res =
make_unique<AudioTrack>(futureAudioTrack.get());
return res;
});
}

Resources