In C++/CLI how do you define thread-safe event accessors? - events

The code sample "How to: Define Event Accessor Methods" at
http://msdn.microsoft.com/en-us/library/dw1dtw0d.aspx
appears to mutate the internal pE without taking locks. (It doesn't look like Delegate::Combine does anything magical that would prevent issues.) It also does
void raise() {
if (pE != nullptr)
pE->Invoke();
}
which can be problematic if pE changes to null between the check and the Invoke(). I have two questions:
Am I right in that the existing code is not thread-safe?
Since I want a thread-safe version of the code, I was thinking of locking the add and remove functions. Is it premature optimization to use
void raise() {
MyDel^ handler = pE;
if (handler != nullptr)
handler->Invoke();
}
or should I just lock that function too?

All three accessors are thread-safe by default (raise includes a null-check, and uses a local variable to avoid the race condition) unlike the example in the page you linked.
When it comes to custom event implementations, you're right about needing to synchronize the add and remove accessors. Just put a mutex around the implementation. But there's no need to throw away type safety by calling Delegate::Combine and then casting, since operator + and - are overloaded for delegate handles. Or you can go lockless, as follows:
void add(MyDel^ p)
{
MyDel^ old;
MyDel^ new;
do {
old = pE;
new = pE + p;
} while (old != Interlocked::CompareExchange(pE, new, old));
}
Define remove mutatis mutandis (new = pE - p;). And the code you gave for raise will be perfectly fine for a custom event implementation.
In summary, that MSDN sample is total garbage. And the simplest way to achieve thread-safety is with an auto-implemented event.

Related

Removing a std::function<()> from a vector c++

I'm building a publish-subscribe class (called SystermInterface), which is responsible to receive updates from its instances, and publish them to subscribers.
Adding a subscriber callback function is trivial and has no issues, but removing it yields an error, because std::function<()> is not comparable in C++.
std::vector<std::function<void()> subs;
void subscribe(std::function<void()> f)
{
subs.push_back(f);
}
void unsubscribe(std::function<void()> f)
{
std::remove(subs.begin(), subs.end(), f); // Error
}
I've came down to five solutions to this error:
Registering the function using a weak_ptr, where the subscriber must keep the returned shared_ptr alive.
Solution example at this link.
Instead of registering at a vector, map the callback function by a custom key, unique per callback function.
Solution example at this link
Using vector of function pointers. Example
Make the callback function comparable by utilizing the address.
Use an interface class (parent class) to call a virtual function.
In my design, all intended classes inherits a parent class called
ServiceCore, So instead of registering a callback function, just
register ServiceCore reference in the vector.
Given that the SystemInterface class has a field attribute per instance (ID) (Which is managed by ServiceCore, and supplied to SystemInterface by constructing a ServiceCore child instance).
To my perspective, the first solution is neat and would work, but it requires handling at subscribers, which is something I don't really prefer.
The second solution would make my implementation more complex, where my implementation looks as:
using namespace std;
enum INFO_SUB_IMPORTANCE : uint8_t
{
INFO_SUB_PRIMARY, // Only gets the important updates.
INFO_SUB_COMPLEMENTARY, // Gets more.
INFO_SUB_ALL // Gets all updates
};
using CBF = function<void(string,string)>;
using INFO_SUBTREE = map<INFO_SUB_IMPORTANCE, vector<CBF>>;
using REQINF_SUBS = map<string, INFO_SUBTREE>; // It's keyed by an iterator, explaining it goes out of the question scope.
using INFSRC_SUBS = map<string, INFO_SUBTREE>;
using WILD_SUBS = INFO_SUBTREE;
REQINF_SUBS infoSubrs;
INFSRC_SUBS sourceSubrs;
WILD_SUBS wildSubrs;
void subscribeInfo(string info, INFO_SUB_IMPORTANCE imp, CBF f) {
infoSubrs[info][imp].push_back(f);
}
void subscribeSource(string source, INFO_SUB_IMPORTANCE imp, CBF f) {
sourceSubrs[source][imp].push_back(f);
}
void subscribeWild(INFO_SUB_IMPORTANCE imp, CBF f) {
wildSubrs[imp].push_back(f);
}
The second solution would require INFO_SUBTREE to be an extended map, but can be keyed by an ID:
using KEY_T = uint32_t; // or string...
using INFO_SUBTREE = map<INFO_SUB_IMPORTANCE, map<KEY_T,CBF>>;
For the third solution, I'm not aware of the limitations given by using function pointers, and the consequences of the fourth solution.
The Fifth solution would eliminate the purpose of dealing with CBFs, but it'll be more complex at subscriber-side, where a subscriber is required to override the virtual function and so receives all updates at one place, in which further requires filteration of the message id and so direct the payload to the intended routines using multiple if/else blocks, which will increase by increasing subscriptions.
What I'm looking for is an advice for the best available option.
Regarding your proposed solutions:
That would work. It can be made easy for the caller: have subscribe() create the shared_ptr and corresponding weak_ptr objects, and let it return the shared_ptr.
Then the caller must not lose the key. In a way this is similar to the above.
This of course is less generic, and then you can no longer have (the equivalent of) captures.
You can't: there is no way to get the address of the function stored inside a std::function. You can do &f inside subscribe() but that will only give you the address of the local variable f, which will go out of scope as soon as you return.
That works, and is in a way similar to 1 and 2, although now the "key" is provided by the caller.
Options 1, 2 and 5 are similar in that there is some other data stored in subs that refers to the actual std::function: either a std::shared_ptr, a key or a pointer to a base class. I'll present option 6 here, which is kind of similar in spirit but avoids storing any extra data:
Store a std::function<void()> directly, and return the index in the vector where it was stored. When removing an item, don't std::remove() it, but just set it to std::nullptr. Next time subscribe() is called, it checks if there is an empty element in the vector and reuses it:
std::vector<std::function<void()> subs;
std::size_t subscribe(std::function<void()> f) {
if (auto it = std::find(subs.begin(), subs.end(), std::nullptr); it != subs.end()) {
*it = f;
return std::distance(subs.begin(), it);
} else {
subs.push_back(f);
return subs.size() - 1;
}
}
void unsubscribe(std::size_t index) {
subs[index] = std::nullptr;
}
The code that actually calls the functions stored in subs must now of course first check against std::nullptrs. The above works because std::nullptr is treated as the "empty" function, and there is an operator==() overload that can check a std::function against std::nullptr, thus making std::find() work.
One drawback of option 6 as shown above is that a std::size_t is a rather generic type. To make it safer, you might wrap it in a class SubscriptionHandle or something like that.
As for the best solution: option 1 is quite heavy-weight. Options 2 and 5 are very reasonable, but 6 is, I think, the most efficient.

Are constant references still best practice in c++11 and later?

I recently read an article about the new move semantics in C++. It was about the confusion how to best implement a return value for a large object. The conclusion was, just implement it like return by copy and let the compiler decide if a move works best.
Now I wondered if this is true for function parameters as well meanwhile.
Currently I use const references like this:
void setLargeObject(const LargeObject &obj) {
_obj = obj;
}
Instead of the simple copy:
void setLargeObject(LargeObject obj) {
_obj = obj;
}
Are parameters, to copy large objects, passed by const reference still be the best practice in C++11 and later?
If setting the property requires taking ownership of the value, then pass by value. It will be copied if necessary before the function call, when the parameter is initialized. Inside the function, move it into place.
void setLargeObject(LargeObject obj) {
_obj = std::move(obj);
}
If LargeObject doesn't support move semantics (so having std::move changes nothing), then you might use const& to limit the performance hit to one copy instead of two. However, the best solution is to add movability, not to stay with const&.

Implementing observer pattern without unsubscribe method

When I implement observer pattern before, I always used to hold a reference to the owner inside of listener. And in listener's ctor I used register and in dtor I used to unregister.
But this time around I don't want to hold a reference for keeping weak coupling between this classes.
I come up with an implementation with weak-ptr. My question is, if it is ok to implement observer pattern without unsubscribe method with weak-ptr?
Is there any case that I can get into trouble?
Yes, using a weak_ptr to an observer is a natural fit.
However, your implementation has a data race where elem expires during your loop, you probably want to instead do
for (auto elem : listenerList)
{
auto locked = elem.lock();
if (locked) { locked->update(val); }
else { anyExpired = true; }
}

Why do we need exception handling?

I can check for the input and if it's an invalid input from the user, I can use a simple "if condition" which prints "input invalid, please re-enter" (in case there is an invalid input).
This approach of "if there is a potential for a failure, verify it using an if condition and then specify the right behavior when failure is encountered..." seems enough for me.
If I can basically cover any kind of failure (divide by zero, etc.) with this approach, why do I need this whole exception handling mechanism (exception class and objects, checked and unchecked, etc.)?
Suppose you have func1 calling func2 with some input.
Now, suppose func2 fails for some reason.
Your suggestion is to handle the failure within func2, and then return to func1.
How will func1 "know" what error (if any) has occurred in func2 and how to proceed from that point?
The first solution that comes to mind is an error-code that func2 will return, where typically, a zero value will represent "OK", and each of the other (non-zero) values will represent a specific error that has occurred.
The problem with this mechanism is that it limits your flexibility in adding / handling new error-codes.
With the exception mechanism, you have a generic Exception object, which can be extended to any specific type of exception. In a way, it is similar to an error-code, but it can contain more information (for example, an error-message string).
You can still argue of course, "well, what's the try/catch for then? why not simply return this object?".
Fortunately, this question has already been answered here in great detail:
In C++ what are the benefits of using exceptions and try / catch instead of just returning an error code?
In general, there are two main advantages for exceptions over error-codes, both of which are different aspects of correct coding:
With an exception, the programmer must either handle it or throw it "upwards", whereas with an error-code, the programmer can mistakenly ignore it.
With the exception mechanism you can write your code much "cleaner" and have everything "automatically handled", wheres with error-codes you are obliged to implement a "tedious" switch/case, possibly in every function "up the call-stack".
Exceptions are a more object-oriented approach to handling exceptional execution flows than return codes. The drawback of return codes is that you have to come up with 'special' values to indicate different types of exceptional results, for example:
public double calculatePercentage(int a, int b) {
if (b == 0) {
return -1;
}
else {
return 100.0 * (a / b);
}
}
The above method uses a return code of -1 to indicate failure (because it cannot divide by zero). This would work, but your calling code needs to know about this convention, for example this could happen:
public double addPercentages(int a, int b, int c, int d) {
double percentage1 = calculatePercentage(a, b);
double percentage2 = calculatePercentage(c, c);
return percentage1 + percentage2;
}
Above code looks fine at first glance. But when b or d are zero the result will be unexpected. calculatePercentage will return -1 and add it to the other percentage which is likely not correct. The programmer who wrote addPercentages is unaware that there is a bug in this code until he tests it, and even then only if he really checks the validity of the results.
With exceptions you could do this:
public double calculatePercentage(int a, int b) {
if (b == 0) {
throw new IllegalArgumentException("Second argument cannot be zero");
}
else {
return 100.0 * (a / b);
}
}
Code calling this method will compile without exception handling, but it will stop when run with incorrect values. This is often the preferred way since it leaves it up to the programmer if and where to handle exceptions.
If you want to force the programmer to handle this exception you should use a checked exception, for example:
public double calculatePercentage(int a, int b) throws MyCheckedCalculationException {
if (b == 0) {
throw new MyCheckedCalculationException("Second argument cannot be zero");
}
else {
return 100.0 * (a / b);
}
}
Notice that calculatePercentage has to declare the exception in its method signature. Checked exceptions have to be declared like that, and the calling code either has to catch them or declare them in their own method signature.
I think many Java developers currently agree that checked exceptions are bit invasive so most API's lately gravitate towards the use of unchecked exceptions.
The checked exception above could be defined like this:
public class MyCheckedCalculationException extends Exception {
public MyCalculationException(String message) {
super(message);
}
}
Creating a custom exception type like that makes sense if you are developing a component with multiple classes and methods which are used by several other components and you want to make your API (including exception handling) very clear.
(see the Throwable class hierarchy)
Let's assume that you need to write some code for some object, which consists of n different resources (n > 3) to be allocated in the constructor and deallocated inside the destructor.
Let's even say, that some of these resources depend on each other.
E.g. in order to create an memory map of some file one would first have to successfully open the file and then perform the OS function for memory mapping.
Without exception handling you would not be able to use the constructor(s) to allocate these resources but you would likely use two-step-initialization.
You would have to take care about order of construction and destruction yourself
-- since you're not using the constructor anymore.
Without exception handling you would not be able to return rich error information to the caller -- this is why in exception free software one usually needs a debugger and debug executable to identify why some complex piece of software is suddenly failing.
This again assumes, that not every library is able to simply dump it's error information to stderr. stderr is in certain cases not available, which in turn makes all code which is using stderr for error reporting not useable.
Using C++ Exception Handling you would simply chain the classes wrapping the matching system calls into base or member class relationships AND the compiler would take care about order of construction and destruction and to only call destructors for not failed constructors.
To start with, methods are generally the block of codes or statements in a program that gives the user the ability to reuse the same code which is ultimately the saving on the excessive use of memory. This means that there is now no wastage of memory on the computer.

C++ memory management patterns for objects used in callback chains

A couple codebases I use include classes that manually call new and delete in the following pattern:
class Worker {
public:
void DoWork(ArgT arg, std::function<void()> done) {
new Worker(std::move(arg), std::move(done)).Start();
}
private:
Worker(ArgT arg, std::function<void()> done)
: arg_(std::move(arg)),
done_(std::move(done)),
latch_(2) {} // The error-prone Latch interface isn't the point of this question. :)
void Start() {
Async1(<args>, [=]() { this->Method1(); });
}
void Method1() {
StartParallel(<args>, [=]() { this->latch_.count_down(); });
StartParallel(<other_args>, [=]() { this->latch_.count_down(); });
latch_.then([=]() { this->Finish(); });
}
void Finish() {
done_();
// Note manual memory management!
delete this;
}
ArgT arg_
std::function<void()> done_;
Latch latch_;
};
Now, in modern C++, explicit delete is a code smell, as, to some extent is delete this. However, I think this pattern (creating an object to represent a chunk of work managed by a callback chain) is fundamentally a good, or at least not a bad, idea.
So my question is, how should I rewrite instances of this pattern to encapsulate the memory management?
One option that I don't think is a good idea is storing the Worker in a shared_ptr: fundamentally, ownership is not shared here, so the overhead of reference counting is unnecessary. Furthermore, in order to keep a copy of the shared_ptr alive across the callbacks, I'd need to inherit from enable_shared_from_this, and remember to call that outside the lambdas and capture the shared_ptr into the callbacks. If I ever wrote the simple code using this directly, or called shared_from_this() inside the callback lambda, the object could be deleted early.
I agree that delete this is a code smell, and to a lesser extent delete on its own. But I think that here it is a natural part of continuation-passing style, which (to me) is itself something of a code smell.
The root problem is that the design of this API assumes unbounded control-flow: it acknowledges that the caller is interested in what happens when the call completes, but signals that completion via an arbitrarily-complex callback rather than simply returning from a synchronous call. Better to structure it synchronously and let the caller determine an appropriate parallelization and memory-management regime:
class Worker {
public:
void DoWork(ArgT arg) {
// Async1 is a mistake; fix it later. For now, synchronize explicitly.
Latch async_done(1);
Async1(<args>, [&]() { async_done.count_down(); });
async_done.await();
Latch parallel_done(2);
RunParallel([&]() { DoStuff(<args>); parallel_done.count_down(); });
RunParallel([&]() { DoStuff(<other_args>); parallel_done.count_down(); };
parallel_done.await();
}
};
On the caller-side, it might look something like this:
Latch latch(tasks.size());
for (auto& task : tasks) {
RunParallel([=]() { DoWork(<args>); latch.count_down(); });
}
latch.await();
Where RunParallel can use std::thread or whatever other mechanism you like for dispatching parallel events.
The advantage of this approach is that object lifetimes are much simpler. The ArgT object lives for exactly the scope of the DoWork call. The arguments to DoWork live exactly as long as the closures containing them. This also makes it much easier to add return-values (such as error codes) to DoWork calls: the caller can just switch from a latch to a thread-safe queue and read the results as they complete.
The disadvantage of this approach is that it requires actual threading, not just boost::asio::io_service. (For example, the RunParallel calls within DoWork() can't block on waiting for the RunParallel calls from the caller side to return.) So you either have to structure your code into strictly-hierarchical thread pools, or you have to allow a potentially-unbounded number of threads.
One option is that the delete this here is not a code smell. At most, it should be wrapped into a small library that would detect if all the continuation callbacks were destroyed without calling done_().

Resources