Related
I'm building a publish-subscribe class (called SystermInterface), which is responsible to receive updates from its instances, and publish them to subscribers.
Adding a subscriber callback function is trivial and has no issues, but removing it yields an error, because std::function<()> is not comparable in C++.
std::vector<std::function<void()> subs;
void subscribe(std::function<void()> f)
{
subs.push_back(f);
}
void unsubscribe(std::function<void()> f)
{
std::remove(subs.begin(), subs.end(), f); // Error
}
I've came down to five solutions to this error:
Registering the function using a weak_ptr, where the subscriber must keep the returned shared_ptr alive.
Solution example at this link.
Instead of registering at a vector, map the callback function by a custom key, unique per callback function.
Solution example at this link
Using vector of function pointers. Example
Make the callback function comparable by utilizing the address.
Use an interface class (parent class) to call a virtual function.
In my design, all intended classes inherits a parent class called
ServiceCore, So instead of registering a callback function, just
register ServiceCore reference in the vector.
Given that the SystemInterface class has a field attribute per instance (ID) (Which is managed by ServiceCore, and supplied to SystemInterface by constructing a ServiceCore child instance).
To my perspective, the first solution is neat and would work, but it requires handling at subscribers, which is something I don't really prefer.
The second solution would make my implementation more complex, where my implementation looks as:
using namespace std;
enum INFO_SUB_IMPORTANCE : uint8_t
{
INFO_SUB_PRIMARY, // Only gets the important updates.
INFO_SUB_COMPLEMENTARY, // Gets more.
INFO_SUB_ALL // Gets all updates
};
using CBF = function<void(string,string)>;
using INFO_SUBTREE = map<INFO_SUB_IMPORTANCE, vector<CBF>>;
using REQINF_SUBS = map<string, INFO_SUBTREE>; // It's keyed by an iterator, explaining it goes out of the question scope.
using INFSRC_SUBS = map<string, INFO_SUBTREE>;
using WILD_SUBS = INFO_SUBTREE;
REQINF_SUBS infoSubrs;
INFSRC_SUBS sourceSubrs;
WILD_SUBS wildSubrs;
void subscribeInfo(string info, INFO_SUB_IMPORTANCE imp, CBF f) {
infoSubrs[info][imp].push_back(f);
}
void subscribeSource(string source, INFO_SUB_IMPORTANCE imp, CBF f) {
sourceSubrs[source][imp].push_back(f);
}
void subscribeWild(INFO_SUB_IMPORTANCE imp, CBF f) {
wildSubrs[imp].push_back(f);
}
The second solution would require INFO_SUBTREE to be an extended map, but can be keyed by an ID:
using KEY_T = uint32_t; // or string...
using INFO_SUBTREE = map<INFO_SUB_IMPORTANCE, map<KEY_T,CBF>>;
For the third solution, I'm not aware of the limitations given by using function pointers, and the consequences of the fourth solution.
The Fifth solution would eliminate the purpose of dealing with CBFs, but it'll be more complex at subscriber-side, where a subscriber is required to override the virtual function and so receives all updates at one place, in which further requires filteration of the message id and so direct the payload to the intended routines using multiple if/else blocks, which will increase by increasing subscriptions.
What I'm looking for is an advice for the best available option.
Regarding your proposed solutions:
That would work. It can be made easy for the caller: have subscribe() create the shared_ptr and corresponding weak_ptr objects, and let it return the shared_ptr.
Then the caller must not lose the key. In a way this is similar to the above.
This of course is less generic, and then you can no longer have (the equivalent of) captures.
You can't: there is no way to get the address of the function stored inside a std::function. You can do &f inside subscribe() but that will only give you the address of the local variable f, which will go out of scope as soon as you return.
That works, and is in a way similar to 1 and 2, although now the "key" is provided by the caller.
Options 1, 2 and 5 are similar in that there is some other data stored in subs that refers to the actual std::function: either a std::shared_ptr, a key or a pointer to a base class. I'll present option 6 here, which is kind of similar in spirit but avoids storing any extra data:
Store a std::function<void()> directly, and return the index in the vector where it was stored. When removing an item, don't std::remove() it, but just set it to std::nullptr. Next time subscribe() is called, it checks if there is an empty element in the vector and reuses it:
std::vector<std::function<void()> subs;
std::size_t subscribe(std::function<void()> f) {
if (auto it = std::find(subs.begin(), subs.end(), std::nullptr); it != subs.end()) {
*it = f;
return std::distance(subs.begin(), it);
} else {
subs.push_back(f);
return subs.size() - 1;
}
}
void unsubscribe(std::size_t index) {
subs[index] = std::nullptr;
}
The code that actually calls the functions stored in subs must now of course first check against std::nullptrs. The above works because std::nullptr is treated as the "empty" function, and there is an operator==() overload that can check a std::function against std::nullptr, thus making std::find() work.
One drawback of option 6 as shown above is that a std::size_t is a rather generic type. To make it safer, you might wrap it in a class SubscriptionHandle or something like that.
As for the best solution: option 1 is quite heavy-weight. Options 2 and 5 are very reasonable, but 6 is, I think, the most efficient.
With the struct definition given below...
struct A {
virtual void hello() = 0;
};
Approach #1:
struct B : public A {
virtual void hello() { ... }
};
Approach #2:
struct B : public A {
void hello() { ... }
};
Is there any difference between these two ways to override the hello function?
They are exactly the same. There is no difference between them other than that the first approach requires more typing and is potentially clearer.
The 'virtualness' of a function is propagated implicitly, however at least one compiler I use will generate a warning if the virtual keyword is not used explicitly, so you may want to use it if only to keep the compiler quiet.
From a purely stylistic point-of-view, including the virtual keyword clearly 'advertises' the fact to the user that the function is virtual. This will be important to anyone further sub-classing B without having to check A's definition. For deep class hierarchies, this becomes especially important.
The virtual keyword is not necessary in the derived class. Here's the supporting documentation, from the C++ Draft Standard (N3337) (emphasis mine):
10.3 Virtual functions
2 If a virtual member function vf is declared in a class Base and in a class Derived, derived directly or indirectly from Base, a member function vf with the same name, parameter-type-list (8.3.5), cv-qualification, and ref-qualifier (or absence of same) as Base::vf is declared, then Derived::vf is also virtual (whether or not it is so declared) and it overrides Base::vf.
No, the virtual keyword on derived classes' virtual function overrides is not required. But it is worth mentioning a related pitfall: a failure to override a virtual function.
The failure to override occurs if you intend to override a virtual function in a derived class, but make an error in the signature so that it declares a new and different virtual function. This function may be an overload of the base class function, or it might differ in name. Whether or not you use the virtual keyword in the derived class function declaration, the compiler would not be able to tell that you intended to override a function from a base class.
This pitfall is, however, thankfully addressed by the C++11 explicit override language feature, which allows the source code to clearly specify that a member function is intended to override a base class function:
struct Base {
virtual void some_func(float);
};
struct Derived : Base {
virtual void some_func(int) override; // ill-formed - doesn't override a base class method
};
The compiler will issue a compile-time error and the programming error will be immediately obvious (perhaps the function in Derived should have taken a float as the argument).
Refer to WP:C++11.
Adding the "virtual" keyword is good practice as it improves readability , but it is not necessary. Functions declared virtual in the base class, and having the same signature in the derived classes are considered "virtual" by default.
There is no difference for the compiler, when you write the virtual in the derived class or omit it.
But you need to look at the base class to get this information. Therfore I would recommend to add the virtual keyword also in the derived class, if you want to show to the human that this function is virtual.
The virtual keyword should be added to functions of a base class to make them overridable. In your example, struct A is the base class. virtual means nothing for using those functions in a derived class. However, it you want your derived class to also be a base class itself, and you want that function to be overridable, then you would have to put the virtual there.
struct B : public A {
virtual void hello() { ... }
};
struct C : public B {
void hello() { ... }
};
Here C inherits from B, so B is not the base class (it is also a derived class), and C is the derived class.
The inheritance diagram looks like this:
A
^
|
B
^
|
C
So you should put the virtual in front of functions inside of potential base classes which may have children. virtual allows your children to override your functions. There is nothing wrong with putting the virtual in front of functions inside of the derived classes, but it is not required. It is recommended though, because if someone would want to inherit from your derived class, they would not be pleased that the method overriding doesn't work as expected.
So put virtual in front of functions in all classes involved in inheritance, unless you know for sure that the class will not have any children who would need to override the functions of the base class. It is good practice.
There's a considerable difference when you have templates and start taking base class(es) as template parameter(s):
struct None {};
template<typename... Interfaces>
struct B : public Interfaces
{
void hello() { ... }
};
struct A {
virtual void hello() = 0;
};
template<typename... Interfaces>
void t_hello(const B<Interfaces...>& b) // different code generated for each set of interfaces (a vtable-based clever compiler might reduce this to 2); both t_hello and b.hello() might be inlined properly
{
b.hello(); // indirect, non-virtual call
}
void hello(const A& a)
{
a.hello(); // Indirect virtual call, inlining is impossible in general
}
int main()
{
B<None> b; // Ok, no vtable generated, empty base class optimization works, sizeof(b) == 1 usually
B<None>* pb = &b;
B<None>& rb = b;
b.hello(); // direct call
pb->hello(); // pb-relative non-virtual call (1 redirection)
rb->hello(); // non-virtual call (1 redirection unless optimized out)
t_hello(b); // works as expected, one redirection
// hello(b); // compile-time error
B<A> ba; // Ok, vtable generated, sizeof(b) >= sizeof(void*)
B<None>* pba = &ba;
B<None>& rba = ba;
ba.hello(); // still can be a direct call, exact type of ba is deducible
pba->hello(); // pba-relative virtual call (usually 3 redirections)
rba->hello(); // rba-relative virtual call (usually 3 redirections unless optimized out to 2)
//t_hello(b); // compile-time error (unless you add support for const A& in t_hello as well)
hello(ba);
}
The fun part of it is that you can now define interface and non-interface functions later to defining classes. That is useful for interworking interfaces between libraries (don't rely on this as a standard design process of a single library). It costs you nothing to allow this for all of your classes - you might even typedef B to something if you'd like.
Note that, if you do this, you might want to declare copy / move constructors as templates, too: allowing to construct from different interfaces allows you to 'cast' between different B<> types.
It's questionable whether you should add support for const A& in t_hello(). The usual reason for this rewrite is to move away from inheritance-based specialization to template-based one, mostly for performance reasons. If you continue to support the old interface, you can hardly detect (or deter from) old usage.
I will certainly include the Virtual keyword for the child class, because
i. Readability.
ii. This child class my be derived further down, you don't want the constructor of the further derived class to call this virtual function.
In C#, you can define a custom enumeration very trivially, eg:
public IEnumerable<Foo> GetNestedFoos()
{
foreach (var child in _SomeCollection)
{
foreach (var foo in child.FooCollection)
{
yield return foo;
}
foreach (var bar in child.BarCollection)
{
foreach (var foo in bar.MoreFoos)
{
yield return foo;
}
}
}
foreach (var baz in _SomeOtherCollection)
{
foreach (var foo in baz.GetNestedFoos())
{
yield return foo;
}
}
}
(This can be simplified using LINQ and better encapsulation but that's not the point of the question.)
In C++11, you can do similar enumerations but AFAIK it requires a visitor pattern instead:
template<typename Action>
void VisitAllFoos(const Action& action)
{
for (auto& child : m_SomeCollection)
{
for (auto& foo : child.FooCollection)
{
action(foo);
}
for (auto& bar : child.BarCollection)
{
for (auto& foo : bar.MoreFoos)
{
action(foo);
}
}
}
for (auto& baz : m_SomeOtherCollection)
{
baz.VisitAllFoos(action);
}
}
Is there a way to do something more like the first, where the function returns a range that can be iterated externally rather than calling a visitor internally?
(And I don't mean by constructing a std::vector<Foo> and returning it -- it should be an in-place enumeration.)
I am aware of the Boost.Range library, which I suspect would be involved in the solution, but I'm not particularly familiar with it.
I'm also aware that it's possible to define custom iterators to do this sort of thing (which I also suspect might be involved in the answer) but I'm looking for something that's easy to write, ideally no more complicated than the examples shown here, and composable (like with _SomeOtherCollection).
I would prefer something that does not require the caller to use lambdas or other functors (since that just makes it a visitor again), although I don't mind using lambdas internally if needed (but would still prefer to avoid them there too).
If I'm understanding your question correctly, you want to perform some action over all elements of a collection.
C++ has an extensive set of iterator operations, defined in the iterator header. Most collection structures, including the std::vector that you reference, have .begin and .end methods which take no arguments and return iterators to the beginning and the end of the structure. These iterators have some operations that can be performed on them manually, but their primary use comes in the form of the algorithm header, which defines several very useful iteration functions.
In your specific case, I believe you want the for_each function, which takes a range (as a beginning to end iterator) and a function to apply. So if you had a function (or function object) called action and you wanted to apply it to a vector called data, the following code would be correct (assuming all necessary headers are included appropriately):
std::for_each(data.begin(), data.end(), action);
Note that for_each is just one of many functions provided by the algorithm header. It also provides functions to search a collection, copy a set of data, sort a list, find a minimum/maximum, and much more, all generalized to work over any structure that has an iterator. And if even these aren't enough, you can write your own by reading up on the operations supported on iterators. Simply define a template function that takes iterators of varying types and document what kind of iterator you want.
template <typename BidirectionalIterator>
void function(BidirectionalIterator begin, BidirectionalIterator end) {
// Do something
}
One final note is that all of the operations mentioned so far also operate correctly on arrays, provided you know the size. Instead of writing .begin and .end, you write + 0 and + n, where n is the size of the array. The trivial zero addition is often necessary in order to decay the type of the array into a pointer to make it a valid iterator, but array pointers are indeed random access iterators just like any other container iterator.
What you can do is writing your own adapter function and call it with different ranges of elements of the same type.
This is a non tested solution, that will probably needs some tweaking to make it compile,but it will give you an idea. It uses variadic templates to move from a collection to the next one.
template<typename Iterator, Args...>
visitAllFoos(std::pair<Iterator, Iterator> collection, Args&&... args)
{
std::for_each(collection.first, collection.second, {}(){ // apply action });
return visitAllFoos(std::forward<Args>(args)...);
}
//you can call it with a sequence of begin/end iterators
visitAllFoos(std::make_pair(c1.begin(), c1,end()), std::make_pair(c2.begin(), c2,end()))
I believe, what you're trying to do can be done with Boost.Range, in particular with join and any_range (the latter would be needed if you want to hide the types of the containers and remove joined_range from the interface).
However, the resulting solution would not be very practical both in complexity and performance - mostly because of the nested joined_ranges and type erasure overhead incurred by any_range. Personally, I would just construct std::vector<Foo*> or use visitation.
You can do this with the help of boost::asio::coroutine; see examples at https://pubby8.wordpress.com/2014/03/16/multi-step-iterators-using-coroutines/ and http://www.boost.org/doc/libs/1_55_0/doc/html/boost_asio/overview/core/coroutine.html.
I have recently run into a problem which has had me thinking in circles. Assume that I have an object of type O with properties O.A and O.B. Also assume that I have a collection of instances of type O, where O.A and O.B are defined for each instance.
Now assume that I need to perform some operation (like sorting) on a collection of O instances using either O.A or O.B, but not both at any given time. My original solution is as follows.
Example -- just for demonstration, not production code:
public class O {
int A;
int B;
}
public static class Utils {
public static void SortByA (O[] collection) {
// Sort the objects in the collection using O.A as the key. Note: this is custom sorting logic, so it is not simply a one-line call to a built-in sort method.
}
public static void SortByB (O[] collection) {
// Sort the objects in the collection using O.B as the key. Same logic as above.
}
}
What I would love to do is this...
public static void SortAgnostic (O[] collection, FieldRepresentation x /* some non-bool, non-int variable representing whether to chose O.A or O.B as the sorting key */) {
// Sort by whatever "x" represents...
}
... but creating a new, highly-specific type that I will have to maintain just to avoid duplicating a few lines of code seems unnecessary to me. Perhaps I am incorrect on that (and I am sure someone will correct me if that statement is wrong :D), but that is my current thought nonetheless.
Question: What is the best way to implement this method? The logic that I have to implement is difficult to break down into smaller methods, as it is already fairly optimized. At the root of the issue is the fact that I need to perform the same operation using different properties of an object. I would like to stay away from using codes/flags/etc. in the method signature if possible so that the solution can be as robust as possible.
Note: When answering this question, please approach it from an algorithmic point of view. I am aware that some language-specific features may be suitable alternatives, but I have encountered this problem before and would like to understand it from a relatively language-agnostic viewpoint. Also, please do not constrain responses to sorting solutions only, as I have only chosen it as an example. The real question is how to avoid code duplication when performing an identical operation on two different properties of an object.
"The real question is how to avoid code duplication when performing an identical operation on two different properties of an object."
This is a very good question as this situation arises all the time. I think, one of the best ways to deal with this situation is to use the following pattern.
public class O {
int A;
int B;
}
public doOperationX1() {
doOperationX(something to indicate which property to use);
}
public doOperationX2() {
doOperationX(something to indicate which property to use);
}
private doOperationX(input ) {
// actual work is done here
}
In this pattern, the actual implementation is performed in a private method, which is called by public methods, with some extra information. For example, in this case, it can be
doOperationX(A), or doOperationX(B), or something like that.
My Reasoning: In my opinion this pattern is optimal as it achieves two main requirements:
It keeps the public interface descriptive and clear, as it keeps operations separate, and avoids flags etc that you also mentioned in your post. This is good for the client.
From the implementation perspective, it prevents duplication, as it is in one place. This is good for the development.
A simple way to approach this I think is to internalize the behavior of choosing the sort field to the class O itself. This way the solution can be language-agnostic.
The implementation in Java could be using an Abstract class for O, where the purpose of the abstract method getSortField() would be to return the field to sort by. All that the invocation logic would need to do is to implement the abstract method to return the desired field.
O o = new O() {
public int getSortField() {
return A;
}
};
The problem might be reduced to obtaining the value of the specified field from the given object so it can be use for sorting purposes, or,
TField getValue(TEntity entity, string fieldName)
{
// Return value of field "A" from entity,
// implementation depends on language of choice, possibly with
// some sort of reflection support
}
This method can be used to substitute comparisons within the sorting algorithm,
if (getValue(o[i], "A")) > getValue(o[j], "A"))
{
swap(i, j);
}
The field name can then be parametrized, as,
public static void SortAgnostic (O[] collection, string fieldName)
{
if (getValue(collection[i], fieldName)) > getValue(collection[j], fieldName))
{
swap(i, j);
}
...
}
which you can use like SortAgnostic(collection, "A").
Some languages allow you to express the field in a more elegant way,
public static void SortAgnostic (O[] collection, Expression fieldExpression)
{
if (getValue(collection[i], fieldExpression)) >
getValue(collection[j], fieldExpression))
{
swap(i, j);
}
...
}
which you can use like SortAgnostic(collection, entity => entity.A).
And yet another option can be passing a pointer to a function which will return the value of the field needed,
public static void SortAgnostic (O[] collection, Function getValue)
{
if (getValue(collection[i])) > getValue(collection[j]))
{
swap(i, j);
}
...
}
which given a function,
TField getValueOfA(TEntity entity)
{
return entity.A;
}
and passing it like SortAgnostic(collection, getValueOfA).
"... but creating a new, highly-specific type that I will have to maintain just to avoid duplicating a few lines of code seems unnecessary to me"
That is why you should use available tools like frameworks or other typo of code libraries that provide you requested solution.
When some mechanism is common that mean it can be moved to higher level of abstraction. When you can not find proper solution try to create own one. Think about the result of operation as not part of class functionality. The sorting is only a feature, that why it should not be part of your class from the beginning. Try to keep class as simple as possible.
Do not worry premature about the sense of having something small just because it is small. Focus on the final usage of it. If you use very often one type of sorting just create a definition of it to reuse it. You do not have to necessary create a utill class and then call it. Sometimes the base functionality enclosed in utill class is fair enough.
I assume that you use Java:
In your case the wheal was already implemented in person of Collection#sort(List, Comparator).
To full fill it you could create a Enum type that implement Comparator interface with predefined sorting types.
We have people who run code for simulations, testing etc. on some supercomputers that we have. What would be nice is, if as part of a build process we can check that not only that the code compiles but that the ouput matches some pattern which will indicate we are getting meaningful results.
i.e. the researcher may know that the value of x must be within some bounds. If not, then a logical error has been made in the code (assuming it compiles and their is no compile time error).
Are there any pre-written packages for this kind of thing. The code is written in FORTRAN, C, C++ etc.
Any specific or general advice would be appreciated.
I expect most unit testing frameworks could do this; supply a toy test data set and see that the answer is sane in various different ways.
A good way to ensure that the resulting value of any computation (whether final or intermediate) meets certain constraints, is to use an object oriented programming language like C++, and define data-types that internally enforce the conditions that you are checking for. You can then use those data-types as the return value of any computation to ensure that said conditions are met for the value returned.
Let's look at a simple example. Assume that you have a member function inside of an Airplane class as a part of a flight control system that estimates the mass of the airplane instance as a function of the number passengers and the amount of fuel that plane has at that moment. One way to declare the Airplane class and an airplaneMass() member function is the following:
class Airplane {
public:
...
int airplaneMass() const; // note the plain int return type
...
private:
...
};
However, a better way to implement the above, would be to define a type AirplaneMass that can be used as the function's return type instead of int. AirplaneMass can internally ensure (in it's constructor and any overloaded operators) that the value it encapsulates meets certain constraints. An example implementation of the AirplaneMass datatype could be the following:
class AirplaneMass {
public:
// AirplaneMass constructor
AirplaneMass(int m) {
if (m < MIN || m > MAX) {
// throw exception or log constraint violation
}
// if the value of m meets the constraints,
// assign it to the internal value.
mass_ = m;
}
...
/* range checking should also be done in the implementation
of overloaded operators. For instance, you may want to
make sure that the resultant of the ++ operation for
any instance of AirplaneMass also lies within the
specified constraints. */
private:
int mass_;
};
Thereafter, you can redeclare class Airplane and its airplaneMass() member function as follows:
class Airplane {
public:
...
AirplaneMass airplaneMass() const;
// note the more specific AirplaneMass return type
...
private:
...
};
The above will ensure that the value returned by airplaneMass() is between MIN and MAX. Otherwise, an exception will be thrown, or the error condition will be logged.
I had to do that for conversions this month. I don't know if that might help you, but it appeared quite simple a solution to me.
First, I defined a tolerance level. (Java-ish example code...)
private static final double TOLERANCE = 0.000000000001D;
Then I defined a new "areEqual" method which checks if the difference between both values is lower than the tolerance level or not.
private static boolean areEqual(double a, double b) {
return (abs(a - b) < TOLERANCE);
}
If I get a false somewhere, it means the check has probably failed. I can adjust the tolerance to see if it's just a precision problem or really a bad result. Works quite well in my situation.