Idiomatic way of providing constructors that move their arguments - c++11

Lets say I have the following class:
#include <vector>
class Foo
{
public:
Foo(const std::vector<int> & a, const std::vector<int> & b)
: a{ a }, b{ b } {}
private:
std::vector<int> a, b;
};
But now I want to account for the situations in which the caller of the constructor might pass temporaries to it and I want to properly move those temporaries to a and b.
Now do I really have to add 3 more constructors, 1 of which has a as a rvalue reference, 1 of which has b as a rvalue reference and 1 that only has rvalue reference arguments?
Of course this question generalizes to any number of arguments which are worthwhile to move and the number of required constructors would be arguments^2 2^arguments.
This question also generalizes to all functions.
What is the idiomatic way of doing this? Or am I completely missing something important here?

The usual approach is to pass by value, then move-construct the members from the parameters:
Foo(std::vector<int> a, std::vector<int> b)
: a{ std::move(a) },
b{ std::move(b) }
{}
If a copy is needed, it will be created by the caller and then moved-from to construct the member. If the caller passes a temporary (or other rvalue), no copy is made, only a single move.
For arguments that don't have an efficient move constructor, then accepting a reference to const is slightly more efficient, and I'd retain that.
None of this applies if the function doesn't need a copy of the passed value - continue to use a const ref if you don't modify the value and don't need it to live beyond the end of the function execution. Personally, I use pass-by-value-and-move liberally in my constructors, but rarely in my other functions.

Really, you should take by value if move construction is very cheap.
This results in exactly 1 extra move over the ideal case in every case.
But if you really must avoid that, you can do this:
template<class T>
struct sink_of {
void const* ptr = 0;
T(*fn)(void const*) = 0;
sink_of(T&& t):
ptr( std::addressof(t) ),
fn([](void const*ptr)->T{
return std::move(*(T*)(ptr));
})
{}
sink_of(T const& t):
ptr( std::addressof(t) ),
fn([](void const*ptr)->T{
return *(T*)(ptr);
})
{}
operator T() const&& {
return fn(ptr);
}
};
which uses RVO/elision to avoid that extra move at the cost of a bunch of pointer-based overhead and type erasure.
Here is some test code that demonstrates that
test( noisy nin ):n(std::move(nin)) {}
test( sink_of<noisy> nin ):n(std::move(nin)) {}
differ by exactly 1 move-construct of a noisy.
The "perfect" version
test( noisy const& nin ):n(nin) {}
test( noisy && nin ):n(std::move(nin)) {}
or
template<class Noisy, std::enable_if_t<std::is_same<noisy, std::decay_t<Noisy>>{}, int> = 0 >
test( Noisy && nin ):n(std::forward<Noisy>(nin)) {}
has the same number of copy/moves as the sink_of version.
(noisy is a type that prints information about what moves/copies it engages in, so you can see what gets optimized away by elision)
This is only worth it when the extra move is important to eliminate. For a vector it is not.
Also, if you have a "true temporary" you are passing, the by-value one is as good as the sink_of or "perfect" ones.

Related

Why does initialization of int by parenthesis inside class give error? [duplicate]

For example, I cannot write this:
class A
{
vector<int> v(12, 1);
};
I can only write this:
class A
{
vector<int> v1{ 12, 1 };
vector<int> v2 = vector<int>(12, 1);
};
Why is there a difference between these two declaration syntaxes?
The rationale behind this choice is explicitly mentioned in the related proposal for non static data member initializers :
An issue raised in Kona regarding scope of identifiers:
During discussion in the Core Working Group at the September ’07 meeting in Kona, a question arose about the scope of identifiers in the initializer. Do we want to allow class scope with the possibility of forward lookup; or do we want to require that the initializers be well-defined at the point that they’re parsed?
What’s desired:
The motivation for class-scope lookup is that we’d like to be able to put anything in a non-static data member’s initializer that we could put in a mem-initializer without significantly changing the semantics (modulo direct initialization vs. copy initialization):
int x();
struct S {
int i;
S() : i(x()) {} // currently well-formed, uses S::x()
// ...
static int x();
};
struct T {
int i = x(); // should use T::x(), ::x() would be a surprise
// ...
static int x();
};
Problem 1:
Unfortunately, this makes initializers of the “( expression-list )” form ambiguous at the time that the declaration is being parsed:
struct S {
int i(x); // data member with initializer
// ...
static int x;
};
struct T {
int i(x); // member function declaration
// ...
typedef int x;
};
One possible solution is to rely on the existing rule that, if a declaration could be an object or a function, then it’s a function:
struct S {
int i(j); // ill-formed...parsed as a member function,
// type j looked up but not found
// ...
static int j;
};
A similar solution would be to apply another existing rule, currently used only in templates, that if T could be a type or something else, then it’s something else; and we can use “typename” if we really mean a type:
struct S {
int i(x); // unabmiguously a data member
int j(typename y); // unabmiguously a member function
};
Both of those solutions introduce subtleties that are likely to be misunderstood by many users (as evidenced by the many questions on comp.lang.c++ about why “int i();” at block scope doesn’t declare a default-initialized int).
The solution proposed in this paper is to allow only initializers of the “= initializer-clause” and “{ initializer-list }” forms. That solves the ambiguity problem in most cases, for example:
HashingFunction hash_algorithm{"MD5"};
Here, we could not use the = form because HasningFunction’s constructor is explicit.
In especially tricky cases, a type might have to be mentioned twice. Consider:
vector<int> x = 3; // error: the constructor taking an int is explicit
vector<int> x(3); // three elements default-initialized
vector<int> x{3}; // one element with the value 3
In that case, we have to chose between the two alternatives by using the appropriate notation:
vector<int> x = vector<int>(3); // rather than vector<int> x(3);
vector<int> x{3}; // one element with the value 3
Problem 2:
Another issue is that, because we propose no change to the rules for initializing static data members, adding the static keyword could make a well-formed initializer ill-formed:
struct S {
const int i = f(); // well-formed with forward lookup
static const int j = f(); // always ill-formed for statics
// ...
constexpr static int f() { return 0; }
};
Problem 3:
A third issue is that class-scope lookup could turn a compile-time error into a run-time error:
struct S {
int i = j; // ill-formed without forward lookup, undefined behavior with
int j = 3;
};
(Unless caught by the compiler, i might be intialized with the undefined value of j.)
The proposal:
CWG had a 6-to-3 straw poll in Kona in favor of class-scope lookup; and that is what this paper proposes, with initializers for non-static data members limited to the “= initializer-clause” and “{ initializer-list }” forms.
We believe:
Problem 1: This problem does not occur as we don’t propose the () notation. The = and {} initializer notations do not suffer from this problem.
Problem 2: adding the static keyword makes a number of differences, this being the least of them.
Problem 3: this is not a new problem, but is the same order-of-initialization problem that already exists with constructor initializers.
One possible reason is that allowing parentheses would lead us back to the most vexing parse in no time. Consider the two types below:
struct foo {};
struct bar
{
bar(foo const&) {}
};
Now, you have a data member of type bar that you want to initialize, so you define it as
struct A
{
bar B(foo());
};
But what you've done above is declare a function named B that returns a bar object by value, and takes a single argument that's a function having the signature foo() (returns a foo and doesn't take any arguments).
Judging by the number and frequency of questions asked on StackOverflow that deal with this issue, this is something most C++ programmers find surprising and unintuitive. Adding the new brace-or-equal-initializer syntax was a chance to avoid this ambiguity and start with a clean slate, which is likely the reason the C++ committee chose to do so.
bar B{foo{}};
bar B = foo();
Both lines above declare an object named B of type bar, as expected.
Aside from the guesswork above, I'd like to point out that you're doing two vastly different things in your example above.
vector<int> v1{ 12, 1 };
vector<int> v2 = vector<int>(12, 1);
The first line initializes v1 to a vector that contains two elements, 12 and 1. The second creates a vector v2 that contains 12 elements, each initialized to 1.
Be careful of this rule - if a type defines a constructor that takes an initializer_list<T>, then that constructor is always considered first when the initializer for the type is a braced-init-list. The other constructors will be considered only if the one taking the initializer_list is not viable.

why does std::for_each iterator need a copy constructable iterator

I noticed that std::for_each requires it's iterators to meet the requirement InputIterator, which in turn requires Iterator and then Copy{Contructable,Assignable}.
That's not the only thing, std::for_each actually uses the copy constructor (cc) (not assignment as far as my configuration goes). That is, deleting the cc from the iterator will result in:
error: use of deleted function ‘some_iterator::some_iterator(const some_iterator&)’
Why does std::for_each need a cc? I found this particularly inconvenient, since I created an iterator which recursively iterates through files in a folder, keeping track of the files and folders on a queue. This means that the iterator has a queue data member, which would also have to be copied if the cc is used: that is unnecessarily inefficient.
The strange thing is that the cc is not called in this simple example:
#include <iostream>
#include <iterator>
#include <algorithm>
class infinite_5_iterator
:
public std::iterator<std::input_iterator_tag, int>
{
public:
infinite_5_iterator() = default;
infinite_5_iterator(infinite_5_iterator const &) {std::cout << "copy constr "; }
infinite_5_iterator &operator=(infinite_5_iterator const &) = delete;
int operator*() { return 5; }
infinite_5_iterator &operator++() { return *this; }
bool operator==(infinite_5_iterator const &) const { return false; }
bool operator!=(infinite_5_iterator const &) const { return true; }
};
int main() {
std::for_each(infinite_5_iterator(), infinite_5_iterator(),
[](int v) {
std::cout << v << ' ';
}
);
}
source: http://ideone.com/YVHph8
It however is needed compile time. Why does std::for_each need to copy construct the iterator, and when is this done? Isn't this extremely inefficient?
NOTE: I'm talking about the cc of the iterator, not of it's elements, as is done here: unexpected copies with foreach over a map
EDIT: Note that the standard does not state the copy-constructor is called at all, it just expresses the amount of times f is called. May I then assume that the cc is not called at all? Why is the use of operator++ and operator* and cc not specified, but the use of f is?
You have simply fallen victim to a specification that has evolved in bits and pieces over decades. The concept of InputIterator was invented a long time before the notion of move-only types, or movable types was conceived.
In hindsight I would love to declare that InputIterator need not be copyable. This would mesh perfectly with its single-pass behavior. But I also fear that such a change would have overwhelming backwards compatibility problems.
In addition to the flawed iterator concepts as specified in the standard, about a decade ago, in an attempt to be helpful, the gcc std::lib (libstdc++) started imposing "concepts" on things like InputIterator in the std-algorithms. I.e. because the standard says:
Requires: InputIterator shall satisfy the requirements of an input iterator (24.2.3).
then "concept checks" were inserted into the std-algorithms that require InputIterator to meet all of the requirements of input iterator whether or not the algorithm actually used all of those requirements. And in this case, it is the concept check, not the actual algorithm, that is requiring your iterator to be CopyConstructible.
<sigh>
If you write your own for_each algorithm, it is trivial to do so without requiring your iterators to be CopyConstructible or CopyAssignable (if supplied with rvalue iterator arguments):
template <class InputIterator, class Function>
inline
Function
for_each(InputIterator first, InputIterator last, Function f)
{
for (; first != last; ++first)
f(*first);
return f;
}
And for your use case I recommend either doing that, or simply writing your own loop.

Dependency injection in C++11 without raw pointers

I often use the "dependency injection" pattern in my projects. In C++ it is easiest to implement by passing around raw pointers, but now with C++11, everything in high-level code should be doable with smart pointers. But what is the best practice for this case? Performance is not critical, a clean and understandable code matters more to me now.
Let me show a simplified example. We have an algorithm that uses distance calculations inside. We want to be able to replace this calculation with different distance metrics (Euclidean, Manhattan, etc.). Our goal is to be able to say something like:
SomeAlgorithm algorithmWithEuclidean(new EuclideanDistanceCalculator());
SomeAlgorithm algorithmWithManhattan(new ManhattanDistanceCalculator());
but with smart pointers to avoid manual new and delete.
This is a possible implementation with raw pointers:
class DistanceCalculator {
public:
virtual double distance(Point p1, Point p2) = 0;
};
class EuclideanDistanceCalculator {
public:
virtual double distance(Point p1, Point p2) {
return sqrt(...);
}
};
class ManhattanDistanceCalculator {
public:
virtual double distance(Point p1, Point p2) {
return ...;
}
};
class SomeAlgorithm {
DistanceCalculator* distanceCalculator;
public:
SomeAlgorithm(DistanceCalculator* distanceCalculator_)
: distanceCalculator(distanceCalculator_) {}
double calculateComplicated() {
...
double dist = distanceCalculator->distance(p1, p2);
...
}
~SomeAlgorithm(){
delete distanceCalculator;
}
};
Let's assume that copying is not really an issue, and if we didn't need polymorphism we would just pass the DistanceCalculator to the constructor of SomeAlgorithm by value (copying). But since we need to be able to pass in different derived instances (without slicing), the parameter must be either a raw pointer, a reference or a smart pointer.
One solution that comes to mind is to pass it in by reference-to-const and encapsulate it in a std::unique_ptr<DistanceCalculator> member variable. Then the call would be:
SomeAlgorithm algorithmWithEuclidean(EuclideanDistance());
But this stack-allocated temporary object (rvalue-reference?) will be destructed after this line. So we'd need some copying to make it more like a pass-by-value. But since we don't know the runtime type, we cannot construct our copy easily.
We could also use a smart pointer as the constructor parameter. Since there is no issue with ownership (the DistanceCalculator will be owned by SomeAlgorithm) we should use std::unique_ptr. Should I really replace all of such constructor parameters with unique_ptr? it seems to reduce readability. Also the user of SomeAlgorithm must construct it in an awkward way:
SomeAlgorithm algorithmWithEuclidean(std::unique_ptr<DistanceCalculator>(new EuclideanDistance()));
Or should I use the new move semantics (&&, std::move) in some way?
It seems to be a pretty standard problem, there must be some succinct way to implement it.
If I wanted to do this, the first thing I'd do is kill your interface, and instead use this:
SomeAlgorithm(std::function<double(Point,Point)> distanceCalculator_)
type erased invocation object.
I could do a drop-in replacement using your EuclideanDistanceCalculator like this:
std::function<double(Point,Point)> UseEuclidean() {
auto obj = std::make_shared<EuclideanDistance>();
return [obj](Point a, Point b)->double {
return obj->distance( a, b );
};
}
SomeAlgorithm foo( UseEuclidean() );
but as distance calculators rarely require state, we could do away with the object.
With C++1y support, this shortens to:
std::function<double(Point,Point>> UseEuclidean() {
return [obj = std::make_shared<EuclideanDistance>()](Point a, Point b)->double {
return obj->distance( a, b );
};
}
which as it no longer requires a local variable, can be used inline:
SomeAlgorithm foo( [obj = std::make_shared<EuclideanDistance>()](Point a, Point b)->double {
return obj->distance( a, b );
} );
but again, the EuclideanDistance doesn't have any real state, so instead we can just
std::function<double(Point,Point>> EuclideanDistance() {
return [](Point a, Point b)->double {
return sqrt( (b.x-a.x)*(b.x-a.x) + (b.y-a.y)*(b.y*a.y) );
};
}
If we really don't need movement but we do need state, we can write a unique_function< R(Args...) > type that does not support non-move based assignment, and store one of those instead.
The core of this is that the interface DistanceCalculator is noise. The name of the variable is usually enough. std::function< double(Point,Point) > m_DistanceCalculator is clear in what it does. The creator of the type-erasure object std::function handles any lifetime management issues, we just store the function object by value.
If your actual dependency injection is more complicated (say multiple different related callbacks), using an interface isn't bad. If you want to avoid copy requirements, I'd go with this:
struct InterfaceForDependencyStuff {
virtual void method1() = 0;
virtual void method2() = 0;
virtual int method3( double, char ) = 0;
virtual ~InterfaceForDependencyStuff() {}; // optional if you want to do more work later, but probably worth it
};
then, write up your own make_unique<T>(Args&&...) (a std one is coming in C++1y), and use it like this:
Interface:
SomeAlgorithm(std::unique_ptr<InterfaceForDependencyStuff> pDependencyStuff)
Use:
SomeAlgorithm foo(std::make_unique<ImplementationForDependencyStuff>( blah blah blah ));
If you don't want virtual ~InterfaceForDependencyStuff() and want to use unique_ptr, you have to use a unique_ptr that stores its deleter (by passing in a stateful deleter).
On the other hand, if std::shared_ptr already comes with a make_shared, and it stores its deleter statefully by default. So if you go with shared_ptr storage of your interface, you get:
SomeAlgorithm(std::shared_ptr<InterfaceForDependencyStuff> pDependencyStuff)
and
SomeAlgorithm foo(std::make_shared<ImplementationForDependencyStuff>( blah blah blah ));
and make_shared will store a pointer-to-function that deletes ImplementationForDependencyStuff that will not be lost when you convert it to a std::shared_ptr<InterfaceForDependencyStuff>, so you can safely lack a virtual destructor in InterfaceForDependencyStuff. I personally would not bother, and leave virtual ~InterfaceForDependencyStuff there.
In most cases you don't want or need ownership transfer, it makes code harder to understand and less flexible (moved-from objects can't be reused). The typical case would be to keep ownership with the caller:
class SomeAlgorithm {
DistanceCalculator* distanceCalculator;
public:
explicit SomeAlgorithm(DistanceCalculator* distanceCalculator_)
: distanceCalculator(distanceCalculator_) {
if (distanceCalculator == nullptr) { abort(); }
}
double calculateComplicated() {
...
double dist = distanceCalculator->distance(p1, p2);
...
}
// Default special members are fine.
};
int main() {
EuclideanDistanceCalculator distanceCalculator;
SomeAlgorithm algorithm(&distanceCalculator);
algorithm.calculateComplicated();
}
Raw pointers are fine to express non-ownership. If you prefer you can use a reference in the constructor argument, it makes no real difference. However, don't use a reference as data member, it makes the class unnecessarily unassignable.
The down side of just using any pointer (smart or raw), or even an ordinary C++ reference, is that they allow calling non-const methods from a const context.
For stateless classes with a single method that is a non-issue, and std::function is a good alternative, but for the general case of classes with state or multiple methods I propose a wrapper similar but not identical to std::reference_wrapper (which lacks the const safe accessor).
template<typename T>
struct NonOwningRef{
NonOwningRef() = delete;
NonOwningRef(T& other) noexcept : ptr(std::addressof(other)) { };
NonOwningRef(const NonOwningRef& other) noexcept = default;
const T& value() const noexcept{ return *ptr; };
T& value() noexcept{ return *ptr; };
private:
T* ptr;
};
usage:
class SomeAlgorithm {
NonOwningRef<DistanceCalculator> distanceCalculator;
public:
SomeAlgorithm(DistanceCalculator& distanceCalculator_)
: distanceCalculator(distanceCalculator_) {}
double calculateComplicated() {
double dist = distanceCalculator.value().distance(p1, p2);
return dist;
}
};
Replace T* with unique_ptr or shared_ptr to get owning versions. In this case, also add move construction, and construction from any unique_ptr<T2> or shared_ptr<T2> ).

push to list of boost::variant's

I have the boost::variant over set of non-default constructible (and maybe even non-moveable/non-copyable and non-copy/move constructible) classes with essentialy different non-default constructor prototypes, as shown below:
#include <boost/variant.hpp>
#include <string>
#include <list>
struct A { A(int) { ; } };
struct B { B(std::string) { ; } };
struct C { C(int, std::string) { ; } };
using V = boost::variant< A const, B const, C const >;
using L = std::list< V >;
int main()
{
L l;
l.push_back(A(1)); // an extra copy/move operation
l.push_back(B("2")); // an extra copy/move operation
l.push_back(C(3, "3")); // an extra copy/move operation
l.emplace_back(4);
l.emplace_back(std::string("5"));
// l.emplace_back(3, std::string("3")); // error here
return 0;
}
I expect, that std::list::emplace_back allows me to construct-and-insert (in single operation) new objects (of all the A, B, C types) into list, even if they have T & operator = (T const &) = delete;/T & operator = (T &&) = delete; and T(T const &) = delete;/T(T &&) = delete;. But what should I do, if constructor is a non-conversion one? I.e. have more, than one parameter. Or what I should to do if two different variant's underlying types have ambiguous constructor prototypes? In my opinion, this is the defect of implementation of the boost::variant library in the light of the new features of C++11 standard, if any at all can be applyed to solve the problem.
I specifically asked about std::list and boost::variant in superposition, because they are both internally implement the pimpl idiom in some form, as far as I know (say, boost::variant currently designed by means of temporary heap backup approach).
emplace can only call the constructors of the type in question. And boost::variant's constructors only take single objects which are unambiguously convertible to one of the variant's types.
variant doesn't forward parameters arbitrarily to one of its bounded types. It just takes a value. A single value that it will try to convert to one of the bounded types.
So you're going to have to construct an object and then copy that into the variant.
Assuming you can modify your "C" class, you could give it an additional constructor that takes a single tuple argument.

C++11 use-case for piecewise_construct of pair and tuple?

In N3059 I found the description of piecewise construction of pairs (and tuples) (and it is in the new Standard).
But I can not see when I should use it. I found discussions about emplace and non-copyable entities, but when I tried it out, I could not create a case where I need piecewiese_construct or could see a performance benefit.
Example. I thought I need a class which is non-copyable, but movebale (required for forwarding):
struct NoCopy {
NoCopy(int, int) {};
NoCopy(const NoCopy&) = delete; // no copy
NoCopy& operator=(const NoCopy&) = delete; // no assign
NoCopy(NoCopy&&) {}; // please move
NoCopy& operator=(NoCopy&&) {}; // please move-assign
};
I then sort-of expected that standard pair-construction would fail:
pair<NoCopy,NoCopy> x{ NoCopy{1,2}, NoCopy{2,3} }; // fine!
but it did not. Actually, this is what I'd expected anyway, because "moving stuff around" rather then copying it everywhere in the stdlib, is it should be.
Thus, I see no reason why I should have done this, or so:
pair<NoCopy,NoCopy> y(
piecewise_construct,
forward_as_tuple(1,2),
forward_as_tuple(2,3)
); // also fine
So, what's a the usecase?
How and when do I use piecewise_construct?
Not all types can be moved more efficiently than copied, and for some types it may make sense to even explicitly disable both copying and moving. Consider std::array<int, BIGNUM> as an an example of the former kind of a type.
The point with the emplace functions and piecewise_construct is that such a class can be constructed in place, without needing to create temporary instances to be moved or copied.
struct big {
int data[100];
big(int first, int second) : data{first, second} {
// the rest of the array is presumably filled somehow as well
}
};
std::pair<big, big> pair(piecewise_construct, {1,2}, {3,4});
Compare the above to pair(big(1,2), big(3,4)) where two temporary big objects would have to be created and then copied - and moving does not help here at all! Similarly:
std::vector<big> vec;
vec.emplace_back(1,2);
The main use case for piecewise constructing a pair is emplacing elements into a map or an unordered_map:
std::map<int, big> map;
map.emplace(std::piecewise_construct, /*key*/1, /*value*/{2,3});
One power piecewise_construct has is to avoid bad conversions when doing overload resolution to construct objects.
Consider a Foo that has a weird set of constructor overloads:
struct Foo {
Foo(std::tuple<float, float>) { /* ... */ }
Foo(int, double) { /* ... */ }
};
int main() {
std::map<std::string, Foo> m1;
std::pair<int, double> p1{1, 3.14};
m1.emplace("Will call Foo(std::tuple<float, float>)",
p1);
m1.emplace("Will still call Foo(std::tuple<float, float>)",
std::forward_as_tuple(2, 3.14));
m1.emplace(std::piecewise_construct,
std::forward_as_tuple("Will call Foo(int, double)"),
std::forward_as_tuple(3, 3.14));
// Some care is required, though...
m1.emplace(std::piecewise_construct,
std::forward_as_tuple("Will call Foo(std::tuple<float, float>)!"),
std::forward_as_tuple(p1));
}

Resources