C++11 use-case for piecewise_construct of pair and tuple? - c++11

In N3059 I found the description of piecewise construction of pairs (and tuples) (and it is in the new Standard).
But I can not see when I should use it. I found discussions about emplace and non-copyable entities, but when I tried it out, I could not create a case where I need piecewiese_construct or could see a performance benefit.
Example. I thought I need a class which is non-copyable, but movebale (required for forwarding):
struct NoCopy {
NoCopy(int, int) {};
NoCopy(const NoCopy&) = delete; // no copy
NoCopy& operator=(const NoCopy&) = delete; // no assign
NoCopy(NoCopy&&) {}; // please move
NoCopy& operator=(NoCopy&&) {}; // please move-assign
};
I then sort-of expected that standard pair-construction would fail:
pair<NoCopy,NoCopy> x{ NoCopy{1,2}, NoCopy{2,3} }; // fine!
but it did not. Actually, this is what I'd expected anyway, because "moving stuff around" rather then copying it everywhere in the stdlib, is it should be.
Thus, I see no reason why I should have done this, or so:
pair<NoCopy,NoCopy> y(
piecewise_construct,
forward_as_tuple(1,2),
forward_as_tuple(2,3)
); // also fine
So, what's a the usecase?
How and when do I use piecewise_construct?

Not all types can be moved more efficiently than copied, and for some types it may make sense to even explicitly disable both copying and moving. Consider std::array<int, BIGNUM> as an an example of the former kind of a type.
The point with the emplace functions and piecewise_construct is that such a class can be constructed in place, without needing to create temporary instances to be moved or copied.
struct big {
int data[100];
big(int first, int second) : data{first, second} {
// the rest of the array is presumably filled somehow as well
}
};
std::pair<big, big> pair(piecewise_construct, {1,2}, {3,4});
Compare the above to pair(big(1,2), big(3,4)) where two temporary big objects would have to be created and then copied - and moving does not help here at all! Similarly:
std::vector<big> vec;
vec.emplace_back(1,2);
The main use case for piecewise constructing a pair is emplacing elements into a map or an unordered_map:
std::map<int, big> map;
map.emplace(std::piecewise_construct, /*key*/1, /*value*/{2,3});

One power piecewise_construct has is to avoid bad conversions when doing overload resolution to construct objects.
Consider a Foo that has a weird set of constructor overloads:
struct Foo {
Foo(std::tuple<float, float>) { /* ... */ }
Foo(int, double) { /* ... */ }
};
int main() {
std::map<std::string, Foo> m1;
std::pair<int, double> p1{1, 3.14};
m1.emplace("Will call Foo(std::tuple<float, float>)",
p1);
m1.emplace("Will still call Foo(std::tuple<float, float>)",
std::forward_as_tuple(2, 3.14));
m1.emplace(std::piecewise_construct,
std::forward_as_tuple("Will call Foo(int, double)"),
std::forward_as_tuple(3, 3.14));
// Some care is required, though...
m1.emplace(std::piecewise_construct,
std::forward_as_tuple("Will call Foo(std::tuple<float, float>)!"),
std::forward_as_tuple(p1));
}

Related

Why does initialization of int by parenthesis inside class give error? [duplicate]

For example, I cannot write this:
class A
{
vector<int> v(12, 1);
};
I can only write this:
class A
{
vector<int> v1{ 12, 1 };
vector<int> v2 = vector<int>(12, 1);
};
Why is there a difference between these two declaration syntaxes?
The rationale behind this choice is explicitly mentioned in the related proposal for non static data member initializers :
An issue raised in Kona regarding scope of identifiers:
During discussion in the Core Working Group at the September ’07 meeting in Kona, a question arose about the scope of identifiers in the initializer. Do we want to allow class scope with the possibility of forward lookup; or do we want to require that the initializers be well-defined at the point that they’re parsed?
What’s desired:
The motivation for class-scope lookup is that we’d like to be able to put anything in a non-static data member’s initializer that we could put in a mem-initializer without significantly changing the semantics (modulo direct initialization vs. copy initialization):
int x();
struct S {
int i;
S() : i(x()) {} // currently well-formed, uses S::x()
// ...
static int x();
};
struct T {
int i = x(); // should use T::x(), ::x() would be a surprise
// ...
static int x();
};
Problem 1:
Unfortunately, this makes initializers of the “( expression-list )” form ambiguous at the time that the declaration is being parsed:
struct S {
int i(x); // data member with initializer
// ...
static int x;
};
struct T {
int i(x); // member function declaration
// ...
typedef int x;
};
One possible solution is to rely on the existing rule that, if a declaration could be an object or a function, then it’s a function:
struct S {
int i(j); // ill-formed...parsed as a member function,
// type j looked up but not found
// ...
static int j;
};
A similar solution would be to apply another existing rule, currently used only in templates, that if T could be a type or something else, then it’s something else; and we can use “typename” if we really mean a type:
struct S {
int i(x); // unabmiguously a data member
int j(typename y); // unabmiguously a member function
};
Both of those solutions introduce subtleties that are likely to be misunderstood by many users (as evidenced by the many questions on comp.lang.c++ about why “int i();” at block scope doesn’t declare a default-initialized int).
The solution proposed in this paper is to allow only initializers of the “= initializer-clause” and “{ initializer-list }” forms. That solves the ambiguity problem in most cases, for example:
HashingFunction hash_algorithm{"MD5"};
Here, we could not use the = form because HasningFunction’s constructor is explicit.
In especially tricky cases, a type might have to be mentioned twice. Consider:
vector<int> x = 3; // error: the constructor taking an int is explicit
vector<int> x(3); // three elements default-initialized
vector<int> x{3}; // one element with the value 3
In that case, we have to chose between the two alternatives by using the appropriate notation:
vector<int> x = vector<int>(3); // rather than vector<int> x(3);
vector<int> x{3}; // one element with the value 3
Problem 2:
Another issue is that, because we propose no change to the rules for initializing static data members, adding the static keyword could make a well-formed initializer ill-formed:
struct S {
const int i = f(); // well-formed with forward lookup
static const int j = f(); // always ill-formed for statics
// ...
constexpr static int f() { return 0; }
};
Problem 3:
A third issue is that class-scope lookup could turn a compile-time error into a run-time error:
struct S {
int i = j; // ill-formed without forward lookup, undefined behavior with
int j = 3;
};
(Unless caught by the compiler, i might be intialized with the undefined value of j.)
The proposal:
CWG had a 6-to-3 straw poll in Kona in favor of class-scope lookup; and that is what this paper proposes, with initializers for non-static data members limited to the “= initializer-clause” and “{ initializer-list }” forms.
We believe:
Problem 1: This problem does not occur as we don’t propose the () notation. The = and {} initializer notations do not suffer from this problem.
Problem 2: adding the static keyword makes a number of differences, this being the least of them.
Problem 3: this is not a new problem, but is the same order-of-initialization problem that already exists with constructor initializers.
One possible reason is that allowing parentheses would lead us back to the most vexing parse in no time. Consider the two types below:
struct foo {};
struct bar
{
bar(foo const&) {}
};
Now, you have a data member of type bar that you want to initialize, so you define it as
struct A
{
bar B(foo());
};
But what you've done above is declare a function named B that returns a bar object by value, and takes a single argument that's a function having the signature foo() (returns a foo and doesn't take any arguments).
Judging by the number and frequency of questions asked on StackOverflow that deal with this issue, this is something most C++ programmers find surprising and unintuitive. Adding the new brace-or-equal-initializer syntax was a chance to avoid this ambiguity and start with a clean slate, which is likely the reason the C++ committee chose to do so.
bar B{foo{}};
bar B = foo();
Both lines above declare an object named B of type bar, as expected.
Aside from the guesswork above, I'd like to point out that you're doing two vastly different things in your example above.
vector<int> v1{ 12, 1 };
vector<int> v2 = vector<int>(12, 1);
The first line initializes v1 to a vector that contains two elements, 12 and 1. The second creates a vector v2 that contains 12 elements, each initialized to 1.
Be careful of this rule - if a type defines a constructor that takes an initializer_list<T>, then that constructor is always considered first when the initializer for the type is a braced-init-list. The other constructors will be considered only if the one taking the initializer_list is not viable.

Most efficient way to assign values to a map of maps

Given a map of maps like:
std::map<unsigned int, std::map<std::string, MyBase*>> m_allMyObjects;
What would be the most efficient way to insert/add/"emplace" an element into m_allMyObjects given an unsigned int and a std::string taking optimization into account (on modern compilers)?
What would be the most efficient way to retrieve an element then?
m_allMyObjects may potentially contain up to 100'000 elements in the future.
Common knowledge and folklore about how to efficiently insert into maps (typically telling you to avoid operator[] and prefering the shiny new emplace) considers the costs of constructing and copying the values in the map. In your case, those values are plain pointers which can be copied at virtually no expense, and copying pointers can be aggressively optimized by the compiler.
On the other hand, you actually do have an object that is expensive to handle, namely the key of type std::string. You need to watch out for copies (moved or copied) of the key to determine performance. Obviously, for the tree lookup, you already need the string object, even if you have it as char*, as there is no insertion function that is templated over the type of key. This means for looking up the place to insert, you use one certain std::string object, but once the map node gets created, the new std::string object inside the map is copy-initialized from it (possibly moved). Avoiding everything in excess of that single copy/move should be your goal.
Example time!
#include <map>
#include <cstdio>
struct noisy {
noisy(int v) : val(v) {}
noisy(const noisy& src) : val(src.val) { std::puts("copy ctor"); }
noisy(noisy&& src) : val(src.val) { std::puts("move ctor"); }
noisy& operator=(const noisy& src)
{ val = src.val; std::puts("copy assign"); return *this; }
noisy& operator=(noisy&& src)
{ val = src.val; std::puts("move assign"); return *this; }
int val;
};
bool operator<(const noisy& a, const noisy& b)
{
return a.val < b.val;
}
int main(void)
{
std::map<noisy,int> m;
std::puts("Operator[]");
m[noisy(1)] = 3;
std::puts("insert/make_pair");
m.insert(std::make_pair(noisy(2), 3));
std::puts("insert/make_pair/ref");
m.insert(std::make_pair<noisy&&,int>(noisy(3), 3));
std::puts("insert/pair/ref");
m.insert(std::pair<noisy&&,int>(noisy(4), 3));
std::puts("emplace");
m.emplace(noisy(5), 3);
}
compiled with g++ 4.9.1, -std=c++11, -O2, the result is
Operator[]
move ctor
insert/make_pair
move ctor
move ctor
insert/make_pair/ref
move ctor
move ctor
insert/pair/ref
move ctor
emplace
move ctor
Which shows: avoid everything that creates an intermediate pair containing a copy of the key! Be aware that std::make_pair does never create a pair that contains references, even if it can take the parameters by reference! Whenever you pass a pair containing the copy of the key, the key gets copied into the pair and later into the map.
The expression suggested by MarkMB, namely m[int_k][str_k] = ptr, is quite good, and likely produces optimal code. There is no reason for the first index (int_k) to not use [], as you want a default constructed sub-map if the index is not used yet, so there is no unnecessary overhead. As we have seen, indexing with the string gets away with a single copy, so you are fine. If you can afford to lose your string, m[int_k][std::move(str_k)] = ptr might be a win, though. As discussed in the beginning, using emplace instead of [] is only about the values, which are virtually free to handle in your case.

C++11 - Moving fundamental data types in constructor?

I'm looking into move semantics from C++11 and I'm curious how to move fundamental types like boolean, integer float etc. in the constructor. Also the compound types like std::string.
Take the following class for example:
class Test
{
public:
// Default.
Test()
: m_Name("default"), m_Tested(true), m_Times(1), m_Grade('B')
{
// Starting up...
}
Test(const Test& other)
: m_Name(other.m_Name), m_Times(other.m_Times)
, m_Grade(other.m_Grade), m_Tested(other.m_Tested)
{
// Duplicating...
}
Test(Test&& other)
: m_Name(std::move(other.m_Name)) // Is this correct?
{
// Moving...
m_Tested = other.m_Tested; // I want to move not copy.
m_Times = other.m_Times; // I want to move not copy.
m_Grade = other.m_Grade; // I want to move not copy.
}
~Test()
{
// Shutting down....
}
private:
std::string m_Name;
bool m_Tested;
int m_Times;
char m_Grade;
};
How do I move (not copy) m_Tested, m_Times, m_Grade. And is m_Name moved correctly? Thank you for your time.
Initialization and assignment of a primitive from a prvalue or xvalue primitive has exactly the same effect as initialization or assignment from a lvalue primitive; the value is copied and the source object is unaffected.
In other words, you can use std::move but it won't make any difference.
If you want to change the value of the source object (to 0, say) you'll have to do that yourself.
Looks correct. Except simple data types like bool, int, char are only copied. The point of "moving" a string is that it has a buffer that it normally has to copy when constructing a new object, however when moving the old buffer is used (copying the pointer and not the contents of the buffer).
Test(Test&& other)
: m_Name(std::move(other.m_Name)), m_Times(other.m_Times)
, m_Grade(other.m_Grade), m_Tested(other.m_Tested)
{}

Dependency injection in C++11 without raw pointers

I often use the "dependency injection" pattern in my projects. In C++ it is easiest to implement by passing around raw pointers, but now with C++11, everything in high-level code should be doable with smart pointers. But what is the best practice for this case? Performance is not critical, a clean and understandable code matters more to me now.
Let me show a simplified example. We have an algorithm that uses distance calculations inside. We want to be able to replace this calculation with different distance metrics (Euclidean, Manhattan, etc.). Our goal is to be able to say something like:
SomeAlgorithm algorithmWithEuclidean(new EuclideanDistanceCalculator());
SomeAlgorithm algorithmWithManhattan(new ManhattanDistanceCalculator());
but with smart pointers to avoid manual new and delete.
This is a possible implementation with raw pointers:
class DistanceCalculator {
public:
virtual double distance(Point p1, Point p2) = 0;
};
class EuclideanDistanceCalculator {
public:
virtual double distance(Point p1, Point p2) {
return sqrt(...);
}
};
class ManhattanDistanceCalculator {
public:
virtual double distance(Point p1, Point p2) {
return ...;
}
};
class SomeAlgorithm {
DistanceCalculator* distanceCalculator;
public:
SomeAlgorithm(DistanceCalculator* distanceCalculator_)
: distanceCalculator(distanceCalculator_) {}
double calculateComplicated() {
...
double dist = distanceCalculator->distance(p1, p2);
...
}
~SomeAlgorithm(){
delete distanceCalculator;
}
};
Let's assume that copying is not really an issue, and if we didn't need polymorphism we would just pass the DistanceCalculator to the constructor of SomeAlgorithm by value (copying). But since we need to be able to pass in different derived instances (without slicing), the parameter must be either a raw pointer, a reference or a smart pointer.
One solution that comes to mind is to pass it in by reference-to-const and encapsulate it in a std::unique_ptr<DistanceCalculator> member variable. Then the call would be:
SomeAlgorithm algorithmWithEuclidean(EuclideanDistance());
But this stack-allocated temporary object (rvalue-reference?) will be destructed after this line. So we'd need some copying to make it more like a pass-by-value. But since we don't know the runtime type, we cannot construct our copy easily.
We could also use a smart pointer as the constructor parameter. Since there is no issue with ownership (the DistanceCalculator will be owned by SomeAlgorithm) we should use std::unique_ptr. Should I really replace all of such constructor parameters with unique_ptr? it seems to reduce readability. Also the user of SomeAlgorithm must construct it in an awkward way:
SomeAlgorithm algorithmWithEuclidean(std::unique_ptr<DistanceCalculator>(new EuclideanDistance()));
Or should I use the new move semantics (&&, std::move) in some way?
It seems to be a pretty standard problem, there must be some succinct way to implement it.
If I wanted to do this, the first thing I'd do is kill your interface, and instead use this:
SomeAlgorithm(std::function<double(Point,Point)> distanceCalculator_)
type erased invocation object.
I could do a drop-in replacement using your EuclideanDistanceCalculator like this:
std::function<double(Point,Point)> UseEuclidean() {
auto obj = std::make_shared<EuclideanDistance>();
return [obj](Point a, Point b)->double {
return obj->distance( a, b );
};
}
SomeAlgorithm foo( UseEuclidean() );
but as distance calculators rarely require state, we could do away with the object.
With C++1y support, this shortens to:
std::function<double(Point,Point>> UseEuclidean() {
return [obj = std::make_shared<EuclideanDistance>()](Point a, Point b)->double {
return obj->distance( a, b );
};
}
which as it no longer requires a local variable, can be used inline:
SomeAlgorithm foo( [obj = std::make_shared<EuclideanDistance>()](Point a, Point b)->double {
return obj->distance( a, b );
} );
but again, the EuclideanDistance doesn't have any real state, so instead we can just
std::function<double(Point,Point>> EuclideanDistance() {
return [](Point a, Point b)->double {
return sqrt( (b.x-a.x)*(b.x-a.x) + (b.y-a.y)*(b.y*a.y) );
};
}
If we really don't need movement but we do need state, we can write a unique_function< R(Args...) > type that does not support non-move based assignment, and store one of those instead.
The core of this is that the interface DistanceCalculator is noise. The name of the variable is usually enough. std::function< double(Point,Point) > m_DistanceCalculator is clear in what it does. The creator of the type-erasure object std::function handles any lifetime management issues, we just store the function object by value.
If your actual dependency injection is more complicated (say multiple different related callbacks), using an interface isn't bad. If you want to avoid copy requirements, I'd go with this:
struct InterfaceForDependencyStuff {
virtual void method1() = 0;
virtual void method2() = 0;
virtual int method3( double, char ) = 0;
virtual ~InterfaceForDependencyStuff() {}; // optional if you want to do more work later, but probably worth it
};
then, write up your own make_unique<T>(Args&&...) (a std one is coming in C++1y), and use it like this:
Interface:
SomeAlgorithm(std::unique_ptr<InterfaceForDependencyStuff> pDependencyStuff)
Use:
SomeAlgorithm foo(std::make_unique<ImplementationForDependencyStuff>( blah blah blah ));
If you don't want virtual ~InterfaceForDependencyStuff() and want to use unique_ptr, you have to use a unique_ptr that stores its deleter (by passing in a stateful deleter).
On the other hand, if std::shared_ptr already comes with a make_shared, and it stores its deleter statefully by default. So if you go with shared_ptr storage of your interface, you get:
SomeAlgorithm(std::shared_ptr<InterfaceForDependencyStuff> pDependencyStuff)
and
SomeAlgorithm foo(std::make_shared<ImplementationForDependencyStuff>( blah blah blah ));
and make_shared will store a pointer-to-function that deletes ImplementationForDependencyStuff that will not be lost when you convert it to a std::shared_ptr<InterfaceForDependencyStuff>, so you can safely lack a virtual destructor in InterfaceForDependencyStuff. I personally would not bother, and leave virtual ~InterfaceForDependencyStuff there.
In most cases you don't want or need ownership transfer, it makes code harder to understand and less flexible (moved-from objects can't be reused). The typical case would be to keep ownership with the caller:
class SomeAlgorithm {
DistanceCalculator* distanceCalculator;
public:
explicit SomeAlgorithm(DistanceCalculator* distanceCalculator_)
: distanceCalculator(distanceCalculator_) {
if (distanceCalculator == nullptr) { abort(); }
}
double calculateComplicated() {
...
double dist = distanceCalculator->distance(p1, p2);
...
}
// Default special members are fine.
};
int main() {
EuclideanDistanceCalculator distanceCalculator;
SomeAlgorithm algorithm(&distanceCalculator);
algorithm.calculateComplicated();
}
Raw pointers are fine to express non-ownership. If you prefer you can use a reference in the constructor argument, it makes no real difference. However, don't use a reference as data member, it makes the class unnecessarily unassignable.
The down side of just using any pointer (smart or raw), or even an ordinary C++ reference, is that they allow calling non-const methods from a const context.
For stateless classes with a single method that is a non-issue, and std::function is a good alternative, but for the general case of classes with state or multiple methods I propose a wrapper similar but not identical to std::reference_wrapper (which lacks the const safe accessor).
template<typename T>
struct NonOwningRef{
NonOwningRef() = delete;
NonOwningRef(T& other) noexcept : ptr(std::addressof(other)) { };
NonOwningRef(const NonOwningRef& other) noexcept = default;
const T& value() const noexcept{ return *ptr; };
T& value() noexcept{ return *ptr; };
private:
T* ptr;
};
usage:
class SomeAlgorithm {
NonOwningRef<DistanceCalculator> distanceCalculator;
public:
SomeAlgorithm(DistanceCalculator& distanceCalculator_)
: distanceCalculator(distanceCalculator_) {}
double calculateComplicated() {
double dist = distanceCalculator.value().distance(p1, p2);
return dist;
}
};
Replace T* with unique_ptr or shared_ptr to get owning versions. In this case, also add move construction, and construction from any unique_ptr<T2> or shared_ptr<T2> ).

push to list of boost::variant's

I have the boost::variant over set of non-default constructible (and maybe even non-moveable/non-copyable and non-copy/move constructible) classes with essentialy different non-default constructor prototypes, as shown below:
#include <boost/variant.hpp>
#include <string>
#include <list>
struct A { A(int) { ; } };
struct B { B(std::string) { ; } };
struct C { C(int, std::string) { ; } };
using V = boost::variant< A const, B const, C const >;
using L = std::list< V >;
int main()
{
L l;
l.push_back(A(1)); // an extra copy/move operation
l.push_back(B("2")); // an extra copy/move operation
l.push_back(C(3, "3")); // an extra copy/move operation
l.emplace_back(4);
l.emplace_back(std::string("5"));
// l.emplace_back(3, std::string("3")); // error here
return 0;
}
I expect, that std::list::emplace_back allows me to construct-and-insert (in single operation) new objects (of all the A, B, C types) into list, even if they have T & operator = (T const &) = delete;/T & operator = (T &&) = delete; and T(T const &) = delete;/T(T &&) = delete;. But what should I do, if constructor is a non-conversion one? I.e. have more, than one parameter. Or what I should to do if two different variant's underlying types have ambiguous constructor prototypes? In my opinion, this is the defect of implementation of the boost::variant library in the light of the new features of C++11 standard, if any at all can be applyed to solve the problem.
I specifically asked about std::list and boost::variant in superposition, because they are both internally implement the pimpl idiom in some form, as far as I know (say, boost::variant currently designed by means of temporary heap backup approach).
emplace can only call the constructors of the type in question. And boost::variant's constructors only take single objects which are unambiguously convertible to one of the variant's types.
variant doesn't forward parameters arbitrarily to one of its bounded types. It just takes a value. A single value that it will try to convert to one of the bounded types.
So you're going to have to construct an object and then copy that into the variant.
Assuming you can modify your "C" class, you could give it an additional constructor that takes a single tuple argument.

Resources