Indirect Member RAII: unique_ptr or optional? - c++11

Consider a class with a member that can't be stored directly, e.g., because it does not have a default constructor, and the enclosing class's constructor doesn't have enough information to create it:
class Foo
{
public:
Foo(){} // Default ctor
private:
/* Won't build: no default ctor or way to call it's
non-default ctor at Foo's ctor. */
Bar m_bar;
};
Clearly, m_bar needs to be stored differently, e.g., through a pointer. A std::unique_ptr seems better, though, as it will destruct it automatically:
std::unique_ptr<Bar> m_bar;
It's also possible to use std::experimental::optional, though:
std::experimenatl::optional<Bar> m_bar;
My questions are: 1. What are the tradeoffs? and 2. Does it make sense to build a class automating the choice between them?
Specifically, looking at the exception guarantees for the ctor of std::unique_ptr and the exception guarantees for the ctor of std::experimental::optional, it seems clear that the former must perform dynamic allocation and deallocation - runtime speed disadvantages, and the latter stores things in some (aligned) memory buffer - size disadvantages. Are these the only tradeoffs?
If these are indeed the tradeoffs, and given that both types share enough of their interface (ctor, operator*), does it make sense to automate the choice between them with something like
template<typename T>
using indirect_raii = typename std::conditional<
// 20 - arbitrary constant
sizeof(std::experimental::optional<T>) >
20 + sizeof(std::exerimental::optional<T>)sizeof(std::unique_ptr<T>),
std::unique_ptr<T>,
std::experimental::optional<T>>::type;
(Note: there is a question discussing the tradeoffs between these two as return types, but the question and answers focus on what each conveys to the callers of the function, which is irrelevant for these private members.)

IMO there are other trade-offs at play here:
unique_ptr is not copyable or copy-assignable, while optional is.
I suppose one thing you could do is make indirect_RAII a class-type and conditionally add definitions to make it copyable by calling Bar's copy ctor, even when unique_ptr is selected. (Or conversely, disable copying when it's an optional.)
optional types can have a constexpr constructor -- you can't really do the equivalent thing with a unique_ptr at compile-time.
Bar can be incomplete at the time that unique_ptr<Bar> is constructed. It cannot be incomplete at the time that optional<Bar> is known. In your example I guess you assume that Bar is complete since you take its size, but potentially you might want to implement a class using indirect_RAII where this isn't the case.
Even in cases where Bar is large, you still may find that e.g. std::vector<Foo> will perform better when optional is selected than when unique_ptr is. I would expect this to happen in cases where the vector is populated once, and then iterated over many times.
It may be that as a general rule of thumb, your size rule is good for common use in your program, but I guess for "common use" it doesn't really matter which one you pick. An alternative to using your indirect_RAII type is, just pick one or the other in each case, and in places where you would have taken advantage of the "generic interface", pass the type as a template parameter when necessary. And in performance-critical areas, make the appropriate choice manually.

Related

In C++, how can one predict if move or copy semantics would be invoked?

Given the latitude that a C++ compiler has in instantiating temporary objects, and in invoking mechanisms like return value optimization etc., it is not always clear by looking at some code if move or copy semantics will be invoked (or how many).
It almost feels as if these primitives exist for incidental optimizations. That is, you may or may not get them. It seems like it's difficult to design any kind of resource management strategy that leverages moves, when it is hard to control the invocation of moves themselves.
Is there a way to predict clearly (and simply) where and how many copies and moves might occur in some code? Ideally, one would not need to be an expert in compiler internals to be able to do this.
It seems like it's difficult to design any kind of resource management strategy that leverages moves, when it is hard to control the invocation of moves themselves.
I would contradict here. Leveraging move semantics when designing a resource handling class should be done independently of how or when copy- or move-construction occurs in the client code. Once move-ctor/assignment is there, client code can be designed to leverage the existence of these special member functions.
Is there a way to predict clearly (and simply) where and how many copies and moves might occur in some code?
A bit hard to tell what simply means here, but this is how I understand it:
Given that a class has no move ctor/assignment operator, you will always get a copy. This is trivial, but important to keep in mind when working with e.g. classes in a legacy code that have user defined destructors and/or copy-ctor/assignment, because the compiler doesn't generate move ctors/assignment in this case.
Return value optimization. The question is tagged C++11, so you don't have guaranteed copy elision for initialization with prvalues brought by C++17. However, it is fair to assume that identical mechanism are already implemented by your compiler. Hence,
struct A {};
A func() { return A{}; }
can be assumed to construct the instance of A to which the function return value is bound on the calling side in place. This causes neither move nor copy construction. The same behavior can optimistically be assumed if the returned object has a name, as long as func() has no branching that renders NRVO impossible.
As an exception from this guideline, function return values that are also function parameters do not qualify for return value optimization. Hence, move/forward them to prevent copy in case A is move-constructible:
A func(A& a) { return std::move(a); }
The object created by the return value of func(A&) will hence be move-constructed.
Function parameters do not reveal per se how they behave, it depends on the type and its special member functions. Given
void f1(A a1) { A a2{std::move(a1)}; };
void f2(A& a1) { /* Same as above. */ };
void f1(A&& a1) { /* Again, same. */ };
the instances a2 are move-constructed if A has a move ctor, otherwise, it's copy.
There is a lot to discover beyond the exemplary cases above, I am neither capable of going into more detail, nor would this fit into the desired simplicity of an answer. Also, the scenario is different when you don't know the types you are dealing with, e.g. in function or class templates. In this case, a good read on how to deal with the related uncertainty of whether copies or moves are made is Item 29 in Eff. Modern C++ ("Assume that move operations are not present, not cheap, and not used").

What are the most common places that move semantics is used in C++11 STL?

I know that std::vector<T>::push_back() has move semantics support. So, when I add a named temporary instance to a vector, I can use std::move().
What are the other common places in the STL that I should grow the habit to add std::move()
I know that std::vector<T>::push_back() has move semantics support.
The support that push_back has is simply an additional overload that takes an rvalue reference, so that the new value T inside the vector can be constructed by invoking T(T&&) instead of T(const T&). The advantage is that the former can be implemented way more efficiently because it assumes that the passed rvalue reference is never going to be used afterwards.
Most Standard Library containers have added similar overloads to their push/enqueue/insert member functions. Additionally, the concept of emplacement has been added (e.g. std::vector<T>::emplace_back), where the values are constructed in place inside the container in order to avoid unnecessary temporaries. Emplacement should be preferred to insertion/pushing.
So, when I add a named temporary instance to a vector, I can use std::move().
"Named temporary" doesn't really make much sense. The idea is that you have an lvalue you don't care about anymore, and you want to turn it into a temporary by using std::move. Example:
Foo foo;
some_vector.emplace_back(std::move(foo));
// I'm sure `foo` won't be used from now on
Just remember that std::move is not special: it literally means static_cast<T&&>.
What are the other common places in the STL that I should grow the habit to add std::move?
This is a really broad question - you should add std::move everywhere it makes sense, not just in the context of the Standard Library. If you have a lvalue you know you're not going to use anymore in a particular code path, and you want to pass it/store it somewhere, then std::move it.

Replacing memset() on classes in a C++ codebase

I've inherited a C++98 codebase which has two major uses of memset() on C++ classes, with macros expanded for clarity:
// pattern #1:
Obj o;
memset(&o, 0, sizeof(o));
// pattern #2:
// (elsewhere: Obj *o;)
memset(something->o, 0, sizeof(*something->o));
As you may have guessed, this codebase does not use STL or otherwise non-POD classes. When I try to put as little as an std::string into one of its classes, bad things generally happen.
It was my understanding that these patterns could be rewrited as follows in C++11:
// pattern #1
Obj o = {};
// pattern #2
something->o = {};
Which is to say, assignment of {} would rewrite the contents of the object with the default-initialized values in both cases. Nice and clean, isn't it?
Well, yes, but it doesn't work. It works on *nix systems, but results in fairly inexplicable results (in essence, garbage values) when built with VS2013 with v120_xp toolset, which implies that my understanding of initializer lists is somehow lacking.
So, the questions:
Why didn't this work?
What's a better way to replace this use of memset that ensures that members with constructors are properly default-initialized, and which can preferably be reliably applied with as little as search-and-replace (there are unfortunately no tests). Bonus points if it works on pre-VS2013.
The behavior of brace-initialization depends on what kind of object you try to initialize.
On aggregates (e.g. simple C-style structures) using an empty brace-initializer zero-initializes the aggregate, i.e. it makes all members zero.
On non-aggregates an empty brace-initializer calls the default constructor. And if the constructor doesn't explicitly initialize the members (which the compilers auto-generated constructor doesn't) then the members will be constructed but otherwise uninitialized. Members with their own constructors that initialize themselves will be okay, but e.g. an int member will have an indeterminate value.
The best way to solve your problems, IMO, is to add a default constructor (if the classes doesn't have it already) with an initializer list that explicitly initializes the members.
It works on *nix systems, but results in fairly inexplicable results (in essence, garbage values) when built with VS2013 with v120_xp toolset, which implies that my understanding of initializer lists is somehow lacking.
The rules for 'default' initialization have changed from version to version of C++, but VC++ has stuck with the C++98 rules, ignoring even the updates from C++03 I think.
Other compilers have implemented new rules, with gcc at one point even implementing some defect resolutions that hadn't been accepted for future inclusion in the official spec.
So even though what you want is guaranteed by the standard, for the most part it's probably best not to try to rely on the behavior of initialization of members that don't have explicit initializers.
I think placement new is established enough that it works on VS, so you might try:
#include <new>
new(&o) T();
new(something->p) T();
Make sure not to do this on any object that hasn't been allocated and destructed/uninitialized first! (But it was pointed out below that this might fail if a constructor throws an exception.)
You might be able to just assign from a default object, that is, o = T(); or *(something->p) = T();. A good general strategy might be to give each of these POD classes a trivial default constructor with : o() in the initializer-list.

C++ why is noexcept required in the context of Move Constructors and Move Assignment Operators to enable optimizations?

Consider the following class, with a move constructor and move assignment operator:
class my_class
{
protected:
double *my_data;
uint64_t my_data_length;
}
my_class(my_class&& other) noexcept : my_data_length{other.my_data_length}, my_data{other.my_data}
{
// Steal the data
other.my_data = nullptr;
other.my_data_length = 0;
}
const my_class& operator=(my_class&& other) noexcept
{
// Steal the data
std::swap(my_data_length, other.my_data_length);
std::swap(my_data, other.my_data);
return *this;
}
What is the purpose of noexcept here? I know that is hits to the compiler that no exceptions should be thrown by the following function, but how does this enable compiler optimizations?
The special importance of noexcept on move constructors and assignment operators is explained in detail in https://vimeo.com/channels/ndc2014/97337253
Basically, it doesn't enable "optimisations" in the traditional sense of allowing the compiler to generate better code. Instead it allows other types, such as containers in the library, to take a different code path when they can detect that moving the element types will never throw. That can enable taking an alternate code path that would not be safe if they could throw (e.g. because it would prevent the container from meeting exception-safety guarantees).
For example, when you do push_back(t) on a vector, if the vector is full (size() == capacity()) then it needs to allocate a new block of memory and copy all the existing elements into the new memory. If copying any of the elements throws an exception then the library just destroys all the elements it created in the new storage and deallocates the new memory, leaving the original vector is unchanged (thus meeting the strong exception-safety guarantee). It would be faster to move the existing elements to the new storage, but if moving could throw then any already-moved elements would have been altered already and meeting the strong guarantee would not be possible, so the library will only try to move them when it knows that can't throw, which it can only know if they are noexcept.
IMHO using noexcept will not enable any compiler optimization on its own. There are traits in STL:
std::is_nothrow_move_constructible
std::is_nothrow_move_assignable
STL containters like vector etc use these traits to test type T and use move constructors and assignment instead of copy constructors and assignment.
Why STL use these traits instead of:
std::is_move_constructible
std::is_move_assignable
Answer: to provide strong exception guarantee.
First of all I would remark that in move constructors or move assignment nothing should throw and there seems to be no need to this ever. The only thing which must be done in constructors/assignment operator is dealing with already allocated memory and pointers to them. Normally you should not call any other methods which can throw and your own moving inside your constructor/operator has no need to do so. But on the other hand a simple output of a debug message breaks this rule.
Optimization can be done in a some different ways. Automatically by the compiler and also by different implementations of code which uses your constructors and assignment operator. Take a look to the STL, there are some specializations for code which are different if you use exceptions or not which are implemented via type traits.
The compiler itself can optimize better while having the guarantee that any code did never throw. The compiler have a guaranteed call tree through your code which can be better inlined, compile time calculated or what so ever. The minimum optimization which can be done is to not store all the informations about the actual stack frame which is needed to handle the throw condition, like deallocation variables on the stack and other things.
There was also a question here: noexcept, stack unwinding and performance
Maybe your question is a duplicate to that?
A maybe helpful question related to this I found here: Are move constructors required to be noexcept?
This discuss the need of throwing in move operations.
What is the purpose of noexcept here?
At minimum saving some program space, which is not only relevant to move operations but for all functions. And if your class is used with STL containers or algorithms it can handled different which can result in better optimization if your STL implementation uses these informations. And maybe the compiler is able to get better general optimization because of a known call tree if all other things are compile time constant.

C++11 is it possible to construct an std::initializer_list?

I have a class that's using an std::discrete_distribution which can take an std::initializer_list OR a couple of iterators. My class is in some ways wrapping the discrete_distribution so I really wanted to mimic the ability to take an std::initializer_list which would then be passed down.
This is simple.
However, the std::initializer_list will always be constructed through some unknown values. So, if it was just a std::discrete_distribution I would just construct from iterators of some container. However, for me to make that available via my class, I would need to templatize the class for the Iterator type.
I don't want to template my class because it's only occasionally that it would use the initializer_list, and the cases where it doesn't, it uses an std::uniform_int_distribution which would make this template argument, maybe confusing.
I know I can default the template argument, and I know that I could just define only vector::iterators if I wanted; I'd just rather not.
According to the documentation, std::initializer_list cannot be non-empty constructed in standard C++. BTW, it is the same for C stdarg(3) va_list (and probably for similar reasons, because variadic function argument passing is implementation specific and generally has its own ABI peculiarities; see however libffi).
In GCC, std::initializer_list is somehow known to the C++ compiler (likewise <stdarg.h> uses some builtin things from the C compiler), and has special support.
The C++11 standard (more exactly its n3337 draft, which is almost exactly the same) says in §18.9.1 that std::initializer_list has only an empty constructor and refers to §8.5.4 list-initialization
You probably should use std::vector and its iterators in your case.
As a rule of thumb and intuitively, std::initializer_list is useful for compile-time known argument lists, and if you want to handle run-time known arguments (with the "number" of "arguments" unknown at compile time) you should provide a constructor for that case (either taking some iterators, or some container, as arguments).
If your class has a constructor accepting std::initializer_list<int> it probably should have another constructor accepting std::vector<int> or std::list<int> (or perhaps std::set<int> if you have some commutativity), then you don't need some weird templates on iterators. BTW, if you want iterators, you would templatize the constructor, not the entire class.

Resources