I saw code somewhere in which someone decided to copy an object and subsequently move it to a data member of a class. This left me in confusion in that I thought the whole point of moving was to avoid copying. Here is the example:
struct S
{
S(std::string str) : data(std::move(str))
{}
};
Here are my questions:
Why aren't we taking an rvalue-reference to str?
Won't a copy be expensive, especially given something like std::string?
What would be the reason for the author to decide to make a copy then a move?
When should I do this myself?
Before I answer your questions, one thing you seem to be getting wrong: taking by value in C++11 does not always mean copying. If an rvalue is passed, that will be moved (provided a viable move constructor exists) rather than being copied. And std::string does have a move constructor.
Unlike in C++03, in C++11 it is often idiomatic to take parameters by value, for the reasons I am going to explain below. Also see this Q&A on StackOverflow for a more general set of guidelines on how to accept parameters.
Why aren't we taking an rvalue-reference to str?
Because that would make it impossible to pass lvalues, such as in:
std::string s = "Hello";
S obj(s); // s is an lvalue, this won't compile!
If S only had a constructor that accepts rvalues, the above would not compile.
Won't a copy be expensive, especially given something like std::string?
If you pass an rvalue, that will be moved into str, and that will eventually be moved into data. No copying will be performed. If you pass an lvalue, on the other hand, that lvalue will be copied into str, and then moved into data.
So to sum it up, two moves for rvalues, one copy and one move for lvalues.
What would be the reason for the author to decide to make a copy then a move?
First of all, as I mentioned above, the first one is not always a copy; and this said, the answer is: "Because it is efficient (moves of std::string objects are cheap) and simple".
Under the assumption that moves are cheap (ignoring SSO here), they can be practically disregarded when considering the overall efficiency of this design. If we do so, we have one copy for lvalues (as we would have if we accepted an lvalue reference to const) and no copies for rvalues (while we would still have a copy if we accepted an lvalue reference to const).
This means that taking by value is as good as taking by lvalue reference to const when lvalues are provided, and better when rvalues are provided.
P.S.: To provide some context, I believe this is the Q&A the OP is referring to.
To understand why this is a good pattern, we should examine the alternatives, both in C++03 and in C++11.
We have the C++03 method of taking a std::string const&:
struct S
{
std::string data;
S(std::string const& str) : data(str)
{}
};
in this case, there will always be a single copy performed. If you construct from a raw C string, a std::string will be constructed, then copied again: two allocations.
There is the C++03 method of taking a reference to a std::string, then swapping it into a local std::string:
struct S
{
std::string data;
S(std::string& str)
{
std::swap(data, str);
}
};
that is the C++03 version of "move semantics", and swap can often be optimized to be very cheap to do (much like a move). It also should be analyzed in context:
S tmp("foo"); // illegal
std::string s("foo");
S tmp2(s); // legal
and forces you to form a non-temporary std::string, then discard it. (A temporary std::string cannot bind to a non-const reference). Only one allocation is done, however. The C++11 version would take a && and require you to call it with std::move, or with a temporary: this requires that the caller explicitly creates a copy outside of the call, and move that copy into the function or constructor.
struct S
{
std::string data;
S(std::string&& str): data(std::move(str))
{}
};
Use:
S tmp("foo"); // legal
std::string s("foo");
S tmp2(std::move(s)); // legal
Next, we can do the full C++11 version, that supports both copy and move:
struct S
{
std::string data;
S(std::string const& str) : data(str) {} // lvalue const, copy
S(std::string && str) : data(std::move(str)) {} // rvalue, move
};
We can then examine how this is used:
S tmp( "foo" ); // a temporary `std::string` is created, then moved into tmp.data
std::string bar("bar"); // bar is created
S tmp2( bar ); // bar is copied into tmp.data
std::string bar2("bar2"); // bar2 is created
S tmp3( std::move(bar2) ); // bar2 is moved into tmp.data
It is pretty clear that this 2 overload technique is at least as efficient, if not more so, than the above two C++03 styles. I'll dub this 2-overload version the "most optimal" version.
Now, we'll examine the take-by-copy version:
struct S2 {
std::string data;
S2( std::string arg ):data(std::move(x)) {}
};
in each of those scenarios:
S2 tmp( "foo" ); // a temporary `std::string` is created, moved into arg, then moved into S2::data
std::string bar("bar"); // bar is created
S2 tmp2( bar ); // bar is copied into arg, then moved into S2::data
std::string bar2("bar2"); // bar2 is created
S2 tmp3( std::move(bar2) ); // bar2 is moved into arg, then moved into S2::data
If you compare this side-by-side with the "most optimal" version, we do exactly one additional move! Not once do we do an extra copy.
So if we assume that move is cheap, this version gets us nearly the same performance as the most-optimal version, but 2 times less code.
And if you are taking say 2 to 10 arguments, the reduction in code is exponential -- 2x times less with 1 argument, 4x with 2, 8x with 3, 16x with 4, 1024x with 10 arguments.
Now, we can get around this via perfect forwarding and SFINAE, allowing you to write a single constructor or function template that takes 10 arguments, does SFINAE to ensure that the arguments are of appropriate types, and then moves-or-copies them into the local state as required. While this prevents the thousand fold increase in program size problem, there can still be a whole pile of functions generated from this template. (template function instantiations generate functions)
And lots of generated functions means larger executable code size, which can itself reduce performance.
For the cost of a few moves, we get shorter code and nearly the same performance, and often easier to understand code.
Now, this only works because we know, when the function (in this case, a constructor) is called, that we will be wanting a local copy of that argument. The idea is that if we know that we are going to be making a copy, we should let the caller know that we are making a copy by putting it in our argument list. They can then optimize around the fact that they are going to give us a copy (by moving into our argument, for example).
Another advantage of the 'take by value" technique is that often move constructors are noexcept. That means the functions that take by-value and move out of their argument can often be noexcept, moving any throws out of their body and into the calling scope (who can avoid it via direct construction sometimes, or construct the items and move into the argument, to control where throwing happens). Making methods nothrow is often worth it.
This is probably intentional and is similar to the copy and swap idiom. Basically since the string is copied before the constructor, the constructor itself is exception safe as it only swaps (moves) the temporary string str.
You don't want to repeat yourself by writing a constructor for the move and one for the copy:
S(std::string&& str) : data(std::move(str)) {}
S(const std::string& str) : data(str) {}
This is much boilerplate code, especially if you have multiple arguments. Your solution avoids that duplication on the cost of an unnecessary move. (The move operation should be quite cheap, however.)
The competing idiom is to use perfect forwarding:
template <typename T>
S(T&& str) : data(std::forward<T>(str)) {}
The template magic will choose to move or copy depending on the parameter that you pass in. It basically expands to the first version, where both constructor were written by hand. For background information, see Scott Meyer's post on universal references.
From a performance aspect, the perfect forwarding version is superior to your version as it avoids the unnecessary moves. However, one can argue that your version is easier to read and write. The possible performance impact should not matter in most situations, anyway, so it seems to be a matter of style in the end.
I am using shared_ptrs extensively in my production code mainly to reduce complexity and maintenance and it generally is working fine. I have, however, written a parser for a complex meta-grammar that leaves shared objects upon exit. One of the culprits is caused by recursivity. Since the parsing code is complex in itself, I want to re-use it each time I descend to the next level. Consequently, I save off the current element in its parent's element while doing descendent parsing. But this causes the problem of too many remaining strong refs. I have experimented a good bit with weak vs strong storing and the TreeVect uses weak_ptrs, but the parent assignment problem persists. My question is, how can I get rid of the second strong ref that gets added on the assignment to the parent statement below? Here is code that illustrates my problem:
#include "stdafx.h"
#include <memory>
#include <vector>
struct Tree;
typedef std::weak_ptr< Tree > TreeWptr;
typedef std::shared_ptr< Tree > TreeSptr;
typedef std::vector< TreeWptr > TreeVect;
struct Tree
{
TreeVect treeVect;
//TreeSptr parent;
TreeWptr parent; // changed from strong to weak ptr
};
struct Element1 : public Tree
{
};
struct Element2 : public Tree
{
};
int main()
{
TreeSptr element1 = std::make_shared< Element1 >();
TreeSptr element2 = std::make_shared< Element2 >();
//element2->parent = element1; // illustrates recursive case. ERROR: Adds extra strong ref to element1
//element2->parent->treeVect.push_back( element2 );
element2->parent = element1; // no longer adds extra strong ref to element1
element2->parent.lock()->treeVect.push_back( element2 );
return 0;
}
Note: I solved this example program's problem by changing the parent member from a shared_ptr to a weak_ptr.
No one has substantively weighed in on what I am doing, so I applied what I know so far about smart pointers: never go out of scope with more than one strong ref to the object or it won't destruct. Seems obvious now.
RESULT FROM IMPLEMENTING MY RECURSIVE PARSER USING SHARED_PTR
Success! FYI, here are the practices I used in solving the problem:
1) Start out programming a project in general not considering using weak ptrs. Make everything shared ptr. You need your attention squarely fixed on solving the project at hand without distractions. That is, don't try to optimize until you are done. This keeps you from engineering code-weirdness in due to ptr logic. BTW, the next time you will make a better estimate at the ownership properties each of what your objects should have at the outset.
2) Use a memory leak detector program. I always write in C++ MFC which automatically reports leaks and their memory blocks on each run so it is easy to track down the culprits.
3) When you are satisfied with your program, to get rid of the leaks, re-think the ownership of your objects so that there is 1 strong reference to the top object you want to automatically be destroyed at the end of your execution path.
One side-effect of using shared_ptr is the code got simpler. This is what was promised us by the c++11 designers and it's true. The error messages forced me to think through at compiler-time the ownership of each object. Gone are the days when I, in the destructor, would just test to see if a pointer is non null and do a delete and hope the other threads are finished.
One Caveat: I have not implemented try/catch around weak_ptrs that have expired. I need to do this for production code. I'll report back here when I do.
In both C++11 and boost, smart pointers can be nullptr. I wonder why. That means that smart pointers must be checked for being nullptr every time they are passed to interface method from uncontrolled client code. Obviously, such check is performed in run time.
What if there would be smart pointers that can be created only via make_shared or make_unique and cannot be reset or reassigned to nullptr or raw pointer? This approach allows to ensure that pointer is not nullptr in compile time.
For example, in Java we always must check if object is not null (bad). But in Swift, we can explicitly make sure that argument (or variable) is not null in compile time (good).
UPD:
Well, thank you much for answers and comments. I got idea. But is there any popular libraries that supports non-nullity compile time guarantee alongside ownership, maybe smart pointer wrappers?
std smart pointers exist for one reason—to implement the concept of ownership. Their responsibility is to clearly define who owns the pointee (i.e. who and how ensures its safe destruction).
Large parts of std are really composed of low-level basic building blocks. While they can be used straight away in client code, they are not supposed to be an all-encompassing solution. They give you single-purpose tools which you cna mix & match to create something you need.
The std smart pointers are eactly "raw pointers + ownership." Raw pointers can be null and can be reseated, so std smart pointers can as well. Nothing prevents you from creating your own "std smart pointer + non-nullity" class(es) and using them in your code.
On the other hand, there are very valid use cases for a null smart pointer. If std smart pointers enforce non-nullity, and you needed null-supporting smart pointers, you'd have a much harder time implementing that. It's easier to add a validity constraint than to remove it when you can only do it by adding to the original class.
For std::unique_ptr is impossible to require no null, consider this:
std::unique_ptr<int> p = std::make_unique<int>();
std::unique_ptr<int> q = std::move(p);
What value will have p and q? If we ban null option this become impossible to implement. Even if we consider destroying move, we will have even worse situation. This is because you will be not allowed to test p, any use will be UB.
For std::share_ptr It could be possible to require it but this will heavy hinder any other use that could used nullable pointers. Standard library is too generic to allow that limitation.
Overall idea of having compile time guarantee of existing object pointed by pointer is very valuable but you try used wrong tool for this. Usually this is done by using & not pointers.
To solve your needs I suggest creating warper around std::share_ptr:
template<typename T>
class always_ptr
{
std::shared_ptr<T> _ptr;
public:
always_ptr() = delete; //no default constructor
always_ptr(const always_ptr& a) : _ptr{ a._ptr } { }
explicit always_ptr(T* p)
{
if (!p) throw std::Exception(); //only way to guarantee this is not null
_ptr = std::shared_ptr<T>(p);
}
T* get() { return _ptr.get(); }
T& operator*() { return *_ptr; }
T* operator->() { return _ptr.get(); }
explicit operator bool() const { return true; } //always true
};
I am using std::shared_ptr in C++11 and I would like to understand if it's better to assign structures of type T in this way:
T a_data;
std::shared_ptr<T> my_pointer(new T);
*my_pointer = a_data;
or like:
memcpy(&my_pointer, data, sizeof(T));
or like:
my_pointer.reset(a_data);
Regards
Mike
They each do a different thing.
1.
T a_data;
std::shared_ptr<T> my_pointer(new T);
*my_pointer = a_data;
Here, a new object (call it n) of type T will be allocated, managed by my_pointer. Then, object a_data will be copy-assigned into n.
2.
memcpy(&my_pointer, a_data, sizeof(T)); // I assume you meant a_data here, not data
That's total nonsense - tha's overwriting the shared_ptr itself with the contents of a_data. Undefined behaviour at its finest (expect a crash or memory corruption).
Perhaps you actually meant my_pointer.get() instead of &my_pointer (that is, you wanted to copy into the object being pointed to)? If that's the case, it can work, as long as T is trivially copyable - which means that it doesn't have non-trivial copy or move ctors, doesn't have non-trivial copy or move assignment operators, and has a trivial destructor. But why rely on that, when normal assignment (*my_pointer = a_data;) does exactly the same for that case, and also works for non-trivially-copyable classes?
3.
my_pointer.reset(a_data);
This normally won't compile as-is, it would need to be my_pointer.reset(&a_data);. That's disaster waiting to happen - you point my_pointer to the automatic (= local) variable a_data and give it ownership of that. Which means that when my_pointer goes out of scope (actually, when the last pointer sharing ownership wiht it does), it will call the deleter, which normally calls delete. On a_data, which was not allocated with new. Welcome to UB land again!
If you just need to manage a dynamically-allocated copy of a_data with a shared_ptr, do this:
T a_data;
std::shared_ptr<T> my_pointer(new T(a_data));
Or even better:
T a_data;
auto my_pointer = std::make_shared<T>(a_data);
Consider the following (not in any particular language):
for (i=0; i<list.length(); i++) { ... }
Some people prefer to rewrite it as:
int len = list.length()
for (i=0; i<len; i++) { ... }
This would make sense if getting the length via list.length() was anything other than O(1). But I don't see any reason why this would be the case. Regardless of the data type, it should be trivial to add a length field somewhere and update it whenever the size changes.
Is there a common data type where getting or updating the length is not O(1)? Or is there another reason why someone would want to do that?
In this case you are accessing a property directly, not using a getter (function call). That is probably always faster than a method call. Even if there was a method call, many languages are smart enough to optimize it.
This is a micro-optimization but a valid one (not implying it should be done, but that it can increase speed - unnoticeable speedup most likely). The reason this is valid is because of aliasing.
length can be modified inside the loop and a non-intrusive compiler might not be able to tell whether it is modified or not. Ergo, it will have to read the value every time, as opposed to accessing it once, before the loop.
The difference can be even more noticeable if the length is retrieved via a method call - like you'd do in C++:
int len = vect.size();