Copy constructor called on *this - c++11

I defined a copy constructor for a class A. Due to an unfortunate macro expansion, I ended up compiling the following:
A a = a;
I (eventually) realized this results in a call to A::A(const A& rhs) with this==&rhs.
Why does the compiler allow this? Conceptually I would assume that since a is declared in this statement, it wouldn't yet be available for use on the RHS.
Should I defensively check this==&rhs whenever I define a copy constructor?
I am using gcc version 5.4.0 with -std=c++11.

In a declaration, the identifier being declared is in scope as soon as it appears. There are some valid uses of this, e.g. void *p = &p;
It's normal for the copy-constructor to assume no self-copy, leaving it up to the caller to not make this mistake. Preferably, follow the rule of zero.
It would be better to not write A a = a; in the first place. To avoid this you could get in the habit of using auto, e.g.
#define F(x) auto a = A(x)
#define G(x) A a = x
Now if you write G(a); you silently get the bug, but F(a); fails to compile.

Related

C++ return value and move rule exceptions

When we return a value from a C++ function copy-initialisation happens. Eg:
std::string hello() {
std::string x = "Hello world";
return x; // copy-init
}
Assume that RVO is disabled.
As per copy-init rule if x is a non-POD class type, then the copy constructor should be called. However for C++11 onward, I see move-constrtuctor being called. I could not find or understand the rules regarding this https://en.cppreference.com/w/cpp/language/copy_initialization. So my first question is -
What does the C++ standard say about move happening for copy-init when value is returned from function?
As an extension to the above question, I would also like to know in what cases move does not happen. I came up with the following case where copy-constructor is called instead of move:
std::string hello2(std::string& param) {
return param;
}
Finally, in some library code I saw that std::move was being explicitly used when returning (even if RVO or move should happen). Eg:
std::string hello3() {
std::string x = "Hello world";
return std::move(x);
}
What is the advantage and disadvantage of explicitly using std::move when returning?
You are confused by the fact that initialization via the move constructor is a special case of "copy initialization", and does not come as seperate concept. Check the notes on the cppreference page.
If other is an rvalue expression, move constructor will be selected by overload resolution and called during copy-initialization. There is no such term as move-initialization.
For returning a value from the function, check the description of returning on cppreference. It says in a box called "automatic move from local variables and parameters", where expression refers to what you return (warning: that quote is shortened! read the original for full details about other cases):
If expression is a (possibly parenthesized) id-expression that names a variable whose type is [...] a non-volatile object type [...] and that variable is declared [...] in the body or as a parameter of the [...] function, then overload resolution to select the constructor to use for initialization of the returned value is performed twice: first as if expression were an rvalue expression (thus it may select the move constructor), and if the first overload resolution failed [...] then overload resolution is performed as usual, with expression considered as an lvalue (so it may select the copy constructor).
So in the special case of returning a local variable, the variable can be treated as r-value, even if normal syntactic rules would make it a l-value. The spirit of the rule is that after the return, you can't find out whether the value of the local variable has been destroyed during the copy-initialization of the returned value, so moving it does not do any damage.
Regarding your second question: It is considered bad style to use std::move while returning, because moving will happen anyway, and it inhibits NRVO.
Quoting the C++ core guidelines linked above:
Never write return move(local_variable);, because the language already knows the variable is a move candidate. Writing move in this code won’t help, and can actually be detrimental because on some compilers it interferes with RVO (the return value optimization) by creating an additional reference alias to the local variable.
So that library code you quote is suboptimal.
Also, you can not implicitly move from anything that is not local to the function (that is local variables and value parameters), because implicit moving may move from something that is still visible after the function returned. In the quote from cppreference, the important point is "a non-volatile object type". When you return std::string& param, that is a variable with reference type.

Avoiding self assignment in std::shuffle

I stumbled upon the following problem when using the checked implementation of glibcxx:
/usr/include/c++/4.8.2/debug/vector:159:error: attempt to self move assign.
Objects involved in the operation:
sequence "this" # 0x0x1b3f088 {
type = NSt7__debug6vectorIiSaIiEEE;
}
Which I have reduced to this minimal example:
#include <vector>
#include <random>
#include <algorithm>
struct Type {
std::vector<int> ints;
};
int main() {
std::vector<Type> intVectors = {{{1}}, {{1, 2}}};
std::shuffle(intVectors.begin(), intVectors.end(), std::mt19937());
}
Tracing the problem I found that shuffle wants to std::swap an element with itself. As the Type is user defined and no specialization for std::swap has been given for it, the default one is used which creates a temporary and uses operator=(&&) to transfer the values:
_Tp __tmp = _GLIBCXX_MOVE(__a);
__a = _GLIBCXX_MOVE(__b);
__b = _GLIBCXX_MOVE(__tmp);
As Type does not explicitly give operator=(&&) it is default implemented by "recursively" applying the same operation on its members.
The problem occurs on line 2 of the swap code where __a and __b point to the same object which results in effect in the code __a.operator=(std::move(__a)) which then triggers the error in the checked implementation of vector::operator=(&&).
My question is: Who's fault is this?
Is it mine, because I should provide an implementation for swap that makes "self swap" a NOP?
Is it std::shuffle's, because it should not try to swap an element with itself?
Is it the checked implementation's, because self-move-assigment is perfectly fine?
Everything is correct, the checked implementation is just doing me a favor in doing this extra check (but then how to turn it off)?
I have read about shuffle requiring the iterators to be ValueSwappable. Does this extend to self-swap (which is a mere runtime problem and can not be enforced by compile-time concept checks)?
Addendum
To trigger the error more directly one could use:
#include <vector>
int main() {
std::vector<int> vectorOfInts;
vectorOfInts = std::move(vectorOfInts);
}
Of course this is quite obvious (why would you move a vector to itself?).
If you where swapping std::vectors directly the error would not occur because of the vector class having a custom implementation of the swap function that does not use operator=(&&).
The libstdc++ Debug Mode assertion is based on this rule in the standard, from [res.on.arguments]
If a function argument binds to an rvalue reference parameter, the implementation may assume that this parameter is a unique reference to this argument.
i.e. the implementation can assume that the object bound to the parameter of T::operator=(T&&) does not alias *this, and if the program violates that assumption the behaviour is undefined. So if the Debug Mode detects that in fact the rvalue reference is bound to *this it has detected undefined behaviour and so can abort.
The paragraph contains this note as well (emphasis mine):
[Note: If a program casts an lvalue to an xvalue while passing that lvalue to a library function (e.g., by calling the function with the argument
std::move(x)), the program is effectively asking that function to treat that lvalue as a temporary object. The implementation is free to optimize away aliasing checks which might be needed if the
argument was an lvalue. —end note]
i.e. if you say x = std::move(x) then the implementation can optimize away any check for aliasing such as:
X::operator=(X&& rval) { if (&rval != this) ...
Since the implementation can optimize that check away, the standard library types don't even bother doing such a check in the first place. They just assume self-move-assignment is undefined.
However, because self-move-assignment can arise in quite innocent code (possibly even outside the user's control, because the std::lib performs a self-swap) the standard was changed by Defect Report 2468. I don't think the resolution of that DR actually helps though. It doesn't change anything in [res.on.arguments], which means it is still undefined behaviour to perform a self-move-assignment, at least until issue 2839 gets resolved. It is clear that the C++ standard committee think self-move-assignment should not result in undefined behaviour (even if they've failed to actually say that in the standard so far) and so it's a libstdc++ bug that our Debug Mode still contains assertions to prevent self-move-assignment.
Until we remove the overeager checks from libstdc++ you can disable that individual assertion (but still keep all the other Debug Mode checks) by doing this before including any other headers:
#include <debug/macros.h>
#undef __glibcxx_check_self_move_assign
#define __glibcxx_check_self_move_assign(x)
Or equivalently, using just command-line flags (so no need to change the source code):
-D_GLIBCXX_DEBUG -include debug/macros.h -U__glibcxx_check_self_move_assign '-D__glibcxx_check_self_move_assign(x)='
This tells the compiler to include <debug/macros.h> at the start of the file, then undefines the macro that performs the self-move-assign assertion, and then redefines it to be empty.
(In general defining, undefining or redefining libstdc++'s internal macros is undefined and unsupported, but this will work, and has my blessing).
It is a bug in GCC's checked implementation. According to the C++11 standard, swappable requirements include (emphasis mine):
17.6.3.2 §4 An rvalue or lvalue t is swappable if and only if t is swappable with any rvalue or lvalue, respectively, of type T
Any rvalue or lvalue includes, by definition, t itself, therefore to be swappable swap(t,t) must be legal. At the same time the default swap implementation requires the following
20.2.2 §2 Requires: Type T shall be MoveConstructible (Table 20) and MoveAssignable (Table 22).
Therefore, to be swappable under the definition of the default swap operator self-move assignment must be valid and have the postcondition that after self assignment t is equivalent to it's old value (not necessarily a no-op though!) as per Table 22.
Although the object you are swapping is not a standard type, MoveAssignable has no precondition that rv and t refer to different objects, and as long as all members are MoveAssignable (as std::vector should be) the generate move assignment operator must be correct (as it performs memberwise move assignment as per 12.8 §29). Furthermore, although the note states that rv has valid but unspecified state, any state except being equivalent to it's original value would be incorrect for self assignment, as otherwise the postcondition would be violated.
I read a couple of tutorials about copy constructors and move assignments and stuff (for example this). They all say that the object must check for self assignment and do nothing in that case. So I would say it is the checked implementation's fault, because self-move-assigment is perfectly fine.

Which type to declare to avoid copy and make sure returned value is moved

Suppose that
I have a function A f();
I want to initialize a local variable a to the return value of f;
I don't want to rely on RVO;
What is the best option (and why) to avoid the return value of f being copied when
a may have to be modified
I know that a will not be modified
Options:
a) A a = f();
b) A&& a = f();
c) const A& = f();
d) const A&& = f();
EDIT:
I would say:
b)
c)
Because both use references and avoid an extra copy (which could be avoided by RVO as well, but that is not guaranteed). So how come I see option a) suggested most of the time?
I guess the heart of the question is: I get that a) most likely has the same effect as c), so why not use c) instead of a), to make things explicit and independent on the compiler?
So how come I see option a) suggested most of the time?
Because all 4 options return the value the exact same way. The only thing that changes is the way you bind a variable to the returned temporary.
a) Declares a variable a of type A and move-initializes it from the temporary. This is not an assignment, this is initialization. The move constructor will be elided by any popular compiler, provided that you don't explicitly disallow it, which means that the program will just ensure that the storage reserved for the return value is the storage for the a variable.
c) Declares a variable a which is a const-reference to the temporary, extending the lifetime of the temporary. Which means that the temporary return value gets storage, and the variable a will refer to it. The compiler, knowing statically that the reference points to the returned value, will effectively generate the same code as a).
b) and d), I am not really sure what they're good for. (and even if that works) I would not take an rvalue reference at this point. If I need one, I would explicitly std::move the variable later.
Now :
a may have to be modified
If a may be modified, use :
auto a = f();
I know that a will not be modified
If you know that a will not be modified, use :
const auto a = f();
The use of auto will prevent any type mismatch. (and as such, any implicit conversion to A if f happens not to return type A after all. Oh, those maintainers...)
If class A have move-constructor then just use A a = f();
If you know nothing about class A you can rely only on RVO or NRVO optimization.

How can I get more information from clang substitution-failure errors?

I have the following std::begin wrappers around Eigen3 matrices:
namespace std {
template<class T, int nd> auto begin(Eigen::Matrix<T,nd,1>& v)
-> decltype(v.data()) { return v.data(); }
}
Substitution fails, and I get a compiler error (error: no matching function for call to 'begin'). For this overload, clang outputs the following:
.../file:line:char note: candidate template ignored:
substitution failure [with T = double, nd = 4]
template<class T, int nd> auto begin(Eigen::Matrix<T,nd,1>& v)
^
I want this overload to be selected. I am expecting the types to be double and int, i.e. they are deduced as I want them to be deduced (and hopefully correctly). By looking at the function, I don't see anything that can actually fail.
Every now and then I get similar errors. Here, clang tells me: substitution failure, I'm not putting this function into the overload resolution set. However, this does not help me debugging at all. Why did substitution failed? What exactly couldn't be substituted where? The only thing obvious to me is that the compiler knows, but it is deliberately not telling me :(
Is it possible to force clang to tell me what did exactly fail here?
This function is trivial and I'm having problems. In more complex functions, I guess things can only get worse. How do you go about debugging these kind of errors?
You can debug substitution failures by doing the substitution yourself into a cut'n'paste of the original template and seeing what errors the compiler spews for the fully specialized code. In this case:
namespace std {
auto begin(Eigen::Matrix<double,4,1>& v)
-> decltype(v.data()) {
typedef double T; // Not necessary in this example,
const int nd = 4; // but define the parameters in general.
return v.data();
}
}
Well this has been reported as a bug in clang. Unfortunately, the clang devs still don't know the best way to fix it. Until then, you can use gcc which will report the backtrace, or you can apply this patch to clang 3.4. The patch is a quick hack that will turn substitution failures into errors.

I am trying use vlog inside static lamba giving me error: statement-expressions are not allowed outside funstions nor in template-argument lists

I wrote a global static lambda like the following:
static auto x = [] (const std::string& y){
VLOG(3) <<" y:" <<y;
};
it is giving me this error on VLOG statement.:
statement-expressions are not allowed outside functions nor in template-argument lists
This is the result of a decision made to leave room for optimizations. The relevant bug report is here in an attempt to reopen the discussion, and specifically mentions your type of use case. Compiling without optimizations should work, though I know that's a lame suggestion. The problem is with the lambda being at global scope, so any solution that brings it into a function should be good.

Resources