Avoiding self assignment in std::shuffle - c++11

I stumbled upon the following problem when using the checked implementation of glibcxx:
/usr/include/c++/4.8.2/debug/vector:159:error: attempt to self move assign.
Objects involved in the operation:
sequence "this" # 0x0x1b3f088 {
type = NSt7__debug6vectorIiSaIiEEE;
}
Which I have reduced to this minimal example:
#include <vector>
#include <random>
#include <algorithm>
struct Type {
std::vector<int> ints;
};
int main() {
std::vector<Type> intVectors = {{{1}}, {{1, 2}}};
std::shuffle(intVectors.begin(), intVectors.end(), std::mt19937());
}
Tracing the problem I found that shuffle wants to std::swap an element with itself. As the Type is user defined and no specialization for std::swap has been given for it, the default one is used which creates a temporary and uses operator=(&&) to transfer the values:
_Tp __tmp = _GLIBCXX_MOVE(__a);
__a = _GLIBCXX_MOVE(__b);
__b = _GLIBCXX_MOVE(__tmp);
As Type does not explicitly give operator=(&&) it is default implemented by "recursively" applying the same operation on its members.
The problem occurs on line 2 of the swap code where __a and __b point to the same object which results in effect in the code __a.operator=(std::move(__a)) which then triggers the error in the checked implementation of vector::operator=(&&).
My question is: Who's fault is this?
Is it mine, because I should provide an implementation for swap that makes "self swap" a NOP?
Is it std::shuffle's, because it should not try to swap an element with itself?
Is it the checked implementation's, because self-move-assigment is perfectly fine?
Everything is correct, the checked implementation is just doing me a favor in doing this extra check (but then how to turn it off)?
I have read about shuffle requiring the iterators to be ValueSwappable. Does this extend to self-swap (which is a mere runtime problem and can not be enforced by compile-time concept checks)?
Addendum
To trigger the error more directly one could use:
#include <vector>
int main() {
std::vector<int> vectorOfInts;
vectorOfInts = std::move(vectorOfInts);
}
Of course this is quite obvious (why would you move a vector to itself?).
If you where swapping std::vectors directly the error would not occur because of the vector class having a custom implementation of the swap function that does not use operator=(&&).

The libstdc++ Debug Mode assertion is based on this rule in the standard, from [res.on.arguments]
If a function argument binds to an rvalue reference parameter, the implementation may assume that this parameter is a unique reference to this argument.
i.e. the implementation can assume that the object bound to the parameter of T::operator=(T&&) does not alias *this, and if the program violates that assumption the behaviour is undefined. So if the Debug Mode detects that in fact the rvalue reference is bound to *this it has detected undefined behaviour and so can abort.
The paragraph contains this note as well (emphasis mine):
[Note: If a program casts an lvalue to an xvalue while passing that lvalue to a library function (e.g., by calling the function with the argument
std::move(x)), the program is effectively asking that function to treat that lvalue as a temporary object. The implementation is free to optimize away aliasing checks which might be needed if the
argument was an lvalue. —end note]
i.e. if you say x = std::move(x) then the implementation can optimize away any check for aliasing such as:
X::operator=(X&& rval) { if (&rval != this) ...
Since the implementation can optimize that check away, the standard library types don't even bother doing such a check in the first place. They just assume self-move-assignment is undefined.
However, because self-move-assignment can arise in quite innocent code (possibly even outside the user's control, because the std::lib performs a self-swap) the standard was changed by Defect Report 2468. I don't think the resolution of that DR actually helps though. It doesn't change anything in [res.on.arguments], which means it is still undefined behaviour to perform a self-move-assignment, at least until issue 2839 gets resolved. It is clear that the C++ standard committee think self-move-assignment should not result in undefined behaviour (even if they've failed to actually say that in the standard so far) and so it's a libstdc++ bug that our Debug Mode still contains assertions to prevent self-move-assignment.
Until we remove the overeager checks from libstdc++ you can disable that individual assertion (but still keep all the other Debug Mode checks) by doing this before including any other headers:
#include <debug/macros.h>
#undef __glibcxx_check_self_move_assign
#define __glibcxx_check_self_move_assign(x)
Or equivalently, using just command-line flags (so no need to change the source code):
-D_GLIBCXX_DEBUG -include debug/macros.h -U__glibcxx_check_self_move_assign '-D__glibcxx_check_self_move_assign(x)='
This tells the compiler to include <debug/macros.h> at the start of the file, then undefines the macro that performs the self-move-assign assertion, and then redefines it to be empty.
(In general defining, undefining or redefining libstdc++'s internal macros is undefined and unsupported, but this will work, and has my blessing).

It is a bug in GCC's checked implementation. According to the C++11 standard, swappable requirements include (emphasis mine):
17.6.3.2 §4 An rvalue or lvalue t is swappable if and only if t is swappable with any rvalue or lvalue, respectively, of type T
Any rvalue or lvalue includes, by definition, t itself, therefore to be swappable swap(t,t) must be legal. At the same time the default swap implementation requires the following
20.2.2 §2 Requires: Type T shall be MoveConstructible (Table 20) and MoveAssignable (Table 22).
Therefore, to be swappable under the definition of the default swap operator self-move assignment must be valid and have the postcondition that after self assignment t is equivalent to it's old value (not necessarily a no-op though!) as per Table 22.
Although the object you are swapping is not a standard type, MoveAssignable has no precondition that rv and t refer to different objects, and as long as all members are MoveAssignable (as std::vector should be) the generate move assignment operator must be correct (as it performs memberwise move assignment as per 12.8 §29). Furthermore, although the note states that rv has valid but unspecified state, any state except being equivalent to it's original value would be incorrect for self assignment, as otherwise the postcondition would be violated.

I read a couple of tutorials about copy constructors and move assignments and stuff (for example this). They all say that the object must check for self assignment and do nothing in that case. So I would say it is the checked implementation's fault, because self-move-assigment is perfectly fine.

Related

Is it still necessary to use std move even if auto && has been used

As we know, STL usually offered two kinds of functions to insert an element: insert/push and emplace.
Let's say I want to emplace all of elements from one container to another.
for (auto &&element : myMap)
{
anotherMap.emplace(element); // vs anotherMap.empalce(std::move(element));
}
In this case, if I want to call the emplace, instead of insert/push, must I still call std::move here or not?
If you indeed want to move all elements from myMap into anotherMap then yes you must call std::move(). The reason is that element here is still an lvalue. Its type is rvalue reference as declared, but the expression itself is still an lvalue, and thus the overload resolution will give back the lvalue reference constructor better known as the copy constructor.
This is a very common point of confusion. See for example this question.
Always keep in mind that std::move doesn't actually do anything itself, it just guarantees that the overload resolver will see an appropriately-typed rvalue instead of an lvalue associated with a given identifier.

C++ return value and move rule exceptions

When we return a value from a C++ function copy-initialisation happens. Eg:
std::string hello() {
std::string x = "Hello world";
return x; // copy-init
}
Assume that RVO is disabled.
As per copy-init rule if x is a non-POD class type, then the copy constructor should be called. However for C++11 onward, I see move-constrtuctor being called. I could not find or understand the rules regarding this https://en.cppreference.com/w/cpp/language/copy_initialization. So my first question is -
What does the C++ standard say about move happening for copy-init when value is returned from function?
As an extension to the above question, I would also like to know in what cases move does not happen. I came up with the following case where copy-constructor is called instead of move:
std::string hello2(std::string& param) {
return param;
}
Finally, in some library code I saw that std::move was being explicitly used when returning (even if RVO or move should happen). Eg:
std::string hello3() {
std::string x = "Hello world";
return std::move(x);
}
What is the advantage and disadvantage of explicitly using std::move when returning?
You are confused by the fact that initialization via the move constructor is a special case of "copy initialization", and does not come as seperate concept. Check the notes on the cppreference page.
If other is an rvalue expression, move constructor will be selected by overload resolution and called during copy-initialization. There is no such term as move-initialization.
For returning a value from the function, check the description of returning on cppreference. It says in a box called "automatic move from local variables and parameters", where expression refers to what you return (warning: that quote is shortened! read the original for full details about other cases):
If expression is a (possibly parenthesized) id-expression that names a variable whose type is [...] a non-volatile object type [...] and that variable is declared [...] in the body or as a parameter of the [...] function, then overload resolution to select the constructor to use for initialization of the returned value is performed twice: first as if expression were an rvalue expression (thus it may select the move constructor), and if the first overload resolution failed [...] then overload resolution is performed as usual, with expression considered as an lvalue (so it may select the copy constructor).
So in the special case of returning a local variable, the variable can be treated as r-value, even if normal syntactic rules would make it a l-value. The spirit of the rule is that after the return, you can't find out whether the value of the local variable has been destroyed during the copy-initialization of the returned value, so moving it does not do any damage.
Regarding your second question: It is considered bad style to use std::move while returning, because moving will happen anyway, and it inhibits NRVO.
Quoting the C++ core guidelines linked above:
Never write return move(local_variable);, because the language already knows the variable is a move candidate. Writing move in this code won’t help, and can actually be detrimental because on some compilers it interferes with RVO (the return value optimization) by creating an additional reference alias to the local variable.
So that library code you quote is suboptimal.
Also, you can not implicitly move from anything that is not local to the function (that is local variables and value parameters), because implicit moving may move from something that is still visible after the function returned. In the quote from cppreference, the important point is "a non-volatile object type". When you return std::string& param, that is a variable with reference type.

memcpy from one type to another type. How do we access the destination afterwards?

uint32_t u32 = 0;
uint16_t u16[2];
static_assert(sizeof(u32) == sizeof(u16), "");
memcpy(u16, &u32, sizeof(u32)); // defined?
// if defined, how to we access the data from here on?
Is this defined behaviour? And, if so, what type of pointer may we use to access the target data after the memcpy?
Must we use uint16_t*, because that suitable for the declared type of u16?
Or must we use uint32_t*, because the type of the source data (the source data copied from by memcpy) is uint_32?
(Personally interested in C++11/C++14. But a discussion of related languages like C would be interesting also.)
Is this defined behavio[u]r?
Yes. memcpying into a pod is well-defined and you ensured that the sizing is the correct.
Must we use uint16_t*, because that suitable for the declared type of u16?
Yes, of course. u16 is an array of two uint16_ts so it must be accessed as such. Accessing it via a uint32_t* would be undefined behavior by the strict-aliasing rule.
It doesn't matter what the source type was. What matters is that you have an object of type uint16_t[2].
On the other hand, this:
uint32_t p;
new (&p) uint16_t(42);
std::cout << p;
is undefined behavior, because now there is an object of a different type whose lifetime has begin at &p and we're accessing it through the wrong type.
The C++ standard delegates to C standard:
The contents and meaning of the header <cstring> are the same as the C standard library header <string.h>.
The C standard specifies:
7.24.1/3 For all functions in this subclause, each character shall be interpreted as if it had the type unsigned char (and therefore every possible object representation is valid and has a different value).
So, to answer your question: Yes, the behaviour is defined.
Yes, uint16_t* is appropriate because uint16_t is the type of the object.
No, the type of the source doesn't matter.
C++ standard doesn't specify such thing as object without declared type or how it would behave. I interpret that to mean that the effective type is implementation defined for objects with no declared type.
Even in C, the source doesn't matter in this case. A more complete version of quote from C standard (draft, N1570) that you are concerned about, emphasis mine:
6.5/6 [...] If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. [...]
This rule doesn't apply, because objects in u16 do have a declared type

In what way does this struct-field-aliasing code invoke Undefined Behavior

Given the code:
#include <stdlib.h>
#include <stdint.h>
typedef struct { int32_t x, y; } INTPAIR;
typedef struct { int32_t w; INTPAIR xy; } INTANDPAIR;
void foo(INTPAIR * s1, INTPAIR * s2)
{
s2->y++;
s1->x^=1;
s2->y--;
s1->x^=1;
}
int hey(int x)
{
static INTPAIR dummy;
void *p = calloc(sizeof (INTANDPAIR),1);
INTANDPAIR *p1 = p;
INTPAIR *p2a = p;
INTPAIR *p2b = &p1->xy;
p2b->x = x;
foo(p2b,p2a);
int result= p2b->x;
free(p);
return result;
}
#include <stdio.h>
int main(void)
{
for (int i=0; i<10; i++)
printf("%d.",hey(i));
}
Behavior depends upon gcc optimization level, which implies that gcc thinks
this code invokes Undefined Behavior (the definition of "foo" collapses to nothing, but interestingly the definition of "hey" increments the value passed in). I'm not quite sure what if anything it does that runs afoul of the Standard's rules, though.
The code very deliberately and evilly constructs two pointers such that
s2a->y and s2b->x will alias, but the pointers are deliberately constructed in such a way that both identify legitimate potential objects of type INTPAIR. Because code used calloc to get the memory, all field members have legitimate initial defined values of zero. All accesses to the allocated memory are done via an int32_t member of an INTPAIR*.
I can understand why it would make sense for the Standard to forbid aliasing structure fields in this fashion, but I couldn't find anything in the Standard which actually does so. Is gcc operating in Standard-compliant fashion here, or is it violating some clause in the Standard which isn't referenced by Annex J.2 and doesn't use any of the terms I searched for?
UPDATE:
I felt this answer was OK, but not still a little imprecise, and not cut and dry as to what the UB was. After a lot of very interesting discussion and comments I have tried again with a new answer
The right part of the C99 standard is quoted in this answer. I'm copying it here for convenience. The question and several of the answers are quite thorough.
(C99; ISO/IEC 9899:1999 6.5/7:
An object shall have its stored value accessed only by an lvalue
expression that has one of the following types 73) or 88):
a type compatible with the effective type of the object,
a qualified version of a type compatible with the effective type of
the object,
a type that is the signed or unsigned type corresponding to the
effective type of the object,
a type that is the signed or unsigned type corresponding to a
qualified version of the effective type of the object,
an aggregate or union type that includes one of the aforementioned
types among its members (including, recursively, a member of a
subaggregate or contained union), or
a character type.
73) or 88) The intent of this list is to specify those circumstances in which an object may or may not be aliased.
What is an effective type then? (C99; ISO/IEC 9899:1999 6.5/6:
The effective type of an object for an access to its stored value is the declared type of the object, if any. 87) If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.
87) Allocated objects have no declared type.
So at the line p2b->x = x the object at p+4 becomes of effective type INTPAIR. Is it aligned correctly? If it isn't then Undefined Behavior (UB). But to keep it interesting, assume it is as it must be in this case because of the layout of INTANDPAIR.
By the same analysis there are two 8 byte objects, p2a (s2) at #(p+4) and p2b #p. As your example is demonstrating the 2nd element of p2a and the first of p2b end up being aliased.
In the foo(), the object p2b #p+4 is accessed by the normal method via s1->x. But then the "stored value" of object p2b is also accessed by a side effect of modifying a different object p2a #p. Since this falls under none of the bullets of 6.5/7, it is UB. Note that 6.5/7 says only, so objects shall not be accessed in any other ways.
I think the main distinction is that the "object" in question is the whole structure p2a/s2 and p2b/s1, not the integer members. If you change the argument of the function to take the integers and alias them it works "fine" because the function can't know s1 and s2 alias. For example:
void foo2(int *s1, int *s2)
{
(*s2)++;
(*s1)^=1;
(*s2)--;
(*s1)^=1;
}
...
/*foo(p2b,p2a);*/
foo2((int*)p, (int*)p); /* or p+4 or whatever you want */
This more or less confirms that this is the way GCC chose to interpret things: modifying a member is modifying the whole struct object and that since side effects of modifying one object are not on the listed legal ways to indirectly modify a different object, whee! we can do whatever silly thing we feel like doing.
So whether GCC interprets the ambiguities in standard to decide that by deriving s1 and s2 pointers through different typed pointers and then accessing them constitutes indirectly accessing the memory via different original types via p1 and p or whether it interprets the standard in the way I'm suggesting that "object" s2->y modifies is not just the integer but the s2 object, it is UB either way. Or is GCC just being especially snarky and pointing out that if the standard doesn't very clearly specify the semantics of dynamically allocated yet overlapping objects, it is free to do whatever it wants because by definition it is "undefined".
I don't think at this microscopic level anyone other than the standards body can definitively answer whether this should be UB or not because at this level it requires some "interpretation". The GCC's implementers opinion's seem to favor very aggressive interpretations.
I like Linus's reaction to this whole thing. And it is true, why not just be conservative and let the programmer tell the compiler when it is safe? Very Excellent Linus Rant
My previous answer was lacking, maybe not completely wrong, but the sample program is deliberately designed to sidestep each of the more obvious explicit Undefined Behaviors (UB) dictated by the C99 standard, like 6.5/7. But with both GCC (and Clang) this example demonstrates strict aliasing failure like symptoms under optimization. They appear to be assuming s1->y and s2-x can't alias. So, is the compiler wrong? Is this a loophole in the strict aliasing legalese?
Short answer: No. I wouldn't be surprised if there was a loophole of some kind in the standard, given its complexity. But in this example, creating overlapping objects on the heap is explicitly undefined behavior, and there are several other things happening that the standard does not define.
I think the point of the example is not that it fails - it is obvious that "playing fast and loose" with pointers is a bad idea and relying on corner cases and legalese to prove the compile "wrong" is of little help if the code doesn't work. The key questions are: is GCC wrong? and what in the standard says so.
First, lets look at the obvious strict aliasing rules and how this example is trying to avoid them.
C99 6.5/7:
An object shall have its stored value accessed only by an lvalue expression that has one of the following types: 76)
a type compatible with the effective type of the object,
a qualified version of a type compatible with the effective type of the object,
a type that is the signed or unsigned type corresponding to the effective type of the object,
a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
a character type.
This is the main strict aliasing section. It means that accessing the same memory via two different type pointers is UB. This example sidesteps it by accessing both using INTPAIR pointers in foo().
The key problem with this is that it is talking about accessing the stored value via two different effective types (e.g. pointers). It doesn't talk about accessing via two different objects.
What is being accessed? is it the integer member or the entire object s1 / s2? Is accessing s2->x via s1->y access via "a type compatible with the effective type of the object". I believe an argument can be made that a) the access as a side effect of modifying a different object does not fall under the permissible methods in 6.5/7 and that b) modifying one member of the aggregate transitively modifies the aggregate (*s1 or *s2) also.
Since this is not specified, it is UB, but it is a bit hand-wavy.
How did we get pointers to two overlapping objects? Are the pointer casts leading to them OK? Section 6.3.2.3 contains the rules for casting pointers and the example carefully does not violate any of them. In particular, because p2b is a pointer to INTANDPAIR member xy the alignment is guaranteed to be right, otherwise it would definitely run afoul of 6.3.2.3/7.
Furthermore, &p1->xy is not a problem - it can't be - it is a perfectly legitimate pointer to an INTPAIR. Simply casting pointers and/or taking addresses is safely outside the definition of "access" (3.1/1).
It is clear that the problem comes about by accessing two integer members that overlay each other as different parts of overlapping objects. Any attempt to do this via pointers of different types would clearly run afoul of 6.5/7. If accessed by the same type pointer at the same address, there would be no problem whatsoever. So the only way left that they could alias this way is that if two objects at different addresses overlapped in some fashion.
Obviously this could occur as part of a union, but that is not the case for this example. Type punning through unions may not be UB in C99, but it would be a different question whether a variant of this example could be made misbehave via unions.
The example uses dynamic allocation and casts the resultant void pointer to two different types. Going from from a pointer to an object to void * and back again is valid (6.3.2.3/1). Several other ways of obtaining pointers to objects that would overlap are explicitly UB by the pointer conversion rules of 6.3.2.3, the aliasing rules of 6.5/7, and/or the compatible type rules 6.2.7.
So what else is wrong?
6.2.4 Storage durations of objects
1 An object has a storage duration that determines its lifetime. There are three storage durations: static, automatic, and allocated. Allocated storage is described in 7.20.3
The storage for each of the objects is allocated by calloc() so the duration we want is "allocated". So we check 7.20.3: (emphasis added)
7.20.3 Memory management functions
1 The order and contiguity of storage allocated by successive calls to the calloc, malloc, and realloc functions is unspecified. The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object and then used to access such an object or an array of such objects in the space allocated (until the space is explicitly deallocated). The lifetime of an allocated object extends from the allocation until the deallocation. Each such allocation shall yield a pointer to an object disjoint from any other object.
...
2 The lifetime of an object is the portion of program execution during which storage is guaranteed to be reserved for it. An object exists, has a constant address, 25) and retains its last-stored value throughout its lifetime. 26) If an object is referred to outside of its lifetime, the behavior is undefined.
To avoid UB, the accesses to the two different objects must be to a valid object within its lifetime. You can get a single valid object (or an array) with malloc()/calloc(), but these guarantee that you will receive a pointer disjoint from all other objects. So is the object returned from calloc() p or is it p1? It can't be both.
The UB is triggered by attempting to reuse the same dynamically allocated object to hold two objects that are not disjoint. While calloc() guarantees it will return a pointer to a disjoint object, there is nothing that says it will still work if you then start using parts of the buffer for a 2nd overlapping one. In fact, it even explicitly says it is UB if you access an object outside its lifetime and there is only a single allocation ergo a single lifetime.
Also note:
4. Conformance
In this International Standard, ‘‘shall’’ is to be interpreted as a requirement on an implementation or on a program; conversely, ‘‘shall not’’ is to be interpreted as a prohibition.
If a ‘‘shall’’ or ‘‘shall not’’ requirement that appears outside of a constraint is violated, the behavior is undefined. Undefined behavior is otherwise indicated in this International Standard by the words ‘‘undefined behavior’’ or by the omission of any explicit definition
of behavior. There is no difference in emphasis among these three; they all describe ‘‘behavior that is undefined’’.
For this to be a compiler error it must fail on a program that only uses constructs explicitly defined. Anything else is outside the safe-harbor and is still undefined, even if it the standard doesn't explicitly state that it is Undefined Behavior.

__verify_pcpu_ptr function in Linux Kernel - What does it do?

#define __verify_pcpu_ptr(ptr)
do {
const void __percpu *__vpp_verify = (typeof((ptr) + 0))NULL;
(void)__vpp_verify;
} while (0)
#define VERIFY_PERCPU_PTR(__p)
({
__verify_pcpu_ptr(__p);
(typeof(*(__p)) __kernel __force *)(__p);
})
What do these two functions do? What are they used for? How do they work?
Thanks.
This is part of the scheme used by per_cpu_ptr to support a pointer that gets a different value for each CPU. There are two motives here:
Ensure that accesses to the per-cpu data structure are only made via the per_cpu_ptr macro.
Ensure that the argument given to the macro is of the correct type.
Restating, this ensures that (a) you don't accidentally access a per-cpu pointer without the macro (which would only reference the first of N members), and (b) that you don't inadvertently use the macro to cast a pointer that is not of the correct declared type to one that is.
By using these macros, you get the support of the compiler in type-checking without any runtime overhead. The compiler is smart enough to eventually recognize that all of these complex machinations result in no observable state change, yet the type-checking will have been performed. So you get the benefit of the type-checking, but no actual executable code will have been emitted by the compiler.

Resources