Why std::move() is not stealing an int value? - c++11

std::move() is stealing the string value whereas not an int, please help me.
int main()
{
int i = 50;
string str = "Mahesh";
int j = std::move(i);
string name = std::move(str);
std::cout <<"i: "<<i<<" J: "<<j <<std::endl;
std::cout <<"str: "<<str<<" name: "<<name <<std::endl;
return 0;
}
Output
i: 50 J: 50
str: name: Mahesh

std::move is a cast to an rvalue reference. This can change overload resolution, particularly with regard to constructors.
int is a fundamental type, it doesn't have any constructors. The definition for int initialisation does not care about whether the expression is const, volatile, lvalue or rvalue. Thus the behaviour is a copy.
One reason this is the case is that there is no benefit to a (destructive) move. Another reason is that there is no such thing as an "empty" int, in the sense that there are "empty" std::strings, and "empty" std::unique_ptrs

std::move() itself doesn't actually do any moving. It is simply used to indicate that an object may be moved from. The actual moving must be implemented for the respective types by a move constructor/move assignment operator.
std::move(x) returns an unnamed rvalue reference to x. rvalue references are really just like normal references. Their only purpose is simply to carry along the information about the "rvalue-ness" of the thing they refer to. When you then use the result of std::move() to initialize/assign to another object, overload resolution will pick a move constructor/move assignment operator if one exists. And that's it. That is literally all that std::move() does. However, the implementation of a move constructor/move assignment operator knows that the only way it could have been called is when the value passed to it is about to expire (otherwise, the copy constructor/copy assignment operator would have been called instead). It, thus, can safely "steal" the value rather than make a copy, whatever that may mean in the context of the particular type.
There is no general answer to the question what exactly it means to "steal" a value from an object. Whoever defines a type has to define whether it makes sense to move objects of this type and what exactly it means to do so (by declaring/defining the respective member functions). Built-in types don't have any special behavior defined for moving their values. So in the case of an int you just get what you get when you initialize an int with a reference to another int, which is a copy…

Related

Pre and post increment behaviour in C++

(C++) Why
std::cout << ++(a++);
shows error: lvalue required as increment operand
but
std::cout << (++a)++;
shows output "1"
(Java) But in Java in both the cases it throws exception. Cause increment and decrement operators work on variable not on values. And output of parentheses operator is always value.
Thanks in advance.
The return type of preincrement is T&, which allows you to modify it (because it's a non-const reference). Postincrement returns T: it's an unnamed value in the t++ context. Therefore, the result is considered const-like so you can't change its state.
If you want to find more information, you can search up lvalues, (p)rvalues etc. but they might be hard to understand.

How to Define a Constant Value of a User-defined Type in Go?

I am implementing a bit-vector in Go:
// A bit vector uses a slice of unsigned integer values or “words,”
// each bit of which represents an element of the set.
// The set contains i if the ith bit is set.
// The following program demonstrates a simple bit vector type with these methods.
type IntSet struct {
words []uint64 //uint64 is important because we need control over number and value of bits
}
I have defined several methods (e.g. membership test, adding or removing elements, set operations like union, intersection etc.) on it which all have a pointer receiver. Here is one such method:
// Has returns true if the given integer is in the set, false otherwise
func (this *IntSet) Has(m int) bool {
// details omitted for brevity
}
Now, I need to return an empty set that is a true constant, so that I can use the same constant every time I need to refer to an IntSet that contains no elements. One way is to return something like &IntSet{}, but I see two disadvantages:
Every time an empty set is to be returned, a new value needs to be allocated.
The returned value is not really constant since it can be modified by the callers.
How do you define a null set that does not have these limitations?
If you read https://golang.org/ref/spec#Constants you see that constants are limited to basic types. A struct or a slice or array will not work as a constant.
I think that the best you can do is to make a function that returns a copy of an internal empty set. If callers modify it, that isn't something you can fix.
Actually modifying it would be difficult for them since the words inside the IntSet are lowercase and therefore private. If you added a value next to words like mut bool you could add a if mut check to every method that changes the IntSet. If it isn't mutable, return an error or panic.
With that, you could keep users from modifying constant, non-mutable IntSet values.

Why does the STL Output Iterator allow only once assignment?

As mentioned here:
http://www.cplusplus.com/reference/iterator/OutputIterator/
Can be dereferenced as an lvalue (if in a dereferenceable state).
It shall only be dereferenced as the left-side of an assignment statement.
Once dereferenced, its iterator value may no longer be dereferenceable.
Next to it there is an example of a valid expression:
*a = t
After this expression (the dereference) I can't derefernce again.
I don't understand why for example I can't do:
*a = t2
After the first expression.
One reason is that output iterators are used for output streams, such as terminals, pipes and sockets. Once data have been written into the stream, it is considered sent elsewhere and thus cannot be changed.
Other iterator types, including Trivial Iterator and Input Iterator, define the notion of a value type, the type returned when an iterator is dereferenced. This notion does not apply to Output Iterators, however, since the dereference operator (unary operator*) does not return a usable value for Output Iterators. The only context in which the dereference operator may be used is assignment through an output iterator: *x = t. Although Input Iterators and output iterators are roughly symmetrical concepts, there is an important sense in which accessing and storing values are not symmetrical: for an Input Iterator operator* must return a unique type, but, for an Output Iterator, in the expression *x = t, there is no reason why operator= must take a unique type. Consequently, there need not be any unique "value type" for Output Iterators.

Is it safe to write to a std::strings buffer directly?

If I have the following code:
std::string hello = "hello world";
char* internalBuffer = &hello[0];
Is it then safe to write to internalBuffer up to hello.length()? Or is this UB/implemention defined? Obviously I can write tests and see that this works, but it doesn't answer my question.
Yes, it's safe. No, it's not explicitly allowed by the standard.
According to my copy of the standard draft from like half a year ago, they do assure that data() points at a contiguous array, and that that array be the same as what you receive from operator[]:
21.4.7.1 basic_string accessors [string.accessors]
const charT* c_str() const noexcept;
const charT* data() const noexcept;
Returns: A pointer p such that p + i == &operator[](i) for each i in [0,size()].
From this one can conclude that operator[] returns a reference to some place within that contiguous array. They also allow the returned reference from (non-const) operator[] be modified.
Having a non-const reference to one member of an array I dare to say that we can modify the entire array.
The relevant section in the standard is §21.4.5:
const_reference operator[](size_type pos) const noexcept;
reference operator[](size_type pos) noexcept;
[...]
Returns: *(begin() + pos) if pos < size(), otherwise a reference to an
object of type T with value charT(); the referenced value shall not be modified.
If I understand this correctly, it means that as long as the index given to operator[] is smaller than the string's size, one is allowed to modify the value. If however, the index is equal to size and thus we obtain the \0 terminating the string, we must not write to this value.
Cppreference uses a slightly different wording here:
If pos == size(), a reference to the character with value CharT() (the null character) is returned.
For the first (non-const) version,the behavior is undefined if this character is modified.
I read this such that 'this character' here only refers to the default constructed CharT, and not to the reference returned in the other case. But I admit that the wording is a bit confusing here.
In practice it is safe, theoretically - no.
C++ standard doesn't force to implement string as a sequential character array like it does for the vector. I'm not aware of any implementation of string where it is not safe, but theoretically there is no guarantee.
http://herbsutter.com/2008/04/07/cringe-not-vectors-are-guaranteed-to-be-contiguous/

How to achieve "optimal" operator overload-resolution in arithmetic expressions with rvalues?

first of all, I apologize for the overly verbose question. I couldn't think of any other way to accurately summarize my problem... Now on to the actual question:
I'm currently experimenting with C++0x rvalue references... The following code produces unwanted behavior:
#include <iostream>
#include <utility>
struct Vector4
{
float x, y, z, w;
inline Vector4 operator + (const Vector4& other) const
{
Vector4 r;
std::cout << "constructing new temporary to store result"
<< std::endl;
r.x = x + other.x;
r.y = y + other.y;
r.z = z + other.z;
r.w = w + other.w;
return r;
}
Vector4&& operator + (Vector4&& other) const
{
std::cout << "reusing temporary 2nd operand to store result"
<< std::endl;
other.x += x;
other.y += y;
other.z += z;
other.w += w;
return std::move(other);
}
friend inline Vector4&& operator + (Vector4&& v1, const Vector4& v2)
{
std::cout << "reusing temporary 1st operand to store result"
<< std::endl;
v1.x += v2.x;
v1.y += v2.y;
v1.z += v2.z;
v1.w += v2.w;
return std::move(v1);
}
};
int main (void)
{
Vector4 r,
v1 = {1.0f, 1.0f, 1.0f, 1.0f},
v2 = {2.0f, 2.0f, 2.0f, 2.0f},
v3 = {3.0f, 3.0f, 3.0f, 3.0f},
v4 = {4.0f, 4.0f, 4.0f, 4.0f},
v5 = {5.0f, 5.0f, 5.0f, 5.0f};
///////////////////////////
// RELEVANT LINE HERE!!! //
///////////////////////////
r = v1 + v2 + (v3 + v4) + v5;
return 0;
}
results in the output
constructing new temporary to store result
constructing new temporary to store result
reusing temporary 1st operand to store result
reusing temporary 1st operand to store result
while I had hoped for something like
constructing new temporary to store result
reusing temporary 1st operand to store result
reusing temporary 2nd operand to store result
reusing temporary 2nd operand to store result
After trying to re-enact what the compiler was doing (I'm using MinGW G++ 4.5.2 with option -std=c++0x in case it matters), it actually seems quite logical. The standard says that arithmetic operations of equal precedence are evaluated/grouped left-to-right (why I assumed right-to-left I don't know, I guess it's more intuitive to me). So what happened here is that the compiler evaluated the sub-expression (v3 + v4) first (since it's in parentheses?), and then began matching the operations in the expression left-to-right against the operator overloads, resulting in a call to Vector4 operator + (const Vector4& other) for the sub-expression v1 + v2. If I want to avoid the unnecessary temporary, I'd have to make sure that no more than one lvalue operand appears to the immediate left of any parenthesized sub-expression, which is counter-intuitive to anyone using this "library" and innocently expecting optimal performance (as in minimizing the creation of temporaries).
(I'm aware that there's ambiguity in my code regarding operator + (Vector4&& v1, const Vector4& v2) and operator + (Vector4&& other) when (v3 + v4) is to be added to the result of v1 + v2, resulting in a warning. But it's harmless in my case and I don't want to add yet another overload for two rvalue reference operands - anyone know if there's a way to disable this warning in gcc?)
Long story short, my question boils down to: Is there any way or pattern (preferably compiler-independent) this vector class could be rewritten to enable arbitrary use of parentheses in expressions that still results in the "optimal" choice of operator overloads (optimal in terms of "performance", i.e. maximizing the binding to rvalue references)? Perhaps I'm asking for too much though and it's impossible... if so, then that's fine too. I just want to make sure I'm not missing anything.
Many thanks in advance
Addendum
First thanks to the quick responses I got, within minutes (!) - I really should have started posting here sooner...
It's becoming tedious replying in the comments, so I think a clarification of my intent with this class design is in order. Maybe you can point me to a fundamental conceptual flaw in my thought process if there is one.
You may notice that I don't hold any resources in the class like heap memory. Its members are only scalar types even. At first sight this makes it a suspect candidate for move-semantics based optimizations (see also this question that actually helped me a great deal grasping the concepts behind rvalue references).
However, since the classes this one is supposed to be a prototype for will be used in a performance-critical context (a 3D engine to be precise), I want to optimize every little thing possible. Low-complexity algorithms and maths-related techniques like look-up tables should of course make up the bulk of the optimizations as anything else would simply be addressing the symptoms and not eradicating the real reason for bad performance. I am well aware of that.
With that out of the way, my intent here is to optimize algebraic expressions with vectors and matrices that are essentially plain-old-data structs without pointers to data in them (mainly due to the performance drawbacks you get with data on the heap [having to dereference additional pointers, cache considerations etc.]).
I don't care about move-assignment or construction, I just don't want more temporaries being created during the evaluation of a complicated algebraic expression than absolutely necessary (usually just one or two, e.g. a matrix and a vector).
Those are my thoughts that might be erroneous. If they are, please correct me:
To achieve this without relying on RVO, return-by-reference is necessary (again: keep in mind I don't have remote resources, only scalar data members).
Returning by reference makes the function-call expression an lvalue, implying the returned object is not a temporary, which is bad, but returning by rvalue reference makes the function-call expression an xvalue (see 3.10.1), which is okay in the context of my approach (see 4)
Returning by reference is dangerous, because of the possibly short lifetime of objects, but:
temporaries are guaranteed to live until the end of the evaluation of the expression they were created in, therefore:
making it safe to return by reference from those operators that take at least one rvalue-reference as their argument, if the object referenced by this rvalue reference argument is the one being returned by reference. Therefore:
Any arbitrary expression that only employs binary operators can be evaluated by creating only one temporary when not more than one PoD-like type is involved, and the binary operations don't require a temporary by nature (like matrix multiplication)
(Another reason to return by rvalue-reference is because it behaves like returning by value in terms of rvalue-ness of the function-call expression; and it's required for the operator/function-call expression to be an rvalue in order to bind to subsequent calls to operators that take rvalue references. As stated in (2), calls to functions that return by reference are lvalues, and would therefore bind to operators with the signature T operator+(const T&, const T&), resulting in the creation of an unnecessary temporary)
I could achieve the desired performance by using a C-style approach of functions like add(Vector4 *result, Vector4 *v1, Vector4 *v2), but come on, we're living in the 21st century...
In summary, my goal is creating a vector class that achieves the same performance as the C-approach using overloaded operators. If that in itself is impossible, than I guess it can't be helped. But I'd appreciate if someone could explain to me why my approach is doomed to fail (the left-to-right operator evaluation issue that was the initial reason for my post aside, of course).
As a matter of fact, I've been using the "real" vector class this one is a simplification of for a while without any crashes or corrupted memory so far. And in fact, I never actually return local objects as references, so there shouldn't be any problems. I dare say what I'm doing is standard-compliant.
Any help on the original issue would of course be appreciated as well!
many thanks for all the patience again
You should not return an rvalue reference, you should return a value. In addition, you should not specify both a member and a free operator+. I'm amazed that even compiled.
Edit:
r = v1 + v2 + (v3 + v4) + v5;
How could you possibly only have one temporary value when you're performing two sub-computations? That's just impossible. You can't re-write the Standard and change this.
You will just have to trust your users to do something not completely stupid, like write the above line of code, and expect to have just one temporary.
I recommend modeling your code after the basic_string operator+() found in chapter 21 of N3225.

Resources