C++ Return two large std::vector - c++11

I have a function that needs to return a pair of std vectors (different dimension). I could also return a pair or tuple. What I want to avoid is copying the whole vector just to return it.
Will something like this code :
return make_pair(vec1, vec2_diffDim);
Duplicate the vectors or will it use a reference?

Assuming vec1 and vec2_diffDim are variables local to your function, you should use make_pair(std::move(vec1), std::move(vec2_diffDim)). make_pair<T1,T2> accepts universal references of types T1 and T2. However, since only rvalues can bind to the rvalue reference overload, make_pair(vec1, vec2_diffDim) will bind to the version that makes a copy of both arguments. The copy of the returned pair can be elided. So, you are guaranteed that it will be moved at least.

Related

Is it still necessary to use std move even if auto && has been used

As we know, STL usually offered two kinds of functions to insert an element: insert/push and emplace.
Let's say I want to emplace all of elements from one container to another.
for (auto &&element : myMap)
{
anotherMap.emplace(element); // vs anotherMap.empalce(std::move(element));
}
In this case, if I want to call the emplace, instead of insert/push, must I still call std::move here or not?
If you indeed want to move all elements from myMap into anotherMap then yes you must call std::move(). The reason is that element here is still an lvalue. Its type is rvalue reference as declared, but the expression itself is still an lvalue, and thus the overload resolution will give back the lvalue reference constructor better known as the copy constructor.
This is a very common point of confusion. See for example this question.
Always keep in mind that std::move doesn't actually do anything itself, it just guarantees that the overload resolver will see an appropriately-typed rvalue instead of an lvalue associated with a given identifier.

Java-8 stream expression to 'OR' several enum values together

I am aggregating a bunch of enum values (different from the ordinal values) in a foreach loop.
int output = 0;
for (TestEnum testEnum: setOfEnums) {
output |= testEnum.getValue();
}
Is there a way to do this in streams API?
If I use a lambda like this in a Stream<TestEnum> :
setOfEnums.stream().forEach(testEnum -> (output |= testEnum.getValue());
I get a compile time error that says, 'variable used in lambda should be effectively final'.
Predicate represents a boolean valued function, you need to use reduce method of stream to aggregate bunch of enum values.
if we consider that you have HashSet as named SetOfEnums :
//int initialValue = 0; //this is effectively final for next stream pipeline if you wont modify this value in that stream
final int initialValue = 0;//final
int output = SetOfEnums.stream()
.map(TestEnum::getValue)
.reduce(initialValue, (e1,e2)-> e1|e2);
You nedd to reduce stream of enums like this:
int output = Arrays.stream(TestEnum.values()).mapToInt(TestEnum::getValue).reduce(0, (acc, value) -> acc | value);
I like the recommendations to use reduction, but perhaps a more complete answer would illustrate why it is a good idea.
In a lambda expression, you can reference variables like output that are in scope where the lambda expression is defined, but you cannot modify the values. The reason for that is that, internally, the compiler must be able to implement your lambda, if it chooses to do so, by creating a new function with your lambda as its body. The compiler may choose to add parameters as needed so that all of the values used in this generated function are available in the parameter list. In your case, such a function would definitely have the lambda's explicit parameter, testEnum, but because you also reference the local variable output in the lambda body, it could add that as a second parameter to the generated function. Effectively, the compiler might generate this function from your lambda:
private void generatedFunction1(TestEnum testEnum, int output) {
output |= testEnum.getValue();
}
As you can see, the output parameter is a copy of the output variable used by the caller, and the OR operation would only be applied to the copy. Since the original output variable wouldn't be modified, the language designers decided to prohibit modification of values passed implicitly to lambdas.
To get around the problem in the most direct way, setting aside for the moment that the use of reduction is a far better approach, you could wrap the output variable in a wrapper (e.g. an int[] array of size 1 or an AtomicInteger. The wrapper's reference would be passed by value to the generated function, and since you would now update the contents of output, not the value of output, output remains effectively final, so the compiler won't complain. For example:
AtomicInteger output = new AtomicInteger();
setOfEnums.stream().forEach(testEnum -> (output.set(output.get() | testEnum.getValue()));
or, since we're using AtomicInteger, we may as well make it thread-safe in case you later choose to use a parallel Stream,
AtomicInteger output = new AtomicInteger();
setOfEnums.stream().forEach(testEnum -> (output.getAndUpdate(prev -> prev | testEnum.getValue())));
Now that we've gone over an answer that most resembles what you asked about, we can talk about the superior solution of using reduction, that other answers have already recommended.
There are two kinds of reduction offered by Stream, stateless reduction (reduce(), and stateful reduction (collect()). To visualize the difference, consider a conveyer belt delivering hamburgers, and your goal is to collect all of the hamburger patties into one big hamburger. With stateful reduction, you would start with a new hamburger bun, and then collect the patty out of each hamburger as it arrives, and you add it to the stack of patties in the hamburger bun you set up to collect them. In stateless reduction, you start out with an empty hamburger bun (called the "identity", since that empty hamburger bun is what you end up with if the conveyer belt is empty), and as each hamburger arrives on the belt, you make a copy of the previous accumulated burger and add the patty from the new one that just arrived, discarding the previous accumulated burger.
The stateless reduction may seem like a huge waste, but there are cases when copying the accumulated value is very cheap. One such case is when accumulating primitive types -- primitive types are very cheap to copy, so stateless reduction is ideal when crunching primitives in applications such as summing, ORing, etc.
So, using stateless reduction, your example might become:
setOfEnums.stream()
.mapToInt(TestEnum::getValue) // or .mapToInt(testEnum -> testEnum.getValue())
.reduce(0, (resultSoFar, testEnum) -> resultSoFar | testEnum);
Some points to ponder:
Your original for loop is probably faster than using streams, except perhaps if your set is very large and you use parallel streams. Don't use streams for the sake of using streams. Use them if they make sense.
In my first example, I showed the use of Stream.forEach(). If you ever find yourself creating a Stream and just calling forEach(), it is more efficient just to call forEach() on the collection directly.
You didn't mention what kind of Set you are using, but I hope you are using EnumSet<TestEnum>. Because it is implemented as a bit field, It performs much better (O(1)) than any other kind of Set for all operations, even copying. EnumSet.noneOf(TestEnum.class) creates an empty Set, EnumSet.allOf(TestEnum.class) gives you a set of all enum values, etc.

Is c++11 operator[] equivalent to emplace on map insertion?

For C++11, is there still a performance difference between the following?
(for std::map<Foo, std::vector<Bar> > as an example)
map[key] = myVector and map.emplace(key, myVector)
The part I'm not figuring out is the exact internal of operator[]. My understanding so far has been (when key doesn't exist):
Create a new key and the associated empty default vector in place inside the map
Return the reference of the associated empty vector
Assign myVector to the reference???
The point 3 is the part I couldn't understand, how can you assign a new value to a reference in the first place?
Though I cannot sort through point 3 I think somehow there's just a copy/move required. Assuming C++11 will be smart enough to know it's gonna be a move operation, is this whole "[]" assignment then already cheaper than insert()? Is it almost equivalent to emplace()? ---- default construction and move content over, versus construct vector with content directly in place?
There are a lot of differences between the two.
If you use operator[], then the map will default construct the value. The return value from operator[] will be this default constructed object, which will then use operator= to assign to it.
If you use emplace, the map will directly construct the value with the parameters you provide.
So the operator[] method will always use two-stage construction. If the default constructor is slow, or if copy/move construction is faster than copy/move assignment, then it could be problematic.
However, emplace will not replace the value if the provided key already exists. Whereas operator[] followed by operator= will always replace the value, whether there was one there or not.
There are other differences too. If copying/moving throws, emplace guarantees that the map will not be changed. By contrast, operator[] will always insert a default constructed element. So if the later copy/move assignment fails, then the map has already been changed. That key will exist with a default constructed value_type.
Really, performance is not the first thing you should be thinking about when deciding which one to use. You need to focus first on whether it has the desired behavior.
C++17 will provide insert_or_assign, which has the effect of map[] = v;, but with the exception safety of insert/emplace.
how can you assign a new value to a reference in the first place?
It's fundamentally no different from assigning to any non-const reference:
int i = 5;
int &j = i;
j = 30;
i == 30; //This is true.

In what way does this struct-field-aliasing code invoke Undefined Behavior

Given the code:
#include <stdlib.h>
#include <stdint.h>
typedef struct { int32_t x, y; } INTPAIR;
typedef struct { int32_t w; INTPAIR xy; } INTANDPAIR;
void foo(INTPAIR * s1, INTPAIR * s2)
{
s2->y++;
s1->x^=1;
s2->y--;
s1->x^=1;
}
int hey(int x)
{
static INTPAIR dummy;
void *p = calloc(sizeof (INTANDPAIR),1);
INTANDPAIR *p1 = p;
INTPAIR *p2a = p;
INTPAIR *p2b = &p1->xy;
p2b->x = x;
foo(p2b,p2a);
int result= p2b->x;
free(p);
return result;
}
#include <stdio.h>
int main(void)
{
for (int i=0; i<10; i++)
printf("%d.",hey(i));
}
Behavior depends upon gcc optimization level, which implies that gcc thinks
this code invokes Undefined Behavior (the definition of "foo" collapses to nothing, but interestingly the definition of "hey" increments the value passed in). I'm not quite sure what if anything it does that runs afoul of the Standard's rules, though.
The code very deliberately and evilly constructs two pointers such that
s2a->y and s2b->x will alias, but the pointers are deliberately constructed in such a way that both identify legitimate potential objects of type INTPAIR. Because code used calloc to get the memory, all field members have legitimate initial defined values of zero. All accesses to the allocated memory are done via an int32_t member of an INTPAIR*.
I can understand why it would make sense for the Standard to forbid aliasing structure fields in this fashion, but I couldn't find anything in the Standard which actually does so. Is gcc operating in Standard-compliant fashion here, or is it violating some clause in the Standard which isn't referenced by Annex J.2 and doesn't use any of the terms I searched for?
UPDATE:
I felt this answer was OK, but not still a little imprecise, and not cut and dry as to what the UB was. After a lot of very interesting discussion and comments I have tried again with a new answer
The right part of the C99 standard is quoted in this answer. I'm copying it here for convenience. The question and several of the answers are quite thorough.
(C99; ISO/IEC 9899:1999 6.5/7:
An object shall have its stored value accessed only by an lvalue
expression that has one of the following types 73) or 88):
a type compatible with the effective type of the object,
a qualified version of a type compatible with the effective type of
the object,
a type that is the signed or unsigned type corresponding to the
effective type of the object,
a type that is the signed or unsigned type corresponding to a
qualified version of the effective type of the object,
an aggregate or union type that includes one of the aforementioned
types among its members (including, recursively, a member of a
subaggregate or contained union), or
a character type.
73) or 88) The intent of this list is to specify those circumstances in which an object may or may not be aliased.
What is an effective type then? (C99; ISO/IEC 9899:1999 6.5/6:
The effective type of an object for an access to its stored value is the declared type of the object, if any. 87) If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.
87) Allocated objects have no declared type.
So at the line p2b->x = x the object at p+4 becomes of effective type INTPAIR. Is it aligned correctly? If it isn't then Undefined Behavior (UB). But to keep it interesting, assume it is as it must be in this case because of the layout of INTANDPAIR.
By the same analysis there are two 8 byte objects, p2a (s2) at #(p+4) and p2b #p. As your example is demonstrating the 2nd element of p2a and the first of p2b end up being aliased.
In the foo(), the object p2b #p+4 is accessed by the normal method via s1->x. But then the "stored value" of object p2b is also accessed by a side effect of modifying a different object p2a #p. Since this falls under none of the bullets of 6.5/7, it is UB. Note that 6.5/7 says only, so objects shall not be accessed in any other ways.
I think the main distinction is that the "object" in question is the whole structure p2a/s2 and p2b/s1, not the integer members. If you change the argument of the function to take the integers and alias them it works "fine" because the function can't know s1 and s2 alias. For example:
void foo2(int *s1, int *s2)
{
(*s2)++;
(*s1)^=1;
(*s2)--;
(*s1)^=1;
}
...
/*foo(p2b,p2a);*/
foo2((int*)p, (int*)p); /* or p+4 or whatever you want */
This more or less confirms that this is the way GCC chose to interpret things: modifying a member is modifying the whole struct object and that since side effects of modifying one object are not on the listed legal ways to indirectly modify a different object, whee! we can do whatever silly thing we feel like doing.
So whether GCC interprets the ambiguities in standard to decide that by deriving s1 and s2 pointers through different typed pointers and then accessing them constitutes indirectly accessing the memory via different original types via p1 and p or whether it interprets the standard in the way I'm suggesting that "object" s2->y modifies is not just the integer but the s2 object, it is UB either way. Or is GCC just being especially snarky and pointing out that if the standard doesn't very clearly specify the semantics of dynamically allocated yet overlapping objects, it is free to do whatever it wants because by definition it is "undefined".
I don't think at this microscopic level anyone other than the standards body can definitively answer whether this should be UB or not because at this level it requires some "interpretation". The GCC's implementers opinion's seem to favor very aggressive interpretations.
I like Linus's reaction to this whole thing. And it is true, why not just be conservative and let the programmer tell the compiler when it is safe? Very Excellent Linus Rant
My previous answer was lacking, maybe not completely wrong, but the sample program is deliberately designed to sidestep each of the more obvious explicit Undefined Behaviors (UB) dictated by the C99 standard, like 6.5/7. But with both GCC (and Clang) this example demonstrates strict aliasing failure like symptoms under optimization. They appear to be assuming s1->y and s2-x can't alias. So, is the compiler wrong? Is this a loophole in the strict aliasing legalese?
Short answer: No. I wouldn't be surprised if there was a loophole of some kind in the standard, given its complexity. But in this example, creating overlapping objects on the heap is explicitly undefined behavior, and there are several other things happening that the standard does not define.
I think the point of the example is not that it fails - it is obvious that "playing fast and loose" with pointers is a bad idea and relying on corner cases and legalese to prove the compile "wrong" is of little help if the code doesn't work. The key questions are: is GCC wrong? and what in the standard says so.
First, lets look at the obvious strict aliasing rules and how this example is trying to avoid them.
C99 6.5/7:
An object shall have its stored value accessed only by an lvalue expression that has one of the following types: 76)
a type compatible with the effective type of the object,
a qualified version of a type compatible with the effective type of the object,
a type that is the signed or unsigned type corresponding to the effective type of the object,
a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
a character type.
This is the main strict aliasing section. It means that accessing the same memory via two different type pointers is UB. This example sidesteps it by accessing both using INTPAIR pointers in foo().
The key problem with this is that it is talking about accessing the stored value via two different effective types (e.g. pointers). It doesn't talk about accessing via two different objects.
What is being accessed? is it the integer member or the entire object s1 / s2? Is accessing s2->x via s1->y access via "a type compatible with the effective type of the object". I believe an argument can be made that a) the access as a side effect of modifying a different object does not fall under the permissible methods in 6.5/7 and that b) modifying one member of the aggregate transitively modifies the aggregate (*s1 or *s2) also.
Since this is not specified, it is UB, but it is a bit hand-wavy.
How did we get pointers to two overlapping objects? Are the pointer casts leading to them OK? Section 6.3.2.3 contains the rules for casting pointers and the example carefully does not violate any of them. In particular, because p2b is a pointer to INTANDPAIR member xy the alignment is guaranteed to be right, otherwise it would definitely run afoul of 6.3.2.3/7.
Furthermore, &p1->xy is not a problem - it can't be - it is a perfectly legitimate pointer to an INTPAIR. Simply casting pointers and/or taking addresses is safely outside the definition of "access" (3.1/1).
It is clear that the problem comes about by accessing two integer members that overlay each other as different parts of overlapping objects. Any attempt to do this via pointers of different types would clearly run afoul of 6.5/7. If accessed by the same type pointer at the same address, there would be no problem whatsoever. So the only way left that they could alias this way is that if two objects at different addresses overlapped in some fashion.
Obviously this could occur as part of a union, but that is not the case for this example. Type punning through unions may not be UB in C99, but it would be a different question whether a variant of this example could be made misbehave via unions.
The example uses dynamic allocation and casts the resultant void pointer to two different types. Going from from a pointer to an object to void * and back again is valid (6.3.2.3/1). Several other ways of obtaining pointers to objects that would overlap are explicitly UB by the pointer conversion rules of 6.3.2.3, the aliasing rules of 6.5/7, and/or the compatible type rules 6.2.7.
So what else is wrong?
6.2.4 Storage durations of objects
1 An object has a storage duration that determines its lifetime. There are three storage durations: static, automatic, and allocated. Allocated storage is described in 7.20.3
The storage for each of the objects is allocated by calloc() so the duration we want is "allocated". So we check 7.20.3: (emphasis added)
7.20.3 Memory management functions
1 The order and contiguity of storage allocated by successive calls to the calloc, malloc, and realloc functions is unspecified. The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object and then used to access such an object or an array of such objects in the space allocated (until the space is explicitly deallocated). The lifetime of an allocated object extends from the allocation until the deallocation. Each such allocation shall yield a pointer to an object disjoint from any other object.
...
2 The lifetime of an object is the portion of program execution during which storage is guaranteed to be reserved for it. An object exists, has a constant address, 25) and retains its last-stored value throughout its lifetime. 26) If an object is referred to outside of its lifetime, the behavior is undefined.
To avoid UB, the accesses to the two different objects must be to a valid object within its lifetime. You can get a single valid object (or an array) with malloc()/calloc(), but these guarantee that you will receive a pointer disjoint from all other objects. So is the object returned from calloc() p or is it p1? It can't be both.
The UB is triggered by attempting to reuse the same dynamically allocated object to hold two objects that are not disjoint. While calloc() guarantees it will return a pointer to a disjoint object, there is nothing that says it will still work if you then start using parts of the buffer for a 2nd overlapping one. In fact, it even explicitly says it is UB if you access an object outside its lifetime and there is only a single allocation ergo a single lifetime.
Also note:
4. Conformance
In this International Standard, ‘‘shall’’ is to be interpreted as a requirement on an implementation or on a program; conversely, ‘‘shall not’’ is to be interpreted as a prohibition.
If a ‘‘shall’’ or ‘‘shall not’’ requirement that appears outside of a constraint is violated, the behavior is undefined. Undefined behavior is otherwise indicated in this International Standard by the words ‘‘undefined behavior’’ or by the omission of any explicit definition
of behavior. There is no difference in emphasis among these three; they all describe ‘‘behavior that is undefined’’.
For this to be a compiler error it must fail on a program that only uses constructs explicitly defined. Anything else is outside the safe-harbor and is still undefined, even if it the standard doesn't explicitly state that it is Undefined Behavior.

Iterator nested typedefs

I'm creating a custom iterator type, and the only use case right now is std::for_each. But apparently, it's not enough to mimic the pointer interface (I'm only doing forward iteration), there are like, a bajillion nested typedefs. I managed to figure out what to put for iterator_category, but I'm having real trouble figuring out what value_type and pointer and reference should be, because, y'know, I'm not building a container here, it's an iterator. Why would for_each even want to know or care? All it's going to do is forward said on to another function.
If you want to use a type T as an iterator, you must ensure that std::iterator_traits can be specialized for that type. That means you either need to provide the five nested typedefs that it defers to by default, or you need to specialize std::iterator_traits yourself. The five nested typedefs it requires are
difference_type, which is some type that can represent the distance between two iterators (e.g., as would be returned by std::distance)
value_type, which is the type of the object pointed to by the iterator
pointer, which is the return type of the iterator type's operator->. This doesn't necessarily need to be a pointer type and it doesn't necessarily need to be value_type* or value_type const*. For example, if you have an iterator that generates elements, you may not have an object to which you can return a pointer. In that case, you might return an object that wraps the returned element and overloads operator-> itself.
reference, which is the return type of the iterator type's operator*. This doesn't necessarily need to be a reference type and it doesn't necessarily need to be value_type& (or value_type const&). For example, If you're iterating over an immutable range of integers, you might just return the element by value, for performance reasons.
iterator_category, which must be one of the iterator category tags or a type derived from one of those tags: input_iterator_tag, output_iterator_tag, forward_iterator_tag, bidirectional_iterator_tag, and random_access_iterator_tag (all in namespace std). Algorithms can use these to select an optimal algorithm based on the iterator category.
You can't omit any of these; they all have to be defined. That said, sometimes one or more of the typedefs may not make sense. For example, if you have an iterator that generates char elements on the fly, your iterator may not implement operator-> (because char is not a class type). In this case, you might consider just using void for the pointer type, since it should never be used anyway.
value_type is what your iterator iterates over. If iter is an iterator, it's the type of *iter. pointer is the pointer to that, and reference is the reference to that.

Resources