iterate over non-const std::unordered_set - c++11

I have a std::unordered_set that contains instances of class bar.
I'd like to iterate over all the bars in the set and call some void foo(bar& b) function on each one.
You'll probably notice from the function signature that I want foo to change the state of the bar& b parameter in some way.
Now, I do know that foo won't change bar in a way that affects hashing or equality comparisons, but I still have a problem.
However I iterate over the set, the best I can hope for is a const bar& which obviously won't work.
I can think of a couple of possible ways around this:
Use const_cast. Don't know if this will work (yet). It kind of smells bad to me, but I'm happy to be enlightened!!
Use a std::unordered_map instead of std::unordered_set, so that even if I can only get a const of the key, I can just use that key to lookup the bar object and safely call foo on it.
I'd really appreciate some advice!
Thanks in advance!

Some clean solutions have already been shown in this answer.
Another clean way would be to add a layer of indirection through a pointer. Even if the pointer itself is const, the data pointed to will not be:
struct Bar
{
int key;
std::unique_ptr<int> pValue;
};
std::unordered_set< Bar, BarHash, BarEqual > bars;
for( const auto& bar : bars )
{
// Works because only the pointer is constant, not the data pointed to.
*bar.pValue = 42;
}
This obviously has the overhead of an additional memory allocation, the space required to store the pointer and the indirection when accessing the value through the pointer.
You will also have to write a custom copy constructor and an assignment operator if you want to keep value semantics.

Use const_cast. Don't know if this will work (yet). It kind of smells bad to me, but I'm happy to be enlightened!!
Yes, it will work. You can easily have an overload foo(bar const&), const_cast the reference, and then call foo(bar&). I agree with you that it smells bad and points to a flaw in design. You might want to take a fresh look at the design and see if there is a clean solution.
Use a std::unordered_map instead of std::unordered_set, so that even if I can only get a const of the key, I can just use that key to lookup the bar object and safely call foo on it
That is not too different from the first approach. std::unordered_set<T> is essentially std::unordered_map<T, bool>.
Potential clean solutions:
Get a copy of the object from the set, remove the entry from the set, update the copy, and put the copy back in the set. If that proves too expensive ...
Use a std::vector<Bar>. You can get a Bar& from the vector and all is well.
Make the member variables of Bar that don't impact its hash value to be mutable. Then, you can just use foo(Bar const&) and be able to call it directly using a reference to the objects in the set.

Related

Is it still necessary to use std move even if auto && has been used

As we know, STL usually offered two kinds of functions to insert an element: insert/push and emplace.
Let's say I want to emplace all of elements from one container to another.
for (auto &&element : myMap)
{
anotherMap.emplace(element); // vs anotherMap.empalce(std::move(element));
}
In this case, if I want to call the emplace, instead of insert/push, must I still call std::move here or not?
If you indeed want to move all elements from myMap into anotherMap then yes you must call std::move(). The reason is that element here is still an lvalue. Its type is rvalue reference as declared, but the expression itself is still an lvalue, and thus the overload resolution will give back the lvalue reference constructor better known as the copy constructor.
This is a very common point of confusion. See for example this question.
Always keep in mind that std::move doesn't actually do anything itself, it just guarantees that the overload resolver will see an appropriately-typed rvalue instead of an lvalue associated with a given identifier.

why use move constructors? clang-tidy modernize-pass-by-value [duplicate]

I saw code somewhere in which someone decided to copy an object and subsequently move it to a data member of a class. This left me in confusion in that I thought the whole point of moving was to avoid copying. Here is the example:
struct S
{
S(std::string str) : data(std::move(str))
{}
};
Here are my questions:
Why aren't we taking an rvalue-reference to str?
Won't a copy be expensive, especially given something like std::string?
What would be the reason for the author to decide to make a copy then a move?
When should I do this myself?
Before I answer your questions, one thing you seem to be getting wrong: taking by value in C++11 does not always mean copying. If an rvalue is passed, that will be moved (provided a viable move constructor exists) rather than being copied. And std::string does have a move constructor.
Unlike in C++03, in C++11 it is often idiomatic to take parameters by value, for the reasons I am going to explain below. Also see this Q&A on StackOverflow for a more general set of guidelines on how to accept parameters.
Why aren't we taking an rvalue-reference to str?
Because that would make it impossible to pass lvalues, such as in:
std::string s = "Hello";
S obj(s); // s is an lvalue, this won't compile!
If S only had a constructor that accepts rvalues, the above would not compile.
Won't a copy be expensive, especially given something like std::string?
If you pass an rvalue, that will be moved into str, and that will eventually be moved into data. No copying will be performed. If you pass an lvalue, on the other hand, that lvalue will be copied into str, and then moved into data.
So to sum it up, two moves for rvalues, one copy and one move for lvalues.
What would be the reason for the author to decide to make a copy then a move?
First of all, as I mentioned above, the first one is not always a copy; and this said, the answer is: "Because it is efficient (moves of std::string objects are cheap) and simple".
Under the assumption that moves are cheap (ignoring SSO here), they can be practically disregarded when considering the overall efficiency of this design. If we do so, we have one copy for lvalues (as we would have if we accepted an lvalue reference to const) and no copies for rvalues (while we would still have a copy if we accepted an lvalue reference to const).
This means that taking by value is as good as taking by lvalue reference to const when lvalues are provided, and better when rvalues are provided.
P.S.: To provide some context, I believe this is the Q&A the OP is referring to.
To understand why this is a good pattern, we should examine the alternatives, both in C++03 and in C++11.
We have the C++03 method of taking a std::string const&:
struct S
{
std::string data;
S(std::string const& str) : data(str)
{}
};
in this case, there will always be a single copy performed. If you construct from a raw C string, a std::string will be constructed, then copied again: two allocations.
There is the C++03 method of taking a reference to a std::string, then swapping it into a local std::string:
struct S
{
std::string data;
S(std::string& str)
{
std::swap(data, str);
}
};
that is the C++03 version of "move semantics", and swap can often be optimized to be very cheap to do (much like a move). It also should be analyzed in context:
S tmp("foo"); // illegal
std::string s("foo");
S tmp2(s); // legal
and forces you to form a non-temporary std::string, then discard it. (A temporary std::string cannot bind to a non-const reference). Only one allocation is done, however. The C++11 version would take a && and require you to call it with std::move, or with a temporary: this requires that the caller explicitly creates a copy outside of the call, and move that copy into the function or constructor.
struct S
{
std::string data;
S(std::string&& str): data(std::move(str))
{}
};
Use:
S tmp("foo"); // legal
std::string s("foo");
S tmp2(std::move(s)); // legal
Next, we can do the full C++11 version, that supports both copy and move:
struct S
{
std::string data;
S(std::string const& str) : data(str) {} // lvalue const, copy
S(std::string && str) : data(std::move(str)) {} // rvalue, move
};
We can then examine how this is used:
S tmp( "foo" ); // a temporary `std::string` is created, then moved into tmp.data
std::string bar("bar"); // bar is created
S tmp2( bar ); // bar is copied into tmp.data
std::string bar2("bar2"); // bar2 is created
S tmp3( std::move(bar2) ); // bar2 is moved into tmp.data
It is pretty clear that this 2 overload technique is at least as efficient, if not more so, than the above two C++03 styles. I'll dub this 2-overload version the "most optimal" version.
Now, we'll examine the take-by-copy version:
struct S2 {
std::string data;
S2( std::string arg ):data(std::move(x)) {}
};
in each of those scenarios:
S2 tmp( "foo" ); // a temporary `std::string` is created, moved into arg, then moved into S2::data
std::string bar("bar"); // bar is created
S2 tmp2( bar ); // bar is copied into arg, then moved into S2::data
std::string bar2("bar2"); // bar2 is created
S2 tmp3( std::move(bar2) ); // bar2 is moved into arg, then moved into S2::data
If you compare this side-by-side with the "most optimal" version, we do exactly one additional move! Not once do we do an extra copy.
So if we assume that move is cheap, this version gets us nearly the same performance as the most-optimal version, but 2 times less code.
And if you are taking say 2 to 10 arguments, the reduction in code is exponential -- 2x times less with 1 argument, 4x with 2, 8x with 3, 16x with 4, 1024x with 10 arguments.
Now, we can get around this via perfect forwarding and SFINAE, allowing you to write a single constructor or function template that takes 10 arguments, does SFINAE to ensure that the arguments are of appropriate types, and then moves-or-copies them into the local state as required. While this prevents the thousand fold increase in program size problem, there can still be a whole pile of functions generated from this template. (template function instantiations generate functions)
And lots of generated functions means larger executable code size, which can itself reduce performance.
For the cost of a few moves, we get shorter code and nearly the same performance, and often easier to understand code.
Now, this only works because we know, when the function (in this case, a constructor) is called, that we will be wanting a local copy of that argument. The idea is that if we know that we are going to be making a copy, we should let the caller know that we are making a copy by putting it in our argument list. They can then optimize around the fact that they are going to give us a copy (by moving into our argument, for example).
Another advantage of the 'take by value" technique is that often move constructors are noexcept. That means the functions that take by-value and move out of their argument can often be noexcept, moving any throws out of their body and into the calling scope (who can avoid it via direct construction sometimes, or construct the items and move into the argument, to control where throwing happens). Making methods nothrow is often worth it.
This is probably intentional and is similar to the copy and swap idiom. Basically since the string is copied before the constructor, the constructor itself is exception safe as it only swaps (moves) the temporary string str.
You don't want to repeat yourself by writing a constructor for the move and one for the copy:
S(std::string&& str) : data(std::move(str)) {}
S(const std::string& str) : data(str) {}
This is much boilerplate code, especially if you have multiple arguments. Your solution avoids that duplication on the cost of an unnecessary move. (The move operation should be quite cheap, however.)
The competing idiom is to use perfect forwarding:
template <typename T>
S(T&& str) : data(std::forward<T>(str)) {}
The template magic will choose to move or copy depending on the parameter that you pass in. It basically expands to the first version, where both constructor were written by hand. For background information, see Scott Meyer's post on universal references.
From a performance aspect, the perfect forwarding version is superior to your version as it avoids the unnecessary moves. However, one can argue that your version is easier to read and write. The possible performance impact should not matter in most situations, anyway, so it seems to be a matter of style in the end.

what should be used New() or var in go?

How a object should be created for a struct?
object := new(struct)
or
var object struct
I could not understatnd when to use what? and if both are same which one should be prefered?
The new syntax you're showing returns a pointer while the other one is a value. Check out this article here; https://golang.org/doc/effective_go.html#allocation_new
There's actually even one other option which I prefer. It's called composite literal and looks like this;
object := &struct{}
The example above is equivalent to your use of new. The cool thing about it is you can specify values for any property in struct within the brackets there.
When to use what is a decision you need to make on a case by case basis. In Go there are several reasons I would want one or the other; Perhaps only the pointer *myType implements some interface while myType does not, an instance myType could contain about 1 GB of data and you want to ensure you're passing a pointer and not the value to other methods, ect. The choice of which to use depends on the use case. Although I will say, pointers are rarely worse and because that's the case I almost always use them.
When you need a pointer object use new or composite literal else use var.
Use var whenever possible as this is more likely to be allocated in stack and memory get freed as soon as scope ends. I case of new memory gets allocated most likely in heap and need to be garbage collected.

using new vs. { } when initializing a struct in Go

So i know in go you can initialize a struct two different ways in GO. One of them is using the new keyword which returns a pointer to the struct in memory. Or you can use the { } to make a struct. My question is when is appropriate to use each?
Thanks
I prefer {} when the full value of the type is known and new() when the value is going to be populated incrementally.
In the former case, adding a new parameter may involve adding a new field initializer. In the latter it should probably be added to whatever code is composing the value.
Note that the &T{} syntax is only allowed when T is a struct, array, slice or map type.
Going off of what #Volker said, it's generally preferable to use &A{} for pointers (and this doesn't necessarily have to be zero values: if I have a struct with a single integer in it, I could do &A{1} to initialize the field). Besides being a stylistic concern, the big reason that people normally prefer this syntax is that, unlike new, it doesn't always actually allocate memory in the heap. If the go compiler can be sure that the pointer will never be used outside of the function, it will simply allocate the struct as a local variable, which is much more efficient than calling new.
Most people use A{} to create a zero value of type A, &A{} to create a pointer to a zero value of type A. Using newis only necessary for int and that like as int{} is a no go.

std::shared_ptr assignment of data vs. memcpy

I am using std::shared_ptr in C++11 and I would like to understand if it's better to assign structures of type T in this way:
T a_data;
std::shared_ptr<T> my_pointer(new T);
*my_pointer = a_data;
or like:
memcpy(&my_pointer, data, sizeof(T));
or like:
my_pointer.reset(a_data);
Regards
Mike
They each do a different thing.
1.
T a_data;
std::shared_ptr<T> my_pointer(new T);
*my_pointer = a_data;
Here, a new object (call it n) of type T will be allocated, managed by my_pointer. Then, object a_data will be copy-assigned into n.
2.
memcpy(&my_pointer, a_data, sizeof(T)); // I assume you meant a_data here, not data
That's total nonsense - tha's overwriting the shared_ptr itself with the contents of a_data. Undefined behaviour at its finest (expect a crash or memory corruption).
Perhaps you actually meant my_pointer.get() instead of &my_pointer (that is, you wanted to copy into the object being pointed to)? If that's the case, it can work, as long as T is trivially copyable - which means that it doesn't have non-trivial copy or move ctors, doesn't have non-trivial copy or move assignment operators, and has a trivial destructor. But why rely on that, when normal assignment (*my_pointer = a_data;) does exactly the same for that case, and also works for non-trivially-copyable classes?
3.
my_pointer.reset(a_data);
This normally won't compile as-is, it would need to be my_pointer.reset(&a_data);. That's disaster waiting to happen - you point my_pointer to the automatic (= local) variable a_data and give it ownership of that. Which means that when my_pointer goes out of scope (actually, when the last pointer sharing ownership wiht it does), it will call the deleter, which normally calls delete. On a_data, which was not allocated with new. Welcome to UB land again!
If you just need to manage a dynamically-allocated copy of a_data with a shared_ptr, do this:
T a_data;
std::shared_ptr<T> my_pointer(new T(a_data));
Or even better:
T a_data;
auto my_pointer = std::make_shared<T>(a_data);

Resources