I've inherited a C++98 codebase which has two major uses of memset() on C++ classes, with macros expanded for clarity:
// pattern #1:
Obj o;
memset(&o, 0, sizeof(o));
// pattern #2:
// (elsewhere: Obj *o;)
memset(something->o, 0, sizeof(*something->o));
As you may have guessed, this codebase does not use STL or otherwise non-POD classes. When I try to put as little as an std::string into one of its classes, bad things generally happen.
It was my understanding that these patterns could be rewrited as follows in C++11:
// pattern #1
Obj o = {};
// pattern #2
something->o = {};
Which is to say, assignment of {} would rewrite the contents of the object with the default-initialized values in both cases. Nice and clean, isn't it?
Well, yes, but it doesn't work. It works on *nix systems, but results in fairly inexplicable results (in essence, garbage values) when built with VS2013 with v120_xp toolset, which implies that my understanding of initializer lists is somehow lacking.
So, the questions:
Why didn't this work?
What's a better way to replace this use of memset that ensures that members with constructors are properly default-initialized, and which can preferably be reliably applied with as little as search-and-replace (there are unfortunately no tests). Bonus points if it works on pre-VS2013.
The behavior of brace-initialization depends on what kind of object you try to initialize.
On aggregates (e.g. simple C-style structures) using an empty brace-initializer zero-initializes the aggregate, i.e. it makes all members zero.
On non-aggregates an empty brace-initializer calls the default constructor. And if the constructor doesn't explicitly initialize the members (which the compilers auto-generated constructor doesn't) then the members will be constructed but otherwise uninitialized. Members with their own constructors that initialize themselves will be okay, but e.g. an int member will have an indeterminate value.
The best way to solve your problems, IMO, is to add a default constructor (if the classes doesn't have it already) with an initializer list that explicitly initializes the members.
It works on *nix systems, but results in fairly inexplicable results (in essence, garbage values) when built with VS2013 with v120_xp toolset, which implies that my understanding of initializer lists is somehow lacking.
The rules for 'default' initialization have changed from version to version of C++, but VC++ has stuck with the C++98 rules, ignoring even the updates from C++03 I think.
Other compilers have implemented new rules, with gcc at one point even implementing some defect resolutions that hadn't been accepted for future inclusion in the official spec.
So even though what you want is guaranteed by the standard, for the most part it's probably best not to try to rely on the behavior of initialization of members that don't have explicit initializers.
I think placement new is established enough that it works on VS, so you might try:
#include <new>
new(&o) T();
new(something->p) T();
Make sure not to do this on any object that hasn't been allocated and destructed/uninitialized first! (But it was pointed out below that this might fail if a constructor throws an exception.)
You might be able to just assign from a default object, that is, o = T(); or *(something->p) = T();. A good general strategy might be to give each of these POD classes a trivial default constructor with : o() in the initializer-list.
Related
I found a few questions on the site that approach that subject, but none of them seem to do this directly. An example is this, but the answer is not satisfying (it is untested, and doesn't explain why this is correct).
Consider this simple example:
class some_class
{
public:
Eigen::Matrix<double,3,4> M;
std::vector<Eigen::Matrix<double,4,2>,
Eigen::aligned_allocator<Eigen::Matrix<double,4,2>>> M2;
//other stuff
};
Now assume that I need to declare an std::vector of some_class objects. Then, is the declaration
std::vector<some_class,Eigen::aligned_allocator<some_class>>>
//Note that it compiles and doesn't seem to cause noticeable run-time problems either
the correct way to do so, or do I have to reimplement an aligned_allocator for that class? I find the documentation a bit short and confusing, since it only states
Using STL containers on fixed-size vectorizable Eigen types, or classes having members of such types requires ...
but it doesn't explicitly say whether one should write an aligned_allocator in such situations.
Is the declaration above safe or not, and why?
Consider a class with a member that can't be stored directly, e.g., because it does not have a default constructor, and the enclosing class's constructor doesn't have enough information to create it:
class Foo
{
public:
Foo(){} // Default ctor
private:
/* Won't build: no default ctor or way to call it's
non-default ctor at Foo's ctor. */
Bar m_bar;
};
Clearly, m_bar needs to be stored differently, e.g., through a pointer. A std::unique_ptr seems better, though, as it will destruct it automatically:
std::unique_ptr<Bar> m_bar;
It's also possible to use std::experimental::optional, though:
std::experimenatl::optional<Bar> m_bar;
My questions are: 1. What are the tradeoffs? and 2. Does it make sense to build a class automating the choice between them?
Specifically, looking at the exception guarantees for the ctor of std::unique_ptr and the exception guarantees for the ctor of std::experimental::optional, it seems clear that the former must perform dynamic allocation and deallocation - runtime speed disadvantages, and the latter stores things in some (aligned) memory buffer - size disadvantages. Are these the only tradeoffs?
If these are indeed the tradeoffs, and given that both types share enough of their interface (ctor, operator*), does it make sense to automate the choice between them with something like
template<typename T>
using indirect_raii = typename std::conditional<
// 20 - arbitrary constant
sizeof(std::experimental::optional<T>) >
20 + sizeof(std::exerimental::optional<T>)sizeof(std::unique_ptr<T>),
std::unique_ptr<T>,
std::experimental::optional<T>>::type;
(Note: there is a question discussing the tradeoffs between these two as return types, but the question and answers focus on what each conveys to the callers of the function, which is irrelevant for these private members.)
IMO there are other trade-offs at play here:
unique_ptr is not copyable or copy-assignable, while optional is.
I suppose one thing you could do is make indirect_RAII a class-type and conditionally add definitions to make it copyable by calling Bar's copy ctor, even when unique_ptr is selected. (Or conversely, disable copying when it's an optional.)
optional types can have a constexpr constructor -- you can't really do the equivalent thing with a unique_ptr at compile-time.
Bar can be incomplete at the time that unique_ptr<Bar> is constructed. It cannot be incomplete at the time that optional<Bar> is known. In your example I guess you assume that Bar is complete since you take its size, but potentially you might want to implement a class using indirect_RAII where this isn't the case.
Even in cases where Bar is large, you still may find that e.g. std::vector<Foo> will perform better when optional is selected than when unique_ptr is. I would expect this to happen in cases where the vector is populated once, and then iterated over many times.
It may be that as a general rule of thumb, your size rule is good for common use in your program, but I guess for "common use" it doesn't really matter which one you pick. An alternative to using your indirect_RAII type is, just pick one or the other in each case, and in places where you would have taken advantage of the "generic interface", pass the type as a template parameter when necessary. And in performance-critical areas, make the appropriate choice manually.
Consider the following class, with a move constructor and move assignment operator:
class my_class
{
protected:
double *my_data;
uint64_t my_data_length;
}
my_class(my_class&& other) noexcept : my_data_length{other.my_data_length}, my_data{other.my_data}
{
// Steal the data
other.my_data = nullptr;
other.my_data_length = 0;
}
const my_class& operator=(my_class&& other) noexcept
{
// Steal the data
std::swap(my_data_length, other.my_data_length);
std::swap(my_data, other.my_data);
return *this;
}
What is the purpose of noexcept here? I know that is hits to the compiler that no exceptions should be thrown by the following function, but how does this enable compiler optimizations?
The special importance of noexcept on move constructors and assignment operators is explained in detail in https://vimeo.com/channels/ndc2014/97337253
Basically, it doesn't enable "optimisations" in the traditional sense of allowing the compiler to generate better code. Instead it allows other types, such as containers in the library, to take a different code path when they can detect that moving the element types will never throw. That can enable taking an alternate code path that would not be safe if they could throw (e.g. because it would prevent the container from meeting exception-safety guarantees).
For example, when you do push_back(t) on a vector, if the vector is full (size() == capacity()) then it needs to allocate a new block of memory and copy all the existing elements into the new memory. If copying any of the elements throws an exception then the library just destroys all the elements it created in the new storage and deallocates the new memory, leaving the original vector is unchanged (thus meeting the strong exception-safety guarantee). It would be faster to move the existing elements to the new storage, but if moving could throw then any already-moved elements would have been altered already and meeting the strong guarantee would not be possible, so the library will only try to move them when it knows that can't throw, which it can only know if they are noexcept.
IMHO using noexcept will not enable any compiler optimization on its own. There are traits in STL:
std::is_nothrow_move_constructible
std::is_nothrow_move_assignable
STL containters like vector etc use these traits to test type T and use move constructors and assignment instead of copy constructors and assignment.
Why STL use these traits instead of:
std::is_move_constructible
std::is_move_assignable
Answer: to provide strong exception guarantee.
First of all I would remark that in move constructors or move assignment nothing should throw and there seems to be no need to this ever. The only thing which must be done in constructors/assignment operator is dealing with already allocated memory and pointers to them. Normally you should not call any other methods which can throw and your own moving inside your constructor/operator has no need to do so. But on the other hand a simple output of a debug message breaks this rule.
Optimization can be done in a some different ways. Automatically by the compiler and also by different implementations of code which uses your constructors and assignment operator. Take a look to the STL, there are some specializations for code which are different if you use exceptions or not which are implemented via type traits.
The compiler itself can optimize better while having the guarantee that any code did never throw. The compiler have a guaranteed call tree through your code which can be better inlined, compile time calculated or what so ever. The minimum optimization which can be done is to not store all the informations about the actual stack frame which is needed to handle the throw condition, like deallocation variables on the stack and other things.
There was also a question here: noexcept, stack unwinding and performance
Maybe your question is a duplicate to that?
A maybe helpful question related to this I found here: Are move constructors required to be noexcept?
This discuss the need of throwing in move operations.
What is the purpose of noexcept here?
At minimum saving some program space, which is not only relevant to move operations but for all functions. And if your class is used with STL containers or algorithms it can handled different which can result in better optimization if your STL implementation uses these informations. And maybe the compiler is able to get better general optimization because of a known call tree if all other things are compile time constant.
I have a class that's using an std::discrete_distribution which can take an std::initializer_list OR a couple of iterators. My class is in some ways wrapping the discrete_distribution so I really wanted to mimic the ability to take an std::initializer_list which would then be passed down.
This is simple.
However, the std::initializer_list will always be constructed through some unknown values. So, if it was just a std::discrete_distribution I would just construct from iterators of some container. However, for me to make that available via my class, I would need to templatize the class for the Iterator type.
I don't want to template my class because it's only occasionally that it would use the initializer_list, and the cases where it doesn't, it uses an std::uniform_int_distribution which would make this template argument, maybe confusing.
I know I can default the template argument, and I know that I could just define only vector::iterators if I wanted; I'd just rather not.
According to the documentation, std::initializer_list cannot be non-empty constructed in standard C++. BTW, it is the same for C stdarg(3) va_list (and probably for similar reasons, because variadic function argument passing is implementation specific and generally has its own ABI peculiarities; see however libffi).
In GCC, std::initializer_list is somehow known to the C++ compiler (likewise <stdarg.h> uses some builtin things from the C compiler), and has special support.
The C++11 standard (more exactly its n3337 draft, which is almost exactly the same) says in §18.9.1 that std::initializer_list has only an empty constructor and refers to §8.5.4 list-initialization
You probably should use std::vector and its iterators in your case.
As a rule of thumb and intuitively, std::initializer_list is useful for compile-time known argument lists, and if you want to handle run-time known arguments (with the "number" of "arguments" unknown at compile time) you should provide a constructor for that case (either taking some iterators, or some container, as arguments).
If your class has a constructor accepting std::initializer_list<int> it probably should have another constructor accepting std::vector<int> or std::list<int> (or perhaps std::set<int> if you have some commutativity), then you don't need some weird templates on iterators. BTW, if you want iterators, you would templatize the constructor, not the entire class.
This is my attempt to start a collection of GCC special features which usually do not encounter. this comes after #jlebedev in the another question mentioned "Effective C++" option for g++,
-Weffc++
This option warns about C++ code which breaks some of the programming guidelines given in the books "Effective C++" and "More Effective C++" by Scott Meyers. For example, a warning will be given if a class which uses dynamically allocated memory does not define a copy constructor and an assignment operator. Note that the standard library header files do not follow these guidelines, so you may wish to use this option as an occasional test for possible problems in your own code rather than compiling with it all the time.
What other cool features are there?
From time to time I go through the current GCC/G++ command line parameter documentation and update my compiler script to be even more paranoid about any kind of coding error. Here it is if you are interested.
Unfortunately I didn't document them so I forgot most, but -pedantic, -Wall, -Wextra, -Weffc++, -Wshadow, -Wnon-virtual-dtor, -Wold-style-cast, -Woverloaded-virtual, and a few others are always useful, warning me of potentially dangerous situations. I like this aspect of customizability, it forces me to write clean, correct code. It served me well.
However they are not without headaches, especially -Weffc++. Just a few examples:
It requires me to provide a custom copy constructor and assignment operator if there are pointer members in my class, which are useless since I use garbage collection. So I need to declare empty private versions of them.
My NonInstantiable class (which prevents instantiation of any subclass) had to implement a dummy private friend class so G++ didn't whine about "only private constructors and no friends"
My Final<T> class (which prevents subclassing of T if T derived from it virtually) had to wrap T in a private wrapper class to declare it as friend, since the standard flat out forbids befriending a template parameter.
G++ recognizes functions that never return a return value, and throw an exception instead, and whines about them not being declared with the noreturn attribute. Hiding behind always true instructions didn't work, G++ was too clever and recognized them. Took me a while to come up with declaring a variable volatile and comparing it against its value to be able to throw that exception unmolested.
Floating point comparison warnings. Oh god. I have to work around them by writing x <= y and x >= y instead of x == y where it is acceptable.
Shadowing virtuals. Okay, this is clearly useful to prevent stupid shadowing/overloading problems in subclasses but still annoying.
No previous declaration for functions. Kinda lost its importance as soon as I started copypasting the function declaration right above it.
It might sound a bit masochist, but as a whole, these are very cool features that increased my understanding of C++ and general programming.
What other cool features G++ has? Well, it's free, open, it's one of the most widely used and modern compilers, consistently outperforms its competitors, can eat almost anything people throw at it, available on virtually every platform, customizable to hell, continuously improved, has a wide community - what's not to like?
A function that returns a value (for example an int) will return a random value if a code path is followed that ends the function without a 'return value' statement. Not paying attention to this can result in exceptions and out of range memory writes or reads.
For example if a function is used to obtain the index into an array, and the faulty code path is used (the one that doesn't end with a return 'value' statement) then a random value will be returned which might be too big as an index into the array, resulting in all sorts of headaches as you wrongly mess up the stack or heap.