If I have code like this:
int executeTypeErased(const std::function<int()>& f)
{
return f();
}
void test()
{
executeTypeErased([] {
return 42;
});
}
When I call test(), am I guaranteed that no heap allocations occur due to the std::function? I know that I am guaranteed that no heap allocations occur if I wrap this in a std::reference_wrapper, but I'm less certain with an undecorated lambda as an argument, promoted to std::function.
Related
The problem that I have is that I'm not allowed to use any heap allocation. And I've got a function that needs to return a pointer to an abstract class.
Example:
class Base
{
public:
virtual Base* func() = 0;
};
class Foo : public Base
{
Base* func() override
{
// return new Foo{}; // usual approach with heap allocation
// Foo result{}; // undefined behaviour.
// return &result;
}
};
This example might be a little over simplified, but it shows the problem. How could I implement Foo::func without heap allocation?
Maybe this would be an option for you: you could pass the address of the memory where the new Foo object is to be created to the function and then create the object using placement new.
This memory, then, could be on the stack, avoiding heap allocations. Of course you can also pass previously heap-allocated memory, which would still avoid an allocation at the point of creation in Foo::func.
See below:
class Base
{
public:
virtual Base* func(std::byte*) = 0;
virtual ~Base() = default;
};
class Foo : public Base
{
public:
Base* func(std::byte* pMem) override
{
return new (pMem) Foo{};
}
};
int main() {
// Buffer on stack
std::byte buf[sizeof(Foo)];
Foo f{};
auto pNew = f.func(buf);
// Manually call destructor before buffer goes out of scope
pNew->~Base();
return 0;
}
Say I'm making a general-purpose collection of some sort, and there are 4-5 points where a user might want to choose implementation A or B. For instance:
homogenous or heterogenous
do we maintain a count of the contained objects, which is slower
do we have it be thread-safe or not
I could just make 16 or 32 implementations, with each combination of features, but obviously this won't be easy to write or maintain.
I could pass in boolean flags to the constructor, that the class could check before doing certain operations. However, the compiler doesn't "know" what those arguments were so has to check them every time, and just checking enough boolean flags itself imposes a performance penalty.
So I'm wondering if template arguments can somehow be used so that at compile time the compiler sees if (false) or if (true) and therefore can completely optimize out the condition test, and if false, the conditional code. I've only found examples of templates as types, however, not as compile-time constants.
The main goal would be to utterly eliminate those calls to lock mutexes, increment and decrement counters, and so on, but additionally, if there's some way to actually remove the mutex or counters from the object structure as well that's be truly optimal.
Conditional computation before 17 was mostly about template specialization. Either specializing the function itself
template<> void f<int>(int) {
std::cout << "Locking an int...\n";
std::cout << "Unlocking an int...\n";
}
template<> void f<std::mutex>(std::mutex &m) {
m.lock();
m.unlock();
}
But this actually creates a rather branchy code (in your case I suspect), so a more sound alternative would be to extract all the dependent, type-specific, parts into static interface and define a static implementation of it for a particular concrete type:
template<class T> struct lock_traits; // interface
template<> struct lock_traits<int> {
void lock(int &) { std::cout << "Locking an int...\n"; }
void unlock(int &) { std::cout << "Unlocking an int...\n"; }
};
template<> struct lock_traits<std::mutex> {
void lock(std::mutex &m) { m.lock(); }
void unlock(std::mutex &m) { m.unlock(); }
};
template<class T> void f(T &t) {
lock_traits<T>::lock(t);
lock_traits<T>::unlock(t);
}
In C++17 if constrexpr was finally introduced, now not all branches do have to compile in all circumstances.
template<class T> void f(T &t) {
if constexpr<std::is_same_v<T, std::mutex>> {
t.lock();
}
else if constexpr<std::is_same_v<T, int>> {
std::cout << "Locking an int...\n";
}
if constexpr<std::is_same_v<T, std::mutex>> {
t.unlock();
}
// forgot to unlock an int here :(
}
First of all, I want to point out that it is the first time I am using dynamic polymorphism and the composite design pattern.
I would like to use the composite design pattern to create a class Tree which is able to take different objects of the type Tree, a composite type, or Leaf, an atomic type. Both Tree and Leaf inherit from a common class Nature. Tree can store Leaf or Tree objects into a std::vector<std::shared_ptr<Nature>> children. I would like to fill the vector children with a syntax of this kind (so I guess I have to use variadic, to consider a generic number of inputs in the input lists), as in the following:
Leaf l0(0);
Leaf l1(1);
Tree t0;
Tree t1;
t0.add(l0,l1);
t1.add(t0,l0,l1); // or in general t1.add(t_00,...,t_0n, l_00,...,l_0n,t10,...,t1n,l10,...,l1n,.... )
Then I would also access different elements of a Tree by means of the operator[ ]. So for example t1[0] returns t0 and t1[0][0] returns l0, while t1[0][1] returns l0.
Also I would like an homogeneous behaviour. So either use -> or the dot for accessing the methods on all levels (tree or leaf).
Is it possible to achieve this behaviour?
The implementation of such classes can be like the following:
class Nature
{
public:
virtual void nature_method() = 0;
virtual~Nature();
//virtual Nature& operator[] (int x);
};
class Leaf: public Nature
{
int value;
public:
Leaf(int val)
{
value = val;
}
void nature_method() override
{
std::cout << " Leaf=="<<value<<" ";
}
};
class Tree: public Nature
{
private:
std::vector <std::shared_ptr< Nature > > children;
int value;
public:
Tree(int val)
{
value = val;
}
void add(const Nature&);
void add(const Leaf& c)
{
children.push_back(std::make_shared<Leaf>(c));
}
void add(const Tree& c)
{
children.push_back(std::make_shared<Tree>(c));
}
void add(std::shared_ptr<Nature> c)
{
children.push_back(c);
}
template<typename...Args>
typename std::enable_if<0==sizeof...(Args), void>::type
add(const Leaf& t,Args...more)
{
children.push_back(std::make_shared<Leaf>(t));
};
template<typename...Args>
typename std::enable_if<0==sizeof...(Args), void>::type
add(const Tree& t,Args...more)
{
children.push_back(std::make_shared<Tree>(t));
};
template<typename...Args>
typename std::enable_if<0<sizeof...(Args), void>::type
add(const Leaf& t,Args...more)
{
children.push_back(std::make_shared<Leaf>(t));
add(more...);
};
template<typename...Args>
typename std::enable_if<0<sizeof...(Args), void>::type
add(const Tree& t,Args...more)
{
children.push_back(std::make_shared<Tree>(t));
add(more...);
};
void nature_method() override
{
std::cout << " Tree=="<< value;
for (int i = 0; i < children.size(); i++)
children[i]->nature_method();
}
}
I could implement the overload operator [] to return a pointer to Nature or a Nature object, like so:
Nature& operator[] (int x) {
return *children[x];
}
std::shared_ptr< Nature > operator[] (int x) {
return children[x];
}
In both cases, the return type is Nature related. This because it could be a Leaf or a Tree, which is not known in advance. But since the return type of the operator has to be known at compile time, I cannot do something else.
However, if the returned type would be Tree related, I cannot use the operator [] anymore, because I have enforced it to be Nature.
How can I dynamically choose the return type, Tree or Leaf related, of []? Is there any workaround for this?
I could consider operator [] a virtual method in the Nature class, but still I would no what to make out of this.
I have read about covariant types as well, but I do not know if they would be applicable here.
Thank you.
If you want to be type-safe, the return value of [] will have to be checked at each use site to determine if it is a Tree or a Leaf.
You could also choose not to be type-safe, and invoke undefined behaviour if you use a Leaf in a way that is supposed to be a Tree.
Regardless:
virtual Nature& operator[](std::ptrdiff_t i) {
throw std::invalid_argument("Not a Tree");
}
virtual Nature const& operator[](std::ptrdiff_t i) const {
throw std::invalid_argument("Not a Tree");
}
in Nature, followed by:
virtual Nature& operator[](std::ptrdiff_t i) final override {
auto r = children.at((std::size_t)x);
if (r) return *r;
throw std::out_of_range("no element there");
}
virtual Nature const& operator[](std::ptrdiff_t i) const final override {
auto r = children.at((std::size_t)x);
if (r) return *r;
throw std::out_of_range("no element there");
}
in Tree.
That'll spawn exceptions when you use [] on the wrong type.
I have a use case where one thread reads message into a large buffer and the distributes the processing to a bunch of threads. The buffer is shared by multiple threads after that. Its read-only and when the last thread finishes, the buffer has to be freed. The buffer is allocated from a lock-free slab allocator.
My initial design was to use shared_ptr for the buffer. But the buffer can be of different size. My way of getting around it was do something like this.
struct SharedBuffer {
SharedBuffer (uint16_t len, std::shared_ptr<void> ptr)
: _length(len), _buf(std::move(ptr))
{
}
uint8_t data () { return (uint8_t *)_buf.get(); }
uint16_t length
std::shared_ptr<void> _buf; // type-erase the shared_ptr as the SharedBuffer
// need to stored in some other structs
};
Now the allocator will allocate the shared_ptr like this:
SharedBuffer allocate (size_t size)
{
auto buf = std::allocate_shared<std::array<uint8_t, 16_K>>(myallocator);
return SharedBuffer{16_K, buf}; // type erase the std::array
}
And the SharedBuffer is enqueued to each thread who wants it.
Now I think, I am doing lot of stuff unnecessarily, I can sort of make do with boost::intrusive_ptr with the below scheme. Things are bit C'ish- as I am using variable size array. Here I have changed the slab allocator with a operator new() for the sake of simplicity. I wanted to run it by to see if this implementation is okay.
template <typename T>
inline int atomicIncrement (T* t)
{
return __atomic_add_fetch(&t->_ref, 1, __ATOMIC_ACQUIRE);
}
template <typename T>
inline int atomicDecrement (T* t)
{
return __atomic_sub_fetch(&t->_ref, 1, __ATOMIC_RELEASE);
}
class SharedBuffer {
public:
friend int atomicIncrement<SharedBuffer>(SharedBuffer*);
friend int atomicDecrement<SharedBuffer>(SharedBuffer*);
SharedBuffer(uint16_t len) : _length(len) {}
uint8_t *data ()
{
return &_data[0];
}
uint16_t length () const
{
return _length;
}
private:
int _ref{0};
const uint16_t _length;
uint8_t _data[];
};
using SharedBufferPtr = boost::intrusive_ptr<SharedBuffer>;
SharedBufferPtr allocate (size_t size)
{
// dummy implementation
void *p = ::operator new (size + sizeof(SharedBuffer));
// I am not explicitly constructing the array of uint8_t
return new (p) SharedBuffer(size);
}
void deallocate (SharedBuffer* sbuf)
{
sbuf->~SharedBuffer();
// dummy implementation
::operator delete ((void *)sbuf);
}
void intrusive_ptr_add_ref(SharedBuffer* sbuf)
{
atomicIncrement(sbuf);
}
void intrusive_ptr_release (SharedBuffer* sbuf)
{
if (atomicDecrement(sbuf) == 0) {
deallocate(sbuf);
}
}
I'd use the simpler implementation (using shared_ptr) unless you are avoiding specific problems (i.e. profile first).
Side Note: you can use boost::shared_pointer<> with boost::make_shared<T[]>(N), which is being [added to the standard library in c++20.
Note that allocate_shared already embeds the control block into the same allocation like you do with the intrusive approach.
Finally, I'd use std::atomic_int so you have a clear contract that cannot (accidentally) be used wrong. At the same time, it'll remove the remaining bit of complexity.
I have a class with a couple of fields, assignment c-tor and move c-tor:
class A{
std::vector<int> numbers;
int k;
public:
A(std::vector<int> &&numbers, const int k):
numbers(numbers), // fast
k(k)
{
// logic
}
A(const std::vector<int> &numbers, const int k):
A(std::move(std::vector<int>(numbers)), k) // copy-and-move vector
{
// empty
}
};
I want to keep logic in one c-tor and call it from others.
Also, I want to support fast move-semantics. And I have to explicitly copy-and-move arguments in the assignment c-tor.
Is there any way to avoid such nested construction and keep all advantages I've listed above?
You could delegate one constructor to the other:
struct A
{
A(const std::vector<int> & v) : A(std::vector<int>(v)) {}
A(std::vector<int> && v)
: v_(std::move(v))
{
// logic
}
// ...
};
The moving constructor is now as fast as it can be, and the copying constructor costs one more move than if you spell both constructors out. If you're willing to pay an extra move, though, you might as well just have a single constructor:
struct A
{
A(std::vector<int> v)
: v_(std::move(v))
{
// logic
}
};
The alternative is to put the common code into a function and call that from both constructors.