Does unordered map frees its memory after calling clear? - c++11

I know that unordered map removes the objects that were stored in it when doing clear() but does it also relinquishes the memory (back to OS) that it holds to create itself.
struct A{
A(){std::cout << "constructor called\n";}
~A(){std::cout << "destructor called\n";}
};
// this will reserve size of 1000 buckets
std::unordered_map<int, A> my_map{1000};
// now insert into the map
my_map[1] = "asdf";
my_map[2] = "asdff";
....
then I do my_map.clear(); This will call destructor of A.
So my question is will the 1000 buckets that was reserved also be freed? I tried looking at the size() after doing clear it says zero, so is there something like capacity for unordered_map that is similar to vector that would let me view the reserved size?

will the 1000 buckets that was reserved also be freed?
This is not specified by the standard.
is there something like capacity for unordered_map that is similar to vector that would let me view the reserved size?
A simple "capacity" value doesn't make sense for a hash map. Rehashing happens when load_factor exceeds max_load_factor. You can check the number of buckets using bucket_count.

I know that unordered map removes the objects that were stored in it when doing clear() but does it also relinquishes the memory (back to OS) that it holds to create itself.
There is no guarantee (in particular because the C++11 specification does not know what an OS is), and in practice it often don't release the memory to the OS (that is, your virtual address space might stay unchanged).
What happens is that (without specific Allocator argument to the std::map template, or to std::unordered_map template) the memory is delete-d (or delete[]-d). And if you really care a lot, implement your own Allocator andd pass it appropriately to your container templates. I'm not sure that is worth the trouble.
Usually new calls malloc & delete calls free. In that case in most (but not all) cases the memory is not released to the kernel (e.g. with munmap(2) on Linux...) but just marked as re-usable by future calls to new.
Details are of course implementation specific. Many free (or delete) implementations would for example release with munmap a large enough block (of several megabytes or more). The exact threshold vary from one implementation to the next one.
Notice that if you use free software (e.g. a Linux system with GCC, GNU libc, ....), you could look into the source code of your C++ standard library (the code of std::map and of ::operator delete etc... inside GCC) and of your C standard library (the code of free) and find out by yourself these gory details.

Related

What is Layout?

I want to reimplement some stdlib smart pointers in rust (mainly Box) to learn it better, i come from a C background which has the simple malloc and free functions, but rust's alloc and dealloc need some Layout. What is it? It's not really explained in the docs.
As its documentation says, Layout describes a block of memory that is to be allocated or deallocated. In addition to memory size, it specifies alignment and optionally the trailing padding of the block.
C's malloc isn't told what alignment to use, so it has to assume worst-case alignment, possibly wasting memory. And even in C one sometimes needs memory of non-standard alignment, for which APIs outside of standard C had to be used until C11.
One final difference between Rust's and C's allocator interface is that Rust requires the layout to be passed to dealloc as well as to alloc - equivalent to C's free requiring the size that was passed to malloc. At first this sounds like a disadvantage because the user of the API must track the allocated size in order to be able to deallocate. But it turns out not to be an issue in practice because:
when allocating a single value such as Box<T> or array Box<[T; n]>, the size is determined at compile-time and thus known when Box::drop is invoked.
when allocating a dynamic array, such as Vec<T>, the capacity of the vector is tracked by the vector itself and thus also available in Vec::drop.
when boxing a slice, such as Box<[T]>, the slice cannot be resized and its capacity is equal to its length and again known to Box::drop.
So it turns out that it is the C-style allocation API that results in storing redundant size information. Passing the size to dealloc eliminates the redundancy, providing opportunity to save space, especially for small blocks where the size information is a non-negligible percentage of the memory used.
When comparing Rust's allocation interface to that of C, keep in mind that in Rust raw allocation is much more of a specialized tool, used only to implement safe abstractions such as Box, Rc or Vec. Unlike C, where a programmer is routinely expected to invoke malloc and free, most Rust programmers never need to invoke the global allocator directly.

Allocator that manages a single block of memory

Due to system limitations, suppose that I can only allocate memory from a heap once (for example with std::allocator or some other more general C++11 compliant allocator).
This single allocation will take a large memory block.
Then I want to use containers and dynamic memory but all restricted to the previously allocated block of memory.
I managed to write very simple allocator that incrementally "gives" memory shifting a pointer.
In this allocator deallocate is a no-op, and memory from the block is not returned to the block.
One can obviously do better than this.
In other words, I want a managed heap.
Reusing this block memory in a sequence is a hard problem because one needs to manage discontinuous free segments, defragmentation, (optional) thread-safety, etc.
What is the name of this pattern? For some time I though that this was a pool allocator but it seem that that refers to something else (reusing small objects).
What features or standard libraries of C++ can I use either implement and administer such allocation or at least build my own with little effort?
I expected to find something in Boost.
But Boost.Pool is something else and it looks like something like this is implemented for a specific purpose in Boost.Interprocess but it doesn't seem to be easy to use and I have a hard time to understand it outside their prototypical use (such a interprocess shared memory.)
Otherwise, the closest thing I found is this https://www.boost.org/doc/libs/1_41_0/libs/pool/doc/interfaces/pool_alloc.html , but it seems that ::new can be called several times.
Example code:
int main(){
UserBlockAllocator<double> a(new double[1000], 1000);
{
std::vector<double, UserBlockAllocator<double>> v0(600, a);
} // v0 returns memory to block managed by a
std::vector<double, UserBlockAllocator<double>> v1(600, a);
std::vector<double, UserBlockAllocator<double>> v2(600, a); //out of memory
}
This pattern is referred to as arena allocator or stack allocator. If I understand the std::pmr stuff correctly, a std::pmr::monotonic_buffer_resource is related to that, but I have never tried that.
With those keywords you find something, but I have no experience with the tools.
Note that it is easy to succesfully deallocate the most recent allocation.
A powerful pattern is the composition of allocators as described in an entertaining talk by Andrei Alexandrescu at CppCon 2015. If you want to build your own tool, you might consider the combination of a FreeListAllocator (43:18) on top of your StackAllocator (35:42). This way, you may solve the problem how to manage discontinous free segments (as you describe it).

Overwriting data in memory

I've written a password manager in Ocaml. In order to make it as secure as possible, I'd like to store a string (an encryption key) in memory in such a way that it can be overwritten. Since Ocaml is pass by value , and there's a garbage collector, this has proven difficult. I encrypt all buffers and variables I can, but I still need a "session key" stored to do this. To prevent this from being detected by automated key searching programs or put into swap, it's assembled from a bunch of random data in a buffer using a random increment. So really, all I need is a single variable that can be overwritten for the assembled key for a few seconds before it's passed into the Nocrypto library... Would a reference work for this?
According to this cornell "Refs and Arrays" page, refs are mutable and work similarly to pointers in C. That being said, I also found a stack overflow answer discussing Ocaml refs, in which the answer mentions "they act like pointers to new allocated memory". Does this mean every time, it just allocates a new thing in memory instead of actually mutating the stuff in memory? If so, you couldn't really "overwrite" a ref.
Other possible solutions I've come across are Bigarrays, and "custom blocks". I'm not entirely sure if "custom blocks" are actually allocated outside of the scope of garbage collection or not. They seem like they're used to access external C code. Are they copied around by the garbage collector? Could they be "overwritten?" There's also this idea of "opaque bytes" and opaque objects in memory. I'm having a pretty hard time wrapping my head around how this all fits together. A useful but confusing (to me) discussion of custom blocks in memory on stack overflow is here: Are custom blocks ever copied in memory? Answer says they can be moved around. Even so, could they be overwritten?
The last possible solution is to store it using a Cstruct like the Nocrypto library seems to do. They discuss it in this github issue: Secret material erasure The asker states:
"Granted, most key material is Cstruct.t, which is a Bigarray.Array1.t, which is allocated outside of the GC heap"
Is this even correct? If so, I can't seem to find a source file that actually does this. I'm pretty new to Ocaml and functional programming in general. If you're curious, my program is located on github here: ocaml-pass
TL;DR;
You shall not store any secret information in OCaml heap. Thus you must never copy your secret into any OCaml heap-allocated value, consequently, neither Bytes, nor Strings or Arrays could be used, even temporary.
Introduction to the OCaml Memory Model
OCaml values are uniformly represented as tagged machine words. The least significant bit of a word is used as a tag, that distinguishes between pointers (tag=0) and immediate values (tag=1). Thus a value has always a fixed size, and is a pointer or an immediate.
Immediate values store their data in the most significant part of the word, that is 31-bits in 32 bit systems, and 63 bits in 64-bit systems. Pointers store their data in blocks, that are located in a so-called OCaml Heap. The OCaml Heap is a set of blocks managed by the Garbage Collector (GC). A block is a chunk of data prefixed with a header. The header specifies the size of data, and some other meta information, used by the GC. Block can contain OCaml values (pointers or immediate values) or opaque data.
To summarize. All OCaml values are represented as machine words, that either store data directly in the word or are pointers to heap-allocated blocks. Each pointer points to one and only one block. Several pointers may point to the same block. Such values are considered physically equal. Some blocks are not pointed by any pointers. Such blocks are called dead and are reclaimed by the GC.
Introduction to the OCaml Garbage Collector
The GC manages blocks, by allocating, moving, and deallocating them. The GC itself uses an arena, that is either obtained from the C memory allocator (malloc) or directly from a kernel via the memmap syscall (depends on a particular system and runtime).
The GC is generational, that means that values are first allocated in a special region of a heap called minor heap. The minor heap is a contiguous region of memory of fixed size, represented in the runtime with three pointers: the pointer beg to the beginning of the minor heap, a pointer end to the end of the minor heap, and the pointer cur to the beginning of the free area of the minor heap. When a block is allocated, cur is increased by the size of the block. Then the block is initialized with data. When there is no more free space in the minor heap (i.e., then end - cur is less than the required block size), then a minor GC cycle is triggered. The GC analyzes all blocks stored in the Minor Heap and copies all blocks that are referenced by at least one pointer to the Major Heap. After that, the cur pointer is set to beg.
In the Major Heap, a block may also be copied several times during a process called compaction. The compactor may try to rearrange blocks in its arena in order to achieve more compact representation of the heap.
Security Consequences
As the OCaml GC is a moving GC, it may copy the heap-allocated data arbitrarily. Although it is called moving, it is still in fact just copying. I.e., when a block is moved from the minor heap to the major heap, it is in fact just bit-copied, and thus is duplicated. The block phantom in the minor heap may live for an arbitrary amount of time until it is overwritten by some newly allocated value. When an object is moved during the compaction, it is also copied, and may or may not be overwritten during the process. And, of course, it goes without saying, that once a block becomes dead, it still may survive in a heap for an arbitrary amount of time until reused by the GC.
That all means, that if a secret ends up in the OCaml heap, it will go wild, as the GC can replicate it multiple times in an arbitrary and rather unpredictable ways. Thus, we can only store secrets either in immediate values or in regions that are not controlled by the GC. As it was said before, all OCaml values that are pointers, always point to a block in the OCaml heap. A block may contain data directly, or it could contain a pointer itself, that will point outside the memory heap. The so-called custom blocks, may or may not store their information in the OCaml heap, it depeds on a particular representation of each custom block. For example, the Bigarray library provides custom blocks that store their payload outside of the OCaml heap. Thus a Bigarray is a custom block, that has two fields: a pointer and size. It is an opaque block, i.e., the GC will never treat these two values as OCaml values, and will never follow neither the size nor the pointer. The data pointed by a pointer is located outside of the OCaml heap, and is either allocated by malloc or by memmap (in fact, it could be arbitrary integer, and even point to stack, or static data, it doesn't really matter, as long as we treat bigarrays just as a ptr,len pair).
This all makes Bigarrays ideal for storing secrets. We can be sure, that they are not moved by the GC, we can overwrite them to prevent the information leakage once they are freed.
Further considerations
We should be careful, and never allow a secret to be copied into the OCaml heap from our safe place. That means, that even if our main storage is a safe bigarray the information will still leak if we will copy its contents to an OCaml string. Consequently, if we first read the information into OCaml string, and then copy it into bigarray, the information will still leak. Thus, any interface that uses OCaml heap-allocated values is unsafe and shall not be used. For example, we can't use OCaml channels to read or write secrets (we should rely on memory mapping or unbuffered IO provided by the Unix module). And again, whenever you get a string data type from a Bigarray, you get your data copied, with all the ramifications.
I would use a value of type bytes, essentially a mutable array of bytes:
# let buffer = Bytes.make 16 'x';;
val buffer : bytes = "xxxxxxxxxxxxxxxx"
# Bytes.set buffer 0 'T';;
- : unit = ()
# buffer;;
- : bytes = "Txxxxxxxxxxxxxxx"
# Bytes.fill buffer 0 16 ' ';;
- : unit = ()
# buffer;;
- : bytes = " "
You can overwrite with Bytes.fill after you're done.

How to split one heap into several heaps inside one process?

Which ecosystems allow to create multiple heaps right now?
Is it possible to have multiple heaps in java?
garbage collection and memory management in Erlang
Is there any benefit to use multiple heaps for memory management purposes?
AppDomains don't create new heaps (there is still one heap for all domains). So, what one need to do to launch several different GC inside the single process?
Which syntactic primitives does one need to create? How a runtime should support that primitives?
Which ecosystems allow to create multiple heaps right now?
One obvious answer would be "C++" (feel free to fill in surrounding pieces as you see fit, if you don't consider a language to be an "ecosystem" in itself).
C++ allows you to specify heaps along a few different axes. One is by the type of an object--you can specify allocation for a particular type by overloading operator new and operator delete for that type:
class Foo {
static void *operator new(size_t size);
static void operator delete(void *block, size_t size);
};
It's then up to you to connect these heap management functions to an actual source of memory. You might allocate that via ::operator new, or you might (for example) go directly to the OS, such as with something like GlobalAlloc or VirtualAlloc on Windows, sbrk on UNIX-like systems, or just have pre-specified blocks of memory on a bare-metal embedded system.
Along a somewhat different axis, all the containers in the C++ standard library allocate and free memory via Allocator classes. The Allocator for any particular collection is specified as a template parameter, so (for example) a declaration for std::vector looks something like this:
template <class T, class Alloc=std::allocator<T>>
class vector {
// ...
};
This lets you specify a heap that will be used to allocate objects in that collection. Much as with operator new and operator delete, this really only specifies the interface by which the collection will allocate and free memory--it's up to you to connect that to code that actually manages the heap.
Garbage Collection
As far as garbage collection goes: I personally find it annoying, and advise against its use as a general rule. The problem is that it while it can (at least from one perspective) fix some types of problems with memory management, it does nothing to help management of other resources--and (unfortunately) I haven't seen anything like a tracing collector for file handles, network sockets, database connections, and so on. RAII provides a uniform method for dealing with resource management in general.
That said, if you really insist on using GC, C++ does support that as well. Prior to C++11, GC was entirely usable on a practical level, but led to what was technically undefined behavior under a few obscure circumstances, such as:
storing a pointer in a file, and reading it back in, or
modifying the bits of a pointer, later un-doing that modification
...and later taking the re-constituted pointer and dereferencing it. Obviously, while the pointer wasn't visible to the CPU, the pointed-to block of memory became eligible for GC, so the later dereference caused problems. C++11 defined these circumstances, and added a few library calls (e.g., declare_reachable, undeclare_reachable) to deal with them (e.g., if you call decalare_reachable(block);, that block is not eligible for collection, regardless of whether a pointer to it is visible). As such, if you want to use GC with C++ you can, and the bounds of defined behavior are thoroughly specified. The only problem is that essentially no code ever calls declare_reachable and/or undeclare_reachable, so in real use they're likely to be of little or no help (but pointer swizzling and/or storage in a file are sufficiently rare that this is unlikely to pose a real problem).
For a practical example, you might want to look at the Boehm-Demers-Weiser collector (if you haven't already).

Sized Deallocation Feature In Memory Management in C++1y

Sized Deallocation feature has been proposed to include in C++1y. However I wanted to understand how it would affect/improve the current c++ low-level memory management?
This proposal is in N3778, which states following about the intent of this.
With C++11, programmers may define a static member function operator
delete that takes a size parameter indicating the size of the object
to be deleted. The equivalent global operator delete is not available.
This omission has unfortunate performance consequences.
Modern memory allocators often allocate in size categories, and, for
space efficiency reasons, do not store the size of the object near the
object. Deallocation then requires searching for the size category
store that contains the object. This search can be expensive,
particularly as the search data structures are often not in memory
caches. The solution is to permit implementations and programmers
to define sized versions of the global operator delete. The
compiler shall call the sized version in preference to the unsized
version when the sized version is available.
Well from above paragraph, it look like the size information which operator delete require can be maintained and hence passed by used program. This would avoid any search for the size while deallocation. But as per my understanding, while allocating, memory management store the size information in some sort of header(explained boundary-tag method in dlmalloc), which would be used while deallocation.
T* p = new T();
// Now size information would be stored in the header
// *(char*)(p - 0x4) = size;
// This would be used when we delete the memory????.
delete p;
If size information is stored in the header, why deallocation require searching for it?
It looks like I am missing something obvious and did not understand this concepts completely.
Additionally,how this feature can be used in program while dealing with the low level memory management in C++. Hope that somebody will help me to understand these concept.
As in your quote:
[Modern memory allocators] for space efficiency reasons, do not store the size of the object near the object.
Increasing the size of every allocation in order to add explicit size information is obviously going to use more memory than alternatives such as storing the size information once per allocation pool, or supplying the information upon deallocation.

Resources