What is the purpose of creating a private heap - windows

One can use the HeapCreate function to allocate a private heap from the calling process on Windows platform. While each process has its own default heap.
My question is: What are some possible reasons that a programmer use a private heap than the default one? In other words, in what scenario, using a private heap will become really handy?

There is a whole long list of reasons for creating multiple heaps:
Better efficiency with threads (threads do not share heaps).
Debugging and error trapping
Setting up heaps with different allocation properties.

Related

Why does windows allow to create private heap?

I am learning memory managment in Windows. I know that process in windows has by default its heap, that can be extended in future. Also process can create additional (private) heaps. Why does windows allow to create private heap? What is benefit of such approach? As I understand usage of default heap (with possible reallocations) is enough. Or maybe is it another way to optimize reallocations?
If you look at HeapCreate you will see that it has multiple options that changes how the heap works. HEAP_NO_SERIALIZE will make it faster but you have to handle thread synchronization on your own etc.
Having multiple heaps can also be beneficial if you allocate objects of different sizes with different lifetimes. You might want to put large long-living objects on their own heap if you also have a high churn of small objects that are allocated and de-allocated as part of your work to reduce fragmentation (and lock contention if you are multithreaded).
As noted in a comment, you can call HeapDestroy to free every allocation and the heap itself in one call but this only makes sense if you have full control over everything allocated there. You are not allowed to destroy the default heap so you must create your own private heap to use this trick.

How to measure the performance of the Erlang Garbage Collector?

I have started programming in Erlang recently and there are a few things I want to understand regarding garbage collection (GC). As far as I understand, there is a generational GC for the private heap of each process and a reference counting GC for the global shared heap.
What I would like to know is if there is anyway to get:
How many number of collection cycles?
How many bytes are allocated and deallocated, on a global level or process level?
What are the private heaps, and shared heap sizes? And can we define this as a GC parameter?
How long does it take to collect garbage? The % of time needed?
Is there a way to run a program without GC?
Is there a way to get this kind of information, either with code or using some commands when I run an Erlang program?
Thanks.
To get information for a single process, you can call erlang:process_info(Pid). This will yield (as of Erlang 18.0) the following fields:
> erlang:process_info(self()).
[{current_function,{erl_eval,do_apply,6}},
{initial_call,{erlang,apply,2}},
{status,running},
{message_queue_len,0},
{messages,[]},
{links,[<0.27.0>]},
{dictionary,[]},
{trap_exit,false},
{error_handler,error_handler},
{priority,normal},
{group_leader,<0.26.0>},
{total_heap_size,4184},
{heap_size,2586},
{stack_size,24},
{reductions,3707},
{garbage_collection,[{min_bin_vheap_size,46422},
{min_heap_size,233},
{fullsweep_after,65535},
{minor_gcs,7}]},
{suspending,[]}]
The number of collection cycles for the process is available in the field minor_gcs under the section garbage_collection.
Per Process
The current heap size for the process is available in the field heap_size from the results above (in words, 4 bytes on a 32-bit VM and 8 bytes on a 64-bit VM). The total memory consumption of the process can be obtained by calling erlang:process_info(Pid, memory) which returns for example {memory,34312} for the above process. This includes call stack, heap and internal structures.
Deallocations (and allocations) can be traced using erlang:trace/3. If the trace flag is garbage_collection you will received messages on the form {trace, Pid, gc_start, Info} and {trace, Pid, gc_end, Info}. The Info field of the gc_start message contains such things as heap_size and old_heap_size.
Per System
Top level statistics of the system can be obtained by erlang:memory/0:
> erlang:memory().
[{total,15023008},
{processes,4215272},
{processes_used,4215048},
{system,10807736},
{atom,202481},
{atom_used,187597},
{binary,325816},
{code,4575293},
{ets,234816}]
Garbage collection statistics can be obtained via erlang:statistics(garbage_collection) which yields:
> statistics(garbage_collection).
{85,23961,0}
Where (as of Erlang 18.0) the first field is the total number of garbage collections performed by the VM and the second field is the total number of words reclaimed.
The heap sizes for a process are available under the fields total_heap_size (all heap fragments and stack) and heap_size (the size of the youngest heap generation) from the process info above.
They can be controlled via spawn options, specifically min_heap_size which sets the initial heap size for a process.
To set it for all process, erlang:system_flag(min_heap_size, MinHeapSize) can be called.
You can also control global VM memory allocation via the +M... options to the Erlang VM. The flags are described here. However, this requires extensive knowledge about the internals of the Erlang VM and its allocators and using them should not be taken lightly.
This can be obtained via the tracing described in answer 2. If you use the option timestamp when tracing, you will receive a timestamp with each trace message that can be used to calculate the total GC time.
Short answer: no.
Long answer: Maybe. You can control the initial heap size (via min_heap_size) which will affect when garbage collection will occur the first time. You can also control when a full sweep will be performed with the fullsweep_after option.
More information can be found in the Academic and Historical Questions and Processes section of the Efficiency Guide.
The most practical way of introspecting Erlang memory usage at runtime is via the Recon library, as Steve Vinoski mentioned.

Garbage collection and memory management in Erlang

I want to know technical details about garbage collection (GC) and memory management in Erlang/OTP.
But, I cannot find on erlang.org and its documents.
I have found some articles online which talk about GC in a very general manner, such as what garbage collection algorithm is used.
To classify things, lets define the memory layout and then talk about how GC works.
Memory Layout
In Erlang, each thread of execution is called a process. Each process has its own memory and that memory layout consists of three parts: Process Control Block, Stack and Heap.
PCB: Process Control Block holds information like process identifier (PID), current status (running, waiting), its registered name, and other such info.
Stack: It is a downward growing memory area which holds incoming and outgoing parameters, return addresses, local variables and temporary spaces for evaluating expressions.
Heap: It is an upward growing memory area which holds process mailbox messages and compound terms. Binary terms which are larger than 64 bytes are NOT stored in process private heap. They are stored in a large Shared Heap which is accessible by all processes.
Garbage Collection
Currently Erlang uses a Generational garbage collection that runs inside each Erlang process private heap independently, and also a Reference Counting garbage collection occurs for global shared heap.
Private Heap GC: It is generational, so divides the heap into two segments: young and old generations. Also there are two strategies for collecting; Generational (Minor) and Fullsweep (Major). The generational GC just collects the young heap, but fullsweep collect both young and old heap.
Shared Heap GC: It is reference counting. Each object in shared heap (Refc) has a counter of references to it held by other objects (ProcBin) which are stored inside private heap of Erlang processes. If an object's reference counter reaches zero, the object has become inaccessible and will be destroyed.
To get more details and performance hints, just look at my article which is the source of the answer: Erlang Garbage Collection Details and Why It Matters
A reference paper for the algorithm: One Pass Real-Time Generational Mark-Sweep Garbage Collection (1995) by Joe Armstrong and Robert Virding in
1995 (at CiteSeerX)
Abstract:
Traditional mark-sweep garbage collection algorithms do not allow reclamation of data until the mark phase of the algorithm has terminated. For the class of languages in which destructive operations are not allowed we can arrange that all pointers in the heap always point backwards towards "older" data. In this paper we present a simple scheme for reclaiming data for such language classes with a single pass mark-sweep collector. We also show how the simple scheme can be modified so that the collection can be done in an incremental manner (making it suitable for real-time collection). Following this we show how the collector can be modified for generational garbage collection, and finally how the scheme can be used for a language with concurrent processes.1
Erlang has a few properties that make GC actually pretty easy.
1 - Every variable is immutable, so a variable can never point to a value that was created after it.
2 - Values are copied between Erlang processes, so the memory referenced in a process is almost always completely isolated.
Both of these (especially the latter) significantly limit the amount of the heap that the GC has to scan during a collection.
Erlang uses a copying GC. During a GC, the process is stopped then the live pointers are copied from the from-space to the to-space. I forget the exact percentages, but the heap will be increased if something like only 25% of the heap can be collected during a collection, and it will be decreased if 75% of the process heap can be collected. A collection is triggered when a process's heap becomes full.
The only exception is when it comes to large values that are sent to another process. These will be copied into a shared space and are reference counted. When a reference to a shared object is collected the count is decreased, when that count is 0 the object is freed. No attempts are made to handle fragmentation in the shared heap.
One interesting consequence of this is, for a shared object, the size of the shared object does not contribute to the calculated size of a process's heap, only the size of the reference does. That means, if you have a lot of large shared objects, your VM could run out of memory before a GC is triggered.
Most if this is taken from the talk Jesper Wilhelmsson gave at EUC2012.
I don't know your background, but apart from the paper already pointed out by jj1bdx you can also give a chance to Jesper Wilhelmsson thesis.
BTW, if you want to monitor memory usage in Erlang to compare it to e.g. C++ you can check out:
Erlang Instrument Module
Erlang OS_MON Application
Hope this helps!

Default heap for a process

I read this article which is about Managing heap memory written by Randy Kath.
I would ask about this segment:
Every process in Windows has one heap called the default heap.
Processes can also have as many other dynamic heaps as they wish,
simply by creating and destroying them on the fly. The system uses the
default heap for all global and local memory management functions, and
the C run-time library uses the default heap for supporting malloc
functions.
I did not underestand what is the function or benefit of the default heap?
Also, I have another question, the author referred always to reserved and committed space, what is meant by committed space?
Applications need heaps to allocate dynamic memory. Windows automatically creates one heap for each process. This is the default heap. Most apps just use this single default heap.
Committing is the act of assigning reserved virtual addresses to specific memory so that it is available for use by the process. I suggest you read this article on MSDN: Managing Virtual Memory.

HeapCreate vs GetProcessHeap

I'm new to using Heap Allocation in C++.
I'm tryin to understand the scenario that will force someone to create a private heap instead of using the Process Heap. Isn't Process Heap generally enough for most of the cases?
Thanks
--Ashish
If you have a flurry of transient heap activity, using a private heap for that can be faster than churning on the process heap. If you start a thread and give it a private heap, it can be thread-safe in those heap operation without needing to deal with locking for them. There are other reasons, but these two are relatively common ones.
It is an easy way of using a memory pool, which is useful especially on deallocation: Instead of tracking the lifetime of many small objects and deleting them one after another, create a seperate heap for them and destroy the entire heap when you're done.

Resources