Why does windows allow to create private heap? - windows

I am learning memory managment in Windows. I know that process in windows has by default its heap, that can be extended in future. Also process can create additional (private) heaps. Why does windows allow to create private heap? What is benefit of such approach? As I understand usage of default heap (with possible reallocations) is enough. Or maybe is it another way to optimize reallocations?

If you look at HeapCreate you will see that it has multiple options that changes how the heap works. HEAP_NO_SERIALIZE will make it faster but you have to handle thread synchronization on your own etc.
Having multiple heaps can also be beneficial if you allocate objects of different sizes with different lifetimes. You might want to put large long-living objects on their own heap if you also have a high churn of small objects that are allocated and de-allocated as part of your work to reduce fragmentation (and lock contention if you are multithreaded).
As noted in a comment, you can call HeapDestroy to free every allocation and the heap itself in one call but this only makes sense if you have full control over everything allocated there. You are not allowed to destroy the default heap so you must create your own private heap to use this trick.

Related

What is the main performance gain from garbage collection?

The llvm documentation says:
In practice, however, the locality and performance benefits of using aggressive garbage collection techniques dominates any low-level losses.
So what is it, exactly, that causes the performance gain when using garbage collection as opposed to manually managing memory? (besides the obvious decrease in code writing time) Is the benefit solely that performing heap compaction increases spatial locality and cache utilization? Or is there something else that helps more, like deleting everything at once?
On modern processors the memory caches are King. Suffering a cache miss can stall the processor for hundreds of cpu cycles, waiting for the slow bus to supply the data.
Making the caches effective requires locality of reference. In other words, if the next memory access is close to the previous one then the odds that the data is already in the cache are high.
A garbage collector can help a lot to make that work out well. The big win is not the collection, it is its ability to rebuild the object graph and reorganize the data structure while doing so. Compacting.
Imagine the typical data structure, an array of pointers to objects. Which is slowly being built up while, say, reading a bunch of strings from a file and turning them into field values of an object. Allocated objects will be scatter-shot in the address space doing so. Long lived objects pointed-to by the array separated by the worker objects, like strings. Iterating that array later is going to be pretty slow.
Until the garbage collector runs and rebuilds the data structure. Putting all of the pointed-to objects in order.
Now iterating the collection is very fast, since accessing element N makes it very likely that element N+1 is readily available. If not in the L1 cache then very good odds for L2 or L3 (if you have it).
Very big win, it is the one feature that made garbage collection competitive with explicit memory management. With the explicit kind having the problem of not supporting moving objects because it will invalidate a pointer.
I can only speak for the Oracle (ex-Sun) and IBM JVMs; their efficiency relies on the fact that newly-created objects are unlikely to live very long. So segregating them into their own area allows that area to be frequently compacted, since with few survivors that's a cheap operation. Frequent compaction means that free space can be kept contiguous, so object creation is also cheap because there's no free chain to traverse and no memory fragmentation.
Manual memory management schemes are rarely this efficient because this is a relatively complex way of doing things that is unlikely to be reinvented for each application. These garbage collectors have evolved and been optimised over a longer period and with more effort than individual applications ever receive. It would be surprising and disappointing if they weren't much more performant.
I doubt locality helps performance at all - admittedly small objects tend to be created at the same time in the same area of the heap (but this applies to C as well), over time, these small objects that remain will be compacted into a closely related area of the heap and it is supposedly this that give you an advantage over C-style allocations. However, show me a program that uses just these small objects and I'll show you a program that does sod all. Show me a program that passes all objects that are to be used on the stack and I'll show you one that screams with speed.
The de-allocation of memory is a performance benefit, short-term as they do not need to be de-allocated. However, when the garbage collector does kick in, this benefit disappears. Usually though, the collection occurs when nothing else is happening in the system (theoretically) so the cost is effectively nullified.
Compaction of the heap also helps allocation, all allocations can come from the beginning of the heap, and the memory manager doesn't have to walk the heap looking for the next free space block of the right size. However, traditional systems can gain the same amount of speed by using multiple fixed-block heaps (which mean you always allocate from a heap for the size of block you want, and you always allocate a fixed block, so walking the heap is just to find the first free block, and this can be removed using a bitmap)
So all in all, there isn't much of a benefit at all, except in benchmarks of course. In my experience the GC can and will jump in and slow you down dramatically at just the wrong time, usually when the system memory is getting filled because the user has done something like load a new page that required a lot of memory allocations.... which in turn required a collection.
It also has a tendency to use a lot of memory - 'memory is cheap' is the mantra of GC languages, so programs are written with this in mind, which means memory allocations are much more common, especially for temporaries and intermediate objects. Just look to StringBuilder classes for the evidence that this is well known. Strings may be 'solved' using this, but many other objects are still allocated with wild abandon. Any program that uses a lot of memory will find itself struggling with RAM IO - all that memory has to be brought into the CPU caches to be used, the more memory you use, the more IO your CPU MM will have to do and that can kill performance in the wrong circumstances.
In addition, when a GC occurs, you have to handle Finalised objects too, this isn't quite as bad as it used to be, but it can still halt your program while the finalisers are run.
Old Java GCs were dreadful for perf, though a lot of research has made them significantly better, they are still not perfect.
EDIT:
one more thing about localisation, imagine creating an array and adding a few items, then do a load of allocations, then you want to add another item to the array - with a GC system the added array element will not be localised, even after a compaction, each object in the array will be stored as an individual item on the heap. This is why I think the localisation issue is not as big a deal as it's made out to be. Now, compare that to an array that is allocated with a buffer and objects are allocated within the buffer space. That may require a re-alloc and copy to add a new item, but reading and modifying it is super fast.
One factor not yet mentioned is that, especially in multi-threaded systems, it can sometimes be difficult to predict with certainty what object will end up holding the last surviving reference to some other object. If one doesn't have to worry about object graphs that might contain cycles, it's possible to use reference counts for this purpose. Before copying a reference to an object, increment its reference count. Before destroying a reference to an object, decrement its reference count. It decrementing the reference count makes it hit zero, destroy the object as well as the reference. Such an approach works well on computers with only one CPU core; if only one thread can actually be running at any given time, one doesn't have to worry about what will happen if two threads try to adjust the same object's reference count simultaneously. Unfortunately, in systems with multiple CPU cores, any CPU that wants to adjust a reference count would have to coordinate that action with all the other CPUs to ensure that two CPUs never hit the counter at the exact same time. Such coordination is "free" with a single CPU, but is relatively expensive in multi-core systems.
When using a batch-mode garbage collector, object references may generally be freely assigned, copied, and destroyed, without inter-CPU coordination. It will periodically be necessary to have all the CPUs stop and run a garbage-collection cycle, but requiring all the CPUs to coordinate with each other once every few seconds or so is a lot cheaper than requiring them to coordinate with each other on every single object-reference assignment.

What's the difference between memory allocation and garbage collection, please?

I understand that 'Garbage Collection' is a form of memory management and that it's a way to automatically reclaim unused memory.
But what is 'memory allocation' and the conceptual difference from 'Garbage Collection'?
They are Polar opposites. So yeah, pretty big difference.
Allocating memory is the process of claiming a memory space to store things.
Garbage Collection (or freeing of memory) is the process of releasing that memory back to the pool of available memory.
Many newer languages perform both of these steps in the background for you when variables are declared/initialized, and fall out of scope.
Memory allocation is the act of asking for some memory to the system to use it for something.
Garbage collection is a process to check if some memory that was previously allocated is no longer really in use (i.e. is no longer accessible from the program) to free it automatically.
A subtle point is that the objective of garbage collection is not actually "freeing objects that are no longer used", but to emulate a machine with infinite memory, allowing you to continue to allocate memory and not caring about deallocating it; for this reason, it's not a substitute for the management of other kind resources (e.g. file handles, database connections, ...).
A simple pseudo-code example:
void myFoo()
{
LinkedList<int> myList = new LinkedList<int>();
return;
}
This will request enough new space on the heap to store the LinkedList object.
However, when the function body is over, myList dissapears and you do not have anymore anyway of knowing where this LinkedList is stored (the memory address). Hence, there is absolutely no way to tell to the system to free that memory, and make it available to you again later.
The Java Garbage Collector will do that for you automatically, in the cost of some performance, and with also introducing a little non-determinism (you cannot really tell when the GC will be called).
In C++ there is no native garbage collector (yet?). But the correct way of managing memory is by the use of smart_pointers (eg. std::auto_ptr (deprecated in C++11), std::shared_ptr) etc etc.
You want a book. You go to the library and request the book you want. The library checks to see if they have the book (in which case they do) and you gladly take it and know you must return it later.
You go home, sit down, read the book and finish it. You return the book back to the library the next day because you are finished with it.
That is a simple analogy for memory allocation and garbage collection. Computers have limited memory, just like libraries have limited copies of books. When you want to allocate memory you need to make a request and if the computer has sufficient memory (the library has enough copies for you) then what you receive is a chunk of memory. Computers need memory for storing data.
Since computers have limited memory, you need to return the memory otherwise you will run out (just like if no one returned the books to the library then the library would have nothing, the computer will explode and burn furiously before your very eyes if it runs out of memory... not really). Garbage collection is the term for checking whether memory that has been previously allocated is no longer in use so it can be returned and reused for other purposes.
Memory allocation asks the computer for some memory, in order to store data. For example, in C++:
int* myInts = new int[howManyIntsIWant];
tells the computer to allocate me enough memory to store some number of integers.
Another way of doing the same thing would be:
int myInts[6];
The difference here is that in the second example, we know when the code is written and compiled exactly how much space we need - it's 6 * the size of one int. This lets us do static memory allocation (which uses memory on what's called the "stack").
In the first example we don't know how much space is needed when the code is compiled, we only know it when the program is running and we have the value of howManyIntsIWant. This is dynamic memory allocation, which gets memory on the "heap".
Now, with static allocation we don't need to tell the computer when we're finished with the memory. This relates to how the stack works; the short version is that once we've left the function where we created that static array, the memory is swallowed up straight away.
With dynamic allocation, this doesn't happen so the memory has to be cleaned up some other way. In some languages, you have to write the code to deallocate this memory, in other it's done automatically. This is garbage collection - some automatic process built into the language that will sweep through all of the dynamically allocated memory on the heap, work out which bits aren't being used and deallocate them (i.e. free them up for other processes and programs).
So: memory allocation = asking for memory for your program. Garbage collection = where the programming language itself works out what memory isn't being used any more and deallocates it for you.

Default heap for a process

I read this article which is about Managing heap memory written by Randy Kath.
I would ask about this segment:
Every process in Windows has one heap called the default heap.
Processes can also have as many other dynamic heaps as they wish,
simply by creating and destroying them on the fly. The system uses the
default heap for all global and local memory management functions, and
the C run-time library uses the default heap for supporting malloc
functions.
I did not underestand what is the function or benefit of the default heap?
Also, I have another question, the author referred always to reserved and committed space, what is meant by committed space?
Applications need heaps to allocate dynamic memory. Windows automatically creates one heap for each process. This is the default heap. Most apps just use this single default heap.
Committing is the act of assigning reserved virtual addresses to specific memory so that it is available for use by the process. I suggest you read this article on MSDN: Managing Virtual Memory.

Why is free() not allowed in garbage-collected languages?

I was reading the C# entry on Wikipedia, and came across:
Managed memory cannot be explicitly freed; instead, it is automatically garbage collected.
Why is it that in languages with automatic memory management, manual management isn't even allowed? I can see that in most cases it wouldn't be necessary, but wouldn't it come in handy where you are tight on memory and don't want to rely on the GC being smart?
Languages with automatic memory management are designed to provide substantial memory safety guarantees that can't be offered in the presence of any manual memory management.
Among the problems prevented are
Double free()s
Calling free() on a pointer to memory that you do not own, leading to illegal access in other places
Calling free() on a pointer that was not the return value of an allocation function, such as taking the address of some object on the stack or in the middle of an array or other allocation.
Dereferencing a pointer to memory that has already been free()d
Additionally, automatic management can result in better performance when the GC moves live objects to a consolidated area. This improves locality of reference and hence cache performance.
Garbage collection enforces the type safety of a memory allocator by guaranteeing that memory allocations never alias. That is, if a piece of memory is currently being viewed as a type T, the memory allocator can guarantee (with garbage collection) that while that reference is alive, it will always refer to a T. More specifically, it means that the memory allocator will never return that memory as a different type.
Now, if a memory allocator allows for manual free() and uses garbage collection, it must ensure that the memory you free()'d is not referenced by anyone else; in other words, that the reference you pass in to free() is the only reference to that memory. Most of the time this is prohibitively expensive to do given an arbitrary call to free(), so most memory allocators that use garbage collection do not allow for it.
That isn't to say it is not possible; if you could express a single-referrent type, you could manage it manually. But at that point it would be easier to either stop using a GC language or simply not worry about it.
Calling GC.Collect is almost always the better than having an explicit free method. Calling free would make sense only for pointers/object refs that are referenced from nowhere. That is something that is error prone, since there is a chance that your call free for the wrong kind of pointer.
When the runtime environment does reference counting monitoring for you, it knows which pointers can be freed safely, and which not, so letting the GC decide which memory can be freed avoids a hole class of ugly bugs. One could think of a runtime implementation with both GC and free where explicitly calling free for a single memory block might be much faster than running a complete GC.Collect (but don't expect freeing every possible memory block "by hand" to be faster than the GC). But I think the designers of C#, CLI (and other languages with garbage collectors like Java) have decided to favor robustness and safety over speed here.
In systems that allow objects to be manually freed, the allocation routines have to search through a list of freed memory areas to find some free memory. In a garbage-collection-based system, any immediately-available free memory is going to be at the end of the heap. It's generally faster and easier for the system to ignore unused areas of memory in the middle of the heap than it would be to try to allocate them.
Interestingly enough, you do have access to the garbage collector through System.GC -- Though from everything I've read, it's highly recommended that you allow the GC manage itself.
I was advised once to use the following 2 lines by a 3rd party vendor to deal with a garbage collection issue with a DLL or COM object or some-such:
// Force garbage collection (cleanup event objects from previous run.)
GC.Collect(); // Force an immediate garbage collection of all generations
GC.GetTotalMemory(true);
That said, I wouldn't bother with System.GC unless I knew exactly what was going on under the hood. In this case, the 3rd party vendor's advice "fixed" the problem that I was dealing with regarding their code. But I can't help but wonder if this was actually a workaround for their broken code...
If you are in situation that you "don't want to rely on the GC being smart" then most probably you picked framework for your task incorrectly. In .net you can manipulate GC a bit (http://msdn.microsoft.com/library/system.gc.aspx), in Java not sure.
I think you can't call free because you start doing one task of GC. GC's efficiency can be somehow guaranteed overall when it does things the way it finds it best and it does them when it decides. If developers will interfere with GC it might decrease it's overall efficiency.
I can't say that it is the answer, but one that comes to mind is that if you can free, you can accidentally double free a pointer/reference or even worse, use one after free. Which defeats the main point of using languages like c#/java/etc.
Of course one possible solution to that, would be to have your free take it's argument by reference and set it to null after freeing. But then, what if they pass an r-value like this: free(whatever()). I suppose you could have an overload for r-value versions, but don't even know if c# supports such a thing :-P.
In the end, even that would be insufficient because as has been pointed out, you can have multiple references to the same object. Setting one to null would do nothing to prevent the others from accessing the now deallocated object.
Many of the other answers provide good explanations of how the GC work and how you should think when programming against a runtime system which provides a GC.
I would like to add a trick that I try to keep in mind when programming in GC'd languages. The rule is this "It is important to drop pointers as soon as possible." By dropping pointers I mean that I no longer point to objects that I no longer will use. For instance, this can be done in some languages by setting a variable to Null. This can be seen as a hint to the garbage collector that it is fine to collect this object, provided there are no other pointers to it.
Why would you want to use free()? Suppose you have a large chunk of memory you want to deallocate.
One way to do it is to call the garbage collector, or let it run when the system wants. In that case, if the large chunk of memory can't be accessed, it will be deallocated. (Modern garbage collectors are pretty smart.) That means that, if it wasn't deallocated, it could still be accessed.
Therefore, if you can get rid of it with free() but not the garbage collector, something still can access that chunk (and not through a weak pointer if the language has the concept), which means that you're left with the language's equivalent of a dangling pointer.
The language can defend itself against double-frees or trying to free unallocated memory, but the only way it can avoid dangling pointers is by abolishing free(), or modifying its meaning so it no longer has a use.
Why is it that in languages with automatic memory management, manual management isn't even allowed? I can see that in most cases it wouldn't be necessary, but wouldn't it come in handy where you are tight on memory and don't want to rely on the GC being smart?
In the vast majority of garbage collected languages and VMs it does not make sense to offer a free function although you can almost always use the FFI to allocate and free unmanaged memory outside the managed VM if you want to.
There are two main reasons why free is absent from garbage collected languages:
Memory safety.
No pointers.
Regarding memory safety, one of the main motivations behind automatic memory management is eliminating the class of bugs caused by incorrect manual memory management. For example, with manual memory management calling free with the same pointer twice or with an incorrect pointer can corrupt the memory manager's own data structures and cause non-deterministic crashes later in the program (when the memory manager next reaches its corrupted data). This cannot happen with automatic memory management but exposing free would open up this can of worms again.
Regarding pointers, the free function releases a block of allocated memory at a location specified by a pointer back to the memory manager. Garbage collected languages and VMs replace pointers with a more abstract concept called references. Most production GCs are moving which means the high-level code holds a reference to a value or object but the underlying location in memory can change as the VM is capable of moving allocated blocks of memory around without the high-level language knowing. This is used to compact the heap, preventing fragmentation and improving locality.
So there are good reasons not to have free when you have a GC.
Manual management is allowed. For example, in Ruby calling GC.start will free everything that can be freed, though you can't free things individually.

HeapCreate vs GetProcessHeap

I'm new to using Heap Allocation in C++.
I'm tryin to understand the scenario that will force someone to create a private heap instead of using the Process Heap. Isn't Process Heap generally enough for most of the cases?
Thanks
--Ashish
If you have a flurry of transient heap activity, using a private heap for that can be faster than churning on the process heap. If you start a thread and give it a private heap, it can be thread-safe in those heap operation without needing to deal with locking for them. There are other reasons, but these two are relatively common ones.
It is an easy way of using a memory pool, which is useful especially on deallocation: Instead of tracking the lifetime of many small objects and deleting them one after another, create a seperate heap for them and destroy the entire heap when you're done.

Resources