Where is the heap? - memory-management

I understand that in Linux the mm_struct describes the memory layout of a process. I also understand that the start_brk and brk mark the start and end of the heap section of a process respectively.
Now, this is my problem: I have a process, for which I wrote the source code, that allocates 5.25 GB of heap memory using malloc. However, when I examine the process's mm_sruct using a kernel module I find the value of is equal to 135168. And this is different from what I expected: I expected brk - start_brk to be equal slight above 5.25 GB.
So, what is going on here?
Thanks.

I notice the following in the manpage for malloc(3):
Normally, malloc() allocates memory from the heap, and adjusts the size of the heap as required, using sbrk(2). When allocating blocks of memory larger than MMAP_THRESHOLD bytes, the glibc malloc() implementation allocates the memory as a private anonymous mapping using mmap(2). MMAP_THRESHOLD is 128 kB by default, but is adjustable using mallopt(3). Allocations performed using mmap(2) are unaffected by the RLIMIT_DATA resource limit (see getrlimit(2)).
So it sounds like mmap is used instead of the heap.

Related

kmem_cache_* creates contiguous memory?

Am I right assuming that a memory slab created and allocated with kmem_cache_create and kmem_cache_alloc is contiguous?
A kmem_cache consists of 1 or more slabs.
A slab consists of 1 or more contiguous pages.
So when you call kmem_cache_alloc, it returns you a piece of memory in a slab which consists of 1 or more contiguous pages.
But if you call kmem_cache_alloc twice, the 2 pieces of memory you get may not contiguous.
And kmem_cache_create only creates and initializes the data structure for a kmem_cache and do not allocate the memories.
AFAIK, kmalloc() and kmem_cache_*() APIs are returning contiguous memory - which is handled by slab allocator....
vmalloc() can be used to ask big chunk of memory and it will return "virtually contiguous" memory (means contiguous virtual address region).

Why memset function make the virtual memory so large

I have a process will do much lithography calculation, so I used mmap to alloc some memory for memory pool. When process need a large chunk of memory, I used mmap to alloc a chunk, after use it then put it in the memory pool, if the same chunk memory is needed again in the process, get it from the pool directly, not used memory map again.(not alloc all the need memory and put it in the pool at the beginning of the process). Between mmaps function, there are some memory malloc not used mmap, such as malloc() or new().
Now the question is:
If I used memset() to set all the chunk data to ZERO before putting it in the memory pool, the process will use too much virtual memory as following, format is "mmap(size)=virtual address":
mmap(4198400)=0x2aaab4007000
mmap(4198400)=0x2aaab940c000
mmap(8392704)=0x2aaabd80f000
mmap(8392704)=0x2aaad6883000
mmap(67112960)=0x2aaad7084000
mmap(8392704)=0x2aaadb085000
mmap(2101248)=0x2aaadb886000
mmap(8392704)=0x2aaadba89000
mmap(67112960)=0x2aaadc28a000
mmap(2101248)=0x2aaae028b000
mmap(2101248)=0x2aaae0c8d000
mmap(2101248)=0x2aaae0e8e000
mmap(8392704)=0x2aaae108f000
mmap(8392704)=0x2aaae1890000
mmap(4198400)=0x2aaae2091000
mmap(4198400)=0x2aaae6494000
mmap(8392704)=0x2aaaea897000
mmap(8392704)=0x2aaaeb098000
mmap(2101248)=0x2aaaeb899000
mmap(8392704)=0x2aaaeba9a000
mmap(2101248)=0x2aaaeca9c000
mmap(8392704)=0x2aaaec29b000
mmap(8392704)=0x2aaaecc9d000
mmap(2101248)=0x2aaaed49e000
mmap(8392704)=0x2aaafd6a7000
mmap(2101248)=0x2aacc5f8c000
The mmap last - first = 0x2aacc5f8c000 - 0x2aaab4007000 = 8.28G
But if I don't call memset before put in the memory pool:
mmap(4198400)=0x2aaab4007000
mmap(8392704)=0x2aaab940c000
mmap(8392704)=0x2aaad2480000
mmap(67112960)=0x2aaad2c81000
mmap(2101248)=0x2aaad6c82000
mmap(4198400)=0x2aaad6e83000
mmap(8392704)=0x2aaadb288000
mmap(8392704)=0x2aaadba89000
mmap(67112960)=0x2aaadc28a000
mmap(2101248)=0x2aaae0a8c000
mmap(2101248)=0x2aaae0c8d000
mmap(2101248)=0x2aaae0e8e000
mmap(8392704)=0x2aaae1890000
mmap(8392704)=0x2aaae108f000
mmap(4198400)=0x2aaae2091000
mmap(4198400)=0x2aaae6494000
mmap(8392704)=0x2aaaea897000
mmap(8392704)=0x2aaaeb098000
mmap(2101248)=0x2aaaeb899000
mmap(8392704)=0x2aaaeba9a000
mmap(2101248)=0x2aaaec29b000
mmap(8392704)=0x2aaaec49c000
mmap(8392704)=0x2aaaecc9d000
mmap(2101248)=0x2aaaed49e000
The mmap last - first = 0x2aaaed49e000 - 0x2aaab4007000= 916M
So the first process will "out of memory" and killed.
In the process, the mmap memory chunk will not be fully used or not even used although it is alloced, I mean, for example, before calibration, the process mmap 67112960(64M), it will not used(write or read data in this memory region) or just used the first 2M bytes, then put in the memory pool.
I know the mmap just return virtual address, the physical memory used delay alloc, it will be alloced when read or write on these address.
But what made me confused is that, why the virtual address increase so much? I used the centos 5.3, kernel version is 2.6.18, I tried this process both on libhoard and the GLIBC(ptmalloc), both with the same behavior.
Do anyone meet the same issue before, what is the possible root cause?
Thanks.
VMAs (virtual memory areas, AKA memory mappings) do not need to be contiguous. Your first example uses ~256 Mb, the second ~246 Mb.
Common malloc() implementations use mmap() automatically for large allocations (usually larger than 64Kb), freeing the corresponding chunks with munmap(). So you do not need to mmap() manually for large allocations, your malloc() library will take care of that.
When mmap()ing, the kernel returns a COW copy of a special zero page, so it doesn't allocate memory until it's written to. Your zeroing is causing memory to be really allocated, better just return it to the allocator, and request a new memory chunk when you need it.
Conclusion: don't write your own memory management unless the system one has proven inadecuate for your needs, and then use your own memory management only when you have proved it noticeably better for your needs with real life load.

Allocating a large DMA buffer

I want to allocate a large DMA buffer, about 40 MB in size. When I use dma_alloc_coherent(), it fails and what I see is:
------------[ cut here ]------------
WARNING: at mm/page_alloc.c:2106 __alloc_pages_nodemask+0x1dc/0x788()
Modules linked in:
[<8004799c>] (unwind_backtrace+0x0/0xf8) from [<80078ae4>] (warn_slowpath_common+0x4c/0x64)
[<80078ae4>] (warn_slowpath_common+0x4c/0x64) from [<80078b18>] (warn_slowpath_null+0x1c/0x24)
[<80078b18>] (warn_slowpath_null+0x1c/0x24) from [<800dfbd0>] (__alloc_pages_nodemask+0x1dc/0x788)
[<800dfbd0>] (__alloc_pages_nodemask+0x1dc/0x788) from [<8004a880>] (__dma_alloc+0xa4/0x2fc)
[<8004a880>] (__dma_alloc+0xa4/0x2fc) from [<8004b0b4>] (dma_alloc_coherent+0x54/0x60)
[<8004b0b4>] (dma_alloc_coherent+0x54/0x60) from [<803ced70>] (mxc_ipu_ioctl+0x270/0x3ec)
[<803ced70>] (mxc_ipu_ioctl+0x270/0x3ec) from [<80123b78>] (do_vfs_ioctl+0x80/0x54c)
[<80123b78>] (do_vfs_ioctl+0x80/0x54c) from [<8012407c>] (sys_ioctl+0x38/0x5c)
[<8012407c>] (sys_ioctl+0x38/0x5c) from [<80041f80>] (ret_fast_syscall+0x0/0x30)
---[ end trace 4e0c10ffc7ffc0d8 ]---
I've tried different values and it looks like dma_alloc_coherent() can't allocate more than 2^25 bytes (32 MB).
How can such a large DMA buffer can be allocated?
After the system has booted up dma_alloc_coherent() is not necessarily reliable for large allocations. This is simply because non-moveable pages quickly fill up your physical memory making large contiguous ranges rare. This has been a problem for a long time.
Conveniently a recent patch-set may help you out, this is the contiguous memory allocator which appeared in kernel 3.5. If you're using a kernel with this then you should be able to pass cma=64M on your kernel command line and that much memory will be reserved (only moveable pages will be placed there). When you subsequently ask for your 40M allocation it should reliably succeed. Simples!
For more information check out this LWN article:
https://lwn.net/Articles/486301/

What do the different columns in the "!heap -flt -s xxxx" windbg command represent

I've been doing some work on high memory issues, and I've been doing a lot of heap analysis in windbg, and I was curious what the different columns really mean in "!heap -flt -s xxxx" command.
I read What do the 'size' numbers mean in the windbg !heap output?, and I looked in my "Windows Internals" book, but I still had a bunch of questions. So the columns and my questions are below.
**HEAP_ENTRY** - What does this pointer really point to? How is it different than UserPtr?
**Size** - What does this size mean? How is it different than UserSize?
**Prev** - This just appears to be the negative offset to get to the previous heap entry. Still not sure exactly how it's used.
**Flags** - Is there any documentation on these flags?
**UserPtr** - What is the user pointer? In all cases I've seen it's always 8 bytes higher than the HEAP_ENTRY, but I don't really know what it points to.
**UserSize** - This appears to be the size of the actual allocation.
**state** - This just tells you what state of this heap entry is (free, busy, etc....)
Example:
HEAP_ENTRY Size Prev Flags UserPtr UserSize - state
0015eeb0 0044 0000 [07] 0015eeb8 00204 - (busy)
HEAP_ENTRY
Heaps store allocated blocks in contiguous Segments of memory, each allocated block starts with a 8-bytes header followed by the actual allocated data. The HEAP_ENTRY column is the address of the beginning of the header of the allocated block.
Size
The heap manager handles blocks in multiple of 8 bytes. The column is the number of 8 bytes chunk allocated. In your sample, 0044 means that the block takes 0x220 bytes (0x44*8).
Prev
Multiply per 8 to have the negative offset in bytes to the previous heap block.
Flags
This is a bitmask that encodes the following information
0x01 - HEAP_ENTRY_BUSY
0x02 - HEAP_ENTRY_EXTRA_PRESENT
0x04 - HEAP_ENTRY_FILL_PATTERN
0x08 - HEAP_ENTRY_VIRTUAL_ALLOC
0x10 - HEAP_ENTRY_LAST_ENTRY
UserPtr
This is the pointer returned to the application by the HeapAlloc (callbed by malloc/new) function. Since the header is always 8 bytes long, it is always HEAP_ENTRY +8.
UserSize
This is the size passed the HeapAlloc function.
state
This is a decoding of the Flags column, telling if the entry is busy, freed, last of its segment, …
Be aware that in Windows 7/2008 R2, heaps are by default using a front-end named LFH (Low fragmented heap) that uses the default heap manager to allocate chunks in which it dispatched user allocated data. For these heaps, UserPtr and UserSize will not point to real user data.
The output of !heap -s displays which heaps are LFH enabled.
From looking at the !heap documentation in the Debugging Tools for Windows help file and the heap docs on MSDN and a great excerpt from Advanced Windows Debugging, here's what I've been able to put together:
HEAP_ENTRY: pointer to entry within the heap. As you found, there is an 8 byte header which contains the data for the HEAP_ENTRY structure. The size of the HEAP_ENTRY structure is 8 bytes which defines the "heap granularity" size. This is used for determining the...
SIZE: size of the entry in terms of the granularity (i.e. the allocation size / 8)
FLAGS: these are defined in winbase.h with explanations found the in MSDN link.
USERPTR: the actual pointer to the allocated (or freed) object
Well, the main difference between HEAP_ENTRY and UserPtr is the due to the fact that heaps have to be indexed, allocated, filled with metadata (like the allocated length made available to user)... otherwise, how could you free(p) something without providing how many bytes were allocated? Same thing with the two size fields: one thing is how big the structure indexing the heap is, one thing is how big is the memory region made available to the user.
The FLAGS, in turn, basically specify which properties of the allocated memory block, if it is committed or just reserved, and, I guess, used by the kernel to rearrange or share memory regions if needed (but as nithins specifies they are documented in MSDN).
The PREV ptr is used to keep track of all the allocated regions and the first pointer is stored in the PEB structure so both user-space and kernel-space code is aware of the allocated heap pools.

managed heap fragmentation

I am trying to understand how heap fragmenation works. What does the following output tell me?
Is this heap overly fragmented?
I have 243010 "free objects" with a total of 53304764 bytes. Are those "free object" spaces in the heap that once contained object but that are now garabage collected?
How can I force a fragmented heap to clean up?
!dumpheap -type Free -stat
total 243233 objects
Statistics:
MT Count TotalSize Class Name
0017d8b0 243010 53304764 Free
It depends on how your heap is organized. You should have a look at how much memory in Gen 0,1,2 is allocated and how much free memory you have there compared to the total used memory.
If you have 500 MB managed heap used but and 50 MB is free then you are doing pretty well. If you do memory intensive operations like creating many WPF controls and releasing them you need a lot more memory for a short time but .NET does not give the memory back to the OS once you allocated it. The GC tries to recognize allocation patterns and tends to keep your memory footprint high although your current heap size is way too big until your machine is running low on physical memory.
I found it much easier to use psscor2 for .NET 3.5 which has some cool commands like ListNearObj where you can find out which objects are around your memory holes (pinned objects?). With the commands from psscor2 you have much better chances to find out what is really going on in your heaps. Most commands are also available in SOS.dll in .NET 4 as well.
To answer your original question: Yes free objects are gaps on the managed heap which can simply be the free memory block after your last allocated object on a GC segement. Or if you do !DumpHeap with the start address of a GC segment you see the objects allocated in that managed heap segment along with your free objects which are GC collected objects.
This memory holes do normally happen in Gen2. The object addresses before and after the free object do tell you what potentially pinned objects are around your hole. From this you should be able to determine your allocation history and optimize it if you need to.
You can find the addresses of the GC Heaps with
0:021> !EEHeap -gc
Number of GC Heaps: 1
generation 0 starts at 0x101da9cc
generation 1 starts at 0x10061000
generation 2 starts at 0x02aa1000
ephemeral segment allocation context: none
segment begin allocated size
02aa0000 02aa1000** 03836a30 0xd95a30(14244400)
10060000 10061000** 103b8ff4 0x357ff4(3506164)
Large object heap starts at 0x03aa1000
segment begin allocated size
03aa0000 03aa1000 03b096f8 0x686f8(427768)
Total Size: Size: 0x115611c (18178332) bytes.
------------------------------
GC Heap Size: Size: 0x115611c (18178332) bytes.
There you see that you have heaps at 02aa1000 and 10061000.
With !DumpHeap 02aa1000 03836a30 you can dump the GC Heap segment.
!DumpHeap 02aa1000 03836a30
Address MT Size
...
037b7b88 5b408350 56
037b7bc0 60876d60 32
037b7be0 5b40838c 20
037b7bf4 5b408350 56
037b7c2c 5b408728 20
037b7c40 5fe4506c 16
037b7c50 60876d60 32
037b7c70 5b408728 20
037b7c84 5fe4506c 16
037b7c94 00135de8 519112 Free
0383685c 5b408728 20
03836870 5fe4506c 16
03836880 608c55b4 96
....
There you find your free memory blocks which was an object which was already GCed. You can dump the surrounding objects (the output is sorted address wise) to find out if they are pinned or have other unusual properties.
You have 50MB of RAM as Free space. This is not good.
Having .NET allocating blocks of 16MB from process, we have a fragmentation issue indeed.
There are plenty of reasons to fragmentation to occure in .NET.
Have a look here and here.
In your case it is possibly a pinning. As 53304764 / 243010 makes 219.35 bytes per object - much lower then LOH objects.

Resources