How to keep cache a second class citizen - caching

OK in a comment to this question:
How to clean caches used by the Linux kernel
ypnos claims that:
"Applications will always be first citizens for memory and don't have to fight with cache for it."
Well, I think my cache is rebelious and does not want to accept its social class. I ran the experiment here:
http://www.linuxatemyram.com/play.html
step 1:
$ free -m
total used free shared buffers cached
Mem: 3015 2901 113 0 15 2282
-/+ buffers/cache: 603 2411
Swap: 2406 2406 0
So 2282MB is used by cache and 113MB is free.
Now:
$ ./munch
Allocated 1 MB
Allocated 2 MB
Allocated 3 MB
Allocated 4 MB
.
.
.
Allocated 265 MB
Allocated 266 MB
Allocated 267 MB
Allocated 268 MB
Allocated 269 MB
Killed
OK, Linux gave me, generously another 156MB and that's it! So, how can I tell Linux that my programs are more important than that 2282MB cache?
Extra info: my /home is encrypted.
More people with the same problem (These make the encryption hypothesis not very plausible):
https://serverfault.com/questions/171164/can-you-set-a-minimum-linux-disk-buffer-size
and
https://askubuntu.com/questions/41778/computer-freezing-on-almost-full-ram-possibly-disk-cache-problem

The thing to know about caching in the kernel is that it's designed to be efficient as possible. This often means things put into cache are left there when there's nothing else asking for memory.
This is the kernel preparing to be lucky in case the thing in cache is asked for again. If no-one else needs the memory, there's little benefit in freeing it up.

I am not sure about Linux specific stuff, but a good OS will keep track of how many times a memory page was accessed, and how long ago. If it wasn't accessed much lately, it can swap it out, and use the RAM for caching.
Also, allocated but unused memory can be sent to swap as well, because sometimes programs allocate more than they actually need, so many memory pages would just sit there filling your RAM.

I found out if I turn off the swap by
#swapoff -a
The problem is going away. If I have swap, when I ask for more memory, then linux tries to move the cache to the swap and then swap get full, then linux halts the whole operation instead of dropping the cache. This results in "out of memory". But without swap Linux knows that it has no hope but dropping the cache in the first place.
I think it's a bug in linux kernel.
From the one of the link that added to the question suggests that:
sysctl -w vm.min_free_kbytes=65536
helps, for me with 64MG still I can easily get into trouble. I'm working with 128MG margin and when the greedy cache reach there, the machine becomes very slow but unlike before it doesn't freeze. I'll check with 256MG margin and see if there will be an improvement or not.

Related

discrepancy between htop and golang readmemstats

My program loads a lot of data at start up and then calls debug.FreeOSMemory() so that any extra space is given back immediately.
loadDataIntoMem()
debug.FreeOSMemory()
after loading into memory , htop shows me the following for the process
VIRT RES SHR
11.6G 7629M 8000
But a call to runtime.ReadMemStats shows me the following
Alloc 5593336608 5.3G
BuckHashSys 1574016 1.6M
HeapAlloc 5593336610 5.3G
HeapIdle 2607980544 2.5G
HeapInuse 7062446080 6.6G
HeapReleased 2607980544 2.5G
HeapSys 9670426624 9.1G
MCacheInuse 9600 9.4K
MCacheSys 16384 16K
MSpanInuse 106776176 102M
MSpanSys 115785728 111M
OtherSys 25638523 25M
StackInuse 589824 576K
StackSys 589824 576K
Sys 10426738360 9.8G
TotalAlloc 50754542056 48G
Alloc is the amount obtained from system and not yet freed ( This is
resident memory right ?) But there is a big difference between the two.
I rely on HeapIdle to kill my program i.e if HeapIdle is more than 2 GB, restart - in this case it is 2.5, and isn't going down even after a while. Golang should use from heap idle when allocating more in the future, thus reducing heap idle right ?
If assumption 1 is wrong, which stat can accurately tell me what the RES value in htop is.
What can I do to reduce the value of HeapIdle ?
This was tried on go 1.4.2, 1.5.2 and 1.6.beta1
The effective memory consumption of your program will be Sys-HeapReleased. This still won't be exactly what the OS reports, because the OS can choose to allocate memory how it sees fit based on the requests of the program.
If your program runs for any appreciable amount of time, the excess memory will be offered back to the OS so there's no need to call debug.FreeOSMemory(). It's also not the job of the garbage collector to keep memory as low as possible; the goal is to use memory as efficiently as possible. This requires some overhead, and room for future allocations.
If you're having trouble with memory usage, it would be a lot more productive to profile your program and see why you're allocating more than expected, instead of killing your process based on incorrect assumptions about memory.

Linux memory overcommit details

I am developing SW for embedded Linux and i am suffering system hangs because OOM Killer appears from time to time. Before going beyond i would like to solve some confusing issues about how Linux Kernel allocate dynamic memory assuming /proc/sys/vm/overcommit_memory has 0 and /proc/sys/vm/min_free_kbytes has 712, and no swap.
Supposing embedded Linux currently physical memory available is 5MB (5MB of free memory and there is not usable cached or buffered memory available) if i write this piece of code:
.....
#define MEGABYTE 1024*1024
.....
.....
void *ptr = NULL;
ptr = (void *) malloc(6*MEGABYTE); //Preserving 6MB
if (!prt)
exit(1);
memset(ptr, 1, MEGABYTE);
.....
I would like to know if when memset call is committed the kernel will try to allocate ~6MB or ~1MB (or min_free_kbytes multiple) in the physical memory space.
Right now there is about 9MB in my embedded device which has 32MB RAM. I check it by doing
# echo 3 > /proc/sys/vm/drop_caches
# free
total used free shared buffers
Mem: 23732 14184 9548 0 220
Swap: 0 0 0
Total: 23732 14184 9548
Forgetting last piece of C code, i would like to know if its possible that oom killer appears when for instance free memory is about >6MB.
I want to know if the system is out of memory when oom appears, so i think i have two options:
See VmRSS entries in /proc/pid/status of suspicious process.
Set /proc/sys/vm/overcommit_memory = 2 and /proc/sys/vm/overcommit_memory = 75 and see if there is any process requiring more of physical memory available.
I think you can read this document. Is provides you three small C programs that you can use to understand what happens with the different possible values of /proc/sys/vm/overcommit_memory .

Fortran array memory management

I am working to optimize a fluid flow and heat transfer analysis program written in Fortran. As I try to run larger and larger mesh simulations, I'm running into memory limitation problems. The mesh, though, is not all that big. Only 500,000 cells and small-peanuts for a typical CFD code to run. Even when I request 80 GB of memory for my problem, it's crashing due to insufficient virtual memory.
I have a few guesses at what arrays are hogging up all that memory. One in particular is being allocated to (28801,345600). Correct me if I'm wrong in my calculations, but a double precision array is 8 bits per value. So the size of this array would be 28801*345600*8=79.6 GB?
Now, I think that most of this array ends up being zeros throughout the calculation so we don't need to store them. I think I can change the solution algorithm to only store the non-zero values to work on in a much smaller array. However, I want to be sure that I'm looking at the right arrays to reduce in size. So first, did I correctly calculate the array size above? And second, is there a way I can have Fortran show array sizes in MB or GB during runtime? In addition to printing out the most memory intensive arrays, I'd be interested in seeing how the memory requirements of the code are changing during runtime.
Memory usage is a quite vaguely defined concept on systems with virtual memory. You can have large amounts of memory allocated (large virtual memory size) but only a small part of it actually being actively used (small resident set size - RSS).
Unix systems provide the getrusage(2) system call that returns information about the amount of system resources in use by the calling thread/process/process children. In particular it provides the maxmimum value of the RSS ever reached since the process was started. You can write a simple Fortran callable helper C function that would call getrusage(2) and return the value of the ru_maxrss field of the rusage structure.
If you are running on Linux and don't care about portability, then you may just open and read from /proc/self/status. It is a simple text pseudofile that among other things contains several lines with statistics about the process virtual memory usage:
...
VmPeak: 9136 kB
VmSize: 7896 kB
VmLck: 0 kB
VmHWM: 7572 kB
VmRSS: 6316 kB
VmData: 5224 kB
VmStk: 88 kB
VmExe: 572 kB
VmLib: 1708 kB
VmPTE: 20 kB
...
Explanation of the various fields - here. You are mostly interested in VmData, VmRSS, VmHWM and VmSize. You can open /proc/self/status as a regular file with OPEN() and process it entirely in your Fortran code.
See also what memory limitations are set with ulimit -a and ulimit -aH. You may be exceeding the hard virtual memory size limit. If you are submitting jobs through a distributed resource manager (e.g. SGE/OGE, Torque/PBS, LSF, etc.) check that you request enough memory for the job.

Examining Erlang crash dumps - how to account for all memory?

I've been poring over this Erlang crash dump where the VM has run out of heap memory. The problem is that there is no obvious culprit allocating all that memory.
Using some serious black awk magic I've summed up the fields Stack+heap, OldHeap, Heap unused and OldHeap unused for each process and ranked them by memory usage. The problem is that this number doesn't come even close to the number that is representing the total memory for all the processes processes_used according to the Erlang crash dump guide.
I've already tried the Crashdump Viewer and either I'm missing something or there isn't much help there for my kind of problem.
The number I get is 525 MB whereas the processes_used value is at 1348 MB. Where can I find the rest of the memory?
Edit: The Heap unused and OldHeap unused shouldn't have been included since they are a sub-part of Stack+Heap and OldHeap, that plus the fact that the number displayed for Stack+Heap and OldHeap are listed as number of words, not bytes, was the problem.
There is an module called crashdump_viewer which is great for these kinds of analysis.
Another thing to keep in mind is that Heap+Stack is afaik in words, not bytes which would mean that you have to multiply Heap+Stack with 4 on 32 and 8 on 64 bit. Can't find a reference in the manual for this but Processes talks about it a bit.

managed heap fragmentation

I am trying to understand how heap fragmenation works. What does the following output tell me?
Is this heap overly fragmented?
I have 243010 "free objects" with a total of 53304764 bytes. Are those "free object" spaces in the heap that once contained object but that are now garabage collected?
How can I force a fragmented heap to clean up?
!dumpheap -type Free -stat
total 243233 objects
Statistics:
MT Count TotalSize Class Name
0017d8b0 243010 53304764 Free
It depends on how your heap is organized. You should have a look at how much memory in Gen 0,1,2 is allocated and how much free memory you have there compared to the total used memory.
If you have 500 MB managed heap used but and 50 MB is free then you are doing pretty well. If you do memory intensive operations like creating many WPF controls and releasing them you need a lot more memory for a short time but .NET does not give the memory back to the OS once you allocated it. The GC tries to recognize allocation patterns and tends to keep your memory footprint high although your current heap size is way too big until your machine is running low on physical memory.
I found it much easier to use psscor2 for .NET 3.5 which has some cool commands like ListNearObj where you can find out which objects are around your memory holes (pinned objects?). With the commands from psscor2 you have much better chances to find out what is really going on in your heaps. Most commands are also available in SOS.dll in .NET 4 as well.
To answer your original question: Yes free objects are gaps on the managed heap which can simply be the free memory block after your last allocated object on a GC segement. Or if you do !DumpHeap with the start address of a GC segment you see the objects allocated in that managed heap segment along with your free objects which are GC collected objects.
This memory holes do normally happen in Gen2. The object addresses before and after the free object do tell you what potentially pinned objects are around your hole. From this you should be able to determine your allocation history and optimize it if you need to.
You can find the addresses of the GC Heaps with
0:021> !EEHeap -gc
Number of GC Heaps: 1
generation 0 starts at 0x101da9cc
generation 1 starts at 0x10061000
generation 2 starts at 0x02aa1000
ephemeral segment allocation context: none
segment begin allocated size
02aa0000 02aa1000** 03836a30 0xd95a30(14244400)
10060000 10061000** 103b8ff4 0x357ff4(3506164)
Large object heap starts at 0x03aa1000
segment begin allocated size
03aa0000 03aa1000 03b096f8 0x686f8(427768)
Total Size: Size: 0x115611c (18178332) bytes.
------------------------------
GC Heap Size: Size: 0x115611c (18178332) bytes.
There you see that you have heaps at 02aa1000 and 10061000.
With !DumpHeap 02aa1000 03836a30 you can dump the GC Heap segment.
!DumpHeap 02aa1000 03836a30
Address MT Size
...
037b7b88 5b408350 56
037b7bc0 60876d60 32
037b7be0 5b40838c 20
037b7bf4 5b408350 56
037b7c2c 5b408728 20
037b7c40 5fe4506c 16
037b7c50 60876d60 32
037b7c70 5b408728 20
037b7c84 5fe4506c 16
037b7c94 00135de8 519112 Free
0383685c 5b408728 20
03836870 5fe4506c 16
03836880 608c55b4 96
....
There you find your free memory blocks which was an object which was already GCed. You can dump the surrounding objects (the output is sorted address wise) to find out if they are pinned or have other unusual properties.
You have 50MB of RAM as Free space. This is not good.
Having .NET allocating blocks of 16MB from process, we have a fragmentation issue indeed.
There are plenty of reasons to fragmentation to occure in .NET.
Have a look here and here.
In your case it is possibly a pinning. As 53304764 / 243010 makes 219.35 bytes per object - much lower then LOH objects.

Resources