Linux OOM killer does not work - linux-kernel

I would like to test if the kernel OOM killer work fine on my embedded Linux or not. I used an application test to fill all memory and see if OOM will kill my application if the system run in out of memory condition.
The test program I used:
#include <stdio.h>
#include <stdlib.h>
#define MEGABYTE 1024*1024
int main(int argc, char *argv[])
{
void *myblock = NULL;
int count = 0;
while(1)
{
myblock = (void *) malloc(MEGABYTE);
if (!myblock) break;
memset(myblock,1, MEGABYTE);
printf("Currently allocating %d MB\n",++count);
}
exit(0);
}
Results:
I always get :
MyApplication triggered out of memory codition (oom killer not called): gfp_mask=0x1200d2, order=0, oomkilladj=0
I try to change /etc/sysctl by adding :
vm.oom_kill_allocating_task=1
vm.panic_on_oom=0
vm.overcommit_memory=0
how can I make OOM works fine on my system
Kernel version :2.6.30 #7 SMP PREEMPT

The Linux “OOM killer” is a solution to the overcommit problem.
If you just “fill all memory”, then overcommit will not show up. The malloc call will eventually return a null pointer, the convention to indicate that the memory request cannot be fulfilled.
In order to cause an overcommit-related problem, you must allocate too much memory without writing to it, and then decide to write to all of it, so that the system finds itself forced to honor promises it made without having the capacity to fulfill them.
EDIT after source code was provided:
To be completely precise, in order to trigger a problem with overcommit and force the Linux OOM killer to take action, you should have several processes that in a first phase all reserve memory with malloc() (but do not write to it yet). Then have all of them write to the memory they have reserved at the same time. This will force Linux to honor the memory promises outside of any memory allocation, and it will have no choice but to kill a process that wasn't allocating (since none of them will be allocating at that moment).

Also, if you still want to see how or when OOM-killer works. I would suggest you to add fork() before while loop. That will create many processes, and eventually one of them OOM-killer will kill.

Related

How do I disable ASLR for heap addresses for a program compiled and linked with mingw-w64 GCC? [duplicate]

For debugging purposes, I would like malloc to return the same addresses every time the program is executed, however in MSVC this is not the case.
For example:
#include <stdlib.h>
#include <stdio.h>
int main() {
int test = 5;
printf("Stack: %p\n", &test);
printf("Heap: %p\n", malloc(4));
return 0;
}
Compiling with cygwin's gcc, I get the same Stack address and Heap address everytime, while compiling with MSVC with aslr off...
cl t.c /link /DYNAMICBASE:NO /NXCOMPAT:NO
...I get the same Stack address every time, but the Heap address changes.
I have already tried adding the registry value HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management\MoveImages but it does not work.
Both the stack address and the pointer returned by malloc() may be different every time. As a matter of fact both differ when the program is compiled and run on Mac/OS multiple times.
The compiler and/or the OS may cause this behavior to try and make it more difficult to exploit software flaws. There might be a way to prevent this in some cases, but if your goal is to replay the same series of malloc() addresses, other factors may change the addresses, such as time sensitive behaviors, file system side effects, not to mention non-deterministic thread behavior. You should try and avoid relying on this for your tests.
Note also that &test should be cast as (void *) as %p expects a void pointer, which is not guaranteed to have the same representation as int *.
It turns out that you may not be able to obtain deterministic behaviour from the MSVC runtime libraries. Both the debug and the production versions of the C/C++ runtime libraries end up calling a function named _malloc_base(), which in turn calls the Win32 API function HeapAlloc(). Unfortunately, neither HeapAlloc() nor the function that provides its heap, HeapCreate(), document a flag or other way to obtain deterministic behaviour.
You could roll up your own allocation scheme on top of VirtualAlloc(), as suggested by #Enosh_Cohen, but then you'd loose the debug functionality offered by the MSVC allocation functions.
Diomidis' answer suggests making a new malloc on top of VirtualAlloc, so I did that. It turned out to be somewhat challenging because VirtualAlloc itself is not deterministic, so I'm documenting the procedure I used.
First, grab Doug Lea's malloc. (The ftp link to the source is broken; use this http alternative.)
Then, replace the win32mmap function with this (hereby placed into the public domain, just like Doug Lea's malloc itself):
static void* win32mmap(size_t size) {
/* Where to ask for the next address from VirtualAlloc. */
static char *next_address = (char*)(0x1000000);
/* Return value from VirtualAlloc. */
void *ptr = 0;
/* Number of calls to VirtualAlloc we have made. */
int tries = 0;
while (!ptr && tries < 100) {
ptr = VirtualAlloc(next_address, size,
MEM_RESERVE|MEM_COMMIT, PAGE_READWRITE);
if (!ptr) {
/* Perhaps the requested address is already in use. Try again
* after moving the pointer. */
next_address += 0x1000000;
tries++;
}
else {
/* Advance the request boundary. */
next_address += size;
}
}
/* Either we got a non-NULL result, or we exceeded the retry limit
* and are going to return MFAIL. */
return (ptr != 0)? ptr: MFAIL;
}
Now compile and link the resulting malloc.c with your program, thereby overriding the MSVCRT allocator.
With this, I now get consistent malloc addresses.
But beware:
The exact address I used, 0x1000000, was chosen by enumerating my address space using VirtualQuery to look for a large, consistently available hole. The address space layout appears to have some unavoidable non-determinism even with ASLR disabled. You may have to adjust the value.
I confirmed this works, in my particular circumstances, to get the same addresses during 100 sequential runs. That's good enough for the debugging I want to do, but the values might change after enough iterations, or after rebooting, etc.
This modification should not be used in production code, only for debugging. The retry limit is a hack, and I've done nothing to track when the heap shrinks.

How can I force MacOS to release MADV_FREE'd pages?

My program has a custom allocator which gets memory from the OS using mmap(MAP_ANON | MAP_PRIVATE). When it no longer needs memory, the allocator calls either munmap or madvise(MADV_FREE). MADV_FREE keeps the mapping around, but tells the OS that it can throw away the physical pages associated with the mapping.
Calling MADV_FREE on pages you're going to need again eventually is much faster than calling munmap and later calling mmap again.
This almost works perfectly for me. The only problem is that, on MacOS, MADV_FREE is very lazy about getting rid of the pages I've asked it to free. In fact, it only gets rid of them when there's memory pressure from another application. Until it gets rid of the pages I've freed, MacOS reports that my program is still using that memory; in the Activity Monitor, its "Real Memory" column doesn't reflect the freed memory.
This makes it difficult for me to measure how much memory my program is actually using. (This difficulty in measuring RSS is keeping us from landing the custom allocator on 10.5.)
I could allocate a whole bunch of memory to force the OS to free up these pages, but in addition to taking a long time, that could have other side-effects, such as causing parts of my program to be paged out to disk.
On a lark, I tried the purge command, but that has no effect.
How can I force MacOS to clean out these MADV_FREE'd pages? Or, how can I ask MacOS how many MADV_FREE'd pages my process has in memory?
Here's a test program, if it helps. The Activity Monitor's "Real Memory" column shows 512MB after the program goes to sleep. On my Linux box, top shows 256MB of RSS, as desired.
#include <sys/mman.h>
#include <stdio.h>
#include <unistd.h>
#define SIZE (512 * 1024 * 1024)
// We use MADV_FREE on Mac and MADV_DONTNEED on Linux.
#ifndef MADV_FREE
#define MADV_FREE MADV_DONTNEED
#endif
int main()
{
char *x = mmap(0, SIZE, PROT_READ | PROT_WRITE, MAP_ANON | MAP_PRIVATE, -1, 0);
// Touch each page we mmap'ed so it gets a physical page.
int i;
for (i = 0; i < SIZE; i += 1024) {
x[i] = i;
}
madvise(x, SIZE / 2, MADV_FREE);
fprintf(stderr, "Sleeping. Now check my RSS. Hopefully it's %dMB.\n", SIZE / (2 * 1024 * 1024));
sleep(1024);
return 0;
}
mprotect(addr, length, PROT_NONE);
mprotect(addr, length, PROT_READ | PROT_WRITE);
Note as you say, madvise is lazier, and that is probably better for performance (just in case anyone is tempted to use this for performance rather than measurement).
Use MADV_FREE_REUSABLE on macOS. According to Apple's magazine_malloc implementation:
On OS X we use MADV_FREE_REUSABLE, which signals the kernel to remove the given pages from the memory statistics for our process. However, on returning that memory to use we have to signal that it has been reused.
https://opensource.apple.com/source/libmalloc/libmalloc-53.1.1/src/magazine_malloc.c.auto.html
Chromium, for example, also uses it:
MADV_FREE_REUSABLE is similar to MADV_FREE, but also marks the pages with the reusable bit, which allows both Activity Monitor and memory-infra to correctly track the pages.
https://github.com/chromium/chromium/blob/master/base/memory/discardable_shared_memory.cc#L377
I've looked and looked, and I don't think this is possible. :\
We're solving the problem by adding code to the allocator which explicitly decommits MADV_FREE'd pages when we ask it to.

GetIpAddrTable() leaks memory. How to resolve that?

On my Windows 7 box, this simple program causes the memory use of the application to creep up continuously, with no upper bound. I've stripped out everything non-essential, and it seems clear that the culprit is the Microsoft Iphlpapi function "GetIpAddrTable()". On each call, it leaks some memory. In a loop (e.g. checking for changes to the network interface list), it is unsustainable. There seems to be no async notification API which could do this job, so now I'm faced with possibly having to isolate this logic into a separate process and recycle the process periodically -- an ugly solution.
Any ideas?
// IphlpLeak.cpp - demonstrates that GetIpAddrTable leaks memory internally: run this and watch
// the memory use of the app climb up continuously with no upper bound.
#include <stdio.h>
#include <windows.h>
#include <assert.h>
#include <Iphlpapi.h>
#pragma comment(lib,"Iphlpapi.lib")
void testLeak() {
static unsigned char buf[16384];
DWORD dwSize(sizeof(buf));
if (GetIpAddrTable((PMIB_IPADDRTABLE)buf, &dwSize, false) == ERROR_INSUFFICIENT_BUFFER)
{
assert(0); // we never hit this branch.
return;
}
}
int main(int argc, char* argv[]) {
for ( int i = 0; true; i++ ) {
testLeak();
printf("i=%d\n",i);
Sleep(1000);
}
return 0;
}
#Stabledog:
I've ran your example, unmodified, for 24 hours but did not observe that the program's Commit Size increased indefinitely. It always stayed below 1024 kilobyte. This was on Windows 7 (32-bit, and without Service Pack 1).
Just for the sake of completeness, what happens to memory usage if you comment out the entire if block and the sleep? If there's no leak there, then I would suggest you're correct as to what's causing it.
Worst case, report it to MS and see if they can fix it - you have a nice simple test case to work from which is more than what I see in most bug reports.
Another thing you may want to try is to check the error code against NO_ERROR rather than a specific error condition. If you get back a different error than ERROR_INSUFFICIENT_BUFFER, there may be a leak for that:
DWORD dwRetVal = GetIpAddrTable((PMIB_IPADDRTABLE)buf, &dwSize, false);
if (dwRetVal != NO_ERROR) {
printf ("ERROR: %d\n", dwRetVal);
}
I've been all over this issue now: it appears that there is no acknowledgment from Microsoft on the matter, but even a trivial application grows without bounds on Windows 7 (not XP, though) when calling any of the APIs which retrieve the local IP addresses.
So the way I solved it -- for now -- was to launch a separate instance of my app with a special command-line switch that tells it "retrieve the IP addresses and print them to stdout". I scrape stdout in the parent app, the child exits and the leak problem is resolved.
But it wins "dang ugly solution to an annoying problem", at best.

barriers in SMP linux kernel

Is there smth like pthread_barrier in SMP Linux kernel?
When kernel works simultaneously on 2 and more CPUs with the same structure, the barrier (like pthread_barrier) can be useful. It will stop all CPUs entering to it until last CPU will run the barrier. From this moment all CPUs again works.
You can probably get equivalent behavior using a completion:
struct fake_barrier_t {
atomic_t count;
struct completion comp;
}
/* run before each pass */
void initialize_fake_barrier(struct fake_barrier_t* b)
{
atomic_set(&b->count, 0);
init_completion(&b->comp);
}
/* make all tasks sleep until nth arrives, then wake all. */
void fake_barrier(struct fake_barrier_t* b, int n)
{
if (atomic_inc_return(&b->count) < n)
wait_for_completion(&b->comp);
else
complete_all(&b->comp);
}
I'm not familiar with the pthread_barrier() construct, but the kernel has a large number of options for memory barriers.
See lxr memory barriers for the documentation
If you're trying to force a set of threads to wait for each other, you can probably hack something together with mutexes and/or waitqueues - though I'm not sure when you'd want to do that. When do you ever want threads to wait on each other? I am very curious now...

Do memory deallocation routines touch the block being freed?

Windows HeapFree, msvcrt free: do they cause the memory being freed to be paged-in? I am trying to estimate if not freeing memory at exit would speed up application shutdown significantly.
NOTE: This is a very specific technical question. It's not about whether applications should or should not call free at exit.
If you don't cleanly deallocate all your resources at application shutdown it will make it nigh on impossible to detect if you have any really serious problems - like memory leaks - which would be more of a problem than a slow shut down. If the UI disappears quickly, then the user will think the it has shut down quickly even if it has a lot of work still to do. With UI, perception of speed is more important than actual speed. When the user selects the 'Exit Application' option, the main application window should immediately disappear. It doesn't matter if the application takes a few seconds after that to free up everything an exit gracefully, the user won't notice.
I ran a test for HeapFree. The following program has access violation inside HeapFree at i = 31999:
#include <windows.h>
int main() {
HANDLE heap = GetProcessHeap();
void * bufs[64000];
// populate heap
for (unsigned i = 0; i < _countof(bufs); ++i) {
bufs[i] = HeapAlloc(heap, 0, 4000);
}
// protect a block in the "middle"
DWORD dwOldProtect;
VirtualProtect(
bufs[_countof(bufs) / 2], 4000, PAGE_NOACCESS,
&dwOldProtect);
// free blocks
for (unsigned i = 0; i < _countof(bufs); ++i) {
HeapFree(heap, 0, bufs[i]);
}
}
The stack is
ntdll.dll!_RtlpCoalesceFreeBlocks#16() + 0x12b9 bytes
ntdll.dll!_RtlFreeHeap#12() + 0x91f bytes
shutfree.exe!main() Line 19 C++
So it looks like the answer is "Yes" (this applies to free as well, since it uses HeapFree internally)
I'm almost certain the answer to the speed improvement question would be "yes". Freeing a block may or may not touch the actual block in question, but it will certainly have to update other bookkeeping information in any case. If you have zillions of small objects allocated (it happens), then the effort required to free them all could have a significant impact.
If you can arrange it, you might try setting up your application such that if it knows it's going to quit, save any pending work (configuration, documents, whatever) and exit ungracefully.

Resources