kzalloc() - Maxmum size at a single call? - memory-management

What is the maximum size that we can allocate using kzalloc() in a single call?
This is a very frequently asked question. Also please let me know if i can verify that value.

The upper limit (number of bytes that can be allocated in a single kmalloc / kzalloc request), is a function of:
the processor – really, the page size – and
the number of buddy system freelists (MAX_ORDER).
On both x86 and ARM, with a standard page size of 4 Kb and MAX_ORDER of 11, the kmalloc upper limit on a single call is 4 MB!
Details, including explanations and code to test this, here:
http://kaiwantech.wordpress.com/2011/08/17/kmalloc-and-vmalloc-linux-kernel-memory-allocation-api-limits/

No different to kmalloc(). That's the question you should ask (or search), because kzalloc is just a thin wrapper that sets GFP_ZERO.
Up to about PAGE_SIZE (at least 4k) is no problem :p. Beyond that... you're right to say lots of people people have asked, it's definitely something you have to think about. Apparently it depends on the kernel version - there used to be a hard 128k limit, but it's been increased (or maybe dropped altogether) now. That's just the hard limit though, what you can actually get depends on a given system. (And very definitely on the kernel version).
Maybe read What is the difference between vmalloc and kmalloc?
You can always "verify" the allocation by checking the return value from kzalloc(), but by then you've probably already logged an allocation failure backtrace. Other than that, no - I don't think there's a good way to check in advance.

However, it depends on your kernel version and config. These limits normally locate in linux/slab.h, usually descripted as below(this example is under linux 2.6.32):
#define KMALLOC_SHIFT_HIGH ((MAX_ORDER + PAGE_SHIFT - 1) <= 25 ? \
(MAX_ORDER + PAGE_SHIFT - 1) : 25)
#define KMALLOC_MAX_SIZE (1UL << KMALLOC_SHIFT_HIGH)
#define KMALLOC_MAX_ORDER (KMALLOC_SHIFT_HIGH - PAGE_SHIFT)
And you can test them with code below:
#include <linux/module.h>
#include <linux/slab.h>
int init_module()
{
printk(KERN_INFO "KMALLOC_SHILFT_LOW:%d, KMALLOC_SHILFT_HIGH:%d, KMALLOC_MIN_SIZE:%d, KMALLOC_MAX_SIZE:%lu\n",
KMALLOC_SHIFT_LOW, KMALLOC_SHIFT_HIGH, KMALLOC_MIN_SIZE, KMALLOC_MAX_SIZE);
return 0;
}
void cleanup_module()
{
return;
}
Finally, the results under linux 2.6.32 32bits are: 3, 22, 8, 4194304, it means the min size is 8 bytes, and the max size is 4MB.
PS.
you can also check the actual size of memory allocated by kmalloc, just use ksize(), i.e.
void *p = kmalloc(15, GFP_KERNEL);
printk(KERN_INFO "%u\n", ksize(p)); /* this will print "16" under my kernel */

Related

How do I disable ASLR for heap addresses for a program compiled and linked with mingw-w64 GCC? [duplicate]

For debugging purposes, I would like malloc to return the same addresses every time the program is executed, however in MSVC this is not the case.
For example:
#include <stdlib.h>
#include <stdio.h>
int main() {
int test = 5;
printf("Stack: %p\n", &test);
printf("Heap: %p\n", malloc(4));
return 0;
}
Compiling with cygwin's gcc, I get the same Stack address and Heap address everytime, while compiling with MSVC with aslr off...
cl t.c /link /DYNAMICBASE:NO /NXCOMPAT:NO
...I get the same Stack address every time, but the Heap address changes.
I have already tried adding the registry value HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management\MoveImages but it does not work.
Both the stack address and the pointer returned by malloc() may be different every time. As a matter of fact both differ when the program is compiled and run on Mac/OS multiple times.
The compiler and/or the OS may cause this behavior to try and make it more difficult to exploit software flaws. There might be a way to prevent this in some cases, but if your goal is to replay the same series of malloc() addresses, other factors may change the addresses, such as time sensitive behaviors, file system side effects, not to mention non-deterministic thread behavior. You should try and avoid relying on this for your tests.
Note also that &test should be cast as (void *) as %p expects a void pointer, which is not guaranteed to have the same representation as int *.
It turns out that you may not be able to obtain deterministic behaviour from the MSVC runtime libraries. Both the debug and the production versions of the C/C++ runtime libraries end up calling a function named _malloc_base(), which in turn calls the Win32 API function HeapAlloc(). Unfortunately, neither HeapAlloc() nor the function that provides its heap, HeapCreate(), document a flag or other way to obtain deterministic behaviour.
You could roll up your own allocation scheme on top of VirtualAlloc(), as suggested by #Enosh_Cohen, but then you'd loose the debug functionality offered by the MSVC allocation functions.
Diomidis' answer suggests making a new malloc on top of VirtualAlloc, so I did that. It turned out to be somewhat challenging because VirtualAlloc itself is not deterministic, so I'm documenting the procedure I used.
First, grab Doug Lea's malloc. (The ftp link to the source is broken; use this http alternative.)
Then, replace the win32mmap function with this (hereby placed into the public domain, just like Doug Lea's malloc itself):
static void* win32mmap(size_t size) {
/* Where to ask for the next address from VirtualAlloc. */
static char *next_address = (char*)(0x1000000);
/* Return value from VirtualAlloc. */
void *ptr = 0;
/* Number of calls to VirtualAlloc we have made. */
int tries = 0;
while (!ptr && tries < 100) {
ptr = VirtualAlloc(next_address, size,
MEM_RESERVE|MEM_COMMIT, PAGE_READWRITE);
if (!ptr) {
/* Perhaps the requested address is already in use. Try again
* after moving the pointer. */
next_address += 0x1000000;
tries++;
}
else {
/* Advance the request boundary. */
next_address += size;
}
}
/* Either we got a non-NULL result, or we exceeded the retry limit
* and are going to return MFAIL. */
return (ptr != 0)? ptr: MFAIL;
}
Now compile and link the resulting malloc.c with your program, thereby overriding the MSVCRT allocator.
With this, I now get consistent malloc addresses.
But beware:
The exact address I used, 0x1000000, was chosen by enumerating my address space using VirtualQuery to look for a large, consistently available hole. The address space layout appears to have some unavoidable non-determinism even with ASLR disabled. You may have to adjust the value.
I confirmed this works, in my particular circumstances, to get the same addresses during 100 sequential runs. That's good enough for the debugging I want to do, but the values might change after enough iterations, or after rebooting, etc.
This modification should not be used in production code, only for debugging. The retry limit is a hack, and I've done nothing to track when the heap shrinks.

Why ioremap allocated areas have such large aligment?

I have a code that fails calling ioremap() for 4M region. Trying to debug the reason, I've found out that if you call ioremap it will try to allocate continuous addresses with a very large alignment (depending on the size of the area you want to allocate). The code that computes this alignment is in __get_vm_area_node() function (mm/vmalloc.c) and it looks like this:
if (flags & VM_IOREMAP) {
int bit = fls(size);
if (bit > IOREMAP_MAX_ORDER)
bit = IOREMAP_MAX_ORDER;
else if (bit < PAGE_SHIFT)
bit = PAGE_SHIFT;
align = 1ul << bit;
}
On ARM, IOREMAP_MAX_ORDER is defined as 23. This means that in my case, ioremap needs not only 4M of continues addressing in vmalloc area but it also has to be aligned to 4M.
I wasn't able to find any information on why this alignment is needed. I even tried using git blame to see the commit that introduces this change but it seems the code is older than git history so I couldn't find anything.

invalid number specified with option "/HEAP:1[,10]"

My Malloc is failing in my project.
Malloc runs several times via a one of the functions but fails due to lack of memory.
I am trying to increase the heap size in my VC++ but it gives me the error as above in the subject.
Can someone please tell me what is wrong in this ?
Windows server 2003 R2 Enterprise edition
And i am using VC++ 98 edition.
I tried some search but could not get anything conclusive on how to use /HEAP OPTION.
should the numbers be in MB ?
message_t* Allocate_momory(MsgType_t msgType, UInt16 dataLength)
{
// TO DO: Allocate memenory and return the pointer
message_t* mes_t;
mes_t = (message_t*) malloc(sizeof (message_t));
mes_t->msgType = msgType;
mes_t->dataLength = 0;
mes_t->clientID = 0;
mes_t->usageCount = 0;
mes_t->dataBuf = malloc(sizeof (dataLength));
return mes_t;
}
Yes it worked... But it unfortunatly did not solve my problem with malloc :( !!
This is a huge project with too many files.
I can't post the code but can someone guide me how should i try to debug a problem where malloc is failing ?
/HEAP sets the heap size in bytes. Also the square brackets in the documentation denote an optional parameter - you don't actually type these in. So it would be e.g.
/HEAP:1073741824
for a 1 GB heap, or
/HEAP:1073741824,16777216
if you really do want to specify the "commit" parameter in addition to the heap size (you probably don't).
Unfortunately I don't think this will solve your real problem, which is that you are running out of memory. You may have memory leaks, which you can track down with a tool such as valgrind. If that's not the case then you have a bad design, which will be a lot harder to fix than memory leaks.

How can I force MacOS to release MADV_FREE'd pages?

My program has a custom allocator which gets memory from the OS using mmap(MAP_ANON | MAP_PRIVATE). When it no longer needs memory, the allocator calls either munmap or madvise(MADV_FREE). MADV_FREE keeps the mapping around, but tells the OS that it can throw away the physical pages associated with the mapping.
Calling MADV_FREE on pages you're going to need again eventually is much faster than calling munmap and later calling mmap again.
This almost works perfectly for me. The only problem is that, on MacOS, MADV_FREE is very lazy about getting rid of the pages I've asked it to free. In fact, it only gets rid of them when there's memory pressure from another application. Until it gets rid of the pages I've freed, MacOS reports that my program is still using that memory; in the Activity Monitor, its "Real Memory" column doesn't reflect the freed memory.
This makes it difficult for me to measure how much memory my program is actually using. (This difficulty in measuring RSS is keeping us from landing the custom allocator on 10.5.)
I could allocate a whole bunch of memory to force the OS to free up these pages, but in addition to taking a long time, that could have other side-effects, such as causing parts of my program to be paged out to disk.
On a lark, I tried the purge command, but that has no effect.
How can I force MacOS to clean out these MADV_FREE'd pages? Or, how can I ask MacOS how many MADV_FREE'd pages my process has in memory?
Here's a test program, if it helps. The Activity Monitor's "Real Memory" column shows 512MB after the program goes to sleep. On my Linux box, top shows 256MB of RSS, as desired.
#include <sys/mman.h>
#include <stdio.h>
#include <unistd.h>
#define SIZE (512 * 1024 * 1024)
// We use MADV_FREE on Mac and MADV_DONTNEED on Linux.
#ifndef MADV_FREE
#define MADV_FREE MADV_DONTNEED
#endif
int main()
{
char *x = mmap(0, SIZE, PROT_READ | PROT_WRITE, MAP_ANON | MAP_PRIVATE, -1, 0);
// Touch each page we mmap'ed so it gets a physical page.
int i;
for (i = 0; i < SIZE; i += 1024) {
x[i] = i;
}
madvise(x, SIZE / 2, MADV_FREE);
fprintf(stderr, "Sleeping. Now check my RSS. Hopefully it's %dMB.\n", SIZE / (2 * 1024 * 1024));
sleep(1024);
return 0;
}
mprotect(addr, length, PROT_NONE);
mprotect(addr, length, PROT_READ | PROT_WRITE);
Note as you say, madvise is lazier, and that is probably better for performance (just in case anyone is tempted to use this for performance rather than measurement).
Use MADV_FREE_REUSABLE on macOS. According to Apple's magazine_malloc implementation:
On OS X we use MADV_FREE_REUSABLE, which signals the kernel to remove the given pages from the memory statistics for our process. However, on returning that memory to use we have to signal that it has been reused.
https://opensource.apple.com/source/libmalloc/libmalloc-53.1.1/src/magazine_malloc.c.auto.html
Chromium, for example, also uses it:
MADV_FREE_REUSABLE is similar to MADV_FREE, but also marks the pages with the reusable bit, which allows both Activity Monitor and memory-infra to correctly track the pages.
https://github.com/chromium/chromium/blob/master/base/memory/discardable_shared_memory.cc#L377
I've looked and looked, and I don't think this is possible. :\
We're solving the problem by adding code to the allocator which explicitly decommits MADV_FREE'd pages when we ask it to.

Determine physical mem size programmatically on OSX

We're trying to find out how much physical memory is installed in a machine running Mac OS X. We've found the BSD function sysctl(). The problem is this function wants to return a 32 bit value but some Macs are able to address up to 32 GB which will not fit in a 32 bit value. (Actually even 4 GB won't fit in a 32 bit value.) Is there another API available on OS X (10.4 or later) that will give us this info?
The answer is to use sysctl to get hw.memsize as was suggested in a previous answer. Here's the actual code for doing that.
#include <sys/types.h>
#include <sys/sysctl.h>
...
int mib[2];
int64_t physical_memory;
size_t length;
// Get the Physical memory size
mib[0] = CTL_HW;
mib[1] = HW_MEMSIZE;
length = sizeof(int64_t);
sysctl(mib, 2, &physical_memory, &length, NULL, 0);
Did you try googling?
This seems to be the answer:
http://lists.apple.com/archives/scitech/2005/Aug/msg00004.html
sysctl() does work, you just need to fetch hw.memsize instead of hw.physmem. hw.memsize will give you a uint64_t, so no 32 bit problem.
From Obtaining a Mac’s System Profiler data from shell:
Use system_profiler.
Alternatively you can add the data from vm_statistics_data_t to get the total memory
vm_statistics_data_t vm_stat;
int count = HOST_VM_INFO_COUNT;
kern_return_t kernReturn = host_statistics(mach_host_self(), HOST_VM_INFO, (integer_t*)&vm_stat, (mach_msg_type_number_t*)&count);

Resources