How does GFP_ATOMIC prevent sleep - linux-kernel

How does GFP_ATOMIC ( in kzalloc) prevent sleep?
Also found
#define GFP_ATOMIC (__GFP_HIGH)
However did not understand furhter..?

GFP_ATOMIC prevents sleeping by telling the memory allocation code that it's not allowed to sleep to satisfy the allocation - that's all. If the memory allocation code needs to sleep, and GFP_ATOMIC has been passed, then it will return an error to the caller instead.

if you try to allocate the memory in the linux kernel then it could be done with the help of function named kmalloc(size,flags).
But the flag parameter in the function is basically instruct the kernel how the memory should be allocated. there are basically three zones of flags as action modifiers, zone modifiers and types.
If u give the flag as GFP_ATOMIC then the allocation is high-priority and does not sleep. This is the flag to use in interrupt handlers, bottom halves and other situations where you cannot sleep.
& it is done basically instructing the kernel to get the memory chunks from the pre-allocated slab memory if there is memory in the slab then it is allocated atomically else allocation fails.
For more info http://www.linuxjournal.com/article/6930

The alternative to GFP_ATOMIC is GFP_KERNEL:
#define GFP_KERNEL (__GFP_WAIT | __GFP_IO | __GFP_FS).
GFP_KERNEL incldes __GFP_WAIT, which is tested in various situations where sleep is considered. Without this flag, there's no sleep.

Related

systemd MemoryLimit not enforced

I am running systemd version 219.
root#EVOvPTX1_RE0-re0:/var/log# systemctl --version
systemd 219
+PAM -AUDIT -SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP -LIBCRYPTSETUP -GCRYPT +GNUTLS +ACL +XZ -LZ4 -SECCOMP +BLKID -ELFUTILS +KMOD -IDN
I have a service, let's call it foo.service which has the following.
[Service]
MemoryLimit=1G
I have deliberately added code to allocate 1M memory 4096 times which causes
4G memory alloc when a certain event is received. The idea is that after
the process consumes 1G of address space, memory alloc would start failing.
However, this does not seem to be the case. I am able to alloc 4G memory
without any issues. This tells me that the memory limit specified in the
service file is not enforced.
Can anyone let me know what am I missing ?
I looked at the proc file system - file named limits. This shows that the
Max address space is Unlimited, which also confirms that the memory limit
is not getting enforced.
This distinction is that you have allocated memory, but you haven't actually used it. In the output of top, this is the difference between the "VIRT" memory column (allocated) and the "RES" column (actually used).
Try modifying your experiment to assign values to elements of a large array instead of just allocating memory and see if you hit the memory limit that way.
Reference: Resident and Virtual memory on Linux: A short example

What is the use of flag PF_MEMALLOC

When I am browsing some code in one device driver in linux, I found the flag PF_MEMALLOC is being set in the thread (process). I found the definition of this flag in header file, which saying that "Allocating Memory"
#define PF_MEMALLOC 0x00000800 /* Allocating memory */
So, my doubt here is, what exactly the use of this flag when set it in a process/thread like code current->flags |= PF_MEMALLOC;
This flag is used within the kernel to indicate a thread that is current executing with a memory-allocation path, and therefore is allowed to recursively allocate any memory it requires ignoring watermarks and without being forced to write out dirty pages.
This is to ensure that if the code that is attempting to free pages in order to satisfy an original allocation request itself has to allocate a small amount of memory to proceed, that code won't then recursively try to free pages.
Most drivers should not require this flag.

How to set memory region's protection in kernel mode under Windows 7

Essentially I am looking for a function that could do for kernel mode what VirtualProtect does for user mode.
I am allocating memory using a logic exemplified by the following simplified code.
PMDL mdl = MmAllocatePagesForMdl
(
LowAddress,
HighAddress,
SkipAddress,
size
);
ULONG flags = NormalPagePriority | MdlMappingNoExecute | MdlMappingNoWrite;
PVOID ptr = MmGetSystemAddressForMdlSafe
(
mdl,
flags
);
The MdlMappingNoExecute and MdlMappingNoWrite flags will have effect only on Win8+.
Moreover, using only MmGetSystemAddressForMdlSafe I cannot assign for example NoAccess protection for the memory region.
Are there any additional or alternative API-s I could use so that I can modify the page protection of the allocated memory?
A hack would do too since currently this functionality would not be in use in production code.
C:\Windows\System32>dumpbin /exports ntdll.dll | find "Protect"
391 17E 0004C030 NtProtectVirtualMemory
1077 42C 000CE8F0 RtlProtectHeap
1638 65D 0004C030 ZwProtectVirtualMemory
I think you can call Zw functions from kernel mode, and the args are generally the same as for the corresponding Nt functions. And while ZwProtectVirtualMemory is undocumented, there is a documented ZwAllocateVirtualMemory that accepts protection flags.
Another approach might be to allocate and protect virtual memory in user-mode, pass the buffer down to your driver, then create the corresponding MDL there.
The code I currently ended up using is below.
All used APIs are official.
Here I create another mdl for subrange of the allocated memory and change protection of that subrange.
If You trip over memory protected with this method below then:
at IRQL < DISPATCH_LEVEL You will get PAGE_FAULT_IN_NONPAGED_AREA fault (Invalid system memory was referenced. This cannot be protected by try-except,
it must be protected by a Probe. Typically the address is just plain bad or it
is pointing at freed memory.)
at IRQL == DISPATCH_LEVEL You will get
DRIVER_IRQL_NOT_LESS_OR_EQUAL fault (An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.)
Note that changing the protection might fail if the subrange is part of large page allocation. Then the status will be likely STATUS_NOT_SUPPORTED.
Large page allocations can happen if the originally allocated memory region's size and alignment (which depends on SkipAddress variable in the question) are suitable and some additional preconditions are fulfilled with which I am not familiar with (perhaps starting from certain OS version).
PMDL guard_mdl = IoAllocateMdl
(
NULL,
PAGE_SIZE * guardPageCount,
FALSE,
FALSE,
NULL
);
if (guard_mdl)
{
IoBuildPartialMdl
(
mdl,
guard_mdl,
(PVOID)(0), // **offset** from the beginning of allocated memory ptr
PAGE_SIZE * guardPageCount
);
status = MmProtectMdlSystemAddress
(
guard_mdl,
PAGE_NOACCESS
);
}

why recursive filesystem calls would be a bad idea when the GFP_NOFS is masked

From LDD3 page 214:
GFP_NOIO
GFP_NOFS
These flags function like GFP_KERNEL, but they add restrictions on what the kernel can do to satisfy the request. A GFP_NOFS allocation is not allowed to perform any filesystem calls, while GFP_NOIO disallows the initiation of any I/O at all. They are used primarily in the filesystem and virtual memory code where an allocation may be allowed to sleep, but recursive filesystem calls would be a bad idea.
I want to know why recursive filesystem calls is a bad idea, when GFP_NOFS is masked?
Thanks!
I want to know why recursive filesystem calls is a bad idea, when GFP_NOFS is masktd?
It's other way around: you use GFP_NOFS to signal, that allocation can sleep, but can't interact with filesystem ( for example: dump some memory block to disk to make some free memory ). It's done in critical areas of code.
For example: you entered filesystem call, locked some global mutex for this filesystem, called kmalloc. If kmalloc will try to call another filesystem function, that locks the same mutex - we will have deadlock. So we provide GFP_NOFS flag.

Memory management in Contiki-OS

I am trying to create a port for Contiki-os to LPC1347, and i have a question as to how exactly is memory handled in Contiki. Protothreads are stack-less and no "real threads" are used so everything is basically on the same stack, so it is basically static memory allocation. I understand how protothreads work but when a new process is initialized, how is memory allocated for it and also, in case of an event having data, how is memory managed for event data?
All required memory is statically allocated during compilation/linkage. Its done by the PROCESS Macro[1], which allocates a structure containing the necessary information [2]. As for the events, they must allocate their own memory, too[3].
It is therefore not possible to run the same thread* or schedule the same event twice.
* Actually it is, but not using the PROCESS macro.
[1] https://github.com/contiki-os/contiki/blob/5bede26b/core/sys/process.h#L301-311
[2] https://github.com/contiki-os/contiki/blob/5bede26b/core/sys/process.h#L315-326
[3] https://github.com/contiki-os/contiki/blob/5bede26b/core/sys/process.c#L62-66

Resources