Default memory block for Unix/Linux threads? - windows

Does anybody know how much default memory is allocated to a thread created on Unix/Linux operating system?
For windows xp OS i found that it allocates a memory block of 1MB, is it correct?
Thanks in advance.

There's not going to be a single answer to that question.
In fact there's not even a single answer on Windows. Different executables specify different stack limits. And even within a single process, individual threads can have different stack limits.
And it gets even more complicated when you factor in the differences between .net and native executables. Rather strangely .net executables commit the entire stack allocation for each thread as soon as the thread starts. On the other hand, native executables reserve the stack allocation and then commit memory on demand using guard pages.

You can see how much space is allocated for thread stacks (measured in kbytes) with ulimit -s.
Quoting from the pthread_create(3) manpage:
On Linux/x86-32, the default stack
size for a new thread is 2 megabytes.
Under the NPTL threading
implementation, if the RLIMIT_STACK
soft resource limit at the time the
program started has any value other
than "unlimited", then it determines
the default stack size of new threads.
Using pthread_attr_setstacksize(3),
the stack size attribute can be
explicitly set in the attr argument
used to create a thread, in order to
obtain a stack size other than the
default.

Related

How to calculate total RSS of all thread stacks under Linux?

I have a heavily multi-threaded application under Linux consuming lots of memory and I am trying to categorize its RSS. I found particularly challenging to estimate total RSS of all thread stacks in program. I had following ideas:
Idea 1: look into /proc/<pid>/smaps and consider mappings for stacks; there is an information regarding resident size of each mapping but only the main thread mapping is annotated like [stack]; the rest of them is indistinguishable from regular 8 MiB mappings (with default stack size). Also reading /proc/<pid>/smaps is pretty expensive as it produces contention on kernel innternal VMA data structures.
Idea 2: look into /proc/<tid>/status; there is VmStk section which should describe stack resident size, but it always shows stack size of a main thread. It looks pretty clear why: beacuse main thread is the only one for which kernel allocates stack by itself, while the rest of threads gets stack from pthreads code which allocates it as a regular memory mapping.
Idea 3: traverse threads from user-space using some stuff from pthreads, retrieve stack mapping address and stack size for each thread and then find out how many pages are resident using mincore(2). As a possible optimization, we may skip calling mincore for sleeping threads using the cached value for them. Unfortunately, I did not find any suitable way to iterate over pthread_t structures. Note that part of the threads comes from the libraries which I am not able to control, so maintaining any kind of thread registry by registering threads on startup is not possible.
Idea 4: use ptrace(2) to retrieve thread registers, retrive stack pointers from them, then proceed with Idea 1. This way looks excessively hard and intrusive.
Can anybody provide me more or less intended way to do so? Being non-portable is OK.
Two more ideas I got after some extra research:
Idea 5: from man 5 proc on /proc/<pid>/maps:
There are additional helpful pseudo-paths:
[stack]
The initial process's (also known as the main thread's) stack.
[stack:<tid>] (since Linux 3.4)
A thread's stack (where the <tid> is a thread ID). It corresponds to the /proc/[pid]/task/[tid]/ path.
It looks intriguing, but it seems that this logic has been reverted as it was implemented ineffiiently: https://lore.kernel.org/patchwork/patch/716239/. Man page seems obsolete (at least on my Ubuntu Disco 19.04).
Idea 6: This one may actually work. There is an /proc/<tid>/syscall file which may expose thread stack register for a blocked thread. Considering the fact that most of my threads are sleeping on I/O, this allows me to track their rsp value, which I may project onto /proc/<pid>/maps to find the correspondence between thread and its stack mapping. After that I may implement Idea 3.

How does macOS allocate stack and heap for a process?

I want to know how macOS allocate stack and heap memory for a process, i.e. the memory layout of a process in macOS. I only know that the segments of a mach-o executable are loaded into pages, but I can't find a segment that correspond to stack or heap area of a process. Is there any document about that?
Stacks and heaps are just memory. The only think that makes a stack a stack or a heap or a heap is the way it is accessed. Stacks and heaps are allocated the same way all memory is: by mapping pages into the logical address space.
Let's take a step back - the Mach-o format describes mapping the binary segments into virtual memory. Importantly the memory pages you mentioned have read write and execute permissions. If it's an executable(i.e. not a dylib) it must contain the __PAGEZERO segment with no permissions at all. This is the safe guard area to prevent accessing low addresses of virtual memory by accident (here falls the infamous Null pointer exception and such if attempting to access zero memory address).
__TEXT read executable (typically without write) segment follows which in virtual memory will contain the file representation itself. This implies all the executable code lives here. Also immmutable data like string constants.
The order may vary, but usually next you will encounter __LINKEDIT read only segment. This is the segment dyld uses to setup externally loaded functions, this is too broad to cover here, but there are numerous answers on the topic.
Finally we have the readable writable __DATA segment the first place a process can actually write to. This is used for global/static variables, external addresses to calls populated by dyld.
We have roughly covered the process initial setup when it will launch through either LC_UNIXTHREAD or in modern MacOS (10.7+) LC_MAIN. This starts the process main thread. Each thread must contain it's own stack. The creation of it is handled by operating system (including allocating it). Notice so far the process has no awareness of the heap at all (it's the operating system that's doing the heavy lifting to prepare the stack).
So to sum up so far we have 2 independent sources of memory - the process memory representing the Mach-o structure (size is fixed and determined by the executable structure) and the main thread stack (also with predefined size). The process is about to run a C-like main function , any local variables declared would move the thread stack pointer, likewise any calls to functions (local and external) to at least setup the stack frame for return address. Accessing a global/static variable would reference the __DATA segment virtual memory directly.
Reserving stack space in x86-64 assembly would look like this:
sub rsp,16
There are some great SO anwers on System V / AMD64 ABI (which includes MacOS) requirements for stack alignment like this one
Any new thread created will have its own stack to allow setting up stack frames for local variables and calling functions.
Now we can cover heap allocation - which is mitigated by the libSystem (aka MacOS C standard library) delivering the malloc/free. Internally this is handled by mmap & munmap system calls - the kernel API for managing memory pages.
Using those system calls directly is possible, but might turned out inefficient, thus an internal memory pool is utilised by malloc/free to limit the number of system calls (which are costly to make).
The changing addresses you mentioned in the comment are caused by:
ASLR aka PIE (position independent code) for process memory , which is a security measure randomizing the start of virtual memory
Thread local stacks being prepared by the operating system

Checking a process' stack usage in Linux

I am using version 3.12.10 of Linux. I am writing a simple module that loops through the task list and checks the stack usage of each process to see if any are in danger of overflowing the stack. To get the stack limit of the process I use:
tsk->signal->rlim[ RLIMIT_STACK ].rlim_cur
To get the memory address for the start of the stack I use:
tsk->mm->start_stack
I then subract from it the result of this macro:
KSTK_ESP( tsk )
Most of the time this seems to work just fine, but on occasion I a situation where a process uses more than its stack limit ( usually 8 MB ), but the process continues to run and Linux itself is not reporting any sort of issue.
My question is, am I using the right variables to check this stack usage?
After doing more research I think I have realized that this is not a good way of determining how much stack was used. The problem arises when the kernel allocates more pages of memory to the stack for that process. Those pages may not be contiguous to the other pages. Thus the current stack pointer may be some value that would result in an invalid calculation.
The value in task->mm->stack_vm can be used to determine how much space was actually allocated to a process' stack. This is not as accurate as how much is actually used, but for my use, good enough.

Memory management and Process

Is HEAP local to a process? In other words we have stack which is always local to a process and for each process it is seprate. Does the same apply to heap? Also, if HEAP is local, i believe HEAP size should change during runtime as we request more and more memory from CPU, so who puts a top limit on how much memory can be requested?
Heaps are indeed local to the process. Limits are placed by the operating system. Memory can also be limited by the number of bits used for addressing (i.e. 32-bits can only address 2G 4G of memory at a time).
Yes, on modern operating systems there exists a separate heap for each process. There is by the way not just a separate stack for every process, but there is a separate stack for every thread in the process. Thus a process can have quite a number of independent stacks.
But not all operating systems and not all hardware platforms offer this feature. You need a memory management unit (in hardware) for that to work. But desktop computers have that feature since... well... a while back... The 386-CPU? (leave a comment if you know better). You may though find yourself on some kind of micro processor that does not have that feature.
Anyway: The limit to the heap size is mainly limited by the operating system and the hardware. The hardware limits especially due to the limited amount of address space that it allows. For example a 32bit-CPU will not address more than 4GB (2^32). A CPU that features physical address extensions (PAE), which the current CPUs do support, can address up to 64GB, but that's done by the use of segments and one single process will not be able to make use of this feature. It will always see 4GB max.
Additionally the operating system can limit the memory as it sees fit. On Linux you can see and set limits using the ulimit command. If you are running some code not natively, but for example in an interpreter/virtual machine (such as Java, or PHP), then that environment can additionally limit the heap size.
'heap' is local to a proccess, but it is shared among threads, while stack is not, it is per-thread.
Regarding the limit, for example in linux it is set by ulimit (see manpage).
On a modern, preemptively multi-tasking OS, each process gets its own address space. The set of memory pages that it can see are separate from the pages that other processes can see. As a result, yes, each process sees its own stack and heap, because the stack and heap are just areas of memory.
On an older, cooperatively multi-tasking OS, each process shared the same address space, so the heap was effectively shared among all processes.
The heap is defined by the collection of things in it, so the heap size only changes as memory is allocated and freed. This is true regardless to how the OS is managing memory.
The top limit of how much memory can be requested is determined by the memory manager. In a machine without virtual memory, the top limit is simply how much memory is installed in the computer. With virtual memory, the top limit is defined by physical memory plus the size of the swap file on the disk.

confusion regarding thread in linux

I know that there is no special difference between thread and processing linux, except keeping the cr3 register untouched during the thread switch and tlb flush during process switch.
Since the threads in groud share same address space and as pgd(page table) is not changed meaning whole memory layout is shared, and hence stack space also gets shared, but as per the general definition thread owns its own stack, how is this acheived in linux.
if its like threadA has stack from x-y range, then at the first pagefault occurs and page table is updated, similarly threadB which uses the range u-v, would update the same pagetable. Hence it is possible to mess up the stack of threadB from threadA.
I just want to get the clear picture on this, help me out.Is this the safe implementation of thread?.
That's correct, there is no OS-enforced protection of the stack memory between threads. One thread A can corrupt the stack of another thread B (if thread A knows where in memory to look).

Resources