What I have read from MSDN,
Each new thread or fiber receives its own stack space consisting of both reserved and initially committed memory.
Does the word 'stack' here really mean a 'call stack' or does it mean that it gets piece of memory that is called a stack?
The call stack lives on the stack. Each thread or fiber has its own private stack and that's what the topic you link to is discussing.
That is referring to the call stack - each thread/fiber needs its own to function. Is there a reason that you think it wouldn't be the call stack?
Related
I want to learn and fill gaps in my knowledge with the help of this question.
So, a user is running a thread (kernel-level) and it now calls yield (a system call I presume).
The scheduler must now save the context of the current thread in the TCB (which is stored in the kernel somewhere) and choose another thread to run and loads its context and jump to its CS:EIP.
To narrow things down, I am working on Linux running on top of x86 architecture. Now, I want to get into the details:
So, first we have a system call:
1) The wrapper function for yield will push the system call arguments onto the stack. Push the return address and raise an interrupt with the system call number pushed onto some register (say EAX).
2) The interrupt changes the CPU mode from user to kernel and jumps to the interrupt vector table and from there to the actual system call in the kernel.
3) I guess the scheduler gets called now and now it must save the current state in the TCB. Here is my dilemma. Since, the scheduler will use the kernel stack and not the user stack for performing its operation (which means the SS and SP have to be changed) how does it store the state of the user without modifying any registers in the process. I have read on forums that there are special hardware instructions for saving state but then how does the scheduler get access to them and who runs these instructions and when?
4) The scheduler now stores the state into the TCB and loads another TCB.
5) When the scheduler runs the original thread, the control gets back to the wrapper function which clears the stack and the thread resumes.
Side questions: Does the scheduler run as a kernel-only thread (i.e. a thread which can run only kernel code)? Is there a separate kernel stack for each kernel-thread or each process?
At a high level, there are two separate mechanisms to understand. The first is the kernel entry/exit mechanism: this switches a single running thread from running usermode code to running kernel code in the context of that thread, and back again. The second is the context switch mechanism itself, which switches in kernel mode from running in the context of one thread to another.
So, when Thread A calls sched_yield() and is replaced by Thread B, what happens is:
Thread A enters the kernel, changing from user mode to kernel mode;
Thread A in the kernel context-switches to Thread B in the kernel;
Thread B exits the kernel, changing from kernel mode back to user mode.
Each user thread has both a user-mode stack and a kernel-mode stack. When a thread enters the kernel, the current value of the user-mode stack (SS:ESP) and instruction pointer (CS:EIP) are saved to the thread's kernel-mode stack, and the CPU switches to the kernel-mode stack - with the int $80 syscall mechanism, this is done by the CPU itself. The remaining register values and flags are then also saved to the kernel stack.
When a thread returns from the kernel to user-mode, the register values and flags are popped from the kernel-mode stack, then the user-mode stack and instruction pointer values are restored from the saved values on the kernel-mode stack.
When a thread context-switches, it calls into the scheduler (the scheduler does not run as a separate thread - it always runs in the context of the current thread). The scheduler code selects a process to run next, and calls the switch_to() function. This function essentially just switches the kernel stacks - it saves the current value of the stack pointer into the TCB for the current thread (called struct task_struct in Linux), and loads a previously-saved stack pointer from the TCB for the next thread. At this point it also saves and restores some other thread state that isn't usually used by the kernel - things like floating point/SSE registers. If the threads being switched don't share the same virtual memory space (ie. they're in different processes), the page tables are also switched.
So you can see that the core user-mode state of a thread isn't saved and restored at context-switch time - it's saved and restored to the thread's kernel stack when you enter and leave the kernel. The context-switch code doesn't have to worry about clobbering the user-mode register values - those are already safely saved away in the kernel stack by that point.
What you missed during step 2 is that the stack gets switched from a thread's user-level stack (where you pushed args) to a thread's protected-level stack. The current context of the thread interrupted by the syscall is actually saved on this protected stack. Inside the ISR and just before entering the kernel, this protected-stack is again switched to the kernel stack you are talking about. Once inside the kernel, kernel functions such as scheduler's functions eventually use the kernel-stack. Later on, a thread gets elected by the scheduler and the system returns to the ISR, it switchs back from the kernel stack to the newly elected (or the former if no higher priority thread is active) thread's protected-level stack, wich eventually contains the new thread context. Therefore the context is restored from this stack by code automatically (depending on the underlying architecture). Finally, a special instruction restores the latest touchy resgisters such as the stack pointer and the instruction pointer. Back in the userland...
To sum-up, a thread has (generally) two stacks, and the kernel itself has one. The kernel stack gets wiped at the end of each kernel entering. It's interesting to point out that since 2.6, the kernel itself gets threaded for some processing, therefore a kernel-thread has its own protected-level stack beside the general kernel-stack.
Some ressources:
3.3.3 Performing the Process Switch of Understanding the Linux Kernel, O'Reilly
5.12.1 Exception- or Interrupt-Handler Procedures of the Intel's manual 3A (sysprogramming). Chapter number may vary from edition to other, thus a lookup on "Stack Usage on Transfers to Interrupt and Exception-Handling Routines" should get you to the good one.
Hope this help!
Kernel itself have no stack at all. The same is true for the process. It also have no stack. Threads are only system citizens which are considered as execution units. Due to this only threads can be scheduled and only threads have stacks. But there is one point which kernel mode code exploits heavily - every moment of time system works in the context of the currently active thread. Due to this kernel itself can reuse the stack of the currently active stack. Note that only one of them can execute at the same moment of time either kernel code or user code. Due to this when kernel is invoked it just reuse thread stack and perform a cleanup before returning control back to the interrupted activities in the thread. The same mechanism works for interrupt handlers. The same mechanism is exploited by signal handlers.
In its turn thread stack is divided into two isolated parts, one of which called user stack (because it is used when thread executes in user mode), and second one is called kernel stack (because it is used when thread executes in kernel mode). Once thread crosses the border between user and kernel mode, CPU automatically switches it from one stack to another. Both stack are tracked by kernel and CPU differently. For the kernel stack, CPU permanently keeps in mind pointer to the top of the kernel stack of the thread. It is easy, because this address is constant for the thread. Each time when thread enters the kernel it found empty kernel stack and each time when it returns to the user mode it cleans kernel stack. In the same time CPU doesn't keep in mind pointer to the top of the user stack, when thread runs in the kernel mode. Instead during entering to the kernel, CPU creates special "interrupt" stack frame on the top of the kernel stack and stores the value of the user mode stack pointer in that frame. When thread exits the kernel, CPU restores the value of ESP from previously created "interrupt" stack frame, immediately before its cleanup. (on legacy x86 the pair of instructions int/iret handle enter and exit from kernel mode)
During entering to the kernel mode, immediately after CPU will have created "interrupt" stack frame, kernel pushes content of the rest of CPU registers to the kernel stack. Note that is saves values only for those registers, which can be used by kernel code. For example kernel doesn't save content of SSE registers just because it will never touch them. Similarly just before asking CPU to return control back to the user mode, kernel pops previously saved content back to the registers.
Note that in such systems as Windows and Linux there is a notion of system thread (frequently called kernel thread, I know it is confusing). System threads a kind of special threads, because they execute only in kernel mode and due to this have no user part of the stack. Kernel employs them for auxiliary housekeeping tasks.
Thread switch is performed only in kernel mode. That mean that both threads outgoing and incoming run in kernel mode, both uses their own kernel stacks, and both have kernel stacks have "interrupt" frames with pointers to the top of the user stacks. Key point of the thread switch is a switch between kernel stacks of threads, as simple as:
pushad; // save context of outgoing thread on the top of the kernel stack of outgoing thread
; here kernel uses kernel stack of outgoing thread
mov [TCB_of_outgoing_thread], ESP;
mov ESP , [TCB_of_incoming_thread]
; here kernel uses kernel stack of incoming thread
popad; // save context of incoming thread from the top of the kernel stack of incoming thread
Note that there is only one function in the kernel that performs thread switch. Due to this each time when kernel has stacks switched it can find a context of incoming thread on the top of the stack. Just because every time before stack switch kernel pushes context of outgoing thread to its stack.
Note also that every time after stack switch and before returning back to the user mode, kernel reloads the mind of CPU by new value of the top of kernel stack. Making this it assures that when new active thread will try to enter kernel in future it will be switched by CPU to its own kernel stack.
Note also that not all registers are saved on the stack during thread switch, some registers like FPU/MMX/SSE are saved in specially dedicated area in TCB of outgoing thread. Kernel employs different strategy here for two reasons. First of all not every thread in the system uses them. Pushing their content to and and popping it from the stack for every thread is inefficient. And second one there are special instructions for "fast" saving and loading of their content. And these instructions doesn't use stack.
Note also that in fact kernel part of the thread stack has fixed size and is allocated as part of TCB. (true for Linux and I believe for Windows too)
I have been reading up on the call stack lately. However all examples and articles I have been reading has been single threaded. I am interested in how the call stack looks like in memory and how we can analyse it.
Sorry for including so many questions in one post. But it seems messy to create one post for each question when they are all related.
My questions here are for Windows x86.
So the questions I am having difficulty with is:
Is there always one call stack for each thread in a process? Ie, threads do not share call stacks?
Is the size of each call stack fixed? Or can it be different for each thread?
Let's pretend that we are doing everything ourselves and write our program in assembly. Is the call stack magically given to us? Or do we have to implement it ourselves?
If we make our program in assembly, do we then reserve some memory and set the call stack memory start address to ESP in order to set it up?
-Michael
1) Each thread has its own stack - almost by definition.
2) Maximum stack size is a process limit, specified in header. Initial thread stack size is a thread creation parameter - see CreateThread() API.
3) The OS manages all memory. The stack for new threads is dynamically allocated by the kernel upon thread creation and the top of the stack filled in with a stack frame that, amongst other stuff, allows the thread to begin execution by popping the frame in a similar manner to an interrupt-return. Don't try to do this at home.
4) NO! Import and call the CreateThread() API.
I have written a function in C, which, when called, immediately results in a stack overflow.
Prototype:
void dumpOutput( Settings *, char **, FILE * );
Calling line:
dumpOutput( stSettings, sInput, fpOut );
At the time of calling it, stSettings is already a pointer to Settings structure, sInput is a dynamically allocated 2D array and fpOut is a FILE *. It reaches all the way to the calling line without any errors, no memory leaks etc.
The actual function is rather lengthy and i think its not worth sharing it here as the overflow occurs just as the code enters the function (called the prologue part, i think)
I have tried calling the same function directly from main() with dummy variables for checking if there are any problems with passed arguments but it still throws the stack overflow condition.
The error arises from the chkstk.asm when the function is called. This asm file (according to the comments present in it) tries to probe the stack to check / allocate the memory for the called function. It just keeps jumping to Find next lower page and probe part till the stack overflow occurs.
The local variables in dumpOutput are not memory beasts either, just 6 integers and 2 pointers.
The memory used by code at the point of entering this function is 60,936K, which increases to 61,940K at the point when the stack overflow occurs. Most of this memory goes into the sInput. Is this the cause of error? I don't think so, because only its pointer is being passed. Secondly, i fail to understand why dumpOutput is trying to allocate 1004K of memory on stack?
I am totally at a loss here. Any help will be highly appreciated.
Thanks in advance.
By design, it is _chkstk()'s job to generate a stack overflow exception. You can diagnose it by looking at the generated machine code. After you step into the function, right-click the edit window and click Go To Disassembly. You ought to see something similar to this:
003013B0 push ebp
003013B1 mov ebp,esp
003013B3 mov eax,1000D4h ; <== here
003013B8 call #ILT+70(__chkstk) (30104Bh)
The value passed through the EAX register is the important one, that's the amount of stack space your function needs. Chkstk then verifies it is actually available by probing the pages of stack. If you see it repeatedly looping then the value for EAX in your code is high. Like mine, it is guaranteed to consume all bytes of the stack. And more. Which is what it protects against, you normally get an access violation exception. But there's no guarantee, your code may accidentally write to a mapped page that belongs to, say, the heap. Which would produce an incredibly difficult to diagnose bug. Chkstk() helps you find these bugs before you blow your brains out in frustration.
I simply did it with this little test function:
void test()
{
char kaboom[1024*1024];
}
We can't see yours, but the exception says that you either have a large array as a local variable or you are passing a large value to _alloca(). Fix by allocating that array from the heap instead.
Most likely a stack corruption or recursion error but it's hard to answer without seeing any code
I've been using SetUnhandledExceptionFilter for a long time, and my handler walks the stack and uses dbghelp.dll to convert the addresses into File/Line references. It then writes that to a log file and puts up a dialog with the same information for the user. This USED to work just fine. These days however I'm getting a completely useless stack:
1004bbaa: Lgid.dll, C:\Data\Code\Lgi\trunk\src\win32\Lgi\LgiException.cpp:175
10057de0: Lgid.dll, C:\Data\Code\Lgi\trunk\src\win32\Lgi\GApp.cpp:107
7c864191: kernel32.dll, UnhandledExceptionFilter+0x1c7
102158ed: MSVCRTD.dll, winxfltr.c:228
006dc1a7: Scribe.exe, crtexe.c:345
7c817077: kernel32.dll, RegisterWaitForInputIdle+0x49
00000000: Scribe.exe
Where 'Scribe.exe' is my application. Now if I walk the debugger from the exception handler back up the stack several frames I eventually get to a completely different temporary stack that actually includes all the calls that led up to the crash. Which is the information I actually want to log for the user. It's as if the exception handler is executing on a separate stack from the main application.
What I need is the stack information for the actual application stack, that includes all the calls leading up to the crash. Is there some easy way to get that from inside the exception handler?
According to http://www.eptacom.net/pubblicazioni/pub_eng/except.html I can get the exception's EIP and EBP out of the EXCEPTION_POINTERS 'Context' member. So I tried passing that EBP to my stack walker as it's initial point and it could then walk the application stack correctly. As long as I put the EIP as the first point in the stack walk I get the whole thing.
Are you using x64? Could you be hitting http://blog.paulbetts.org/index.php/2010/07/20/the-case-of-the-disappearing-onload-exception-user-mode-callback-exceptions-in-x64/ ?
When a thread is executing in kernel mode, will the stack pointer points to its kernel mode stack? Similarly, will it points to the user mode stack when the thread is running in user mode?
Thanks.
It depends on what kind of process is executing. Recent linux kernels allow user processes in kernel mode There are also multiple stacks in each process and mode so your reference to "the stack pointer" is a little vague imho.