context switching and kernel modes - linux-kernel

There's a trouble with terminology that I would like to clarify.
Linux kernel (probably all UNIX kernels?) executes in supervisor mode (aka kernel mode),
whereas user applications run in user mode; each mode has also its own
memory space.
Unix transfers execution from user mode (user space) to kernel mode (kernel space)
when application issues syscall or is interrupted by hardware interrupts.
However, most of technical literature talks about context switching, when
the kernel switches execution from one task (process) to another.
How is called exec transfer from user mode to kernel, and how is this related to
the context switching?

The transition from user to kernel mode and back is simply called 'mode switch'. The most common literature to my knowledge:
William Stallings: Operating Systems: Internals and Design Principles 6/E, Prentice Hall, 2009
Silberschatz, Galvin, Gagne: Operating System Concepts Wiley & Sons, 2005
Andrew S. Tanenbaum: Modern Operating Systems 3/E, Prentice Hall, 2008

Related

How to disable software SMI (System Management Interrupt) in Windows

Starting from Windows 10 1809, OS generates lots of software SMIs.
We are running our real time application on separate processor core and each SMI generates unpredictable delay. Before 1809 it was always possible to disable SMIs in BIOS.
Call stack in Windows looks like:
hal!HalEfiGetEnvironmentVariable+0x56
hal!HalGetEnvironmentVariableEx+0xb572
nt!IopGetEnvironmentVariableHal+0x2a
nt!IoGetEnvironmentVariableEx+0x85
nt!ExpGetFirmwareEnvironmentVariable+0x91
nt!ExGetFirmwareEnvironmentVariable+0x110ce3
nt!NtQuerySystemEnvironmentValueEx+0x6e
SMI is generated by OUT instruction into port 0xb2. It is required to read UEFI variables from NVRAM. When BIOS is in legacy mode, there is no SMIs.
Is it possible to configure Windows, so it will not access UEFI variables using SMIs?
The short answer is NO, it is not possible to configure Windows to not generate SW SMIs on UEFI Variable accesses, because those SMIs are not generated by Windows. The SMIs are generated inside the firmware.
All UEFI-aware OSes read/write UEFI variables via GetVariable() and SetVariable() services, which are part of Runtime Services exposed by a UEFI firmware to the OS via System Table - see UEFI Spec, section 8. The current implementation of Variable Services in most firmware is to process the actual Get/Set variable requests inside SMM, for security reasons.
So it is the device's firmware that's responsible for generating SW SMIs, not the OS. However, the OS and some system services/applications absolutely need to work with UEFI variables as it is how a UEFI-aware OS is supposed to run on a UEFI firmware.
On processors that supports AMD-V (e.g. AMD Processors, Hygon Processors), the answer is yes, but in kernel mode. There are two instructions called stgi and clgi, where stgi sets the GIF and clgi clears the GIF. The GIF is used to control the interrupt behaviors so that one may enter absolute atomic operations. As defined in AMD-V, Internal SMIs (e.g. I/O Trapping) are discarded and External SMIs (e.g. from external hardware, or IPIs by APIC) are held pending when GIF is cleared. Make sure you enabled the SVME bit in EFER MSR as you are to execute these instructions.
If you would like to make it happen in a more generic way, which does not rely on AMD-V, you may try to get your code into SMI handler, in that SMIs which occurs later will be latched while processor is in SMM.
Reference:
Chapter 10.3.3 "Exceptions and Interrupts", Volume 2 "System Programming", AMD64 Architecture Programmer's Manual.
Chapter 15.17 "Global Interrupt Flag, STGI and CLGI Instructions", Volume 2 "System Programming", AMD64 Architecture Programmer's Manual.
https://www.amd.com/system/files/TechDocs/24593.pdf

Is the only way to raise a software interrupt on a x86 CPU through the BIOS?

I was actually under the impression that this was kind of old fashioned, and that modern operating systems (Windows, Linux) were calling the CPU directly.
EDIT:
I read it on the BIOS interrupt call article on wikipedia.
For example, to print a character to the screen using BIOS interrupt 0x10 [...]
No, a modern application does not and cannot use BIOS interrupts (as they must be called from 16-bit mode).
Software interrupts are used for system calls, which let an application pass control to the operating system to do something (e.g. file access) which the application is not allowed to do (like writing to video memory or accessing the hardware).

What is a Windows Kernel Driver?

What is Windows Kernel Driver written with the WDK?
What is different from normal app or service?
Kernel drivers are programs written against Windows NT's native API (rather than the Win32 Subsystem's API) and which execute in kernel mode on the underlying hardware. This means that a driver needs to be able to deal with switching virtual memory contexts between processes, and needs to be written to be incredibly stable -- because kernel drivers run in kernel mode, if one crashes, it brings down the entire system. Kernel drivers are unsuitable for anything but hardware devices because they require administrative access to install or start, and because they remove the security the kernel normally provides to programs that crash -- namely, that they crash themselves and not the entire system.
Long story short:
Drivers use the native API rather than the Win32 API
This means that drivers generally cannot display any UI.
Drivers need to manage memory and how memory is paged explicitly -- using things like paged pool and nonpaged pool.
Drivers need to deal with process context switching and not depend on which process happens to have the page table while they're running.
Drivers cannot be installed into the kernel by limited users.
Drivers run with privileged rights at the processor level.
A fault in a user-level program results in termination of that program's process. A fault in a driver brings down the system with a Blue Screen of Death.
Drivers need to deal with low level hardware bits like Interrupts and Interrupt Request Levels (IRQLs).
It is code that runs in kernel mode rather than user mode. Kernel mode code has direct access to the internals of the OS, hardware etc.
Invariably you write kernel mode modules to implement device drivers.
A kernel driver is a low-level implementation of an "application".
Because it runs in the kernel context, it has the ability to access the kernel API and memory directly.
For example, a kernel driver should be used to:
Control access to files (password protection,hiding)
Allow accessing non-standard filesystems (like ext, reiserfs, zfs and etc.) and devices
True API hooks
...and for many other reasons
If you'd like to get know more, you can search for keyword "ring0" with your favorite search engine.
Others have explained the difference as the perspective of system level.
If you are doing development in C++, there are below differences in User mode development and kernel-mode development.
Unhandled exceptions crash the process in User mode, but in kernel mode, it crashes the whole system(face BSOD).
When the user-mode process terminates without free private memory, the system implicitly free process memory. But in kernel mode, remaining memory free after system boot.
The user-mode code is written and execute in PASSIVE_LEVEL. In kernel mode, there are more IRQL level.
Kernel code debugging done using separate machines. But you can debug user mode on same machine.
you can't use all C++ functionality in kernel-mode such as Exception handling and STL.
Entry points are different, in user mode, you use the main as the entry point. But in kernel mode, we need to use DriverEntry.
You can't use new operator in kernel mode, you need to overload it explictly.

Windows XP: Have my program run in kernel mode?

I'm currently learning about the different modes the Windows operating system runs in (kernel mode vs. user mode), device drivers, their respective advantages and disadvantages and computer security in general.
I would like to create a practical example of what a faulty device driver that runs in kernel mode can do to the system, by for example corrupting memory used for critical OS-processes.
How can I execute my code in kernel mode instead of user mode, directly?
Do I have to write a dummy device driver and install it to do this?
Where can I read more about kernel and user mode in Windows?
I know the dangers of this and will do all of the experiments on a virtual machine running Windows XP only
The "Windows Internals" book is rather shallow on the topic at question.
First I should note that any program also runs in kernel mode (KM). This is due to the fact that - not unlike in unixoid systems - for system calls the calling thread transitions into KM where the kernel itself or one of the drivers services the request and then returns to user mode (UM).
A first step to get started would be to download the latest Windows Driver Kit (WDK) and start reading the documentation. If you want a more digestive book, go for one of these:
Windows NT Device Driver Development - though an old title, many of the basics still apply.
Programming the Windows Driver Model (by Oney) - WDM programming in particular, also covers basics, has some errors (as most books).
Undocumented Windows 2000 Secrets (by Schreiber) - contains plenty of information about all kinds of internals at a more technical level than the book mentioned before.
Undocumented Windows NT - contains a more generic part about internals on a technical level followed by a reference of some native API functions.
Windows NT/2000 Native API - the classic, but it's more of a reference. Nevertheless there are several gems (and examples) in it.
Since you want to use Windows XP, many of the techniques described over at rootkit.com (even from some years ago) should work. They also got plenty of samples.
And as you notice by the name of the referenced website, you are in fact in what I'd call a gray area with that question ;)
It's a simple answer, and as you suspect, you do need to write a device driver in order to run in kernel mode. I'm afraid I don't know of a particularly good reference for kernel mode programming but a quick websearch reveals:
en.wikibooks.org/wiki/Windows_Programming/User_Mode_vs_Kernel_Mode
http://www.netomatix.com/Development/Kernelmode.aspx
http://technet.microsoft.com/en-us/library/cc750820.aspx
http://msdn.microsoft.com/en-us/library/ff553208(VS.85).aspx
You will need a good understanding of Windows Internals:
http://technet.microsoft.com/en-us/sysinternals
and yes they have a book: Windows Internals
http://technet.microsoft.com/en-us/sysinternals/bb963901
http://www.amazon.com/Windows%C2%AE-Internals-Including-Windows-PRO-Developer/dp/0735625301
Basically your questions are all answered in this book (and it even comes with samples and hands-on labs).

How can a kernel be non preemptive and still have multiple control paths

In an operating systems course I took a while ago we were working on an old, non-preemptive kernel of Linux (2.4.X). However, we were told that there could be multiple control paths in the kernel simultaneously. Doesn't that contradict the non-preemptive nature of the kernel?
EDIT: I mean, there is no context switch inside the kernel. Last time I tried asking this question I got the response "well, the Linux kernel is preemptive, so there's no problem".
Within the 2.4 kernel, although kernel code could not be arbitrarily pre-empted by other kernel code, kernel code could still voluntarily give up the CPU by sleeping (this is obviously quite a common case).
In addition, kernel code could always be pre-empted by interrupt handlers (unless it specifically disabled interrupts), and the 2.4 kernel also supported SMP, allowing multiple CPUs to be executing within the kernel simultaneously.
The Linux kernel can run in interrupt context or in process (user) context. Process context means it is running on behalf of a process, which has called a syscall. Interrupt context means it is running on behalf of some kind of interrupt (hardware interrupt, softirq, ...).
When you talk about preemptive multitasking, it means the kernel can decide to preempt some task to run another task. When you talk about preemptive kernels, it means the kernel can decide to preempt itself running to run some other kernel code.
Now, before Linux was a preemptive kernel, you could run kernel code on several CPUs, and kernel code could be interrupted by hardware interrupts (which could end up running softirqs before returning,...). Preemptive kernels mean the kernel can also be preemptied by process context kernel code, to avoid long latencies (preemptive Linux came from the Linux realtime tree).
Of course, all of this is better explained in Rusty Russell's Unreliable Guide to Kernel Hacking and Unreliable Guide to Kernel Locking.
EDIT:
Or, trying to explain it better, when a task calls a syscall on a non-preemptive kernel, that task cannot be preemptied until the syscall ends (maybe with EINTR, but this could be a long time). A preemptive kernel allows that task to be preemptied, leading to lower average-case and worst-case latencies for other tasks waiting to run.
A non-preemptive kernel means that the kernel does not perform context switching on behalf of another process, or interrupt another running process. It can still be multi-processing by implementing cooperative multitasking where the actually running processes themselves yield control to the kernel or other processes. So yes, you can have multi-tasking and a non-preemptive kernel.
There is no context switching within the kernel for MONOLITHIC kernels, but of course there is still multitasking performed by the kernel....therefore you still have multi-tasking and non-premptiveness
The Linux kernel offloads a lot of work to kernel threads, which may be scheduled in and out alongside userspace tasks, independent of kernel preemption. Even your old 2.4 kernel has these kernel threads, albeit less of them than a modern 2.6 kernel. The 2.6 kernel now has several levels of preemption that can be chosen at compile time, but full preemption is not the default.

Resources