Why should kernel drivers reside in non-paged memory? - windows

Why should kernel drivers reside in non-paged memory? (I know this is true for Windows, I would be curious if it is also true for other operating systems and why).

Non-paged memory is required, since the region of code that interfaces with the device (eg ISR) should not be in paged memory. If it does not work with the device or does not need to be processed in DISPATCH_LEVEL, it is okay to be in paged memory.

Related

How does windows distinguishes between discs?

I wonder how windows distinguishes between diferrent drives and memory modules, I mean how can windows writte somethig specificaly to disc C or disc D.
In every programming language when you declare variable it gets saved into to the memory, and when you need to store something to hdd, you have to use some library.
So, how does windows handle it?
Does it treat all discs and memory modules as a single line of data, and does it only save each mediums beginning adress? - like 0x00000 is where the disc C begins, 0x15616 is where the disc D begins.
Like #MSalters said,
C: is a symlink to something like Device\HarddiskVolume1.
What it means is that disk drivers on Windows are implemented as virtual filesystems a bit like on Linux. I'll explain for Linux since there's much more documentation but the answer is quite similar for Windows even though both OSes do things differently.
Basically, on Linux everything is a file. Linux ships with disk drivers as these are at the basis of every computer. Linux exposes a driver model like every OS. The Linux driver model for files (including hard disks) exposes functions that will be called by the kernel to read/write to disk. There are open, read and write functions that the kernel expects to be present for a file driver.
If you wanted, you could write a disk driver and replace the existing one. You write drivers as modules that you can then load in the kernel using certain utilities that ship with Linux. I won't share more details as I'm not that much aware. Once your code is loaded in the kernel, it has access to all kernel code and all hardware since it runs in kernel mode.
Today, disk drivers probably use PCI DMA which is a controller connected to the PCI bus which allows to do disk operations which ignore the CPU and load disk data to RAM directly. The PCI convention says that all compatible devices (like PCI DMA controllers) must expose a certain interface to the computer. This interface is mostly some memory mapped registers that can be used to send commands to the controller. The OS will write data in these registers to tell the DMA controller to do disk operations. Then the DMA controller will trigger an interrupt once it is done. The OS will then know that the data is readily loaded into RAM and ready for use. The same applies for writing
The OS knows the location of these registers by looking in the ACPI tables at boot.
In modern Windows (2000 or later) C: is a symlink to something like Device\HarddiskVolume1. The number there can vary. Typically, \Device\Bootpartition is also a symlink to the same HarddiskVolume.
Windows doesn't use libraries to write to disk. Instead, it uses drivers. The chief difference is that drivers run as part of the OS kernel, while libraries run as part of applications.

Difference between Character Device, Platform Driver and Kernel Module

I am a newbie to Linux Kernel device driver codes.
One question up on the other: which is the difference between:
Character Device
Platform Driver
Kernel Module
I am writing this question because, within the same code I am examining, there are three section: one for each.
Platform Device Driver:
A platform device driver is generally written for on-chip components/devices and on-chip/off-chip unspeakable/un-discoverable devices.
If there is a device on-chip/off-chip, which doesn't have a self-identifying capability, like say i2x devices, GPIO line based, or in-circuit (on-chip) timers, etc. Then such devices need to be identified by the drivers because the devices don't have self-ID, or capability to identify themselves. This generally happens with bus lines and on-chip components.
Here is detailed explanation.
Example platform Devices: i2c devices, kernel/Documentation/i2c/instantiating-devices states:
Basically, all device drivers can be categorized into character, or block; based on the data transaction size.
Though there are many sub-classifications like network devices drivers and X device drivers, they too can be brought into devices, which carry data transactions (operations) in terms of few bytes that undergo tr
Typically, a platform device driver can fit into character device driver section, as they generally involve on-chip operations, for initialization and to transfer a few bytes, whenever needed, but not in terms of blocks (KB, MB, GB) of data.
Kernel Module?
Now, a driver can be either compiled (to be integrated) into kernel image (zImage/bzImage/...) OR can be compiled (off-the kernel) to be optionally invokable modular driver, which is not part of kernel image, but is part of filesystem as a .ko (kernel object) file (find /lib/modules/`uname -r`/ -name "*.ko"), that stays off the kernel image, but can be inserted (using modprobe/insmod) or removed (using rmmod/modprobe -r) as necessary.
On the other hand, a built-in driver can't be removed dynamically, even if we don't need it momentarily. A built-in driver would remain in the kernel and hence on RAM, as long as the system is running, even if the respective device is "not found"/"not necessary/"shutdown"), just wasting memory space (on RAM).
The Module (or modular driver), would only step-in, when necessary, from secondary storage to RAM, and can be removed if the device is removed or not-in-action. This saves RAM and helps dynamic allocation of resources.

How does U-Boot communicate with Linux kernel?

I'm reading the book and it tells that:
After U-Boot loads Linux kernel, the kernel will claim all the resources of U-Boot
What does this mean? Does it mean that all data structures that allocated in U-Boot will be discarded?
For example: during U-Boot, PCIE and Network Device will be initialized.
After booting Linux kernel, will the PCIE and Network Device data structure be discarded? Will the Linux kernel do PCIE and NEtwork initialize again? Or U-Boot will transfer some data to kernel?
It depends on your CPU architecture how the communication happens, but it is usually via a special place in RAM, flash or the filesystem. No data structures are transferred, they would be meaningless to the kernel and the memory space will be different between the two. Uboot generally passes boot parameters like what type of hardware is present, what memory to use for something, or which type of mode to use for a specific driver. So yes, the kernel will re-initialize the hardware. The exception may be some of the low level CPU specifics which the kernel may expect uboot or a BIOS to have setup already.
Depending on your architecture, there may be different mechanism for the u-boot to communicate with the Linux kernel.
Actually there may be some structures defined by u-boot which are transferred to and used by the kernel using ATAGS. The address in which these structure are passed is stored in r2 register on ARM. They convey information such as available RAM size and location, kernel command line, ...
Note that on some architectures (like ARM again) we have support for device-tree which intends for defining the hardware in which the kernel is going to be run as well as kernel command line, memory and other thins. Such description is usually created during kernel compile time, loaded into the memory by the u-boot and in case of ARM architecture, its address is transferred through r2 register.
The interesting thing about this (regarding your question) is that u-boot can change this device-tree structure before passing it to the kernel through device tree overlay mechanism. So this is a (relatively) new way of u-boot/kernel communication. Note that device-tree is not supported on some architectures.
And at the end, yes, the hardware is reinitialized by the kernel even in they have already initialized by the u-boot except for memory controller and some other very low level initialization, AFAIK.

How can a 4GB process run on only 2 GB RAM?

Given a 32-bit/64-bit processor can a 4GB process run on 2GB RAM. Will it use virtual memory or it wont run at all?
This is HIGHLY platform dependent. On many 32bit OS's, no single process can ever use more than 2GB of memory, regardless of the physical memory installed or virtual memory allocated.
For example, my work computers use 32bit Linux with PAE (Physical Address Extensions) to allow it to have 16GB of RAM installed. The 2GB per process limit still applies however. Having the extra RAM simply allows me to have more individual processes running. 32bit Windows is the same way.
64bit OS's are more of a mixed bag. 64bit Linux will allow individual processes to map memory well in excess of 32GB (but again, varies from Kernel to Kernel). You will be limited only by the amount of Swap (Linux virtual memory) you have. 64bit Windows is a complete crap shoot. Certain versions will only allow 2GB per process, but most will allow >32GB limited only by the amount of Page File the user has allocated.
Microsoft provides a useful table breaking down the various memory limits on various OS versions/editions. Unfortunately there is no such table that I can find with cursory searching for Linux since it is so fragmented.
Short answer: Depends on the system.
Most 32-bit systems have a limitation of 2GB per process. If your system allows >2GB per process, then we can move on to the next part of your question.
Most modern systems use Virtual Memory. Yet, there are some constrained (and various old) systems that would just run out of space and make you cry. I believe uClinux supports both MMU and MMU-less architectures. Most 32-bit processors have a MMU (a few don't, see ARM Cortex-M0) and a handful of 16-bit or 8-bit have it as well (see Atmel ATtiny13A-MMU and Atari MMU).
Any process that needs more memory than is physically available will require a form of Memory Swap (e.g., a partition or file).
Virtual Memory is divided in pages. At some point, a page reside either in RAM or in Swap. Any attempt to access a memory page that's not loaded in RAM will trigger an interruption called Page Fault, which is handled by the kernel.
A 64-bit process needing 4GB on a 64-bit OS can generally run in 2GB of physical RAM, by using virtual memory, assuming disk swap space is available, but performance will be severely impacted if all of that memory is frequently accessed.
A 32-bit process can't address exactly 4GB of memory in practice (some address space overhead is required by the operating system), so it won't run. Depending on the OS, it can probably run a process that needs > 2GB and < 3-4GB.

Writing a Windows 64-bit device driver for a 32-bit PCI device

I'm evaluating to port a device driver I wrote several years ago from 32 to 64 bits. The physical device is a 32-bit PCI card. That is, the device is 32 bits but I need to access it from Win7x64. The device presents some registers to the Windows world and then performs heavy bus master data transferring into a chunk of driver-allocated memory.
I've read in the Microsoft documentation that you can signal whether the driver supports 64-bit DMA or not. If it doesn't, then the DMA is double buffered. However, I'm not sure if this is the case. My driver would/could be a full 64-bit one, so it could support 64-bit addresses in the processor address space, but the actual physical device WON'T support it. In fact, the device BARs must be mapped under 4 GB and the device must get a PC RAM address to perform bus master below 4 GB. Does this mean that my driver will go through double buffering always? This is a very performance-sensitive process and the double buffering could prevent the whole system from working.
Of course, designing a new 64-bit PCI (or PCI-E) board is out of question.
Anybody could give me some resources for this process (apart from MS pages)?
Thanks a lot!
This is an old post, I hope the answer is still relevant...
There are two parts here, PCI target and PCI master access.
PCI target access: The driver maps PCI BARs to 64bit virtual address space and the driver just reads/writes through a pointer.
PCI master access: You need to create a DmaAdapter object by calling IoGetDmaAdapter(). When creating, you also describe your device is a 32bit (see DEVICE_DESCRIPTION parameter). Then you call DmaAdapter::AllocateCommonBuffer() method to allocate a contiguous DMA buffer in PC RAM.
I am not sure about double-buffering though. From my experience, double-buffering is not used, instead, DmaAdapter::AllocateCommonBuffer() simply fails if cannot allocate a buffer that satisfies the DEVICE_DESCRIPTION (in your case - 32bit dma addressing).
There's no problem writing a 64-bit driver for a device only capable of 32-bit PCI addressing. As Alexey pointed out, the DMA adapter object you create specifies the HW addressing capabilities of your device. As you allocate DMA buffers, the OS takes this into account and will make sure to allocate these within your HW's accessible region. Linux drivers behave similar, where your driver must supply a DMA address mask to associate with your device that DMA functions used later will refer to.
The performance hit you could run into is if your application allocates a buffer that you need to DMA to/from. This buffer could be scattered all throughout memory, with pages in memory above 4G. If your driver plans to DMA to these, it will need to lock the buffer pages in RAM during the DMA and build an SGL for your DMA engine based on the page locations. The problem is, for those pages above 4G, the OS would then have to copy/move them to pages under 4G so that your DMA engine is able to access them. That is where the potential performance hit is.

Resources