How can I access memory at known physical address inside a kernel module? - memory-management

I am trying to work on this problem: A user spaces program keeps polling a buffer to get requests from a kernel module and serves it and then responses to the kernel.
I want to make the solution much faster, so instead of creating a device file and communicating via it, I allocate a memory buffer from the user space and mark it as pinned, so the memory pages never get swapped out. Then the user space invokes a special syscall to tell the kernel about the memory buffer so that the kernel module can get the physical address of that buffer. (because the user space program may be context-switched out and hence the virtual address means nothing if the kernel module accesses the buffer at that time.)
When the module wants to send request, it needs put the request to the buffer via physical address. The question is: How can I access the buffer inside the kernel module via its physical address.
I noticed there is get_user_pages, but don't know how to use it, or maybe there are other better methods?
Thanks.

You are better off doing this the other way around - have the kernel allocate the buffer, then allow the userspace program to map it into its address space using mmap().

Finally I figured out how to handle this problem...
Quite simple but may be not safe.
use phys_to_virt, which calls __va(pa), to get the virtual address in kernel and I can access that buffer. And because the buffer is pinned, that can guarantee the physical location is available.
What's more, I don't need s special syscall to tell the kernel the buffer's info. Instead, a proc file is enough because I just need tell the kernel once.

Related

Difference between copying user space memory and mapping userspace memory

What is the difference between copying from user space buffer to kernel space buffer and, mapping user space buffer to kernel space buffer and then copying kernel space buffer to another kernel data structure?
What I meant to say is:
The first method is copy_from_user() function.
The second method is say, a user space buffer is mapped to kernel space and the kernel is passed with physical address(say using /proc/self/pagemap), then kernel space calls phys_to_virt() on the passed physical address to get it's corresponding kernel virtual address. Then kernel copies the data from one of its data structures say skb_buff to the kernel virtual address it got from the call to phys_to_virt() call.
Note: phys_to_virt() adds an offset of 0xc0000000 to the passed physical address to get kernel virtual address, right?
The second method describes the functionality in DPDK for KNI module and they say in documentation that it eliminates the overhead of copying from user space to kernel space. Please explain me how.
It really depends on what you're trying to accomplish, but still some differences I can think about?
To begin with, copy_from_user has some built-in security checks that should be considered.
While mapping your data "manually" to kernel space enables you to read from it continuously, and maybe monitor something that the user process is doing to the data in that page, while using the copy_to_user method will require constantly calling it to be aware of changes.
Can you elaborate on what you are trying to do?

Releasing device memory mapped to user space if program terminates

I am very new to Windows. I want to access device memory from user space and after doing some googling, kind of got an understanding on how to do this.
MmMapIoSpaceEx to map the physical address to system VA
IoAllocateMdl to allocate MDL for the same VA.
MmBuildMdlForNonPagedPool to update the MDL with underlying physical addresses.
MmMapLockedPagesSpecifyCache with UserMode flag.
My question is after mapping this memory to userspace, if program crashes for some reason, do I need to unmap using MmUnmapLockedpages.
I was thinking since the page table for the process will get destroyed as part of it's exit, unmapping might not be required. In that case I need to call only IoFreeMdl.
Is my understanding Right?. It would be great if you could point to any resources.
I really appreciate your help.

Physical Memory Allocation in Kernel

I am writting a Kernel Module that is going to trigger and external PCIe device to read a block of data from my internel memory. To do this I need to send the PCIe device a pointer to the physical memory address of the data that I would like to send. Ultimately this data is going to be written from Userspace to the kernel with the write() function (userspace) and copy_from_user() (kernel space). As I understand it, the address that my kernel module will see is still a virtual memory address. I need a way to get the physical address of it so that the PCIe device can find it.
1) Can I just use mmap() from userspace and place my data in a known location in DDR memory, instead of using copy_from_user()? I do not want to accidently overwrite another processes data in memory though.
2) My kernel module reserves PCIe data space at initialization using ioremap_nocache(), can I do the same from my kernel module or is it a bad idea to treat this memory as io memory? If I can, what would happen if the memory that I try to reserve is already in use? I do not want to hard code a static memory location and then find out that it is in use.
Thanks in advance for you help.
You don't choose a memory location and put your data there. Instead, you ask the kernel to tell you the location of your data in physical memory, and tell the board to read that location. Each page of memory (4KB) will be at a different physical location, so if you are sending more data than that, your device likely supports "scatter gather" DMA, so it can read a sequence of pages at different locations in memory.
The API is this: dma_map_page() to return a value of type dma_addr_t, which you can give to the board. Then dma_unmap_page() when the transfer is finished. If you're doing scatter-gather, you'll put that value instead in the list of descriptors that you feed to the board. Again if scatter-gather is supported, dma_map_sg() and friends will help with this mapping of a large buffer into a set of pages. It's still your responsibility to set up the page descriptors in the format expected by your device.
This is all very well written up in Linux Device Drivers (Chapter 15), which is required reading. http://lwn.net/images/pdf/LDD3/ch15.pdf. Some of the APIs have changed from when the book was written, but the concepts remain the same.
Finally, mmap(): Sure, you can allocate a kernel buffer, mmap() it out to user space and fill it there, then dma_map that buffer for transmission to the device. This is in fact probably the cleanest way to avoid copy_from_user().

Modifying Linux process page table for physical memory access without system call

I am developing a real-time application for Linux 3.5.7. The application needs to manage a PCI-E device.
In order to access the PCI-E card spaces, I have been using mmap in combination with /dev/mem. However (please correct me if I am wrong) each time I read or write the mapped memory, a system call is required for the /dev/mem pseudo-driver to handle the memory access.
To avoid the overhead of this system call, I think it should be possible to write a kernel module so that, within e.g. a ioctl call I can modify the process page table, in order to map the physical device pages to userspace pages and avoid the system call.
Can you give me some orientation on this?
Thanks and regards
However (please correct me if I am wrong) each time I read or write the mapped memory, a system call is required
You are wrong.
it should be possible to write a kernel module so that, within e.g. a ioctl call I can modify the process page table
This is precisely what mmap() does.

User to kernel mode big picture?

I've to implement a char device, a LKM.
I know some basics about OS, but I feel I don't have the big picture.
In a C programm, when I call a syscall what I think it happens is that the CPU is changed to ring0, then goes to the syscall vector and jumps to a kernel memmory space function that handle it. (I think that it does int 0x80 and in eax is the offset of the syscall vector, not sure).
Then, I'm in the syscall itself, but I guess that for the kernel is the same process that was before, only that it is in kernel mode, I mean the current PCB is the process that called the syscall.
So far... so good?, correct me if something is wrong.
Others questions... how can I write/read in process memory?.
If in the syscall handler I refer to address, say, 0xbfffffff. What it means that address? physical one? Some virtual kernel one?
To read/write memory from the kernel, you need to use function calls such as get_user or __copy_to_user.
See the User Space Memory Access API of the Linux Kernel.
You can never get to ring0 from a regular process.
You'll have to write a kernel module to get to ring0.
And you never have to deal with any physical addresses, 0xbfffffff represents an address in a virtual address space of your process.
Big picture:
Everything happens in assembly. So in Intel assembly, there is a set of privilege instruction which can only be executed in Ring0 mode (http://en.wikipedia.org/wiki/Privilege_level). To make the transition into Ring0 mode, you can use the "Int" or "Sysenter" instruction:
what all happens in sysenter instruction is used in linux?
And then inside the Ring0 mode (which is your kernel mode), accessing the memory will require the privilege level to be matched via DPL/CPL/RPL attributes bits tagged in the segment register:
http://duartes.org/gustavo/blog/post/cpu-rings-privilege-and-protection/
You may asked, how the CPU initialize the memory and register in the first place: it is because when bootup, x86 CPU is running in realmode, unprotected (no Ring concept), and so everything is possible and lots of setup work is done.
As for virtual vs non-virtual memory address (or physical address): just remember that anything in the register used for memory addressing, is always via virtual address (if the MMU is setup, protected mode enabled). Look at the picture here (noticed that anything from the CPU is virtual address, only the memory bus will see physical address):
http://en.wikipedia.org/wiki/Memory_management_unit
As for memory separation between userspace and kernel, you can read here:
http://www.inf.fu-berlin.de/lehre/SS01/OS/Lectures/Lecture14.pdf

Resources