Mapping IO space to UserMode via CreateFileMapping - windows

I am writing some proof of concept code for KVM for communication between Windows 10 and the Host Linux system.
What I have is a virtual RAM device that is actually connected to a shared memory segment on the Host. The PCIe BAR 2 is a direct mapping to this RAM.
My intent is to provide a high bandwidth low latency means of transferring data that doesn't involve other common means used (sockets, etc). ZeroCopy would be ideal.
So far I have pretty much everything working, I have written a driver that calls MmAllocateMdlForIoSpace and then maps the memory using MmMapLockedPagesSpecifyCache to user mode via a DeviceIOControl. This works perfectly, the user mode application is able to address the shared memory and write to it.
What I am missing is the ability to use CreateFileMapping in user mode to obtain a HANDLE to a mapping of this memory. I am fairly new to windows driver programming and as such I am uncertain as to if this is even possible. Any pointers as to the best way to achieve this would be very helpful.

Related

Linux PCIe DMA driver

I'm currently writing a driver for a PCIe device that should send data to a Linux system using DMA. As far as I can understand my PCIe device needs a DMA controller (DMA master) and my Linux system too (DMA slave). Currently the PCIe device has no DMA controller and should not get one. That confuses me.
A. Is the following possible?
PCIe device sends interrupt
Wait for interrupt in the Linux driver
Start DMA transfer from memory mapped PCIe registers to Linux system DMA.
Read the data from memory in userspace
I have everything setup for this, the only thing I miss is how to transfer the data from the PCIe registers to the memory.
B. Which system call (or series of) do I need to call to do a DMA transfer?
C. I probably need to setup the DMA on the Linux system but what I find points to code that assumes there is a slave, e.g. struct dma_slave_config.
The use case is collecting data from the PCIe device and make it available in memory to userspace.
Any help is much appreciated. Thanks in advance!
DMA, by definition, is completely independent of the CPU and any software (i.e. OS kernel) running on it. DMA is a way for devices to perform memory reads and writes against host memory without the involvement of the host CPU.
The way DMA usually works is something like this: software will allocate a DMA accessible region in memory and share the physical address with the device, say, by performing memory writes against the address space associated with one of the device's BARs. Then, the device will perform a DMA read or write against that block of memory. When that operation is complete, the device will issue an interrupt to the device driver so it can handle the data and/or free the memory.
If your device does not have the capability of issuing a DMA read or write against host memory, then you'll have to interact with it with using the CPU only. Discrete DMA controllers have not been a thing for a very long time.

Is it possible to allow a particular user-level application to run in kernel-mode?

This is a hypothetical question. Suppose there is an application (which typically executes in user mode) that wants to access kernel data structures, read register values, and perform some kernel-level functions.
Is there a way for kernel and/or CPU to allow this application to perform its functions while maintaining the normal user-level/kernel-level isolation for other applications except this one?
In order to either put your app in kernel space (kernel memory) or to run it in ring 0 CPU mode, you will need to do that from kernel code. In normal state of operation you can't run app from the kernel with mentioned privileges (at least there is no existing API to do that). It's probably possible to implement some kernel code which is able of this. But it will be tricky and will mess up the whole concept of kernel-space/user-space separation, and if any advanced user-space API was used -- it won't work anyway.
If you are thinking about just giving your app ring 0 privileges -- it won't work either, because kernel has its own stack and because of kernel-space/user-space memory separation, so you won't be able to run internal kernel API.
Basically, you can achieve the same thing by writing kernel module instead. And for running some kernel code on behalf of user-space app -- you can use system calls interface.
So, answering your question: no, it's not possible to run user-space app in kernel mode so it can use internal kernel API.

Is it possible to set the dma buffer address for a network card?

My understanding of network cards is that when receiving data, that data is DMA'd into main memory through the network card driver. The kernel then copies this memory into user space and sends any necessary messages.
My question is, in Windows, is it possible to set the address that the DMA is writing to? My goal is to eliminate the extra memory copy similar to the way NVidia's GPUDirect pipeline works.
Yes, this is possible. I believe this is called "common buffer DMA". It is used for intelligent network adapters. Taking advantage of this would require writing your own network driver. Here is some microsoft documentation on it. http://msdn.microsoft.com/en-us/library/windows/hardware/ff565359%28v=vs.85%29.aspx

Accessing Platform Device from Userpace

From a general standpoint, I am trying to figure out how to access a platform device from userspace. To be more specific, I have a EMIF controller on and SoC of which I have added to my device tree and I believe it is correctly bound to a pre-written EMIF platform device driver. Now I am trying to figure out how I can access this EMIF device from a userspace application. I have come accross a couple different topics that seem to have some connection to this issue but I cannot quite find out how they relate.
1) As I read it seems like most I/O is done through the use of device nodes which are created by mknod(), do I need to create a device node in order to access this device?
2) I have read a couple threads that talk about writting a Kernel module (Character?, Block?) that can interface with both userspace and the platform device driver, and use it as an intermediary.
3) I have read about the possibility of using mmap() to map the memory of my platform device into my virtual memory space. Is this possible?
4) It seems that when the EMIF driver is instantiated, it calls the probe() fucntion. What functions would a userpace application call in the driver?
It's not completely clear what you're needing to do (and I should caveat that I have no experience with EMIF or with "platform devices" specifically), but here's some overview to help you get started:
Yes, the usual way of providing access to a device is via a device node. Usually this access is provided by a character device driver unless there's some more specific way of providing it. Most of the time if an application is talking "directly" to your driver, it's a character device. Most other types of devices are used in interfacing with other kernel subsystems: for example, a block device is typically used to provide access from a file system driver (say) to an underlying disk drive; a network driver provides access to the network from the in-kernel TCP/IP stack, etc.
There are several char device methods or entry points that can be supported by your driver, but the most common are "read" (i.e. if a user-space program opens your device and does a read(2) from it), "write" (analogous for write(2)) and "ioctl" (often used for configuration/administrative tasks that don't fall naturally into either a read or write). Note that mknod(2) only creates the user-space side of the device. There needs to be a corresponding device driver in the kernel (the "major device number" given in the mknod call links the user-space node with the driver).
For actually creating the device node in the file system, this can be automated (i.e. the node will automatically show up in /dev) if you call the right kernel functions while setting up your device. There's a special daemon that gets notifications from the kernel and responds by executing the mknod(2) system call.
A kernel module is merely a dynamically loadable way of creating a driver or other kernel extension. It can create a character, block or network device (et al.), but then so can a statically linked module. There are some differences in capability mostly because not all kernel functions you might want to use are "exported" to (i.e. visible to) dynamically loaded modules.
It's possible to support mapping of the device memory into user virtual memory space. This would be implemented by yet another driver entry point (mmap). See struct file_operations for all the entry points a char driver can support.
This is pretty much up to you: it depends on what the application needs to be able to do. There are many drivers in the kernel that provide no direct function to user-space, only to other kernel code. As to "probe", there are many probe functions defined in various interfaces. In most cases, these are called by the kernel (or perhaps by a 'higher level "class" driver') to allow the specific driver to discover, identify and "claim" individual devices. They (probe functions) don't usually have anything directly to do with providing access from user-space but I might well be missing something in a particular interface.
You need to create a device node in order to access the device.
The probe function is called when the driver finds a matching device.
For information on platform device API, the following articles could be useful.
The platform device API
Platform devices and device trees

Memory Alignment for a DMA transaction (Windows Driver Foundation)

We are writing a DMA-based driver for a custom made PCI-Express device using WDF for Windows 7.
As you may know, PCI-Express bus transactions are not allowed to cross a 4k memory boundary. The custom device does not check this, and therefore we need to ensure that the driver only requests DMA transfers which are aligned to 4k memory boundaries.
The profile for the device is WdfDmaProfilePacket64.
We tried using WdfDeviceSetAlignmentRequirement(DevExt->Device, 4095), but this does not result in the DMA start address to be properly aligned.
How can we configure the WDF framework so that it only requests properly aligned addresses?
you can handle this in user space application, somehow that you initiate/allocate an aligned memory in user space and then send it to kernel program. it is not easy for a driver to align a memory which already allocated and initiated. even in user-space application we have to allocating extra space and then using the aligned part(I know, it's not pretty, that's why i recommend to solve this problem in device side)
for example if you use C++ for your user-space application you can do something like this

Resources