Looking for an explanation of kernel driver I/O interface capability - winapi

I am looking at ways of interfacing to specific hardware I/O addresses from various Windows versions from 32-bit XP up 64-bit Win7 and beyond. There seem to be various solutions published with varying degrees of capability under different Windows versions and I am trying to understand the possibilities for creating my own kernel driver. The most basic kernal I/O R/W capability seems to be the direct I/O operations such as READ_PORT_UCHAR and WRITE_PORT_UCHAR (and their word and long derivatives). I have also seen the technique below which I dont understand, appearing to be some memory mapping capability of which I have no experience and can find little readable documentation. Could someone comment on the suitability / compatibility of READ_PORT_UCHAR / WRITE_PORT_UCHAR versus this mapping technique that I reproduce below please?
Thanks in advance.
case IOCTL_PHYMEM_MAP:
if (dwInBufLen==sizeof(PHYMEM_MEM) && dwOutBufLen==sizeof(PVOID))
{
PHYSICAL_ADDRESS phyAddr;
PVOID pvk, pvu;
phyAddr.QuadPart=(ULONGLONG)pMem->pvAddr;
//get mapped kernel address
pvk=MmMapIoSpace(phyAddr, pMem->dwSize, MmNonCached);
if (pvk)
{
//allocate mdl for the mapped kernel address
PMDL pMdl=IoAllocateMdl(pvk, pMem->dwSize, FALSE, FALSE, NULL);
if (pMdl)
{
PMAPINFO pMapInfo;
//build mdl and map to user space
MmBuildMdlForNonPagedPool(pMdl);
pvu=MmMapLockedPages(pMdl, UserMode);
//insert mapped infomation to list
pMapInfo=(PMAPINFO)ExAllocatePool(\
NonPagedPool, sizeof(MAPINFO));
pMapInfo->pMdl=pMdl;
pMapInfo->pvk=pvk;
pMapInfo->pvu=pvu;
pMapInfo->memSize=pMem->dwSize;
PushEntryList(&lstMapInfo, &pMapInfo->link);
DebugPrint("Map physical 0x%x to virtual 0x%x, size %u", \
pMem->pvAddr, pvu, pMem->dwSize);
RtlCopyMemory(pSysBuf, &pvu, sizeof(PVOID));
irp->IoStatus.Information=sizeof(PVOID);
}
else
{
//allocate mdl error, unmap the mapped physical memory
MmUnmapIoSpace(pvk, pMem->dwSize);
irp->IoStatus.Status=STATUS_INSUFFICIENT_RESOURCES;
}
}
else
irp->IoStatus.Status=STATUS_INSUFFICIENT_RESOURCES;
}
else
irp->IoStatus.Status=STATUS_INVALID_PARAMETER;
break;

What are these I/O ports that you're trying to access? It's generally a Really Bad Idea to go partying on ports that you don't own because you have no way of synchronizing access to those ports with the driver that owns them, the O/S, or the BIOS (it's possible to take an SMI and have the BIOS start talking to ports that it thinks it owns).
The code snippet provided is also a horribly bad idea and should be burned. Basically, all it's doing is mapping a kernel virtual address to a device register (MmMapIoSpace) and then doing the work to then map that device register into user mode (MmMapLockedPages). There are two obvious problems with it:
1) You don't know the caching attributes of the memory, so randomly specifying MmNonCached can hang the system
2) Same as with I/O ports, you can't just arbitrarily access a device's registers. You can't properly synchronize yourself with the driver that owns them, so you're doomed to eventually borking your system.
-scott

Related

Flushing cache when DMAing directly to user memory

I have written a driver whose purpose is allow a userspace program to pin its pages and get the physical addresses for them.
Specifically, I do this with a call to get_user_pages_fast in my kernel module.
For reference, the source code to this module can be found here: https://github.com/UofT-HPRC/mpsoc_drivers/tree/master/pinner
Using /dev/mem (and yes, my kernel does allow unsafe /dev/mem accesses) I have confirmed that the physical addresses are correct.
However, I have some external hardware (an AXI DMA in an FPGA, to be precise) which is not working, and it looks like it might be a cache coherency problem. On lines 329-337 of the above linked code, I do this: (in this code, cm.usr_buf is a user virtual address)
//Find the VMA containing the user's buffer
struct vm_area_struct *vma = find_vma(current->mm, (unsigned long)cmd.usr_buf);
if (!vma) {
printk(KERN_ALERT "pinner: unrecognized user virtual address\n");
return -EINVAL;
}
flush_cache_range(vma, (unsigned long) cmd.usr_buf, (unsigned long) cmd.usr_buf + cmd.usr_buf_sz);
This doesn't appear to help. I have also tried the more general flush_cache_mm function.
Is there a correct way to flush the cache of user pages?
I tried a different API for flushing the cache. Laurent Pinchard gave a talk called "Mastering the DMA and IOMMU APIs", and in it he explains that the functions in <asm/cacheflush.h> shouldn't be used. Instead, you can use things like dma_map_sg and dma_unmap_sg when pinning user memory. I took a quick look in the kernel sources, and these functions eventually call assembly routines specific to each architecture, which are possibly responsible for disabling the cache in certain memory regions.
Also, dma_sync_sg_for_cpu and dma_sync_sg_for_device can be used to force cache flushes if you try to access the memory between DMA transfers.
I rewrote my kernel driver to use these functions, and it works.

Where to find device-tree?

Coming form this question yesterday, I decided to port this library to my board. I was aware that I needed to change something, so I compiled the library, call it on a small program and see what happens. The 1st problem is here:
// Check for GPIO and peripheral addresses from device tree.
// Adapted from code in the RPi.GPIO library at:
// http://sourceforge.net/p/raspberry-gpio-python/
FILE *fp = fopen("/proc/device-tree/soc/ranges", "rb");
if (fp == NULL) {
return MMIO_ERROR_OFFSET;
}
This lib is aimed for Rpi, os the structure of the system on my board is not the same. So I was wondering if somebody could tell me where I could find this file or how it looks like so I can find it by my self in order to proceed the job.
Thanks.
You don't necessarily want that "file" (or more precisely /proc node).
The code this is found in is setting up to do direct memory mapped I/O using what appears to be a pi-specific gpio-flavored version of the /dev/mem type of device driver for exposing hardware special function registers to userspace.
To port this to your board, you would need to first determine if there is a /dev/mem or similar capability in your kernel which you can activate. Then you would need to determine the appropriate I/O registers for GPIO pins. The pi-specific code is reading the Device Tree to figure this out, but there are other ways, for example you can manually read the programmer's manual of the SoC on which you are running.
Another approach you can consider is adding some small microcontroller (or yes, barebones ***duino) to the system, and using that to collect information from various sensors and peripherals. This can then be forwarded to the SoC over a UART link, or queried out via I2C or similar - add a small amount of cost and some degree of bottleneck, but also means that the software on the SoC then becomes very portable - to a different comparable chip, or perhaps even to run on a desktop PC during development.

check if the mapped memory supports write combining

I write a kernel driver which exposes to the user space my I/O device.
Using mmap the application gets virtual address to write into the device.
Since i want the application write uses a big PCIe transaction, the driver maps this memory to be write combining.
According to the memory type (write-combining or non-cached) the application applies an optimal method to work with the device.
But, some architectures do not support write-combining or may support but just for part of the memory space.
Hence, it is important that the kernel driver tell to application if it succeeded to map the memory to be write-combining or not.
I need a generic way to check in the kernel driver if the memory it mapped (or going to map) is write-combining or not.
How can i do it?
here is part of my code:
vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
io_remap_pfn_range(vma, vma->vm_start, pfn, PAGE_SIZE, vma->vm_page_prot);
First, you can find out if an architecture supports write-combining at compile time, with the macro ARCH_HAS_IOREMAP_WC. See, for instance, here.
At run-time, you can check the return values from ioremap_wc, or set_memory_wc and friends for success or not.

Unknown symbol flush_cache_range in linux device driver

I am just writing my very first linux device driver, and I have ran into a problem. I want to prevent one memory region from being cached, so I have been trying to use flush_cache_range() and flush_tlb_range() to flush the cache for this memory region. Everything compiles well, but when I try to load the kernel module I get the following errors:
Unknown symbol flush_cache_range (err 0)
Unknown symbol flush_tlb_range (err 0)
I find this very strange. Shouldn't they be defined in kernel?
I know that alternatively I could also use dma_alloc_coherent() to allocate a non-cached memory region. But I don't have a device structure and passing NULL for this parameter didn't cause any errors, but I also couldn't see any of the data that was supposed to be there.
Some information about my system: I'm trying to get this running on a ARM microcontroller with an integrated FPGA (the Xilinx Zynq). The FPGA copies some data to a memory location specified by the CPU. Now I want to access this memory without getting old data from the caches.
Any help is very appreciated.
You cannot use functions such as flush_cache_range() because they are not intended to be used by modules.
To allocate memory that can be accessed by a DMA device, you must use dma_alloc_coherent().
This requires a valid device structure so that it can do proper mapping between memory addresses and bus addresses.
If your device is not on a bus that is handled by an existing framework (such as PCI), you have to create a platform device.
A few notes:
1- flush_cache_range doesn't "prevent one memory region from being cached" .. It just simply flush (clean + invalidate) the caches. Any future writes/reads to this memory region through the same virtual range will go through the cache again.
2- If the FPGA is writing to memory and then the CPU are going to read from this memory, probably flushing the cache isn't the correct thing to do any way. Usually what you need to do is to invalidate the memory region and then tell the FPGA to write.
3- Please take a look at "${kernel-src}/Documentation/DMA-API.txt" in the kernel sources. It has plenty of information about how you can safely ( cache maintenance + phys_to_dma translation ) use a specific region of memory for DMA.

Convert DMA mapping to virtual address

I have a somewhat unusual situation where I'm developing a simulation module for an Ethernet device. Ideally, the simulation layer would just be identical to the real hardware with regard to the register set. The issue I've run into is that the DMA registers in the hardware are loaded with the DMA mapping (physical) address of the data. I need to use those physical addresses to copy the data from the Tx buffer on the source device to the Rx buffer on the destination device. To do that in module code, I need pointers to virtual memory. I looked at phys_to_virt() and I didn't understand this comment in the man page:
This function does not handle bus mappings for DMA transfers.
Does this mean that a physical address that is retrieved via dma_map_single cannot be converted back to a virtual address using phys_to_virt()? Is there another way to accomplish this conversion?
There is not any general way to map a DMA address to a virtual address. The dma_map_single() function might be programming an IOMMU (eg VT-d on an Intel x86 system), which results in a DMA address that is completely unrelated to the original physical or virtual address. However this presentation and the linked slides gives one approach to hooking an emulated hardware model up to a real driver (basically, use virtualization).
I am not too clear about this question but if you are using "phys_to_virt()" may be the reason that address available on the bus can not be coverted to virtual by this function. I am not sure just try bus_to_virt(bus_addr); function
Try dma_virt = virt_to_phys(bus_to_virt(dma_handle))
it worked for me. It gives the same virtual address that was mapped by dma_coherent_alloc().

Resources