Flushing cache when DMAing directly to user memory - caching

I have written a driver whose purpose is allow a userspace program to pin its pages and get the physical addresses for them.
Specifically, I do this with a call to get_user_pages_fast in my kernel module.
For reference, the source code to this module can be found here: https://github.com/UofT-HPRC/mpsoc_drivers/tree/master/pinner
Using /dev/mem (and yes, my kernel does allow unsafe /dev/mem accesses) I have confirmed that the physical addresses are correct.
However, I have some external hardware (an AXI DMA in an FPGA, to be precise) which is not working, and it looks like it might be a cache coherency problem. On lines 329-337 of the above linked code, I do this: (in this code, cm.usr_buf is a user virtual address)
//Find the VMA containing the user's buffer
struct vm_area_struct *vma = find_vma(current->mm, (unsigned long)cmd.usr_buf);
if (!vma) {
printk(KERN_ALERT "pinner: unrecognized user virtual address\n");
return -EINVAL;
}
flush_cache_range(vma, (unsigned long) cmd.usr_buf, (unsigned long) cmd.usr_buf + cmd.usr_buf_sz);
This doesn't appear to help. I have also tried the more general flush_cache_mm function.
Is there a correct way to flush the cache of user pages?

I tried a different API for flushing the cache. Laurent Pinchard gave a talk called "Mastering the DMA and IOMMU APIs", and in it he explains that the functions in <asm/cacheflush.h> shouldn't be used. Instead, you can use things like dma_map_sg and dma_unmap_sg when pinning user memory. I took a quick look in the kernel sources, and these functions eventually call assembly routines specific to each architecture, which are possibly responsible for disabling the cache in certain memory regions.
Also, dma_sync_sg_for_cpu and dma_sync_sg_for_device can be used to force cache flushes if you try to access the memory between DMA transfers.
I rewrote my kernel driver to use these functions, and it works.

Related

check if the mapped memory supports write combining

I write a kernel driver which exposes to the user space my I/O device.
Using mmap the application gets virtual address to write into the device.
Since i want the application write uses a big PCIe transaction, the driver maps this memory to be write combining.
According to the memory type (write-combining or non-cached) the application applies an optimal method to work with the device.
But, some architectures do not support write-combining or may support but just for part of the memory space.
Hence, it is important that the kernel driver tell to application if it succeeded to map the memory to be write-combining or not.
I need a generic way to check in the kernel driver if the memory it mapped (or going to map) is write-combining or not.
How can i do it?
here is part of my code:
vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
io_remap_pfn_range(vma, vma->vm_start, pfn, PAGE_SIZE, vma->vm_page_prot);
First, you can find out if an architecture supports write-combining at compile time, with the macro ARCH_HAS_IOREMAP_WC. See, for instance, here.
At run-time, you can check the return values from ioremap_wc, or set_memory_wc and friends for success or not.

Unknown symbol flush_cache_range in linux device driver

I am just writing my very first linux device driver, and I have ran into a problem. I want to prevent one memory region from being cached, so I have been trying to use flush_cache_range() and flush_tlb_range() to flush the cache for this memory region. Everything compiles well, but when I try to load the kernel module I get the following errors:
Unknown symbol flush_cache_range (err 0)
Unknown symbol flush_tlb_range (err 0)
I find this very strange. Shouldn't they be defined in kernel?
I know that alternatively I could also use dma_alloc_coherent() to allocate a non-cached memory region. But I don't have a device structure and passing NULL for this parameter didn't cause any errors, but I also couldn't see any of the data that was supposed to be there.
Some information about my system: I'm trying to get this running on a ARM microcontroller with an integrated FPGA (the Xilinx Zynq). The FPGA copies some data to a memory location specified by the CPU. Now I want to access this memory without getting old data from the caches.
Any help is very appreciated.
You cannot use functions such as flush_cache_range() because they are not intended to be used by modules.
To allocate memory that can be accessed by a DMA device, you must use dma_alloc_coherent().
This requires a valid device structure so that it can do proper mapping between memory addresses and bus addresses.
If your device is not on a bus that is handled by an existing framework (such as PCI), you have to create a platform device.
A few notes:
1- flush_cache_range doesn't "prevent one memory region from being cached" .. It just simply flush (clean + invalidate) the caches. Any future writes/reads to this memory region through the same virtual range will go through the cache again.
2- If the FPGA is writing to memory and then the CPU are going to read from this memory, probably flushing the cache isn't the correct thing to do any way. Usually what you need to do is to invalidate the memory region and then tell the FPGA to write.
3- Please take a look at "${kernel-src}/Documentation/DMA-API.txt" in the kernel sources. It has plenty of information about how you can safely ( cache maintenance + phys_to_dma translation ) use a specific region of memory for DMA.

Physical Memory Allocation in Kernel

I am writting a Kernel Module that is going to trigger and external PCIe device to read a block of data from my internel memory. To do this I need to send the PCIe device a pointer to the physical memory address of the data that I would like to send. Ultimately this data is going to be written from Userspace to the kernel with the write() function (userspace) and copy_from_user() (kernel space). As I understand it, the address that my kernel module will see is still a virtual memory address. I need a way to get the physical address of it so that the PCIe device can find it.
1) Can I just use mmap() from userspace and place my data in a known location in DDR memory, instead of using copy_from_user()? I do not want to accidently overwrite another processes data in memory though.
2) My kernel module reserves PCIe data space at initialization using ioremap_nocache(), can I do the same from my kernel module or is it a bad idea to treat this memory as io memory? If I can, what would happen if the memory that I try to reserve is already in use? I do not want to hard code a static memory location and then find out that it is in use.
Thanks in advance for you help.
You don't choose a memory location and put your data there. Instead, you ask the kernel to tell you the location of your data in physical memory, and tell the board to read that location. Each page of memory (4KB) will be at a different physical location, so if you are sending more data than that, your device likely supports "scatter gather" DMA, so it can read a sequence of pages at different locations in memory.
The API is this: dma_map_page() to return a value of type dma_addr_t, which you can give to the board. Then dma_unmap_page() when the transfer is finished. If you're doing scatter-gather, you'll put that value instead in the list of descriptors that you feed to the board. Again if scatter-gather is supported, dma_map_sg() and friends will help with this mapping of a large buffer into a set of pages. It's still your responsibility to set up the page descriptors in the format expected by your device.
This is all very well written up in Linux Device Drivers (Chapter 15), which is required reading. http://lwn.net/images/pdf/LDD3/ch15.pdf. Some of the APIs have changed from when the book was written, but the concepts remain the same.
Finally, mmap(): Sure, you can allocate a kernel buffer, mmap() it out to user space and fill it there, then dma_map that buffer for transmission to the device. This is in fact probably the cleanest way to avoid copy_from_user().

The correct way to access uncachable memory in Linux

My scenario is as follow:
One board acted as the PCIE Rootport (RP) and one board acted as PCIE
Endpoint (EP).
The PCIE Endpoint exported a memory region. This
memory region is shared between the RP and EP. Whenever we need to
access this shared memory region (actually we just access the first
few byte control data structure in this region). We have to do a
invalidate cache for reading and flush cache after writing.
I tried to declare the structure as below, but without the invalidate/flush
cache, the read/write does not take effect.
typedef volatile struct {
u32 front;
u32 rear;
u32 n_msg;
u32 offset;
} queue_ctl_t;
May anyone please tell me the correct way to access this shared memory region. I just wonder how some network drivers (for network cards on PCIE bus) can access the data consistently without doing the invalidate/flush cache.
Any suggestions are appreciated, thank a lot!
You should be using ioremap_nocache() to map the memory regions in question, and reading/writing them using the ioread*() and iowrite*() functions.

What happens when I printk a char * that was initialized in userspace?

I implemented a new system call as an intro exercise. All it does is take in a buffer and printk that buffer. I later learned that the correct practice would be to use copy_from_user.
Is this just a precautionary measure to validate the address, or is my system call causing some error (page fault?) that I cannot see?
If it is just a precautionary measure, what is it protecting against?
Thanks!
There are several reasons.
Some architectures employ segmented memory, where there is a separate segment for the user memory. In that case, copy_from_user is essential to actually get the right memory address.
The kernel has access to everything, including (almost by definition) a lot of privileged information. Not using copy_from_user could allow information disclosure if a user passes in a kernel address. Worse, if you are writing to a user-supplied buffer without copy_to_user, the user could overwrite kernel memory.
You'd like to prevent the user from crashing the kernel module just by passing in a bad pointer; using copy_from_user protects against faults so e.g. a system call handler can return EFAULT in response to a bad user pointer.

Resources