Get PFN from DMA address (dma_addr_t)? - linux-kernel

I would like to get the PFN associated with a memory block allocated with dma_alloc_coherent for use with a PCIe device as shown below:
unsigned long pfn;
buffer = dma_alloc_coherent(&pcie->dev, size, &bus_addr, GFP_KERNEL);
// Get PFN?
virt_to_phys(buffer) >> PAGE_SHIFT;
I'm aware that this is probably not the correct method, but it seems to work... I'm just looking for the right solution to translate the potential bus address (since I do not know if there is an IOMMU) to a PFN. Thanks in advance.
Note: There seems to be an ARM function in the kernel called dma_to_pfn, which seems to be exactly what I need, but for x86.

What you're doing is indeed wrong. From the man page for virt_to_phys():
This function does not give bus mappings for DMA transfers. In almost all conceivable cases a device driver should not be using this function.
The equivalent function for DMA addresses is dma_to_phys(), defined in include/linux/dma-direct.h as follows:
phys_addr_t dma_to_phys(struct device *dev, dma_addr_t daddr);
Therefore you can do:
dma_to_phys(&pcie->dev, bus_addr) >> PAGE_SHIFT;
Notice that I am using the bus_addr returned by dma_alloc_coherent(), not buffer, since you obviously need to pass a DMA address (dma_addr_t) to this function, not a virtual address.
There also seems to be a macro PHYS_PFN() defined in include/linux/pfn.h to get the PFN for a given physical address, if you prefer to use that:
PHYS_PFN(dma_to_phys(&pcie->dev, bus_addr));

Related

How can my PCI device driver remap PCI memory to userspace?

I am trying to implement a PCI device driver for a virtual PCI device on QEMU. The device defines a BAR region as RAM, and the driver can do ioremap() this region and access it without any issues. The next step is to assign this region (or a fraction of it) to a user application.
To do this, I have also implemented an .mmap function as part of my driver file operations. This mmap is simply using remap_pfn_range, but it also passes the pfn of the memory pointer returned by the ioremap() earlier.
However, upon running the user space application, the mmap is successful, but when the app tries to access the memory, it gets killed and I get the following dmesg errors.
"
a.out: Corrupted page table at address 7f66248b8000
..Some page table info..
Bad pagetable: 000f [#2] SMP NOPTI
..and the core dump..
"
Does anyone know what have I done wrong? Did I missed a step? Or it could be an error specific to QEMU?
I am running x86_softmmu as my QEMU configuration and my kernel is the 4.14
I've solved this issue and managed to map PCI memory to user space via the driver. As #IanAbbott implied, I've changed the pfn input of the remap_pfn_range() function I was using in my custom ->mmap().
The original was:
io_remap_pfn_range(vma, vma->vm_start, pfn, vma->vm_end - vma->vm_start, vma->vm_page_prot));
where the pfn was the result of the buffer pointer return from the ioremap(). I changed the pfn to:
pfn = pci_resource_start(pdev, BAR) >> PAGE_SHIFT;
That basically points to the actual starting address pointed by the BAR. My working remap_pfn_range() function is now:
io_remap_pfn_range(vma, vma->vm_start, pci_resource_start(pdev, BAR) >> PAGE_SHIFT, vma->vm_end - vma->vm_start,vma->vm_page_prot);
I confirmed that it works by doing some dummy writes to the buffer pointer in my driver, then picking up the reads and doing some writes in my user space application.

Flushing cache when DMAing directly to user memory

I have written a driver whose purpose is allow a userspace program to pin its pages and get the physical addresses for them.
Specifically, I do this with a call to get_user_pages_fast in my kernel module.
For reference, the source code to this module can be found here: https://github.com/UofT-HPRC/mpsoc_drivers/tree/master/pinner
Using /dev/mem (and yes, my kernel does allow unsafe /dev/mem accesses) I have confirmed that the physical addresses are correct.
However, I have some external hardware (an AXI DMA in an FPGA, to be precise) which is not working, and it looks like it might be a cache coherency problem. On lines 329-337 of the above linked code, I do this: (in this code, cm.usr_buf is a user virtual address)
//Find the VMA containing the user's buffer
struct vm_area_struct *vma = find_vma(current->mm, (unsigned long)cmd.usr_buf);
if (!vma) {
printk(KERN_ALERT "pinner: unrecognized user virtual address\n");
return -EINVAL;
}
flush_cache_range(vma, (unsigned long) cmd.usr_buf, (unsigned long) cmd.usr_buf + cmd.usr_buf_sz);
This doesn't appear to help. I have also tried the more general flush_cache_mm function.
Is there a correct way to flush the cache of user pages?
I tried a different API for flushing the cache. Laurent Pinchard gave a talk called "Mastering the DMA and IOMMU APIs", and in it he explains that the functions in <asm/cacheflush.h> shouldn't be used. Instead, you can use things like dma_map_sg and dma_unmap_sg when pinning user memory. I took a quick look in the kernel sources, and these functions eventually call assembly routines specific to each architecture, which are possibly responsible for disabling the cache in certain memory regions.
Also, dma_sync_sg_for_cpu and dma_sync_sg_for_device can be used to force cache flushes if you try to access the memory between DMA transfers.
I rewrote my kernel driver to use these functions, and it works.

How to get physical address from struct page in linux kernel

Suppose I get a pointer to struct page from allocator, how could I get corresponding physical address? Whether does kernel provide functions to achieve it?
Thanks.
The answer is page_to_phys(), but please make sure that this is really what you need. e.g., if you want the physical address in order to hand it off to some device for DMA it is very likely what you need is the bus address for the page, which may or may not be the physical address.
http://lxr.free-electrons.com/source/include/asm-generic/page.h#L90

Convert DMA mapping to virtual address

I have a somewhat unusual situation where I'm developing a simulation module for an Ethernet device. Ideally, the simulation layer would just be identical to the real hardware with regard to the register set. The issue I've run into is that the DMA registers in the hardware are loaded with the DMA mapping (physical) address of the data. I need to use those physical addresses to copy the data from the Tx buffer on the source device to the Rx buffer on the destination device. To do that in module code, I need pointers to virtual memory. I looked at phys_to_virt() and I didn't understand this comment in the man page:
This function does not handle bus mappings for DMA transfers.
Does this mean that a physical address that is retrieved via dma_map_single cannot be converted back to a virtual address using phys_to_virt()? Is there another way to accomplish this conversion?
There is not any general way to map a DMA address to a virtual address. The dma_map_single() function might be programming an IOMMU (eg VT-d on an Intel x86 system), which results in a DMA address that is completely unrelated to the original physical or virtual address. However this presentation and the linked slides gives one approach to hooking an emulated hardware model up to a real driver (basically, use virtualization).
I am not too clear about this question but if you are using "phys_to_virt()" may be the reason that address available on the bus can not be coverted to virtual by this function. I am not sure just try bus_to_virt(bus_addr); function
Try dma_virt = virt_to_phys(bus_to_virt(dma_handle))
it worked for me. It gives the same virtual address that was mapped by dma_coherent_alloc().

How to get a struct page from any address in the Linux kernel

I have existing code that takes a list of struct page * and builds a descriptor table to share memory with a device. The upper layer of that code currently expects a buffer allocated with vmalloc or from user space, and uses vmalloc_to_page to obtain the corresponding struct page *.
Now the upper layer needs to cope with all kinds of memory, not just memory obtained through vmalloc. This could be a buffer obtained with kmalloc, a pointer inside the stack of a kernel thread, or other cases that I'm not aware of. The only guarantee I have is that the caller of this upper layer must ensure that the memory buffer in question is mapped in kernel space at that point (i.e. it is valid to access buffer[i] for all 0<=i<size at this point). How do I obtain a struct page* corresponding to an arbitrary pointer?
Putting it in pseudo-code, I have this:
lower_layer(struct page*);
upper_layer(void *buffer, size_t size) {
for (addr = buffer & PAGE_MASK; addr <= buffer + size; addr += PAGE_SIZE) {
struct page *pg = vmalloc_to_page(addr);
lower_layer(pg);
}
}
and I now need to change upper_layer to cope with any valid buffer (without changing lower_layer).
I've found virt_to_page, which Linux Device Drivers indicates operates on “a logical address, [not] memory from vmalloc or high memory”. Furthermore, is_vmalloc_addr tests whether an address comes from vmalloc, and virt_addr_valid tests if an address is a valid virtual address (fodder for virt_to_page; this includes kmalloc(GFP_KERNEL) and kernel stacks). What about other cases: global buffers, high memory (it'll come one day, though I can ignore it for now), possibly other kinds that I'm not aware of? So I could reformulate my question as:
What are all the kinds of memory zones in the kernel?
How do I tell them apart?
How do I obtain page mapping information for each of them?
If it matters, the code is running on ARM (with an MMU), and the kernel version is at least 2.6.26.
I guess what you want is a page table walk, something like (warning, not actual code, locking missing etc):
struct mm_struct *mm = current->mm;
pgd = pgd_offset(mm, address);
pmd = pmd_offset(pgd, address);
pte = *pte_offset_map(pmd, address);
page = pte_page(pte);
But you you should be very very careful with this. the kmalloc address you got might very well be not page aligned for example. This sounds like a very dangerous API to me.
Mapping Addresses to a struct page
There is a requirement for Linux to have a fast method of mapping virtual addresses to physical addresses and for mapping struct pages to their physical address. Linux achieves this by knowing where, in both virtual and physical memory, the global mem_map array is because the global array has pointers to all struct pages representing physical memory in the system. All architectures achieve this with very similar mechanisms, but, for illustration purposes, we will only examine the x86 carefully.
Mapping Physical to Virtual Kernel Addresses
any virtual address can be translated to the physical address by simply subtracting PAGE_OFFSET, which is essentially what the function virt_to_phys() with the macro __pa() does:
/* from <asm-i386/page.h> */
132 #define __pa(x) ((unsigned long)(x)-PAGE_OFFSET)
/* from <asm-i386/io.h> */
76 static inline unsigned long virt_to_phys(volatile void * address)
77 {
78 return __pa(address);
79 }
Obviously, the reverse operation involves simply adding PAGE_OFFSET, which is carried out by the function phys_to_virt() with the macro __va(). Next we see how this helps the mapping of struct pages to physical addresses.
There is one exception where virt_to_phys() cannot be used to convert virtual addresses to physical ones. Specifically, on the PPC and ARM architectures, virt_to_phys() cannot be used to convert addresses that have been returned by the function consistent_alloc(). consistent_alloc() is used on PPC and ARM architectures to return memory from non-cached for use with DMA.
What are all the kinds of memory zones in the kernel? <---see here
For user-space allocated memory, you want to use get_user_pages, which will give you the list of pages associated with the malloc'd memory, and also increment their reference counter (you'll need to call page_cache_release on each page once done with them.)
For vmalloc'd pages, vmalloc_to_page is your friend, and I don't think you need to do anything.
For 64 bit architectures, the answer of gby should be adapted to:
pgd_t * pgd;
pmd_t * pmd;
pte_t * pte;
struct page *page = NULL;
pud_t * pud;
void * kernel_address;
pgd = pgd_offset(mm, address);
pud = pud_offset(pgd, address);
pmd = pmd_offset(pud, address);
pte = pte_offset_map(pmd, address);
page = pte_page(*pte);
// mapping in kernel memory:
kernel_address = kmap(page);
// work with kernel_address....
kunmap(page);
You could try virt_to_page. I am not sure it is what you want, but at least it is somewhere to start looking.

Resources