I'm writing a character driver which acts are bridge b/w mailbox framework and expose fops for userspace applications.
The user apps do data xfer to remote processors using this char driver. However, a requirement is remote processor wants the data to be in a fixed memory range. The range is provided to char drv in probe() fn.
I could do memremap() of the addr range provided and use it if there was a single user application but in my current case I'm having multiple applications trying to send data buffers to remote core.
So, I need to have the allocation for kernel buffer in a specific memory range and then do copy_from_user() on each data buffer received, then pass the physical addr to the remote core(arm m0+)
Are there any APIs in Linux kernel which does this kind of allocations?
Any pointers on this would be very much appreciated and helpful. Thanks in advance for any suggestions/questions.
ps: the memory addr range provided by remote core is a RAM addr within lower 2GB range(0x8000_0000 - 0xFFFF_FFFF)
Related
I have a device memory mapped to kernel virtual address via ioremap. Userspace needs to access a page at offset x from this device memory.
The way i can achieve it rightnow is via using mmap in userspace and writing a small memory mapping at driver side.
Is there any way to use offset ( lets assume kernel passes the offset to userspae )and achieve samething without making any mapping at driver side.
Can ioremapped kernel virtual addresses be used here ?
I am working on a userspace PCI driver which uses SPDK/VFIO APIs to do dma access.
Currently for each DMA allocation request I need to fill up structure spdk_vfio_dma_map then call system call ioctl(fd, VFIO_IOMMU_MAP_DMA, &dma_map) to map the DMA region through IOMMU. Then later call ioctl(fd, VFIO_IOMMU_UNMAP_DMA, &dma_map) to unmap the IOMMU mapping.
This is working fine so far and looks like it's what SPDK examples are using. However I am wondering if there is a way to pre-allocate all memory buffer in userspace then in each DMA allocation request just use the pre-allocated memory instead of doing ioctl call each time?
Any idea is well appreciated.
Don't know if I get the issue but the whole idea (of DPDK and SPDK) is to allocate all the memory you are using on application start or driver probe.
If you are using memory that is under application control all the time then you don't need to do VFIO_IOMMU_MAP_DMA and VFIO_IOMMU_UNMAP_DMA every DMA transaction. If this is not the case you have two options:
Do the VFIO_IOMMU_MAP_DMA and VFIO_IOMMU_UNMAP_DMA for every IO
Copy the payload to the memory that is already registered in VFIO_IOMMU_MAP_DMA.
First option is better for huge memory blocks, while second is better for small IO chunks.
Instead of using dma_map_single() or kmalloc() and dma_map_sg() to allocate a CPU accessible buffer then obtain IOMMU mapped dma address, can I specify a specific dma_addr_t type dma address and pass it to kernel to use?
The reason I have to do this is my customized hardware provides a way to calculate IOMMU mapped dma address that is available for device driver to use, but I am not sure if I can correlate this to CPU virtual memory. dma_map_sg() and dma_map_single() work in my case but I have no control over the dma address it returns (I would like to check a specific bit in dma address, and only use the address when the bit is set).
I have checked several APIs looks like dma_map_sg() might be able to do so... any idea is well appreciated.
I am porting Windows 7 Network driver code to WEC7. I got stuck with the API MmGetPhysicalAddress. I didn't find equivalent API to this in WEC7. Can anyone help to proceed further..
Thanks.
MmGetPhysicalAddress is not available in Windows CE, but you probably don't need it anyway.
Somewhere in the InitializeHandlerEx callback, the driver should be calling NdisMAllocateSharedMemory to allocate RX/TX buffers.
NdisMAllocateSharedMemory returns both the virtual and physical address of the allocated buffer, so you can keep the physical address around, and then there won't be any need to request it from the OS.
Normally the physical address would be kept in a driver-specific, per-buffer structure along with the virtual buffer address.
You can find a sample implementation of this in C:\WINCE700\public\COMMON\oak\drivers\netcard\e100bex\60. In mp_init.c, notice how NICAllocAdapterMemory calls NdisMAllocateSharedMemory and stores the physical address of each buffer in pMpTxbuf->BufferPa.
You may have a look at LockPages:
https://msdn.microsoft.com/en-us/library/ee482989.aspx
But if the buffer was not allocated using NDIS functions it may not be fully contiguous in physical memory, so you may need to check that.
I am writting a Kernel Module that is going to trigger and external PCIe device to read a block of data from my internel memory. To do this I need to send the PCIe device a pointer to the physical memory address of the data that I would like to send. Ultimately this data is going to be written from Userspace to the kernel with the write() function (userspace) and copy_from_user() (kernel space). As I understand it, the address that my kernel module will see is still a virtual memory address. I need a way to get the physical address of it so that the PCIe device can find it.
1) Can I just use mmap() from userspace and place my data in a known location in DDR memory, instead of using copy_from_user()? I do not want to accidently overwrite another processes data in memory though.
2) My kernel module reserves PCIe data space at initialization using ioremap_nocache(), can I do the same from my kernel module or is it a bad idea to treat this memory as io memory? If I can, what would happen if the memory that I try to reserve is already in use? I do not want to hard code a static memory location and then find out that it is in use.
Thanks in advance for you help.
You don't choose a memory location and put your data there. Instead, you ask the kernel to tell you the location of your data in physical memory, and tell the board to read that location. Each page of memory (4KB) will be at a different physical location, so if you are sending more data than that, your device likely supports "scatter gather" DMA, so it can read a sequence of pages at different locations in memory.
The API is this: dma_map_page() to return a value of type dma_addr_t, which you can give to the board. Then dma_unmap_page() when the transfer is finished. If you're doing scatter-gather, you'll put that value instead in the list of descriptors that you feed to the board. Again if scatter-gather is supported, dma_map_sg() and friends will help with this mapping of a large buffer into a set of pages. It's still your responsibility to set up the page descriptors in the format expected by your device.
This is all very well written up in Linux Device Drivers (Chapter 15), which is required reading. http://lwn.net/images/pdf/LDD3/ch15.pdf. Some of the APIs have changed from when the book was written, but the concepts remain the same.
Finally, mmap(): Sure, you can allocate a kernel buffer, mmap() it out to user space and fill it there, then dma_map that buffer for transmission to the device. This is in fact probably the cleanest way to avoid copy_from_user().