Mapping external memory device - gcc

I am using the GCC toolchain and the ARM Cortex-M0 uC. I would like to ask if it is possible to define a space in the linker so that the reading and writing operations would call the external device driver functions for reading and writing it's space (eg. SPI memory). Can anyone give some hints how to do it?
Regards, Rafal
EDIT:
Thank you for your comments and replies. My setup is:
The random access SPI memory is connected via SPI controller and I use a "standard" driver to access the memory space and store/read data from it.
What I wanted to do is to avoid calling the driver's functions explicitly, but to hide them behind some fixed RAM address, so that any read of that address would call the spi read memory driver function and write would call the spi write memory function (the offset of the initial address would be the address of the data in the external memory). I doubt that it is at all possible in the uC without the MMU, but I think it is always worth to ask someone else who might have had similar idea.

No, this is not how it works. Cortex-M0 has no memory management Unit, and is therefore unable to intercept accesses to specific memory regions.
It's not really clear what you are trying to achieve. If you have connected SPI memory external to the chip, you have to perform all the accesses using a driver, it is not possible to memory map the SPI port abstraction.
If this is an on-device SPI memory controller, it will have two regions in the memory map. One will be the 'memory'region, and will probably behave read-only, one with be the control registers for the memory controller hardware, and it is these registers which the device driver talks to. Specifically, to write to the SPI, you need to perform driver accesses to perform the write.
In the extreme case, (for example Cortex-M1 for Xilinx), there will be an eXecute In Place (XIP) peripheral for the memory map behaviour, and a SPI Master device for the read/write functionality. A GPIO pin is used to multiplex the SPI EEPROM pins between 'memory mode' and çonfiguration mode'.

Related

How can I improve SPDK performance on userspace DMA access?

I am working on a userspace PCI driver which uses SPDK/VFIO APIs to do dma access.
Currently for each DMA allocation request I need to fill up structure spdk_vfio_dma_map then call system call ioctl(fd, VFIO_IOMMU_MAP_DMA, &dma_map) to map the DMA region through IOMMU. Then later call ioctl(fd, VFIO_IOMMU_UNMAP_DMA, &dma_map) to unmap the IOMMU mapping.
This is working fine so far and looks like it's what SPDK examples are using. However I am wondering if there is a way to pre-allocate all memory buffer in userspace then in each DMA allocation request just use the pre-allocated memory instead of doing ioctl call each time?
Any idea is well appreciated.
Don't know if I get the issue but the whole idea (of DPDK and SPDK) is to allocate all the memory you are using on application start or driver probe.
If you are using memory that is under application control all the time then you don't need to do VFIO_IOMMU_MAP_DMA and VFIO_IOMMU_UNMAP_DMA every DMA transaction. If this is not the case you have two options:
Do the VFIO_IOMMU_MAP_DMA and VFIO_IOMMU_UNMAP_DMA for every IO
Copy the payload to the memory that is already registered in VFIO_IOMMU_MAP_DMA.
First option is better for huge memory blocks, while second is better for small IO chunks.

check if the mapped memory supports write combining

I write a kernel driver which exposes to the user space my I/O device.
Using mmap the application gets virtual address to write into the device.
Since i want the application write uses a big PCIe transaction, the driver maps this memory to be write combining.
According to the memory type (write-combining or non-cached) the application applies an optimal method to work with the device.
But, some architectures do not support write-combining or may support but just for part of the memory space.
Hence, it is important that the kernel driver tell to application if it succeeded to map the memory to be write-combining or not.
I need a generic way to check in the kernel driver if the memory it mapped (or going to map) is write-combining or not.
How can i do it?
here is part of my code:
vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
io_remap_pfn_range(vma, vma->vm_start, pfn, PAGE_SIZE, vma->vm_page_prot);
First, you can find out if an architecture supports write-combining at compile time, with the macro ARCH_HAS_IOREMAP_WC. See, for instance, here.
At run-time, you can check the return values from ioremap_wc, or set_memory_wc and friends for success or not.

Physical Memory Allocation in Kernel

I am writting a Kernel Module that is going to trigger and external PCIe device to read a block of data from my internel memory. To do this I need to send the PCIe device a pointer to the physical memory address of the data that I would like to send. Ultimately this data is going to be written from Userspace to the kernel with the write() function (userspace) and copy_from_user() (kernel space). As I understand it, the address that my kernel module will see is still a virtual memory address. I need a way to get the physical address of it so that the PCIe device can find it.
1) Can I just use mmap() from userspace and place my data in a known location in DDR memory, instead of using copy_from_user()? I do not want to accidently overwrite another processes data in memory though.
2) My kernel module reserves PCIe data space at initialization using ioremap_nocache(), can I do the same from my kernel module or is it a bad idea to treat this memory as io memory? If I can, what would happen if the memory that I try to reserve is already in use? I do not want to hard code a static memory location and then find out that it is in use.
Thanks in advance for you help.
You don't choose a memory location and put your data there. Instead, you ask the kernel to tell you the location of your data in physical memory, and tell the board to read that location. Each page of memory (4KB) will be at a different physical location, so if you are sending more data than that, your device likely supports "scatter gather" DMA, so it can read a sequence of pages at different locations in memory.
The API is this: dma_map_page() to return a value of type dma_addr_t, which you can give to the board. Then dma_unmap_page() when the transfer is finished. If you're doing scatter-gather, you'll put that value instead in the list of descriptors that you feed to the board. Again if scatter-gather is supported, dma_map_sg() and friends will help with this mapping of a large buffer into a set of pages. It's still your responsibility to set up the page descriptors in the format expected by your device.
This is all very well written up in Linux Device Drivers (Chapter 15), which is required reading. http://lwn.net/images/pdf/LDD3/ch15.pdf. Some of the APIs have changed from when the book was written, but the concepts remain the same.
Finally, mmap(): Sure, you can allocate a kernel buffer, mmap() it out to user space and fill it there, then dma_map that buffer for transmission to the device. This is in fact probably the cleanest way to avoid copy_from_user().

bypassing tty layer and copy to user

I would like to copy data to user space from kernel module which receives data from serial port and transfers it to DMA, which in turn forwards the data to tty layer and finally to user space.
the current flow is
serial driver FIFO--> DMA-->TTY layer -->User space (the data to tty layer is emptied from DMA upon expiration of timer)
What I want to achieve is
serial driver FIFO-->DMA-->user space. (I am OK with using timer to send the data to user space, if there is a better way let me know)
Also the kernel module handling the serialFIFO->DMA is not a character device.
I would like to bypass tty layer completely. what is the best way to achieve so?
Any pointers/code snippet would be appreciated.
In >=3.10.5 the "serial FIFO" that you refer to is called a uart_port. These are defined in drivers/tty/serial.
I assume that what you want to do is to copy the driver for your UART to a new file, then instead of using uart_insert_char to insert characters from the UART RX FIFO, you want to insert the characters into a buffer that you can access from user space.
The way to do this is to create a second driver, a misc class device driver that has file operations, including mmap, and that allocates kernel memory that the driver's mmap file operation function associates with the userspace mapped memory. There is a good example of code for this written by Maxime Ripard. This example was written for a FIQ handled device, but you can use just the probe routine's dma_zalloc_coherent call and the mmap routine, with it's call to remap_pfn_range, to do the trick, that is, to associate a user space mmap on the misc device file with the alloc'ed memory.
You need to connect the memory that you allocated in your misc driver to the buffer that you write to in your UART driver using either a global void pointer, or else by using an exported symbol, if your misc driver is a module. Initialize the pointer to a known invalid value in the UART driver and test it to make sure the misc driver has assigned it before you try to insert characters to the address to which it points.
Note that you can't add an mmap function to the UART driver directly because the UART driver class does not support an mmap file operation. It only supports the operations defined in the include/linux/serial_core.h struct uart_ops.
Admittedly this is a cumbersome solution - two device drivers, but the alternative is to write a new device class, a UART device that has an mmap operation, and that would be a lot of work compared with the above solution although it would be elegant. No one has done this to date because as Jonathan Corbet say's "...not every device lends itself to the mmap abstraction; it makes no sense, for instance, for serial ports and other stream-oriented devices", though this is exactly what you are asking for.
I implemented this solution for a polling mode UART driver based on the mxs-auart.c code and Maxime's example. It was non-trivial effort but mostly because I am using a FIQ handler for the polling timer. You should allow two to three weeks to get the whole thing up and running.
The DMA aspect of your question depends on whether the UART supports DMA transfer mode. If so, then you should be able to set it using the serial flags. The i.MX28's PrimeCell auarts support DMA transfer but for my application there was no advantage over simply reading bytes directly from the UART RX FIFO.

Convert DMA mapping to virtual address

I have a somewhat unusual situation where I'm developing a simulation module for an Ethernet device. Ideally, the simulation layer would just be identical to the real hardware with regard to the register set. The issue I've run into is that the DMA registers in the hardware are loaded with the DMA mapping (physical) address of the data. I need to use those physical addresses to copy the data from the Tx buffer on the source device to the Rx buffer on the destination device. To do that in module code, I need pointers to virtual memory. I looked at phys_to_virt() and I didn't understand this comment in the man page:
This function does not handle bus mappings for DMA transfers.
Does this mean that a physical address that is retrieved via dma_map_single cannot be converted back to a virtual address using phys_to_virt()? Is there another way to accomplish this conversion?
There is not any general way to map a DMA address to a virtual address. The dma_map_single() function might be programming an IOMMU (eg VT-d on an Intel x86 system), which results in a DMA address that is completely unrelated to the original physical or virtual address. However this presentation and the linked slides gives one approach to hooking an emulated hardware model up to a real driver (basically, use virtualization).
I am not too clear about this question but if you are using "phys_to_virt()" may be the reason that address available on the bus can not be coverted to virtual by this function. I am not sure just try bus_to_virt(bus_addr); function
Try dma_virt = virt_to_phys(bus_to_virt(dma_handle))
it worked for me. It gives the same virtual address that was mapped by dma_coherent_alloc().

Resources