Multiple entries of PCIe address range in /proc/mem log - pci-e

I use dma_alloc_coherent() in my custom driver to get both virtual and bus addresses.
res->KernelAddress = (u64)dma_alloc_coherent( &DevExt->pdev->dev, size, &res->BusAddress, GFP_ATOMIC );
When printing (%llx) the bus address (res->BusAddress), I got 80009000 as the one.
I checked the log of /proc/iomem to verify the range, but there are multiple entries.
The log of /proc/iomem is shown below:
10000000-10000fff : /pcie-controller#10003000/pci#1,0
10003000-100037ff : pcie-pads
10003800-10003fff : pcie-afi
10004000-10004fff : /pcie-controller#10003000/pci#3,0
40000000-4fffffff : pcie-config-space
50100000-57ffffff : pcie-non-prefetchable
50800000-52ffffff : PCI Bus 0000:01
50800000-5087ffff : 0000:01:00.0
51000000-51ffffff : 0000:01:00.0
52000000-52ffffff : 0000:01:00.0
58000000-7fffffff : pcie-prefetchable
58000000-58ffffff : PCI Bus 0000:01
58000000-58ffffff : 0000:01:00.0
80000000-d82fffff : System RAM
80080000-810fafff : Kernel code
8123f000-814b3fff : Kernel data
d9300000-efffffff : System RAM
f0200000-275ffffff : System RAM
276600000-2767fffff : System RAM
Is 80009000 valid? Which section does it belong to?
Is it necessary to use dma_mmap_coherent() after dma_alloc_coherent() for proper mapping?
Thanks in advance !!

From https://www.kernel.org/doc/Documentation/bus-virt-phys-mapping.txt (some of the details of this file are now obsolete but it's the best overview of the issue):
Essentially, the three ways of addressing memory are (this is "real memory",
that is, normal RAM--see later about other details):
CPU untranslated. This is the "physical" address. Physical address
0 is what the CPU sees when it drives zeroes on the memory bus.
CPU translated address. This is the "virtual" address, and is
completely internal to the CPU itself with the CPU doing the appropriate
translations into "CPU untranslated".
bus address. This is the address of memory as seen by OTHER devices,
not the CPU. Now, in theory there could be many different bus
addresses, with each device seeing memory in some device-specific way, but
happily most hardware designers aren't actually actively trying to make
things any more complex than necessary, so you can assume that all
external hardware sees the memory the same way.
Now, on normal PCs the bus address is exactly the same as the physical
address, and things are very simple indeed. However, they are that simple
because the memory and the devices share the same address space, and that is
not generally necessarily true on other PCI/ISA setups.
The bottom line is that the answer to your question is architecture-dependent.
In your /proc/iomem snippet, note that that that listing is nested. The 80009000 address appears to fall into two sections because one of those sections is a subset of the other. If that address was a physical memory address, then yes, it would be a "kernel code address", which would be a strange thing to get back from dma_alloc_coherent. That leads me to believe that a physical address is not the same as a bus address on your architecture.
dma_alloc_coherent also maps the memory in kernel virtual address space so you shouldn't need to do anything else to access it from your code. (dma_mmap_coherent is used to map the memory into user virtual address space.)

Related

How does PCIe Endpoint device memory is mapped into the systems memory map (MMIO)?

How does Linux Kernel or BIOS map the PCIe endpoint device memory into systems MMIO space ? Is there any API to achieve it ?
Lets assume that when writing a Linux device driver for a PCIe endpoint device, How can we map PCIe device memory into MMIO space ? Or Is it true that the device is already mapped into MMIO by BIOS during enumeration and what I would need to do it just remap the device MMIO into the kernel virtual address space using ioremap() ?
Platform : Linux on x86
There are two parts to this answer
Role of the BIOS
The BIOS (typically UEFI based) will do some sort of Depth-First Search (DFS) and enumerate all the children as PCIe is a self-enumerating bus. Since it has the view of the world (device, buses, processors) it will write an address to the BAR registers (could be BAR0 and or multiple of them). This will be the address the system will use and it will actually route these requests from the Host Agent (HA on x86/Intel platforms) to the Root Port to a PCIe switch all the way to the end point.
Each of these elements track what address ranges belong to themselves or one of their child devices (example a Switch may be the child of a Root Port)
Role of the Device Driver
The OS/Kernel will provide a toolkit of helper routines that the driver authors will use to access the device registers. Typically a driver may follow the folling routines
This is some sample driver pseudo-code, just to help illustrate the idea
1. pci_resource_flags(pdev, 0) & IORESOURCE_MEM
Check if a resource region is valid, here check for BAR 0
2. pci_request_regions(pdev, "region")
Take ownership of the resource/region
3. drv->registers = pci_iomap(pdev, 0, SIZE_YOU_WANT_TO_MAP)
This will give you kernel virtual address to device register mapping
Note : In case the BIOS does not enumerate, through Linux one can rescan the PCIe tree to see if a device can be seen or not.

x86 debug register : how to break on a specific physical address

x86 debug registers (dr0 to dr3) allow to break on a specific address when software attempts to read/write in (intel doc). And this configured address must be a linear address.
The problem is that when paging is enabled, the same physical address may be mapped to by several different linear addresses from different task.
So, how to break on access to a specific physical address when we do not know all its possible linear alias addresses?
Without modifying the underlying kernel, you can only do that with virtualization extensions. When you are a guest OS, what you think of as physical addresses are virtual to the next layer down; so it can use the debug registers to achieve the desired effect.
If you are willing to modify the kernel, one avenue is to refuse to map a ‘page of interest’, then redirect all faults on this page to the appropriate debuggers. It is trickier than I am stating, you may have to emulate/singlestep some code and keep some intricate state.

Is device address is virtual address? what is functionality of mmap in this case?

Is device address is virtual address? what is functionality of mmap in this case?or device address mapped to physical address
Usally, device address are allocated by specific system/host bus. It can identify devices on the bus.
Virtual address and physical address are used in memory system.
For mmap, the system allocate an I/O address for specific device in physical address space, application can access device in the way of memory access.
Usually devices comes with resources like registers, internal memory, etc that can be accessed from the CPU.
In order to be able to access a specific device register for example from the CPU you need to know the physical address of this device registers and then map this physical address to either kernel or user depending on your use case.
mmap maps resources to be accessed from user space. The result of mmap is a user space cpu address that is mapped to this resource.
This resource can be anything. It can be:
a file
anonymous memory
some external device resource ( memory, registers, etc )
mmap can't directly map device registers for example simply because it doesn't know how to do that. In this case you will probably need add some kernel space support for your mmap operation.

Convert DMA mapping to virtual address

I have a somewhat unusual situation where I'm developing a simulation module for an Ethernet device. Ideally, the simulation layer would just be identical to the real hardware with regard to the register set. The issue I've run into is that the DMA registers in the hardware are loaded with the DMA mapping (physical) address of the data. I need to use those physical addresses to copy the data from the Tx buffer on the source device to the Rx buffer on the destination device. To do that in module code, I need pointers to virtual memory. I looked at phys_to_virt() and I didn't understand this comment in the man page:
This function does not handle bus mappings for DMA transfers.
Does this mean that a physical address that is retrieved via dma_map_single cannot be converted back to a virtual address using phys_to_virt()? Is there another way to accomplish this conversion?
There is not any general way to map a DMA address to a virtual address. The dma_map_single() function might be programming an IOMMU (eg VT-d on an Intel x86 system), which results in a DMA address that is completely unrelated to the original physical or virtual address. However this presentation and the linked slides gives one approach to hooking an emulated hardware model up to a real driver (basically, use virtualization).
I am not too clear about this question but if you are using "phys_to_virt()" may be the reason that address available on the bus can not be coverted to virtual by this function. I am not sure just try bus_to_virt(bus_addr); function
Try dma_virt = virt_to_phys(bus_to_virt(dma_handle))
it worked for me. It gives the same virtual address that was mapped by dma_coherent_alloc().

How are base registers, limit registers and relocation registers used?

My understanding in address translation process in MMU(memory management unit)
-> logical address : generated by cpu.programmer concern with this address.
-> virtual address : reside in the hard disk , as a pages.
-> physical address : reside in the RAM. It is the actual address.
1: cpu generate the logical address and send it to the MMU.
2: MMU translate the logical address into the virtual address then translate it to the physical address and send the physical address to RAM.
3: when ever the RAM is full , the page which is not used rapidly is returned to the hard disk , to allocate memory to the other pages(processes).
my questions are :
1) where the value of Relocation register is added?
2) who decide the value of Relocation Register?
3) what to do with the Base register and Limit register , how to use it?
4) where the logical address goes off?
If any body can answer it , It would be grateful to me.
It is requested that , let me know it any misunderstanding in this topic.
-thanks
I can tell you how this works on x86.
All programs in non-64-bit modes operate with addresses combined of two items: segment selector (for brevity "selector" is often omitted in text and that may be confusing) and offset. This selector:offset pair is called the logical address.
The selector portion isn't always explicitly specified or manipulated with in code since the CPU has "default" associations of segment registers containing selectors with specific instructions or specific instruction encodings. It's also uncommon to manipulate selectors in 32-bit mode, but is very often necessary in 16-bit code.
The virtual address is formed from the logical address either "directly" (in real or 8086 virtual mode) or "indirectly" (in protected mode).
"Direct" virtual address = selector * 16 + offset.
"Indirect" virtual address = SegmentDescriptorTable[selector].Base + offset.
SegmentDescriptorTable is either the Global Descriptor Table (AKA GDT) or the Local Descriptor Table (AKA LDT). It's set up by the OS and describes the location and size of various segments of memory. selector is used to select a segment in the table. The Base entry of the table tells the segment's beginning (virtual address). The Limit entry tells the segment size (generally; the details are a little more complex).
When a program tries to access memory with an offset resulting access beyond the end of the segment (the CPU compares offset and Limit), the CPU generates an exception and the OS handles it, by usually terminating the program.
Btw, in real/v86 mode, even though the virtual address is formed directly from selector:offset, there's still a 16-bit Limit imposed on offsets, which is why you need to use a different selector to access more than 64KB of memory.
The Base entry in a segment descriptor can be used to either isolate the segment from the rest of the memory (Limit helps here) or to place or move the entire segment to an arbitrary virtual address without having to modify anything (or much) in the program it belongs to (if we're moving a segment, the data has to be moved in the memory, obviously). Basically, it can be used for relocation purposes. In real/v86 mode for relocation purposes the selector is changed.
The virtual address can be further translated to the physical address if the CPU is running in protected mode and has set up page tables. If there're no page tables, the physical address is the same as the virtual address. The translation is done in blocks of physical memory and address ranges that are called pages (often 4KB).
There's no dedicated relocation register on x86 CPUs. Relocation can be achieved by adjusting:
segment selectors in CPU registers or program's code
segment base addresses in GDT/LDT
offsets in program's code
physical addresses in page tables
As for virtual address : reside in the hard disk , as a pages, I'm not sure what exactly you want to say with this, but just because there's virtual to physical address translation, it doesn't mean there's also virtual on-disk memory. There are other uses for the translation besides virtual on-disk memory. And the addresses reside in the CPU and wherever your (and OS's) code writes them to, not necessarily on the disk.
Your description has a number of mistakes, much of which may be the result of imprecise documentation and common usage.
First of all, there really is no such a thing as a virtual address. There are physical and logical addresses. Sadly, the term virtual address is frequently (even in hardware documentation) used when logical address is what is meant..
The CPU instruction stream always operates on logical addresses (values may refer to physical addresses).
When the CPU needs to access a logical address, the MMU attempts to translate it to a physical addresses. It does that by looking up the address in a page table.
Several things can happen at that point:
There may not be a page table entry for the address => Access violation.
The page table entry is marked invalid => Access violation.
The page table entry indicates that no physical memory is mapped to it => Page fault.
(I omit mode access checks).
It is this last step that last step where virtual memory comes into play. At that point the page fault handler of the operating system needs to find where the corresponding page has been stored to disk, load it, update the page table, and restart the instruction.
The operating system manages the available physical memory by paging writeable memory (that has changed) to disk (read only data does not have to be written back) when there is high demand for physical memory.
I have never heard of a "relocation register" before. But doing a GOOGLE search I can see that some academic material uses it as a confusing pedagogical concept (i.e., with no relation to reality).
Some systems define the page table using base and limit registers. The base registers indicate where the page table starts in memory (this can be either a physical or logical addresses) and the limit register indicates the side of the table.
The registers are usually not loaded directly. Their values are usually written to the hardware Process Context Block (PCB). When the process context is loaded, the page table base and limit are loaded automatically.
On some systems there are multiple page tables. If there are system and user page tables, the user page tables can refer to logical addresses in the system space and the system page tables refer to physical addresses.

Resources