I'm trying to write a kernel module for an Intel FPGA design supporting PCIe SR-IOV and placed in the x16 PCIe slot of an IBase M991 Mainboard (Q170 PCH, VT-d activated in BIOS, Integrated graphics only mode enabled).
The CPU is an Intel Core i7-6700TE, which also supports virtualization.
Furthermore I'm using a Yocto - Morty Distribution (Linux Kernel 4.19) with the following Kconfigs enabled:
CONFIG_PCI_IOV=y
CONFIG_PCI_DEBUG=y
CONFIG_INTEL_IOMMU_SVM=y
CONFIG_PCI_REALLOC_ENABLE_AUTO=y
CONFIG_INTEL_IOMMU_DEFAULT_ON=y
CONFIG_IRQ_REMAP=y
CONFIG_IOMMU_DEFAULT_PASSTHROUGH=y
CONFIG_DYNAMIC_DEBUG=y
When doing all of this I see my driver loading (probe function gets called), but after calling pci_enable_sriov with the number of VF I want to activate I get the kernel message
not enough MMIO resources for SR-IOV
What am I doing wrong here? Is there an init function I need to call?
Many thanks for your help.
Edit: More information about the PCIe device:
1 PF, 8 VF
2 BARs (BAR0 and BAR2)
non prefetchable, 32 bit BARs
each BAR size is 4 kB (12bit)
Related
First of all, I would say to you that I write this question from nothing because I have attempt to find good documentation but nothing stand out...
What happens when we squeeze a key?
I think this is complex but I hope you can help me.
What I search to know : all (but especially the program start on the host machine and how the key electric signal is encoded and send...)
The eXtensible Host Controller (xHC) has a Periodic Transfer Ring. Windows programs this ring to trigger a transfer every time an interval in milliseconds has passed. The right interval is specified in the USB descriptor returned by the USB device. When the transfer occurs, the xHC puts a Transfer Event TRB on the event ring and triggers an MSI-X interrupt which bypasses the IOAPIC as some kind of inter-processor interrupt. If Windows detects some change in the keys pressed, it will send a message to the application which currently has focus (calling the window's procedure) with the key pressed in one of the argument.
I don't know about electrical signals but I know the eXtensible Host Controller is the USB controller responsible to interact with USB on modern Windows systems. Since Windows nowadays requires an x64 processor, the xHC must be present on your motherboard. The xHC is a PCI-Express device which is compliant with the PCI-Express specification.
To find an xHC, you:
Find the RSDP ACPI table in RAM;
This table will be found by the UEFI firmware which acts as some kind of small operating-system (OS) during boot of the computer. Then, the OS developers will write a small UEFI application named bootx64.efi that they will place on a FAT32 partition on the hard-disk. They will place this app in the /boot/efi directory. The UEFI firmware will directly launch that application on boot of the computer which allows to have an OS which doesn't require user input to be launched (similarly to how it used to work with the legacy BIOS fetching the first sector of the hard-disk and executing the instructions found there).
The UEFI application is compiled in practice with either EDK2 or gnu-efi. These compilers are aware of the UEFI environment and specification. They thus compile the code to system calls that are present during boot and available for the UEFI application written by the OS developers. The System Tables (often the ACPI tables) are given as an argument to the "main" function (often called UefiMain) called by the UEFI firmware in the UEFI application. The code of the application can thus simply use these arguments to find the RSDP table and pass it to the OS.
Find the MCFG ACPI table using the RSDP;
The chain of table is RSDP -> XSDT -> MCFG. Once the OS found the MCFG, this table specifies the base address of the PCI configuration space. To interact with PCI devices you use memory mapped IO (MMIO). You write to some position in RAM and it will instead write to the registers of the PCI devices. The MCFG thus specifies the base address at which you will start finding MMIO registers for the different PCI devices that are plugged into the computer.
Iterate on the PCI devices and look at their IDs until you find an xHC.
To iterate on the PCI devices, the PCI convention specifies a formula which is the following:
UINT64 physical_address = base_address + ((bus - first_bus) << 20 | device << 15 | function << 12);
The base_address is for a specific segment group. Each segment group can have 256 buses (suitable for large servers or large computers with lots of components). There can be up to 65536 segment groups and each can have up to 256 PCI buses. Each PCI bus can have up to 32 devices plugged onto it and each device can have up to 8 functions. Each function can also be a PCI bridge. This is quite straightforward to understand because the terminology is clear. The bus here is an actual serial bus that the PCI devices (like a network card, a graphics card, an xHC, an AHCI, etc.) use to communicate with RAM. The function is a functionality of the PCI device like controlling USB devices, hard-disks, HDMI screens (for graphics cards), etc. The PCI bridge bridges a PCI bus to another PCI bus. It means you can have almost an infinite amount of devices with the PCI specification because the bridges allow to extend the tree of devices by adding other PCI host controllers.
Meanwhile, the bus is simply a number between 0 and 255. The first bus is specified in the MCFG ACPI table for a specific segment group. The device is a number between 0 and 31 and the function is a number between 0 and 7. This formula returns a physical address which points to a conventional configuration space (it is the same for all functions) which has specific registers. These registers are used to determine what is the type of device and to load a proper driver for it. Each function of each device thus gets a configuration space.
For the xHC, there will be only one function and the IDs returned by its configuration space will be 0x0C for the class ID and 0x03 for the subclass ID (https://wiki.osdev.org/EXtensible_Host_Controller_Interface).
Once you found an xHC, it gets rather complex. You need to initialize it and get the USB devices which are plugged in the computer at the current moment. You need to take several steps to get the xHC operational. For this part, I'll leave you to read the xHCI specification which (on chapter 4) specifies exactly the steps which need to be taken (https://www.intel.com/content/dam/www/public/us/en/documents/technical-specifications/extensible-host-controler-interface-usb-xhci.pdf).
For the keyboard portion I'll leave you to read one of my answer on the stackexchange for computer science: https://cs.stackexchange.com/questions/141870/when-are-a-controllers-registers-loaded-and-ready-to-inform-an-i-o-operation/141918#141918.
Some good links:
https://wiki.osdev.org/Universal_Serial_Bus
https://wiki.osdev.org/PCI
How does Linux Kernel or BIOS map the PCIe endpoint device memory into systems MMIO space ? Is there any API to achieve it ?
Lets assume that when writing a Linux device driver for a PCIe endpoint device, How can we map PCIe device memory into MMIO space ? Or Is it true that the device is already mapped into MMIO by BIOS during enumeration and what I would need to do it just remap the device MMIO into the kernel virtual address space using ioremap() ?
Platform : Linux on x86
There are two parts to this answer
Role of the BIOS
The BIOS (typically UEFI based) will do some sort of Depth-First Search (DFS) and enumerate all the children as PCIe is a self-enumerating bus. Since it has the view of the world (device, buses, processors) it will write an address to the BAR registers (could be BAR0 and or multiple of them). This will be the address the system will use and it will actually route these requests from the Host Agent (HA on x86/Intel platforms) to the Root Port to a PCIe switch all the way to the end point.
Each of these elements track what address ranges belong to themselves or one of their child devices (example a Switch may be the child of a Root Port)
Role of the Device Driver
The OS/Kernel will provide a toolkit of helper routines that the driver authors will use to access the device registers. Typically a driver may follow the folling routines
This is some sample driver pseudo-code, just to help illustrate the idea
1. pci_resource_flags(pdev, 0) & IORESOURCE_MEM
Check if a resource region is valid, here check for BAR 0
2. pci_request_regions(pdev, "region")
Take ownership of the resource/region
3. drv->registers = pci_iomap(pdev, 0, SIZE_YOU_WANT_TO_MAP)
This will give you kernel virtual address to device register mapping
Note : In case the BIOS does not enumerate, through Linux one can rescan the PCIe tree to see if a device can be seen or not.
I'm using PCIe bus on Freescale MPC8308 (as root complex) and the endpoint device is an ASIC with just one 256 MB memory region and just one BAR register. The device configuration space registers are readily accessible through "pciutils" package. At first I tried to access memory region by using mmap() but it didn't work. So at the next level, I prepared a device driver for the PCIe endpoint device which is a kernel module that I load into kernel after Linux booting.
In my driver the endpoint device is identified from device ID table but when I want to enable the device by pci_enable_device(), I see this error:
driver-pci 0000:00:00.0: device not available because of BAR 0 [0x000000-0xfffffff] collisions
Also when I want to allocate memory region for PCIe device by using pci_request_region(), it is not possible.
Here is the part of driver code which is not working:
pci_enable_result = pci_enable_device (pdev);
if (pci_enable_result)
{
printk(KERN_INFO "PCI enable encountered a problem \n");
return pci_enable_result;
}
else
{
printk(KERN_INFO "PCI enable was succesfull \n");
}
And here is the result in "dmesg" :
driver-pci 0000:00:00.0: device not available because of BAR 0 [0x000000-0xfffffff] collisions
PCI enable encountered a problem
driver-pci: probe of 0000:00:00.0 failed with error -22
It is worth noting that in the driver I can read and write configuration registers correctly by using functions like pci_read_config_dword() and pci_write_config_dword().
What's the problem do you think? is it possible that the problem appears because the kernel initializes the device prior to kernel module? what should I do to prevent this to occur?
BAR registers access are generally for small region. Your BAR0 size seems to be too large. Try with less memory (less than 1MB), it should works.
I am trying to understand how a MLO is loaded into the on-chip of a SOC and do the minimal configuration. I am using TI DM8168 soc.
I have gone through the following link to understand the MLO or x-loader:
http://omappedia.org/wiki/Bootloader_Project
I got to know that the ROM Code loads the MLO (x-loader) to the on-chip RAM of the SoC which do the minimal configuration and finally loads the uboot (universal bootloader), that finally initiates the linux kernel.
My doubt here is that my on-chip RAM size is 64 KB and the MLO size is 116 KB, then how the ROM code is loading the MLO to the on-chip RAM
It seems that the DM8168 has more than 64KiB internal RAM: as explained in
the DM816x AM389x PSP 04.00.01.13 Feature Performance Guide, it has at least two more blocks of internal RAM, referenced OMC0 and OMC1, both being 256KiB in size.
Those two banks can be used by u-boot according to this document:
OCMC0 0x40300000 - 0x4033FFFF OCMC 0 will be used by ROM Code and U-boot. Once Linux kernel boots, OCMC0 is free and kernel can use it. If OCMC0 should not be used to load u-boot if loaded using CCS.
OCMC1 0x40400000 - 0x4043FFFF OCMC 1 will be used by ROM Code and U-boot. Once Linux kernel boots, OCMC0 is free and kernel can use it.
From u-boot-omap3/board/ti/ti8168/config.mk, it seems u-boot is using OMC1
TI_LOAD_ADDR = 0x40400000
This would explain why your 116KiB u-boot image can fit in the DM8168 internal RAM.
I am writing a device driver for a PCIe card in Linux. I am trying to use interrupts in my driver.
Reading the "IRQ Line" section of the PCI configuration register (offset 0x3C) reports that the assigned IRQ line for the device is 11. lspci -b -vv also reports that my device's interrupt number is 11.
Heres where it gets weird... cat /sys/bus/pci/devices/<my_device>/irq reports that the interrupt number is 19. lspci -vv also reports that the interrupt number is 19.
Requesting 11 in my driver does not work. If I request 19 in the driver, I catch interrupts just fine.
What gives?
Thanks!!!
I believe that it has to do with the difference between "physical" and "virtual" IRQ lines. Because the processor has a limited number of physical IRQ lines it assigns virtual IRQ lines to allow the total number of PCI devices to exceed the number of physical lines.
In this instance, 19 is your virtual IRQ line (as recognized by the processor) while 11 is the physical line (as recognized by the PCI device).
By the way, you should probably really get the IRQ number from the struct pci_dev for that device since they're dynamically generated.
Sean's answer is easy to understand. However here I would try to make it more complete.
CPU's IRQ pin, almost always, isn't connected directly to a peripheral device, but via an programmable interrupt controller(PIC, e.g. Intel 8259A). This helps handling large device fan-out and also heterogeneous interrupt format (pin based v.s. message based as in PCIe).
If you run a recent version of lspci, it would print information like
Interrupt: pin A routed to IRQ 26
Here, pin A as 11 in OP, is the physical pin. This is something saved by the PCI device and used by the hardware to exchange between interrupts controller. From LDP:
The PCI set up code writes the pin number of the interrupt controller
into the PCI configuration header for each device. It determines the
interrupt pin (or IRQ) number using its knowledge of the PCI interrupt
routing topology together with the devices PCI slot number and which
PCI interrupt pin that it is using. The interrupt pin that a device
uses is fixed and is kept in a field in the PCI configuration header
for this device. It writes this information into the interrupt line
field that is reserved for this purpose. When the device driver runs,
it reads this information and uses it to request control of the
interrupt from the Linux kernel.
IRQ 26 as 19 in OP is something that kernel code and CPU deal with. According to Linux Documentation/IRQ.txt:
An IRQ number is a kernel identifier used to talk about a hardware
interrupt source. Typically this is an index into the global irq_desc
array, but except for what linux/interrupt.h implements the details
are architecture specific.
So the PCI first receives interrupts from device, translate interrupt source to a IRQ number and informs the CPU. CPU use IRQ number to look into Interrupt Descriptor Table(IDT) and find the correct software handler.
Ref:
http://www.tldp.org/LDP/tlk/dd/interrupts.html
http://www.brokenthorn.com/Resources/OSDevPic.html