Linux PCI Device Driver - Bus v. Kernel IRQ - linux-kernel

I am writing a device driver for a PCIe card in Linux. I am trying to use interrupts in my driver.
Reading the "IRQ Line" section of the PCI configuration register (offset 0x3C) reports that the assigned IRQ line for the device is 11. lspci -b -vv also reports that my device's interrupt number is 11.
Heres where it gets weird... cat /sys/bus/pci/devices/<my_device>/irq reports that the interrupt number is 19. lspci -vv also reports that the interrupt number is 19.
Requesting 11 in my driver does not work. If I request 19 in the driver, I catch interrupts just fine.
What gives?
Thanks!!!

I believe that it has to do with the difference between "physical" and "virtual" IRQ lines. Because the processor has a limited number of physical IRQ lines it assigns virtual IRQ lines to allow the total number of PCI devices to exceed the number of physical lines.
In this instance, 19 is your virtual IRQ line (as recognized by the processor) while 11 is the physical line (as recognized by the PCI device).
By the way, you should probably really get the IRQ number from the struct pci_dev for that device since they're dynamically generated.

Sean's answer is easy to understand. However here I would try to make it more complete.
CPU's IRQ pin, almost always, isn't connected directly to a peripheral device, but via an programmable interrupt controller(PIC, e.g. Intel 8259A). This helps handling large device fan-out and also heterogeneous interrupt format (pin based v.s. message based as in PCIe).
If you run a recent version of lspci, it would print information like
Interrupt: pin A routed to IRQ 26
Here, pin A as 11 in OP, is the physical pin. This is something saved by the PCI device and used by the hardware to exchange between interrupts controller. From LDP:
The PCI set up code writes the pin number of the interrupt controller
into the PCI configuration header for each device. It determines the
interrupt pin (or IRQ) number using its knowledge of the PCI interrupt
routing topology together with the devices PCI slot number and which
PCI interrupt pin that it is using. The interrupt pin that a device
uses is fixed and is kept in a field in the PCI configuration header
for this device. It writes this information into the interrupt line
field that is reserved for this purpose. When the device driver runs,
it reads this information and uses it to request control of the
interrupt from the Linux kernel.
IRQ 26 as 19 in OP is something that kernel code and CPU deal with. According to Linux Documentation/IRQ.txt:
An IRQ number is a kernel identifier used to talk about a hardware
interrupt source. Typically this is an index into the global irq_desc
array, but except for what linux/interrupt.h implements the details
are architecture specific.
So the PCI first receives interrupts from device, translate interrupt source to a IRQ number and informs the CPU. CPU use IRQ number to look into Interrupt Descriptor Table(IDT) and find the correct software handler.
Ref:
http://www.tldp.org/LDP/tlk/dd/interrupts.html
http://www.brokenthorn.com/Resources/OSDevPic.html

Related

What happens when we press a key on Windows?

First of all, I would say to you that I write this question from nothing because I have attempt to find good documentation but nothing stand out...
What happens when we squeeze a key?
I think this is complex but I hope you can help me.
What I search to know : all (but especially the program start on the host machine and how the key electric signal is encoded and send...)
The eXtensible Host Controller (xHC) has a Periodic Transfer Ring. Windows programs this ring to trigger a transfer every time an interval in milliseconds has passed. The right interval is specified in the USB descriptor returned by the USB device. When the transfer occurs, the xHC puts a Transfer Event TRB on the event ring and triggers an MSI-X interrupt which bypasses the IOAPIC as some kind of inter-processor interrupt. If Windows detects some change in the keys pressed, it will send a message to the application which currently has focus (calling the window's procedure) with the key pressed in one of the argument.
I don't know about electrical signals but I know the eXtensible Host Controller is the USB controller responsible to interact with USB on modern Windows systems. Since Windows nowadays requires an x64 processor, the xHC must be present on your motherboard. The xHC is a PCI-Express device which is compliant with the PCI-Express specification.
To find an xHC, you:
Find the RSDP ACPI table in RAM;
This table will be found by the UEFI firmware which acts as some kind of small operating-system (OS) during boot of the computer. Then, the OS developers will write a small UEFI application named bootx64.efi that they will place on a FAT32 partition on the hard-disk. They will place this app in the /boot/efi directory. The UEFI firmware will directly launch that application on boot of the computer which allows to have an OS which doesn't require user input to be launched (similarly to how it used to work with the legacy BIOS fetching the first sector of the hard-disk and executing the instructions found there).
The UEFI application is compiled in practice with either EDK2 or gnu-efi. These compilers are aware of the UEFI environment and specification. They thus compile the code to system calls that are present during boot and available for the UEFI application written by the OS developers. The System Tables (often the ACPI tables) are given as an argument to the "main" function (often called UefiMain) called by the UEFI firmware in the UEFI application. The code of the application can thus simply use these arguments to find the RSDP table and pass it to the OS.
Find the MCFG ACPI table using the RSDP;
The chain of table is RSDP -> XSDT -> MCFG. Once the OS found the MCFG, this table specifies the base address of the PCI configuration space. To interact with PCI devices you use memory mapped IO (MMIO). You write to some position in RAM and it will instead write to the registers of the PCI devices. The MCFG thus specifies the base address at which you will start finding MMIO registers for the different PCI devices that are plugged into the computer.
Iterate on the PCI devices and look at their IDs until you find an xHC.
To iterate on the PCI devices, the PCI convention specifies a formula which is the following:
UINT64 physical_address = base_address + ((bus - first_bus) << 20 | device << 15 | function << 12);
The base_address is for a specific segment group. Each segment group can have 256 buses (suitable for large servers or large computers with lots of components). There can be up to 65536 segment groups and each can have up to 256 PCI buses. Each PCI bus can have up to 32 devices plugged onto it and each device can have up to 8 functions. Each function can also be a PCI bridge. This is quite straightforward to understand because the terminology is clear. The bus here is an actual serial bus that the PCI devices (like a network card, a graphics card, an xHC, an AHCI, etc.) use to communicate with RAM. The function is a functionality of the PCI device like controlling USB devices, hard-disks, HDMI screens (for graphics cards), etc. The PCI bridge bridges a PCI bus to another PCI bus. It means you can have almost an infinite amount of devices with the PCI specification because the bridges allow to extend the tree of devices by adding other PCI host controllers.
Meanwhile, the bus is simply a number between 0 and 255. The first bus is specified in the MCFG ACPI table for a specific segment group. The device is a number between 0 and 31 and the function is a number between 0 and 7. This formula returns a physical address which points to a conventional configuration space (it is the same for all functions) which has specific registers. These registers are used to determine what is the type of device and to load a proper driver for it. Each function of each device thus gets a configuration space.
For the xHC, there will be only one function and the IDs returned by its configuration space will be 0x0C for the class ID and 0x03 for the subclass ID (https://wiki.osdev.org/EXtensible_Host_Controller_Interface).
Once you found an xHC, it gets rather complex. You need to initialize it and get the USB devices which are plugged in the computer at the current moment. You need to take several steps to get the xHC operational. For this part, I'll leave you to read the xHCI specification which (on chapter 4) specifies exactly the steps which need to be taken (https://www.intel.com/content/dam/www/public/us/en/documents/technical-specifications/extensible-host-controler-interface-usb-xhci.pdf).
For the keyboard portion I'll leave you to read one of my answer on the stackexchange for computer science: https://cs.stackexchange.com/questions/141870/when-are-a-controllers-registers-loaded-and-ready-to-inform-an-i-o-operation/141918#141918.
Some good links:
https://wiki.osdev.org/Universal_Serial_Bus
https://wiki.osdev.org/PCI

Failed to request_irq for kernel module

I am trying to port drivers from old kernel to new one on ARM based platform. While porting one of the drivers I have noticed that request_irq fails on new kernel. Actually, what this driver have is a number of hard coded irq numbers, and it tries to request_irq for this HW lines. I started to search what is the reason of request_irq failure, - the reason is that the appropriate IRQ descriptor (irq_desc) have IRQ_NOREQUEST flag set.
I started to search where this flag is cleared, and found that it happens here:
of_platform_populate
|
...
|
of_device_alloc
|
...
|
irq_of_parse_and_map
(some levels below this flag is dropped)
So that code is invoked from mach init code and parse DTB, all interrupt numbers that are mentioned in DTB would be mapped to irq virtual space and will be accessible through appropriate devices structures, for example:
irq = platform_get_irq(pdev, 0);
As I already said, this old - fashion drivers just have hard-coded irq numbers in include files, and they don't have no appropriate dtb entries.
The question is what is the right way, from the kernel perspective of view, to port this? Do I need to make this old driver a platform device (now it is just a char device) and create appropriate dtb description for it? Or I can just use some kernel API to mark this interrupts as available? Is it a common/normal style?

Why are MSI interrupts not shared?

Can any body tell why MSI interrupts are not shareable in linux.
PIN based interrupts can be shared by devices, but MSI interrupts are not shared by devices, each device gets its own MSI IRQ number. Why can't MSI interrupts be shared ?
The old INTx interrupts have two problematic properties:
Each INTx signal requires a separate signal line in hardware; and
the interrupt signal is independent of the other data signals, and this is sent asynchronously.
The consequences are that
multiple devices and drivers need to be able to share interrupts (the interrupt handler needs to check if its device actually raised the interrupt); and
when a driver receives an interrupt, it needs to do a read of some device register to ensure that any previous DMA writes made by the device are visible on the CPU.
Typically, both cases are handled by the driver reading its device's interrupt status register.
Message-Signaled Interrupts do not require a separate signal line but are sent as a message over the data bus. This means that the same hardware can support many more interrupts (so sharing it not necessary), and that the interrupt message is automatically synchronized with any DMA accesses. As a consequence, the interrupt handler does not need to do anything; the interrupt is guaranteed to come from its device, and DMA'd data is guaranteed to have already arrived.
If some drivers were written to share some MSI, the interrupt handler would again have to check whether the interrupt actually came from its own device, and there would be no advantage over INTx interrupts.
MSIs are not shared because it would not be possible, but because it is not necessary.
Please note that sharing an MSI is actually possible: as seen in this excerpt from /proc/interrupts, the Advanced Error Reporting, Power Management Events, and hotplugging drivers share one interrupt:
64: 0 0 PCI-MSI-edge aerdrv, PCIe PME, pciehp
These drivers are actually attached to the same device, but they still behave similar to INTx drivers, i.e., they register their interrupt with IRQF_SHARED, and the interrupt handlers check whether it was their own function that raised the interrupt.
Interrupt sharing is a hack due to resource constraints, like not having enough physical IRQ lines for each device that wants attention. If interrupts are represented by messages that have a large ID space, why would you do that?
"That" meaning: giving them the same identity so that devices then have to be probed to figure out which of the ones clashing to the same ID actually interrupted.
In fact, we would sometimes like to have multiple interrupts for one device. For instance, it's useful if the interrupt ID tells us not only which device interrupted by also why: like is it due to the arrival of input, or the draining of an output buffer? If interrupt lines are "cheap" because they are just software ID's with lots of bits, we can have that.

Interrupt handling on an SMP ARM system with a GIC

I wanted to know how interrupt handling works from the point any device is interrupted.I know of interrupt handling in bits and pieces and would like to have clear end to end picture of interrupt handing.Let me put across what little I know about interrupt handling.
Suppose an FPGA device is interrupted through electrical lines and get some data .Device driver for this FPGA device already had code (Interrupt handler) registered using request_irq function.
So now FPGA device have an IRQ line which it get after to call request_irq ,using this IRQ line device send data to the General Interrupt controller and GIC will do many to one translation of IRQ lines and send the signal to CPU core which then call below minimal code
IRQ_handler
SUB lr, lr, #4 ; modify LR
SRSFD #0x12! ; store SPSR and LR to IRQ mode stack
PUSH {r0-r3, r12} ; store AAPCS registers on to the IRQ mode stack
BL IRQ_handler_to_specific_device
POP {r0-r3, r12} ; restore registers
RFEFD sp! ; and return from the exception using pre-modified LR
IRQ_handler_to_specific_device is nothing is what we registered in Device driver using request_irq() call.
I still don't how CPU core comes to know about the interrupt source?(from which device interrupt is coming)
Also what is role of call like do_irq and shared interrupts works?
Need some help in understanding end to end picture on how interrupts are handled on ARM architecture?
The GIC is divided into two sections. The first is called the distributor. This is global to the system. It has several interrupt sources physically routed to it; although it maybe within an SOC package. The second section is replicated per-CPU and it called the cpu interface. The distributor has logic on how to distribute the shared peripheral interrupts or SPI. These are the type of interrupt your question is asking about. They are global hardware interrupts.
In the context of Linux, this is implemented in irq-gic.c. There is some documentation in gic.txt. Of specific interest,
reg : Specifies base physical address(s) and size of the GIC registers. The
first region is the GIC distributor register base and size. The 2nd region is
the GIC cpu interface register base and size.
The distributor must be accessed globally, so care must be taken to manage it's registers. The CPU interface has the same physical address for each CPU, but each CPU has a separate implementation. The distributor can be set up to route interrupts to specific CPUs (including multiples). See: gic_set_affinity() for example. It is also possible for any CPU to handle the interrupt. The ACK register will allocate IRQ; the first CPU to read it, gets the interrupt. If multiple IRQs pend and there are two ACK reads from different CPUs, then each will get a different interrupt. A third CPU reading would get a spurious IRQ.
As well, each CPU interface has some private interrupt sources, that are used for CPU-to-CPU interrupts as well as private timers and the like. But I believe the focus of the question is how a physical peripheral (unique to a system) gets routed to a CPU in an SMP system.

Inter processor Interrrupts in ARM cortex A9 ( How To write an handler for Software generated Interrupt ( ARM) in Linux? )

I read that the Software generated interrupts in ARM are used as Inter-processor interrupts. I can also see that 5 of those interrupts are already in use. I also know that ARM provides 16 Software generated interrupts.
In my application i am running a bare metal application on of the ARM-cortex cores and Linux on the other. I want to communicate some data from the core running bare metal application to the core which is running Linux. I plan to copy the data to the on chip memory ( which is shared) and I will trigger a SGI on the Core ( running linux) to indicate some data is available for it to process. Now I am able to generate the SGI from the core ( running bare-metal application ). But for handling the interrupt in the linux side, I am not sure of the SGI IRQ numbers which are free and I am also not sure whether i can use the IRQ number directly ( in general SGI are from 0-15). Does any one have an idea how to write a handler for SGI in Linux?
Edit: This is a re-wording of the above text, because the question was closed for SSCE reasons. The Cortex-A CPUs are used in multi-CPU systems. An ARM generic interrupt controller (GIC) monitors all global interrupts and dispatches them to a particular CPU. In order for individual CPUs to signal each other, a software generated interrupt (SGI) is sent from one core to the other; this uses peripheral private interrupts (PPI). This question is,
How to implement a Linux kernel driver that can receive an SGI as a PPI?
Does any one have an idea how to write a handler for SGI in Linux?
As you didn't give the Linux version, I will assume you work with the latest (or at least recent). The ARM GIC has device tree bindings. Typically, you need to specify the SGI interrupt number in a device tree node,
ipc: ipc#address {
compatible = "company,board-ipc"; /* Your driver */
reg = <address range>;
interrupts = <1 SGI 0x02>; /* SGI is your CPU interrupt. */
status = "enabled";
};
The first number in the interrupt stanza denotes a PPI. The SGI will probably be between 0-15 as this is where the SGI interrupts are routed (at least on a Cortex-A5).
Then you can just use the platform_get_irq() in your driver to get the PPI (peripheral private interrupt). I guess that address is the shared memory (physical) where you wish to do the communications; maybe reg is not appropriate, but I think it will work. This area will be remapped by the Linux MMU and you can use it with,
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
mem = devm_ioremap_resource(dev, res);
The address in the device tree above is a hex value of the physical address. The platform_get_irq() should return an irq number which you can use with the request_irq() family of functions. Just connect this to your routine.
Edit: Unfortunately, interrupts below 16 are forbidden by the Linux irq-gic.c. For example, gic_handle_irq(), limits handler to interrupts between 16 and 1020. If SMP is enabled, then handle_IPI() is called for the interrupts of interest. gic_raise_softirq() can be used to signal an interrupt. To handle the SGI with the current Linux, smp.c needs additional enum ipi_msg_type values and code to handle these in handle_IPI(). It looks like newer kernels (3.14+ perhaps?) may add a set_ipi_handler() to smp.c to make such a modification unneeded.
I would like to add that an example of such inter-core communication can be found in TI multicore SoC's (i.e. OMAP3530). Some time ago when I was using such a mechanism, means were provided by TI. Specifically, it was the DSPLink Linux device driver which was providing such a functionality. At that time, unfortunately, it wasn't an open source solution, but maybe there is some technical paper from TI describing how it works ... Just a direction what you could investigate further :)
EDIT: In the meantime, it seems that they've made it open source. So, if that's what you are looking for, you can have a look: DSPLink and SysLink (successor of DSPLink)

Resources