I am writing a device driver that services the interrupts from the device. The device has only one MSI interrupt vector, so I poll the irq with pci_irq_vector(dev, 0), receive the irq, and register the interrupt. This is shown in the following code snippet (equivalent to what I have minus error handling):
retval = pci_alloc_irq_vectors(dev, 1, 1, PCI_IRQ_MSI);
irq = pci_irq_vector(dev, 0);
retval = request_irq(irq, irq_fnc, 0, "name", dev);
This all completes successfully and without warning (at least with dmesg). Yet when the interrupt comes in, I get the error.
kernel:do_IRQ: 0.xxx No irq handler for this vector (irq -1)
The xxx appears to be an arbitrary number that changes every time the driver is loaded, but does not match the irq number. Instead, it matches the last two hex digits of the message data sent with the MSI interrupt as read from the MSI capability structure. Trying to request an irq of this number returns EINVAL which I think means that it's not associated with any PCI device. What does this number mean anyway?
Something that may be important to note, I am actually manually triggering this interrupt from the host side due to limitations with the device. I am reading the interrupt address and data from the capability structure then instructing the device to write the data to that address.
How would I go about further debugging this? Does anything from my description stand out as suspicious? Any help would be appreciated.
Does this particular irq show when you type cat /proc/interrupts? Maybe you can get the correct irq number from there, as well as other info like where it is attached and what driver is associated with this interrupt line!
So the problem ended up being in the order of things. To manually create the interrupt, I had read the config space for the interrupt address and data before allocating interrupts. While obvious in retrospect, allocating the irq vectors for the device writes the appropriate data to the config space. Hence, using the preexisting value in the message data field would point to an irq vector that does not exist.
I'm currently working on a little game that can run from the boot sector of a hard drive, just for something fun to do. This means my program runs in 16-bit real mode, and I have my compiler flags set up to emit pure i386 code. I'm writing the game in C++, but I do need a lot of inline assembly to talk to the BIOS via interrupt calls. Some of these calls return a 32-bit integer, but stored in two 16-bit registers. Currently I'm doing the following to get my number out of the assembly:
auto getTicks = [](){
uint16_t ticksL{ 0 }, ticksH{ 0 };
asm volatile("int $0x1a" : "=c"(ticksH), "=d"(ticksL) : "a"(0x0));
return static_cast<uint32_t>( (ticksH << 16) | ticksL );
};
This is a lambda function I use to call this interrupt function which returns a tick count. I'm aware that there are better methods to get time data, and that I haven't implemented a check for AL to see if midnight has passed, but that's another topic.
As you can see, I have to use two 16-bit values, get the register values separately, then combine them into a 32-bit number the way you see at the return statement.
Is there any way I could retrieve that data into a single 32-bit number in my code right away avoid the shift and bitwise-or? I know that those 16-bit registers I'm accessing are really just the higher and lower 16-bits of a 32-bit register in reality, but I have no idea how to access the original 32-bit register as a whole.
I know that those 16-bit registers I'm accessing are really just the higher and lower 16-bits of a 32-bit register in reality, but I have no idea how to access the original 32-bit register as a whole.
As Jester has already pointed out, these are in fact 2 separate registers, so there is no way to retrieve "the original 32-bit register."
One other point: That interrupt modifies the ax register (returning the 'past midnight' flag), however your asm doesn't inform gcc that you are changing ax. Might I suggest something like this:
asm volatile("int $0x1a" : "=c"(ticksH), "=d"(ticksL), "=a"(midnight) : "a"(0x0));
Note that midnight is also a uint16_t.
As other answers suggest you can't load DX and CX directly into a 32-bit register. You'd have to combine them as you suggest.
In this case there is an alternative. Rather than using INT 1Ah/AH=0h you can read the BIOS Data Area (BDA) in low memory for the 32-bit DWORD value and load it into a 32-bit register. This is allowed in real mode on i386 processors. Two memory addresses of interest:
40:6C dword Daily timer counter, equal to zero at midnight;
incremented by INT 8; read/set by INT 1A
40:70 byte Clock rollover flag, set when 40:6C exceeds 24hrs
These two memory addresses are in segment:offset format, but would be equivalent to physical address 0x0046C and 0x00470.
All you'd have to do is temporarily set the DS register to 0 (saving the previous value), turn off interrupts with CLI retrieve the values from lower memory using C/C++ pointers, re-enable interrupts with STI and restore DS to the previously saved value. This of course is added overhead in the boot sector compared to using INT 1Ah/AH=0h but would allow you direct access to the memory addresses the BIOS is reading/writing on your behalf.
Note: If DS is set to zero already no need to save/set/restore it. Since we don't see the code that sets up the environment before calling into the C++ code I don't know what your default segment values are. If you don't need to retrieve both the roll over and timer values and only wish to get them individually you can eliminate the CLI/STI.
You're looking for the 'A' constraint, which refers to the dx:ax register pair as a double-wide value. You can see the full set of defined constraints for x86 in the gcc documentation. Unfortunately there are no constraints for any other register pairs, so you have to get them as two values and reassemble them with shift and or, like you describe.
Assume we have a system with CPU which is fully compatible with Intel 8259 Programmable Interrupt Controller. So, this CPU use vectored interrupts, of course.
When one of eight interrupts occurs, PIC just asserts INTR wire that is connected to the CPU. Now PIC waits for CPU until INTA will be asserted. When so, PIC selects interrupt with the highest priority (depends on pin number), and then send its interrupt vector to data bus. I omitted some timing, but it doesn't matter for now, I think.
Here are questions:
How whole device, that causes interrupt, knows that his interrupt
request was accepted and it can pull off interrupt request? I read about 8259, but I didn't find it.
Is acknowledge device, whose interrupt was accepted, performed in ISR?
Sorry for my English.
The best reference is the original intel doc and is available here: https://pdos.csail.mit.edu/6.828/2012/readings/hardware/8259A.pdf It has full details of these modes, how the device operates, and how to program the device.
Caveat: I'm a bit rusty as I haven't programmed the 8259 in many years, but I'll take a shot at explaining things, per your request.
After an interrupting device, connected to an IRR ["interrupt request register"] pin, has asserted an interrupt request, the 8259 will convey this to the CPU by assserting INTR and then placing the vector on the bus during the three INTA cycles generated by the CPU.
After a given device has asserted IRR, the 8259's IS ["in-service"] register is or'ed with a mask of the IRR pin number. The IS is a priority select. While the IS bit is set, other interrupting devices of lower priority [or the original one] will not cause an INTR/INTA cycle to the CPU. The IS bit must be cleared first. These interrupts remain "pending".
The IS can be cleared by an EOI (end-of-interrupt) operation. There are multiple EOI modes that can be programmed. The EOI can be generated by the 8259 in AEOI mode. In other modes, the EOI is generated manually by the ISR by sending a command to the 8259.
The EOI action is all about allowing other devices to cause interrupts while the ISR is processing the current one. The EOI does not clear the interrupting device.
Clearing the interrupting device must be done by the ISR using whatever device specific register the device has for that purpose. Usually, this a "pending interrupt" register [can be 1 bit wide]. Most H/W uses two interrupt related registers and the other one is an "interrupt enable" register.
With level triggered interrupts, if the ISR does not clear the device, when the ISR does issue the EOI command to the 8259, the 8259 will [try to] reinterrupt the CPU using the vector for the same device for the same condition. The CPU will probably be reinterrupted as soon as it issues an sti or iret instruction. Thus, an ISR routine must take care to process things in proper sequence.
Consider an example. We have a video controller that has four sources for interrupts:
HSTART -- start of horizontal line
HEND -- end of horizontal line [start of horizontal blanking interval]
VSTART -- start of new video field/frame
VEND -- end of video field/frame [start of vertical blanking interval]
The controller presents these as a bit mask in its own special interrupt source register, which we'll call vidintr_pend. We'll call the interrupt enable register vidintr_enable.
The video controller will use only one 8259 IRR pin. It is the responsibility of the CPU's video ISR to interrogate the vidpend register and decide what to do.
The video controller will assert its IRR pin as long as vidpend is non-zero. Since we're level triggered, the CPU may be re-interrupted.
Here is a sample ISR routine to go with this:
// video_init -- initialize controller
void
video_init(void)
{
write_port(...);
write_port(...);
write_port(...);
...
// we only care about the vertical interrupts, not the horizontal ones
write_port(vidintr_enable,VSTART | VEND);
}
// video_stop -- stop controller
void
video_stop(void)
{
// stop all interrupt sources
write_port(vidintr_enable,0);
write_port(...);
write_port(...);
write_port(...);
...
}
// vidisr_process -- process video interrupts
void
vidisr_process(void)
{
u32 pendmsk;
// NOTE: we loop because controller may assert a new, different interrupt
// while we're processing a given one -- we don't want to exit if we _know_
// we'll be [almost] immediately re-entered
while (1) {
pendmsk = port_read(vidintr_pend);
if (pendmsk == 0)
break;
// the normal way to clear on most H/W is a writeback
// writing a 1 to a given bit clears the interrupt source
// writing a 0 does nothing
// NOTE: with this method, we can _never_ have a race condition where
// we lose an interrupt
port_write(vidintr_pend,pendmsk);
if (pendmsk & HSTART)
...
if (pendmsk & HEND)
...
if (pendmsk & VSTART)
...
if (pendmsk & VEND)
...
}
}
// vidisr_simple -- simple video ISR routine
void
vidisr_simple(void)
{
// NOTE: interrupt state has been pre-saved for us ...
// process our interrupt sources
vidisr_process();
// allow other devices to cause interrupts
port_write(8259,SEND_NON_SPECIFIC_EOI)
// return from interrupt by popping interrupt state
iret();
}
// vidisr_nested -- video ISR routine that allows nested interrupts
void
vidisr_nested(void)
{
// NOTE: interrupt state has been pre-saved for us ...
// allow other devices to cause interrupts
port_write(8259,SEND_NON_SPECIFIC_EOI)
// allow us to receive them
sti();
// process our interrupt sources
// this can be interrupted by another source or another device
vidisr_process();
// return from interrupt by popping interrupt state
iret();
}
UPDATE:
Your followup questions:
Why do you use interrupt disable on video controller register instead of mask 8259's interrupt enable bit?
When you execute vidisr_nested(void) function, it will enable nesting the same interrupt. Is it true? And is that what you want?
To answer (1), we should do both but not necessarily in the same place. They seem similar, but work in slightly different ways.
We change the video controller registers in the video controller driver [as it's the only place that "understands" the video controller's registers].
The video controller actually asserts the 8259's IRR pin from: IRR = ((vidintr_enable & vidintr_pend) != 0). If we never set vidintr_enable (i.e. it's all zeroes), then we can operate the device in a "polled" [non-interrupt] mode.
The 8259 interrupt enable register works similarly, but it masks against which IRRs [asserted or not] may interrupt the CPU. The device vidintr_enable controls whether it will assert IRR or not.
In the example video driver, the init routine enables the vertical interrupts, but not the horizontal. Only the vertical interrupts will generate a call to the ISR, but the ISR can/will also process the horizontal ones [as polled bits].
Changing the 8259 interrupt enable mask should be done in a place that understands the interrupt topology of the entire system. This is usually done by the containing OS. That's because the OS knows about the other devices and can make the best choice.
Herein, "containing OS" could be a full OS like Linux [of which I'm most familiar]. Or, it could just be an R/T executive [or boot rom--I've written a few] that has some common device handling framework with "helper" functions for the device drivers.
For example, although it's usual that all devices get their own IRR pin. But, it is possible, with level triggering, for two different devices to share an IRR. (e.g.) IRR[0] = devA_IRROUT | devB_IRROUT. Either through an OR gate [or wired OR(?)].
It's also possible that the device is attached to a "nested" or "cascaded" interrupt controller. IIRC [consult document], it is possible to have a "master" 8259 and [up to] 8 "slave" 8259s. Each slave 8259 connects to an IRR pin of the master. Then, connect devices to the slave IRR pins. For a fully loaded system, you can have 256 interrupting devices. And, the master can have slave 8259s on some IRR pins and real devices on others [a "hybrid" topology].
Usually, only the OS knows enough to deal with this. In a real system, a device driver probably wouldn't touch the 8259 at all. The non-specific EOI would probably have been sent to the 8259 before entering the device's ISR. And, the OS would handle the full "save state" and "restore state" and the driver just handles device specific actions.
Also, under an OS, the OS will call the "init" and "stop" routines. The general OS routines for this will handle the 8259 and call the device specific ones.
For example, under Linux [or almost any other OS or R/T executive], the interrupt sequence goes something like this:
- CPU hardware actions [atomic]:
- push %esp and flags register [has CPU interrupt enable flag] to stack
- clear CPU interrupt enable flag (e.g. implied cli)
- jump within interrupt vector table
- OS general ISR (preset within IVT):
- push all remaining registers to stack
- send non-specific EOI to 8259(s)
- call device-specific ISR (NOTE: CPU interrupt flag still clear)
- pop regs
- iret
To answer (2), yes, you are correct. It would probably interrupt immediately, and might nest (infinitely :-).
The simple ISR version is more efficient and preferable if the actions taken in the ISR are short, quick, and simple (e.g. just output to a few data ports).
If the required actions take a relatively long time (e.g. do intensive calculations, or write to a large number of ports or memory locations), the nested version is preferred to prevent other devices from having entry to their ISRs delayed excessively.
However, some time critical devices [like a video controller] need to use the simple model, preventing interruption by other devices, to guaranteed that they can complete in a finite, deterministic time.
For example, the video ISR handling of VEND might program the device for the next/upcoming field/frame and must complete this within the vertical blanking interval. They, have to do this, even if it means "excessive" delay of other ISRs.
Note that the ISR was "racing" to complete before the end of the blanking interval. Not the best design. I've had to program such a controller/device. For rev 2, we changed the design so the device registers were double-buffered.
That meant that we could set up the registers for frame 1 anytime during the [much longer] frame 0 display period. At VSTART for frame 1, the video hardware would instantly clock-in/save the double-buffered values, and the CPU could then setup for frame 2 anytime during the display of frame 1. And so on ...
With the modified design, the video driver removed the device setup from the ISR entirely. It was now handled from OS task level
In the driver example, I've adjusted the sequencing a bit to prevent infinite stacking, and added some additional information based upon my question (1) answer. That is, it shows [crudely] what to do with or without an OS.
// video controller driver
//
// for illustration purposes, STANDALONE means a very simple software system
//
// if it's _not_ defined, we assume the ISR is called from an OS general ISR
// that handles 8259 interactions
//
// if it's _defined_, we're showing [crudely] what needs to be done
//
// NOTE: although this is largely C code, it's also pseudo-code in places
// video_init -- initialize controller
void
video_init(void)
{
write_port(...);
write_port(...);
write_port(...);
...
#ifdef STANDALONE
write_port(8259_interrupt_enable |= VIDEO_IRR_PIN);
#endif
// we only care about the vertical interrupts, not the horizontal ones
write_port(vidintr_enable,VSTART | VEND);
}
// video_stop -- stop controller
void
video_stop(void)
{
// stop all interrupt sources
write_port(vidintr_enable,0);
#ifdef STANDALONE
write_port(8259_interrupt_enable &= ~VIDEO_IRR_PIN);
#endif
write_port(...);
write_port(...);
write_port(...);
...
}
// vidisr_pendmsk -- get video controller pending mask (and clear it)
u32
vidisr_pendmsk(void)
{
u32 pendmsk;
pendmsk = port_read(vidintr_pend);
// the normal way to clear on most H/W is a writeback
// writing a 1 to a given bit clears the interrupt source
// writing a 0 does nothing
// NOTE: with this method, we can _never_ have a race condition where
// we lose an interrupt
port_write(vidintr_pend,pendmsk);
return pendmsk;
}
// vidisr_process -- process video interrupts
void
vidisr_process(u32 pendmsk)
{
// NOTE: we loop because controller may assert a new, different interrupt
// while we're processing a given one -- we don't want to exit if we _know_
// we'll be [almost] immediately re-entered
while (1) {
if (pendmsk == 0)
break;
if (pendmsk & HSTART)
...
if (pendmsk & HEND)
...
if (pendmsk & VSTART)
...
if (pendmsk & VEND)
...
pendmsk = port_read(vidintr_pend);
}
}
// vidisr_simple -- simple video ISR routine
void
vidisr_simple(void)
{
u32 pendmsk;
// NOTE: interrupt state has been pre-saved for us ...
pendmsk = vidisr_pendmsk();
// process our interrupt sources
vidisr_process(pendmsk);
// allow other devices to cause interrupts
#ifdef STANDALONE
port_write(8259,SEND_NON_SPECIFIC_EOI)
#endif
// return from interrupt by popping interrupt state
#ifdef STANDALONE
pop_regs();
iret();
#endif
}
// vidisr_nested -- video ISR routine that allows nested interrupts
void
vidisr_nested(void)
{
u32 pendmsk;
// NOTE: interrupt state has been pre-saved for us ...
// get device pending mask -- do this _before_ [optional] EOI and the sti
// to prevent immediate stacked interrupts
pendmsk = vidisr_pendmsk();
// allow other devices to cause interrupts
#ifdef STANDALONE
port_write(8259,SEND_NON_SPECIFIC_EOI)
#endif
// allow us to receive them
// NOTE: with or without OS, we can't stack until _after_ this
sti();
// process our interrupt sources
// this can be interrupted by another source or another device
vidisr_process(pendmsk);
// return from interrupt by popping interrupt state
#ifdef STANDALONE
pop_regs();
iret();
#endif
}
BTW, I'm the author of the linux irqtune program
I wrote it back in the mid 90's. It's of lesser use now, and probably doesn't work on modern systems, but the FAQ I wrote has a great deal of information about interrupt device priorities. The program itself did a simple 8259 manipulation.
An online copy is available here: http://archive.debian.org/debian/dists/Debian-1.1/main/disks-i386/SpecialKernels/irqtune/README.html There's probably source code somewhere in this archive.
That's the version 0.2 doc. I haven't found an online copy of version 0.6 which has better explanation, so I've put up a text version here: http://pastebin.com/Ut6nCgL6
Side note: The "where to get" information in the FAQ [and email address] are no longer valid. And, I didn't understand the full impact of "spam" until I posted the FAQ and starting getting [tons of] it ;-)
And, irqtune even drew Linus' ire. Not because it didn't work but because it did: https://lkml.org/lkml/1996/8/23/19 IMO, if he had read the FAQ, he would have understood why [as what irqtune did is standard stuff to R/T guys].
UPDATE #2
Your new questions:
I think that you are missing a destination address in write_port(8259_interrupt_enable &= ~VIDEO_IRR_PIN). Isn't it so?
IRR register is read-only or r/w? If the second case, what is the purpose of writing into it?
Interrupt vectors are stored as logical addresses or physical address?
To answer question (3): No, not really [even if it seemed so]. The code snippet was "pseudo code" [not pure C code], as I mentioned in a code comment at the top, so technically speaking, I'm covered. However, to make it more clear, here is what the [closer to] real C code would look like:
// the system must know _which_ IRR H/W pin the video controller is connected to
// so we _hardwire_ it here
#define VIDEO_IRR_PIN_NUMBER 3 // just an example
#define VIDEO_IMR_MASK (1 << VIDEO_IRR_PIN_NUMBER)
// video_enable -- enable/disable video controller in 8259
void
video_enable(int enable)
{
u32 val;
// NOTE: we're reading/writing the _enable_ register, not the IRR [which
// software can _not_ modify or read]
val = read_port(8259_interrupt_enable);
if (enable)
val |= VIDEO_IMR_MASK;
else
val &= ~VIDEO_IMR_MASK;
write_port(8259_interrupt_enable,val);
}
Now, in video_init, replace the code inside STANDALONE with video_enable(1), and, in video_stop with video_enable(0)
As to question (4): We weren't really writing to the IRR, even though the symbol had _IRR_ in it. As mentioned in the code comments above, we were writing to the 8259 interrupt enable register which is really the "interrupt mask register" or IMR in the documentation. The IMR can be read from and written to by using OCW1 (see doc).
There is no way for software to access the IRR at all. (i.e.) There is no port in the 8259 to read or write the IRR value. The IRR is completely internal to the 8259.
There is a one-to-one correspondence between IRR pin numbers [0-7] and IMR bit numbers (e.g. to enable for IRR(0), set IMR bit 0), but the software has to know which bit to set.
Because the video controller is physically connected to a given IRR pin, it is always the same for a given PC board. The software [on older non-PnP systems] can't probe for this. Even on newer systems, the 8259 knows nothing of PnP, so it's still hardwired. The video controller driver programmer must just "know" what IRR pin is being used [by consulting the "spec sheet" or controller "architecture reference manual"].
To answer question (5): First consider what the 8259 does.
When the 8259 is intialized, the ICW2 ("initialization command word 2") gets set by the OS driver. This defines a portion of interrupt vector number the 8259 will present during the INTR/INTA cycle. In ICW2, the most significant 5 bits are marked T7-T3.
When an interrupt occurs, these bits are combined with the IRR pin number of the interrupting device [which is 3 bits wide] to form an 8 bit interrupt vector number: T7,T6,T5,T4,T3|I2,I1,I0
For example, if we put 0xD0 into ICW2, with our video controller using IRR pin 3, we'd have 1,1,0,1,0|0,1,1 or 0xD3 as the interrupt vector number that the 8259 will send to the CPU.
This is just a vector number [0x00-0xFF] as the 8259 knows nothing of memory addresses. It is the CPU that takes this vector number and, using the CPU's "interrupt vector table" [IVT], uses the vector number as an index into the IVT to properly vector the interrupt to an ISR routine.
On 80386 and later architectures, the IVT is actually called an IDT ("interrupt descriptor table"). For details, see the "System Programming Guide", chapter 6: http://download.intel.com/design/processor/manuals/253668.pdf
As, to whether the resulting ISR address from the IVT/IDT is physical or logical depends on the processor mode (e.g. real mode, protected mode, protected with virtual addressing enabled).
In a sense, all such addresses are always logical. And, all logical addresses undergo a translation to physical on each CPU instruction. Whether the translation is one-to-one [MMU not enabled or page tables have one-to-one mapping] is a question for "How has the OS set things up?"
Strictly speaking, there is no such thing
as "acknowledge an interrupt to device".
The thing that an ISR should do, is to handle
the interrupt condition. For example, if
the UART requested an interrupt because it
has an incoming data, then you should read
that incoming data. After that read operation,
UART no longer has the incoming data, so naturally
it stops asserting the IRQ line. Alternatively,
if your program no longer needs to read the
data and wants to stop the communication, it
would just mask the receiver interrupt via
the UART registers, and, once again, UART
will stop asserting the IRQ line. If the device
just wanted to signal you some state change,
then you should read the new state, and the
device will know that you have an up-to-date
state and will release an IRQ line.
So, in short: there is usually no any device-specific
acknowledge procedure. All you need to do is
to service an interrupt condition, after which,
that condition will disappear, voiding the
interrupt request.
I have written a simple character driver and requested IRQ on a gpio pin and wrtten a handler for it.
err = request_irq( irq, irq_handler,IRQF_SHARED | IRQF_TRIGGER_RISING, INTERRUPT_DEVICE_NAME, raspi_gpio_devp);
static irqreturn_t irq_handler(int irq, void *arg);
now from theory i know that Upon interrupt the interrupt Controller with tell the processor to call do_IRQ() which will check the IDT and call my interrupt handler for this line.
how does the kernel know that the interrupt handler was for this particular device file
Also I know that Interrupt handlers do not run in any process context. But let say I am accessing any variable declared out side scope of handler, a static global flag = 0, In the handler I make flag = 1 indicating that an interrupt has occurred. That variable is in process context. So I am confused how this handler not in any process context modify a variable in process context.
Thanks
The kernel does not know that this particular interrupt is for a particular device.
The only thing it knows is that it must call irq_handler with raspi_gpio_devp as a parameter. (like this: irq_handler(irq, raspi_gpio_devp)).
If your irq line is shared, you should check if your device generated an IRQ or not. Code:
int irq_handler(int irq, void* dev_id) {
struct raspi_gpio_dev *raspi_gpio_devp = (struct raspi_gpio_dev *) dev_id;
if (!my_gpio_irq_occured(raspi_gpio_devp))
return IRQ_NONE;
/* do stuff here */
return IRQ_HANDLED;
}
The interrupt handler runs in interrupt context. But you can access static variables declared outside the scope of the interrupt.
Usually, what an interrupt handler does is:
check interrupt status
retrieve information from the hardware and store it somewhere (a buffer/fifo for example)
wake_up() a kernel process waiting for that information
If you want to be really confident with the do and don't of interrupt handling, the best thing to read about is what a process is for the kernel.
An excellent book dealing with this is Linux Kernel Developpement by Robert Love.
The kernel doesn't know which device the interrupt pertains to. It is possible for a single interrupt to be shared among multiple devices. Previously this was quite common. It is becoming less so due to improved interrupt support in interrupt controllers and introduction of message-signaled interrupts. Your driver must determine whether the interrupt was from your device (i.e. whether your device needs "service").
You can provide context to your interrupt handler via the "void *arg" provided. This should never be process-specific context, because a process might exit leaving pointers dangling (i.e. referencing memory which has been freed and/or possibly reallocated for other purposes).
A global variable is not "in process context". It is in every context -- or no context if you prefer. When you hear "not in process context", that means a few things: (1) you cannot block/sleep (because what process would you be putting to sleep?), (2) you cannot make any references to user-space virtual addresses (because what would those references be pointing to?), (3) you cannot make references to "current task" (since there isn't one or it's unknown).
Typically, a driver's interrupt handler pushes or pulls data into "driver global" data areas from which/to which the process context end of the driver can transfer data.
This is to reply your question :-
how does the kernel know that the interrupt handler was for this particular >device file?
Each System-On-Chip documents will mention interrupt numbers for different devices connected to different interrupt lines.
The Same Interrupt number has to be mentioned in the Device Tree entry for instantiation of device driver.
The Device driver's usual probe function parses the Device tree data structure and reads the IRQ number and registers the handler using the register_irq function.
If there are multiple devices to a single IRQ number/line, then the IRQ status register(for different devices if mapped under the same VM space) can be used inside the IRQ handler to differentiate.
Please read more in my blog
I'm working on a x86 system with Linux 3.6.0. For some experiments, I need to know how the IRQ is mapped to the vector. I learn from many book saying that for vector 0x0 to 0x20 is for traps and exceptions, and from vector 0x20 afterward is for the external device interrupts. And this also defined in the source code Linux/arch/x86/include/asm/irq_vectors.h
However, what I'm puzzled is that when I check the do_IRQ function,
http://lxr.linux.no/linux+v3.6/arch/x86/kernel/irq.c#L181
I found the IRQ is fetched by looking up the "vector_irq" array:
unsigned int __irq_entry do_IRQ(struct pt_regs *regs)
{
struct pt_regs *old_regs = set_irq_regs(regs);
/* high bit used in ret_from_ code */
unsigned vector = ~regs->orig_ax;
unsigned irq;
...
irq = __this_cpu_read(vector_irq[vector]); // get the IRQ from the vector_irq
// print out the vector_irq
prink("CPU-ID:%d, vector: 0x%x - irq: %d", smp_processor_id(), vector, irq);
}
By instrumenting the code with printk, the vector-irq mapping I got is like below and I don't have any clue why this is the mapping. I though the mapping should be (irq + 0x20 = vector), but it seems not the case.
from: Linux/arch/x86/include/asm/irq_vector.h
* Vectors 0 ... 31 : system traps and exceptions - hardcoded events
* Vectors 32 ... 127 : device interrupts = 0x20 – 0x7F
But my output is:
CPU-ID=0.Vector=0x56 (irq=58)
CPU-ID=0.Vector=0x66 (irq=59)
CPU-ID=0.Vector=0x76 (irq=60)
CPU-ID=0.Vector=0x86 (irq=61)
CPU-ID=0.Vector=0x96 (irq=62)
CPU-ID=0.Vector=0xa6 (irq=63)
CPU-ID=0.Vector=0xb6 (irq=64)
BTW, these irqs are my 10GB ethernet cards with MSIX enabled. Could anyone give me some ideas about why this is the mapping? and what's the rules for making this mapping?
Thanks.
William
The irq number (which is what you use in software) is not the same as the vector number (which is what the interrupt controller actually uses).
The x86 I/OAPIC interrupt controller assigns interrupt priorities in groups of 16, so the vector numbers are spaced out to prevent them from interfering with each other
(see the function __assign_irq_vector in arch/x86/kernel/apic/io_apic.c).
I guess my question is how the vectors are assigned for a particular
IRQ number and what's are the rules behind.
The IOAPIC supports a register called IOREDTBL for each IRQ input. Software assigns the desired vector number for the IRQ input using bit 7-0 of this register. It is this vector number that serves as an index into the processors Interrupt Descriptor Table. Quoting the IOAPIC manual (82093AA)
7:0 Interrupt Vector (INTVEC)—R/W: The vector field is an 8 bit field
containing the interrupt vector for this interrupt. Vector values
range from 10h to FEh.
Note that these registers are not directly accessible to software. To access IOAPIC registers (not to be confused with Local APIC registers) software must use the IOREGSEL and IOWIN registers to indirectly interact with the IOAPIC. These registers are also described in the IOAPIC manual.
The source information for the IOAPIC can be a little tricky to dig up. Here's a link to the example I used:
IOAPIC data sheet link