What steps are needed to detect a GPU interrupt on a Raspberry Pi3? - raspberry-pi3

I am writing a bare-metal kernel, but an interrupt doesn't seem to be triggered when the EMMC INTERRUPT register becomes non-zero.
I have two cores idling, and one at EL3 with no data caches enabled continually displaying a page of memory, in order to see what the code I'm testing is up to. (The test code regularly flushes its cache, on the working QA7 millisecond interrupt.)
The code being tested is running at Secure EL0 on core 0, with interrupts enabled. Interrupts are routed to core 0:
QA7.GPU_interrupts_routing = 0; // IRQ and FIQ to Core 0
The EMMC interface is initialised, and a reset command sent to the card, at which point the INTERRUPT register becomes 1 (bit 0 set: command has finished), but no GPU interrupt seems to be signalled to the QA7 hardware (bit 8 in the Core 0 interrupt source register stays zero).
The EMMC registers IRPT_MASK and IRPT_EN are both set to 0x017f7137, which I think should enable all known interrupts from that peripheral, and certainly bit 0.
The BCM8235 interrupt registers have been written as so:
Enable_IRQs_1 = 0x20000000; // (1 << 29);
Enable_IRQs_2 = 0x02ff6800; // 0b00000010111111110110100000000000;
Enable_Basic_IRQs = 0x303;
But they read back as:
Enable_IRQs_1: 0x20000200 // (1 << 9) also set
Enable_IRQs_2: 0x02ff6800 // unchanged
Enable_Basic_IRQs: 0x3 // No interrupts from IRQs 1 or 2.
What have I missed?
(Tagged raspberry-pi2, since it also has the QA7 component.)

The simple answer is nothing more. The Arasan SD interface interrupt is number 62, bit 20 in the IRQ basic pending register. Enable bit 30 in IRQ pending 2, and the interrupt comes through.
Enable_IRQs_2 = 0x42ff6800;
I just had to ignore the advice: "The table above has many empty entries. These should not be enabled as they will interfere with the GPU operation." in the documentation.

Related

Can I disable the watchdog timer in windows 7?

I'm trying to disable all interrupts including NMI's on a single core in a processor and put that core into an infinite loop with a JMP instruction targeting itself (bytecode 0xEBFE) I tried this with the following machine code:
cli
in al, 0x70
mov bl, 0x80
or al, bl
out 0x70, al
jmp self (0xEBFE)
I assumed that disabling NMI interrupts would also disable the watchdog since according to this link the watchdog timer is an NMI interrupt, but what happened when I ran this code is after around 5 seconds my computer bugchecked with code 0x101 CLOCK_WATCHDOG_TIMEOUT. I'm wondering if windows notices that I've disabled NMI interrupts and then re-enables them before initiating the kernel panic. Does anyone know how to disable the watchdog timer in windows 7?
I don't think the NMIs are at fault here.
External NMIs are obsolete, they are hard to route in an SMP system. That watchdog timer is also obsolete, it was either a secondary PIT or a limited fourth channel of the primary PIT:
----------P00440047--------------------------
PORT 0044-0047 - Microchannel - PROGRAMMABLE INTERVAL TIMER 2
SeeAlso: PORT 0040h,PORT 0048h
0044 RW PIT counter 3 (PS/2)
used as fail-safe timer. generates an NMI on time out.
for user generated NMI see at 0462.
0047 -W PIT control word register counter 3 (PS/2, EISA)
bit 7-6 = 00 counter 3 select
= 01 reserved
= 10 reserved
= 11 reserved
bit 5-4 = 00 counter latch command counter 3
= 01 read/write counter bits 0-7 only
= 1x reserved
bit 3-0 = 00
----------P0048004B--------------------------
PORT 0048-004B - EISA - PROGRAMMABLE INTERVAL TIMER 2
Note: this second timer is also supported by many Intel chipsets
SeeAlso: PORT 0040h,PORT 0044h
0048 RW EISA PIT2 counter 3 (Watchdog Timer)
0049 ?? EISA 8254 timer 2, not used (counter 4)
004A RW EISA PIT2 counter 5 (CPU speed control)
004B -W EISA PIT2 control word
These hardware is gone, it's not present on modern systems. I've tested my machine and I don't have it.
Intel chipsets don't have it:
There is only the primary PIT.
Modern timers are the LAPIC timer and the HPET (Linux did even resort to using the PMC registers).
Windows does support an HW WDT, in fact Microsoft went as long as defining an ACPI extension: the WDAT table.
This WDT however can only reboot or shutdown the system, in hardware, without the intervention of the software.
// Configures the watchdog hardware to perform a reboot
// when it is fired.
//
#define WATCHDOG_ACTION_SET_REBOOT 0x11
//
// Determines if the watchdog hardware is configured to perform
// a system shutdown when fired.
//
#define WATCHDOG_ACTION_QUERY_SHUTDOWN 0x12
//
// Configures the watchdog hardware to perform a system shutdown
// when fired.
//
#define WATCHDOG_ACTION_SET_SHUTDOWN 0x13
Microsoft set quite a quit of requirement for this WDT since it must be setup as early as possible in the boot process, before the PnP enumeration (i.e. PCI(e) enumeration).
This is not the timer that bugchecked your system.
By the way, I don't have this timer (my system is missing the WDAT table) and I don't expect it to be found on client hardware.
The bugcheck 0x101 is due to a software WDT, it is raised inside a function in ntoskrnl.exe.
This function is called by KeUpdateRunTime and by another chain of calls starting in DriverEntry:
According to Windows Internals, KeUpdateRunTime is used to update the internal ticks counting of Windows.
I'd expect only a single logical processor to be put in charge of that, though I'm not sure of how exactly Windows housekeeps time.
I'd also expect this software WDT to be implemented in a master-slave fashion: each CPU increments its own counter and a designed CPU check the counters periodically (or any equivalent implementation).
This seems to be suggested by the wording of the documentation of the 0x101 bugcheck:
The CLOCK_WATCHDOG_TIMEOUT bug check has a value of 0x00000101. This indicates that an expected clock interrupt on a secondary processor, in a multi-processor system, was not received within the allocated interval.
Again, I'm not an expert on this part of Windows (The user MdRm, probably is) and this may be utterly wrong, but if it isn't you probably are better of following Alex's advice and boot with one less logical CPU.
You can then execute code on that CPU with an INIT-SIPI-SIPI sequence as described on the Intel's manual but you must be careful because the issuing processor is using paging while the sleeping one is not yet (the processor will start up in real-mode).
Initialising a CPU may be a little cumbersome but not too much after all.
Stealing it may result in other problems besides the WDT, for example if Windows has routed an interrupt to that processor only.
I don't know if there is driver API to unregister a logical processor, I found nothing looking at the exports of hal.dll and ntoskrnl.exe.

Intel 8259 PIC - Acknowledge interrupt

Assume we have a system with CPU which is fully compatible with Intel 8259 Programmable Interrupt Controller. So, this CPU use vectored interrupts, of course.
When one of eight interrupts occurs, PIC just asserts INTR wire that is connected to the CPU. Now PIC waits for CPU until INTA will be asserted. When so, PIC selects interrupt with the highest priority (depends on pin number), and then send its interrupt vector to data bus. I omitted some timing, but it doesn't matter for now, I think.
Here are questions:
How whole device, that causes interrupt, knows that his interrupt
request was accepted and it can pull off interrupt request? I read about 8259, but I didn't find it.
Is acknowledge device, whose interrupt was accepted, performed in ISR?
Sorry for my English.
The best reference is the original intel doc and is available here: https://pdos.csail.mit.edu/6.828/2012/readings/hardware/8259A.pdf It has full details of these modes, how the device operates, and how to program the device.
Caveat: I'm a bit rusty as I haven't programmed the 8259 in many years, but I'll take a shot at explaining things, per your request.
After an interrupting device, connected to an IRR ["interrupt request register"] pin, has asserted an interrupt request, the 8259 will convey this to the CPU by assserting INTR and then placing the vector on the bus during the three INTA cycles generated by the CPU.
After a given device has asserted IRR, the 8259's IS ["in-service"] register is or'ed with a mask of the IRR pin number. The IS is a priority select. While the IS bit is set, other interrupting devices of lower priority [or the original one] will not cause an INTR/INTA cycle to the CPU. The IS bit must be cleared first. These interrupts remain "pending".
The IS can be cleared by an EOI (end-of-interrupt) operation. There are multiple EOI modes that can be programmed. The EOI can be generated by the 8259 in AEOI mode. In other modes, the EOI is generated manually by the ISR by sending a command to the 8259.
The EOI action is all about allowing other devices to cause interrupts while the ISR is processing the current one. The EOI does not clear the interrupting device.
Clearing the interrupting device must be done by the ISR using whatever device specific register the device has for that purpose. Usually, this a "pending interrupt" register [can be 1 bit wide]. Most H/W uses two interrupt related registers and the other one is an "interrupt enable" register.
With level triggered interrupts, if the ISR does not clear the device, when the ISR does issue the EOI command to the 8259, the 8259 will [try to] reinterrupt the CPU using the vector for the same device for the same condition. The CPU will probably be reinterrupted as soon as it issues an sti or iret instruction. Thus, an ISR routine must take care to process things in proper sequence.
Consider an example. We have a video controller that has four sources for interrupts:
HSTART -- start of horizontal line
HEND -- end of horizontal line [start of horizontal blanking interval]
VSTART -- start of new video field/frame
VEND -- end of video field/frame [start of vertical blanking interval]
The controller presents these as a bit mask in its own special interrupt source register, which we'll call vidintr_pend. We'll call the interrupt enable register vidintr_enable.
The video controller will use only one 8259 IRR pin. It is the responsibility of the CPU's video ISR to interrogate the vidpend register and decide what to do.
The video controller will assert its IRR pin as long as vidpend is non-zero. Since we're level triggered, the CPU may be re-interrupted.
Here is a sample ISR routine to go with this:
// video_init -- initialize controller
void
video_init(void)
{
write_port(...);
write_port(...);
write_port(...);
...
// we only care about the vertical interrupts, not the horizontal ones
write_port(vidintr_enable,VSTART | VEND);
}
// video_stop -- stop controller
void
video_stop(void)
{
// stop all interrupt sources
write_port(vidintr_enable,0);
write_port(...);
write_port(...);
write_port(...);
...
}
// vidisr_process -- process video interrupts
void
vidisr_process(void)
{
u32 pendmsk;
// NOTE: we loop because controller may assert a new, different interrupt
// while we're processing a given one -- we don't want to exit if we _know_
// we'll be [almost] immediately re-entered
while (1) {
pendmsk = port_read(vidintr_pend);
if (pendmsk == 0)
break;
// the normal way to clear on most H/W is a writeback
// writing a 1 to a given bit clears the interrupt source
// writing a 0 does nothing
// NOTE: with this method, we can _never_ have a race condition where
// we lose an interrupt
port_write(vidintr_pend,pendmsk);
if (pendmsk & HSTART)
...
if (pendmsk & HEND)
...
if (pendmsk & VSTART)
...
if (pendmsk & VEND)
...
}
}
// vidisr_simple -- simple video ISR routine
void
vidisr_simple(void)
{
// NOTE: interrupt state has been pre-saved for us ...
// process our interrupt sources
vidisr_process();
// allow other devices to cause interrupts
port_write(8259,SEND_NON_SPECIFIC_EOI)
// return from interrupt by popping interrupt state
iret();
}
// vidisr_nested -- video ISR routine that allows nested interrupts
void
vidisr_nested(void)
{
// NOTE: interrupt state has been pre-saved for us ...
// allow other devices to cause interrupts
port_write(8259,SEND_NON_SPECIFIC_EOI)
// allow us to receive them
sti();
// process our interrupt sources
// this can be interrupted by another source or another device
vidisr_process();
// return from interrupt by popping interrupt state
iret();
}
UPDATE:
Your followup questions:
Why do you use interrupt disable on video controller register instead of mask 8259's interrupt enable bit?
When you execute vidisr_nested(void) function, it will enable nesting the same interrupt. Is it true? And is that what you want?
To answer (1), we should do both but not necessarily in the same place. They seem similar, but work in slightly different ways.
We change the video controller registers in the video controller driver [as it's the only place that "understands" the video controller's registers].
The video controller actually asserts the 8259's IRR pin from: IRR = ((vidintr_enable & vidintr_pend) != 0). If we never set vidintr_enable (i.e. it's all zeroes), then we can operate the device in a "polled" [non-interrupt] mode.
The 8259 interrupt enable register works similarly, but it masks against which IRRs [asserted or not] may interrupt the CPU. The device vidintr_enable controls whether it will assert IRR or not.
In the example video driver, the init routine enables the vertical interrupts, but not the horizontal. Only the vertical interrupts will generate a call to the ISR, but the ISR can/will also process the horizontal ones [as polled bits].
Changing the 8259 interrupt enable mask should be done in a place that understands the interrupt topology of the entire system. This is usually done by the containing OS. That's because the OS knows about the other devices and can make the best choice.
Herein, "containing OS" could be a full OS like Linux [of which I'm most familiar]. Or, it could just be an R/T executive [or boot rom--I've written a few] that has some common device handling framework with "helper" functions for the device drivers.
For example, although it's usual that all devices get their own IRR pin. But, it is possible, with level triggering, for two different devices to share an IRR. (e.g.) IRR[0] = devA_IRROUT | devB_IRROUT. Either through an OR gate [or wired OR(?)].
It's also possible that the device is attached to a "nested" or "cascaded" interrupt controller. IIRC [consult document], it is possible to have a "master" 8259 and [up to] 8 "slave" 8259s. Each slave 8259 connects to an IRR pin of the master. Then, connect devices to the slave IRR pins. For a fully loaded system, you can have 256 interrupting devices. And, the master can have slave 8259s on some IRR pins and real devices on others [a "hybrid" topology].
Usually, only the OS knows enough to deal with this. In a real system, a device driver probably wouldn't touch the 8259 at all. The non-specific EOI would probably have been sent to the 8259 before entering the device's ISR. And, the OS would handle the full "save state" and "restore state" and the driver just handles device specific actions.
Also, under an OS, the OS will call the "init" and "stop" routines. The general OS routines for this will handle the 8259 and call the device specific ones.
For example, under Linux [or almost any other OS or R/T executive], the interrupt sequence goes something like this:
- CPU hardware actions [atomic]:
- push %esp and flags register [has CPU interrupt enable flag] to stack
- clear CPU interrupt enable flag (e.g. implied cli)
- jump within interrupt vector table
- OS general ISR (preset within IVT):
- push all remaining registers to stack
- send non-specific EOI to 8259(s)
- call device-specific ISR (NOTE: CPU interrupt flag still clear)
- pop regs
- iret
To answer (2), yes, you are correct. It would probably interrupt immediately, and might nest (infinitely :-).
The simple ISR version is more efficient and preferable if the actions taken in the ISR are short, quick, and simple (e.g. just output to a few data ports).
If the required actions take a relatively long time (e.g. do intensive calculations, or write to a large number of ports or memory locations), the nested version is preferred to prevent other devices from having entry to their ISRs delayed excessively.
However, some time critical devices [like a video controller] need to use the simple model, preventing interruption by other devices, to guaranteed that they can complete in a finite, deterministic time.
For example, the video ISR handling of VEND might program the device for the next/upcoming field/frame and must complete this within the vertical blanking interval. They, have to do this, even if it means "excessive" delay of other ISRs.
Note that the ISR was "racing" to complete before the end of the blanking interval. Not the best design. I've had to program such a controller/device. For rev 2, we changed the design so the device registers were double-buffered.
That meant that we could set up the registers for frame 1 anytime during the [much longer] frame 0 display period. At VSTART for frame 1, the video hardware would instantly clock-in/save the double-buffered values, and the CPU could then setup for frame 2 anytime during the display of frame 1. And so on ...
With the modified design, the video driver removed the device setup from the ISR entirely. It was now handled from OS task level
In the driver example, I've adjusted the sequencing a bit to prevent infinite stacking, and added some additional information based upon my question (1) answer. That is, it shows [crudely] what to do with or without an OS.
// video controller driver
//
// for illustration purposes, STANDALONE means a very simple software system
//
// if it's _not_ defined, we assume the ISR is called from an OS general ISR
// that handles 8259 interactions
//
// if it's _defined_, we're showing [crudely] what needs to be done
//
// NOTE: although this is largely C code, it's also pseudo-code in places
// video_init -- initialize controller
void
video_init(void)
{
write_port(...);
write_port(...);
write_port(...);
...
#ifdef STANDALONE
write_port(8259_interrupt_enable |= VIDEO_IRR_PIN);
#endif
// we only care about the vertical interrupts, not the horizontal ones
write_port(vidintr_enable,VSTART | VEND);
}
// video_stop -- stop controller
void
video_stop(void)
{
// stop all interrupt sources
write_port(vidintr_enable,0);
#ifdef STANDALONE
write_port(8259_interrupt_enable &= ~VIDEO_IRR_PIN);
#endif
write_port(...);
write_port(...);
write_port(...);
...
}
// vidisr_pendmsk -- get video controller pending mask (and clear it)
u32
vidisr_pendmsk(void)
{
u32 pendmsk;
pendmsk = port_read(vidintr_pend);
// the normal way to clear on most H/W is a writeback
// writing a 1 to a given bit clears the interrupt source
// writing a 0 does nothing
// NOTE: with this method, we can _never_ have a race condition where
// we lose an interrupt
port_write(vidintr_pend,pendmsk);
return pendmsk;
}
// vidisr_process -- process video interrupts
void
vidisr_process(u32 pendmsk)
{
// NOTE: we loop because controller may assert a new, different interrupt
// while we're processing a given one -- we don't want to exit if we _know_
// we'll be [almost] immediately re-entered
while (1) {
if (pendmsk == 0)
break;
if (pendmsk & HSTART)
...
if (pendmsk & HEND)
...
if (pendmsk & VSTART)
...
if (pendmsk & VEND)
...
pendmsk = port_read(vidintr_pend);
}
}
// vidisr_simple -- simple video ISR routine
void
vidisr_simple(void)
{
u32 pendmsk;
// NOTE: interrupt state has been pre-saved for us ...
pendmsk = vidisr_pendmsk();
// process our interrupt sources
vidisr_process(pendmsk);
// allow other devices to cause interrupts
#ifdef STANDALONE
port_write(8259,SEND_NON_SPECIFIC_EOI)
#endif
// return from interrupt by popping interrupt state
#ifdef STANDALONE
pop_regs();
iret();
#endif
}
// vidisr_nested -- video ISR routine that allows nested interrupts
void
vidisr_nested(void)
{
u32 pendmsk;
// NOTE: interrupt state has been pre-saved for us ...
// get device pending mask -- do this _before_ [optional] EOI and the sti
// to prevent immediate stacked interrupts
pendmsk = vidisr_pendmsk();
// allow other devices to cause interrupts
#ifdef STANDALONE
port_write(8259,SEND_NON_SPECIFIC_EOI)
#endif
// allow us to receive them
// NOTE: with or without OS, we can't stack until _after_ this
sti();
// process our interrupt sources
// this can be interrupted by another source or another device
vidisr_process(pendmsk);
// return from interrupt by popping interrupt state
#ifdef STANDALONE
pop_regs();
iret();
#endif
}
BTW, I'm the author of the linux irqtune program
I wrote it back in the mid 90's. It's of lesser use now, and probably doesn't work on modern systems, but the FAQ I wrote has a great deal of information about interrupt device priorities. The program itself did a simple 8259 manipulation.
An online copy is available here: http://archive.debian.org/debian/dists/Debian-1.1/main/disks-i386/SpecialKernels/irqtune/README.html There's probably source code somewhere in this archive.
That's the version 0.2 doc. I haven't found an online copy of version 0.6 which has better explanation, so I've put up a text version here: http://pastebin.com/Ut6nCgL6
Side note: The "where to get" information in the FAQ [and email address] are no longer valid. And, I didn't understand the full impact of "spam" until I posted the FAQ and starting getting [tons of] it ;-)
And, irqtune even drew Linus' ire. Not because it didn't work but because it did: https://lkml.org/lkml/1996/8/23/19 IMO, if he had read the FAQ, he would have understood why [as what irqtune did is standard stuff to R/T guys].
UPDATE #2
Your new questions:
I think that you are missing a destination address in write_port(8259_interrupt_enable &= ~VIDEO_IRR_PIN). Isn't it so?
IRR register is read-only or r/w? If the second case, what is the purpose of writing into it?
Interrupt vectors are stored as logical addresses or physical address?
To answer question (3): No, not really [even if it seemed so]. The code snippet was "pseudo code" [not pure C code], as I mentioned in a code comment at the top, so technically speaking, I'm covered. However, to make it more clear, here is what the [closer to] real C code would look like:
// the system must know _which_ IRR H/W pin the video controller is connected to
// so we _hardwire_ it here
#define VIDEO_IRR_PIN_NUMBER 3 // just an example
#define VIDEO_IMR_MASK (1 << VIDEO_IRR_PIN_NUMBER)
// video_enable -- enable/disable video controller in 8259
void
video_enable(int enable)
{
u32 val;
// NOTE: we're reading/writing the _enable_ register, not the IRR [which
// software can _not_ modify or read]
val = read_port(8259_interrupt_enable);
if (enable)
val |= VIDEO_IMR_MASK;
else
val &= ~VIDEO_IMR_MASK;
write_port(8259_interrupt_enable,val);
}
Now, in video_init, replace the code inside STANDALONE with video_enable(1), and, in video_stop with video_enable(0)
As to question (4): We weren't really writing to the IRR, even though the symbol had _IRR_ in it. As mentioned in the code comments above, we were writing to the 8259 interrupt enable register which is really the "interrupt mask register" or IMR in the documentation. The IMR can be read from and written to by using OCW1 (see doc).
There is no way for software to access the IRR at all. (i.e.) There is no port in the 8259 to read or write the IRR value. The IRR is completely internal to the 8259.
There is a one-to-one correspondence between IRR pin numbers [0-7] and IMR bit numbers (e.g. to enable for IRR(0), set IMR bit 0), but the software has to know which bit to set.
Because the video controller is physically connected to a given IRR pin, it is always the same for a given PC board. The software [on older non-PnP systems] can't probe for this. Even on newer systems, the 8259 knows nothing of PnP, so it's still hardwired. The video controller driver programmer must just "know" what IRR pin is being used [by consulting the "spec sheet" or controller "architecture reference manual"].
To answer question (5): First consider what the 8259 does.
When the 8259 is intialized, the ICW2 ("initialization command word 2") gets set by the OS driver. This defines a portion of interrupt vector number the 8259 will present during the INTR/INTA cycle. In ICW2, the most significant 5 bits are marked T7-T3.
When an interrupt occurs, these bits are combined with the IRR pin number of the interrupting device [which is 3 bits wide] to form an 8 bit interrupt vector number: T7,T6,T5,T4,T3|I2,I1,I0
For example, if we put 0xD0 into ICW2, with our video controller using IRR pin 3, we'd have 1,1,0,1,0|0,1,1 or 0xD3 as the interrupt vector number that the 8259 will send to the CPU.
This is just a vector number [0x00-0xFF] as the 8259 knows nothing of memory addresses. It is the CPU that takes this vector number and, using the CPU's "interrupt vector table" [IVT], uses the vector number as an index into the IVT to properly vector the interrupt to an ISR routine.
On 80386 and later architectures, the IVT is actually called an IDT ("interrupt descriptor table"). For details, see the "System Programming Guide", chapter 6: http://download.intel.com/design/processor/manuals/253668.pdf
As, to whether the resulting ISR address from the IVT/IDT is physical or logical depends on the processor mode (e.g. real mode, protected mode, protected with virtual addressing enabled).
In a sense, all such addresses are always logical. And, all logical addresses undergo a translation to physical on each CPU instruction. Whether the translation is one-to-one [MMU not enabled or page tables have one-to-one mapping] is a question for "How has the OS set things up?"
Strictly speaking, there is no such thing
as "acknowledge an interrupt to device".
The thing that an ISR should do, is to handle
the interrupt condition. For example, if
the UART requested an interrupt because it
has an incoming data, then you should read
that incoming data. After that read operation,
UART no longer has the incoming data, so naturally
it stops asserting the IRQ line. Alternatively,
if your program no longer needs to read the
data and wants to stop the communication, it
would just mask the receiver interrupt via
the UART registers, and, once again, UART
will stop asserting the IRQ line. If the device
just wanted to signal you some state change,
then you should read the new state, and the
device will know that you have an up-to-date
state and will release an IRQ line.
So, in short: there is usually no any device-specific
acknowledge procedure. All you need to do is
to service an interrupt condition, after which,
that condition will disappear, voiding the
interrupt request.

Interrupt handling on an SMP ARM system with a GIC

I wanted to know how interrupt handling works from the point any device is interrupted.I know of interrupt handling in bits and pieces and would like to have clear end to end picture of interrupt handing.Let me put across what little I know about interrupt handling.
Suppose an FPGA device is interrupted through electrical lines and get some data .Device driver for this FPGA device already had code (Interrupt handler) registered using request_irq function.
So now FPGA device have an IRQ line which it get after to call request_irq ,using this IRQ line device send data to the General Interrupt controller and GIC will do many to one translation of IRQ lines and send the signal to CPU core which then call below minimal code
IRQ_handler
SUB lr, lr, #4 ; modify LR
SRSFD #0x12! ; store SPSR and LR to IRQ mode stack
PUSH {r0-r3, r12} ; store AAPCS registers on to the IRQ mode stack
BL IRQ_handler_to_specific_device
POP {r0-r3, r12} ; restore registers
RFEFD sp! ; and return from the exception using pre-modified LR
IRQ_handler_to_specific_device is nothing is what we registered in Device driver using request_irq() call.
I still don't how CPU core comes to know about the interrupt source?(from which device interrupt is coming)
Also what is role of call like do_irq and shared interrupts works?
Need some help in understanding end to end picture on how interrupts are handled on ARM architecture?
The GIC is divided into two sections. The first is called the distributor. This is global to the system. It has several interrupt sources physically routed to it; although it maybe within an SOC package. The second section is replicated per-CPU and it called the cpu interface. The distributor has logic on how to distribute the shared peripheral interrupts or SPI. These are the type of interrupt your question is asking about. They are global hardware interrupts.
In the context of Linux, this is implemented in irq-gic.c. There is some documentation in gic.txt. Of specific interest,
reg : Specifies base physical address(s) and size of the GIC registers. The
first region is the GIC distributor register base and size. The 2nd region is
the GIC cpu interface register base and size.
The distributor must be accessed globally, so care must be taken to manage it's registers. The CPU interface has the same physical address for each CPU, but each CPU has a separate implementation. The distributor can be set up to route interrupts to specific CPUs (including multiples). See: gic_set_affinity() for example. It is also possible for any CPU to handle the interrupt. The ACK register will allocate IRQ; the first CPU to read it, gets the interrupt. If multiple IRQs pend and there are two ACK reads from different CPUs, then each will get a different interrupt. A third CPU reading would get a spurious IRQ.
As well, each CPU interface has some private interrupt sources, that are used for CPU-to-CPU interrupts as well as private timers and the like. But I believe the focus of the question is how a physical peripheral (unique to a system) gets routed to a CPU in an SMP system.

Replace HW interrupt in flat memory mode with DOS32/A

I have a question about how to replace HW interrupt in flat memory mode...
about my application...
created by combining Watcom C and DOS32/A.
written for running on DOS mode( not on OS mode )
with DOS32/A now I can access >1M memory and allocate large memory to use...(running in flat memory mode !!!)
current issue...
I want to write an ISR(interrupt service routine) for one PCI card. Thus I need to "replace" the HW interrupt.
Ex. the PCI card's interrupt line = 0xE in DOS. That means this device will issue interrupt via 8259's IRQ 14.
But I did not how to achieve my goal to replace this interrupt in flat mode ?
# resource I found...
- in watcom C's library, there is one sample using _dos_getvect, _dos_setvect, and _chain_intr to hook INT 0x1C...
I tested this code and found OK. But when I apply it to my case: INT76 ( where IRQ 14 is "INT 0x76" <- (14-8) + 0x70 ) then nothing happened...
I checked HW interrupt is generated but my own ISR did not invoked...
Do I lose something ? or are there any functions I can use to achieve my goal ?
===============================================================
[20120809]
I tried to use DPMI calls 0x204 and 0x205 and found MyISR() is still not invoked. I described what I did as below and maybe you all can give me some suggestions !
1) Use inline assembly to implement DPMI calls 0x204 and 0x205 and test OK...
Ex. Use DPMI 0x204 to show the interrupt vectors of 16 IRQs and I get(selector:offset) following results: 8:1540(INT8),8:1544(INT9),.....,8:1560(INT70),8:1564(INT71),...,8:157C(INT77)
Ex. Use DPMI 0x205 to set the interrupt vector for IRQ14(INT76) and returned CF=0, indicating successful
2) Create my own ISR MyISR() as follows:
volatile int tick=0; // global and volatile...
void MyISR(void)
{
tick = 5; // simple code to change the value of tick...
}
3) Set new interrupt vector by DPMI call 0x205:
selector = FP_SEG(MyISR); // selector = 0x838 here
offset = FP_OFF(MyISR); // offset = 0x30100963 here
sts = DPMI_SetIntVector(0x76, selector, offset, &out_ax);
Then sts = 0(CF=0) indicating successful !
One strange thing here is:my app runs in flat memory model and I think the selector should be 0 for MyISR()... But if selector = 0 for DPMI call 0x205 then I got CF=1 and AX = 0x8022, indicating "invalid selector" !
4) Let HW interrupt be generated and the evidences are:
PCI device config register 0x5 bit2(Interrupt Disabled) = 0
PCI device config register 0x6 bit3(Interrupt status) = 1
PCI device config register 0x3C/0x3D (Interrupt line) = 0xE/0x2
In DOS the interrupt mode is PIC mode(8259 mode) and Pin-based(MSIE=0)
5) Display the value of tick and found it is still "0"...
Thus I think MyISR() is not invoked correctly...
Try using DPMI Function 0204h and 0205h instead of '_dos_getvect' and '_dos_setvect', respectively.
The runtime environment of your program is DOS32A or a DPMI Server/host. So use the api they provided instead of using DOS int21h facilities. But DOS32A does intercepts int21h interrupts, so your code should work fine, as far as real mode is concerned.
Actually what you did is you install only real mode interrupt handler for IRQ14 by using '_dos_getvect' and '_dos_setvect' functions.
By using the DPMI functions instead, you install protected mode interrupt handler for IRQ14, and DOS32a will autopassup IRQ14 interrupt to this protected mode handler.
Recall: A dos extender/DPMI server can be in protected mode or real mode while an IRQ is asserted.
This is becoz your application uses some DOS or BIOS API, so extender needs to switch to real mode to execute them and the return back to protected mode to transfer control to you protected mode application.
DOS32a does this by allocating a real-mode callback (at least for hardware interrupts) which calls your protected mode handler if IRQ14 is asserted while the Extender is in real-mode.
If the extender is in protected mode, while IRQ14 is asserted, it will automatically transfer control to your IRQ14 handler.
But if you didn't install protected mode handler for your IRQ, then DOS32a, will not allocate any real-mode callback, and your real-mode irq handler may not get control.
But it should recieve control AFAIK.
Anyway give the above two functions a try. And do chain to the previous int76h interrupt handler as Sean said.
In short:
In case of DOS32a, you need not use '_dos_getvect' and '_dos_setvect' functions. Instead use the DPMI functions 0204h and 0205h for installing your protected mode IRQ handler.
An advise : In your interrupt handler the first step should be to check whether your device actually generated interrupt or it is some other device sharing this irq(IRQ14 in your case). You can do this by checking a 'interrupt pending bit' in your device, if it is set, service your device and chain to next handler. If it is not set to 1, simply chain to next handler.
EDITED:
Use the latest version of DOS32a, instead of one that comes with OW.
Update on 2012-08-14:
Yes, you can use FP_SEG and FP_OFF macros for obtaining selector and offset respectively, just like you would use these macros in real modes to get segment and offset.
You can also use MK_FP macro to create far pointers from selector and offset. eg.
MK_FP(selector, offset).
You should declare your interrupt handler with ' __interrupt ', keyword when writing handlers in C.
Here is a snippet:
#include <i86.h> /* for FP_OFF, FP_SEG, and MK_FP in OW */
/* C Prototype for your IRQ handler */
void __interrupt __far irqHandler(void);
.
.
.
irq_selector = (unsigned short)FP_SEG( &irqHandler );
irq_offset = (unsigned long)FP_OFF( &irqHandler );
__dpmi_SetVect( intNum, irq_selector, irq_offset );
.
.
.
or, try this:
extern void sendEOItoMaster(void);
# pragma aux sendEOItoMaster = \
"mov al, 0x20" \
"out 0x20, al" \
modify [eax] ;
extern void sendEOItoSlave(void);
# pragma aux sendEOItoSlave = \
"mov al, 0x20" \
"out 0xA0, al" \
modify [eax] ;
unsigned int old76_selector, new76_selector;
unsigned long old76_offset, new76_offset;
volatile int chain = 1; /* Chain to the old handler */
volatile int tick=0; // global and volatile...
void (__interrupt __far *old76Handler)(void) = NULL; // function pointer declaration
void __interrupt __far new76Handler(void) {
tick = 5; // simple code to change the value of tick...
.
.
.
if( chain ){
// disable irqs if enabled above.
_chain_intr( old76Handler ); // 'jumping' to the old handler
// ( *old76Handler )(); // 'calling' the old handler
}else{
sendEOItoMaster();
sendEOItoSlave();
}
}
__dpmi_GetVect( 0x76, &old76_selector, &old76_offset );
old76Handler = ( void (__interrupt __far *)(void) ) MK_FP (old76_selector, old76_offset)
new76_selector = (unsigned int)FP_SEG( &new76Handler );
new76_offset = (unsigned long)FP_OFF( &new76Handler );
__dpmi_SetVect( 0x76, new76_selector, new76_offset );
.
.
NOTE:
You should first double check that the IRQ# you are hooking is really assigned/mapped to the interrupt pin of your concerned PCI device. IOWs, first read 'Interrupt Line register' (NOT Interrupt Pin register) from PCI configuration space, and hook only that irq#. The valid values for this register, in your case are: 0x00 through 0x0F inclusive, with 0x00 means IRQ0 and 0x01 means IRQ1 and so on.
POST/BIOS code writes a value in 'Interrupt Line register', while booting, and you MUST NOT modify this register at any cost.(of course, unless you are dealing with interrupt routing issues which an OS writer will deal with)
You should also get and save the selector and offset of the old handler by using DPMI call 0204h, in case you are chaining to old handler. If not, don't forget to send EOI(End-of-interrupt) to BOTH master and slave PICs in case you hooked an IRQ belonging to slave PIC(ie INT 70h through 77h, including INT 0Ah), and ONLY to master PIC in case you hooked an IRQ belonging to master PIC.
In flat model, the BASE address is 0 and Limit is 0xFFFFF, with G bit(ie Granularity bit) = 1.
The base and limit(along with attribute bits(e.g G bit) of a segment) reside in the descriptor corresponding to a particular segment. The descriptor itself, sits in the descriptor table.
Descriptor tables are an array with each entry being 8bytes.
The selector is merely a pointer(or an index) to the 8-byte descriptor entry, in the Descriptor table(either GDT or LDT). So a selector CAN'T be 0.
Note that lowest 3 bits of 16-bit selector have special meaning, and only the upper 13-bits are used to index a descriptor entry from a descriptor table.
GDT = Global Descriptor Table
LDT = Local Descriptor Table
A system can have only one GDT, but many LDTs.
As entry number 0 in GDT, is reserved and can't be used. AFAIK, DOS32A, does not create any LDT for its applications, instead it simply allocate and initalize descriptor entries corresponding to the application, in GDT itself.
Selector MUST not be 0, as x86 architecture regards 0 selector as invalid, when you try to access memory with that selector; though you can successfully place 0 in any segment register, it is only when you try to access(read/write/execute) that segment, the cpu generates an exception.
In case of interrupt handlers, the base address need not be 0, even in case of flat mode.
The DPMI environment must have valid reasons for doing this so.
After all, you still need to tackle segmentation at some level in x86 architecture.
PCI device config register 0x5 bit2(Interrupt Disabled) = 0
PCI device config register 0x6 bit3(Interrupt status) = 1
I think, you mean Bus master command and status registers respectively. They actually reside in either I/O space or memory space, but NOT in PCI configuration space.
So you can read/write them directly via IN/OUT or MOV, instructions.
For reading/writing, PCI configuration registers you must use configuration red/write methods or PCI BIOS routines.
NOTE:
Many PCI disk controllers, have a bit called 'Interrupt enable/disable' bit. The register
that contains this bit is usually in the PCI configuration space, and can be found from the datasheet.
Actually, this setting is for "forwarding" the interrupt generated by the device attached to the PCI controller, to the PCI bus.
If, interrupts are disabled via this bit, then even if your device(attached to PCI controller) is generating the interrupt, the interrupt will NOT be forwarded to the PCI bus(and hence cpu will never know if interrupt occurred), but the interrupt bit(This bit is different from 'Interrupt enable/disable' bit) in PCI controller is still set to notify that the device(attached to PCI controller, eg a hard disk) generated an interrupt, so that the program can read this bit and take appropriate actions. It is similar to polling, from programming perspective.
This usually apply only for non-bus master transfers.
But, it seems that you are using bus master transfers(ie DMA), so it should not apply in your case.
But anyway, I would suggest you do read the datasheet of the PCI controller carefully, especially looking for bits/registers related to interrupt handling
EDITED:
Well, as far as application level programming is concerned, you need not encounter/use _far pointers, as your program will not access anything outside to your code.
But this is not completely true, when you go to system-level programming, you need to access memory mapped device registers, external ROM, or implementing interrupt handlers, etc.
The story changes here. The creation of a segment ie allocating descriptor and getting its associated selector, ensures that even if there is a bug in code, it will not annoyingly change anything external to that particular segment from which current code is executing. If it tries to do so, cpu will generate a fault. So when accessing external devices(especially memory mapped device's registers), or accessing some rom data, eg BIOS etc., it is a good idea to have allocate a descriptor and set the base and segment limits according to the area you need to execute/read/write and proceed. But you are not bound to do so.
Some external code residing for eg in rom, assume that they will be invoked with a far call.
As I said earlier, in x86 architecture, at some level(the farther below you go) you need to deal with segmentation as there is no way to disable it completely.
But in flat model, segmentation is present as an aid to programmer, as I said above, when accessing external(wrt to your program) things. But you need not use if you don't desire to do so.
When an interrupt handler is invoked, it doesn't know the base and limits of program that was interrupted. It doesn't know the segment attributes, limits etc. of the interrupted program, we say except CS and EIP all registers are in undefined state wrt interrupt handler. So it is needed to be declared as far function to indicate that it resides somewhere external to currently executing program.
it's been a while since I fiddled with interrupts, but the table is a pointer to set where the processor should go to to process an interrupt. I can give you the process, but not code, as I only ever used 8086 code.
Pseudo code:
Initialize:
Get current vector - store value
Set vector to point to the entry point of your routine
next:
Process Interrupt:
Your code decides what to do with data
If it's your data:
process it, and return
If not:
jump to the stored vector that we got during initialize,
and let the chain of interrupts continue as they normally would
finally:
Program End:
check to see if interrupt still points to your code
if yes, set vector back to the saved value
if no, set beginning of your code to long jump to vector address you saved,
or set a flag that lets your program not process anything

How to send nmi on same system

I need to send an nmi on the system I am working on. I want to test few things which I have implemented. Is there any windows driver routine which allows us to do that? I think I can write to a port using __outword. Is there any other way to do it?
I have one more question. Are there any specific scenarios which causes an NMI? (However, I dont want system to BSOD or triple fault.)
Thanks
From Intel's Software Development Manual: System Programming Guide:
The nonmaskable interrupt (NMI) can be generated in either of two ways:
External hardware asserts the NMI pin.
The processor receives a message on the system bus (Pentium 4, Intel Core Duo, Intel Core 2, Intel Atom, and Intel Xeon processors) or the APIC serial bus (P6 family and Pentium processors) with a delivery mode NMI.
and
It is possible to issue a maskable hardware interrupt (through the INTR pin) to vector 2 to invoke the NMI interrupt handler; however, this interrupt will not truly be an NMI interrupt. A true NMI interrupt that activates the processors NMI-handling hardware can only be delivered through one of the mechanisms listed above.
So, if all you want to do is trigger the NMI handler, you can simply use int $2 (int 02h in Intel syntax). But, if you need to ensure that it is not masked, you will either need external hardware to trigger it, or to use the APIC.
If you choose to use the APIC to send an NMI, the easiest way to do it is to send an inter-processor interrupt. To do this, you will need access to the local APIC's registers, which are mapped into physical memory, by default at the address 0xFEE00000, although that can be changed. You will need to find the physical page containing the APIC's registers and map it into virtual memory so that you can access them.
In order to send an IPI, you need to write into the interrupt configuration register. The ICR's low 32 bits are located at 0x300 within the APIC's page, and the upper 32 bits are at 0x310. To send the NMI, you need to:
Get the APIC ID of the processor you want to send the NMI to. If you want to send it to the processor you are running on, this is simple since you can read it from the APIC at 0x20 in bits 24-31.
Write the APIC ID into the destination field, bits 24-31 of the high ICR register.
Write the value 0x4400 into the low ICR register. Bits 8-10 of this value indicate that you are sending an NMI, and bit 14 indicates that you are using the assert trigger mode.
When writing to an APIC register, you must write a full 32 bit value. Also, bits 13, 16-17, and 20-55 in the ICR are reserved, so you should not change their values. You also must write to the high bits of the ICR before the low bits, since the IPI is triggered by the write to the low bits.
Here is an example of sending an NMI to the current processor in C.
#define APIC_ID_OFFSET 0x20
#define ICR_LOW_OFFSET 0x300
#define ICR_HIGH_OFFSET 0x310
// Convenience macro used to access APIC registers
#define APIC_REG(offset) (*(unsigned int*)(apicAddress + offset))
void *apicAddress; // This should contain the virtual address that the APIC registers are mapped to
// Get the current APIC ID. Leave it in the high 8 bits since that is where it needs to be written anyway
unsigned int apicID = APIC_REG(APIC_ID_OFFSET) & 0xFF000000;
unsigned int high = APIC_REG(ICR_HIGH_OFFSET) & 0x00FFFFFF;
high |= apicID;
unsigned int low = APIC_REG(ICR_LOW_OFFSET) & 0xFFF32000;
low |= 0x4400;
APIC_REG(ICR_HIGH_OFFSET) = high;
APIC_REG(ICR_LOW_OFFSET) = low;

Resources