I am writing a bare-metal kernel, but an interrupt doesn't seem to be triggered when the EMMC INTERRUPT register becomes non-zero.
I have two cores idling, and one at EL3 with no data caches enabled continually displaying a page of memory, in order to see what the code I'm testing is up to. (The test code regularly flushes its cache, on the working QA7 millisecond interrupt.)
The code being tested is running at Secure EL0 on core 0, with interrupts enabled. Interrupts are routed to core 0:
QA7.GPU_interrupts_routing = 0; // IRQ and FIQ to Core 0
The EMMC interface is initialised, and a reset command sent to the card, at which point the INTERRUPT register becomes 1 (bit 0 set: command has finished), but no GPU interrupt seems to be signalled to the QA7 hardware (bit 8 in the Core 0 interrupt source register stays zero).
The EMMC registers IRPT_MASK and IRPT_EN are both set to 0x017f7137, which I think should enable all known interrupts from that peripheral, and certainly bit 0.
The BCM8235 interrupt registers have been written as so:
Enable_IRQs_1 = 0x20000000; // (1 << 29);
Enable_IRQs_2 = 0x02ff6800; // 0b00000010111111110110100000000000;
Enable_Basic_IRQs = 0x303;
But they read back as:
Enable_IRQs_1: 0x20000200 // (1 << 9) also set
Enable_IRQs_2: 0x02ff6800 // unchanged
Enable_Basic_IRQs: 0x3 // No interrupts from IRQs 1 or 2.
What have I missed?
(Tagged raspberry-pi2, since it also has the QA7 component.)
The simple answer is nothing more. The Arasan SD interface interrupt is number 62, bit 20 in the IRQ basic pending register. Enable bit 30 in IRQ pending 2, and the interrupt comes through.
Enable_IRQs_2 = 0x42ff6800;
I just had to ignore the advice: "The table above has many empty entries. These should not be enabled as they will interfere with the GPU operation." in the documentation.
I'm in a strange situation with an ATMega2560.
I want to save power by going into PowerDown mode. In this mode there are only a few events only which can wake it up.
On USART1 I have an external controller which sends messages to the AVR.
But when USART1 is used I can not use the INT2 and INT3 for external interrupt (=the CPU will not wake up).
So I had an idea to disable the USART1 just right before going into PowerDown mode, and have the INT2 enabled as external interrupt.
Pseudo code for this:
UCSR1B &= ~(1<<RXEN1); //Disable RXEN1: let AVR releasing it
DDRD &= ~(1<<PD2); //Make sure PortD2 is an input - we need it for waking up
EIMSK &= ~(1<<INT2); //Disable INT2 - this needs to be done before changing ISC20 and ISC221
EICRA |= (1<<ISC20)|(1<<ISC21); //Rising edge on PortD2 will generate an interrupt and wake up the AVR from PowerDown
EIMSK |= (1<<INT2); //Now enable INT2
//Sleep routine
cli();
sleep_enable();
sei();
sleep_cpu();
sleep_disable();
In the ISR of INT2, I change everything back to USART1.
Pseudo:
ISR(INT2_vect) {
EIMSK &= ~(1<<INT2); //Disable INT2 to be able to use it as USART1 again
UCSR1B=(1<<RXEN1)|(1<<TXEN1)|(1<<RXCIE1);
}
However it seems to take long time until the USART1 is working correctly again.
There are too many faulty bits in the beginning (after waking up from PowerDown).
How hackish is this?
Is there any reasonable way to make the change faster?
The main idea was to set the 'RX' port to an interrupt which can wake the CPU up then immediately change it back to USART and process it asap.
PS: I really have to use the same pin for this purpose, there is no other available option. So guiding toward using some other pins won't be accepted as an answer.
The Power-down mode disables the oscillator, so you have to wait for a stable oscillator after wakeup.
Please take a look at the datasheet on page 51:
When waking up from Power-down mode, there is a delay from the wake-up condition occurs until the wake-upbecomes effective. This allows the clock to restart and become stable after having been stopped. The wake-upperiod is defined by the same CKSEL Fuses that define the Reset Time-out period, as described in “ClockSources” on page 40.
You have to wait up to 258 clock cykles assuming you use a high speed ceramic oscillator (see table 10-4 on page 42).
You can use the Standby Mode. If you use an external oscillator the CPU enter the Standby Mode, which is identical to Power-down Mode, but the Oscillator isn´t stopped. Furthermore you can set the Power Reduction Register for additional power save options.
Another option is to use the Extended Standby Mode, which is identical to the Power-save Mode. This mode disables the oscillator, but the oscillator wakes up in six clock cycles.
Assume we have a system with CPU which is fully compatible with Intel 8259 Programmable Interrupt Controller. So, this CPU use vectored interrupts, of course.
When one of eight interrupts occurs, PIC just asserts INTR wire that is connected to the CPU. Now PIC waits for CPU until INTA will be asserted. When so, PIC selects interrupt with the highest priority (depends on pin number), and then send its interrupt vector to data bus. I omitted some timing, but it doesn't matter for now, I think.
Here are questions:
How whole device, that causes interrupt, knows that his interrupt
request was accepted and it can pull off interrupt request? I read about 8259, but I didn't find it.
Is acknowledge device, whose interrupt was accepted, performed in ISR?
Sorry for my English.
The best reference is the original intel doc and is available here: https://pdos.csail.mit.edu/6.828/2012/readings/hardware/8259A.pdf It has full details of these modes, how the device operates, and how to program the device.
Caveat: I'm a bit rusty as I haven't programmed the 8259 in many years, but I'll take a shot at explaining things, per your request.
After an interrupting device, connected to an IRR ["interrupt request register"] pin, has asserted an interrupt request, the 8259 will convey this to the CPU by assserting INTR and then placing the vector on the bus during the three INTA cycles generated by the CPU.
After a given device has asserted IRR, the 8259's IS ["in-service"] register is or'ed with a mask of the IRR pin number. The IS is a priority select. While the IS bit is set, other interrupting devices of lower priority [or the original one] will not cause an INTR/INTA cycle to the CPU. The IS bit must be cleared first. These interrupts remain "pending".
The IS can be cleared by an EOI (end-of-interrupt) operation. There are multiple EOI modes that can be programmed. The EOI can be generated by the 8259 in AEOI mode. In other modes, the EOI is generated manually by the ISR by sending a command to the 8259.
The EOI action is all about allowing other devices to cause interrupts while the ISR is processing the current one. The EOI does not clear the interrupting device.
Clearing the interrupting device must be done by the ISR using whatever device specific register the device has for that purpose. Usually, this a "pending interrupt" register [can be 1 bit wide]. Most H/W uses two interrupt related registers and the other one is an "interrupt enable" register.
With level triggered interrupts, if the ISR does not clear the device, when the ISR does issue the EOI command to the 8259, the 8259 will [try to] reinterrupt the CPU using the vector for the same device for the same condition. The CPU will probably be reinterrupted as soon as it issues an sti or iret instruction. Thus, an ISR routine must take care to process things in proper sequence.
Consider an example. We have a video controller that has four sources for interrupts:
HSTART -- start of horizontal line
HEND -- end of horizontal line [start of horizontal blanking interval]
VSTART -- start of new video field/frame
VEND -- end of video field/frame [start of vertical blanking interval]
The controller presents these as a bit mask in its own special interrupt source register, which we'll call vidintr_pend. We'll call the interrupt enable register vidintr_enable.
The video controller will use only one 8259 IRR pin. It is the responsibility of the CPU's video ISR to interrogate the vidpend register and decide what to do.
The video controller will assert its IRR pin as long as vidpend is non-zero. Since we're level triggered, the CPU may be re-interrupted.
Here is a sample ISR routine to go with this:
// video_init -- initialize controller
void
video_init(void)
{
write_port(...);
write_port(...);
write_port(...);
...
// we only care about the vertical interrupts, not the horizontal ones
write_port(vidintr_enable,VSTART | VEND);
}
// video_stop -- stop controller
void
video_stop(void)
{
// stop all interrupt sources
write_port(vidintr_enable,0);
write_port(...);
write_port(...);
write_port(...);
...
}
// vidisr_process -- process video interrupts
void
vidisr_process(void)
{
u32 pendmsk;
// NOTE: we loop because controller may assert a new, different interrupt
// while we're processing a given one -- we don't want to exit if we _know_
// we'll be [almost] immediately re-entered
while (1) {
pendmsk = port_read(vidintr_pend);
if (pendmsk == 0)
break;
// the normal way to clear on most H/W is a writeback
// writing a 1 to a given bit clears the interrupt source
// writing a 0 does nothing
// NOTE: with this method, we can _never_ have a race condition where
// we lose an interrupt
port_write(vidintr_pend,pendmsk);
if (pendmsk & HSTART)
...
if (pendmsk & HEND)
...
if (pendmsk & VSTART)
...
if (pendmsk & VEND)
...
}
}
// vidisr_simple -- simple video ISR routine
void
vidisr_simple(void)
{
// NOTE: interrupt state has been pre-saved for us ...
// process our interrupt sources
vidisr_process();
// allow other devices to cause interrupts
port_write(8259,SEND_NON_SPECIFIC_EOI)
// return from interrupt by popping interrupt state
iret();
}
// vidisr_nested -- video ISR routine that allows nested interrupts
void
vidisr_nested(void)
{
// NOTE: interrupt state has been pre-saved for us ...
// allow other devices to cause interrupts
port_write(8259,SEND_NON_SPECIFIC_EOI)
// allow us to receive them
sti();
// process our interrupt sources
// this can be interrupted by another source or another device
vidisr_process();
// return from interrupt by popping interrupt state
iret();
}
UPDATE:
Your followup questions:
Why do you use interrupt disable on video controller register instead of mask 8259's interrupt enable bit?
When you execute vidisr_nested(void) function, it will enable nesting the same interrupt. Is it true? And is that what you want?
To answer (1), we should do both but not necessarily in the same place. They seem similar, but work in slightly different ways.
We change the video controller registers in the video controller driver [as it's the only place that "understands" the video controller's registers].
The video controller actually asserts the 8259's IRR pin from: IRR = ((vidintr_enable & vidintr_pend) != 0). If we never set vidintr_enable (i.e. it's all zeroes), then we can operate the device in a "polled" [non-interrupt] mode.
The 8259 interrupt enable register works similarly, but it masks against which IRRs [asserted or not] may interrupt the CPU. The device vidintr_enable controls whether it will assert IRR or not.
In the example video driver, the init routine enables the vertical interrupts, but not the horizontal. Only the vertical interrupts will generate a call to the ISR, but the ISR can/will also process the horizontal ones [as polled bits].
Changing the 8259 interrupt enable mask should be done in a place that understands the interrupt topology of the entire system. This is usually done by the containing OS. That's because the OS knows about the other devices and can make the best choice.
Herein, "containing OS" could be a full OS like Linux [of which I'm most familiar]. Or, it could just be an R/T executive [or boot rom--I've written a few] that has some common device handling framework with "helper" functions for the device drivers.
For example, although it's usual that all devices get their own IRR pin. But, it is possible, with level triggering, for two different devices to share an IRR. (e.g.) IRR[0] = devA_IRROUT | devB_IRROUT. Either through an OR gate [or wired OR(?)].
It's also possible that the device is attached to a "nested" or "cascaded" interrupt controller. IIRC [consult document], it is possible to have a "master" 8259 and [up to] 8 "slave" 8259s. Each slave 8259 connects to an IRR pin of the master. Then, connect devices to the slave IRR pins. For a fully loaded system, you can have 256 interrupting devices. And, the master can have slave 8259s on some IRR pins and real devices on others [a "hybrid" topology].
Usually, only the OS knows enough to deal with this. In a real system, a device driver probably wouldn't touch the 8259 at all. The non-specific EOI would probably have been sent to the 8259 before entering the device's ISR. And, the OS would handle the full "save state" and "restore state" and the driver just handles device specific actions.
Also, under an OS, the OS will call the "init" and "stop" routines. The general OS routines for this will handle the 8259 and call the device specific ones.
For example, under Linux [or almost any other OS or R/T executive], the interrupt sequence goes something like this:
- CPU hardware actions [atomic]:
- push %esp and flags register [has CPU interrupt enable flag] to stack
- clear CPU interrupt enable flag (e.g. implied cli)
- jump within interrupt vector table
- OS general ISR (preset within IVT):
- push all remaining registers to stack
- send non-specific EOI to 8259(s)
- call device-specific ISR (NOTE: CPU interrupt flag still clear)
- pop regs
- iret
To answer (2), yes, you are correct. It would probably interrupt immediately, and might nest (infinitely :-).
The simple ISR version is more efficient and preferable if the actions taken in the ISR are short, quick, and simple (e.g. just output to a few data ports).
If the required actions take a relatively long time (e.g. do intensive calculations, or write to a large number of ports or memory locations), the nested version is preferred to prevent other devices from having entry to their ISRs delayed excessively.
However, some time critical devices [like a video controller] need to use the simple model, preventing interruption by other devices, to guaranteed that they can complete in a finite, deterministic time.
For example, the video ISR handling of VEND might program the device for the next/upcoming field/frame and must complete this within the vertical blanking interval. They, have to do this, even if it means "excessive" delay of other ISRs.
Note that the ISR was "racing" to complete before the end of the blanking interval. Not the best design. I've had to program such a controller/device. For rev 2, we changed the design so the device registers were double-buffered.
That meant that we could set up the registers for frame 1 anytime during the [much longer] frame 0 display period. At VSTART for frame 1, the video hardware would instantly clock-in/save the double-buffered values, and the CPU could then setup for frame 2 anytime during the display of frame 1. And so on ...
With the modified design, the video driver removed the device setup from the ISR entirely. It was now handled from OS task level
In the driver example, I've adjusted the sequencing a bit to prevent infinite stacking, and added some additional information based upon my question (1) answer. That is, it shows [crudely] what to do with or without an OS.
// video controller driver
//
// for illustration purposes, STANDALONE means a very simple software system
//
// if it's _not_ defined, we assume the ISR is called from an OS general ISR
// that handles 8259 interactions
//
// if it's _defined_, we're showing [crudely] what needs to be done
//
// NOTE: although this is largely C code, it's also pseudo-code in places
// video_init -- initialize controller
void
video_init(void)
{
write_port(...);
write_port(...);
write_port(...);
...
#ifdef STANDALONE
write_port(8259_interrupt_enable |= VIDEO_IRR_PIN);
#endif
// we only care about the vertical interrupts, not the horizontal ones
write_port(vidintr_enable,VSTART | VEND);
}
// video_stop -- stop controller
void
video_stop(void)
{
// stop all interrupt sources
write_port(vidintr_enable,0);
#ifdef STANDALONE
write_port(8259_interrupt_enable &= ~VIDEO_IRR_PIN);
#endif
write_port(...);
write_port(...);
write_port(...);
...
}
// vidisr_pendmsk -- get video controller pending mask (and clear it)
u32
vidisr_pendmsk(void)
{
u32 pendmsk;
pendmsk = port_read(vidintr_pend);
// the normal way to clear on most H/W is a writeback
// writing a 1 to a given bit clears the interrupt source
// writing a 0 does nothing
// NOTE: with this method, we can _never_ have a race condition where
// we lose an interrupt
port_write(vidintr_pend,pendmsk);
return pendmsk;
}
// vidisr_process -- process video interrupts
void
vidisr_process(u32 pendmsk)
{
// NOTE: we loop because controller may assert a new, different interrupt
// while we're processing a given one -- we don't want to exit if we _know_
// we'll be [almost] immediately re-entered
while (1) {
if (pendmsk == 0)
break;
if (pendmsk & HSTART)
...
if (pendmsk & HEND)
...
if (pendmsk & VSTART)
...
if (pendmsk & VEND)
...
pendmsk = port_read(vidintr_pend);
}
}
// vidisr_simple -- simple video ISR routine
void
vidisr_simple(void)
{
u32 pendmsk;
// NOTE: interrupt state has been pre-saved for us ...
pendmsk = vidisr_pendmsk();
// process our interrupt sources
vidisr_process(pendmsk);
// allow other devices to cause interrupts
#ifdef STANDALONE
port_write(8259,SEND_NON_SPECIFIC_EOI)
#endif
// return from interrupt by popping interrupt state
#ifdef STANDALONE
pop_regs();
iret();
#endif
}
// vidisr_nested -- video ISR routine that allows nested interrupts
void
vidisr_nested(void)
{
u32 pendmsk;
// NOTE: interrupt state has been pre-saved for us ...
// get device pending mask -- do this _before_ [optional] EOI and the sti
// to prevent immediate stacked interrupts
pendmsk = vidisr_pendmsk();
// allow other devices to cause interrupts
#ifdef STANDALONE
port_write(8259,SEND_NON_SPECIFIC_EOI)
#endif
// allow us to receive them
// NOTE: with or without OS, we can't stack until _after_ this
sti();
// process our interrupt sources
// this can be interrupted by another source or another device
vidisr_process(pendmsk);
// return from interrupt by popping interrupt state
#ifdef STANDALONE
pop_regs();
iret();
#endif
}
BTW, I'm the author of the linux irqtune program
I wrote it back in the mid 90's. It's of lesser use now, and probably doesn't work on modern systems, but the FAQ I wrote has a great deal of information about interrupt device priorities. The program itself did a simple 8259 manipulation.
An online copy is available here: http://archive.debian.org/debian/dists/Debian-1.1/main/disks-i386/SpecialKernels/irqtune/README.html There's probably source code somewhere in this archive.
That's the version 0.2 doc. I haven't found an online copy of version 0.6 which has better explanation, so I've put up a text version here: http://pastebin.com/Ut6nCgL6
Side note: The "where to get" information in the FAQ [and email address] are no longer valid. And, I didn't understand the full impact of "spam" until I posted the FAQ and starting getting [tons of] it ;-)
And, irqtune even drew Linus' ire. Not because it didn't work but because it did: https://lkml.org/lkml/1996/8/23/19 IMO, if he had read the FAQ, he would have understood why [as what irqtune did is standard stuff to R/T guys].
UPDATE #2
Your new questions:
I think that you are missing a destination address in write_port(8259_interrupt_enable &= ~VIDEO_IRR_PIN). Isn't it so?
IRR register is read-only or r/w? If the second case, what is the purpose of writing into it?
Interrupt vectors are stored as logical addresses or physical address?
To answer question (3): No, not really [even if it seemed so]. The code snippet was "pseudo code" [not pure C code], as I mentioned in a code comment at the top, so technically speaking, I'm covered. However, to make it more clear, here is what the [closer to] real C code would look like:
// the system must know _which_ IRR H/W pin the video controller is connected to
// so we _hardwire_ it here
#define VIDEO_IRR_PIN_NUMBER 3 // just an example
#define VIDEO_IMR_MASK (1 << VIDEO_IRR_PIN_NUMBER)
// video_enable -- enable/disable video controller in 8259
void
video_enable(int enable)
{
u32 val;
// NOTE: we're reading/writing the _enable_ register, not the IRR [which
// software can _not_ modify or read]
val = read_port(8259_interrupt_enable);
if (enable)
val |= VIDEO_IMR_MASK;
else
val &= ~VIDEO_IMR_MASK;
write_port(8259_interrupt_enable,val);
}
Now, in video_init, replace the code inside STANDALONE with video_enable(1), and, in video_stop with video_enable(0)
As to question (4): We weren't really writing to the IRR, even though the symbol had _IRR_ in it. As mentioned in the code comments above, we were writing to the 8259 interrupt enable register which is really the "interrupt mask register" or IMR in the documentation. The IMR can be read from and written to by using OCW1 (see doc).
There is no way for software to access the IRR at all. (i.e.) There is no port in the 8259 to read or write the IRR value. The IRR is completely internal to the 8259.
There is a one-to-one correspondence between IRR pin numbers [0-7] and IMR bit numbers (e.g. to enable for IRR(0), set IMR bit 0), but the software has to know which bit to set.
Because the video controller is physically connected to a given IRR pin, it is always the same for a given PC board. The software [on older non-PnP systems] can't probe for this. Even on newer systems, the 8259 knows nothing of PnP, so it's still hardwired. The video controller driver programmer must just "know" what IRR pin is being used [by consulting the "spec sheet" or controller "architecture reference manual"].
To answer question (5): First consider what the 8259 does.
When the 8259 is intialized, the ICW2 ("initialization command word 2") gets set by the OS driver. This defines a portion of interrupt vector number the 8259 will present during the INTR/INTA cycle. In ICW2, the most significant 5 bits are marked T7-T3.
When an interrupt occurs, these bits are combined with the IRR pin number of the interrupting device [which is 3 bits wide] to form an 8 bit interrupt vector number: T7,T6,T5,T4,T3|I2,I1,I0
For example, if we put 0xD0 into ICW2, with our video controller using IRR pin 3, we'd have 1,1,0,1,0|0,1,1 or 0xD3 as the interrupt vector number that the 8259 will send to the CPU.
This is just a vector number [0x00-0xFF] as the 8259 knows nothing of memory addresses. It is the CPU that takes this vector number and, using the CPU's "interrupt vector table" [IVT], uses the vector number as an index into the IVT to properly vector the interrupt to an ISR routine.
On 80386 and later architectures, the IVT is actually called an IDT ("interrupt descriptor table"). For details, see the "System Programming Guide", chapter 6: http://download.intel.com/design/processor/manuals/253668.pdf
As, to whether the resulting ISR address from the IVT/IDT is physical or logical depends on the processor mode (e.g. real mode, protected mode, protected with virtual addressing enabled).
In a sense, all such addresses are always logical. And, all logical addresses undergo a translation to physical on each CPU instruction. Whether the translation is one-to-one [MMU not enabled or page tables have one-to-one mapping] is a question for "How has the OS set things up?"
Strictly speaking, there is no such thing
as "acknowledge an interrupt to device".
The thing that an ISR should do, is to handle
the interrupt condition. For example, if
the UART requested an interrupt because it
has an incoming data, then you should read
that incoming data. After that read operation,
UART no longer has the incoming data, so naturally
it stops asserting the IRQ line. Alternatively,
if your program no longer needs to read the
data and wants to stop the communication, it
would just mask the receiver interrupt via
the UART registers, and, once again, UART
will stop asserting the IRQ line. If the device
just wanted to signal you some state change,
then you should read the new state, and the
device will know that you have an up-to-date
state and will release an IRQ line.
So, in short: there is usually no any device-specific
acknowledge procedure. All you need to do is
to service an interrupt condition, after which,
that condition will disappear, voiding the
interrupt request.
I wanted to know how interrupt handling works from the point any device is interrupted.I know of interrupt handling in bits and pieces and would like to have clear end to end picture of interrupt handing.Let me put across what little I know about interrupt handling.
Suppose an FPGA device is interrupted through electrical lines and get some data .Device driver for this FPGA device already had code (Interrupt handler) registered using request_irq function.
So now FPGA device have an IRQ line which it get after to call request_irq ,using this IRQ line device send data to the General Interrupt controller and GIC will do many to one translation of IRQ lines and send the signal to CPU core which then call below minimal code
IRQ_handler
SUB lr, lr, #4 ; modify LR
SRSFD #0x12! ; store SPSR and LR to IRQ mode stack
PUSH {r0-r3, r12} ; store AAPCS registers on to the IRQ mode stack
BL IRQ_handler_to_specific_device
POP {r0-r3, r12} ; restore registers
RFEFD sp! ; and return from the exception using pre-modified LR
IRQ_handler_to_specific_device is nothing is what we registered in Device driver using request_irq() call.
I still don't how CPU core comes to know about the interrupt source?(from which device interrupt is coming)
Also what is role of call like do_irq and shared interrupts works?
Need some help in understanding end to end picture on how interrupts are handled on ARM architecture?
The GIC is divided into two sections. The first is called the distributor. This is global to the system. It has several interrupt sources physically routed to it; although it maybe within an SOC package. The second section is replicated per-CPU and it called the cpu interface. The distributor has logic on how to distribute the shared peripheral interrupts or SPI. These are the type of interrupt your question is asking about. They are global hardware interrupts.
In the context of Linux, this is implemented in irq-gic.c. There is some documentation in gic.txt. Of specific interest,
reg : Specifies base physical address(s) and size of the GIC registers. The
first region is the GIC distributor register base and size. The 2nd region is
the GIC cpu interface register base and size.
The distributor must be accessed globally, so care must be taken to manage it's registers. The CPU interface has the same physical address for each CPU, but each CPU has a separate implementation. The distributor can be set up to route interrupts to specific CPUs (including multiples). See: gic_set_affinity() for example. It is also possible for any CPU to handle the interrupt. The ACK register will allocate IRQ; the first CPU to read it, gets the interrupt. If multiple IRQs pend and there are two ACK reads from different CPUs, then each will get a different interrupt. A third CPU reading would get a spurious IRQ.
As well, each CPU interface has some private interrupt sources, that are used for CPU-to-CPU interrupts as well as private timers and the like. But I believe the focus of the question is how a physical peripheral (unique to a system) gets routed to a CPU in an SMP system.
I need to send an nmi on the system I am working on. I want to test few things which I have implemented. Is there any windows driver routine which allows us to do that? I think I can write to a port using __outword. Is there any other way to do it?
I have one more question. Are there any specific scenarios which causes an NMI? (However, I dont want system to BSOD or triple fault.)
Thanks
From Intel's Software Development Manual: System Programming Guide:
The nonmaskable interrupt (NMI) can be generated in either of two ways:
External hardware asserts the NMI pin.
The processor receives a message on the system bus (Pentium 4, Intel Core Duo, Intel Core 2, Intel Atom, and Intel Xeon processors) or the APIC serial bus (P6 family and Pentium processors) with a delivery mode NMI.
and
It is possible to issue a maskable hardware interrupt (through the INTR pin) to vector 2 to invoke the NMI interrupt handler; however, this interrupt will not truly be an NMI interrupt. A true NMI interrupt that activates the processors NMI-handling hardware can only be delivered through one of the mechanisms listed above.
So, if all you want to do is trigger the NMI handler, you can simply use int $2 (int 02h in Intel syntax). But, if you need to ensure that it is not masked, you will either need external hardware to trigger it, or to use the APIC.
If you choose to use the APIC to send an NMI, the easiest way to do it is to send an inter-processor interrupt. To do this, you will need access to the local APIC's registers, which are mapped into physical memory, by default at the address 0xFEE00000, although that can be changed. You will need to find the physical page containing the APIC's registers and map it into virtual memory so that you can access them.
In order to send an IPI, you need to write into the interrupt configuration register. The ICR's low 32 bits are located at 0x300 within the APIC's page, and the upper 32 bits are at 0x310. To send the NMI, you need to:
Get the APIC ID of the processor you want to send the NMI to. If you want to send it to the processor you are running on, this is simple since you can read it from the APIC at 0x20 in bits 24-31.
Write the APIC ID into the destination field, bits 24-31 of the high ICR register.
Write the value 0x4400 into the low ICR register. Bits 8-10 of this value indicate that you are sending an NMI, and bit 14 indicates that you are using the assert trigger mode.
When writing to an APIC register, you must write a full 32 bit value. Also, bits 13, 16-17, and 20-55 in the ICR are reserved, so you should not change their values. You also must write to the high bits of the ICR before the low bits, since the IPI is triggered by the write to the low bits.
Here is an example of sending an NMI to the current processor in C.
#define APIC_ID_OFFSET 0x20
#define ICR_LOW_OFFSET 0x300
#define ICR_HIGH_OFFSET 0x310
// Convenience macro used to access APIC registers
#define APIC_REG(offset) (*(unsigned int*)(apicAddress + offset))
void *apicAddress; // This should contain the virtual address that the APIC registers are mapped to
// Get the current APIC ID. Leave it in the high 8 bits since that is where it needs to be written anyway
unsigned int apicID = APIC_REG(APIC_ID_OFFSET) & 0xFF000000;
unsigned int high = APIC_REG(ICR_HIGH_OFFSET) & 0x00FFFFFF;
high |= apicID;
unsigned int low = APIC_REG(ICR_LOW_OFFSET) & 0xFFF32000;
low |= 0x4400;
APIC_REG(ICR_HIGH_OFFSET) = high;
APIC_REG(ICR_LOW_OFFSET) = low;