why does arm-linux switch to SVC mode to handle exception? - linux-kernel

arm doc reads
This is because a fresh interrupt could occur at any time, which would cause
the core to store the return address of the new interrupt and overwrite the original interrupt.
When the original interrupt attempts to return to the main program, it will cause the system to
fail. The nested handler must change into an alternative kernel mode before re-enabling
interrupts in order to prevent this.
according to the context, the main reason is that new-coming irq would overwrite R14(LR), so that the first irq cannot return to main program.
In my understanding, to solve this problem, I just need to push R14(LR_irq), SPSR_irq to R13(SP_irq) before next irq raising.
There is no need for nested handler to switch to alternative mode before re-enabling interrupts.
Thank you!

Consider this situation:
When you're in a IRQ handler and then perform a C function call, at this point the LR has been changed; so, if you then nest an IRQ right away, the LR will be corrupt.

Related

Nested Interrupt Handling in ARM

Below is the flow mentioned in the Cortex A Prog Guide, I have a few questions on the text.
A reentrant interrupt handler must therefore take the following steps after an IRQ exception is raised and control is transferred to the interrupt handler in the way previously described.
• The interrupt handler saves the context of the interrupted program (that is, it pushes onto the alternative kernel mode stack any registers which will be corrupted by the handler, including the return address and SPSR_IRQ).
Q> What is the alternative kernel mode stack here ?
• It determines which interrupt source needs to be processed and clears the source in the external hardware (preventing it from immediately triggering another interrupt).
• The interrupt handler changes the processor to the other kernel mode, leaving the CPSR I bit set (interrupts are still disabled).
Q> From IRQ to SVC mode with CPSR.I =1 . Right ?
• The interrupt handler saves the exception return address on the stack (a stack for the new mode, located in kernel memory) and re-enables interrupts.
Q> Are there 2 stacks here ?
• It calls the appropriate C handler for the original interrupt (interrupts are still disabled).
• Upon completion, the interrupt handler disables IRQ and pops the exception return address from the stack.
• It restores the context of the interrupted program directly from the alternative kernel mode stack. This includes restoring the PC, and the CPSR which switches back to the previous execution mode.
Q> How is the nesting done here ? I am bit confused here...
1) Up to you, really. The requirement is that it is one that cannot be asynchronously invoked. So you can use System mode stack, which is shared with User mode - with some interesting implications. Or you can use the Supervisor mode stack, as long as you always properly store all context before executing an SVC instruction.
2) Yes.
3) Yes, you store the context on a stack for whichever mode picked in (1).
4) While executing in the alternative mode, you re-enable interrupts (as your text states). At this point, the processor will now react to new interrupts signalled to the core - generally ones of a higher priority as configured in your interrupt controller.

How does not disabling local interrupts in interrupt handler(which acquire lock) could lead to double-acquire deadlock?

In Linux Kernel Development book (Robert Love), It is mentioned that :
we must disable local interrupts before obtaining spinlock in
interrupt handler. Otherwise it is possible for an interrupt handler
to interrupt kernel code while the lock is held and attempt to
re-acquire the lock. Which finally can lead to double-acquire
deadlock.
Now my doubt is:
In general, doesn't do_IRQ() disables local interrupt ?
And if lock is acquire, it means thatpreempt_count variable is not zero, which makes that no other handler should get chance, as kernel is not preempt_safe. So how other interrupt handler can work in this situation ?
First, the do_IRQ() function dosn't disable the local interrupt, but some function written in assembly language does, which is the interrupt entrance. And later, before executing the interrupt function registering by request_irq(), in function handle_IRQ_event() a flag which also pass by request_irq() is compare with IRQF_DISABLED to determine whether we should enable the local interrupt when executing the interrupt function. So the answer to your question one is depending on the flags that you pass to the request_irq() function.
Second, preempt_count just means for kernel preemption in process context, but not for interrupt. To avoid interrupt handlers be executed in UP, the only way is involving the irqs_disable(). When the preempt_count is zero, it's said that the kernel can safely does the process switch, otherwise not.

For a shared interrupt line how do I find which interrupt handler to use?

For a shared interrupt line,I can have several interrupt handlers. The kernel will sequentially invoke all the handlers for that particular shared line.
As far as I know, each handler, when invoked informs the kernel whether it was the correct handler to be invoked or not.
My questions is how is this determined,is there a way it checks a memory mapped register that tells status of a particular device or is there some other hardware mechanism ? How does the handler know that the corresponding device is indeed the one that issued the interrupt or not ?
Is this information relayed through the interrupt controller that is between the devices and the processor interrupt line ??
The kernel will sequentially invoke all the handlers for that particular shared line.
Exactly. Say Dev1 and Dev2 shares the IRQ10. When an interrupt is generated for IRQ10, all ISRs registered with this line will be invoked one by one.
In our scenario, say Dev2 was the one that generated the interrupt. If Dev1's ISR is registered first, than its ISR (i.e Dev1's ISR) only called first. In that ISR, the interrupt status register will be verified for interrupt. If no interrupt bit is set (which is the case, cause Dev2 raised the interrupt) then we can confirm that interrupt was not generated by Dev1 - so Dev1's ISR should return to the kernel IRQ_NONE - which means:"I did not handled that interrupt", so on the kernel continues to the next ISR (i.e Dev2's ISR), which in turn, will indeed verify that its corresponding device generated the interrupt, thus, this handler should handle it and eventually return IRQ_HANDLED - which means:"I handled this one".
See the return values IRQ_NONE/IRQ_HANDLED for more information.
How does the handler know that the corresponding device issued the interrupt or not ?
By reading the Interrupt status register only.
Is this information relayed through the interrupt controller that is between the devices and the processor interrupt line ??
I'm not sure about this. But the OS will take care of calling ISRs based on the return values from ISR.

Trap Dispatching on Windows

I am actually reading Windows Internals 5th edition and i am enjoying, although isn't a easy book to read and understand.
I am confused about IRQLs and IDT Table.
I read that windows implement custom priorization levels with IRQL and the Plug and Play Manager maps IRQ from devices to IRQL.
Alright, so, IRQLs are used for Software and Hardware interrupts, and for exceptions is used the Exception Dispatch handler.
When one device generates an interrupt, the interrupt controller pass this information to the CPU with the IRQ.
So Windows takes this IRQ and translates to IRQL to schedule when to execute the routine (routine that IDT[IRQ_VALUE] is pointing to?
Is that what is happening?
Yes, on a very high level.
Everything starts with a kernel trap. Kernel trap handler handles interrupts, exceptions, system service calls and virtual memory pager.
When an interrupt happens (line based - using dedicated pin or message based- writing to an address) windows uses IRQL to determine the priority of the interrupt and uses this to see if the interrupt can be served or not during that time. HAL does the job of translating the IRQ to IRQL.
It then uses IRQ to get an index of the IDT to find the appropriate ISR routing to invoke. Note there can be multiple ISR associated for a given IRQ. All of them execute in order.
Each processor has its own IDT so you could potentially have multiple ISR's running at the same time.
Exception dispatch, as I mentioned before, is also handled by the kernel trap but the procedure for it is different. It usually starts by checking for any exception handlers by stack unwinding, then checking for debugger port etc.

spin_lock_irqsave vs spin_lock_irq

On an SMP machine we must use spin_lock_irqsave and not spin_lock_irq from interrupt context.
Why would we want to save the flags (which contain the IF)?
Is there another interrupt routine that could interrupt us?
spin_lock_irqsave is basically used to save the interrupt state before taking the spin lock, this is because spin lock disables the interrupt, when the lock is taken in interrupt context, and re-enables it when while unlocking. The interrupt state is saved so that it should reinstate the interrupts again.
Example:
Lets say interrupt x was disabled before spin lock was acquired
spin_lock_irq will disable the interrupt x and take the the lock
spin_unlock_irq will enable the interrupt x.
So in the 3rd step above after releasing the lock we will have interrupt x enabled which was earlier disabled before the lock was acquired.
So only when you are sure that interrupts are not disabled only then you should spin_lock_irq otherwise you should always use spin_lock_irqsave.
If interrupts are already disabled before your code starts locking, when you call spin_unlock_irq you will forcibly re-enable interrupts in a potentially unwanted manner. If instead you also save the current interrupt enable state in flags through spin_lock_irqsave, attempting to re-enable interrupts with the same flags after releasing the lock, the function will just restore the previous state (thus not necessarily enabling interrupts).
Example with spin_lock_irqsave:
spinlock_t mLock = SPIN_LOCK_UNLOCK;
unsigned long flags;
spin_lock_irqsave(&mLock, flags); // Save the state of interrupt enable in flags and then disable interrupts
// Critical section
spin_unlock_irqrestore(&mLock, flags); // Return to the previous state saved in flags
Example with spin_lock_irq( without irqsave ):
spinlock_t mLock = SPIN_LOCK_UNLOCK;
unsigned long flags;
spin_lock_irq(&mLock); // Does not know if interrupts are already disabled
// Critical section
spin_unlock_irq(&mLock); // Could result in an unwanted interrupt re-enable...
The need for spin_lock_irqsave besides spin_lock_irq is quite similar to the reason local_irq_save(flags) is needed besides local_irq_disable. Here is a good explanation of this requirement taken from Linux Kernel Development Second Edition by Robert Love.
The local_irq_disable() routine is dangerous if interrupts were
already disabled prior to its invocation. The corresponding call to
local_irq_enable() unconditionally enables interrupts, despite the
fact that they were off to begin with. Instead, a mechanism is needed
to restore interrupts to a previous state. This is a common concern
because a given code path in the kernel can be reached both with and
without interrupts enabled, depending on the call chain. For example,
imagine the previous code snippet is part of a larger function.
Imagine that this function is called by two other functions, one which
disables interrupts and one which does not. Because it is becoming
harder as the kernel grows in size and complexity to know all the code
paths leading up to a function, it is much safer to save the state of
the interrupt system before disabling it. Then, when you are ready to
reenable interrupts, you simply restore them to their original state:
unsigned long flags;
local_irq_save(flags); /* interrupts are now disabled */ /* ... */
local_irq_restore(flags); /* interrupts are restored to their previous
state */
Note that these methods are implemented at least in part as macros, so
the flags parameter (which must be defined as an unsigned long) is
seemingly passed by value. This parameter contains
architecture-specific data containing the state of the interrupt
systems. Because at least one supported architecture incorporates
stack information into the value (ahem, SPARC), flags cannot be passed
to another function (specifically, it must remain on the same stack
frame). For this reason, the call to save and the call to restore
interrupts must occur in the same function.
All the previous functions can be called from both interrupt and
process context.
Reading Why kernel code/thread executing in interrupt context cannot sleep? which links to Robert Loves article, I read this :
some interrupt handlers (known in
Linux as fast interrupt handlers) run
with all interrupts on the local
processor disabled. This is done to
ensure that the interrupt handler runs
without interruption, as quickly as
possible. More so, all interrupt
handlers run with their current
interrupt line disabled on all
processors. This ensures that two
interrupt handlers for the same
interrupt line do not run
concurrently. It also prevents device
driver writers from having to handle
recursive interrupts, which complicate
programming.
Below is part of code in linux kernel 4.15.18, which shows that spiin_lock_irq() will call __raw_spin_lock_irq(). However, it will not save any flags as you can see below part of the code but disable the interrupt.
static inline void __raw_spin_lock_irq(raw_spinlock_t *lock)
{
local_irq_disable();
preempt_disable();
spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);
}
Below code shows spin_lock_irqsave() which saves the current stage of flag and then preempt disable.
static inline unsigned long __raw_spin_lock_irqsave(raw_spinlock_t *lock)
{
unsigned long flags;
local_irq_save(flags);
preempt_disable();
spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
/*
* On lockdep we dont want the hand-coded irq-enable of
* do_raw_spin_lock_flags() code, because lockdep assumes
* that interrupts are not re-enabled during lock-acquire:
*/
#ifdef CONFIG_LOCKDEP
LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);
#else
do_raw_spin_lock_flags(lock, &flags);
#endif
return flags;
}
This question starts from the false assertion:
On an SMP machine we must use spin_lock_irqsave and not spin_lock_irq from interrupt context.
Neither of these should be used from interrupt
context, on SMP or on UP. That said, spin_lock_irqsave()
may be used from interrupt context, as being more universal
(it can be used in both interrupt and normal contexts), but
you are supposed to use spin_lock() from interrupt context,
and spin_lock_irq() or spin_lock_irqsave() from normal context.
The use of spin_lock_irq() is almost always the wrong thing
to do in interrupt context, being this SMP or UP. It may work
because most interrupt handlers run with IRQs locally enabled,
but you shouldn't try that.
UPDATE: as some people misread this answer, let me clarify that
it only explains what is for and what is not for an interrupt
context locking. There is no claim here that spin_lock() should
only be used in interrupt context. It can be used in a process
context too, for example if there is no need to lock in interrupt
context.

Resources