Restriction on interrupt routines in linux kernel drivers - linux-kernel

Every device driver book talks about not using functions that sleep in interrupt routines.
What issues occur by calling these functions from ISRs ?

A total lockdown of the kernel is the issue here. The kernel is in interrupt context when executing interrupt handlers, that is, the interrupt handler is not associated with any process (the current macro cannot be used).
If you are able to sleep, you would never be able to get back to the interrupted code, since the scheduler would not know how to get back to it.
Holding a lock in the interrupt handler, and then sleeping, allowing another process to run and then entering the interrupt handler again and trying to re-acquire the lock would deadlock the kernel.
If you try to read more about how the scheduling in the kernel works, you will soon realize why sleeping is a no go in certain contexts.

Related

How does os kernel keep track of the locks that user threads are waiting on?

I know that when a user thread acquires for a lock(like event, semaphore and so on), the kernel will change the thread's state to waiting so the thread will not be scheduled to run until the kernel finds that the lock is available.
My question is how does the kernel captures the state of these locks? By polling or notifying?
By notifying. Before the thread goes to sleep, it adds itself to the wakeup list for whatever kernel object corresponds to the thing it's waiting for.
This works precisely the same way all other waits work. Say, for example, the process does a blocking read on a file and the process has to sleep until the read completes. Or say the process accesses some code that hasn't been read in from disk yet. In all of these cases, the process is added to the appropriate wakeup notification scheme when it puts itself to sleep.
What you are asking is highly system specific and lock specific. For example, quality operating systems have lock management facilities that will detect deadlocks.
Some locks might be implemented as spin locks where there is no process hibernation and no operating system notification at all.
In the case where waiting suspends a process, all the operating system needs to keep track of is the lock itself. If a process releases the lock, the operating system can send a notification to all the waiting process—no poling necessary.

module context of execution

I work on module for ipsec in linux. Look at two different situations when code from my module will be executed.
Executing from process context: application generate some traffic to transmit via network, application should call some syscall to transfer data, then process switch to kernel space and packet go through network subsystem of linux, somewere here will be executed my module, and all finished after affording task to network card. All these steps performed from process context and in any moment scheduler can switch process from one to another. Is as follows fist case of using my module - from process context.
Executing from softirq context: when network card receive packet it generate hardware interrupt, which "prepare" appropriate softirq to run. And packet go through network subsystem of linux (including my module) until some application got it. These steps performed from softirq context and could be interrupted only by hardware interrupt, but not by scheduler work.
The question is: How can I programmatically determine in module, from which context module is executing? It can be some element of struct task_struct or some syscall or something else. I couldn't find it by myself.
It is considered as a bad practice to make a function's control flow dependent from whether it is executed in interrupt context or not.
Citation from the Linux kernel developer (Andrew Morton):
The consistent pattern we use in the kernel is that callers keep track of whether they are running in a schedulable context and, if necessary, they will inform callees about that. Callees don't work it out for themselves.
However, there are several functions(macros) defined in linux/preempt.h for detect current scheduling context: in_atomic(), in_interrupt(). But see that LWN article about their usage.

Interrupt handler and virtual memory

Does interrupt handler is running like user programs in the meaning of virtual memory (TLB miss - load page descriptor) or there are on any CPU difference solution?
The interrupt service routine (ISR) is going to execute in kernel mode. The jump table that the processor uses to figure out what routine to run on the interrupt itself cannot be swapped out, because the page fault handler would also be found there. I don't know for sure what would happen if the handler address pointed to an unmapped region of memory. Virtual memory can be supported in kernel mode, at least on x86. Maybe some architectures could handle an access fault for an ISR address, but an OS would never implement that, because the latency for entering the ISR would be totally unacceptable.

How are Linux work queues working?

I'm new to Linux device drivers writing and I'm trying to make a device driver that handles an UART chip. For this I decided to use work ques as my bottom half processing because I have to use some semaphores when handling the data that I get from the UART chip.
A work queue handler that was scheduled earlier in an interrupt now gets executed and during it's execution it will sleep at a semaphore. During this time the interrupt handler is called again and schedules the same work queue handler. Will the work queue handler be executed again before the first execution of it finishes ?
Thanks.
The default behavior of work queues is to allow concurrent execution on different CPUs. There is a flag WQ_NON_REENTRANT that changes this behavior. More information can be found in this post http://lwn.net/Articles/403891/
But it seems that in recent kernels work queues are non-reentrant by default - see
http://lwn.net/Articles/511190

How to keep thread on processor till an event happens?

I am spawning few threads inside ioctl call to my driver. I am also assigning kernel affinity to my driver. I want to ensure one of the thread does not get scheduled out till a particular event is flagged by the other thread. Is there any way to not allow windows scheduler to context out my thread. Using _disable() may hang the system as event may take couple of seconds.
Environment is windows 7,64bit
Thanks,
What you are probably after is a spin lock. However this is probably a bad idea unless you can guarantee that your driver/application is always run on a multi-processor system, even then it is still very bad practice. On a single processor system if a thread spin locks then the other thread signalling the spin locked thread will never be scheduled and so can't signal your event. Spin locks are meant to be used sparingly and only when the lock is for a very short time, never a couple of seconds.
It sounds like you need to use an event or other signally mechanism to synchronise your threads and let the windows scheduler do its job. If you need to respond to an event very quickly then interrupts or a Deferred Procedure Call (DPC) could be used instead.

Resources