why kernel invokes other handler when kernel receive interrupt?

why kernel invokes other handler when kernel receive interrupt? - linux-kernel

in linux kernel development,i read about interrupt that when the kernel receives interrupt,it invokes sequentially each registered handler on the line.
My question is why kernel invokes other handler?

This is because several devices are sharing the same interrupt line. The only way the kernel can detect which handler to invoke is by passing the dev_id of the device to all the handlers which requires invoking the handler. The handler that was registered with the passed dev_id gets a match and it continues to run.
Remember, the handler was registered as:
static irqreturn_t intr_handler(int irq, void *dev_id, struct pt_regs *regs)
the handler was registered by passing the dev_id. So, that's the only thing that distinguishes devices on the same IRQ line.

In a well define interrupt handler, specifically share the irq line, which will check whether the interrupt is raised by the specific device by reading some registers, and if it is, then handles the interrupt and return IRQ_HANDLED, or return IRQ_NONE to indicate that is not the device the handler servicing.
So, it invokes sequentially each registered handler on the line until that the handler return IRQ_HANDLED meaning processing properly

Related

Synchronization primitives in DriverKit

In a DriverKit extension, I would like to block a call from a user client until a specific hardware interrupt fires. Since there are no semaphores available (Does the DriverKit SDK support semaphores?), I've reached for a very basic spinlock using an _Atomic(bool) member and busy waiting:
struct IVars
{
volatile _Atomic(bool) InterruptOccurred = false;
}
// In the user client method handler
{
// Clear the flag
atomic_store(&ivars->InterruptOccurred, false);
// Set up the interrupt on the device
...
// Wait for the interrupt
while (!atomic_load(&ivars->InterruptOccurred))
{
IOSleep(10);
}
}
// In the interrupt handler
{
bool expected = false;
if (atomic_compare_exchange_strong(&ivars->InterruptOccurred, &expected, true))
{
return;
}
// Proceed with normal handling if the user client method is not waiting
}
The user client method is called infrequently and the interrupt is guaranteed to fire within 100ms, so in principle busy waiting should be acceptable, but I am not very happy with the solution. I haven't worked with spinlocks before and they make me feel rather uneasy.
I would like to avoid taking an IOLock in the interrupt handler. Is there any other synchronization primitive in DriverKit I could reach for? I guess a cleaner way to handle this would be for the user client method to accept a callback that fires on the interrupt, but that would still require synchronization with the interrupt handler and would complicate the client application code.

Preliminaries
I would like to avoid taking an IOLock in the interrupt handler.
I assume you're aware that, this being DriverKit, this isn't running in the context of a primary interrupt controller, but you're already behind a layer of Mach messaging, kernel/user context switch, and IODispatchQueue serialisation?
Possible solutions:
Since there are no semaphores available[…]
OSAction
The OSAction class contains a set of methods for sleeping in a thread until the action is invoked. (WillWait/Wait/EndWait) This might be a feasible way of implementing what you're trying to do. As usual, the documentation is in the header/iig file but hasn't made it into the web-based API docs.
IODispatchQueue
As of DriverKit 21 (macOS 12), you also get Apple's simpler Sleep/Wakeup event system baked into IODispatchQueue, which you might be familiar with from the kernel. (It is also similar to pthreads condition variables.) Note you need to create the queue with the kIODispatchQueueReentrant option in this case.
From DriverKit 22 (macOS 13/iPadOS) on, there's also a version with a deadline for the sleep SleepWithDeadline.
Async callbacks
I guess a cleaner way to handle this would be for the user client method to accept a callback that fires on the interrupt, but that would still require synchronization with the interrupt handler and would complicate the client application code.
If you're happy calling the async callback in the app on every interrupt, there's not really any synchronisation needed, you can just invoke the same OSAction repeatedly. Even if you want to only invoke the async call on the "next" interrupt, atomic compare-and-swap should be sufficient for the interrupt handler to claim the OSAction* pointer.
Important note:
With all of these potential solutions except IODispatchQueue::Sleep and the async callback: bear in mind that sleeping in the context of a user client external method will block the dispatch queue and thus any other calls to external methods in that user client will fail to make progress. (As well as any other methods scheduled to that queue.)

Thread wait reasons

I've been using code that I found in the following post:
How to get thread state (e.g. suspended), memory + CPU usage, start time, priority, etc
I'm examining thread state, and there's the following enum that describes the reasons for thread 'waiting' status -
enum KWAIT_REASON
{
Executive,
FreePage,
PageIn,
PoolAllocation,
DelayExecution,
Suspended,
UserRequest,
WrExecutive,
WrFreePage,
WrPageIn,
WrPoolAllocation,
WrDelayExecution,
WrSuspended,
WrUserRequest,
WrEventPair,
WrQueue,
WrLpcReceive,
WrLpcReply,
WrVirtualMemory,
WrPageOut,
WrRendezvous,
Spare2,
Spare3,
Spare4,
Spare5,
Spare6,
WrKernel,
MaximumWaitReason
};
Can anyone explain what WrQueue is, and perhaps what the difference between WrUserRequest and UserRequest is?
The information is obtained using NtQuerySystemInformation() with SystemProcessInformation.

WrQueue this is when thread waits on KQUEUE object (look it definition in wdm.h) in kernel. this can be call to ZwRemoveIoCompletion or Win32 shell GetQueuedCompletionStatus (IOCP is exactly KQUEUE object). or thread (begining from vista) call ZwWaitForWorkViaWorkerFactory (worker factory internally use KQUEUE. also possible that thread in kernel calls KeRemoveQueue - this usually does system working threads.
WrUserRequest is used by win32k.sys subsystem. Usually this is when thread calls GetMessage. So if we view WrUserRequest we can be sure that thread is waiting for window messages.
UserRequest - this means that thread waits on some object[s] via WaitForSingleObject[Ex] or WaitForMultipleObjects[Ex] or MsgWaitForMultipleObjects[Ex] (or it equivalents)

When do handlers for cancelled boost::asio handlers get to run?

The boost docs say that cancelled async connect, send and receive finish immediately, and the handlers for cancelled operations will be passed the boost::asio::error::operation_aborted error.
I would like to find out if the cancelled handler gets to run (and see the operation_aborted error) before other (non-cancelled, and newly scheduled) completion handlers run.
Here is the timeline that concerns me:
acceptHandler and readHandler are running on the same event loop and the same thread.
time t0 - readHandler is running on oldConnectionSocket
time t1 - acceptHandler runs
time t2 - acceptHandler calls oldConnectionSocket.cancel
time t3 - acceptHandler closes oldConnectionSocket
time t4 - acceptHandler calls newConnectionSocket.async_read(...readHandler...)
time t5 - readHandler is called (from which context?)
Is it possible at t5 for readHandler to be called in the newConnectionSocket context before it is called with the operation_aborted error in the oldConnectionSocket context?

Cancelled operations will immediately post their handlers for deferred invocation. However, the io_service makes no guarantees on the invocation order of handlers. Thus, the io_service could choose to invoke the ReadHandlers in either order. Currently, only a strand specifies guaranteed ordering under certain conditions.
Within a completion handler, if the goal is to know which I/O object was associated with the operation, then consider constructing the completion handler so that it has an explicit handle to the I/O object. This is often accomplished using any of the following:
a custom functor
std::bind() or boost::bind()
a C++11 lambda
One common idiom is to have the I/O object be managed by a single class that inherits from boost::enable_shared_from_this<>. When a class inherits from boost::enable_shared_from_this, it provides a shared_from_this() member function that returns a valid shared_ptr instance to this. A copy of the shared_ptr is passed to completion handlers, such as a capture-list in lambdas or passed as the instance handle to boost::bind(). This allows for the handlers to know the I/O object on which the operation was performed, and causes the lifetime of the I/O object to be extended to at least as long as the handler. See the Boost.Asio asynchronous TCP daytime server tutorial for an example using this approach.
class tcp_connection
: public boost::enable_shared_from_this<tcp_connection>
{
public:
// ...
void start()
{
boost::asio::async_write(socket_, ...,
boost::bind(&tcp_connection::handle_write, shared_from_this(),
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred));
}
void handle_write(
const boost::system::error_code& error,
std::size_t bytes_transferred)
{
// I/O object is this->socket_.
}
tcp::socket socket_;
};
On the other hand, if the goal is to determine if one handler has executed before the other, then:
the application will need to explicitly manage the state
trying to manage multiple dependent call chains may be introducing unnecessary complexity and often indicates a need to reexamine the design
custom handlers could be used to prioritize the order in which handlers are executed. The Boost.Asio Invocation example uses custom handlers that are added to a priority queue, which are then executed at a later point in time.

why some code calls request_threaded_irq with NULL as a parameter for irq_handler?

As per kernel docs, request_threaded_irq is used to split it in two parts, irq_handler checks whether the interrupt originates from the device. If yes it needs to disable the interrupt on the device and return IRQ_WAKE_THREAD which will wake up the handler thread and run #thread_fn.
But I found some code register interrupt, using request_threaded_irq while passing NULL as irq_handler and they keep complete functionality in thread_fn.
So my doubt is why we use request_threaded_irq() in that case, while we can easily use request_irq, which behaves same for the mentioned scenario.

The documentation says:
If NULL and thread_fn != NULL the default primary handler is installed
The default primary handler is undocumented, but its source should be self-explaining:
static irqreturn_t irq_default_primary_handler(int irq, void *dev_id)
{
return IRQ_WAKE_THREAD;
}

NULL pointer dereference in swiotlb_unmap_sg_attrs() on disk IO

I'm getting an error I really don't understand when reading or writing files using a PCIe block device driver. I seem to be hitting an issue in swiotlb_unmap_sg_attrs(), which appears to be doing a NULL dereference of the sg pointer, but I don't know where this is coming from, as the only scatterlist I use myself is allocated as part of the device info structure and persists as long as the driver does.
There is a stacktrace to go with the problem. It tends to vary a bit in exact details, but it always crashes in swiotlb_unmap_sq_attrs().
I think it's likely I have a locking issue, as I am not sure how to handle the locks around the IO functions. The lock is already held when the request function is called, I release it before the IO functions themselves are called, as they need an (MSI) IRQ to complete. The IRQ handler updates a "status" value, which the IO function is waiting for. When the IO function returns, I then take the lock back up and return to request queue handling.
The crash happens in blk_fetch_request() during the following:
if (!__blk_end_request(req, res, bytes)){
printk(KERN_ERR "%s next request\n", DRIVER_NAME);
req = blk_fetch_request(q);
} else {
printk(KERN_ERR "%s same request\n", DRIVER_NAME);
}
where bytes is updated by the request handler to be the total length of IO (summed length of each scatter-gather segment).

It turned out this was due to re-entrancy of the request function. Because I was unlocking in the middle to allow IRQs to come in, the request function could be called again, would take the lock (while the original request handler was waiting on IO) and then the wrong handler would get the IRQ and everything went south with stacks of failed IO.
The way I solved this was to set a "busy" flag at the start of the request function, clear it at the end and return immediately at the start of the function if this is set:
static void mydev_submit_req(struct request_queue *q){
struct mydevice *dev = q->queuedata;
// We are already processing a request
// so reentrant calls can take a hike
// They'll be back
if (dev->has_request)
return;
// We own the IO now, new requests need to wait
// Queue lock is held when this function is called
// so no need for an atomic set
dev->has_request = 1;
// Access request queue here, while queue lock is held
spin_unlock_irq(q->queue_lock);
// Perform IO here, with IRQs enabled
// You can't access the queue or request here, make sure
// you got the info you need out before you release the lock
spin_lock_irq(q->queue_lock);
// you can end the requests as needed here, with the lock held
// allow new requests to be processed after we return
dev->has_request = 0;
// lock is held when the function returns
}
I am still not sure why I consistently got the stacktrace from swiotlb_unmap_sq_attrs(), however.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio