I found an interesting topic, threaded interrupt handler, on https://lwn.net/Articles/302043/.
It is not a brand new idea and it intended to replace tasklet.
But when I studied its functionality, I found there is no difference between threaded interrupt handler and tasklet. Both of them defer the work and finish the hardware-related issue ASAP in the hard irq and defer the time-consuming part.
Could anyone shed me some light on it?
Related
I’ve been trying to refresh my understanding of sleeping in the kernel with regards to wait queues. So started browsing the source code for bcmgenet.c (kernel version 4.4) which is the driver responsible for driving the 7xxx series of Broadcom SoC for their set top box solution.
As part of the probe callback, this driver initializes a work queue which is part of the driver’s private structure and adds itself to the Q. But I do not see any blocking of any kind anywhere. Then it goes on to initialize a work queue with a function to call when woken up.
Now coming to the ISR0 for the driver, within that is an explicit call to the scheduler as part of the ISR (bcmgenet_isr0) if certain conditions are met. Now AFAIK, this call is used to defer work to a later time, much like a tasklet does.
Post this we check some MDIO status flags and if the conditions are met, we wake up the process which was blocked in process context. But where exactly is the process blocked?
Also, most of the time, wait queues seem to be used in conjunction with work queues. Is that the typical way to use them?
As part of the probe callback, this driver initializes a work queue which is part of the driver’s private structure and adds itself to the Q. But I do not see any blocking of any kind anywhere.
I think you meant the wait queue head, not the work queue. I do not see any evidence of the probe adding itself to the queue; it is merely initializing the queue.
The queue is used by the calls to the wait_event_timeout() macro in the bcmgenet_mii_read() and bcmgenet_mii_write() functions in bcmmii.c. These calls will block until either the condition they are waiting for becomes true or the timeout period elapses. They are woken up by the wake_up(&priv->wq); call in the ISR0 interrupt handler.
Then it goes on to initialize a work queue with a function to call when woken up.
It is initializing a work item, not a work queue. The function will be called from a kernel thread as a result of the work item being added to the system work queue.
Now coming to the ISR0 for the driver, within that is an explicit call to the scheduler as part of the ISR (bcmgenet_isr0) if certain conditions are met. Now AFAIK, this call is used to defer work to a later time, much like a tasklet does.
You are referring to the schedule_work(&priv->bcmgenet_irq_work); call in the ISR0 interrupt handler. This is adding the previously mentioned work item to the system work queue. It is similar to as tasklet, but tasklets are run in a softirq context whereas work items are run in a process context.
Post this we check some MDIO status flags and if the conditions are met, we wake up the process which was blocked in process context. But where exactly is the process blocked?
As mentioned above, the process is blocked in the bcmgenet_mii_read() and bcmgenet_mii_write() functions, although they use a timeout to avoid blocking for long periods. (This timeout is especially important for those versions of GENET that do not support MDIO-related interrupts!)
Also, most of the time, wait queues seem to be used in conjunction with work queues. Is that the typical way to use them?
Not especially. This particular driver uses both a wait queue and a work item, but I wouldn't describe them as being used "in conjunction" since they are being used to handle different interrupt conditions.
I have a Win32 MFC app that creates a thread which listens on the RS232 port. When new data is received that listener thread allocates memory using new and posts a message to a window using PostMessage. This carries on just fine and the window handles the incoming data and deletes the memory as necessary using delete. I'm noticing some small memory leaks right as my program closes. My suspicion is that one or two final messages are being posted and are still sitting in the message queue at the moment the user shuts the program and the thread closes before that memory gets properly deleted. Is there a way I can insure certain things happen before the program closes? Can I make sure the message queue is empty or at least has processed some of these important messages? I have tried looking at WaitForInputIdle or PeekMessage in destructors and things like that. Any ideas on a good way to solve this?
I 100% agree that all allocated memory should be explicitly free'd. (Just as you should fixed all compiler warnings). This eliminates the diagnostic noise, allowing you to quickly spot real issues.
Building on Harry Johnston's suggestion, I would push all new data into some kind of a queue and simply post a command "check the queue", removing and freeing data in the message handler. That way you can easily free everything left in the queue before exiting.
For a small utility, that leak might be acceptable - but it might cover other causes that are less benign.
PostMessage does not guarantee delivery. So other options are
using a blocking SendMessage
add the data to a deque, use Post Message to notify the receiver new data is available
(Remote code review: if PostMessage returns false, do you delete the memory right away?)
The folks arguing to not worry about it have a valid point. The process is about to end, and the OS will release all the memory, so there's not much point in spending time cleaning up first.
However, this does create noise that might obscure ongoing memory leaks that could become real problems before you application exits. It also means your program would be harder to turn into a library that could be incorporated into another app later.
I'm a fan of writing clean shutdown code, and then, in opt builds, adding an early out to skip the unnecessary work. Thus your debug builds will tell you about real leaks, and your users will get a responsive exit.
To do this cleanly:
You'll need a way for the main thread to tell the listener thread to quit (or at least to stop listening). Otherwise you'll always have a small window of opportunity where the main thread is about the quit just as the listener does another allocation. The main thread will need to know that the listener thread has received and complied with this message. Only then, can the main thread go through the queue to free up all the memory associated with the last messages and know that nothing more will arrive.
Don't use TerminateThread, or you'll end up with additional problems! If the listener thread is waiting on a handle the represents the serial port, then you can make it instead wait on two handle: the serial port handle and the handle of an event. The main thread can raise the event when it wants the listener to quit. The listener thread can raise a different event to signal that it has stopped listening.
When the main thread gets the WM_QUIT, it should raise the event to tell the listener to quit, then wait on the event that says the listener thread is done, then use PeekMessage to pull any messages that the listener posted before it stopped and free the memory associated with them.
What is the advantage/disadvantage over using RegisterWaitForSingleObject() instead of WaitForSingleObject()?
The reason that I know:
RegisterWaitForSingleObject() uses the thread pool already available in OS
In case of the use of WaitForSingleObject(), an own thread should be polling for the event.
the only difference is Polling vs. Automatic Event? or Is there any considerable performance advantage between these?
It's pretty straight-forward, WaitForSingleObject() blocks a thread. It is consuming a megabyte of virtual memory and not doing anything useful with it while it is blocked. It won't wake up and resume doing useful stuff until the handle is signaled.
RegisterWaitForSingleObject() does not block a thread. The thread can continue doing useful work. When the handle is signaled, Windows grabs a thread-pool thread to run the code you specified as the callback. The same code you would have programmed after a WFSO call. There is still a thread involved with getting that callback to run, the wait thread, but it can handle many RWFSO requests.
So the big advantage is that your program can use a lot less threads while still handling many service requests. A disadvantage is that it can take a bit longer for the completion code to start running. And it is harder to program correctly since that code runs on another thread. Also note that you don't need RWFSO when you already use overlapped I/O.
They serve two different code models. In case with RegisterWaitForSingleObject you'll get an asynchronous notification callback on a random thread from the thread pool managed by the OS. If you can structure your code like this, it might be more efficient. On the other hand, WaitForSingleObject is a synchronous wait call blocking (an thus 'occupying') the calling thread. In most cases, such code is easier to write and would probably be less error-prone to various dead-lock and race conditions.
Background: I'm writing network traffic processing kernel module.
I'm getting packets using netfilter hooks. All filtering is done inside hook function, but I don't want to do packet processing here. So solution is tasklets or workqueues. I know the difference between them, I can use both, but I have some problems and I need an advice.
Tasklets solution. Preferrable. I can create and start tasklet for
each packet, but who will delete this tasklet? Tasklet function? I
don't think its a good idea - to dealloc tasklet while it is
executing. Create global pool of tasklets? Well, since there can't
be 2 executing tasklets on one processor, the pool size will be the
number of processors. But how to find out when tasklet is available
for new use? There are only two states: shed and run, but there is
no "done" state. Ok, I probably can wrap tasklet with some struct
with flag. But wouldn't that all be too much overkill?
Workqueue solution. Same problem: who will delete work? Same "solution" as for tasklets?
Workqueue solution 2. Just create permanent work due module loading, save packets to some queue and process them inside the work. May be two works and two queues: incoming and outgoing. But I'm afraid that with that solution I will use only one (or two) processors since looks like work can't be performed on few processors simultaneously.
Any other solutions?
One can use high-priority(WQ_HIGH_PRI), unbound(WQ_UNBOUND) workqueues and stick with option3 listed in the question.
WQ_HIGH_PRI guarantees that the processing is initiated ASAP. WQ_UNBOUND eliminates the single-CPU bottleneck as the scheduler assigns the work to any available CPU immediately.
I was going through a legacy code and found that the code uses SuspendThread Function to suspend the execution of a worker thread. Whenever the worker thread needs to process a request, the calling thread resumes this worker thread. Once the task is done the thread suspends itself.
I don’t know why it was done this way. According to me it could have been done more elegantly using an Event object with WaitForSingleObject API.
My question is, what are the benefits (if any) of suspending a thread as compared to making a thread wait on a synchronization object? In which scenarios would you prefer SuspendThread, ResumeThread APIs?
No.
Suspending a thread is discouraged in every environment I've ever worked in. The main concern is that a thread may be suspended while holding onto a lock on some resource, potentially causing a dead lock. Any resources saved in terms of synchronization objects aren't worth the deadlock risks.
This is not a concern when a thread is made to wait, as the thread inherently controls its own "suspension" and can be sure to release any locks it is holding.
If you read the documentation on SuspendThread, you'll see that it is meant for use by debuggers. Tear it out of any application code if you can.
To illustrate my point, a list of the "do not use" suspension methods I've come across:
SuspendThread
Thread.Suspend
Thread.suspend
As an aside; I'm really surprised that Thread.Suspend in .NET was "supported" in 1.0/1.1, it really should have been warning worthy from the start.
You'll need a separate event object for each thread if you want to be able to wake up a specific thread. That would lead to higher kernel object consumption which is not good by itself and could possibly cause problems on early versions of Windows. With manual resume you don't need any new kernel objects.