dynamic tasklets or work queues - linux-kernel

Background: I'm writing network traffic processing kernel module.
I'm getting packets using netfilter hooks. All filtering is done inside hook function, but I don't want to do packet processing here. So solution is tasklets or workqueues. I know the difference between them, I can use both, but I have some problems and I need an advice.
Tasklets solution. Preferrable. I can create and start tasklet for
each packet, but who will delete this tasklet? Tasklet function? I
don't think its a good idea - to dealloc tasklet while it is
executing. Create global pool of tasklets? Well, since there can't
be 2 executing tasklets on one processor, the pool size will be the
number of processors. But how to find out when tasklet is available
for new use? There are only two states: shed and run, but there is
no "done" state. Ok, I probably can wrap tasklet with some struct
with flag. But wouldn't that all be too much overkill?
Workqueue solution. Same problem: who will delete work? Same "solution" as for tasklets?
Workqueue solution 2. Just create permanent work due module loading, save packets to some queue and process them inside the work. May be two works and two queues: incoming and outgoing. But I'm afraid that with that solution I will use only one (or two) processors since looks like work can't be performed on few processors simultaneously.
Any other solutions?

One can use high-priority(WQ_HIGH_PRI), unbound(WQ_UNBOUND) workqueues and stick with option3 listed in the question.
WQ_HIGH_PRI guarantees that the processing is initiated ASAP. WQ_UNBOUND eliminates the single-CPU bottleneck as the scheduler assigns the work to any available CPU immediately.

Related

When to use Unconfined in Kotlin

When would I choose to use Dispatchers.Unconfined? Is when it doesn't really matter where the coroutine should run? So you let the coroutine to choose the thread pool as it better suits?
And how does it differ from Dispatchers.Default? Is it that when running the Default dispatcher is always within a specific thread pool defined as the default one?
So you let the coroutine to choose the thread pool as it better suits?
That's not really how Unconfined works. The best way to understand it is that it is a "no-op" dispatcher that doesn't actually do any dispatch at all. Wherever you call continuation.resume(), that's where the coroutine resumes execution — within that very call. When the resume() call returns, it means the coroutine has either suspended again or completed.
In normal programming, you usually call continuation.resume() from a callback and it is not your code that runs the callback, so you don't actually have any control over the thread where your coroutine will resume. It is not advisable to use the Unconfined dispatcher when resuming from a callback provided by a library that is not under your control.
Unconfined is really a special-cased tool you can use when building a coroutine execution environment yourself, or in other custom scenarios. Basically, you should use it only when you are actively looking for a way to disable the normal dispatching mechanism.
The unconfined dispatcher is appropriate for coroutines which neither consume CPU time nor update any shared data (like UI) confined to a specific thread.
So, I'd use it in non-IO, UI or computation heavy situations basically :D.
I think the nunmber of use-cases for this is pretty low, but I'd think of an operation which isn't heavy, but still for some reason you'd like it to run on a different thread.
Here's a link for how it actually works.
Dispatchers.Default is really different, and it's mostly used for heavy CPU operations.
This is because, it actually dispatches works to a thread pool with a number of threads equal to the number of CPU cores, and it's at least 2. This way developers can leverage the full capacity of the cpu when doing heavy computational work.

Scheduling tasks/messages for later processing/delivery

I'm creating a new service, and for that I have database entries (Mongo) that have a state field, which I need to update based on a current time, so, for instance, the start time was set to two hours from now, I need to change state from CREATED -> STARTED in database, and there can be multiple such states.
Approaches I've thought of:
Keep querying database entries that are <= current time and then change their states accordingly. This causes extra reads for no reason and half the time empty reads, and it will get complicated fast with more states coming in.
I write a job scheduler (I am using go, so that'd be not so hard), and schedule all the jobs, but I might lose queue data in case of a panic/crash.
I use some products like celery, have found a go implementation for it https://github.com/gocelery/gocelery
Another task scheduler I've found is on Google Cloud https://cloud.google.com/solutions/reliable-task-scheduling-compute-engine, but I don't want to get stuck in proprietary technologies.
I wanted to use some PubSub service for this, but I couldn't find one that has delayed messages (if that's a thing). My problem is mainly not being able to find an actual name for this problem, to be able to search for it properly, I've even tried searching Microsoft docs. If someone can point me in the right direction or if any of the approaches I've written are the ones I should use, please let me know, that would be a great help!
UPDATE:
Found one more solution by Netflix, for the same problem
https://medium.com/netflix-techblog/distributed-delay-queues-based-on-dynomite-6b31eca37fbc
I think you are right in that the problem you are trying to solve is the job or task scheduling problem.
One approach that many companies use is the system you are proposing: jobs are inserted into a datastore with a time to execute at and then that datastore can be polled for jobs to be run. There are optimizations that prevent extra reads like polling the database at a regular interval and using exponential back-off. The advantage of this system is that it is tolerant to node failure and the disadvantage is added complexity to the system.
Looking around, in addition to the one you linked (https://github.com/gocelery/gocelery) there are other implementations of this model (https://github.com/ajvb/kala or https://github.com/rakanalh/scheduler were ones I found after a quick search).
The other approach you described "schedule jobs in process" is very simple in go because goroutines which are parked are extremely cheap. It's simple to just spawn a goroutine for your work cheaply. This is simple but the downside is that if the process dies, the job is lost.
go func() {
<-time.After(expirationTime.Sub(time.Now()))
// do work here.
}()
A final approach that I have seen but wouldn't recommend is the callback model (something like https://gitlab.com/andreynech/dsched). This is where your service calls to another service (over http, grpc, etc.) and schedules a callback for a specific time. The advantage is that if you have multiple services in different languages, they can use the same scheduler.
Overall, before you decide on a solution, I would consider some trade-offs:
How acceptable is job loss? If it's ok that some jobs are lost a small percentage of the time, maybe an in-process solution is acceptable.
How long will jobs be waiting? If it's longer than the shutdown period of your host, maybe a datastore based solution is better.
Will you need to distribute job load across multiple machines? If you need to distribute the load, sharding and scheduling are tricky things and you might want to consider using a more off-the-shelf solution.
Good luck! Hope that helps.

How would you implement a save thread cooperation with signals in ruby 2.0?

I just began to work with threads. I know the theory and understand the main aspects of it, but I've got only a little practice on this topic.
I am looking for a good solution (or pattern, if available) for the following problem.
Assume there should be a transaction component which holds a pool of threads processing tasks from a queue, which is also part of this transaction component.
Each thread of this pool waits until there's a task to do, pops it from the queue, processes it and then waits for the next turn.
Assume also, there are multiple threads adding tasks to this queue. Then I want these threads to suspend until their tasks are processed.
If a task is processed, the thread, which enqueued the processed task, should be made runnable again.
The ruby class Thread provides the methods Thread#stop and Thread#run. However, I read, that you should not use these methods, if you want a stable implementation. And to use some kind of signalling mechanism.
In ruby, there are some classes which deal with synchronization and thread cooperation in general like Thread, Mutex, Monitor, ConditionVariable, etc.
Maybe ConditionVariable could be my friend, because it allows to emit signals, but I'm just not sure.
How would you implement this?
Ruby provides a threadsafe Queue class that will handles some of this for you:
queue.pop
Will block until a value is pushed to the queue. You can have as many threads as you want waiting on the queue in this fashion. If one of the things you push onto the queue is another queue or a condition variable then you could use that to signal task completion.
Threads are notoriously hard to reason about effectively. You may find that an alternative higher level approach such as celluloid easier to work with.

MPI: master-slave with the master also doing work

I'm implementing a standard MPI master/slave system: there is a master that distributes work, and there are slaves who ask for chunks and process data.
However... if implemented in a naive way (rank==0 is master, the rest are slaves), the master ends up doing no real work, but still takes one core for what needs practically no real computing power. So I tried to implement a separate "scheduler" thread in the master, but that involved sending MPI messages to itself, and didn't really work...
Do you have any ideas how to solve this?
As I realized after some googling: you can send messages to yourself using tags. Tags are a kind of filter: if you do a recv for only tag==1, then you'll receive only those, with later messages being able to overtake eariler ones.
So, as for the solution:
tag the "scheduler to worker" and "worker to scheduler" messages with a different id
if rank==0: start a scheduler thread
afterwards, regardless of the rank, request work.
This way, the rank 0 worker won't receive its own "let's give me work" messages, because they will have a "to be received by the scheduler only" tag.
Edit: this thing doesn't really seem to be thread-safe though... (= it sometimes crashes in "free()" even though it's written in Python...) so I'd be still interested in the real & proven solution :)

Inter-thread communication (worker threads)

I've created two threads A & B using CreateThread windows API. I'm trying to send the data from thread A to B.
I know I can use Event object and wait for the Event object in another using "WaitForSingleObject" method. What this event does all is just signal the thread. That's it! But how I can send a data. Also I don't want thread B to wait till thread A signals. It has it own job to do. I can't make it wait.
I can't find a Windows function that will allow me to send data to / from the worker thread and main thread referencing the worker thread either by thread ID or by the returned HANDLE. I do not want to introduce the MFC dependency in my project and would like to hear any suggestions as to how others would or have done in this situation. Thanks in advance for any help!
First of all, you should keep in mind that Windows provides a number of mechanisms to deal with threading for you: I/O Completion Ports, old thread pools and new thread pools. Depending on what you're doing any of them might be useful for your purposes.
As to "sending" data from one thread to another, you have a couple of choices. Windows message queues are thread-safe, and a a thread (even if it doesn't have a window) can have a message queue, which you can post messages to using PostThreadMessage.
I've also posted code for a thread-safe queue in another answer.
As far as having the thread continue executing, but take note when a change has happened, the typical method is to have it call WaitForSingleObject with a timeout value of 0, then check the return value -- if it's WAIT_OBJECT_0, the Event (or whatever) has been set, so it needs to take note of the change. If it's WAIT_TIMEOUT, there's been no change, and it can continue executing. Either way, WaitForSingleObject returns immediately.
Since the two threads are in the same process (at least that's what it sounds like), then it is not necessary to "send" data. They can share it (e.g., a simple global variable). You do need to synchronize access to it via either an event, semaphore, mutex, etc.
Depending on what you are doing, it can be very simple.
Thread1Func() {
Set some global data
Signal semaphore to indicate it is available
}
Thread2Func() {
WaitForSingleObject to check/wait if data is available
use the data
}
If you are concerned with minimizing Windows dependencies, and assuming you are coding in C++, then I recommend using Boost.Threads, which is a pretty nice, Posix-like C++ threading interface. This will give you easy portability between Windows and Linux.
If you go this route, then use a mutex to protect any data shared across threads, and a condition variable (combined with the mutex) to signal one thread from the other.
Don´t use a mutexes when only working in one single process, beacuse it has more overhead (since it is a system-wide defined object)... Place a critical section around Your data and try to enter it (as Jerry Coffin did in his code around for the thread safe queue).

Resources