Rufus Scheduler blocks after how many threads? - ruby

Writing a scheduler in rufus where the scheduled tasks will overlap. This is expected behavior, but was curious how rufus handles the overlap. Will it overlap up to n threads and then block from there? Or does it continue to overlap without a care of how many concurrent tasks run at a time.
Ideally I would like to take advantage of rufus concurrency and not have to manage my own pool of managed threads. Would like to block once I've reached the max pool count.
scheduler = Rufus::Scheduler.new
# Syncs one tenant in every call. Overlapping calls will allow for multiple
# syncs to occur until all threads expended, then blocks until a thread is available.
scheduler.every '30s', SingleTenantSyncHandler
Edit
Seeing from the README that rufus does use thread pools in version 3.x.
You can set the max thread count like:
scheduler = Rufus::Scheduler.new(:max_work_threads => 77)
Assuming this answers my question but still would like confirmation from others.

Yes, I confirm, https://github.com/jmettraux/rufus-scheduler/#max_work_threads answers your question. Note that this thread pool is shared among all scheduled jobs in the rufus-scheduler instance.

Related

Persistent threads on Windows Thread Pool

I copied this code from the windows samples
auto workItemHandler = ref new WorkItemHandler([this](IAsyncAction ^ action)
{
while (action->Status == AsyncStatus::Started)
{
// Main Loop
}
});
m_renderLoopWorker = ThreadPool::RunAsync(workItemHandler, WorkItemPriority::High, WorkItemOptions::TimeSliced);
but have experienced some unreproducible lag sometimes (although maybe its from the gpu).
On the other hand
WorkItemOptions::TimeSliced The work items should be run simultaneously with other work items sharing a processor.
doesn't sound like a high performance option.
WorkItemOptions::None The work item should be run when the thread pool has an available worker thread.
Where you would want to use WorkItemOptions::TimeSliced vs WorkItemOptions::None?
Is it ever advisable to use CreathThread over running a task on the thread pool for persistent work.
WorkItemOptions::TimeSliced => preemptive multitasking
WorkItemOptions::None => cooperative multitasking
When do you want to use each one... difficult to say.
If you use None and all the threads in the thread pool are currently used, your task wont run until a thread finishes its job.
With TimeSliced each task is allowed a time slice, when the time is up, your task is paused and the thread switch to another task. This way, if you have 100 work items, but only 10 thread, all work items will progress, little by little, but 10x slower.
If you need to update something regularly, lets say a progress bar, you would rather use TimeSliced.
It is perfectly acceptable to use CreateThread for a long task. A render loop fit that description. This way you have your own thread to yourself to do whatever you want. Even though at the OS level, there is preemptive multitasking anyway, otherwise if your processor had only 2 cores, and you ran 3 threads, the 3rd thread would hang.
The main point of thread pools is to avoid creating new threads for every little task you want to do, because it incurs an overhead.

Spring Boot: How to schedule and execute tasks concurrently?

What I want to is as follows:
In a Spring Boot application,
Schedule tasks (functions, or a method of a class) with cron expressions (cron expressions can be different for each task).
When it's time to run task, run it, concurrently with other task if neccessary (start time is the same, running periods overlaps etc.) - and without any limitation of the concurrency.
The tasks can take several minutes.
The number of tasks (and its options) and cron expressions cannot be determined at development time. It is end user configurable.
The scheduler must satisfy the following requirements.
It must not have a wait queue. if a scheduled time arrives, the task must be executed immediately (Don't worry about the number of the threads).
When the tasks are not running, the number of idle threads should be minimal - or the number should be controllable.
I've looked at ThreadPoolTaskScheduler, but it seems that it fails to satisfy the above requirements.
Thank you in advance.

Spring Task Executor thread count keeps increasing

Following are the properties I have set -
spring.task.execution.pool.core-size=50
spring.task.execution.pool.max-size=200
spring.task.execution.pool.queue-capacity=100
spring.task.execution.shutdown.await-termination=true
spring.task.execution.shutdown.await-termination-period=10s
spring.task.execution.thread-name-prefix=async-task-exec-
I still see thread names as - "async-task-exec-7200"
Does it mean it is creating 7200 threads?
Also, another issue I observed that #Async would wait for more than 10min to get a thread and relieve the parent thread.
You specified core size of 50 and max size of 200. So your pool will normally run with 50 threads, and when there is extra work, it will spawn additional threads, you'll see "async-task-exec-51", "async-task-exec-52" created and so on. Later, if there is not enough work for all the threads, the pool will kill some threads to get back to just 50. So it may kill thread "async-task-exec-52". The next time it has too much work for 50 threads, it will create a new thread "async-task-exec-53".
So the fact that you see "async-task-exec-7200" means that over the life time of the thread pool it has created 7200 threads, but it will still never have more than the max of 200 running at the same time.
If #Async method is waiting 10 minutes for a thread it means that you have put so much work into the pool that it has already spawned all 200 threads and they are processing, and you have filled up the queue capacity of 100, so now the parent thread has to block(wait) until there is at least a spot in the queue to put the task.
If you need to consistently handle more tasks, you will need a powerful enough machine and enough max threads in the pool. But if your work load is just very spiky, and you don't want to spend on a bigger machine and you are ok with tasks waiting longer sometimes, you might be able to get away with just raising your queue-capacity, so the work will queue up and eventually your threads might catch up (if the task creation gets slower).
Keep trying combinations of these settings to see what will be right for your workload.

How can I make resque worker process other jobs while current job is sleeping?

Each task I have work in a short bursts, then sleep for about an hour and then work again and so on until the job is done. Some jobs may take about 10 hours to complete and there is nothing I can do about it.
What bothers me is that while job is sleeping resque worker would be busy, so if I have 4 workers and 5 jobs the last job would have to wait 10 hours until it can be processed, which is grossly unoptimal since it can work while any other worker is sleeping. Is there any way to make resque worker to process other job while current job is sleeping?
Currently I have a worker similar to this:
class ImportSongs
def self.perform(api_token, songs)
api = API.new api_token
songs.each_with_index do |song, i|
# make current worker proceed with another job while it's sleeping
sleep 60*60 if i != 0 && i % 100 == 0
api.import_song song
end
end
end
It looks like the problem you're trying to solve is API rate limiting with batch processing of the import process.
You should have one job that runs as soon as it's enqueued to enumerate all the songs to be imported. You can then break those down into groups of 100 (or whatever size you have to limit it to) and schedule a deferred job using resque-scheduler in one hour intervals.
However, if you have a hard API rate limit and you execute several of these distributed imports concurrently, you may not be able to control how much API traffic is going at once. If you have that strict of a rate limit, you may want to build a specialized process as a single point of control to enforce the rate limiting with it's own work queue.
With resque-scheduler, you'll be able to repeat discrete jobs at scheduled or delayed times as an alternative to a single, long running job that loops with sleep statements.

How to launch multiple worker processes in eventmachine?

I'm using rails 3, eventmachine and rabbitmq.
When I publish message(s) to a queue, I need to launch multiple worker processes.
I understand that eventmachine is solution for my scenerio.
Some tasks will take longer than others.
Using eventmachine, from most code samples it looks like only a single thread/process will be run at any given time.
How can I launch 2-4 worker processes at a single time?
if you use the EM.defer method every proc you pass to it will be put in the thread pool (default to 20 threads). you can have as many worker you want if you change the EM.threadpool_size.
worker = Proc.new do
# log running job
end
EM.defer(worker)

Resources