We need a method of determining the load and then scaling the number of dyno workers accordingly.
I am using workless gem with DelayedJob and it works like a charm!
Basically you just need to install it and scale worker dynos to 0. When there's a new job added to DJ queue it picks it up in a few seconds, adds a worker and scales down when the task is performed. There are options for multiple workers but I never got so much jobs, so can't share any experience.
Found this article that shows how to scale workers via a Ruby script.
Related
I'm building an application in Laravel (v9) where users upload videos, and these get converted to MP4 (showing progress percentage), thumbnail gets created… etc
Once the video is uploaded, I dispatch a new job in the background that runs all my FFMPEG commands, and marks the video as ready on the database once FFMPEG has finished.
However, if there are multiple users uploading multiple videos, this leaves them waiting, as Laravel’s queue executes each job one by one.
How can I make it so that videos get converted immediately without waiting for the previous job to finish?
You're always probably going to want to use a queue, but you could look into increasing the number of queue workers that are running at any given time. Take a look at the Laravel docs on running your queue via Supervisor and consider setting the numprocs value high enough to support the concurrent load you need to handle.
The caveat is that each queue worker will need CPU/memory, so if you set the number of concurrent workers too high, it may exceed your server's capacity.
You can use this article on php-fpm tuning to help figure out your server capacity needs. The article is focused on tuning web servers, but you can use the same technique to determine how much memory your queue workers are using, and from there determine how many workers you can reasonably run at once.
One other option would be to look at Sidecar to run your ffmpeg processes in AWS Lambdas rather than relying on a queue at all. This project may help you get started…
In my Laravel 5.1 project I want to start my second job when first will finished.
Here is my logic.
\Queue::push(new MyJob())
and when this job finish I want to start this job
\Queue::push(new ClearJob())
How can i realize this?
If you want this, you just should define 1 Queue.
A queue is just a list/line of things waiting to be handled in order,
starting from the beginning. When I say things, I mean jobs. - https://toniperic.com/2015/12/01/laravel-queues-demystified
To get the opposite of what you want: async executed Jobs, you should define a new Queue for every Job.
Multiple Queues and Workers
You can have different queues/lists for
storing the jobs. You can name them however you want, such as “images”
for pushing image processing tasks, or “emails” for queue that holds
jobs specific to sending emails. You can also have multiple workers,
each working on a different queue if you want. You can even have
multiple workers per queue, thus having more than one job being worked
on simultaneously. Bear in mind having multiple workers comes with a
CPU and memory cost. Look it up in the official docs, it’s pretty
straightforward.
I am using Sinatra gem for my API. What I want to do is when request is received process it, return the response and start new long running task.
I am newbie to Ruby, I have read about Threading but not sure what is the best way to accomplish my task.
Here my sinatra endpoint
post '/items' do
# Processing data
# Return response (body ...)
# Start long running task
end
I would be grateful for any advice or example.
I believe that better way to do it - is to use background jobs. While your worker executes some long-running tasks, it is unavailable for new requests. With background jobs - they do the work, while your web-worker can work with new request.
You can have a look at most popular backgroung jobs gems for ruby as a starting point: resque, delayed_jobs, sidekiq
UPD: Implementation depends on chosen gem, but general scheme will be like this:
# Controller
post '/items' do
# Processing data
MyAwesomeJob.enqueue # here you put your job into queue
head :ok # or whatever
end
In MyAwesomejob you implement your long-runnning task
Next, about Mongoid and background jobs. You should never use complex objects as job arguments. I don't know what kind of task you are implementing, but there is general answer - use simple objects.
For example, instead of using your User as argument, use user_id and then find it inside your job. If you will do it like that, you can use any DB without problems.
Agree with unkmas.
There are two ways to do this.
Threads or a background job gem like sidekiq.
Threads are perfectly fine if the processing times aren't that high and if you don't want to write code for the worker. But there is a strong possibility that you might run up too many threads if you don't use a threadpool or if you're expecting bursty http traffic.
The best way to do it is by using sidekiq or something similar. You could even have a job queue like beanstalkd in between and en-queue the job to it and return the response. You can have a worker reading from the queue and processing it later on.
I have set up queues in Laravel for my processing scripts.
I am using beanstalkd and supervisord.
There are 6 different tubes for different types of processing.
The issue is that for each tube, artisan is constantly spawning workers every second.
The worker code seems to sleep for 1 second and then the worker thread uses 7-15% cpu, multiply this by 6 tubes... and I would like to have multiple workers per tube.. my cpu is being eaten up.
I tried changing the 1 second sleep to 10 seconds.
This helps but there is still a huge cpu spike every 10 seconds when the workers wake back up.
I am not even processing anything at this time because the queues are completely empty, it is simply the workers looking for something to do.
I also tested to see the cpu usage of laravel when I refreshed the page in a brower and that was hovering around 10%.. I am on a low end rackspace instance right now so that could explain it but still... it seems like the workers spin up a laravel instance every time they wake up.
Is there no way to solve this? Do I just have to put a lot of money into a more expensive server just to be able to listen to see if a job is ready?
EDIT:
Found a solution... it was to NOT use the artisan queue:listener or queue:work
I looked into the queue code and there doesn't seem to be a way around this issue, it requires laravel to load every time a worker checks for more work to do.
Instead I wrote my own listener using pheanstalk.
I am still using laravel to push things into the queue, then my custom listener is parsing the queue data and then triggering an artisan command to run.
Now my cpu usage for my listeners is under %0, the only time my cpu shoots up now is when it actually finds work to do and then triggers the command, I am fine with that.
The problem of high CPU is caused because the worker loads the complete framework everytime it checks for a job in the queue. In laravel 4.2, you can use php artisan queue:work --daemon. This will load the framework once and the checking/processing of jobs happen inside a while loop, which lets CPU breathe easy. You can find more about daemon worker in the official documentation: http://laravel.com/docs/queues#daemon-queue-worker.
However, this benefit comes with a drawback - you need special care when deploying the code and you have to take care of the database connections. Usually, long running database connections are disconnected.
I had the same issue.
But I found another solution. I used the artisan worker as is, but I modified the 'watch' time. By default(from laravel) this time is hardcoded to zero, I've changed this value to 600 (seconds). See the file:
'vendor/laravel/framework/src/Illuminate/Queue/BeanstalkdQueue.php'
and in function
'public function pop($queue = null)'
So now the work is also listening to the queue for 10 minutes. When it does not have a job, it exits, and supervisor is restarting it. When it receives a job, it executes it after that it exists, and supervisor is restarting it.
==> No polling anymore!
notes:
it does not work for iron.io queue's or others.
it might not work when you want that 1 worker accept jobs from more than 1 queue.
According to this commit, you can now set a new option to queue:listen "--sleep={int}" which will let you fine tune how much time to wait before polling for new jobs.
Also, default has been set to 3 instead of 1.
We want to use resque to queue a bunch of jobs, and process them by workers. While the jobs are waiting to be processed, we want to know what is their position in the queue (as an indicator of how long they would have to wait). How do we find the position of a job in a queue?
Thanks in advance.
Assuming your problem is in using resque queue system ( you have not mentioned an technology stack that you are using ) .
You can use resque-status an extension to the resque queue system that provides simple trackable jobs.
resque-status provides a set of simple classes that extend resque’s default functionality (with 0% monkey patching) to give apps a way to track specific job instances and their status. It achieves this by giving job instances UUID’s and allowing the job instances to report their status from within their iterations.