I have set up queues in Laravel for my processing scripts.
I am using beanstalkd and supervisord.
There are 6 different tubes for different types of processing.
The issue is that for each tube, artisan is constantly spawning workers every second.
The worker code seems to sleep for 1 second and then the worker thread uses 7-15% cpu, multiply this by 6 tubes... and I would like to have multiple workers per tube.. my cpu is being eaten up.
I tried changing the 1 second sleep to 10 seconds.
This helps but there is still a huge cpu spike every 10 seconds when the workers wake back up.
I am not even processing anything at this time because the queues are completely empty, it is simply the workers looking for something to do.
I also tested to see the cpu usage of laravel when I refreshed the page in a brower and that was hovering around 10%.. I am on a low end rackspace instance right now so that could explain it but still... it seems like the workers spin up a laravel instance every time they wake up.
Is there no way to solve this? Do I just have to put a lot of money into a more expensive server just to be able to listen to see if a job is ready?
EDIT:
Found a solution... it was to NOT use the artisan queue:listener or queue:work
I looked into the queue code and there doesn't seem to be a way around this issue, it requires laravel to load every time a worker checks for more work to do.
Instead I wrote my own listener using pheanstalk.
I am still using laravel to push things into the queue, then my custom listener is parsing the queue data and then triggering an artisan command to run.
Now my cpu usage for my listeners is under %0, the only time my cpu shoots up now is when it actually finds work to do and then triggers the command, I am fine with that.
The problem of high CPU is caused because the worker loads the complete framework everytime it checks for a job in the queue. In laravel 4.2, you can use php artisan queue:work --daemon. This will load the framework once and the checking/processing of jobs happen inside a while loop, which lets CPU breathe easy. You can find more about daemon worker in the official documentation: http://laravel.com/docs/queues#daemon-queue-worker.
However, this benefit comes with a drawback - you need special care when deploying the code and you have to take care of the database connections. Usually, long running database connections are disconnected.
I had the same issue.
But I found another solution. I used the artisan worker as is, but I modified the 'watch' time. By default(from laravel) this time is hardcoded to zero, I've changed this value to 600 (seconds). See the file:
'vendor/laravel/framework/src/Illuminate/Queue/BeanstalkdQueue.php'
and in function
'public function pop($queue = null)'
So now the work is also listening to the queue for 10 minutes. When it does not have a job, it exits, and supervisor is restarting it. When it receives a job, it executes it after that it exists, and supervisor is restarting it.
==> No polling anymore!
notes:
it does not work for iron.io queue's or others.
it might not work when you want that 1 worker accept jobs from more than 1 queue.
According to this commit, you can now set a new option to queue:listen "--sleep={int}" which will let you fine tune how much time to wait before polling for new jobs.
Also, default has been set to 3 instead of 1.
Related
My Project consumes several 3rd party APIs which enforce requests limiting. My Project calls these api's through Laravel Jobs. I am using using Spatie/aravel-rate-limited-job-middleware for rate limiting
Once a Project is submitted, around 60 jobs are dispatched on an average. These jobs needs to be executed as 1 Job/Minute
There is one supervisord program running 2 process of the default queue with --tries=3
also in config/queue.php for redis I am using 'retry_after' => (60 * 15) to avoid retrying while job is executing.
My current Rate Limiter middleware is coded this way
return (new RateLimited())
->allow(1)
->everySeconds(60)
->releaseAfterBackoff($this->attempts());
What happens is that 3 jobs get processed in 3 mins, but after that all jobs gets failed.
what I can understand is all jobs are requeued every min and once they cross tries threshold (3), they are moved to failed_jobs.
I tried removing --tries flags but that didn't work. I also tried increasing --tries=20, but then jobs fails after 20 mins.
I don't want to hardcode the --tries flag as in some situation more than 100 jobs can be dispatched.
I also want to increase no of queue workers process in the supervisor so that few jobs can execute parallely.
I understand it is issue with configuring retry, timeouts flags but I don't understand how. Need Help...
We have a job MyPrettyJob, that is queued through redis from a controller. When we run this job from the command like so, the job does succeed. When we run the job with little data the queue stays online, but when we run the job with a lot of data the queue crashes with an exit code of 12, which suggests an "Out of Memory" error.
The large job processes about 300.000 items, who mostly depend on each other. To that end, we cannot really split up this job without causing severe performance impact. In some extreme cases it could take up to hours instead of the few minutes it currently takes.
For the large job, the queue outputs the following:
$ php artisan queue:work --queue=myqueue
Processing: App\Jobs\MyPrettyJob
Processed: App\Jobs\MyPrettyJob
$ echo $?
12
The queue worker even crashes regardless if something is queued behind that job. That seems to suggest that the queue crashes through cleanup of the large job, but it does not seem to give any indication of what that is. The queue worker also crashes regardless if any database interactions are done, which rules anything related to the database.
What is the queue doing in-between jobs? Can I debug in any way why it is getting out of memory after completing the job? Does the queue write something to a log maybe, or is it doing something in redis in between jobs? It seems like a really weird time for that process to crash.
Exit code 12 happens when the queue worker system determines that it has used more memory than is allowed (see https://github.com/laravel/framework/blob/5.8/src/Illuminate/Queue/Worker.php#L199-L210 for the specific section of code). If you run php artisan queue:work --memory=<digit> where memory is enough to fully run your job (for example, 1024 for 1GB), you should be able to allow your job to complete and continue running after the fact.
We're running into an issue with the SOS-Berlin JobScheduler running on Windows that is difficult to diagnose* and I would appreciate any guidance.
*Difficult because I don't know Scala (though I do know C++ and Java). It's difficult to navigate this code-base (some of it's in German).
We have a process-class called Foo, that will sometimes burst up outside the limit of how many processes can be run. So, for example, we limit the process-class to 30 processes and 60 want to run. This leaves 30 running and 30 "waiting for process."
The problem is that JobScheduler doesn't seem to prioritize the 30 that are waiting for a process. Instead, any new job that gets fired after the burst receives processes, leaving some jobs waiting indefinitely. Once the number of jobs "waiting for process" hits zero, the jobs clear out immediately.
Further, it seems that when there are a large number of jobs "waiting for process," the run time for tasks doubles or triples. A job that normally takes 20 seconds to run, will spike to 1-2 minutes, further amplifying the issue as processes are not released back to the pool.
Admittedly, we're running an older version of JS, which we're planning to upgrade this/next week. However, I'm wondering if there is something fundamental we're missing. We've turned down the logging, looked for DB locks, added memory to the heap, shut-down some other processes on the server. We've also increased the process pool, but we don't want to push it too far, lest we crush the server. Nothing seems to be alleviating the issue.
Any tuning help would be appreciated!
As a follow-up, we determined the cause of the issue.
Another user had been using the temp directory to store intermediate generated files. The user was not clearing out these files, resulting in 100's of thousands of files in the directory. They were not very large so we didn't notice. For some reason Job Scheduler started to choke based on this. I'm not clear on the reasons.
Clearing the temp directory, scolding the user, and fixing his script fixed the issue.
I have a beanstalkd instance with two workers picking jobs from one tube.
I've noticed that occasionally one of the workers will reserve a job that has already been reserved (and being worked on) by the other worker.
I know there aren't duplicate jobs in the queue.
Why does beanstalkd allow the same job to be reserved twice?
It sounds to me that you didn't implemented the protocol properly. You need to handle DEADLINE_SOON, and do TOUCH.
What does DEADLINE_SOON mean?
DEADLINE_SOON is a response to a reserve command indicating that you have a job reserved whose deadline is real soon (current safety margin is approximately 1 second).
If you are frequently receiving DEADLINE_SOON errors on reserve, you should probably consider increasing the TTR on your jobs as it generally indicates you aren’t completing them in time. It may also be that you are failing to delete tasks when you have completed them.
See the mailing list discussion for more information.
How does TTR work?
TTR only applies to a job at the moment it becomes reserved. At that event, a timer (called “time-left” in the job stats) starts counting down from the job’s TTR.
If the timer reaches zero, the job gets put back in the ready queue.
If the job is buried, deleted, or released before the timer runs out, the timer ceases to exist.
If the job is touch"ed before the timer reaches zero, the timer starts over counting down from TTR.
The "touch" command
Allows a worker to request more time to work on a job.
This is useful for jobs that potentially take a long time, but you still want
the benefits of a TTR pulling a job away from an unresponsive worker. A worker
may periodically tell the server that it's still alive and processing a job
(e.g. it may do this on DEADLINE_SOON). The command postpones the auto
release of a reserved job until TTR seconds from when the command is issued.
The jobs take longer to run than the TTR, so it was being returned back to the queue and picked up by the other worker.
I now set a larger TTR on the job.
I need asynchronous, quick processing of everything in the queue. Jobs consist of CURL requests so it takes forever doing them 1 by 1 (They're basically the same as sleep(3)). I'd like all messages in the queue to run at the same time, or at least set a limit like 50. The reason I'm using a queue for this and not just running them instantly is because I need to make sure that if anything fails, it tries again.
Use the queue with iron.io ironMQ push, the queue shouldn't fail but in the unlikely even it does there is a log.
See this link for reference http://blog.iron.io/2013/05/laravel-4-ironmq-push-queues-insane.html
From memory you get 10 million requests free per month with ironMQ