TL;DR Laravel Horizon queue workers go to sleep for 60 seconds after each job they process
I have a big backlog in my Laravel Horizon queue. There are a lot of workers (maxProcesses set to 30), but when I monitor the log file, the output suggests that it is processing exactly 30 jobs over the course of 2-3 seconds, and then it pauses for a full minute (more or less exactly 60 seconds).
Any ideas why this could be happening? Am I hitting some resource limit that is causing Horizon or Supervisor to hit the breaks?
Here's the relevant section from my horizon.php config file:
'environments' => [
'production' => [
'supervisor-1' => [
'connection' => 'redis',
'queue' => ['high', 'default', 'low'],
'balance' => 'false',
'minProcesses' => 3,
'maxProcesses' => 30,
'timeout' => 1800,
'tries' => 3
],
I have the exact same configuration in my local environment, and my throughput locally is ~600 jobs/minute. In production it hovers right around ~30 jobs/minute.
Update per #Qumber's request
For the most part these aren't actually jobs. They're events being handled by one or more listeners, most of which are super simple. For example:
public function handle(TransactionDeleted $event)
{
TransactionFile::where("transaction_id", $event->subject->id)->delete();
}
Here's some queue config:
'redis' => [
'driver' => 'redis',
'connection' => 'default',
'queue' => env('REDIS_QUEUE', 'default'),
'retry_after' => 1900,
'block_for' => null,
],
Update per #sykez request
Here's the supervisor config in local:
[program:laravelqueue]
process_name=%(program_name)s_%(process_num)02d
command=php /path/to/artisan queue:once redis --sleep=1 --tries=1
autostart=true
autorestart=true
user=adam
numprocs=3
redirect_stderr=true
stdout_logfile=/path/to/worker.log
stopwaitsecs=3600
Here's the supervisor config in production:
[program:daemon-XXXXXX]
directory=/home/forge/SITE_URL/current/
command=php artisan horizon
process_name=%(program_name)s_%(process_num)02d
autostart=true
autorestart=true
user=forge
redirect_stderr=true
stdout_logfile=/home/forge/.forge/daemon-XXXXXX.log
stopwaitsecs=3600
The local supervisor is running the queue directly, with the "once" flag, which should load the entire code base for each job rather than running as a daemon. This, of course, should make it slower, not 20 times faster...
Another update
Thanks to some help from one of the core Laravel devs, we were able to determine that all of the "hanging" jobs were broadcast jobs, from events that were configured to broadcast after firing. We use Pusher as our broadcast engine. When Pusher is disabled (as it is in our local environment), then the jobs finish immediately with no pause.
Related
I'm just moving our Laravel v8 queue driver from db to redis, using Horizon for management.
No matter what I configured in config/horizon.php, I was only getting either one worker process across all my queues or one worker per queue - but with no auto-balancing.
I modified the supervisor scheduler.conf to run 2 (or more) processes:
[program:horizon]
process_name=%(program_name)s_%(process_num)02d
command=php /www/E3/artisan horizon
autostart=true
autorestart=true
user=web
numprocs=2
redirect_stderr=true
stdout_logfile=/var/log/supervisor/horizon.log
stopwaitsecs=3600
but this seems to spawn multiple supervisors (in Horizon parlance) with one worker each, rather than multiple workers per supervisor:
I think Horizon is configured correctly:
'defaults' => [
'supervisor-1' => [
'connection' => 'redis',
'queue' => ['high', 'updatestock', 'priceapi', 'pubsub', 'klaviyo', 'default', 'low'],
'balance' => 'auto',
'processes' => 2,
'minProcesses' => 2,
'maxProcesses' => 10,
'maxTime' => 3600, // how long the process can run before restarting (to avoid memory leaks)
'maxJobs' => 0,
'balanceMaxShift' => 1,
'balanceCooldown' => 3,
'memory' => 128,
'tries' => 3,
'timeout' => 60,
'nice' => 0,
],
],
'environments' => [
'staging' => [
'supervisor-1' => [
'maxProcesses' => 3,
],
],
],
Also, at some point while attempting various changes I'm no longer getting any data shown in pending/completed - the json responses show counts, but not the job data, for instance in /horizon/api/jobs/completed?starting_at=-1&limit=50:
{
"jobs": [],
"total": 13157
}
I think in this case you dont need to worry about supervisor.conf .
Since based on the horizon.php - laravel will auto-scale amount of "workers".
Try to change maxProcesses and test it out.
I have script, which executes about 5-8 mins and in the end it gives me xls file, on localhost it works fine, but on server it executes 3 times, i cannot understand why.
There is supervisor with 8 processes of queue workers.
queue connection set to redis.
laravel 5.7
Maybe someone had same problem and solved it?
.env
BROADCAST_DRIVER=redis
QUEUE_CONNECTION=redis
queue
'redis' => [
'driver' => 'redis',
'connection' => 'default',
'queue' => 'default',
'retry_after' => 90,
'block_for' => null,
],
upd:
changing retry_after => 900 doesn't help
worker starts with this command:
artisan queue:work redis --timeout=900 --sleep=3 --tries=3
Using laravel 5.5, we need to use both Redis and SQS queues. Redis for our internal messaging and SQS for messages coming from a 3rd party.
config/queue.php has various connection information. The first key is the default connection. That default is the one used by queue:work artisan command.
'default' => 'redis',
'connections' => [
'sqs' => [
'driver' => 'sqs',
'key' => env('ACCESS_KEY_ID', ''),
'secret' => env('SECRET_ACCESS_KEY', ''),
'prefix' => 'https://sqs.us-west-1.amazonaws.com/account-id/',
'queue' => 'my-sqs-que'),
'region' => 'us-west-1',
],
'redis' => [
'driver' => 'redis',
'connection' => 'default',
'queue' => env('REDIS_QUE' , 'default'),
'retry_after' => 90,
],
The question is how can we use different queue connection for queue:work.
If --queue=my-sqs-que is supplied, with default connection set to redis, laravel looks under redis and obviously does not find my-sqs-que
Setting default to sqs will disable processing our internal messages.
You can specify the connection when running queue:work, see Specifying the Connection and Queue:
You may also specify which queue connection the worker should utilize. The connection name passed to the work command should correspond to one of the connections defined in your config/queue.php configuration file:
php artisan queue:work redis
You will need to setup the corresponding connections per queue, as well.
However, any given queue connection may have multiple "queues" which may be thought of as different stacks or piles of queued jobs.
So, I configured my QUEUE_DRIVE with redis.
The queue.php:
'redis' => [
'driver' => 'redis',
'connection' => 'default',
'queue' => 'default',
'expire' => 90,
'retry_after' => 550
],
Supervisor is configured like this:
command=php /home/xxxxx/domains/xxxxx/public_html/artisan queue:work redis --sleep=3 --tries=5 --timeout=500
The job is being dispatched like this:
$job = (new CreateOrder($orderHeaderToPush, $order->order_id))
->delay(Carbon::now()->addMinutes(1));
dispatch($job);
I need the --tries argument to be bigger because there are multiple users doing this operation at the same time.
PROBLEM
Inside the job I have a Log::Debug. After 1 minute the job is dispatched - order comes in - No debug logging present. After a long time (the 500s) the job is dispatched again, this time logging with Log::Debug.
What exactly is happening? the job is not failed. How can it run without accessing the Log::Debug but doing other methods?
I am using vladimir-yuldashev/laravel-queue-rabbitmq library to use RabbitMq queues in Lumen project.
The queue functionality is working fine, but I see tons of below errors in my log file.
lumen.ERROR: PhpAmqpLib\Exception\AMQPRuntimeException: Channel connection is closed. in /var/www/html/vendor/php-amqplib/php-amqplib/PhpAmqpLib/Channel/AbstractChannel.php:227
From the error stack trace, it seems that the queue name is taken as "NULL". Here is my rabbitmq connection configuration from queue.php
'rabbitmq' => [
'driver' => 'rabbitmq',
'host' => env('RABBITMQ_HOST', 'rabbitmq'),
'port' => env('RABBITMQ_PORT', 5672),
'vhost' => env('RABBITMQ_VHOST', '/'),
'login' => env('RABBITMQ_LOGIN', 'guest'),
'password' => env('RABBITMQ_PASSWORD', 'guest'),
'queue' => env('RABBITMQ_QUEUE'),
// name of the default queue,
'exchange_declare' => env('RABBITMQ_EXCHANGE_DECLARE', true),
// create the exchange if not exists
'queue_declare_bind' => env('RABBITMQ_QUEUE_DECLARE_BIND', true),
// create the queue if not exists and bind to the exchange
'queue_params' => [
'passive' => env('RABBITMQ_QUEUE_PASSIVE', false),
'durable' => env('RABBITMQ_QUEUE_DURABLE', true),
'exclusive' => env('RABBITMQ_QUEUE_EXCLUSIVE', false),
'auto_delete' => env('RABBITMQ_QUEUE_AUTODELETE', false),
],
'exchange_params' => [
'name' => env('RABBITMQ_EXCHANGE_NAME', null),
'type' => env('RABBITMQ_EXCHANGE_TYPE', 'direct'),
// more info at http://www.rabbitmq.com/tutorials/amqp-concepts.html
'passive' => env('RABBITMQ_EXCHANGE_PASSIVE', false),
'durable' => env('RABBITMQ_EXCHANGE_DURABLE', true),
// the exchange will survive server restarts
'auto_delete' => env('RABBITMQ_EXCHANGE_AUTODELETE', false),
],
'sleep_on_error' => env('RABBITMQ_ERROR_SLEEP', 5), // the number of seconds to sleep if there's an error communicating with rabbitmq
]
I am not using the default queue. Instead, each of my event listeners declare a queue for itself. Here is how I am using the queue commands to start the worker and the listeners.
worker
php artisan queue:work rabbitmq
Listeners
php artisan queue:listen --queue=my-queue-1 --timeout=0
php artisan queue:listen --queue=my-queue-2 --timeout=0
php artisan queue:listen --queue=my-queue-3 --timeout=0
Each of these queue functionality is working fine.
My questions are:
Is it ok to start just one worker for multiple listeners?
Why are my logs filled up with these errors? And how can I fix this?
One more note: Not sure if it matters, my events are chained. i.e., I am firing event 2 from the event 1 listeners and so on so forth.
Ok, I finally have breakthrough on this one. Apparently, the error is happening because of this command php artisan queue:work rabbitmq as I was not passing the --queue option and I don't have a default queue declared in my .env file.
As per this question on SO, my understanding of how these queue commands work is incorrect.
As mentioned in the above url, I have totally removed the queue:listen and used multiple queue:work commands, passing the queue name to each of the work command. So, after the changes, this is how my commands look like:
php artisan queue:work --queue=my-queue-1 --timeout=0
php artisan queue:work --queue=my-queue-2 --timeout=0
php artisan queue:work --queue=my-queue-3 --timeout=0