Laravel Horizon inactive and still processing - laravel

I run my Application on Kubernetes.
I have one Service for requests and one service for the worker processes.
If I access the Horizon UI it often shows the Inactive Status, but there are still jobs being processed by the worker. I know this because the JOBS PAST HOUR are getting more.
If I scale up my worker service there will be constantly "failing" Jobs with this exception Illuminate\Queue\MaxAttemptsExceededException.
If I connect directly to the pods and run ps aux I will see that there are horizon instances running.
If I connect to a pod on which the worker is running and execute the horizon:list command it tells me that one (or multiple) Masters are running.
How can I further debug this?
Laravel version: 5.7.15
Horizon version: 2.0.0
Redis version: 3.2.4

The issue was that the Server Time was out of Sync so the "old" ones got restartet all the time

Related

Queued jobs are somehow being cached with Laravel Horizon using Supervisor

I have a really strange thing happening with my application that I am really struggling to debug and was wondering if anyone had any ideas or similar experiences.
I have an application running on Laravel v5.8 which is using Horizon to run the queued jobs on a Ubuntu 16.04 server. I have a feature that archives an account which is passed off to the queue.
I noticed that it didn't seem to be working, despite working locally and having had the tests passing for the feature.
My last attempt to debug was me commenting out the entire handle method and added Log::info('wtf?!'); to see if even that would work which it didn't, in fact, it was still trying to run the commented out code. I decided to restart supervisor and tried again. At last, I managed to get 'wtf?!' written to my logs.
I have since been unable to deploy my code without having to restart supervisor in order for it to recognise the 'new' code.
Does Horizon cache the jobs in any way? I can't see anything in the documentation.
Has anyone experienced anything like this?
Any ideas on how I can stop having to restart supervisor every time?
Thanks
As stated in the documentation here
Remember, queue workers are long-lived processes and store the booted application state in memory. As a result, they will not notice changes in your code base after they have been started. So, during your deployment process, be sure to restart your queue workers.
Alternatively, you may run the queue:listen command. When using the queue:listen command, you don't have to manually restart the worker after your code is changed; however, this command is not as efficient as queue:work:
And as stated here in the Horizon documentation.
If you are deploying Horizon to a live server, you should configure a process monitor to monitor the php artisan horizon command and restart it if it quits unexpectedly. When deploying fresh code to your server, you will need to instruct the master Horizon process to terminate so it can be restarted by your process monitor and receive your code changes
When you restart supervisor, you are basically restarting the command and loading the new code, your behaviour is exactly as expected to be.

How to deploy laravel into a docker container while there are jobs running

We are trying to migrate our laravel setup to use docker. Dockerizing the laravel app was straight forward however we ran into an issue where if do a deployment while scheduled jobs are running they would be killed since the container is destroyed. Whats the best practice here? Having a separate container to run the laravel scheduler doesnt seem like it would solve the problem.
Run the scheduled job in a different container so you can scale it independently of the laravel app.
Run multiple containers of the scheduled job so you can stop some to upgrade them while the old ones will continue processing jobs.
Docker will send a SIGTERM signal to the container and wait for the container to exit cleanly before issuing SIGKILL (the time between the two signals is configurable, 10 seconds by default). This will allow to finish your current job cleanly (or save a checkpoint to continue later).
The plan is to stop old containers and start new containers gradually so there aren't lost jobs or downtime. If you use an orchestrator like Docker Swarm or Kubernetes, they will handle most of these logistics for you.
Note: the laravel scheduler is based on cron and will fire processes that will be killed by docker. To prevent this have the scheduler add a job to a laravel queue. The queue is a foreground process and it will be given the chance to stop/save cleanly by the SIGTERM that it will receive before being killed.

Does Apache Twill relaunch containers that are killed by Yarn?

Yarn kills containers when there is heavy load in the cluster. How does Apache Twill react when one of its runnablse running in the container gets killed? Does its run with reduced number of instances of the runnable or does it relaunch it?
By default twill will keep trying re-launch the instances indefinitely. As of version 0.10.0, you are able to specify a maximum number of retries.

DC/OS (mesos/marathon) how set time to start killed instance of aplication

I have install DC/OS (3master and 7slave server - all Centos7)
I saw problem - when one of slave server shut down - mesos/marathon start killed instance of application after 5 minutes.
For example - I run in mesos/marathon 8 instance simple web application. When I shut down or deactivate network interface of one slave server marathon show that some instancje are killed. From this moment mesos/marathon wait 5 minutes and start killed instance to another online slave server.
My question is - how can I change this time? 5 minutes is to long. I read documentation of DC/OS but I can't find variable responsible for this.
I will be very thankful for your help.
You can have a at the Marathon command-line flags. Based on your description, I guess the default for either task_launch_timeout or scale_apps_interval could be responsible for this.
I'm unsure though if this can be configured on the fly, or during installation in DC/OS. I saw that there's a quite recent enhancement request to Make Marathon flags passable via environment variables.

Spark workers and master not communicating (both start without error) in standalone cluster

I have the same question as Spark Clusters: worker info doesn't show on web UI , but I can't seem to figure out what the problem is.
In addition to what's written there, there are two extra interesting/useful points:
The workers will show up on their respective worker webuis --> http://xxx.xx.xx.xx:8081/ (but not the master's (http://yyy.yy.yy.yy:8080/))
The workers will end in a minute or so (presumably because they can't connect to the master), without an error message.
(if I run sbin/start-slaves.sh , I see :
root#198.23.89.40: starting org.apache.spark.deploy.worker.Worker, logging to /usr/local/spark/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-alanbaresj5.hi.com.out
, and if I run again (without waiting for a minute or so),
root#198.23.89.40: org.apache.spark.deploy.worker.Worker running as process 30542. Stop it first.
Any thoughts?
Thanks!

Resources