Problems with Oracle long-lived connection in Kubernetes cluster - go

I am trying to deploy a Go application inside the Kubernetes cluster. My application uses the goracle.v2 library to connect to the Oracle database.
The problem only happens when my application is running inside the Kubernetes cluster. I have a process that executes a stored procedure and returns a cursor, and it often takes more than 10 minutes to execute.
When this happens the active session that was there in the database ends and the pod that was running with the application is stopped and nothing else happens. And this scenario only happens when it runs inside the cluster. If I run the app locally it doesn't happen even if the process takes more than 10 minutes.
Anyone have any idea what might be happening?

Related

Queued jobs are somehow being cached with Laravel Horizon using Supervisor

I have a really strange thing happening with my application that I am really struggling to debug and was wondering if anyone had any ideas or similar experiences.
I have an application running on Laravel v5.8 which is using Horizon to run the queued jobs on a Ubuntu 16.04 server. I have a feature that archives an account which is passed off to the queue.
I noticed that it didn't seem to be working, despite working locally and having had the tests passing for the feature.
My last attempt to debug was me commenting out the entire handle method and added Log::info('wtf?!'); to see if even that would work which it didn't, in fact, it was still trying to run the commented out code. I decided to restart supervisor and tried again. At last, I managed to get 'wtf?!' written to my logs.
I have since been unable to deploy my code without having to restart supervisor in order for it to recognise the 'new' code.
Does Horizon cache the jobs in any way? I can't see anything in the documentation.
Has anyone experienced anything like this?
Any ideas on how I can stop having to restart supervisor every time?
Thanks
As stated in the documentation here
Remember, queue workers are long-lived processes and store the booted application state in memory. As a result, they will not notice changes in your code base after they have been started. So, during your deployment process, be sure to restart your queue workers.
Alternatively, you may run the queue:listen command. When using the queue:listen command, you don't have to manually restart the worker after your code is changed; however, this command is not as efficient as queue:work:
And as stated here in the Horizon documentation.
If you are deploying Horizon to a live server, you should configure a process monitor to monitor the php artisan horizon command and restart it if it quits unexpectedly. When deploying fresh code to your server, you will need to instruct the master Horizon process to terminate so it can be restarted by your process monitor and receive your code changes
When you restart supervisor, you are basically restarting the command and loading the new code, your behaviour is exactly as expected to be.

Laravel Horizon inactive and still processing

I run my Application on Kubernetes.
I have one Service for requests and one service for the worker processes.
If I access the Horizon UI it often shows the Inactive Status, but there are still jobs being processed by the worker. I know this because the JOBS PAST HOUR are getting more.
If I scale up my worker service there will be constantly "failing" Jobs with this exception Illuminate\Queue\MaxAttemptsExceededException.
If I connect directly to the pods and run ps aux I will see that there are horizon instances running.
If I connect to a pod on which the worker is running and execute the horizon:list command it tells me that one (or multiple) Masters are running.
How can I further debug this?
Laravel version: 5.7.15
Horizon version: 2.0.0
Redis version: 3.2.4
The issue was that the Server Time was out of Sync so the "old" ones got restartet all the time

DC/OS (mesos/marathon) how set time to start killed instance of aplication

I have install DC/OS (3master and 7slave server - all Centos7)
I saw problem - when one of slave server shut down - mesos/marathon start killed instance of application after 5 minutes.
For example - I run in mesos/marathon 8 instance simple web application. When I shut down or deactivate network interface of one slave server marathon show that some instancje are killed. From this moment mesos/marathon wait 5 minutes and start killed instance to another online slave server.
My question is - how can I change this time? 5 minutes is to long. I read documentation of DC/OS but I can't find variable responsible for this.
I will be very thankful for your help.
You can have a at the Marathon command-line flags. Based on your description, I guess the default for either task_launch_timeout or scale_apps_interval could be responsible for this.
I'm unsure though if this can be configured on the fly, or during installation in DC/OS. I saw that there's a quite recent enhancement request to Make Marathon flags passable via environment variables.

Manually start HDFS every time I boot?

Laconically: Should I start HDFS every that I come back to the cluster after a power-off operation?
I have successfully created a Hadoop cluster (after loosing some battles) and now I want to be very careful on proceeding with this.
Should I execute start-dfs.sh every time I power on the cluster, or it's ready to execute my application's code? Same for start-yarn.sh.
I am afraid that if I run it without everything being fine, it might leave garbage directories after execution.
Just from playing around with the Hortonworks and Cloudera sandboxes, I can say turning them on and off doesn't seem to demonstrate any "side-effects".
However, it is necessary to start the needed services everytime the cluster starts.
As far as power cycling goes in a real cluster, it is recommended to stop the services running on the respective nodes before powering them down (stop-dfs.sh and stop-yarn.sh). That way there are no weird problems and any errors on the way to stopping the services will be properly logged on each node.

How much time does Tomcat take to replicate sessions on redeploy

We have a Tomcat cluster with two instances (version 5.5.25) running on a single machine. We use this to make sure our web page is available and that all sessions survive during redeployment.
We were wondering if sessions could be lost during this procedure. Here is what we do:
a) Application is running on tomcatA and tomcatB and all sessions are replicated
b) Application on tomcatA is undeployed, all requests are redirected to tomcatB
c) Application on tomcatA is deployed
* Now the cluster needs to make sure that all sessions are replicated properly on tomcatA
d) Application on tomcatB is undeployed, all requests are redirected to tomcatA
e) Application on tomcatB is deployed
We use an ant script and the tomcat manager tasks to control the deployment procedure. Note that we are not actually bringing down any tomcat instances, but rather we're just redeploying a particular application while the rest are still running.
Could sessions be lost in this procedure when the undeployment of the webapp on tomcatB happens right after the deployment on tomcatA?
Does the tomcat manager deploy task in step (c) only return once all sessions are successfully replicated?
If not is there a way to ensure all sessions are replicated before undeploying the webapp on tomcatB?
Thanks,
Mitko

Resources