how to handle sigterm signal in .net 6 app deployed in azure kubernetes service - .net-6.0

I have a .NET 6 application. This application contains a BackgroundService that is used to handler background jobs. These jobs are fired on a certain time based on different schedules.
I need to implement graceful shutdown when the application is stopping, because there might be some running jobs and I want to complete the jobs before stopping the application.
The app is deployed in Azure Kubernetes Service, when running this command kubectl delete pod {podname} to delete a certain pod, I am not able to handle the sigterm and write my logic to handle graceful shutdown.
I am using the IHostApplicationLifetime and registering on the ApplicationStopping event.
IHostApplicationLifetime _applicationLifetime;
_applicationLifetime.ApplicationStopping.Register(OnAppStopping);
In the docker file I added the below line
STOPSIGNAL SIGTERM
The OnAppStopping is never fired on AKS.

Related

When exactly SIGTERM is sent to Spring Boot App process in a container when preStop hook is used?

I'm trying to understand how traffic can be routed to a pod that has begun shutdown process.
In Spring Boot docs it is mentioned that
Once the pre-stop hook has completed, SIGTERM will be sent to the container and graceful shutdown will begin, allowing any remaining in-flight requests to complete.
Kubernetes Container Lifecycle
But in Kubernetes docs we have
The Pod's termination grace period countdown begins before the PreStop hook is executed, so regardless of the outcome of the handler, the container will eventually terminate within the Pod's termination grace period. No parameters are passed to the handler.
Container hooks
In Kubernetes docs it say The Pod's termination grace period countdown begins before the PreStop hook is executed which means SIGTERM was send before the hook is called. Isn't this in contradiction to Spring Boot which says Once the pre-stop hook has completed, SIGTERM will be sent to the container?
It happens in the following order:
Count down for termination grace period starts
Pre stop hook starts executing
Pre stop hooks finished
SIGTERM is issued to the container, Spring Boot starts shutting down (possible waiting if graceful shutdown is configured)
If at any point in time the grace period is exceeded SIGKILL is issued and all processes are terminated.

Handle long running tasks gracefully

I am working in microservices architecture. Every service has few long-running tasks (data processing, report generation) which can take up to 1-2 hours. We are using Kafka for the queue.
How to handle cases where pod restart or any deployment happens just before the completion of the task? The task will start again and will take that much time again. Is there any way to run these tasks independently to the application pod?
you can use the Kubernetes jobs to do these types of tasks so as soon as the task is done Kubernetes will auto-delete the pods also.
Jobs are also configurable and will be running standalone so if you will deploy the job again it fetches the data from Kafka and jobs new job will start.

How to handle pods shutdown gracefully in microservices architechture

I am exploring different strategies for handling shutdown gracefully in case of deployment/crash. I am using the Spring Boot framework and Kubernetes. In a few of the services, we have tasks that can take around 10-20 minutes(data processing, large report generation). How to handle pod termination in these cases when the task is taking more time. For queuing I am using Kafka.
we have tasks that can take around 10-20 minutes(data processing, large report generation)
First, this is more of a Job/Task rather than a microservice. But similar "rules" applies, the node where this job is executing might terminate for upgrade or other reason, so your Job/Task must be idempotent and be able to be re-run if it crashes or is terminated.
How to handle pod termination in these cases when the task is taking more time. For queuing I am using Kafka.
Kafka is a good technology for this, because it is able to let the client Jon/Task to be idempotent. The job receives the data to process, and after processing it can "commit" that it has processed the data. If the Task/Job is terminated before it has processed the data, a new Task/Job will spawn and continue processing on the "offset" that is not yet committed.

How to build spring based micro services state syncing after any node failure service crash

I have few micro services, they accept the data from customer and process the request asynchronously, later customer can come and check the status, to make my platform my robust, I am planning to providing HA with running same setup services (atleat 2) and also registering with eureka, and also all my services are behind load balancer
Now I stuck at providing a solution if node failure or services goes down after accepting the request.
Lets say I have Service-A1 and Service-A2, both have same capability.
Now Service-A1 accepts the request and gave accepted response to customer then started processing the job and updating its intermediate results in db, now due to some node failure or service crash it could not complete the job.
In this case I want other service to auto detect(get notified) to continue, so it can read the job status and continue the job request for completion.
Is there any feature in Spring Eureka or zookeeper to watch and notify other to continue. ?

Alerts for apps failing Marathon healthchecks

I've been configuring http healthchecks for all my apps in marathon which are working nicely, the trouble is marathon will keep stepping in and restarting a container failing it's healthcheck and I won't know unless I happen to be looking in the Marathon UI.
Is there a way to retrieve all apps that have a failed healthcheck so I can send an email alert or similar?
Marathon exposes information about failing healthcheck with event bus so you can write a simple service that will consume Marathons HealthChecks Event ("eventType": "instance_health_changed_event") and translate it to metric, alert you name it.
For a reference I can recommend allegro/appcop. This is the service that scales down unhealthy applications. Its code could be easily altered to do what you want.

Resources