I'm running a spring boot task inside a k8s pod. This is the k8s specification:
kind: CronJob
metadata:
name: data-transmission
spec:
schedule: "*/2 * * * *"
concurrencyPolicy: Forbid
successfulJobsHistoryLimit: 1
failedJobsHistoryLimit: 3
jobTemplate:
spec:
template:
spec:
containers:
- name: print-date
image: fredde:latest
imagePullPolicy: IfNotPresent
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8080
initialDelaySeconds: 2
failureThreshold: 2
readinessProbe: null
restartPolicy: OnFailure
The pod starts as it should each 2 min. The task in the spring boot application is running and shutting down itself when it's done. But my issue is that the pod is still running even when the spring boot application has exited but it changes status to NotReady, i was expecting it to be complete or terminated.
A Job creates one or more Pods and will continue to retry execution of the Pods until a specified number of them successfully terminate. As pods successfully complete, the Job tracks the successful completions. When a specified number of successful completions is reached, the task (ie, Job) is complete. Deleting a Job will clean up the Pods it created. Suspending a Job will delete its active Pods until the Job is resumed again.
A simple case is to create one Job object in order to reliably run one Pod to completion. The Job object will start a new Pod if the first Pod fails or is deleted (for example due to a node hardware failure or a node reboot). (Source: Kubernetes)
In general jobs won’t get deleted from API as soon as they are successfully completed, they are kept for some time in order to have the information on whether a job is successful or not. Kubernetes' TTL-after-finished controller will take care of this, you can set an expiration time for jobs for deleting them from the API.
Note: This is taken from the official kubernetes documentation, go through the links for the exact point of reference.
I found a solution that works.
It seems like when my spring application is done and shutting down itself the k8s pod is still up and running because of istio.
There is an open pull request:
https://github.com/istio/istio/issues/11659
Related
What I'd like to ask is if is it possible to create a Kubernetes Job that runs a bash command within another Pod.
apiVersion: batch/v1
kind: Job
metadata:
namespace: dev
name: run-cmd
spec:
ttlSecondsAfterFinished: 180
template:
spec:
containers:
- name: run-cmd
image: <IMG>
command: ["/bin/bash", "-c"]
args:
- <CMD> $POD_NAME
restartPolicy: Never
backoffLimit: 4
I considered using :
Environment variable to define the pod name
Using Kubernetes SDK to automate
But if you have better ideas I am open to them, please!
The Job manifest you shared seems the valid idea.
Yet, you need to take into consideration below points:
As running command inside other pod (pod's some container) requires interacting with Kubernetes API server, you'd need to interact with it using Kubernetes client (e.g. kubectl). This, in turn, requires the client to be installed inside the job's container's image.
job's pod's service account has to have the permissions to pods/exec resource. See docs and this answer.
I want to mark the pod ready only when there are enough connections created and the pod is ready to handle requests. The connections are created at the startup of my Springboot application. How can I make sure that the pod is ready only after the connections are created?
You can write a small python (or other) script which checks if the connections are created and ready to receive requests.
Then, add it to the pod deployment yaml as an initContainers:
https://kubernetes.io/docs/concepts/workloads/pods/init-containers/
initContainers:
- name: my-connection-validator
image: path-to-your-image
env:
- name: POD_HOST
value: "localhost-or-ip"
- name: POD_PORT
value: "12345"
I deploy a Elasticsearch cluster to EKS, below is the spec
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: elk
spec:
version: 7.15.2
serviceAccountName: docker-sa
http:
tls:
selfSignedCertificate:
disabled: true
nodeSets:
- name: node
count: 3
config:
...
I can see it has been deployed correctly and all pods are running.
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
elk-es-node-0 1/1 Running 0 19h
elk-es-node-1 1/1 Running 0 19h
elk-es-node-2 1/1 Running 0 11h
But I can't restart the deployment Elasticsearch,
$ kubectl rollout restart Elasticsearch elk-es-node
Error from server (NotFound): elasticsearches.elasticsearch.k8s.elastic.co "elk-es-node" not found
The Elasticsearch is using statefulset so I tried to restart statefulset,
$ kubectl rollout restart statefulset elk-es-node
statefulset.apps/elk-es-node restarted
the above command says restarted, but the actual pods are not restarting.
what is the right way to restart a custom kind in K8S?
Use - kubectl get all
To identify if the resource created is a deployment or a statefulset -
use -n <namespace"> along with the above command, if you are working in a specific namespace.
Assuming, you are using a statefulset, the issue below command to understand the properties in which it is configured.
kubectl get statefulset <statefulset-name"> -o yaml > statefulsetContent.yaml
this will create a yaml file names statefulsetContent.yaml in same directory.
you can use it to explore different options configured in the statefulset.
Check for .spec.updateStrategy in the yaml file. Based on this we can identify its update strategy.
Below is from the official documentation
There are two possible values:
OnDelete
When a StatefulSet's .spec.updateStrategy.type is set to OnDelete, the StatefulSet controller will not automatically update the Pods in a StatefulSet. Users must manually delete Pods to cause the controller to create new Pods that reflect modifications made to a StatefulSet's .spec.template.
RollingUpdate
The RollingUpdate update strategy implements automated, rolling update for the Pods in a StatefulSet. This is the default update strategy.
As a work around, you can try to scale down/up the statefulset.
kubectl scale sts <statefulset-name"> --replicas=<count">
With ECK as the operator, you do not need to use rollout restart. Apply your updated Elasticsearch spec and the operator will perform rolling update for you. If for any reason you need to restart a pod, you use kubectl delete pod <es pod> -n <your es namespace> to remove the pod and the operator will spin up new one for you.
I deploy a elasticsearch to minikube with below configure file:
apiVersion: apps/v1
kind: Deployment
metadata:
name: elasticsearch
spec:
replicas: 1
selector:
matchLabels:
name: elasticsearch
template:
metadata:
labels:
name: elasticsearch
spec:
containers:
- name: elasticsearch
image: elasticsearch:7.10.1
ports:
- containerPort: 9200
- containerPort: 9300
I run the command kubectl apply -f es.yml to deploy the elasticsearch cluster.
$ kubectl get pod
NAME READY STATUS RESTARTS AGE
elasticsearch-fb9b44948-bchh2 1/1 Running 5 6m23s
The elasticsearch pod keep restarting every a few minutes. When I run kubectl describe pod command, I can see these events:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 7m11s default-scheduler Successfully assigned default/elasticsearch-fb9b44948-bchh2 to minikube
Normal Pulled 3m18s (x5 over 7m11s) kubelet Container image "elasticsearch:7.10.1" already present on machine
Normal Created 3m18s (x5 over 7m11s) kubelet Created container elasticsearch
Normal Started 3m18s (x5 over 7m10s) kubelet Started container elasticsearch
Warning BackOff 103s (x11 over 5m56s) kubelet Back-off restarting failed container
The last event is Back-off restarting failed but I don't know why it restarts the pod. Is there any way I can check why it keeps restarting?
The first step (kubectl describe pod) you've already done. As a next step I suggest checking container logs: kubectl logs <pod_name>. 99% you get the reason from logs in this case (I bet on bootstrap check failure).
When neither describe pod nor logs do not have anything about the error, I get into the container with 'exec': kubectl exec -it <pod_name> -c <container_name> sh. With this you'll get a shell inside the container (of course if there IS a shell binary in it) ans so you can use it to investigate the problem manually. Note that to keep failing container alive you may need to change command and args to something like this:
command:
- /bin/sh
- -c
args:
- cat /dev/stdout
Be sure to disable probes when doing this. A container may restart if liveness probe fails, you will see that in kubectl describe pod if it happen. Since your snippet doesn't have any probes specified, you can skip this.
Checking logs of the pod using kubectl logs podname gives clue about what could go wrong.
ERROR: [2] bootstrap checks failed
[1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
[2]: the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes] must be configured
ERROR: Elasticsearch did not exit normally - check the logs at /usr/share/elasticsearch/logs/docker-cluster.log
Check this post for a solution
I have a distributed scheduler. Now I have to pick the record and process them.
In my case, I will create a transaction and process the record.
Now suppose during the processing itself the Kubernetes container goes down. IN that case, what will happen does it release the lock and rollback the transaction
When a pod should be terminated:
A SIGTERM signal is sent to the main process (PID 1) in each container, and a “grace period” countdown starts (defaults to 30 seconds - see below to change it).
Upon the receival of the SIGTERM, each container should start a graceful shutdown of the running application and exit.
If a container doesn’t terminate within the grace period, a SIGKILL signal will be sent and the container violently terminated.
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: test
spec:
replicas: 1
template:
spec:
containers:
- name: test
image: ...
terminationGracePeriodSeconds: 60
So in your spring boot app you need to handle the SIGTERM signal and rollback any transaction or persistent it outside for retrying it later.