Fluentd capture when a Kubernetes Pod terminates with the 'CrashLoopBackOff'? - debugging

I'm running a pod that write a simple message to the 'terminationMessagePath' then the pod exit with "CrashLoopBackOff". I would like to be able to debug through Kibana instead of having to login to each Kubernetes nodes. I queried Kibana to get the container last state value "CrashLoopBackOff" from the property reason & message and could not locate an entry.
I can see the fields for the pod in Kibana but the field that I'm looking for (in bold yaml format below) is empty.
What configuration is needed in fluentd to get the log from Kubernetes pod? or configuration need to be set from Kubernetes
$ kubectl get pod_name_1 -o=yaml
terminationMessagePath: /var/log/containers/dt.log
volumeMounts:
- mountPath: /var/log/containers
name: data
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: default-token-s0w2n
readOnly: true
dnsPolicy: ClusterFirst
nodeName: dev-master-01
restartPolicy: Always
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
volumes:
- hostPath:
path: /var/log/containers
name: data
- name: default-token-s0w2n
secret:
defaultMode: 420
secretName: default-token-s0w2n
status:
conditions:
- lastProbeTime: null
lastTransitionTime: 2017-07-05T14:45:11Z
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: 2017-07-05T17:00:22Z
message: 'containers with unready status: [dt-termination-demo]'
reason: ContainersNotReady
status: "False"
type: Ready
- lastProbeTime: null
lastTransitionTime: 2017-07-05T14:45:11Z
status: "True"
type: PodScheduled
containerStatuses:
- containerID:
docker://9649c26527cf0e1cd3bd67ba9c606c0b78e6b4f08bacf96175627ddc7d250772
image: debian
imageID: docker pullable://docker.io/debian#sha256:
7d067f77d2ae5a23fe6920f8fbc2936c4b0d417e9d01b26372561860750815f0
lastState:
terminated:
containerID: docker://
9649c26527cf0e1cd3bd67ba9c606c0b78e6b4f08bacf96175627ddc7d250772
exitCode: 0
finishedAt: 2017-07-05T17:00:22Z
**message: |
Sleep expired**
reason: Completed
startedAt: 2017-07-05T17:00:12Z
name: dt-termination-demo
ready: false
restartCount: 30
state:
waiting:
message: Back-off 5m0s restarting failed container=dt-termination-demo
pod=dt-termination-demo-2814930607-8kshj_
default(8c247b15-6190-11e7-acb7-00505691210d)
**reason: CrashLoopBackOff**
hostIP: 192.21.19.128
phase: Running
podIP: 10.0.0.8
startTime: 2017-07-05T14:45:11Z

When Fluentd is deployed as a DaemonSet, it aims to collect all logs from the Node and Pods. As a guide to accomplish this please check the following Yaml file and further repository associated:
https://github.com/fluent/fluentd-kubernetes-daemonset/blob/master/fluentd-daemonset-elasticsearch.yaml
https://github.com/fluent/fluentd-kubernetes-daemonset
If you need additional assistance you can also join our Slack channel:
http://slack.fluentd.org

Related

container "sonarqube" in pod "sonar-574d99bfb5-dr8nx" is waiting to start: CreateContainerConfigError

i am facing a problem with my sonar i've been trying to set it up but i get this error from : kubectl logs sonar-574d99bfb5-dr8nx -n sonar == container "sonarqube" in pod "sonar-574d99bfb5-dr8nx" is waiting to start: CreateContainerConfigError.
and when i do describe : kubectl describe pod sonar-574d99bfb5-dr8nx -n sonar
i get this :
Name: sonar-574d99bfb5-dr8nx
Namespace: sonar
Priority: 0
Node: master01/192.168.137.136
Start Time: Tue, 22 Mar 2022 20:30:16 +0000
Labels: app=sonar
pod-template-hash=574d99bfb5
Annotations: cni.projectcalico.org/containerID: 734ba33acb9e2c007861112ffe7c1fce84fa3a434494a0df6951a7b4b6b8dacb
cni.projectcalico.org/podIP: 10.42.241.105/32
cni.projectcalico.org/podIPs: 10.42.241.105/32
Status: Pending
IP: 10.42.241.105
IPs:
IP: 10.42.241.105
Controlled By: ReplicaSet/sonar-574d99bfb5
Containers:
sonarqube:
Container ID:
Image: sonarqube:latest
Image ID:
Port: 9000/TCP
Host Port: 0/TCP
State: Waiting
Reason: CreateContainerConfigError
Ready: False
Restart Count: 0
Limits:
memory: 2Gi
Requests:
memory: 1Gi
Environment Variables from:
sonar-config ConfigMap Optional: false
Environment: <none>
Mounts:
/opt/sonarqube/data/ from app-pvc (rw,path="data")
/opt/sonarqube/extensions/ from app-pvc (rw,path="extensions")
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-q22lb (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
app-pvc:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: sonar-pvc
ReadOnly: false
kube-api-access-q22lb:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 12m default-scheduler Successfully assigned sonar/sonar-574d99bfb5-dr8nx to master01
Warning Failed 10m (x12 over 12m) kubelet Error: stat /home/mtst/data-sonar-pvc: no such file or directory
Normal Pulled 2m24s (x50 over 12m) kubelet Container image "sonarqube:latest" already present on machine
here's my pvc yaml :
apiVersion: v1
kind: PersistentVolume
metadata:
name: sonar-pv
namespace: sonar
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 3Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/home/mtst/data-sonar-pvc"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: sonar-pvc
namespace: sonar
labels:
type: local
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
if there's anything that can help me resolve my issue i would appreciate it.
thank you.
I had the same issue with the awx postgres deployment.
Credit goes to Kubernetes 1.17.2 Rancher 2.3.5 CreateContainerConfigError: stat no such file or directory but the directory IS there for resolving the issue for me.
Running a Rancher kubernetes cluster and wanting custom PV I needed to add the below type to my PersistentVolume definition:
hostPath:
path: "/home/mtst/data-sonar-pvc"
type: "DirectoryOrCreate"

Kubernetes giving CrashLoopBackOff error while running the packetbeat in kubernetes cluster

I'm trying to deploy Packetbeat as a DaemonSet on a Kubernetes cluster. But Kubernetes giving CrashLoopBackOff error while running the Packetbeat. I have checked the pod logs of Packetbeat. Below are the logs.
2020-08-23T14:28:00.054Z INFO instance/beat.go:475 Beat UUID: 69d32e5f-c8f2-41bf-9242-48435688c540
2020-08-23T14:28:00.054Z INFO instance/beat.go:213 Setup Beat: packetbeat; Version: 6.2.4
2020-08-23T14:28:00.061Z INFO add_cloud_metadata/add_cloud_metadata.go:301 add_cloud_metadata: hosting provider type detected as ec2, metadata={"availability_zone":"us-east-1f","instance_id":"i-05b8121af85c94236","machine_type":"t2.medium","provider":"ec2","region":"us-east-1"}
2020-08-23T14:28:00.061Z INFO kubernetes/watcher.go:77 kubernetes: Performing a pod sync
2020-08-23T14:28:00.074Z INFO kubernetes/watcher.go:108 kubernetes: Pod sync done
2020-08-23T14:28:00.074Z INFO elasticsearch/client.go:145 Elasticsearch url: http://elasticsearch:9200
2020-08-23T14:28:00.074Z INFO kubernetes/watcher.go:140 kubernetes: Watching API for pod events
2020-08-23T14:28:00.074Z INFO pipeline/module.go:76 Beat name: ip-172-31-72-117
2020-08-23T14:28:00.075Z INFO procs/procs.go:78 Process matching disabled
2020-08-23T14:28:00.076Z INFO [monitoring] log/log.go:97 Starting metrics logging every 30s
2020-08-23T14:28:00.076Z INFO elasticsearch/client.go:145 Elasticsearch url: http://elasticsearch:9200
2020-08-23T14:28:00.083Z WARN transport/tcp.go:36 DNS lookup failure "elasticsearch": lookup elasticsearch on 172.31.0.2:53: no such host
2020-08-23T14:28:00.083Z ERROR elasticsearch/elasticsearch.go:165 Error connecting to Elasticsearch at http://elasticsearch:9200: Get http://elasticsearch:9200: lookup elasticsearch on 172.31.0.2:53: no such host
2020-08-23T14:28:00.085Z INFO [monitoring] log/log.go:132 Total non-zero metrics {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":20,"time":28},"total":{"ticks":160,"time":176,"value":160},"user":{"ticks":140,"time":148}},"info":{"ephemeral_id":"70e07383-3aae-4bc1-a6e1-540a6cfa8ad8","uptime":{"ms":35}},"memstats":{"gc_next":26511344,"memory_alloc":21723000,"memory_total":23319008,"rss":51834880}},"libbeat":{"config":{"module":{"running":0}},"output":{"type":"elasticsearch"},"pipeline":{"clients":5,"events":{"active":0}}},"system":{"cpu":{"cores":2},"load":{"1":0.11,"15":0.1,"5":0.14,"norm":{"1":0.055,"15":0.05,"5":0.07}}}}}}
2020-08-23T14:28:00.085Z INFO [monitoring] log/log.go:133 Uptime: 37.596889ms
2020-08-23T14:28:00.085Z INFO [monitoring] log/log.go:110 Stopping metrics logging.
2020-08-23T14:28:00.085Z ERROR instance/beat.go:667 Exiting: Error importing Kibana dashboards: fail to create the Elasticsearch loader: Error creating Elasticsearch client: Couldn't connect to any of the configured Elasticsearch hosts. Errors: [Error connection to Elasticsearch http://elasticsearch:9200: Get http://elasticsearch:9200: lookup elasticsearch on 172.31.0.2:53: no such host]
Exiting: Error importing Kibana dashboards: fail to create the Elasticsearch loader: Error creating Elasticsearch client: Couldn't connect to any of the configured Elasticsearch hosts. Errors: [Error connection to Elasticsearch http://elasticsearch:9200: Get http://elasticsearch:9200: lookup elastic search on 172.31.0.2:53: no such host]
Here is Packetbeat.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: packetbeat-dynamic-config
namespace: kube-system
labels:
k8s-app: packetbeat-dynamic
kubernetes.io/cluster-service: "true"
data:
packetbeat.yml: |-
setup.dashboards.enabled: true
setup.template.enabled: true
setup.template.settings:
index.number_of_shards: 2
packetbeat.interfaces.device: any
packetbeat.protocols:
- type: dns
ports: [53]
include_authorities: true
include_additionals: true
- type: http
ports: [80, 8000, 8080, 9200]
- type: mysql
ports: [3306]
- type: redis
ports: [6379]
packetbeat.flows:
timeout: 30s
period: 10s
processors:
- add_cloud_metadata:
- add_kubernetes_metadata:
host: ${HOSTNAME}
indexers:
- ip_port:
matchers:
- field_format:
format: '%{[ip]}:%{[port]}'
cloud.id: ${ELASTIC_CLOUD_ID}
cloud.auth: ${ELASTIC_CLOUD_AUTH}
#setup.kibana.host: kibana:5601
setup.ilm.overwrite: true
output.elasticsearch:
hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
username: ${ELASTICSEARCH_USERNAME}
password: ${ELASTICSEARCH_PASSWORD}
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: packetbeat-dynamic
namespace: kube-system
labels:
k8s-app: packetbeat-dynamic
kubernetes.io/cluster-service: "true"
spec:
selector:
matchLabels:
k8s-app: packetbeat-dynamic
kubernetes.io/cluster-service: "true"
template:
metadata:
labels:
k8s-app: packetbeat-dynamic
kubernetes.io/cluster-service: "true"
spec:
serviceAccountName: packetbeat-dynamic
terminationGracePeriodSeconds: 30
hostNetwork: true
containers:
- name: packetbeat-dynamic
image: docker.elastic.co/beats/packetbeat:6.2.4
imagePullPolicy: Always
args: [
"-c", "/etc/packetbeat.yml",
"-e",
]
securityContext:
runAsUser: 0
capabilities:
add:
- NET_ADMIN
env:
- name: ELASTICSEARCH_HOST
value: elasticsearch
- name: ELASTICSEARCH_PORT
value: "9200"
- name: ELASTICSEARCH_USERNAME
value: elastic
- name: ELASTICSEARCH_PASSWORD
value: changeme
- name: CLOUD_ID
value:
- name: ELASTIC_CLOUD_AUTH
value:
- name: KIBANA_HOST
value: kibana
- name: KIBANA_PORT
value: "5601"
volumeMounts:
- name: config
mountPath: /etc/packetbeat.yml
readOnly: true
subPath: packetbeat.yml
- name: data
mountPath: /usr/share/packetbeat/data
volumes:
- name: config
configMap:
defaultMode: 0600
name: packetbeat-dynamic-config
- name: data
emptyDir: {}
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: packetbeat-dynamic
subjects:
- kind: ServiceAccount
name: packetbeat-dynamic
namespace: kube-system
roleRef:
kind: ClusterRole
name: packetbeat-dynamic
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: packetbeat-dynamic
labels:
k8s-app: packetbeat-dynamic
rules:
- apiGroups: [""] # "" indicates the core API group
resources:
- namespaces
- pods
verbs:
- get
- watch
- list
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: packetbeat-dynamic
namespace: kube-system
labels:
k8s-app: packetbeat-dynamic
Could anyone suggest me to resolve this issue? any suggestible link also more helpful.
kubectl describe daemonset packetbeat-dynamic -n kube-system
Name: packetbeat-dynamic
Selector: k8s-app=packetbeat-dynamic,kubernetes.io/cluster-service=true
Node-Selector: <none>
Labels: k8s-app=packetbeat-dynamic
kubernetes.io/cluster-service=true
Annotations: deprecated.daemonset.template.generation: 1
Desired Number of Nodes Scheduled: 1
Current Number of Nodes Scheduled: 1
Number of Nodes Scheduled with Up-to-date Pods: 1
Number of Nodes Scheduled with Available Pods: 0
Number of Nodes Misscheduled: 1
Pods Status: 2 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: k8s-app=packetbeat-dynamic
kubernetes.io/cluster-service=true
Service Account: packetbeat-dynamic
Containers:
packetbeat-dynamic:
Image: docker.elastic.co/beats/packetbeat:6.2.4
Port: <none>
Host Port: <none>
Args:
-c
/etc/packetbeat.yml
-e
Environment:
ELASTICSEARCH_HOST: elasticsearch
ELASTICSEARCH_PORT: 9200
ELASTICSEARCH_USERNAME: elastic
ELASTICSEARCH_PASSWORD: changeme
CLOUD_ID:
ELASTIC_CLOUD_AUTH:
KIBANA_HOST: kibana
KIBANA_PORT: 5601
Mounts:
/etc/packetbeat.yml from config (ro,path="packetbeat.yml")
/usr/share/packetbeat/data from data (rw)
Volumes:
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: packetbeat-dynamic-config
Optional: false
data:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
Events: <none>

Elasticsearch enable security issues

I have a Elasticsearch 7.6 cluster installed base on
https://github.com/openstack/openstack-helm-infra/tree/master/elasticsearch
Following is what I did to enable security:
a. Generate certificate
./bin/elasticsearch-certutil ca
File location: /usr/share/elasticsearch/elastic-stack-ca.p12
./bin/elasticsearch-certutil cert --ca elastic-stack-ca.p12
File location: /usr/share/elasticsearch/elastic-certificates.p12
kubectl create secret generic elastic-certificates --from-file=elastic-certificates.p12
b. Enable Security on statefulset for master pod
kubectl edit statefulset elasticsearch-master
----
- name: xpack.security.enabled
value: "true"
- name: xpack.security.transport.ssl.enabled
value: "true"
- name: xpack.security.transport.ssl.verification_mode
value: certificate
- name: xpack.security.transport.ssl.keystore.path
value: /usr/share/elasticsearch/config/certs/elastic-certificates.p12
- name: xpack.security.transport.ssl.truststore.path
value: /usr/share/elasticsearch/config/certs/elastic-certificates.p12
----
- mountPath: /usr/share/elasticsearch/config/certs
name: elastic-certificates
readOnly: true
----
- name: elastic-certificates
secret:
defaultMode: 444
secretName: elastic-certificates
c. Enable security on statefulset for data pod
kubectl edit statefulset elasticsearch-data
----
- name: xpack.security.enabled
value: "true"
- name: xpack.security.transport.ssl.enabled
value: "true"
- name: xpack.security.transport.ssl.verification_mode
value: certificate
----
- mountPath: /usr/share/elasticsearch/config/certs
name: elastic-certificates
----
- name: elastic-certificates
secret:
defaultMode: 444
secretName: elastic-certificates
d. Enable security on deployment for client
kubectl edit deployment elasticsearch-client
----
- name: xpack.security.enabled
value: "true"
- name: xpack.security.transport.ssl.enabled
value: "true"
- name: xpack.security.transport.ssl.verification_mode
value: certificate
- name: xpack.security.transport.ssl.keystore.path
value: /usr/share/elasticsearch/config/certs/elastic-certificates.p12
- name: xpack.security.transport.ssl.truststore.path
value: /usr/share/elasticsearch/config/certs/elastic-certificates.p12
----
- mountPath: /usr/share/elasticsearch/config/certs
name: elastic-certificates
----
- name: elastic-certificates
secret:
defaultMode: 444
secretName: elastic-certificates
After pods restart, I got the following issue:
a. data pots are stuck in init stage
kubectl get pod |grep data
elasticsearch-data-0 1/1 Running 0 42m
elasticsearch-data-1 0/1 Init:0/3 0 10m
kubectl logs elasticsearch-data-1 -c init |tail -1
Entrypoint WARNING: <date/time> entrypoint.go:72: Resolving dependency Service elasticsearch-logging in namespace osh-infra failed: Service elasticsearch-logging has no endpoints .
b. Client pod errors regarding connection refused
Warning Unhealthy 18m (x4 over 19m) kubelet, s1-worker-2 Readiness probe failed: Get http://192.180.71.82:9200/_cluster/health: dial tcp 192.180.71.82:9200: connect: connection refused
Warning Unhealthy 4m17s (x86 over 18m) kubelet, s1-worker-2 Readiness probe failed: HTTP probe failed with statuscode: 401
c. Service "elasticsearch-logging" endpoints is empty
Any suggestions how to fix or what is wrong?
Thanks.

Readiness and Liveness probes for elasticsearch 6.3.0 on Kubernetes failing

I am trying to setup EFK stack on Kubernetes . The Elasticsearch version being used is 6.3.2. Everything works fine until I place the probes configuration in the deployment YAML file. I am getting error as below. This is causing the pod to be declared unhealthy and eventually gets restarted which appears to be a false restart.
Warning Unhealthy 15s kubelet, aks-agentpool-23337112-0 Liveness probe failed: Get http://10.XXX.Y.ZZZ:9200/_cluster/health: dial tcp 10.XXX.Y.ZZZ:9200: connect: connection refused
I did try using telnet from a different container to the elasticsearch pod with IP and port and I was successful but only kubelet on the node is unable to resolve the IP of the pod causing the probes to fail.
Below is the snippet from the pod spec of the Kubernetes Statefulset YAML. Any assistance on the resolution would be really helpful. Spent quite a lot of time on this without any clue :(
PS: The stack is being setup on AKS cluster
- name: es-data
image: quay.io/pires/docker-elasticsearch-kubernetes:6.3.2
env:
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: CLUSTER_NAME
value: myesdb
- name: NODE_MASTER
value: "false"
- name: NODE_INGEST
value: "false"
- name: HTTP_ENABLE
value: "true"
- name: NODE_DATA
value: "true"
- name: DISCOVERY_SERVICE
value: "elasticsearch-discovery"
- name: NETWORK_HOST
value: "_eth0:ipv4_"
- name: ES_JAVA_OPTS
value: -Xms512m -Xmx512m
- name: PROCESSORS
valueFrom:
resourceFieldRef:
resource: limits.cpu
resources:
requests:
cpu: 0.25
limits:
cpu: 1
ports:
- containerPort: 9200
name: http
- containerPort: 9300
name: transport
livenessProbe:
httpGet:
port: http
path: /_cluster/health
initialDelaySeconds: 40
periodSeconds: 10
readinessProbe:
httpGet:
path: /_cluster/health
port: http
initialDelaySeconds: 30
timeoutSeconds: 10
The pods/containers runs just fine without the probes in place . Expectation is that the probes should work fine when set on the deployment YAMLs and the POD should not get restarted.
The thing is that ElasticSearch itself has own health statuses (red, yellow, green) and you need to consider that in your configuration.
Here what I found in my own ES configuration, based on the official ES helm chart:
readinessProbe:
failureThreshold: 3
initialDelaySeconds: 40
periodSeconds: 10
successThreshold: 3
timeoutSeconds: 5
exec:
command:
- sh
- -c
- |
#!/usr/bin/env bash -e
# If the node is starting up wait for the cluster to be green
# Once it has started only check that the node itself is responding
START_FILE=/tmp/.es_start_file
http () {
local path="${1}"
if [ -n "${ELASTIC_USERNAME}" ] && [ -n "${ELASTIC_PASSWORD}" ]; then
BASIC_AUTH="-u ${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}"
else
BASIC_AUTH=''
fi
curl -XGET -s -k --fail ${BASIC_AUTH} http://127.0.0.1:9200${path}
}
if [ -f "${START_FILE}" ]; then
echo 'Elasticsearch is already running, lets check the node is healthy'
http "/"
else
echo 'Waiting for elasticsearch cluster to become green'
if http "/_cluster/health?wait_for_status=green&timeout=1s" ; then
touch ${START_FILE}
exit 0
else
echo 'Cluster is not yet green'
exit 1
fi
fi
First Please check the logs using
kubectl logs <pod name> -n <namespacename>
You have to first run the init container and change the volume permissions.
you have to run the whole config as the user : 1000 also before the container of elasticsearch start you have to change the volume permission using init container.
apiVersion: apps/v1
kind: StatefulSet
metadata:
labels:
app : elasticsearch
component: elasticsearch
release: elasticsearch
name: elasticsearch
spec:
podManagementPolicy: Parallel
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app : elasticsearch
component: elasticsearch
release: elasticsearch
serviceName: elasticsearch
template:
metadata:
creationTimestamp: null
labels:
app : elasticsearch
component: elasticsearch
release: elasticsearch
spec:
containers:
- env:
- name: cluster.name
value: <SET THIS>
- name: discovery.type
value: single-node
- name: ES_JAVA_OPTS
value: -Xms512m -Xmx512m
- name: bootstrap.memory_lock
value: "false"
image: elasticsearch:6.5.0
imagePullPolicy: IfNotPresent
name: elasticsearch
ports:
- containerPort: 9200
name: http
protocol: TCP
- containerPort: 9300
name: transport
protocol: TCP
resources:
limits:
cpu: 250m
memory: 1Gi
requests:
cpu: 150m
memory: 512Mi
securityContext:
privileged: true
runAsUser: 1000
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /usr/share/elasticsearch/data
name: elasticsearch-data
dnsPolicy: ClusterFirst
initContainers:
- command:
- sh
- -c
- chown -R 1000:1000 /usr/share/elasticsearch/data
- sysctl -w vm.max_map_count=262144
- chmod 777 /usr/share/elasticsearch/data
- chomod 777 /usr/share/elasticsearch/data/node
- chmod g+rwx /usr/share/elasticsearch/data
- chgrp 1000 /usr/share/elasticsearch/data
image: busybox:1.29.2
imagePullPolicy: IfNotPresent
name: set-dir-owner
resources: {}
securityContext:
privileged: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /usr/share/elasticsearch/data
name: elasticsearch-data
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 10
updateStrategy:
type: OnDelete
volumeClaimTemplates:
- metadata:
creationTimestamp: null
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
Check out the my yaml config and you can use. It's for single node of elasticsearch
Probe outlined in my answer works in 3 nodes discovery when Istio presented. If livenessProbe is bad, than k8s will restart container even not allowing to start properly. I use internal Elastic ports (for node to node communication) to test liveness. These ports speak TCP.
livenessProbe:
tcpSocket:
port: 9300
initialDelaySeconds: 60 # it takes time from jvm process to start start up to point when discovery process starts
timeoutSeconds: 10
- name: discovery.zen.minimum_master_nodes
value: "2"
- name: discovery.zen.ping.unicast.hosts
value: elastic

Elasticsearch fails to start on AWS kubernetes cluster

I am running my kubernetes cluster on AWS EKS which runs kubernetes 1.10.
I am following this guide to deploy elasticsearch in my Cluster
elasticsearch Kubernetes
The first time I deployed it everything worked fine. Now, When I redeploy it gives me the following error.
ERROR: [2] bootstrap checks failed
[1]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65536]
[2018-08-24T18:07:28,448][INFO ][o.e.n.Node ] [es-master-6987757898-5pzz9] stopping ...
[2018-08-24T18:07:28,534][INFO ][o.e.n.Node ] [es-master-6987757898-5pzz9] stopped
[2018-08-24T18:07:28,534][INFO ][o.e.n.Node ] [es-master-6987757898-5pzz9] closing ...
[2018-08-24T18:07:28,555][INFO ][o.e.n.Node ] [es-master-6987757898-5pzz9] closed
Here is my deployment file.
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: es-master
labels:
component: elasticsearch
role: master
spec:
replicas: 3
template:
metadata:
labels:
component: elasticsearch
role: master
spec:
initContainers:
- name: init-sysctl
image: busybox:1.27.2
command:
- sysctl
- -w
- vm.max_map_count=262144
securityContext:
privileged: true
containers:
- name: es-master
image: quay.io/pires/docker-elasticsearch-kubernetes:6.3.2
env:
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: CLUSTER_NAME
value: myesdb
- name: NUMBER_OF_MASTERS
value: "2"
- name: NODE_MASTER
value: "true"
- name: NODE_INGEST
value: "false"
- name: NODE_DATA
value: "false"
- name: HTTP_ENABLE
value: "false"
- name: ES_JAVA_OPTS
value: -Xms512m -Xmx512m
- name: NETWORK_HOST
value: "0.0.0.0"
- name: PROCESSORS
valueFrom:
resourceFieldRef:
resource: limits.cpu
resources:
requests:
cpu: 0.25
limits:
cpu: 1
ports:
- containerPort: 9300
name: transport
livenessProbe:
tcpSocket:
port: transport
initialDelaySeconds: 20
periodSeconds: 10
volumeMounts:
- name: storage
mountPath: /data
volumes:
- emptyDir:
medium: ""
name: "storage"
I have seen a lot of posts talking about increasing the value but I am not sure how to do it. Any help would be appreciated.
Just want to append to this issue:
If you create EKS cluster by eksctl then you can append to NodeGroup creation yaml:
preBootstrapCommand:
- "sed -i -e 's/1024:4096/65536:65536/g' /etc/sysconfig/docker"
- "systemctl restart docker"
This will solve the problem for newly created cluster by fixing docker daemon config.
Update default-ulimit parameter in the file '/etc/docker/daemon.json'
"default-ulimits": {
"nofile": {
"Name": "nofile",
"Soft": 65536,
"Hard": 65536
}
}
and restart docker daemon.
This is the only thing that worked for me using EKS setting up an EFK stack. Add this to your nodegroup creation YAML file under nodeGroups:. Then create your nodegroup and apply your ES pods on it.
preBootstrapCommands:
- "sysctl -w vm.max_map_count=262144"
- "systemctl restart docker"

Resources