I am running my kubernetes cluster on AWS EKS which runs kubernetes 1.10.
I am following this guide to deploy elasticsearch in my Cluster
elasticsearch Kubernetes
The first time I deployed it everything worked fine. Now, When I redeploy it gives me the following error.
ERROR: [2] bootstrap checks failed
[1]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65536]
[2018-08-24T18:07:28,448][INFO ][o.e.n.Node ] [es-master-6987757898-5pzz9] stopping ...
[2018-08-24T18:07:28,534][INFO ][o.e.n.Node ] [es-master-6987757898-5pzz9] stopped
[2018-08-24T18:07:28,534][INFO ][o.e.n.Node ] [es-master-6987757898-5pzz9] closing ...
[2018-08-24T18:07:28,555][INFO ][o.e.n.Node ] [es-master-6987757898-5pzz9] closed
Here is my deployment file.
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: es-master
labels:
component: elasticsearch
role: master
spec:
replicas: 3
template:
metadata:
labels:
component: elasticsearch
role: master
spec:
initContainers:
- name: init-sysctl
image: busybox:1.27.2
command:
- sysctl
- -w
- vm.max_map_count=262144
securityContext:
privileged: true
containers:
- name: es-master
image: quay.io/pires/docker-elasticsearch-kubernetes:6.3.2
env:
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: CLUSTER_NAME
value: myesdb
- name: NUMBER_OF_MASTERS
value: "2"
- name: NODE_MASTER
value: "true"
- name: NODE_INGEST
value: "false"
- name: NODE_DATA
value: "false"
- name: HTTP_ENABLE
value: "false"
- name: ES_JAVA_OPTS
value: -Xms512m -Xmx512m
- name: NETWORK_HOST
value: "0.0.0.0"
- name: PROCESSORS
valueFrom:
resourceFieldRef:
resource: limits.cpu
resources:
requests:
cpu: 0.25
limits:
cpu: 1
ports:
- containerPort: 9300
name: transport
livenessProbe:
tcpSocket:
port: transport
initialDelaySeconds: 20
periodSeconds: 10
volumeMounts:
- name: storage
mountPath: /data
volumes:
- emptyDir:
medium: ""
name: "storage"
I have seen a lot of posts talking about increasing the value but I am not sure how to do it. Any help would be appreciated.
Just want to append to this issue:
If you create EKS cluster by eksctl then you can append to NodeGroup creation yaml:
preBootstrapCommand:
- "sed -i -e 's/1024:4096/65536:65536/g' /etc/sysconfig/docker"
- "systemctl restart docker"
This will solve the problem for newly created cluster by fixing docker daemon config.
Update default-ulimit parameter in the file '/etc/docker/daemon.json'
"default-ulimits": {
"nofile": {
"Name": "nofile",
"Soft": 65536,
"Hard": 65536
}
}
and restart docker daemon.
This is the only thing that worked for me using EKS setting up an EFK stack. Add this to your nodegroup creation YAML file under nodeGroups:. Then create your nodegroup and apply your ES pods on it.
preBootstrapCommands:
- "sysctl -w vm.max_map_count=262144"
- "systemctl restart docker"
Related
Trying to setup elasticsearch cluster on kube, the problem i am having is that each pod isn't able to talk to the others by the respective hostnames, but the ip address works.
So for example i'm trying to currently setup 3 master nodes, es-master-0, es-master-1 and es-master-2 , if i log into one of the containers and ping another based on the pod ip it's fine, but i i try to ping say es-master-1 from es-master-0 based on the hostname it can't find it.
Clearly missing something here. Currently launching this config to try get it working:
apiVersion: v1
kind: Service
metadata:
name: ed
labels:
component: elasticsearch
role: master
spec:
selector:
component: elasticsearch
role: master
ports:
- name: transport1
port: 9300
protocol: TCP
clusterIP: None
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: es-master
labels:
component: elasticsearch
role: master
spec:
selector:
matchLabels:
component: elasticsearch
role: master
serviceName: ed
replicas: 3
template:
metadata:
labels:
component: elasticsearch
role: master
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- { key: es-master, operator: In, values: [ "true" ] }
initContainers:
- name: init-sysctl
image: busybox:1.27.2
command:
- sysctl
- -w
- vm.max_map_count=262144
securityContext:
privileged: true
dnsPolicy: "None"
dnsConfig:
options:
- name: ndots
value: "6"
nameservers:
- 10.85.0.10
searches:
- ed.es.svc.cluster.local
- es.svc.cluster.local
- svc.cluster.local
- cluster.local
- home
- node1
containers:
- name: es-master
image: docker.elastic.co/elasticsearch/elasticsearch:7.17.5
imagePullPolicy: Always
securityContext:
privileged: true
env:
- name: ES_JAVA_OPTS
value: -Xms2048m -Xmx2048m
resources:
requests:
cpu: "0.25"
limits:
cpu: "2"
ports:
- containerPort: 9300
name: transport1
livenessProbe:
tcpSocket:
port: transport1
initialDelaySeconds: 60
periodSeconds: 10
volumeMounts:
- name: storage
mountPath: /data
- name: config
mountPath: /usr/share/elasticsearch/config/elasticsearch.yml
subPath: elasticsearch.yml
volumes:
- name: config
configMap:
name: es-master-config
volumeClaimTemplates:
- metadata:
name: storage
spec:
storageClassName: "local-path"
accessModes: [ ReadWriteOnce ]
resources:
requests:
storage: 2Gi
It's clearly somehow not resolving the hostnames
For pod to pod communication you can use k8s service which you had defined.
I'm trying to create an ElasticSearch stateful set (STS) with init containers to increase the worker nodes vm.max_map_count=262144 and also the ulimit -n 65536.
However some PodSecurityPolicy (PSP) is denying the escalation of privilaged containers from what I can tell.
Warning FailedCreate 1s (x12 over 11s) statefulset-controller
create Pod elasticsearch-node-0 in StatefulSet elasticsearch-node
failed error: pods "elasticsearch-node-0" is forbidden: unable to
validate against any pod security policy:
[spec.initContainers[0].securityContext.privileged: Invalid value:
true: Privileged containers are not allowed
spec.initContainers[1].securityContext.privileged: Invalid value:
true: Privileged containers are not allowed]
And there are in fact 2x PSP in the cluster, privilaged and unprivilaged. Do I need to specify the privilaged PSP in the STS somehow? Or a svc-acc?
The k8s server version is 1.9.8 - if it matters.
This is the STS (with some helm elements)
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: elasticsearch-node
namespace: {{ .Release.Namespace }}
labels:
component: elasticsearch
role: node
spec:
replicas: {{ .Values.replicas }}
serviceName: elasticsearch-discovery
selector:
matchLabels:
component: elasticsearch
role: node
template:
metadata:
namespace: {{ .Release.Namespace }}
labels:
component: elasticsearch
role: node
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: component
operator: In
values:
- elasticsearch
- key: role
operator: In
values:
- node
topologyKey: kubernetes.io/hostname
terminationGracePeriodSeconds: 100
securityContext:
fsGroup: 1000
initContainers:
# To increase the default vm.max_map_count to 262144
- name: increase-vm-max-map-count
image: busybox
command:
- sysctl
- -w
- vm.max_map_count=262144
securityContext:
privileged: true
# To increase the ulimit to 65536
- name: increase-ulimit
image: busybox
command:
- sh
- -c
- ulimit -n 65536
securityContext:
privileged: true
containers:
- name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch:{{ .Values.global.version }}
imagePullPolicy: Always
ports:
- name: http
containerPort: 9200
- name: transport
containerPort: 9300
volumeMounts:
# - name: storage
# mountPath: /data
- name: config
mountPath: /usr/share/elasticsearch/config/elasticsearch.yml
subPath: elasticsearch.yml
resources:
{{ toYaml .Values.resources | indent 12 }}
env:
- name: ES_JAVA_OPTS
value: {{ .Values.java.options }}
volumes:
- name: config
configMap:
name: elasticsearch-node
$ kubectl describe sts elasticsearch-node
Name: elasticsearch-node
Namespace: default
CreationTimestamp: Tue, 12 Nov 2019 17:09:50 +0100
Selector: component=elasticsearch,role=node
Labels: component=elasticsearch
role=node
Annotations: <none>
Replicas: 2 desired | 0 total
Update Strategy: RollingUpdate
Partition: 824638159384
Pods Status: 0 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: component=elasticsearch
role=node
Init Containers:
increase-vm-max-map-count:
Image: busybox
Port: <none>
Host Port: <none>
Command:
sysctl
-w
vm.max_map_count=262144
Environment: <none>
Mounts: <none>
increase-ulimit:
Image: busybox
Port: <none>
Host Port: <none>
Command:
sh
-c
ulimit -n 65536
Environment: <none>
Mounts: <none>
Containers:
elasticsearch:
Image: docker.elastic.co/elasticsearch/elasticsearch:7.3.2
Ports: 9200/TCP, 9300/TCP
Host Ports: 0/TCP, 0/TCP
Limits:
cpu: 1
memory: 3Gi
Requests:
cpu: 250m
memory: 2Gi
Environment:
ES_JAVA_OPTS: -Xms2G -Xmx2G
Mounts:
/usr/share/elasticsearch/config/elasticsearch.yml from config (rw,path="elasticsearch.yml")
Volumes:
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: elasticsearch-node
Optional: false
Volume Claims: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedCreate 1s (x17 over 78s) statefulset-controller create Pod elasticsearch-node-0 in StatefulSet elasticsearch-node failed error: pods "elasticsearch-node-0" is forbidden: unable to validate against any pod security policy: [spec.initContainers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed spec.initContainers[1].securityContext.privileged: Invalid value: true: Privileged containers are not allowed]
Been staring at the PSP docs for some time now: https://kubernetes.io/docs/concepts/policy/pod-security-policy/
I am trying to setup EFK stack on Kubernetes . The Elasticsearch version being used is 6.3.2. Everything works fine until I place the probes configuration in the deployment YAML file. I am getting error as below. This is causing the pod to be declared unhealthy and eventually gets restarted which appears to be a false restart.
Warning Unhealthy 15s kubelet, aks-agentpool-23337112-0 Liveness probe failed: Get http://10.XXX.Y.ZZZ:9200/_cluster/health: dial tcp 10.XXX.Y.ZZZ:9200: connect: connection refused
I did try using telnet from a different container to the elasticsearch pod with IP and port and I was successful but only kubelet on the node is unable to resolve the IP of the pod causing the probes to fail.
Below is the snippet from the pod spec of the Kubernetes Statefulset YAML. Any assistance on the resolution would be really helpful. Spent quite a lot of time on this without any clue :(
PS: The stack is being setup on AKS cluster
- name: es-data
image: quay.io/pires/docker-elasticsearch-kubernetes:6.3.2
env:
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: CLUSTER_NAME
value: myesdb
- name: NODE_MASTER
value: "false"
- name: NODE_INGEST
value: "false"
- name: HTTP_ENABLE
value: "true"
- name: NODE_DATA
value: "true"
- name: DISCOVERY_SERVICE
value: "elasticsearch-discovery"
- name: NETWORK_HOST
value: "_eth0:ipv4_"
- name: ES_JAVA_OPTS
value: -Xms512m -Xmx512m
- name: PROCESSORS
valueFrom:
resourceFieldRef:
resource: limits.cpu
resources:
requests:
cpu: 0.25
limits:
cpu: 1
ports:
- containerPort: 9200
name: http
- containerPort: 9300
name: transport
livenessProbe:
httpGet:
port: http
path: /_cluster/health
initialDelaySeconds: 40
periodSeconds: 10
readinessProbe:
httpGet:
path: /_cluster/health
port: http
initialDelaySeconds: 30
timeoutSeconds: 10
The pods/containers runs just fine without the probes in place . Expectation is that the probes should work fine when set on the deployment YAMLs and the POD should not get restarted.
The thing is that ElasticSearch itself has own health statuses (red, yellow, green) and you need to consider that in your configuration.
Here what I found in my own ES configuration, based on the official ES helm chart:
readinessProbe:
failureThreshold: 3
initialDelaySeconds: 40
periodSeconds: 10
successThreshold: 3
timeoutSeconds: 5
exec:
command:
- sh
- -c
- |
#!/usr/bin/env bash -e
# If the node is starting up wait for the cluster to be green
# Once it has started only check that the node itself is responding
START_FILE=/tmp/.es_start_file
http () {
local path="${1}"
if [ -n "${ELASTIC_USERNAME}" ] && [ -n "${ELASTIC_PASSWORD}" ]; then
BASIC_AUTH="-u ${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}"
else
BASIC_AUTH=''
fi
curl -XGET -s -k --fail ${BASIC_AUTH} http://127.0.0.1:9200${path}
}
if [ -f "${START_FILE}" ]; then
echo 'Elasticsearch is already running, lets check the node is healthy'
http "/"
else
echo 'Waiting for elasticsearch cluster to become green'
if http "/_cluster/health?wait_for_status=green&timeout=1s" ; then
touch ${START_FILE}
exit 0
else
echo 'Cluster is not yet green'
exit 1
fi
fi
First Please check the logs using
kubectl logs <pod name> -n <namespacename>
You have to first run the init container and change the volume permissions.
you have to run the whole config as the user : 1000 also before the container of elasticsearch start you have to change the volume permission using init container.
apiVersion: apps/v1
kind: StatefulSet
metadata:
labels:
app : elasticsearch
component: elasticsearch
release: elasticsearch
name: elasticsearch
spec:
podManagementPolicy: Parallel
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app : elasticsearch
component: elasticsearch
release: elasticsearch
serviceName: elasticsearch
template:
metadata:
creationTimestamp: null
labels:
app : elasticsearch
component: elasticsearch
release: elasticsearch
spec:
containers:
- env:
- name: cluster.name
value: <SET THIS>
- name: discovery.type
value: single-node
- name: ES_JAVA_OPTS
value: -Xms512m -Xmx512m
- name: bootstrap.memory_lock
value: "false"
image: elasticsearch:6.5.0
imagePullPolicy: IfNotPresent
name: elasticsearch
ports:
- containerPort: 9200
name: http
protocol: TCP
- containerPort: 9300
name: transport
protocol: TCP
resources:
limits:
cpu: 250m
memory: 1Gi
requests:
cpu: 150m
memory: 512Mi
securityContext:
privileged: true
runAsUser: 1000
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /usr/share/elasticsearch/data
name: elasticsearch-data
dnsPolicy: ClusterFirst
initContainers:
- command:
- sh
- -c
- chown -R 1000:1000 /usr/share/elasticsearch/data
- sysctl -w vm.max_map_count=262144
- chmod 777 /usr/share/elasticsearch/data
- chomod 777 /usr/share/elasticsearch/data/node
- chmod g+rwx /usr/share/elasticsearch/data
- chgrp 1000 /usr/share/elasticsearch/data
image: busybox:1.29.2
imagePullPolicy: IfNotPresent
name: set-dir-owner
resources: {}
securityContext:
privileged: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /usr/share/elasticsearch/data
name: elasticsearch-data
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 10
updateStrategy:
type: OnDelete
volumeClaimTemplates:
- metadata:
creationTimestamp: null
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
Check out the my yaml config and you can use. It's for single node of elasticsearch
Probe outlined in my answer works in 3 nodes discovery when Istio presented. If livenessProbe is bad, than k8s will restart container even not allowing to start properly. I use internal Elastic ports (for node to node communication) to test liveness. These ports speak TCP.
livenessProbe:
tcpSocket:
port: 9300
initialDelaySeconds: 60 # it takes time from jvm process to start start up to point when discovery process starts
timeoutSeconds: 10
- name: discovery.zen.minimum_master_nodes
value: "2"
- name: discovery.zen.ping.unicast.hosts
value: elastic
I'm upgrading my elasticsearch on kubernetes cluster from 5.6.10 to elasticsearch 6.1.4. However, I can't even get es 6.1.4 to launch.
I keep getting the error unknown setting [xpack.license.self_generated.type].
Per the docs, I tried setting the value to basic, xpack.license.self_generated.type=basic and I've also omitted the value all together.
I've seen a few others run into this error but none of their fixes have worked for me.
Help is much appreciated!
My stateful set yaml
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: elastic-data
labels:
app: elastic-data
area: devs
role: nosql
version: "6.1.4"
environment: elastic
spec:
serviceName: elastic-data
replicas: 1
updateStrategy:
type: RollingUpdate
template:
metadata:
labels:
app: elastic-data
area: devs
role: nosql
version: "6.1.4"
environment: elastic
annotations:
pod.beta.kubernetes.io/init-containers: '[
{
"name": "sysctl",
"image": "busybox",
"imagePullPolicy": "IfNotPresent",
"command": ["sysctl", "-w", "vm.max_map_count=262144"],
"securityContext": {
"privileged": true
}
}
]'
spec:
terminationGracePeriodSeconds: 10
securityContext:
runAsUser: 1000
fsGroup: 1000
containers:
- name: elastic-data
image: docker.elastic.co/elasticsearch/elasticsearch:6.1.4
resources:
requests:
memory: "512Mi"
limits:
memory: "1024Mi"
env:
- name: ES_JAVA_OPTS
value: -Xms512m -Xmx512m
command: ["/bin/bash", "-c", "~/bin/elasticsearch-plugin remove x-pack; elasticsearch"]
args:
- -Ecluster.name=elastic-devs
- -Enode.name=${HOSTNAME}
- -Ediscovery.zen.ping.unicast.hosts=elastic-master.default.svc.cluster.local
- -Enode.master=false
- -Enode.data=true
- -Enode.ingest=false
- -Enetwork.host=0.0.0.0
- -Expack.license.self_generated.type=basic
ports:
- containerPort: 9300
name: transport
- containerPort: 9200
name: http
volumeMounts:
- name: data-volume
mountPath: /usr/share/elasticsearch/data
readinessProbe:
tcpSocket:
port: 9300
initialDelaySeconds: 30
periodSeconds: 30
timeoutSeconds: 3
livenessProbe:
tcpSocket:
port: 9300
initialDelaySeconds: 30
periodSeconds: 30
timeoutSeconds: 3
volumeClaimTemplates:
- metadata:
name: data-volume
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 2Gi
As they are trying to communicate, you forgot to remove the configuration property from config/elasticsearch.yml. So, the full revised command: would be
~/bin/elasticsearch-plugin remove x-pack
sed -i.bak -e /xpack.license.self_generated.type/d config/elasticsearch.yml
elasticsearch
Don't get me wrong, it's very silly of them to bomb over a config property for something that doesn't exist, but that's apparently the way it is.
p.s. you may be happier with the --purge option, since when I tried that command locally elasticsearch-plugin cheerfully advised:
-> preserving plugin config files [/usr/share/elasticsearch/config/x-pack] in case of upgrade; use --purge if not needed
thus: ./bin/elasticsearch-plugin remove x-pack --purge
I am trying to deploy elastic-search in kubernetes with local drive volume but I get the following error, can you please correct me.
using ubuntu 16.04
kubernetes v1.11.0
Docker version 17.03.2-ce
Getting error 'unknown field hostPath' Kubernetes Elasticsearch using with local volume
error: error validating "es-d.yaml": error validating data: ValidationError(StatefulSet.spec.template.spec.containers[1]): unknown field "hostPath" in io.k8s.api.core.v1.Container; if you choose to ignore these errors, turn validation off with --validate=false
This is the yaml file of the statefulSet:
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: es-data
labels:
component: elasticsearch
role: data
spec:
serviceName: elasticsearch-data
replicas: 1
template:
metadata:
labels:
component: elasticsearch
role: data
spec:
initContainers:
- name: init-sysctl
image: alpine:3.6
command:
- sysctl
- -w
- vm.max_map_count=262144
securityContext:
privileged: true
containers:
- name: es-data
image: quay.io/pires/docker-elasticsearch-kubernetes:6.3.0
env:
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: CLUSTER_NAME
value: myesdb
- name: NODE_MASTER
value: "false"
- name: NODE_INGEST
value: "false"
- name: HTTP_ENABLE
value: "true"
- name: ES_JAVA_OPTS
value: -Xms256m -Xmx256m
- name: PROCESSORS
valueFrom:
resourceFieldRef:
resource: limits.cpu
resources:
requests:
cpu: 0.25
limits:
cpu: 1
ports:
- containerPort: 9200
name: http
- containerPort: 9300
name: transport
livenessProbe:
tcpSocket:
port: transport
initialDelaySeconds: 20
periodSeconds: 10
readinessProbe:
httpGet:
path: /_cluster/health
port: http
initialDelaySeconds: 20
timeoutSeconds: 5
volumeMounts:
- name: storage
mountPath: /es
volumes:
- name: storage
You have the wrong structure. volumes must be on the same level as containers, initContainers.
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: es-data
labels:
component: elasticsearch
role: data
spec:
serviceName: elasticsearch-data
replicas: 1
template:
metadata:
labels:
component: elasticsearch
role: data
spec:
initContainers:
- name: init-sysctl
image: alpine:3.6
command:
- sysctl
- -w
- vm.max_map_count=262144
securityContext:
privileged: true
containers:
- name: es-data
image: quay.io/pires/docker-elasticsearch-kubernetes:6.3.0
env:
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: CLUSTER_NAME
value: myesdb
- name: NODE_MASTER
value: "false"
- name: NODE_INGEST
value: "false"
- name: HTTP_ENABLE
value: "true"
- name: ES_JAVA_OPTS
value: -Xms256m -Xmx256m
- name: PROCESSORS
valueFrom:
resourceFieldRef:
resource: limits.cpu
resources:
requests:
cpu: 0.25
limits:
cpu: 1
ports:
- containerPort: 9200
name: http
- containerPort: 9300
name: transport
livenessProbe:
tcpSocket:
port: transport
initialDelaySeconds: 20
periodSeconds: 10
readinessProbe:
httpGet:
path: /_cluster/health
port: http
initialDelaySeconds: 20
timeoutSeconds: 5
volumeMounts:
- name: storage
mountPath: /es
volumes:
- name: storage
You can find example here.
Check your format, hostPath is not supposed to be under container part, 'volume' is not in it's position.