How to disable swapping in Elasticsearch on Kubernetes? - elasticsearch

As per the official es docs, disabling swapping is one of the best performance boosts available to Elasticsearch.
However, it's proving to be difficult to configure. I've spent a number of hours researching and attempting different methods to disable swapping using the official ES docker image on Kubernetes.
When setting bootstrap.memory_lock: true as an env variable, the image fails to boot up with the error: Unable to lock JVM Memory: error=12, reason=Cannot allocate memory. This can result in part of the JVM being swapped out. Increase RLIMIT_MEMLOCK, soft limit: 65536, hard limit: 65536. As the docs point out, this is kind of expected. I've even mounted a custom /etc/security/limits.conf with the settings, but that's failed.
What is the recommended way to disable swapping when using the official es image on k8s?
And, here are the relevant sections of my yaml
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: elastic-data
spec:
serviceName: elastic-data
replicas: 1
template:
spec:
securityContext:
runAsUser: 0
fsGroup: 0
containers:
- name: elastic-data
image: docker.elastic.co/elasticsearch/elasticsearch-oss:6.4.0
env:
- name: ES_JAVA_OPTS
value: "-Xms2g -Xmx2g"
- name: cluster.name
value: "elastic-devs"
- name: node.name
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: discovery.zen.ping.unicast.hosts
value: "elastic-master.default.svc.cluster.local"
- name: node.master
value: "false"
- name: node.ingest
value: "false"
- name: node.data
value: "true"
- name: network.host
value: "0.0.0.0"
- name: path.data
value: /usr/share/elasticsearch/data
- name: indices.memory.index_buffer_size
value: "512MB"
- name: bootstrap.memory_lock
value: "true"
resources:
requests:
memory: "3Gi"
limits:
memory: "3Gi"
ports:
- containerPort: 9300
name: transport
- containerPort: 9200
name: http
volumeMounts:
- name: data-volume
mountPath: /usr/share/elasticsearch/data
- name: swappiness-config
mountPath: /etc/security/limits.conf
subPath: limits.conf
volumes:
- name: data-volume
persistentVolumeClaim:
claimName: pvc-es
- name: swappiness-config
configMap:
name: swappiness-config
items:
- key: limits.conf
path: limits.conf
limits.conf
elasticsearch soft memlock unlimited
elasticsearch hard memlock unlimited
elasticsearch hard nofile 65536
elasticsearch soft nofile 65536

I think, the ulimits in my yaml weren't being recognized, so I followed this post and created an image with a custom entrypoint that set the settings.

Which image-type are you using?
If its Container-Optimized OS (cos) try to switch to Ubuntu based images

Related

Elastic Search Kubernetes - Disable memory swapping

I am using Elastic Search(v7.6.1) on a Kubernetes(v1.19) cluster.
The docs suggests to disable swapping:
https://www.elastic.co/guide/en/elasticsearch/reference/current/setup-configuration-memory.html
My yaml:
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: elastic-cluster-1
spec:
version: 7.6.1
image: docker.elastic.co/elasticsearch/elasticsearch:7.6.1
nodeSets:
- name: default
count: 3
config:
node.master: true
node.data: true
node.ingest: true
podTemplate:
metadata:
labels:
# additional labels for pods
type: elastic-master-node
spec:
nodeSelector:
node-pool: <NODE_POOL>
initContainers:
# Increase linux map count to allow elastic to store large memory maps
- name: sysctl
securityContext:
privileged: true
command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144']
containers:
- name: elasticsearch
# specify resource limits and requests
resources:
limits:
memory: 11.2Gi
requests:
cpu: 3200m
env:
- name: ES_JAVA_OPTS
value: "-Xms6g -Xmx6g"
# Request persistent data storage for pods
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
storageClassName: ssd
- name: data
count: 2
config:
node.master: false
node.data: true
node.ingest: true
podTemplate:
metadata:
labels:
# additional labels for pods
type: elastic-data-node
spec:
nodeSelector:
node-pool: <NODE_POOL>
initContainers:
# Increase linux map count to allow elastic to store large memory maps
- name: sysctl
securityContext:
privileged: true
command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144']
containers:
- name: elasticsearch
# specify resource limits and requests
resources:
limits:
memory: 11.2Gi
requests:
cpu: 3200m
env:
- name: ES_JAVA_OPTS
value: "-Xms6g -Xmx6g"
# Request persistent data storage for pods
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
storageClassName: ssd
# Google cloud storage credentials
secureSettings:
- secretName: "gcs-credentials"
http:
service:
spec:
# expose this cluster Service with a LoadBalancer
type: LoadBalancer
tls:
certificate:
secretName: elasticsearch-certificate
It's not clear to me how to change this yaml in order to disable swapping correctly. Changing each manually is not an option because in every restart the configuration will be lost.
How can I do this?
First of all k8s cluster by default will have swap disabled, this is actually a mandatory requirement. For most cases; especially cloud managed cluster which follows the requirement, you do not need to worry about swapping issue. Even for 1.22, enabling swap is only an alpha feature.
If for whatever reason you need to deal with this, you can consider setting bootstrap.memory_lock to true.
...
containers:
- name: elasticsearch
env:
- name: bootstrap.memory_lock
value: "true"
...
Up until recently, Kubernetes had no control over swapping.
As of 1.22, there's a new alpha feature to do this. The CRI spec does allow for swap allocations. I didn't find anything new in that regard, in the Pod specification: as far as I understand, currently, you could either allow your containers to use as much swap as they can (UnlimitedSwap), or limit swap+memory usage to whatever memory limit you set on your container (LimitedSwap).
Since you're running 1.19, this shouldn't concern you right now. A good practice while deploying your cluster would have been to make sure there is no swap at all on your nodes, or set swapiness to 0 or 1. Checking Kubespray playbooks, we can see they would still unconditionally disable swap.
You can connect your nodes (ssh), make sure there's no swap -- or disable it otherwise. There's nothing you can do in that ElasticSearch object directly.

How can I configure different storage mount for different pod in Elasticsearch cluster in K8S?

I am deploying Elasticsearch cluster to K8S on EKS with nodegroup. I claimed a EBS for the cluster's storage. When I launch the cluster, only one pod is running successfully but I got this error for other pods:
Warning FailedAttachVolume 3m33s attachdetach-controller Multi-Attach error for volume "pvc-4870bd46-2f1e-402a-acf7-005de83e4588" Volume is already used by pod(s) es-0
Warning FailedMount 90s kubelet Unable to attach or mount volumes: unmounted volumes=[persistent-storage], unattached volumes=[es-config persistent-storage default-token-pqzkp]: timed out waiting for the condition
It means the storage is already in use. I understand that this volume is used by the first pod so other pods can't use it. But I don't know how to use different mount path for different pod when they are using the same EBS volume.
Below is the full spec for the cluster.
apiVersion: v1
kind: ConfigMap
metadata:
name: es-config
data:
elasticsearch.yml: |
cluster.name: elk-cluster
network.host: "0.0.0.0"
bootstrap.memory_lock: false
# discovery.zen.minimum_master_nodes: 2
node.max_local_storage_nodes: 9
discovery.seed_hosts:
- es-0.es-entrypoint.default.svc.cluster.local
- es-1.es-entrypoint.default.svc.cluster.local
- es-2.es-entrypoint.default.svc.cluster.local
ES_JAVA_OPTS: -Xms4g -Xmx8g
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: es
namespace: default
spec:
serviceName: es-entrypoint
replicas: 3
selector:
matchLabels:
name: es
template:
metadata:
labels:
name: es
spec:
volumes:
- name: es-config
configMap:
name: es-config
items:
- key: elasticsearch.yml
path: elasticsearch.yml
- name: persistent-storage
persistentVolumeClaim:
claimName: ebs-claim
initContainers:
- name: permissions-fix
image: busybox
volumeMounts:
- name: persistent-storage
mountPath: /usr/share/elasticsearch/data
command: [ 'chown' ]
args: [ '1000:1000', '/usr/share/elasticsearch/data' ]
containers:
- name: es
image: elasticsearch:7.10.1
resources:
requests:
cpu: 2
memory: 8Gi
ports:
- name: http
containerPort: 9200
- containerPort: 9300
name: inter-node
volumeMounts:
- name: es-config
mountPath: /usr/share/elasticsearch/config/elasticsearch.yml
subPath: elasticsearch.yml
- name: persistent-storage
mountPath: /usr/share/elasticsearch/data
---
apiVersion: v1
kind: Service
metadata:
name: es-entrypoint
spec:
selector:
name: es
ports:
- port: 9200
targetPort: 9200
protocol: TCP
clusterIP: None
You should be using volumeClaimTemplates with statefulset so that each pod gets its own volume. Details:
volumeClaimTemplates:
- metadata:
name: es
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
# storageClassName: <omit to use default StorageClass or specify>

Data is lost after changes are applied to the image

I am trying to add Elasticsearch to the EKS cluster. But whenever I apply the changes my data is lost. It looks like volume I have attached has been changed and reclaimed.
metadata:
name: elasticsearch-uat
labels:
component: elasticsearch-uat
spec:
replicas: 1
serviceName: elasticsearch-uat
template:
metadata:
...
spec:
initContainers:
- name: init-sysctl
...
containers:
- name: es
securityContext:
capabilities:
add:
- IPC_LOCK
image: 559076975273.dkr.ecr.us-west-2.amazonaws.com/elasticsearch-s3:v2
env:
...
ports:
- containerPort: 9200
name: http
protocol: TCP
- containerPort: 9300
name: transport
protocol: TCP
volumeMounts:
- mountPath: /data
name: es-storage-uat
updateStrategy:
type: RollingUpdate
volumeClaimTemplates:
- metadata:
namespace: k8
name: es-storage-uat
spec:
storageClassName: gp2
accessModes: [ ReadWriteOnce ]
resources:
requests:
storage: 2Gi
This is a statefulset
Please help me to understand this concept. I do not want my data to loose in any condition.
Thanks in advance.
From pv output it is clear that PersistentVolume reclaim policy is set to Delete. Which means that if the pvc is deleted the PersistentVolume gets auto deleted. You loose the data if the pvc is deleted.
In your case, it is appropriate to use the “Retain” policy. With the “Retain” policy, if a user deletes a PersistentVolumeClaim, the corresponding PersistentVolume is not be deleted.
Further to the above, if you are using dynamic storage then set reclaimPolicy field of the storage class to appropriate value. If no reclaimPolicy is specified when a StorageClass object is created, it will default to Delete.

Kubernetes persistent volume claim overriding existing directory's owner and permissions

In Kubernetes, I am having a directory permission problem. I am testing with a pod to create a bare-bones elasticsearch instance, built off of an ElasticSearch provided docker image.
If I use a basic .yaml file to define the container, everything starts up. The problem happens when I attempt to replace a directory created from the docker image with a directory created from mounting of the persistent volume.
The original directory was
drwxrwxr-x 1 elasticsearch root 4096 Aug 30 19:25 data
and if I mount the persistent volume, it changes the owner and permissions to
drwxr-xr-x 2 root root 4096 Aug 30 19:53 data
Now with the elasticsearch process running a the elasticsearch user, this directory can longer be accessed.
I have set the pod's security context's fsGroup to 1000, to match the group of the elasticsearch group. I have set the container's security context's runAsUser to 0. I have set various other combinations of users and group, but to no avail.
Here is my pod, persistent volume claim, and persistent volume definitions.
Any suggestions are welcome.
apiVersion: v1
kind: Pod
metadata:
name: elasticfirst
labels:
app: elasticsearch
spec:
securityContext:
fsGroup: 1000
containers:
- name: es01
image: docker.elastic.co/elasticsearch/elasticsearch:7.3.1
securityContext:
runAsUser: 0
resources:
limits:
memory: 2Gi
cpu: 200m
requests:
memory: 1Gi
cpu: 100m
env:
- name: node.name
value: es01
- name: discovery.seed_hosts
value: es01
- name: cluster.initial_master_nodes
value: es01
- name: cluster.name
value: elasticsearch-cluster
- name: bootstrap.memory_lock
value: "true"
- name: ES_JAVA_OPTS
value: "-Xms1g -Xmx2g"
ports:
- containerPort: 9200
volumeMounts:
- mountPath: "/usr/share/elasticsearch/data"
name: elastic-storage2
nodeSelector:
type: compute
volumes:
- name: elastic-storage2
persistentVolumeClaim:
claimName: elastic-storage2-pvc
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: elastic-storage2-pvc
spec:
storageClassName: local-storage
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 512Mi
apiVersion: v1
kind: PersistentVolume
metadata:
name: elastic-storage2-pv
spec:
storageClassName: local-storage
capacity:
storage: 512Mi
accessModes:
- ReadWriteOnce
hostPath:
path: /var/tmp/pv
Your question is a tiny bit confusing about what is happening versus what you want to be happening, but in general that problem is a common one; that's why many setups use an initContainer: to change the ownership of freshly provisioned PersistentVolumes (as in this example)
In such a setup, the initContainer: would run as root, but would also presumably be a very thin container whose job is only to chown and then exit, leaving your application container -- elasticsearch in your example -- free to run as an unprivileged user
spec:
initContainers:
- name: chown
image: busybox
command:
- chown
- -R
- "1000:1000"
- /the/data
volumeMounts:
- name: es-data
mountPoint: /the/data
containers:
- name: es
# etc etc

Kubernetes Persistent Volume is not working on GCE

I am trying to make my elastic search pods persistent so that data is preserved when deployment or pods are recreated.Elastic search is a part of Graylog2 setup.
After I set everything up, I sent a few logs to Graylog and I could see them appear on the dashboard. However, I deleted elasticsearch pod and after it was recreated all the data was lost on Graylog dashboard.
I am using GCE.
Here is my persistent volume config:
kind: PersistentVolume
apiVersion: v1
metadata:
name: elastic-pv
labels:
type: gcePD
spec:
capacity:
storage: 200Gi
accessModes:
- ReadWriteOnce
gcePersistentDisk:
fsType: ext4
pdName: elastic-pv-disk
Persistent volume claim config:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: elastic-pvc
labels:
type: gcePD
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Gi
and here is my elasticsearch deployment:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: elastic-deployment
spec:
replicas: 1
template:
metadata:
labels:
type: elasticsearch
spec:
containers:
- name: elastic-container
image: gcr.io/project/myelasticsearch:v1
imagePullPolicy: Always
ports:
- containerPort: 9300
name: first-port
protocol: TCP
- containerPort: 9200
name: second-port
protocol: TCP
volumeMounts:
- name: elastic-pd
mountPath: /data/db
volumes:
- name: elastic-pd
persistentVolumeClaim:
claimName: elastic-pvc
Output of kubectl describe pod:
Name: elastic-deployment-1423685295-jt6x5
Namespace: default
Node: gke-sd-logger-default-pool-2b3affc0-299k/10.128.0.6
Start Time: Tue, 09 May 2017 22:59:59 +0500
Labels: pod-template-hash=1423685295
type=elasticsearch
Status: Running
IP: 10.12.0.11
Controllers: ReplicaSet/elastic-deployment-1423685295
Containers:
elastic-container:
Container ID: docker://8774c747e2a56363f657a583bf5c2234ed2cff64dc21b6319fc53fdc5c1a6b2b
Image: gcr.io/thematic-flash-786/myelasticsearch:v1
Image ID: docker://sha256:7c25be62dbad39c07c413888e275ae419a66070d37e0d98bf5008e15d7720eec
Ports: 9300/TCP, 9200/TCP
Requests:
cpu: 100m
State: Running
Started: Tue, 09 May 2017 23:02:11 +0500
Ready: True
Restart Count: 0
Volume Mounts:
/data/db from elastic-pd (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-qtdbb (ro)
Environment Variables: <none>
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
elastic-pd:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: elastic-pvc
ReadOnly: false
default-token-qtdbb:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-qtdbb
QoS Class: Burstable
Tolerations: <none>
No events.
Output of kubectl describe pv:
Name: elastic-pv
Labels: type=gcePD
StorageClass:
Status: Bound
Claim: default/elastic-pvc
Reclaim Policy: Retain
Access Modes: RWO
Capacity: 200Gi
Message:
Source:
Type: GCEPersistentDisk (a Persistent Disk resource in Google Compute Engine)
PDName: elastic-pv-disk
FSType: ext4
Partition: 0
ReadOnly: false
No events.
Output of kubectl describe pvc:
Name: elastic-pvc
Namespace: default
StorageClass:
Status: Bound
Volume: elastic-pv
Labels: type=gcePD
Capacity: 200Gi
Access Modes: RWO
No events.
Confirmation that real disk exists:
What could be the reason Persistent Volume is not persistent?
In the official images, the Elasticsearch data is stored at /usr/share/elasticsearch/data and not /data/db. It would appear that you needed to updated the mount to be /usr/share/elasticsearch/data instead to get the data storing on the persistent volume.

Resources