Have anyone come across this error. Elastic search fails with bootstrap errors. How do we fix them.
{"#timestamp":"2022-11-14T08:57:21.047Z", "log.level":"ERROR", "message":"uncaught exception in thread [process reaper (pid 86)]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"process reaper (pid 86)","log.logger":"org.elasticsearch.bootstrap.ElasticsearchUncaughtExceptionHandler","elasticsearch.node.name":"es-cluster-1","error.type":"java.security.AccessControlException","error.message":"access denied (\"java.lang.RuntimePermission\" \"modifyThread\")","error.stack_trace":"java.security.AccessControlException: access denied (\"java.lang.RuntimePermission\" \"modifyThread\")\n\tat java.base/java.security.AccessControlContext.checkPermission(AccessControlContext.java:485)\n\tat java.base/java.security.AccessController.checkPermission(AccessController.java:1068)\n\tat java.base/java.lang.SecurityManager.checkPermission(SecurityManager.java:411)\n\tat org.elasticsearch.securesm#8.5.0/org.elasticsearch.secure_sm.SecureSM.checkThreadAccess(SecureSM.java:166)\n\tat org.elasticsearch.securesm#8.5.0/org.elasticsearch.secure_sm.SecureSM.checkAccess(SecureSM.java:120)\n\tat java.base/java.lang.Thread.checkAccess(Thread.java:2360)\n\tat java.base/java.lang.Thread.setDaemon(Thread.java:2308)\n\tat java.base/java.lang.ProcessHandleImpl.lambda$static$0(ProcessHandleImpl.java:103)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.<init>(ThreadPoolExecutor.java:637)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:928)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.processWorkerExit(ThreadPoolExecutor.java:1021)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1158)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)\n\tat java.base/java.lang.Thread.run(Thread.java:1589)\n\tat java.base/jdk.internal.misc.InnocuousThread.run(InnocuousThread.java:186)\n"}
Deployment files have the relevant settings set.
env:
- name: cluster.name
value: k8s-logs
- name: node.name
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: discovery.seed_hosts
value: "es-cluster-0.elasticsearch,es-cluster-1.elasticsearch,es-cluster-2.elasticsearch"
- name: cluster.initial_master_nodes
value: "es-cluster-0,es-cluster-1,es-cluster-2"
- name: ES_JAVA_OPTS
value: "-Xms1024m -Xmx2048m"
nodeSelector:
app: aks1
initContainers:
- name: fix-permissions
image: busybox
command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"]
securityContext:
privileged: true
volumeMounts:
- name: data
mountPath: /usr/share/elasticsearch/data
- name: increase-vm-max-map
image: busybox
command: ["sysctl", "-w", "vm.max_map_count=262144"]
securityContext:
privileged: true
- name: increase-fd-ulimit
image: busybox
command: ["sh", "-c", "ulimit -n 65536"]
securityContext:
privileged: true
Have anyone come across this error. Elastic search fails with bootstrap errors. How do we fix them.
{"#timestamp":"2022-11-14T08:57:21.047Z", "log.level":"ERROR", "message":"uncaught exception in thread [process reaper (pid 86)]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"process reaper (pid 86)","log.logger":"org.elasticsearch.bootstrap.ElasticsearchUncaughtExceptionHandler","elasticsearch.node.name":"es-cluster-1","error.type":"java.security.AccessControlException","error.message":"access denied (\"java.lang.RuntimePermission\" \"modifyThread\")","error.stack_trace":"java.security.AccessControlException: access denied (\"java.lang.RuntimePermission\" \"modifyThread\")\n\tat java.base/java.security.AccessControlContext.checkPermission(AccessControlContext.java:485)\n\tat java.base/java.security.AccessController.checkPermission(AccessController.java:1068)\n\tat java.base/java.lang.SecurityManager.checkPermission(SecurityManager.java:411)\n\tat org.elasticsearch.securesm#8.5.0/org.elasticsearch.secure_sm.SecureSM.checkThreadAccess(SecureSM.java:166)\n\tat org.elasticsearch.securesm#8.5.0/org.elasticsearch.secure_sm.SecureSM.checkAccess(SecureSM.java:120)\n\tat java.base/java.lang.Thread.checkAccess(Thread.java:2360)\n\tat java.base/java.lang.Thread.setDaemon(Thread.java:2308)\n\tat java.base/java.lang.ProcessHandleImpl.lambda$static$0(ProcessHandleImpl.java:103)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.<init>(ThreadPoolExecutor.java:637)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:928)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.processWorkerExit(ThreadPoolExecutor.java:1021)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1158)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)\n\tat java.base/java.lang.Thread.run(Thread.java:1589)\n\tat java.base/jdk.internal.misc.InnocuousThread.run(InnocuousThread.java:186)\n"}
Related
When starting the cluster Elasticsearch 7.17 in Openshift . Cluster writes an error
chroot: cannot change root directory to '/': Operation not permitted`
Kibana started ok.
Code :
`apiVersion: apps/v1
kind: StatefulSet
metadata:
name: elasticsearch
namespace: elasticsearch-pp
spec:
serviceName: elasticsearch
replicas: 3
selector:
matchLabels:
app: elasticsearch
template:
metadata:
labels:
app: elasticsearch
spec:
containers:
- name: elasticsearch
image: NEXUS/elasticsearch:7.17.7
env:
- name: cluster.name
value: k8s-logs
- name: node.name
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: discovery.seed_hosts
value: "es-cluster-0.elasticsearch,es-cluster-1.elasticsearch,es-cluster-2.elasticsearch"
- name: cluster.initial_master_nodes
value: "es-cluster-0,es-cluster-1,es-cluster-2"
- name: ES_JAVA_OPTS
value: "-Xms4012m -Xmx4012m"
ports:
- containerPort: 9200
name: client
- containerPort: 9300
name: nodes
volumeMounts:
- name: data-cephfs
mountPath: /usr/share/elasticsearch/data
initContainers:
- name: fix-permissions
image: NEXUS/busybox
command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"]
volumeMounts:
- name: data-cephfs
mountPath: /usr/share/elasticsearch/data
- name: increase-vm-max-map
image: NEXUS/busybox
imagePullPolicy: IfNotPresent
command: ["/bin/sh"]
args: ["-c", "sysctl -w vm.max_map_count=262144; echo vm.max_map_count=262144 >> /etc/sysctl.conf ; sysctl -p"]
# image: NEXUS/busybox
# command: ["sysctl", "-w", "vm.max_map_count=262144"]
- name: increase-fd-ulimit
image: NEXUS/busybox
command: ["sh", "-c", "ulimit -n 65536"]
serviceAccount: elk-anyuid
serviceAccountName: elk-anyuid
restartPolicy: Always
volumes:
- name: data-cephfs
persistentVolumeClaim:
claimName: data-cephfs `
I tried changing the cluster settings, disabled initContainers , the error persists
Below is my YAML for volume mounting:
initContainers:
- name: fix-permissions
image: busybox
command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"]
securityContext:
privileged: true
volumeMounts:
- name: data
mountPath: /usr/share/elasticsearch/data
- name: increase-vm-max-map
image: busybox
command: ["sysctl", "-w", "vm.max_map_count=262144"]
securityContext:
privileged: true
- name: increase-fd-ulimit
image: busybox
command: ["sh", "-c", "ulimit -n 65536"]
securityContext:
privileged: true
volumes:
- name: data
hostPath:
path: /usr/share/elasticsearch/data
type: DirectoryOrCreate
Even after changing type to DirectoryOrCreate, it shows the error:
MountVolume.SetUp failed for volume "data" : hostPath type check failed: /usr/share/elasticsearch/data is not a directory
How can I fix this ??
You can add a volumeMounts inside your container:
containers:
- name: increase-fd-ulimit
volumeMounts:
- name: data
mountPath: /usr/share/elasticsearch/data
and add a volumeClaimTemplates inside your sts:
spec:
volumeClaimTemplates:
- kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: data
namespace: your_namespace
annotations:
volume.beta.kubernetes.io/storage-class: default
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Gi
volumeMode: Filesystem
I am backuping my Postgresql database using this cronjob:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: postgres-backup
spec:
schedule: "0 2 * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: postgres-backup
image: postgres:10.4
command: ["/bin/sh"]
args: ["-c", 'echo "$PGPASS" > /root/.pgpass && chmod 600 /root/.pgpass && pg_dump -Fc -h <host> -U <user> <db> > /var/backups/backup.dump']
env:
- name: PGPASS
valueFrom:
secretKeyRef:
name: pgpass
key: pgpass
volumeMounts:
- mountPath: /var/backups
name: postgres-backup-storage
restartPolicy: Never
volumes:
- name: postgres-backup-storage
hostPath:
path: /var/volumes/postgres-backups
type: DirectoryOrCreate
The cronjob gets successfully executed, the backup is made and saved in the container of the Job but this container is stopped after successful execution of the script.
of course I want to access the backup files in the container but I can't because it is stopped/terminated.
is there a way to execute shell commands in a container after it is terminated, so I can access the backup files saved in the container?
I know that I could do that on the node, but I don't have the permission to access it.
#confused genius gave me a great idea to create another same container to access the dump files so this is the solution that works:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: postgres-backup
spec:
schedule: "0 2 * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: postgres-backup
image: postgres:10.4
command: ["/bin/sh"]
args: ["-c", 'echo "$PGPASS" > /root/.pgpass && chmod 600 /root/.pgpass && pg_dump -Fc -h <host> -U <user> <db> > /var/backups/backup.dump']
env:
- name: PGPASS
valueFrom:
secretKeyRef:
name: dev-pgpass
key: pgpass
volumeMounts:
- mountPath: /var/backups
name: postgres-backup-storage
- name: postgres-restore
image: postgres:10.4
volumeMounts:
- mountPath: /var/backups
name: postgres-backup-storage
restartPolicy: Never
volumes:
- name: postgres-backup-storage
hostPath:
# Ensure the file directory is created.
path: /var/volumes/postgres-backups
type: DirectoryOrCreate
after that one just needs to sh into the "postgres-restore" container and access the dump files.
thanks
I am trying to deploy an elasticsearch statfulset and have the storage provisioned by rook-ceph storageclass.
The pod is in pending mode because of:
Warning FailedScheduling 87s default-scheduler 0/4 nodes are available: 4 pod has unbound immediate PersistentVolumeClaims.
The statefull set looks like this:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: elasticsearch
namespace: rcc
labels:
app: elasticsearch
tier: backend
type: db
spec:
serviceName: es
replicas: 3
selector:
matchLabels:
app: elasticsearch
tier: backend
type: db
template:
metadata:
labels:
app: elasticsearch
tier: backend
type: db
spec:
terminationGracePeriodSeconds: 300
initContainers:
- name: fix-the-volume-permission
image: busybox
command:
- sh
- -c
- chown -R 1000:1000 /usr/share/elasticsearch/data
securityContext:
privileged: true
volumeMounts:
- name: data
mountPath: /usr/share/elasticsearch/data
- name: increase-the-vm-max-map-count
image: busybox
command:
- sysctl
- -w
- vm.max_map_count=262144
securityContext:
privileged: true
- name: increase-the-ulimit
image: busybox
command:
- sh
- -c
- ulimit -n 65536
securityContext:
privileged: true
containers:
- name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch-oss:7.7.1
ports:
- containerPort: 9200
name: http
- containerPort: 9300
name: tcp
resources:
requests:
memory: 4Gi
limits:
memory: 6Gi
env:
- name: cluster.name
value: elasticsearch
- name: node.name
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: discovery.zen.ping.unicast.hosts
value: "elasticsearch-0.es.default.svc.cluster.local,elasticsearch-1.es.default.svc.cluster.local,elasticsearch-2.es.default.svc.cluster.local"
- name: ES_JAVA_OPTS
value: -Xms4g -Xmx4g
volumeMounts:
- name: data
mountPath: /usr/share/elasticsearch/data
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes:
- ReadWriteOnce
storageClassName: rook-cephfs
resources:
requests:
storage: 5Gi
and the reson of the pvc is not getting created is:
Normal FailedBinding 47s (x62 over 15m) persistentvolume-controller no persistent volumes available for this claim and no storage class is set
Any idea what I do wrong?
After adding rook-ceph-block storage class and changing the storageclass to that it worked without any issues.
I am running my kubernetes cluster on AWS EKS which runs kubernetes 1.10.
I am following this guide to deploy elasticsearch in my Cluster
elasticsearch Kubernetes
The first time I deployed it everything worked fine. Now, When I redeploy it gives me the following error.
ERROR: [2] bootstrap checks failed
[1]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65536]
[2018-08-24T18:07:28,448][INFO ][o.e.n.Node ] [es-master-6987757898-5pzz9] stopping ...
[2018-08-24T18:07:28,534][INFO ][o.e.n.Node ] [es-master-6987757898-5pzz9] stopped
[2018-08-24T18:07:28,534][INFO ][o.e.n.Node ] [es-master-6987757898-5pzz9] closing ...
[2018-08-24T18:07:28,555][INFO ][o.e.n.Node ] [es-master-6987757898-5pzz9] closed
Here is my deployment file.
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: es-master
labels:
component: elasticsearch
role: master
spec:
replicas: 3
template:
metadata:
labels:
component: elasticsearch
role: master
spec:
initContainers:
- name: init-sysctl
image: busybox:1.27.2
command:
- sysctl
- -w
- vm.max_map_count=262144
securityContext:
privileged: true
containers:
- name: es-master
image: quay.io/pires/docker-elasticsearch-kubernetes:6.3.2
env:
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: CLUSTER_NAME
value: myesdb
- name: NUMBER_OF_MASTERS
value: "2"
- name: NODE_MASTER
value: "true"
- name: NODE_INGEST
value: "false"
- name: NODE_DATA
value: "false"
- name: HTTP_ENABLE
value: "false"
- name: ES_JAVA_OPTS
value: -Xms512m -Xmx512m
- name: NETWORK_HOST
value: "0.0.0.0"
- name: PROCESSORS
valueFrom:
resourceFieldRef:
resource: limits.cpu
resources:
requests:
cpu: 0.25
limits:
cpu: 1
ports:
- containerPort: 9300
name: transport
livenessProbe:
tcpSocket:
port: transport
initialDelaySeconds: 20
periodSeconds: 10
volumeMounts:
- name: storage
mountPath: /data
volumes:
- emptyDir:
medium: ""
name: "storage"
I have seen a lot of posts talking about increasing the value but I am not sure how to do it. Any help would be appreciated.
Just want to append to this issue:
If you create EKS cluster by eksctl then you can append to NodeGroup creation yaml:
preBootstrapCommand:
- "sed -i -e 's/1024:4096/65536:65536/g' /etc/sysconfig/docker"
- "systemctl restart docker"
This will solve the problem for newly created cluster by fixing docker daemon config.
Update default-ulimit parameter in the file '/etc/docker/daemon.json'
"default-ulimits": {
"nofile": {
"Name": "nofile",
"Soft": 65536,
"Hard": 65536
}
}
and restart docker daemon.
This is the only thing that worked for me using EKS setting up an EFK stack. Add this to your nodegroup creation YAML file under nodeGroups:. Then create your nodegroup and apply your ES pods on it.
preBootstrapCommands:
- "sysctl -w vm.max_map_count=262144"
- "systemctl restart docker"