Fix elasticsearch broken cluster within kubernetes - elasticsearch

I deployed an elasticsearch cluster with official Helm chart (https://github.com/elastic/helm-charts/tree/master/elasticsearch).
There are 3 Helm releases:
master (3 nodes)
client (1 node)
data (2 nodes)
Cluster was running fine, I did a crash test by removing master release, and re-create it.
After that, master nodes are ok, but data nodes complain:
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid xeQ6IVkDQ2es1CO2yZ_7rw than local cluster uuid 9P9ZGqSuQmy7iRDGcit5fg, rejecting
which is normal because master nodes are new.
How can I fix data nodes cluster state without removing data folder?
Edit:
I know the reason why is broken, I know a basic solution is to remove data folder and restart node (as I can see on elastic forum, lot of similar questions without answers). But I am looking for a production aware solution, maybe with https://www.elastic.co/guide/en/elasticsearch/reference/current/node-tool.html tool?

Using elasticsearch-node utility, it's possible to reset cluster state, then the fresh node can join another cluster.
The tricky thing is to use this utility bin with Docker, because elasticsearch server must be stopped!
Solution with kubernetes:
Stop pods by scaling to 0 the sts: kubectl scale data-nodes --replicas=0
Create a k8s job that reset the cluster state, with data volume attached
Apply the job for each PVC
Rescale sts and enjoy!
job.yaml:
apiVersion: batch/v1
kind: Job
metadata:
name: test-fix-cluster-m[0-3]
spec:
template:
spec:
containers:
- args:
- -c
- yes | elasticsearch-node detach-cluster; yes | elasticsearch-node remove-customs '*'
# uncomment for at least 1 PVC
#- yes | elasticsearch-node unsafe-bootstrap -v
command:
- /bin/sh
image: docker.elastic.co/elasticsearch/elasticsearch:7.10.1
name: elasticsearch
volumeMounts:
- mountPath: /usr/share/elasticsearch/data
name: es-data
restartPolicy: Never
volumes:
- name: es-data
persistentVolumeClaim:
claimName: es-test-master-es-test-master-[0-3]
If you are interested, here the code behind unsafe-bootstrap: https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/cluster/coordination/UnsafeBootstrapMasterCommand.java#L83
I have written a small story at https://medium.com/#thomasdecaux/fix-broken-elasticsearch-cluster-405ad67ee17c.

Related

ElasticSearch CrashLoopBackoff when deploying with ECK in Kubernetes OKD 4.11

I am running Kubernetes using OKD 4.11 (running on vSphere) and have validated the basic functionality (including dyn. volume provisioning) using applications (like nginx).
I also applied
oc adm policy add-scc-to-group anyuid system:authenticated
to allow authenticated users to use anyuid (which seems to have been required to deploy the nginx example I was testing with).
Then I installed ECK using this quickstart with kubectl to install the CRD and RBAC manifests. This seems to have worked.
Then I deployed the most basic ElasticSearch quickstart example with kubectl apply -f quickstart.yaml using this manifest:
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: quickstart
spec:
version: 8.4.2
nodeSets:
- name: default
count: 1
config:
node.store.allow_mmap: false
The deployment proceeds as expected, pulling image and starting container, but ends in a CrashLoopBackoff with the following error from ElasticSearch at the end of the log:
"elasticsearch.cluster.name":"quickstart",
"error.type":"java.lang.IllegalStateException",
"error.message":"failed to obtain node locks, tried
[/usr/share/elasticsearch/data]; maybe these locations
are not writable or multiple nodes were started on the same data path?"
Looking into the storage, the PV and PVC are created successfully, the output of kubectl get pv,pvc,sc -A -n my-namespace is:
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pvc-9d7b57db-8afd-40f7-8b3d-6334bdc07241 1Gi RWO Delete Bound my-namespace/elasticsearch-data-quickstart-es-default-0 thin 41m
NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
my-namespace persistentvolumeclaim/elasticsearch-data-quickstart-es-default-0 Bound pvc-9d7b57db-8afd-40f7-8b3d-6334bdc07241 1Gi RWO thin 41m
NAMESPACE NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
storageclass.storage.k8s.io/thin (default) kubernetes.io/vsphere-volume Delete Immediate false 19d
storageclass.storage.k8s.io/thin-csi csi.vsphere.vmware.com Delete WaitForFirstConsumer true 19d
Looking at the pod yaml, it appears that the volume is correctly attached :
volumes:
- name: elasticsearch-data
persistentVolumeClaim:
claimName: elasticsearch-data-quickstart-es-default-0
- name: downward-api
downwardAPI:
items:
- path: labels
fieldRef:
apiVersion: v1
fieldPath: metadata.labels
defaultMode: 420
....
volumeMounts:
...
- name: elasticsearch-data
mountPath: /usr/share/elasticsearch/data
I cannot understand why the volume would be read-only or rather why ES cannot create the lock.
I did find this similar issue, but I am not sure how to apply the UID permissions (in general I am fairly naive about the way permissions work in OKD) when when working with ECK.
Does anyone with deeper K8s / OKD or ECK/ElasticSearch knowledge have an idea how to better isolate and/or resolve this issue?
Update: I believe this has something to do with this issue and am researching the optionas related to OKD.
For posterity, the ECK starts an init container that should take care of the chown on the data volume, but can only do so if it is running as root.
The resolution for me was documented here:
https://repo1.dso.mil/dsop/elastic/elasticsearch/elasticsearch/-/issues/7
The manifest now looks like this:
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: quickstart
spec:
version: 8.4.2
nodeSets:
- name: default
count: 1
config:
node.store.allow_mmap: false
# run init container as root to chown the volume to uid 1000
podTemplate:
spec:
securityContext:
runAsUser: 1000
runAsGroup: 0
initContainers:
- name: elastic-internal-init-filesystem
securityContext:
runAsUser: 0
runAsGroup: 0
And the pod starts up and can write to the volume as uid 1000.

Schedule Filebeat daemonset on particular ip nodes in kubernetes

I am trying to gather logs of kubernetes cluster into elk cluster via filebeat.
As of now, my filebeat runs on all ec2 nodes as daemon and its working fine but I want to schedule the filebeat on the ip address of the node i choose.
My kubernetes version is 1.15 and helm chart version is 2.17 due to which i am using helm chart of elasticsearch with version 7.17.3.
As per documentation of kubernetes,this can achieved and i have tried to modify helm chart with following entry in filebeat but then no node comes up:
daemonset:
# Annotations to apply to the daemonset
annotations: {}
# additionals labels
labels: {}
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchFields:
- key: metadata.name
operator: In
values:
- 10.17.7.7
Kindly help.
Kindly follow the steps as mentioned:
Get the lables of all nodes by running the following command:
kubectl get nodes --show-labels
ip-xx-xx-x-xx.xx.compute.internal Ready worker 511d v1.15.11 KubernetesCluster=XXdev.com,beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m5.xlarge,beta.kubernetes.io/os=linux,cattle.io/creator=norman,failure-domain.beta.kubernetes.io/region=xx,failure-domain.beta.kubernetes.io/zone=xx,kubernetes.io/arch=amd64,kubernetes.io/hostname=XXdev-aworker1,kubernetes.io/os=linux,node-role.kubernetes.io/worker=true
ip-xx-xx-x-xx.xx.compute.internal Ready controlplane,etcd,worker 688d v1.15.11 KubernetesCluster=XXdev.com,beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m5.2xlarge,beta.kubernetes.io/os=linux,cattle.io/creator=norman,failure-domain.beta.kubernetes.io/region=xx,failure-domain.beta.kubernetes.io/zone=xx,kubernetes.io/arch=amd64,kubernetes.io/hostname=XXdev-all3,kubernetes.io/os=linux,node-role.kubernetes.io/controlplane=true,node-role.kubernetes.io/etcd=true,node-role.kubernetes.io/worker=true
now in the labels column, kindly see the key value pair kubernetes.io/etcd=true and mention the same in above code
daemonset:
# Annotations to apply to the daemonset
annotations: {}
# additionals labels
labels: {}
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchFields:
- key: kubernetes.io/etcd
operator: In
values:
- true
Another way to achieve same by nodeSelector is as follows:
nodeSelector:
kubernetes.io/etcd: true

Packetbeat does not add Kubernetes metadata

I've started a minikube (using Kubernetes 1.18.3) to test out ECK and specifically packetbeat. The minikube profile is called "packetbeat" (important, as that's the hostname for the Virtualbox VM as well) and I followed the ECK quickstart to get it up and running. ElasticSearch (single node) and Kibana are running fine and packetbeat is gathering flows as well, however, I'm unable to make it add the Kubernetes metadata to the fields.
I'm working in the default namespace and created a ClusterRoleBinding to view for the default ServiceAccount in the namespace. This is working well, if I do not do that, packetbeat will report it is unable to list the Pods on the API server.
This is the Beat config I'm using to make ECK deploy packetbeat:
apiVersion: beat.k8s.elastic.co/v1beta1
kind: Beat
metadata:
name: packetbeat
spec:
type: packetbeat
version: 7.9.0
elasticsearchRef:
name: quickstart
kibanaRef:
name: kibana
config:
packetbeat.interfaces.device: any
packetbeat.protocols:
- type: http
ports: [80, 8000, 8080, 9200]
- type: tls
ports: [443]
packetbeat.flows:
timeout: 30s
period: 10s
processors:
- add_kubernetes_metadata: {}
daemonSet:
podTemplate:
spec:
terminationGracePeriodSeconds: 30
hostNetwork: true
automountServiceAccountToken: true # some older Beat versions are depending on this settings presence in k8s context
dnsPolicy: ClusterFirstWithHostNet
containers:
- name: packetbeat
securityContext:
runAsUser: 0
capabilities:
add:
- NET_ADMIN
(This is mostly a slightly modified example from the ECK example page.) However, this is not working at all. I tried it with "add_kubernetes_metadata: {}" first, but that will error with the message:
2020-08-19T14:23:38.550Z ERROR [kubernetes] kubernetes/util.go:117
kubernetes: Querying for pod failed with error: pods "packetbeat" not
found {"libbeat.processor": "add_kubernetes_metadata"}
This message goes away when I add the "host: packetbeat". I'm no longer getting an error now, but I'm not getting the Kubernetes metadata either. I'm mostly interested in the namespace tag, but I'm not getting any. I do not see any additional errors in the log and it just reports monitoring details every 30 seconds at the moment.
What am I doing wrong? Any more information I can provide to help me debug this?
So the docs are just unclear. Although they do not explicitely state it, you do need to add indexers and matchers. My understanding was that there are "default" ones (as you can disable those), but that does not seem to be the case. Adding the indexers and matchers as per the example in the docs makes the Kubernetes metadata part of the data.

Digital Ocean managed Kubernetes volume in pending state

It's not so digital ocean specific, would be really nice to verify if this is an expected behavior or not.
I'm trying to setup ElasticSearch cluster on DO managed Kubernetes cluster with helm chart from ElasticSearch itself
And they say that I need to specify a storageClassName in a volumeClaimTemplate in order to use volume which is provided by managed kubernetes service. For DO it's do-block-storages according to their docs. Also seems to be it's not necessary to define PVC, helm chart should do it itself.
Here's config I'm using
# Specify node pool
nodeSelector:
doks.digitalocean.com/node-pool: elasticsearch
# Shrink default JVM heap.
esJavaOpts: "-Xmx128m -Xms128m"
# Allocate smaller chunks of memory per pod.
resources:
requests:
cpu: "100m"
memory: "512M"
limits:
cpu: "1000m"
memory: "512M"
# Specify Digital Ocean storage
# Request smaller persistent volumes.
volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
storageClassName: do-block-storage
resources:
requests:
storage: 10Gi
extraInitContainers: |
- name: create
image: busybox:1.28
command: ['mkdir', '/usr/share/elasticsearch/data/nodes/']
volumeMounts:
- mountPath: /usr/share/elasticsearch/data
name: elasticsearch-master
- name: file-permissions
image: busybox:1.28
command: ['chown', '-R', '1000:1000', '/usr/share/elasticsearch/']
volumeMounts:
- mountPath: /usr/share/elasticsearch/data
name: elasticsearch-master
Helm chart i'm setting with terraform, but it doesn't matter anyway, which way you'll do it:
resource "helm_release" "elasticsearch" {
name = "elasticsearch"
chart = "elastic/elasticsearch"
namespace = "elasticsearch"
values = [
file("charts/elasticsearch.yaml")
]
}
Here's what I've got when checking pod logs:
51s Normal Provisioning persistentvolumeclaim/elasticsearch-master-elasticsearch-master-2 External provisioner is provisioning volume for claim "elasticsearch/elasticsearch-master-elasticsearch-master-2"
2m28s Normal ExternalProvisioning persistentvolumeclaim/elasticsearch-master-elasticsearch-master-2 waiting for a volume to be created, either by external provisioner "dobs.csi.digitalocean.com" or manually created by system administrator
I'm pretty sure the problem is a volume. it should've been automagically provided by kubernetes. Describing persistent storage gives this:
holms#debian ~/D/c/s/b/t/s/post-infra> kubectl describe pvc elasticsearch-master-elasticsearch-master-0 --namespace elasticsearch
Name: elasticsearch-master-elasticsearch-master-0
Namespace: elasticsearch
StorageClass: do-block-storage
Status: Pending
Volume:
Labels: app=elasticsearch-master
Annotations: volume.beta.kubernetes.io/storage-provisioner: dobs.csi.digitalocean.com
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
Mounted By: elasticsearch-master-0
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Provisioning 4m57s (x176 over 14h) dobs.csi.digitalocean.com_master-setupad-eu_04e43747-fafb-11e9-b7dd-e6fd8fbff586 External provisioner is provisioning volume for claim "elasticsearch/elasticsearch-master-elasticsearch-master-0"
Normal ExternalProvisioning 93s (x441 over 111m) persistentvolume-controller waiting for a volume to be created, either by external provisioner "dobs.csi.digitalocean.com" or manually created by system administrator
I've google everything already, it seems to be everything is correct, and volume should be up withing DO side with no problems, but it hangs in pending state. Is this expected behavior or should I ask DO support to check what's going on their side?
Yes, this is expected behavior. This chart might not be compatible with Digital Ocean Kubernetes service.
Digital Ocean documentation has the following information in Known Issues section:
Support for resizing DigitalOcean Block Storage Volumes in Kubernetes has not yet been implemented.
In the DigitalOcean Control Panel, cluster resources (worker nodes, load balancers, and block storage volumes) are listed outside of the Kubernetes page. If you rename or otherwise modify these resources in the control panel, you may render them unusable to the cluster or cause the reconciler to provision replacement resources. To avoid this, manage your cluster resources exclusively with kubectl or from the control panel’s Kubernetes page.
In the charts/stable/elasticsearch there are specific requirements mentioned:
Prerequisites Details
Kubernetes 1.10+
PV dynamic provisioning support on the underlying infrastructure
You can ask Digital Ocean support for help or try to deploy ElasticSearch without helm chart.
It is even mentioned on github that:
Automated testing of this chart is currently only run against GKE (Google Kubernetes Engine).
Update:
The same issue is present on my kubeadm ha cluster.
However I managed to get it working by manually creating PersistentVolumes's for my storageclass.
My storageclass definition: storageclass.yaml:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: ssd
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
parameters:
type: pd-ssd
$ kubectl apply -f storageclass.yaml
$ kubectl get sc
NAME PROVISIONER AGE
ssd local 50m
My PersistentVolume definition: pv.yaml:
apiVersion: v1
kind: PersistentVolume
metadata:
name: task-pv-volume
labels:
type: local
spec:
storageClassName: ssd
capacity:
storage: 30Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data"
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- <name of the node>
kubectl apply -f pv.yaml
After that I ran helm chart:
helm install stable/elasticsearch --name my-release --set data.persistence.storageClass=ssd,data.storage=30Gi --set data.persistence.storageClass=ssd,master.storage=30Gi
PVC finally got bound.
$ kubectl get pvc -A
NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
default data-my-release-elasticsearch-data-0 Bound task-pv-volume2 30Gi RWO ssd 17m
default data-my-release-elasticsearch-master-0 Pending 17m
Note that I only manually satisfied only single pvc and ElasticSearch manual volume provisioning might be very inefficient.
I suggest contacting DO support for automated volume provisioning solution.
What a strange situation, after I've changed 10Gi to 10G it started to work. Maybe it has to do something with a storage class it's self, but it started to work.

Service discovery does not discover multiple nodes

I have elasticsearch cluster on kubernetes on aws. I have used upmc operator and readonlyrest security plugin.
I have started my cluster passing yaml to upmc operator with 3 data / master and ingest nodes.
However when I do /localhsot:9200/_nodes all I see only 1 node is being assigned. Service discovery did not attach other nodes to the clusters. Essentially I have a single node cluster. Any settings I am missing or after creating cluster I need to run some settings so that all nodes become part of the cluster ?
Here is my yml file:
The following yaml file is used to create the cluster.
This yaml config creates 3 master/data/ingest nodes and upmcoperator uses pod afinity to allocate pods in different zones.
All 9 nodes are getting created just fine, but they are unable to become part of the cluster.
====================================
apiVersion: enterprises.upmc.com/v1
kind: ElasticsearchCluster
metadata:
name: es-cluster
spec:
kibana:
image: docker.elastic.co/kibana/kibana-oss:6.1.3
image-pull-policy: Always
cerebro:
image: upmcenterprises/cerebro:0.7.2
image-pull-policy: Always
elastic-search-image: myelasticimage:elasticsearch-mod-v0.1
elasticsearchEnv:
- name: PUBLIC_KEY
value: "default"
- name: NETWORK_HOST
value: "_eth0:ipv4_"
image-pull-secrets:
- name: egistrykey
image-pull-policy: Always
client-node-replicas: 3
master-node-replicas: 3
data-node-replicas: 3
network-host: 0.0.0.0
zones: []
data-volume-size: 1Gi
java-options: "-Xms512m -Xmx512m"
snapshot:
scheduler-enabled: false
bucket-name: somebucket
cron-schedule: "#every 1m"
image: upmcenterprises/elasticsearch-cron:0.0.4
storage:
type: standard
storage-class-provisioner: volume.alpha.kubernetes.io/storage-class
volume-reclaim-policy: Delete

Resources