elasticsearch on kubernetes - discovery of nodes - elasticsearch

We are attempting to run Elasticsearch on top of a kubernetes / flannel / coreos cluster.
As flannel does not support multicast, we cannot use Zen multicast discovery to allow the nodes to find each other, form a cluster and communicate.
Short of hard-coding the IP addresses of all the kubernetes nodes into the ES-config-file, is there another method we can utilise to assist in discovery? Possibly using etcd2 or some other kubernetes-compatible discovery service?

Version 6.2.0 is supporting kubernetes auto discovery
update your elasticsearch.yml as following
discovery.zen.ping.unicast.hosts: "kubernetes service name"

There is a discovery plugin that uses the kubernetes API for cluster discovery:
https://github.com/fabric8io/elasticsearch-cloud-kubernetes
Install the plugin:
/usr/share/elasticsearch/bin/plugin -i io.fabric8/elasticsearch-cloud-kubernetes/1.3.0 --verbose
Create a Kubernetes service for discovery:
apiVersion: v1
kind: Service
metadata:
name: elasticsearch-cluster
spec:
ports:
- port: 9300
selector:
app: elasticsearch
And an elasticsearch.yml:
cloud.k8s.servicedns: elasticsearch-cluster
discovery.type: io.fabric8.elasticsearch.discovery.k8s.K8sDiscoveryModule

Place the containers into a Kubernetes Service. The Kubernetes API makes an 'endpoints' API available that lists the IP addresses of all of the members of a service. This endpoint set will dynamically shrink and grow as you scale the number of pods.
You can access endpoints with:
kubectl get endpoints <service-name>
or directly via the Kubernetes API, see:
https://github.com/kubernetes/kubernetes/blob/master/examples/cassandra/java/src/io/k8s/cassandra/KubernetesSeedProvider.java#L106
for an example of how this was done for Cassandra.

It worked for me only in this configuration.
Important! flannel must be enabled with vxlan.
cluster.yaml
network:
plugin: flannel
options:
flannel_backend_type: vxlan
elasticsearch.yaml
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: elastic-cluster
spec:
version: 7.0.1
nodeSets:
- name: node
count: 3
config:
node.master: true
node.data: true
node.ingest: true
xpack.ml.enabled: true
node.store.allow_mmap: true
indices.query.bool.max_clause_count: 100000
# Fixed flannel kubernetes network plugin
discovery.seed_hosts:
{{ range $i, $e := until (3 | int) }}
- elastic-cluster-es-node-{{ $i }}
{{ end }}
podTemplate:
spec:
containers:
- name: elasticsearch
env:
- name: ES_JAVA_OPTS
value: "-Xms4g -Xmx4g"
- name: READINESS_PROBE_TIMEOUT
value: "60"
resources:
requests:
memory: 5Gi
# cpu: 1
limits:
memory: 6Gi
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
storageClassName: local-elasticsearch-storage
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5G

Related

Elastic Search Kubernetes - Disable memory swapping

I am using Elastic Search(v7.6.1) on a Kubernetes(v1.19) cluster.
The docs suggests to disable swapping:
https://www.elastic.co/guide/en/elasticsearch/reference/current/setup-configuration-memory.html
My yaml:
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: elastic-cluster-1
spec:
version: 7.6.1
image: docker.elastic.co/elasticsearch/elasticsearch:7.6.1
nodeSets:
- name: default
count: 3
config:
node.master: true
node.data: true
node.ingest: true
podTemplate:
metadata:
labels:
# additional labels for pods
type: elastic-master-node
spec:
nodeSelector:
node-pool: <NODE_POOL>
initContainers:
# Increase linux map count to allow elastic to store large memory maps
- name: sysctl
securityContext:
privileged: true
command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144']
containers:
- name: elasticsearch
# specify resource limits and requests
resources:
limits:
memory: 11.2Gi
requests:
cpu: 3200m
env:
- name: ES_JAVA_OPTS
value: "-Xms6g -Xmx6g"
# Request persistent data storage for pods
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
storageClassName: ssd
- name: data
count: 2
config:
node.master: false
node.data: true
node.ingest: true
podTemplate:
metadata:
labels:
# additional labels for pods
type: elastic-data-node
spec:
nodeSelector:
node-pool: <NODE_POOL>
initContainers:
# Increase linux map count to allow elastic to store large memory maps
- name: sysctl
securityContext:
privileged: true
command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144']
containers:
- name: elasticsearch
# specify resource limits and requests
resources:
limits:
memory: 11.2Gi
requests:
cpu: 3200m
env:
- name: ES_JAVA_OPTS
value: "-Xms6g -Xmx6g"
# Request persistent data storage for pods
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
storageClassName: ssd
# Google cloud storage credentials
secureSettings:
- secretName: "gcs-credentials"
http:
service:
spec:
# expose this cluster Service with a LoadBalancer
type: LoadBalancer
tls:
certificate:
secretName: elasticsearch-certificate
It's not clear to me how to change this yaml in order to disable swapping correctly. Changing each manually is not an option because in every restart the configuration will be lost.
How can I do this?
First of all k8s cluster by default will have swap disabled, this is actually a mandatory requirement. For most cases; especially cloud managed cluster which follows the requirement, you do not need to worry about swapping issue. Even for 1.22, enabling swap is only an alpha feature.
If for whatever reason you need to deal with this, you can consider setting bootstrap.memory_lock to true.
...
containers:
- name: elasticsearch
env:
- name: bootstrap.memory_lock
value: "true"
...
Up until recently, Kubernetes had no control over swapping.
As of 1.22, there's a new alpha feature to do this. The CRI spec does allow for swap allocations. I didn't find anything new in that regard, in the Pod specification: as far as I understand, currently, you could either allow your containers to use as much swap as they can (UnlimitedSwap), or limit swap+memory usage to whatever memory limit you set on your container (LimitedSwap).
Since you're running 1.19, this shouldn't concern you right now. A good practice while deploying your cluster would have been to make sure there is no swap at all on your nodes, or set swapiness to 0 or 1. Checking Kubespray playbooks, we can see they would still unconditionally disable swap.
You can connect your nodes (ssh), make sure there's no swap -- or disable it otherwise. There's nothing you can do in that ElasticSearch object directly.

How can I disable Elasticsearch authentication when launching it in K8S?

I am launching Elasticsearch cluster in K8S and below is the spec file. It failed to launch the pod with below error. I am trying to disable authentication and want to connect to the cluster without any credentials. But it stops me doing that. It says the configuration is internal use. What is the correct way for me to set this settings?
Warning ReconciliationError 84s elasticsearch-controller Failed to apply spec change: adjust resources: adjust discovery config: Operation cannot be fulfilled on elasticsearches.elasticsearch.k8s.elastic.co "datasource": the object has been modified; please apply your changes to the latest version and try again
Normal AssociationStatusChange 1s (x16 over 86s) es-monitoring-association-controller Association status changed from [] to []
Warning Validation 1s (x20 over 84s) elasticsearch-controller [spec.nodeSets[0].config.xpack.security.enabled: Forbidden: Configuration setting is reserved for internal use. User-configured use is unsupported, spec.nodeSets[0].config.xpack.security.http.ssl.enabled: Forbidden: Configuration setting is reserved for internal use. User-configured use is unsupported, spec.nodeSets[0].config.xpack.security.transport.ssl.enabled: Forbidden: Configuration setting is reserved for internal use. User-configured use is unsupported]
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: datasource
spec:
version: 7.14.0
nodeSets:
- name: node
count: 2
config:
node.store.allow_mmap: false
xpack.security.http.ssl.enabled: false
xpack.security.transport.ssl.enabled: false
xpack.security.enabled: false
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
storageClassName: ebs-sc
resources:
requests:
storage: 1024Gi
You can try this:
https://discuss.elastic.co/t/cannot-disable-tls-and-security-in-eks/222335/2
I have tested and its working fine for me without any issues:
cat <<EOF | kubectl apply -f -
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: quickstart
spec:
version: 7.15.0
nodeSets:
- name: default
count: 1
config:
node.master: true
node.data: true
node.ingest: true
node.store.allow_mmap: false
xpack.security.authc:
anonymous:
username: anonymous
roles: superuser
authz_exception: false
EOF
To Disable basic authentication:
https://www.elastic.co/guide/en/elasticsearch/reference/7.14/anonymous-access.html
To disable SSL self signed certificate:
https://www.elastic.co/guide/en/cloud-on-k8s/0.9/k8s-accessing-elastic-services.html#k8s-disable-tls

How to configure Filebeat on ECK for kafka input?

I have Elasticsearch and Kibana running on Kubernetes. Both created by ECK. Now I try to add Filebeat to it and configure it to index data coming from a Kafka topic. This is my current configuration:
apiVersion: beat.k8s.elastic.co/v1beta1
kind: Beat
metadata:
name: my-filebeat
namespace: my-namespace
spec:
type: filebeat
version: 7.10.2
elasticsearchRef:
name: my-elastic
kibanaRef:
name: my-kibana
config:
filebeat.inputs:
- type: kafka
hosts:
- host1:9092
- host2:9092
- host3:9092
topics: ["my.topic"]
group_id: "my_group_id"
index: "my_index"
deployment:
podTemplate:
spec:
dnsPolicy: ClusterFirstWithHostNet
hostNetwork: true
securityContext:
runAsUser: 0
containers:
- name: filebeat
In the logs of the pod I can see entries like following
log/log.go:145 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":2470,"time":{"ms":192}},"total":{"ticks":7760,"time":{"ms":367},"value":7760},"user":{"ticks":5290,"time":{"ms":175}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":13},"info":{"ephemeral_id":"5ce8521c-f237-4994-a02e-dd11dfd31b09","uptime":{"ms":181997}},"memstats":{"gc_next":23678528,"memory_alloc":15320760,"memory_total":459895768},"runtime":{"goroutines":106}},"filebeat":{"harvester":{"open_files":0,"running":0},"inputs":{"kafka":{"bytes_read":46510,"bytes_write":37226}}},"libbeat":{"config":{"module":{"running":0}},"pipeline":{"clients":1,"events":{"active":0}}},"registrar":{"states":{"current":0}},"system":{"load":{"1":1.18,"15":0.77,"5":0.97,"norm":{"1":0.0738,"15":0.0481,"5":0.0606}}}}}}
And nor error entries are there. So I assume that the connection to Kafka works. Unfortunately, there no data in the my_index specified above. What do I do wrong?
I guess you are not able to connect to the Elasticsearch mentioned in the output.
As per docs, ECK secures the Elasticsearch deployed and stores it in the Kubernetes Secrets.
https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-beat-configuration.html

How to resize an ECK cluster

I have an elasticsearch cluster that has the storage field set to 10Gi, I want to resize this cluster (for testing purposes to 15Gi). However, after changing the storage value from 10Gi to 15Gi I can see that the cluster still did not resize and the generated PVC is still set to 10Gi.
From what I can tell the aws-ebs storage https://kubernetes.io/docs/concepts/storage/storage-classes/ allows for volume expansion when the field allowVolumeExpansion is true. But even when I have this, the volume is never expanded when I change that storage value
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: elasticsearch-storage
namespace: test
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
reclaimPolicy: Delete
allowVolumeExpansion: true
---
apiVersion: elasticsearch.k8s.elastic.co/v1beta1
kind: Elasticsearch
metadata:
name: elasticsearch
namespace: test
spec:
version: 7.4.2
spec:
http:
tls:
certificate:
secretName: es-cert
nodeSets:
- name: default
count: 3
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
annotations:
volume.beta.kubernetes.io/storage-class: elasticsearch-storage
spec:
accessModes:
- ReadWriteOnce
storageClassName: elasticsearch-storage
resources:
requests:
storage: 15Gi
config:
node.master: true
node.data: true
node.ingest: true
node.store.allow_mmap: false
xpack.security.authc.realms:
native:
native1:
order: 1
---
Technically it should work but your Kubernetes cluster might not be able to connect to the AWS API to expand the volume. Did you check the actual EBS volume on the EC2 console or AWS CLI? You can debug this issue by looking at the kube-controller-manager and cloud-controller manager logs.
My guess is that there is some type of permission issue that from your K8s cluster that cannot talk to your AWS/EC2 API.
If you are running EKS, make sure that the IAM cluster role that you are using has permissions for EC2/EBS. You can check the control plane logs (kube-controller-manager, kube-apiserver, cloud-controller-manager, etc) on CloudWatch.
EDIT:
The Elasticsearch operator uses StatefulSets and as of this date Volume expansion is not supported on StatefulSets.

Access Elasticsearch from minikube/kubernetes

I have a spring boot application which is deployed in Kubernetes on local windows machine using minikube. I also have Elasticsearch running on my local machine (http://localhost:9200).
I want to call Elasticsearch REST endpoints from this spring boot app.
I tried solving this by creating a service without selector but not sure what am i missing.
When accessing the spring boot app using http://#minikube_ip#:#Node_Port#, i get an error "No route to host".
i tried doing minikube ssh and executing curl command, from there also i get the same error. Clearly I am missing something here.
application.yaml
elasticsearch:
hosts:
- http://my-es:80
connectTimeout: 10000
connectionRequestTimeout: 10000
socketTimeout: 10000
maxRetryTimeoutMillis: 60000
deployment.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: kube-es-app
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
run: kube-es-app
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
run: kube-es-app
spec:
containers:
- image: elastic-search-app:latest
imagePullPolicy: Never
name: kube-es-app
ports:
- containerPort: 8080
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
---
kind: Service
apiVersion: v1
metadata:
name: my-es
spec:
ports:
- protocol: TCP
port: 80
targetPort: 9200
---
kind: Endpoints
apiVersion: v1
metadata:
name: my-es
subsets:
- addresses:
- ip: <MY_LOCAL_MACHINE_IP>
ports:
- port: 9200
Commands I executed
docker build -t elastic-search-app .
kubectl create -f deployment.yaml
kubectl expose deployment/kube-es-app --type="NodePort" --port 8080
Can anyone help please? I am stuck
If I've got the description right, the Windows machine should have vbox network adapter connected to the Host-only-network the Minikube VM is connected to.
Minikube can access the host machine directly because both are in the same network.
The Minikube is in charge of NAT-ting packages from Pods outside. What you need is to allow Elasticsearch to listen to the vbox- or all interfaces, and enable its port in the Windows firewall. Then the Elasticsearch should be available via IP address of Windows in the Host-only-network.
Apart from that, you might create a service (if you need go by name instead of IP) as discussed here:
Connect to local database from inside minikube cluster,
Minikube:Exposing mysql as a service on localhost.

Resources