I have an elasticsearch cluster that has the storage field set to 10Gi, I want to resize this cluster (for testing purposes to 15Gi). However, after changing the storage value from 10Gi to 15Gi I can see that the cluster still did not resize and the generated PVC is still set to 10Gi.
From what I can tell the aws-ebs storage https://kubernetes.io/docs/concepts/storage/storage-classes/ allows for volume expansion when the field allowVolumeExpansion is true. But even when I have this, the volume is never expanded when I change that storage value
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: elasticsearch-storage
namespace: test
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
reclaimPolicy: Delete
allowVolumeExpansion: true
---
apiVersion: elasticsearch.k8s.elastic.co/v1beta1
kind: Elasticsearch
metadata:
name: elasticsearch
namespace: test
spec:
version: 7.4.2
spec:
http:
tls:
certificate:
secretName: es-cert
nodeSets:
- name: default
count: 3
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
annotations:
volume.beta.kubernetes.io/storage-class: elasticsearch-storage
spec:
accessModes:
- ReadWriteOnce
storageClassName: elasticsearch-storage
resources:
requests:
storage: 15Gi
config:
node.master: true
node.data: true
node.ingest: true
node.store.allow_mmap: false
xpack.security.authc.realms:
native:
native1:
order: 1
---
Technically it should work but your Kubernetes cluster might not be able to connect to the AWS API to expand the volume. Did you check the actual EBS volume on the EC2 console or AWS CLI? You can debug this issue by looking at the kube-controller-manager and cloud-controller manager logs.
My guess is that there is some type of permission issue that from your K8s cluster that cannot talk to your AWS/EC2 API.
If you are running EKS, make sure that the IAM cluster role that you are using has permissions for EC2/EBS. You can check the control plane logs (kube-controller-manager, kube-apiserver, cloud-controller-manager, etc) on CloudWatch.
EDIT:
The Elasticsearch operator uses StatefulSets and as of this date Volume expansion is not supported on StatefulSets.
Related
I am running Kubernetes using OKD 4.11 (running on vSphere) and have validated the basic functionality (including dyn. volume provisioning) using applications (like nginx).
I also applied
oc adm policy add-scc-to-group anyuid system:authenticated
to allow authenticated users to use anyuid (which seems to have been required to deploy the nginx example I was testing with).
Then I installed ECK using this quickstart with kubectl to install the CRD and RBAC manifests. This seems to have worked.
Then I deployed the most basic ElasticSearch quickstart example with kubectl apply -f quickstart.yaml using this manifest:
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: quickstart
spec:
version: 8.4.2
nodeSets:
- name: default
count: 1
config:
node.store.allow_mmap: false
The deployment proceeds as expected, pulling image and starting container, but ends in a CrashLoopBackoff with the following error from ElasticSearch at the end of the log:
"elasticsearch.cluster.name":"quickstart",
"error.type":"java.lang.IllegalStateException",
"error.message":"failed to obtain node locks, tried
[/usr/share/elasticsearch/data]; maybe these locations
are not writable or multiple nodes were started on the same data path?"
Looking into the storage, the PV and PVC are created successfully, the output of kubectl get pv,pvc,sc -A -n my-namespace is:
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pvc-9d7b57db-8afd-40f7-8b3d-6334bdc07241 1Gi RWO Delete Bound my-namespace/elasticsearch-data-quickstart-es-default-0 thin 41m
NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
my-namespace persistentvolumeclaim/elasticsearch-data-quickstart-es-default-0 Bound pvc-9d7b57db-8afd-40f7-8b3d-6334bdc07241 1Gi RWO thin 41m
NAMESPACE NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
storageclass.storage.k8s.io/thin (default) kubernetes.io/vsphere-volume Delete Immediate false 19d
storageclass.storage.k8s.io/thin-csi csi.vsphere.vmware.com Delete WaitForFirstConsumer true 19d
Looking at the pod yaml, it appears that the volume is correctly attached :
volumes:
- name: elasticsearch-data
persistentVolumeClaim:
claimName: elasticsearch-data-quickstart-es-default-0
- name: downward-api
downwardAPI:
items:
- path: labels
fieldRef:
apiVersion: v1
fieldPath: metadata.labels
defaultMode: 420
....
volumeMounts:
...
- name: elasticsearch-data
mountPath: /usr/share/elasticsearch/data
I cannot understand why the volume would be read-only or rather why ES cannot create the lock.
I did find this similar issue, but I am not sure how to apply the UID permissions (in general I am fairly naive about the way permissions work in OKD) when when working with ECK.
Does anyone with deeper K8s / OKD or ECK/ElasticSearch knowledge have an idea how to better isolate and/or resolve this issue?
Update: I believe this has something to do with this issue and am researching the optionas related to OKD.
For posterity, the ECK starts an init container that should take care of the chown on the data volume, but can only do so if it is running as root.
The resolution for me was documented here:
https://repo1.dso.mil/dsop/elastic/elasticsearch/elasticsearch/-/issues/7
The manifest now looks like this:
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: quickstart
spec:
version: 8.4.2
nodeSets:
- name: default
count: 1
config:
node.store.allow_mmap: false
# run init container as root to chown the volume to uid 1000
podTemplate:
spec:
securityContext:
runAsUser: 1000
runAsGroup: 0
initContainers:
- name: elastic-internal-init-filesystem
securityContext:
runAsUser: 0
runAsGroup: 0
And the pod starts up and can write to the volume as uid 1000.
I am using Elastic Search(v7.6.1) on a Kubernetes(v1.19) cluster.
The docs suggests to disable swapping:
https://www.elastic.co/guide/en/elasticsearch/reference/current/setup-configuration-memory.html
My yaml:
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: elastic-cluster-1
spec:
version: 7.6.1
image: docker.elastic.co/elasticsearch/elasticsearch:7.6.1
nodeSets:
- name: default
count: 3
config:
node.master: true
node.data: true
node.ingest: true
podTemplate:
metadata:
labels:
# additional labels for pods
type: elastic-master-node
spec:
nodeSelector:
node-pool: <NODE_POOL>
initContainers:
# Increase linux map count to allow elastic to store large memory maps
- name: sysctl
securityContext:
privileged: true
command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144']
containers:
- name: elasticsearch
# specify resource limits and requests
resources:
limits:
memory: 11.2Gi
requests:
cpu: 3200m
env:
- name: ES_JAVA_OPTS
value: "-Xms6g -Xmx6g"
# Request persistent data storage for pods
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
storageClassName: ssd
- name: data
count: 2
config:
node.master: false
node.data: true
node.ingest: true
podTemplate:
metadata:
labels:
# additional labels for pods
type: elastic-data-node
spec:
nodeSelector:
node-pool: <NODE_POOL>
initContainers:
# Increase linux map count to allow elastic to store large memory maps
- name: sysctl
securityContext:
privileged: true
command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144']
containers:
- name: elasticsearch
# specify resource limits and requests
resources:
limits:
memory: 11.2Gi
requests:
cpu: 3200m
env:
- name: ES_JAVA_OPTS
value: "-Xms6g -Xmx6g"
# Request persistent data storage for pods
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
storageClassName: ssd
# Google cloud storage credentials
secureSettings:
- secretName: "gcs-credentials"
http:
service:
spec:
# expose this cluster Service with a LoadBalancer
type: LoadBalancer
tls:
certificate:
secretName: elasticsearch-certificate
It's not clear to me how to change this yaml in order to disable swapping correctly. Changing each manually is not an option because in every restart the configuration will be lost.
How can I do this?
First of all k8s cluster by default will have swap disabled, this is actually a mandatory requirement. For most cases; especially cloud managed cluster which follows the requirement, you do not need to worry about swapping issue. Even for 1.22, enabling swap is only an alpha feature.
If for whatever reason you need to deal with this, you can consider setting bootstrap.memory_lock to true.
...
containers:
- name: elasticsearch
env:
- name: bootstrap.memory_lock
value: "true"
...
Up until recently, Kubernetes had no control over swapping.
As of 1.22, there's a new alpha feature to do this. The CRI spec does allow for swap allocations. I didn't find anything new in that regard, in the Pod specification: as far as I understand, currently, you could either allow your containers to use as much swap as they can (UnlimitedSwap), or limit swap+memory usage to whatever memory limit you set on your container (LimitedSwap).
Since you're running 1.19, this shouldn't concern you right now. A good practice while deploying your cluster would have been to make sure there is no swap at all on your nodes, or set swapiness to 0 or 1. Checking Kubespray playbooks, we can see they would still unconditionally disable swap.
You can connect your nodes (ssh), make sure there's no swap -- or disable it otherwise. There's nothing you can do in that ElasticSearch object directly.
I am launching Elasticsearch cluster in K8S and below is the spec file. It failed to launch the pod with below error. I am trying to disable authentication and want to connect to the cluster without any credentials. But it stops me doing that. It says the configuration is internal use. What is the correct way for me to set this settings?
Warning ReconciliationError 84s elasticsearch-controller Failed to apply spec change: adjust resources: adjust discovery config: Operation cannot be fulfilled on elasticsearches.elasticsearch.k8s.elastic.co "datasource": the object has been modified; please apply your changes to the latest version and try again
Normal AssociationStatusChange 1s (x16 over 86s) es-monitoring-association-controller Association status changed from [] to []
Warning Validation 1s (x20 over 84s) elasticsearch-controller [spec.nodeSets[0].config.xpack.security.enabled: Forbidden: Configuration setting is reserved for internal use. User-configured use is unsupported, spec.nodeSets[0].config.xpack.security.http.ssl.enabled: Forbidden: Configuration setting is reserved for internal use. User-configured use is unsupported, spec.nodeSets[0].config.xpack.security.transport.ssl.enabled: Forbidden: Configuration setting is reserved for internal use. User-configured use is unsupported]
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: datasource
spec:
version: 7.14.0
nodeSets:
- name: node
count: 2
config:
node.store.allow_mmap: false
xpack.security.http.ssl.enabled: false
xpack.security.transport.ssl.enabled: false
xpack.security.enabled: false
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
storageClassName: ebs-sc
resources:
requests:
storage: 1024Gi
You can try this:
https://discuss.elastic.co/t/cannot-disable-tls-and-security-in-eks/222335/2
I have tested and its working fine for me without any issues:
cat <<EOF | kubectl apply -f -
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: quickstart
spec:
version: 7.15.0
nodeSets:
- name: default
count: 1
config:
node.master: true
node.data: true
node.ingest: true
node.store.allow_mmap: false
xpack.security.authc:
anonymous:
username: anonymous
roles: superuser
authz_exception: false
EOF
To Disable basic authentication:
https://www.elastic.co/guide/en/elasticsearch/reference/7.14/anonymous-access.html
To disable SSL self signed certificate:
https://www.elastic.co/guide/en/cloud-on-k8s/0.9/k8s-accessing-elastic-services.html#k8s-disable-tls
I have 3 Kubernetes nodes, one of them is the Master and the others are worker. I deployed Laravel application to the Master Node and created volumes and storage class which points to that folder.
These are my YAML files to create volumes and the persistent volume claim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: qsinav-pv-www-claim
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storage class
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: manual
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-ssd
apiVersion: v1
kind: PersistentVolume
metadata:
name: qsinav-pv-www
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
claimRef:
namespace: default
name: qsinav-pv-www-claim
hostPath:
path: "/var/www/html/Test/qSinav-starter"
The problem is that the pods in every node try to mount the folder in its parent node.
So as I am running a Web application and I have load balancing between these nodes, if I logged in node one and the next request went to node 2 it redirects me to the login page as I don't have a session there.
So I need to share 1 folder from a master node with all worker nodes.
I don't know what should I do to achieve my goal, so please help me to solve it
Thanks in advance
Yeah that's expected and is clearly mentioned in the docs for hostPath volume.
Pods with identical configuration (such as created from a PodTemplate)
may behave differently on different nodes due to different files on
the nodes
You need to use something like nfs which can be shared between nodes. You PV definition would end up looking something like this (change IP & path as per your setup):
apiVersion: v1
kind: PersistentVolume
metadata:
name: qsinav-pv-www
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
claimRef:
namespace: default
name: qsinav-pv-www-claim
nfs:
path: /tmp
server: 172.17.0.2
don't use the type hostPath, because this type of volume is just for a single node cluster,
it Work well just for single node environnement, because if the pod is assigne to another node, then the pod can’t get the data nedded.
so use the type Local
It remembers which node was used for provisioning the volume, thus making sure that a restarting POD will always find the data storage in the state it had left it before the reboot.
Ps 1: Once a node has died, the data of both hostpath and local persitent volumes of that node are lost.
ps 2: the local and the hostPath type don't work with dynamic provisioning
apiVersion: v1
kind: PersistentVolume
metadata:
name: qsinav-pv-www
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
claimRef:
namespace: default
name: qsinav-pv-www-claim
local:
path: "/var/www/html/Test/qSinav-starter"
We are attempting to run Elasticsearch on top of a kubernetes / flannel / coreos cluster.
As flannel does not support multicast, we cannot use Zen multicast discovery to allow the nodes to find each other, form a cluster and communicate.
Short of hard-coding the IP addresses of all the kubernetes nodes into the ES-config-file, is there another method we can utilise to assist in discovery? Possibly using etcd2 or some other kubernetes-compatible discovery service?
Version 6.2.0 is supporting kubernetes auto discovery
update your elasticsearch.yml as following
discovery.zen.ping.unicast.hosts: "kubernetes service name"
There is a discovery plugin that uses the kubernetes API for cluster discovery:
https://github.com/fabric8io/elasticsearch-cloud-kubernetes
Install the plugin:
/usr/share/elasticsearch/bin/plugin -i io.fabric8/elasticsearch-cloud-kubernetes/1.3.0 --verbose
Create a Kubernetes service for discovery:
apiVersion: v1
kind: Service
metadata:
name: elasticsearch-cluster
spec:
ports:
- port: 9300
selector:
app: elasticsearch
And an elasticsearch.yml:
cloud.k8s.servicedns: elasticsearch-cluster
discovery.type: io.fabric8.elasticsearch.discovery.k8s.K8sDiscoveryModule
Place the containers into a Kubernetes Service. The Kubernetes API makes an 'endpoints' API available that lists the IP addresses of all of the members of a service. This endpoint set will dynamically shrink and grow as you scale the number of pods.
You can access endpoints with:
kubectl get endpoints <service-name>
or directly via the Kubernetes API, see:
https://github.com/kubernetes/kubernetes/blob/master/examples/cassandra/java/src/io/k8s/cassandra/KubernetesSeedProvider.java#L106
for an example of how this was done for Cassandra.
It worked for me only in this configuration.
Important! flannel must be enabled with vxlan.
cluster.yaml
network:
plugin: flannel
options:
flannel_backend_type: vxlan
elasticsearch.yaml
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: elastic-cluster
spec:
version: 7.0.1
nodeSets:
- name: node
count: 3
config:
node.master: true
node.data: true
node.ingest: true
xpack.ml.enabled: true
node.store.allow_mmap: true
indices.query.bool.max_clause_count: 100000
# Fixed flannel kubernetes network plugin
discovery.seed_hosts:
{{ range $i, $e := until (3 | int) }}
- elastic-cluster-es-node-{{ $i }}
{{ end }}
podTemplate:
spec:
containers:
- name: elasticsearch
env:
- name: ES_JAVA_OPTS
value: "-Xms4g -Xmx4g"
- name: READINESS_PROBE_TIMEOUT
value: "60"
resources:
requests:
memory: 5Gi
# cpu: 1
limits:
memory: 6Gi
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
storageClassName: local-elasticsearch-storage
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5G