Digital Ocean managed Kubernetes volume in pending state

Digital Ocean managed Kubernetes volume in pending state - elasticsearch

It's not so digital ocean specific, would be really nice to verify if this is an expected behavior or not.
I'm trying to setup ElasticSearch cluster on DO managed Kubernetes cluster with helm chart from ElasticSearch itself
And they say that I need to specify a storageClassName in a volumeClaimTemplate in order to use volume which is provided by managed kubernetes service. For DO it's do-block-storages according to their docs. Also seems to be it's not necessary to define PVC, helm chart should do it itself.
Here's config I'm using
# Specify node pool
nodeSelector:
doks.digitalocean.com/node-pool: elasticsearch
# Shrink default JVM heap.
esJavaOpts: "-Xmx128m -Xms128m"
# Allocate smaller chunks of memory per pod.
resources:
requests:
cpu: "100m"
memory: "512M"
limits:
cpu: "1000m"
memory: "512M"
# Specify Digital Ocean storage
# Request smaller persistent volumes.
volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
storageClassName: do-block-storage
resources:
requests:
storage: 10Gi
extraInitContainers: |
- name: create
image: busybox:1.28
command: ['mkdir', '/usr/share/elasticsearch/data/nodes/']
volumeMounts:
- mountPath: /usr/share/elasticsearch/data
name: elasticsearch-master
- name: file-permissions
image: busybox:1.28
command: ['chown', '-R', '1000:1000', '/usr/share/elasticsearch/']
volumeMounts:
- mountPath: /usr/share/elasticsearch/data
name: elasticsearch-master
Helm chart i'm setting with terraform, but it doesn't matter anyway, which way you'll do it:
resource "helm_release" "elasticsearch" {
name = "elasticsearch"
chart = "elastic/elasticsearch"
namespace = "elasticsearch"
values = [
file("charts/elasticsearch.yaml")
]
}
Here's what I've got when checking pod logs:
51s Normal Provisioning persistentvolumeclaim/elasticsearch-master-elasticsearch-master-2 External provisioner is provisioning volume for claim "elasticsearch/elasticsearch-master-elasticsearch-master-2"
2m28s Normal ExternalProvisioning persistentvolumeclaim/elasticsearch-master-elasticsearch-master-2 waiting for a volume to be created, either by external provisioner "dobs.csi.digitalocean.com" or manually created by system administrator
I'm pretty sure the problem is a volume. it should've been automagically provided by kubernetes. Describing persistent storage gives this:
holms#debian ~/D/c/s/b/t/s/post-infra> kubectl describe pvc elasticsearch-master-elasticsearch-master-0 --namespace elasticsearch
Name: elasticsearch-master-elasticsearch-master-0
Namespace: elasticsearch
StorageClass: do-block-storage
Status: Pending
Volume:
Labels: app=elasticsearch-master
Annotations: volume.beta.kubernetes.io/storage-provisioner: dobs.csi.digitalocean.com
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
Mounted By: elasticsearch-master-0
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Provisioning 4m57s (x176 over 14h) dobs.csi.digitalocean.com_master-setupad-eu_04e43747-fafb-11e9-b7dd-e6fd8fbff586 External provisioner is provisioning volume for claim "elasticsearch/elasticsearch-master-elasticsearch-master-0"
Normal ExternalProvisioning 93s (x441 over 111m) persistentvolume-controller waiting for a volume to be created, either by external provisioner "dobs.csi.digitalocean.com" or manually created by system administrator
I've google everything already, it seems to be everything is correct, and volume should be up withing DO side with no problems, but it hangs in pending state. Is this expected behavior or should I ask DO support to check what's going on their side?

Yes, this is expected behavior. This chart might not be compatible with Digital Ocean Kubernetes service.
Digital Ocean documentation has the following information in Known Issues section:
Support for resizing DigitalOcean Block Storage Volumes in Kubernetes has not yet been implemented.
In the DigitalOcean Control Panel, cluster resources (worker nodes, load balancers, and block storage volumes) are listed outside of the Kubernetes page. If you rename or otherwise modify these resources in the control panel, you may render them unusable to the cluster or cause the reconciler to provision replacement resources. To avoid this, manage your cluster resources exclusively with kubectl or from the control panel’s Kubernetes page.
In the charts/stable/elasticsearch there are specific requirements mentioned:
Prerequisites Details
Kubernetes 1.10+
PV dynamic provisioning support on the underlying infrastructure
You can ask Digital Ocean support for help or try to deploy ElasticSearch without helm chart.
It is even mentioned on github that:
Automated testing of this chart is currently only run against GKE (Google Kubernetes Engine).
Update:
The same issue is present on my kubeadm ha cluster.
However I managed to get it working by manually creating PersistentVolumes's for my storageclass.
My storageclass definition: storageclass.yaml:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: ssd
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
parameters:
type: pd-ssd
$ kubectl apply -f storageclass.yaml
$ kubectl get sc
NAME PROVISIONER AGE
ssd local 50m
My PersistentVolume definition: pv.yaml:
apiVersion: v1
kind: PersistentVolume
metadata:
name: task-pv-volume
labels:
type: local
spec:
storageClassName: ssd
capacity:
storage: 30Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data"
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- <name of the node>
kubectl apply -f pv.yaml
After that I ran helm chart:
helm install stable/elasticsearch --name my-release --set data.persistence.storageClass=ssd,data.storage=30Gi --set data.persistence.storageClass=ssd,master.storage=30Gi
PVC finally got bound.
$ kubectl get pvc -A
NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
default data-my-release-elasticsearch-data-0 Bound task-pv-volume2 30Gi RWO ssd 17m
default data-my-release-elasticsearch-master-0 Pending 17m
Note that I only manually satisfied only single pvc and ElasticSearch manual volume provisioning might be very inefficient.
I suggest contacting DO support for automated volume provisioning solution.

What a strange situation, after I've changed 10Gi to 10G it started to work. Maybe it has to do something with a storage class it's self, but it started to work.

Related

ElasticSearch CrashLoopBackoff when deploying with ECK in Kubernetes OKD 4.11

I am running Kubernetes using OKD 4.11 (running on vSphere) and have validated the basic functionality (including dyn. volume provisioning) using applications (like nginx).
I also applied
oc adm policy add-scc-to-group anyuid system:authenticated
to allow authenticated users to use anyuid (which seems to have been required to deploy the nginx example I was testing with).
Then I installed ECK using this quickstart with kubectl to install the CRD and RBAC manifests. This seems to have worked.
Then I deployed the most basic ElasticSearch quickstart example with kubectl apply -f quickstart.yaml using this manifest:
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: quickstart
spec:
version: 8.4.2
nodeSets:
- name: default
count: 1
config:
node.store.allow_mmap: false
The deployment proceeds as expected, pulling image and starting container, but ends in a CrashLoopBackoff with the following error from ElasticSearch at the end of the log:
"elasticsearch.cluster.name":"quickstart",
"error.type":"java.lang.IllegalStateException",
"error.message":"failed to obtain node locks, tried
[/usr/share/elasticsearch/data]; maybe these locations
are not writable or multiple nodes were started on the same data path?"
Looking into the storage, the PV and PVC are created successfully, the output of kubectl get pv,pvc,sc -A -n my-namespace is:
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pvc-9d7b57db-8afd-40f7-8b3d-6334bdc07241 1Gi RWO Delete Bound my-namespace/elasticsearch-data-quickstart-es-default-0 thin 41m
NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
my-namespace persistentvolumeclaim/elasticsearch-data-quickstart-es-default-0 Bound pvc-9d7b57db-8afd-40f7-8b3d-6334bdc07241 1Gi RWO thin 41m
NAMESPACE NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
storageclass.storage.k8s.io/thin (default) kubernetes.io/vsphere-volume Delete Immediate false 19d
storageclass.storage.k8s.io/thin-csi csi.vsphere.vmware.com Delete WaitForFirstConsumer true 19d
Looking at the pod yaml, it appears that the volume is correctly attached :
volumes:
- name: elasticsearch-data
persistentVolumeClaim:
claimName: elasticsearch-data-quickstart-es-default-0
- name: downward-api
downwardAPI:
items:
- path: labels
fieldRef:
apiVersion: v1
fieldPath: metadata.labels
defaultMode: 420
....
volumeMounts:
...
- name: elasticsearch-data
mountPath: /usr/share/elasticsearch/data
I cannot understand why the volume would be read-only or rather why ES cannot create the lock.
I did find this similar issue, but I am not sure how to apply the UID permissions (in general I am fairly naive about the way permissions work in OKD) when when working with ECK.
Does anyone with deeper K8s / OKD or ECK/ElasticSearch knowledge have an idea how to better isolate and/or resolve this issue?
Update: I believe this has something to do with this issue and am researching the optionas related to OKD.

For posterity, the ECK starts an init container that should take care of the chown on the data volume, but can only do so if it is running as root.
The resolution for me was documented here:
https://repo1.dso.mil/dsop/elastic/elasticsearch/elasticsearch/-/issues/7
The manifest now looks like this:
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: quickstart
spec:
version: 8.4.2
nodeSets:
- name: default
count: 1
config:
node.store.allow_mmap: false
# run init container as root to chown the volume to uid 1000
podTemplate:
spec:
securityContext:
runAsUser: 1000
runAsGroup: 0
initContainers:
- name: elastic-internal-init-filesystem
securityContext:
runAsUser: 0
runAsGroup: 0
And the pod starts up and can write to the volume as uid 1000.

Why does it complain about `volume claim templates can only have their storage requests increased`?

I am deploying Elasticsearch cluster on Kubernetes in AWS EKS. The spec I have is:
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: es
spec:
version: 7.14.0
nodeSets:
- name: node
count: 2
config:
node.store.allow_mmap: false
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
storageClassName: ebs-sc
resources:
requests:
storage: 1024Gi
When I deploy I got this error:
for: "es.yml": admission webhook "elastic-es-validation-v1.k8s.elastic.co" denied the request: Elasticsearch.elasticsearch.k8s.elastic.co "es" is invalid: spec.nodeSet[0].volumeClaimTemplates: Invalid value: []v1.PersistentVolumeClaim{v1.PersistentVolumeClaim{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"elasticsearch-data", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:v1.PersistentVolumeClaimSpec{AccessModes:[]v1.PersistentVolumeAccessMode{"ReadWriteOnce"}, Selector:(*v1.LabelSelector)(nil), Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil), Requests:v1.ResourceList{"storage":resource.Quantity{i:resource.int64Amount{value:1099511627776, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"", Format:"BinarySI"}}}, VolumeName:"", StorageClassName:(*string)(0xc000cd2600), VolumeMode:(*v1.PersistentVolumeMode)(nil), DataSource:(*v1.TypedLocalObjectReference)(nil)}, Status:v1.PersistentVolumeClaimStatus{Phase:"", AccessModes:[]v1.PersistentVolumeAccessMode(nil), Capacity:v1.ResourceList(nil), Conditions:[]v1.PersistentVolumeClaimCondition(nil)}}}: volume claim templates can only have their storage requests increased, if the storage class allows volume expansion. Any other change is forbidden
The spec includes volumeClaimTemplates which is used to claim the persistent storage. I don't understand why it says volume claim templates can only have their storage requests increased, if the storage class allows volume expansion.
I have checked PVC is empty:
$ kubectl get pvc
No resources found in default namespace.
And I have below spec for storage class:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: ebs-sc
provisioner: ebs.csi.aws.com
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer

The other answer is correct, K8s has no idea how to change the storage class while retaining the data at the same time as the volumes have to change as well.
By changing the nodeset name, the old nodeset will disappear while the new nodeset will be able to be spawned from a clean state (assuming you're okay with not retaining the data in the old pods).
You can first rename the nodeset to a new name, let K8s flush and spawn new pods, then rename the nodeset back to the original name and repeat.

...I don't understand why it says volume claim templates can only have their storage requests increased, if the storage class allows volume expansion.
With ECK, when you applied a spec that referred a StorageClass that allows volume expansion, you can increase the storage size and re-apply the spec. Beside this no other change is allowed for the volumeClaimTemplates section. If you must change the volumeClaimTemplates (eg. refer a different StorageClass, reduce storage size), you need to also rename the name under nodeSets and re-apply the spec. Example:
...
spec:
...
nodeSets:
- name: <CHANGEME!> # <-- MUST change if not a simple size increase
...
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
resources:
requests:
storage: <xxxGi> # <-- Can be increased if StorageClass supports expansion. To decrease the nodeSets name MUST be changed.
storageClassName:<changeable only if nodeSets name is changed>

Elasticsearch deployment on kubernetes using Persistent Volume

I am trying to deploy a Elasticsearch cluster(replicas: 3) using Statefulset in kubernetes and need to store the Elasticsearch data in a Persistent Volume (PV). Since each Elasticsearch instance has its own data folder, I need to have separate data folder for each replica in the PV. I am trying to use volumeClaimTemplates and mountPath: /usr/share/elasticsearch/data but this is resulting in an error: pod has unbound immediate PersistentVolumeClaims in the second pod. Hence how can I achieve this using Statefulset?
Thanks in advance.

There is no information how you are trying to install elastic-search however:
As an example please follow:
this tutorial,
helm-charts,
As per documentation for StatefulSet - limitations:
The storage for a given Pod must either be provisioned by a PersistentVolume Provisioner based on the requested storage class, or pre-provisioned by an admin.
This looks like your example, problem with dynamic storage provisioning.
Please verify storage class, if pv and pvc were created and bind together and storage class in volumeClaimTemplates:
volumeMounts:
- name: "elasticsearch-master"
mountPath: /usr/share/elasticsearch/data
volumeClaimTemplates:
- metadata:
name: elasticsearch-master
spec:
accessModes:
- ReadWriteOnce
storageClassName: name #please refer to this settings to see if you are using default storage class. In other case you should spceify this parameter manually
resources:
requests:
storage: 30Gi
Hope this help.

If you are using dynamic provisioning then you can get the volume created automatically at backend, like disk is storage for PVs in Azure ( for Read Write Once kind of operations), else you need to create that manually
Once you create the volume, just create a pvc in the appropriate namespace which is of size matching the pv, then you are just supposed to pass the volume name in pvc definition, it will get bound automatically.
You can try something like this -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: claimName
namespace: namespace
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: default
volumeName: pv-volumeName
status:
accessModes:
- ReadWriteOnce
capacity:
storage: 1Gi
Please share if you still face issues

How to setup storage for my elasticsearch setup on GKE?

I have setup Elasticsearch in single node in GKE how to add storage for my elasticsearch data.

You need to add a PersistentVolumeClaims (pvc) to the pod. A PersistentVolumeClaim is a request for and claim to a PersistentVolume resource and pods use claims as Volumes. In this following example manifest describes a request for a 30 GiB disk whose access mode allows it to be mounted by one Pod at a time in read/write mode:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: helloweb-disk
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 30Gi
You can find more information about PersistentVolumes and PersistentVolumeClaims in Kubernetes here.

Why Prometheus pod pending after setup it by helm in Kubernetes cluster on Rancher server?

Installed Rancher server and 2 Rancher agents in Vagrant. Then switch to K8S environment from Rancher server.
On Rancher server host, installed kubectl and helm. Then installed Prometheus by Helm:
helm install stable/prometheus
Now check the status from Kubernetes dashboard, there are 2 pods pending:
It noticed PersistentVolumeClaim is not bound, so aren't the K8S components been installed default with Rancher server?
(another name, same issue)
Edit
> kubectl get pvc
NAME STATUS VOLUME CAPACITY
ACCESSMODES STORAGECLASS AGE
voting-prawn-prometheus-alertmanager Pending 6h
voting-prawn-prometheus-server Pending 6h
> kubectl get pv
No resources found.
Edit 2
$ kubectl describe pvc voting-prawn-prometheus-alertmanager
Name: voting-prawn-prometheus-alertmanager
Namespace: default
StorageClass:
Status: Pending
Volume:
Labels: app=prometheus
chart=prometheus-4.6.9
component=alertmanager
heritage=Tiller
release=voting-prawn
Annotations: <none>
Capacity:
Access Modes:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal FailedBinding 12s (x10 over 2m) persistentvolume-controller no persistent volumes available for this claim and no storage class is set
$ kubectl describe pvc voting-prawn-prometheus-server
Name: voting-prawn-prometheus-server
Namespace: default
StorageClass:
Status: Pending
Volume:
Labels: app=prometheus
chart=prometheus-4.6.9
component=server
heritage=Tiller
release=voting-prawn
Annotations: <none>
Capacity:
Access Modes:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal FailedBinding 12s (x14 over 3m) persistentvolume-controller no persistent volumes available for this claim and no storage class is set

I had same issues as you. I found two ways to solve this:
edit values.yaml under persistentVolumes.enabled=false this will allow you to use emptyDir "this applies to Prometheus-Server and AlertManager"
If you can't change values.yaml you will have to create the PV before deploying the chart so that the pod can bind to the volume otherwise it will stay in the pending state forever

PV are cluster scoped and PVC are namespaced scope.
If your application running in a different namespace and PVC in a different namespace, it can be issue.
If yes, use RBAC to give proper permissions, or put app and PVC in same namespace.
Can you make sure PV which is getting created from Storage class is the default SC of the cluster ?

I found that i was missing storage class and storage volumes. fixed similar problems on my cluster by first creating a storage class.
kubectl apply -f storageclass.ymal
storageclass.ymal:
{
"kind": "StorageClass",
"apiVersion": "storage.k8s.io/v1",
"metadata": {
"name": "local-storage",
"annotations": {
"storageclass.kubernetes.io/is-default-class": "true"
}
},
"provisioner": "kubernetes.io/no-provisioner",
"reclaimPolicy": "Delete"
and the using the storage class when install Prometheus with helm
helm install stable/prometheus --set server.storageClass=local-storage
and i was also forced to create a volume for Prometheus to bind to
kubectl apply -f prometheusVolume.yaml
prometheusVolume.yaml:
apiVersion: v1
kind: PersistentVolume
metadata:
name: prometheus-volume
spec:
storageClassName: local-storage
capacity:
storage: 2Gi #Size of the volume
accessModes:
- ReadWriteOnce #type of access
hostPath:
path: "/mnt/data" #host location
You could use other storage classes, found that there as a lot to chose between but then there might be other steps involved to get it working.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Digital Ocean managed Kubernetes volume in pending state - elasticsearch

What a strange situation, after I've changed 10Gi to 10G it started to work. Maybe it has to do something with a storage class it's self, but it started to work.

Related

ElasticSearch CrashLoopBackoff when deploying with ECK in Kubernetes OKD 4.11

Why does it complain about `volume claim templates can only have their storage requests increased`?

Elasticsearch deployment on kubernetes using Persistent Volume

How to setup storage for my elasticsearch setup on GKE?

Why Prometheus pod pending after setup it by helm in Kubernetes cluster on Rancher server?

Categories

Resources