Data is lost after changes are applied to the image - elasticsearch

I am trying to add Elasticsearch to the EKS cluster. But whenever I apply the changes my data is lost. It looks like volume I have attached has been changed and reclaimed.
metadata:
name: elasticsearch-uat
labels:
component: elasticsearch-uat
spec:
replicas: 1
serviceName: elasticsearch-uat
template:
metadata:
...
spec:
initContainers:
- name: init-sysctl
...
containers:
- name: es
securityContext:
capabilities:
add:
- IPC_LOCK
image: 559076975273.dkr.ecr.us-west-2.amazonaws.com/elasticsearch-s3:v2
env:
...
ports:
- containerPort: 9200
name: http
protocol: TCP
- containerPort: 9300
name: transport
protocol: TCP
volumeMounts:
- mountPath: /data
name: es-storage-uat
updateStrategy:
type: RollingUpdate
volumeClaimTemplates:
- metadata:
namespace: k8
name: es-storage-uat
spec:
storageClassName: gp2
accessModes: [ ReadWriteOnce ]
resources:
requests:
storage: 2Gi
This is a statefulset
Please help me to understand this concept. I do not want my data to loose in any condition.
Thanks in advance.

From pv output it is clear that PersistentVolume reclaim policy is set to Delete. Which means that if the pvc is deleted the PersistentVolume gets auto deleted. You loose the data if the pvc is deleted.
In your case, it is appropriate to use the “Retain” policy. With the “Retain” policy, if a user deletes a PersistentVolumeClaim, the corresponding PersistentVolume is not be deleted.
Further to the above, if you are using dynamic storage then set reclaimPolicy field of the storage class to appropriate value. If no reclaimPolicy is specified when a StorageClass object is created, it will default to Delete.

Related

How can I configure different storage mount for different pod in Elasticsearch cluster in K8S?

I am deploying Elasticsearch cluster to K8S on EKS with nodegroup. I claimed a EBS for the cluster's storage. When I launch the cluster, only one pod is running successfully but I got this error for other pods:
Warning FailedAttachVolume 3m33s attachdetach-controller Multi-Attach error for volume "pvc-4870bd46-2f1e-402a-acf7-005de83e4588" Volume is already used by pod(s) es-0
Warning FailedMount 90s kubelet Unable to attach or mount volumes: unmounted volumes=[persistent-storage], unattached volumes=[es-config persistent-storage default-token-pqzkp]: timed out waiting for the condition
It means the storage is already in use. I understand that this volume is used by the first pod so other pods can't use it. But I don't know how to use different mount path for different pod when they are using the same EBS volume.
Below is the full spec for the cluster.
apiVersion: v1
kind: ConfigMap
metadata:
name: es-config
data:
elasticsearch.yml: |
cluster.name: elk-cluster
network.host: "0.0.0.0"
bootstrap.memory_lock: false
# discovery.zen.minimum_master_nodes: 2
node.max_local_storage_nodes: 9
discovery.seed_hosts:
- es-0.es-entrypoint.default.svc.cluster.local
- es-1.es-entrypoint.default.svc.cluster.local
- es-2.es-entrypoint.default.svc.cluster.local
ES_JAVA_OPTS: -Xms4g -Xmx8g
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: es
namespace: default
spec:
serviceName: es-entrypoint
replicas: 3
selector:
matchLabels:
name: es
template:
metadata:
labels:
name: es
spec:
volumes:
- name: es-config
configMap:
name: es-config
items:
- key: elasticsearch.yml
path: elasticsearch.yml
- name: persistent-storage
persistentVolumeClaim:
claimName: ebs-claim
initContainers:
- name: permissions-fix
image: busybox
volumeMounts:
- name: persistent-storage
mountPath: /usr/share/elasticsearch/data
command: [ 'chown' ]
args: [ '1000:1000', '/usr/share/elasticsearch/data' ]
containers:
- name: es
image: elasticsearch:7.10.1
resources:
requests:
cpu: 2
memory: 8Gi
ports:
- name: http
containerPort: 9200
- containerPort: 9300
name: inter-node
volumeMounts:
- name: es-config
mountPath: /usr/share/elasticsearch/config/elasticsearch.yml
subPath: elasticsearch.yml
- name: persistent-storage
mountPath: /usr/share/elasticsearch/data
---
apiVersion: v1
kind: Service
metadata:
name: es-entrypoint
spec:
selector:
name: es
ports:
- port: 9200
targetPort: 9200
protocol: TCP
clusterIP: None
You should be using volumeClaimTemplates with statefulset so that each pod gets its own volume. Details:
volumeClaimTemplates:
- metadata:
name: es
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
# storageClassName: <omit to use default StorageClass or specify>

Kubernetes persistent volume claim overriding existing directory's owner and permissions

In Kubernetes, I am having a directory permission problem. I am testing with a pod to create a bare-bones elasticsearch instance, built off of an ElasticSearch provided docker image.
If I use a basic .yaml file to define the container, everything starts up. The problem happens when I attempt to replace a directory created from the docker image with a directory created from mounting of the persistent volume.
The original directory was
drwxrwxr-x 1 elasticsearch root 4096 Aug 30 19:25 data
and if I mount the persistent volume, it changes the owner and permissions to
drwxr-xr-x 2 root root 4096 Aug 30 19:53 data
Now with the elasticsearch process running a the elasticsearch user, this directory can longer be accessed.
I have set the pod's security context's fsGroup to 1000, to match the group of the elasticsearch group. I have set the container's security context's runAsUser to 0. I have set various other combinations of users and group, but to no avail.
Here is my pod, persistent volume claim, and persistent volume definitions.
Any suggestions are welcome.
apiVersion: v1
kind: Pod
metadata:
name: elasticfirst
labels:
app: elasticsearch
spec:
securityContext:
fsGroup: 1000
containers:
- name: es01
image: docker.elastic.co/elasticsearch/elasticsearch:7.3.1
securityContext:
runAsUser: 0
resources:
limits:
memory: 2Gi
cpu: 200m
requests:
memory: 1Gi
cpu: 100m
env:
- name: node.name
value: es01
- name: discovery.seed_hosts
value: es01
- name: cluster.initial_master_nodes
value: es01
- name: cluster.name
value: elasticsearch-cluster
- name: bootstrap.memory_lock
value: "true"
- name: ES_JAVA_OPTS
value: "-Xms1g -Xmx2g"
ports:
- containerPort: 9200
volumeMounts:
- mountPath: "/usr/share/elasticsearch/data"
name: elastic-storage2
nodeSelector:
type: compute
volumes:
- name: elastic-storage2
persistentVolumeClaim:
claimName: elastic-storage2-pvc
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: elastic-storage2-pvc
spec:
storageClassName: local-storage
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 512Mi
apiVersion: v1
kind: PersistentVolume
metadata:
name: elastic-storage2-pv
spec:
storageClassName: local-storage
capacity:
storage: 512Mi
accessModes:
- ReadWriteOnce
hostPath:
path: /var/tmp/pv
Your question is a tiny bit confusing about what is happening versus what you want to be happening, but in general that problem is a common one; that's why many setups use an initContainer: to change the ownership of freshly provisioned PersistentVolumes (as in this example)
In such a setup, the initContainer: would run as root, but would also presumably be a very thin container whose job is only to chown and then exit, leaving your application container -- elasticsearch in your example -- free to run as an unprivileged user
spec:
initContainers:
- name: chown
image: busybox
command:
- chown
- -R
- "1000:1000"
- /the/data
volumeMounts:
- name: es-data
mountPoint: /the/data
containers:
- name: es
# etc etc

facing issue with filestore persistent volume in gke

I am using google filestore for persistent volume in kubernetes. But it is mounting only the root folder not its contents.
I am using GKE service and perform the following tasks:
volume-create:
apiVersion: v1
kind: PersistentVolume
metadata:
name: volume1
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteMany
nfs:
server: x.x.x.x
path: /share
Persistent volume claim:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: volume1-claim
spec:
accessModes:
- ReadWriteMany
storageClassName: ""
volumeName: volume1
resources:
requests:
storage: 3Gi
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: image1
spec:
template:
metadata:
labels:
name: image1
spec:
restartPolicy: Always
volumes:
- name: volume
persistentVolumeClaim:
claimName: volume1-claim
containers:
- name: image1
image: gcr.io/project-gcr/image:1.0
imagePullPolicy: Always
volumeMounts:
- name: volume
mountPath: "/app/data"
But it mounting empty folder app/data not its contents. I also referred the below URL:
https://cloud.google.com/filestore/docs/accessing-fileshares
Any help is appreciated.
Are you trying to mount an already existing persistent disk?
If so, you will need to define the disk in your configuration file.
You can find more information here.
What kind of data is located on /app/data , is it on a VM? Can you give me some more information on your deployment? How are you testing viewing your data?
The more details I have, the more specific we can be with the help we can provide.

Elasticsearch path.data

I have two webservers, each with their own installation of Elasticsearch.
Both these webservers have a shared folder on their D: drive.
I want to use the same data folder so that I have one set of indexes and each elasticsearch install uses those same indexes, rather than having 2 sets, one on each server.
Therefore I have changed the 'path.data' location in both elasticsearch.yml files to point to the same shared folder.
Problem is, only one webserver is able to retrieve data for queries, the other server just returns nothing when running a search query.
Am I missing a config setting?
Are the two Elasticsearch Nodes in the same cluster ?
Each node writes to its own folder even though they share the same base directory.
As it is, seems you have two distinct Elasticsearch instances holding separate data.
Define a cluster and add the two nodes to the cluster which is the proper way to have the same data managed by multiple nodes
bellow config works fine
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
name: production
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: elasticsearch-data-prod
namespace: production
labels:
type: local
spec:
storageClassName: standard
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: elasticsearch-data-prod
namespace: production
spec:
storageClassName: standard
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
---
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: data-es
namespace: production
spec:
version: 8.4.3
nodeSets:
- name: default
count: 1
config:
node.store.allow_mmap: false
podTemplate:
spec:
containers:
- name: elasticsearch
# resources:
# limits:
# memory: 2Gi
# cpu: 2
# env:
# - name: ES_JAVA_OPTS
# value: "-Xms2g -Xmx4g"
volumeMounts:
- name: elasticsearch-data-prod
mountPath: /usr/share/production/elasticsearch/data
volumeClaimTemplates:
- metadata:
name: elasticsearch-data-prod
spec:
accessModes:
- ReadWriteOnce
storageClassName: standard
resources:
requests:
storage: 10Gi
---
apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
name: data-kibana
namespace: production
spec:
version: 8.4.3
count: 1
elasticsearchRef:
name: data-es

Kubernetes Persistent Volume is not working on GCE

I am trying to make my elastic search pods persistent so that data is preserved when deployment or pods are recreated.Elastic search is a part of Graylog2 setup.
After I set everything up, I sent a few logs to Graylog and I could see them appear on the dashboard. However, I deleted elasticsearch pod and after it was recreated all the data was lost on Graylog dashboard.
I am using GCE.
Here is my persistent volume config:
kind: PersistentVolume
apiVersion: v1
metadata:
name: elastic-pv
labels:
type: gcePD
spec:
capacity:
storage: 200Gi
accessModes:
- ReadWriteOnce
gcePersistentDisk:
fsType: ext4
pdName: elastic-pv-disk
Persistent volume claim config:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: elastic-pvc
labels:
type: gcePD
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Gi
and here is my elasticsearch deployment:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: elastic-deployment
spec:
replicas: 1
template:
metadata:
labels:
type: elasticsearch
spec:
containers:
- name: elastic-container
image: gcr.io/project/myelasticsearch:v1
imagePullPolicy: Always
ports:
- containerPort: 9300
name: first-port
protocol: TCP
- containerPort: 9200
name: second-port
protocol: TCP
volumeMounts:
- name: elastic-pd
mountPath: /data/db
volumes:
- name: elastic-pd
persistentVolumeClaim:
claimName: elastic-pvc
Output of kubectl describe pod:
Name: elastic-deployment-1423685295-jt6x5
Namespace: default
Node: gke-sd-logger-default-pool-2b3affc0-299k/10.128.0.6
Start Time: Tue, 09 May 2017 22:59:59 +0500
Labels: pod-template-hash=1423685295
type=elasticsearch
Status: Running
IP: 10.12.0.11
Controllers: ReplicaSet/elastic-deployment-1423685295
Containers:
elastic-container:
Container ID: docker://8774c747e2a56363f657a583bf5c2234ed2cff64dc21b6319fc53fdc5c1a6b2b
Image: gcr.io/thematic-flash-786/myelasticsearch:v1
Image ID: docker://sha256:7c25be62dbad39c07c413888e275ae419a66070d37e0d98bf5008e15d7720eec
Ports: 9300/TCP, 9200/TCP
Requests:
cpu: 100m
State: Running
Started: Tue, 09 May 2017 23:02:11 +0500
Ready: True
Restart Count: 0
Volume Mounts:
/data/db from elastic-pd (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-qtdbb (ro)
Environment Variables: <none>
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
elastic-pd:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: elastic-pvc
ReadOnly: false
default-token-qtdbb:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-qtdbb
QoS Class: Burstable
Tolerations: <none>
No events.
Output of kubectl describe pv:
Name: elastic-pv
Labels: type=gcePD
StorageClass:
Status: Bound
Claim: default/elastic-pvc
Reclaim Policy: Retain
Access Modes: RWO
Capacity: 200Gi
Message:
Source:
Type: GCEPersistentDisk (a Persistent Disk resource in Google Compute Engine)
PDName: elastic-pv-disk
FSType: ext4
Partition: 0
ReadOnly: false
No events.
Output of kubectl describe pvc:
Name: elastic-pvc
Namespace: default
StorageClass:
Status: Bound
Volume: elastic-pv
Labels: type=gcePD
Capacity: 200Gi
Access Modes: RWO
No events.
Confirmation that real disk exists:
What could be the reason Persistent Volume is not persistent?
In the official images, the Elasticsearch data is stored at /usr/share/elasticsearch/data and not /data/db. It would appear that you needed to updated the mount to be /usr/share/elasticsearch/data instead to get the data storing on the persistent volume.

Resources