I have a k8s cronjob which exports metrics periodically and there's k8s secret.yaml and I execute a script from it, called run.sh
Within run.sh, I would like to refer another script and I can't seem to find a right way to access this script or specify it in cronjob.yaml
cronjob.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
name: exporter
labels:
app: metrics-exporter
spec:
schedule: "* * * * *"
jobTemplate:
spec:
template:
metadata:
labels:
app: exporter
spec:
volumes:
- name: db-dir
emptyDir: { }
- name: home-dir
emptyDir: { }
- name: run-sh
secret:
secretName: exporter-run-sh
- name: clusters
emptyDir: { }
containers:
- name: stats-exporter
image: XXXX
imagePullPolicy: IfNotPresent
command:
- /bin/bash
- /home/scripts/run.sh
resources: { }
volumeMounts:
- name: db-dir
mountPath: /.db
- name: home-dir
mountPath: /home
- name: run-sh
mountPath: /home/scripts
- name: clusters
mountPath: /db-clusters
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
securityContext:
capabilities:
drop:
- ALL
privileged: false
runAsUser: 1000
runAsNonRoot: true
readOnlyRootFilesystem: false
allowPrivilegeEscalation: false
terminationGracePeriodSeconds: 30
restartPolicy: OnFailure
Here's how in secret.yaml, I run script run.sh and refer to another script inside /db-clusters.
apiVersion: v1
kind: Secret
metadata:
name: exporter-run-sh
type: Opaque
stringData:
run.sh: |
#!/bin/sh
source $(dirname $0)/db-clusters/cluster1.sh
# further work here
Here's the
Error Message:
/home/scripts/run.sh: line 57: /home/scripts/db-clusters/cluster1.sh: No such file or directory
As per secret yaml, In string data you need to mention “run-sh” as you need to include the secret in the run-sh volume mount.
Have a try by using run-sh in stringData as below :
apiVersion: v1
kind: Secret
metadata:
name: exporter-run-sh
type: Opaque
stringData:
run-sh: |
#!/bin/sh
source $(dirname $0)/db-clusters/cluster1.sh
# further work here
Related
I'm trying to setup an elasticsearch stateful set. I realise there a some similar questions that have been asked but none help in my circumstance.
The first version of setting up an elasticsearch stateful set worked fine with the following config:
apiVersion: v1
kind: PersistentVolume
metadata:
name: elasticsearch-volume
labels:
type: local
spec:
storageClassName: do-block-storage
capacity:
storage: 100M
accessModes:
- ReadWriteOnce
hostPath:
path: "/data/elasticsearch"
---
apiVersion: v1
kind: PersistentVolumeClaim # Create PVC
metadata:
name: elasticsearch-volume-claim # Sets PVC's name
labels:
app: elasticsearch # Defines app to create PVC for
spec:
storageClassName: do-block-storage
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100M # Sets PVC's size
---
apiVersion: v1
kind: Service
metadata:
name: elasticsearch
spec:
type: ClusterIP
clusterIP: None
selector:
app: elasticsearch
ports:
- port: 9200 # To get at the elasticsearch container, just hit the service on 9200
targetPort: 9200 # routes to the exposed port on elasticsearch
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: elasticsearch # name of stateful
namespace: default
spec:
serviceName: elasticsearch
replicas: 1
selector:
matchLabels:
app: elasticsearch # should match service > spec.slector.app.
template:
metadata:
labels:
app: elasticsearch
spec:
volumes:
- name: elasticsearch-pvc
persistentVolumeClaim:
claimName: elasticsearch-volume-claim
containers:
- name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch:8.2.3
resources:
limits:
cpu: 100m
requests:
cpu: 100m
ports:
- containerPort: 9200
name: rest
protocol: TCP
- containerPort: 9300
name: inter-node
protocol: TCP
volumeMounts:
- name: elasticsearch-pvc
mountPath: /usr/share/elasticsearch/data
env:
- name: cluster.name
value: search
- name: node.name
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: discovery.type
value: single-node
- name: ES_JAVA_OPTS
value: "-Xms512m -Xmx512m"
- name: xpack.security.enabled
value: "false"
initContainers:
- name: fix-permissions
image: busybox
command:
["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"]
securityContext:
privileged: true
volumeMounts:
- name: elasticsearch-pvc
mountPath: /usr/share/elasticsearch/data
- name: increase-vm-max-map
image: busybox
command: ["sysctl", "-w", "vm.max_map_count=262144"]
securityContext:
privileged: true
- name: increase-fd-ulimit
image: busybox
command: ["sh", "-c", "ulimit -n 65536"]
securityContext:
privileged: true
I then tried to implement a version of this with multiple replica's:
apiVersion: v1
kind: Service
metadata:
name: elasticsearch
spec:
type: ClusterIP
clusterIP: None
selector:
app: elasticsearch
ports:
- port: 9200 # To get at the elasticsearch container, just hit the service on 9200
targetPort: 9200 # routes to the exposed port on elasticsearch
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: es-cluster # name of stateful
spec:
serviceName: elasticsearch
replicas: 2
selector:
matchLabels:
app: elasticsearch # should match service > spec.slector.app.
volumeClaimTemplates:
- metadata:
name: elasticsearch-pvc
labels:
app: elasticsearch
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100M
storageClassName: do-block-storage
template:
metadata:
labels:
app: elasticsearch
spec:
# volumes:
# - name: elasticsearch-pvc
# persistentVolumeClaim:
# claimName: elasticsearch-volume-claim
containers:
- name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch:8.2.3
resources:
limits:
cpu: 100m
requests:
cpu: 100m
ports:
- containerPort: 9200
name: rest
protocol: TCP
- containerPort: 9300
name: inter-node
protocol: TCP
volumeMounts:
- name: elasticsearch-pvc
mountPath: /usr/share/elasticsearch/data
env:
- name: cluster.name
value: search
- name: node.name
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: discovery.seed_hosts
value: "es-cluster-0.elasticsearch,es-cluster-1.elasticsearch,es-cluster-2.elasticsearch"
- name: cluster.initial_master_nodes
value: "es-cluster-0,es-cluster-1,es-cluster-2"
- name: ES_JAVA_OPTS
value: "-Xms512m -Xmx512m"
initContainers:
- name: fix-permissions
image: busybox
command:
["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"]
securityContext:
privileged: true
volumeMounts:
- name: elasticsearch-pvc
mountPath: /usr/share/elasticsearch/data
- name: increase-vm-max-map
image: busybox
command: ["sysctl", "-w", "vm.max_map_count=262144"]
securityContext:
privileged: true
- name: increase-fd-ulimit
image: busybox
command: ["sh", "-c", "ulimit -n 65536"]
securityContext:
privileged: true
However I ran into the error: 0/2 nodes are available: 2 pod has unbound immediate PersistentVolumeClaims.
I subsequently reduced the replica's to just 1 and manually created the PV in case DO was having an issue creating the PVC without a PV (even though DO should dynamically create the PVC and PV because it works with the postgres multi-replica stateful set which I set up in exactly the same way):
apiVersion: v1
kind: PersistentVolume
metadata:
name: es-volume-1
spec:
capacity:
storage: 100M
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: do-block-storage
hostPath:
path: "/data/elasticsearch"
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- es-cluster-0
This again yielded the error: 0/2 nodes are available: 2 pod has unbound immediate PersistentVolumeClaims.
After spending a while de-bugging I gave up and decided to revert back to my single replica elasticsearch stateful set using the method I had originally used.
But once again I got the error 0/2 nodes are available: 2 pod has unbound immediate PersistentVolumeClaims.!!!
I don't have a clue what's going on here. Why am I getting this error even though I'm only trying to create a single replica and I have manually defined the PV and PVC which worked fine before??
Turns out the issue was indeed Digital Ocean specific. In the second attempt when I tried to create multiple replica's I had to use dynamic volume provisioning via volumeClaimTemplates and set the storage class to do-block-storage which as it turns out has a minimum limit of 1Gi!
Alas when I updated to 1Gi it all started working.
I love elastic search so on my new project I have been trying to make it work on Kubernetes and skaffold
this is the yaml file I wrote:
apiVersion: apps/v1
kind: Deployment
metadata:
name: eks-depl
spec:
replicas: 1
selector:
matchLabels:
app: eks
template:
metadata:
labels:
app: eks
spec:
containers:
- name: eks
image: elasticsearch:7.17.0
---
apiVersion: v1
kind: Service
metadata:
name: eks-srv
spec:
selector:
app: eks
ports:
- name: db
protocol: TCP
port: 9200
targetPort: 9200
- name: monitoring
protocol: TCP
port: 9300
targetPort: 9300
After I run skaffold dev it shows to be working by Kubernetes but after a few seconds it crashes and goes down.
I can't understand what I am doing wrong.
After I have updated my config files as Mr. Harsh Manvar it worked like a charm but currently I am facing another issue. The client side says the following....
Btw I am using ElasticSearch version 7.11.1 and Client side module "#elastic/elasticsearch^7.11.1"
Here is example YAML file you should consider running if you are planning to run the single Node elasticsearch cluster on the Kubernetes
apiVersion: apps/v1
kind: StatefulSet
metadata:
labels:
app: elasticsearch
component: elasticsearch
release: elasticsearch
name: elasticsearch
namespace: default
spec:
podManagementPolicy: OrderedReady
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: elasticsearch
component: elasticsearch
release: elasticsearch
serviceName: elasticsearch
template:
metadata:
labels:
app: elasticsearch
component: elasticsearch
release: elasticsearch
spec:
containers:
- env:
- name: cluster.name
value: es_cluster
- name: ELASTIC_PASSWORD
value: xyz-xyz
- name: discovery.type
value: single-node
- name: path.repo
value: backup/es-backup
- name: ES_JAVA_OPTS
value: -Xms512m -Xmx512m
- name: bootstrap.memory_lock
value: "false"
- name: xpack.security.enabled
value: "true"
image: elasticsearch:7.3.2
imagePullPolicy: IfNotPresent
name: elasticsearch
ports:
- containerPort: 9200
name: http
protocol: TCP
- containerPort: 9300
name: transport
protocol: TCP
resources:
limits:
cpu: 451m
memory: 1250Mi
requests:
cpu: 250m
memory: 1000Mi
securityContext:
privileged: true
runAsUser: 1000
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /usr/share/elasticsearch/data
name: elasticsearch-data
dnsPolicy: ClusterFirst
initContainers:
- command:
- sh
- -c
- chown -R 1000:1000 /usr/share/elasticsearch/data
- sysctl -w vm.max_map_count=262144
- chmod 777 /usr/share/elasticsearch/data
- chomod 777 /usr/share/elasticsearch/data/node
- chmod g+rwx /usr/share/elasticsearch/data
- chgrp 1000 /usr/share/elasticsearch/data
image: busybox:1.29.2
imagePullPolicy: IfNotPresent
name: set-dir-owner
resources: {}
securityContext:
privileged: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /usr/share/elasticsearch/data
name: elasticsearch-data
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 10
updateStrategy:
type: OnDelete
volumeClaimTemplates:
- metadata:
creationTimestamp: null
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
volumeMode: Filesystem
i would also recommand checking out the helm charts of the elasticsearch :
1 . https://github.com/elastic/helm-charts/tree/master/elasticsearch
2. https://github.com/helm/charts/tree/master/stable/elasticsearch
you can expose the above stateful set using the service and use the further with the application.
I am backuping my Postgresql database using this cronjob:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: postgres-backup
spec:
schedule: "0 2 * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: postgres-backup
image: postgres:10.4
command: ["/bin/sh"]
args: ["-c", 'echo "$PGPASS" > /root/.pgpass && chmod 600 /root/.pgpass && pg_dump -Fc -h <host> -U <user> <db> > /var/backups/backup.dump']
env:
- name: PGPASS
valueFrom:
secretKeyRef:
name: pgpass
key: pgpass
volumeMounts:
- mountPath: /var/backups
name: postgres-backup-storage
restartPolicy: Never
volumes:
- name: postgres-backup-storage
hostPath:
path: /var/volumes/postgres-backups
type: DirectoryOrCreate
The cronjob gets successfully executed, the backup is made and saved in the container of the Job but this container is stopped after successful execution of the script.
of course I want to access the backup files in the container but I can't because it is stopped/terminated.
is there a way to execute shell commands in a container after it is terminated, so I can access the backup files saved in the container?
I know that I could do that on the node, but I don't have the permission to access it.
#confused genius gave me a great idea to create another same container to access the dump files so this is the solution that works:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: postgres-backup
spec:
schedule: "0 2 * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: postgres-backup
image: postgres:10.4
command: ["/bin/sh"]
args: ["-c", 'echo "$PGPASS" > /root/.pgpass && chmod 600 /root/.pgpass && pg_dump -Fc -h <host> -U <user> <db> > /var/backups/backup.dump']
env:
- name: PGPASS
valueFrom:
secretKeyRef:
name: dev-pgpass
key: pgpass
volumeMounts:
- mountPath: /var/backups
name: postgres-backup-storage
- name: postgres-restore
image: postgres:10.4
volumeMounts:
- mountPath: /var/backups
name: postgres-backup-storage
restartPolicy: Never
volumes:
- name: postgres-backup-storage
hostPath:
# Ensure the file directory is created.
path: /var/volumes/postgres-backups
type: DirectoryOrCreate
after that one just needs to sh into the "postgres-restore" container and access the dump files.
thanks
I am trying to deploy an elasticsearch statfulset and have the storage provisioned by rook-ceph storageclass.
The pod is in pending mode because of:
Warning FailedScheduling 87s default-scheduler 0/4 nodes are available: 4 pod has unbound immediate PersistentVolumeClaims.
The statefull set looks like this:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: elasticsearch
namespace: rcc
labels:
app: elasticsearch
tier: backend
type: db
spec:
serviceName: es
replicas: 3
selector:
matchLabels:
app: elasticsearch
tier: backend
type: db
template:
metadata:
labels:
app: elasticsearch
tier: backend
type: db
spec:
terminationGracePeriodSeconds: 300
initContainers:
- name: fix-the-volume-permission
image: busybox
command:
- sh
- -c
- chown -R 1000:1000 /usr/share/elasticsearch/data
securityContext:
privileged: true
volumeMounts:
- name: data
mountPath: /usr/share/elasticsearch/data
- name: increase-the-vm-max-map-count
image: busybox
command:
- sysctl
- -w
- vm.max_map_count=262144
securityContext:
privileged: true
- name: increase-the-ulimit
image: busybox
command:
- sh
- -c
- ulimit -n 65536
securityContext:
privileged: true
containers:
- name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch-oss:7.7.1
ports:
- containerPort: 9200
name: http
- containerPort: 9300
name: tcp
resources:
requests:
memory: 4Gi
limits:
memory: 6Gi
env:
- name: cluster.name
value: elasticsearch
- name: node.name
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: discovery.zen.ping.unicast.hosts
value: "elasticsearch-0.es.default.svc.cluster.local,elasticsearch-1.es.default.svc.cluster.local,elasticsearch-2.es.default.svc.cluster.local"
- name: ES_JAVA_OPTS
value: -Xms4g -Xmx4g
volumeMounts:
- name: data
mountPath: /usr/share/elasticsearch/data
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes:
- ReadWriteOnce
storageClassName: rook-cephfs
resources:
requests:
storage: 5Gi
and the reson of the pvc is not getting created is:
Normal FailedBinding 47s (x62 over 15m) persistentvolume-controller no persistent volumes available for this claim and no storage class is set
Any idea what I do wrong?
After adding rook-ceph-block storage class and changing the storageclass to that it worked without any issues.
I am using the .yaml file below to create Katib Experiment in Kubeflow. However, I am getting
Failed to reconcile: cannot restore struct from: string
errors. Have any solutions for this? Most of the Katib Experiment example codes doesn't have a volume in it, but I'm trying to mount a volume after downloading the data from my S3.
apiVersion: "kubeflow.org/v1alpha3"
kind: Experiment
metadata:
namespace: apple
labels:
controller-tools.k8s.io: "1.0"
name: transformer-experiment
spec:
objective:
type: maximize
goal: 0.8
objectiveMetricName: Train-accuracy
additionalMetricNames:
- Train-loss
algorithm:
algorithmName: random
parallelTrialCount: 3
maxTrialCount: 12
maxFailedTrialCount: 3
metricsCollectorSpec:
collector:
kind: StdOut
parameters:
- name: --lr
parameterType: double
feasibleSpace:
min: "0.01"
max: "0.03"
- name: --dropout_rate
parameterType: double
feasibleSpace:
min: "0.005"
max: "0.020"
- name: --layer_count
parameterType: int
feasibleSpace:
min: "2"
max: "5"
- name: --d_model_count
parameterType: categorical
feasibleSpace:
list:
- "64"
- "128"
- "256"
trialTemplate:
goTemplate:
rawTemplate: |-
apiVersion: batch/v1
kind: Job
metadata:
name: {{.Trial}}
namespace: {{.NameSpace}}
spec:
template:
spec:
volumes:
- name: train-data
emptyDir: {}
containers:
- name: data-download
image: amazon/aws-cli
command:
- "aws s3 sync s3://kubeflow/kubeflowdata.tar.gz /train-data"
volumeMounts:
- name: train-data
mountPath: /train-data
- name: {{.Trial}}
image: <Our Image>
command:
- "cd /train-data"
- "ls"
- "python"
- "/opt/ml/src/main.py"
- "--train_batch=64"
- "--test_batch=64"
- "--num_workers=4"
volumeMounts:
- name: train-data
mountPath: /train-data
{{- with .HyperParameters}}
{{- range .}}
- "{{.Name}}={{.Value}}"
{{- end}}
{{- end}}
restartPolicy: Never
As answered here, the following works for me:
apiVersion: batch/v1
kind: Job
spec:
template:
spec:
containers:
- name: training-container
image: docker.io/romeokienzler/claimed-train-mobilenet_v2:0.4
command:
- "ipython"
- "/train-mobilenet_v2.ipynb"
- "optimizer=${trialParameters.optimizer}"
volumeMounts:
- mountPath: /data/
name: data-volume
restartPolicy: Never
volumes:
- name: data-volume
persistentVolumeClaim:
claimName: data-pvc