Nifi cluster does not spin up using NifiKop - apache-nifi

I have followed this link https://orange-opensource.github.io/nifikop/docs/next/2_setup/1_getting_started to install nifikop and spin up the cluster however the cluster doesn't seem spin up.
Below are the series of command I executed
Create namespace:
kubectl create namespace nifi
kubectl create namespace zookeeper
kubectl create namespace cert-manager
Create custom Storageclass
cat <<EOF | kubectl apply -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-storage
parameters:
type: pd-standard
provisioner: kubernetes.io/gce-pd
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
EOF
Create service account
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ServiceAccount
metadata:
name: nifikop
EOF
Install zookeeper
helm install nifikop-zk bitnami/zookeeper \
--namespace=nifi \
--set resources.requests.memory=256Mi \
--set resources.requests.cpu=250m \
--set resources.limits.memory=256Mi \
--set resources.limits.cpu=250m \
--set networkPolicy.enabled=true \
--set replicaCount=3 \
--set namespaces={“nifi”}
Install CRDS
kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.1.0/cert-manager.yaml
kubectl apply -f https://raw.githubusercontent.com/Orange-OpenSource/nifikop/master/config/crd/bases/nifi.orange.com_nificlusters.yaml
kubectl apply -f https://raw.githubusercontent.com/Orange-OpenSource/nifikop/master/config/crd/bases/nifi.orange.com_nifiusers.yaml
kubectl apply -f https://raw.githubusercontent.com/Orange-OpenSource/nifikop/master/config/crd/bases/nifi.orange.com_nifiusergroups.yaml
kubectl apply -f https://raw.githubusercontent.com/Orange-OpenSource/nifikop/master/config/crd/bases/nifi.orange.com_nifidataflows.yaml
kubectl apply -f https://raw.githubusercontent.com/Orange-OpenSource/nifikop/master/config/crd/bases/nifi.orange.com_nifiparametercontexts.yaml
kubectl apply -f https://raw.githubusercontent.com/Orange-OpenSource/nifikop/master/config/crd/bases/nifi.orange.com_nifiregistryclients.yaml
Install NifiKOP
helm install nifikop orange-incubator/nifikop \
--namespace=nifi \
--version="0.4.2-alpha" \
--set resources.requests.memory=256Mi \
--set resources.requests.cpu=250m \
--set resources.limits.memory=256Mi \
--set resources.limits.cpu=250m \
--set namespaces={"nifi"} --skip-crds
Create cluster
cat <<EOF | kubectl create -n nifi -f -
apiVersion: nifi.konpyutaika.com/v1
kind: NifiCluster
metadata:
name: simplenifi
spec:
service:
headlessEnabled: true
zkAddress: "nifikop-zk-zookeeper:2181"
zkPath: /simplenifi
clusterImage: "apache/nifi:1.17.0"
oneNifiNodePerNode: false
nodeConfigGroups:
default_group:
isNode: true
serviceAccountName: nifikop
storageConfigs:
- mountPath: "/opt/nifi/nifi-current/logs"
name: logs
pvcSpec:
accessModes:
- ReadWriteOnce
storageClassName: "local-storage"
resources:
requests:
storage: 10Gi
resourcesRequirements:
limits:
cpu: "2"
memory: 3Gi
requests:
cpu: "1"
memory: 1Gi
nodes:
- id: 1
nodeConfigGroup: "default_group"
- id: 2
nodeConfigGroup: "default_group"
propagateLabels: true
nifiClusterTaskSpec:
retryDurationMinutes: 10
listenersConfig:
internalListeners:
- containerPort: 8080
type: http
name: http
- containerPort: 6007
type: cluster
name: cluster
- containerPort: 10000
type: s2s
name: s2s
- containerPort: 9090
type: prometheus
name: prometheus
- containerPort: 6342
type: load-balance
name: load-balance
EOF
But I can see only these pods under nifi namespace
root#bh-gsn-57-asca-dev-01:~/nifikop# k get pods -n nifi
NAME READY STATUS RESTARTS AGE
nifikop-5d6f94854-fjx4q 1/1 Running 0 32m
nifikop-zk-zookeeper-0 1/1 Running 0 39m
nifikop-zk-zookeeper-1 1/1 Running 0 39m
nifikop-zk-zookeeper-2 1/1 Running 0 39m
root#bh-gsn-57-asca-dev-01:~/nifikop# k get NifiCluster -n nifi
NAME AGE
simplenifi 19m

Related

Update istio-ingressgateway with yaml instead of kubectl edit

I test tcp-based service from book...
To complete this task, I need to expose port 31400...
I found that I can do this using this command : KUBE_EDITOR="nano" kubectl edit svc istio-ingressgateway -n istio-system
and enter manually this :
name: tcp
nodePort: 30851
port: 31400,
protocol: TCP
targetPort: 31400
I work as expected, but how do the same task using yaml and kubectl apply ?
Thanks for your help,
WCDR
1 - Get current configuration :
$ kubectl get -n istio-system service istio-ingressgateway -o yaml
Output look like :
apiVersion: v1
kind: Service
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{...,"kind":"Service",..."app":"istio-ingressgateway"...
...
labels:
app: istio-ingressgateway
...
spec:
...
ports:
...
>>>> insert block here <<<<
selector:
...
...
2 - Patch it with yq or manually...
https://github.com/mikefarah/yq
3 - Apply change :
$ kubectl apply -n istio-system -f - <<EOF
apiVersion: v1
kind: Service
...
EOF
Output must be :
service/istio-ingressgateway configured
Enjoy...

PostStarthook exited with 126

I need to copy some configuration files already present in a location B to a location A where I have mounted a persistent volume, in the same container.
for that I tried to configure a post start hook as follows
lifecycle:
postStart:
exec:
command:
- "sh"
- "-c"
- >
if [! -d "/opt/A/data" ] ; then
cp -rp /opt/B/. /opt/A;
fi;
rm -rf /opt/B
but it exited with 126
Any tips please
You should give a space after the first bracket [. The following Deployment works:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
labels:
app: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
lifecycle:
postStart:
exec:
command:
- "sh"
- "-c"
- >
if [ ! -d "/suren" ] ; then
cp -rp /docker-entrypoint.sh /home/;
fi;
rm -rf /docker-entrypoint.sh
So, this nginx container starts with a docker-entrypoint.sh script by default. After the container has started, is won't find the directory suren, that will give true to the if statement, it will copy the script into /home directory and remove the script from the root.
# kubectl exec nginx-8d7cc6747-5nvwk 2> /dev/null -- ls /home/
docker-entrypoint.sh
Here is the yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: oracledb
labels:
app: oracledb
spec:
selector:
matchLabels:
app: oracledb
strategy:
type: Recreate
template:
metadata:
labels:
app: oracledb
spec:
containers:
- env:
- name: DB_SID
value: ORCLCDB
- name: DB_PDB
value: pdb
- name: DB_PASSWD
value: pwd
image: oracledb
imagePullPolicy: IfNotPresent
name: oracledb
lifecycle:
postStart:
exec:
command:
- "sh"
- "-c"
- >
if [ ! -d "/opt/oracle/oradata/ORCLCDB" ] ; then
cp -rp /opt/oracle/files/* /opt/oracle/oradata;
fi;
rm -rf /opt/oracle/files/
volumeMounts:
- mountPath: /opt/oracle/oradata
name: oradata
securityContext:
fsGroup: 54321
terminationGracePeriodSeconds: 30
volumes:
- name: oradata
persistentVolumeClaim:
claimName: oradata

Kubernetes Helm Elasticstack CrashLoopBackOff with JavaErrors in Log

I'm trying to deploy the ELK stack to my developing kubernetes cluster. It seems that I do everything as described in the tutorials, however, the pods keep failing with Java errors (see below). I will describe the whole process from installing the cluster until the error happens.
Step 1: Installing the cluster
# Apply sysctl params without reboot
cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter
# Setup required sysctl params, these persist across reboots.
cat <<EOF | sudo tee /etc/sysctl.d/99-kubernetes-cri.conf
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF
sudo sysctl --system
#update and install apt https stuff
sudo apt-get update
sudo apt-get install apt-transport-https ca-certificates curl gnupg lsb-release
# add docker repo for containerd and install it
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo \
"deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
sudo apt-get install -y containerd.io
# copy config
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml
sudo systemctl restart containerd
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
br_netfilter
EOF
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1 // somewhat redundant
net.bridge.bridge-nf-call-iptables = 1 // somewhat redundant
EOF
sudo sysctl --system
#install kubernetes binaries
sudo curl -fsSLo /usr/share/keyrings/kubernetes-archive-keyring.gpg https://packages.cloud.google.com/apt/doc/apt-key.gpg
echo "deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
#disable swap and comment swap in fstab
sudo swapoff -v /dev/mapper/main-swap
sudo nano /etc/fstab
#init cluster
sudo kubeadm init --pod-network-cidr=192.168.0.0/16
#make user to kubectl admin
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
#install calico
kubectl apply -f
kubectl create -f https://docs.projectcalico.org/manifests/tigera-operator.yaml
kubectl create -f https://docs.projectcalico.org/manifests/custom-resources.yaml
#untaint master node that pods can run on it
kubectl taint nodes --all node-role.kubernetes.io/master-
#install helm
curl https://baltocdn.com/helm/signing.asc | sudo apt-key add -
sudo apt-get install apt-transport-https --yes
echo "deb https://baltocdn.com/helm/stable/debian/ all main" | sudo tee /etc/apt/sources.list.d/helm-stable-debian.list
sudo apt-get update
sudo apt-get install helm
Step 2: Install ECK (https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-install-helm.html) and elasticsearch (https://github.com/elastic/helm-charts/blob/master/elasticsearch/README.md#installing)
# add helm repo
helm repo add elastic https://helm.elastic.co
helm repo update
# install eck
#### ommited as suggested in comment section!!!! helm install elastic-operator elastic/eck-operator -n elastic-system --create-namespace
helm install elasticsearch elastic/elasticsearch
Step 3: Add PersistentVolume
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: elk-data1
labels:
type: local
spec:
capacity:
storage: 30Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data1"
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: elk-data2
labels:
type: local
spec:
capacity:
storage: 30Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data2"
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: elk-data3
labels:
type: local
spec:
capacity:
storage: 30Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data3"
apply it
sudo mkdir /mnt/data1
sudo mkdir /mnt/data2
sudo mkdir /mnt/data3
kubectl apply -f storage.yaml
Now the pods (or at least one) sould run. But I keep getting STATUS CrashLoopBackOff with java errors in the log.
kubectl get pv,pvc,pods
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/elk-data1 30Gi RWO Retain Bound default/elasticsearch-master-elasticsearch-master-1 140m
persistentvolume/elk-data2 30Gi RWO Retain Bound default/elasticsearch-master-elasticsearch-master-2 140m
persistentvolume/elk-data3 30Gi RWO Retain Bound default/elasticsearch-master-elasticsearch-master-0 140m
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/elasticsearch-master-elasticsearch-master-0 Bound elk-data3 30Gi RWO 141m
persistentvolumeclaim/elasticsearch-master-elasticsearch-master-1 Bound elk-data1 30Gi RWO 141m
persistentvolumeclaim/elasticsearch-master-elasticsearch-master-2 Bound elk-data2 30Gi RWO 141m
NAME READY STATUS RESTARTS AGE
pod/elasticsearch-master-0 0/1 CrashLoopBackOff 32 141m
pod/elasticsearch-master-1 0/1 Pending 0 141m
pod/elasticsearch-master-2 0/1 Pending 0 141m
Logs and Error:
kubectl logs pod/elasticsearch-master-2
Exception in thread "main" java.lang.InternalError: java.lang.reflect.InvocationTargetException
at java.base/jdk.internal.platform.Metrics.systemMetrics(Metrics.java:65)
at java.base/jdk.internal.platform.Container.metrics(Container.java:43)
at jdk.management/com.sun.management.internal.OperatingSystemImpl.<init>(OperatingSystemImpl.java:48)
at jdk.management/com.sun.management.internal.PlatformMBeanProviderImpl.getOperatingSystemMXBean(PlatformMBeanProviderImpl.java:279)
at jdk.management/com.sun.management.internal.PlatformMBeanProviderImpl$3.nameToMBeanMap(PlatformMBeanProviderImpl.java:198)
at java.management/java.lang.management.ManagementFactory.lambda$getPlatformMBeanServer$0(ManagementFactory.java:487)
at java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:273)
at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:179)
at java.base/java.util.HashMap$ValueSpliterator.forEachRemaining(HashMap.java:1766)
at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)
at java.management/java.lang.management.ManagementFactory.getPlatformMBeanServer(ManagementFactory.java:488)
at org.apache.logging.log4j.core.jmx.Server.reregisterMBeansAfterReconfigure(Server.java:140)
at org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:558)
at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:263)
at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:207)
at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:220)
at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:197)
at org.elasticsearch.common.logging.LogConfigurator.configureStatusLogger(LogConfigurator.java:248)
at org.elasticsearch.common.logging.LogConfigurator.configureWithoutConfig(LogConfigurator.java:95)
at org.elasticsearch.cli.CommandLoggingConfigurator.configureLoggingWithoutConfig(CommandLoggingConfigurator.java:29)
at org.elasticsearch.cli.Command.main(Command.java:76)
at org.elasticsearch.common.settings.KeyStoreCli.main(KeyStoreCli.java:32)
Caused by: java.lang.reflect.InvocationTargetException
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:78)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:567)
at java.base/jdk.internal.platform.Metrics.systemMetrics(Metrics.java:61)
... 26 more
Caused by: java.lang.ExceptionInInitializerError
at java.base/jdk.internal.platform.CgroupSubsystemFactory.create(CgroupSubsystemFactory.java:107)
at java.base/jdk.internal.platform.CgroupMetrics.getInstance(CgroupMetrics.java:167)
... 31 more
Caused by: java.lang.NullPointerException
at java.base/java.util.Objects.requireNonNull(Objects.java:208)
at java.base/sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:260)
at java.base/java.nio.file.Path.of(Path.java:147)
at java.base/java.nio.file.Paths.get(Paths.java:69)
at java.base/jdk.internal.platform.CgroupUtil.lambda$readStringValue$1(CgroupUtil.java:66)
at java.base/java.security.AccessController.doPrivileged(AccessController.java:554)
at java.base/jdk.internal.platform.CgroupUtil.readStringValue(CgroupUtil.java:68)
at java.base/jdk.internal.platform.CgroupSubsystemController.getStringValue(CgroupSubsystemController.java:65)
at java.base/jdk.internal.platform.CgroupSubsystemController.getLongValue(CgroupSubsystemController.java:124)
at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.getLongValue(CgroupV1Subsystem.java:272)
at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.getHierarchical(CgroupV1Subsystem.java:218)
at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.setPath(CgroupV1Subsystem.java:201)
at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.setSubSystemControllerPath(CgroupV1Subsystem.java:173)
at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.lambda$initSubSystem$5(CgroupV1Subsystem.java:113)
at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:179)
at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)
at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.initSubSystem(CgroupV1Subsystem.java:113)
at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.<clinit>(CgroupV1Subsystem.java:47)
... 33 more
Exception in thread "main" java.lang.InternalError: java.lang.reflect.InvocationTargetException
at java.base/jdk.internal.platform.Metrics.systemMetrics(Metrics.java:65)
at java.base/jdk.internal.platform.Container.metrics(Container.java:43)
at jdk.management/com.sun.management.internal.OperatingSystemImpl.<init>(OperatingSystemImpl.java:48)
at jdk.management/com.sun.management.internal.PlatformMBeanProviderImpl.getOperatingSystemMXBean(PlatformMBeanProviderImpl.java:279)
at jdk.management/com.sun.management.internal.PlatformMBeanProviderImpl$3.nameToMBeanMap(PlatformMBeanProviderImpl.java:198)
at java.management/sun.management.spi.PlatformMBeanProvider$PlatformComponent.getMBeans(PlatformMBeanProvider.java:195)
at java.management/java.lang.management.ManagementFactory.getPlatformMXBean(ManagementFactory.java:686)
at java.management/java.lang.management.ManagementFactory.getOperatingSystemMXBean(ManagementFactory.java:388)
at org.elasticsearch.tools.launchers.DefaultSystemMemoryInfo.<init>(DefaultSystemMemoryInfo.java:28)
at org.elasticsearch.tools.launchers.JvmOptionsParser.jvmOptions(JvmOptionsParser.java:125)
at org.elasticsearch.tools.launchers.JvmOptionsParser.main(JvmOptionsParser.java:86)
Caused by: java.lang.reflect.InvocationTargetException
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:78)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:567)
at java.base/jdk.internal.platform.Metrics.systemMetrics(Metrics.java:61)
... 10 more
Caused by: java.lang.ExceptionInInitializerError
at java.base/jdk.internal.platform.CgroupSubsystemFactory.create(CgroupSubsystemFactory.java:107)
at java.base/jdk.internal.platform.CgroupMetrics.getInstance(CgroupMetrics.java:167)
... 15 more
Caused by: java.lang.NullPointerException
at java.base/java.util.Objects.requireNonNull(Objects.java:208)
at java.base/sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:260)
at java.base/java.nio.file.Path.of(Path.java:147)
at java.base/java.nio.file.Paths.get(Paths.java:69)
at java.base/jdk.internal.platform.CgroupUtil.lambda$readStringValue$1(CgroupUtil.java:66)
at java.base/java.security.AccessController.doPrivileged(AccessController.java:554)
at java.base/jdk.internal.platform.CgroupUtil.readStringValue(CgroupUtil.java:68)
at java.base/jdk.internal.platform.CgroupSubsystemController.getStringValue(CgroupSubsystemController.java:65)
at java.base/jdk.internal.platform.CgroupSubsystemController.getLongValue(CgroupSubsystemController.java:124)
at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.getLongValue(CgroupV1Subsystem.java:272)
at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.getHierarchical(CgroupV1Subsystem.java:218)
at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.setPath(CgroupV1Subsystem.java:201)
at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.setSubSystemControllerPath(CgroupV1Subsystem.java:173)
at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.lambda$initSubSystem$5(CgroupV1Subsystem.java:113)
at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:179)
at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)
at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.initSubSystem(CgroupV1Subsystem.java:113)
at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.<clinit>(CgroupV1Subsystem.java:47)
... 17 more
values.yaml from helm chart
---
clusterName: "elasticsearch"
nodeGroup: "master"
# The service that non master groups will try to connect to when joining the cluster
# This should be set to clusterName + "-" + nodeGroup for your master group
masterService: ""
# Elasticsearch roles that will be applied to this nodeGroup
# These will be set as environment variables. E.g. node.master=true
roles:
master: "true"
ingest: "true"
data: "true"
remote_cluster_client: "true"
ml: "true"
replicas: 3
minimumMasterNodes: 2
esMajorVersion: ""
# Allows you to add any config files in /usr/share/elasticsearch/config/
# such as elasticsearch.yml and log4j2.properties
esConfig: {}
# elasticsearch.yml: |
# key:
# nestedkey: value
# log4j2.properties: |
# key = value
# Extra environment variables to append to this nodeGroup
# This will be appended to the current 'env:' key. You can use any of the kubernetes env
# syntax here
extraEnvs: []
# - name: MY_ENVIRONMENT_VAR
# value: the_value_goes_here
# Allows you to load environment variables from kubernetes secret or config map
envFrom: []
# - secretRef:
# name: env-secret
# - configMapRef:
# name: config-map
# A list of secrets and their paths to mount inside the pod
# This is useful for mounting certificates for security and for mounting
# the X-Pack license
secretMounts: []
# - name: elastic-certificates
# secretName: elastic-certificates
# path: /usr/share/elasticsearch/config/certs
# defaultMode: 0755
hostAliases: []
#- ip: "127.0.0.1"
# hostnames:
# - "foo.local"
# - "bar.local"
image: "docker.elastic.co/elasticsearch/elasticsearch"
imageTag: "7.12.1"
imagePullPolicy: "IfNotPresent"
podAnnotations: {}
# iam.amazonaws.com/role: es-cluster
# additionals labels
labels: {}
esJavaOpts: "-Xmx1g -Xms1g"
resources:
requests:
cpu: "1000m"
memory: "2Gi"
limits:
cpu: "1000m"
memory: "2Gi"
initResources: {}
# limits:
# cpu: "25m"
# # memory: "128Mi"
# requests:
# cpu: "25m"
# memory: "128Mi"
sidecarResources: {}
# limits:
# cpu: "25m"
# # memory: "128Mi"
# requests:
# cpu: "25m"
# memory: "128Mi"
networkHost: "0.0.0.0"
volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 30Gi
rbac:
create: false
serviceAccountAnnotations: {}
serviceAccountName: ""
podSecurityPolicy:
create: false
name: ""
spec:
privileged: true
fsGroup:
rule: RunAsAny
runAsUser:
rule: RunAsAny
seLinux:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
volumes:
- secret
- configMap
- persistentVolumeClaim
- emptyDir
persistence:
enabled: true
labels:
# Add default labels for the volumeClaimTemplate of the StatefulSet
enabled: false
annotations: {}
extraVolumes: []
# - name: extras
# emptyDir: {}
extraVolumeMounts: []
# - name: extras
# mountPath: /usr/share/extras
# readOnly: true
extraContainers: []
# - name: do-something
# image: busybox
# command: ['do', 'something']
extraInitContainers: []
# - name: do-something
# image: busybox
# command: ['do', 'something']
# This is the PriorityClass settings as defined in
# https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/#priorityclass
priorityClassName: ""
# By default this will make sure two pods don't end up on the same node
# Changing this to a region would allow you to spread pods across regions
antiAffinityTopologyKey: "kubernetes.io/hostname"
# Hard means that by default pods will only be scheduled if there are enough nodes for them
# and that they will never end up on the same node. Setting this to soft will do this "best effort"
antiAffinity: "hard"
# This is the node affinity settings as defined in
# https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#node-affinity-beta-feature
nodeAffinity: {}
# The default is to deploy all pods serially. By setting this to parallel all pods are started at
# the same time when bootstrapping the cluster
podManagementPolicy: "Parallel"
# The environment variables injected by service links are not used, but can lead to slow Elasticsearch boot times when
# there are many services in the current namespace.
# If you experience slow pod startups you probably want to set this to `false`.
enableServiceLinks: true
protocol: http
httpPort: 9200
transportPort: 9300
service:
labels: {}
labelsHeadless: {}
type: ClusterIP
nodePort: ""
annotations: {}
httpPortName: http
transportPortName: transport
loadBalancerIP: ""
loadBalancerSourceRanges: []
externalTrafficPolicy: ""
updateStrategy: RollingUpdate
# This is the max unavailable setting for the pod disruption budget
# The default value of 1 will make sure that kubernetes won't allow more than 1
# of your pods to be unavailable during maintenance
maxUnavailable: 1
podSecurityContext:
fsGroup: 1000
runAsUser: 1000
securityContext:
capabilities:
drop:
- ALL
# readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
# How long to wait for elasticsearch to stop gracefully
terminationGracePeriod: 120
sysctlVmMaxMapCount: 262144
readinessProbe:
failureThreshold: 3
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 3
timeoutSeconds: 5
# https://www.elastic.co/guide/en/elasticsearch/reference/7.12/cluster-health.html#request-params wait_for_status
clusterHealthCheckParams: "wait_for_status=green&timeout=1s"
## Use an alternate scheduler.
## ref: https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/
##
schedulerName: ""
imagePullSecrets: []
nodeSelector: {}
tolerations: []
# Enabling this will publically expose your Elasticsearch instance.
# Only enable this if you have security enabled on your cluster
ingress:
enabled: false
annotations: {}
# kubernetes.io/ingress.class: nginx
# kubernetes.io/tls-acme: "true"
hosts:
- host: chart-example.local
paths:
- path: /
tls: []
# - secretName: chart-example-tls
# hosts:
# - chart-example.local
nameOverride: ""
fullnameOverride: ""
# https://github.com/elastic/helm-charts/issues/63
masterTerminationFix: false
lifecycle: {}
# preStop:
# exec:
# command: ["/bin/sh", "-c", "echo Hello from the postStart handler > /usr/share/message"]
# postStart:
# exec:
# command:
# - bash
# - -c
# - |
# #!/bin/bash
# # Add a template to adjust number of shards/replicas
# TEMPLATE_NAME=my_template
# INDEX_PATTERN="logstash-*"
# SHARD_COUNT=8
# REPLICA_COUNT=1
# ES_URL=http://localhost:9200
# while [[ "$(curl -s -o /dev/null -w '%{http_code}\n' $ES_URL)" != "200" ]]; do sleep 1; done
# curl -XPUT "$ES_URL/_template/$TEMPLATE_NAME" -H 'Content-Type: application/json' -d'{"index_patterns":['\""$INDEX_PATTERN"\"'],"settings":{"number_of_shards":'$SHARD_COUNT',"number_of_replicas":'$REPLICA_COUNT'}}'
sysctlInitContainer:
enabled: true
keystore: []
networkPolicy:
## Enable creation of NetworkPolicy resources. Only Ingress traffic is filtered for now.
## In order for a Pod to access Elasticsearch, it needs to have the following label:
## {{ template "uname" . }}-client: "true"
## Example for default configuration to access HTTP port:
## elasticsearch-master-http-client: "true"
## Example for default configuration to access transport port:
## elasticsearch-master-transport-client: "true"
http:
enabled: false
## if explicitNamespacesSelector is not set or set to {}, only client Pods being in the networkPolicy's namespace
## and matching all criteria can reach the DB.
## But sometimes, we want the Pods to be accessible to clients from other namespaces, in this case, we can use this
## parameter to select these namespaces
##
# explicitNamespacesSelector:
# # Accept from namespaces with all those different rules (only from whitelisted Pods)
# matchLabels:
# role: frontend
# matchExpressions:
# - {key: role, operator: In, values: [frontend]}
## Additional NetworkPolicy Ingress "from" rules to set. Note that all rules are OR-ed.
##
# additionalRules:
# - podSelector:
# matchLabels:
# role: frontend
# - podSelector:
# matchExpressions:
# - key: role
# operator: In
# values:
# - frontend
transport:
## Note that all Elasticsearch Pods can talks to themselves using transport port even if enabled.
enabled: false
# explicitNamespacesSelector:
# matchLabels:
# role: frontend
# matchExpressions:
# - {key: role, operator: In, values: [frontend]}
# additionalRules:
# - podSelector:
# matchLabels:
# role: frontend
# - podSelector:
# matchExpressions:
# - key: role
# operator: In
# values:
# - frontend
# Deprecated
# please use the above podSecurityContext.fsGroup instead
fsGroup: ""
What you are experiencing is not an issue related to Elasticsearch. It is a problem resulting from the cgroup configuration for the version of containerd you are using. I haven't unpacked the specifics, but the exception in the Elasticsearch logs relates to the JDK failing when attempting to retrieve the required cgroup information.
I had the same issue and resolved it by executing the following steps, before installing Kubernetes, to install a later version of containerd and configure it to use cgroups with systemd:
Add the GPG key for the official Docker repository.
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
Add the Docker repository to APT sources.
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu focal stable"
Install the latest containerd.io package instead of the containerd package from Ubuntu.
apt-get -y install containerd.io
Generate the default containerd configuration.
containerd config default > /etc/containerd/config.toml
Configure containerd to use systemd to manage the cgroups.
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
runtime_engine = ""
runtime_root = ""
privileged_without_host_devices = false
base_runtime_spec = ""
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
Restart the containerd service.
systemctl restart containerd
For the ELK stack to work you need all three PersistentVolumeClaim's to be bound as I recall. Instead of creating 1 30 GB of PV create 3 of the same size with the claims and then re-install. Other nodes have unmet dependincies.
Also please do not handle the volumes by hand. There are guidelines to deploy dynamic volums. Use OpenEBS for example. That way you wont need to worry about the pvc's. After giving the pv's if anything happens write again with your cluster installation process.
I was wrong obviously, in this particular problem, filesystems and cgroups take role and the main problem of this is an old problem. From 5.2.1 to 8.0.0.
Reinstall the chart by pulling the chart. Edit values file and definitely change the container version. It should be fine or create another error log stack.
Get it solved down to the 7.11.x version.
My kubernetes version is v1.13.x

Filename of configMap shows up as env in the Pod

I have a file named config.txt, which i used to create configmap myconfig inside minikube cluster.
However, when I use myconfig in a Pod, the name of the file config.txt also shows up as part of the ENV.
How can I correct it?
> cat config.txt
var3=val3
var4=val4
> kubectl create cm myconfig --from-file=config.txt
configmap/myconfig created
> kubectl describe cm myconfig
Name: myconfig
Namespace: default
Labels: <none>
Annotations: <none>
Data
====
config.txt:
----
var3=val3
var4=val4
Events: <none>
Pod definition
> cat nginx.yml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: nginx
name: nginx
spec:
containers:
- image: nginx
name: nginx
envFrom:
- configMapRef:
name: myconfig
resources: {}
dnsPolicy: ClusterFirst
restartPolicy: Never
status: {}
> kubectl create -f nginx.yml
pod/nginx created
Pod EVN inspection, notice the line config.txt=var3=val3
expected it to be just var3=val3
> kubectl exec -it nginx -- env
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=nginx
TERM=xterm
config.txt=var3=val3
var4=val4
KUBERNETES_PORT_443_TCP=tcp://10.96.0.1:443
KUBERNETES_PORT_443_TCP_PROTO=tcp
KUBERNETES_PORT_443_TCP_PORT=443
KUBERNETES_PORT_443_TCP_ADDR=10.96.0.1
KUBERNETES_SERVICE_HOST=10.96.0.1
KUBERNETES_SERVICE_PORT=443
KUBERNETES_SERVICE_PORT_HTTPS=443
KUBERNETES_PORT=tcp://10.96.0.1:443
NGINX_VERSION=1.19.4
NJS_VERSION=0.4.4
PKG_RELEASE=1~buster
HOME=/root
Create configmap like this will do the job:
kubectl create cm myconfig --from-env-file=config.txt

Readiness and Liveness probes for elasticsearch 6.3.0 on Kubernetes failing

I am trying to setup EFK stack on Kubernetes . The Elasticsearch version being used is 6.3.2. Everything works fine until I place the probes configuration in the deployment YAML file. I am getting error as below. This is causing the pod to be declared unhealthy and eventually gets restarted which appears to be a false restart.
Warning Unhealthy 15s kubelet, aks-agentpool-23337112-0 Liveness probe failed: Get http://10.XXX.Y.ZZZ:9200/_cluster/health: dial tcp 10.XXX.Y.ZZZ:9200: connect: connection refused
I did try using telnet from a different container to the elasticsearch pod with IP and port and I was successful but only kubelet on the node is unable to resolve the IP of the pod causing the probes to fail.
Below is the snippet from the pod spec of the Kubernetes Statefulset YAML. Any assistance on the resolution would be really helpful. Spent quite a lot of time on this without any clue :(
PS: The stack is being setup on AKS cluster
- name: es-data
image: quay.io/pires/docker-elasticsearch-kubernetes:6.3.2
env:
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: CLUSTER_NAME
value: myesdb
- name: NODE_MASTER
value: "false"
- name: NODE_INGEST
value: "false"
- name: HTTP_ENABLE
value: "true"
- name: NODE_DATA
value: "true"
- name: DISCOVERY_SERVICE
value: "elasticsearch-discovery"
- name: NETWORK_HOST
value: "_eth0:ipv4_"
- name: ES_JAVA_OPTS
value: -Xms512m -Xmx512m
- name: PROCESSORS
valueFrom:
resourceFieldRef:
resource: limits.cpu
resources:
requests:
cpu: 0.25
limits:
cpu: 1
ports:
- containerPort: 9200
name: http
- containerPort: 9300
name: transport
livenessProbe:
httpGet:
port: http
path: /_cluster/health
initialDelaySeconds: 40
periodSeconds: 10
readinessProbe:
httpGet:
path: /_cluster/health
port: http
initialDelaySeconds: 30
timeoutSeconds: 10
The pods/containers runs just fine without the probes in place . Expectation is that the probes should work fine when set on the deployment YAMLs and the POD should not get restarted.
The thing is that ElasticSearch itself has own health statuses (red, yellow, green) and you need to consider that in your configuration.
Here what I found in my own ES configuration, based on the official ES helm chart:
readinessProbe:
failureThreshold: 3
initialDelaySeconds: 40
periodSeconds: 10
successThreshold: 3
timeoutSeconds: 5
exec:
command:
- sh
- -c
- |
#!/usr/bin/env bash -e
# If the node is starting up wait for the cluster to be green
# Once it has started only check that the node itself is responding
START_FILE=/tmp/.es_start_file
http () {
local path="${1}"
if [ -n "${ELASTIC_USERNAME}" ] && [ -n "${ELASTIC_PASSWORD}" ]; then
BASIC_AUTH="-u ${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}"
else
BASIC_AUTH=''
fi
curl -XGET -s -k --fail ${BASIC_AUTH} http://127.0.0.1:9200${path}
}
if [ -f "${START_FILE}" ]; then
echo 'Elasticsearch is already running, lets check the node is healthy'
http "/"
else
echo 'Waiting for elasticsearch cluster to become green'
if http "/_cluster/health?wait_for_status=green&timeout=1s" ; then
touch ${START_FILE}
exit 0
else
echo 'Cluster is not yet green'
exit 1
fi
fi
First Please check the logs using
kubectl logs <pod name> -n <namespacename>
You have to first run the init container and change the volume permissions.
you have to run the whole config as the user : 1000 also before the container of elasticsearch start you have to change the volume permission using init container.
apiVersion: apps/v1
kind: StatefulSet
metadata:
labels:
app : elasticsearch
component: elasticsearch
release: elasticsearch
name: elasticsearch
spec:
podManagementPolicy: Parallel
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app : elasticsearch
component: elasticsearch
release: elasticsearch
serviceName: elasticsearch
template:
metadata:
creationTimestamp: null
labels:
app : elasticsearch
component: elasticsearch
release: elasticsearch
spec:
containers:
- env:
- name: cluster.name
value: <SET THIS>
- name: discovery.type
value: single-node
- name: ES_JAVA_OPTS
value: -Xms512m -Xmx512m
- name: bootstrap.memory_lock
value: "false"
image: elasticsearch:6.5.0
imagePullPolicy: IfNotPresent
name: elasticsearch
ports:
- containerPort: 9200
name: http
protocol: TCP
- containerPort: 9300
name: transport
protocol: TCP
resources:
limits:
cpu: 250m
memory: 1Gi
requests:
cpu: 150m
memory: 512Mi
securityContext:
privileged: true
runAsUser: 1000
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /usr/share/elasticsearch/data
name: elasticsearch-data
dnsPolicy: ClusterFirst
initContainers:
- command:
- sh
- -c
- chown -R 1000:1000 /usr/share/elasticsearch/data
- sysctl -w vm.max_map_count=262144
- chmod 777 /usr/share/elasticsearch/data
- chomod 777 /usr/share/elasticsearch/data/node
- chmod g+rwx /usr/share/elasticsearch/data
- chgrp 1000 /usr/share/elasticsearch/data
image: busybox:1.29.2
imagePullPolicy: IfNotPresent
name: set-dir-owner
resources: {}
securityContext:
privileged: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /usr/share/elasticsearch/data
name: elasticsearch-data
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 10
updateStrategy:
type: OnDelete
volumeClaimTemplates:
- metadata:
creationTimestamp: null
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
Check out the my yaml config and you can use. It's for single node of elasticsearch
Probe outlined in my answer works in 3 nodes discovery when Istio presented. If livenessProbe is bad, than k8s will restart container even not allowing to start properly. I use internal Elastic ports (for node to node communication) to test liveness. These ports speak TCP.
livenessProbe:
tcpSocket:
port: 9300
initialDelaySeconds: 60 # it takes time from jvm process to start start up to point when discovery process starts
timeoutSeconds: 10
- name: discovery.zen.minimum_master_nodes
value: "2"
- name: discovery.zen.ping.unicast.hosts
value: elastic

Resources