`kubectl debug` hangs on 1.20 with feature gate enabled

`kubectl debug` hangs on 1.20 with feature gate enabled - debugging

I have a Kubernetes 1.20 cluster with kubectl 1.20 and the EphemeralContainers feature gate enabled.
I'm trying to run the commands in the kubectl debug documentation, but they don't seem to be working correctly. I can start a pod with:
$ kubectl run ephemeral-demo --image=k8s.gcr.io/pause:3.1 --restart=Never
pod/ephemeral-demo created
And when I try to attach a debug container to it:
$ kubectl debug -it ephemeral-demo --image=busybox --target=ephemeral-demo
Defaulting debug container name to debugger-g6pj6.
I never get a command line, no matter how long I wait or how many times I hit <enter>. If I examine the pod I can see the debug container is present:
$ kubectl describe pod ephemeral-demo
Name: ephemeral-demo
Namespace: nextcloud
Priority: 0
Node: k8s-htz-worker-02/78.47.15.149
Start Time: Tue, 15 Dec 2020 06:36:30 -0600
Labels: run=ephemeral-demo
Annotations: cni.projectcalico.org/podIP: 10.244.2.186/32
cni.projectcalico.org/podIPs: 10.244.2.186/32
Status: Running
IP: 10.244.2.186
IPs:
IP: 10.244.2.186
Containers:
ephemeral-demo:
Container ID: docker://b6d3ffa3d2ee8eb6a51a3b5ba823392cf57ed836833830510a2625788f8789d6
Image: k8s.gcr.io/pause:3.1
Image ID: docker-pullable://k8s.gcr.io/pause#sha256:f78411e19d84a252e53bff71a4407a5686c46983a2c2eeed83929b888179acea
Port: <none>
Host Port: <none>
State: Running
Started: Tue, 15 Dec 2020 06:36:32 -0600
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-btnzm (ro)
Ephemeral Containers:
debugger-g6pj6:
Image: busybox
Port: <none>
Host Port: <none>
Environment: <none>
Mounts: <none>
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
default-token-btnzm:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-btnzm
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 55s default-scheduler Successfully assigned nextcloud/ephemeral-demo to k8s-htz-worker-02
Normal Pulled 53s kubelet Container image "k8s.gcr.io/pause:3.1" already present on machine
Normal Created 53s kubelet Created container ephemeral-demo
Normal Started 53s kubelet Started container ephemeral-demo
But if I try to exec into it, I get a failure:
$ kc exec -it ephemeral-demo -c debugger-g6pj6 -- bash
error: unable to upgrade connection: container not found ("debugger-g6pj6")
Am I missing something?

The solution turned out to be that while I enabled the feature gate on the master node (/etc/kubernetes/manifests/kube-apiserver.yaml), the change didn't propogate to the worker nodes in the cluster (/var/lib/kubelet/config.yaml). Manually applying the changes to the worker nodes and restarting kubelet (systemctl restart kubelet.service) resolved the issue.

Related

k8s spring boot pod failing readiness and liveness probe

I have configured a spring-boot pod and configured the liveness and readiness probes.
When I start the pod, the describe command is showing the below output.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 92s default-scheduler Successfully assigned pradeep-ns/order-microservice-rs-8tqrv to pool-h4jq5h014-ukl3l
Normal Pulled 43s (x2 over 91s) kubelet Container image "classpathio/order-microservice:latest" already present on machine
Normal Created 43s (x2 over 91s) kubelet Created container order-microservice
Normal Started 43s (x2 over 91s) kubelet Started container order-microservice
Warning Unhealthy 12s (x6 over 72s) kubelet Liveness probe failed: Get "http://10.244.0.206:8222/actuator/health/liveness": dial tcp 10.244.0.206:8222: connect: connection refused
Normal Killing 12s (x2 over 52s) kubelet Container order-microservice failed liveness probe, will be restarted
Warning Unhealthy 2s (x8 over 72s) kubelet Readiness probe failed: Get "http://10.244.0.206:8222/actuator/health/readiness": dial tcp 10.244.0.206:8222: connect: connection refused
The pod definition is like below
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: order-microservice-rs
labels:
app: order-microservice
spec:
replicas: 1
selector:
matchLabels:
app: order-microservice
template:
metadata:
name: order-microservice
labels:
app: order-microservice
spec:
containers:
- name: order-microservice
image: classpathio/order-microservice:latest
imagePullPolicy: IfNotPresent
env:
- name: SPRING_PROFILES_ACTIVE
value: dev
- name: SPRING_DATASOURCE_USERNAME
valueFrom:
secretKeyRef:
key: username
name: db-credentials
- name: SPRING_DATASOURCE_PASSWORD
valueFrom:
secretKeyRef:
key: password
name: db-credentials
volumeMounts:
- name: app-config
mountPath: /app/config
- name: app-logs
mountPath: /var/log
livenessProbe:
httpGet:
port: 8222
path: /actuator/health/liveness
initialDelaySeconds: 10
periodSeconds: 10
readinessProbe:
httpGet:
port: 8222
path: /actuator/health/readiness
initialDelaySeconds: 10
periodSeconds: 10
resources:
requests:
memory: "550Mi"
cpu: "500m"
limits:
memory: "550Mi"
cpu: "750m"
volumes:
- name: app-config
configMap:
name: order-microservice-config
- name: app-logs
emptyDir: {}
restartPolicy: Always
If I disable the liveness and readiness probe in the replica-set manifest and I exec into the pod, I am getting a valid response when invoking http://localhost:8222/actuator/health/liveness and http://localhost:8222/actuator/health/readiness endpoint.
Why is my pod restarting and failing when invoking the readiness and liveness endpoint with Kubernetes. Where am I going wrong?
Update
If I remove the resource section, the pods are running but when added the resource parameters, the probes are failing.

When you limit the container / spring application to 0.5 cores (500 millicores) the startup probably takes longer than the given liveness probe thresholds.
You can either increase them, or use a startupProbe with more relaxed settings (f.e. failureThreshold 10). You can reduce the period for the liveness probe in that case and get faster feedback after a successful container start was detected.

Your pod config only give 0.5 Core of CPU, and your check time was too short. The spring boot start may take a long time more than 10 seconds according your server CPU performance. This is my config of spring boot pod may give you a point.
"livenessProbe": {
"httpGet": {
"path": "/actuator/liveness",
"port": 11032,
"scheme": "HTTP"
},
"initialDelaySeconds": 90,
"timeoutSeconds": 30,
"periodSeconds": 30,
"successThreshold": 1,
"failureThreshold": 3
},
"readinessProbe": {
"httpGet": {
"path": "/actuator/health",
"port": 11032,
"scheme": "HTTP"
},
"initialDelaySeconds": 60,
"timeoutSeconds": 30,
"periodSeconds": 30,
"successThreshold": 1,
"failureThreshold": 3
},
and I did not limit the CPU and memory resource, if you limit the CPU, it will take more time. Hop this could help you.

When you are trying the request against your localhost, and it works, it is not a guarantee that it is going to work on other network interfaces. Kubelet is a node agent, so the request is going to your eth0, or equivalent, not your localhost.
You can check it by making the request from another pod to your pod's IP address, or the service backing it up.
Probably you are making your application to serve on localhost, while you have to make it serve on 0.0.0.0, or eth0.

Why can't I Access a Service Exposed from Minikube on Windows?

I'm new to Kubernetes. I successfully created a deployment with 2 replicas of my Angular frontend application, but when I expose it with a service and try to access the service with 'minikube service service-name', the browser can't show me the application.
This is my docker file
FROM registry.gitlab.informatica.aci.it/ccsc/images/nodejs/10_15
LABEL maintainer="d.vaccaro#informatica.aci.it" name="assistenza-fo" version="v1.0.0" license=""
WORKDIR /usr/src/app
ARG PRODUCTION_MODE="false"
ENV NODE_ENV='development'
ENV HTTP_PORT=4200
COPY package*.json ./
RUN if [ "${PRODUCTION_MODE}" = "true" ] || [ "${PRODUCTION_MODE}" = "1" ]; then \
echo "Build di produzione"; \
npm ci --production ; \
else \
echo "Build di sviluppo"; \
npm ci ; \
fi
RUN npm audit fix
RUN npm install -g #angular/cli
COPY dockerize /usr/local/bin
RUN chmod +x /usr/local/bin/dockerize
COPY . .
EXPOSE 4200
CMD ng serve --host 0.0.0.0
pod description
Name: assistenza-fo-674f85c547-bzf8g
Namespace: default
Priority: 0
Node: minikube/172.17.0.2
Start Time: Sun, 19 Apr 2020 12:41:06 +0200
Labels: pod-template-hash=674f85c547
run=assistenza-fo
Annotations: <none>
Status: Running
IP: 172.18.0.6
Controlled By: ReplicaSet/assistenza-fo-674f85c547
Containers:
assistenza-fo:
Container ID: docker://ef2bfb66d22dea56b2dc0e49e875376bf1edff369274015445806451582703a0
Image: registry.gitlab.informatica.aci.it/apra/sta-r/assistenza/assistenza-fo:latest
Image ID: docker-pullable://registry.gitlab.informatica.aci.it/apra/sta-r/assistenza/assistenza-fo#sha256:8d02a3e69d6798c1ac88815ef785e05aba6e394eb21f806bbc25fb761cca5a98
Port: 4200/TCP
Host Port: 0/TCP
State: Running
Started: Sun, 19 Apr 2020 12:41:08 +0200
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-zdrwg (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
default-token-zdrwg:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-zdrwg
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>
my deployment description
Name: assistenza-fo
Namespace: default
CreationTimestamp: Sun, 19 Apr 2020 12:41:06 +0200
Labels: run=assistenza-fo
Annotations: deployment.kubernetes.io/revision: 1
Selector: run=assistenza-fo
Replicas: 2 desired | 2 updated | 2 total | 2 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: run=assistenza-fo
Containers:
assistenza-fo:
Image: registry.gitlab.informatica.aci.it/apra/sta-r/assistenza/assistenza-fo:latest
Port: 4200/TCP
Host Port: 0/TCP
Environment: <none>
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
OldReplicaSets: <none>
NewReplicaSet: assistenza-fo-674f85c547 (2/2 replicas created)
Events: <none>
and my service description
Name: assistenza-fo
Namespace: default
Labels: run=assistenza-fo
Annotations: <none>
Selector: run=assistenza-fo
Type: LoadBalancer
IP: 10.97.3.206
Port: <unset> 4200/TCP
TargetPort: 4200/TCP
NodePort: <unset> 30375/TCP
Endpoints: 172.18.0.6:4200,172.18.0.7:4200
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
When i run the command
minikube service assistenza-fo
I get the following output:
|-----------|---------------|-------------|-------------------------|
| NAMESPACE | NAME | TARGET PORT | URL |
|-----------|---------------|-------------|-------------------------|
| default | assistenza-fo | 4200 | http://172.17.0.2:30375 |
|-----------|---------------|-------------|-------------------------|
* Opening service default/assistenza-fo in default browser...
but Chrome prints out: "unable to reach the site" for timeout.
Thank you
EDIT
I create again the service, this time as a NodePort service. Still not working. This is the service description:
Name: assistenza-fo
Namespace: default
Labels: run=assistenza-fo
Annotations: <none>
Selector: run=assistenza-fo
Type: NodePort
IP: 10.107.46.43
Port: <unset> 4200/TCP
TargetPort: 4200/TCP
NodePort: <unset> 30649/TCP
Endpoints: 172.18.0.7:4200,172.18.0.8:4200
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>

I was able to reproduce your issue.
It's actually a bug on latest version of Minikube for Windows running Docker Driver: --driver=docker
You can see it here: Issue - minikube service not working with Docker driver on Windows 10 Pro #7644
it was patched with the merge: Pull - docker driver: Add Service & Tunnel features to windows
it is available now on Minikube v1.10.0-beta.0
In order to make it work, download the beta version from the website:
https://github.com/kubernetes/minikube/releases/download/v1.10.0-beta.0/minikube-windows-amd64.exe
move it to your working folder and rename it to minikube.exe
C:\Kubernetes>rename minikube-windows-amd64.exe minikube.exe
C:\Kubernetes>dir
22/04/2020 21:10 <DIR> .
22/04/2020 21:10 <DIR> ..
22/04/2020 21:04 55.480.832 minikube.exe
22/04/2020 20:05 489 nginx.yaml
2 File(s) 55.481.321 bytes
If you haven't yet, stop and uninstall the older version, then start Minikube with the new binary:
C:\Kubernetes>minikube.exe start --driver=docker
* minikube v1.10.0-beta.0 on Microsoft Windows 10 Pro 10.0.18363 Build 18363
* Using the docker driver based on existing profile
* Starting control plane node minikube in cluster minikube
* Pulling base image ...
* Restarting existing docker container for "minikube" ...
* Preparing Kubernetes v1.18.0 on Docker 19.03.2 ...
- kubeadm.pod-network-cidr=10.244.0.0/16
* Enabled addons: dashboard, default-storageclass, storage-provisioner
* Done! kubectl is now configured to use "minikube"
C:\Kubernetes>kubectl get all
NAME READY STATUS RESTARTS AGE
pod/nginx-76df748b9-t6q59 1/1 Running 1 78m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 85m
service/nginx-svc NodePort 10.100.212.15 <none> 80:31027/TCP 78m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx 1/1 1 1 78m
NAME DESIRED CURRENT READY AGE
replicaset.apps/nginx-76df748b9 1 1 1 78m
Minikube is now running on version v1.10.0-beta.0, now you can run the service as intended (and note the command will be unavailable because it will be tunneling the connection:
The browser will open automatically and your service will be available:
If you have any doubts let me know in the comments.

LivenessProbe is failing but port-forward is working on the same port

I have the following deployment yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: gofirst
labels:
app: gofirst
spec:
selector:
matchLabels:
app: gofirst
template:
metadata:
labels:
app: gofirst
spec:
restartPolicy: Always
containers:
- name: gofirst
image: lbvenkatesh/gofirst:0.0.5
resources:
limits:
memory: "128Mi"
cpu: "500m"
ports:
- name: http
containerPort: 8080
livenessProbe:
httpGet:
path: /health
port: http
httpHeaders:
- name: "X-Health-Check"
value: "1"
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health
port: http
httpHeaders:
- name: "X-Health-Check"
value: "1"
initialDelaySeconds: 30
periodSeconds: 10
and my service yaml is this:
apiVersion: v1
kind: Service
metadata:
name: gofirst
labels:
app: gofirst
spec:
publishNotReadyAddresses: true
type: NodePort
selector:
app: gofirst
ports:
- port: 8080
targetPort: http
name: http
"gofirst" is a simple web application written in Golang Gin.
Here is the dockerFile of the same:
FROM golang:latest
LABEL MAINTAINER='Venkatesh Laguduva <lbvenkatesh#gmail.com>'
RUN mkdir /app
ADD . /app/
RUN apt -y update && apt -y install git
RUN go get github.com/gin-gonic/gin
RUN go get -u github.com/RaMin0/gin-health-check
WORKDIR /app
RUN go build -o main .
ARG verArg="0.0.1"
ENV VERSION=$verArg
ENV PORT=8080
ENV GIN_MODE=release
EXPOSE 8080
CMD ["/app/main"]
I have deployed this application in Minikube and when I try to describe this pods, I am seeing these events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 10m (x2 over 10m) default-scheduler 0/1 nodes are available: 1 Insufficient cpu.
Normal Scheduled 10m default-scheduler Successfully assigned default/gofirst-95fc8668c-6r4qc to m01
Normal Pulling 10m kubelet, m01 Pulling image "lbvenkatesh/gofirst:0.0.5"
Normal Pulled 10m kubelet, m01 Successfully pulled image "lbvenkatesh/gofirst:0.0.5"
Normal Killing 8m13s (x2 over 9m13s) kubelet, m01 Container gofirst failed liveness probe, will be restarted
Normal Pulled 8m13s (x2 over 9m12s) kubelet, m01 Container image "lbvenkatesh/gofirst:0.0.5" already present on machine
Normal Created 8m12s (x3 over 10m) kubelet, m01 Created container gofirst
Normal Started 8m12s (x3 over 10m) kubelet, m01 Started container gofirst
Warning Unhealthy 7m33s (x7 over 9m33s) kubelet, m01 Liveness probe failed: Get http://172.17.0.4:8080/health: dial tcp 172.17.0.4:8080: connect: connection refused
Warning Unhealthy 5m35s (x12 over 9m25s) kubelet, m01 Readiness probe failed: Get http://172.17.0.4:8080/health: dial tcp 172.17.0.4:8080: connect: connection refused
Warning BackOff 31s (x17 over 4m13s) kubelet, m01 Back-off restarting failed container
I tried the sample container "hello-world" and worked well when I did "minikube service hello-world" but when I tried the same with "minikube service gofirst", I got the connection error in the browser.
I must be doing something relatively simpler but am unable to locate the error. Please go through my yaml and docker file, let me know if I am making any error.

I've reproduced your scenario and faced the same issues you have. So I decided to remove the liveness and rediness probes to be able to log in to the pod and investigate it.
Here is the yaml I used:
apiVersion: apps/v1
kind: Deployment
metadata:
name: gofirst
labels:
app: gofirst
spec:
selector:
matchLabels:
app: gofirst
template:
metadata:
labels:
app: gofirst
spec:
restartPolicy: Always
containers:
- name: gofirst
image: lbvenkatesh/gofirst:0.0.5
resources:
limits:
memory: "128Mi"
cpu: "500m"
ports:
- name: http
containerPort: 8080
I logged in the pod to check if the application is listening in the port you are trying to test:
kubectl exec -ti gofirst-65cfc7556-bbdcg -- bash
Then I installed netstat:
# apt update
# apt install net-tools
Checked if the application is running:
# ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 10:06 ? 00:00:00 /app/main
root 9 0 0 10:06 pts/0 00:00:00 sh
root 15 9 0 10:07 pts/0 00:00:00 ps -ef
And finally checked if port 8080 is listening:
# netstat -an
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 127.0.0.1:8080 0.0.0.0:* LISTEN
tcp 0 0 10.28.0.9:56106 151.101.184.204:80 TIME_WAIT
tcp 0 0 10.28.0.9:56130 151.101.184.204:80 TIME_WAIT
tcp 0 0 10.28.0.9:56104 151.101.184.204:80 TIME_WAIT
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags Type State I-Node Path
As we can see, application is listening to localhost connections only and not from everywhere. Expected output should be: 0.0.0.0:8080
Hope it helps you to solve the problem.

Unable to create the fluentd containers in my kubernetes cluster on ubuntu

I am trying to do the log monitoring of my kubernetes cluster using Elasticsearch, Fluentd, and Kibana. Here is the link which I was followed in this task. I labeled the nodes with beta.kubernetes.io/fluentd-ds-ready: "true". Initially, I created the statefulset for Elasticsearch.
After that, I created the fluentd-es-configmap.yaml,fluentd-es-ds.yaml and checked the pods status using kubectl get pods -n kube-system. The Fluentd pods are showing status like container creating. I checked the logs of the Fluentd container and it shows the error like:
Error from server (BadRequest): container "fluentd-es" in pod "fluentd-es-v2.0.1-csx96" is waiting to start: ContainerCreating
Here is fluentd pod description:
Name: fluentd-es-v2.0.1-csx96
Namespace: kube-system
Priority: 0
PriorityClassName: <none>
Node: ldap/192.168.1.191
Start Time: Wed, 10 Oct 2018 15:08:17 -0400
Labels: controller-revision-hash=5754d85c97
k8s-app=fluentd-es
kubernetes.io/cluster-service=true
pod-template-generation=1
version=v2.0.1
Annotations: scheduler.alpha.kubernetes.io/critical-pod:
Status: Pending
IP:
Controlled By: DaemonSet/fluentd-es-v2.0.1
Containers:
fluentd-es:
Container ID:
Image: gcr.io/google-containers/fluentd-elasticsearch:v2.0.1
Image ID:
Port: <none>
Host Port: <none>
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Limits:
memory: 500Mi
Requests:
cpu: 100m
memory: 200Mi
Environment:
FLUENTD_ARGS: --no-supervisor -q
Mounts:
/etc/fluent/config.d from config-volume (rw)
/host/lib from libsystemddir (ro)
/var/lib/docker/containers from varlibdockercontainers (ro)
/var/log from varlog (rw)
/var/run/secrets/kubernetes.io/serviceaccount from fluentd-es-token-l2b2m (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
varlog:
Type: HostPath (bare host directory volume)
Path: /var/log
HostPathType:
varlibdockercontainers:
Type: HostPath (bare host directory volume)
Path: /var/lib/docker/containers
HostPathType:
libsystemddir:
Type: HostPath (bare host directory volume)
Path: /usr/lib64
HostPathType:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: fluentd-es-config-v0.1.0
Optional: false
fluentd-es-token-l2b2m:
Type: Secret (a volume populated by a Secret)
SecretName: fluentd-es-token-l2b2m
Optional: false
QoS Class: Burstable
Node-Selectors: beta.kubernetes.io/fluentd-ds-ready=true
Tolerations: node.kubernetes.io/disk-pressure:NoSchedule
node.kubernetes.io/memory-pressure:NoSchedule
node.kubernetes.io/not-ready:NoExecute
node.kubernetes.io/unreachable:NoExecute
node.kubernetes.io/unschedulable:NoSchedule
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedMount 14m (x42 over 107m) kubelet, ldap Unable to mount vo lumes for pod "fluentd-es-v2.0.1-csx96_kube-system(d80d9c78-ccbf-11e8-b7b5-52540 0e4ff36)": timeout expired waiting for volumes to attach or mount for pod "kube- system"/"fluentd-es-v2.0.1-csx96". list of unmounted volumes=[config-volume]. li st of unattached volumes=[varlog varlibdockercontainers libsystemddir config-volume fluentd-es-token-l2b2m]
Warning FailedMount 3m23s (x60 over 109m) kubelet, ldap MountVolume.SetUp failed for volume "config-volume" : configmap "fluentd-es-config-v0.1.0" not found
Could anybody suggest me how to resolve this issue?
Thanks in advance.

The problem seems to be a mismatch in the name of the configmap. The DaemonSet in looking for a configmap named fluentd-es-config-v0.1.0 but it is not found.
In the repository the configmap is named fluentd-es-config-v0.1.5 in both fluentd-es-ds.yaml and fluentd-es-configmap.yaml, so it should work by just using these files.

Running elasticsearch on Google Cloud Kubernetes ends in CrashLoopBackOff

I try to run the elasticsearch6 container on a google cloud instance. Unfortunately the container always ends in CrashLoopBackOff.
This is what I did:
install gcloud and kubectl
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
echo "deb http://packages.cloud.google.com/apt cloud-sdk-$(lsb_release -c -s) main" | sudo tee -a /etc/apt/sources.list.d/google-cloud-sdk.list
sudo apt-get update && sudo apt-get install google-cloud-sdk kubectl
configure gcloud
gcloud init
gcloud config set compute/zone europe-west3-a # For Frankfurt
create kubernetes cluster
gcloud container clusters create elasticsearch-cluster --machine-type=f1-micro --num-nodes=3
Activate pod
kubectl create -f pod.yml
apiVersion: v1
kind: Pod
metadata:
name: test-elasticsearch
labels:
name: test-elasticsearch
spec:
containers:
- image: launcher.gcr.io/google/elasticsearch6
name: elasticsearch
After this I get the status:
kubectl get pods
NAME READY STATUS RESTARTS AGE
test-elasticsearch 0/1 CrashLoopBackOff 10 31m
A kubectl logs test-elasticsearch does not show any output.
And here the output of kubectl describe po test-elasticsearch with some info XXX out.
Name: test-elasticsearch
Namespace: default
Node: gke-elasticsearch-cluste-default-pool-XXXXXXXX-wtbv/XX.XXX.X.X
Start Time: Sat, 12 May 2018 14:54:36 +0200
Labels: name=test-elasticsearch
Annotations: kubernetes.io/limit-ranger=LimitRanger plugin set: cpu request for container elasticsearch
Status: Running
IP: XX.XX.X.X
Containers:
elasticsearch:
Container ID: docker://bb9d093df792df072a762973066d504a4e7d73b0e87d0236a94c3e8b972d9c41
Image: launcher.gcr.io/google/elasticsearch6
Image ID: docker-pullable://launcher.gcr.io/google/elasticsearch6#sha256:1ddafd5293dbec8fb73eabffa29614916e4933bb057db50231084d89f4a0b3fa
Port: <none>
Host Port: <none>
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Sat, 12 May 2018 14:55:06 +0200
Finished: Sat, 12 May 2018 14:55:09 +0200
Ready: False
Restart Count: 2
Requests:
cpu: 100m
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-XXXXX (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
default-token-XXXXX:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-XXXXX
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.alpha.kubernetes.io/notReady:NoExecute for 300s
node.alpha.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 51s default-scheduler Successfully assigned test-elasticsearch to gke-elasticsearch-cluste-def
Normal SuccessfulMountVolume 51s kubelet, gke-elasticsearch-cluste-default-pool-XXXXXXXX-wtbv MountVolume.SetUp succeeded for volume "default-token-XXXXX"
Normal Pulling 22s (x3 over 49s) kubelet, gke-elasticsearch-cluste-default-pool-XXXXXXXX-wtbv pulling image "launcher.gcr.io/google/elasticsearch6"
Normal Pulled 22s (x3 over 49s) kubelet, gke-elasticsearch-cluste-default-pool-XXXXXXXX-wtbv Successfully pulled image "launcher.gcr.io/google/elasticsearch6"
Normal Created 22s (x3 over 48s) kubelet, gke-elasticsearch-cluste-default-pool-XXXXXXXX-wtbv Created container
Normal Started 21s (x3 over 48s) kubelet, gke-elasticsearch-cluste-default-pool-XXXXXXXX-wtbv Started container
Warning BackOff 4s (x3 over 36s) kubelet, gke-elasticsearch-cluste-default-pool-XXXXXXXX-wtbv Back-off restarting failed container
Warning FailedSync 4s (x3 over 36s) kubelet, gke-elasticsearch-cluste-default-pool-XXXXXXXX-wtbv Error syncing pod

The problem was the f1-micro instance. It doesn't have enough memory to run. Only after upgrading to an instance with 4GB it works. Unfortunately this is way too expensive for me, so I have to look for something else.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

`kubectl debug` hangs on 1.20 with feature gate enabled - debugging

Related

k8s spring boot pod failing readiness and liveness probe

Why can't I Access a Service Exposed from Minikube on Windows?

LivenessProbe is failing but port-forward is working on the same port

Unable to create the fluentd containers in my kubernetes cluster on ubuntu

Running elasticsearch on Google Cloud Kubernetes ends in CrashLoopBackOff

Categories

Resources