microk8s image pull keeps breaking - microk8s

I am running microk8s v1.18.8 rev 1609 from 1.18/stable.
Several times I have got my deployments up and running perfectly (as far as I can tell). The images pull from localhost:32000. I have gone through many rounds of updating the deployments and the pods get automatically replaced, with the new images being pulled successfully from the repository.
Then I move onto another project for a few days (having nothing to do with microk8s). I leave microk8s running and untouched. Several times when I've returned to the microk8s project, all the pods have gone away and show an error state (ErrImagePull). If I delete a pod, a new pod tries to replace it, but hangs initially in the ContainerCreating state (last log entry is 'Pulling image "localhost:32000/..."'). Eventually it times out and goes through the ImagePullBackOff and ErrImagePull states. However, the last time I had anything to do with the project, these images were pulling perfectly fine.
I can push the image to localhost:32000 without error. I can pull the image without error. I can pull the image using microk8s.ctr:
microk8s ctr --debug images pull --plain-http localhost:32000/imagename
It works fine. I've tried changing ufw default to allow routed (no effect), iptables -P FORWARD ACCEPT (no effect). microk8s inspect does not report any issues. I've tried microk8s stop followed by microk8s start (no effect). Rebooting my machine (no effect). Everything else about the system appears fine: just the pods trying to pull images fails.
Previously, something in the above made it work again, but not this time. So my main question is "What else can I try?"
My secondary question is: Is this a stable platform for anyone? Can you leave a service/deployment (e.g. an nginx server) running for months without issue? I am tired of leaving a working environment and coming back a little while later to a badly broken system that takes hours/days to fix. I'm having serious doubts about microk8s in particular and k8s in general as a useful platform.

if you pull the image from external registry, if it shows ErrImagePull and ImagePullBackOff error, please try it
kubectl create secret docker-registry regprivate --docker-server=https://privateregistry.com/ --docker-username=user --docker-password=mypassword
spec:
imagePullSecrets:
- name: regprivate
containers:
- name: miapp
image: privateregistry.com/miapp:v2

Related

How to troubleshoot DDEV DB container healthcheck timeout

When i want to start my DDEV Project an Container stucks at creating
Container ddev-oszimt-lf12a-v2-db Started
Error Message:
Failed waiting for web/db containers to become ready: db container failed: log=, err=health check timed out after 2m0s: labels map[com.ddev.site-name:oszimt-lf12a-v2 com.docker.compose.service:db] timed out without becoming healthy, status=
Its an Error i also had with some other projects.
In the Error Log is no information about this.
What could the Problem be and how do i fix it?
This isn't a very good place to study problems with specific projects, our Discord Channel is much better, or the DDEV issue queue.
But I'll try to give you some ideas about how to study and debug this.
Go to the Troubleshooting section of the docs. Work through it step-by-step.
As it says there, try the simplest possible project and see what the results are.
If the problem is particular to one particular project, see if you can remove customizations like .ddev/docker-compose.*.yaml files and config.*.yaml and non-standard things in the config.yaml file.
To find out what the causes the healthcheck timeout, see the docs on this exact problem, in your case the db container is timing out. So first, ddev logs -s db to see if something happened, and second docker inspect --format "{{json .State.Health }}" ddev-<projectname>-db.
For more help, you'll need to provide more information with things like your OS, Docker Provider, etc, and the easiest way to do that is to run ddev debug test and capture the output and put it in a gist on gist.github.com, then come over to discord with a link to that.

How to edit internal files without running container

Mariadb10.3 was installed as Docker on Mac, and the collaction-server value in the /etc/mysql/my.cnf file was modified.
After modification, I tried to restart the container, but when I entered the'docker ps -a' command, the Status was displayed as Exited(1).
So I entered docker logs [container name] and the following error was displayed.
The setting parameter was incorrectly written as'collection-server=utf8_unicode_ci'.
So the container did not run.
I've looked at several ways, but I can't find a way to modify the internal files without running the container.
I know that you shouldn't tamper with files inside the Docker container.
My question may be,'How do I edit a file inside the computer without turning on the computer?', but I don't think that the answer is to delete the container and create a new one.
Of course, deleting the container and installing a new one will save time and may be the simplest method. But I thought in a different way.
If a company that actually operates this docker container has the same mistake as me and cannot operate the container, it must be a very fatal mistake.
Because of that, I don't know... I think there is definitely a way.
I would like advice on a solution to this method.

Can't pull some images in Docker

When I'm trying to pull images I get in stuck.
docker pull php:7-fpm
7-fpm: Pulling from library/php
f17d81b4b692: Already exists
376d99d019dc: Already exists
80b3573727f0: Already exists
2c492579cd1f: Waiting
I'm using Windows 10 Home with docker-toolbox running on VirtualBox.
How to beat this infinity Waiting ?
Don't know how, but after system reloading, all works fine.

How can I allow a private insecure registry to be used inside a minikube node?

I know there are about a thousand answers to various permutations of this question but none of the fifteen or so that I've tried have worked.
I'm running on Mac OS Sierra and using Minikube 0.17.1 as well as kubectl 1.5.3.
We run our own private Docker registry that is insecure as it is not open to the internet. (This is not my choice or in my control so it's not open for discussion). This is my first foray into Kubernetes and actually container orchestration altogether. I also have a very intermediate level of knowledge about Docker in general so I'm drowning in terminology/platform soup here.
When I execute
kubectl run perf-ui --image=X.X.X.X/performance/perf-ui:master
I see
image pull failed for X.X.X.X/performance/perf-ui:master, this may be because there are no credentials on this request. details: (Error response from daemon: Get https://X.X.X.X/v1/_ping: dial tcp X.X.X.X:443: getsockopt: connection refused)
We have an Ubuntu box that accesses the same registry (not using Kubernetes, just Docker) that works just fine. This is likely due to
DOCKER_OPTS="--insecure-registry X.X.X.X"
being in /etc/default/docker.
I made a similar change using the UI of Docker for Mac. I don't know where this change persisted in a config file. After this change a docker pull worked on my laptop!!! Again, this is just using Docker not Kubernetes. The interesting part is I got the same "Connection refused error" (as it tries to access via HTTPS) on my Mac as I get in the Minikube VM and after the change the pull worked. I feel like I'm on to something there.
After sshing into minikube (the VM created my minikube start) using
minikube ssh
I added the following content to /var/lib/boot2docker/profile
export EXTRA_ARGS="$EXTRA_ARGS --insecure-registry 10.129.100.3
export DOCKER_OPTS="$DOCKER_OPTS --insecure-registry 10.129.100.3
As you can infer, nothing has worked. I know I've tried other things but they too have failed.
I know this isn't the most comprehensive explanation but I've been digging into this for the past 4 hours.
The bottom line is docker pulls work from our Ubuntu box with the config file setup correctly and from my Mac with the setting configured properly.
How can I enable the same setting in my "Linux 2.6" VM that was created by Minikube?
If someone knows the answer I would be forever grateful.
Thank you in advance!
Thank you to Janos for your alternative solution. I'm confident that is the right choice for some use cases.
It turns out that what I needed was a good night sleep and the following command to start Minikube in the first place:
minikube start --insecure-registry="X.X.X.X"
#intelfx says that adding a port won't be necessary. I'm inclined to believe them but if your registry is on a non-standard port just keep it in mind in case things still aren't working.
In the end it was, in fact, a matter of telling Docker to use an insecure registry but it was not clear how to tell this to Docker when I was not controlling it directly.
I know that seems simple but after you've tried a hundred things you're almost hallucinating so you're not in a great state to make rational decisions. I'm sorry for the dumb post but I'm willing to bet this will help at least one person one day, which makes it worth it.
Thanks SO!
The flag --insecure-registry doesn't work on the existing cluster on MacOS. You need to do minikube delete, it's not enough just to stop the cluster with kubectl stop.
I spent plenty of time to figure this out and then I found this comment at https://github.com/kubernetes/minikube/issues/604:
the --insecure-registry flag is ignored if the
machine already existed (even if it is stopped). You must first
minikube delete if you want new flags to be respected.
You can use kube-registry-proxy from (needs some configuration):
https://github.com/kubernetes/kubernetes/blob/master/cluster/saltbase/salt/kube-registry-proxy/kube-registry-proxy.yaml
Then you can refer to localhost:5050 as your registry. The trick is that localhost is allowed as an insecure registry by default.

Ruby Stack failed to deploy on Google Developers Console

I tried to deploy Ruby stack using Google Developers Console, but no success. I tried several times at other project, error was always the same (below).
Do you have any idea why it keeps failing?
2014/10/23 15:59:44
rubyStackBox: PENDING
2014/10/23 15:59:55~2014/10/23 16:06:01
rubyStackBox: DEPLOYING
2014/10/23 16:06:11
rubyStackBox: DEPLOYMENT_FAILED
Replica rubystackbox-eaeo failed with status PERMANENTLY_FAILING: Replica State changed to PERMANENTLY_FAILING. Replica was unhealthy 2 consecutive times.
I replicated the issue you experienced several times and it also failed. What finally worked was playing with the zones/regions when deploying the ruby stack :
Developers console > Click-to-deploy > Set MySQL password > Advanced Options, choose a different zone and click Deploy.
Another useful tool when investigating this is Console Output. Even if the deployment fails, you can go to the VM instance and check View Output towards the bottom of the page. It will list all the packages and any errors encountered. The following command will achieve the same thing:
$ gcloud compute instances get-serial-port-output <INSTANCE_NAME> --project <PROJECT_ID> --zone <ZONE_NAME>
Please advise if still seeing issues.

Resources