Kubernetes - how to mount existing persistent volume locally - debugging

When I work with AWS, I can create storage volumes and bind them in kubernetes.
I would like to mount the persistent volume locally, in order to inspect the volume content and manipulate files with a local script.
Is there a handy way to mount a persistent volume to the client host, with something like: kubectl niceMountCommand my-pvc /data/local/my-pvc
I already know about kubectl cp and the possibility to add a dummy pod to access data, but I would have to adapt every script that manipulates data to exec commands through kubectl exec.

Related

Where the images get stored in Azure Kubernetes Cluster after pulling from Azure Container Registry?

I am using Kubernetes to deploy all my microservices provided by Azure Kubernetes Services.
Whenever I release an update of my microservice which is getting frequently from last one month, it pulls the new image from the Azure Container Registry.
I was trying to figure out where do these images reside in the cluster?
Just like Docker stores, the pulled images in /var/lib/docker & since the Kubernetes uses Docker under the hood may be it stores the images somewhere too.
But if this is the case, how can I delete the old images from the cluster that are not in use anymore?
Clusters with Linux node pools created on Kubernetes v1.19 or greater default to containerd for its container runtime (Container runtime configuration).
To manually remove unused images on a node running containerd:
Identity node names:
kubectl get nodes
Start an interactive debugging container on a node (Connect with SSH to Azure Kubernetes Service):
kubectl debug node/aks-agentpool-11045208-vmss000003 -it --image=mcr.microsoft.com/aks/fundamental/base-ubuntu:v0.0.11
Setup crictl on the debugging container (check for newer releases of crictl):
The host node's filesystem is available at /host, so configure crictl to use the host node's containerd.sock.
curl -sL https://github.com/kubernetes-sigs/cri-tools/releases/download/v1.23.0/crictl-v1.23.0-linux-amd64.tar.gz | tar xzf - -C /usr/local/bin \
&& export CONTAINER_RUNTIME_ENDPOINT=unix:///host/run/containerd/containerd.sock IMAGE_SERVICE_ENDPOINT=unix:///host/run/containerd/containerd.sock
Remove unused images on the node:
crictl rmi --prune
You are correct in guessing that it's mostly up to Docker, or rather to whatever the active CRI plugin is. The Kubelet automatically cleans up old images when disk space runs low so it's rare that you need to ever touch it directly, but if you did (and are using Docker as your runtime) then it would be the same docker image commands as per normal.
I was trying to figure out where do these images reside in the
cluster?
With the test and check, the result shows each node in the AKS cluster installed the Docker server, and the images stored like Docker as you say that the image layers stored in the directory /var/lib/docker/.
how can I delete the old images from the cluster that are not in use
anymore?
You can do this through the Docker command inside the node. Follow the steps in Connect with SSH to Azure Kubernetes Service (AKS) cluster nodes to make a connection to the node, then you could delete the image through the Docker CLI docker rmi image_name:tag, but carefully with it, make sure the image is really no more useful.

How to store Kubernetes pods logs directly on local machine?

As logs go away as soon as pods crashes, I would like to store them directly on my local machine. I don't want to use GCE. Also I would have multiple nodes of a service, so HostPath will not be of any use.
kubectl logs <pod-name> > log.txt will just capture a snapshot. I want the complete logs to be persistent on my local machine.
Logs are already on nodes in /var/lib/docker/containers/CONTAINER_ID/CONTAINER_ID-json.log
For collecting them you can use fluent-bit. The easiest approach would be using kubernetes to manage fluent-bit - run daemon set with host path volumes.
Here is helm chart that can do it for you: https://github.com/helm/charts/tree/master/stable/fluent-bit

Mount container volume on (windows) host without copying

I have a container that I start like
docker run -it --mount type=bind,source=/path/to/my/data,target=/staging -v myvol:/myvol buildandoid bash -l
It has two mounts, one bind mount that I use to get data into the container, and one named volume that I use to persist data. The container is used as a reproducable android (AOSP) build environment, so not your typical web service.
I would like to access the files on myvol from the Windows host. If I use an absolute path for the mount, e.g. -v /c/some/path:/myvol, I can do that, but I believe docker creates copies of all the files and keeps them in sync. I really want to avoid creating these files on the windows side (for space reasons, as it is several GB, and performance reasons, since NTFS doesn't seem to handle many little files well).
Can I somehow "mount" a container directory or a named volume on the host? So the exact reverse of a bind mount. I think alternatively I could install samba or sshd in the container, and use that, but maybe there is something built into docker / VirtualBox to achive this.
Use bind mounts.
https://docs.docker.com/engine/admin/volumes/bind-mounts/
By contrast, when you use a volume, a new directory is created within Docker’s storage directory on the host machine, and Docker manages that directory’s contents.

Does docker stores all its files as "memory image", as part of image, not disk file?

I was trying to add some files inside a docker container like "touch". I found after I shutdown this container, and bring it up again, all my files are lost. Also, I'm using ubuntu image, after shutdown-restart the same image, all my software that has been installed by apt-get is gone! Just like running a new image. So how can I save any file that I created?
My question is, does docker "store" all its file systems like "/tmp" as memory file system, so nothing is actually saved to disk?
Thanks.
This is normal behavoir for docker. You have to define a volume to save your data, those volumes will exist even if you shutdown your container.
For example with a simple apache webserver:
$ docker run -dit --name my-apache-app -v "$PWD":/usr/local/apache2/htdocs/ httpd:2.4
This will mount your "current" director to /usr/local/apache2/htdocs at the container, so those files wil be available there.
A other approach is to use named volumes, those ones are not linked to a directory on your disk. Please refer to the docs:
Docker Manage - Data
When you start a container using docker run command,docker run ubuntu, docker starts a new container based on the image you specified. Any changes you make to the previous container will not be available, as this is a new instance spawned from the base image.
There a multiple ways to persist your data/changes to your container.
Use Volumes.
Data volumes are designed to persist data, independent of the container’s lifecycle. You could attach a data volume or mount a host directory as a volume.
Use Docker commit to create a new image with your changes and start future containers based on that image.
docker commit <container-id> new_image_name
docker run new_image_name
Use docker ps -a to list all the containers. It will list all containers including the ones that have exited. Find the docker id of the container that you were working on and start it using docker start <id>.
docker ps -a #find the id
docker start 1aef34df8ddf #start the container in background
References
Docker Volumes
Docker Commit

Mount docker host volume but overwrite with container's contents

Several articles have been extremely helpful in understanding Docker's volume and data management. These two in particular are excellent:
http://container-solutions.com/understanding-volumes-docker/
http://www.alexecollins.com/docker-persistence/
However, I am not sure if what I am looking for is discussed. Here is my understanding:
When running docker run -v /host/something:/container/something the host files will overlay (but not overwrite) the container files at the specified location. The container will no longer have access to the location's previous files, but instead only have access to the host files at that location.
When defining a VOLUME in a Dockerfile, other containers may share the contents created by the image/container.
The host may also view/modify a Dockerfile volume, but only after discovering the true mountpoint using docker inspect. (usually somewhere like /var/lib/docker/vfs/dir/cde167197ccc3e138a14f1a4f7c....). However, this is hairy when Docker has to run inside a Virtualbox VM.
How can I reverse the overlay so that when mounting a volume, the container files take precedence over my host files?
I want to specify a mountpoint where I can easily access the container filesystem. I understand I can use a data container for this, or I can use docker inspect to find the mountpoint, but neither solution is a good solution in this case.
The docker 1.10+ way of sharing files would be through a volume, as in docker volume create.
That means that you can use a data volume directly (you don't need a container dedicated to a data volume).
That way, you can share and mount that volume in a container which will then keep its content in said volume.
That is more in line with how a container is working: isolating memory, cpu and filesystem from the host: that is why you cannot "mount a volume and have the container's files take precedence over the host file": that would break that container isolation and expose to the host its content.
Begin your container's script with copying files from a read-only mount bind reflecting the host files to a work location in the container. End the script with copying necessary results from the container's work location back to the host using either the same or different mount point.
Alternatively to the end-of-the script command, run the container without automatically removing it at the end, then run docker cp CONTAINER_NAME:CONTAINER_DIR HOST_DIR, then docker rm CONTAINER_NAME.
Alternatively to copying results back to the host, keep them in a separate "named" volume, provided that the container had it mounted (type=volume,src=datavol,dst=CONTAINER_DIR/work). Use the named volume with other docker run commands to retrieve or use the results.
The input files may be modified in the host during development between the repeated runs of the container. Avoid shadowing them with the frozen files in the named volume. Beginning the container script with copying the input files from the host may help.
Using a named volume helps running the container read-only. (One may still need --tmpfs /tmp for temporary files or --tmpfs /tmp:exec if some container commands create and run executable code in the temporary location).

Resources