Save docker state (edited postgresql.conf file for example) - bash

I have downloaded a postgresql docker image and at the moment editing some config files. The problem that I have is that whenever I edit the config files and commit the docker image (save it as a new one), it never saves anything. The image is still the same as the one I downloaded.
Image I am using:
https://hub.docker.com/_/postgres/
I believe this is the latest docker file.
https://github.com/docker-library/postgres/blob/a00e979002aaa80840d58a5f8cc541342e06788f/9.6/Dockerfile
This is what I did:
1. Run the postgresql docker container
2. Enter the terminal of the container. docker exec -i -t {id of container} /bin/bash
3. Edit some config files.
4. Exit the container.
5. Commit the changes by using docker commit {containerid} {new name}
6. Stop the old container and start the new one.
The new container is created. If I start the new container with the new image and check the config files I edited, my changes are not there. No changes were committed.
What am I doing wrong here?

The Docker file contains a volume declaration
https://github.com/docker-library/postgres/blob/a00e979002aaa80840d58a5f8cc541342e06788f/9.6/Dockerfile#L52
VOLUME /var/lib/postgresql/data
All file edits under this path will not be saved in a Docker image commit. These data files are deliberately excluded as they define your container's state. Images on the other hand are designed to create new containers, so VOLUMEs are a mechanism to keep state separate.
It would appear that you're attempting to use Docker images as a mechanism for DB backup and recovery. This is ill-advised as the docker file system is less performant compared to the native file system typically exposed to a volume.

As Mark rightfully points out, your data is left behind because of the volume definition, and it should not be altered for the general production use.
If you have a legitimate reason to keep the data within the image produced, you may move the postgres data from the volume by adding the following to your dockerfile:
ENV PGDATA /var/lib/postgresql/my_data
RUN mkdir -p $PGDATA
I've been using this technique to produce db images for testing to speedup the feedback loop.

Related

Is there a way to automate the creation of Docker Image?

I needed to create a Docker image of a Springboot application and I achieved that by creating a Dockerfile and building it into an image. Then, I used "docker run" to bring up a container. This container is used for all the activities for which my application was written.
My problem, however, is that the JAR file that I have used needs constant changes and that requires me to rebuild the Docker image everytime. Furthermore, I need to take the contents of the earlier running Docker container and transfer it into a container created from the newly built image.
I know this whole process can be written as a Shell script and exected every time I have changes on my JAR file. But, is there any tool I can use to somehow automate it in a simple manner?
Here is my Dockerfile:
FROM java:8
WORKDIR /app
ADD ./SuperApi ./SuperApi
ADD ./config ./config
ADD ./Resources ./Resources
EXPOSE 8000
CMD java -jar SuperApi/SomeName.jar --spring.config.location=SuperApi/application.properties
If you have a JAR file that you need to copy into an otherwise static Docker image, you can use a bind mount to save needing to rebuild repeatedly. This allows for directories to be shared from the host into the container.
Say your project directory (the build location where the JAR file is located) on the host machine is /home/vishwas/projects/my_project, and you need to have the contents placed at /opt/my_project inside the container. When starting the container from the command line, use the -v flag:
docker run -v /home/vishwas/projects/my_project:/opt/my_project [...]
Changes made to files under /home/vishwas/projects/my_project locally will be visible immediately inside the container1, so no need to rebuild (and probably no need to restart) the container.
If using docker-compose, this can be expressed using a volumes stanza under the services listing for that container:
volumes:
- type: bind
source: /home/vishwas/projects/my_project
target: /opt/my_project
This works for development, but later on, it's likely you'll want to bundle the JAR file into the image instead of sharing from the host system (so it can be placed into production). When that time comes, just re-build the image and add a COPY directive to the Dockerfile:
COPY /home/vishwas/projects/my_project /opt/my_project
1: Worth noting that it will default to read/write, so the container will also be able to modify your project files. To mount as read-only, use: docker run -v /home/vishwas/projects/my_project:/opt/my_project:ro
You are looking for docker compose
You can build and start containers with a single command using compose.

Does docker stores all its files as "memory image", as part of image, not disk file?

I was trying to add some files inside a docker container like "touch". I found after I shutdown this container, and bring it up again, all my files are lost. Also, I'm using ubuntu image, after shutdown-restart the same image, all my software that has been installed by apt-get is gone! Just like running a new image. So how can I save any file that I created?
My question is, does docker "store" all its file systems like "/tmp" as memory file system, so nothing is actually saved to disk?
Thanks.
This is normal behavoir for docker. You have to define a volume to save your data, those volumes will exist even if you shutdown your container.
For example with a simple apache webserver:
$ docker run -dit --name my-apache-app -v "$PWD":/usr/local/apache2/htdocs/ httpd:2.4
This will mount your "current" director to /usr/local/apache2/htdocs at the container, so those files wil be available there.
A other approach is to use named volumes, those ones are not linked to a directory on your disk. Please refer to the docs:
Docker Manage - Data
When you start a container using docker run command,docker run ubuntu, docker starts a new container based on the image you specified. Any changes you make to the previous container will not be available, as this is a new instance spawned from the base image.
There a multiple ways to persist your data/changes to your container.
Use Volumes.
Data volumes are designed to persist data, independent of the container’s lifecycle. You could attach a data volume or mount a host directory as a volume.
Use Docker commit to create a new image with your changes and start future containers based on that image.
docker commit <container-id> new_image_name
docker run new_image_name
Use docker ps -a to list all the containers. It will list all containers including the ones that have exited. Find the docker id of the container that you were working on and start it using docker start <id>.
docker ps -a #find the id
docker start 1aef34df8ddf #start the container in background
References
Docker Volumes
Docker Commit

How to specify docker image path on command line without editing configuration setting?

I have my docker container images in different directories. And I would like to specify the path of the directory in the docker -run command. There is a method to change this path by editing the '-g' option in the configuration file, but it requires to restart the docker deamon. Is there any way to specify the docker image path in the docker-run command itself?
Docker must have the knowledge of not just your image physical location, but its complete tree. because docker image is made up of layers, where each layer is built with one Dockerfile command.
Hence, you should let docker register / know all the images from the directory where the images are present. Moreover, if you have physically copied these images from another machine, they would not work unless they are registered / tagged within Docker engine.
The short answer to your question is NO, it is not possible.
Docker engine itself should manage the images, you could do all what docker engine is doing by changing all the configuration files it maintains internally, because all of them are plain text. But it is definitely not worth your time, and you are better off with docker managing the images itself.

How to edit files in stopped/not starting docker container

Trying to fix errors and debug problems with my application that is split over several containers, I frequently edit files in containers:
either I am totally lazy and install nano and edit directly in container or
I docker cp the file out of the container, edit it, copy it back and restart the container
Those are intermediate steps before coming to new content for container build, which takes a lot longer than doing the above (which of course is only intermediate/fiddling around).
Now I frequently break the starting program of the container, which in the breaking cases is either a node script or a python webserver script, both typically fail from syntax errors.
Is there any way to save those containers? Since they do not start, I cannot docker exec into them, and thus they are lost to me. I then go the rm/rmi/build/run route after fixing the offending file in the build input.
How can I either edit files in a stopped container, or cp them in or start a shell in a stopped container - anything that allows me to fix this container?
(It seems a bit like working on a remote computer and breaking the networking configuration - connection is lost "forever" this way and one has to use a fallback, if that exists.)
How to edit Docker container files from the host? looks relevant but is outdated.
I had a problem with a container which wouldn't start due to a bad config change I made.
I was able to copy the file out of the stopped container and edit it. something like:
docker cp docker_web_1:/etc/apache2/sites-enabled/apache2.conf .
(correct the file)
docker cp apache.conf docker_web_1:/etc/apache2/sites-enabled/apache2.conf
Answering my own question.. still hoping for a better answer from a more knowledgable person!!
There are 2 possibilities.
1) Editing file system on host directly. This is somewhat dangerous and has a chance of completely breaking the container, possibly other data depending on what goes wrong.
2) Changing the startup script to something that never fails like starting a bash, doing the fixes/edits and then changing the startup program again to the desired one (like node or whatever it was before).
More details:
1) Using
docker ps
to find the running containers or
docker ps -a
to find all containers (including stopped ones) and
docker inspect (containername)
look for the "Id", one of the first values.
This is the part that contains implementation detail and might change, be aware that you may lose your container this way.
Go to
/var/lib/docker/aufs/diff/9bc343a9..(long container id)/
and there you will find all files that are changed towards the image the container is based upon. You can overwrite files, add or edit files.
Again, I would not recommend this.
2) As is described at https://stackoverflow.com/a/32353134/586754 you can find the configuration json config.json at a path like
/var/lib/docker/containers/9bc343a99..(long container id)/config.json
There you can change the args from e. g. "nodejs app.js" to "/bin/bash". Now restart the docker service and start the container (you should see that it now correctly starts up). You should use
docker start -i (containername)
to make sure it does not quit straight away. You can now work with the container and/or later attach with
docker exec -ti (containername) /bin/bash
Also, docker cp is rather useful for copying files that were edited outside of the container.
Also, one should only fall back to those measures if the container is more or less "lost" anyway, so any change would be an improvement.
You can edit container file-system directly, but I don't know if it is a good idea.
First you need to find the path of directory which is used as runtime root for container.
Run docker container inspect id/name.
Look for the key UpperDir in JSON output.
That is your directory.
If you are trying to restart an stopped container and need to alter the container because of misconfiguration but the container isn't starting you can do the following which works using the "docker cp" command (similar to previous suggestion). This procedure lets you remove files and do any other changes needed. With luck you can skip a lot of the steps below.
Use docker inspect to find entrypoint, (named Path in some versions)
Create a clone of the using docker run
Enter clone using docker exec -ti bash (if *nix container)
Locate entrypoint file location by looking though the clone to find
Copy the old entrypoint script using docker cp : ./
Modify or create a new entrypoint script for instance
#!/bin/bash
tail -f /etc/hosts
ensure the script has execution rights
Replace the old entrypoint using docker cp ./ :
start the old container using start
redo steps 6-9 until the starts
Fix issues in container
Restore entrypoint if needed and redo steps 6-9 as required
Remove clone if needed

Oracle on Docker with pre-pumped Data

I am using the following base docker file:
https://github.com/wnameless/docker-oracle-xe-11g/blob/master/Dockerfile
I read a bit on how to setup a data Volumne from this SO question and this blog, but not sure how to fit the pieces together.
In short, I would like to manage the oracle data in a data only Docker image, how to do it ?
I've realized volumes mount for db data.
Here is my fork:
Reduce size of image from 3.8G to 825Mb
Database initialization moved out of the image build phase. Now database initializes at the containeer startup with no database files mounted
media reuse support outside of container. Added graceful shutdown on containeer stop
Removed sshd
You may check here:
https://registry.hub.docker.com/u/sath89/oracle-xe-11g
https://github.com/MaksymBilenko/docker-oracle-xe-11g
I tried mapping the datafiles and fast recovery directories in my oracle xe container. However, I changed my mind after losing the files ... so you should be very careful about this approach and understand how docker manages those spaces under all operations.
I found, for example, that if you clean out old containers, the contents of the mapped directories will be deleted even if they are mapped to something outside the docker system area (/var/lib/docker). You can avoid this by keeping containers and starting them up again. But, if you want to version and make a new image... you have to backup those files.
Oracle also id's the files themselves (checksum or inode # or something) and complains about them on startup.... I did not investigate the extent of that issue or even if there is indeed any issue there.
I've opted to not map any of those files/dirs and plan to use datapump or whatever to get the data out until I get a better handle on all that can happen.
So I update the data and version the image... pushing to to the repo for safe-keeping
In general:
# Start data container
docker run -d -v /dbdata --name dbdata -it ubuntu
# Put oracale data in /dbdata some how
# Start container with stabase and look for data at /dbdata
docker run -d --volumes-from dbdata --name db -it ubuntu

Resources