How to make a docker workflow for multiple processes

How to make a docker workflow for multiple processes - bash

I am doing a lot of media related tasks using content that is processed in different Docker containers and I would like some guidance on how to create a simple, streamlined workflow. Specifically I am not sure when to use shell scripts and when to use Dockerfiles.
The common sequence of events is:
Convert an input video to a sequence of images
Process that sequence of images
Convert the processed images back to a video
Save this video on the host
Each of these is run with a different docker run command. I want to script these processes so that I do not have to sit at the command line and type docker run image_name etc. etc. every single time I want to do this.
Step 1. is run as the following:
docker run -v $PWD:/temp/ jrottenberg/ffmpeg -i /temp/$SRC_VIDEO_DIR/$FILENAME /temp/$OUTPUT_IMAGE_DIR/$OUTPUT_IMAGE_BASENAME%06d.bmp
Step 2. is run as the following:
docker run -v $PWD/$OUTPUT_IMAGE_DIR:/notebook/raisr/test -v $PWD/$OUTPUT_LARGER_IMAGE_DIR:/notebook/raisr/results $DOCKER_IMAGE /bin/bash -c "source activate superres; cd /notebook/raisr; python test.py"
Step 3. is run as the following:
docker run -v $PWD:/temp/ jrottenberg/ffmpeg -framerate $FPS -i /temp/$OUTPUT_IMAGE_DIR/$OUTPUT_IMAGE_BASENAME%06d.bmp -c:v qtrle -pix_fmt rgb24 /temp/output_video/$OUTPUT_FILENAME
The easiest way to do this seems like to create a shell script with those three commands in them and just run that script. The annoying thing with that is that I have to edit the shell script and change the input path of the video file each time. Also calling 3 commands in Step 2. does not seem like a common practice.
I realize that I could use the RUN option in a Dockerfile to specify that those commands are automatically run when the docker file is built, but mounting volumes via a Dockerfile and getting data out of the running container and to the host has been a large pain. And some of these commands depend on a Docker image being installed locally, with a specific name (the variable $DOCKER_IMAGE in step 2. above.
So:
When you have multiple docker run commands, each relying on each others data, what is the correct, scalable way to run these commands?
When do people use shell scripts and when do people use Dockerfiles to manage multiple jobs?
Is it good practice to have a Dockerfile run a job that processes media or should this be done from a shell script as a docker run command?

Related

file descriptor redirection in docker

I want to be able to pipe some content into a docker process without clobbering it's stdin.
I thought I could do this by opening a new file descriptor in bash before spawning the docker process, then consuming this descriptor within the docker process. However it doesn't work
outside docker:
exec 4<>somefile.txt
docker run --rm -i image cmd args > output.txt
inside docker:
exec 4>file.txt # also tried without the exec
do something with file.txt
The docker container stops when it reaches the 4>file.txt line.
It must be an atomic action, so I can't use docker cp or anything like that.
Also, the docker image does not expose any network ports, so netcat cannot be used.
I would prefer to not use any complex docker mounts.
STDIN is required for other purposes, so I can't clobber that
Are there any other options for getting the file content into a transient container for the use of a single command?

The usual approach here is to mount the current directory into the container. You can choose any directory name inside the container, and should try to avoid hiding the script itself with the mount.
docker run --rm -i -v $PWD:/data image \
cmd -i /data/file.txt -o /data/output.txt --other-args
Filesystem permissions can be tricky, on both sides of this: you can name any directory in the first half of the -v option, even system directories like /etc; and if the process inside the container runs as a non-root user it might have trouble reading the files in the directory you mount in.
You can bind-mount either files or directories, but with the one caveat that they must exist on the host first, or else Docker will create a directory for you (even if you wanted a file; and likely owned by root and not your local user).

copy bash command history (recursive search commands) into Docker container

I have a container which I am using interactively (docker run -it), in it, i have to run a pretty common set of commands, though not always in a set order, hence I cannot just run a script.
Thus, I would like for a way to have my commands in recursive search (Ctrl+R) be available in the Docker container.
Any idea how I can do this?

Let's mount the history file into the container from the host so it's contains will get preserved the container death.
# In some directory
touch bash_history
docker run -v ./bash_history:/root/.bash_history:Z -it fedora /bin/bash
I would recommend to have separate bash history to the one that you use on the host for the safety reasons.

I found helpful info in these questions:
Docker and .bash_history
Docker: preserve command history
https://superuser.com/questions/1158739/prompt-command-to-reload-from-bash-history
They use docker volume mounts however, which mean that the container commands affect the local (host PC) commands, which I do not want.
It seems I will have to copy ~/.bash_history from local into container which will make the history work 'one-way'.
UPDATE: Working:
COPY your_command_script.sh some_folder/my_history
ENV HISTFILE myroot/my_history
RUN PROMPT_COMMAND="history -a; history -r"
Explanation:
copy command script into a file in container
tell the shell to look at a different file for history
reload the history file

Running one script in different environment

I am just wondering is that possible to run one script (e.g. shell script, python script, etc.) in different environments?
For example, I want to run my script from Linux shell to docker container shell (which the container is created by the script)? In other words, keep the script executing the rest of commands on container (after into the container).
run.sh (#shell script)
sudo docker exec -it some_containers bash #this command will lead me to docker container environment
apt-get install curl # I want to also execute this command inside the docker container after I enter the docker container environment
# this is just one script

Your question is not very clear, but it sounds like this is a job requiring two scripts - the first script runs in your "Linux shell", and needs to cause the second script to be placed into the container (perhaps by way of the dockerfile), at which point you can have the first script use docker exec.
Please see the answers on this question for more information.

Reuse inherited image's CMD or ENTRYPOINT

How can I include my own shell script CMD on container start/restart/attach, without removing the CMD used by an inherited image?
I am using this, which does execute my script fine, but appears to overwrite the PHP CMD:
FROM php
COPY start.sh /usr/local/bin
CMD ["/usr/local/bin/start.sh"]
What should I do differently? I am avoiding the prospect of copy/pasting the ENTRYPOINT or CMD of the parent image, and maybe that's not a good approach.

As mentioned in the comments, there's no built-in solution to this. From the Dockerfile, you can't see the value of the current CMD or ENTRYPOINT. Having a run-parts solution is nice if you control the upstream base image and include this code there, allowing downstream components to make their changes. But docker there's one inherent issue that will cause problems with this, containers should only run a single command that needs to run in the foreground. So if the upstream image kicks off, it would stay running without giving your later steps a chance to run, so you're left with complexities to determine the order to run commands to ensure that a single command does eventually run without exiting.
My personal preference is a much simpler and hardcoded option, to add my own command or entrypoint, and make the last step of my command to exec the upstream command. You will still need to manually identify the script name to call from the upstream Dockerfile. But now in your start.sh, you would have:
#!/bin/sh
# run various pieces of initialization code here
# ...
# kick off the upstream command:
exec /upstream-entrypoint.sh "$#"
By using an exec call, you transfer pid 1 to the upstream entrypoint so that signals get handled correctly. And the trailing "$#" passes through any command line arguments. You can use set to adjust the value of $# if there are some args you want to process and extract in your own start.sh script.

If the base image is not yours, you unfortunately have to call the parent command manually.
If you own the parent image, you can try what the people at camptocamp suggest here.
They basically use a generic script as an entry point that calls run-parts on a directory. What that does is run all scripts in that directory in lexicographic order. So when you extend an image, you just have to put your new scripts in that same folder.
However, that means you'll have to maintain order by prefixing your scripts which could potentially get out of hand. (Imagine the parent image decides to add a new script later...).
Anyway, that could work.
Update #1
There is a long discussion on this docker compose issue about provisioning after container run. One suggestion is to wrap you docker run or compose command in a shell script and then run docker exec on your other commands.
If you'd like to use that approach, you basically keep the parent CMD as the run command and you place yours as a docker exec after your docker run.

Using mysql image as an example
Do docker inspect mysql/mysql-server:5.7 and see that:
Config.Cmd="mysqld"
Config.Entrypoint="/entrypoint.sh"
which we put in bootstrap.sh (remember to chmod a+x):
#!/bin/bash
echo $HOSTNAME
echo "Start my initialization script..."
# docker inspect results used here
/entrypoint.sh mysqld
Dockerfile is now:
FROM mysql/mysql-server:5.7
# put our script inside the image
ADD bootstrap.sh /etc/bootstrap.sh
# set to run our script
ENTRYPOINT ["/bin/sh","-c"]
CMD ["/etc/bootstrap.sh"]
Build and run our new image:
docker build --rm -t sidazhou/tmp-mysql:5.7 .
docker run -it --rm sidazhou/tmp-mysql:5.7
Outputs:
6f5be7c6d587
Start my initialization script...
[Entrypoint] MySQL Docker Image 5.7.28-1.1.13
[Entrypoint] No password option specified for new database.
...
...
You'll see this has the same output as the original image:
docker run -it --rm mysql/mysql-server:5.7
[Entrypoint] MySQL Docker Image 5.7.28-1.1.13
[Entrypoint] No password option specified for new database.
...
...

convert Dockerfile to Bash script

Is there any easy way to convert a Dockerfile to a Bash script in order to install all the software on a real OS? The reason is that docker container I can not change and I would like afterwards change few things if they did not work out.

In short - no.
By parsing the Dockerfile with a tool such as dockerfile-parse you could run the individual RUN commands, but this would not replicate the Dockerfile's output.
You would have to be running the same version of the same OS.
The ADD and COPY commands affect the filesystem, which is in its own namespace. Running these outside of the container could potentially break your host system. Your host will also have files in places that the container image would not.
VOLUME mounts will also affect the filesytem.
The FROM image (which may in turn be descended from other images) may have other applications installed.
Writing Dockerfiles can be a slow process if there is a large installation or download step. To mitigate that, try adding new packages as a new RUN command (to take advantage of the cache) and add features incrementally, only optimising/compressing the layers when the functionality is complete.
You may also want to use something like ServerSpec to get a TDD approach to your container images and prevent regressions during development.
Best practice docs here, gotchas and the original article.

Basically you can make a copy of a Docker container's file system using “docker export”, which you can then write to a loop device:
docker build -t <YOUR-IMAGE> ...
docker create --name=<YOUR-CONTAINER> <YOUR-IMAGE>
dd if=/dev/zero of=disk.img bs=1 count=0 seek=1G
mkfs.ext2 -F disk.img
sudo mount -o loop disk.img /mnt
docker export <YOUR-CONTAINER> | sudo tar x -C /mnt
sudo umount /mnt
Convert a Docker container to a raw file system image.
More info here:
http://mr.gy/blog/build-vm-image-with-docker.html

You can of course convert a Dockerfile to bash script commands. Its just a matter of determining what the translation means. All docker installs, apply changes to a "file system layer" and that means all changes can be implemented in a real OS.
An example of this process is here:
https://github.com/thatkevin/dockerfile-to-shell-script
It is an example of how you would do the translation.

you can install application inside dockerfile like this
FROM <base>
RUN apt-get update -y
RUN apt-get install <some application> -y

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio