file descriptor redirection in docker - bash

I want to be able to pipe some content into a docker process without clobbering it's stdin.
I thought I could do this by opening a new file descriptor in bash before spawning the docker process, then consuming this descriptor within the docker process. However it doesn't work
outside docker:
exec 4<>somefile.txt
docker run --rm -i image cmd args > output.txt
inside docker:
exec 4>file.txt # also tried without the exec
do something with file.txt
The docker container stops when it reaches the 4>file.txt line.
It must be an atomic action, so I can't use docker cp or anything like that.
Also, the docker image does not expose any network ports, so netcat cannot be used.
I would prefer to not use any complex docker mounts.
STDIN is required for other purposes, so I can't clobber that
Are there any other options for getting the file content into a transient container for the use of a single command?

The usual approach here is to mount the current directory into the container. You can choose any directory name inside the container, and should try to avoid hiding the script itself with the mount.
docker run --rm -i -v $PWD:/data image \
cmd -i /data/file.txt -o /data/output.txt --other-args
Filesystem permissions can be tricky, on both sides of this: you can name any directory in the first half of the -v option, even system directories like /etc; and if the process inside the container runs as a non-root user it might have trouble reading the files in the directory you mount in.
You can bind-mount either files or directories, but with the one caveat that they must exist on the host first, or else Docker will create a directory for you (even if you wanted a file; and likely owned by root and not your local user).

Related

Run a shell script with arguments on any given file with docker run

I am a docker beginner. I have used this SO post to run a shell script with docker run and this works fine. However, what I am trying to do is to apply my shell script to a file that lives in my current working directory, where Dockerfile and script are.
My shell script - given a file as an argument, return its name and the number of lines:
#!/bin/bash
echo $1
wc -l $1
Dockerfile:
FROM ubuntu
COPY ./file.sh /
CMD /bin/bash file.sh
then build and run:
docker build -t test .
docker run -ti test /file.sh text_file
This is what I get:
text_file
wc: text_file: No such file or directory
I'm left clueless why the second line doesn't work, why the file can't be found. I don't want to copy my text_file to the container. Ideally, I'd like to run my script from docker container on any file in my current working directory.
Any help will be much appreciated.
Thanks!!
You're building your Docker image containing the script /file.sh. Still, your Docker container does not contain (or know) about the file text_file which you're passing as an argument.
In order to make it known to your Docker container, you have to mount it when running the container.
docker run --rm -it -v "$PWD"/text_file:/text_file test /file.sh /text_file
In order to check for other files, you just have to swap text_file in both the mount and the argument.
Notes
In addition to Docker volume mounts, I might suggest some more improvements to spice up your image.
In order to run a script, you don't have to use ubuntu as your base image. You might be fine with alpine or even more focused bash. And don't forget to use tags in order to enforce the exact same behavior over time.
You can set your script as an ENTRYPOINT of your Dockerfile. Then, your only specifying the script name (text_file in that case) as your command.
When mounting files, you can change the name of the file in your container. Therefore, you can simplify your script and just mounting the file to test at the exact same place every time you run the container.
FROM alpine:3.10
WORKDIR /tmp
COPY file.sh /usr/local/bin/wordcount
ENTRYPOINT /usr/local/bin/wordcount
CMD file
Then,
docker run --rm -it -v "PWD"/text_file:/tmp/file test
will do the job.

Problem in executing a shell script present on host using docker exec

I'm trying to execute a script on the master node of AWS EMR cluster. The intention is to create a new conda env and link it to jupyter. I'm following this doc from AWS. Problem is, whatever be the content of the script, I'm getting the same error: bash: /home/hadoop/scripts/bootstrap.sh: No such file or directory while executing sudo docker exec jupyterhub bash /home/hadoop/scripts/bootstrap.sh. I've made sure the sh file is in the correct location.
But if I copy the bootstrap.sh file inside the container, and then run the same docker exec cmd, it's working fine. What am I missing here? I've tried with a simple script with the following entries, but it throws the same error:
#!/bin/bash
echo "Hello"
The doc clearly says:
Kernels are installed within the Docker container. The easiest way to
accomplish this is to create a bash script with installation commands,
save it to the master node, and then use the sudo docker exec
jupyterhub script_name command to run the script within the jupyterhub
container.
The docker exec command runs a command within the container's namespaces. One of those namespaces is the filesystem. So unless the command is part of the image, written into the container directly, or you have mounted a host volume to map a host directory into the container, you won't be able to execute it. A host volume could look like:
docker run -v /host/scripts:/container/scripts -n your_container $your_image
docker exec -it your_container /container/scripts/test.sh
That host volume could be the same path on both the host and the container.
If it is a shell script, you could use I/O redirection, e.g.:
docker exec -i $container_id /bin/bash <local_script.sh
but be aware that you cannot do interactive stuff this way since the script content has replaced your terminal as stdin. This works because the shell inside the container is just processing commands from stdin.
Other than those scenarios, I don't know what to tell you other than the documentation from AWS appears to be wrong.

copy bash command history (recursive search commands) into Docker container

I have a container which I am using interactively (docker run -it), in it, i have to run a pretty common set of commands, though not always in a set order, hence I cannot just run a script.
Thus, I would like for a way to have my commands in recursive search (Ctrl+R) be available in the Docker container.
Any idea how I can do this?
Let's mount the history file into the container from the host so it's contains will get preserved the container death.
# In some directory
touch bash_history
docker run -v ./bash_history:/root/.bash_history:Z -it fedora /bin/bash
I would recommend to have separate bash history to the one that you use on the host for the safety reasons.
I found helpful info in these questions:
Docker and .bash_history
Docker: preserve command history
https://superuser.com/questions/1158739/prompt-command-to-reload-from-bash-history
They use docker volume mounts however, which mean that the container commands affect the local (host PC) commands, which I do not want.
It seems I will have to copy ~/.bash_history from local into container which will make the history work 'one-way'.
UPDATE: Working:
COPY your_command_script.sh some_folder/my_history
ENV HISTFILE myroot/my_history
RUN PROMPT_COMMAND="history -a; history -r"
Explanation:
copy command script into a file in container
tell the shell to look at a different file for history
reload the history file

Reuse inherited image's CMD or ENTRYPOINT

How can I include my own shell script CMD on container start/restart/attach, without removing the CMD used by an inherited image?
I am using this, which does execute my script fine, but appears to overwrite the PHP CMD:
FROM php
COPY start.sh /usr/local/bin
CMD ["/usr/local/bin/start.sh"]
What should I do differently? I am avoiding the prospect of copy/pasting the ENTRYPOINT or CMD of the parent image, and maybe that's not a good approach.
As mentioned in the comments, there's no built-in solution to this. From the Dockerfile, you can't see the value of the current CMD or ENTRYPOINT. Having a run-parts solution is nice if you control the upstream base image and include this code there, allowing downstream components to make their changes. But docker there's one inherent issue that will cause problems with this, containers should only run a single command that needs to run in the foreground. So if the upstream image kicks off, it would stay running without giving your later steps a chance to run, so you're left with complexities to determine the order to run commands to ensure that a single command does eventually run without exiting.
My personal preference is a much simpler and hardcoded option, to add my own command or entrypoint, and make the last step of my command to exec the upstream command. You will still need to manually identify the script name to call from the upstream Dockerfile. But now in your start.sh, you would have:
#!/bin/sh
# run various pieces of initialization code here
# ...
# kick off the upstream command:
exec /upstream-entrypoint.sh "$#"
By using an exec call, you transfer pid 1 to the upstream entrypoint so that signals get handled correctly. And the trailing "$#" passes through any command line arguments. You can use set to adjust the value of $# if there are some args you want to process and extract in your own start.sh script.
If the base image is not yours, you unfortunately have to call the parent command manually.
If you own the parent image, you can try what the people at camptocamp suggest here.
They basically use a generic script as an entry point that calls run-parts on a directory. What that does is run all scripts in that directory in lexicographic order. So when you extend an image, you just have to put your new scripts in that same folder.
However, that means you'll have to maintain order by prefixing your scripts which could potentially get out of hand. (Imagine the parent image decides to add a new script later...).
Anyway, that could work.
Update #1
There is a long discussion on this docker compose issue about provisioning after container run. One suggestion is to wrap you docker run or compose command in a shell script and then run docker exec on your other commands.
If you'd like to use that approach, you basically keep the parent CMD as the run command and you place yours as a docker exec after your docker run.
Using mysql image as an example
Do docker inspect mysql/mysql-server:5.7 and see that:
Config.Cmd="mysqld"
Config.Entrypoint="/entrypoint.sh"
which we put in bootstrap.sh (remember to chmod a+x):
#!/bin/bash
echo $HOSTNAME
echo "Start my initialization script..."
# docker inspect results used here
/entrypoint.sh mysqld
Dockerfile is now:
FROM mysql/mysql-server:5.7
# put our script inside the image
ADD bootstrap.sh /etc/bootstrap.sh
# set to run our script
ENTRYPOINT ["/bin/sh","-c"]
CMD ["/etc/bootstrap.sh"]
Build and run our new image:
docker build --rm -t sidazhou/tmp-mysql:5.7 .
docker run -it --rm sidazhou/tmp-mysql:5.7
Outputs:
6f5be7c6d587
Start my initialization script...
[Entrypoint] MySQL Docker Image 5.7.28-1.1.13
[Entrypoint] No password option specified for new database.
...
...
You'll see this has the same output as the original image:
docker run -it --rm mysql/mysql-server:5.7
[Entrypoint] MySQL Docker Image 5.7.28-1.1.13
[Entrypoint] No password option specified for new database.
...
...

Dockerize ruby script that takes directories as input/output

I am very new to docker, and I need help to dockerize a ruby script that takes a a input directory and output directory.
i.e generate_rr_pair.rb BuildRR -n /data/ -o /output
What the script does, is it will take the -n option (input) and check if the directory exists, if it does it uses the files inside as input. The script will then output data to the -o option (output). If the output directory doesn't exist, the script will create the directory and output files there.
How can I create a Dockerfile to handle this? Should I pass these in, as environment variables? Or should I use mounted Volumes? But since the script handles fileIO, I am not sure if I want volumes. The input directory should already exist on the host, and the output directory will get created. Both directories, should remain after docker container stops.
Use the official ruby image in your docker file:
FROM ruby:2.1-onbuild
CMD ["ruby", "generate_rr_pair.rb"]
Building the container as normal
docker build -t myruby .
Which can then be run as follows:
docker run --rm -it -v /data:/data -v /output:/output myruby BuildRR -n /data -o /output
Note that volume mappings are required if you want the ruby script within the container to operate on directories mounted on the host machine.

Resources