What purpose does using exec in docker entrypoint scripts serve? - bash

For example in the redis official image:
https://github.com/docker-library/redis/blob/master/2.8/docker-entrypoint.sh
#!/bin/bash
set -e
if [ "$1" = 'redis-server' ]; then
chown -R redis .
exec gosu redis "$#"
fi
exec "$#"
Why not just run the commands as usual without exec preceding them?

As #Peter Lyons says, using exec will replace the parent process, rather than have two processes running.
This is important in Docker for signals to be proxied correctly. For example, if Redis was started without exec, it will not receive a SIGTERM upon docker stop and will not get a chance to shutdown cleanly. In some cases, this can lead to data loss or zombie processes.
If you do start child processes (i.e. don't use exec), the parent process becomes responsible for handling and forwarding signals as appropriate. This is one of the reasons it's best to use supervisord or similar when running multiple processes in a container, as it will forward signals appropriately.

Without exec, the parent shell process survives and waits for the child to exit. With exec, the child process replaces the parent process entirely so when there's nothing for the parent to do after forking the child, I would consider exec slightly more precise/correct/efficient. In the grand scheme of things, I think it's probably safe to classify it as a minor optimization.
without exec
parent shell starts
parent shell forks child
child runs
child exits
parent shell exits
with exec
parent shell starts
parent shell forks child, replaces itself with child
child program runs taking over the shell's process
child exits

Think of it as an optimization like tail recursion.
If running another program is the final act of the shell script, there's not much of a need to have the shell run the program in a new process and wait for it. Using exec, the shell process replaces itself with the program.
In either case, the exit value of the shell script will be identical1. Whatever program originally called the shell script will see an exit value that is equal to the exit value of the exec`ed program (or 127 if the program cannot be found).
1 modulo corner cases such as a program doing something different depending on the name of its parent.

Related

How to write to a coprocess from a child process of the parent that opened the coprocess

I am using a coprocess inside my main parent process to spawn commands to a shell that otherwise cannot be solved (the shell that I open in the coprocess is not maintained by me and executes the "newgrp" and "exec" commands that stop me from sending commands to that shell simply from my script... So I need the coprocess to be able to execute commands in that shell from a script). So far I have been using one thread, the parent process to push commands to the coprocess but now I am in need of spawning commands from several child processes, too, because of an optimization step. The bash doc says, file descriptors are not inherited by child processes, and this is in fact true, when I opened a subshell I got the following error message from bash:
[...]/automated_integration/clif_ai_common.sh: line 396: ${!clifAi_sendCmdToCoproc_varName}: Bad file descriptor
The code that makes this message appear is as follows:
if [[ ${PARAM_NO_MOVING_VERIF_TB_TAGS} != true ]]; then
(
clifAi_log ${CLIFAI_LOGLEVEL_INFO} "" "clifAi_sanityRegression_callbackRunning" "Populating moving VERIF and TB tags in the background..."
clifAi_popVerifTags "${clifAi_sanityRegression_callbackRunning_coproc}" "${clifAi_sanityRegression_callbackRunning_wslogfile}" "${PARAM_OPTLEVEL}" "${CONST_EXCLUDE_FILTER}" "${CONST_DIR_TO_OPT}" ${clifAi_sanityRegression_callbackRunning_excludeList}
clifAi_popTbTags "${clifAi_sanityRegression_callbackRunning_coproc}" "${clifAi_sanityRegression_callbackRunning_wslogfile}"
rm -rf ${VAR_VERIFTBTAG_SEMAPHORE_FILE}
) &
fi
Bash reports the same error if I move this piece of code into a function and call it with & without the ( ), so no subshell. This is also understandable; it will still spawn a child process, regardless of running it in a subshell or not.
My question is, how can I write to the coprocess owned by the parent process from child processes, too? What is the best practice?
Many thanks in advance,
Geza Balazs

Bash: wait for the child process spawned before exec'ing bash

Bash can use wait to wait for the processes it started directly. However, if the process forks a child, and then execs bash (that is, parent turns into bash), the newly exec'd Bash process cannot wait for the "inherited" child. Here is the minimal reproduction:
#/bin/bash
sleep inf &
pid=$!
exec bash -c "wait $pid;"'echo This shell is $$; sleep inf'
which gives this output:
$ bash test.sh
bash: wait: pid 53984 is not a child of this shell
This shell is 53983
The pstree, however, shows that the child pid is indeed the child of the shell:
$ pstree -p 53983
bash(53983)─┬─sleep(53984)
└─sleep(53985)
It seems that Bash tracks the spawned processes internally, and consults this list rather than calling waitpid(2) directly (zsh has the same problem, but ksh works as expected).
Is there any way to workaround this behavior, and have Bash add the "inherited" child to its internal structures?
I could not reproduce it, but I wrote a script that shows that this behaviour is consistent in at least 6 well maintained shells (including the mentioned ksh).
As you can see in the report, all shells won't list the first sleep job in the replaced shell, only the new one created after the exec call.
When invoking exec the new shell does not inherit the job list managed by the replaced one. It seems an intended behaviour but I could not find it anywhere in the POSIX specification.

Linux - Child process to survive parent process tree kill

Motivation:
In a Java program, I'm setting a bash script to be executed on -XX:OnOutOfMemoryError. This script is responsible for uploading the heap-dump to HDFS. However, quite often only a part of the file gets uploaded.
I'm suspecting the JVM gets killed by cluster manager before the upload script completes. My guess is the JVM receives a process group kill signal and takes the bash script, i.e. its child process, down too.
The Question:
Is there a way in unix to run a sub-process in such a way that it does not die when it's parent receives a group kill signal?
You can use disown. Start the process in the background and then disown it, and any kill signals to the process parent will no longer be propagated to the child.
Script would look something like:
./handler_script &
disown

How do I record all child processes spawned by an Ant script over time?

I inherited a legacy Ant-based build system and I'm trying to get a sense of its scope. I observed multiple jvm and junit tasks with fork=yes. It calls subant and similar tasks wildly. Occasionally, it just execs other processes.
I really don't want to search through 100s of scripts and reference documentation for every task to find possible-forking-behavior. I'd like to capture the child-process list while the build runs.
I managed to create a clean Vagrant + Puppet environment for builds and I can run the full build like so
$ cd /vagrant && $ANT_HOME/bin/ant
If I had to brute force something... I'd have a script kick off the build and capture child processes until the build is completed?
#!/bin/bash
$ANT_HOME/bin/ant &
while ps $!
do
sleep 1
ps --ppid $! >> build_processes
done
User Jayan recommended strace, specifically:
$ strace -f -e trace=fork ant
The -f limits tracing to fork system calls.
Trace child processes as they are created by cur-
rently traced processes as a result of the fork(2)
system call. The new process is attached to as soon
as its pid is known (through the return value of
fork(2) in the parent process). This means that such
children may run uncontrolled for a while (espe-
cially in the case of a vfork(2)), until the parent
is scheduled again to complete its (v)fork(2) call.
If the parent process decides to wait(2) for a child
that is currently being traced, it is suspended
until an appropriate child process either terminates
or incurs a signal that would cause it to terminate
(as determined from the child’s current signal dis-
position).
I can't find the trace=fork expression, but trace=process seems useful.
-e trace=process
Trace all system calls which involve process management. This is useful for watching the fork, wait, and exec steps of a process.
http://linuxcommand.org/man_pages/strace1.html
As ant is a java process, you can try to use byteman. In byteman script you define rules which are triggered when methods exec from java.lang.Runtime are executed.
You attach byteman to ant using ANT_OPTS env variable.

How to kill all children of the current shell on interrupt?

My scripts cdist-deploy-to and cdist-mass-deploy (from cdist configuration management) run interactively (i.e. are called by a user).
These scripts call a lot of scripts, which again call some scripts:
cdist-mass-deploy ...
cdist-deploy-to ...
cdist-explorer-run-global ...
cdist-dir ....
What I want is to exit / kill all scripts, as soon as cdist-mass-deploy is either stopped by control C (SIGINT) or killed with SIGTERM.
cdist-deploy-to can also be called interactively and should exhibit the same behaviour.
Using ps -ef... and co variants to find out all processes with the ppid looks like it could be quite unportable. Using $! does not work as in the deeper levels the children are no background processes.
I tried using the following code:
__cdist_kill_on_interrupt()
{
__cdist_tmp_removal
kill 0
exit 1
}
trap __cdist_kill_on_interrupt INT TERM
But this leads to ugly Terminated messages as well as to a segfault in the shells (dash, bash, zsh) and seems not to stop everything instantly anyway:
# cdist-mass-deploy -p ikq04.ethz.ch ikq05.ethz.ch
core: Waiting for cdist-deploy-to jobs to finish
^CTerminated
Terminated
Terminated
Terminated
Segmentation fault
So the question is, how to cleanly exit including all (sub-)children in a portable manner (bourne shell, no csh support needed)?
You don't need to handle ^C, that will result in a signal being sent to the whole process group, which will kill all the processes that are not in the background. So you don't need to catch INT.
The only reason you get a Terminated when you kill them is that kill sends TERM by default, but that's reasonable if you are handling a TERM in the first place. You could use kill -INT 0 if you want to avoid the messages.
(responding with extra info)
If the child processes are run in the background, you can get their process ids just after you start them, using the $! special shell variable. Gather these together in a variable and just kill them all when you need to terminate.

Resources