Kill last program in pipe if any previous command fails - bash

I have a backup script that is essentially:
acquire_data | gzip -9 | gpg --batch -e -r me#example.com | upload-to-cloud
The problem is if acquire_data, or gpg fails, then upload-to-cloud will see the EOF and happily upload an incomplete backup. As an example gpg will fail if the filesystem with the user's home directory is, is full.
I want to pipe it, not store to a temporary file, because it's a lot of data that may not fit in the local server's free space.
I might be able to do something like:
set -o pipefail
mkfifo fifo
upload-to-cloud < fifo &
UPLOADER=$!
((acquire_data | gzip -9 | gpg […]) || kill $UPLOADER) > fifo
wait $UPLOADER # since I need the exit status
But I think that has a race condition. It's not guaranteed that the upload-to-cloud program will receive the signal before it reads an EOF. And adding a sleep seems wrong. Really stdin of upload-to-cloud need never be closed.
I want upload-to-cloud to die before it handles the EOF because then it won't finalize the upload, and the partial upload will be correctly discarded.
There's this similar question, except it talks about killing an earlier part if a later part fails, which is safer since it doesn't have a problem of the race condition.
What's the best way to do this?

Instead of running this all as one pipeline, split off upload-to-cloud into a separate process substitution which can be independently signaled, and for which your parent shell script holds a descriptor (and thus can control the timing of reaching EOF on its stdin).
Note that upload-to-cloud needs to be written to delete content it already uploaded in the event of an unclean exit for this to work as you intend.
Assuming you have a suitably recent version of bash:
#!/usr/bin/env bash
# dynamically allocate a file descriptor; assign it to a process substitution
# store the PID of that process substitution in upload_pid
exec {upload_fd}> >(exec upload-to-cloud); upload_pid=$!
# make sure we recorded an upload_pid that refers to a process that is actually running
if ! kill -0 "$upload_pid"; then
# if this happens without any other obvious error message, check that we're bash 4.4
echo "ERROR: upload-to-cloud not started, or PID not stored" >&2
fi
shopt -s pipefail
if acquire_data | gzip -9 | gpg --batch -e -r me#example.com >&"$upload_fd"; then
exec {upload_fd}>&- # close the pipeline writing up upload-to-cloud gracefully...
wait "$upload_pid" # ...and wait for it to exit
exit # ...then ourselves exiting with the exit status of upload-to-cloud
# (which was returned by wait, became $?, thus exit's default).
else
retval=$? # store the exit status of the failed pipeline component
kill "$upload_pid" # kill the backgrounded process of upload-to-cloud
wait "$upload_pid" # let it handle that SIGTERM...
exit "$retval" # ...and exit the script with the exit status we stored earlier.
fi
Without a new enough bash to be able to store the PID for a process substitution, the line establishing the process substitution might change to:
mkfifo upload_to_cloud.fifo
upload-to-cloud <upload_to_cloud.fifo & upload_pid=$!
exec {upload_fd}>upload_to_cloud.fifo
rm -f upload_to_cloud.fifo
...after which the rest of the script should work unmodified.

Related

Kill next command in pipeline on failure

I have a streaming backup script which I'm running as follows:
./backup_script.sh | aws s3 cp - s3://bucket/path/to/backup
The aws command streams stdin to cloud storage in an atomic way. If the process is interrupted without an EOF, the upload is aborted.
I want the aws process to be killed if ./backup_script.sh exits with a non-zero exit code.
Any bash trick for doing this?
EDIT:
You can test your solution with this script:
#!/usr/bin/env python
import signal
import sys
import functools
def signal_handler(signame, signum, frame):
print "Got {}".format(signame)
sys.exit(0)
signal.signal(signal.SIGTERM, functools.partial(signal_handler, 'TERM'))
signal.signal(signal.SIGINT, functools.partial(signal_handler, 'INT'))
for i in sys.stdin:
pass
print "Got EOF"
Example:
$ grep --bla | ./sigoreof.py
grep: unrecognized option `--bla'
usage: grep [-abcDEFGHhIiJLlmnOoqRSsUVvwxZ] [-A num] [-B num] [-C[num]]
[-e pattern] [-f file] [--binary-files=value] [--color=when]
[--context[=num]] [--directories=action] [--label] [--line-buffered]
[--null] [pattern] [file ...]
Got EOF
I want ./sigoreof.py to be terminated with a signal.
Adopting/correcting a solution originally given by #Dummy00001:
mkfifo aws.fifo
exec 3<>aws.fifo # open the FIFO read/write *in the shell itself*
aws s3 cp - s3://bucket/path/to/backup <aws.fifo 3>&- & aws_pid=$!
rm aws.fifo # everyone who needs a handle already has one; can remove the directory entry
if ./backup_script.sh >&3 3>&-; then
exec 3>&- # success: close the FIFO and let AWS exit successfully
wait "$aws_pid"
else
kill "$aws_pid" # send a SIGTERM...
wait "$aws_pid" # wait for the process to die...
exec 3>&- # only close the write end *after* the process is dead
fi
Important points:
The shell opens the FIFO r/w to avoid blocking (opening for write only would block for a reader; this could also be avoided by invoking the reader [that is, the s3 command] in the background prior to the exec opening the write side).
The write end of the FIFO is held by the script itself, so the read end never hits end-of-file until after the script intentionally closes it.
The aws command's handle on the write end of the FIFO is explicitly closed (3<&-), so it doesn't hold itself open (in which case the exec 3>&- done in the parent would not successfully allow it to finish reading and exit).
backup_script.sh should have a non-zero exit status if there is an error, so you script should look something like:
if ./backup_script.sh > output.txt; then
aws s3 cp output.txt s3://bucket/path/to/backup
fi
rm -f output.txt
A pipe isn't really appropriate here.
If you really need to conserve disk space locally, you'll have to "reverse" the upload; either remove the uploaded file in the event of an error in backup_script.sh, or upload to a temporary location, then move that to the final path once you've determined that the backup has succeeded.
(For simplicity, I'm ignoring the fact that by letting aws exit on its own in the event of an error, you may be uploading more of the partial backup than you need to. See Charles Duffy's answer for a more bandwidth-efficient approach.)
After starting the backup process with
mkfifo data
./backup_script.sh > data & writer_pid=$!
use one of the following to upload the data.
# Upload and remove if there was an error
aws s3 cp - s3://bucket/path/to/backup < data &
if ! wait $writer_pid; then
aws s3 rm s3://bucket/path/to/backup
fi
or
# Upload to a temporary file and move it into place
# once you know the backup succeeded.
aws s3 cp - s3://bucket/path/to/backup.tmp < data &
if wait $writer_pid; then
aws s3 mv s3://bucket/path/to/backup.tmp s3://bucket/path/to/backup
else
aws s3 rm s3://bucket/path/to/backup
fi
A short script which uses process substitution instead of named pipes would be:
#!/bin/bash
exec 4> >( ./second-process.sh )
./first-process.sh >&4 &
if ! wait $! ; then echo "error in first process" >&2; kill 0; wait; fi
It works much like with a fifo, basically using the fd as the information carrier for the IPC instead of a file name.
Two remarks: I wasn't sure whether it's necessary to close fd 4 ; I would assume that upon script exit the shell closes all open files.
And I couldn't figure out how to obtain the PID of the process in the process substitution (anybody? at least on my cygwin the usual $! didn't work.) Therefore I resorted to killing all processes in the group, which may not be desirable (but I'm not entirely sure about the semantics).
I think you need to spawn both processes from a third one and either use the named pipe approach from Lynch in the post mentioned by #tourism (further below in the answers); or keep piping directly but re-write backup_script.sh such that it stays alive in the error case, keeping stdout open. backup_script.sh would have to signal the error condition to the calling process (e.g. by sending a SIGUSR to the parent process ID), which in turn first kills the aws process (leading to an atomic abort) and only then backup_script.sh, unless it exited already because of the broken pipe.
I had a similar situation: a shell script contained a pipeline that used one of its own functions and that function wanted to be able to effect termination. A simple contrived example that finds and displays a file:
#!/bin/sh
a() { find . -maxdepth 1 -name "$1" -print -quit | grep . || exit 101; }
a "$1" | cat
echo done
Here, the function a needs to be able to effect termination which it tries to do by calling exit. However, when invoked through a pipeline (line 3), it only terminates its own (subshell) process. In the example, the done message still appears.
One way to work around this is to detect when in a subshell and send a signal to the parent:
#!/bin/sh
die() { [[ $$ == $(exec sh -c 'echo $PPID') ]] && exit $1 || kill $$; }
a() { find . -maxdepth 1 -name "$1" -print -quit | grep . || die 101; }
a "$1" | cat
echo done
When in a subshell the $$ is the pid of the parent and the construct $(exec sh -c 'echo $PPID') is a shell-agnostic way to obtain the pid of the subprocess. If using bash then this can be replaced by $BASHPID.
If the subprocess pid and $$ differ then it sends a SIGTERM signal to the parent (kill $$) instead of calling exit.
The given exit status (101) isn't propagated by kill so the script exits with a status of 143 (which is 128+15 where 15 is the id of SIGTERM).

Quit less when pipe closes

As part of a bash script, I want to run a program repeatedly, and redirect the output to less. The program has an interactive element, so the goal is that when you exit the program via the window's X button, it is restarted via the script. This part works great, but when I use a pipe to less, the program does not automatically restart until I go to the console and press q. The relevant part of the script:
while :
do
program | less
done
I want to make less quit itself when the pipe closes, so that the program restarts without any user intervention. (That way it behaves just as if the pipe was not there, except while the program is running you can consult the console to view the output of the current run.)
Alternative solutions to this problem are also welcome.
Instead of exiting less, could you simply aggregate the output of each run of program?
while :
do
program
done | less
Having less exit when program would be at odds with one useful feature of less, which is that it can buffer the output of a program that exits before you finish reading its output.
UPDATE: Here's an attempt at using a background process to kill less when it is time. It assumes that the only program reading the output file is the less to kill.
while :
do
( program > /tmp/$$-program-output; kill $(lsof -Fp | cut -c2-) ) &
less /tmp/$$-program-output
done
program writes its output to a file. Once it exits, the kill command uses lsof to
find out what process is reading the file, then kills it. Note that there is a race condition; less needs to start before program exists. If that's a problem, it can
probably be worked around, but I'll avoid cluttering the answer otherwise.
You may try to kill the process group program and less belong to instead of using kill and lsof.
#!/bin/bash
trap 'kill 0' EXIT
while :
do
# script command gives sh -c own process group id (only sh -c cmd gets killed, not entire script!)
# FreeBSD script command
script -q /dev/null sh -c '(trap "kill -HUP -- -$$" EXIT; echo hello; sleep 5; echo world) | less -E -c'
# GNU script command
#script -q -c 'sh -c "(trap \"kill -HUP -- -$$\" EXIT; echo hello; sleep 5; echo world) | less -E -c"' /dev/null
printf '\n%s\n\n' "you now may ctrl-c the program: $0" 1>&2
sleep 3
done
While I agree with chepner's suggestion, if you really want individual less instances, I think this item for the man page will help you:
-e or --quit-at-eof
Causes less to automatically exit the second time it reaches end-of-file. By default,
the only way to exit less is via the "q" command.
-E or --QUIT-AT-EOF
Causes less to automatically exit the first time it reaches end-of-file.
you would make this option visible to less in the LESS envir variable
export LESS="-E"
while : ; do
program | less
done
IHTH

executing bash loop while command is running

I want to build a bash script that executes a command and in the meanwhile performs other stuff, with the possibility of killing the command if the script is killed. Say, executes a cp of a large file and in the meanwhile prints the elapsed time since copy started, but if the script is killed it kills also the copy.
I don't want to use rsync, for 2 reasons: 1) is slow and 2) I want to learn how to do it, it could be useful.
I tried this:
until cp SOURCE DEST
do
#evaluates time, stuff, commands, file dimensions, not important now
#and echoes something
done
but it doesn't execute the do - done block, as it is waiting that the copy ends. Could you please suggest something?
until is the opposite of while. It's nothing to do with doing stuff while another command runs. For that you need to run your task in the background with &.
cp SOURCE DEST &
pid=$!
# If this script is killed, kill the `cp'.
trap "kill $pid 2> /dev/null" EXIT
# While copy is running...
while kill -0 $pid 2> /dev/null; do
# Do stuff
...
sleep 1
done
# Disable the trap on a normal exit.
trap - EXIT
kill -0 checks if a process is running. Note that it doesn't actually signal the process and kill it, as the name might suggest. Not with signal 0, at least.
There are three steps involved in solving your problem:
Execute a command in the background, so it will keep running while your script does something else. You can do this by following the command with &. See the section on Job Control in the Bash Reference Manual for more details.
Keep track of that command's status, so you'll know if it is still running. You can do this with the special variable $!, which is set to the PID (process identifier) of the last command you ran in the background, or empty if no background command was started. Linux creates a directory /proc/$PID for every process that is running and deletes it when the process exits, so you can check for the existence of that directory to find out if the background command is still running. You can learn more than you ever wanted to know about /proc from the Linux Documentation Project's File System Hierarchy page or Advanced Bash-Scripting Guide.
Kill the background command if your script is killed. You can do this with the trap command, which is a bash builtin command.
Putting the pieces together:
# Look for the 4 common signals that indicate this script was killed.
# If the background command was started, kill it, too.
trap '[ -z $! ] || kill $!' SIGHUP SIGINT SIGQUIT SIGTERM
cp $SOURCE $DEST & # Copy the file in the background.
# The /proc directory exists while the command runs.
while [ -e /proc/$! ]; do
echo -n "." # Do something while the background command runs.
sleep 1 # Optional: slow the loop so we don't use up all the dots.
done
Note that we check the /proc directory to find out if the background command is still running, because kill -0 will generate an error if it's called when the process no longer exists.
Update to explain the use of trap:
The syntax is trap [arg] [sigspec …], where sigspec … is a list of signals to catch, and arg is a command to execute when any of those signals is raised. In this case, the command is a list:
'[ -z $! ] || kill $!'
This is a common bash idiom that takes advantage of the way || is processed. An expression of the form cmd1 || cmd2 will evaluate as successful if either cmd1 OR cmd2 succeeds. But bash is clever: if cmd1 succeeds, bash knows that the complete expression must also succeed, so it doesn't bother to evaluate cmd2. On the other hand, if cmd1 fails, the result of cmd2 determines the overall result of the expression. So an important feature of || is that it will execute cmd2 only if cmd1 fails. That means it's a shortcut for the (invalid) sequence:
if cmd1; then
# do nothing
else
cmd2
fi
With that in mind, we can see that
trap '[ -z $! ] || kill $!' SIGHUP SIGINT SIGQUIT SIGTERM
will test whether $! is empty (which means the background task was never executed). If that fails, which means the task was executed, it kills the task.
here is the simplest way to do that using ps -p :
[command_1_to_execute] &
pid=$!
while ps -p $pid &>/dev/null; do
[command_2_to_be_executed meanwhile command_1 is running]
sleep 10
done
This will run every 10 seconds the command_2 if the command_1 is still running in background .
hope this will help you :)
What you want is to do two things at once in shell. The usual way to do that is with a job. You can start a background job by ending the command with an ampersand.
copy $SOURCE $DEST &
You can then use the jobs command to check its status.
Read more:
Gnu Bash Job Control

Close pipe even if subprocesses of first command is still running in background

Suppose I have test.sh as below. The intent is to run some background task(s) by this script, that continuously updates some file. If the background task is terminated for some reason, it should be started again.
#!/bin/sh
if [ -f pidfile ] && kill -0 $(cat pidfile); then
cat somewhere
exit
fi
while true; do
echo "something" >> somewhere
sleep 1
done &
echo $! > pidfile
and want to call it like ./test.sh | otherprogram, e. g. ./test.sh | cat.
The pipe is not being closed as the background process still exists and might produce some output. How can I tell the pipe to close at the end of test.sh? Is there a better way than checking for existence of pidfile before calling the pipe command?
As a variant I tried using #!/bin/bash and disown at the end of test.sh, but it is still waiting for the pipe to be closed.
What I actually try to achieve: I have a "status" script which collects the output of various scripts (uptime, free, date, get-xy-from-dbus, etc.), similar to this test.sh here. The output of the script is passed to my window manager, which displays it. It's also used in my GNU screen bottom line.
Since some of the scripts that are used might take some time to create output, I want to detach them from output collection. So I put them in a while true; do script; sleep 1; done loop, which is started if it is not running yet.
The problem here is now that I don't know how to tell the calling script to "really" detach the daemon process.
See if this serves your purpose:
(I am assuming that you are not interested in any stderr of commands in while loop. You would adjust the code, if you are. :-) )
#!/bin/bash
if [ -f pidfile ] && kill -0 $(cat pidfile); then
cat somewhere
exit
fi
while true; do
echo "something" >> somewhere
sleep 1
done >/dev/null 2>&1 &
echo $! > pidfile
If you want to explicitly close a file descriptor, like for example 1 which is standard output, you can do it with:
exec 1<&-
This is valid for POSIX shells, see: here
When you put the while loop in an explicit subshell and run the subshell in the background it will give the desired behaviour.
(while true; do
echo "something" >> somewhere
sleep 1
done)&

Bash process substitution and exit codes

I'd like to turn the following:
git status --short && (git status --short | xargs -Istr test -z str)
which gets me the desired result of mirroring the output to stdout and doing a zero length check on the result into something closer to:
git status --short | tee >(xargs -Istr test -z str)
which unfortunately returns the exit code of tee (always zero).
Is there any way to get at the exit code of the substituted process elegantly?
[EDIT]
I'm going with the following for now, it prevents running the same command twice but seems to beg for something better:
OUT=$(git status --short) && echo "${OUT}" && test -z "${OUT}"
Look here:
$ echo xxx | tee >(xargs test -n); echo $?
xxx
0
$ echo xxx | tee >(xargs test -z); echo $?
xxx
0
and look here:
$echo xxx | tee >(xargs test -z; echo "${PIPESTATUS[*]}")
xxx
123
$echo xxx | tee >(xargs test -n; echo "${PIPESTATUS[*]}")
xxx
0
Is it?
See also Pipe status after command substitution
I've been working on this for a while, and it seems that there is no way to do that with process substitution, except for resorting to inline signalling, and that can really be used only for input pipes, so I'm not going to expand on it.
However, bash-4.0 provides coprocesses which can be used to replace process substitution in this context and provide clean reaping.
The following snippet provided by you:
git status --short | tee >(xargs -Istr test -z str)
can be replaced by something alike:
coproc GIT_XARGS { xargs -Istr test -z str; }
{ git status --short | tee; } >&${GIT_XARGS[1]}
exec {GIT_XARGS[1]}>&-
wait ${GIT_XARGS_PID}
Now, for some explanation:
The coproc call creates a new coprocess, naming it GIT_XARGS (you can use any name you like), and running the command in braces. A pair of pipes is created for the coprocess, redirecting its stdin and stdout.
The coproc call sets two variables:
${GIT_XARGS[#]} containing pipes to process' stdin and stdout, appropriately ([0] to read from stdout, [1] to write to stdin),
${GIT_XARGS_PID} containing the coprocess' PID.
Afterwards, your command is run and its output is directed to the second pipe (i.e. coprocess' stdin). The cryptically looking >&${GIT_XARGS[1]} part is expanded to something like >&60 which is regular output-to-fd redirection.
Please note that I needed to put your command in braces. This is because a pipeline causes subprocesses to be spawned, and they don't inherit file descriptors from the parent process. In other words, the following:
git status --short | tee >&${GIT_XARGS[1]}
would fail with invalid file descriptor error, since the relevant fd exists in parent process and not the spawned tee process. Putting it in brace causes bash to apply the redirection to the whole pipeline.
The exec call is used to close the pipe to your coprocess. When you used process substitution, the process was spawned as part of output redirection and the pipe to it was closed immediately after the redirection no longer had effect. Since coprocess' pipe's lifetime extends beyond a single redirection, we need to close it explicitly.
Closing the output pipe should cause the process to get EOF condition on stdin and terminate gracefully. We use wait to wait for its termination and reap it. wait returns the coprocess' exit status.
As a last note, please note that in this case, you can't use kill to terminate the coprocess since that would alter its exit status.
#!/bin/bash
if read q < <(git status -s)
then
echo $q
exit
fi

Resources