How to redirect output to file and STDOUT and exit on error - bash

I can exit a program on error like so
ls /fake/folder || exit 1
I can redirect the output to a file and STDOUT like so
ls /usr/bin | tee foo.txt
But I can't do
ls /fake/folder | tee foo.txt || exit 1
because I get the output of tee and not ls
How do I redirect the output to both a file and STDOUT and exit on error

This is exactly what the pipefail runtime option is meant for:
# Make a pipeline successful only if **all** components are successful
set -o pipefail
ls /fake/folder | tee foo.txt || exit 1
If you want to be explicit about precedence, by the way, consider:
set -o pipefail
{ ls /fake/folder | tee foo.txt; } || exit 1 # same thing, but maybe more clear
...or, if you want to avoid making runtime configuration changes, you can use PIPESTATUS to check the exit status of any individual element of the most recent pipeline:
ls /fake/folder | tee foo.txt
(( ${PIPESTATUS[0]} == 0 )) || exit
If you don't want to take any of the approaches above, but are willing to use ksh extensions adopted by bash, putting it in a process substitution rather than a pipeline will prevent tee from impacting exit status:
ls /fake/folder > >(tee foo.txt) || exit 1

Related

What is the difference between using process substitution vs. a pipe?

I came across an example for the using tee utility in the tee info page:
wget -O - http://example.com/dvd.iso | tee >(sha1sum > dvd.sha1) > dvd.iso
I looked up the >(...) syntax and found something called "process substitution". From what I understand, it makes a process look like a file that another process could write/append its output to. (Please correct me if I'm wrong on that point.)
How is this different from a pipe? (|) I see a pipe is being used in the above example—is it just a precedence issue? or is there some other difference?
There's no benefit here, as the line could equally well have been written like this:
wget -O - http://example.com/dvd.iso | tee dvd.iso | sha1sum > dvd.sha1
The differences start to appear when you need to pipe to/from multiple programs, because these can't be expressed purely with |. Feel free to try:
# Calculate 2+ checksums while also writing the file
wget -O - http://example.com/dvd.iso | tee >(sha1sum > dvd.sha1) >(md5sum > dvd.md5) > dvd.iso
# Accept input from two 'sort' processes at the same time
comm -12 <(sort file1) <(sort file2)
They're also useful in certain cases where you for any reason can't or don't want to use pipelines:
# Start logging all error messages to file as well as disk
# Pipes don't work because bash doesn't support it in this context
exec 2> >(tee log.txt)
ls doesntexist
# Sum a column of numbers
# Pipes don't work because they create a subshell
sum=0
while IFS= read -r num; do (( sum+=num )); done < <(curl http://example.com/list.txt)
echo "$sum"
# apt-get something with a generated config file
# Pipes don't work because we want stdin available for user input
apt-get install -c <(sed -e "s/%USER%/$USER/g" template.conf) mysql-server
Another major difference is the propagation of return values / exit codes (I'll use simpler commands to illustrate):
Pipe:
$ ls -l /notthere | tee listing.txt
ls: cannot access '/notthere': No such file or directory
$ echo $?
0
-> exit code of tee is propagated
Process substitution:
$ ls -l /notthere > >(tee listing.txt)
ls: cannot access '/notthere': No such file or directory
$ echo $?
2
-> exit code of ls is propagated
There are of course several methods to work around this (e.g. set -o pipefail, variable PIPESTATUS), but I think it's worth mentioning since this is the default behavior.
Yet another rather subtle, yet potentially annoying difference lies in subprocess termination (best illustrated using commands that produce lots of output):
Pipe:
#!/usr/bin/env bash
tar --create --file /tmp/etc-backup.tar --verbose --directory /etc . 2>&1 | tee /tmp/etc-backup.log
retval=${PIPESTATUS[0]}
(( ${retval} == 0 )) && echo -e "\n*** SUCCESS ***\n" || echo -e "\n*** FAILURE (EXIT CODE: ${retval}) ***\n"
-> after the line containing the pipe construct, all commands of the pipe have already terminated (otherwise PIPESTATUS could not contain their respective exit codes)
Process substitution:
#!/usr/bin/env bash
tar --create --file /tmp/etc-backup.tar --verbose --directory /etc . &> >(tee /tmp/etc-backup.log)
retval=$?
(( ${retval} == 0 )) && echo -e "\n*** SUCCESS ***\n" || echo -e "\n*** FAILURE (EXIT CODE: ${retval}) ***\n"
-> after the line containing the process substitution, the command within >(...), i.e. tee in this example, may still be running, potentially causing desynchronized console output (SUCCESS / FAILURE message gets mixed in with still flowing tar output) [*]
[*] Can be reproduced on the framebuffer console, but does not seem to affect GUI terminals like KDE's Konsole (likely due to different buffering strategies).

Incorrect results with bash process substitution and tail?

Using bash process substitution, I want to run two different commands on a file simultaneously. In this example it is not necessary but imagine that "cat /usr/share/dict/words" was a very expensive operation such as uncompressing a 50gb file.
cat /usr/share/dict/words | tee >(head -1 > h.txt) >(tail -1 > t.txt) > /dev/null
After this command I would expect h.txt to contain the first line of the words file "A", and t.txt to contain the last line of the file "Zyzzogeton".
However what actually happens is that h.txt contains "A" but t.txt contains "argillaceo" which is about 5% into the file.
Why does this happen? It seems like either the "tail" process is terminating early or the streams are getting mixed up.
Running another similar command like this behaves as expected:
cat /usr/share/dict/words | tee >(grep ^a > a.txt) >(grep ^z > z.txt) > /dev/null
After this command I'd expect a.txt to contain all the words that begin with "a", while z.txt contains all of the words that begin with "z", which is exactly what happened.
So why doesn't this work with "tail", and with what other commands will this not work?
Ok, what seems to happen is that once the head -1 command finishes it exits and that causes tee to get a SIGPIPE it tries to write to the named pipe that the process substitution setup which generates an EPIPE and according to man 2 write will also generate SIGPIPE in the writing process, which causes tee to exit and that forces the tail -1 to exit immediately, and the cat on the left gets a SIGPIPE as well.
We can see this a little better if we add a bit more to the process with head and make the output both more predictable and also written to stderr without relying on the tee:
for i in {1..30}; do echo "$i"; echo "$i" >&2; sleep 1; done | tee >(head -1 > h.txt; echo "Head done") >(tail -1 > t.txt) >/dev/null
which when I run it gave me the output:
1
Head done
2
so it got just 1 more iteration of the loop before everything exited (though t.txt still only has 1 in it). If we then did
echo "${PIPESTATUS[#]}"
we see
141 141
which this question ties to SIGPIPE in a very similar fashion to what we're seeing here.
The coreutils maintainers have added this as an example to their tee "gotchas" for future posterity.
For a discussion with the devs about how this fits into POSIX compliance you can see the (closed notabug) report at http://debbugs.gnu.org/cgi/bugreport.cgi?bug=22195
If you have access to GNU version 8.24 they have added some options (not in POSIX) that can help like -p or --output-error=warn. Without that you can take a bit of a risk but get the desired functionality in the question by trapping and ignoring SIGPIPE:
trap '' PIPE
for i in {1..30}; do echo "$i"; echo "$i" >&2; sleep 1; done | tee >(head -1 > h.txt; echo "Head done") >(tail -1 > t.txt) >/dev/null
trap - PIPE
will have the expected results in both h.txt and t.txt, but if something else happened that wanted SIGPIPE to be handled correctly you'd be out of luck with this approach.
Another hacky option would be to zero out t.txt before starting then not let the head process list finish until it is non-zero length:
> t.txt; for i in {1..10}; do echo "$i"; echo "$i" >&2; sleep 1; done | tee >(head -1 > h.txt; echo "Head done"; while [ ! -s t.txt ]; do sleep 1; done) >(tail -1 > t.txt; date) >/dev/null

Exit when one process in pipe fails

The goal was to make a simple unintrusive wrapper that traces stdin and stdout to stderr:
#!/bin/bash
tee /dev/stderr | ./script.sh | tee /dev/stderr
exit ${PIPESTATUS[1]}
Test script script.sh:
#!/bin/bash
echo asd
sleep 1
exit 4
But when the script exits, it doesn't terminate the wrapper. Possible solution is to end the first tee from the second command of the pipe:
#!/bin/bash
# Second subshell will get the PID of the first one through the pipe.
# It will be able to kill the whole script by killing the first subshell.
# Create a temporary named pipe (it's safe, conflicts will throw an error).
pipe=$(mktemp -u)
if ! mkfifo $pipe; then
echo "ERROR: debug tracing pipe creation failed." >&2
exit 1
fi
# Attach it to file descriptor 3.
exec 3<>$pipe
# Unlink the named pipe.
rm $pipe
(echo $BASHPID >&3; tee /dev/stderr) | (./script.sh; r=$?; kill $(head -n1 <&3); exit $r) | tee /dev/stderr
exit ${PIPESTATUS[1]}
That's a lot of code. Is there another way?
I think that you're looking for the pipefail option. From the bash man page:
pipefail
If set, the return value of a pipeline is the value of the last (rightmost)
command to exit with a non-zero status, or zero if all commands in the
pipeline exit successfully. This option is disabled by default.
So if you start your wrapper script with
#!/bin/bash
set -e
set -o pipefail
Then the wrapper will exit when any error occurs (set -e) and will set the status of the pipeline in the way that you want.
The main issue at hand here is clearly the pipe. In bash, when executing a command of the form
command1 | command2
and command2 dies or terminates, the pipe which receives the output (/dev/stdout) from command1 becomes broken. The broken pipe, however, does not terminate command1. This will only happen when it tries to write to the broken pipe, upon which it will exit with sigpipe. A simple demonstration of this can be seen in this question.
If you want to avoid this problem, you should make use of process substitution in combination with input redirection. This way, you avoid pipes. The above pipeline is then written as:
command2 < <(command1)
In the case of the OP, this would become:
./script.sh < <(tee /dev/stderr) | tee /dev/stderr
which can also be written as:
./script.sh < <(tee /dev/stderr) > >(tee /dev/stderr)

Access $? Variable with a piped statement?

I have some code that I would like to have the $? variable of.
VARIABLE=`grep "searched_string" test.log | sed 's/searched/found/'`
Is there any way to test if this entire line (rather than just the sed command) was completed successfully? If I try the following code right after it:
if [ "$?" -ne 0 ]
then
echo 1
exit
fi
it doesn't run even if the grep part of the statement fails.
Could someone show how to resolve this issue?
Use the
echo ${PIPESTATUS[#]}
will print out the array of exit-statuses of all commands.
$ ls | grep . | wc -l
28
$ echo ${PIPESTATUS[#]}
0 0 0
but
$ ls | grep nonexistentfilename | wc -l
0
$ echo ${PIPESTATUS[#]}
0 1 0 #the grep returns 1 - pattern not found
or
$ ls nonexistentfilename | grep somegibberish | wc -l
ls: nonexistentfilename: No such file or directory
0
$ echo ${PIPESTATUS[#]}
1 1 0 #ls and grep fails
for exact command status
echo ${PIPESTATUS[1]} #for the grep
also here is the
set -o pipefail
from the docs
pipefail
If set, the return value of a pipeline is the value of the
last (rightmost) command to exit with a non-zero status, or zero if
all commands in the pipeline exit successfully. This option is
disabled by default.
$ ls nonexistentfile | wc -c
ls: nonexistentfile: No such file or directory
0
$ echo $?
0
$ set -o pipefail
$ ls nonexistentfile | wc -c
ls: nonexistentfile: No such file or directory
0
$ echo $?
1
EDIT based on the comment
Youre probably tried the next:
VARIABLE=$(grep "searched_string" test.log | sed 's/searched/found/')
echo "${PIPESTATUS[#]}"
Of course, this can't work because the whole $(...) part runs in the subshell (another process) and therefore any variable what is created is lost when the subshell exits. (at the ))
You should put the whole PIPESTATUS mechanism into $(...) like next:
variable=$(
grep "searched_string" test.log | sed 's/searched/found/'
# do something with PIPESTATUS
# you should not echo anythig to stdout (because will be captured into $variable)
# you can echo on stderr - e.g.
echo "=${PIPESTATUS[#]}=" >&2
)
Also, the second line of the comment is an solution, eg:
var_with_status=$(command | commmand2 ; echo ":DELIMITER:${PIPESTATUS[#]}")
now, the $var_with_status will contain not only the result of the command | command2 but the PIPESTATUS too, delimited with some unique delimiter, so you can extract it...
Also, the set -o pipefail will indicate the result - if you don't need exact place of the fail.
Also you can write the PIPESTATUS in some temp-file (in the subshell) and the parent can read it and delete the temp-file...
Also is possible print the PIPESTATUS into different file-descriptors in the subshell and read this descriptor in the parent shell, but....
... beware do not fall into the XY problem, where you will make extremelly complicated script, only because you don't want change the logic of the processing.
e.g. you can always break you script into safe parts, like:
var1=$(grep 'str' test.log)
#check the `$var1` and do something with the error indicated with `$?`
var2=(sed '....' <<<"$var1")
#check the `$var2` and do something with the error indicated with `$?`
#and so on
simple enough?
So, ask yourself - do you really need mungling with how to get the PIPESTATUS form an subshell?
Ps: don't use uppercase variable names. could interfere with some environment variables and causes hard-to-debug problems..

Pipe command output, but keep the error code [duplicate]

This question already has answers here:
Pipe output and capture exit status in Bash
(16 answers)
Closed 5 years ago.
How do I get the correct return code from a unix command line application after I've piped it through another command that succeeded?
In detail, here's the situation :
$ tar -cEvhf - -I ${sh_tar_inputlist} | gzip -5 -c > ${sh_tar_file} -- when only the tar command fails $?=0
$ echo $?
0
And, what I'd like to see is:
$ tar -cEvhf - -I ${sh_tar_inputlist} 2>${sh_tar_error_file} | gzip -5 -c > ${sh_tar_file}
$ echo $?
1
Does anyone know how to accomplish this?
Use ${PIPESTATUS[0]} to get the exit status of the first command in the pipe.
For details, see http://tldp.org/LDP/abs/html/internalvariables.html#PIPESTATUSREF
See also http://cfajohnson.com/shell/cus-faq-2.html for other approaches if your shell does not support $PIPESTATUS.
Look at $PIPESTATUS which is an array variable holding exit statuses. So ${PIPESTATUS[0]} holds the exit status of the first command in the pipe, ${PIPESTATUS[1]} the exit status of the second command, and so on.
For example:
$ tar -cEvhf - -I ${sh_tar_inputlist} | gzip -5 -c > ${sh_tar_file}
$ echo ${PIPESTATUS[0]}
To print out all statuses use:
$ echo ${PIPESTATUS[#]}
Here is a general solution using only POSIX shell and no temporary files:
Starting from the pipeline:
foo | bar | baz
exec 4>&1
error_statuses=`((foo || echo "0:$?" >&3) |
(bar || echo "1:$?" >&3) |
(baz || echo "2:$?" >&3)) 3>&1 >&4`
exec 4>&-
$error_statuses contains the status codes of any failed processes, in random order, with indexes to tell which command emitted each status.
# if "bar" failed, output its status:
echo $error_statuses | grep '1:' | cut -d: -f2
# test if all commands succeeded:
test -z "$error_statuses"
# test if the last command succeeded:
echo $error_statuses | grep '2:' >/dev/null
As others have pointed out, some modern shells provide PIPESTATUS to get this info. In classic sh, it's a bit more difficult, and you need to use a fifo:
#!/bin/sh
trap 'rm -rf $TMPDIR' 0
TMPDIR=$( mktemp -d )
mkfifo ${FIFO=$TMPDIR/fifo}
cmd1 > $FIFO &
cmd2 < $FIFO
wait $!
echo The return value of cmd1 is $?
(Well, you don't need to use a fifo. You can have the commands early in the pipe echo a status variable and eval that in the main shell, redirecting file descriptors all over the place and basically bending over backwards to check things, but using a fifo is much, much easier.)

Resources