Pipe command output, but keep the error code [duplicate] - shell

This question already has answers here:
Pipe output and capture exit status in Bash
(16 answers)
Closed 5 years ago.
How do I get the correct return code from a unix command line application after I've piped it through another command that succeeded?
In detail, here's the situation :
$ tar -cEvhf - -I ${sh_tar_inputlist} | gzip -5 -c > ${sh_tar_file} -- when only the tar command fails $?=0
$ echo $?
0
And, what I'd like to see is:
$ tar -cEvhf - -I ${sh_tar_inputlist} 2>${sh_tar_error_file} | gzip -5 -c > ${sh_tar_file}
$ echo $?
1
Does anyone know how to accomplish this?

Use ${PIPESTATUS[0]} to get the exit status of the first command in the pipe.
For details, see http://tldp.org/LDP/abs/html/internalvariables.html#PIPESTATUSREF
See also http://cfajohnson.com/shell/cus-faq-2.html for other approaches if your shell does not support $PIPESTATUS.

Look at $PIPESTATUS which is an array variable holding exit statuses. So ${PIPESTATUS[0]} holds the exit status of the first command in the pipe, ${PIPESTATUS[1]} the exit status of the second command, and so on.
For example:
$ tar -cEvhf - -I ${sh_tar_inputlist} | gzip -5 -c > ${sh_tar_file}
$ echo ${PIPESTATUS[0]}
To print out all statuses use:
$ echo ${PIPESTATUS[#]}

Here is a general solution using only POSIX shell and no temporary files:
Starting from the pipeline:
foo | bar | baz
exec 4>&1
error_statuses=`((foo || echo "0:$?" >&3) |
(bar || echo "1:$?" >&3) |
(baz || echo "2:$?" >&3)) 3>&1 >&4`
exec 4>&-
$error_statuses contains the status codes of any failed processes, in random order, with indexes to tell which command emitted each status.
# if "bar" failed, output its status:
echo $error_statuses | grep '1:' | cut -d: -f2
# test if all commands succeeded:
test -z "$error_statuses"
# test if the last command succeeded:
echo $error_statuses | grep '2:' >/dev/null

As others have pointed out, some modern shells provide PIPESTATUS to get this info. In classic sh, it's a bit more difficult, and you need to use a fifo:
#!/bin/sh
trap 'rm -rf $TMPDIR' 0
TMPDIR=$( mktemp -d )
mkfifo ${FIFO=$TMPDIR/fifo}
cmd1 > $FIFO &
cmd2 < $FIFO
wait $!
echo The return value of cmd1 is $?
(Well, you don't need to use a fifo. You can have the commands early in the pipe echo a status variable and eval that in the main shell, redirecting file descriptors all over the place and basically bending over backwards to check things, but using a fifo is much, much easier.)

Related

What is the difference between using process substitution vs. a pipe?

I came across an example for the using tee utility in the tee info page:
wget -O - http://example.com/dvd.iso | tee >(sha1sum > dvd.sha1) > dvd.iso
I looked up the >(...) syntax and found something called "process substitution". From what I understand, it makes a process look like a file that another process could write/append its output to. (Please correct me if I'm wrong on that point.)
How is this different from a pipe? (|) I see a pipe is being used in the above example—is it just a precedence issue? or is there some other difference?
There's no benefit here, as the line could equally well have been written like this:
wget -O - http://example.com/dvd.iso | tee dvd.iso | sha1sum > dvd.sha1
The differences start to appear when you need to pipe to/from multiple programs, because these can't be expressed purely with |. Feel free to try:
# Calculate 2+ checksums while also writing the file
wget -O - http://example.com/dvd.iso | tee >(sha1sum > dvd.sha1) >(md5sum > dvd.md5) > dvd.iso
# Accept input from two 'sort' processes at the same time
comm -12 <(sort file1) <(sort file2)
They're also useful in certain cases where you for any reason can't or don't want to use pipelines:
# Start logging all error messages to file as well as disk
# Pipes don't work because bash doesn't support it in this context
exec 2> >(tee log.txt)
ls doesntexist
# Sum a column of numbers
# Pipes don't work because they create a subshell
sum=0
while IFS= read -r num; do (( sum+=num )); done < <(curl http://example.com/list.txt)
echo "$sum"
# apt-get something with a generated config file
# Pipes don't work because we want stdin available for user input
apt-get install -c <(sed -e "s/%USER%/$USER/g" template.conf) mysql-server
Another major difference is the propagation of return values / exit codes (I'll use simpler commands to illustrate):
Pipe:
$ ls -l /notthere | tee listing.txt
ls: cannot access '/notthere': No such file or directory
$ echo $?
0
-> exit code of tee is propagated
Process substitution:
$ ls -l /notthere > >(tee listing.txt)
ls: cannot access '/notthere': No such file or directory
$ echo $?
2
-> exit code of ls is propagated
There are of course several methods to work around this (e.g. set -o pipefail, variable PIPESTATUS), but I think it's worth mentioning since this is the default behavior.
Yet another rather subtle, yet potentially annoying difference lies in subprocess termination (best illustrated using commands that produce lots of output):
Pipe:
#!/usr/bin/env bash
tar --create --file /tmp/etc-backup.tar --verbose --directory /etc . 2>&1 | tee /tmp/etc-backup.log
retval=${PIPESTATUS[0]}
(( ${retval} == 0 )) && echo -e "\n*** SUCCESS ***\n" || echo -e "\n*** FAILURE (EXIT CODE: ${retval}) ***\n"
-> after the line containing the pipe construct, all commands of the pipe have already terminated (otherwise PIPESTATUS could not contain their respective exit codes)
Process substitution:
#!/usr/bin/env bash
tar --create --file /tmp/etc-backup.tar --verbose --directory /etc . &> >(tee /tmp/etc-backup.log)
retval=$?
(( ${retval} == 0 )) && echo -e "\n*** SUCCESS ***\n" || echo -e "\n*** FAILURE (EXIT CODE: ${retval}) ***\n"
-> after the line containing the process substitution, the command within >(...), i.e. tee in this example, may still be running, potentially causing desynchronized console output (SUCCESS / FAILURE message gets mixed in with still flowing tar output) [*]
[*] Can be reproduced on the framebuffer console, but does not seem to affect GUI terminals like KDE's Konsole (likely due to different buffering strategies).

How to redirect output to file and STDOUT and exit on error

I can exit a program on error like so
ls /fake/folder || exit 1
I can redirect the output to a file and STDOUT like so
ls /usr/bin | tee foo.txt
But I can't do
ls /fake/folder | tee foo.txt || exit 1
because I get the output of tee and not ls
How do I redirect the output to both a file and STDOUT and exit on error
This is exactly what the pipefail runtime option is meant for:
# Make a pipeline successful only if **all** components are successful
set -o pipefail
ls /fake/folder | tee foo.txt || exit 1
If you want to be explicit about precedence, by the way, consider:
set -o pipefail
{ ls /fake/folder | tee foo.txt; } || exit 1 # same thing, but maybe more clear
...or, if you want to avoid making runtime configuration changes, you can use PIPESTATUS to check the exit status of any individual element of the most recent pipeline:
ls /fake/folder | tee foo.txt
(( ${PIPESTATUS[0]} == 0 )) || exit
If you don't want to take any of the approaches above, but are willing to use ksh extensions adopted by bash, putting it in a process substitution rather than a pipeline will prevent tee from impacting exit status:
ls /fake/folder > >(tee foo.txt) || exit 1

false | true; echo $? [duplicate]

I currently have a script that does something like
./a | ./b | ./c
I want to modify it so that if any of a, b, or c exit with an error code I print an error message and stop instead of piping bad output forward.
What would be the simplest/cleanest way to do so?
In bash you can use set -e and set -o pipefail at the beginning of your file. A subsequent command ./a | ./b | ./c will fail when any of the three scripts fails. The return code will be the return code of the first failed script.
Note that pipefail isn't available in standard sh.
You can also check the ${PIPESTATUS[]} array after the full execution, e.g. if you run:
./a | ./b | ./c
Then ${PIPESTATUS} will be an array of error codes from each command in the pipe, so if the middle command failed, echo ${PIPESTATUS[#]} would contain something like:
0 1 0
and something like this run after the command:
test ${PIPESTATUS[0]} -eq 0 -a ${PIPESTATUS[1]} -eq 0 -a ${PIPESTATUS[2]} -eq 0
will allow you to check that all commands in the pipe succeeded.
If you really don't want the second command to proceed until the first is known to be successful, then you probably need to use temporary files. The simple version of that is:
tmp=${TMPDIR:-/tmp}/mine.$$
if ./a > $tmp.1
then
if ./b <$tmp.1 >$tmp.2
then
if ./c <$tmp.2
then : OK
else echo "./c failed" 1>&2
fi
else echo "./b failed" 1>&2
fi
else echo "./a failed" 1>&2
fi
rm -f $tmp.[12]
The '1>&2' redirection can also be abbreviated '>&2'; however, an old version of the MKS shell mishandled the error redirection without the preceding '1' so I've used that unambiguous notation for reliability for ages.
This leaks files if you interrupt something. Bomb-proof (more or less) shell programming uses:
tmp=${TMPDIR:-/tmp}/mine.$$
trap 'rm -f $tmp.[12]; exit 1' 0 1 2 3 13 15
...if statement as before...
rm -f $tmp.[12]
trap 0 1 2 3 13 15
The first trap line says 'run the commands 'rm -f $tmp.[12]; exit 1' when any of the signals 1 SIGHUP, 2 SIGINT, 3 SIGQUIT, 13 SIGPIPE, or 15 SIGTERM occur, or 0 (when the shell exits for any reason).
If you're writing a shell script, the final trap only needs to remove the trap on 0, which is the shell exit trap (you can leave the other signals in place since the process is about to terminate anyway).
In the original pipeline, it is feasible for 'c' to be reading data from 'b' before 'a' has finished - this is usually desirable (it gives multiple cores work to do, for example). If 'b' is a 'sort' phase, then this won't apply - 'b' has to see all its input before it can generate any of its output.
If you want to detect which command(s) fail, you can use:
(./a || echo "./a exited with $?" 1>&2) |
(./b || echo "./b exited with $?" 1>&2) |
(./c || echo "./c exited with $?" 1>&2)
This is simple and symmetric - it is trivial to extend to a 4-part or N-part pipeline.
Simple experimentation with 'set -e' didn't help.
Unfortunately, the answer by Johnathan requires temporary files and the answers by Michel and Imron requires bash (even though this question is tagged shell). As pointed out by others already, it is not possible to abort the pipe before later processes are started. All processes are started at once and will thus all run before any errors can be communicated. But the title of the question was also asking about error codes. These can be retrieved and investigated after the pipe finished to figure out whether any of the involved processes failed.
Here is a solution that catches all errors in the pipe and not only errors of the last component. So this is like bash's pipefail, just more powerful in the sense that you can retrieve all the error codes.
res=$( (./a 2>&1 || echo "1st failed with $?" >&2) |
(./b 2>&1 || echo "2nd failed with $?" >&2) |
(./c 2>&1 || echo "3rd failed with $?" >&2) > /dev/null 2>&1)
if [ -n "$res" ]; then
echo pipe failed
fi
To detect whether anything failed, an echo command prints on standard error in case any command fails. Then the combined standard error output is saved in $res and investigated later. This is also why standard error of all processes is redirected to standard output. You can also send that output to /dev/null or leave it as yet another indicator that something went wrong. You can replace the last redirect to /dev/null with a file if yo uneed to store the output of the last command anywhere.
To play more with this construct and to convince yourself that this really does what it should, I replaced ./a, ./b and ./c by subshells which execute echo, cat and exit. You can use this to check that this construct really forwards all the output from one process to another and that the error codes get recorded correctly.
res=$( (sh -c "echo 1st out; exit 0" 2>&1 || echo "1st failed with $?" >&2) |
(sh -c "cat; echo 2nd out; exit 0" 2>&1 || echo "2nd failed with $?" >&2) |
(sh -c "echo start; cat; echo end; exit 0" 2>&1 || echo "3rd failed with $?" >&2) > /dev/null 2>&1)
if [ -n "$res" ]; then
echo pipe failed
fi
This answer is in the spirit of the accepted answer, but using shell variables instead of temporary files.
if TMP_A="$(./a)"
then
if TMP_B="$(echo "TMP_A" | ./b)"
then
if TMP_C="$(echo "TMP_B" | ./c)"
then
echo "$TMP_C"
else
echo "./c failed"
fi
else
echo "./b failed"
fi
else
echo "./a failed"
fi

Incorrect results with bash process substitution and tail?

Using bash process substitution, I want to run two different commands on a file simultaneously. In this example it is not necessary but imagine that "cat /usr/share/dict/words" was a very expensive operation such as uncompressing a 50gb file.
cat /usr/share/dict/words | tee >(head -1 > h.txt) >(tail -1 > t.txt) > /dev/null
After this command I would expect h.txt to contain the first line of the words file "A", and t.txt to contain the last line of the file "Zyzzogeton".
However what actually happens is that h.txt contains "A" but t.txt contains "argillaceo" which is about 5% into the file.
Why does this happen? It seems like either the "tail" process is terminating early or the streams are getting mixed up.
Running another similar command like this behaves as expected:
cat /usr/share/dict/words | tee >(grep ^a > a.txt) >(grep ^z > z.txt) > /dev/null
After this command I'd expect a.txt to contain all the words that begin with "a", while z.txt contains all of the words that begin with "z", which is exactly what happened.
So why doesn't this work with "tail", and with what other commands will this not work?
Ok, what seems to happen is that once the head -1 command finishes it exits and that causes tee to get a SIGPIPE it tries to write to the named pipe that the process substitution setup which generates an EPIPE and according to man 2 write will also generate SIGPIPE in the writing process, which causes tee to exit and that forces the tail -1 to exit immediately, and the cat on the left gets a SIGPIPE as well.
We can see this a little better if we add a bit more to the process with head and make the output both more predictable and also written to stderr without relying on the tee:
for i in {1..30}; do echo "$i"; echo "$i" >&2; sleep 1; done | tee >(head -1 > h.txt; echo "Head done") >(tail -1 > t.txt) >/dev/null
which when I run it gave me the output:
1
Head done
2
so it got just 1 more iteration of the loop before everything exited (though t.txt still only has 1 in it). If we then did
echo "${PIPESTATUS[#]}"
we see
141 141
which this question ties to SIGPIPE in a very similar fashion to what we're seeing here.
The coreutils maintainers have added this as an example to their tee "gotchas" for future posterity.
For a discussion with the devs about how this fits into POSIX compliance you can see the (closed notabug) report at http://debbugs.gnu.org/cgi/bugreport.cgi?bug=22195
If you have access to GNU version 8.24 they have added some options (not in POSIX) that can help like -p or --output-error=warn. Without that you can take a bit of a risk but get the desired functionality in the question by trapping and ignoring SIGPIPE:
trap '' PIPE
for i in {1..30}; do echo "$i"; echo "$i" >&2; sleep 1; done | tee >(head -1 > h.txt; echo "Head done") >(tail -1 > t.txt) >/dev/null
trap - PIPE
will have the expected results in both h.txt and t.txt, but if something else happened that wanted SIGPIPE to be handled correctly you'd be out of luck with this approach.
Another hacky option would be to zero out t.txt before starting then not let the head process list finish until it is non-zero length:
> t.txt; for i in {1..10}; do echo "$i"; echo "$i" >&2; sleep 1; done | tee >(head -1 > h.txt; echo "Head done"; while [ ! -s t.txt ]; do sleep 1; done) >(tail -1 > t.txt; date) >/dev/null

Get exit code from subshell through the pipes

How can I get exit code of wget from the subshell process?
So, main problem is that $? is equal 0. Where can $?=8 be founded?
$> OUT=$( wget -q "http://budueba.com/net" | tee -a "file.txt" ); echo "$?"
0
It works without tee, actually.
$> OUT=$( wget -q "http://budueba.com/net" ); echo "$?"
8
But ${PIPESTATUS} array (I'm not sure it's related to that case) also does not contain that value.
$> OUT=$( wget -q "http://budueba.com/net" | tee -a "file.txt" ); echo "${PIPESTATUS[1]}"
$> OUT=$( wget -q "http://budueba.com/net" | tee -a "file.txt" ); echo "${PIPESTATUS[0]}"
0
$> OUT=$( wget -q "http://budueba.com/net" | tee -a "file.txt" ); echo "${PIPESTATUS[-1]}"
0
So, my question is - how can I get wget's exit code through tee and subshell?
If it could be helpful, my bash version is 4.2.20.
By using $() you are (effectively) creating a subshell. Thus the PIPESTATUS instance you need to look at is only available inside your subshell (i.e. inside the $()), since environment variables do not propagate from child to parent processes.
You could do something like this:
OUT=$( wget -q "http://budueba.com/net" | tee -a "file.txt"; exit ${PIPESTATUS[0]} );
echo $? # prints exit code of wget.
You can achieve a similar behavior by using the following:
OUT=$(wget -q "http://budueba.com/net")
rc=$? # safe exit code for later
echo "$OUT" | tee -a "file.txt"
Beware of this when using local variables:
local OUT=$(command; exit 1)
echo $? # 0
OUT=$(command; exit 1)
echo $? # 1
Copy the PIPESTATUS array first. Any reads destroy the current state.
declare -a PSA
cmd1 | cmd2 | cmd3
PSA=( "${PIPESTATUS[#]}" )
I used fifos to solve the sub-shell/PIPESTATUS problem. See
bash pipestatus in backticked command?
I also found these useful:
bash script: how to save return value of first command in a pipeline?
and https://unix.stackexchange.com/questions/14270/get-exit-status-of-process-thats-piped-to-another/70675#70675

Resources