Capture output of piped command while still knowing if first command wrote to stderr - bash

Is it possible to capture the output of cmd2 from cmd1 | cmd2 while still knowing if cmd1 wrote to stderr?
I am using exiftool to strip exif data from files:
exiftool "/path/to/file.ext" -all= -o -
This writes the output to stdout. This works for most files. If the file is corrupt or not a video/image file it will not write anything to stdout and, instead, write an error to stderr. For example:
Error: Writing of this type of file is not supported - /path/to/file.ext
I ultimately need to capture the md5 of files that don't result in an error. Right now I am doing this:
md5=$(exiftool "/path/to/file.ext" -all= -o - | md5sum | awk '{print $1}')
Regardless if the file is a image/video, it'll calculate an md5.
If the file is an image/video, it'll capture the file's md5 as expected.
If the file is not an image/video, exiftool doesn't write anything to stdout and so md5sum calculates the md5 of the null input. But that line will also write an error to stderr.
I need to be able to check if something was written to stderr so I know to scrap the calculated md5.
I know one alternative is to run the exiftool twice: one time without the md5sum and without capturing to see if anything was written to stderr and then a second time with the md5sum and capturing. But this means I have to run exiftool twice. I want to avoid that because it can take a long time for big files. I'd rather only run it once.
Update
Also, I can't capture the output of just exiftool because it yields this error:
bash: warning: command substitution: ignored null byte in input
And I cannot ignore this error because the md5 result is not the same. That is to say:
file=$(exiftool "/path/to/file.ext" -all= -o -)
echo "$file" | md5sum
Will print the above null byte error and will not have the same md5 result as:
exiftool "/path/to/file.ext" -all= -o - | md5sum

There is a special var(array) for this PIPESTATUS, simple example, file and file2 exist
$ ls file &> /dev/null | ls file2 &> /dev/null; echo ${PIPESTATUS[#]}
0 0
And here file3 not exist
$ ls file3 &> /dev/null | ls file2 &> /dev/null; echo ${PIPESTATUS[#]}
2 0
$ ls file3; echo $?
ls: cannot access 'file3': No such file or directory
2
Triple pipe
$ ls file 2> /dev/null | ls file3 &> /dev/null | ls file2 &> /dev/null; echo ${PIPESTATUS[#]}
0 2 0
Pipe in var tested with grep
$ test=$(ls file | grep .; ((${PIPESTATUS[1]} > 0)) && echo error)
$ echo $test
file
$ test=$(ls file3 | grep .; ((${PIPESTATUS[1]} > 0)) && echo error)
ls: cannot access 'file3': No such file or directory
$ echo $test
error
Another approach is to check that file type is image or video first.
type=$(file "/path/to/file.ext")
case $type in
*image*|*Media*) echo "is an image or video";;
esac

A coprocess can be used for this:
#!/usr/bin/env bash
case $BASH_VERSION in [0-3].*) echo "ERROR: Bash 4+ required" >&2; exit 1;; esac
coproc STDERR_CHECK { seen=0; while IFS= read -r; do seen=1; done; echo "$seen"; }
{
md5=$(exiftool "/path/to/file.ext" -all= -o - | md5sum | awk '{print $1}')
} 2>&${STDERR_CHECK[1]}
exec {STDERR_CHECK[1]}>&-
read stderr_seen <&"${STDERR_CHECK[0]}"
if (( stderr_seen )); then
echo "exiftool emitted stdout with md5 $md5, and had content on stderr"
else
echo "exiftool emitted stdout with md5 $md5, and did not emit any content on stderr"
fi

md5=$(exec 3>&1; (exiftool "/path/to/file.ext" -all= -o - 2>&1 1>&3) 3> >(md5sum | awk '{print $1}' >&3) | grep -q .)
This opens file descriptor 3 and redirects it to file descriptor 1 (a.k.a. stdout).
The trick is to redirect exiftool outputs:
exiftool ... 2>&1 tells that file descriptor 2 (a.k.a. stderr) is redirected to stdout
exiftool ... 1>&3 tells that stdout is redirected to file descriptor 3 which, at this moment, is redirected to stdout
Then fd 3 is redirected to another chain of commands using process substitution, i.e. 3> >(md5sum | awk '{print $1}' >&3) where 3> tells to redirect fd3 and >(...) is the process substitution itself.
At the same time, the standard error of exiftool is written to the standard output which is piped into grep -q . which will return 0 if there is at least one character.
Because grep -q . is the last command executed in the main chain of commands, you can simply check the results of $?:
md5=$(exec 3>&1; (exiftool "/path/to/file.ext" -all= -o - 2>&1 1>&3) 3> >(md5sum | awk '{print $1}' >&3) | grep -q .)
if [ $? -eq 0 ]
then
# something was written to exiftool's stderr
fi
The error will not be written. If you want to see the error but not capture it in md5 then replace grep -q . by grep . >&2
md5=$(exec 3>&1; (exiftool "/path/to/file.ext" -all= -o - 2>&1 1>&3) 3> >(md5sum | awk '{print $1}' >&3) | grep . >&2)
It is very important that you redirect exiftool outputs in this very order. If you redirected like this:
exiftool "/path/to/file.ext" -all= -o - 1>&3 2>&1
Then stdout is redirected to fd3 and then stderr is redirected to stdout. But because 1>&3 occurs before 2>&1 then stderr will be redirected to stdout which is redirected to fd3 at this time. You definitely don’t want that.
The end of the process substitution chain writes to fd3 with >&3 because you want to keep the result to fd3. Without >&3, the result of awk would end up in fd1 which would be piped to grep -q . or grep . >&2 and, again, you definitely don’t want that.
PS. you don’t need to close fd3 because it was opened during a subprocess when assigning md5. Should you need to close the file descriptor, please call exec 3>&-

Just capture the output, and then conditionally write it. eg:
if out="$(exiftool "/path/to/file.ext" -all= -o - )"; then
md5=$(echo "$out" | md5sum | awk '{print $1}'))
fi
This makes the assignment to md5 and returns the exit status of exiftool, which is checked by the if. Note that this construction assumes that exiftool returns a reasonable exit value.

Related

Testing whether stdin is a file vs. a pipe vs. a tty

I know for bash and zsh, one can use e.g. [ -t 1 ] to determine if STDIN is an interactive tty session.
However, there doesn't seem to be a way to test whether stdin is being redirected from a file, versus being piped in from a command:
foo < ./file
bar | foo
Is there any way to detect the difference between these two? Separately, is there any way to get the path of the file being redirected from (outside of /proc/self, which is unavailable on macOS)?
You can check if /dev/stdin is a regular file or a pipe:
$ cat tmp.sh
#!/bin/bash
if [ -f /dev/stdin ]; then
echo "file"
elif [ -p /dev/stdin ]; then
echo "pipe"
fi
$ bash tmp.sh < foo.txt
file
$ echo foo | bash tmp.sh
pipe
This relies on /dev/stdin being in your file system, though.
You can also use the stat command, which will return information about standard input given no file name argument. As you mentioned you are using macOS, you can use the %HT format:
$ stat -f %HT
Character Device
$ stat -f %HT < foo.txt
Regular File
$ echo foo | stat -f %HT
Fifo File

Unix pipeline command for redirecting output from stdin to stderr

I have a shell script that outputs information about successes to stdout, and also does a grep looking for errors in logs
inner.sh:
# do some things
echo success
# do other things
echo success
grep 'error' logs/*
I have another shell script that calls this one, counts up the successes and compares them to an expected number of successes:
outer.sh:
bash ./inner.sh | grep success | wc -l # I compare this number to the expected number
What I can't figure out how to do is have the output of grep go to stderr, so its not counted by the wc -l in outer.sh, but rather makes it around the wc to the terminal for the operator to see.
So I want a command like stdin_to_stderr that I can pipe the grep to, that would make any results it finds leave inner.sh on its stderr.
Is there already such a thing? Or do I just need to write the tiny script that would do this? Or am I thinking about this wrong?
bash ./inner.sh >&2 output_log_file
grep -c success output_log_file -- Count of success
grep -v -c success output_log_file -- Count of not "success"
Example :
echo -e "success.\nerror.\nsuccess.\nError.\nsuccess.\nerror" | grep -c success
echo -e "success.\nerror.\nsuccess.\nError.\nsuccess." | grep -v -c success
output :
3
2

bash command to grep something on stderr and save the result in a file

I am running a program called stm. I want to save only those stderr messages that contain the text "ERROR" in a text file. I also want the messages on the console.
How do I do that in bash?
Use the following pipeline if only messages containing ERROR should be displayed on the console (stderr):
stm |& grep ERROR | tee -a /path/to/logfile
Use the following command if all messages should be displayed on the console (stderr):
stm |& tee /dev/stderr | grep ERROR >> /path/to/logfile
Edit: Versions without connecting standard output and standard error:
stm 2> >( grep --line-buffered ERROR | tee -a /path/to/logfile >&2 )
stm 2> >( tee /dev/stderr | grep --line-buffered ERROR >> /path/to/logfile )
This looks like a duplicate of How to pipe stderr, and not stdout?
Redirect stderr to "&1", which means "the same place where stdout is going".
Then redirect stdout to /dev/null. Then use a normal pipe.
$ date -g
date: invalid option -- 'g'
Try `date --help' for more information.
$
$ (echo invent ; date -g)
invent (stdout)
date: invalid option -- 'g' (stderr)
Try `date --help' for more information. (stderr)
$
$ (echo invent ; date -g) 2>&1 >/dev/null | grep inv
date: invalid option -- 'g'
$
To copy the output from the above command to a file, you can use a > redirection or "tee". The tee command will print one copy of the output to the console and second copy to the file.
$ stm 2>&1 >/dev/null | grep ERROR > errors.txt
or
$ stm 2>&1 >/dev/null | grep ERROR | tee errors.txt
Are you saying that you want both stderr and stdout to appear in the console, but only stderr (not stdout) that contains "ERROR" to be logged to a file? It is that last condition that makes it difficult to find an elegant solution. If that is what you are looking for, here is my very ugly solution:
touch stm.out stm.err
stm 1>stm.out 2>stm.err & tail -f stm.out & tail -f stm.err & \
wait `pgrep stm`; pkill tail; grep ERROR stm.err > error.log; rm stm.err stm.out
I warned you about it being ugly. You could hide it in a function, use mktemp to create the temporary filenames, etc. If you don't want to wait for stm to exit before logging the ERROR text to a file, you could add tail -f stm.err | grep ERROR > error.log & after the other tail commands, and remove the grep command from the last line.

Redirect stderr to console and file

How can i redirect sdterr of bash script to console and file?
I am using:
exec 2>> myfile
to log It to myfile. How to extend it to log to console as well?
For example:
exec 2>&1 | tee myfile
or you can use tail -f
$ touch myfile
$ tail -f myfile &
$ command 2>myfile
You can create a fifo
$ mknod mypipe p
let tee read from the fifo. It writes to stdout and the file you specified
$ tee myfile <mypipe &
[1] 17121
now run the command and pipe stderr to the fifo
$ ls kkk 2>mypipe
ls: cannot access kkk: No such file or directory
[1]+ Done tee myfile < mypipe
Try to open that file by another command (like cat) in background.
exec 2>> myfile
cat myfile & >&2
CAT_PID=$!
... # your script
kill $CAT_PID
Pure Bash solution which builts upon #mpapis's answer:
exec 2> >( while read -r line; do printf '%s\n' "${line}" >&2; printf '%s\n' "${line}" >> err.log; done )
and expanded:
exec 2> >(
while read -r line; do
printf '%s\n' "${line}" >&2
printf '%s\n' "${line}" >> err.log
done
)
You can redirect output to a process and use tee in that process:
#!/usr/bin/env bash
exec 2> >( tee -a err.log )
echo bla >&2

Pipe command output, but keep the error code [duplicate]

This question already has answers here:
Pipe output and capture exit status in Bash
(16 answers)
Closed 5 years ago.
How do I get the correct return code from a unix command line application after I've piped it through another command that succeeded?
In detail, here's the situation :
$ tar -cEvhf - -I ${sh_tar_inputlist} | gzip -5 -c > ${sh_tar_file} -- when only the tar command fails $?=0
$ echo $?
0
And, what I'd like to see is:
$ tar -cEvhf - -I ${sh_tar_inputlist} 2>${sh_tar_error_file} | gzip -5 -c > ${sh_tar_file}
$ echo $?
1
Does anyone know how to accomplish this?
Use ${PIPESTATUS[0]} to get the exit status of the first command in the pipe.
For details, see http://tldp.org/LDP/abs/html/internalvariables.html#PIPESTATUSREF
See also http://cfajohnson.com/shell/cus-faq-2.html for other approaches if your shell does not support $PIPESTATUS.
Look at $PIPESTATUS which is an array variable holding exit statuses. So ${PIPESTATUS[0]} holds the exit status of the first command in the pipe, ${PIPESTATUS[1]} the exit status of the second command, and so on.
For example:
$ tar -cEvhf - -I ${sh_tar_inputlist} | gzip -5 -c > ${sh_tar_file}
$ echo ${PIPESTATUS[0]}
To print out all statuses use:
$ echo ${PIPESTATUS[#]}
Here is a general solution using only POSIX shell and no temporary files:
Starting from the pipeline:
foo | bar | baz
exec 4>&1
error_statuses=`((foo || echo "0:$?" >&3) |
(bar || echo "1:$?" >&3) |
(baz || echo "2:$?" >&3)) 3>&1 >&4`
exec 4>&-
$error_statuses contains the status codes of any failed processes, in random order, with indexes to tell which command emitted each status.
# if "bar" failed, output its status:
echo $error_statuses | grep '1:' | cut -d: -f2
# test if all commands succeeded:
test -z "$error_statuses"
# test if the last command succeeded:
echo $error_statuses | grep '2:' >/dev/null
As others have pointed out, some modern shells provide PIPESTATUS to get this info. In classic sh, it's a bit more difficult, and you need to use a fifo:
#!/bin/sh
trap 'rm -rf $TMPDIR' 0
TMPDIR=$( mktemp -d )
mkfifo ${FIFO=$TMPDIR/fifo}
cmd1 > $FIFO &
cmd2 < $FIFO
wait $!
echo The return value of cmd1 is $?
(Well, you don't need to use a fifo. You can have the commands early in the pipe echo a status variable and eval that in the main shell, redirecting file descriptors all over the place and basically bending over backwards to check things, but using a fifo is much, much easier.)

Resources