bash: exit if grep pipeline selects any lines - bash

I want a grep pipeline to succeed if no lines are selected:
set -o errexit
echo foo | grep -v foo
# no output, pipeline returns 1, shell exits
! echo foo | grep -v foo
# no output, pipeline returns 1 reversed by !, shell continues
Similarly, I want the pipeline to fail (and the enclosing shell to exit, when errexit is set) if any lines come out the end. The above approach does not work:
echo foo | grep -v bar
# output, pipeline returns 0, shell continues
! echo foo | grep -v bar
# output, ???, shell continues
! (echo foo | grep -v bar)
# output, ???, shell continues
I finally found a method, but it seems ugly, and I'm looking for a better way.
echo foo | grep -v bar | (! grep '.*')
# output, pipeline returns 1, shell exits
Any explanation of the behavior above would be appreciated as well.

set -e, aka set -o errexit, only handles unchecked commands; its intent is to prevent errors from going unnoticed, not to be an explicit flow control mechanism. It makes sense to check explicitly here, since it's a case you actively care about as opposed to an error that could happen as a surprise.
echo foo | grep -v bar && exit
More to the point, several of your commands make it explicit that you care about (and thus are presumably already manually handling) exit status -- thus setting the "checked" flag, making errexit have no effect.
Running ! pipeline, in particular, sets the checked flag for the pipeline -- it means that you're explicitly doing something with its exit status, thus implying that automatic failure-case handling need not apply.

This is not a one-liner, but it works:
if echo foo | grep foo ; then
exit 1
fi
Meaning the shell exits with 1 if grep finds something. Also I don't think that set -e or set -o errexit should be used for logic, it should be used for what it is meant for, for stopping your script if an unexpected error occurs.

Related

Redirect suppressed errors [duplicate]

I want to execute a long running command in Bash, and both capture its exit status, and tee its output.
So I do this:
command | tee out.txt
ST=$?
The problem is that the variable ST captures the exit status of tee and not of command. How can I solve this?
Note that command is long running and redirecting the output to a file to view it later is not a good solution for me.
There is an internal Bash variable called $PIPESTATUS; it’s an array that holds the exit status of each command in your last foreground pipeline of commands.
<command> | tee out.txt ; test ${PIPESTATUS[0]} -eq 0
Or another alternative which also works with other shells (like zsh) would be to enable pipefail:
set -o pipefail
...
The first option does not work with zsh due to a little bit different syntax.
Dumb solution: Connecting them through a named pipe (mkfifo). Then the command can be run second.
mkfifo pipe
tee out.txt < pipe &
command > pipe
echo $?
using bash's set -o pipefail is helpful
pipefail: the return value of a pipeline is the status of
the last command to exit with a non-zero status,
or zero if no command exited with a non-zero status
There's an array that gives you the exit status of each command in a pipe.
$ cat x| sed 's///'
cat: x: No such file or directory
$ echo $?
0
$ cat x| sed 's///'
cat: x: No such file or directory
$ echo ${PIPESTATUS[*]}
1 0
$ touch x
$ cat x| sed 's'
sed: 1: "s": substitute pattern can not be delimited by newline or backslash
$ echo ${PIPESTATUS[*]}
0 1
This solution works without using bash specific features or temporary files. Bonus: in the end the exit status is actually an exit status and not some string in a file.
Situation:
someprog | filter
you want the exit status from someprog and the output from filter.
Here is my solution:
((((someprog; echo $? >&3) | filter >&4) 3>&1) | (read xs; exit $xs)) 4>&1
echo $?
See my answer for the same question on unix.stackexchange.com for a detailed explanation and an alternative without subshells and some caveats.
By combining PIPESTATUS[0] and the result of executing the exit command in a subshell, you can directly access the return value of your initial command:
command | tee ; ( exit ${PIPESTATUS[0]} )
Here's an example:
# the "false" shell built-in command returns 1
false | tee ; ( exit ${PIPESTATUS[0]} )
echo "return value: $?"
will give you:
return value: 1
So I wanted to contribute an answer like lesmana's, but I think mine is perhaps a little simpler and slightly more advantageous pure-Bourne-shell solution:
# You want to pipe command1 through command2:
exec 4>&1
exitstatus=`{ { command1; printf $? 1>&3; } | command2 1>&4; } 3>&1`
# $exitstatus now has command1's exit status.
I think this is best explained from the inside out - command1 will execute and print its regular output on stdout (file descriptor 1), then once it's done, printf will execute and print icommand1's exit code on its stdout, but that stdout is redirected to file descriptor 3.
While command1 is running, its stdout is being piped to command2 (printf's output never makes it to command2 because we send it to file descriptor 3 instead of 1, which is what the pipe reads). Then we redirect command2's output to file descriptor 4, so that it also stays out of file descriptor 1 - because we want file descriptor 1 free for a little bit later, because we will bring the printf output on file descriptor 3 back down into file descriptor 1 - because that's what the command substitution (the backticks), will capture and that's what will get placed into the variable.
The final bit of magic is that first exec 4>&1 we did as a separate command - it opens file descriptor 4 as a copy of the external shell's stdout. Command substitution will capture whatever is written on standard out from the perspective of the commands inside it - but since command2's output is going to file descriptor 4 as far as the command substitution is concerned, the command substitution doesn't capture it - however once it gets "out" of the command substitution it is effectively still going to the script's overall file descriptor 1.
(The exec 4>&1 has to be a separate command because many common shells don't like it when you try to write to a file descriptor inside a command substitution, that is opened in the "external" command that is using the substitution. So this is the simplest portable way to do it.)
You can look at it in a less technical and more playful way, as if the outputs of the commands are leapfrogging each other: command1 pipes to command2, then the printf's output jumps over command 2 so that command2 doesn't catch it, and then command 2's output jumps over and out of the command substitution just as printf lands just in time to get captured by the substitution so that it ends up in the variable, and command2's output goes on its merry way being written to the standard output, just as in a normal pipe.
Also, as I understand it, $? will still contain the return code of the second command in the pipe, because variable assignments, command substitutions, and compound commands are all effectively transparent to the return code of the command inside them, so the return status of command2 should get propagated out - this, and not having to define an additional function, is why I think this might be a somewhat better solution than the one proposed by lesmana.
Per the caveats lesmana mentions, it's possible that command1 will at some point end up using file descriptors 3 or 4, so to be more robust, you would do:
exec 4>&1
exitstatus=`{ { command1 3>&-; printf $? 1>&3; } 4>&- | command2 1>&4; } 3>&1`
exec 4>&-
Note that I use compound commands in my example, but subshells (using ( ) instead of { } will also work, though may perhaps be less efficient.)
Commands inherit file descriptors from the process that launches them, so the entire second line will inherit file descriptor four, and the compound command followed by 3>&1 will inherit the file descriptor three. So the 4>&- makes sure that the inner compound command will not inherit file descriptor four, and the 3>&- will not inherit file descriptor three, so command1 gets a 'cleaner', more standard environment. You could also move the inner 4>&- next to the 3>&-, but I figure why not just limit its scope as much as possible.
I'm not sure how often things use file descriptor three and four directly - I think most of the time programs use syscalls that return not-used-at-the-moment file descriptors, but sometimes code writes to file descriptor 3 directly, I guess (I could imagine a program checking a file descriptor to see if it's open, and using it if it is, or behaving differently accordingly if it's not). So the latter is probably best to keep in mind and use for general-purpose cases.
(command | tee out.txt; exit ${PIPESTATUS[0]})
Unlike #cODAR's answer this returns the original exit code of the first command and not only 0 for success and 127 for failure. But as #Chaoran pointed out you can just call ${PIPESTATUS[0]}. It is important however that all is put into brackets.
In Ubuntu and Debian, you can apt-get install moreutils. This contains a utility called mispipe that returns the exit status of the first command in the pipe.
Outside of bash, you can do:
bash -o pipefail -c "command1 | tee output"
This is useful for example in ninja scripts where the shell is expected to be /bin/sh.
The simplest way to do this in plain bash is to use process substitution instead of a pipeline. There are several differences, but they probably don't matter very much for your use case:
When running a pipeline, bash waits until all processes complete.
Sending Ctrl-C to bash makes it kill all the processes of a pipeline, not just the main one.
The pipefail option and the PIPESTATUS variable are irrelevant to process substitution.
Possibly more
With process substitution, bash just starts the process and forgets about it, it's not even visible in jobs.
Mentioned differences aside, consumer < <(producer) and producer | consumer are essentially equivalent.
If you want to flip which one is the "main" process, you just flip the commands and the direction of the substitution to producer > >(consumer). In your case:
command > >(tee out.txt)
Example:
$ { echo "hello world"; false; } > >(tee out.txt)
hello world
$ echo $?
1
$ cat out.txt
hello world
$ echo "hello world" > >(tee out.txt)
hello world
$ echo $?
0
$ cat out.txt
hello world
As I said, there are differences from the pipe expression. The process may never stop running, unless it is sensitive to the pipe closing. In particular, it may keep writing things to your stdout, which may be confusing.
PIPESTATUS[#] must be copied to an array immediately after the pipe command returns.
Any reads of PIPESTATUS[#] will erase the contents.
Copy it to another array if you plan on checking the status of all pipe commands.
"$?" is the same value as the last element of "${PIPESTATUS[#]}",
and reading it seems to destroy "${PIPESTATUS[#]}", but I haven't absolutely verified this.
declare -a PSA
cmd1 | cmd2 | cmd3
PSA=( "${PIPESTATUS[#]}" )
This will not work if the pipe is in a sub-shell. For a solution to that problem,
see bash pipestatus in backticked command?
Base on #brian-s-wilson 's answer; this bash helper function:
pipestatus() {
local S=("${PIPESTATUS[#]}")
if test -n "$*"
then test "$*" = "${S[*]}"
else ! [[ "${S[#]}" =~ [^0\ ] ]]
fi
}
used thus:
1: get_bad_things must succeed, but it should produce no output; but we want to see output that it does produce
get_bad_things | grep '^'
pipeinfo 0 1 || return
2: all pipeline must succeed
thing | something -q | thingy
pipeinfo || return
Pure shell solution:
% rm -f error.flag; echo hello world \
| (cat || echo "First command failed: $?" >> error.flag) \
| (cat || echo "Second command failed: $?" >> error.flag) \
| (cat || echo "Third command failed: $?" >> error.flag) \
; test -s error.flag && (echo Some command failed: ; cat error.flag)
hello world
And now with the second cat replaced by false:
% rm -f error.flag; echo hello world \
| (cat || echo "First command failed: $?" >> error.flag) \
| (false || echo "Second command failed: $?" >> error.flag) \
| (cat || echo "Third command failed: $?" >> error.flag) \
; test -s error.flag && (echo Some command failed: ; cat error.flag)
Some command failed:
Second command failed: 1
First command failed: 141
Please note the first cat fails as well, because it's stdout gets closed on it. The order of the failed commands in the log is correct in this example, but don't rely on it.
This method allows for capturing stdout and stderr for the individual commands so you can then dump that as well into a log file if an error occurs, or just delete it if no error (like the output of dd).
It may sometimes be simpler and clearer to use an external command, rather than digging into the details of bash. pipeline, from the minimal process scripting language execline, exits with the return code of the second command*, just like a sh pipeline does, but unlike sh, it allows reversing the direction of the pipe, so that we can capture the return code of the producer process (the below is all on the sh command line, but with execline installed):
$ # using the full execline grammar with the execlineb parser:
$ execlineb -c 'pipeline { echo "hello world" } tee out.txt'
hello world
$ cat out.txt
hello world
$ # for these simple examples, one can forego the parser and just use "" as a separator
$ # traditional order
$ pipeline echo "hello world" "" tee out.txt
hello world
$ # "write" order (second command writes rather than reads)
$ pipeline -w tee out.txt "" echo "hello world"
hello world
$ # pipeline execs into the second command, so that's the RC we get
$ pipeline -w tee out.txt "" false; echo $?
1
$ pipeline -w tee out.txt "" true; echo $?
0
$ # output and exit status
$ pipeline -w tee out.txt "" sh -c "echo 'hello world'; exit 42"; echo "RC: $?"
hello world
RC: 42
$ cat out.txt
hello world
Using pipeline has the same differences to native bash pipelines as the bash process substitution used in answer #43972501.
* Actually pipeline doesn't exit at all unless there is an error. It executes into the second command, so it's the second command that does the returning.
Why not use stderr? Like so:
(
# Our long-running process that exits abnormally
( for i in {1..100} ; do echo ploop ; sleep 0.5 ; done ; exit 5 )
echo $? 1>&2 # We pass the exit status of our long-running process to stderr (fd 2).
) | tee ploop.out
So ploop.out receives the stdout. stderr receives the exit status of the long running process. This has the benefit of being completely POSIX-compatible.
(Well, with the exception of the range expression in the example long-running process, but that's not really relevant.)
Here's what this looks like:
...
ploop
ploop
ploop
ploop
ploop
ploop
ploop
ploop
ploop
ploop
5
Note that the return code 5 does not get output to the file ploop.out.

Can I create a "wildcard" bash alias that alters any command?

The inspiration here is a prank idea, so try to look past the fact that it's not really useful...
Let's say I wanted to set up an alias in bash that would subtly change any command entered at the prompt into the same command, but ultimately piped through tac to reverse the final output. A few examples of what I'd try to do:
ls ---> ls | tac
ls -la ---> ls -la | tac
tail ./foo | grep 'bar' ---> tail ./foo | grep 'bar' | tac
Is there a way to set up an alias, or some other means, that will append | tac to the end of each/every command entered without further intervention? Extra consideration given to ideas that are easy to hide in a bashrc. ;)
This isn't guaranteed to be side-effect-free, but it's probably a sane first cut:
reverse_command() {
# C check the number of entries in the `BASH_SOURCE` array to ensure that it's empty
# ...(meaning an interactive command).
if (( ${#BASH_SOURCE[#]} <= 1 )); then
# For an interactive command, take its text, tack on `| tac`, and evaluate
eval "${BASH_COMMAND} | tac"
# ...then return false to suppress the non-reversed version.
false
else
# for a noninteractive command, return true to run the original unmodified
true
fi
}
# turn on extended DEBUG hook behavior (necessary to suppress original commands).
shopt -s extdebug
# install our trap
trap reverse_command DEBUG
bash doesn't support modifying commands in this fashion. It does, however, let you redirect standard output for the shell itself, which every command will then inherit. Add this to .bashrc:
exec > >( tac )

Bash: Checking for exit status of multi-pipe command chain

I have a problem checking whether a certain command in a multi-pipe command chain did throw an error. Usually this is not hard to check but neither set -o pipefail nor checking ${PIPESTATUS[#]} works in my case. The setup is like this:
cmd="$snmpcmd $snmpargs $agent $oid | grep <grepoptions> for_stuff | cut -d',' f$fields | sed 's/ubstitute/some_other_stuff/g'"
Note-1: The command was tested thoroughly and works perfectly.
Now, I want to store the output of that command in an array called procdata. Thus, I did:
declare -a procdata
procdata=( $(eval $cmd) )
Note-2: eval is necessary because otherwise $snmpcmd throws up with an invalid option -- <grepoption> error which makes no sense because <grepoption> is not an $snmpcmd option obviously. At this stage I consider this a bug with $snmpcmd but that's another show...
If an error occurres, procdata will be empty. However, it might be empty for two different reasons: either because an error occurred while executing the $snmpcmd (e.g. timeout) or because grep couldn't find what it was looking for. The problem is, I need to be able to distinguish between these two cases and handle them separately.
Thus, set -o pipefail is not an option since it will propagate any error and I can't distinguish which part of the pipe failed. On the other hand echo ${PIPESTATUS[#]} is always 0 after procdata=( $(eval $cmd) ) even though I have many pipes!?. Yet if I execute the whole command directly at the prompt and call echo ${PIPESTATUS[#]} immediately after, it returns the exit status of all the pipes correctly.
I know I could bind the err stream to stdout but I would have to use heuristic methods to check whether the elements in procdata are valid or error messages and I run the risk of getting false positives. I could also pipe stdout to /dev/null and capture only the error stream and check whether ${#procdata[#]} -eq 0. But I'd have to repeat the call to get the actual data and the whole command is time costly (ca. 3-5s). I wouldn't want to call it twice. Or I could use a temporary file to write errors to but I'd rather do it without the overhead of creating/deleting files.
Any ideas how I can make this work in bash?
Thanks
P.S.:
$ echo $BASH_VERSION
4.2.37(1)-release
A number of things here:
(1) When you say eval $cmd and attempt to get the exit values of the processes in the pipeline contained in the command $cmd, echo "${PIPESTATUS[#]}" would contain only the exit status for eval. Instead of eval, you'd need to supply the complete command line.
(2) You need to get the PIPESTATUS while assigning the output of the pipeline to the variable. Attempting to do that later wouldn't work.
As an example, you can say:
foo=$(command | grep something | command2; echo "${PIPESTATUS[#]})"
This captures the output of the pipeline and the PIPESTATUS array into the variable foo.
You could get the command output into an array by saying:
result=($(head -n -1 <<< "$foo"))
and the PIPESTATUS array by saying
tail -1 <<< "$foo"

piping output through sed but retain exit status [duplicate]

This question already has answers here:
Pipe output and capture exit status in Bash
(16 answers)
Closed 9 years ago.
I pipe the output of a long-running build process through sed for syntax-highlighting, implemented as a wrapper around "mvn".
Further I have a "monitor" script that notifies me on the desktop when the build is finished. The monitor script checks the exit state of its argument and reports "Success" or "Failure".
By piping the maven output through sed, the exit status is always "ok", even when the build fails.
How can I pipe the correct exit status through sed as well?
Are there alternatives ?
Maybe the PIPESTATUS variable can help.
If you are using Bash, there's an option to use the set -o pipefail option, but since it's bash dependent, it's not portable, and won't work from a crontab, unless you wrap the whole thing in a bash env (bad solution).
http://bclary.com/blog/2006/07/20/pipefail-testing-pipeline-exit-codes/
This is a well known pain in the rear. If you are using bash (and probably many other modern sh variants), you can access the PIPESTATUS array to get the return value of a program earlier in the pipe. (Typically, the return value of the pipe is the return value of the last program in the pipe.) If you are using a shell that doesn't have PIPESTATUS (or if you want portability), you can do something like this:
#!/bin/sh
# run 'echo foo | false | sed s/f/t/', recording the status
# of false in RV
eval $( { { echo foo | false; printf RV=$? >&4; } |
sed s/f/t/ >&3; } 4>&1; ) 3>&1
echo RV=$RV
# run 'echo foo | cat | sed s/f/t/', recording the status
# of cat in RV
eval $( { { echo foo | cat; printf RV=$? >&4; } |
sed s/f/t/ >&3; } 4>&1; ) 3>&1
echo RV=$RV
In each case, RV will contain the return value of the false and the cat, respectively.
Bastil, because the pipe doesn't care about the exit status, you can only know whether sed exits sanely or not. I would enhance the sed script (or perhaps consider using a 3-liner Perl script) to exit with a failure status if the expected text isn't found, something like in pseudocode:
read($stdin)
if blank
exit(1) // output was blank, or on $stderr
else
regular expression substitution here
end
// natural exit success here
You could do it as a perl one-liner, and the same can be done in sedscript (but not in a sed one-liner, as far as I know)
Perhaps you could use a named pipe? Here's an example:
FIFODIR=`mktemp -d`
FIFO=$FIFODIR/fifo
mkfifo $FIFO
cat $FIFO & # An arbitrary pipeline
if false > $FIFO
then
echo "Build succeeded"
else
echo "Build failed" # This line WILL execute
fi
rm -r $FIFODIR
A week later I got a solution:
Originally I wwanted to do
monitor "mvn blah | sed -e SomeHighlightRegEx"
where monitor would reacts on exit status of sed (instead of mvn).
It's easier to do
monitor "mvn blah" | sed -e SomeHiglightRegEx
Note that this pipes the output of monitor through sed, while the monitor script reacts on status of mvn.
Thanks anyway for the other ideas.

Passing multiple arguments to a UNIX shell script

I have the following (bash) shell script, that I would ideally use to kill multiple processes by name.
#!/bin/bash
kill `ps -A | grep $* | awk '{ print $1 }'`
However, while this script works is one argument is passed:
end chrome
(the name of the script is end)
it does not work if more than one argument is passed:
$end chrome firefox
grep: firefox: No such file or directory
What is going on here?
I thought the $* passes multiple arguments to the shell script in sequence. I'm not mistyping anything in my input - and the programs I want to kill (chrome and firefox) are open.
Any help is appreciated.
Remember what grep does with multiple arguments - the first is the word to search for, and the remainder are the files to scan.
Also remember that $*, "$*", and $# all lose track of white space in arguments, whereas the magical "$#" notation does not.
So, to deal with your case, you're going to need to modify the way you invoke grep. You either need to use grep -F (aka fgrep) with options for each argument, or you need to use grep -E (aka egrep) with alternation. In part, it depends on whether you might have to deal with arguments that themselves contain pipe symbols.
It is surprisingly tricky to do this reliably with a single invocation of grep; you might well be best off tolerating the overhead of running the pipeline multiple times:
for process in "$#"
do
kill $(ps -A | grep -w "$process" | awk '{print $1}')
done
If the overhead of running ps multiple times like that is too painful (it hurts me to write it - but I've not measured the cost), then you probably do something like:
case $# in
(0) echo "Usage: $(basename $0 .sh) procname [...]" >&2; exit 1;;
(1) kill $(ps -A | grep -w "$1" | awk '{print $1}');;
(*) tmp=${TMPDIR:-/tmp}/end.$$
trap "rm -f $tmp.?; exit 1" 0 1 2 3 13 15
ps -A > $tmp.1
for process in "$#"
do
grep "$process" $tmp.1
done |
awk '{print $1}' |
sort -u |
xargs kill
rm -f $tmp.1
trap 0
;;
esac
The use of plain xargs is OK because it is dealing with a list of process IDs, and process IDs do not contain spaces or newlines. This keeps the simple code for the simple case; the complex case uses a temporary file to hold the output of ps and then scans it once per process name in the command line. The sort -u ensures that if some process happens to match all your keywords (for example, grep -E '(firefox|chrome)' would match both), only one signal is sent.
The trap lines etc ensure that the temporary file is cleaned up unless someone is excessively brutal to the command (the signals caught are HUP, INT, QUIT, PIPE and TERM, aka 1, 2, 3, 13 and 15; the zero catches the shell exiting for any reason). Any time a script creates a temporary file, you should have similar trapping around the use of the file so that it will be cleaned up if the process is terminated.
If you're feeling cautious and you have GNU Grep, you might add the -w option so that the names provided on the command line only match whole words.
All the above will work with almost any shell in the Bourne/Korn/POSIX/Bash family (you'd need to use backticks with strict Bourne shell in place of $(...), and the leading parenthesis on the conditions in the case are also not allowed with Bourne shell). However, you can use an array to get things handled right.
n=0
unset args # Force args to be an empty array (it could be an env var on entry)
for i in "$#"
do
args[$((n++))]="-e"
args[$((n++))]="$i"
done
kill $(ps -A | fgrep "${args[#]}" | awk '{print $1}')
This carefully preserves spacing in the arguments and uses exact matches for the process names. It avoids temporary files. The code shown doesn't validate for zero arguments; that would have to be done beforehand. Or you could add a line args[0]='/collywobbles/' or something similar to provide a default - non-existent - command to search for.
To answer your question, what's going on is that $* expands to a parameter list, and so the second and later words look like files to grep(1).
To process them in sequence, you have to do something like:
for i in $*; do
echo $i
done
Usually, "$#" (with the quotes) is used in place of $* in cases like this.
See man sh, and check out killall(1), pkill(1), and pgrep(1) as well.
Look into pkill(1) instead, or killall(1) as #khachik comments.
$* should be rarely used. I would generally recommend "$#". Shell argument parsing is relatively complex and easy to get wrong. Usually the way you get it wrong is to end up having things evaluated that shouldn't be.
For example, if you typed this:
end '`rm foo`'
you would discover that if you had a file named 'foo' you don't anymore.
Here is a script that will do what you are asking to have done. It fails if any of the arguments contain '\n' or '\0' characters:
#!/bin/sh
kill $(ps -A | fgrep -e "$(for arg in "$#"; do echo "$arg"; done)" | awk '{ print $1; }')
I vastly prefer $(...) syntax for doing what backtick does. It's much clearer, and it's also less ambiguous when you nest things.

Resources