send EXIT code as variable to next command in a pipeline [bash] - bash

I have something like this:
while read line
do command1 $line | awk -v l="$line" '
...
... awk program here doing something ...
...
'
done < inputfile.txt
Now, command1 will have three possible EXIT code statuses (0,1,2), and depending
on which one occures, processing in awk program will be different.
So, EXIT code serves the purpose of a logic in awk program.
My problem is, I don't know how to pass that EXIT code from command1
to my awk program.
I thought maybe there is a way of passing that EXIT code as a var, along with that
line:
-v l="$line" -v l="$EXITCODE", but did not find the way of doing it.
After exiting from while do done loop, I still need access to
${PIPESTATUS[0]} (original exit status of a command1 command) variable, as I am doing some more stuff after.
ADDITION (LOGIC CHANGED)
while read line
do
stdout=$(command $line);
case $? in
0 )
# need to call awk program
awk -v l="$line" -v stdout="$stdout" -f horsepower
;;
1 )
# no need to call awk program here..
;;
2 )
# no need to call awk program here..
;;
;;
*)
# something unimportant
esac
done < inputfile.txt
So as you can see, I changed a logic a bit, so now I am doing my EXIT status code logic outside of awk
program (now I made it as a separate program in its own file called
horsepower), and need only to call it in case 0), to do processing on stdout output generated from previous command.
horsepower has few lines like:
#! /bin/awk -f
/some regex/ { some action }
, and based on what it finds it acts appropriately. But how to make it now act
on that stdout?

Don't Use PIPESTATUS Array
You can use the Bash PIPESTATUS array to capture the exit status of the last foreground pipeline. The trick is knowing which array element you need. However, you can't capture the exit status of the current pipeline this way. For example:
$ false | echo "${PIPESTATUS[0]}"
0
The first time you run this, the exit status will be 0. However, subsequent runs will return 1, because they're showing the status of the previous command.
Use Separate Commands
You'd be much better off using $? to display the exit status of the previous command. However, that will preclude using a pipeline. You will need to split your commands, perhaps storing your standard output in a variable that you can then feed to the next command. You can also feed the exit status of the last command into awk in the same way.
Since you only posted pseudo-code, consider this example:
$ stdout=$(grep root /etc/passwd); \
awk -v status=$? \
-v stdout="$stdout" \
'BEGIN {printf "Status: %s\nVariable: %s\n", status, stdout}'
Status: 0
Variable: root:x:0:0:root,,,:/root:/bin/bash
Note the semicolon separating the commands. You wouldn't necessarily need the semicolon inside a script, but it allows you to cut and paste the example as a one-liner with separate commands.
The output of your command is stored in stdout, and the exit status is stored in $?. Both are declared as variables to awk, and available within your awk script.
Updated to Take Standard Input from Variable
The OP updated the original question, and now wants standard input to come from the data stored in the stdout variable that is storing the results of the command. This can be done with a here-string. The case statement in the updated question can be modified like so:
while read line; do
stdout=$(command $line)
case $? in
0) awk -v l="$line" -f horsepower <<< "$stdout" ;;
# other cases, as needed
esac
done
In this example, the the stdout variable captures standard output from the command stored in line, and then reuses the variable as standard input to the awk script named "horsepower." There are a number of ways this sort of command evaluation can break, but this will work for the OP's use case as currently posted.

Related

How can I create timestamped logs and error handle in BASH at the same time?

I am writing a BASH script and two of the things I need it to do is:
Provide a timestamped log file.
Handle errors.
I am finding that these two objectives are clashing.
First of all, I am using the ts command to timestamp log entries, e.g. <a command/subscript> 2>&1 | ts '%H:%M:%S ' >> log. Note that I need all the lines output of any subscripts to be timestamped too. This works great... until I try to handle errors using exit codes.
Any command that fails (exits with a code of 1) is immediately followed with the ts command which executes successfully (exits with a code of 0). This means that I am unable to use the exit codes of the commands to handle errors with the $? environment variable because ts is always the last command to run and always has an exit code of 0.
Here is the case statement I am using:
<command> 2>&1 | ts '%H:%M:%S ' >> log
case $? in
0)
echo "Success"
;;
*)
echo "Failure"
esac
When a foreground pipeline returns, bash saves exit status values of its components to an array variable named PIPESTATUS. In this case, you can use ${PIPESTATUS[0]} (or just $PIPESTATUS; as you're interested in the first component) instead of $? to get <command>'s exit status value.
Proof of concept:
$ false | true | false | true
$ declare -p PIPESTATUS
declare -a PIPESTATUS=([0]="1" [1]="0" [2]="1" [3]="0")

Capture stdout to variable and get the exit statuses of foreground pipe

I want to execute a command (say ls) and sed its output, then save the stdout to a variable, like this,
OUT=$(ls | sed -n -e 's/regexp/replacement/p')
After this, if I try to access the $PIPESTATUS array, I get only 0 (which is same as $?). So, how can I get both $PIPESTATUS as well as capture the entire piped command's stdout?
Note:
If I only executed those piped commands and didn't capture the stdout (like ls | sed -n -e 's/regexp/replacement/p'), I get expected exit statuses in $PIPESTATUS (like 0 0)
If I only executed single command (without piping multiple commands) using Command Substitution and captured the stdout (like OUT=$(ls)), I get expected single exit status in $PIPESTATUS (which is same as $?)
P.S. I know, I could run the command 2 times (first to capture the stdout, second to access $PIPESTATUS without using Command Substitution), but is there a way to get both in single execution?
You can:
Use a temporary file to pass PIPESTATUS.
tmp=$(mktemp)
out=$(pipeline; echo "${PIPESTATUS[#]}" > "$tmp")
PIPESTATUS=($(<"$tmp")) # Note: PIPESTATUS is overwritten each command...
rm "$tmp"
Use a temporary file to pass out.
tmp=$(mktemp)
pipeline > "$tmp"
out=$(<"$tmp"))
rm "$tmp"
Interleave output with pipestatus. For example reserve the part from last newline character till the end for PIPESTATUS. To preserve original return status I think some temporary variables are needed:
out=$(pipeline; tmp=("${PIPESTATUS[#]}") ret=$?; echo $'\n' "${tmp[#]}"; exit "$ret"))
pipestatus=(${out##*$'\n'})
out="${out%$'\n'*}"
out="${out%%$'\n'}" # remove trailing newlines like command substitution does
tested with:
out=$(false | true | false | echo 123; echo $'\n' "${PIPESTATUS[#]}");
pipestatus=(${out##*$'\n'});
out="${out%$'\n'*}"; out="${out%%$'\n'}";
echo out="$out" PIPESTATUS="${pipestatus[#]}"
# out=123 PIPESTATUS=1 0 1 0
Notes:
Upper case variables by convention should be reserved by exported variables.

Interacting with awk while processing a pipe

GIs there a way to make awk interactive when it is processing /dev/stdin via a pipe.
Imagine I have a program which continuously generates data. Example :
$ od -vAn -tu2 -w2 < /dev/urandom
2357
60431
19223
...
This data is being processed by a very advanced awk script by means of a pipe :
$ od -vAn -tu2 -w2 < /dev/urandom | awk '{print}'
Question: Is it possible to make this awk program interactive such that :
The program continuously prints output
When a single key is pressed (eg. z), it starts to output only 0 for each line it reads from the pipe.
When the key is pressed again, it continues to output the original data, obviously skipping the already processed records it printed as 0.
Problems:
/dev/stdin (also referenced as -) is already in use, so the keyboard interaction needs to be picked up with /dev/tty or is there another way?
getline key < "/dev/tty" awaits until RS is encountered, so in the default case you need to press two keys (z and Enter) :
$ awk 'BEGIN{ getline key < "/dev/tty"; print key}'
This is acceptable, but I would prefer a single key-press.
So, is it possible to set RS locally such that getline reads a single character? This way we could locally modify RS and reset it after the getline. Another way might be using the shell function read. But it is incompatible between bash and zsh.
getline awaits for input until the end-of-time. So it essentially stops the processing of the pipe. There is a gawk extention which allows you to set a timeout, but this is only available since gawk 4.2. So I believe this could potentially work :
awk '{print p ? 0 : $0 }
{ PROCINFO["/dev/tty", "READ_TIMEOUT"]=1;
while (getline key < "/dev/tty") p=key=="z"?!p:p
}
However, I do not have access to gawk 4.2 (update: this does not work)
Requests:
I would prefer a full POSIX compliant version, which is or entirely awk or using POSIX compliant system calls
If this is not possible, gawk extensions prior to 3.1.7 can be used and shell independent system calls.
As a last resort, I would accept any shell-awk construct which would make this possible under the single condition that the data is only read continuously by awk (so I'm thinking multiple pipes here).
After some searching, I came up with a Bash script that allows doing this. The idea is to inject a unique identifiable string into the pipe that awk is processing. Both the original program od and the bash script write to the pipe. In order not to mangle that data, I used stdbuf to run the program od line-buffered. Furthermore, since it is the bash-script that handles the key-press, both the original program and the awk script have to run in the background. Therefore a clean exit strategy needs to be in place. Awk will exit when the key q is pressed, while od will terminate automatically when awk is terminated.
In the end, it looks like this :
#!/usr/bin/env bash
# make a fifo which we use to inject the output of data-stream
# and the key press
mkfifo foo
# start the program in line-buffer mode, writing to FIFO
# and run it in the background
stdbuf -o L od -vAn -tu2 -w2 < /dev/urandom > foo &
# run the awk program that processes the identified key-press
# also run it in the background and insert a clear EXIT strategy
awk '/key/{if ($2=="q") exit; else p=!p}
!p{print}
p{print 0}' foo &
# handle the key pressing
# if a key is pressed inject the string "key <key>" into the FIFO
# use "q" to exit
while true; do
read -rsn1 key
echo "key $key" > foo
[[ $key == "q" ]] && exit
done
note: I ignored the concept that the key has to be z
Some useful posts :
Unix/Linux pipe behavior when reading process terminates before writing process
shell script respond to keypress

Bash: Checking for exit status of multi-pipe command chain

I have a problem checking whether a certain command in a multi-pipe command chain did throw an error. Usually this is not hard to check but neither set -o pipefail nor checking ${PIPESTATUS[#]} works in my case. The setup is like this:
cmd="$snmpcmd $snmpargs $agent $oid | grep <grepoptions> for_stuff | cut -d',' f$fields | sed 's/ubstitute/some_other_stuff/g'"
Note-1: The command was tested thoroughly and works perfectly.
Now, I want to store the output of that command in an array called procdata. Thus, I did:
declare -a procdata
procdata=( $(eval $cmd) )
Note-2: eval is necessary because otherwise $snmpcmd throws up with an invalid option -- <grepoption> error which makes no sense because <grepoption> is not an $snmpcmd option obviously. At this stage I consider this a bug with $snmpcmd but that's another show...
If an error occurres, procdata will be empty. However, it might be empty for two different reasons: either because an error occurred while executing the $snmpcmd (e.g. timeout) or because grep couldn't find what it was looking for. The problem is, I need to be able to distinguish between these two cases and handle them separately.
Thus, set -o pipefail is not an option since it will propagate any error and I can't distinguish which part of the pipe failed. On the other hand echo ${PIPESTATUS[#]} is always 0 after procdata=( $(eval $cmd) ) even though I have many pipes!?. Yet if I execute the whole command directly at the prompt and call echo ${PIPESTATUS[#]} immediately after, it returns the exit status of all the pipes correctly.
I know I could bind the err stream to stdout but I would have to use heuristic methods to check whether the elements in procdata are valid or error messages and I run the risk of getting false positives. I could also pipe stdout to /dev/null and capture only the error stream and check whether ${#procdata[#]} -eq 0. But I'd have to repeat the call to get the actual data and the whole command is time costly (ca. 3-5s). I wouldn't want to call it twice. Or I could use a temporary file to write errors to but I'd rather do it without the overhead of creating/deleting files.
Any ideas how I can make this work in bash?
Thanks
P.S.:
$ echo $BASH_VERSION
4.2.37(1)-release
A number of things here:
(1) When you say eval $cmd and attempt to get the exit values of the processes in the pipeline contained in the command $cmd, echo "${PIPESTATUS[#]}" would contain only the exit status for eval. Instead of eval, you'd need to supply the complete command line.
(2) You need to get the PIPESTATUS while assigning the output of the pipeline to the variable. Attempting to do that later wouldn't work.
As an example, you can say:
foo=$(command | grep something | command2; echo "${PIPESTATUS[#]})"
This captures the output of the pipeline and the PIPESTATUS array into the variable foo.
You could get the command output into an array by saying:
result=($(head -n -1 <<< "$foo"))
and the PIPESTATUS array by saying
tail -1 <<< "$foo"

Passing multiple arguments to a UNIX shell script

I have the following (bash) shell script, that I would ideally use to kill multiple processes by name.
#!/bin/bash
kill `ps -A | grep $* | awk '{ print $1 }'`
However, while this script works is one argument is passed:
end chrome
(the name of the script is end)
it does not work if more than one argument is passed:
$end chrome firefox
grep: firefox: No such file or directory
What is going on here?
I thought the $* passes multiple arguments to the shell script in sequence. I'm not mistyping anything in my input - and the programs I want to kill (chrome and firefox) are open.
Any help is appreciated.
Remember what grep does with multiple arguments - the first is the word to search for, and the remainder are the files to scan.
Also remember that $*, "$*", and $# all lose track of white space in arguments, whereas the magical "$#" notation does not.
So, to deal with your case, you're going to need to modify the way you invoke grep. You either need to use grep -F (aka fgrep) with options for each argument, or you need to use grep -E (aka egrep) with alternation. In part, it depends on whether you might have to deal with arguments that themselves contain pipe symbols.
It is surprisingly tricky to do this reliably with a single invocation of grep; you might well be best off tolerating the overhead of running the pipeline multiple times:
for process in "$#"
do
kill $(ps -A | grep -w "$process" | awk '{print $1}')
done
If the overhead of running ps multiple times like that is too painful (it hurts me to write it - but I've not measured the cost), then you probably do something like:
case $# in
(0) echo "Usage: $(basename $0 .sh) procname [...]" >&2; exit 1;;
(1) kill $(ps -A | grep -w "$1" | awk '{print $1}');;
(*) tmp=${TMPDIR:-/tmp}/end.$$
trap "rm -f $tmp.?; exit 1" 0 1 2 3 13 15
ps -A > $tmp.1
for process in "$#"
do
grep "$process" $tmp.1
done |
awk '{print $1}' |
sort -u |
xargs kill
rm -f $tmp.1
trap 0
;;
esac
The use of plain xargs is OK because it is dealing with a list of process IDs, and process IDs do not contain spaces or newlines. This keeps the simple code for the simple case; the complex case uses a temporary file to hold the output of ps and then scans it once per process name in the command line. The sort -u ensures that if some process happens to match all your keywords (for example, grep -E '(firefox|chrome)' would match both), only one signal is sent.
The trap lines etc ensure that the temporary file is cleaned up unless someone is excessively brutal to the command (the signals caught are HUP, INT, QUIT, PIPE and TERM, aka 1, 2, 3, 13 and 15; the zero catches the shell exiting for any reason). Any time a script creates a temporary file, you should have similar trapping around the use of the file so that it will be cleaned up if the process is terminated.
If you're feeling cautious and you have GNU Grep, you might add the -w option so that the names provided on the command line only match whole words.
All the above will work with almost any shell in the Bourne/Korn/POSIX/Bash family (you'd need to use backticks with strict Bourne shell in place of $(...), and the leading parenthesis on the conditions in the case are also not allowed with Bourne shell). However, you can use an array to get things handled right.
n=0
unset args # Force args to be an empty array (it could be an env var on entry)
for i in "$#"
do
args[$((n++))]="-e"
args[$((n++))]="$i"
done
kill $(ps -A | fgrep "${args[#]}" | awk '{print $1}')
This carefully preserves spacing in the arguments and uses exact matches for the process names. It avoids temporary files. The code shown doesn't validate for zero arguments; that would have to be done beforehand. Or you could add a line args[0]='/collywobbles/' or something similar to provide a default - non-existent - command to search for.
To answer your question, what's going on is that $* expands to a parameter list, and so the second and later words look like files to grep(1).
To process them in sequence, you have to do something like:
for i in $*; do
echo $i
done
Usually, "$#" (with the quotes) is used in place of $* in cases like this.
See man sh, and check out killall(1), pkill(1), and pgrep(1) as well.
Look into pkill(1) instead, or killall(1) as #khachik comments.
$* should be rarely used. I would generally recommend "$#". Shell argument parsing is relatively complex and easy to get wrong. Usually the way you get it wrong is to end up having things evaluated that shouldn't be.
For example, if you typed this:
end '`rm foo`'
you would discover that if you had a file named 'foo' you don't anymore.
Here is a script that will do what you are asking to have done. It fails if any of the arguments contain '\n' or '\0' characters:
#!/bin/sh
kill $(ps -A | fgrep -e "$(for arg in "$#"; do echo "$arg"; done)" | awk '{ print $1; }')
I vastly prefer $(...) syntax for doing what backtick does. It's much clearer, and it's also less ambiguous when you nest things.

Resources