I want to break from my nested while loops.
Below is what i have tried in my code,
while [itretative condition]
do
...
...
cat test1.json | while read line
do
...
...
cat test2.json | while read line
do
...
...
if [ "$taskstatus" = "RUNNING" ]
#when my task status reach running, i want to stop the script execution and end it.
break 3 #break 3 or exit is not working for me.
fi
done
done
done
Please suggest me how i can achieve this?
Whenever you use a pipe, you create a sub shell. Neiter break nor exit works over sub shell boundaries.
Even trap does not work over sub shell boundaries. But the option set -E tells Bash to inherit the error handler of the parent shell. By this you can reserve a special exit code to implement the break logic.
The following code is divided in an outer shell and a sub shell, which runs your code using pipes and creating further sub shells.
The error handler of the outer shell is not inherited by the sub shells, because trap handler inheritance is disabled by default. The error handler of the outer shell just checks for the reserved exit code (42 in the example) and treats this as no error.
In the first sub shell error handler inheritance is enabled by set -E. This means that all sub shells share the same error handler. The sub shell error handler just passes the error code through and terminates the shell. By this all sub shells are terminated.
#! /bin/bash
err()
{
local err=$?
if (( err == 42 )); then
exit
else
exit $err
fi
}
trap err ERR
(
set -E
err()
{
local err=$?
exit $err
}
trap err ERR
printf "%s\n" a b c | while read i; do
echo $i
printf "%s\n" x y z | while read j; do
echo $j
exit 42
done
done
)
As mentioned in the comments, break statement will not work across sub-shell created from a pipe. For the simple case that the pipe is simply cat file | while, it is possible to 'de-pipe' the expression with while read ... done <somefile (Gordon Davisson comment above).
For the more general case, where the pipeline can contain arbitrary commands (grep x file1, and grep y file2, in the example below), possible to de-pipe the command (and allow the break to work over multiple levels) using one of the following: (1) here documents (2) process substitution.
With here documents: <<<"$(commands)"
while read x ; do
while read y ; do
echo "$x/$y" ;
[[ "$x" = d* ]] && echo BREAK && break 2
done <<< $(grep y file2);
done <<< "$(grep x file1)"
With process substitution < <(commands)
while read x ; do
while read y ; do
echo "$x/$y" ;
[[ "$x" = d* ]] && echo FOO && break 2 ;
done < <(grep y file2);
done < <(grep x file1)
Related
I have a bash script that produces some text from a pipe of commands. Based on a command line option I want to do some validation on the output. For a contrived example...
CHECK_OUTPUT=$1
...
check_output()
{
if [[ "$CHECK_OUTPUT" != "--check" ]]; then
# Don't check the output. Passthrough and return.
cat
return 0
fi
# Check each line exists in the fs root
while read line; do
if [[ ! -e "/$line" ]]; then
echo "Error: /$line does not exist"
return 1
fi
echo "$line"
done
return 0
}
ls /usr | grep '^b' | check_output
[EDIT] better example: https://stackoverflow.com/a/52539364/1888983
This is really useful, particularly if I have multiple functions that can becomes passthroughs. Yes, I could move the CHECK_OUTPUT conditional and create a pipe with or without check_output but I'd need to write lines for each combination for more functions. If there are better ways to dynamically build a pipe I'd like to know.
The problem is the "useless use of cat". Can this be avoided and make check_output like it wasn't in the pipe at all?
Yes, you can do this -- by making your function a wrapper that conditionally injects a pipeline element, instead of being an unconditional pipeline element itself. For example:
maybe_checked() {
if [[ $CHECK_OUTPUT != "--check" ]]; then
"$#" # just run our arguments as a command, as if we weren't here
else
# run our arguments in a process substitution, reading from stdout of same.
# ...some changes from the original code:
# IFS= stops leading or trailing whitespace from being stripped
# read -r prevents backslashes from being processed
local line # avoid modifying $line outside our function
while IFS= read -r line; do
[[ -e "/$line" ]] || { echo "Error: /$line does not exist" >&2; return 1; }
printf '%s\n' "$line" # see https://unix.stackexchange.com/questions/65803
done < <("$#")
fi
}
ls /usr | maybe_checked grep '^b'
Caveat of the above code: if the pipefail option is set, you'll want to check the exit status of the process substitution to have complete parity with the behavior that would otherwise be the case. In bash version 4.3 or later (IIRC), $? is modified by process substitutions to have the relevant PID, which can be waited for to retrieve exit status.
That said, this is also a use case wherein using cat is acceptable, and I'm saying this as a card-carying member of the UUOC crowd. :)
Adopting the examples from John Kugelman's answers on the linked question:
maybe_sort() {
if (( sort )); then
"$#" | sort
else
"$#"
fi
}
maybe_limit() {
if [[ -n $limit ]]; then
"$#" | head -n "$limit"
else
"$#"
fi
}
printf '%s\n' "${haikus[#]}" | maybe_limit maybe_sort sed -e 's/^[ \t]*//'
In the following program, if I set the variable $foo to the value 1 inside the first if statement, it works in the sense that its value is remembered after the if statement. However, when I set the same variable to the value 2 inside an if which is inside a while statement, it's forgotten after the while loop. It's behaving like I'm using some sort of copy of the variable $foo inside the while loop and I am modifying only that particular copy. Here's a complete test program:
#!/bin/bash
set -e
set -u
foo=0
bar="hello"
if [[ "$bar" == "hello" ]]
then
foo=1
echo "Setting \$foo to 1: $foo"
fi
echo "Variable \$foo after if statement: $foo"
lines="first line\nsecond line\nthird line"
echo -e $lines | while read line
do
if [[ "$line" == "second line" ]]
then
foo=2
echo "Variable \$foo updated to $foo inside if inside while loop"
fi
echo "Value of \$foo in while loop body: $foo"
done
echo "Variable \$foo after while loop: $foo"
# Output:
# $ ./testbash.sh
# Setting $foo to 1: 1
# Variable $foo after if statement: 1
# Value of $foo in while loop body: 1
# Variable $foo updated to 2 inside if inside while loop
# Value of $foo in while loop body: 2
# Value of $foo in while loop body: 2
# Variable $foo after while loop: 1
# bash --version
# GNU bash, version 4.1.10(4)-release (i686-pc-cygwin)
echo -e $lines | while read line
...
done
The while loop is executed in a subshell. So any changes you do to the variable will not be available once the subshell exits.
Instead you can use a here string to re-write the while loop to be in the main shell process; only echo -e $lines will run in a subshell:
while read line
do
if [[ "$line" == "second line" ]]
then
foo=2
echo "Variable \$foo updated to $foo inside if inside while loop"
fi
echo "Value of \$foo in while loop body: $foo"
done <<< "$(echo -e "$lines")"
You can get rid of the rather ugly echo in the here-string above by expanding the backslash sequences immediately when assigning lines. The $'...' form of quoting can be used there:
lines=$'first line\nsecond line\nthird line'
while read line; do
...
done <<< "$lines"
UPDATED#2
Explanation is in Blue Moons's answer.
Alternative solutions:
Eliminate echo
while read line; do
...
done <<EOT
first line
second line
third line
EOT
Add the echo inside the here-is-the-document
while read line; do
...
done <<EOT
$(echo -e $lines)
EOT
Run echo in background:
coproc echo -e $lines
while read -u ${COPROC[0]} line; do
...
done
Redirect to a file handle explicitly (Mind the space in < <!):
exec 3< <(echo -e $lines)
while read -u 3 line; do
...
done
Or just redirect to the stdin:
while read line; do
...
done < <(echo -e $lines)
And one for chepner (eliminating echo):
arr=("first line" "second line" "third line");
for((i=0;i<${#arr[*]};++i)) { line=${arr[i]};
...
}
Variable $lines can be converted to an array without starting a new sub-shell. The characters \ and n has to be converted to some character (e.g. a real new line character) and use the IFS (Internal Field Separator) variable to split the string into array elements. This can be done like:
lines="first line\nsecond line\nthird line"
echo "$lines"
OIFS="$IFS"
IFS=$'\n' arr=(${lines//\\n/$'\n'}) # Conversion
IFS="$OIFS"
echo "${arr[#]}", Length: ${#arr[*]}
set|grep ^arr
Result is
first line\nsecond line\nthird line
first line second line third line, Length: 3
arr=([0]="first line" [1]="second line" [2]="third line")
You are asking this bash FAQ. The answer also describes the general case of variables set in subshells created by pipes:
E4) If I pipe the output of a command into read variable, why
doesn't the output show up in $variable when the read command finishes?
This has to do with the parent-child relationship between Unix
processes. It affects all commands run in pipelines, not just
simple calls to read. For example, piping a command's output
into a while loop that repeatedly calls read will result in
the same behavior.
Each element of a pipeline, even a builtin or shell function,
runs in a separate process, a child of the shell running the
pipeline. A subprocess cannot affect its parent's environment.
When the read command sets the variable to the input, that
variable is set only in the subshell, not the parent shell. When
the subshell exits, the value of the variable is lost.
Many pipelines that end with read variable can be converted
into command substitutions, which will capture the output of
a specified command. The output can then be assigned to a
variable:
grep ^gnu /usr/lib/news/active | wc -l | read ngroup
can be converted into
ngroup=$(grep ^gnu /usr/lib/news/active | wc -l)
This does not, unfortunately, work to split the text among
multiple variables, as read does when given multiple variable
arguments. If you need to do this, you can either use the
command substitution above to read the output into a variable
and chop up the variable using the bash pattern removal
expansion operators or use some variant of the following
approach.
Say /usr/local/bin/ipaddr is the following shell script:
#! /bin/sh
host `hostname` | awk '/address/ {print $NF}'
Instead of using
/usr/local/bin/ipaddr | read A B C D
to break the local machine's IP address into separate octets, use
OIFS="$IFS"
IFS=.
set -- $(/usr/local/bin/ipaddr)
IFS="$OIFS"
A="$1" B="$2" C="$3" D="$4"
Beware, however, that this will change the shell's positional
parameters. If you need them, you should save them before doing
this.
This is the general approach -- in most cases you will not need to
set $IFS to a different value.
Some other user-supplied alternatives include:
read A B C D << HERE
$(IFS=.; echo $(/usr/local/bin/ipaddr))
HERE
and, where process substitution is available,
read A B C D < <(IFS=.; echo $(/usr/local/bin/ipaddr))
Hmmm... I would almost swear that this worked for the original Bourne shell, but don't have access to a running copy just now to check.
There is, however, a very trivial workaround to the problem.
Change the first line of the script from:
#!/bin/bash
to
#!/bin/ksh
Et voila! A read at the end of a pipeline works just fine, assuming you have the Korn shell installed.
This is an interesting question and touches on a very basic concept in Bourne shell and subshell. Here I provide a solution that is different from the previous solutions by doing some kind of filtering. I will give an example that may be useful in real life. This is a fragment for checking that downloaded files conform to a known checksum. The checksum file look like the following (Showing just 3 lines):
49174 36326 dna_align_feature.txt.gz
54757 1 dna.txt.gz
55409 9971 exon_transcript.txt.gz
The shell script:
#!/bin/sh
.....
failcnt=0 # this variable is only valid in the parent shell
#variable xx captures all the outputs from the while loop
xx=$(cat ${checkfile} | while read -r line; do
num1=$(echo $line | awk '{print $1}')
num2=$(echo $line | awk '{print $2}')
fname=$(echo $line | awk '{print $3}')
if [ -f "$fname" ]; then
res=$(sum $fname)
filegood=$(sum $fname | awk -v na=$num1 -v nb=$num2 -v fn=$fname '{ if (na == $1 && nb == $2) { print "TRUE"; } else { print "FALSE"; }}')
if [ "$filegood" = "FALSE" ]; then
failcnt=$(expr $failcnt + 1) # only in subshell
echo "$fname BAD $failcnt"
fi
fi
done | tail -1) # I am only interested in the final result
# you can capture a whole bunch of texts and do further filtering
failcnt=${xx#* BAD } # I am only interested in the number
# this variable is in the parent shell
echo failcnt $failcnt
if [ $failcnt -gt 0 ]; then
echo $failcnt files failed
else
echo download successful
fi
The parent and subshell communicate through the echo command. You can pick some easy to parse text for the parent shell. This method does not break your normal way of thinking, just that you have to do some post processing. You can use grep, sed, awk, and more for doing so.
I use stderr to store within a loop, and read from it outside.
Here var i is initially set and read inside the loop as 1.
# reading lines of content from 2 files concatenated
# inside loop: write value of var i to stderr (before iteration)
# outside: read var i from stderr, has last iterative value
f=/tmp/file1
g=/tmp/file2
i=1
cat $f $g | \
while read -r s;
do
echo $s > /dev/null; # some work
echo $i > 2
let i++
done;
read -r i < 2
echo $i
Or use the heredoc method to reduce the amount of code in a subshell.
Note the iterative i value can be read outside the while loop.
i=1
while read -r s;
do
echo $s > /dev/null
let i++
done <<EOT
$(cat $f $g)
EOT
let i--
echo $i
How about a very simple method
+call your while loop in a function
- set your value inside (nonsense, but shows the example)
- return your value inside
+capture your value outside
+set outside
+display outside
#!/bin/bash
# set -e
# set -u
# No idea why you need this, not using here
foo=0
bar="hello"
if [[ "$bar" == "hello" ]]
then
foo=1
echo "Setting \$foo to $foo"
fi
echo "Variable \$foo after if statement: $foo"
lines="first line\nsecond line\nthird line"
function my_while_loop
{
echo -e $lines | while read line
do
if [[ "$line" == "second line" ]]
then
foo=2; return 2;
echo "Variable \$foo updated to $foo inside if inside while loop"
fi
echo -e $lines | while read line
do
if [[ "$line" == "second line" ]]
then
foo=2;
echo "Variable \$foo updated to $foo inside if inside while loop"
return 2;
fi
# Code below won't be executed since we returned from function in 'if' statement
# We aready reported the $foo var beint set to 2 anyway
echo "Value of \$foo in while loop body: $foo"
done
}
my_while_loop; foo="$?"
echo "Variable \$foo after while loop: $foo"
Output:
Setting $foo 1
Variable $foo after if statement: 1
Value of $foo in while loop body: 1
Variable $foo after while loop: 2
bash --version
GNU bash, version 3.2.51(1)-release (x86_64-apple-darwin13)
Copyright (C) 2007 Free Software Foundation, Inc.
Though this is an old question and asked several times, here's what I'm doing after hours fidgeting with here strings, and the only option that worked for me is to store the value in a file during while loop sub-shells and then retrieve it. Simple.
Use echo statement to store and cat statement to retrieve. And the bash user must chown the directory or have read-write chmod access.
#write to file
echo "1" > foo.txt
while condition; do
if (condition); then
#write again to file
echo "2" > foo.txt
fi
done
#read from file
echo "Value of \$foo in while loop body: $(cat foo.txt)"
The code
In the example function a, I capture the input from a pipe as follows:
function a() {
if [ -t 1 ]; then
read test
echo "$test"
fi
if [[ -z "$1" ]]; then
echo "$1"
fi
}
called as follows:
echo "hey " | a "hello"
produces the output:
hey
hello
The issue
I was inspired by this answer, however a quote after the snippet has me concerned:
But there's no point - your variable assignments may not last! A pipeline may spawn a subshell, where the environment is inherited by value, not by reference. This is why read doesn't bother with input from a pipe - it's undefined.
I'm not sure I understand this - attempting to create subshells yielded the output I expected:
function a() {
(
if [ -t 1 ]; then
read test
echo "$test"
fi
if [[ -z "$1" ]]; then
echo "$1"
fi
)
}
And in the method call:
(echo "hey") | (a "hello")
still yields:
hey
hello
So what is meant by your variable assignments may not last! A pipeline may spawn a subshell, where the environment is inherited by value, not by reference.? Is there something that I've misunderstood?
The quoted note is incorrect. read doesn't care where its input comes from.
However, you must remember that the variable assigned to by the invocation of the read command is part of the (sub-)shell which executes the command.
By default, each command executed in a pipeline (a series of commands separated by |) is executed in a separate subshell. So after you execute echo foo | read foo, you will find that the value of $foo has not changed: not because read ignored its input but rather because the shell read executed in no longer exists.
Try this:
echo test | read myvar
echo $myvar
You might expect that it will print test, but it doesn't, it prints nothing. The reason is that bash will execute the read myvar in a subshell process. The variable will be read, but only in that subshell. So in the original shell the variable will never be set.
On the other hand, if you do this:
echo test | { read myvar; echo $myvar; }
or this
echo test | (read myvar; echo $myvar)
you will get the expected output. This is what happens with your code.
How can I optimize following Bash code?
if grep --quiet $pattern $fname; then
echo "==> "$fname" <=="
grep -n $pattern $fname
fi
Firstly it scans file for occurences of $pattern. If there were any results found, it prints file name and then all occurences.
You can see that it does same grep twice. If I could store results from first call and then reuse them, it would be perfect.
An assignment won't change the value of $?, so you can add one without otherwise modifying your logic:
if content=$(grep -n "$pattern" "$fname"); then
echo "==> $fname <=="
printf '%s\n' "$content"
fi
Note, here, that all the variable expansions are inside double quotes. For some reason your original was explicitly performing them only unquoted -- this causes both string-splitting and glob expansion to take place; you almost certainly don't want either.
By the way -- there are things you could do that would make an assignment modify the exit status of a command run in the subshell generating its value! Using declare, export, local, or the like to perform an assignment will cause that command's own exit status to replace that of the subshell being assigned.
# here, the "local" will replace $? with 0
$ f() {
> local foo=$(echo "bar"; exit 1)
> echo "$?"
> }
$ f
0
...whereas...
# here, the "local" is separate, so the subshell's exit status survives
$ f() {
> local foo
> foo=$(echo "bar"; exit 1)
> echo "$?"
> }
$ f
1
In my bash script I use while read loop and a helper function fv():
fv() {
case "$1" in
out) echo $VAR
;;
* ) VAR="$VAR $1"
;;
esac
}
cat "$1" | while read line
do
...some processings...
fv some-str-value
done
echo "`fv out`"
in a hope that I can distil value from while read loop in a variable accessible in rest of the script.
But above snippet is no good, as I get no output.
Is there easy way to solve this - output string from this loop in a variable that would be accessible in rest of the script - without reformatting my script?
As no one has explained to you why your code didn't work, I will.
When you use cat "$1" |, you are making that loop execute in a subshell. The VAR variable used in that subshell starts as a copy of VAR from the main script, and any changes to it are limited to that copy (the subshell's scope), they don't affect the script's original VAR. By removing the useless use of cat, you remove the pipeline and so the loop is executed in the main shell, so it can (and does) alter the correct copy of VAR.
Replace your while loop by while read line ; do ... ; done < $1:
#!/bin/bash
function fv
{
case "$1" in
out) echo $VAR
;;
* ) VAR="$VAR $1"
;;
esac
}
while read line
do
fv "$line\n"
done < "$1"
echo "$(fv out)"
Stop piping to read.
done < "$1"