Let's consider these two simple pieace of bash code.
Code A (without eval mgrep):
while read line; do
echo $line
done <<EOF
1234567
2345678
3456789
EOF
Code B (with eval mgrep):
while read line; do
echo $line
eval "mgrep $line"
done <<EOF
1234567
2345678
3456789
EOF
After running Code A, I get the expected output, i.e.:
1234567
2345678
3456789
While running Code B will produce something like:
1234567
[...mgrep results here...]
So it means, the while loop is only executed for the first input line and not for the remaining ones. What am I doing wrong?
Related
In the following program, if I set the variable $foo to the value 1 inside the first if statement, it works in the sense that its value is remembered after the if statement. However, when I set the same variable to the value 2 inside an if which is inside a while statement, it's forgotten after the while loop. It's behaving like I'm using some sort of copy of the variable $foo inside the while loop and I am modifying only that particular copy. Here's a complete test program:
#!/bin/bash
set -e
set -u
foo=0
bar="hello"
if [[ "$bar" == "hello" ]]
then
foo=1
echo "Setting \$foo to 1: $foo"
fi
echo "Variable \$foo after if statement: $foo"
lines="first line\nsecond line\nthird line"
echo -e $lines | while read line
do
if [[ "$line" == "second line" ]]
then
foo=2
echo "Variable \$foo updated to $foo inside if inside while loop"
fi
echo "Value of \$foo in while loop body: $foo"
done
echo "Variable \$foo after while loop: $foo"
# Output:
# $ ./testbash.sh
# Setting $foo to 1: 1
# Variable $foo after if statement: 1
# Value of $foo in while loop body: 1
# Variable $foo updated to 2 inside if inside while loop
# Value of $foo in while loop body: 2
# Value of $foo in while loop body: 2
# Variable $foo after while loop: 1
# bash --version
# GNU bash, version 4.1.10(4)-release (i686-pc-cygwin)
echo -e $lines | while read line
...
done
The while loop is executed in a subshell. So any changes you do to the variable will not be available once the subshell exits.
Instead you can use a here string to re-write the while loop to be in the main shell process; only echo -e $lines will run in a subshell:
while read line
do
if [[ "$line" == "second line" ]]
then
foo=2
echo "Variable \$foo updated to $foo inside if inside while loop"
fi
echo "Value of \$foo in while loop body: $foo"
done <<< "$(echo -e "$lines")"
You can get rid of the rather ugly echo in the here-string above by expanding the backslash sequences immediately when assigning lines. The $'...' form of quoting can be used there:
lines=$'first line\nsecond line\nthird line'
while read line; do
...
done <<< "$lines"
UPDATED#2
Explanation is in Blue Moons's answer.
Alternative solutions:
Eliminate echo
while read line; do
...
done <<EOT
first line
second line
third line
EOT
Add the echo inside the here-is-the-document
while read line; do
...
done <<EOT
$(echo -e $lines)
EOT
Run echo in background:
coproc echo -e $lines
while read -u ${COPROC[0]} line; do
...
done
Redirect to a file handle explicitly (Mind the space in < <!):
exec 3< <(echo -e $lines)
while read -u 3 line; do
...
done
Or just redirect to the stdin:
while read line; do
...
done < <(echo -e $lines)
And one for chepner (eliminating echo):
arr=("first line" "second line" "third line");
for((i=0;i<${#arr[*]};++i)) { line=${arr[i]};
...
}
Variable $lines can be converted to an array without starting a new sub-shell. The characters \ and n has to be converted to some character (e.g. a real new line character) and use the IFS (Internal Field Separator) variable to split the string into array elements. This can be done like:
lines="first line\nsecond line\nthird line"
echo "$lines"
OIFS="$IFS"
IFS=$'\n' arr=(${lines//\\n/$'\n'}) # Conversion
IFS="$OIFS"
echo "${arr[#]}", Length: ${#arr[*]}
set|grep ^arr
Result is
first line\nsecond line\nthird line
first line second line third line, Length: 3
arr=([0]="first line" [1]="second line" [2]="third line")
You are asking this bash FAQ. The answer also describes the general case of variables set in subshells created by pipes:
E4) If I pipe the output of a command into read variable, why
doesn't the output show up in $variable when the read command finishes?
This has to do with the parent-child relationship between Unix
processes. It affects all commands run in pipelines, not just
simple calls to read. For example, piping a command's output
into a while loop that repeatedly calls read will result in
the same behavior.
Each element of a pipeline, even a builtin or shell function,
runs in a separate process, a child of the shell running the
pipeline. A subprocess cannot affect its parent's environment.
When the read command sets the variable to the input, that
variable is set only in the subshell, not the parent shell. When
the subshell exits, the value of the variable is lost.
Many pipelines that end with read variable can be converted
into command substitutions, which will capture the output of
a specified command. The output can then be assigned to a
variable:
grep ^gnu /usr/lib/news/active | wc -l | read ngroup
can be converted into
ngroup=$(grep ^gnu /usr/lib/news/active | wc -l)
This does not, unfortunately, work to split the text among
multiple variables, as read does when given multiple variable
arguments. If you need to do this, you can either use the
command substitution above to read the output into a variable
and chop up the variable using the bash pattern removal
expansion operators or use some variant of the following
approach.
Say /usr/local/bin/ipaddr is the following shell script:
#! /bin/sh
host `hostname` | awk '/address/ {print $NF}'
Instead of using
/usr/local/bin/ipaddr | read A B C D
to break the local machine's IP address into separate octets, use
OIFS="$IFS"
IFS=.
set -- $(/usr/local/bin/ipaddr)
IFS="$OIFS"
A="$1" B="$2" C="$3" D="$4"
Beware, however, that this will change the shell's positional
parameters. If you need them, you should save them before doing
this.
This is the general approach -- in most cases you will not need to
set $IFS to a different value.
Some other user-supplied alternatives include:
read A B C D << HERE
$(IFS=.; echo $(/usr/local/bin/ipaddr))
HERE
and, where process substitution is available,
read A B C D < <(IFS=.; echo $(/usr/local/bin/ipaddr))
Hmmm... I would almost swear that this worked for the original Bourne shell, but don't have access to a running copy just now to check.
There is, however, a very trivial workaround to the problem.
Change the first line of the script from:
#!/bin/bash
to
#!/bin/ksh
Et voila! A read at the end of a pipeline works just fine, assuming you have the Korn shell installed.
This is an interesting question and touches on a very basic concept in Bourne shell and subshell. Here I provide a solution that is different from the previous solutions by doing some kind of filtering. I will give an example that may be useful in real life. This is a fragment for checking that downloaded files conform to a known checksum. The checksum file look like the following (Showing just 3 lines):
49174 36326 dna_align_feature.txt.gz
54757 1 dna.txt.gz
55409 9971 exon_transcript.txt.gz
The shell script:
#!/bin/sh
.....
failcnt=0 # this variable is only valid in the parent shell
#variable xx captures all the outputs from the while loop
xx=$(cat ${checkfile} | while read -r line; do
num1=$(echo $line | awk '{print $1}')
num2=$(echo $line | awk '{print $2}')
fname=$(echo $line | awk '{print $3}')
if [ -f "$fname" ]; then
res=$(sum $fname)
filegood=$(sum $fname | awk -v na=$num1 -v nb=$num2 -v fn=$fname '{ if (na == $1 && nb == $2) { print "TRUE"; } else { print "FALSE"; }}')
if [ "$filegood" = "FALSE" ]; then
failcnt=$(expr $failcnt + 1) # only in subshell
echo "$fname BAD $failcnt"
fi
fi
done | tail -1) # I am only interested in the final result
# you can capture a whole bunch of texts and do further filtering
failcnt=${xx#* BAD } # I am only interested in the number
# this variable is in the parent shell
echo failcnt $failcnt
if [ $failcnt -gt 0 ]; then
echo $failcnt files failed
else
echo download successful
fi
The parent and subshell communicate through the echo command. You can pick some easy to parse text for the parent shell. This method does not break your normal way of thinking, just that you have to do some post processing. You can use grep, sed, awk, and more for doing so.
I use stderr to store within a loop, and read from it outside.
Here var i is initially set and read inside the loop as 1.
# reading lines of content from 2 files concatenated
# inside loop: write value of var i to stderr (before iteration)
# outside: read var i from stderr, has last iterative value
f=/tmp/file1
g=/tmp/file2
i=1
cat $f $g | \
while read -r s;
do
echo $s > /dev/null; # some work
echo $i > 2
let i++
done;
read -r i < 2
echo $i
Or use the heredoc method to reduce the amount of code in a subshell.
Note the iterative i value can be read outside the while loop.
i=1
while read -r s;
do
echo $s > /dev/null
let i++
done <<EOT
$(cat $f $g)
EOT
let i--
echo $i
How about a very simple method
+call your while loop in a function
- set your value inside (nonsense, but shows the example)
- return your value inside
+capture your value outside
+set outside
+display outside
#!/bin/bash
# set -e
# set -u
# No idea why you need this, not using here
foo=0
bar="hello"
if [[ "$bar" == "hello" ]]
then
foo=1
echo "Setting \$foo to $foo"
fi
echo "Variable \$foo after if statement: $foo"
lines="first line\nsecond line\nthird line"
function my_while_loop
{
echo -e $lines | while read line
do
if [[ "$line" == "second line" ]]
then
foo=2; return 2;
echo "Variable \$foo updated to $foo inside if inside while loop"
fi
echo -e $lines | while read line
do
if [[ "$line" == "second line" ]]
then
foo=2;
echo "Variable \$foo updated to $foo inside if inside while loop"
return 2;
fi
# Code below won't be executed since we returned from function in 'if' statement
# We aready reported the $foo var beint set to 2 anyway
echo "Value of \$foo in while loop body: $foo"
done
}
my_while_loop; foo="$?"
echo "Variable \$foo after while loop: $foo"
Output:
Setting $foo 1
Variable $foo after if statement: 1
Value of $foo in while loop body: 1
Variable $foo after while loop: 2
bash --version
GNU bash, version 3.2.51(1)-release (x86_64-apple-darwin13)
Copyright (C) 2007 Free Software Foundation, Inc.
Though this is an old question and asked several times, here's what I'm doing after hours fidgeting with here strings, and the only option that worked for me is to store the value in a file during while loop sub-shells and then retrieve it. Simple.
Use echo statement to store and cat statement to retrieve. And the bash user must chown the directory or have read-write chmod access.
#write to file
echo "1" > foo.txt
while condition; do
if (condition); then
#write again to file
echo "2" > foo.txt
fi
done
#read from file
echo "Value of \$foo in while loop body: $(cat foo.txt)"
In bash (works on v4 at least), the following commands allow to assign multiple variables from a string :
IFS=',' read a b <<< "s1,s2"
IFS=',' read a b < <(echo "s1,s2") # an equivalent
After one of these commands :
$ echo $a
s1
$ echo $b
s2
But provided commands are not POSIX-compliant; if run in sh (dash) :
sh: 1: Syntax error: redirection unexpected
What is a POSIX-compliant alternative to those commands? I tried :
IFS=',' echo "s1,s2" | read a b
The command succeeds (return code 0), but echo $a and echo $b then prints nothing.
a and b are set, but due to the pipline, the read command runs in a subshell. When the subshell exits, the variables disappear.
You can read from a here-doc
IFS=, read a b <<END
s1,s2
END
To replace any arbitrary pipeline (or process substitution), you can capture the pipeline's output and put that variable in a heredoc:
output=$( seq 10 20 | paste -sd, )
IFS=, read a b <<END
$output
END
echo "$a"
echo "$b"
outputs
10
11,12,13,14,15,16,17,18,19,20
In addition to glenn jackman's answer, there is also the option of an explicit named pipe (which is, essentially, what a process substitution replaces):
mkfifo p
echo "s1,s2" > p &
IFS=, read a b < p
I have a file /tmp/a.txt whose contents I want to read in a variable many number of times. If the EOF is reached then it should start from the beginning.
i.e. If the contents of the file is "abc" and I want to get 10 chars, it should be "abcabcabca".
For this I wrote an obvious script:
while [ 1 ];
do cat /tmp/a.txt;
done |
for i in {1..3};
do read -N 10 A;
echo "For $i: $A";
done
The only problem is that it hangs! I have no idea why it does so!
I am also open to other solutions in bash.
To repeat over and over a line you can :
yes "abc" | for i in {1..3}; do read -N 10 A; echo "for $i: $A"; done
yes will output 'forever', but then the for i in 1..3 will only execute the "do ... done;" part 3 times
yes add a "\n" after the string. If you don't want it, do:
yes "abc" | tr -d '\n' | for i in {1..3}; do read -N 10 A; echo "for $i: $A"; done
In all the above, note that as the read is after a pipe, in bash it will be in a subshell, so "$A" will only available in the "do....done;" area, and be lost after!
To loop and read from a file, and also not do that in a subshell:
for i in {1..3}; do read -N 10 A ; echo "for $i: $A"; done <$(cat /the/file)
To be sure there is enough data in /the/file, repeat at will:
for i in {1..3}; do read -N 10 A ; echo "for $i: $A"; done <$(cat /the/file /the/file /the/file)
To test the latest: echo -n "abc" > /the/file (-n, so there is no trainling newline)
The script hangs because of the first loop. After the three iterations of the second loop (for) are done, the first loop repeatedly starts new cat instances which read the file and then write the content abc to the pipe. The write to the pipe doesn't work any more in the later iterations. Yes, there is a SIGPIPE kill, but to the cat command and not to the loop itself. So the solution is to catch the error in the right place:
while [ 1 ];
do cat /tmp/a.txt || break
done |
for i in {1..3};
do read -N 10 A;
echo "For $i: $A";
done
Besides: output is following:
For 1: abcabcabca
For 2: bcabcabcab
For 3: cabcabcabc
<-- (Here the shell hangs no more)
I need to read some configuration data into environment variables in a bash script.
The "obvious" (but incorrect) pattern is:
egrep "pattern" config-file.cfg | read VAR1 VAR2 VAR3 etc...
This fails because the read is run in a subshell and therefore cannot set variables in the invoking shell. So I came up with this as an alternative
coproc egrep "pattern" config-file.cfg
read -u ${COPROC[0]} VAR1 VAR2 VAR3 etc...
which works fine.
To test what happens if the coprocess returns more than one line, I tried this:
coproc cat config-file.cfg
read -u ${COPROC[0]} VAR1 VAR2 VAR3 etc...
where config-file.cfg contains three lines.
$ cat config-file.cfg
LINE1 A1 B1 C1
LINE2 A2 B2 C2
LINE3 A3 B3 C3
I expected this to process the first line in the file, followed by some kind of "broken pipe" error message. While it did process the first line, there was no error message and no coprocess was left running.
So I then tried the following in a script:
$ cat test.sh
coproc cat config-file.cfg
read -u ${COPROC[0]} VAR1 VAR2 VAR3 VAR4
echo $VAR1 $VAR2 $VAR3 $VAR4
wait
echo $?
Running it:
$ bash -x test.sh
+ read -u 63 VAR1 VAR2 VAR3 VAR4
+ cat config-file.cfg
LINE1 A1 B1 C1
+ wait
+ echo 0
0
Where did the remaining two lines go? I would have expected either "broken pipe", or the wait to hang since there was nothing to read the remaining lines, but as you can see the return code was zero.
As per comments above, you can use process substitution to achieve just that. This way, read is not run in a subshell and the captured vars will be available within the current shell.
read VAR1 VAR2 VAR3 < <(egrep "pattern" config-file.cfg)
"If the <(list) form is used, the file passed as an argument should be read to obtain the output of list" -- what "file passed as an agrument" are they talking about?
That is rather cryptic to me too. The chapter on process substitution in Advanced Bash-scripting Guide has a more comprehensive explanation.
The way I see it, when the <(cmd) syntax is used, the ouput of cmd is made available via a named pipe (or temp file) and the syntax is replaced by the filename of the pipe/file. So for the example above, it would end up being equivalent to:
read VAR1 VAR2 VAR3 < /dev/fd/63
where /dev/fd/63 is the named pipe connected to the stdout of cmd.
If I understand correctly your question (and I hope I'm not stating the obvious),
read reads one line at a time, as in:
$ read a b c < config-file.cfg && echo $?
0
or:
$ printf '%s\n%s\n' one two | { read; echo "$REPLY";}
one
$ echo ${PIPESTATUS[#]}
0 0
To read all the input you'll need a loop:
$ coproc cat config-file.cfg
[1] 3460
$ while read -u ${COPROC[0]} VAR1 VAR2 VAR3; do echo $VAR1 $VAR2 $VAR3; done
LINE1 A1 B1 C1
LINE2 A2 B2 C2
LINE3 A3 B3 C3
[1]+ Done coproc COPROC cat config-file.cfg
Just to add that this is explained in the FAQ.
What happens is, as soon as the subshell finishes, the parent shell cleans up and closes the FDs. You're lucky you even got to read the first line!
Try this in an interactive shell:
$ coproc ECHO { echo foo; echo bar; }
[2] 16472
[2]+ Done coproc ECHO { echo foo; echo bar; }
$ read -u ${ECHO[0]}; echo $REPLY
bash: read: -u: option requires an argument
read: usage: read [-ers] [-a array] [-d delim] [-i text] [-n nchars] [-N nchars] [-p prompt] [-t timeout] [-u fd] [name ...]
It even mops up the environment variable.
Now try this:
$ coproc ECHO { echo foo; echo bar; sleep 30; }
[2] 16485
$ read -u ${ECHO[0]}; echo $REPLY
foo
$ read -u ${ECHO[0]}; echo $REPLY
bar
$ read -u ${ECHO[0]}; echo $REPLY # blocks until the 30 seconds are up
[2]+ Done coproc ECHO { echo foo; echo bar; sleep 30; }
As for solving the problem behind the question: Yes, redirection and process substitution is the better choice for the particular example given.
I have a problem updating a value of a variable in a shell script from inside of a while loop. It can be simulated with the following piece of code:
printf "aaa\nbbb\n" | \
while read x ; do
y=$x
echo "INSIDE: $y"
done
echo "OUTSIDE: $y"
Output:
INSIDE: aaa
INSIDE: bbb
OUTSIDE:
Here printf command just display two lines, while-read loop read it line by line, updating certain variable, but as soon as control going out of the loop the value of the variable gets lost.
I guess the problem is related to the fact that 'pipe-while-read' statement causes shell to execute the body of the loop in a subprocess, which cannot update the shell variables in the main loop.
If I rewrite the first two lines of code as
for x in `printf "aaa\nbbb\n" ` ; do
Output:
INSIDE: aaa
INSIDE: bbb
OUTSIDE: bbb
It could be a workaround, but not for my case because in reality I have not 'aaa' and 'bbb' but more complex strings including whitespaces etc.
Any idea how to tackle the problem, namely: read a command output line by line in a loop and be able to update shell variables?
Thanks.
An excerpt from man bash:
Each command in a pipeline is executed as a separate process (i.e., in a subshell).
And Subshell cannot change the variable in Parent.
One of the possible Solution is:
IFS='\n'
while read x ; do
y=${x}
echo "INSIDE: ${y}"
done <<EOT
aaa
bbb
EOT
echo "OUTSIDE: ${y}"
Or if the input is a file:
IFS='\n'
while read x ; do
y=${x}
echo "INSIDE: ${y}"
done < /path/to/file
echo "OUTSIDE: ${y}"
This reads one line at a time, and doesn't have any issue with spaces.
You can get rid of the pipe-into-while by using process substitution instead:
while read x ; do
y=$x
echo "INSIDE: $y"
done < <(printf "aaa\nbbb\n")
echo "OUTSIDE: $y"
Alternatively, if your input is in a file, you can redirect it into while:
while read x ; do
y=$x
echo "INSIDE: $y"
done < file
echo "OUTSIDE: $y"