Looping file contents in bash - bash

I have a file /tmp/a.txt whose contents I want to read in a variable many number of times. If the EOF is reached then it should start from the beginning.
i.e. If the contents of the file is "abc" and I want to get 10 chars, it should be "abcabcabca".
For this I wrote an obvious script:
while [ 1 ];
do cat /tmp/a.txt;
done |
for i in {1..3};
do read -N 10 A;
echo "For $i: $A";
done
The only problem is that it hangs! I have no idea why it does so!
I am also open to other solutions in bash.

To repeat over and over a line you can :
yes "abc" | for i in {1..3}; do read -N 10 A; echo "for $i: $A"; done
yes will output 'forever', but then the for i in 1..3 will only execute the "do ... done;" part 3 times
yes add a "\n" after the string. If you don't want it, do:
yes "abc" | tr -d '\n' | for i in {1..3}; do read -N 10 A; echo "for $i: $A"; done
In all the above, note that as the read is after a pipe, in bash it will be in a subshell, so "$A" will only available in the "do....done;" area, and be lost after!
To loop and read from a file, and also not do that in a subshell:
for i in {1..3}; do read -N 10 A ; echo "for $i: $A"; done <$(cat /the/file)
To be sure there is enough data in /the/file, repeat at will:
for i in {1..3}; do read -N 10 A ; echo "for $i: $A"; done <$(cat /the/file /the/file /the/file)
To test the latest: echo -n "abc" > /the/file (-n, so there is no trainling newline)

The script hangs because of the first loop. After the three iterations of the second loop (for) are done, the first loop repeatedly starts new cat instances which read the file and then write the content abc to the pipe. The write to the pipe doesn't work any more in the later iterations. Yes, there is a SIGPIPE kill, but to the cat command and not to the loop itself. So the solution is to catch the error in the right place:
while [ 1 ];
do cat /tmp/a.txt || break
done |
for i in {1..3};
do read -N 10 A;
echo "For $i: $A";
done
Besides: output is following:
For 1: abcabcabca
For 2: bcabcabcab
For 3: cabcabcabc
<-- (Here the shell hangs no more)

Related

Current Count vs Total Count output in a single line using Bash

I need an output of current count vs total count in single line. I would like to know if this could be done Via Bash using 'for' 'while' loop.
Expecting an output that should only update the count and should not display multiple lines
File Content
$ cat ~/test.rtf
hostname1
hostname2
hostname3
hostname4
#!/bin/sh
j=1
k=$(cat ~/test.rtf | wc -l)
for i in $(cat ~/test.rtf);
do
echo "Working on line ($j/$k)"
echo "$i"
#or any other command for i
j=$((j+1))
done
EX:
Working on line (2/4)
Not like,
Working on line (2/4)
Working on line (3/4)
Assumptions:
OP wants to generate n lines of output that overwrite each other on successive passes through the loop
in OP's sample code there are two echo calls so in this case n=2
General approaches:
issue a clear at the beginning of each pass through the loop so as to clear the current output and reposition the cursor at the 'top' of the console/window
use tput to manage movement of the cursor (and clearing/overwriting of previous output)
Sample input:
$ cat test.rtf
this is line1
then line2
and line3
and last but not least line4
real last line5
clear approach:
j=1
k=$(wc -l < test.rtf)
while read -r line
do
clear
echo "Working on line ($j/$k)"
echo "${line}"
((j++))
done < test.rtf
tput approach:
j=1
k=$(wc -l < test.rtf)
EraseToEOL=$(tput el) # grab terminal specific code for clearing from cursor to EOL
clear # optional: start with a new screen otherwise use current position in console/window for next command ...
tput sc # grab current cursor position
while read -r line
do
tput rc # go (back) to cursor position stored via 'tput sc'
echo "Working on line ($j/$k)"
echo "${line}${EraseToEOL}" # ${EraseToEOL} forces clearing rest of line, effectively erasing a previous line that may have been longer then the current line of output
((j++))
done < test.rtf
Both of these generate the same result:
Something along these lines:
file=~/test.rtf
nl=$(wc -l "$file")
nl=${nl%%[[:blank:]]*}
i=0
while IFS= read -r line; do
i=$((i+1))
echo "Working on line ($i/$nl)"
done < "$file"
Your main question is how to avoid each the counter to be written to new lines. The newlines are \n characters, which is appended by echo. You want \r, like
for ((i=0; i<10; i++)); do
printf "Counter $i\r"
sleep 1
done
echo
When you echo something from the line you are working on, you will use \n again. I will use cut as an example of processing the inputline. Use the output string in the same printf command like
j=1
k=$(cat ~/test.rtf | wc -l)
while IFS= read -r line; do
printf "Working on line (%s): %s\r" "$j/$k" $(cut -c1-10 <<< "${line}")
sleep 1
((j++))
done < ~/test.rft
The problem with the above solution is that you will see output from previous lines when your last output is shorter than the previous one. When you know the maximum length that your processing of the line will show, you can force additional spaces:
j=1
k=$(cat ~/test.rtf | wc -l)
while IFS= read -r line; do
printf "Working on line (%5.5s): %-20s\r" "$j/$k" "$(cut -c1-20 <<< "${line}")";
sleep 1
((j++))
done < ~/test.rft

Can't add a new element to an array in bash [duplicate]

In the following program, if I set the variable $foo to the value 1 inside the first if statement, it works in the sense that its value is remembered after the if statement. However, when I set the same variable to the value 2 inside an if which is inside a while statement, it's forgotten after the while loop. It's behaving like I'm using some sort of copy of the variable $foo inside the while loop and I am modifying only that particular copy. Here's a complete test program:
#!/bin/bash
set -e
set -u
foo=0
bar="hello"
if [[ "$bar" == "hello" ]]
then
foo=1
echo "Setting \$foo to 1: $foo"
fi
echo "Variable \$foo after if statement: $foo"
lines="first line\nsecond line\nthird line"
echo -e $lines | while read line
do
if [[ "$line" == "second line" ]]
then
foo=2
echo "Variable \$foo updated to $foo inside if inside while loop"
fi
echo "Value of \$foo in while loop body: $foo"
done
echo "Variable \$foo after while loop: $foo"
# Output:
# $ ./testbash.sh
# Setting $foo to 1: 1
# Variable $foo after if statement: 1
# Value of $foo in while loop body: 1
# Variable $foo updated to 2 inside if inside while loop
# Value of $foo in while loop body: 2
# Value of $foo in while loop body: 2
# Variable $foo after while loop: 1
# bash --version
# GNU bash, version 4.1.10(4)-release (i686-pc-cygwin)
echo -e $lines | while read line
...
done
The while loop is executed in a subshell. So any changes you do to the variable will not be available once the subshell exits.
Instead you can use a here string to re-write the while loop to be in the main shell process; only echo -e $lines will run in a subshell:
while read line
do
if [[ "$line" == "second line" ]]
then
foo=2
echo "Variable \$foo updated to $foo inside if inside while loop"
fi
echo "Value of \$foo in while loop body: $foo"
done <<< "$(echo -e "$lines")"
You can get rid of the rather ugly echo in the here-string above by expanding the backslash sequences immediately when assigning lines. The $'...' form of quoting can be used there:
lines=$'first line\nsecond line\nthird line'
while read line; do
...
done <<< "$lines"
UPDATED#2
Explanation is in Blue Moons's answer.
Alternative solutions:
Eliminate echo
while read line; do
...
done <<EOT
first line
second line
third line
EOT
Add the echo inside the here-is-the-document
while read line; do
...
done <<EOT
$(echo -e $lines)
EOT
Run echo in background:
coproc echo -e $lines
while read -u ${COPROC[0]} line; do
...
done
Redirect to a file handle explicitly (Mind the space in < <!):
exec 3< <(echo -e $lines)
while read -u 3 line; do
...
done
Or just redirect to the stdin:
while read line; do
...
done < <(echo -e $lines)
And one for chepner (eliminating echo):
arr=("first line" "second line" "third line");
for((i=0;i<${#arr[*]};++i)) { line=${arr[i]};
...
}
Variable $lines can be converted to an array without starting a new sub-shell. The characters \ and n has to be converted to some character (e.g. a real new line character) and use the IFS (Internal Field Separator) variable to split the string into array elements. This can be done like:
lines="first line\nsecond line\nthird line"
echo "$lines"
OIFS="$IFS"
IFS=$'\n' arr=(${lines//\\n/$'\n'}) # Conversion
IFS="$OIFS"
echo "${arr[#]}", Length: ${#arr[*]}
set|grep ^arr
Result is
first line\nsecond line\nthird line
first line second line third line, Length: 3
arr=([0]="first line" [1]="second line" [2]="third line")
You are asking this bash FAQ. The answer also describes the general case of variables set in subshells created by pipes:
E4) If I pipe the output of a command into read variable, why
doesn't the output show up in $variable when the read command finishes?
This has to do with the parent-child relationship between Unix
processes. It affects all commands run in pipelines, not just
simple calls to read. For example, piping a command's output
into a while loop that repeatedly calls read will result in
the same behavior.
Each element of a pipeline, even a builtin or shell function,
runs in a separate process, a child of the shell running the
pipeline. A subprocess cannot affect its parent's environment.
When the read command sets the variable to the input, that
variable is set only in the subshell, not the parent shell. When
the subshell exits, the value of the variable is lost.
Many pipelines that end with read variable can be converted
into command substitutions, which will capture the output of
a specified command. The output can then be assigned to a
variable:
grep ^gnu /usr/lib/news/active | wc -l | read ngroup
can be converted into
ngroup=$(grep ^gnu /usr/lib/news/active | wc -l)
This does not, unfortunately, work to split the text among
multiple variables, as read does when given multiple variable
arguments. If you need to do this, you can either use the
command substitution above to read the output into a variable
and chop up the variable using the bash pattern removal
expansion operators or use some variant of the following
approach.
Say /usr/local/bin/ipaddr is the following shell script:
#! /bin/sh
host `hostname` | awk '/address/ {print $NF}'
Instead of using
/usr/local/bin/ipaddr | read A B C D
to break the local machine's IP address into separate octets, use
OIFS="$IFS"
IFS=.
set -- $(/usr/local/bin/ipaddr)
IFS="$OIFS"
A="$1" B="$2" C="$3" D="$4"
Beware, however, that this will change the shell's positional
parameters. If you need them, you should save them before doing
this.
This is the general approach -- in most cases you will not need to
set $IFS to a different value.
Some other user-supplied alternatives include:
read A B C D << HERE
$(IFS=.; echo $(/usr/local/bin/ipaddr))
HERE
and, where process substitution is available,
read A B C D < <(IFS=.; echo $(/usr/local/bin/ipaddr))
Hmmm... I would almost swear that this worked for the original Bourne shell, but don't have access to a running copy just now to check.
There is, however, a very trivial workaround to the problem.
Change the first line of the script from:
#!/bin/bash
to
#!/bin/ksh
Et voila! A read at the end of a pipeline works just fine, assuming you have the Korn shell installed.
This is an interesting question and touches on a very basic concept in Bourne shell and subshell. Here I provide a solution that is different from the previous solutions by doing some kind of filtering. I will give an example that may be useful in real life. This is a fragment for checking that downloaded files conform to a known checksum. The checksum file look like the following (Showing just 3 lines):
49174 36326 dna_align_feature.txt.gz
54757 1 dna.txt.gz
55409 9971 exon_transcript.txt.gz
The shell script:
#!/bin/sh
.....
failcnt=0 # this variable is only valid in the parent shell
#variable xx captures all the outputs from the while loop
xx=$(cat ${checkfile} | while read -r line; do
num1=$(echo $line | awk '{print $1}')
num2=$(echo $line | awk '{print $2}')
fname=$(echo $line | awk '{print $3}')
if [ -f "$fname" ]; then
res=$(sum $fname)
filegood=$(sum $fname | awk -v na=$num1 -v nb=$num2 -v fn=$fname '{ if (na == $1 && nb == $2) { print "TRUE"; } else { print "FALSE"; }}')
if [ "$filegood" = "FALSE" ]; then
failcnt=$(expr $failcnt + 1) # only in subshell
echo "$fname BAD $failcnt"
fi
fi
done | tail -1) # I am only interested in the final result
# you can capture a whole bunch of texts and do further filtering
failcnt=${xx#* BAD } # I am only interested in the number
# this variable is in the parent shell
echo failcnt $failcnt
if [ $failcnt -gt 0 ]; then
echo $failcnt files failed
else
echo download successful
fi
The parent and subshell communicate through the echo command. You can pick some easy to parse text for the parent shell. This method does not break your normal way of thinking, just that you have to do some post processing. You can use grep, sed, awk, and more for doing so.
I use stderr to store within a loop, and read from it outside.
Here var i is initially set and read inside the loop as 1.
# reading lines of content from 2 files concatenated
# inside loop: write value of var i to stderr (before iteration)
# outside: read var i from stderr, has last iterative value
f=/tmp/file1
g=/tmp/file2
i=1
cat $f $g | \
while read -r s;
do
echo $s > /dev/null; # some work
echo $i > 2
let i++
done;
read -r i < 2
echo $i
Or use the heredoc method to reduce the amount of code in a subshell.
Note the iterative i value can be read outside the while loop.
i=1
while read -r s;
do
echo $s > /dev/null
let i++
done <<EOT
$(cat $f $g)
EOT
let i--
echo $i
How about a very simple method
+call your while loop in a function
- set your value inside (nonsense, but shows the example)
- return your value inside
+capture your value outside
+set outside
+display outside
#!/bin/bash
# set -e
# set -u
# No idea why you need this, not using here
foo=0
bar="hello"
if [[ "$bar" == "hello" ]]
then
foo=1
echo "Setting \$foo to $foo"
fi
echo "Variable \$foo after if statement: $foo"
lines="first line\nsecond line\nthird line"
function my_while_loop
{
echo -e $lines | while read line
do
if [[ "$line" == "second line" ]]
then
foo=2; return 2;
echo "Variable \$foo updated to $foo inside if inside while loop"
fi
echo -e $lines | while read line
do
if [[ "$line" == "second line" ]]
then
foo=2;
echo "Variable \$foo updated to $foo inside if inside while loop"
return 2;
fi
# Code below won't be executed since we returned from function in 'if' statement
# We aready reported the $foo var beint set to 2 anyway
echo "Value of \$foo in while loop body: $foo"
done
}
my_while_loop; foo="$?"
echo "Variable \$foo after while loop: $foo"
Output:
Setting $foo 1
Variable $foo after if statement: 1
Value of $foo in while loop body: 1
Variable $foo after while loop: 2
bash --version
GNU bash, version 3.2.51(1)-release (x86_64-apple-darwin13)
Copyright (C) 2007 Free Software Foundation, Inc.
Though this is an old question and asked several times, here's what I'm doing after hours fidgeting with here strings, and the only option that worked for me is to store the value in a file during while loop sub-shells and then retrieve it. Simple.
Use echo statement to store and cat statement to retrieve. And the bash user must chown the directory or have read-write chmod access.
#write to file
echo "1" > foo.txt
while condition; do
if (condition); then
#write again to file
echo "2" > foo.txt
fi
done
#read from file
echo "Value of \$foo in while loop body: $(cat foo.txt)"

Stuck in an infinite while loop

I am trying to write this code so that if the process reads map finished in the pipe it increments a variable by 1 so that it eventually breaks out of the while loop. Otherwise it will add unique parameters to a keys file. However it goes into an infinite loop and never breaks out of the loop.
while [ $a -le 5 ]; do
read input < map_pipe;
if [ $input = "map finished" ]; then
((a++))
echo $a
else
sort -u map_pipe >> keys.txt;
fi
done
I decided to fix it for you, not sure if this is what you wanted, but I think I am close:
#!/bin/bash
a=0 #Initialize your variable to something
while [ $a -le 5 ]; do
read input < map_pipe;
if [ "$input" = "map finished" ]; then #Put double quotes around variables to allow values with spaces
a=$(($a + 1)) #Your syntax was off, use spaces and do something with the output
else
echo $input >> keys.txt #Don't re-read the pipe, it's empty by now and sort will wait for the next input
sort -u keys.txt > tmpfile #Instead sort your file, don't save directly into the same file it will break
mv tmpfile keys.txt
#sort -u keys.txt | sponge keys.txt #Will also work instead of the other sort and mv, but sponge is not installed on most machines
fi
done

Parsing .csv file in bash, not reading final line

I'm trying to parse a csv file I made with Google Spreadsheet. It's very simple for testing purposes, and is basically:
1,2
3,4
5,6
The problem is that the csv doesn't end in a newline character so when I cat the file in BASH, I get
MacBook-Pro:Desktop kkSlider$ cat test.csv
1,2
3,4
5,6MacBook-Pro:Desktop kkSlider$
I just want to read line by line in a BASH script using a while loop that every guide suggests, and my script looks like this:
while IFS=',' read -r last first
do
echo "$last $first"
done < test.csv
The output is:
MacBook-Pro:Desktop kkSlider$ ./test.sh
1 2
3 4
Any ideas on how I could have it read that last line and echo it?
Thanks in advance.
You can force the input to your loop to end with a newline thus:
#!/bin/bash
(cat test.csv ; echo) | while IFS=',' read -r last first
do
echo "$last $first"
done
Unfortunately, this may result in an empty line at the end of your output if the input already has a newline at the end. You can fix that with a little addition:
!/bin/bash
(cat test.csv ; echo) | while IFS=',' read -r last first
do
if [[ $last != "" ]] ; then
echo "$last $first"
fi
done
Another method relies on the fact that the values are being placed into the variables by the read but they're just not being output because of the while statement:
#!/bin/bash
while IFS=',' read -r last first
do
echo "$last $first"
done <test.csv
if [[ $last != "" ]] ; then
echo "$last $first"
fi
That one works without creating another subshell to modify the input to the while statement.
Of course, I'm assuming here that you want to do more inside the loop that just output the values with a space rather than a comma. If that's all you wanted to do, there are other tools better suited than a bash read loop, such as:
tr "," " " <test.csv
cat file |sed -e '${/^$/!s/$/\n/;}'| while IFS=',' read -r last first; do echo "$last $first"; done
If the last (unterminated) line needs to be processed differently from the rest, #paxdiablo's version with the extra if statement is the way to go; but if it's going to be handled like all the others, it's cleaner to process it in the main loop.
You can roll the "if there was an unterminated last line" into the main loop condition like this:
while IFS=',' read -r last first || [ -n "$last" ]
do
echo "$last $first"
done < test.csv

Bash loop, print current iteration?

Say you have a simple loop
while read line
do
printf "${line#*//}\n"
done < text.txt
Is there an elegant way of printing the current iteration with the output? Something like
0 The
1 quick
2 brown
3 fox
I am hoping to avoid setting a variable and incrementing it on each loop.
To do this, you would need to increment a counter on each iteration (like you are trying to avoid).
count=0
while read -r line; do
printf '%d %s\n' "$count" "${line*//}"
(( count++ ))
done < test.txt
EDIT: After some more thought, you can do it without a counter if you have bash version 4 or higher:
mapfile -t arr < test.txt
for i in "${!arr[#]}"; do
printf '%d %s' "$i" "${arr[i]}"
done
The mapfile builtin reads the entire contents of the file into the array. You can then iterate over the indices of the array, which will be the line numbers and access that element.
You don't often see it, but you can have multiple commands in the condition clause of a while loop. The following still requires an explicit counter variable, but the arrangement may be more suitable or appealing for some uses.
while ((i++)); read -r line
do
echo "$i $line"
done < inputfile
The while condition is satisfied by whatever the last command returns (read in this case).
Some people prefer to include the do on the same line. This is what that would look like:
while ((i++)); read -r line; do
echo "$i $line"
done < inputfile
You can use a range to go through, it can be an array, a string, a input line or a list.
In this example, i use a list of numbers [0..10] is used with an increment of 2, as well.
#!/bin/bash
for i in {0..10..2}; do
echo " $i times"
done
The output is:
0 times
2 times
4 times
6 times
8 times
10 times
To print the index regardless of the loop range, you have to use a variable "COUNTER=0" and increase it in each iteration "COUNTER+1".
my solution prints each iteration, the FOR traverses an inputline and increments by one each iteration, also shows each of words in the inputline:
#!/bin/bash
COUNTER=0
line="this is a sample input line"
for word in $line; do
echo "This i a word number $COUNTER: $word"
COUNTER=$((COUNTER+1))
done
The output is:
This i a word number 0: this
This i a word number 1: is
This i a word number 2: a
This i a word number 3: sample
This i a word number 4: input
This i a word number 5: line
to see more about loops: enter link description here
to test your scripts: enter link description here
n=0
cat test.txt | while read line; do
printf "%7s %s\n" "$n" "${line#*//}"
n=$((n+1))
done
This will work in Bourne shell as well, of course.
If you really want to avoid incrementing a variable, you can pipe the output through grep or awk:
cat test.txt | while read line; do
printf " %s\n" "${line#*//}"
done | grep -n .
or
awk '{sub(/.*\/\//, ""); print NR,$0}' test.txt
Update: Other answers posted here are better, especially those of #Graham and #DennisWilliamson.
Something very like this should suit:
tr -s ' ' '\n' <test.txt | nl -ba
You can add a -v0 flag to the nl command if you want indexing from 0.

Resources