Manually iterating a line of a file | bash - bash

I could do this in any other language, but with Bash I've looked far and wide and could not find the answer.
I need to manually increase $line in a script. Example:
for line in `cat file`
do
foo()
foo_loop(condition)
{
do_something_to_line($line)
}
done
If you notice, every time the foo_loop iterates, $line stays the same. I need to iterate $line there, and make sure the original for loop only runs the number of lines in file.
I have thought about finding the number of lines in file using a different loop and iterating the line variable inside the inner loop of foo().
Any ideas?
EDIT:
Sorry for being so vague.
Here we go:
I'm trying to make a section of my code execute multiple times (parallel execution)
Function foo() # Does something
for line in `cat $temp_file`;
foo($line)
That code works just fine, because foo is just taking in the value of line; but if I wanted to do this:
Function foo() # Does something
for line in `cat $temp_file`;
while (some condition)
foo($line)
end
$line will equal the same value throughout the while loop. I need it to change with the while loop, then continue when it goes back to the for. Example:
line = Hi
foo{ echo "$line" };
for line in `cat file`;
while ( number_of_processes_running -lt max_number_allowed)
foo($line)
end
If the contents of file were
Hi \n Bye \n Yellow \n Green \n
The output of the example program would be (if max number allowed was 3)
Hi Hi Hi Bye Bye Bye Yellow Yellow Yellow Green Green Green.
Where I want it to be
Hi Bye Yellow Green
I hope this is better. I'm doing my best to explain my problem.

Instead of using a for loop to read through the file you should maybe read through the file like so.
#!bin/bash
while read line
do
do_something_to_line($line)
done < "your.file"

Long story short, while read line; do _____ ; done
Then, make sure you have double-quotes around "$line" so that a parameter isn't delimited by spaces.
Example:
$ cat /proc/cpuinfo | md5sum
c2eb5696e59948852f66a82993016e5a *-
$ cat /proc/cpuinfo | while read line; do echo "$line"; done | md5sum
c2eb5696e59948852f66a82993016e5a *-
Second example
# add .gz to every file in the current directory:
# If any files had spaces, the mv command for that line would return an error.
$ find -type f -maxdepth 1 | while read line; do mv "$line" "$line.gz"; done

You should post follow-ups as edits to your question or in comments rather than as an answer.
This structure:
while read line
do
for (( i=1; i<$max_number_allowed; i++ ))
do
foo $line
done
done < file
Yields:
Hi
Hi
Hi
Bye
Bye
Bye
...etc.
While this one:
for (( i=1; i<$max_number_allowed; i++ ))
do
while read line
do
foo $line
done < file
done
Yields:
Hi
Bye
Yellow
Green
Hi
Bye
Yellow
Green
...etc.

Related

Current Count vs Total Count output in a single line using Bash

I need an output of current count vs total count in single line. I would like to know if this could be done Via Bash using 'for' 'while' loop.
Expecting an output that should only update the count and should not display multiple lines
File Content
$ cat ~/test.rtf
hostname1
hostname2
hostname3
hostname4
#!/bin/sh
j=1
k=$(cat ~/test.rtf | wc -l)
for i in $(cat ~/test.rtf);
do
echo "Working on line ($j/$k)"
echo "$i"
#or any other command for i
j=$((j+1))
done
EX:
Working on line (2/4)
Not like,
Working on line (2/4)
Working on line (3/4)
Assumptions:
OP wants to generate n lines of output that overwrite each other on successive passes through the loop
in OP's sample code there are two echo calls so in this case n=2
General approaches:
issue a clear at the beginning of each pass through the loop so as to clear the current output and reposition the cursor at the 'top' of the console/window
use tput to manage movement of the cursor (and clearing/overwriting of previous output)
Sample input:
$ cat test.rtf
this is line1
then line2
and line3
and last but not least line4
real last line5
clear approach:
j=1
k=$(wc -l < test.rtf)
while read -r line
do
clear
echo "Working on line ($j/$k)"
echo "${line}"
((j++))
done < test.rtf
tput approach:
j=1
k=$(wc -l < test.rtf)
EraseToEOL=$(tput el) # grab terminal specific code for clearing from cursor to EOL
clear # optional: start with a new screen otherwise use current position in console/window for next command ...
tput sc # grab current cursor position
while read -r line
do
tput rc # go (back) to cursor position stored via 'tput sc'
echo "Working on line ($j/$k)"
echo "${line}${EraseToEOL}" # ${EraseToEOL} forces clearing rest of line, effectively erasing a previous line that may have been longer then the current line of output
((j++))
done < test.rtf
Both of these generate the same result:
Something along these lines:
file=~/test.rtf
nl=$(wc -l "$file")
nl=${nl%%[[:blank:]]*}
i=0
while IFS= read -r line; do
i=$((i+1))
echo "Working on line ($i/$nl)"
done < "$file"
Your main question is how to avoid each the counter to be written to new lines. The newlines are \n characters, which is appended by echo. You want \r, like
for ((i=0; i<10; i++)); do
printf "Counter $i\r"
sleep 1
done
echo
When you echo something from the line you are working on, you will use \n again. I will use cut as an example of processing the inputline. Use the output string in the same printf command like
j=1
k=$(cat ~/test.rtf | wc -l)
while IFS= read -r line; do
printf "Working on line (%s): %s\r" "$j/$k" $(cut -c1-10 <<< "${line}")
sleep 1
((j++))
done < ~/test.rft
The problem with the above solution is that you will see output from previous lines when your last output is shorter than the previous one. When you know the maximum length that your processing of the line will show, you can force additional spaces:
j=1
k=$(cat ~/test.rtf | wc -l)
while IFS= read -r line; do
printf "Working on line (%5.5s): %-20s\r" "$j/$k" "$(cut -c1-20 <<< "${line}")";
sleep 1
((j++))
done < ~/test.rft

Parsing a config file in bash

Here's my config file (dansguardian-config):
banned-phrase duck
banned-site allaboutbirds.org
I want to write a bash script that will read this config file and create some other files for me. Here's what I have so far, it's mostly pseudo-code:
while read line
do
# if line starts with "banned-phrase"
# add rest of line to file bannedphraselist
# fi
# if line starts with "banned-site"
# add rest of line to file bannedsitelist
# fi
done < dansguardian-config
I'm not sure if I need to use grep, sed, awk, or what.
Hope that makes sense. I just really hate DansGuardian lists.
With awk:
$ cat config
banned-phrase duck frog bird
banned-phrase horse
banned-site allaboutbirds.org duckduckgoose.net
banned-site froggingbirds.gov
$ awk '$1=="banned-phrase"{for(i=2;i<=NF;i++)print $i >"bannedphraselist"}
$1=="banned-site"{for(i=2;i<=NF;i++)print $i >"bannedsitelist"}' config
$ cat bannedphraselist
duck
frog
bird
horse
$ cat bannedsitelist
allaboutbirds.org
duckduckgoose.net
froggingbirds.gov
Explanation:
In awk by default each line is separated into fields by whitespace and each field is handled by $i where i is the ith field i.e. the first field on each line is $1, the second field on each line is $2 upto $NF where NF is the variable that contains the number of fields on the given line.
So the script is simple:
Check the first field against our required strings $1=="banned-phrase"
If the first field matched then loop over all the other fields for(i=2;i<=NF;i++) and print each field print $i and redirect the output to the file >"bannedphraselist".
You could do
sed -n 's/^banned-phrase *//p' dansguardian-config > bannedphraselist
sed -n 's/^banned-site *//p' dansguardian-config > bannedsitelist
Although that means reading the file twice. I doubt that the possible performance loss matters though.
You can read multiple variables at once; by default they're split on whitespace.
while read command target; do
case "$command" in
banned-phrase) echo "$target" >>bannedphraselist;;
banned-site) echo "$target" >>bannedsitelist;;
"") ;; # blank line
*) echo >&2 "$0: unrecognized config directive '$command'";;
esac
done < dansguardian-config
Just as an example. A smarter implementation would read the list files first, make sure things weren't already banned, etc.
What is the problem with all the solutions which uses echo text >> file? It can be checked with strace that in every such step the file is opened, then positioned to the end, then text is written and file is closed. So if there is 1000 times echo text >> file then there will be 1000 open, lseek, write, close. The number of open, lseek and close can be reduced a lot on the following way:
while read key val; do
case $key in
banned-phrase) echo $val>&2;;
banned-site) echo $val;;
esac
done >bannedsitelist 2>bannedphraselist <dansguardian-config
The stdout and stderr is redirected to files and kept open while the loop is alive. So the files are opened once and closed once. No need of lseek. Also the file caching is used more in this way as the unnecessary calls to close will not flush the buffers each time.
while read name value
do
if [ $name = banned-phrase ]
then
echo $value >> bannedphraselist
elif [ $name = banned-site ]
then
echo $value >> bannedsitelist
fi
done < dansguardian-config
Better to use awk:
awk '$1 ~ /^banned-phrase/{print $2 >> "bannedphraselist"}
$1 ~ /^banned-site/{print $2 >> "bannedsitelist"}' dansguardian-config

bash loop skip commented lines

I'm looping over lines in a file. I just need to skip lines that start with "#".
How do I do that?
#!/bin/sh
while read line; do
if ["$line doesn't start with #"];then
echo "line";
fi
done < /tmp/myfile
Thanks for any help!
while read line; do
case "$line" in \#*) continue ;; esac
...
done < /tmp/my/input
Frankly, however, it is often clearer to turn to grep:
grep -v '^#' < /tmp/myfile | { while read line; ...; done; }
This is an old question but I stumbled upon this problem recently, so I wanted to share my solution as well.
If you are not against using some python trickery, here it is:
Let this be our file called "my_file.txt":
this line will print
this will also print # but this will not
# this wont print either
# this may or may not be printed, depending on the script used, see below
Let this be our bash script called "my_script.sh":
#!/bin/sh
line_sanitizer="""import sys
with open(sys.argv[1], 'r') as f:
for l in f.read().splitlines():
line = l.split('#')[0].strip()
if line:
print(line)
"""
echo $(python -c "$line_sanitizer" ./my_file.txt)
Calling the script will produce something similar to:
$ ./my_script.sh
this line will print
this will also print
Note: the blank line was not printed
If you want blank lines you can change the script to:
#!/bin/sh
line_sanitizer="""import sys
with open(sys.argv[1], 'r') as f:
for l in f.read().splitlines():
line = l.split('#')[0]
if line:
print(line)
"""
echo $(python -c "$line_sanitizer" ./my_file.txt)
Calling this script will produce something similar to:
$ ./my_script.sh
this line will print
this will also print

Parsing .csv file in bash, not reading final line

I'm trying to parse a csv file I made with Google Spreadsheet. It's very simple for testing purposes, and is basically:
1,2
3,4
5,6
The problem is that the csv doesn't end in a newline character so when I cat the file in BASH, I get
MacBook-Pro:Desktop kkSlider$ cat test.csv
1,2
3,4
5,6MacBook-Pro:Desktop kkSlider$
I just want to read line by line in a BASH script using a while loop that every guide suggests, and my script looks like this:
while IFS=',' read -r last first
do
echo "$last $first"
done < test.csv
The output is:
MacBook-Pro:Desktop kkSlider$ ./test.sh
1 2
3 4
Any ideas on how I could have it read that last line and echo it?
Thanks in advance.
You can force the input to your loop to end with a newline thus:
#!/bin/bash
(cat test.csv ; echo) | while IFS=',' read -r last first
do
echo "$last $first"
done
Unfortunately, this may result in an empty line at the end of your output if the input already has a newline at the end. You can fix that with a little addition:
!/bin/bash
(cat test.csv ; echo) | while IFS=',' read -r last first
do
if [[ $last != "" ]] ; then
echo "$last $first"
fi
done
Another method relies on the fact that the values are being placed into the variables by the read but they're just not being output because of the while statement:
#!/bin/bash
while IFS=',' read -r last first
do
echo "$last $first"
done <test.csv
if [[ $last != "" ]] ; then
echo "$last $first"
fi
That one works without creating another subshell to modify the input to the while statement.
Of course, I'm assuming here that you want to do more inside the loop that just output the values with a space rather than a comma. If that's all you wanted to do, there are other tools better suited than a bash read loop, such as:
tr "," " " <test.csv
cat file |sed -e '${/^$/!s/$/\n/;}'| while IFS=',' read -r last first; do echo "$last $first"; done
If the last (unterminated) line needs to be processed differently from the rest, #paxdiablo's version with the extra if statement is the way to go; but if it's going to be handled like all the others, it's cleaner to process it in the main loop.
You can roll the "if there was an unterminated last line" into the main loop condition like this:
while IFS=',' read -r last first || [ -n "$last" ]
do
echo "$last $first"
done < test.csv

Bash loop, print current iteration?

Say you have a simple loop
while read line
do
printf "${line#*//}\n"
done < text.txt
Is there an elegant way of printing the current iteration with the output? Something like
0 The
1 quick
2 brown
3 fox
I am hoping to avoid setting a variable and incrementing it on each loop.
To do this, you would need to increment a counter on each iteration (like you are trying to avoid).
count=0
while read -r line; do
printf '%d %s\n' "$count" "${line*//}"
(( count++ ))
done < test.txt
EDIT: After some more thought, you can do it without a counter if you have bash version 4 or higher:
mapfile -t arr < test.txt
for i in "${!arr[#]}"; do
printf '%d %s' "$i" "${arr[i]}"
done
The mapfile builtin reads the entire contents of the file into the array. You can then iterate over the indices of the array, which will be the line numbers and access that element.
You don't often see it, but you can have multiple commands in the condition clause of a while loop. The following still requires an explicit counter variable, but the arrangement may be more suitable or appealing for some uses.
while ((i++)); read -r line
do
echo "$i $line"
done < inputfile
The while condition is satisfied by whatever the last command returns (read in this case).
Some people prefer to include the do on the same line. This is what that would look like:
while ((i++)); read -r line; do
echo "$i $line"
done < inputfile
You can use a range to go through, it can be an array, a string, a input line or a list.
In this example, i use a list of numbers [0..10] is used with an increment of 2, as well.
#!/bin/bash
for i in {0..10..2}; do
echo " $i times"
done
The output is:
0 times
2 times
4 times
6 times
8 times
10 times
To print the index regardless of the loop range, you have to use a variable "COUNTER=0" and increase it in each iteration "COUNTER+1".
my solution prints each iteration, the FOR traverses an inputline and increments by one each iteration, also shows each of words in the inputline:
#!/bin/bash
COUNTER=0
line="this is a sample input line"
for word in $line; do
echo "This i a word number $COUNTER: $word"
COUNTER=$((COUNTER+1))
done
The output is:
This i a word number 0: this
This i a word number 1: is
This i a word number 2: a
This i a word number 3: sample
This i a word number 4: input
This i a word number 5: line
to see more about loops: enter link description here
to test your scripts: enter link description here
n=0
cat test.txt | while read line; do
printf "%7s %s\n" "$n" "${line#*//}"
n=$((n+1))
done
This will work in Bourne shell as well, of course.
If you really want to avoid incrementing a variable, you can pipe the output through grep or awk:
cat test.txt | while read line; do
printf " %s\n" "${line#*//}"
done | grep -n .
or
awk '{sub(/.*\/\//, ""); print NR,$0}' test.txt
Update: Other answers posted here are better, especially those of #Graham and #DennisWilliamson.
Something very like this should suit:
tr -s ' ' '\n' <test.txt | nl -ba
You can add a -v0 flag to the nl command if you want indexing from 0.

Resources