I have a script variable which is multi-line.
How do i traverse this variable to read it line by line and process each line the way I want?
Consider the following multi-line variable
x=$(echo -e "a\nb\nc d e")
and a simple process for each line: just echo it with a prefix=LINE: and with single quotes around the line. Either of the following codes will satisfy that requirement:
while read line; do echo "LINE: '${line}'"; done <<< "$x"
or
while read line; do echo "LINE: '${line}'"; done < <(echo "$x")
Neither creates a subshell (so you can, e.g., set variables in the loop and access them outside of it), and both output
LINE: 'a'
LINE: 'b'
LINE: 'c d e'
But suppose instead you have
x=$(echo -e "a \n b\nc d e")
# note--------^--^
and that leading and trailing whitespace matter for your application (e.g., parsing Git porcelain). Both the above codes will give exactly the same output for the latter variable/data as for the former, which is not what you want. To preserve leading and trailing whitespace, replace while read line with while IFS= read -r line . I.e., either of the following codes
while IFS= read -r line; do echo "LINE: '${line}'"; done <<< "$x"
or
while IFS= read -r line; do echo "LINE: '${line}'"; done < <(echo "$x")
will produce
LINE: 'a '
LINE: ' b'
LINE: 'c d e'
See Greg Wooledge's excellent Bash FAQ for details.
Although I typically use "while read" for processing multi-line variables, I recently had an instance where it removed the leading space from each line in a file. Using this instead fixed my issue:
printf '%s\n' "$var" | while IFS= read -r line
do
echo "$line"
done
Code taken from this Unix Stack Exchange answer.
Edit: updated to fix last line issue as suggested by Nicolae Iotu
Related
I have a bash script with following variable:
operators_list=$'andrii,bogdan,eios,fre,kuz,pvm,sebastian,tester,tester2,vincent,ykosogon'
while IFS=, read -r tech_login; do
echo "... $tech_login ..."
done <<< "$operators_list"
I need to read arguments from variable and work with them in loop. But it returns echo only one time with all items:
+ IFS=,
+ read -r tech_login
+ echo '... andrii,bogdan,eios,fre,kuz,pvm,sebastian,tester,tester2,vincent,ykosogon ...'
... andrii,bogdan,eios,fre,kuz,pvm,sebastian,tester,tester2,vincent,ykosogon ...
+ IFS=,
+ read -r tech_login
What am I doing wrong? How to rework script, so it will work only with one item per time?
operators_list=$'andrii,bogdan,eios,fre,kuz,pvm,sebastian,tester,tester2,vincent,ykosogon'
So you have strings separated by ,. You can do that multiple ways:
using bash arrays:
IFS=, read -a operators <<<$operators_list
for op in "${operators[#]}"; do
echo "$op"
done
Using a while loop, like you wanted:
while IFS= read -d, -r op; do
echo "$op"
done <<<$operators_list
Using xargs, because why not:
<<<$operators_list xargs -d, -n1 echo
The thing with IFS and read delimeter is: read reads until delimeter specified with -d. Then after read has read a full string (usually whole line, as default delimeter is newline), then the string is splitted into parts using IFS as delimeter. So you can:
while IFS=: read -d, -r op1 op2; do
echo "$op1" "$op2"
done <<<"op11:op12,op12:op22"
I am trying to parse a huge text file, say 200mb.
the text file contains some strings
123
1234
12345
12345
so my script looked like
while read line ; do
echo "$line"
done <textfile
however using this above method, my string " 12345" gets truncated to "12345"
I tried using
sed -n "$i"p textfile
but the the throughput is reduced from 27 to 0.2 lines per second, which is inacceptable ;-)
any Idea howto solve this?
You want to echo the lines without a fieldsep:
while IFS="" read line; do
echo "$line"
done <<< " 12345"
When you also want to skip interpretation of special characters, use
while IFS="" read -r line; do
echo "$line"
done <<< " 12345"
You can write the IFS without double quotes:
while IFS= read -r line; do
echo "$line"
done <<< " 12345"
This seems to be what you're looking for:
while IFS= read line; do
echo "$line"
done < textfile
The safest method is to use read -r in comparison to just read which will skip interpretation of special characters (thanks Walter A):
while IFS= read -r line; do
echo "$line"
done < textfile
OPTION 1:
#!/bin/bash
# read whole file into array
readarray -t aMyArray < <(cat textfile)
# echo each line of the array
# this will preserve spaces
for i in "${aMyArray[#]}"; do echo "$i"; done
readarray -- read lines from standard input
-t -- omit trailing newline character
aMyArray -- name of array to store file in
< <() -- execute command; redirect stdout into array
cat textfile -- file you want to store in variable
for i in "${aMyArray[#]}" -- for every element in aMyArray
"" -- needed to maintain spaces in elements
${ [#]} -- reference all elements in array
do echo "$i"; -- for every iteration of "$i" echo it
"" -- to maintain variable spaces
$i -- equals each element of the array aMyArray as it cycles through
done -- close for loop
OPTION 2:
In order to accommodate your larger file you could do this to help alleviate the work and speed up the processing.
#!/bin/bash
sSearchFile=textfile
sSearchStrings="1|2|3|space"
while IFS= read -r line; do
echo "${line}"
done < <(egrep "${sSearchStrings}" "${sSearchFile}")
This will grep the file (faster) before it cycles it through the while command. Let me know how this works for you. Notice you can add multiple search strings to the $sSearchStrings variable.
OPTION 3:
and an all in one solution to have a text file with your search criteria and everything else combined...
#!/bin/bash
# identify file containing search strings
sSearchStrings="searchstrings.file"
while IFS= read -r string; do
# if $sSearchStrings empty read in strings
[[ -z $sSearchStrings ]] && sSearchStrings="${string}"
# if $sSearchStrings not empty read in $sSearchStrings "|" $string
[[ ! -z $sSearchStrings ]] && sSearchStrings="${sSearchStrings}|${string}"
# read search criteria in from file
done <"${sSearchStrings}"
# identify file to be searched
sSearchFile="text.file"
while IFS= read -r line; do
echo "${line}"
done < <(egrep "${sSearchStrings}" "${sSearchFile}")
It seems everybody else wants to remove any additional whitespace, however I have the opposite problem.
I have a file, call it some_file.txt that looks like
a b c d
and some more
and I'm reading it line-by-line with sed,
num_lines=$(cat some_file.txt | wc -l)
for i in $(seq 1 $num_lines); do
echo $(sed "${i}q;d" $file)
string=$(sed "${i}q;d" $file)
echo $string
done
I would expect the number of whitespace characters to stay the same, however the output I get is
a b c d
a b c d
and some more
and some more
So it seems that the problem is with sed removing the extra whitespace between chars, anyway to fix this?
Have a look at this example:
$ echo Hello World
Hello World
$ echo "Hello World"
Hello World
sed is not your problem, your problem is that bash removes the whitespaces when passing the output of sed into echo.
You just need to surround whatever echo is supposed to print with double quotation marks. So instead of
echo $(sed "${i}q;d" $file)
echo $string
You write
echo "$(sed "${i}q;d" $file)"
echo "$string"
The new script should look like this:
#!/usr/bin/env bash
file=some_file.txt
num_lines=$(cat some_file.txt | wc -l)
for i in $(seq 1 $num_lines); do
echo "$(sed "${i}q;d" $file)"
string=$(sed "${i}q;d" $file)
echo "$string"
done
prints the correct output:
a b c d
a b c d
and some more
and some more
However, if you just want to go through your file line by line, I strongly recommend something like this:
while IFS= read -r line; do
echo "$line"
done < some_file.txt
Question from the comments: What to do if you only want 33 lines starting from line x. One possible solution is this:
#!/usr/bin/env bash
declare -i s=$1
declare -i e=${s}+32
sed -n "${s},${e}p" $file | while IFS= read -r line; do
echo "$line"
done
(Note that I would probably include some validation of $1 in there as well.)
I declare s and e as integer variables, then even bash can do some simple arithmetic on them and calculate the actual last line to print.
I have this
while read -r line
do
echo -e "$line\r"
done <<< $keys
Where $keys, if printed as it is displays 30k rows.
When doing echo as above I get only one line as output.
I need to filter every line and then output it.
Why is that happening (overwriting)
How can I prevent it to happen?
Use More Quotes
while read -r line
do
echo -e "$line\r"
done <<< "$keys"
observe the quotes in <<< "$keys"
Look:
$ printf -v keys '%s\n' "key one" "key two"
$ echo $keys
key one key two
$ # Oh dear
$ # Now with quotes:
$ echo "$keys"
key one
key two
$ # Yeah :)
$ # Same with a here string:
$ while read line; do echo "$line"; done <<< $keys
key one key two
$ # Oh dear :(
$ # Now with quotes:
$ while read line; do echo "Read: $line"; done <<< "$keys"
Read: line one
Read: line two
$ # Done \o/
In this post, you have learned:
to use more quotes,
to use more quotes,
to use more quotes,
to use more quotes,
to use more quotes,
and to use more quotes.
Now remember,
Each time you forget quotes, God kills a kitten
This is happening because of the \r in your echo command.
It will be solved replacing it to:
echo "$line"
It is also important to note that to call the variable you'd better use "$keys" to keep its format.
All together:
while read -r line
do
echo -e "$line"
done <<< "$keys"
I'm trying to read file line by line in bash.
Every line has format as follows text|number.
I want to produce file with format as follows text,text,text etc. so new file would have just text from previous file separated by comma.
Here is what I've tried and couldn't get it to work :
FILENAME=$1
OLD_IFS=$IFSddd
IFS=$'\n'
i=0
for line in $(cat "$FILENAME"); do
array=(`echo $line | sed -e 's/|/,/g'`)
echo ${array[0]}
i=i+1;
done
IFS=$OLD_IFS
But this prints both text and number but in different format text number
here is sample input :
dsadadq-2321dsad-dasdas|4212
dsadadq-2321dsad-d22as|4322
here is sample output:
dsadadq-2321dsad-dasdas,dsadadq-2321dsad-d22as
What did I do wrong?
Not pure bash, but you could do this in awk:
awk -F'|' 'NR>1{printf(",")} {printf("%s",$1)}'
Alternately, in pure bash and without having to strip the final comma:
#/bin/bash
# You can get your input from somewhere else if you like. Even stdin to the script.
input=$'dsadadq-2321dsad-dasdas|4212\ndsadadq-2321dsad-d22as|4322\n'
# Output should be reset to empty, for safety.
output=""
# Step through our input. (I don't know your column names.)
while IFS='|' read left right; do
# Only add a field if it exists. Salt to taste.
if [[ -n "$left" ]]; then
# Append data to output string
output="${output:+$output,}$left"
fi
done <<< "$input"
echo "$output"
No need for arrays and sed:
while IFS='' read line ; do
echo -n "${line%|*}",
done < "$FILENAME"
You just have to remove the last comma :-)
Using sed:
$ sed ':a;N;$!ba;s/|[0-9]*\n*/,/g;s/,$//' file
dsadadq-2321dsad-dasdas,dsadadq-2321dsad-d22as
Alternatively, here is a bit more readable sed with tr:
$ sed 's/|.*$/,/g' file | tr -d '\n' | sed 's/,$//'
dsadadq-2321dsad-dasdas,dsadadq-2321dsad-d22as
Choroba has the best answer (imho) except that it does not handle blank lines and it adds a trailing comma. Also, mucking with IFS is unnecessary.
This is a modification of his answer that solves those problems:
while read line ; do
if [ -n "$line" ]; then
if [ -n "$afterfirst" ]; then echo -n ,; fi
afterfirst=1
echo -n "${line%|*}"
fi
done < "$FILENAME"
The first if is just to filter out blank lines. The second if and the $afterfirst stuff is just to prevent the extra comma. It echos a comma before every entry except the first one. ${line%|\*} is a bash parameter notation that deletes the end of a paramerter if it matches some expression. line is the paramter, % is the symbol that indicates a trailing pattern should be deleted, and |* is the pattern to delete.