Read lines from a file and output with specific formatting with Bash - bash

In A.csv, there are
1
2
3
4
How should I read this file and create variables $B and $C so that:
echo $B
echo $C
returns:
1 2 3 4
1,2,3,4
So far I am trying:
cat A.csv | while read A;
do
echo $A
done
It only returns
1
2
3
4

Assuming bash 4.x, the following is efficient, robust, and native:
# Read each line of A.csv into a separate element of the array lines
readarray -t lines <A.csv
# Generate a string B with a comma after each item in the array
printf -v B '%s,' "${lines[#]}"
# Prune the last comma from that string
B=${B%,}
# Generate a string C with a space after each item in the array
printf -v B '%s ' "${lines[#]}"

As #Cyrus said
B=$(cat A.csv)
echo $B
Will output:
1 2 3 4
Because bash will not carry the newlines if the variable is not wrapped in quotes. This is dangerous if A.csv contains any characters which might be affected by bash glob expansion, but should be fine if you are just reading simple strings.
If you are reading simple strings with no spaces in any of the elements, you can also get your desired result for $C by using:
echo $B | tr ' ' ','
This will output:
1,2,3,4
If lines in A.csv may contain bash special characters or spaces then we return to the loop.
For why I've formatted the file reading loop as I have, refer to: Looping through the content of a file in Bash?
B=''
C=''
while read -u 7 curr_line; do
if [ "$B$C" == "" ]; then
B="$curr_line"
C="$curr_line"
else
B="$B $curr_line"
C="$C,$curr_line"
fi
done 7<A.csv
echo "$B"
echo "$C"
Will construct the two variables as you desire using a loop through the file contents and should prevent against unwanted globbing and splitting.

B=$(cat A.csv)
echo $B
Output:
1 2 3 4
With quotes:
echo "$B"
Output:
1
2
3
4

I would read the file into a bash array:
mapfile -t array < A.csv
Then, with various join characters
b="${array[*]}" # space is the default
echo "$b"
c=$( IFS=","; echo "${array[*]}" )
echo "$c"
Or, you can use paste to join all the lines with a specified separator:
b=$( paste -d" " -s < A.csv )
c=$( paste -d"," -s < A.csv )

Try this :
cat A.csv | while read A;
do
printf "$A"
done
Regards!

Try This(Simpler One):
b=$(tr '\n' ' ' < file)
c=$(tr '\n' ',' < file)
You don't have to read File for that. Make sure you ran dos2unix file command. If you are running in windows(to remove \r).
Note: It will modify the file. So, make sure you copied from original file.

Related

Output a file in two columns in BASH

I'd like to rearrange a file in two columns after the nth line.
For example, say I have a file like this here:
This is a bunch
of text
that I'd like to print
as two
columns starting
at line number 7
and separated by four spaces.
Here are some
more lines so I can
demonstrate
what I'm talking about.
And I'd like to print it out like this:
This is a bunch and separated by four spaces.
of text Here are some
that I'd like to print more lines so I can
as two demonstrate
columns starting what I'm talking about.
at line number 7
How could I do that with a bash command or function?
Actually, pr can do almost exactly this:
pr --output-tabs=' 1' -2 -t tmp1
↓
This is a bunch and separated by four spaces.
of text Here are some
that I'd like to print more lines so I can
as two demonstrate
columns starting what I'm talking about.
at line number 7
-2 for two columns; -t to omit page headers; and without the --output-tabs=' 1', it'll insert a tab for every 8 spaces it added. You can also set the page width and length (if your actual files are much longer than 100 lines); check out man pr for some options.
If you're fixed upon “four spaces more than the longest line on the left,” then perhaps you might have to use something a bit more complex;
The following works with your test input, but is getting to the point where the correct answer would be, “just use Perl, already;”
#!/bin/sh
infile=${1:-tmp1}
longest=$(longest=0;
head -n $(( $( wc -l $infile | cut -d ' ' -f 1 ) / 2 )) $infile | \
while read line
do
current="$( echo $line | wc -c | cut -d ' ' -f 1 )"
if [ $current -gt $longest ]
then
echo $current
longest=$current
fi
done | tail -n 1 )
pr -t -2 -w$(( $longest * 2 + 6 )) --output-tabs=' 1' $infile
↓
This is a bunch and separated by four spa
of text Here are some
that I'd like to print more lines so I can
as two demonstrate
columns starting what I'm talking about.
at line number 7
… re-reading your question, I wonder if you meant that you were going to literally specify the nth line to the program, in which case, neither of the above will work unless that line happens to be halfway down.
Thank you chatraed and BRPocock (and your colleague). Your answers helped me think up this solution, which answers my need.
function make_cols
{
file=$1 # input file
line=$2 # line to break at
pad=$(($3-1)) # spaces between cols - 1
len=$( wc -l < $file )
max=$(( $( wc -L < <(head -$(( line - 1 )) $file ) ) + $pad ))
SAVEIFS=$IFS;IFS=$(echo -en "\n\b")
paste -d" " <( for l in $( cat <(head -$(( line - 1 )) $file ) )
do
printf "%-""$max""s\n" $l
done ) \
<(tail -$(( len - line + 1 )) $file )
IFS=$SAVEIFS
}
make_cols tmp1 7 4
Could be optimized in many ways, but does its job as requested.
Input data (configurable):
file
num of rows borrowed from file for the first column
num of spaces between columns
format.sh:
#!/bin/bash
file=$1
if [[ ! -f $file ]]; then
echo "File not found!"
exit 1
fi
spaces_col1_col2=4
rows_col1=6
rows_col2=$(($(cat $file | wc -l) - $rows_col1))
IFS=$'\n'
ar1=($(head -$rows_col1 $file))
ar2=($(tail -$rows_col2 $file))
maxlen_col1=0
for i in "${ar1[#]}"; do
if [[ $maxlen_col1 -lt ${#i} ]]; then
maxlen_col1=${#i}
fi
done
maxlen_col1=$(($maxlen_col1+$spaces_col1_col2))
if [[ $rows_col1 -lt $rows_col2 ]]; then
rows=$rows_col2
else
rows=$rows_col1
fi
ar=()
for i in $(seq 0 $(($rows-1))); do
line=$(printf "%-${maxlen_col1}s\n" ${ar1[$i]})
line="$line${ar2[$i]}"
ar+=("$line")
done
printf '%s\n' "${ar[#]}"
Output:
$ > bash format.sh myfile
This is a bunch and separated by four spaces.
of text Here are some
that I'd like to print more lines so I can
as two demonstrate
columns starting what I'm talking about.
at line number 7
$ >

Correctly count number of lines a bash variable

I need to count the number of lines of a given variable. For example I need to find how many lines VAR has, where VAR=$(git log -n 10 --format="%s").
I tried with echo "$VAR" | wc -l), which indeed works, but if VAR is empty, is prints 1, which is wrong. Is there a workaround for this? Something better than using an if clause to check whether the variable is empty...(maybe add a line and subtract 1 from the returned value?).
The wc counts the number of newline chars. You can use grep -c '^' for counting lines.
You can see the difference with:
#!/bin/bash
count_it() {
echo "Variablie contains $2: ==>$1<=="
echo -n 'grep:'; echo -n "$1" | grep -c '^'
echo -n 'wc :'; echo -n "$1" | wc -l
echo
}
VAR=''
count_it "$VAR" "empty variable"
VAR='one line'
count_it "$VAR" "one line without \n at the end"
VAR='line1
'
count_it "$VAR" "one line with \n at the end"
VAR='line1
line2'
count_it "$VAR" "two lines without \n at the end"
VAR='line1
line2
'
count_it "$VAR" "two lines with \n at the end"
what produces:
Variablie contains empty variable: ==><==
grep:0
wc : 0
Variablie contains one line without \n at the end: ==>one line<==
grep:1
wc : 0
Variablie contains one line with \n at the end: ==>line1
<==
grep:1
wc : 1
Variablie contains two lines without \n at the end: ==>line1
line2<==
grep:2
wc : 1
Variablie contains two lines with \n at the end: ==>line1
line2
<==
grep:2
wc : 2
You can always write it conditionally:
[ -n "$VAR" ] && echo "$VAR" | wc -l || echo 0
This will check whether $VAR has contents and act accordingly.
For a pure bash solution: instead of putting the output of the git command into a variable (which, arguably, is ugly), put it in an array, one line per field:
mapfile -t ary < <(git log -n 10 --format="%s")
Then you only need to count the number of fields in the array ary:
echo "${#ary[#]}"
This design will also make your life simpler if, e.g., you need to retrieve the 5th commit message:
echo "${ary[4]}"
try:
echo "$VAR" | grep ^ | wc -l

omit commas from echo output in bash

Hi I am reading in a line from a .csv file and using
echo $line
to print the cell contents of that record to the screen, however the commas are also printed i.e.
1,2,3,a,b,c
where I actually want
1 2 3 a b c
checking the echo man page there isn't an option to omit commas, so does anyone have a nifty bash trick to do this?
Use bash replacement:
$ echo "${line//,/ }"
1 2 3 a b c
Note the importance of double slash:
$ echo "${line/,/ }"
1 2,3,a,b,c
That is, single one would just replace the first occurrence.
For completeness, check other ways to do it:
$ sed 's/,/ /g' <<< "$line"
1 2 3 a b c
$ tr ',' ' ' <<< "$line"
1 2 3 a b c
$ awk '{gsub(",", " ")}1' <<< "$line"
1 2 3 a b c
If you need something more POSIX-compliant due to portability concerns, echo "$line" | tr ',' ' ' works too.
If you have to use the field values as separated values, can be useful to use the IFS built-in bash variable.
You can set it with "," value in order to specify the field separator for read command used to read from .csv file.
ORIG_IFS="$IFS"
IFS=","
while read f1 f2 f3 f4 f5 f6
do
echo "Follow fields of record as separated variables"
echo "f1: $f1"
echo "f2: $f2"
echo "f3: $f3"
echo "f4: $f4"
echo "f5: $f5"
echo "f6: $f6"
done < test.csv
IFS="$OLDIFS"
On this way you have one variable for each field of the line/record and you can use it as you prefer.
NOTE: to avoid unexpected behaviour, remember to set the original value to IFS variable

BASH - Reading Multiple Lines from Text File

i am trying to read a text file, say file.txt and it contains multiple lines.
say the output of file.txt is
$ cat file.txt
this is line 1
this is line 2
this is line 3
I want to store the entire output as a variable say, $text.
When the variable $text is echoed, the expected output is:
this is line 1 this is line 2 this is line 3
my code is as follows
while read line
do
test="${LINE}"
done < file.txt
echo $test
the output i get is always only the last line. Is there a way to concatenate the multiple lines in file.txt as one long string?
You can translate the \n(newline) to (space):
$ text=$(tr '\n' ' ' <file.txt)
$ echo $text
this is line 1 this is line 2 this is line 3
If lines ends with \r\n, you can do this:
$ text=$(tr -d '\r' <file.txt | tr '\n' ' ')
Another one:
line=$(< file.txt)
line=${line//$'\n'/ }
test=$(cat file.txt | xargs)
echo $test
You have to append the content of the next line to your variable:
while read line
do
test="${test} ${LINE}"
done < file.txt
echo $test
Resp. even simpler you could simply read the full file at once into the variable:
test=$(cat file.txt)
resp.
test=$(tr "\n" " " < file.txt)
If you would want to keep the newlines it would be as simple as:
test=<file.txt
I believe it's the simplest method:
text=$(echo $(cat FILE))
But it doesn't preserve multiple spaces/tabs between words.
Use arrays
#!/bin/bash
while read line
do
a=( "${a[#]}" "$line" )
done < file.txt
echo -n "${a[#]}"
output:
this is line 1 this is line 2 this is line 3
See e.g. tldp section on arrays

File content into unix variable with newlines

I have a text file test.txt with the following content:
text1
text2
And I want to assign the content of the file to a UNIX variable, but when I do this:
testvar=$(cat test.txt)
echo $testvar
the result is:
text1 text2
instead of
text1
text2
Can someone suggest me a solution for this?
The assignment does not remove the newline characters, it's actually the echo doing this. You need simply put quotes around the string to maintain those newlines:
echo "$testvar"
This will give the result you want. See the following transcript for a demo:
pax> cat num1.txt ; x=$(cat num1.txt)
line 1
line 2
pax> echo $x ; echo '===' ; echo "$x"
line 1 line 2
===
line 1
line 2
The reason why newlines are replaced with spaces is not entirely to do with the echo command, rather it's a combination of things.
When given a command line, bash splits it into words according to the documentation for the IFS variable:
IFS: The Internal Field Separator that is used for word splitting after expansion ... the default value is <space><tab><newline>.
That specifies that, by default, any of those three characters can be used to split your command into individual words. After that, the word separators are gone, all you have left is a list of words.
Combine that with the echo documentation (a bash internal command), and you'll see why the spaces are output:
echo [-neE] [arg ...]: Output the args, separated by spaces, followed by a newline.
When you use echo "$x", it forces the entire x variable to be a single word according to bash, hence it's not split. You can see that with:
pax> function count {
...> echo $#
...> }
pax> count 1 2 3
3
pax> count a b c d
4
pax> count $x
4
pax> count "$x"
1
Here, the count function simply prints out the number of arguments given. The 1 2 3 and a b c d variants show it in action.
Then we try it with the two variations on the x variable. The one without quotes shows that there are four words, "test", "1", "test" and "2". Adding the quotes makes it one single word "test 1\ntest 2".
This is due to IFS (Internal Field Separator) variable which contains newline.
$ cat xx1
1
2
$ A=`cat xx1`
$ echo $A
1 2
$ echo "|$IFS|"
|
|
A workaround is to reset IFS to not contain the newline, temporarily:
$ IFSBAK=$IFS
$ IFS=" "
$ A=`cat xx1` # Can use $() as well
$ echo $A
1
2
$ IFS=$IFSBAK
To REVERT this horrible change for IFS:
IFS=$IFSBAK
Bash -ge 4 has the mapfile builtin to read lines from the standard input into an array variable.
help mapfile
mapfile < file.txt lines
printf "%s" "${lines[#]}"
mapfile -t < file.txt lines # strip trailing newlines
printf "%s\n" "${lines[#]}"
See also:
http://bash-hackers.org/wiki/doku.php/commands/builtin/mapfile
Your variable is set correctly by testvar=$(cat test.txt). To display this variable which consist new line characters, simply add double quotes, e.g.
echo "$testvar"
Here is the full example:
$ printf "test1\ntest2" > test.txt
$ testvar=$(<test.txt)
$ grep testvar <(set)
testvar=$'test1\ntest2'
$ echo "$testvar"
text1
text2
$ printf "%b" "$testvar"
text1
text2
Just if someone is interested in another option:
content=( $(cat test.txt) )
a=0
while [ $a -le ${#content[#]} ]
do
echo ${content[$a]}
a=$[a+1]
done
The envdir utility provides an easy way to do this. envdir uses files to represent environment variables, with file names mapping to env var names, and file contents mapping to env var values. If the file contents contain newlines, so will the env var.
See https://pypi.python.org/pypi/envdir

Resources