BASH - Reading Multiple Lines from Text File - bash

i am trying to read a text file, say file.txt and it contains multiple lines.
say the output of file.txt is
$ cat file.txt
this is line 1
this is line 2
this is line 3
I want to store the entire output as a variable say, $text.
When the variable $text is echoed, the expected output is:
this is line 1 this is line 2 this is line 3
my code is as follows
while read line
do
test="${LINE}"
done < file.txt
echo $test
the output i get is always only the last line. Is there a way to concatenate the multiple lines in file.txt as one long string?

You can translate the \n(newline) to (space):
$ text=$(tr '\n' ' ' <file.txt)
$ echo $text
this is line 1 this is line 2 this is line 3
If lines ends with \r\n, you can do this:
$ text=$(tr -d '\r' <file.txt | tr '\n' ' ')

Another one:
line=$(< file.txt)
line=${line//$'\n'/ }

test=$(cat file.txt | xargs)
echo $test

You have to append the content of the next line to your variable:
while read line
do
test="${test} ${LINE}"
done < file.txt
echo $test
Resp. even simpler you could simply read the full file at once into the variable:
test=$(cat file.txt)
resp.
test=$(tr "\n" " " < file.txt)
If you would want to keep the newlines it would be as simple as:
test=<file.txt

I believe it's the simplest method:
text=$(echo $(cat FILE))
But it doesn't preserve multiple spaces/tabs between words.

Use arrays
#!/bin/bash
while read line
do
a=( "${a[#]}" "$line" )
done < file.txt
echo -n "${a[#]}"
output:
this is line 1 this is line 2 this is line 3
See e.g. tldp section on arrays

Related

Read lines from a file and output with specific formatting with Bash

In A.csv, there are
1
2
3
4
How should I read this file and create variables $B and $C so that:
echo $B
echo $C
returns:
1 2 3 4
1,2,3,4
So far I am trying:
cat A.csv | while read A;
do
echo $A
done
It only returns
1
2
3
4
Assuming bash 4.x, the following is efficient, robust, and native:
# Read each line of A.csv into a separate element of the array lines
readarray -t lines <A.csv
# Generate a string B with a comma after each item in the array
printf -v B '%s,' "${lines[#]}"
# Prune the last comma from that string
B=${B%,}
# Generate a string C with a space after each item in the array
printf -v B '%s ' "${lines[#]}"
As #Cyrus said
B=$(cat A.csv)
echo $B
Will output:
1 2 3 4
Because bash will not carry the newlines if the variable is not wrapped in quotes. This is dangerous if A.csv contains any characters which might be affected by bash glob expansion, but should be fine if you are just reading simple strings.
If you are reading simple strings with no spaces in any of the elements, you can also get your desired result for $C by using:
echo $B | tr ' ' ','
This will output:
1,2,3,4
If lines in A.csv may contain bash special characters or spaces then we return to the loop.
For why I've formatted the file reading loop as I have, refer to: Looping through the content of a file in Bash?
B=''
C=''
while read -u 7 curr_line; do
if [ "$B$C" == "" ]; then
B="$curr_line"
C="$curr_line"
else
B="$B $curr_line"
C="$C,$curr_line"
fi
done 7<A.csv
echo "$B"
echo "$C"
Will construct the two variables as you desire using a loop through the file contents and should prevent against unwanted globbing and splitting.
B=$(cat A.csv)
echo $B
Output:
1 2 3 4
With quotes:
echo "$B"
Output:
1
2
3
4
I would read the file into a bash array:
mapfile -t array < A.csv
Then, with various join characters
b="${array[*]}" # space is the default
echo "$b"
c=$( IFS=","; echo "${array[*]}" )
echo "$c"
Or, you can use paste to join all the lines with a specified separator:
b=$( paste -d" " -s < A.csv )
c=$( paste -d"," -s < A.csv )
Try this :
cat A.csv | while read A;
do
printf "$A"
done
Regards!
Try This(Simpler One):
b=$(tr '\n' ' ' < file)
c=$(tr '\n' ',' < file)
You don't have to read File for that. Make sure you ran dos2unix file command. If you are running in windows(to remove \r).
Note: It will modify the file. So, make sure you copied from original file.

How to separate a line into an array with white space in shell scripting

I can't figure out why my script is not displaying the string separated by white space.
This is my code:
While read -r row
do
line = ($row)
for word in $line
do
echo ${word[0]}
done
done < $1
say the line is "add $s0 $s0 $t1"
i want the output to be "add"
While read -r row
This will try to run a command called While, you'll probably get an error for that. The shell keyword is while.
do
line = ($row)
This will try to run a command called line, which is a program from GNU coreutils (line - read one line), but probably not what you want. Assignments in the shell must not have whitespace around the equal sign.
If that assignment worked, it would make an array called line.
for word in $line
Referencing the array just by name expands to the first item of it, so the loop is useless here.
do
echo ${word[0]}
And here, indexing is not very useful since word is going to be a single value, not an array.
I suspect what you want is this:
while read -r row ; do
words=($row);
echo "${words[0]}"
done
Though if $row contains glob characters like *, they'll be expanded to matching filenames.
This would be better:
read -r -a words
echo "${words[0]}"
or simply
read -r line
echo "${line%% *}" # remove everything after the first space
This work fine :
while read -r row
do
echo $row | awk '{print $1}'
done
while read -r row ask for user input and store it in row variable, awk '{print $1}' display only first word of user input.
Do you want each token on a seperate line? Why not just use sed?
$ echo "1 2 3 hi" | sed -r 's/[ \t]+/\n/g'
1
2
3
hi
If you want the first word of each line, then:
$ echo "1 2 3 hi" | sed -r 's/^([^ \t]+).+/\1/'
1
If its a file, then remove "echo ... | " and just give the filename as a parameter to sed:
$ sed -r 's/^([^ \t]+).+/\1/' file.txt

bash: what is the difference between "done < foo", "done << foo" and "done <<< foo" when closing a loop?

In a bash script, I see several while statements with those redirect signs when closing the loop.
I know that if I end it with "done < file", I am redirecting the file to the stdin of the command in the while statement. But what the others means?
I would appreciate if someone could give an explanation with examples.
With the file text.txt
1aa
2bb
3cc
Redirection:
$ cat < text.txt
1aa
2bb
3cc
Here document:
$ cat << EOF
> 1AA
> 2BB
> EOF
1AA
2BB
Here string:
$ cat <<< 1aaa
1aaa
The first form, <, is an input redirection. It somewhat different than << and <<< which are two variants of a here document.
The first form, <, is primarily used to redirect the contents of a file to a command or process. It is a named FIFO, and therefor a file that is passed to a command that accepts file arguments.
cmd < file
will open the file named file and create a new file name to open and read. The difference between cmd file and cmd < file is the name passed to cmd in the second case is the name of a named pipe.
You can also do process substitution:
cmd <(process)
An example use would be comparing two directories:
diff <(ls dir1) <(ls dir2)
In this case, the command ls dir1 and ls dir2 has output redirected to a file like stream that is then read by diff as if those were two files.
You can see the name of the file device by passing to echo a process substitution:
$ echo <(ls)
/dev/fd/63
Since echo does not support opening files, it just prints the name of the FIFO.
Here documents are easier to demonstrate. The << form has a 'limit string' that is not included in the output:
$ cat <<HERE
> line 1
> line 2
> line 3
> HERE
line 1
line 2
line 3
The HERE is a unique string that must be on its own line.
The 'here string' or <<< form does not require the delimiting string of the << form and is on a single line:
$ cat <<< 'line 1'
line 1
You can also expand parameters:
$ v="some text"
$ cat <<< "$v"
some text
But not other forms of shell expansions:
Brace expansion:
$ echo a{b,c,d}e
abe ace ade
$ cat <<< a{b,c,d}e
a{b,c,d}e
Given a 'generic' Bash while loop that reads input line by line:
while IFS= read -r line || [[ -n $line ]]; do printf "'%s'\n" "$line"; done
There are several ways that you can feed input into that loop.
First example, you can redirect a file. For demo, create a 6 line file:
$ seq 6 > /tmp/6.txt
Redirect the input of the file into the loop:
while IFS= read -r line || [[ -n $line ]]; do printf "'%s'\n" "$line"; done </tmp/6.txt
'1'
'2'
'3'
'4'
'5'
'6'
Or, second example, you can directly read from the output of seq using redirection:
$ while IFS= read -r line || [[ -n $line ]]; do printf "'%s'\n" "$line"; done < <(seq 3)
'1'
'2'
'3'
(Please note the extra < with a space for this form)
Or, third example, you can use a 'HERE' doc separated by CR:
while IFS= read -r line || [[ -n $line ]]; do printf "'%s'\n" "$line"; done <<HERE
1
2 3
4
HERE
'1 '
'2 3'
' 4'
Going back to diff which will only work on files, you can use process substitution and a HERE doc or process substitution and redirection to use diff on free text or the output of a program.
Given:
$ cat /tmp/f1.txt
line 1
line 2
line 3
Normally you would need to have a second file to compare free text with that file. You can use a HERE doc and process substitution to skip creating a separate file:
$ diff /tmp/f1.txt <(cat <<HERE
line 1
line 2
line 5
HERE
)
3c3
< line 3
---
> line 5
command < foo
Redirect the file foo to the standard input of command.
command << foo
blah 1
blah 2
foo
Here document: send the following lines up to foo to the standard input of command.
command <<< foo
Here-string. The string foo is sent to the standard input of command.

Bash : How to check in a file if there are any word duplicates

I have a file with 6 character words in every line and I want to check if there are any duplicate words. I did the following but something isn't right:
#!/bin/bash
while read line
do
name=$line
d=$( grep '$name' chain.txt | wc -w )
if [ $d -gt '1' ]; then
echo $d $name
fi
done <$1
Assuming each word is on a new line, you can achieve this without looping:
$ cat chain.txt | sort | uniq -c | grep -v " 1 " | cut -c9-
You can use awk for that:
awk -F'\n' 'found[$1] {print}; {found[$1]++}' chain.txt
Set the field separator to newline, so that we look at the whole line. Then, if the line already exists in the array found, print the line. Finally, add the line to the found array.
Note: If a line will only be suppressed once, so if the same line appears, say, 6 times, it will be printed 5 times.

Bash script get item from array

I'm trying to read file line by line in bash.
Every line has format as follows text|number.
I want to produce file with format as follows text,text,text etc. so new file would have just text from previous file separated by comma.
Here is what I've tried and couldn't get it to work :
FILENAME=$1
OLD_IFS=$IFSddd
IFS=$'\n'
i=0
for line in $(cat "$FILENAME"); do
array=(`echo $line | sed -e 's/|/,/g'`)
echo ${array[0]}
i=i+1;
done
IFS=$OLD_IFS
But this prints both text and number but in different format text number
here is sample input :
dsadadq-2321dsad-dasdas|4212
dsadadq-2321dsad-d22as|4322
here is sample output:
dsadadq-2321dsad-dasdas,dsadadq-2321dsad-d22as
What did I do wrong?
Not pure bash, but you could do this in awk:
awk -F'|' 'NR>1{printf(",")} {printf("%s",$1)}'
Alternately, in pure bash and without having to strip the final comma:
#/bin/bash
# You can get your input from somewhere else if you like. Even stdin to the script.
input=$'dsadadq-2321dsad-dasdas|4212\ndsadadq-2321dsad-d22as|4322\n'
# Output should be reset to empty, for safety.
output=""
# Step through our input. (I don't know your column names.)
while IFS='|' read left right; do
# Only add a field if it exists. Salt to taste.
if [[ -n "$left" ]]; then
# Append data to output string
output="${output:+$output,}$left"
fi
done <<< "$input"
echo "$output"
No need for arrays and sed:
while IFS='' read line ; do
echo -n "${line%|*}",
done < "$FILENAME"
You just have to remove the last comma :-)
Using sed:
$ sed ':a;N;$!ba;s/|[0-9]*\n*/,/g;s/,$//' file
dsadadq-2321dsad-dasdas,dsadadq-2321dsad-d22as
Alternatively, here is a bit more readable sed with tr:
$ sed 's/|.*$/,/g' file | tr -d '\n' | sed 's/,$//'
dsadadq-2321dsad-dasdas,dsadadq-2321dsad-d22as
Choroba has the best answer (imho) except that it does not handle blank lines and it adds a trailing comma. Also, mucking with IFS is unnecessary.
This is a modification of his answer that solves those problems:
while read line ; do
if [ -n "$line" ]; then
if [ -n "$afterfirst" ]; then echo -n ,; fi
afterfirst=1
echo -n "${line%|*}"
fi
done < "$FILENAME"
The first if is just to filter out blank lines. The second if and the $afterfirst stuff is just to prevent the extra comma. It echos a comma before every entry except the first one. ${line%|\*} is a bash parameter notation that deletes the end of a paramerter if it matches some expression. line is the paramter, % is the symbol that indicates a trailing pattern should be deleted, and |* is the pattern to delete.

Resources