How to display a file with multiple lines as a single string with escape chars (\n) - bash

In bash, how I can display the content of a file with multiple lines as a single string where new lines appears as \n.
Example:
$ echo "line 1
line 2" >> file.txt
I need to get the content as this "line 1\nline2" with bash commands.
I tried using a combinations of cat/printf/echo with no success.

You can use bash's printf to get something close:
$ printf "%q" "$(< file.txt)"
$'line1\nline2'
and in bash 4.4 there is a new parameter expansion operator to produce the same:
$ foo=$(<file.txt)
$ echo "${foo#Q}"
$'line1\nline2'

$ cat file.txt
line 1
line 2
$ paste -s -d '~' file.txt | sed 's/~/\\n/g'
line 1\nline 2
You can use paste command to paste all the lines of the serially with delimiter say ~ and replace all ~ with \n with a sed command.

Without '\n' after file 2, you need to use echo -n
echo -n "line 1
line 2" > file.txt
od -cv file.txt
0000000 l i n e 1 \n l i n e 2
sed -z 's/\n/\\n/g' file.txt
line 1\nline 2
With '\n' after line 2
echo "line 1
line 2" > file.txt
od -cv file.txt
0000000 l i n e 1 \n l i n e 2 \n
sed -z 's/\n/\\n/g' file.txt
line 1\nline 2\n

This tools may display character codes also:
$ hexdump -v -e '/1 "%_c"' file.txt ; echo
line 1\nline 2\n
$ od -vAn -tc file.txt
l i n e 1 \n l i n e 2 \n

you could try piping a string from stdin or file and trim the desired pattern...
try this:
cat file|tr '\n' ' '
where file is the file name with the \n. this will return a string with all the text in a single line.
if you want to write the result to a file just redirect the result of the command, like this.
cat file|tr '\n' ' ' >> file2
here is another example:
How to remove carriage return from a string in Bash

Related

How to add a space after special characters in bash script?

I have a text file with something like,
!aa
#bb
#cc
$dd
%ee
expected output is,
! aa
# bb
# cc
$ dd
% ee
What I have tried, echo "${foo//#/# }".
This does work fine with one string but it does not work for all the lines in the file. I have tried with this while loop to read all the lines of the file and do the same using echo but it does not work.
while IFS= read -r line; do
foo=$line
sep="!##$%"
echo "${foo//$sep/$sep }"
done < $1
I have tried with awk split but it does not give the expected output. Is there any workaround for this? by using awk or sed.
The following assumes you want to add a space after every character in the !##$% set (even if it is the last character in a line). Test file:
$ cat file.txt
a!a
#bb
c#c
$dd
ee%
foo
%b%r
$ sep='!##$%'
With sed:
$ sed 's/['"$sep"']/& /g' file.txt
a! a
# bb
c# c
$ dd
ee%
foo
% b% r
With awk:
$ awk '{gsub(/['"$sep"']/,"& "); print}' file.txt
a! a
# bb
c# c
$ dd
ee%
foo
% b% r
With plain bash (not recommended, it is too slow):
$ while IFS= read -r line; do
str=""
for (( i=0; i<${#line}; i++ )); do
char="${line:i:1}"
str="$str$char"
[[ "$char" =~ [$sep] ]] && str="$str "
done
printf '%s\n' "$str"
done < file.txt
a! a
# bb
c# c
$ dd
ee%
foo
% b% r
Or (not sure which is the worst):
$ while IFS= read -r line; do
for (( i=0; i<${#sep}; i++ )); do
char="${sep:i:1}"
line="${line//$char/$char }"
done
printf '%s\n' "$line"
done < file.txt
a! a
# bb
c# c
$ dd
ee%
foo
% b% r
Characters you call special in your example seems to be subset of characters known as [[:punct:]] to GNU sed, thus I propose following solution:
sed 's/\([[:punct:]]\)/\1 /g' file.txt
which with file.txt content being
!aa
#bb
#cc
$dd
%ee
output
! aa
# bb
# cc
$ dd
% ee
Explanation: I use capturing group \(...\) which has any character belonging to [:punct:] then I replace what was captured with content of that capture followed by space. I use g to apply it to all occurences in each line, though this has not visible impact for data above. You might elect to drop g if you are sure there will be at most one character to replace in every line.
If you want to know more about [:punct:] or other similar character sets read about Character Classes on Regular-Expressions.info
If the file always contain a symbol at the start of line like that then use this
sed -Ei 's/^(.)/\1 /g' yourfile.txt
The -E option is to tell sed to use regex. -i modifies the file inline, you can remove it if you want to output to console or another file. The ^(.) regex captures the first character on the line and add a space to it (\1 )
Assuming that special characters are non-numeric and non-alphabetic characters, and special characters can appear anywhere in the line, use the following regular expression to replace them.
sed 's/[^a-zA-Z0-9]/& /g' urfile

How does grep handle DOS end of line?

I have a Windows text file which contains a line (with ending CRLF)
aline
The following is several commands' output:
[root#panel ~]# grep aline file.txt
aline
[root#panel ~]# grep aline$'\r' file.txt
[root#panel ~]# grep aline$'\r'$'\n' file.txt
[root#panel ~]# grep aline$'\n' file.txt
aline
The first command's output is normal. I'm curious about the second and the third output. Why is it an empty line? And the last output, I think it can not find the string but it actually finds it, why? The commands are run on CentOS/bash.
In this case grep really matches the string "aline\r" but you just don't see it because it was overwritten by the ANSI sequence that prints color. Pass the output to od -c and you'll see
$ grep aline file.txt
aline
$ grep aline$'\r' file.txt
$ grep aline$'\r' --color=never file.txt
aline
$ grep aline$'\r' --color=never file.txt | od -c
0000000 a l i n e \r \n
0000007
$ grep aline$'\r' --color=always file.txt | od -c
0000000 033 [ 0 1 ; 3 1 m 033 [ K a l i n e
0000020 \r 033 [ m 033 [ K \n
0000030
With --color=never you can see the output string because grep doesn't print out the color. \r simply resets the cursor to the start of the line and then a new line is printed out, nothing is overwritten. But by default grep will check whether it's running on the terminal or its output is being piped and prints out the matched string in color if supported, and it seems resetting the color then print \n clears the rest of the line
To match \n you can use the -z option to make null bytes the line separator
$ grep -z aline$'\r'$'\n' --color=never file.txt
aline
$ grep -z aline$'\r'$'\n' --color=never file.txt | od -c
0000000 a l i n e \r \n \0
0000010
$ grep -z aline$'\r'$'\n' --color=always file.txt | od -c
0000000 033 [ 0 1 ; 3 1 m 033 [ K a l i n e
0000020 \r 033 [ m 033 [ K \n \0
0000031
Your last command grep aline$'\n' file.txt works because \n is simply a word separator in bash, so the command is just the same as grep aline file.txt. Exactly the same thing happened in the 3rd line: grep aline$'\r'$'\n' file.txt To pass a newline you must quote the argument to prevent word splitting
$ echo "aline" | grep -z "aline$(echo $'\n')"
aline
To demonstrate the effect of the quote with the 3rd line I added another line to the file
$ cat file.txt
aline
another line
$ grep -z "aline$(echo $'\n')" file.txt | od -c
0000000 a l i n e \r \n a n o t h e r l
0000020 i n e \n \0
0000025
$ grep -z "aline$(echo $'\n')" file.txt
aline
another line
$
If the input is not well-formed, the behavior is undefined.
In practice, some versions of GNU grep use CR for internal purposes, so attempting to match it does not work at all, or produces really bizarre results.
For not entirely different reasons, passing in a literal newline as part of the regular expression could have some odd interpretations, including, but not limited to, interpreting the argument as two separate patterns. (Look at how grep -F reads from a file, and imagine that at least some implementations use the same logic to parse the command line.)
In the grand scheme of things, the sane solution is to fix the input so it's a valid text file before attempting to run Unix line-oriented tools on it.
For quick and dirty solutions, some tools have well-defined semantics for random binary input. Perl is a model citizen in this respect.
bash$ perl -ne 'print if /aline\r$/' <<<$'aline\r'
aline
Awk also tends to work amicably, though there are several implementations, so the risk that somebody somewhere has a version which doesn't behave identically to AT&T Awk is higher.
Maybe notice also how \r is the last character before the end of the line (the DOS line ending is the sequence CR LF, where LF is the standard Unix line terminator for text files).
At least for me phuclv's answer doesn't completely cover the last case, i.e. grep aline$'\n' file.txt.
Your mileage my vary depending on which shell and which version and implementation of grep you are using, but for me grep -z "aline$(echo $'\n')" and grep -z aline$'\n' both just match the same pattern as grep -z aline.
This becomes more apparent if the -o switch is used, so that grep outputs only the matched string and not the entire line (which is the entire file for most text files when the -z option is used).
If you use the same file.txt as in phuclv's second example:
$ cat file.txt
aline
another line
$ grep -z "aline$(echo $'\n')" file.txt | od -c
0000000 a l i n e \r \n a n o t h e r l
0000020 i n e \n \0
0000025
$ grep -z -o "aline$(echo $'\n')" file.txt | od -c
0000000 a l i n e \0
0000006
$ grep -z -o aline$'\n' file.txt | od -c
0000000 a l i n e \0
0000006
$ grep -z -o aline file.txt | od -c
0000000 a l i n e \0
0000006
To actually match a \n as part of the pattern I had to use the -P switch to turn on "Perl-compatible regular expression"
$ grep -z -o -P 'aline\r\n' file.txt | od -c
0000000 a l i n e \r \n \0
0000010
$ grep -z -o -P 'aline\r\nanother' file.txt | od -c
0000000 a l i n e \r \n a n o t h e r \0
0000017
For reference:
grep --version|head -n1
grep (GNU grep) 3.1
bash --version|head -n1
GNU bash, version 4.4.20(1)-release (x86_64-pc-linux-gnu)

Bash: Strip trailing linebreak from output

When I execute commands in Bash (or to be specific, wc -l < log.txt), the output contains a linebreak after it. How do I get rid of it?
If your expected output is a single line, you can simply remove all newline characters from the output. It would not be uncommon to pipe to the tr utility, or to Perl if preferred:
wc -l < log.txt | tr -d '\n'
wc -l < log.txt | perl -pe 'chomp'
You can also use command substitution to remove the trailing newline:
echo -n "$(wc -l < log.txt)"
printf "%s" "$(wc -l < log.txt)"
If your expected output may contain multiple lines, you have another decision to make:
If you want to remove MULTIPLE newline characters from the end of the file, again use cmd substitution:
printf "%s" "$(< log.txt)"
If you want to strictly remove THE LAST newline character from a file, use Perl:
perl -pe 'chomp if eof' log.txt
Note that if you are certain you have a trailing newline character you want to remove, you can use head from GNU coreutils to select everything except the last byte. This should be quite quick:
head -c -1 log.txt
Also, for completeness, you can quickly check where your newline (or other special) characters are in your file using cat and the 'show-all' flag -A. The dollar sign character will indicate the end of each line:
cat -A log.txt
One way:
wc -l < log.txt | xargs echo -n
If you want to remove only the last newline, pipe through:
sed -z '$ s/\n$//'
sed won't add a \0 to then end of the stream if the delimiter is set to NUL via -z, whereas to create a POSIX text file (defined to end in a \n), it will always output a final \n without -z.
Eg:
$ { echo foo; echo bar; } | sed -z '$ s/\n$//'; echo tender
foo
bartender
And to prove no NUL added:
$ { echo foo; echo bar; } | sed -z '$ s/\n$//' | xxd
00000000: 666f 6f0a 6261 72 foo.bar
To remove multiple trailing newlines, pipe through:
sed -Ez '$ s/\n+$//'
There is also direct support for white space removal in Bash variable substitution:
testvar=$(wc -l < log.txt)
trailing_space_removed=${testvar%%[[:space:]]}
leading_space_removed=${testvar##[[:space:]]}
If you want to print output of anything in Bash without end of line, you echo it with the -n switch.
If you have it in a variable already, then echo it with the trailing newline cropped:
$ testvar=$(wc -l < log.txt)
$ echo -n $testvar
Or you can do it in one line, instead:
$ echo -n $(wc -l < log.txt)
If you assign its output to a variable, bash automatically strips whitespace:
linecount=`wc -l < log.txt`
printf already crops the trailing newline for you:
$ printf '%s' $(wc -l < log.txt)
Detail:
printf will print your content in place of the %s string place holder.
If you do not tell it to print a newline (%s\n), it won't.
Adding this for my reference more than anything else ^_^
You can also strip a new line from the output using the bash expansion magic
VAR=$'helloworld\n'
CLEANED="${VAR%$'\n'}"
echo "${CLEANED}"
Using Awk:
awk -v ORS="" '1' log.txt
Explanation:
-v assignment for ORS
ORS - output record separator set to blank. This will replace new line (Input record separator) with ""

BASH - Reading Multiple Lines from Text File

i am trying to read a text file, say file.txt and it contains multiple lines.
say the output of file.txt is
$ cat file.txt
this is line 1
this is line 2
this is line 3
I want to store the entire output as a variable say, $text.
When the variable $text is echoed, the expected output is:
this is line 1 this is line 2 this is line 3
my code is as follows
while read line
do
test="${LINE}"
done < file.txt
echo $test
the output i get is always only the last line. Is there a way to concatenate the multiple lines in file.txt as one long string?
You can translate the \n(newline) to (space):
$ text=$(tr '\n' ' ' <file.txt)
$ echo $text
this is line 1 this is line 2 this is line 3
If lines ends with \r\n, you can do this:
$ text=$(tr -d '\r' <file.txt | tr '\n' ' ')
Another one:
line=$(< file.txt)
line=${line//$'\n'/ }
test=$(cat file.txt | xargs)
echo $test
You have to append the content of the next line to your variable:
while read line
do
test="${test} ${LINE}"
done < file.txt
echo $test
Resp. even simpler you could simply read the full file at once into the variable:
test=$(cat file.txt)
resp.
test=$(tr "\n" " " < file.txt)
If you would want to keep the newlines it would be as simple as:
test=<file.txt
I believe it's the simplest method:
text=$(echo $(cat FILE))
But it doesn't preserve multiple spaces/tabs between words.
Use arrays
#!/bin/bash
while read line
do
a=( "${a[#]}" "$line" )
done < file.txt
echo -n "${a[#]}"
output:
this is line 1 this is line 2 this is line 3
See e.g. tldp section on arrays

Extract words from files

How can I extract all the words from a file, every word on a single line?
Example:
test.txt
This is my sample text
Output:
This
is
my
sample
text
The tr command can do this...
tr [:blank:] '\n' < test.txt
This asks the tr program to replace white space with a new line.
The output is stdout, but it could be redirected to another file, result.txt:
tr [:blank:] '\n' < test.txt > result.txt
And here the obvious bash line:
for i in $(< test.txt)
do
printf '%s\n' "$i"
done
EDIT Still shorter:
printf '%s\n' $(< test.txt)
That's all there is to it, no special (pathetic) cases included (And handling multiple subsequent word separators / leading / trailing separators is by Doing The Right Thing (TM)). You can adjust the notion of a word separator using the $IFS variable, see bash manual.
The above answer doesn't handle multiple spaces and such very well. An alternative would be
perl -p -e '$_ = join("\n",split);' test.txt
which would. E.g.
esben#mosegris:~/ange/linova/build master $ echo "test test" | tr [:blank:] '\n'
test
test
But
esben#mosegris:~/ange/linova/build master $ echo "test test" | perl -p -e '$_ = join("\n",split);'
test
test
This might work for you:
# echo -e "this is\tmy\nsample text" | sed 's/\s\+/\n/g'
this
is
my
sample
text
perl answer will be :
pearl.214> cat file1
a b c d e f pearl.215> perl -p -e 's/ /\n/g' file1
a
b
c
d
e
f
pearl.216>

Resources