How can I print a newline as \n in Bash? - bash

Basically, I want to achieve something like the inverse of echo -e.
I have a variable which stores a command output, but I want to print newlines as \n.

Here's my solution:
sed 's/$/\\n/' | tr -d '\n'

If your input is already in a (Bash) shell variable, say $varWithNewlines:
echo "${varWithNewlines//$'\n'/\\n}"
It simply uses Bash parameter expansion to replace all newline ($'\n') instances with literal '\n' each.
If your input comes from a file, use AWK:
awk -v ORS='\\n' 1
In action, with sample input:
# Sample input with actual newlines created with ANSI C quoting ($'...'),
# which turns `\n` literals into actual newlines.
varWithNewlines=$'line 1\nline 2\nline 3'
# Translate newlines to '\n' literals.
# Note the use of `printf %s` to avoid adding an additional newline.
# By contrast, a here-string - <<<"$varWithNewlines" _always appends a newline_.
printf %s "$varWithNewlines" | awk -v ORS='\\n' 1
awk reads input line by line
by setting ORS- the output record separator to literal '\n' (escaped with an additional \ so that awk doesn't interpret it as an escape sequence), the input lines are output with that separator
1 is just shorthand for {print}, i.e., all input lines are printed, terminated by ORS.
Note: The output will always end in literal '\n', even if your input does not end in a newline.
This is because AWK terminates every output line with ORS, whether the input line ended with a newline (separator specified in FS) or not.
Here's how to unconditionally strip the terminating literal '\n' from your output.
# Translate newlines to '\n' literals and capture in variable.
varEncoded=$(printf %s "$varWithNewlines" | awk -v ORS='\\n' 1)
# Strip terminating '\n' literal from the variable value
# using Bash parameter expansion.
echo "${varEncoded%\\n}"
By contrast, more work is needed if you want to make the presence of a terminating literal '\n' dependent on whether the input ends with a newline or not.
# Translate newlines to '\n' literals and capture in variable.
varEncoded=$(printf %s "$varWithNewlines" | awk -v ORS='\\n' 1)
# If the input does not end with a newline, strip the terminating '\n' literal.
if [[ $varWithNewlines != *$'\n' ]]; then
# Strip terminating '\n' literal from the variable value
# using Bash parameter expansion.
echo "${varEncoded%\\n}"
else
echo "$varEncoded"
fi

You can use printf "%q":
eol=$'\n'
printf "%q\n" "$eol"
$'\n'

A Bash solution
x=$'abcd\ne fg\nghi'
printf "%s\n" "$x"
abcd
e fg
ghi
y=$(IFS=$'\n'; set -f; printf '%s\\n' $x)
y=${y%??}
printf "%s\n" "$y"
abcd\ne fg\nghi

Related

replacing newlines with the string '\n' with POSIX tools

Yes I know there are a number of questions (e.g. (0) or (1)) which seem to ask the same, but AFAICS none really answers what I want.
What I want is, to replace any occurrence of a newline (LF) with the string \n, with no implicitly assumed newlines... and this with POSIX only utilities (and no GNU extensions or Bashisms) and input read from stdin with no buffering of that is desired.
So for example:
printf 'foo' | magic
should give foo
printf 'foo\n' | magic
should give foo\n
printf 'foo\n\n' | magic
should give foo\n\n
The usually given answers, don't do this, e.g.:
awk
printf 'foo' | awk 1 ORS='\\n gives foo\n, whereas it should give just foo
so adds an \n when there was no newline.
sed
would work for just foo but in all other cases, like:
printf 'foo\n' | sed ':a;N;$!ba;s/\n/\\n/g' gives foo, whereas it should give foo\n
misses one final newline.
Since I do not want any sort of buffering, I cannot just look whether the input ended in an newline and then add the missing one manually.
And anyway... it would use GNU extensions.
sed -z 's/\n/\\n/g'
does work (even retains the NULs correctly), but again, GNU extension.
tr
can only replace with one character, whereas I need two.
The only working solution I'd have so far is with perl:
perl -p -e 's/\n/\\n/'
which works just as desired in all cases, but as I've said, I'd like to have a solution for environments where just the basic POSIX utilities are there (so no Perl or using any GNU extensions).
Thanks in advance.
The following will work with all POSIX versions of the tools being used and with any POSIX text permissible characters as input whether a terminating newline is present or not:
$ magic() { { cat -u; printf '\n'; } | awk -v ORS= '{print sep $0; sep="\\n"}'; }
$ printf 'foo' | magic
foo$
$ printf 'foo\n' | magic
foo\n$
$ printf 'foo\n\n' | magic
foo\n\n$
The function first adds a newline to the incoming piped data to ensure that what awk is reading is a valid POSIX text file (which must end in a newline) so it's guaranteed to work in all POSIX compliant awks and then the awk command discards that terminating newline that we added and replaces all others with "\n" as required.
The only utility above that has to process input without a terminating newline is cat, but POSIX just talks about "files" as input to cat, not "text files" as in the awk and sed specs, and so every POSIX-compliant version of cat can handle input without a terminating newline.
You can (I think) do this with pure POSIX shell. I am assuming you are working with text, not arbitrary binary data that can include null bytes.
magic () {
while read x; do
printf '%s\\n' "$x"
done
printf '%s' "$x"
}
read assumes POSIX text lines (terminated with a newline), but it still populates x with anything it reads until the end of its input when no linefeed is seen. So as long as read succeeds, you have a proper line (minus the linefeed) in x that you can write back, but with a literal \n instead of a linefeed.
Once the loop breaks, output whatever (if anything) in x after the failed read, but without a trailing literal \n.
$ [ "$(printf foo | magic)" = foo ] && echo passed
passed
$ [ "$(printf 'foo\n' | magic)" = 'foo\n' ] && echo passed
passed
$ [ "$(printf 'foo\n\n' | magic)" = 'foo\n\n' ] && echo passed
passed
Here is a tr + sed solution that should work on any POSIX shell as it doesn't call any gnu utility:
printf 'foo' | tr '\n' '\7' | sed 's/\x7/\\n/g'
foo
printf 'foo\n' | tr '\n' '\7' | sed 's/\x7/\\n/g'
foo\n
printf 'foo\n\n' | tr '\n' '\7' | sed 's/\x7/\\n/g'
foo\n\n
Details:
tr command replaces each line break with \x07
sed command replace each \x07 with \\n

How to properly expand a Bash variable that contains newlines on sed replacement (insertion) side

Bear with me at first, thank you. Suppose I have
$ echo $'foo\nbar'
foo
bar
Now when I assign the string to a Bash variable, Bash does not give the same vertical output anymore:
$ str='foo\nbar'
$
$ echo $str
foo\nbar
$
$ echo $'str'
str
Try printf:
$ printf "$str\n"
foo
bar
Those examples are for illustration purposes because I am looking for a way to expand the newline(s) inside the $str variable such that I can substitute the $str variable on sed replacement (insertion) side.
# this does not work:
sed -i.bak $'/<!-- insert here -->/i\\\n'$'str'$'\\\n' index.html
# this works as expected though:
sed -i.bak $'/<!-- insert here -->/i\\\n'foo$'\\\n'bar$'\\\n' index.html
I did several ways to hack this but none worked; here is one example:
# this does not work:
sed -i.bak $'/<!-- insert here -->/i\\\n'`printf 'foo\\x0Abar'`$'\\\n' index.html
Further tests, I realized that as long as the variable does not contain newlines, things work as expected:
# This works as long as str2 does not contain any newline.
str2='foo_bar'
sed -i.bak $'/<!-- insert here -->/i\\\n'$str2$'\\\n' index.html
The expected result is that sed will insert 2 liners in place before <!-- insert here --> of the index.html file.
foo
bar
<!-- insert here -->
I try to achieve this as one liner. I know I can break sed into the vertical, multi-line form, which will be easier for me; however, I want to explore if there is a one liner style.
Is this doable or not?
My system is macOS High Sierra 10.13.6
Bash version: 3.2.57(1)-release
BSD sed was last updated on May 10, 2005
Your examples have a few subtle error, so here are a few examples regarding quoting and newlines in strings in bash and sed.
How quoting works in general:
# bash converts escape-sequence '\n' to real newline (0x0a) before passing it to echo
$ echo $'foo\nbar'
foo
bar
# bash passes literal 8 characters 'foo\nbar' to echo and echo simply prints them
$ echo 'foo\nbar'
foo\nbar
# bash passes literal 8 characters 'foo\nbar' to echo and echo converts escape-sequence
$ echo -e 'foo\nbar'
foo
bar
# bash passes literal string 'foo\nbar' to echo (twice)
# then echo recombines both arguments using a single space
$ str='foo\nbar'
$ echo $str "$str"
foo\nbar foo\nbar
# bash interprets escape-sequences and stores result 'foo<0x0a>bar' in str,
# then passes two arguments 'foo' and 'bar' to echo, due to "word splitting"
# then echo recombines both arguments using a single space
$ str=$'foo\nbar'
$ echo $str
foo bar
# bash interprets escape-sequences and stores result 'foo<0x0a>bar' in str,
# then passes it as a single argument to echo, without "word splitting"
$ str=$'foo\nbar'
$ echo "$str"
foo
bar
How to apply shell quoting, when dealing with newlines in sed
# replace a character with newline, using newline's escape-sequence
# sed will convert '\n' to a literal newline (0x0a)
$ sed 's/-/foo\nbar/' <<< 'blah-blah'
# replace a character with newline, using newline's escape-sequence in a variable
# sed will convert '\n' to a literal newline (0x0a)
$ str='foo\nbar' # str contains the escape-sequence '\n' and not a literal newline
$ sed 's/-/'"$str"'/' <<< 'blah-blah'
# replace a character with newline, using a literal newline.
# note the line-continuation-mark \ after 'foo' before the literal newline,
# which is part of the sed script, since everything in-between '' is literal
$ sed 's/-/foo\
bar/' <<< 'blah-blah' # end-of-command
# replace a character with newline, using a newline in shell-escape-mode
# note the same line-continuation-mark \ before $'\n', which is part of the sed script
# note: the sed script is a single string composed of three parts '…\', $'\n' and '…',
$ sed 's/-/foo\'$'\n''bar/' <<< 'blah-blah'
# the same as above, but with a single shell-escape-mode string instead of 3 parts.
# note the required quoting of the line-continuation-mark with an additional \ escape
# i.e. after shell-escaping the sed script contains a single \ and a literal newline
$ sed $'s/-/foo\\\nbar/' <<< 'blah-blah'
# replace a character with newline, using a shell-escaped string in a variable
$ str=$'\n' # str contains a literal newline (0x0a) due to shell escaping
$ sed 's/-/foo\'"$str"'bar/' <<< 'blah-blah'
# same as above with the required (quoted) line-continuation inside the variable
# note, how the single \ from '…foo\' (previous example) became \\ inside $'\\…'
$ str=$'\\\n' # str contains \ and a literal newline (0x0a) due to shell escaping
$ sed 's/-/foo'"$str"'bar/' <<< 'blah-blah'
All the sed examples will print the same:
blahfoo
barblah
So, a newline in sed's replacement string must either be
(1) newline's escape-sequence (i.e. '\n'), so sed can replace it with a literal newline, or
(2) a literal newline preceded by a line-continuation-mark (i.e. $'\\\n' or '\'$'\n', which is NOT the same as '\\\n' or '\\n' or $'\\n').
This means you need to replace each literal newline <0x0a> with newline's escape-sequence \n or insert a line-continuation-mark before each literal newline inside your replacement string before double-quote-expanding it into sed's substitute replacement string.
Since there are many more caveats regarding escaping in sed, I recommend you use awk's gsub function instead passing your replacement string as a variable via -v, e.g.
$ str=$'foo\nbar'
$ awk -v REP="$str" -- '{gsub(/-/, REP); print}' <<< 'blah-blah'
blahfoo
barblah
PS: I don't know, if this answer is entirely true in your case, because your operating system uses an outdated version of bash.
echo -e $str
where -e is
enable interpretation of backslash escapes
Use sed command r to insert arbitrary text
str="abc\ndef"
tmp=$(mktemp)
(
echo
printf -- "$str"
echo
) > "$tmp"
sed -i.bak '/<!-- insert here -->/r '"$tmp" index.html
rm -r "$tmp"
sed interprets newline as command delimiter. The ; doesn't really is a seds command delimeter, only newline is. Don't append/suffix ; or } or spaces in the w command - it will be interpreted as part of the filename (yes, spaces also). sed commands like w or r are escaped by a newline.
If you want more flexibility, rather move to awk.

BASH: unescape string

Suppose I have the following string:
"some\nstring\n..."
And it displays as one line when catted in bash. Further,
string_from_pipe | sed 's/\\\\/\\/g' # does not work
| awk '{print $0}'
| awk '{s = $0; print s}'
| awk '{s = $0; printf "%s",s}'
| echo $0
| sed 's/\\(.)/\1/g'
# all have not worked.
How do I unescape this string such that it prints as:
some
string
Or even displays that way inside a file?
POSIX sh provides printf %b for just this purpose:
s='some\nstring\n...'
printf '%b\n' "$s"
...will emit:
some
string
...
More to the point, the APPLICATION USAGE section of the POSIX spec for echo explicitly suggests using printf %b for this purpose rather than relying on optional XSI extensions.
As you observed, echo does not solve the problem:
$ s="some\nstring\n..."
$ echo "$s"
some\nstring\n...
You haven't mentioned where you got that string or which escapes are in it.
Using a POSIX-compliant shell's printf
If the escapes are ones supported by printf, then try:
$ printf '%b\n' "$s"
some
string
...
Using sed
$ echo "$s" | sed 's/\\n/\n/g'
some
string
...
Using awk
$ echo "$s" | awk '{gsub(/\\n/, "\n")} 1'
some
string
...
If you have the string in a variable (say myvar), you can use:
${myvar//\\n/$'\n'}
For example:
$ myvar='hello\nworld\nfoo'
$ echo "${myvar//\\n/$'\n'}"
hello
world
foo
$
(Note: it's usually safer to use printf %s <string> than echo <string>, if you don't have full control over the contents of <string>.)
How about using the -e option of echo?
$ s="some\nstring\n..." && echo -e "$s"
some
string
...
From the echo man-page
-e enable interpretation of the following backslash escapes
[...]
\a alert (bell)
\b backspace
\c suppress further output
\e escape character
\f form feed
\n new line
\r carriage return
\t horizontal tab
\v vertical tab
\\ backslash
\0nnn the character whose ASCII code is NNN (octal). NNN can be 0 to 3 octal digits
\xHH the eight-bit character whose value is HH (hexadecimal). HH can be one or two hex digits

shell script for reading file and replacing new file with | symbol

i have txt file like below.
abc
def
ghi
123
456
789
expected output is
abc|def|ghi
123|456|789
I want replace new line with pipe symbol (|). i want to use in egrep.After empty line it should start other new line.
you can try with awk
awk -v RS= -v OFS="|" '{$1=$1}1' file
you get,
abc|def|ghi
123|456|789
Explanation
Set RS to a null/blank value to get awk to operate on sequences of blank lines.
From the POSIX specification for awk:
RS
The first character of the string value of RS shall be the input record separator; a by default. If RS contains more than one character, the results are unspecified. If RS is null, then records are separated by sequences consisting of a plus one or more blank lines, leading or trailing blank lines shall not result in empty records at the beginning or end of the input, and a shall always be a field separator, no matter what the value of FS is.
$1==$1 re-formatting output with OFS as separator, 1 is true for always print.
Here's one using GNU sed:
cat file | sed ':a; N; $!ba; s/\n/|/g; s/||/\n/g'
If you're using BSD sed (the flavor packaged with Mac OS X), you will need to pass in each expression separately, and use a literal newline instead of \n (more info):
cat file | sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/|/g' -e 's/||/\
/g'
If file is:
abc
def
ghi
123
456
789
You get:
abc|def|ghi
123|456|789
This replaces each newline with a | (credit to this answer), and then || (i.e. what was a pair of newlines in the original input) with a newline.
The caveat here is that | can't appear at the beginning or end of a line in your input; otherwise, the second sed will add newlines in the wrong places. To work around that, you can use another character that won't be in your input as an intermediate value, and then replace singletons of that character with | and pairs with \n.
EDIT
Here's an example that implements the workaround above, using the NUL character \x00 (which should be highly unlikely to appear in your input) as the intermediate character:
cat file | sed ':a;N;$!ba; s/\n/\x00/g; s/\x00\x00/\n/g; s/\x00/|/g'
Explanation:
:a;N;$!ba; puts the entire file in the pattern space, including newlines
s/\n/\x00/g; replaces all newlines with the NUL character
s/\x00\x00/\n/g; replaces all pairs of NULs with a newline
s/\x00/|/g replaces the remaining singletons of NULs with a |
BSD version:
sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/\x00/g' -e 's/\x00\x00/\
/g' -e 's/\x00/|/g'
EDIT 2
For a more direct approach (GNU sed only), provided by #ClaudiuGeorgiu:
sed -z 's/\([^\n]\)\n\([^\n]\)/\1|\2/g; s/\n\n/\n/g'
Explanation:
-z uses NUL characters as line-endings (so newlines are not given special treatment and can be matched in the regular expression)
s/\([^\n]\)\n\([^\n]\)/\1|\2/g; replaces every 3-character sequence of <non-newline><newline><non-newline> with <non-newline>|<non-newline>
s/\n\n/\n/g replaces all pairs of newlines with a single newline
In native bash:
#!/usr/bin/env bash
curr=
while IFS= read -r line; do
if [[ $line ]]; then
curr+="|$line"
else
printf '%s\n' "${curr#|}"
curr=
fi
done
[[ $curr ]] && printf '%s\n' "${curr#|}"
Tested:
$ f() { local curr= line; while IFS= read -r line; do if [[ $line ]]; then curr+="|$line"; else printf '%s\n' "${curr#|}"; curr=; fi; done; [[ $curr ]] && printf '%s\n' "${curr#|}"; }
$ f < <(printf '%s\n' 'abc' 'def' 'ghi' '' 123 456 789)
abc|def|ghi
123|456|789
Use rs. For example:
rs -C'|' 2 3 < file
rs = reshape data array. Here I'm specifying that I want 2 rows, 3 columns, and the output separator to be pipe.

How to safely handle "\n" with sed on automated scripts where "\n" can be in the input to these scripts?

I have a shell script that is called via parameters (it's called by an external binary programm which I can not change), like this:
myscript.sh "param1" "param2"
Now, in this script there's a sed "s/param1/param2/"-like command involved and the param2 can contain literaly the newline escape sequence \n (like line1\nline2):
VAL=$(echo "$2" | sed -e 's/[\/&]/\\&/g')
sed -i "s/$1/$VAL/" /a/path/to/file
I already did this: Escape a string for a sed replace pattern to escape backslashes and ampersands that may occur, but this does not help handling the newline \n (sed ignores it).
I know how to do it manually in a script (by entering a real newline, pressing Return, in the shell script file at the according place or do some stuff like $(echo)), but I have no influence to the parameters that are passed.
How can I safely handle the newline sequence so that sed does its job and inserts a newline when \n occurs in the parameter?
In this case, I would very strongly recommend replacing sed with perl. If you are able to do that, then your script becomes:
perl -pi -e 'BEGIN {$a=shift;$b=shift} s/$a/$b/' "$1" "$2" /a/path/to/file
You no longer need the VAL variable at all!
If for some bizarre reason you're absolutely restricted to sed, change the VAL= statement to:
VAL=$(echo "$2" | sed -ne '1h;2,$H;$x;$s/[\/&]/\\&/g;$s/\n/\\n/g;$p;')
But don't do that. Use the perl version instead.
Replace \n with real newlines:
VAL=${VAL//\\n/$'\n'}
From BashFAQ #21, a generic string substitution tool that works with arbitrary literals (neither newlines nor regexp characters being special) using awk:
# usage: gsub_literal STR REP
# replaces all instances of STR with REP. reads from stdin and writes to stdout.
gsub_literal() {
# STR cannot be empty
[[ $1 ]] || return
# string manip needed to escape '\'s, so awk doesn't expand '\n' and such
awk -v str="${1//\\/\\\\}" -v rep="${2//\\/\\\\}" '
# get the length of the search string
BEGIN {
len = length(str);
}
{
# empty the output string
out = "";
# continue looping while the search string is in the line
while (i = index($0, str)) {
# append everything up to the search string, and the replacement string
out = out substr($0, 1, i-1) rep;
# remove everything up to and including the first instance of the
# search string from the line
$0 = substr($0, i + len);
}
# append whatever is left
out = out $0;
print out;
}
'
}
Granted, that's a mouthful, but it's trivial to use:
gsub_literal "$1" "$val" <infile >outfile
VAL=$(echo "$2" | sed -e 's/[\/&]/\\&/g')
How can I safely handle the newline sequence so that sed does its job and inserts a newline
when \n occurs in the parameter?
You can just let sed undo the escaping of \n by adding s/\\n/\n/g, i. e.
VAL=$(echo "$2" | sed -e 's/[\/&]/\\&/g;s/\\n/\n/g')
Test:
# set a 'line1\nline2'
# VAL=$(echo "$2" | sed -e 's/[\/&]/\\&/g;s/\\n/\n/g')
# sed "s/$1/$VAL/" <<<qay
qline1
line2y

Resources