BASH: unescape string - bash

Suppose I have the following string:
"some\nstring\n..."
And it displays as one line when catted in bash. Further,
string_from_pipe | sed 's/\\\\/\\/g' # does not work
| awk '{print $0}'
| awk '{s = $0; print s}'
| awk '{s = $0; printf "%s",s}'
| echo $0
| sed 's/\\(.)/\1/g'
# all have not worked.
How do I unescape this string such that it prints as:
some
string
Or even displays that way inside a file?

POSIX sh provides printf %b for just this purpose:
s='some\nstring\n...'
printf '%b\n' "$s"
...will emit:
some
string
...
More to the point, the APPLICATION USAGE section of the POSIX spec for echo explicitly suggests using printf %b for this purpose rather than relying on optional XSI extensions.

As you observed, echo does not solve the problem:
$ s="some\nstring\n..."
$ echo "$s"
some\nstring\n...
You haven't mentioned where you got that string or which escapes are in it.
Using a POSIX-compliant shell's printf
If the escapes are ones supported by printf, then try:
$ printf '%b\n' "$s"
some
string
...
Using sed
$ echo "$s" | sed 's/\\n/\n/g'
some
string
...
Using awk
$ echo "$s" | awk '{gsub(/\\n/, "\n")} 1'
some
string
...

If you have the string in a variable (say myvar), you can use:
${myvar//\\n/$'\n'}
For example:
$ myvar='hello\nworld\nfoo'
$ echo "${myvar//\\n/$'\n'}"
hello
world
foo
$
(Note: it's usually safer to use printf %s <string> than echo <string>, if you don't have full control over the contents of <string>.)

How about using the -e option of echo?
$ s="some\nstring\n..." && echo -e "$s"
some
string
...
From the echo man-page
-e enable interpretation of the following backslash escapes
[...]
\a alert (bell)
\b backspace
\c suppress further output
\e escape character
\f form feed
\n new line
\r carriage return
\t horizontal tab
\v vertical tab
\\ backslash
\0nnn the character whose ASCII code is NNN (octal). NNN can be 0 to 3 octal digits
\xHH the eight-bit character whose value is HH (hexadecimal). HH can be one or two hex digits

Related

Replace one character by the other (and vice-versa) in shell

Say I have strings that look like this:
$ a='/o\\'
$ echo $a
/o\
$ b='\//\\\\/'
$ echo $b
\//\\/
I'd like a shell script (ideally a one-liner) to replace / occurrences by \ and vice-versa.
Suppose the command is called invert, it would yield (in a shell prompt):
$ invert $a
\o/
$ invert $b
/\\//\
For example using sed, it seems unavoidable to use a temporary character, which is not great, like so:
$ echo $a | sed 's#/#%#g' | sed 's#\\#/#g' | sed 's#%#\\#g'
\o/
$ echo $b | sed 's#/#%#g' | sed 's#\\#/#g' | sed 's#%#\\#g'
/\\//\
For some context, this is useful for proper printing of git log --graph --all | tac (I like to see newer commits at the bottom).
tr is your friend:
% echo 'abc' | tr ab ba
bac
% echo '/o\' | tr '\\/' '/\\'
\o/
(escaping the backslashes in the output might require a separate step)
I think this can be done with (g)awk:
$ echo a/\\b\\/c | gawk -F "/" 'BEGIN{ OFS="\\" } { for(i=1;i<=NF;i++) gsub(/\\/,"/",$i); print $0; }'
a\/b/\c
$ echo a\\/b/\\c | gawk -F "/" 'BEGIN{ OFS="\\" } { for(i=1;i<=NF;i++) gsub(/\\/,"/",$i); print $0; }'
a/\b\/c
$
-F "/" This defines the separator, The input will be split in "/", and should no longer contain a "/" character.
for(i=1;i<=NF;i++) gsub(/\\/,"/",$i);. This will replace, in all items in the input, the backslash (\) for a slash (/).
If you want to replace every instance of / with \, you can uses the y command of sed, which is quite similar to what tr does:
$ a='/o\'
$ echo "$a"
/o\
$ echo "$a" | sed 'y|/\\|\\/|'
\o/
$ b='\//\\/'
$ echo "$b"
\//\\/
$ echo "$b" | sed 'y|/\\|\\/|'
/\\//\
If you are strictly limited to GNU AWK you might get desired result following way, let file.txt content be
\//\\\\/
then
awk 'BEGIN{FPAT=".";OFS="";arr["/"]="\\";arr["\\"]="/"}{for(i=1;i<=NF;i+=1){if($i in arr){$i=arr[$i]}};print}' file.txt
gives output
/\\////\
Explanation: I inform GNU AWK that field is any single character using FPAT built-in variable and that output field separator (OFS) is empty string and create array where key-value pair represent charactertobereplace-replacement, \ needs to be escaped hence \\ denote literal \. Then for each line I iterate overall all fields using for loop and if given field hold character present in array arr keys I do exchange it for corresponding value, after loop I print line.
(tested in gawk 4.2.1)

replacing newlines with the string '\n' with POSIX tools

Yes I know there are a number of questions (e.g. (0) or (1)) which seem to ask the same, but AFAICS none really answers what I want.
What I want is, to replace any occurrence of a newline (LF) with the string \n, with no implicitly assumed newlines... and this with POSIX only utilities (and no GNU extensions or Bashisms) and input read from stdin with no buffering of that is desired.
So for example:
printf 'foo' | magic
should give foo
printf 'foo\n' | magic
should give foo\n
printf 'foo\n\n' | magic
should give foo\n\n
The usually given answers, don't do this, e.g.:
awk
printf 'foo' | awk 1 ORS='\\n gives foo\n, whereas it should give just foo
so adds an \n when there was no newline.
sed
would work for just foo but in all other cases, like:
printf 'foo\n' | sed ':a;N;$!ba;s/\n/\\n/g' gives foo, whereas it should give foo\n
misses one final newline.
Since I do not want any sort of buffering, I cannot just look whether the input ended in an newline and then add the missing one manually.
And anyway... it would use GNU extensions.
sed -z 's/\n/\\n/g'
does work (even retains the NULs correctly), but again, GNU extension.
tr
can only replace with one character, whereas I need two.
The only working solution I'd have so far is with perl:
perl -p -e 's/\n/\\n/'
which works just as desired in all cases, but as I've said, I'd like to have a solution for environments where just the basic POSIX utilities are there (so no Perl or using any GNU extensions).
Thanks in advance.
The following will work with all POSIX versions of the tools being used and with any POSIX text permissible characters as input whether a terminating newline is present or not:
$ magic() { { cat -u; printf '\n'; } | awk -v ORS= '{print sep $0; sep="\\n"}'; }
$ printf 'foo' | magic
foo$
$ printf 'foo\n' | magic
foo\n$
$ printf 'foo\n\n' | magic
foo\n\n$
The function first adds a newline to the incoming piped data to ensure that what awk is reading is a valid POSIX text file (which must end in a newline) so it's guaranteed to work in all POSIX compliant awks and then the awk command discards that terminating newline that we added and replaces all others with "\n" as required.
The only utility above that has to process input without a terminating newline is cat, but POSIX just talks about "files" as input to cat, not "text files" as in the awk and sed specs, and so every POSIX-compliant version of cat can handle input without a terminating newline.
You can (I think) do this with pure POSIX shell. I am assuming you are working with text, not arbitrary binary data that can include null bytes.
magic () {
while read x; do
printf '%s\\n' "$x"
done
printf '%s' "$x"
}
read assumes POSIX text lines (terminated with a newline), but it still populates x with anything it reads until the end of its input when no linefeed is seen. So as long as read succeeds, you have a proper line (minus the linefeed) in x that you can write back, but with a literal \n instead of a linefeed.
Once the loop breaks, output whatever (if anything) in x after the failed read, but without a trailing literal \n.
$ [ "$(printf foo | magic)" = foo ] && echo passed
passed
$ [ "$(printf 'foo\n' | magic)" = 'foo\n' ] && echo passed
passed
$ [ "$(printf 'foo\n\n' | magic)" = 'foo\n\n' ] && echo passed
passed
Here is a tr + sed solution that should work on any POSIX shell as it doesn't call any gnu utility:
printf 'foo' | tr '\n' '\7' | sed 's/\x7/\\n/g'
foo
printf 'foo\n' | tr '\n' '\7' | sed 's/\x7/\\n/g'
foo\n
printf 'foo\n\n' | tr '\n' '\7' | sed 's/\x7/\\n/g'
foo\n\n
Details:
tr command replaces each line break with \x07
sed command replace each \x07 with \\n

How can I print a newline as \n in Bash?

Basically, I want to achieve something like the inverse of echo -e.
I have a variable which stores a command output, but I want to print newlines as \n.
Here's my solution:
sed 's/$/\\n/' | tr -d '\n'
If your input is already in a (Bash) shell variable, say $varWithNewlines:
echo "${varWithNewlines//$'\n'/\\n}"
It simply uses Bash parameter expansion to replace all newline ($'\n') instances with literal '\n' each.
If your input comes from a file, use AWK:
awk -v ORS='\\n' 1
In action, with sample input:
# Sample input with actual newlines created with ANSI C quoting ($'...'),
# which turns `\n` literals into actual newlines.
varWithNewlines=$'line 1\nline 2\nline 3'
# Translate newlines to '\n' literals.
# Note the use of `printf %s` to avoid adding an additional newline.
# By contrast, a here-string - <<<"$varWithNewlines" _always appends a newline_.
printf %s "$varWithNewlines" | awk -v ORS='\\n' 1
awk reads input line by line
by setting ORS- the output record separator to literal '\n' (escaped with an additional \ so that awk doesn't interpret it as an escape sequence), the input lines are output with that separator
1 is just shorthand for {print}, i.e., all input lines are printed, terminated by ORS.
Note: The output will always end in literal '\n', even if your input does not end in a newline.
This is because AWK terminates every output line with ORS, whether the input line ended with a newline (separator specified in FS) or not.
Here's how to unconditionally strip the terminating literal '\n' from your output.
# Translate newlines to '\n' literals and capture in variable.
varEncoded=$(printf %s "$varWithNewlines" | awk -v ORS='\\n' 1)
# Strip terminating '\n' literal from the variable value
# using Bash parameter expansion.
echo "${varEncoded%\\n}"
By contrast, more work is needed if you want to make the presence of a terminating literal '\n' dependent on whether the input ends with a newline or not.
# Translate newlines to '\n' literals and capture in variable.
varEncoded=$(printf %s "$varWithNewlines" | awk -v ORS='\\n' 1)
# If the input does not end with a newline, strip the terminating '\n' literal.
if [[ $varWithNewlines != *$'\n' ]]; then
# Strip terminating '\n' literal from the variable value
# using Bash parameter expansion.
echo "${varEncoded%\\n}"
else
echo "$varEncoded"
fi
You can use printf "%q":
eol=$'\n'
printf "%q\n" "$eol"
$'\n'
A Bash solution
x=$'abcd\ne fg\nghi'
printf "%s\n" "$x"
abcd
e fg
ghi
y=$(IFS=$'\n'; set -f; printf '%s\\n' $x)
y=${y%??}
printf "%s\n" "$y"
abcd\ne fg\nghi

Remove non printing chars from bash variable

I have some variable $a. This variable have non printing characters (carriage return ^M).
>echo $a
some words for compgen
>a+="END"
>echo $a
ENDe words for compgen
How I can remove that char?
I know that echo "$a" display it correct. But it's not a solution in my case.
You could use tr:
tr -dc '[[:print:]]' <<< "$var"
would remove non-printable character from $var.
$ foo=$'abc\rdef'
$ echo "$foo"
def
$ tr -dc '[[:print:]]' <<< "$foo"
abcdef
$ foo=$(tr -dc '[[:print:]]' <<< "$foo")
$ echo "$foo"
abcdef
To remove just the trailing carriage return from a, use
a=${a%$'\r'}
I was trying to send a notification via libnotify, with content that may contain unprintable characters. The existing solutions did not quite work for me (using a whitelist of characters using tr works, but strips any multi-byte characters).
Here is what worked, while passing the 💩 test:
message=$(iconv --from-code=UTF-8 -c <<< "$message")
As an equivalent to the tr approach using only shell builtins:
cleanVar=${var//[![:print:]]/}
...substituting :print: with the character class you want to keep, if appropriate.
tr -dc '[[:alpha:]]'
will translate your string to only have alpha characters (if that is needed)

Remove blank spaces with comma in a string in bash shell

I would like to replace blank spaces/white spaces in a string with commas.
STR1=This is a string
to
STR1=This,is,a,string
Without using external tools:
echo ${STR1// /,}
Demo:
$ STR1="This is a string"
$ echo ${STR1// /,}
This,is,a,string
See bash: Manipulating strings.
Just use sed:
echo $STR1 | sed 's/ /,/g'
or pure BASH way::
echo ${STR1// /,}
kent$ echo "STR1=This is a string"|awk -v OFS="," '$1=$1'
STR1=This,is,a,string
Note:
if there are continued blanks, they would be replaced with a single comma. as example above shows.
This might work for you:
echo 'STR1=This is a string' | sed 'y/ /,/'
STR1=This,is,a,string
or:
echo 'STR1=This is a string' | tr ' ' ','
STR1=This,is,a,string
How about
STR1="This is a string"
StrFix="$( echo "$STR1" | sed 's/[[:space:]]/,/g')"
echo "$StrFix"
**output**
This,is,a,string
If you have multiple adjacent spaces in your string and what to reduce them to just 1 comma, then change the sed to
STR1="This is a string"
StrFix="$( echo "$STR1" | sed 's/[[:space:]][[:space:]]*/,/g')"
echo "$StrFix"
**output**
This,is,a,string
I'm using a non-standard sed, and so have used ``[[:space:]][[:space:]]*to indicate one or more "white-space" characters (including tabs, VT, maybe a few others). In a modern sed, I would expect[[:space:]]+` to work as well.
STR1=`echo $STR1 | sed 's/ /,/g'`

Resources