How to properly expand a Bash variable that contains newlines on sed replacement (insertion) side - bash

Bear with me at first, thank you. Suppose I have
$ echo $'foo\nbar'
foo
bar
Now when I assign the string to a Bash variable, Bash does not give the same vertical output anymore:
$ str='foo\nbar'
$
$ echo $str
foo\nbar
$
$ echo $'str'
str
Try printf:
$ printf "$str\n"
foo
bar
Those examples are for illustration purposes because I am looking for a way to expand the newline(s) inside the $str variable such that I can substitute the $str variable on sed replacement (insertion) side.
# this does not work:
sed -i.bak $'/<!-- insert here -->/i\\\n'$'str'$'\\\n' index.html
# this works as expected though:
sed -i.bak $'/<!-- insert here -->/i\\\n'foo$'\\\n'bar$'\\\n' index.html
I did several ways to hack this but none worked; here is one example:
# this does not work:
sed -i.bak $'/<!-- insert here -->/i\\\n'`printf 'foo\\x0Abar'`$'\\\n' index.html
Further tests, I realized that as long as the variable does not contain newlines, things work as expected:
# This works as long as str2 does not contain any newline.
str2='foo_bar'
sed -i.bak $'/<!-- insert here -->/i\\\n'$str2$'\\\n' index.html
The expected result is that sed will insert 2 liners in place before <!-- insert here --> of the index.html file.
foo
bar
<!-- insert here -->
I try to achieve this as one liner. I know I can break sed into the vertical, multi-line form, which will be easier for me; however, I want to explore if there is a one liner style.
Is this doable or not?
My system is macOS High Sierra 10.13.6
Bash version: 3.2.57(1)-release
BSD sed was last updated on May 10, 2005

Your examples have a few subtle error, so here are a few examples regarding quoting and newlines in strings in bash and sed.
How quoting works in general:
# bash converts escape-sequence '\n' to real newline (0x0a) before passing it to echo
$ echo $'foo\nbar'
foo
bar
# bash passes literal 8 characters 'foo\nbar' to echo and echo simply prints them
$ echo 'foo\nbar'
foo\nbar
# bash passes literal 8 characters 'foo\nbar' to echo and echo converts escape-sequence
$ echo -e 'foo\nbar'
foo
bar
# bash passes literal string 'foo\nbar' to echo (twice)
# then echo recombines both arguments using a single space
$ str='foo\nbar'
$ echo $str "$str"
foo\nbar foo\nbar
# bash interprets escape-sequences and stores result 'foo<0x0a>bar' in str,
# then passes two arguments 'foo' and 'bar' to echo, due to "word splitting"
# then echo recombines both arguments using a single space
$ str=$'foo\nbar'
$ echo $str
foo bar
# bash interprets escape-sequences and stores result 'foo<0x0a>bar' in str,
# then passes it as a single argument to echo, without "word splitting"
$ str=$'foo\nbar'
$ echo "$str"
foo
bar
How to apply shell quoting, when dealing with newlines in sed
# replace a character with newline, using newline's escape-sequence
# sed will convert '\n' to a literal newline (0x0a)
$ sed 's/-/foo\nbar/' <<< 'blah-blah'
# replace a character with newline, using newline's escape-sequence in a variable
# sed will convert '\n' to a literal newline (0x0a)
$ str='foo\nbar' # str contains the escape-sequence '\n' and not a literal newline
$ sed 's/-/'"$str"'/' <<< 'blah-blah'
# replace a character with newline, using a literal newline.
# note the line-continuation-mark \ after 'foo' before the literal newline,
# which is part of the sed script, since everything in-between '' is literal
$ sed 's/-/foo\
bar/' <<< 'blah-blah' # end-of-command
# replace a character with newline, using a newline in shell-escape-mode
# note the same line-continuation-mark \ before $'\n', which is part of the sed script
# note: the sed script is a single string composed of three parts '…\', $'\n' and '…',
$ sed 's/-/foo\'$'\n''bar/' <<< 'blah-blah'
# the same as above, but with a single shell-escape-mode string instead of 3 parts.
# note the required quoting of the line-continuation-mark with an additional \ escape
# i.e. after shell-escaping the sed script contains a single \ and a literal newline
$ sed $'s/-/foo\\\nbar/' <<< 'blah-blah'
# replace a character with newline, using a shell-escaped string in a variable
$ str=$'\n' # str contains a literal newline (0x0a) due to shell escaping
$ sed 's/-/foo\'"$str"'bar/' <<< 'blah-blah'
# same as above with the required (quoted) line-continuation inside the variable
# note, how the single \ from '…foo\' (previous example) became \\ inside $'\\…'
$ str=$'\\\n' # str contains \ and a literal newline (0x0a) due to shell escaping
$ sed 's/-/foo'"$str"'bar/' <<< 'blah-blah'
All the sed examples will print the same:
blahfoo
barblah
So, a newline in sed's replacement string must either be
(1) newline's escape-sequence (i.e. '\n'), so sed can replace it with a literal newline, or
(2) a literal newline preceded by a line-continuation-mark (i.e. $'\\\n' or '\'$'\n', which is NOT the same as '\\\n' or '\\n' or $'\\n').
This means you need to replace each literal newline <0x0a> with newline's escape-sequence \n or insert a line-continuation-mark before each literal newline inside your replacement string before double-quote-expanding it into sed's substitute replacement string.
Since there are many more caveats regarding escaping in sed, I recommend you use awk's gsub function instead passing your replacement string as a variable via -v, e.g.
$ str=$'foo\nbar'
$ awk -v REP="$str" -- '{gsub(/-/, REP); print}' <<< 'blah-blah'
blahfoo
barblah
PS: I don't know, if this answer is entirely true in your case, because your operating system uses an outdated version of bash.

echo -e $str
where -e is
enable interpretation of backslash escapes

Use sed command r to insert arbitrary text
str="abc\ndef"
tmp=$(mktemp)
(
echo
printf -- "$str"
echo
) > "$tmp"
sed -i.bak '/<!-- insert here -->/r '"$tmp" index.html
rm -r "$tmp"
sed interprets newline as command delimiter. The ; doesn't really is a seds command delimeter, only newline is. Don't append/suffix ; or } or spaces in the w command - it will be interpreted as part of the filename (yes, spaces also). sed commands like w or r are escaped by a newline.
If you want more flexibility, rather move to awk.

Related

Bash seems to convert LF into LFCR

It seems that Bash converts LF to LFCR. Indeed, here is an example bellow that describes that:
text=$(echo -e "foo\nbar\ntir")
When setting IFS to the LF end of line:
IFS=$(echo -e "\n")
the \n in the string text is not interpreted such as bellow:
for w in $text ; do echo ${w}';' ; done
Output:
foo
bar
tir;
Here the character ";" used as a marker shows that $text contains only one element and so \n is not interpreted although it was set as the IFS.
But now, while setting IFS to the LFCR end of line:
IFS=$(echo -e "\n\r")
the output of the previous command turns into:
foo;
bar;
tir;
The marker ";" shows that $text contains three elements and thereby \n in $text has been interpreted as a \n\r (LFCR) set as the IFS.
So, why does Bash seems convert LF to LFCR? If it does not, what is the underlying explanation please?
$IFS is actually being set to an empty string.
$ IFS=$(echo -e "\n")
$ echo "[$IFS]"
[]
When $(...) captures output it removes trailing newlines. You can set $IFS to a newline by using $'...', shell syntax that interprets escape sequences while avoiding the trimming problem.
$ IFS=$'\n'
$ echo "[$IFS]"
[
]
$ text=$'foo\nbar\nbaz\n'
$ printf '%s;\n' $text
foo;
bar;
baz;

How to remove white spaces (\t, \n, \r, space) form the beginning and the end of a string in shell?

I want to remove white spaces (\t, \n, \r, space) form the beginning and the end of a string if they exist
How to do that?
Is it possibe to that only with expressions like ${str#*}?
If you're using bash (which your idea of ${str#} seems to suggest), then you can use this:
echo "${str##[[:space:]]}" # trim all initial whitespace characters
echo "${str%%[[:space:]]}" # trim all trailing whitespace characters
You can say
sed -e 's/^[ \t\r\n]*//' -e 's/[ \t\r\n]*$//' <<< "string"
# ^^^^^^^^^^^ ^^^^^^^^^^
# beginning end of string
Or use \s to match tab and space if it is supported by your sed version.
If you can use sed then:
echo "${str}" | sed -e 's/^[[:space:]]*//' -e 's/[[:space:]]*$//'

How to safely handle "\n" with sed on automated scripts where "\n" can be in the input to these scripts?

I have a shell script that is called via parameters (it's called by an external binary programm which I can not change), like this:
myscript.sh "param1" "param2"
Now, in this script there's a sed "s/param1/param2/"-like command involved and the param2 can contain literaly the newline escape sequence \n (like line1\nline2):
VAL=$(echo "$2" | sed -e 's/[\/&]/\\&/g')
sed -i "s/$1/$VAL/" /a/path/to/file
I already did this: Escape a string for a sed replace pattern to escape backslashes and ampersands that may occur, but this does not help handling the newline \n (sed ignores it).
I know how to do it manually in a script (by entering a real newline, pressing Return, in the shell script file at the according place or do some stuff like $(echo)), but I have no influence to the parameters that are passed.
How can I safely handle the newline sequence so that sed does its job and inserts a newline when \n occurs in the parameter?
In this case, I would very strongly recommend replacing sed with perl. If you are able to do that, then your script becomes:
perl -pi -e 'BEGIN {$a=shift;$b=shift} s/$a/$b/' "$1" "$2" /a/path/to/file
You no longer need the VAL variable at all!
If for some bizarre reason you're absolutely restricted to sed, change the VAL= statement to:
VAL=$(echo "$2" | sed -ne '1h;2,$H;$x;$s/[\/&]/\\&/g;$s/\n/\\n/g;$p;')
But don't do that. Use the perl version instead.
Replace \n with real newlines:
VAL=${VAL//\\n/$'\n'}
From BashFAQ #21, a generic string substitution tool that works with arbitrary literals (neither newlines nor regexp characters being special) using awk:
# usage: gsub_literal STR REP
# replaces all instances of STR with REP. reads from stdin and writes to stdout.
gsub_literal() {
# STR cannot be empty
[[ $1 ]] || return
# string manip needed to escape '\'s, so awk doesn't expand '\n' and such
awk -v str="${1//\\/\\\\}" -v rep="${2//\\/\\\\}" '
# get the length of the search string
BEGIN {
len = length(str);
}
{
# empty the output string
out = "";
# continue looping while the search string is in the line
while (i = index($0, str)) {
# append everything up to the search string, and the replacement string
out = out substr($0, 1, i-1) rep;
# remove everything up to and including the first instance of the
# search string from the line
$0 = substr($0, i + len);
}
# append whatever is left
out = out $0;
print out;
}
'
}
Granted, that's a mouthful, but it's trivial to use:
gsub_literal "$1" "$val" <infile >outfile
VAL=$(echo "$2" | sed -e 's/[\/&]/\\&/g')
How can I safely handle the newline sequence so that sed does its job and inserts a newline
when \n occurs in the parameter?
You can just let sed undo the escaping of \n by adding s/\\n/\n/g, i. e.
VAL=$(echo "$2" | sed -e 's/[\/&]/\\&/g;s/\\n/\n/g')
Test:
# set a 'line1\nline2'
# VAL=$(echo "$2" | sed -e 's/[\/&]/\\&/g;s/\\n/\n/g')
# sed "s/$1/$VAL/" <<<qay
qline1
line2y

How can I print a newline as \n in Bash?

Basically, I want to achieve something like the inverse of echo -e.
I have a variable which stores a command output, but I want to print newlines as \n.
Here's my solution:
sed 's/$/\\n/' | tr -d '\n'
If your input is already in a (Bash) shell variable, say $varWithNewlines:
echo "${varWithNewlines//$'\n'/\\n}"
It simply uses Bash parameter expansion to replace all newline ($'\n') instances with literal '\n' each.
If your input comes from a file, use AWK:
awk -v ORS='\\n' 1
In action, with sample input:
# Sample input with actual newlines created with ANSI C quoting ($'...'),
# which turns `\n` literals into actual newlines.
varWithNewlines=$'line 1\nline 2\nline 3'
# Translate newlines to '\n' literals.
# Note the use of `printf %s` to avoid adding an additional newline.
# By contrast, a here-string - <<<"$varWithNewlines" _always appends a newline_.
printf %s "$varWithNewlines" | awk -v ORS='\\n' 1
awk reads input line by line
by setting ORS- the output record separator to literal '\n' (escaped with an additional \ so that awk doesn't interpret it as an escape sequence), the input lines are output with that separator
1 is just shorthand for {print}, i.e., all input lines are printed, terminated by ORS.
Note: The output will always end in literal '\n', even if your input does not end in a newline.
This is because AWK terminates every output line with ORS, whether the input line ended with a newline (separator specified in FS) or not.
Here's how to unconditionally strip the terminating literal '\n' from your output.
# Translate newlines to '\n' literals and capture in variable.
varEncoded=$(printf %s "$varWithNewlines" | awk -v ORS='\\n' 1)
# Strip terminating '\n' literal from the variable value
# using Bash parameter expansion.
echo "${varEncoded%\\n}"
By contrast, more work is needed if you want to make the presence of a terminating literal '\n' dependent on whether the input ends with a newline or not.
# Translate newlines to '\n' literals and capture in variable.
varEncoded=$(printf %s "$varWithNewlines" | awk -v ORS='\\n' 1)
# If the input does not end with a newline, strip the terminating '\n' literal.
if [[ $varWithNewlines != *$'\n' ]]; then
# Strip terminating '\n' literal from the variable value
# using Bash parameter expansion.
echo "${varEncoded%\\n}"
else
echo "$varEncoded"
fi
You can use printf "%q":
eol=$'\n'
printf "%q\n" "$eol"
$'\n'
A Bash solution
x=$'abcd\ne fg\nghi'
printf "%s\n" "$x"
abcd
e fg
ghi
y=$(IFS=$'\n'; set -f; printf '%s\\n' $x)
y=${y%??}
printf "%s\n" "$y"
abcd\ne fg\nghi

How can I capture the text between specific delimiters into a shell variable?

I have little problem with specifying my variable. I have a file with normal text and somewhere in it there are brackets [ ] (only 1 pair of brackets in whole file), and some text between them. I need to capture the text within these brackets in a shell (bash) variable. How can I do that, please?
Bash/sed:
VARIABLE=$(tr -d '\n' filename | sed -n -e '/\[[^]]/s/^[^[]*\[\([^]]*\)].*$/\1/p')
If that is unreadable, here's a bit of an explanation:
VARIABLE=`subexpression` Assigns the variable VARIABLE to the output of the subexpression.
tr -d '\n' filename Reads filename, deletes newline characters, and prints the result to sed's input
sed -n -e 'command' Executes the sed command without printing any lines
/\[[^]]/ Execute the command only on lines which contain [some text]
s/ Substitute
^[^[]* Match any non-[ text
\[ Match [
\([^]]*\) Match any non-] text into group 1
] Match ]
.*$ Match any text
/\1/ Replaces the line with group 1
p Prints the line
May I point out that while most of the suggested solutions might work, there is absolutely no reason why you should fork another shell, and spawn several processes to do such a simple task.
The shell provides you with all the tools you need:
$ var='foo[bar] pinch'
$ var=${var#*[}; var=${var%%]*}
$ echo "$var"
bar
See: http://mywiki.wooledge.org/BashFAQ/073
Sed is not necessary:
var=`egrep -o '\[.*\]' FILENAME | tr -d ][`
But it's only works with single line matches.
Using Bash builtin regex matching seems like yet another way of doing it:
var='foo[bar] pinch'
[[ "$var" =~ [^\]\[]*\[([^\[]*)\].* ]] # Bash 3.0
var="${BASH_REMATCH[1]}"
echo "$var"
Assuming you are asking about bash variable:
$ export YOUR_VAR=$(perl -ne'print $1 if /\[(.*?)\]/' your_file.txt)
The above works if brackets are on the same line.
What about:
shell_variable=$(sed -ne '/\[/,/\]/{s/^.*\[//;s/\].*//;p;}' $file)
Worked for me on Solaris 10 under Korn shell; should work with Bash too. Replace '$(...)' with back-ticks in Bourne shell.
Edit: worked when given [ on one line and ] on another. For the single line case as well, use:
shell_variable=$(sed -n -e '/\[[^]]*$/,/\]/{s/^.*\[//;s/\].*//;p;}' \
-e '/\[.*\]/s/^.*\[\([^]]*\)\].*$/\1/p' $file)
The first '-e' deals with the multi-line spread; the second '-e' deals with the single-line case. The first '-e' says:
From the line containing an open bracket [ not followed by a close bracket ] on the same line
Until the line containing close bracket ],
substitute anything up to and including the open bracket with an empty string,
substitute anything from the close bracket onwards with an empty string, and
print the result
The second '-e' says:
For any line containing both open bracket and close bracket
Substitute the pattern consisting of 'characters up to and including open bracket', 'characters up to but excluding close bracket' (and remember this), 'stuff from close bracket onwards' with the remembered characters in the middle, and
print the result
For the multi-line case:
$ file=xxx
$ cat xxx
sdsajdlajsdl
asdajsdkjsaldjsal
sdasdsad [aaaa
bbbbbbb
cccc] asdjsalkdjsaldjlsaj
asdjsalkdjlksjdlaj
asdasjdlkjsaldja
$ shell_variable=$(sed -n -e '/\[[^]]*$/,/\]/{s/^.*\[//;s/\].*//;p;}' \
-e '/\[.*\]/s/^.*\[\([^]]*\)\].*$/\1/p' $file)
$ echo $shell_variable
aaaa bbbbbbb cccc
$
And for the single-line case:
$ cat xxx
sdsajdlajsdl
asdajsdkjsaldjsal
sdasdsad [aaaa bbbbbbb cccc] asdjsalkdjsaldjlsaj
asdjsalkdjlksjdlaj
asdasjdlkjsaldja
$
$ shell_variable=$(sed -n -e '/\[[^]]*$/,/\]/{s/^.*\[//;s/\].*//;p;}' \
-e '/\[.*\]/s/^.*\[\([^]]*\)\].*$/\1/p' $file)
$ echo $shell_variable
aaaa bbbbbbb cccc
$
Somewhere about here, it becomes simpler to do the whole job in Perl, slurping the file and editing the result string in two multi-line substitute operations.
var=`grep -e '\[.*\]' test.txt | sed -e 's/.*\[\(.*\)\].*/\1/' infile.txt`
Thanks to everyone, i used Strager's version and works perfectly, thanks alot once again...
var=`grep -e '\[.*\]' test.txt | sed -e 's/.*\[\(.*\)\].*/\1/' infile.txt`
Backslashes (BSL) got munched up ... :
var='foo[bar] pinch'
[[ "$var" =~ [^\]\[]*\[([^\[]*)\].* ]] # Bash 3.0
# Just in case ...:
[[ "$var" =~ [^BSL]BSL[]*BSL[([^BSL[]*)BSL].* ]] # Bash 3.0
var="${BASH_REMATCH[1]}"
echo "$var"
2 simple steps to extract the text.
split var at [ and get the right part
split var at ] and get the left part
cb0$ var='foo[bar] pinch'
cb0$ var=${var#*[}
cb0$ var=${var%]*} && echo $var
bar

Resources