Bash seems to convert LF into LFCR - bash

It seems that Bash converts LF to LFCR. Indeed, here is an example bellow that describes that:
text=$(echo -e "foo\nbar\ntir")
When setting IFS to the LF end of line:
IFS=$(echo -e "\n")
the \n in the string text is not interpreted such as bellow:
for w in $text ; do echo ${w}';' ; done
Output:
foo
bar
tir;
Here the character ";" used as a marker shows that $text contains only one element and so \n is not interpreted although it was set as the IFS.
But now, while setting IFS to the LFCR end of line:
IFS=$(echo -e "\n\r")
the output of the previous command turns into:
foo;
bar;
tir;
The marker ";" shows that $text contains three elements and thereby \n in $text has been interpreted as a \n\r (LFCR) set as the IFS.
So, why does Bash seems convert LF to LFCR? If it does not, what is the underlying explanation please?

$IFS is actually being set to an empty string.
$ IFS=$(echo -e "\n")
$ echo "[$IFS]"
[]
When $(...) captures output it removes trailing newlines. You can set $IFS to a newline by using $'...', shell syntax that interprets escape sequences while avoiding the trimming problem.
$ IFS=$'\n'
$ echo "[$IFS]"
[
]
$ text=$'foo\nbar\nbaz\n'
$ printf '%s;\n' $text
foo;
bar;
baz;

Related

How to properly expand a Bash variable that contains newlines on sed replacement (insertion) side

Bear with me at first, thank you. Suppose I have
$ echo $'foo\nbar'
foo
bar
Now when I assign the string to a Bash variable, Bash does not give the same vertical output anymore:
$ str='foo\nbar'
$
$ echo $str
foo\nbar
$
$ echo $'str'
str
Try printf:
$ printf "$str\n"
foo
bar
Those examples are for illustration purposes because I am looking for a way to expand the newline(s) inside the $str variable such that I can substitute the $str variable on sed replacement (insertion) side.
# this does not work:
sed -i.bak $'/<!-- insert here -->/i\\\n'$'str'$'\\\n' index.html
# this works as expected though:
sed -i.bak $'/<!-- insert here -->/i\\\n'foo$'\\\n'bar$'\\\n' index.html
I did several ways to hack this but none worked; here is one example:
# this does not work:
sed -i.bak $'/<!-- insert here -->/i\\\n'`printf 'foo\\x0Abar'`$'\\\n' index.html
Further tests, I realized that as long as the variable does not contain newlines, things work as expected:
# This works as long as str2 does not contain any newline.
str2='foo_bar'
sed -i.bak $'/<!-- insert here -->/i\\\n'$str2$'\\\n' index.html
The expected result is that sed will insert 2 liners in place before <!-- insert here --> of the index.html file.
foo
bar
<!-- insert here -->
I try to achieve this as one liner. I know I can break sed into the vertical, multi-line form, which will be easier for me; however, I want to explore if there is a one liner style.
Is this doable or not?
My system is macOS High Sierra 10.13.6
Bash version: 3.2.57(1)-release
BSD sed was last updated on May 10, 2005
Your examples have a few subtle error, so here are a few examples regarding quoting and newlines in strings in bash and sed.
How quoting works in general:
# bash converts escape-sequence '\n' to real newline (0x0a) before passing it to echo
$ echo $'foo\nbar'
foo
bar
# bash passes literal 8 characters 'foo\nbar' to echo and echo simply prints them
$ echo 'foo\nbar'
foo\nbar
# bash passes literal 8 characters 'foo\nbar' to echo and echo converts escape-sequence
$ echo -e 'foo\nbar'
foo
bar
# bash passes literal string 'foo\nbar' to echo (twice)
# then echo recombines both arguments using a single space
$ str='foo\nbar'
$ echo $str "$str"
foo\nbar foo\nbar
# bash interprets escape-sequences and stores result 'foo<0x0a>bar' in str,
# then passes two arguments 'foo' and 'bar' to echo, due to "word splitting"
# then echo recombines both arguments using a single space
$ str=$'foo\nbar'
$ echo $str
foo bar
# bash interprets escape-sequences and stores result 'foo<0x0a>bar' in str,
# then passes it as a single argument to echo, without "word splitting"
$ str=$'foo\nbar'
$ echo "$str"
foo
bar
How to apply shell quoting, when dealing with newlines in sed
# replace a character with newline, using newline's escape-sequence
# sed will convert '\n' to a literal newline (0x0a)
$ sed 's/-/foo\nbar/' <<< 'blah-blah'
# replace a character with newline, using newline's escape-sequence in a variable
# sed will convert '\n' to a literal newline (0x0a)
$ str='foo\nbar' # str contains the escape-sequence '\n' and not a literal newline
$ sed 's/-/'"$str"'/' <<< 'blah-blah'
# replace a character with newline, using a literal newline.
# note the line-continuation-mark \ after 'foo' before the literal newline,
# which is part of the sed script, since everything in-between '' is literal
$ sed 's/-/foo\
bar/' <<< 'blah-blah' # end-of-command
# replace a character with newline, using a newline in shell-escape-mode
# note the same line-continuation-mark \ before $'\n', which is part of the sed script
# note: the sed script is a single string composed of three parts '…\', $'\n' and '…',
$ sed 's/-/foo\'$'\n''bar/' <<< 'blah-blah'
# the same as above, but with a single shell-escape-mode string instead of 3 parts.
# note the required quoting of the line-continuation-mark with an additional \ escape
# i.e. after shell-escaping the sed script contains a single \ and a literal newline
$ sed $'s/-/foo\\\nbar/' <<< 'blah-blah'
# replace a character with newline, using a shell-escaped string in a variable
$ str=$'\n' # str contains a literal newline (0x0a) due to shell escaping
$ sed 's/-/foo\'"$str"'bar/' <<< 'blah-blah'
# same as above with the required (quoted) line-continuation inside the variable
# note, how the single \ from '…foo\' (previous example) became \\ inside $'\\…'
$ str=$'\\\n' # str contains \ and a literal newline (0x0a) due to shell escaping
$ sed 's/-/foo'"$str"'bar/' <<< 'blah-blah'
All the sed examples will print the same:
blahfoo
barblah
So, a newline in sed's replacement string must either be
(1) newline's escape-sequence (i.e. '\n'), so sed can replace it with a literal newline, or
(2) a literal newline preceded by a line-continuation-mark (i.e. $'\\\n' or '\'$'\n', which is NOT the same as '\\\n' or '\\n' or $'\\n').
This means you need to replace each literal newline <0x0a> with newline's escape-sequence \n or insert a line-continuation-mark before each literal newline inside your replacement string before double-quote-expanding it into sed's substitute replacement string.
Since there are many more caveats regarding escaping in sed, I recommend you use awk's gsub function instead passing your replacement string as a variable via -v, e.g.
$ str=$'foo\nbar'
$ awk -v REP="$str" -- '{gsub(/-/, REP); print}' <<< 'blah-blah'
blahfoo
barblah
PS: I don't know, if this answer is entirely true in your case, because your operating system uses an outdated version of bash.
echo -e $str
where -e is
enable interpretation of backslash escapes
Use sed command r to insert arbitrary text
str="abc\ndef"
tmp=$(mktemp)
(
echo
printf -- "$str"
echo
) > "$tmp"
sed -i.bak '/<!-- insert here -->/r '"$tmp" index.html
rm -r "$tmp"
sed interprets newline as command delimiter. The ; doesn't really is a seds command delimeter, only newline is. Don't append/suffix ; or } or spaces in the w command - it will be interpreted as part of the filename (yes, spaces also). sed commands like w or r are escaped by a newline.
If you want more flexibility, rather move to awk.

How to remove trailing spaces from. a variable in a shell script? [duplicate]

I have a shell script with this code:
var=`hg st -R "$path"`
if [ -n "$var" ]; then
echo $var
fi
But the conditional code always executes, because hg st always prints at least one newline character.
Is there a simple way to strip whitespace from $var (like trim() in PHP)?
or
Is there a standard way of dealing with this issue?
I could use sed or AWK, but I'd like to think there is a more elegant solution to this problem.
A simple answer is:
echo " lol " | xargs
Xargs will do the trimming for you. It's one command/program, no parameters, returns the trimmed string, easy as that!
Note: this doesn't remove all internal spaces so "foo bar" stays the same; it does NOT become "foobar". However, multiple spaces will be condensed to single spaces, so "foo bar" will become "foo bar". In addition it doesn't remove end of lines characters.
Let's define a variable containing leading, trailing, and intermediate whitespace:
FOO=' test test test '
echo -e "FOO='${FOO}'"
# > FOO=' test test test '
echo -e "length(FOO)==${#FOO}"
# > length(FOO)==16
How to remove all whitespace (denoted by [:space:] in tr):
FOO=' test test test '
FOO_NO_WHITESPACE="$(echo -e "${FOO}" | tr -d '[:space:]')"
echo -e "FOO_NO_WHITESPACE='${FOO_NO_WHITESPACE}'"
# > FOO_NO_WHITESPACE='testtesttest'
echo -e "length(FOO_NO_WHITESPACE)==${#FOO_NO_WHITESPACE}"
# > length(FOO_NO_WHITESPACE)==12
How to remove leading whitespace only:
FOO=' test test test '
FOO_NO_LEAD_SPACE="$(echo -e "${FOO}" | sed -e 's/^[[:space:]]*//')"
echo -e "FOO_NO_LEAD_SPACE='${FOO_NO_LEAD_SPACE}'"
# > FOO_NO_LEAD_SPACE='test test test '
echo -e "length(FOO_NO_LEAD_SPACE)==${#FOO_NO_LEAD_SPACE}"
# > length(FOO_NO_LEAD_SPACE)==15
How to remove trailing whitespace only:
FOO=' test test test '
FOO_NO_TRAIL_SPACE="$(echo -e "${FOO}" | sed -e 's/[[:space:]]*$//')"
echo -e "FOO_NO_TRAIL_SPACE='${FOO_NO_TRAIL_SPACE}'"
# > FOO_NO_TRAIL_SPACE=' test test test'
echo -e "length(FOO_NO_TRAIL_SPACE)==${#FOO_NO_TRAIL_SPACE}"
# > length(FOO_NO_TRAIL_SPACE)==15
How to remove both leading and trailing spaces--chain the seds:
FOO=' test test test '
FOO_NO_EXTERNAL_SPACE="$(echo -e "${FOO}" | sed -e 's/^[[:space:]]*//' -e 's/[[:space:]]*$//')"
echo -e "FOO_NO_EXTERNAL_SPACE='${FOO_NO_EXTERNAL_SPACE}'"
# > FOO_NO_EXTERNAL_SPACE='test test test'
echo -e "length(FOO_NO_EXTERNAL_SPACE)==${#FOO_NO_EXTERNAL_SPACE}"
# > length(FOO_NO_EXTERNAL_SPACE)==14
Alternatively, if your bash supports it, you can replace echo -e "${FOO}" | sed ... with sed ... <<<${FOO}, like so (for trailing whitespace):
FOO_NO_TRAIL_SPACE="$(sed -e 's/[[:space:]]*$//' <<<${FOO})"
There is a solution which only uses Bash built-ins called wildcards:
var=" abc "
# remove leading whitespace characters
var="${var#"${var%%[![:space:]]*}"}"
# remove trailing whitespace characters
var="${var%"${var##*[![:space:]]}"}"
printf '%s' "===$var==="
Here's the same wrapped in a function:
trim() {
local var="$*"
# remove leading whitespace characters
var="${var#"${var%%[![:space:]]*}"}"
# remove trailing whitespace characters
var="${var%"${var##*[![:space:]]}"}"
printf '%s' "$var"
}
You pass the string to be trimmed in quoted form, e.g.:
trim " abc "
A nice thing about this solution is that it will work with any POSIX-compliant shell.
Reference
Remove leading & trailing whitespace from a Bash variable (original source)
In order to remove all the spaces from the beginning and the end of a string (including end of line characters):
echo $variable | xargs echo -n
This will remove duplicate spaces also:
echo " this string has a lot of spaces " | xargs echo -n
Produces: 'this string has a lot of spaces'
Bash has a feature called parameter expansion, which, among other things, allows string replacement based on so-called patterns (patterns resemble regular expressions, but there are fundamental differences and limitations).
[flussence's original line: Bash has regular expressions, but they're well-hidden:]
The following demonstrates how to remove all white space (even from the interior) from a variable value.
$ var='abc def'
$ echo "$var"
abc def
# Note: flussence's original expression was "${var/ /}", which only replaced the *first* space char., wherever it appeared.
$ echo -n "${var//[[:space:]]/}"
abcdef
Strip one leading and one trailing space
trim()
{
local trimmed="$1"
# Strip leading space.
trimmed="${trimmed## }"
# Strip trailing space.
trimmed="${trimmed%% }"
echo "$trimmed"
}
For example:
test1="$(trim " one leading")"
test2="$(trim "one trailing ")"
test3="$(trim " one leading and one trailing ")"
echo "'$test1', '$test2', '$test3'"
Output:
'one leading', 'one trailing', 'one leading and one trailing'
Strip all leading and trailing spaces
trim()
{
local trimmed="$1"
# Strip leading spaces.
while [[ $trimmed == ' '* ]]; do
trimmed="${trimmed## }"
done
# Strip trailing spaces.
while [[ $trimmed == *' ' ]]; do
trimmed="${trimmed%% }"
done
echo "$trimmed"
}
For example:
test4="$(trim " two leading")"
test5="$(trim "two trailing ")"
test6="$(trim " two leading and two trailing ")"
echo "'$test4', '$test5', '$test6'"
Output:
'two leading', 'two trailing', 'two leading and two trailing'
From Bash Guide section on globbing
To use an extglob in a parameter expansion
#Turn on extended globbing
shopt -s extglob
#Trim leading and trailing whitespace from a variable
x=${x##+([[:space:]])}; x=${x%%+([[:space:]])}
#Turn off extended globbing
shopt -u extglob
Here's the same functionality wrapped in a function (NOTE: Need to quote input string passed to function):
trim() {
# Determine if 'extglob' is currently on.
local extglobWasOff=1
shopt extglob >/dev/null && extglobWasOff=0
(( extglobWasOff )) && shopt -s extglob # Turn 'extglob' on, if currently turned off.
# Trim leading and trailing whitespace
local var=$1
var=${var##+([[:space:]])}
var=${var%%+([[:space:]])}
(( extglobWasOff )) && shopt -u extglob # If 'extglob' was off before, turn it back off.
echo -n "$var" # Output trimmed string.
}
Usage:
string=" abc def ghi ";
#need to quote input-string to preserve internal white-space if any
trimmed=$(trim "$string");
echo "$trimmed";
If we alter the function to execute in a subshell, we don't have to worry about examining the current shell option for extglob, we can just set it without affecting the current shell. This simplifies the function tremendously. I also update the positional parameters "in place" so I don't even need a local variable
trim() {
shopt -s extglob
set -- "${1##+([[:space:]])}"
printf "%s" "${1%%+([[:space:]])}"
}
so:
$ s=$'\t\n \r\tfoo '
$ shopt -u extglob
$ shopt extglob
extglob off
$ printf ">%q<\n" "$s" "$(trim "$s")"
>$'\t\n \r\tfoo '<
>foo<
$ shopt extglob
extglob off
You can trim simply with echo:
foo=" qsdqsd qsdqs q qs "
# Not trimmed
echo \'$foo\'
# Trim
foo=`echo $foo`
# Trimmed
echo \'$foo\'
I've always done it with sed
var=`hg st -R "$path" | sed -e 's/ *$//'`
If there is a more elegant solution, I hope somebody posts it.
With Bash's extended pattern matching features enabled (shopt -s extglob), you can use this:
{trimmed##*( )}
to remove an arbitrary amount of leading spaces.
You can delete newlines with tr:
var=`hg st -R "$path" | tr -d '\n'`
if [ -n $var ]; then
echo $var
done
# Trim whitespace from both ends of specified parameter
trim () {
read -rd '' $1 <<<"${!1}"
}
# Unit test for trim()
test_trim () {
local foo="$1"
trim foo
test "$foo" = "$2"
}
test_trim hey hey &&
test_trim ' hey' hey &&
test_trim 'ho ' ho &&
test_trim 'hey ho' 'hey ho' &&
test_trim ' hey ho ' 'hey ho' &&
test_trim $'\n\n\t hey\n\t ho \t\n' $'hey\n\t ho' &&
test_trim $'\n' '' &&
test_trim '\n' '\n' &&
echo passed
There are a lot of answers, but I still believe my just-written script is worth being mentioned because:
it was successfully tested in the shells bash/dash/busybox shell
it is extremely small
it doesn't depend on external commands and doesn't need to fork (->fast and low resource usage)
it works as expected:
it strips all spaces and tabs from beginning and end, but not more
important: it doesn't remove anything from the middle of the string (many other answers do), even newlines will remain
special: the "$*" joins multiple arguments using one space. if you want to trim & output only the first argument, use "$1" instead
if doesn't have any problems with matching file name patterns etc
The script:
trim() {
local s2 s="$*"
until s2="${s#[[:space:]]}"; [ "$s2" = "$s" ]; do s="$s2"; done
until s2="${s%[[:space:]]}"; [ "$s2" = "$s" ]; do s="$s2"; done
echo "$s"
}
Usage:
mystring=" here is
something "
mystring=$(trim "$mystring")
echo ">$mystring<"
Output:
>here is
something<
This is what I did and worked out perfect and so simple:
the_string=" test"
the_string=`echo $the_string`
echo "$the_string"
Output:
test
If you have shopt -s extglob enabled, then the following is a neat solution.
This worked for me:
text=" trim my edges "
trimmed=$text
trimmed=${trimmed##+( )} #Remove longest matching series of spaces from the front
trimmed=${trimmed%%+( )} #Remove longest matching series of spaces from the back
echo "<$trimmed>" #Adding angle braces just to make it easier to confirm that all spaces are removed
#Result
<trim my edges>
To put that on fewer lines for the same result:
text=" trim my edges "
trimmed=${${text##+( )}%%+( )}
# Strip leading and trailing white space (new line inclusive).
trim(){
[[ "$1" =~ [^[:space:]](.*[^[:space:]])? ]]
printf "%s" "$BASH_REMATCH"
}
OR
# Strip leading white space (new line inclusive).
ltrim(){
[[ "$1" =~ [^[:space:]].* ]]
printf "%s" "$BASH_REMATCH"
}
# Strip trailing white space (new line inclusive).
rtrim(){
[[ "$1" =~ .*[^[:space:]] ]]
printf "%s" "$BASH_REMATCH"
}
# Strip leading and trailing white space (new line inclusive).
trim(){
printf "%s" "$(rtrim "$(ltrim "$1")")"
}
OR
# Strip leading and trailing specified characters. ex: str=$(trim "$str" $'\n a')
trim(){
if [ "$2" ]; then
trim_chrs="$2"
else
trim_chrs="[:space:]"
fi
[[ "$1" =~ ^["$trim_chrs"]*(.*[^"$trim_chrs"])["$trim_chrs"]*$ ]]
printf "%s" "${BASH_REMATCH[1]}"
}
OR
# Strip leading specified characters. ex: str=$(ltrim "$str" $'\n a')
ltrim(){
if [ "$2" ]; then
trim_chrs="$2"
else
trim_chrs="[:space:]"
fi
[[ "$1" =~ ^["$trim_chrs"]*(.*[^"$trim_chrs"]) ]]
printf "%s" "${BASH_REMATCH[1]}"
}
# Strip trailing specified characters. ex: str=$(rtrim "$str" $'\n a')
rtrim(){
if [ "$2" ]; then
trim_chrs="$2"
else
trim_chrs="[:space:]"
fi
[[ "$1" =~ ^(.*[^"$trim_chrs"])["$trim_chrs"]*$ ]]
printf "%s" "${BASH_REMATCH[1]}"
}
# Strip leading and trailing specified characters. ex: str=$(trim "$str" $'\n a')
trim(){
printf "%s" "$(rtrim "$(ltrim "$1" "$2")" "$2")"
}
OR
Building upon moskit's expr soulution...
# Strip leading and trailing white space (new line inclusive).
trim(){
printf "%s" "`expr "$1" : "^[[:space:]]*\(.*[^[:space:]]\)[[:space:]]*$"`"
}
OR
# Strip leading white space (new line inclusive).
ltrim(){
printf "%s" "`expr "$1" : "^[[:space:]]*\(.*[^[:space:]]\)"`"
}
# Strip trailing white space (new line inclusive).
rtrim(){
printf "%s" "`expr "$1" : "^\(.*[^[:space:]]\)[[:space:]]*$"`"
}
# Strip leading and trailing white space (new line inclusive).
trim(){
printf "%s" "$(rtrim "$(ltrim "$1")")"
}
Use AWK:
echo $var | awk '{gsub(/^ +| +$/,"")}1'
You can use old-school tr. For example, this returns the number of modified files in a git repository, whitespaces stripped.
MYVAR=`git ls-files -m|wc -l|tr -d ' '`
This will remove all the whitespaces from your String,
VAR2="${VAR2//[[:space:]]/}"
/ replaces the first occurrence and // all occurrences of whitespaces in the string. I.e. all white spaces get replaced by – nothing
I would simply use sed:
function trim
{
echo "$1" | sed -n '1h;1!H;${;g;s/^[ \t]*//g;s/[ \t]*$//g;p;}'
}
a) Example of usage on single-line string
string=' wordA wordB wordC wordD '
trimmed=$( trim "$string" )
echo "GIVEN STRING: |$string|"
echo "TRIMMED STRING: |$trimmed|"
Output:
GIVEN STRING: | wordA wordB wordC wordD |
TRIMMED STRING: |wordA wordB wordC wordD|
b) Example of usage on multi-line string
string=' wordA
>wordB<
wordC '
trimmed=$( trim "$string" )
echo -e "GIVEN STRING: |$string|\n"
echo "TRIMMED STRING: |$trimmed|"
Output:
GIVEN STRING: | wordAA
>wordB<
wordC |
TRIMMED STRING: |wordAA
>wordB<
wordC|
c) Final note:
If you don't like to use a function, for single-line string you can simply use a "easier to remember" command like:
echo "$string" | sed -e 's/^[ \t]*//' | sed -e 's/[ \t]*$//'
Example:
echo " wordA wordB wordC " | sed -e 's/^[ \t]*//' | sed -e 's/[ \t]*$//'
Output:
wordA wordB wordC
Using the above on multi-line strings will work as well, but please note that it will cut any trailing/leading internal multiple space as well, as GuruM noticed in the comments
string=' wordAA
>four spaces before<
>one space before< '
echo "$string" | sed -e 's/^[ \t]*//' | sed -e 's/[ \t]*$//'
Output:
wordAA
>four spaces before<
>one space before<
So if you do mind to keep those spaces, please use the function at the beginning of my answer!
d) EXPLANATION of the sed syntax "find and replace" on multi-line strings used inside the function trim:
sed -n '
# If the first line, copy the pattern to the hold buffer
1h
# If not the first line, then append the pattern to the hold buffer
1!H
# If the last line then ...
$ {
# Copy from the hold to the pattern buffer
g
# Do the search and replace
s/^[ \t]*//g
s/[ \t]*$//g
# print
p
}'
There are a few different options purely in BASH:
line=${line##+([[:space:]])} # strip leading whitespace; no quote expansion!
line=${line%%+([[:space:]])} # strip trailing whitespace; no quote expansion!
line=${line//[[:space:]]/} # strip all whitespace
line=${line//[[:space:]]/} # strip all whitespace
line=${line//[[:blank:]]/} # strip all blank space
The former two require extglob be set/enabled a priori:
shopt -s extglob # bash only
NOTE: variable expansion inside quotation marks breaks the top two examples!
The pattern matching behaviour of POSIX bracket expressions are detailed here. If you are using a more modern/hackable shell such as Fish, there are built-in functions for string trimming.
I've seen scripts just use variable assignment to do the job:
$ xyz=`echo -e 'foo \n bar'`
$ echo $xyz
foo bar
Whitespace is automatically coalesced and trimmed. One has to be careful of shell metacharacters (potential injection risk).
I would also recommend always double-quoting variable substitutions in shell conditionals:
if [ -n "$var" ]; then
since something like a -o or other content in the variable could amend your test arguments.
Here's a trim() function that trims and normalizes whitespace
#!/bin/bash
function trim {
echo $*
}
echo "'$(trim " one two three ")'"
# 'one two three'
And another variant that uses regular expressions.
#!/bin/bash
function trim {
local trimmed="$#"
if [[ "$trimmed" =~ " *([^ ].*[^ ]) *" ]]
then
trimmed=${BASH_REMATCH[1]}
fi
echo "$trimmed"
}
echo "'$(trim " one two three ")'"
# 'one two three'
This does not have the problem with unwanted globbing, also, interior white-space is unmodified (assuming that $IFS is set to the default, which is ' \t\n').
It reads up to the first newline (and doesn't include it) or the end of string, whichever comes first, and strips away any mix of leading and trailing space and \t characters. If you want to preserve multiple lines (and also strip leading and trailing newlines), use read -r -d '' var << eof instead; note, however, that if your input happens to contain \neof, it will be cut off just before. (Other forms of white space, namely \r, \f, and \v, are not stripped, even if you add them to $IFS.)
read -r var << eof
$var
eof
To remove spaces and tabs from left to first word, enter:
echo " This is a test" | sed "s/^[ \t]*//"
cyberciti.biz/tips/delete-leading-spaces-from-front-of-each-word.html
var=' a b c '
trimmed=$(echo $var)
This is the simplest method I've seen. It only uses Bash, it's only a few lines, the regexp is simple, and it matches all forms of whitespace:
if [[ "$test" =~ ^[[:space:]]*([^[:space:]].*[^[:space:]])[[:space:]]*$ ]]
then
test=${BASH_REMATCH[1]}
fi
Here is a sample script to test it with:
test=$(echo -e "\n \t Spaces and tabs and newlines be gone! \t \n ")
echo "Let's see if this works:"
echo
echo "----------"
echo -e "Testing:${test} :Tested" # Ugh!
echo "----------"
echo
echo "Ugh! Let's fix that..."
if [[ "$test" =~ ^[[:space:]]*([^[:space:]].*[^[:space:]])[[:space:]]*$ ]]
then
test=${BASH_REMATCH[1]}
fi
echo
echo "----------"
echo -e "Testing:${test}:Tested" # "Testing:Spaces and tabs and newlines be gone!"
echo "----------"
echo
echo "Ah, much better."
Removing spaces to one space:
(text) | fmt -su
Assignments ignore leading and trailing whitespace and as such can be used to trim:
$ var=`echo ' hello'`; echo $var
hello
Python has a function strip() that works identically to PHP's trim(), so we can just do a little inline Python to make an easily understandable utility for this:
alias trim='python -c "import sys; sys.stdout.write(sys.stdin.read().strip())"'
This will trim leading and trailing whitespace (including newlines).
$ x=`echo -e "\n\t \n" | trim`
$ if [ -z "$x" ]; then echo hi; fi
hi

Trailing newlines get truncated from bash subshell

Bash seems to remove trailing newlines from the output of subshells. For instance:
$ echo "Newline: '$(echo $'\n')'"
will produce the output
Newline: ''
Does anyone know a workaround or a way to prevent this truncation from happening?
If all you need is just a newline in a variable:
nl=$'\n'
If you need to retain the newline, you can do this (which you show in your own answer):
f () { echo "hello"; }
output=$(f; echo "x")
output=${output%x}
echo "'$output'"
Resulting in:
'hello
'
After some more experimentation I found a workaround using shell variables. Basically, I make sure that the output does not end in a newline, then I strip off the added text later
output="$(echo $'\n'x )"
output="${output%x}"
echo "Newline: '$output'"
This gives the proper output
Newline: '
'
You can use the -e option to enable interpretation of backslash escapes and do it all with one echo.
$ echo -e "Newline: '\n'"
will produce the output
Newline: '
'

Printf example in bash does not create a newline

Working with printf in a bash script, adding no spaces after "\n" does not create a newline, whereas adding a space creates a newline, e. g.:
No space after "\n"
NewLine=`printf "\n"`
echo -e "Firstline${NewLine}Lastline"
Result:
FirstlineLastline
Space after "\n "
NewLine=`printf "\n "`
echo -e "Firstline${NewLine}Lastline"
Result:
Firstline
Lastline
Question: Why doesn't 1. create the following result:
Firstline
Lastline
I know that this specific issue could have been worked around using other techniques, but I want to focus on why 1. does not work.
Edited:
When using echo instead of printf, I get the expected result, but why does printf work differently?
NewLine=`echo "\n"`
echo -e "Firstline${NewLine}Lastline"
Result:
Firstline
Lastline
The backtick operator removes trailing new lines. See 3.4.5. Command substitution at http://tldp.org/LDP/Bash-Beginners-Guide/html/sect_03_04.html
Note on edited question
Compare:
[alvaro#localhost ~]$ printf "\n"
[alvaro#localhost ~]$ echo "\n"
\n
[alvaro#localhost ~]$ echo -e "\n"
[alvaro#localhost ~]$
The echo command doesn't treat \n as a newline unless you tell him to do so:
NAME
echo - display a line of text
[...]
-e enable interpretation of backslash escapes
POSIX 7 specifies this behaviour here:
[...] with the standard output of the command, removing sequences of one or more characters at the end of the substitution
Maybe people will come here with the same problem I had:
echoing \n inside a code wrapped in backsticks. A little tip:
printf "astring\n"
# and
printf "%s\n" "astring"
# both have the same effect.
# So... I prefer the less typing one
The short answer is:
# Escape \n correctly !
# Using just: printf "$myvar\n" causes this effect inside the backsticks:
printf "banana
"
# So... you must try \\n that will give you the desired
printf "banana\n"
# Or even \\\\n if this string is being send to another place
# before echoing,
buffer="${buffer}\\\\n printf \"$othervar\\\\n\""
One common problem is that if you do inside the code:
echo 'Tomato is nice'
when surrounded with backsticks will produce the error
command Tomato not found.
The workaround is to add another echo -e or printf
printed=0
function mecho(){
#First time you need an "echo" in order bash relaxes.
if [[ $printed == 0 ]]; then
printf "echo -e $1\\\\n"
printed=1
else
echo -e "\r\n\r$1\\\\n"
fi
}
Now you can debug your code doing in prompt just:
(prompt)$ `mySuperFunction "arg1" "etc"`
The output will be nicely
mydebug: a value
otherdebug: whathever appended using myecho
a third string
and debuging internally with
mecho "a string to be hacktyped"
$ printf -v NewLine "\n"
$ echo -e "Firstline${NewLine}Lastline"
Firstline
Lastline
$ echo "Firstline${NewLine}Lastline"
Firstline
Lastline
It looks like BASH is removing trailing newlines.
e.g.
NewLine=`printf " \n\n\n"`
echo -e "Firstline${NewLine}Lastline"
Firstline Lastline
NewLine=`printf " \n\n\n "`
echo -e "Firstline${NewLine}Lastline"
Firstline
Lastline
Your edited echo version is putting a literal backslash-n into the variable $NewLine which then gets interpreted by your echo -e. If you did this instead:
NewLine=$(echo -e "\n")
echo -e "Firstline${NewLine}Lastline"
your result would be the same as in case #1. To make that one work that way, you'd have to escape the backslash and put the whole thing in single quotes:
NewLine=$(printf '\\n')
echo -e "Firstline${NewLine}Lastline"
or double escape it:
NewLine=$(printf "\\\n")
Of course, you could just use printf directly or you can set your NewLine value like this:
printf "Firstline\nLastline\n"
or
NewLine=$'\n'
echo "Firstline${NewLine}Lastline" # no need for -e
For people coming here wondering how to use newlines in arguments to printf, use %b instead of %s:
$> printf "a%sa" "\n"
a\na
$> printf "a%ba" "\n"
a
a
From the manual:
%b expand backslash escape sequences in the corresponding argument
We do not need "echo" or "printf" for creating the NewLine variable:
NewLine="
"
printf "%q\n" "${NewLine}"
echo "Firstline${NewLine}Lastline"
Bash delete all trailing newlines in commands substitution.
To save trailing newlines, assign printf output to the variable with printf -v VAR
instead of
NewLine=`printf "\n"`
echo -e "Firstline${NewLine}Lastline"
#FirstlineLastline
use
printf -v NewLine '\n'
echo -e "Firstline${NewLine}Lastline"
#Firstline
#Lastline
Explanation
According to bash man
3.5.4 Command Substitution
$(command)
or
`command`
Bash performs the expansion by executing command and replacing the command substitution with the standard output of the command, with any trailing newlines deleted. Embedded newlines are not deleted, but they may be removed during word splitting.
So, after adding any trailing newlines, bash will delete them.
var=$(printf '%s\n%s\n\n\n' 'foo' 'bar')
echo "$var"
output:
foo
bar
According to help printf
printf [-v var] format [arguments]
If the -v option is supplied, the output is placed into the value of the shell variable VAR rather than being sent to the standard output.
In this case, for safe copying of formatted text to the variable, use the [-v var] option:
printf -v var '%s\n%s\n\n\n' 'foo' 'bar'
echo "$var"
output:
foo
bar
Works ok if you add "\r"
$ nl=`printf "\n\r"` && echo "1${nl}2"
1
2

How can I capture the text between specific delimiters into a shell variable?

I have little problem with specifying my variable. I have a file with normal text and somewhere in it there are brackets [ ] (only 1 pair of brackets in whole file), and some text between them. I need to capture the text within these brackets in a shell (bash) variable. How can I do that, please?
Bash/sed:
VARIABLE=$(tr -d '\n' filename | sed -n -e '/\[[^]]/s/^[^[]*\[\([^]]*\)].*$/\1/p')
If that is unreadable, here's a bit of an explanation:
VARIABLE=`subexpression` Assigns the variable VARIABLE to the output of the subexpression.
tr -d '\n' filename Reads filename, deletes newline characters, and prints the result to sed's input
sed -n -e 'command' Executes the sed command without printing any lines
/\[[^]]/ Execute the command only on lines which contain [some text]
s/ Substitute
^[^[]* Match any non-[ text
\[ Match [
\([^]]*\) Match any non-] text into group 1
] Match ]
.*$ Match any text
/\1/ Replaces the line with group 1
p Prints the line
May I point out that while most of the suggested solutions might work, there is absolutely no reason why you should fork another shell, and spawn several processes to do such a simple task.
The shell provides you with all the tools you need:
$ var='foo[bar] pinch'
$ var=${var#*[}; var=${var%%]*}
$ echo "$var"
bar
See: http://mywiki.wooledge.org/BashFAQ/073
Sed is not necessary:
var=`egrep -o '\[.*\]' FILENAME | tr -d ][`
But it's only works with single line matches.
Using Bash builtin regex matching seems like yet another way of doing it:
var='foo[bar] pinch'
[[ "$var" =~ [^\]\[]*\[([^\[]*)\].* ]] # Bash 3.0
var="${BASH_REMATCH[1]}"
echo "$var"
Assuming you are asking about bash variable:
$ export YOUR_VAR=$(perl -ne'print $1 if /\[(.*?)\]/' your_file.txt)
The above works if brackets are on the same line.
What about:
shell_variable=$(sed -ne '/\[/,/\]/{s/^.*\[//;s/\].*//;p;}' $file)
Worked for me on Solaris 10 under Korn shell; should work with Bash too. Replace '$(...)' with back-ticks in Bourne shell.
Edit: worked when given [ on one line and ] on another. For the single line case as well, use:
shell_variable=$(sed -n -e '/\[[^]]*$/,/\]/{s/^.*\[//;s/\].*//;p;}' \
-e '/\[.*\]/s/^.*\[\([^]]*\)\].*$/\1/p' $file)
The first '-e' deals with the multi-line spread; the second '-e' deals with the single-line case. The first '-e' says:
From the line containing an open bracket [ not followed by a close bracket ] on the same line
Until the line containing close bracket ],
substitute anything up to and including the open bracket with an empty string,
substitute anything from the close bracket onwards with an empty string, and
print the result
The second '-e' says:
For any line containing both open bracket and close bracket
Substitute the pattern consisting of 'characters up to and including open bracket', 'characters up to but excluding close bracket' (and remember this), 'stuff from close bracket onwards' with the remembered characters in the middle, and
print the result
For the multi-line case:
$ file=xxx
$ cat xxx
sdsajdlajsdl
asdajsdkjsaldjsal
sdasdsad [aaaa
bbbbbbb
cccc] asdjsalkdjsaldjlsaj
asdjsalkdjlksjdlaj
asdasjdlkjsaldja
$ shell_variable=$(sed -n -e '/\[[^]]*$/,/\]/{s/^.*\[//;s/\].*//;p;}' \
-e '/\[.*\]/s/^.*\[\([^]]*\)\].*$/\1/p' $file)
$ echo $shell_variable
aaaa bbbbbbb cccc
$
And for the single-line case:
$ cat xxx
sdsajdlajsdl
asdajsdkjsaldjsal
sdasdsad [aaaa bbbbbbb cccc] asdjsalkdjsaldjlsaj
asdjsalkdjlksjdlaj
asdasjdlkjsaldja
$
$ shell_variable=$(sed -n -e '/\[[^]]*$/,/\]/{s/^.*\[//;s/\].*//;p;}' \
-e '/\[.*\]/s/^.*\[\([^]]*\)\].*$/\1/p' $file)
$ echo $shell_variable
aaaa bbbbbbb cccc
$
Somewhere about here, it becomes simpler to do the whole job in Perl, slurping the file and editing the result string in two multi-line substitute operations.
var=`grep -e '\[.*\]' test.txt | sed -e 's/.*\[\(.*\)\].*/\1/' infile.txt`
Thanks to everyone, i used Strager's version and works perfectly, thanks alot once again...
var=`grep -e '\[.*\]' test.txt | sed -e 's/.*\[\(.*\)\].*/\1/' infile.txt`
Backslashes (BSL) got munched up ... :
var='foo[bar] pinch'
[[ "$var" =~ [^\]\[]*\[([^\[]*)\].* ]] # Bash 3.0
# Just in case ...:
[[ "$var" =~ [^BSL]BSL[]*BSL[([^BSL[]*)BSL].* ]] # Bash 3.0
var="${BASH_REMATCH[1]}"
echo "$var"
2 simple steps to extract the text.
split var at [ and get the right part
split var at ] and get the left part
cb0$ var='foo[bar] pinch'
cb0$ var=${var#*[}
cb0$ var=${var%]*} && echo $var
bar

Resources