Dealing with variables and new lines, and quoting in a bash script - bash

I would like to automate the following svn command. Note this command produces the desired results on my system - Ubuntu 10.04, svn 1.6.6, bash shell, when issued from the command line:
svn ci -m $'Added new File: newFile.txt\nOrig loc: /etc/networking/newFile.txt' /home/user/svnDir/newFile.txt
I would like to run that command in a bash script, assuming that the original full path to the file is contained in the variable $oFileFull, and the filename is in $oFileName. The script is executed from the svn directory. I need to allow for the possibility that the file name and or path contain spaces.
so the line inside my shel script might look like:
svn ci -m$'Added new file: ${oFileName}\nOrig loc: ${oFileFull}' ${oFileName}
But I want the variables (which may contain spaces) expanded before the command is executed, and I cannot figure out how to do this while enclosing the svn comment in single quotes which is necessary in order to get the new line in the subversion comment log. I am pulling my hair out trying to figure out how to properly quote and assemble this command. Any help appreciated.
The following was posted as an answer, but should have been an edit.
#Dennis:
Wow, this is hard to wrap my head around, I'd love it if someone could point me towards an online resource that explains this clearly and concisely. The ones I have found have not cleared it up for me.
My latest source of confusion is the following:
#!/bin/bash
nl=$'\n'
msg="Line 1${nl}Line 2"
echo $msg # ouput = Line 1 Line 2
echo -e $msg # ouput = Line 1 Line 2
echo "$msg" # output = Line 1
# Line 2
What I'm attempting to illustrate is that having the double quotes around the variable $msg splits the output into two lines, without the double quotes, even with the -e switch, there is no new line.
I don't get why the double quotes are necessary - why isn't the $nl variable expanded when it is assigned as part of the msg variable?
I hope I'm not committing a StackOverflow faux pas by answering my own question. And in fact, I am not really providing an answer, but merely a response to your comment. I cannot format a comment as necessary.

Put your newline in a variable, use the variable wherever you need a newline and change the quotes around your larger string to double quotes. You should also always quote any variable that contains a filename.
nl=$'\n'
svn ci -m"Added new file: ${oFileName}${nl}Orig loc: ${oFileFull}" "${oFileName}"

#Scott, I know this has already been answered, but here are some comments on your additional question about the difference between echo $msg and echo "$msg". Your example:
nl=$'\n'
msg="Line 1${nl}Line 2"
echo $msg # ouput = Line 1 Line 2
echo -e $msg # ouput = Line 1 Line 2
echo "$msg" # output = Line 1
# Line 2
It really helps to know how many arguments the shell is passing to the echo command in each case. Echo takes any number of arguments. It writes all of them to standard output, in order, with a single space after each one except the last one.
In the first example, echo $msg, the shell replaces $msg with the characters that you've set it to (one of which is a newline). Then before invoking echo it splits that string into multiple arguments using the Internal Field Separator (IFS), and it passes that resulting list of arguments to echo. By default, the IFS is set to whitespace (spaces, tabs, and newlines), so the shell chops your $msg string into four arguments: "Line", "1", "Line", and "2". Echo never even sees the newline, because the shell considers it a separator, just like a space.
In the second example, echo -e $msg, the echo command receives five arguments: "-e", "Line", "1", "Line", "2". The -e argument tells the command to expand any backslash characters that it sees in its arguments, but there are none, so echo leaves the arguments unchanged. The result is the same as the first example.
In the last example, echo "$msg", you're telling the shell to expand $msg and pass its contents to echo but to treat everything between the double quotation marks as one argument, so echo receives one argument, a string that contains letters, numbers, spaces and newlines. It echoes that string exactly as it receives it, so "Line 1" and "Line 2" appear on separate lines.
Try this experiment: set the IFS to something arbitrary, like the letter i, and watch how the shell splits your string differently:
nl=$'\n'
msg="Line 1${nl}Line 2"
IFS=i
echo $msg # ouput = L ne 1
# L ne 2
There's nothing especially magical about whitespace. Here, because of the weird IFS, the shell treats the letter i as a separator but not the spaces or newlines. So echo receives three arguments: "L", "ne \nL", and "ne 2". (I'm showing the newline character as \n, but it's really just a single character, like X or p or 5.)
Once you realize that single and double quotation marks on the shell's command line are all about constructing arguments for the commands, the logic begins to make sense.

Bash will concatenate adjacent quoted sections. So for example
$ r=rooty
$ t=tooty
$ echo $r$'\n'$t
rooty
tooty
$ echo "$r $r"$'\n'$t
rooty rooty
tooty
For an online guide, the advanced bash scripting guide may be useful, though it may be no better than just reading bash's long and detailed manual. I recommend using vim's Man command (standard under version 7.2, if not earlier versions) to bring up the bash manual, and then using indent-based folding so that you can browse its heirarchy. Well, assuming that you don't have the time and patience to read all 3300 lines from start to finish.

Related

ls -1 is not working as intended in csh

I want to list all the files in a directory one line after another, for which I am using a sample shell script as follows
#!/bin/sh
MY_VAR="$(ls -1)"
echo "$MY_VAR"
This works out as expected, however if the same is executed using csh as follows:
#!/bin/csh
set MY_VAR = `ls -1`
echo $MY_VAR
it outputs all files in one single line, than printing one file per line.
Can any one explain why in csh ls -1 is not working as expected.
From man csh, emphasis mine:
Command substitution
Command substitution is indicated by a command enclosed in ``'. The
output from such a command is broken into separate words at blanks,
tabs and newlines, and null words are discarded. The output is vari‐
able and command substituted and put in place of the original string.
Command substitutions inside double quotes (`"') retain blanks and
tabs; only newlines force new words. The single final newline does not
force a new word in any case. It is thus possible for a command sub‐
stitution to yield only part of a word, even if the command outputs a
complete line.
By default, the shell since version 6.12 replaces all newline and car‐
riage return characters in the command by spaces. If this is switched
off by unsetting csubstnonl, newlines separate commands as usual.
The entries are assigned in a list; you can access a single entry with e.g. echo $MY_VAR[2].
This is different from the Bourne shell, which doesn't have the concept of a "list" and variables are always strings.
To print all the entries on a single line, use a foreach loop:
#!/bin/csh
set my_var = "`ls -1`"
foreach e ($my_var)
echo "$e"
end
Adding double quotes around the `ls -1` is recommended, as this will make sure it works correctly when you have filenames with a space (otherwise such files would show up as two words/list entries).

Why does echo "$out" split output onto multiple lines, if quotes suppress word-splitting?

I have very simple directory with "directory1" and "file2" in it.
After
out=`ls`
I want to print my variable: echo $out gives:
directory1 file2
but echo "$out" gives:
directory1
file2
so using quotes gives me output with each record on separate line. As we know ls command prints output using single line for all files/dirs (if line is big enough to contain output) so I expected that using double quotes prevents my shell from splitting words to separate lines while ommitting quotes would split them.
Pls tell me: why using quotes (used for prevent word-splitting) suddenly splits output ?
On Behavior Of ls
ls only prints multiple filenames on a single line by default when output is to a TTY. When output is to a pipeline, a file, or similar, then the default is to print one line to a file.
Quoting from the POSIX standard for ls, with emphasis added:
The default format shall be to list one entry per line to standard output; the exceptions are to terminals or when one of the -C, -m, or -x options is specified. If the output is to a terminal, the format is implementation-defined.
Literal Question (Re: Quoting)
It's the very act of splitting your command into separate arguments that causes it to be put on one line! Natively, your value spans multiple lines, so echoing it unmodified (without any splitting) prints it precisely that manner.
The result of your command is something like:
out='directory1
file2'
When you run echo "$out", that exact content is printed. When you run echo $out, by contrast, the behavior is akin to:
echo "directory1" "file2"
...in that the string is split into two elements, each passed as completely different argument to echo, for echo to deal with as it sees fit -- in this case, printing both those arguments on the same line.
On Side Effects Of Word Splitting
Word-splitting may look like it does what you want here, but that's often not the case! Consider some particular issues:
Word-splitting expands glob expressions: If a filename contains a * surrounded by whitespace, that * will be replaced with a list of files in the current directory, leading to duplicate results.
Word-splitting doesn't honor quotes or escaping: If a filename contains whitespace, that internal whitespace can't be distinguished from whitespace separating multiple names. This is closely related to the issues described in BashFAQ #50.
On Reading Directories
See Why you shouldn't parse the output of ls. In short -- in your example of out=`ls`, the out variable (being a string) isn't able to store all possible filenames in a useful, parsable manner.
Consider, for instance, a file created as such:
touch $'hello\nworld"three words here"'
...that filename contains spaces and newlines, and word-splitting won't correctly detect it as a single name in the output from ls. However, you can store and process it in an array:
# create an array of filenames
names=( * )
if ! [[ -e $names || -L $names ]]; then # this tests only the FIRST name
echo "No names matched" >&2 # ...but that's good enough.
else
echo "Found ${#files[#]} files" # print number of filenames
printf '- %q\n' "${names[#]}"
fi

Newlines at the end "$( .... )" get removed in shell scripts. Why?

Can someone explain why the output of these two commands is different?
$ echo "NewLine1\nNewLine2\n"
NewLine1
NewLine2
<-- Note 2nd newline here
$ echo "$(echo "NewLine1\nNewLine2\n")"
NewLine1
NewLine2
$ <-- No second newline
Is there any good way that I can keep the new lines at the end of the output in "$( .... )" ? I've thought about just adding a dummy letter and removing it, but I'd quite like to understand why those new lines are going away.
Because that's what POSIX specifies and has always been like that in Bourne shells:
2.6.3 Command Substitution
Command substitution allows the output of a command to be substituted
in place of the command name itself. Command substitution shall occur
when the command is enclosed as follows:
$(command)
or (backquoted version):
`command`
The shell shall expand the command substitution by executing command
in a subshell environment (see Shell Execution Environment) and
replacing the command substitution (the text of command plus the
enclosing "$()" or backquotes) with the standard output of the
command, removing sequences of one or more <newline> characters at the
end of the substitution. Embedded <newline> characters before the end
of the output shall not be removed; however, they may be treated as
field delimiters and eliminated during field splitting, depending on
the value of IFS and quoting that is in effect. If the output contains
any null bytes, the behavior is unspecified.
One way to keep the final newline(s) would be
VAR="$(command; echo x)" # Append x to keep newline(s).
VAR=${VAR%x} # Chop x.
Vis.:
$ x="$(whoami; echo x)" ; printf '<%s>\n' "$x" "${x%x}"
<user
x>
<user
>
But why remove trailing newlines? Because more often than not you want it that way. I'm also programming in perl and I can't count the number of times where I read a line or variable and then need to chop the newline:
while (defined ($string = <>)) {
chop $string;
frobnitz($string);
}
command substitution removes every trailing newline.
It makes sense to remove one. For instance:
basename foo/bar
outputs bar\n. In:
var=$(basename foo/bar)
you want $var to contain bar, not bar\n.
However in
var=$(basename $'foo/bar\n')
You would like $var to contain bar\n (after all, newline is as valid a character as any in a file name on Unix). But all shells remove every trailing newline character. That misfeature was in the original Bourne shell and even rc which has fixed most of Bourne's flaws has not fixed that one. (though rc has the ``(){cmd} syntax to not strip any newline character).
In POSIX shells, to work around the issue, you can do:
var=$(basename -- "$file"; echo .)
var=${var%??}
Though you're then losing the exit status of basename. Which you can fix with:
var=$(basename -- "$file" && echo .) && var=${var%??}
${var%??} is to remove the last two characters. The first one is the . that we added above, the second is the one newline character added by basename, we're not removing any more as command substitution would do as the other newline characters, if any, would be part of the filename we want to get the base of, so we do want them.
In the Bourne shell which doesn't have the ${var%x} operator, you had to go a long and convoluted way to work around it.
If the newlines were not removed, then constructs like:
x="$(pwd)/filename"
would not work usefully, but the people who wrote Unix preferred useful behaviour.
Once, briefly, a very long time ago (like 1983, maybe 1984), I suffered from a shell update on a particular variant of Unix that didn't remove the trailing newline. It broke scripts all over the place. It was fixed very quickly.

place a multi-line output inside a variable

I'm writing a script in bash and I want it to execute a command and to handle each line separately. for example:
LINES=$(df)
echo $LINES
it will return all the output converting new lines with spaces.
example:
if the output was supposed to be:
1
2
3
then I would get
1 2 3
how can I place the output of a command into a variable allowing new lines to still be new lines so when I print the variable i will get proper output?
Generally in bash $v is asking for trouble in most cases. Almost always what you really mean is "$v" in double quotes:
LINES="$(df)"
echo "$LINES"
No, it will not. The $(something) only strips trailing newlines.
The expansion in argument to echo splits on whitespace and than echo concatenates separate arguments with space. To preserve the whitespace, you need to quote again:
echo "$LINES"
Note, that the assignment does not need to be quoted; result of expansion is not word-split in assignment to variable and in argument to case. But it can be quoted and it's easier to just learn to just always put the quotes in.

How to split strings over multiple lines in Bash?

How can i split my long string constant over multiple lines?
I realize that you can do this:
echo "continuation \
lines"
>continuation lines
However, if you have indented code, it doesn't work out so well:
echo "continuation \
lines"
>continuation lines
This is what you may want
$ echo "continuation"\
> "lines"
continuation lines
If this creates two arguments to echo and you only want one, then let's look at string concatenation. In bash, placing two strings next to each other concatenate:
$ echo "continuation""lines"
continuationlines
So a continuation line without an indent is one way to break up a string:
$ echo "continuation"\
> "lines"
continuationlines
But when an indent is used:
$ echo "continuation"\
> "lines"
continuation lines
You get two arguments because this is no longer a concatenation.
If you would like a single string which crosses lines, while indenting but not getting all those spaces, one approach you can try is to ditch the continuation line and use variables:
$ a="continuation"
$ b="lines"
$ echo $a$b
continuationlines
This will allow you to have cleanly indented code at the expense of additional variables. If you make the variables local it should not be too bad.
Here documents with the <<-HERE terminator work well for indented multi-line text strings. It will remove any leading tabs from the here document. (Line terminators will still remain, though.)
cat <<-____HERE
continuation
lines
____HERE
See also http://ss64.com/bash/syntax-here.html
If you need to preserve some, but not all, leading whitespace, you might use something like
sed 's/^ //' <<____HERE
This has four leading spaces.
Two of them will be removed by sed.
____HERE
or maybe use tr to get rid of newlines:
tr -d '\012' <<-____
continuation
lines
____
(The second line has a tab and a space up front; the tab will be removed by the dash operator before the heredoc terminator, whereas the space will be preserved.)
For wrapping long complex strings over many lines, I like printf:
printf '%s' \
"This will all be printed on a " \
"single line (because the format string " \
"doesn't specify any newline)"
It also works well in contexts where you want to embed nontrivial pieces of shell script in another language where the host language's syntax won't let you use a here document, such as in a Makefile or Dockerfile.
printf '%s\n' >./myscript \
'#!/bin/sh` \
"echo \"G'day, World\"" \
'date +%F\ %T' && \
chmod a+x ./myscript && \
./myscript
You can use bash arrays
$ str_array=("continuation"
"lines")
then
$ echo "${str_array[*]}"
continuation lines
there is an extra space, because (after bash manual):
If the word is double-quoted, ${name[*]} expands to a single word with
the value of each array member separated by the first character of the
IFS variable
So set IFS='' to get rid of extra space
$ IFS=''
$ echo "${str_array[*]}"
continuationlines
In certain scenarios utilizing Bash's concatenation ability might be appropriate.
Example:
temp='this string is very long '
temp+='so I will separate it onto multiple lines'
echo $temp
this string is very long so I will separate it onto multiple lines
From the PARAMETERS section of the Bash Man page:
name=[value]...
...In the context where an assignment statement is assigning a value to a shell variable or array index, the += operator can be used to append to or add to the variable's previous value. When += is applied to a variable for which the integer attribute has been set, value is evaluated as an arithmetic expression and added to the variable's current value, which is also evaluated. When += is applied to an array variable using compound assignment (see Arrays below), the variable's value is not unset (as it is when using =), and new values are appended to the array beginning at one greater than the array's maximum index (for indexed arrays) or added as additional key-value pairs in an associative array. When applied to a string-valued variable, value is expanded and appended to the variable's value.
You could simply separate it with newlines (without using backslash) as required within the indentation as follows and just strip of new lines.
Example:
echo "continuation
of
lines" | tr '\n' ' '
Or if it is a variable definition newlines gets automatically converted to spaces. So, strip of extra spaces only if applicable.
x="continuation
of multiple
lines"
y="red|blue|
green|yellow"
echo $x # This will do as the converted space actually is meaningful
echo $y | tr -d ' ' # Stripping of space may be preferable in this case
This isn't exactly what the user asked, but another way to create a long string that spans multiple lines is by incrementally building it up, like so:
$ greeting="Hello"
$ greeting="$greeting, World"
$ echo $greeting
Hello, World
Obviously in this case it would have been simpler to build it one go, but this style can be very lightweight and understandable when dealing with longer strings.
Line continuations also can be achieved through clever use of syntax.
In the case of echo:
# echo '-n' flag prevents trailing <CR>
echo -n "This is my one-line statement" ;
echo -n " that I would like to make."
This is my one-line statement that I would like to make.
In the case of vars:
outp="This is my one-line statement" ;
outp+=" that I would like to make." ;
echo -n "${outp}"
This is my one-line statement that I would like to make.
Another approach in the case of vars:
outp="This is my one-line statement" ;
outp="${outp} that I would like to make." ;
echo -n "${outp}"
This is my one-line statement that I would like to make.
Voila!
I came across a situation in which I had to send a long message as part of a command argument and had to adhere to the line length limitation. The commands looks something like this:
somecommand --message="I am a long message" args
The way I solved this is to move the message out as a here document (like #tripleee suggested). But a here document becomes a stdin, so it needs to be read back in, I went with the below approach:
message=$(
tr "\n" " " <<-END
This is a
long message
END
)
somecommand --message="$message" args
This has the advantage that $message can be used exactly as the string constant with no extra whitespace or line breaks.
Note that the actual message lines above are prefixed with a tab character each, which is stripped by here document itself (because of the use of <<-). There are still line breaks at the end, which are then replaced by tr with spaces.
Note also that if you don't remove newlines, they will appear as is when "$message" is expanded. In some cases, you may be able to workaround by removing the double-quotes around $message, but the message will no longer be a single argument.
Depending on what sort of risks you will accept and how well you know and trust the data, you can use simplistic variable interpolation.
$: x="
this
is
variably indented
stuff
"
$: echo "$x" # preserves the newlines and spacing
this
is
variably indented
stuff
$: echo $x # no quotes, stacks it "neatly" with minimal spacing
this is variably indented stuff
Following #tripleee 's printf example (+1):
LONG_STRING=$( printf '%s' \
'This is the string that never ends.' \
' Yes, it goes on and on, my friends.' \
' My brother started typing it not knowing what it was;' \
" and he'll continue typing it forever just because..." \
' (REPEAT)' )
echo $LONG_STRING
This is the string that never ends. Yes, it goes on and on, my friends. My brother started typing it not knowing what it was; and he'll continue typing it forever just because... (REPEAT)
And we have included explicit spaces between the sentences, e.g. "' Yes...". Also, if we can do without the variable:
echo "$( printf '%s' \
'This is the string that never ends.' \
' Yes, it goes on and on, my friends.' \
' My brother started typing it not knowing what it was;' \
" and he'll continue typing it forever just because..." \
' (REPEAT)' )"
This is the string that never ends. Yes, it goes on and on, my friends. My brother started typing it not knowing what it was; and he'll continue typing it forever just because... (REPEAT)
Acknowledgement for the song that never ends
However, if you have indented code, it doesn't work out so well:
echo "continuation \
lines"
>continuation lines
Try with single quotes and concatenating the strings:
echo 'continuation' \
'lines'
>continuation lines
Note: the concatenation includes a whitespace.
This probably doesn't really answer your question but you might find it useful anyway.
The first command creates the script that's displayed by the second command.
The third command makes that script executable.
The fourth command provides a usage example.
john#malkovich:~/tmp/so$ echo $'#!/usr/bin/env python\nimport textwrap, sys\n\ndef bash_dedent(text):\n """Dedent all but the first line in the passed `text`."""\n try:\n first, rest = text.split("\\n", 1)\n return "\\n".join([first, textwrap.dedent(rest)])\n except ValueError:\n return text # single-line string\n\nprint bash_dedent(sys.argv[1])' > bash_dedent
john#malkovich:~/tmp/so$ cat bash_dedent
#!/usr/bin/env python
import textwrap, sys
def bash_dedent(text):
"""Dedent all but the first line in the passed `text`."""
try:
first, rest = text.split("\n", 1)
return "\n".join([first, textwrap.dedent(rest)])
except ValueError:
return text # single-line string
print bash_dedent(sys.argv[1])
john#malkovich:~/tmp/so$ chmod a+x bash_dedent
john#malkovich:~/tmp/so$ echo "$(./bash_dedent "first line
> second line
> third line")"
first line
second line
third line
Note that if you really want to use this script, it makes more sense to move the executable script into ~/bin so that it will be in your path.
Check the python reference for details on how textwrap.dedent works.
If the usage of $'...' or "$(...)" is confusing to you, ask another question (one per construct) if there's not already one up. It might be nice to provide a link to the question you find/ask so that other people will have a linked reference.

Resources