In bash, how do I force variable never to be interpreted as a list? - bash

In my bash scripts, I regularly use file paths which may contain spaces:
FOO=/path\ with\ spaces/
Later, if I want to use FOO, I have to wrap it in quotes ("$FOO") or it will be interpreted as a list (/path, with, spaces/). Is there a better way to force a variable never to be interpreted as a list? It is cumbersome to have to constantly quote-wrap.

No. You must always use quotes or bash will word-split (except in [[, but that is a special case).

You can also change the internal field separator, IFS, as in:
ORIGIFS="$IFS"
IFS=$(echo -en "\n\b")
# do stuff...
IFS="$ORIGIFS"
However, this affects all situations where bash looks to do field splitting, which might be more broad than you'd like.

Related

How to use double quotes when assigning variables?

There's a bash file with something like this:
FOO=${BAR:-"/some/path/with/$VAR/in/it"}
Are those double quotes necessary? Based on the following test, I'd say no, and that no quote at all is needed in the above assignment. In fact, it's the user of that variable that needs to expand it within double quotes to avoid wrong splitting.
touch 'some file' # create a file
VAR='some file' # create a variable for that file name
FOO=${BAR:-$VAR} # use it with the syntax above, but no quotes
ls -l "$FOO" # the file does exist (here we do need double quotes)
ls -l $FOO # without quotes it fails searching for files `some` and `file`
rm 'some file' # remove temporary file
Am I correct? Or there's something more?
Are those double quotes necessary?
Not in this case, no.
Am I correct?
Yes. And it's always the user of the variable that has to quote it - field splitting is run when expanding the variable, so when using it it has to be quoted.
There are exceptions, like case $var in and somevar1=$somevar2 - contexts which do not run field splitting, so like do not require quoting. But anyway, quotes do not hurt in such cases and can be used anyway.
Or there's something more?
From POSIX shell:
2.6.2 Parameter Expansion
In addition, a parameter expansion can be modified by using one of the following formats. In each case that a value of word is needed (based on the state of parameter, as described below), word shall be subjected to tilde expansion, parameter expansion, command substitution, and arithmetic expansion.
${parameter:-word}
Because field splitting expansion is not run over word inside ${parameter:-word}, indeed, quoting doesn't do much.

zip exclude subfolder passed as argument or variable [duplicate]

I want to run a command from a bash script which has single quotes and some other commands inside the single quotes and a variable.
e.g. repo forall -c '....$variable'
In this format, $ is escaped and the variable is not expanded.
I tried the following variations but they were rejected:
repo forall -c '...."$variable" '
repo forall -c " '....$variable' "
" repo forall -c '....$variable' "
repo forall -c "'" ....$variable "'"
If I substitute the value in place of the variable the command is executed just fine.
Please tell me where am I going wrong.
Inside single quotes everything is preserved literally, without exception.
That means you have to close the quotes, insert something, and then re-enter again.
'before'"$variable"'after'
'before'"'"'after'
'before'\''after'
Word concatenation is simply done by juxtaposition. As you can verify, each of the above lines is a single word to the shell. Quotes (single or double quotes, depending on the situation) don't isolate words. They are only used to disable interpretation of various special characters, like whitespace, $, ;... For a good tutorial on quoting see Mark Reed's answer. Also relevant: Which characters need to be escaped in bash?
Do not concatenate strings interpreted by a shell
You should absolutely avoid building shell commands by concatenating variables. This is a bad idea similar to concatenation of SQL fragments (SQL injection!).
Usually it is possible to have placeholders in the command, and to supply the command together with variables so that the callee can receive them from the invocation arguments list.
For example, the following is very unsafe. DON'T DO THIS
script="echo \"Argument 1 is: $myvar\""
/bin/sh -c "$script"
If the contents of $myvar is untrusted, here is an exploit:
myvar='foo"; echo "you were hacked'
Instead of the above invocation, use positional arguments. The following invocation is better -- it's not exploitable:
script='echo "arg 1 is: $1"'
/bin/sh -c "$script" -- "$myvar"
Note the use of single ticks in the assignment to script, which means that it's taken literally, without variable expansion or any other form of interpretation.
The repo command can't care what kind of quotes it gets. If you need parameter expansion, use double quotes. If that means you wind up having to backslash a lot of stuff, use single quotes for most of it, and then break out of them and go into doubles for the part where you need the expansion to happen.
repo forall -c 'literal stuff goes here; '"stuff with $parameters here"' more literal stuff'
Explanation follows, if you're interested.
When you run a command from the shell, what that command receives as arguments is an array of null-terminated strings. Those strings may contain absolutely any non-null character.
But when the shell is building that array of strings from a command line, it interprets some characters specially; this is designed to make commands easier (indeed, possible) to type. For instance, spaces normally indicate the boundary between strings in the array; for that reason, the individual arguments are sometimes called "words". But an argument may nonetheless have spaces in it; you just need some way to tell the shell that's what you want.
You can use a backslash in front of any character (including space, or another backslash) to tell the shell to treat that character literally. But while you can do something like this:
reply=\”That\'ll\ be\ \$4.96,\ please,\"\ said\ the\ cashier
...it can get tiresome. So the shell offers an alternative: quotation marks. These come in two main varieties.
Double-quotation marks are called "grouping quotes". They prevent wildcards and aliases from being expanded, but mostly they're for including spaces in a word. Other things like parameter and command expansion (the sorts of thing signaled by a $) still happen. And of course if you want a literal double-quote inside double-quotes, you have to backslash it:
reply="\"That'll be \$4.96, please,\" said the cashier"
Single-quotation marks are more draconian. Everything between them is taken completely literally, including backslashes. There is absolutely no way to get a literal single quote inside single quotes.
Fortunately, quotation marks in the shell are not word delimiters; by themselves, they don't terminate a word. You can go in and out of quotes, including between different types of quotes, within the same word to get the desired result:
reply='"That'\''ll be $4.96, please," said the cashier'
So that's easier - a lot fewer backslashes, although the close-single-quote, backslashed-literal-single-quote, open-single-quote sequence takes some getting used to.
Modern shells have added another quoting style not specified by the POSIX standard, in which the leading single quotation mark is prefixed with a dollar sign. Strings so quoted follow similar conventions to string literals in the ANSI standard version of the C programming language, and are therefore sometimes called "ANSI strings" and the $'...' pair "ANSI quotes". Within such strings, the above advice about backslashes being taken literally no longer applies. Instead, they become special again - not only can you include a literal single quotation mark or backslash by prepending a backslash to it, but the shell also expands the ANSI C character escapes (like \n for a newline, \t for tab, and \xHH for the character with hexadecimal code HH). Otherwise, however, they behave as single-quoted strings: no parameter or command substitution takes place:
reply=$'"That\'ll be $4.96, please," said the cashier'
The important thing to note is that the single string that gets stored in the reply variable is exactly the same in all of these examples. Similarly, after the shell is done parsing a command line, there is no way for the command being run to tell exactly how each argument string was actually typed – or even if it was typed, rather than being created programmatically somehow.
Below is what worked for me -
QUOTE="'"
hive -e "alter table TBL_NAME set location $QUOTE$TBL_HDFS_DIR_PATH$QUOTE"
EDIT: (As per the comments in question:)
I've been looking into this since then. I was lucky enough that I had repo laying around. Still it's not clear to me whether you need to enclose your commands between single quotes by force. I looked into the repo syntax and I don't think you need to. You could used double quotes around your command, and then use whatever single and double quotes you need inside provided you escape double ones.
just use printf
instead of
repo forall -c '....$variable'
use printf to replace the variable token with the expanded variable.
For example:
template='.... %s'
repo forall -c $(printf "${template}" "${variable}")
Variables can contain single quotes.
myvar=\'....$variable\'
repo forall -c $myvar
I was wondering why I could never get my awk statement to print from an ssh session so I found this forum. Nothing here helped me directly but if anyone is having an issue similar to below, then give me an up vote. It seems any sort of single or double quotes were just not helping, but then I didn't try everything.
check_var="df -h / | awk 'FNR==2{print $3}'"
getckvar=$(ssh user#host "$check_var")
echo $getckvar
What do you get? A load of nothing.
Fix: escape \$3 in your print function.
Does this work for you?
eval repo forall -c '....$variable'

how to escape paths to be executed with $( )?

I have program whose textual output I want to directly execute in a shell. How shall I format the output of this program such that the paths with spaces are accepted by the shell ?
$(echo ls /folderA/folder\ with\ spaces/)
Some more info: the program that generates the output is coded in Haskell (source). It's a simple program that keeps a list of my favorite commands. It prints the commands with 'cmdl -l'. I can then choose one command to execute with 'cmdl -g12' for command number 12. Thanks for pointing out that instead of $( ) use 'cmdl -g12 | bash', I wasn't aware of that...
How shall I format the output of this program such that the paths with
spaces are accepted by the shell ?
The shell cannot distinguish between spaces that are part of a path and spaces that are separator between arguments, unless those are properly quoted. Moreover, you actually need proper quoting using single quotes ('...') in order to "shield" all those characters combinations that might otherwise have special meaning for the shell (\, &, |, ||, ...).
Depending the language used for your external tool, their might be a library available for that purpose. As as example, Python has pipes.quote (shlex.quote on Python 3) and Perl has String::ShellQuote::shell_quote.
I'm not quite sure I understand, but don't you just want to pipe through the shell?
For a program called foo
$ foo | sh
To format output from your program so Bash won't try to space-separate them into arguments either update, probably easiest just to double-quote them with any normal quoting method around each argument, e.g.
mkdir "/tmp/Joey \"The Lips\" Fagan"
As you saw, you can backslash the spaces alternatively, but I find that less readable ususally.
EDIT:
If you may have special shell characters (&|``()[]$ etc), you'll have to do it the hard/proper way (with a specific escaper for your language and target - as others have mentioned.
It's not just spaces you need to worry about, but other characters such as [ and ] (glob a.k.a pathname-expansion characters) and metacharacters such as ;, &, (, ...
You can use the following approach:
Enclose the string in single quotes.
Replace existing single quotes in the string with '\'' (which effectively breaks the string into multiple parts with spliced in \-escaped single quotes; the shell then reassembles the parts into a single string).
Example:
I'm good (& well[1];) would encode to 'I'\''m good (& well[1]);'
Note how single-quoting allows literal use of the glob characters and metacharacters.
Since single quotes themselves can never be used within single-quoted strings (there's not even an escape), the splicing-in approach described above is needed.
As described by #mklement0, a safe algorithm is to wrap every argument in a pair of single quotes, and quote single quotes inside arguments as '\''. Here is a shell function that does it:
function quote {
typeset cmd="" escaped
for arg; do
escaped=${arg//\'/\'\\\'\'}
cmd="$cmd '$escaped'"
done
printf %s "$cmd"
}
$ quote foo "bar baz" "don't do it"
'foo' 'bar baz' 'don'\''t do it'

Expansion of variables inside single quotes in a command in Bash

I want to run a command from a bash script which has single quotes and some other commands inside the single quotes and a variable.
e.g. repo forall -c '....$variable'
In this format, $ is escaped and the variable is not expanded.
I tried the following variations but they were rejected:
repo forall -c '...."$variable" '
repo forall -c " '....$variable' "
" repo forall -c '....$variable' "
repo forall -c "'" ....$variable "'"
If I substitute the value in place of the variable the command is executed just fine.
Please tell me where am I going wrong.
Inside single quotes everything is preserved literally, without exception.
That means you have to close the quotes, insert something, and then re-enter again.
'before'"$variable"'after'
'before'"'"'after'
'before'\''after'
Word concatenation is simply done by juxtaposition. As you can verify, each of the above lines is a single word to the shell. Quotes (single or double quotes, depending on the situation) don't isolate words. They are only used to disable interpretation of various special characters, like whitespace, $, ;... For a good tutorial on quoting see Mark Reed's answer. Also relevant: Which characters need to be escaped in bash?
Do not concatenate strings interpreted by a shell
You should absolutely avoid building shell commands by concatenating variables. This is a bad idea similar to concatenation of SQL fragments (SQL injection!).
Usually it is possible to have placeholders in the command, and to supply the command together with variables so that the callee can receive them from the invocation arguments list.
For example, the following is very unsafe. DON'T DO THIS
script="echo \"Argument 1 is: $myvar\""
/bin/sh -c "$script"
If the contents of $myvar is untrusted, here is an exploit:
myvar='foo"; echo "you were hacked'
Instead of the above invocation, use positional arguments. The following invocation is better -- it's not exploitable:
script='echo "arg 1 is: $1"'
/bin/sh -c "$script" -- "$myvar"
Note the use of single ticks in the assignment to script, which means that it's taken literally, without variable expansion or any other form of interpretation.
The repo command can't care what kind of quotes it gets. If you need parameter expansion, use double quotes. If that means you wind up having to backslash a lot of stuff, use single quotes for most of it, and then break out of them and go into doubles for the part where you need the expansion to happen.
repo forall -c 'literal stuff goes here; '"stuff with $parameters here"' more literal stuff'
Explanation follows, if you're interested.
When you run a command from the shell, what that command receives as arguments is an array of null-terminated strings. Those strings may contain absolutely any non-null character.
But when the shell is building that array of strings from a command line, it interprets some characters specially; this is designed to make commands easier (indeed, possible) to type. For instance, spaces normally indicate the boundary between strings in the array; for that reason, the individual arguments are sometimes called "words". But an argument may nonetheless have spaces in it; you just need some way to tell the shell that's what you want.
You can use a backslash in front of any character (including space, or another backslash) to tell the shell to treat that character literally. But while you can do something like this:
reply=\”That\'ll\ be\ \$4.96,\ please,\"\ said\ the\ cashier
...it can get tiresome. So the shell offers an alternative: quotation marks. These come in two main varieties.
Double-quotation marks are called "grouping quotes". They prevent wildcards and aliases from being expanded, but mostly they're for including spaces in a word. Other things like parameter and command expansion (the sorts of thing signaled by a $) still happen. And of course if you want a literal double-quote inside double-quotes, you have to backslash it:
reply="\"That'll be \$4.96, please,\" said the cashier"
Single-quotation marks are more draconian. Everything between them is taken completely literally, including backslashes. There is absolutely no way to get a literal single quote inside single quotes.
Fortunately, quotation marks in the shell are not word delimiters; by themselves, they don't terminate a word. You can go in and out of quotes, including between different types of quotes, within the same word to get the desired result:
reply='"That'\''ll be $4.96, please," said the cashier'
So that's easier - a lot fewer backslashes, although the close-single-quote, backslashed-literal-single-quote, open-single-quote sequence takes some getting used to.
Modern shells have added another quoting style not specified by the POSIX standard, in which the leading single quotation mark is prefixed with a dollar sign. Strings so quoted follow similar conventions to string literals in the ANSI standard version of the C programming language, and are therefore sometimes called "ANSI strings" and the $'...' pair "ANSI quotes". Within such strings, the above advice about backslashes being taken literally no longer applies. Instead, they become special again - not only can you include a literal single quotation mark or backslash by prepending a backslash to it, but the shell also expands the ANSI C character escapes (like \n for a newline, \t for tab, and \xHH for the character with hexadecimal code HH). Otherwise, however, they behave as single-quoted strings: no parameter or command substitution takes place:
reply=$'"That\'ll be $4.96, please," said the cashier'
The important thing to note is that the single string that gets stored in the reply variable is exactly the same in all of these examples. Similarly, after the shell is done parsing a command line, there is no way for the command being run to tell exactly how each argument string was actually typed – or even if it was typed, rather than being created programmatically somehow.
Below is what worked for me -
QUOTE="'"
hive -e "alter table TBL_NAME set location $QUOTE$TBL_HDFS_DIR_PATH$QUOTE"
EDIT: (As per the comments in question:)
I've been looking into this since then. I was lucky enough that I had repo laying around. Still it's not clear to me whether you need to enclose your commands between single quotes by force. I looked into the repo syntax and I don't think you need to. You could used double quotes around your command, and then use whatever single and double quotes you need inside provided you escape double ones.
just use printf
instead of
repo forall -c '....$variable'
use printf to replace the variable token with the expanded variable.
For example:
template='.... %s'
repo forall -c $(printf "${template}" "${variable}")
Variables can contain single quotes.
myvar=\'....$variable\'
repo forall -c $myvar
I was wondering why I could never get my awk statement to print from an ssh session so I found this forum. Nothing here helped me directly but if anyone is having an issue similar to below, then give me an up vote. It seems any sort of single or double quotes were just not helping, but then I didn't try everything.
check_var="df -h / | awk 'FNR==2{print $3}'"
getckvar=$(ssh user#host "$check_var")
echo $getckvar
What do you get? A load of nothing.
Fix: escape \$3 in your print function.
Does this work for you?
eval repo forall -c '....$variable'

Tricky brace expansion in shell

When using a POSIX shell, the following
touch {quick,man,strong}ly
expands to
touch quickly manly strongly
Which will touch the files quickly, manly, and strongly, but is it possible to dynamically create the expansion? For example, the following illustrates what I want to do, but does not work because of the order of expansion:
TEST=quick,man,strong #possibly output from a program
echo {$TEST}ly
Is there any way to achieve this? I do not mind constricting myself to Bash if need be. I would also like to avoid loops. The expansion should be given as complete arguments to any arbitrary program (i.e. the program cannot be called once for each file, it can only be called once for all files). I know about xargs but I'm hoping it can all be done from the shell somehow.
... There is so much wrong with using eval. What you're asking is only possible with eval, BUT what you might want is easily possible without having to resort to bash bug-central.
Use arrays! Whenever you need to keep multiple items in one datatype, you need (or, should use) an array.
TEST=(quick man strong)
touch "${TEST[#]/%/ly}"
That does exactly what you want without the thousand bugs and security issues introduced and concealed in the other suggestions here.
The way it works is:
"${foo[#]}": Expands the array named foo by expanding each of its elements, properly quoted. Don't forget the quotes!
${foo/a/b}: This is a type of parameter expansion that replaces the first a in foo's expansion by a b. In this type of expansion you can use % to signify the end of the expanded value, sort of like $ in regular expressions.
Put all that together and "${foo[#]/%/ly}" will expand each element of foo, properly quote it as a separate argument, and replace each element's end by ly.
In bash, you can do this:
#!/bin/bash
TEST=quick,man,strong
eval echo $(echo {$TEST}ly)
#eval touch $(echo {$TEST}ly)
That last line is commented out but will touch the specified files.
Zsh can easily do that:
TEST=quick,man,strong
print ${(s:,:)^TEST}ly
Variable content is splitted at commas, then each element is distributed to the string around the braces:
quickly manly strongly
Taking inspiration from the answers above:
$ TEST=quick,man,strong
$ touch $(eval echo {$TEST}ly)

Resources