Given a variable VAR, that may contain an empty string, is it possible to make bash always interpret both
ls $VAR
ls "$VAR"
as "command that is given one argument"?
For bash, the IFS variable allows determining how the unquoted variable expansion in the first command is interpreted, defaulting to "Split the string at newlines, tabs and spaces, and treat each one as separate argument."
Things can be made a bit more convenient by settings IFS=''. But even then, if $VAR expands to am empty string, it will not be treated as the empty string but as absence of an argument.
This behavior actually makes sense for other values of IFS: By default unquoted expansion is treated as "list of whitespace-separated tokens", with the empty string being the empty list.
But is there a feature, to make $VAR and "$VAR" truely equivalent, at least for IFS=''?
Related
Consider:
# double quotes make empty variables count as args
emptyvar=""
printf %s%s%sEND $emptyvar a b c
echo ""
printf %s%s%sEND "$emptyvar" a b c
echo ""
echo ""
# but an empty array does not count, even with double quotes
empty=()
printf %s%s%sEND ${empty[#]} a c b
echo ""
printf %s%s%sEND "${empty[#]}" a b c
which outputs:
abcEND
abENDcEND
acbEND
abcEND
Try it online!
I understand the first example.
I understand that in the 2nd example the double quotes are somehow forcing the empty string to be considered an arg, because of how word splitting works -- but I'm hazy on the details.
I think the 3rd example is working similar to the first: it just gets processed as whitespace during word splitting.
And I'm unclear why arrays are treated specially in the 4th case.
I'd like an explanation of what is happening under the hood to understand it better, along with any relevant quotes from man bash (I wasn't able to find anything explaining this behavior, but probably missed it).
man bash:
Any element of an array may be referenced using ${name[subscript]}. The braces are required to avoid conflicts with pathname expansion. If subscript is # or *, the word expands to all members of name. These subscripts differ only when the word appears within double quotes. If the word is double-quoted, ${name[*]} expands to a single word with the value of each array member separated by the first character of the IFS special variable, and ${name[#]} expands each element of name to a separate word. When there are no array members, ${name[#]} expands to nothing.
How can I match environment variables which include the case-insensitive segment "proxy" that is not a prefix? I'm on bash:
root#PDPINTDEV9:~# echo ${SHELL}
/bin/bash
I want to unset a bunch of proxy variables simultaneously. They all have "proxy" or "PROXY" in the name, such as http_proxy or NO_PROXY. I would like to use glob expansion, which this answer & comment says is what bash uses.
Also based on that answer, I see that I can find environment vars which start with "PROXY":
root#PDPINTDEV9:~# echo "${!PROXY*}"
PROXY_IP PROXY_PORT
But that doesn't make sense with what I've read about glob expansion. Based on those, "${!PROXY*}" should match anything that doesn't start with proxy... I think.
Furthermore, I can't get anything that does make sense with glob syntax to actually work:
root#PDPINTDEV9:~# echo ${*proxy}
-bash: ${*proxy}: bad substitution
root#PDPINTDEV9:~# echo "${!*[pP][rR][oO][xX][yY]}"
-bash: ${!*[pP][rR][oO][xX][yY]}: bad substitution
SOLVED below: Turns out you can't. Crazy, but thanks everyone.
Variable name expansion, as a special case of shell parameter expansion, does not support globbing. But it has two flavors:
${!PREFIX*}
${!PREFIX#}
In both, the * and # characters are hard-coded.
The first form will expand to variable names prefixed with PREFIX and joined by the first character of the IFS (which is a space, by default):
$ printf "%s\n" "${!BASH*}"
BASH BASHOPTS BASHPID BASH_ALIASES BASH_ARGC BASH_ARGV BASH_CMDS BASH_COMMAND ...
The second form will expand to variable names (prefixed with PREFIX), but as separate words:
$ printf "%s\n" "${!BASH#}"
BASH
BASHOPTS
BASHPID
BASH_ALIASES
BASH_ARGC
...
Both of these forms are case-sensitive, so to get the variable names in a case-insensitive manner, you can use set, in combination with some cut and grep:
$ (set -o posix; set) | cut -d= -f1 | grep -i ^proxy
PROXY_IP
proxy_port
But that doesn't make sense with what I've read about glob expansion.
Based on those, "${!PROXY*}" should match anything that doesn't start
with proxy... I think.
No and no.
In the first place, the ! character is not significant to pathname expansion, except when it appears at the beginning of a character class in a pattern, in which case the sense of the class is inverted. For example, fo[!o] is a pattern that matches any three-character string whose first two characters are "fo" and whose third is not another 'o'. But there is no character class in your expression.
But more importantly, pathname expansion isn't relevant to your expression ${!PROXY*} at all. There is no globbing there. The '!' and '*' are fixed parts of the syntax for one of the forms of parameter expansion. That particular expansion produces, by definition, the names of all shell variables whose names start with "PROXY", separated by the first character of the value of the IFS variable. Where it appears outside of double quotes, it is equivalent to ${!PROXY#}, which is less susceptible to globbing-related confusion.
Furthermore, I can't get anything that does make sense with glob syntax to actually work: [...]
No, because, again, there is no globbing going on. You need exactly ${! followed by the name prefix of interest, followed by *} or #} to form the particular kind of parameter expansion you're asking about.
How can I match environment variables which include the case-insensitive segment "proxy"?
You need to explicitly express the case variations of interest to you. For example:
${!PROXY*} ${!proxy*} ${!Proxy*}
Suppose you have something like:
$ a=(fooa foob foox)
Then you can do:
$ b=(${(M)a:#*(a|b)})
To select a's elements matching the pattern.
So you have:
$ print ${(qq)b}
'fooa' 'foob'
Then you expect to build the pattern in some dynamic way, so you have it in another variable, say:
$ p="*(a|b)"
And you expect this:
$ b=(${(M)a:#$p})
Would work the same as before, as the documentation says, but it doesn't:
$ print ${(qq)b}
''
Why is that?
Because zsh tries to select $p's value literally (a plain string text) in this case:
a=('*(a|b)' fooa foob)
p="*(a|b)"
b=(${(M)a:#$p})
print ${(qq)b}
;#⇒'*(a|b)'
We could tell zsh to treat $p's expansion as patterns rather than literal values explicitly by a ${~spec} form.
${~spec}
Turn on the GLOB_SUBST option for the evaluation of spec; if the ‘~’ is doubled, turn it off. When this option is set, the string resulting from the expansion will be interpreted as a pattern anywhere that is possible, such as in filename expansion and filename generation and pattern-matching contexts like the right hand side of the ‘=’ and ‘!=’ operators in conditions.
-- zshexpn(1): Expansion, Parameter Expansion
In this case, we could use it like this:
a=(fooa foob foox)
p="*(a|b)"
b=(${(M)a:#${~p}}) ;# tell zsh treat as a pattern for `$p`
print ${(qq)b}
;#⇒'fooa' 'foob'
Note: It gives some hints in the parameter expansion flag b for storing patterns in variable values:
b
Quote with backslashes only characters that are special to pattern matching. This is useful when the contents of the variable are to be tested using GLOB_SUBST, including the ${~...} switch.
Quoting using one of the q family of flags does not work for this purpose since quotes are not stripped from non-pattern characters by GLOB_SUBST. In other words,
pattern=${(q)str}
[[ $str = ${~pattern} ]]
works if $str is ‘a*b’ but not if it is ‘a b’, whereas
pattern=${(b)str}
[[ $str = ${~pattern} ]]
is always true for any possible value of $str.
-- zshexpn(1): Expansion, Parameter Expansion, Parameter Expansion Flags
This is driving me absolutely crazy:
$ a="/"
echo $a # note empty output line below
$ var="/home/vivek/foo/bar"
$ echo $var
home vivek foo bar
What's going on in my bash shell on OS X?
I've tried this on my other Mac.. and it works perfectly!
tl;dr:
Reset the special $IFS variable to its default - IFS=$' \t\n' - or, preferably, double-quote your variable reference (echo "$var") to print the value as-is.
You're referencing $var unquoted, which means that its value is subject to word splitting (one of the many expansions that Bash applies to unquoted tokens).
Word splitting happens by any of the characters defined in the built-in $IFS variable (the Internal Field Separator), which defaults to $' \t\n' (space, tab, newline).
In your case, $IFS contains / (possibly among other chars.), which means that /home/vivek/foo/bar is split into separate arguments home, vivek, foo, bar, which are then passed to echo.
echo, when given multiple arguments, prints them separated with a space, which is what you're seeing.
(Similarly, / as the value of $var is interpreted as just a separator, with no fields, which means that no arguments are passed to echo, which just prints a newline).
There are 2 lessons here:
Only temporarily change $IFS; restore the previous value once you're done with the custom value.
Generally, double-quote all variable references to ensure that their values are preserved as-is; only use unquoted variable references if you explicitly want shell expansions applied to their values.
Say I have this command:
printf $text | perl program.pl
How do I guarantee that everything in the $text variable is literally? For example, if $text contains hello"\n, how do I make sure that's exactly what gets passed to program.pl, without the newline or quotation mark (or any conceivable character) being interpreted as a special character?
Quotes!
printf '%s' "$text" | ...
Don't ever expand variables unquoted if you care about preserving their contents precisely. Also, don't ever pass a dynamic string as a format variable when you want it to be treated as literal data.
If you want backslash sequences to be interpreted -- for instance, the two-character sequence \n to be changed to a single newline -- and your shell is bash, use printf '%b' "$text" instead. If you want byte-for-byte accuracy, %s is the Right Thing (and works on any POSIX-compliant shell). If you want escaping for interpretation by another shell (which would be appropriate if, say, you were passing content as part of a ssh command line), then the appropriate format string (for bash only) is %q.