Shell variable with spaces , quoting for single command line option - bash

Autoconf scripts have trouble with a filename or pathname with spaces. For example,
./configure CPPFLAGS="-I\"/path with space\""
results in (config.log):
configure:3012: gcc -I"/path with space" conftest.c >&5
gcc: with: No such file or directory
gcc: space": No such file or directory
The compile command from ./configure is ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' and I am not able to modify this (I could perhaps, but working around autoconf in this way is not a general solution).
I think it comes down to getting a shell variable that contains spaces to be parsed as a single command line variable rather than split at spaces. The simplest shell example I can come up with is to create a file with spaces and attempt to list is with ls with a shell variable as the argument to ls:
$ touch "a b"
$ file="a b"
$ ls $file
ls: a: No such file or directory
ls: b: No such file or directory
This works, but is illegal since in autoconf I can't modify the shell code:
$ ls "$file"
a b
None of the following attempts at quoting things work:
$ file="\"a \"b"; ls $file
ls: "a: No such file or directory
ls: b": No such file or directory
$ file="a\ b"
$ file="a\\ b"
$ file="`echo \\"a b\\"`"
and so on.
Is this impossible to accomplish in shell scripts? Is there a magical quoting that will expand a shell variable with spaces into a single command line argument?

You should try to set the $IFS environment variable.
from man bash(1):
IFS - The Internal Field Separator that is used for word splitting
after expansion and to split lines into words with the read builtin
command. The default value is ''space tab newline''.
For example
IFS=<C-v C-m> # newline
file="a b"
touch $file
ls $file
Don't forget to set $IFS back or strange things will happen.

if you give command
gcc -I"x y z"
in a shell then certainly the single command line parameter "-Ix y z" will be passed to gcc. There is no question to that. That's the whole meaning of double quotes: things inside double quotes are NOT subject to field splitting, and so not subject to $IFS either, for instance.
But you need to be careful about the number of quotes you need. For instance, if you say
file="a b" # 1
and then you say
ls $file # 2
what happens is that the file variable's contents are 'a b', not '"a b"', because the double quotes were "eaten" when line 1 was parsed. The replaced value is then field-separated and you get ls on two files 'a' and 'b'. The correct way to get what you want is
file="a b"; ls "$file"
Now the problem in your original case is that when you set a variable to a string that CONTAINS double quotes, the double quotes are later not interpreted as shell quote symbols but just as normal letters. Which is why when you do something like
file="\"a b\""; ls $file
actually the shell tokenizes the contents of the file variable into '"a' and 'b"' when the ls command is analyzed; the double quote is no longer a shell quote character but just part of the variable's contents. It's analogous to that if you set
file="\$HOME"; ls $file
you get an error that '$HOME' directory does not exist---no environment variable lookup takes place.
So your best options are
Hack autoconf
Do not use path names with spaces (best solution)

Using space in directory names in the Unix world is simply asking for trouble. It's not just the problem of quoting in shell scripts (which needs to be done right anyway): some tools simply cannot cope with spaces in filenames. For instance, you can't (portably) write a Makefile rule that says build baz.o from foo bar/baz.c.
In the case of CPPFLAGS above, I would try one of the following (in order of preference):
Fix the system not use use any space in directory names.
Write a small wrapper around the compiler and call ./configure CC=mygcc. In that case mygcc might be:
#!/bin/sh
gcc "-I/foo bar/include" "$#"
Create a symbolic link (e.g., /tmp/mypath) to the dreaded path and use CPPFLAGS=-I/tmp/mypath.

You want to quote the entire argument, in either of these ways:
./configure "CPPFLAGS=-I/path with space"
./configure CPPFLAGS="-I/path with space"
The ./configure command then sees a single argument
"CPPFLAGS=-I/path with space"
which is parsed as a parameter named«CPPFLAGS» having the value«-I/path with space» (brackets added for clarity).

Using quotes is interesting. From (lightly) reading the bash man page I thought you had to escape the space with \, thus "/path with space" becomes /path\ with\ space I've never tried the quotes, but it seems that it doesn't work generally (your ls example). Escaping works with ls without quoting and without changing IFS.
What happens if you use the "escaping spaces" format of the command?

$ file="\"a b\""
$ eval ls $file

Everything depends on how the variable is used. First, note that if you are using Autoconf, this probably means that make will be used eventually, so that the rules are dictated by make, and in particular, the default make rules. Even though you may want to use your own rules exclusively, things must remain consistent between tools, and some variables have standard meanings, so that you do not want to deviate from them. This is not the case of CPPFLAGS, but this should remain similar to CFLAGS, which is standard. See the POSIX make utility, where variables are simply expanded with standard sh word splitting, which does not provide any quoting mechanism (the field separator is controlled by $IFS, but do not change the IFS variable to accept spaces as normal characters since this will break other things, like being able to provide several -I and/or -L options in such variables with the standard way).
Since there is such a limitation with make, I suppose that it would be useless to try to avoid this limitation in Autoconf.
Now, since a space is necessarily a field separator, the only possibility is to provide pathnames without space characters. If spaces in pathnames were to be supported in the future, this would probably be done via pathname encoding, with decoding at the high-level UI (a bit like with URL's). Alternatively, if you have the choice and really want to use spaces in pathnames, you may use some non-ASCII space (BTW, this is how RISC OS supports space in pathnames, by forcing it to be the no-break space).

Related

Why are quotes preserved when using bash $() syntax, but not if executed manually?

I have the following bash script:
$ echo $(dotnet run --project Updater)
UPDATE_NEEDED='0' MD5_SUM="7e3ad68397421276a205ac5810063e0a"
$ export UPDATE_NEEDED='0' MD5_SUM="7e3ad68397421276a205ac5810063e0a"
$ echo $UPDATE_NEEDED
0
$ export $(dotnet run --project Updater)
$ echo $UPDATE_NEEDED
'0'
Why is it $UPDATE_NEEDED is 0 on the 3rd command, but '0' on the 5th command?
What would I need to do to get it to simply set 0? Using UPDATE_NEEDED=0 instead is not an option, as some of the other variables may contain a space (And I'd like to optimistically quote them to have it properly parse spaces).
Also, this is a bit of a XY problem. If anyone knows an easier way to export multiple variables from an executable that can be used later on in the bash script, that could also be useful.
To expand on the answer by Glenn:
When you write something like export UPDATE_NEEDED='0' in Bash code, this is 100% identical to export UPDATE_NEEDED=0. The quotes are used by Bash to parse the command expression, but they are then discarded immediately. Their only purpose is to prevent word splitting and to avoid having to escape special characters. In the same vein, the code fragment 'foo bar' is exactly identical to foo\ bar as far as Bash is concerned: both lead to space being treated as literal rather than as a word splitter.
Conversely, parameter expansion and command substitution follows different rules, and preserves literal quotes.
When you use eval, the command line arguments passed to eval are treated as if they were Bash code, and thus follow the same rules of expansion as regular Bash code, which leads to the same result as (1).
Apparently that Updater project is doing the equivalent of
echo "UPDATE_NEEDED=\'0\' MD5_SUM=\"7e3ad68397421276a205ac5810063e0a\""
It's explicitly outputting the quotes.
When you do export UPDATE_NEEDED='0' MD5_SUM="7e3ad68397421276a205ac5810063e0a",
bash will eventually remove the quotes before actually setting the variables.
I agree with #pynexj, eval is warranted here, although additional quoting is recommended:
eval export "$(dotnet ...)"

What does `$*`(a dollar sign followed by a star) mean in the default setting of `grepprg` in vim?

In a vanilla vim on Mac, when I type :set grepprg?, it returns the following:
grepprg=grep -n $* /dev/null.
I understand what -n and /dev/null means,
thanks to an old question here.
I also understand what $ and * means individually.
However, I am not sure what to make of $*.
I tried to look it up in the vim doc,
but all that I could find was
The placeholder "$*" is allowed to specify where the arguments will be included.
I sense that I am missing some important connection here.
I would really appreciate if someone could explain to me
how $* works as a placeholder.
Update:
Thanks to the detailed explanation from #romainl,
I realized that I was misinterpreting $* as regex,
whereas they are part of the convention in shell script.
In fact, there already exists old post
about this particular convention.
Silly me!
I'm not sure what kind of explanation is needed beyond what you have already quoted:
The placeholder "$*" is allowed to specify where the arguments will be included.
$* is just a placeholder and it works like all placeholders: before being actually sent to the shell, the command is built out of &grepprg and $*, if present, is replaced by any pattern, filename, flags, etc. provided by the user.
Say you want to search for foo\ bar in all JavaScript files under the current directory. The command would be:
:grep 'foo\ bar' *.js
After you press <CR>, Vim grabs any argument you gave to :grep, in this case:
'foo\ bar' *.js
then, if there is a $* in &grepprg, it is replaced with the given argument:
grep -n 'foo\ bar' *.js /dev/null
or, if there is no $* in &grepprg, the given argument is appended to it, and only then sends the whole command to a shell.
$* means "in this command, I specifically want the user-provided arguments to appear here".
As for the meaning of $*… $ and * have no intrinsic meaning and $* could have been $$$PLACEHOLDER$$$ or anything. $* may have been chosen because it is used in shell script to represent all the arguments given to a function or script, which is somewhat close in meaning to what is happening in &grepprg with $*.

how to pass > symbol as an argument?

I have a py script which gets as an argument data stream char > and >>, for example
python script.py -o 'ls -la /tmp > /tmp/test'
When I'm executing this kind of command the > char perceived with my terminal. Could you help me, how can I pass the > as a symbol so that my console doesn't think of it as a command?
Single quotes already work exactly as you would wish. Probably you are doing something wrong with the string inside of your Python script.
In some more detail, the arguments when Bash is done parsing this command are (one per line)
python
script.py
-o
ls -la /tmp > /tmp/test
(The quotes were useful while Bash was parsing this command line, but they are gone now.)
You should easily be able to verify writhin Python that sys.argv[0] is script.py, sys.argv[1] is -o, and sys.argv[2] is the single-quoted string, sans quotes; all of these are strings (the shell really doesn't have any other primitive data type).
Backslash escaping all shell metacharacters individually would work as well here;
python script.py -o ls\ -la\ /tmp\ \>\ test
or double quoting the string; albeit in the shell, double quotes have slightly different semantics (variables will be interpolated and backticks evaluated, and a backslash can be used to escape backslashes, double quotes, dollar signs, and backticks, unlike in single quotes, where every character is preserved verbatim, and there is no way to escape a character.)
You don't reveal how exactly your code doesn't work, but I'd speculate that your problem is actually the opposite; you end up running something like ls '>' and get an error message that this file does not exist.
For the record, > is a redirection operator, not a command. The shell parses this into
ls
-la
/tmp
with standard oqtput redirected to the file /tmp/test. In terms of Python code,
with open('/tmp/test', 'w') as redirect:
subprocess.run(['ls', '-la', '/tmp'], stdout=redirect)
Of course, if you want to support arbitrary shell features, you need a shell. The simplest fix here is probably
subprocess.run(sys.argv[2], shell=True)
See Actual meaning of shell=True in subprocess but here, I don't see any easy way to avoid the shell, short of by writing your own reimplementation.
In general, when you want to use special characters literally, i.e., without their special meaning, you need to surround a string of characters with single quotation marks(or quotes) to strip all characters within the quotes of any special meaning they might have. It seems that you’re already applied quoting whilst passing your command line argument to the python script. So it should work as expected. For example let’s assume that your python script looks like this:
import argparse
parser = argparse.ArgumentParser(description='Your App')
parser.add_argument('-o', action="store", dest="command")
options = parser.parse_args()
print(options.command)
Invoking this script with python ./script.py -o 'ls -la /tmp > /tmp/test' will produce the below results as expected.
ls -la /tmp > /tmp/test
But you've already mentioned that it is not the behavior you're seeing. So the next thing you might check is the code in your actual python script to determine what’s going on there with the input value. It is most likely that the problem is in your python script. So, the next step I would do is inspecting your python script to understand that magic.
Try by escaping the character using a backslash \ before the greather than > symbol.

bash shell access to $ProgramFiles(x86) environment variable

In the bash shell, I'm using git-bash.exe, how do I access the Windows 10 ProgramFiles(x86) environment variable?
If I execute printenv I see it in the output with the casing noted but attempts to access it using echo $ProgramFiles(x86), echo $ProgramFiles\(x86\) and echo $"ProgramFiles(x86)" do not work.
I am able to access the non-x86 version of that environment variable without any issue using echo $PROGRAMFILES and do relevant colon removal and backslash to forward replacements necessary to use it in PATH environment variable updates, e.g. PATH=$PATH:"/${PROGRAMFILES//:\\//} (x86)/Some App Path With Spaces/" followed by echo $PATH and printenv PATH that confirms the desired result. The issue is that I'd rather not have to compose the ProgramFiles(x86) environment variable versus being able to use it directly in updates to the PATH environment variable.
Along these same lines when trying to use the Windows APPDATA [ = C:\Users<username>\AppData\Roaming ] environment variable in updates to PATH environment variable I need to be able to replace not only the initial colon & backslash but also the subsequent backslashes with forward slashes. Using echo ${APPDATA//:\\//} produces C/Users\<username>\AppData\Roaming and I'm not aware of how to get the bash environment variable character matching and substitution syntax to cover both cases in order to produce the required C/Users/<username>/AppData/Roaming necessary for use in updates to PATH environment variable.
Note: there's a flaw in the process described below. In particular, if some environment variable is set to a multi-line value where one of the value lines matches the sed expression, you'll capture that line as well. To avoid this, if you have a Python available, you could use:
python -c 'import os; print(os.getenv("FOO(BAR)"))'
for instance. This will print None if the variable is not set, so you might want to make it fancier: e.g., supply a default value, or use sys.exit(1) if the variable is not set, for instance. (But if you have a Python interpreter available, you might consider writing in Python rather than directly in bash.)
Unix shell (sh, bash, etc) variable names—including environment variables—are constrained to character sets that exclude parentheses. In particular, "$FOO(BAR)" always parses as a reference to variable $FOO, followed by (BAR) as a separate word. This holds even with braceed expansion forms, where the separate word (BAR) is syntactically invalid:
bash$ echo "${FOO(BAR)}"
bash: ${FOO(BAR)}: bad substitution
Nonetheless, it is possible to set such variables, and access them, using other programs. For instance, using Python I set FOO(BAR) to hello:
>>> import os
>>> os.environ["FOO(BAR)"] = "hello"
>>> import subprocess
>>> subprocess.call("bash")
bash$
This bash instance cannot directly access the variable, but env prints all the variables:
bash$ env | grep FOO
FOO(BAR)=hello
If you have env (you probably do) and sed, you can combine them to extract arbitrary variables:
bash$ setting="$(env | sed -n 's/^FOO(BAR)=//p')"
bash$ echo "$setting"
hello
So assuming that Windows Bash doesn't have any special case to work around this particular clumsiness better, this same trick should work for "ProgramFiles(x86)".
Substituting multiple backslashes with forward slashes
You're mostly there: the problem is that your pattern looks specifically for :\ but the strings have multiple \s without colons. Your best bet is probably to have a program or function that actually understands Windows paths, as they don't necessarily have drive letters at the front (see https://learn.microsoft.com/en-us/dotnet/standard/io/file-path-formats). But this pattern works for all-backslash:
bash$ v='a\b\c'
bash$ echo ${v//\\/\/}
a/b/c
The double slash means "substitute all occurrences". The pattern is then \\, which matches one backslash. The next slash introduces the replacement string, which is \/, which means one forward slash. (This can also be written as just / but I find that harder to read, oddly enough.)
Of course this does not replace the colon in C:, so we need one more substitution. You can't do that in one ${...} expansion, so the trick is to add another one:
bash$ v='C:\a\b\c'
bash$ echo ${v//\\/\/}
C:/a/b/c
bash$ v1="${v//\\//}"; echo ${v1/:/}
C/a/b/c
Put this inside a shell function, which you can eventually make smart enough to handle all valid paths, and that way you can use local to keep the variable name v1 from leaking.
Regarding APPDATA: The cygpath program can convert pathnames between Windows, Unix and "Mixed" conventions. Both Cygwin and Git for Windows come with this tool. Example:
$ echo "$APPDATA"
C:\Users\me\AppDataRoaming\
$ cygpath -u "$APPDATA"
/c/Users/me/AppData/Roaming
$ cygpath -m "$APPDATA"
C:/Users/me/AppData/Roaming
$ cygpath -w "$APPDATA"
C:\Users\me\AppData\Roaming
The "mixed" format is quite usefull because even most windows programs and Git for Windows can handle that format directly.
Assigning the output of cygpath to a variable works like this (note the quotes!):
$ XAPP=$(cygpath "$APPDATA")
$ echo "$XAPP"
$ cd "$XAPP"

cd to an unknown directory name with spaces in a bash script

I've looked at some of the posts that have similar issues, but I can't extrapolate some of the solutions to fit my own needs, so here I am.
I have a simple shell script and I need it to cd into a directory with a space in the name. The directory is in the same place every time (/home/user/topleveldir) but the directory name itself is unique to the machine, and has a space in it (/home/user/topleveldir/{machine1}\ dir/, /home/user/topleveldir/{machine2}\ dir/). I'm confused as to the best method to cd into that unique directory in a script.
I don't see why something like following would not work
baseDir=/home/user/topleveldir
machine=<whatever machine name>
cd "$baseDir/$machine dir"
You need to quote that space character, so that the shell knows that it's part of the argument and not a separator between arguments.
If you have the directory directly on that command line in the script, use single quotes. Between single quotes, every character is interpreted literally except a single quote.
cd '/home/user/topleveldir/darkstar dir/'
If the directory name comes from a variable, use double quotes around the command substitution. Always use double quotes around command substitutions, e.g. "$foo". If you leave out the quotes, the value of the variable is split into separate words which are interpreted as glob patterns — this is very rarely desirable.
directory_name='darkstar dir'
…
cd "/home/user/topleveldir/$directory_name"

Resources