options or arguments passed to executables are quoted by " " - shell

There is a executbale called app, and it can take some options and command line args, -l -v, to name a few. Now I'm writing a bash script inside which app will be invoked with some options, and I did it this way,
opt_string="-l -v" # this string might change according to different conditions used in if-else
# HERE is my problem
./app ${opt_string}
Look how I invoked app, typically I just invoke it in prompt shell like this:
./app -l -v
But now in this script, would it be actually this:
./app "-l -v"
cuz ${opt_string} is a STRING quoted by "", if so I doubt whether app will run normally.
I know there might be a way around this by using eval "./app ${opt_string}", but is there any way to strip the ""?

BASH FAQ entry #50: "I'm trying to put a command in a variable, but the complex cases always fail!"
opt_string=(-l -v)
./app "${opt_string[#]}"

The way you're invoking (./app ${opt_string}) it is fine. You define opt_string as a single string because you quote the value in the assignment. However, when you dereference it, you have not used quotes, so the shell will substitute its value and then split it into individual words.
when you say
./app "$opt_string"
you are passing one single argument to the app. When you say
./app $opt_string
you are passing multiple arguments to the app.
See word splitting in the bash manual.
Note that braces are not quotes. Curly braces (in this context) merely serve to disambiguate the variable name from the surrounding text, i.e. echo "$opt_string_blah" versus echo "${opt_string}_blah"

Related

Bash command works when I run it myself but fails in the script

My company has a tool that dynamically generates commands to run based on an input json. It works very well when all arguments to the compiled command are single words, but is failing when we attempt multi word args. Here is the minimal example of how it fails.
# Print and execute the command.
print_and_run() { local command=("$#")
if [[ ${command[0]} == "time" ]]; then
echo "Your command: time ${command[#]:1}"
time ${command[#]:1}
fi
}
# How print_and_run is called in the script
print_and_run time docker run our-conainer:latest $generated_flags
# Output
Your command: time docker run our-container:latest subcommand --arg1=val1 --arg2="val2 val3"
Usage: our-program [OPTIONS] COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...
Try 'our-program --help' for help.
Error: No such command 'val3"'.
But if I copy the printed command and run it myself it works fine (I've omitted docker flags). Shelling into the container and running the program directly with these arguments works as well, so the parsing logic there is solid (It's a python program that uses click to parse the args).
Now, I have a working solution that uses eval, but my entire team jumped down my throat at that suggestion. I've also proposed a solution using delineating characters for multi-word arguments, but that was shot down as well.
No other solutions proposed by other engineers have worked either. So can I ask someone to perhaps explain why val3 is being treated as a separate command, or to help me find a solution to get bash to properly evaluate the dynamically determined command without using eval?
Your command after expanding $generated_flags is:
print_and_run time docker run our-conainer:latest subcommand --arg1=val1 --arg2="val2 val3"
Your specific problem is that in --arg2="val2 val3" the quotes are literal, not syntactical, because quotes are processed before variables are expanded. This means --arg2="val2 and val3" are being split into two separate arguments. Then, I assume, docker is trying to interpret val3" as some kind of docker command because it's not part of any argument, and it's throwing out an error because it doesn't know what that means.
Normally you'd fix this via an array to properly maintain the string boundary.
generated_flags=( "subcommand" "--arg1=val1" "--arg2=val2 val3" )
print_and_run time docker run our-container:latest "${generated_flags[#]}"
This will maintain --arg2=val2 val3 as a single argument as it gets passed into print_and_run, then you just have to expand your command array correctly inside the function (make sure to quote the expansion).
The question is:
why val3 is being treated as a separate command
Unquoted variable expansion undergo word splitting and filename expansion. Word splitting splits the result of the variable expansion on spcaes, tabs and newlines. Splits it into separate "words".
a="something else"
$a # results in two "words"; 'something' and 'else'
It is irrelevent what you put inside the variable value or how many quotes or escape sequences you put inside. Every consecutive spaces splits it into words. Quotes " ' and escapes \ are parsed when part of the input line, not when part of the result of unquoted expansion.
help me find a solution to
Write a parser that will actually parse the commands and split it according to the rules that you want to use and then execute the command split into separate words. For example, a very crude such parser is included in xargs:
$ echo " 'quotes quotes' not quotes" | xargs printf "'%s'\n"
'quotes quotes'
'not'
'quotes'
For example, python has shlex.split which you can just use, and at the same time introduce python which is waaaaay easier to manage than badly written Bash scripts.
tool that dynamically generates commands to run based on an input json
Overall, the proper way forward would is to upgrade the tool to generate a JSON array that represents the words of the command to be executed. Than you can just execute that array of words, which is, again, trivial to do properly in python with json and subprocess.run, and will require some gymnastics with jq and read and Bash arrays in shell.
Check your scripts with shellcheck.

How to use a pure string as an argument for python program through bash terminal

I am trying to give an argument to my python program through the terminal.
For this I am using the lines:
import sys
something = sys.argv[1]
I now try to put in a string like this through the bash terminal:
python my_script.py 2m+{N7HiwH3[>!"4y?t9*y#;/$Ar3wF9+k$[3hK/WA=aMzF°L0PaZTM]t*P|I_AKAqIb0O4# cm=sl)WWYwEg10DDv%k/"c{LrS)oVd§4>8bs:;9u$ *W_SGk3CXe7hZMm$nXyhAuHDi-q+ug5+%ioou.,IhC]-_O§V]^,2q:VBVyTTD6'aNw9:oan(s2SzV
This returns a bash error because some of the characters in the string are bash special characters.
How can I use the string exactly as it is?
You can put the raw string into a file, for example like this, with cat and a here document.
cat <<'EOF' > file.txt
2m+{N7HiwH3[>!"4y?t9*y#;/$Ar3wF9+k$[3hK/WA=aMzF°L0PaZTM]t*P|I_AKAqIb0O4# cm=sl)WWYwEg10DDv%k/"c{LrS)oVd§4>8bs:;9u$ *W_SGk3CXe7hZMm$nXyhAuHDi-q+ug5+%ioou.,IhC]-_O§V]^,2q:VBVyTTD6'aNw9:oan(s2SzV
EOF
and then run
python my_script.py "$(< file.txt)"
You can also use the text editor of your choice for the first step if you prefer that.
If this is a reoccurring task, which you have to perform from time to time, you can make your life easier with a little alias in your shell:
alias escape='read -r string ; printf "Copy this:\n%q\n" "${string}"'
It is using printf "%q" to escape your input string.
Run it like this:
escape
2m+{N7HiwH3[>!"4y?t9*y#;/$Ar3wF9+k$[3hK/WA=aMzF°L0PaZTM]t*P|I_AKAqIb0O4# cm=sl)WWYwEg10DDv%k/"c{LrS)oVd§4>8bs:;9u$ *W_SGk3CXe7hZMm$nXyhAuHDi-q+ug5+%ioou.,IhC]-_O§V]^,2q:VBVyTTD6'aNw9:oan(s2SzV
Copy this:
2m+\{N7HiwH3\[\>\!\"4y\?t9\*y#\;/\$Ar3wF9+k\$\[3hK/WA=aMzF°L0PaZTM\]t\*P\|I_AKAqIb0O4#\ cm=sl\)WWYwEg10DDv%k/\"c\{LrS\)oVd§4\>8bs:\;9u\$\ \*W_SGk3CXe7hZMm\$nXyhAuHDi-q+ug5+%ioou.\,IhC\]-_O§V\]\^\,2q:VBVyTTD6\'aNw9:oan\(s2SzV
You can use the escaped string directly in your shell, without additional quotes, like this:
python my_script.py 2m+\{N7HiwH3\[\>\!\"4y\?t9\*y#\;/\$Ar3wF9+k\$\[3hK/WA=aMzF°L0PaZTM\]t\*P\|I_AKAqIb0O4#\ cm=sl\)WWYwEg10DDv%k/\"c\{LrS\)oVd§4\>8bs:\;9u\$\ \*W_SGk3CXe7hZMm\$nXyhAuHDi-q+ug5+%ioou.\,IhC\]-_O§V\]\^\,2q:VBVyTTD6\'aNw9:oan\(s2SzV
In order to make life easier, shells like bash do a little bit of extra work to help users pass the correct arguments to the programs they instruct it to execute. This extra work usually results in predictable argument arrays getting passed to programs.
Oftentimes, though, this extra help results in unexpected arguments getting passed to programs; and sometimes results in the execution of undesired additional commands. In this case, though, it ended up causing Bash to emit an error.
In order to turn off this extra work, Bash allows users to indicate where arguments should begin and end by surrounding them by quotation marks. Bash supports both single quotes (') and double quotes (") to delimit arguments. As a last resort, if a string may contain single and double quotes (or double quotes are required but aren't aggressive enough), Bash allows you to indicate that a special- or whitespace-character should be part of the adjacent argument by preceding it with a backslash (\\).
If this method of escaping arguments is too cumbersome, it may be worth simplifying your program's interface by having it consume this data from a file instead of a command line argument. Another option is to create a program that loads the arguments from a more controlled location (like a file) and directly execs the target program with the desired argument array.

Script takes only first part of double quotes

Yesterday I asked a similar question about escaping double quotes in env variables, although It didn't solve my problem (Probably because I didn't explain good enough) so I would like to specify more.
I'm trying to run a script (Which I know is written in Perl), although I have to use it as a black box because of permissions issue (so I don't know how the script works). Lets call this script script_A.
I'm trying to run a basic command in Shell: script_A -arg "date time".
If I run from the command line, it's works fine, but If I try to use it from a bash script or perl scrip (for example using the system operator), it will take only the first part of the string which was sent as an argument. In other words, it will fail with the following error: '"date' is not valid..
Example to specify a little bit more:
If I run from the command line (works fine):
> script_A -arg "date time"
If I run from (for example) a Perl script (fails):
my $args = $ENV{SOME_ENV}; # Assume that SOME_ENV has '-arg "date time"'
my $cmd = "script_A $args";
system($cmd");
I think that the problem comes from the environment variable, but I can't use the one quote while defining the env variable. For example, I can't use the following method:
setenv SOME_ENV '-arg "date time"'
Because it fails with the following error: '"date' is not valid.".
Also, I tried to use the following method:
setenv SOME_ENV "-arg '"'date time'"'"
Although now the env variable will containe:
echo $SOME_ENV
> -arg 'date time' # should be -arg "date time"
Another note, using \" fails on Shell (tried it).
Any suggestions on how to locate the reason for the error and how to solve it?
The $args, obtained from %ENV as you show, is a string.
The problem is in what happens to that string as it is manipulated before arguments are passed to the program, which needs to receive strings -arg and date time
If the program is executed in a way that bypasses the shell, as your example is, then the whole -arg "date time" is passed to it as its first argument. This is clearly wrong as the program expects -arg and then another string for its value (date time)
If the program were executed via the shell, what happens when there are shell metacharacters in the command line (not in your example), then the shell would break the string into words, except for the quoted part; this is how it works from the command line. That can be enforced with
system('/bin/tcsh', '-c', $cmd);
This is the most straightforward fix but I can't honestly recommend to involve the shell just for arguments parsing. Also, you are then in the game of layered quoting and escaping, what can get rather involved and tricky. For one, if things aren't right the shell may end up breaking the command into words -arg, "date, time"
How you set the environment variable works
> setenv SOME_ENV '-arg "date time"'
> perl -wE'say $ENV{SOME_ENV}' #--> -arg "date time" (so it works)
what I believe has always worked this way in [t]csh.
Then, in a Perl script: parse this string into -arg and date time strings, and have the program is executed in a way that bypasses the shell (if shell isn't used by the command)
my #args = $ENV{SOME_ENV} =~ /(\S+)\s+"([^"]+)"/; #"
my #cmd = ('script_A', #args);
system(#cmd) == 0 or die "Error with system(#cmd): $?";
This assumes that SOME_ENV's first word is always the option's name (-arg) and that all the rest is always the option's value, under quotes. The regex extracts the first word, as consecutive non-space characters, and after spaces everything in quotes.† These are program's arguments.
In the system LIST form the program that is the first element of the list is executed without using a shell, and the remaining elements are passed to it as arguments. Please see system for more on this, and also for basics of how to investigate failure by looking into $? variable.
It is in principle advisable to run external commands without the shell. However, if your command needs the shell then make sure that the string is escaped just right to to preserve quotes.
Note that there are modules that make it easier to use external commands. A few, from simple to complex: IPC::System::Simple, Capture::Tiny, IPC::Run3, and IPC::Run.
I must note that that's an awkward environment variable; any way to ogranize things otherwise?
† To make this work for non-quoted arguments as well (-arg date) make the quote optional
my #args = $ENV{SOME_ENV} =~ /(\S+)\s+"?([^"]+)/;
where I now left out the closing (unnecessary) quote for simplicity

bash script pass a variable to a ./configure command containing quotes and expansion

I ham having difficulty understanding how to pass a variable to a ./configure command that includes variable expansion and quotes.
myvars.cfg
myFolderA="/home/myPrefix"
myFolderB="/home/stuffB"
myFolderC="/home/stuffC"
optsA="--prefix=${myFolderA}"
optsB="CPPFLAGS=\"-I${myFolderB} -I${myFolderC}\""
cmd="/home/prog/"
myScript.sh
#!/bin/bash
. /home/myvars.cfg
doCmd=("$cmd/configure" "${optsA}" "${optsB}")
${doCmd[#]}
The doCmd should look like this
/home/prog/configure --prefix=/home/myPrefix CPPFLAGS="-I/home/stuffB -I/home/stuffC"
however it seems when running bash it is adding single quotes
/home/prog/configure --prefix=/home/myPrefix 'CPPFLAGS="-I/home/stuffB' '-I/home/stuffC"'
causing an error of
configure: error: unrecognized option: `-I/home/stuffC"'
Is there a way to pass a variable that needs top be expanded and contains double quotes?
As your script is written, there is no point to using the doCmd array. You could simply write the command:
"$cmd/configure" "${optsA}" "${optsB}"
Or, more simply:
"$cmd/configure" "$optsA" "$optsB"
However, it is possible that you've simplified the script in a way which hides the need for the array. In any case, if you use the array, you need to ensure that its elements are not word-split and filepath expanded, so you must quote its expansion:
"${doCmd[#]}"
Also, you need to get rid of the quotes in optsB. You don't want to pass
CPPFLAGS="-I/home/stuffB -I/home/stuffC"
to the configure script. You want to pass what the shell would pass if you typed the above string. And what the shell would pass would be a single command-line argument with a space in it, looking like this:
CPPFLAGS=-I/home/stuffB -I/home/stuffC
In order to get that into optsB, you just write:
optsB="CPPFLAGS=-I${myFolderB} -I${myFolderC}"
Finally, the shell is not "adding single quotes" into the command line. It is showing you a form of the command whch you could type at the command-line. Since the argument (incorrectly) contains a quote symbol, the shell shows you the command with its arguments skingle-quoted, so that you can see that the optB has been (incorrectly) split into two arguments, each of which contains (incorrectly) one double quote.
You could have found much of the above and more by pasting your script into https://shellcheck.net. As the bash tag summary suggests, you should always try that before asking a shell question here because a lot of the time, it will solve your problem instantly.

Is Python3 shlex.quote() safe?

I execute some code in shell using
subprocess.Popen('echo '+user_string+' | pipe to some string manipulation tools',
shell=True)
where user_string is from an untrusted source.
Is it safe enough to use shlex.quote() for escaping the input?
I'm necromancing this because it's the top Google hit for "is shlex.quote() safe", and while the accepted answer seems correct, there's a lot of pitfalls to point out.
shlex.quote() escapes the shell's parsing, but it does not escape the argument parser of the command you're calling, and some additional tool-specific escaping needs to be done manually, especially if your string starts with a dash (-).
Most (but not all) tools accept -- as an argument, and anything afterward is interpreted as verbatim. You can prepend "-- " if the string starts with "-". Example: rm -- --help removes the file called --help.
When dealing with file names, you can prepend "./" if the string starts with "-": rm ./--help.
In the case of your example with echo, neither escape is sufficient: When attempting to echo the string -e, echo -- -e gives the wrong result, you'll need something like echo -e \x2de. This demonstrates that there's no universal bulletproof way to escape program arguments.
The safest route is to bypass the shell by avoiding shell=True or os.system() if the string involves any user-supplied data.
In your case, set stdin=subprocess.PIPE and pass the user_string as an argument to communicate(). Then you can even leave your original invocation as-is with shell=True:
subprocess.Popen(
'pipe to some string manipulation tools',
shell=True,
stdin=subprocess.PIPE
).communicate(user_input_string.encode())
According to the official pyton documentation for shlex.quote the answer is yes.
Of course that also depends on what you mean by "safe enough". Under the assumption that you mean "Will using shlex.quote on user_string guard me against a typical scenario of malicious shell code passed as string input to my script?" the answer is yes.

Resources