Is Python3 shlex.quote() safe? - bash

I execute some code in shell using
subprocess.Popen('echo '+user_string+' | pipe to some string manipulation tools',
shell=True)
where user_string is from an untrusted source.
Is it safe enough to use shlex.quote() for escaping the input?

I'm necromancing this because it's the top Google hit for "is shlex.quote() safe", and while the accepted answer seems correct, there's a lot of pitfalls to point out.
shlex.quote() escapes the shell's parsing, but it does not escape the argument parser of the command you're calling, and some additional tool-specific escaping needs to be done manually, especially if your string starts with a dash (-).
Most (but not all) tools accept -- as an argument, and anything afterward is interpreted as verbatim. You can prepend "-- " if the string starts with "-". Example: rm -- --help removes the file called --help.
When dealing with file names, you can prepend "./" if the string starts with "-": rm ./--help.
In the case of your example with echo, neither escape is sufficient: When attempting to echo the string -e, echo -- -e gives the wrong result, you'll need something like echo -e \x2de. This demonstrates that there's no universal bulletproof way to escape program arguments.
The safest route is to bypass the shell by avoiding shell=True or os.system() if the string involves any user-supplied data.
In your case, set stdin=subprocess.PIPE and pass the user_string as an argument to communicate(). Then you can even leave your original invocation as-is with shell=True:
subprocess.Popen(
'pipe to some string manipulation tools',
shell=True,
stdin=subprocess.PIPE
).communicate(user_input_string.encode())

According to the official pyton documentation for shlex.quote the answer is yes.
Of course that also depends on what you mean by "safe enough". Under the assumption that you mean "Will using shlex.quote on user_string guard me against a typical scenario of malicious shell code passed as string input to my script?" the answer is yes.

Related

What does `$*`(a dollar sign followed by a star) mean in the default setting of `grepprg` in vim?

In a vanilla vim on Mac, when I type :set grepprg?, it returns the following:
grepprg=grep -n $* /dev/null.
I understand what -n and /dev/null means,
thanks to an old question here.
I also understand what $ and * means individually.
However, I am not sure what to make of $*.
I tried to look it up in the vim doc,
but all that I could find was
The placeholder "$*" is allowed to specify where the arguments will be included.
I sense that I am missing some important connection here.
I would really appreciate if someone could explain to me
how $* works as a placeholder.
Update:
Thanks to the detailed explanation from #romainl,
I realized that I was misinterpreting $* as regex,
whereas they are part of the convention in shell script.
In fact, there already exists old post
about this particular convention.
Silly me!
I'm not sure what kind of explanation is needed beyond what you have already quoted:
The placeholder "$*" is allowed to specify where the arguments will be included.
$* is just a placeholder and it works like all placeholders: before being actually sent to the shell, the command is built out of &grepprg and $*, if present, is replaced by any pattern, filename, flags, etc. provided by the user.
Say you want to search for foo\ bar in all JavaScript files under the current directory. The command would be:
:grep 'foo\ bar' *.js
After you press <CR>, Vim grabs any argument you gave to :grep, in this case:
'foo\ bar' *.js
then, if there is a $* in &grepprg, it is replaced with the given argument:
grep -n 'foo\ bar' *.js /dev/null
or, if there is no $* in &grepprg, the given argument is appended to it, and only then sends the whole command to a shell.
$* means "in this command, I specifically want the user-provided arguments to appear here".
As for the meaning of $*… $ and * have no intrinsic meaning and $* could have been $$$PLACEHOLDER$$$ or anything. $* may have been chosen because it is used in shell script to represent all the arguments given to a function or script, which is somewhat close in meaning to what is happening in &grepprg with $*.

How to use a pure string as an argument for python program through bash terminal

I am trying to give an argument to my python program through the terminal.
For this I am using the lines:
import sys
something = sys.argv[1]
I now try to put in a string like this through the bash terminal:
python my_script.py 2m+{N7HiwH3[>!"4y?t9*y#;/$Ar3wF9+k$[3hK/WA=aMzF°L0PaZTM]t*P|I_AKAqIb0O4# cm=sl)WWYwEg10DDv%k/"c{LrS)oVd§4>8bs:;9u$ *W_SGk3CXe7hZMm$nXyhAuHDi-q+ug5+%ioou.,IhC]-_O§V]^,2q:VBVyTTD6'aNw9:oan(s2SzV
This returns a bash error because some of the characters in the string are bash special characters.
How can I use the string exactly as it is?
You can put the raw string into a file, for example like this, with cat and a here document.
cat <<'EOF' > file.txt
2m+{N7HiwH3[>!"4y?t9*y#;/$Ar3wF9+k$[3hK/WA=aMzF°L0PaZTM]t*P|I_AKAqIb0O4# cm=sl)WWYwEg10DDv%k/"c{LrS)oVd§4>8bs:;9u$ *W_SGk3CXe7hZMm$nXyhAuHDi-q+ug5+%ioou.,IhC]-_O§V]^,2q:VBVyTTD6'aNw9:oan(s2SzV
EOF
and then run
python my_script.py "$(< file.txt)"
You can also use the text editor of your choice for the first step if you prefer that.
If this is a reoccurring task, which you have to perform from time to time, you can make your life easier with a little alias in your shell:
alias escape='read -r string ; printf "Copy this:\n%q\n" "${string}"'
It is using printf "%q" to escape your input string.
Run it like this:
escape
2m+{N7HiwH3[>!"4y?t9*y#;/$Ar3wF9+k$[3hK/WA=aMzF°L0PaZTM]t*P|I_AKAqIb0O4# cm=sl)WWYwEg10DDv%k/"c{LrS)oVd§4>8bs:;9u$ *W_SGk3CXe7hZMm$nXyhAuHDi-q+ug5+%ioou.,IhC]-_O§V]^,2q:VBVyTTD6'aNw9:oan(s2SzV
Copy this:
2m+\{N7HiwH3\[\>\!\"4y\?t9\*y#\;/\$Ar3wF9+k\$\[3hK/WA=aMzF°L0PaZTM\]t\*P\|I_AKAqIb0O4#\ cm=sl\)WWYwEg10DDv%k/\"c\{LrS\)oVd§4\>8bs:\;9u\$\ \*W_SGk3CXe7hZMm\$nXyhAuHDi-q+ug5+%ioou.\,IhC\]-_O§V\]\^\,2q:VBVyTTD6\'aNw9:oan\(s2SzV
You can use the escaped string directly in your shell, without additional quotes, like this:
python my_script.py 2m+\{N7HiwH3\[\>\!\"4y\?t9\*y#\;/\$Ar3wF9+k\$\[3hK/WA=aMzF°L0PaZTM\]t\*P\|I_AKAqIb0O4#\ cm=sl\)WWYwEg10DDv%k/\"c\{LrS\)oVd§4\>8bs:\;9u\$\ \*W_SGk3CXe7hZMm\$nXyhAuHDi-q+ug5+%ioou.\,IhC\]-_O§V\]\^\,2q:VBVyTTD6\'aNw9:oan\(s2SzV
In order to make life easier, shells like bash do a little bit of extra work to help users pass the correct arguments to the programs they instruct it to execute. This extra work usually results in predictable argument arrays getting passed to programs.
Oftentimes, though, this extra help results in unexpected arguments getting passed to programs; and sometimes results in the execution of undesired additional commands. In this case, though, it ended up causing Bash to emit an error.
In order to turn off this extra work, Bash allows users to indicate where arguments should begin and end by surrounding them by quotation marks. Bash supports both single quotes (') and double quotes (") to delimit arguments. As a last resort, if a string may contain single and double quotes (or double quotes are required but aren't aggressive enough), Bash allows you to indicate that a special- or whitespace-character should be part of the adjacent argument by preceding it with a backslash (\\).
If this method of escaping arguments is too cumbersome, it may be worth simplifying your program's interface by having it consume this data from a file instead of a command line argument. Another option is to create a program that loads the arguments from a more controlled location (like a file) and directly execs the target program with the desired argument array.

Script takes only first part of double quotes

Yesterday I asked a similar question about escaping double quotes in env variables, although It didn't solve my problem (Probably because I didn't explain good enough) so I would like to specify more.
I'm trying to run a script (Which I know is written in Perl), although I have to use it as a black box because of permissions issue (so I don't know how the script works). Lets call this script script_A.
I'm trying to run a basic command in Shell: script_A -arg "date time".
If I run from the command line, it's works fine, but If I try to use it from a bash script or perl scrip (for example using the system operator), it will take only the first part of the string which was sent as an argument. In other words, it will fail with the following error: '"date' is not valid..
Example to specify a little bit more:
If I run from the command line (works fine):
> script_A -arg "date time"
If I run from (for example) a Perl script (fails):
my $args = $ENV{SOME_ENV}; # Assume that SOME_ENV has '-arg "date time"'
my $cmd = "script_A $args";
system($cmd");
I think that the problem comes from the environment variable, but I can't use the one quote while defining the env variable. For example, I can't use the following method:
setenv SOME_ENV '-arg "date time"'
Because it fails with the following error: '"date' is not valid.".
Also, I tried to use the following method:
setenv SOME_ENV "-arg '"'date time'"'"
Although now the env variable will containe:
echo $SOME_ENV
> -arg 'date time' # should be -arg "date time"
Another note, using \" fails on Shell (tried it).
Any suggestions on how to locate the reason for the error and how to solve it?
The $args, obtained from %ENV as you show, is a string.
The problem is in what happens to that string as it is manipulated before arguments are passed to the program, which needs to receive strings -arg and date time
If the program is executed in a way that bypasses the shell, as your example is, then the whole -arg "date time" is passed to it as its first argument. This is clearly wrong as the program expects -arg and then another string for its value (date time)
If the program were executed via the shell, what happens when there are shell metacharacters in the command line (not in your example), then the shell would break the string into words, except for the quoted part; this is how it works from the command line. That can be enforced with
system('/bin/tcsh', '-c', $cmd);
This is the most straightforward fix but I can't honestly recommend to involve the shell just for arguments parsing. Also, you are then in the game of layered quoting and escaping, what can get rather involved and tricky. For one, if things aren't right the shell may end up breaking the command into words -arg, "date, time"
How you set the environment variable works
> setenv SOME_ENV '-arg "date time"'
> perl -wE'say $ENV{SOME_ENV}' #--> -arg "date time" (so it works)
what I believe has always worked this way in [t]csh.
Then, in a Perl script: parse this string into -arg and date time strings, and have the program is executed in a way that bypasses the shell (if shell isn't used by the command)
my #args = $ENV{SOME_ENV} =~ /(\S+)\s+"([^"]+)"/; #"
my #cmd = ('script_A', #args);
system(#cmd) == 0 or die "Error with system(#cmd): $?";
This assumes that SOME_ENV's first word is always the option's name (-arg) and that all the rest is always the option's value, under quotes. The regex extracts the first word, as consecutive non-space characters, and after spaces everything in quotes.† These are program's arguments.
In the system LIST form the program that is the first element of the list is executed without using a shell, and the remaining elements are passed to it as arguments. Please see system for more on this, and also for basics of how to investigate failure by looking into $? variable.
It is in principle advisable to run external commands without the shell. However, if your command needs the shell then make sure that the string is escaped just right to to preserve quotes.
Note that there are modules that make it easier to use external commands. A few, from simple to complex: IPC::System::Simple, Capture::Tiny, IPC::Run3, and IPC::Run.
I must note that that's an awkward environment variable; any way to ogranize things otherwise?
† To make this work for non-quoted arguments as well (-arg date) make the quote optional
my #args = $ENV{SOME_ENV} =~ /(\S+)\s+"?([^"]+)/;
where I now left out the closing (unnecessary) quote for simplicity

Calling a shell command from Applescript with quotes

This seems like it should be simple, but I'm pulling out my remaining hair trying to get it to work. In a shell script I want to run some Applescript code that defines a string, then pass that string (containing a single quote) to a shell command that calls PHP's addslashes function, to return a string with that single quote escaped properly.
Here's the code I have so far - it's returning a syntax error.
STRING=$(osascript -- - <<'EOF'
set s to "It's me"
return "['test'=>'" & (do shell script "php -r 'echo addslashes(\"" & s & "\");") & "']"
EOF)
echo -e $STRING
It's supposed to return this:
['test'=>'It\'s me']
First, when asking a question like this, please include what's happening, not just what you're trying to do. When I try this, I get:
42:99: execution error: sh: -c: line 0: unexpected EOF while looking for matchin
sh: -c: line 1: syntax error: unexpected end of file (2)
(which is actually two error messages, with one partly overwriting the other.) Is that what you're getting?
If it is, the problem is that the inner shell command you're creating has quoting issues. Take a look at the AppleScript snippet that tries to run a shell command:
do shell script "php -r 'echo addslashes(\"" & s & "\");"
Since s is set to It's me, this runs the shell command:
php -r 'echo addslashes("It's me");
Which has the problem that the apostrophe in It's me is acting as a close-quote for the string that starts 'echo .... After that, the double-quote in me"); is seen as opening a new quoted string, which doesn't get closed before the end of the "file", causing the unexpected EOF problem.
The underlying problem is that you're trying to pass a string from AppleScript to shell to php... but each of those has its own rules for parsing strings (with different ideas about how quoting and escaping work). Worse, it looks like you're doing this so you can get an escaped string (following which set of escaping rules?) to pass to something else... This way lies madness.
I'm not sure what the real goal is here, but there has to be a better way; something that doesn't involve a game of telephone with players that all speak different languages. If not, you're pretty much doomed.
BTW, there are a few other dubious shell-scripting practices in the script:
Don't use all-caps variable named in shell scripts. There are a bunch of all-caps variables that have special meanings, and if you accidentally use one of those for something else, weird results can happen.
Put double-quotes around all variable references in scripts, to avoid them getting split into multiple "words" and/or expanded as shell wildcards. For example, if the variable string was set to "['test'=>'It\'s-me']", and you happened to have files named "t" and "m" in the current directory, echo -e $string will print "m t" because those are the files that match the [] pattern.
Don't use echo with options and/or to print strings that might contain escapes, since different versions treat these things differently. Some versions, for example, will print the "-e" as part of the output string. Use printf instead. The first argument to printf is a format string that tells it how to format all of the rest of the arguments. To emulate echo -e "$string" in a more reliable form, use printf '%b\n' "$string".
To complement Gordon Davisson's helpful answer with a pragmatic solution:
Shell strings cannot contain \0 (NUL) characters, but the following sed command emulates all other escaping that PHP's (oddly named) addslashes PHP function performs (\-escaping ', " and \ instances):
string=$(osascript <<'EOF'
set s to "It's me\\you and we got 3\" of rain."
return "['test'=>'" & (do shell script "sed 's/[\"\\\\'\\'']/\\\\&/g' <<<" & quoted form of s) & "']"
EOF
)
printf '%s\n' "$string"
yields
['test'=>'It\'s me\\you and we got 3\" of rain.']
Note the use of quoted form of, which is crucial for passing a string from AppleScript to a do shell script shell command with proper quoting.
Also note how the closing here-doc delimiter, EOF, is on its own line to ensure that it is properly recognized (in Bash 3.2.57, as used on macOS 10.12, (also when called as /bin/sh, which is what do shell script does), this isn't strictly necessary, but Bash 4.x would rightfully complain about EOF) with warning: here-document at line <n> delimited by end-of-file (wanted 'EOF')

Bash - Special characters on command line [duplicate]

I'm looking for a way (other than ".", '.', \.) to use bash (or any other linux shell) while preventing it from parsing parts of command line. The problem seems to be unsolvable
How to interpret special characters in command line argument in C?
In theory, a simple switch would suffice (e.g. -x ... telling that the
string ... won't be interpreted) but it apparently doesn't exist. I wonder whether there is a workaround, hack or idea for solving this problem. The original problem is a script|alias for a program taking youtube URLs (which may contain special characters (&, etc.)) as arguments. This problem is even more difficult: expanding "$1" while preventing shell from interpreting the expanded string -- essentially, expanding "$1" without interpreting its result
Use a here-document:
myprogramm <<'EOF'
https://www.youtube.com/watch?v=oT3mCybbhf0
EOF
If you wrap the starting EOF in single quotes, bash won't interpret any special chars in the here-doc.
Short answer: you can't do it, because the shell parses the command line (and interprets things like "&") before it even gets to the point of deciding your script/alias/whatever is what will be run, let alone the point where your script has any control at all. By the time your script has any influence in the process, it's far too late.
Within a script, though, it's easy to avoid most problems: wrap all variable references in double-quotes. For example, rather than curl -o $outputfile $url you should use curl -o "$outputfile" "$url". This will prevent the shell from applying any parsing to the contents of the variable(s) before they're passed to the command (/other script/whatever).
But when you run the script, you'll always have to quote or escape anything passed on the command line.
Your spec still isn't very clear. As far as I know the problem is you want to completely reinvent how the shell handles arguments. So… you'll have to write your own shell. The basics aren't even that difficult. Here's pseudo-code:
while true:
print prompt
read input
command = (first input)
args = (argparse (rest input))
child_pid = fork()
if child_pid == 0: // We are inside child process
exec(command, args) // See variety of `exec` family functions in posix
else: // We are inside parent process and child_pid is actual child pid
wait(child_pid) // See variety of `wait` family functions in posix
Your question basically boils down to how that "argparse" function is implemented. If it's just an identity function, then you get no expansion at all. Is that what you want?

Resources