Script takes only first part of double quotes - shell

Yesterday I asked a similar question about escaping double quotes in env variables, although It didn't solve my problem (Probably because I didn't explain good enough) so I would like to specify more.
I'm trying to run a script (Which I know is written in Perl), although I have to use it as a black box because of permissions issue (so I don't know how the script works). Lets call this script script_A.
I'm trying to run a basic command in Shell: script_A -arg "date time".
If I run from the command line, it's works fine, but If I try to use it from a bash script or perl scrip (for example using the system operator), it will take only the first part of the string which was sent as an argument. In other words, it will fail with the following error: '"date' is not valid..
Example to specify a little bit more:
If I run from the command line (works fine):
> script_A -arg "date time"
If I run from (for example) a Perl script (fails):
my $args = $ENV{SOME_ENV}; # Assume that SOME_ENV has '-arg "date time"'
my $cmd = "script_A $args";
system($cmd");
I think that the problem comes from the environment variable, but I can't use the one quote while defining the env variable. For example, I can't use the following method:
setenv SOME_ENV '-arg "date time"'
Because it fails with the following error: '"date' is not valid.".
Also, I tried to use the following method:
setenv SOME_ENV "-arg '"'date time'"'"
Although now the env variable will containe:
echo $SOME_ENV
> -arg 'date time' # should be -arg "date time"
Another note, using \" fails on Shell (tried it).
Any suggestions on how to locate the reason for the error and how to solve it?

The $args, obtained from %ENV as you show, is a string.
The problem is in what happens to that string as it is manipulated before arguments are passed to the program, which needs to receive strings -arg and date time
If the program is executed in a way that bypasses the shell, as your example is, then the whole -arg "date time" is passed to it as its first argument. This is clearly wrong as the program expects -arg and then another string for its value (date time)
If the program were executed via the shell, what happens when there are shell metacharacters in the command line (not in your example), then the shell would break the string into words, except for the quoted part; this is how it works from the command line. That can be enforced with
system('/bin/tcsh', '-c', $cmd);
This is the most straightforward fix but I can't honestly recommend to involve the shell just for arguments parsing. Also, you are then in the game of layered quoting and escaping, what can get rather involved and tricky. For one, if things aren't right the shell may end up breaking the command into words -arg, "date, time"
How you set the environment variable works
> setenv SOME_ENV '-arg "date time"'
> perl -wE'say $ENV{SOME_ENV}' #--> -arg "date time" (so it works)
what I believe has always worked this way in [t]csh.
Then, in a Perl script: parse this string into -arg and date time strings, and have the program is executed in a way that bypasses the shell (if shell isn't used by the command)
my #args = $ENV{SOME_ENV} =~ /(\S+)\s+"([^"]+)"/; #"
my #cmd = ('script_A', #args);
system(#cmd) == 0 or die "Error with system(#cmd): $?";
This assumes that SOME_ENV's first word is always the option's name (-arg) and that all the rest is always the option's value, under quotes. The regex extracts the first word, as consecutive non-space characters, and after spaces everything in quotes.† These are program's arguments.
In the system LIST form the program that is the first element of the list is executed without using a shell, and the remaining elements are passed to it as arguments. Please see system for more on this, and also for basics of how to investigate failure by looking into $? variable.
It is in principle advisable to run external commands without the shell. However, if your command needs the shell then make sure that the string is escaped just right to to preserve quotes.
Note that there are modules that make it easier to use external commands. A few, from simple to complex: IPC::System::Simple, Capture::Tiny, IPC::Run3, and IPC::Run.
I must note that that's an awkward environment variable; any way to ogranize things otherwise?
† To make this work for non-quoted arguments as well (-arg date) make the quote optional
my #args = $ENV{SOME_ENV} =~ /(\S+)\s+"?([^"]+)/;
where I now left out the closing (unnecessary) quote for simplicity

Related

Bash command works when I run it myself but fails in the script

My company has a tool that dynamically generates commands to run based on an input json. It works very well when all arguments to the compiled command are single words, but is failing when we attempt multi word args. Here is the minimal example of how it fails.
# Print and execute the command.
print_and_run() { local command=("$#")
if [[ ${command[0]} == "time" ]]; then
echo "Your command: time ${command[#]:1}"
time ${command[#]:1}
fi
}
# How print_and_run is called in the script
print_and_run time docker run our-conainer:latest $generated_flags
# Output
Your command: time docker run our-container:latest subcommand --arg1=val1 --arg2="val2 val3"
Usage: our-program [OPTIONS] COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...
Try 'our-program --help' for help.
Error: No such command 'val3"'.
But if I copy the printed command and run it myself it works fine (I've omitted docker flags). Shelling into the container and running the program directly with these arguments works as well, so the parsing logic there is solid (It's a python program that uses click to parse the args).
Now, I have a working solution that uses eval, but my entire team jumped down my throat at that suggestion. I've also proposed a solution using delineating characters for multi-word arguments, but that was shot down as well.
No other solutions proposed by other engineers have worked either. So can I ask someone to perhaps explain why val3 is being treated as a separate command, or to help me find a solution to get bash to properly evaluate the dynamically determined command without using eval?
Your command after expanding $generated_flags is:
print_and_run time docker run our-conainer:latest subcommand --arg1=val1 --arg2="val2 val3"
Your specific problem is that in --arg2="val2 val3" the quotes are literal, not syntactical, because quotes are processed before variables are expanded. This means --arg2="val2 and val3" are being split into two separate arguments. Then, I assume, docker is trying to interpret val3" as some kind of docker command because it's not part of any argument, and it's throwing out an error because it doesn't know what that means.
Normally you'd fix this via an array to properly maintain the string boundary.
generated_flags=( "subcommand" "--arg1=val1" "--arg2=val2 val3" )
print_and_run time docker run our-container:latest "${generated_flags[#]}"
This will maintain --arg2=val2 val3 as a single argument as it gets passed into print_and_run, then you just have to expand your command array correctly inside the function (make sure to quote the expansion).
The question is:
why val3 is being treated as a separate command
Unquoted variable expansion undergo word splitting and filename expansion. Word splitting splits the result of the variable expansion on spcaes, tabs and newlines. Splits it into separate "words".
a="something else"
$a # results in two "words"; 'something' and 'else'
It is irrelevent what you put inside the variable value or how many quotes or escape sequences you put inside. Every consecutive spaces splits it into words. Quotes " ' and escapes \ are parsed when part of the input line, not when part of the result of unquoted expansion.
help me find a solution to
Write a parser that will actually parse the commands and split it according to the rules that you want to use and then execute the command split into separate words. For example, a very crude such parser is included in xargs:
$ echo " 'quotes quotes' not quotes" | xargs printf "'%s'\n"
'quotes quotes'
'not'
'quotes'
For example, python has shlex.split which you can just use, and at the same time introduce python which is waaaaay easier to manage than badly written Bash scripts.
tool that dynamically generates commands to run based on an input json
Overall, the proper way forward would is to upgrade the tool to generate a JSON array that represents the words of the command to be executed. Than you can just execute that array of words, which is, again, trivial to do properly in python with json and subprocess.run, and will require some gymnastics with jq and read and Bash arrays in shell.
Check your scripts with shellcheck.

Is Python3 shlex.quote() safe?

I execute some code in shell using
subprocess.Popen('echo '+user_string+' | pipe to some string manipulation tools',
shell=True)
where user_string is from an untrusted source.
Is it safe enough to use shlex.quote() for escaping the input?
I'm necromancing this because it's the top Google hit for "is shlex.quote() safe", and while the accepted answer seems correct, there's a lot of pitfalls to point out.
shlex.quote() escapes the shell's parsing, but it does not escape the argument parser of the command you're calling, and some additional tool-specific escaping needs to be done manually, especially if your string starts with a dash (-).
Most (but not all) tools accept -- as an argument, and anything afterward is interpreted as verbatim. You can prepend "-- " if the string starts with "-". Example: rm -- --help removes the file called --help.
When dealing with file names, you can prepend "./" if the string starts with "-": rm ./--help.
In the case of your example with echo, neither escape is sufficient: When attempting to echo the string -e, echo -- -e gives the wrong result, you'll need something like echo -e \x2de. This demonstrates that there's no universal bulletproof way to escape program arguments.
The safest route is to bypass the shell by avoiding shell=True or os.system() if the string involves any user-supplied data.
In your case, set stdin=subprocess.PIPE and pass the user_string as an argument to communicate(). Then you can even leave your original invocation as-is with shell=True:
subprocess.Popen(
'pipe to some string manipulation tools',
shell=True,
stdin=subprocess.PIPE
).communicate(user_input_string.encode())
According to the official pyton documentation for shlex.quote the answer is yes.
Of course that also depends on what you mean by "safe enough". Under the assumption that you mean "Will using shlex.quote on user_string guard me against a typical scenario of malicious shell code passed as string input to my script?" the answer is yes.

Bash - Special characters on command line [duplicate]

I'm looking for a way (other than ".", '.', \.) to use bash (or any other linux shell) while preventing it from parsing parts of command line. The problem seems to be unsolvable
How to interpret special characters in command line argument in C?
In theory, a simple switch would suffice (e.g. -x ... telling that the
string ... won't be interpreted) but it apparently doesn't exist. I wonder whether there is a workaround, hack or idea for solving this problem. The original problem is a script|alias for a program taking youtube URLs (which may contain special characters (&, etc.)) as arguments. This problem is even more difficult: expanding "$1" while preventing shell from interpreting the expanded string -- essentially, expanding "$1" without interpreting its result
Use a here-document:
myprogramm <<'EOF'
https://www.youtube.com/watch?v=oT3mCybbhf0
EOF
If you wrap the starting EOF in single quotes, bash won't interpret any special chars in the here-doc.
Short answer: you can't do it, because the shell parses the command line (and interprets things like "&") before it even gets to the point of deciding your script/alias/whatever is what will be run, let alone the point where your script has any control at all. By the time your script has any influence in the process, it's far too late.
Within a script, though, it's easy to avoid most problems: wrap all variable references in double-quotes. For example, rather than curl -o $outputfile $url you should use curl -o "$outputfile" "$url". This will prevent the shell from applying any parsing to the contents of the variable(s) before they're passed to the command (/other script/whatever).
But when you run the script, you'll always have to quote or escape anything passed on the command line.
Your spec still isn't very clear. As far as I know the problem is you want to completely reinvent how the shell handles arguments. So… you'll have to write your own shell. The basics aren't even that difficult. Here's pseudo-code:
while true:
print prompt
read input
command = (first input)
args = (argparse (rest input))
child_pid = fork()
if child_pid == 0: // We are inside child process
exec(command, args) // See variety of `exec` family functions in posix
else: // We are inside parent process and child_pid is actual child pid
wait(child_pid) // See variety of `wait` family functions in posix
Your question basically boils down to how that "argparse" function is implemented. If it's just an identity function, then you get no expansion at all. Is that what you want?

Command substitution in shell script without globbing

Consider this little shell script.
# Save the first command line argument
cmd="$1"
# Execute the command specified in the first command line argument
out=$($cmd)
# Do something with the output of the specified command
# Here we do a silly thing, like make the output all uppercase
echo "$out" | tr -s "a-z" "A-Z"
The script executes the command specified as the first argument, transforms the output obtained from that command and prints it to standard output. This script may be executed in this manner.
sh foo.sh "echo select * from table"
This does not do what I want. It may print something like the following,
$ sh foo.sh "echo select * from table"
SELECT FILEA FILEB FILEC FROM TABLE
if fileA, fileB and fileC is present in the current directory.
From a user perspective, this command is reasonable. The user has quoted the * in the command line argument, so the user doesn't expect the * to be globbed. But my script astonishes the user by using this argument in a command substitution which causes globbing of * as seen in the above output.
I want the output to be the following instead.
SELECT * FROM TABLE
The entire text in cmd actually comes from command line arguments to the script so I would like to preserve any * symbol present in the argument without globbing them.
I am looking for a solution that works for any POSIX shell.
One solution I have come up with is to disable globbing with set -o noglob just before the command substitution. Here is the complete code.
# Save the first command line argument
cmd="$1"
# Execute the command specified in the first command line argument
set -o noglob
out=$($cmd)
# Do something with the output of the specified command
# Here we do a silly thing, like make the output all uppercase
echo "$out" | tr -s "a-z" "A-Z"
This does what I expect.
$ sh foo.sh "echo select * from table"
SELECT * FROM TABLE
Apart from this, is there any other concept or trick (such as a quoting mechanism) I need to be aware of to disable globbing only within a command substitution without having to use set -o noglob.
I am not against set -o noglob. I just want to know if there is another way. You know, globbing can be disabled for normal command line arguments just by quoting them, so I was wondering if there is anything similar for command substiution.
If I understand correctly, you want the user to provide a shell command as a command-line argument, which will be executed by the script, and is expected to produce an SQL string, which will be processed (upper-cased) and echoed to stdout.
The first thing to say is that there is no point in having the user provide a shell command that the script just blindly executes. If the script applied some kind of modification/preprocessing of the command before it executed it then perhaps it could make sense, but if not, then the user might as well execute the command himself and pass the output to the script as a command-line argument, or via stdin.
But that being said, if you really want to do it this way, then there are two things that need to be said. Firstly, this is the proper form to use:
out=$(eval "$cmd");
A fairly advanced understanding of the shell grammer and expansion rules would be required to fully understand the rationale for using the above syntax, but basically executing $cmd and executing eval "$cmd" have subtle differences that render the $cmd form inappropriate for executing a given shell command string.
Just to give some detail that will hopefully clarify the above point, there are seven kinds of expansion that are performed by the shell in the following order when processing input: (1) brace expansion, (2) tilde expansion, (3) parameter and variable expansion, (4) arithmetic expansion, (5) command substitution, (6) word splitting, and (7) pathname expansion. Notice that variable expansion happens somewhat in the middle of that sequence, and thus the variable-expanded shell command (which was provided by the user) will not receive the benefit of the prior expansion types. Other issues are that leading variable assignments, pipelines, and command list tokens will not be executed correctly under the $cmd form, because they are parsed and processed prior to variable expansion (actually prior to all expansions) as well.
By running the command through eval, properly double-quoted, you ensure that the full shell parsing/processing/execution algorithm will be applied to the shell command string that was given by the user of your script.
The second thing to say is this: If you try the above proper form in your script, you will find that it has not solved your problem. You will still get SELECT FILEA FILEB FILEC FROM TABLE as output.
The reason is this: Since you've decided you want to accept an arbitrary shell command from the user of your script, it is now the user's responsibility to properly quote all metacharacters that may be embedded in that piece of code. It does not make sense for you to accept a shell command as a command-line argument, but somehow change the processing rules for shell commands so that certain metacharacters will no longer be metacharacters when the given shell command is executed. Actually, you could do something like that, perhaps using set -o noglob as you discovered, but then that must become a contract between the script and the user of the script; the user must be made aware of exactly what the precise processing rules will be when the command is executed so that he can properly use the script.
Under this design, the user could call the script as follows (notice the extra layer of quoting for the shell command string evaluation; could alternatively backslash-escape just the asterisk):
$ sh foo.sh "echo 'select * from table'";
I'd like to return to my earlier comment about the overall design; it doesn't really make sense to do it this way. It makes more sense to take the text-to-process itself, not a shell command that is expected to produce the text-to-process.
Here is how that could be done:
## take the text-to-process via a command-line argument
sql="$1";
## process and echo it
echo "$sql"| tr a-z A-Z;
(I also removed the -s option of tr, which really doesn't make sense here.)
Notice that the script is simpler now, and usage is also simpler:
$ sh foo.sh 'select * from table';

Bash- passing input without shell interpreting parameter expansion chars

So I have a script where I type the script.sh followed by input for a set of if-else statements. Like this:
script.sh fnSw38h$?2
The output echoes out the input in the end.
But I noticed that $? is interpreted as 0/1 so the output would echo:
fnSw38h12
How can I stop the shell from expanding the characters and take it face value?
I looked at something like opt noglob or something similar but they didn't work.
When I put it like this:
script.sh 'fnSw38h$?2'
it works. But how do I capture that within single quotes ('') when I can't state variables inside it like Var='$1'
Please help!
How to pass a password to a script
I gather from the comments that the true purpose of this script is to validate a password. If this is an important or sensitive application, you really should be using professional security tools. If this application is not sensitive or this is just a learning exercise, then read on for a first introduction to the issues.
First, do not do this:
script.sh fnSw38h$?2
This password will appear in ps and be visible to any user on the system in plain text.
Instead, have the user type the password as input to the script, such as:
#!/bin/sh
IFS= read -r var
Here, read will gather input from the keyboard free from shell interference and it will not appear in ps output.
var will have the password for you to verify but you really shouldn't have plain text passwords saved anywhere for you to verify against. It is much better to put the password through a one-way hash and then compare the hash with something that you have saved in a file. For example:
var=$(head -n1 | md5sum)
Here, head will read one line (the password) and pass it to md5sum which will convert it to a hash. This hash can be compared with the known correct hash for this user's password. The text returned by head will be exactly what the user typed, unmangled by the shell.
Actually, for a known hash algorithm, it is possible to make a reverse look-up table for common passwords. So, the solution is to create a variable, called salt, that has some user dependent information:
var=$( { head -n1; echo "$salt"; } | md5sum)
The salt does not have to be kept secret. It is just there to make look-up tables more difficult to compute.
The md5sum algorithm, however, has been found to have some weaknesses. So, it should be replaced with more recent hash algorithms. As I write, that would probably be a sha-2 variant.
Again, if this is a sensitive application, do not use home-made tools
Answer to original question
how do I capture that within single quotes ('') when I can't state variables inside it like Var='$1'
The answer is that you don't need to. Consider, for example, this script:
#!/bin/sh
var=$1
echo $var
First, note that $$ and $? are both shell variables:
$ echo $$ $?
28712 0
Now, let's try our script:
$ bash ./script.sh '$$ $?'
$$ $?
These variables were not expanded because (1) when they appeared on the command line, they were in single-quotes, and (2) in the script, they were assigned to variables and bash does not expand variables recursively. In other words, on the line echo $var, bash will expand $var to get $$ $? but there it stops. It does not expand what was in var.
You can escape any dollar signs in a double-quoted string that are not meant to introduce a parameter expansion.
var=foo
# Pass the literal string fnSw38h$?2foo to script.sh
script.sh "fnSw38h\$?2$var"
You cannot do what you are trying to do. What is entered on the command line (such as the arguments to your script) must be in shell syntax, and will be interpreted by the shell (according to the shell's rules) before being handed to your script.
When someone runs the command script.sh fnSw38h$?2, the shell parses the argument as the text "fnSw38h", followed by $? which means "substitute the exit status of the last command here", followed by "2". So the shell does as it's been told, it substitutes the exit status of the last command, then hands the result of that to your script.
Your script never receives "fnSw38h$?2", and cannot recover the argument in that form. It receives something like "fnSw38h02" or "fnSw38h12", because that's what the user asked the shell to pass it. That might not be what the user wanted to pass it, but as I said, the command must be in shell syntax, and in shell syntax an unescaped and unnquoted $? means "substitute the last exit status here".
If the user wants to pass "$?" as part of the argument, they must escape or single-quote it on the command line. Period.

Resources