I'm looking for a way (other than ".", '.', \.) to use bash (or any other linux shell) while preventing it from parsing parts of command line. The problem seems to be unsolvable
How to interpret special characters in command line argument in C?
In theory, a simple switch would suffice (e.g. -x ... telling that the
string ... won't be interpreted) but it apparently doesn't exist. I wonder whether there is a workaround, hack or idea for solving this problem. The original problem is a script|alias for a program taking youtube URLs (which may contain special characters (&, etc.)) as arguments. This problem is even more difficult: expanding "$1" while preventing shell from interpreting the expanded string -- essentially, expanding "$1" without interpreting its result
Use a here-document:
myprogramm <<'EOF'
https://www.youtube.com/watch?v=oT3mCybbhf0
EOF
If you wrap the starting EOF in single quotes, bash won't interpret any special chars in the here-doc.
Short answer: you can't do it, because the shell parses the command line (and interprets things like "&") before it even gets to the point of deciding your script/alias/whatever is what will be run, let alone the point where your script has any control at all. By the time your script has any influence in the process, it's far too late.
Within a script, though, it's easy to avoid most problems: wrap all variable references in double-quotes. For example, rather than curl -o $outputfile $url you should use curl -o "$outputfile" "$url". This will prevent the shell from applying any parsing to the contents of the variable(s) before they're passed to the command (/other script/whatever).
But when you run the script, you'll always have to quote or escape anything passed on the command line.
Your spec still isn't very clear. As far as I know the problem is you want to completely reinvent how the shell handles arguments. So… you'll have to write your own shell. The basics aren't even that difficult. Here's pseudo-code:
while true:
print prompt
read input
command = (first input)
args = (argparse (rest input))
child_pid = fork()
if child_pid == 0: // We are inside child process
exec(command, args) // See variety of `exec` family functions in posix
else: // We are inside parent process and child_pid is actual child pid
wait(child_pid) // See variety of `wait` family functions in posix
Your question basically boils down to how that "argparse" function is implemented. If it's just an identity function, then you get no expansion at all. Is that what you want?
Related
I am trying to give an argument to my python program through the terminal.
For this I am using the lines:
import sys
something = sys.argv[1]
I now try to put in a string like this through the bash terminal:
python my_script.py 2m+{N7HiwH3[>!"4y?t9*y#;/$Ar3wF9+k$[3hK/WA=aMzF°L0PaZTM]t*P|I_AKAqIb0O4# cm=sl)WWYwEg10DDv%k/"c{LrS)oVd§4>8bs:;9u$ *W_SGk3CXe7hZMm$nXyhAuHDi-q+ug5+%ioou.,IhC]-_O§V]^,2q:VBVyTTD6'aNw9:oan(s2SzV
This returns a bash error because some of the characters in the string are bash special characters.
How can I use the string exactly as it is?
You can put the raw string into a file, for example like this, with cat and a here document.
cat <<'EOF' > file.txt
2m+{N7HiwH3[>!"4y?t9*y#;/$Ar3wF9+k$[3hK/WA=aMzF°L0PaZTM]t*P|I_AKAqIb0O4# cm=sl)WWYwEg10DDv%k/"c{LrS)oVd§4>8bs:;9u$ *W_SGk3CXe7hZMm$nXyhAuHDi-q+ug5+%ioou.,IhC]-_O§V]^,2q:VBVyTTD6'aNw9:oan(s2SzV
EOF
and then run
python my_script.py "$(< file.txt)"
You can also use the text editor of your choice for the first step if you prefer that.
If this is a reoccurring task, which you have to perform from time to time, you can make your life easier with a little alias in your shell:
alias escape='read -r string ; printf "Copy this:\n%q\n" "${string}"'
It is using printf "%q" to escape your input string.
Run it like this:
escape
2m+{N7HiwH3[>!"4y?t9*y#;/$Ar3wF9+k$[3hK/WA=aMzF°L0PaZTM]t*P|I_AKAqIb0O4# cm=sl)WWYwEg10DDv%k/"c{LrS)oVd§4>8bs:;9u$ *W_SGk3CXe7hZMm$nXyhAuHDi-q+ug5+%ioou.,IhC]-_O§V]^,2q:VBVyTTD6'aNw9:oan(s2SzV
Copy this:
2m+\{N7HiwH3\[\>\!\"4y\?t9\*y#\;/\$Ar3wF9+k\$\[3hK/WA=aMzF°L0PaZTM\]t\*P\|I_AKAqIb0O4#\ cm=sl\)WWYwEg10DDv%k/\"c\{LrS\)oVd§4\>8bs:\;9u\$\ \*W_SGk3CXe7hZMm\$nXyhAuHDi-q+ug5+%ioou.\,IhC\]-_O§V\]\^\,2q:VBVyTTD6\'aNw9:oan\(s2SzV
You can use the escaped string directly in your shell, without additional quotes, like this:
python my_script.py 2m+\{N7HiwH3\[\>\!\"4y\?t9\*y#\;/\$Ar3wF9+k\$\[3hK/WA=aMzF°L0PaZTM\]t\*P\|I_AKAqIb0O4#\ cm=sl\)WWYwEg10DDv%k/\"c\{LrS\)oVd§4\>8bs:\;9u\$\ \*W_SGk3CXe7hZMm\$nXyhAuHDi-q+ug5+%ioou.\,IhC\]-_O§V\]\^\,2q:VBVyTTD6\'aNw9:oan\(s2SzV
In order to make life easier, shells like bash do a little bit of extra work to help users pass the correct arguments to the programs they instruct it to execute. This extra work usually results in predictable argument arrays getting passed to programs.
Oftentimes, though, this extra help results in unexpected arguments getting passed to programs; and sometimes results in the execution of undesired additional commands. In this case, though, it ended up causing Bash to emit an error.
In order to turn off this extra work, Bash allows users to indicate where arguments should begin and end by surrounding them by quotation marks. Bash supports both single quotes (') and double quotes (") to delimit arguments. As a last resort, if a string may contain single and double quotes (or double quotes are required but aren't aggressive enough), Bash allows you to indicate that a special- or whitespace-character should be part of the adjacent argument by preceding it with a backslash (\\).
If this method of escaping arguments is too cumbersome, it may be worth simplifying your program's interface by having it consume this data from a file instead of a command line argument. Another option is to create a program that loads the arguments from a more controlled location (like a file) and directly execs the target program with the desired argument array.
I am trying to study both bash and ruby and I am trying to pass a variable that contains a JSON object from ruby to be printed using a bash script.
I tried storing the JSON object in a variable then used that variable as an argument for my bash script but I am not getting the result that I wanted.
so in my ruby code I have this:
param = request.body
cmd = "ruby_to_bash #{param}"
exec cmd
and in my bash script, I am simply outputting the value of the argument:
echo $1
For example I have this JSON:
{"first":"one","second":"two","third":"three"}
my code only gives me this output:
{first:
I want to display the whole JSON object. How can I do this? Any kind of help will be much appreciated.
Both bash and ruby treat double quotes in a kinda special way: double-quoted strings are interpolated before passed to the receiver. I strongly encourage you to start with learning somewhat one in the first place instead of doing zero progress in both due to induced errors.
Answering the question stated, one must escape double quotes when passing back and forth since both ruby and bash treat them as content delimiters and discard when the content is already parsed in the resulting string.
This would work:
exec %|ruby_to_bash #{param.gsub('"', '\"')}|
I use %| and | as string delimiters to avoid clashes against inner quotes.
Sidenote: the above won’t work for the input containing spaces, but I purposely avoid to show how to deal with spaces since it leads to the dead end; once handled spaces with surrounding the interpolated param with single quotes, we are prone to be screwed up with inner single quotes in JSON object and so forth.
The code above is not someone should ever produce in production.
I think your best bet is, as usual in such cases, not to involve a shell at all.
When you say this:
cmd = "ruby_to_bash #{param}"
exec cmd
Your exec call is invoking a shell which parses cmd and then runs ruby_to_bash so you're doing this:
Build a command line.
Replace the current process with /bin/sh.
/bin/sh parses the command line that you should mashed together. This is where you need to worry about quoting and escaping to get past the shell.
/bin/sh exec's ruby_to_bash.
You could bypass the whole problem by using the multi-argument form of Kernel#exec:
exec(cmdname, arg1, ...)
command name and one or more arguments (no shell)
which doesn't bother with a shell at all. If you say:
exec 'ruby_to_bash', param
You won't involve a shell and the quotes or spacing in param won't matter to anyone other than ruby_to_bash.
Yesterday I asked a similar question about escaping double quotes in env variables, although It didn't solve my problem (Probably because I didn't explain good enough) so I would like to specify more.
I'm trying to run a script (Which I know is written in Perl), although I have to use it as a black box because of permissions issue (so I don't know how the script works). Lets call this script script_A.
I'm trying to run a basic command in Shell: script_A -arg "date time".
If I run from the command line, it's works fine, but If I try to use it from a bash script or perl scrip (for example using the system operator), it will take only the first part of the string which was sent as an argument. In other words, it will fail with the following error: '"date' is not valid..
Example to specify a little bit more:
If I run from the command line (works fine):
> script_A -arg "date time"
If I run from (for example) a Perl script (fails):
my $args = $ENV{SOME_ENV}; # Assume that SOME_ENV has '-arg "date time"'
my $cmd = "script_A $args";
system($cmd");
I think that the problem comes from the environment variable, but I can't use the one quote while defining the env variable. For example, I can't use the following method:
setenv SOME_ENV '-arg "date time"'
Because it fails with the following error: '"date' is not valid.".
Also, I tried to use the following method:
setenv SOME_ENV "-arg '"'date time'"'"
Although now the env variable will containe:
echo $SOME_ENV
> -arg 'date time' # should be -arg "date time"
Another note, using \" fails on Shell (tried it).
Any suggestions on how to locate the reason for the error and how to solve it?
The $args, obtained from %ENV as you show, is a string.
The problem is in what happens to that string as it is manipulated before arguments are passed to the program, which needs to receive strings -arg and date time
If the program is executed in a way that bypasses the shell, as your example is, then the whole -arg "date time" is passed to it as its first argument. This is clearly wrong as the program expects -arg and then another string for its value (date time)
If the program were executed via the shell, what happens when there are shell metacharacters in the command line (not in your example), then the shell would break the string into words, except for the quoted part; this is how it works from the command line. That can be enforced with
system('/bin/tcsh', '-c', $cmd);
This is the most straightforward fix but I can't honestly recommend to involve the shell just for arguments parsing. Also, you are then in the game of layered quoting and escaping, what can get rather involved and tricky. For one, if things aren't right the shell may end up breaking the command into words -arg, "date, time"
How you set the environment variable works
> setenv SOME_ENV '-arg "date time"'
> perl -wE'say $ENV{SOME_ENV}' #--> -arg "date time" (so it works)
what I believe has always worked this way in [t]csh.
Then, in a Perl script: parse this string into -arg and date time strings, and have the program is executed in a way that bypasses the shell (if shell isn't used by the command)
my #args = $ENV{SOME_ENV} =~ /(\S+)\s+"([^"]+)"/; #"
my #cmd = ('script_A', #args);
system(#cmd) == 0 or die "Error with system(#cmd): $?";
This assumes that SOME_ENV's first word is always the option's name (-arg) and that all the rest is always the option's value, under quotes. The regex extracts the first word, as consecutive non-space characters, and after spaces everything in quotes.† These are program's arguments.
In the system LIST form the program that is the first element of the list is executed without using a shell, and the remaining elements are passed to it as arguments. Please see system for more on this, and also for basics of how to investigate failure by looking into $? variable.
It is in principle advisable to run external commands without the shell. However, if your command needs the shell then make sure that the string is escaped just right to to preserve quotes.
Note that there are modules that make it easier to use external commands. A few, from simple to complex: IPC::System::Simple, Capture::Tiny, IPC::Run3, and IPC::Run.
I must note that that's an awkward environment variable; any way to ogranize things otherwise?
† To make this work for non-quoted arguments as well (-arg date) make the quote optional
my #args = $ENV{SOME_ENV} =~ /(\S+)\s+"?([^"]+)/;
where I now left out the closing (unnecessary) quote for simplicity
I execute some code in shell using
subprocess.Popen('echo '+user_string+' | pipe to some string manipulation tools',
shell=True)
where user_string is from an untrusted source.
Is it safe enough to use shlex.quote() for escaping the input?
I'm necromancing this because it's the top Google hit for "is shlex.quote() safe", and while the accepted answer seems correct, there's a lot of pitfalls to point out.
shlex.quote() escapes the shell's parsing, but it does not escape the argument parser of the command you're calling, and some additional tool-specific escaping needs to be done manually, especially if your string starts with a dash (-).
Most (but not all) tools accept -- as an argument, and anything afterward is interpreted as verbatim. You can prepend "-- " if the string starts with "-". Example: rm -- --help removes the file called --help.
When dealing with file names, you can prepend "./" if the string starts with "-": rm ./--help.
In the case of your example with echo, neither escape is sufficient: When attempting to echo the string -e, echo -- -e gives the wrong result, you'll need something like echo -e \x2de. This demonstrates that there's no universal bulletproof way to escape program arguments.
The safest route is to bypass the shell by avoiding shell=True or os.system() if the string involves any user-supplied data.
In your case, set stdin=subprocess.PIPE and pass the user_string as an argument to communicate(). Then you can even leave your original invocation as-is with shell=True:
subprocess.Popen(
'pipe to some string manipulation tools',
shell=True,
stdin=subprocess.PIPE
).communicate(user_input_string.encode())
According to the official pyton documentation for shlex.quote the answer is yes.
Of course that also depends on what you mean by "safe enough". Under the assumption that you mean "Will using shlex.quote on user_string guard me against a typical scenario of malicious shell code passed as string input to my script?" the answer is yes.
I have two bash scripts. The first listens to a pipe "myfifo" for input and executes the input as a command:
fifo_name="myfifo"
[ -p $fifo_name ] || mkfifo $fifo_name;
while true
do
if read line; then
$line
fi
done <"$fifo_name"
The second passes a command 'echo $SET_VAR' to the "myfifo" pipe:
command='echo $SET_VAR'
command_to_pass="echo $command"
$command_to_pass > myfifo
As you can see, I want to pass 'echo $SET_VAR' through the pipe. In the listener process, I've set a $SET_VAR environment variable. I expect the output of the command 'echo $SET_VAR' to be 'var_value,' which is the value of the environment variable SET_VAR.
Running the first (the listener) script in one bash process and then passing a command via the second in another process gives the following result:
$SET_VAR
I expected to "var_value" to be printed. Instead, the string literal $SET_VAR is printed. Why is this the case?
Before I get to the problem you're reporting, I have to point out that your loop won't work. The while true part (without a break somewhere in the loop) will run forever. It'll read the first line from the file, loop, try to read a second line (which fails), loop again, try to read a third line (also fails), loop again, try to read a fourth line, etc... You want the loop to exit as soon as the read command fails, so use this:
while read line
do
# something I'll get to
done <"$fifo_name"
The other problem you're having is that the shell expands variables (i.e. replaces $var with the value of the variable var) partway through the process of parsing a command line, and when it's done that it doesn't go back and re-do the earlier parsing steps. In particular, if the variable's value included something like $SET_VAR it doesn't go back and expand that, since it's just finished the bit where it expands variables. In fact, the only thing it does with the expanded value is split it into "words" (based on whitespace), and expand any filename wildcards it finds -- no variable expansions happen, no quote or escape interpretation, etc.
One possible solution is to tell the shell to run the parsing process twice, with the eval command:
while read line
do
eval "$line"
done <"$fifo_name"
(Note that I used double-quotes around "$line" -- this prevents the word splitting and wildcard expansion I mentioned from happening before eval goes through the normal parsing process. If you think of your original code half-parsing the command in $line, without double-quotes it gets one and a half-parsed, which is weird. Double-quotes suppress that half-parsing stage, so the contents of the variable get parsed exactly once.)
However, this solution comes with a big warning, because eval has a well-deserved reputation as a bug magnet. eval makes it easy to do complex things without quite understanding what's going on, which means you tend to get scripts that work great in testing, then fail incomprehensibly later. And in my experience, when eval looks like the best solution, it probably means you're trying to solve the wrong problem.
So, what're you actually trying to do? If you're just trying to execute the lines coming from the fifo as shell commands, then you can use bash "$fifo_name" to run them in a subshell, or source "$fifo_name" to run them in the current shell.
BTW, the script that feeds the fifo:
command='echo $SET_VAR'
command_to_pass="echo $command"
$command_to_pass > myfifo
Is also a disaster waiting to happen. Putting commands in variables doesn't work very well in the shell (I second chepner's recommendation of BashFAQ #50: I'm trying to put a command in a variable, but the complex cases always fail!), and putting a command to print another command in a variable is just begging for trouble.
bash, by it's nature, reads commands from stdin. You can simply run:
bash < myfifo