I'm in the unfortunate position to be forced to invoke a program via echo <input> | program.exe. Of course, I wondered how to escape <input> and found:
How does the Windows Command Interpreter (CMD.EXE) parse scripts?
Escape angle brackets in a Windows command prompt
In essence, it seems sufficient to escape all special chars with ^. Out of curiosity I still would like to know, why echo ingores double-quote escaping in the first place:
C:\>echo "foo"
"foo"
C:\>
Is there any normative reference?
Bonus question: How to echo the strings on and off with the echo command?
Edit: I just found this. I states that only |<> need to be escaped. However, expansion like %FOO% still work.
Special characters like ^, &, (, ), <, >, %, ! and " may cause problems with echo, also when trying to echo a string into a pipe; odd numbers of " are particularly difficult to handle.
Building escape sequences can be very complicated particularly with pipes, because such initiates new cmd instances for either side, so multi-escaping might become necessary.
The only reliable way to pipe the output of echo into a program is to use a variable holding the string to return and to apply delayed expansion, but within the left side of the pipe, like this:
cmd /V /C echo(^^!VARIABLE^^!| program.exe
Note the double-escaping of ! like ^^!, which makes this code even work when delayed expansion is also enabled in the parent cmd instance. There must not be a SPACE in front of the |, because this was echoed too otherwise. Note that echo terminates the output by a line-break.
Related
I want to run a command from a bash script which has single quotes and some other commands inside the single quotes and a variable.
e.g. repo forall -c '....$variable'
In this format, $ is escaped and the variable is not expanded.
I tried the following variations but they were rejected:
repo forall -c '...."$variable" '
repo forall -c " '....$variable' "
" repo forall -c '....$variable' "
repo forall -c "'" ....$variable "'"
If I substitute the value in place of the variable the command is executed just fine.
Please tell me where am I going wrong.
Inside single quotes everything is preserved literally, without exception.
That means you have to close the quotes, insert something, and then re-enter again.
'before'"$variable"'after'
'before'"'"'after'
'before'\''after'
Word concatenation is simply done by juxtaposition. As you can verify, each of the above lines is a single word to the shell. Quotes (single or double quotes, depending on the situation) don't isolate words. They are only used to disable interpretation of various special characters, like whitespace, $, ;... For a good tutorial on quoting see Mark Reed's answer. Also relevant: Which characters need to be escaped in bash?
Do not concatenate strings interpreted by a shell
You should absolutely avoid building shell commands by concatenating variables. This is a bad idea similar to concatenation of SQL fragments (SQL injection!).
Usually it is possible to have placeholders in the command, and to supply the command together with variables so that the callee can receive them from the invocation arguments list.
For example, the following is very unsafe. DON'T DO THIS
script="echo \"Argument 1 is: $myvar\""
/bin/sh -c "$script"
If the contents of $myvar is untrusted, here is an exploit:
myvar='foo"; echo "you were hacked'
Instead of the above invocation, use positional arguments. The following invocation is better -- it's not exploitable:
script='echo "arg 1 is: $1"'
/bin/sh -c "$script" -- "$myvar"
Note the use of single ticks in the assignment to script, which means that it's taken literally, without variable expansion or any other form of interpretation.
The repo command can't care what kind of quotes it gets. If you need parameter expansion, use double quotes. If that means you wind up having to backslash a lot of stuff, use single quotes for most of it, and then break out of them and go into doubles for the part where you need the expansion to happen.
repo forall -c 'literal stuff goes here; '"stuff with $parameters here"' more literal stuff'
Explanation follows, if you're interested.
When you run a command from the shell, what that command receives as arguments is an array of null-terminated strings. Those strings may contain absolutely any non-null character.
But when the shell is building that array of strings from a command line, it interprets some characters specially; this is designed to make commands easier (indeed, possible) to type. For instance, spaces normally indicate the boundary between strings in the array; for that reason, the individual arguments are sometimes called "words". But an argument may nonetheless have spaces in it; you just need some way to tell the shell that's what you want.
You can use a backslash in front of any character (including space, or another backslash) to tell the shell to treat that character literally. But while you can do something like this:
reply=\”That\'ll\ be\ \$4.96,\ please,\"\ said\ the\ cashier
...it can get tiresome. So the shell offers an alternative: quotation marks. These come in two main varieties.
Double-quotation marks are called "grouping quotes". They prevent wildcards and aliases from being expanded, but mostly they're for including spaces in a word. Other things like parameter and command expansion (the sorts of thing signaled by a $) still happen. And of course if you want a literal double-quote inside double-quotes, you have to backslash it:
reply="\"That'll be \$4.96, please,\" said the cashier"
Single-quotation marks are more draconian. Everything between them is taken completely literally, including backslashes. There is absolutely no way to get a literal single quote inside single quotes.
Fortunately, quotation marks in the shell are not word delimiters; by themselves, they don't terminate a word. You can go in and out of quotes, including between different types of quotes, within the same word to get the desired result:
reply='"That'\''ll be $4.96, please," said the cashier'
So that's easier - a lot fewer backslashes, although the close-single-quote, backslashed-literal-single-quote, open-single-quote sequence takes some getting used to.
Modern shells have added another quoting style not specified by the POSIX standard, in which the leading single quotation mark is prefixed with a dollar sign. Strings so quoted follow similar conventions to string literals in the ANSI standard version of the C programming language, and are therefore sometimes called "ANSI strings" and the $'...' pair "ANSI quotes". Within such strings, the above advice about backslashes being taken literally no longer applies. Instead, they become special again - not only can you include a literal single quotation mark or backslash by prepending a backslash to it, but the shell also expands the ANSI C character escapes (like \n for a newline, \t for tab, and \xHH for the character with hexadecimal code HH). Otherwise, however, they behave as single-quoted strings: no parameter or command substitution takes place:
reply=$'"That\'ll be $4.96, please," said the cashier'
The important thing to note is that the single string that gets stored in the reply variable is exactly the same in all of these examples. Similarly, after the shell is done parsing a command line, there is no way for the command being run to tell exactly how each argument string was actually typed – or even if it was typed, rather than being created programmatically somehow.
Below is what worked for me -
QUOTE="'"
hive -e "alter table TBL_NAME set location $QUOTE$TBL_HDFS_DIR_PATH$QUOTE"
EDIT: (As per the comments in question:)
I've been looking into this since then. I was lucky enough that I had repo laying around. Still it's not clear to me whether you need to enclose your commands between single quotes by force. I looked into the repo syntax and I don't think you need to. You could used double quotes around your command, and then use whatever single and double quotes you need inside provided you escape double ones.
just use printf
instead of
repo forall -c '....$variable'
use printf to replace the variable token with the expanded variable.
For example:
template='.... %s'
repo forall -c $(printf "${template}" "${variable}")
Variables can contain single quotes.
myvar=\'....$variable\'
repo forall -c $myvar
I was wondering why I could never get my awk statement to print from an ssh session so I found this forum. Nothing here helped me directly but if anyone is having an issue similar to below, then give me an up vote. It seems any sort of single or double quotes were just not helping, but then I didn't try everything.
check_var="df -h / | awk 'FNR==2{print $3}'"
getckvar=$(ssh user#host "$check_var")
echo $getckvar
What do you get? A load of nothing.
Fix: escape \$3 in your print function.
Does this work for you?
eval repo forall -c '....$variable'
Here's a simple .bat file that shows the first three arguments with which the bat file is executed:
#echo 1: %1
#echo 2: %2
#echo 3: %3
When I execute the bat file like so
c:\> x:\show_parameters.bat "foo bar" baz "one two three"
the output is
1: "foo bar"
2: baz
3: "one two three"
I was surprised, because I didn't expect the double quotes to be passed as part of the arguments.
When I use a Perl script to show the values of the parameters
my $arg_cnt = 1;
for my $arg (#ARGV) {
printf "%2d: %s\n", $arg_cnt, $arg;
$arg_cnt++;
}
and execute the script like so
c:\> x:\show_parameter.pl "foo bar" baz "one two three"
it prints
1: foo bar
2: baz
3: one two three
that is, without any double quotes. This is the, imho, expected behaviour for the bat variant.
So, Why are the arguments passed differently to the bat file?
TL;DR: It depends on the shell implementation. On windows, the cmd console quoting uses different rules from the bash shell. source
I believe you're looking for:
#echo 1: %~1
#echo 2: %~2
#echo 3: %~3
See the documentation.
The Tilde character has special "modifier" meaning with batch parameters. If you think about it, Perl and batch are two different languages, and when a program is sent parameters, think of it like it's passed a query string, except it's upto the language to decide how to parse it. Really what you pass to the program is one-long parameter but the program splits on spaces while keeping in mind quotes and escaping.
You can also see #echo 0: %0 will show you the program in quotes while #echo 0: %~0 removes the double quotes.
To see the full line of "arguments" passed to the script, without being parsed, you'd do:
#echo *: %*
As you can see, the script is really passed one long argument and has to be parsed first, keeping spaces and quotes, and escape characters such as ^ in mind, in order to create the concept of multiple "arguments".
In terms of Perl's behavior, my guess is that Perl does this automagically for you when populating ARGV. Could check its source code if you're interested in the logic Perl is using.
Edit:
After playing with it for a while, I'm starting to think this is beyond Perl's control. I'm noticing the same behavior with PHP as well when testing print_r($argv); from commandline and it also loses its quotes.
You can see Perl is getting sent the parameters with quotes by running:
use Win32::API;
my $GetCommandLine = Win32::API->new('kernel32',
'GetCommandLine', [ ] , 'P' );
$cmdline = $GetCommandLine->Call();
print $cmdline;
But then you'd need to parse that if you wanted just the parameters and not the full command line command.
There's a question posted here exactly like yours:
http://www.nntp.perl.org/group/perl.beginners/2002/07/msg29597.html
To paraphrasing Peter Scott's answer there on the 2nd page: the shell does the whitespace splitting and dequoting before the program sees the arguments and it makes no difference what language the program is written in. So you'll have to find a workaround.
The answer is pretty consistent that it's a shell issue the more I research it.
For example even in Python, it's the same issue.
So why does batch give different results? It depends on the shell implementation. On windows, the cmd console quoting uses different rules from the bash shell.
consider
call :somesubroutine %*
Without the quotes, somesubroutine would see 6 arguments. With its sees 3.
Really a matter of definition, but batch seems to tell the truth here.
Consider also what happens with
x:\show_parameters.bat "foo bar" "" "one two three"
Whats's going on?
helper.bat
#echo off
echo %1
call:foo %1
goto:eof
:foo
echo %1
goto:eof
Run our script like the following
helper "^^^^"
Output
"^^^^"
"^^^^^^^^"
Why? I know that '^' symbol is smth special in case of cmd.exe, but what's going on here? How the function call affect on it?
CALL is very special in this case!
The batch parser has different phases, in the special character phase unquoted carets are used to escape the next character, the caret itself is removed.
In your case, the carets are quoted, so they will not be affected.
Then the carets can be affected again in the delayed expansion phase, but quotes havn't special meaning there, the carets are used only to escape exclamation marks.
Normally after the delayed phase all is done, BUT if you use CALL all carets are doubled.
Normally this is invisible, as the CALL also restarts the parser and carets are removed in the special character phase again.
But in your case they are quoted, therefore they stay doubled.
Try this
call call call call echo a^^ "b^"
Output
a^ "b^^^^^^^^^^^^^^^^"
The parser is explained at How does the Windows Command Interpreter (CMD.EXE) parse scripts?
I'm trying to populate a shell variable called $recipient which should contain a value followed by a new-line.
$ set -x # force bash to show commands as it executes them
I start by populating $user, which is the value that I want to be followed by the newline.
$ user=user#xxx.com
+ user=user#xxx.com
I then call echo $user inside a double-quoted command substitution. The echo statement should create a newline after $user, and the double-quotes should preserve the newline.
$ recipient="$(echo $user)"
++ echo user#xxx.com
+ recipient=user#xxx.com
However when I print $recipient, I can see that the newline has been discarded.
$ echo "'recipient'"
+ echo ''\''recipient'\'''
'recipient'
I've found the same behaviour under bash versions 4.1.5 and 3.1.17, and also replicated the issue under dash.
I tried using "printf" rather than echo; this didn't change anything.
Is this expected behaviour?
Command substitution removes trailing newlines. From the standard:
The shell shall expand the command substitution by executing command in a subshell environment (see Shell Execution Environment ) and replacing the command substitution (the text of command plus the enclosing "$()" or backquotes) with the standard output of the command, removing sequences of one or more characters at the end of the substitution. Embedded characters before the end of the output shall not be removed; however, they may be treated as field delimiters and eliminated during field splitting, depending on the value of IFS and quoting that is in effect. If the output contains any null bytes, the behavior is unspecified.
You will have to explicitly add a newline. Perhaps:
recipient="$user
"
There's really no reason to use a command substitution here. (Which is to say that $(echo ...) is almost always a silly thing to do.)
All shell versions will react the same way, this is nothing new in scripting.
The new-line at the end of your original assignment is not included in the variable's value. It only "terminates" the current cmd and signals the shell to process.
Maybe user="user#xxx.com\n" will work, but without context about why you want this, just know that people usually keep variables values separate from the formatting "tools" like the newline.
IHTH.
The Windows command prompt (cmd.exe) has an optional /s parameter, which modifies the behavior of /c (run a particular command and then exit) or /k (run a particular command and then show a shell prompt). This /s parameter evidently has something to do with some arcane quote handling.
The docs are confusing, but as far as I can tell, when you do cmd /csomething, and the something contains quotation marks, then by default cmd will sometimes strip off those quotes, and /s tells it to leave them alone.
What I don't understand is when the quote removal would break anything, because that's the only time /s ("suppress the default quote-removal behavior") would be necessary. It only removes quotes under a certain arcane set of conditions, and one of those conditions is that the first character after the /c must be a quotation mark. So it's not removing quotes around arguments; it's either removing quotes around the path to the EXE you're running, or around the entire command line (or possibly around the first half of the command line, which would be bizarre).
If the path to the EXE is quoted, e.g. cmd /c "c:\tools\foo.exe" arg1 arg2, then quotes are unnecessary, and if cmd wants to remove them, fine. (It won't remove them if the path has a space in the name -- that's another of the arcane rules.) I can't imagine any reason to suppress the quote removal, so /s seems unnecessary.
If the entire command line is quoted, e.g. cmd /c "foo.exe arg1 arg2", then it seems like quote removal would be a necessity, since there's no EXE named foo.exe arg1 arg2 on the system; so it seems like opting out of quote removal using /s would actually break things. (In actual fact, however, it does not break things: cmd /s /c "foo.exe arg1 arg2" works just fine.)
Is there some subtlety to /s that's eluding me? When would it ever be necessary? When would it even make any difference?
Cmd /S is very useful as it saves you having to worry about "quoting quotes". Recall that the /C argument means "execute this command as if I had typed it at the prompt, then quit".
So if you have a complicated command which you want to pass to CMD.exe you either have to remember CMD's argument quoting rules, and properly escape all of the quotes, or use /S, which triggers a special non-parsing rule of "Strip first and last " and treat all other characters as the command to execute unchanged".
You would use it where you want to take advantage of the capabilities of the CMD shell, rather than directly calling another program. For example environment variable expansion, output or input redirection, or using CMD.exe built-ins.
Example:
Use a shell built-in: This executes as-if you had typed DEL /Q/S "%TMP%\TestFile" at the prompt:
CMD.exe /S /C " DEL /Q/S "%TMP%\TestFile" "
This executes SomeCommand.exe redirecting standard output to a temp file and standard error to the same place:
CMD.exe /S /C " "%UserProfile%\SomeCommand.exe" > "%TMP%\TestOutput.txt" 2>&1 "
So what does /S give you extra? Mainly it saves you from having to worry about quoting the quotes. It also helps where you are unsure whether for example an environtment variable contains quote characters. Just say /S and put an extra quote at the beginning and end.
Vaguely Related: $* in Bourne Shell.
Some background
Recall that the list of arguments to main() is a C-ism and Unix-ism. The Unix/Linux shell (e.g. Bourne Shell etc) interprets the command line, un-quotes the arguments, expands wildcards like * to lists of files, and passes a list of arguments to the called program.
So if you say:
$ vi *.txt
The vi command sees for example these arguments:
vi
a.txt
b.txt
c.txt
d.txt
This is because unix/linux operates internally on the basis of "list of arguments".
Windows, which derives ultimately from CP/M and VAX, does not use this system internally. To the operating system, the command line is just a single string of characters. It is the responsibility of the called program to interpret the command line, expand file globs (* etc) and deal with unquoting quoted arguments.
So the arguments expected by C, have to be hacked up by the C runtime library. The operating system only supplies a single string with the arguments in, and if your language is not C (or even if it is) it may not be interpreted as space-separated arguments quoted according to shell rules, but as something completely different.
Here's an example of how it can make a difference.
Suppose you have two executables: c:\Program.exe and c:\Program Files\foo.exe.
If you say
cmd /c "c:\Program Files\foo"
you'll run foo.exe (with no arguments) whereas if you say
cmd /s /c "c:\Program Files\foo"
you'll run Program.exe with Files\foo as the argument.
(Oddly enough, in the first example, if foo.exe didn't exist, Program.exe would run instead.)
Addendum: if you were to type
c:\Program Files\foo
at the command prompt, you would run Program.exe (as happens with cmd /s /c) rather than foo.exe (as happens with just cmd /c). So one reason for using /s would be if you want to make sure a command is parsed in exactly the same way as if it were being typed at the command prompt. This is probably more likely to be desirable in the scenario in the question Michael Burr linked to, where cmd.exe is being launched by CreateProcess rather than from a batch file or the command line itself..
That is, if you say
CreateProcess("cmd.exe", "cmd /s /c \"" MY_COMMAND "\"", ...)
then the string MY_COMMAND will be parsed exactly as if it were typed at the command prompt. If you're taking command-line input from the user, or if you're a library processing a command line provided by an application, that's probably a good idea. For example, the C runtime library system() function might be implemented in this way.
In all but one specific case, the /S won't actually make any difference.
The help for cmd.exe is accurate, if a bit complicated:
If /C or /K is specified, then the remainder of the command line after
the switch is processed as a command line, where the following logic is
used to process quote (") characters:
If all of the following conditions are met, then quote characters
on the command line are preserved:
no /S switch
exactly two quote characters
no special characters between the two quote characters,
where special is one of: &<>()#^|
there are one or more whitespace characters between the
two quote characters
the string between the two quote characters is the name
of an executable file.
Otherwise, old behavior is to see if the first character is
a quote character and if so, strip the leading character and
remove the last quote character on the command line, preserving
any text after the last quote character.
I'd summarize as follows:
Normal behavior:
If the rest of the command line after /K or /C starts with a quote, both that quote and the final quote are removed. (See exception below.) Other than that, no quotes are removed.
Exception:
If the rest of the command line after /K or /C starts with a quote, followed by the name of an executable file, followed by another quote, AND if those are the only two quotes, AND if the file name contains spaces but contains no special characters, then the quotes are not removed (even though they normally would have been removed according to the rule above).
The only effect of /S is to override this one exception, so that the two quote characters are still removed in that case.
If you always use /S, you can forget about the exception and just remember the "normal" case. The downside is that cmd.exe /S /C "file name with spaces.exe" argument1 won't work without adding an extra set of quotes, whereas without /S it would have worked... until you decide to replace argument1 with "argument1".