How to properly escape filenames in Windows cmd.exe? - cmd

If you have a filename containing spaces, you typically double quote it on the Windows command shell (cmd.exe).
dir "\Program Files"
This also works for other special characters like ^&;,=. But it doesn't work for percent signs as they may be part of variable substitution. For example,
mkdir "%os%"
will create a directory named Windows_NT. To escape the percent sign, a caret can be used:
mkdir ^%os^%
But unfortunately, the caret sign loses its meaning in double quotes:
mkdir "^%os^%"
creates a directory named ^%os^%.
This is what I found out so far (Windows 7 command shell):
The characters ^ and & can be escaped with either a caret or double quotes.
The characters ;, ,, =, and space can only be escaped with double quotes.
The character % can only be escaped with a caret.
The characters '`+-~_.!#$#()[]{} apparently don't have to be escaped in filenames.
The characters <>:"/\|?* are illegal in filenames anyway.
This seems to make a general algorithm to quote filenames rather complicated. For example, to create a directory named My favorite %OS%, you have to write:
mkdir "My favorite "^%OS^%
Question 1: Is there an easier way to safely quote space and percent characters?
Question 2: Are the characters '`+-~_.!#$#()[]{} really safe to use without escaping?

The characters ^ and & can be escaped with either a caret or double quotes.
There is an additional restriction when piping.
When a pipe is used, the expressions are parsed twice. First when the
expression before the pipe is executed and a second time when the
expression after the pipe is executed. So to escape any characters in
the second expression double escaping is needed:
The line below will echo a single & character:
break| echo ^^^&
The character % can only be escaped with a caret.
% can also be escaped by doubling it.
The % character has a special meaning for command line parameters
and FOR parameters.
To treat a percent as a regular character, double it:
%%
For example, to create a directory named My favorite %OS%, you have to write:
mkdir "My favorite "^%OS^%
Question 1: Is there an easier way to safely quote space and percent
characters?
Use %% instead of % with the second " at the end where you would normally expect it to be.
C:\test\sub>dir
...
Directory of C:\test\sub
03/06/2015 14:40 <DIR> .
03/06/2015 14:40 <DIR> ..
0 File(s) 0 bytes
2 Dir(s) 82,207,772,672 bytes free
C:\test\sub>mkdir "My favorite %%OS%%"
C:\test\sub>dir
...
Directory of C:\test\sub
03/06/2015 14:40 <DIR> .
03/06/2015 14:40 <DIR> ..
03/06/2015 14:40 <DIR> My favorite %Windows_NT%
0 File(s) 0 bytes
3 Dir(s) 82,207,772,672 bytes free
Question 2: Are the characters '`+-~_.!#$#()[]{} really safe to use without escaping?
No, again there are some additional circumstances when some of these must be escaped, for example when using delayed variable expansion or when using for /f.
See Escape Characters for all the details ... in particular the Summary table.
Sources Syntax : Escape Characters, Delimiters and Quotes and Escape Characters.
Further Reading
How does the Windows Command Interpreter (CMD.EXE) parse scripts?

Related

bash script - why backslash did not escape "d" here in "\dirname" [duplicate]

In the first part of my question I will provide some background info as a
service to the community. The second part contains the actual question.
Part I
Assume I've created the following alias:
alias ls='ls -r'
I know how to temporarily unalias (i.e., override this alias) in the following
ways, using:
1) the full pathname of the command: /bin/ls
2) command substitution: $(which ls)
3) the command builtin: command ls
4) double quotation marks: "ls"
5) single quotation marks: 'ls'
6) a backslash character: \ls
Case 1 is obvious and case 2 is simply a variation. The command builtin in case 3 was designed to ignore shell functions, but apparently it also works for circumventing aliases. Finally, cases 4 and 5 are consistent with both the POSIX standard (2.3.1):
"a resulting word that is identified
to be the command name word of a
simple command shall be examined to
determine whether it is an unquoted,
valid alias name."
and the Bash Reference Manual (6.6):
"The first word of each simple
command, if unquoted, is checked to
see if it has an alias."
Part II
Here's the question: why is case 6 (overriding the alias by saying \ls)
considered quoting the word? In keeping with the style of this question, I am looking for references to the "official" documentation.
The documentation says that a backslash only escapes the following
character, as opposed to single and double quotation marks, which quote a
sequence of characters. POSIX standard (2.2.1):
"A backslash that is not quoted shall
preserve the literal value of the
following character, with the
exception of a < newline >"
Bash Reference Manual (3.1.2.1):
"A non-quoted backslash ‘\’ is the
Bash escape character. It preserves
the literal value of the next
character that follows, with the
exception of newline."
(BTW, isn't "the next character that follows" a bit of overkill?)
A possible answer might be that this situation isn't that special: it is
similar to a few cases in ANSI-C quoting, e.g. \nnn. However, that is still
escaping a single character (the eight-bit character whose value is the octal
value nnn), not a sequence of characters.
Historically, and maintained by POSIX, quoting any part of the word causes the entire word to be considered quoted for the purposes of functions and alias expansion. It also applies to quoting the end token for a here document:
cat << \EOF
this $text is fully quoted
EOF
Just for completion, here's yet another way to suppress alias & function lookups (by clearing the entire shell environment for a single command):
# cf. http://bashcurescancer.com/temporarily-clearing-environment-variables.html
env -i ls
\ls only quotes the first character rather than the whole word. It's equivalent to writing 'l's.
You can verify it like this:
$ touch \?l
$ \??
bash: ?l: command not found
If \?? quoted the whole word it would say ?? not found rather than ?l not found.
I.e. it has the same effect as:
$ '?'?
bash: ?l: command not found
rather than:
$ '??'
bash: ??: command not found

Change directory to subfolder with single quotes and exclamation mark

I'm trying to navigate down to a subfolder in a bash shell. The name of the subfolder is:
Let's Go Play!
I cannot figure out how to escape the single quote (apostrophe) or the exclamation point.
I have tried
cd "Let's Go Play\!"
cd "Let\'s Go Play\!"
Thanks.
The correct form is
cd "Let's Go Play!"
Inside double quotes, backslashes are not special unless they come before a newline, a quote, a backslash, a dollar sign or a backtick. Backslash-newline is removed altogether; a backslash followed by one of the other four characters in that list is removed and the character loses its special significance.
Inside single quotes, backslashes are never removed and have no special significance. Consequently, it is impossible to insert a single quote into a single-quoted string and so there is no single-quoted form of the above cd command. However, you can concatenate words, so you could write:
cd 'Let'"'"'s Go Play!'
Outside of quoted words, backslashes are more general. A backslash followed by any character other than a newline character is removed from the input and the following character becomes an ordinary character (even if it were ordinary already). Backslash-newline is removed entirely from the input, so that there is no way to insert a newline character into an unquoted string.
So you could have written:
cd Let\'s\ Go\ Play\!
But the double-quoted version one seems simpler.
Exclamations points are an extension to the Posix standard (the above rules comes directly from the Posix standard), and the bash implementation is a bit quirky and sometimes really annoying. Exclamation points introduce history expansion, unless they are inside single quotes, are preceded by a backslash, or are followed by whitespace or either an equals sign or (if shell option extglob is enabled) an open parenthesis. Inside double quotes, an exclamation point is also not special just before the closing quote. (You can change the history expansion character to something other than an exclamation point so technically I should write "the history expansion character".)
Even though a backslash makes an exclamation mark unspecial, the backslash is not removed from the input stream unless it would have been removed by the Posix rules. So the exclamation point in
echo "a\!b"
is an ordinary character (it is preceded by a backslash), but the backslash is also an ordinary character (it is not followed by one of the characters in the double-quote list), so the result is
a\!b
(Although I copied those rules from the bash manual, I know there are some other cases where history expansion is suppressed, such as when the exclamation point is part of a parameter expansion such as $! or ${!name}. And I think there are more of these exceptions that I can't remember off-hand.)
I find all that so annoying, and I rely so little on history expansion, that I simply turn it off by adding set +H to my bash startup file ~/.bashrc. If you turn history expansion off, then exclamation points lose all special significance. However, there are people who seem to really like history expansion, and if you're one of them, more power to you.

"invalid path 0 files copied" Error while using xcopy command

Hi I have this little command to copy files in a batch, which will help because I do this specific copy multiple times a day. The problem occurs while using the xcopy command. Everything is in order, but I am receiving this error: "Invalid path 0 files copied". Here is the code:
C:\Windows\System32\xcopy /Y "C:\Users\Ryan\Desktop\mmars_pub\" "C:\Users\Ryan\Desktop\Dropbox\MMARS\mmars_pub\"
I'm using the full path to the xcopy executable because I was having trouble configuring the path environment variable to function properly. I would imagine that it shouldn't affect the result though. I read somewhere about the "Prevent MS-DOS-based programs from detecting Windows" checkbox that should fix the issue, but I just can't seem to find that. Any help appreciated.
Original answer
Remove the ending backslash from the source folder path
C:\Windows\System32\xcopy.exe /Y "C:\Users\Ryan\Desktop\mmars_pub" "C:\Users\Ryan\Desktop\Dropbox\MMARS\mmars_pub\"
edited 2015/10/01
While the original question used a literal path, and the indicated solution will solve the problem, there is another option. For literal paths and in cases where the path is inside a variable and could (or not) end in a backslash, it is enough to ensure that the ending backslash (if present) is separated from the quote, including an ending dot.
xcopy /y "x:\source\." "x:\target"
xcopy /y "%myVariable%." "x:\target"
This ending dot will not interfere in files/folders names. If there is and ending backslash, the additional dot will simply refer to the same folder. If there is not ending backslash as in windows files and folders can not end their names with a dot, it will be discarded.
BUT if the output of the xcopy command will be processed, remember that this additional dot will be included in the paths shown.
note: The solutions are above the line. Keep reading if interested on why/where there is a problem.
Why xcopy "c:\source\" "d:\target\" fails but xcopy "c:\source" "d:\target\" works?
Both commands seems to have valid path references, and ... YES! both are valid path references, but there are two elements that work together to make the command fail:
the folder reference is quoted (note: it should be quoted, it is a good habit to quote paths as you never know when they will contain spaces or special characters)
xcopy is not an internal command handled by cmd but an executable file
As xcopy is an external command, its arguments are not handled following the cmd parser command line logic. They are handled by the Microsoft C startup code.
This parser follows two sets of rules, official rules
Arguments are delimited by white space, which is either a space or a tab.
A string surrounded by double quotation marks is interpreted as a single argument, regardless of white space contained within. A quoted
string can be embedded in an argument. Note that the caret (^) is not
recognized as an escape character or delimiter.
A double quotation mark preceded by a backslash, \", is interpreted as a literal double quotation mark (").
Backslashes are interpreted literally, unless they immediately precede a double quotation mark.
If an even number of backslashes is followed by a double quotation mark, then one backslash (\) is placed in the argv array for every
pair of backslashes (\\), and the double quotation mark (") is
interpreted as a string delimiter.
If an odd number of backslashes is followed by a double quotation mark, then one backslash (\) is placed in the argv array for every
pair of backslashes (\\) and the double quotation mark is interpreted
as an escape sequence by the remaining backslash, causing a literal
double quotation mark (") to be placed in argv.
and undocumented/non official rules (How Command Line Parameters Are Parsed)
Outside a double quoted block a " starts a double quoted block.
Inside a double quoted block a " followed by a different character (not another ") ends the double quoted block.
Inside a double quoted block a " followed immediately by another " (i.e. "") causes a single " to be added to the output, and the
double quoted block continues.
This parser sees the sequence \" found at the end of the "first" argument as a escaped quote that does not end/closes the argument, it is seen as part or the argument. And the "starting" quote of the "second" argument is just ending the double quoted block BUT not ending the argument, remember that arguments are delimited by white space.
So while it seems that the command line arguments are
v v v......argument delimiters
v.........v v..........v ......quoted blocks
xcopy "x:\souce\" "x:\target\"
^.......^ ^........^ ......argument data
arg #1 arg #2
arg #1 = x:\source\
arg #2 = x:\target\
the actual argument handled by xcopy is
v v .....argument delimiters
v......................v .....quoted block
xcopy "x:\souce\" "x:\target\"
^.....................^ .....argument data
arg #1
arg #1 = x:\source" x:\target"
When the ending backslash is removed or the additional dot included, the closing quote in the argument will not be escaped, it will close the quoted block and the space between arguments will be seen as a delimiter.

Why do backslashes prevent alias expansion?

In the first part of my question I will provide some background info as a
service to the community. The second part contains the actual question.
Part I
Assume I've created the following alias:
alias ls='ls -r'
I know how to temporarily unalias (i.e., override this alias) in the following
ways, using:
1) the full pathname of the command: /bin/ls
2) command substitution: $(which ls)
3) the command builtin: command ls
4) double quotation marks: "ls"
5) single quotation marks: 'ls'
6) a backslash character: \ls
Case 1 is obvious and case 2 is simply a variation. The command builtin in case 3 was designed to ignore shell functions, but apparently it also works for circumventing aliases. Finally, cases 4 and 5 are consistent with both the POSIX standard (2.3.1):
"a resulting word that is identified
to be the command name word of a
simple command shall be examined to
determine whether it is an unquoted,
valid alias name."
and the Bash Reference Manual (6.6):
"The first word of each simple
command, if unquoted, is checked to
see if it has an alias."
Part II
Here's the question: why is case 6 (overriding the alias by saying \ls)
considered quoting the word? In keeping with the style of this question, I am looking for references to the "official" documentation.
The documentation says that a backslash only escapes the following
character, as opposed to single and double quotation marks, which quote a
sequence of characters. POSIX standard (2.2.1):
"A backslash that is not quoted shall
preserve the literal value of the
following character, with the
exception of a < newline >"
Bash Reference Manual (3.1.2.1):
"A non-quoted backslash ‘\’ is the
Bash escape character. It preserves
the literal value of the next
character that follows, with the
exception of newline."
(BTW, isn't "the next character that follows" a bit of overkill?)
A possible answer might be that this situation isn't that special: it is
similar to a few cases in ANSI-C quoting, e.g. \nnn. However, that is still
escaping a single character (the eight-bit character whose value is the octal
value nnn), not a sequence of characters.
Historically, and maintained by POSIX, quoting any part of the word causes the entire word to be considered quoted for the purposes of functions and alias expansion. It also applies to quoting the end token for a here document:
cat << \EOF
this $text is fully quoted
EOF
Just for completion, here's yet another way to suppress alias & function lookups (by clearing the entire shell environment for a single command):
# cf. http://bashcurescancer.com/temporarily-clearing-environment-variables.html
env -i ls
\ls only quotes the first character rather than the whole word. It's equivalent to writing 'l's.
You can verify it like this:
$ touch \?l
$ \??
bash: ?l: command not found
If \?? quoted the whole word it would say ?? not found rather than ?l not found.
I.e. it has the same effect as:
$ '?'?
bash: ?l: command not found
rather than:
$ '??'
bash: ??: command not found

How to input special character in cmd?

I have written a c program that retrieves arguments from the command line under Windows. One of the arguments is a regular expression. So I need to retrieve special characters such as "( , .", etc., but cmd.exe treats "(" as a special character.
How could I input these special character?
thanks.
You can put the arguments in quotes:
myprogram.exe "(this is some text, with special characters.)"
Though I wouldn't assume that parentheses cause problems unless you are using blocks for conditional statements or loops in a batch file. The usual array of characters that are treated specially by the shell and need quoting or escaping are:
& | > < ^
If you use those in your regular expression, then you need quotes, or escape those characters:
myprogram "(.*)|[a-f]+"
myprogram (.*)^|[a-f]+
(^ is the escape character which causes the following character to be not interpreted by the shell but instead used literally)
You can generally prefix any character with ^ to turn off its special nature. For example:
Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.
C:\Documents and Settings\Pax>echo No ^<redirection^> here and can also do ^
More? multi-line, ^(parentheses^) and ^^ itself
No <redirection> here and can also do multi-line, (parentheses) and ^ itself
C:\Documents and Settings\Pax>
That's a caret followed by an ENTER after the word do.

Resources