How to input special character in cmd? - shell

I have written a c program that retrieves arguments from the command line under Windows. One of the arguments is a regular expression. So I need to retrieve special characters such as "( , .", etc., but cmd.exe treats "(" as a special character.
How could I input these special character?
thanks.

You can put the arguments in quotes:
myprogram.exe "(this is some text, with special characters.)"
Though I wouldn't assume that parentheses cause problems unless you are using blocks for conditional statements or loops in a batch file. The usual array of characters that are treated specially by the shell and need quoting or escaping are:
& | > < ^
If you use those in your regular expression, then you need quotes, or escape those characters:
myprogram "(.*)|[a-f]+"
myprogram (.*)^|[a-f]+
(^ is the escape character which causes the following character to be not interpreted by the shell but instead used literally)

You can generally prefix any character with ^ to turn off its special nature. For example:
Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.
C:\Documents and Settings\Pax>echo No ^<redirection^> here and can also do ^
More? multi-line, ^(parentheses^) and ^^ itself
No <redirection> here and can also do multi-line, (parentheses) and ^ itself
C:\Documents and Settings\Pax>
That's a caret followed by an ENTER after the word do.

Related

bash script - why backslash did not escape "d" here in "\dirname" [duplicate]

In the first part of my question I will provide some background info as a
service to the community. The second part contains the actual question.
Part I
Assume I've created the following alias:
alias ls='ls -r'
I know how to temporarily unalias (i.e., override this alias) in the following
ways, using:
1) the full pathname of the command: /bin/ls
2) command substitution: $(which ls)
3) the command builtin: command ls
4) double quotation marks: "ls"
5) single quotation marks: 'ls'
6) a backslash character: \ls
Case 1 is obvious and case 2 is simply a variation. The command builtin in case 3 was designed to ignore shell functions, but apparently it also works for circumventing aliases. Finally, cases 4 and 5 are consistent with both the POSIX standard (2.3.1):
"a resulting word that is identified
to be the command name word of a
simple command shall be examined to
determine whether it is an unquoted,
valid alias name."
and the Bash Reference Manual (6.6):
"The first word of each simple
command, if unquoted, is checked to
see if it has an alias."
Part II
Here's the question: why is case 6 (overriding the alias by saying \ls)
considered quoting the word? In keeping with the style of this question, I am looking for references to the "official" documentation.
The documentation says that a backslash only escapes the following
character, as opposed to single and double quotation marks, which quote a
sequence of characters. POSIX standard (2.2.1):
"A backslash that is not quoted shall
preserve the literal value of the
following character, with the
exception of a < newline >"
Bash Reference Manual (3.1.2.1):
"A non-quoted backslash ‘\’ is the
Bash escape character. It preserves
the literal value of the next
character that follows, with the
exception of newline."
(BTW, isn't "the next character that follows" a bit of overkill?)
A possible answer might be that this situation isn't that special: it is
similar to a few cases in ANSI-C quoting, e.g. \nnn. However, that is still
escaping a single character (the eight-bit character whose value is the octal
value nnn), not a sequence of characters.
Historically, and maintained by POSIX, quoting any part of the word causes the entire word to be considered quoted for the purposes of functions and alias expansion. It also applies to quoting the end token for a here document:
cat << \EOF
this $text is fully quoted
EOF
Just for completion, here's yet another way to suppress alias & function lookups (by clearing the entire shell environment for a single command):
# cf. http://bashcurescancer.com/temporarily-clearing-environment-variables.html
env -i ls
\ls only quotes the first character rather than the whole word. It's equivalent to writing 'l's.
You can verify it like this:
$ touch \?l
$ \??
bash: ?l: command not found
If \?? quoted the whole word it would say ?? not found rather than ?l not found.
I.e. it has the same effect as:
$ '?'?
bash: ?l: command not found
rather than:
$ '??'
bash: ??: command not found

Why does backtick placement matter?

I'm trying to better understand how backticks work in PowerShell. This works and executes the ipconfig command:
$a = "ipc"
$b = "onf`ig"
iex $a$b
However, if the backtick is moved one character to the left, before the "f", the command breaks...
$a = "ipc"
$b = "on`fig"
iex $a$b
Another example of this:
who`ami
If the backtick is anywhere else, the whoami command will work just fine. With a backtick in the middle, it breaks.
What's happening here? Why does the placement of the backtick's matter so much?
These are becuase some special characters in powershell.
In powershell there are some special characters which are not in standard character set. They start with back tick to show special meaning. They are:
`0 Null
`a Alert
`b Backspace
`e Escape
`f Form feed
`n New line
`r Carriage return
`t Horizontal tab
`u{x} Unicode escape sequence
`v Vertical tab
Here when you escape "a" with backtick
means alert powershell (whoami) and when you escape "f" with backtick means form feed (ipconfig), so both commands break.
And when you escape the other character, commands don't break becuase then characters not render the special meaning.
Though I don't agree with all the author of this article says. Most of the is valid when it comes to use of the graveyard accents\bact tick.
It does have its use cases, but not for what you are showing.
Bye Bye Backtick: Natural Line Continuations in PowerShell]
See also:
about_Special_Characters - PowerShell | Microsoft Docs
PowerShell - Special Characters And Tokens
Grave_accent
Use in programming Programmers use the grave accent symbol as a
separate character (i.e., not combined with any letter) for a number
of tasks. In this role, it is known as a backquote, or backtick.
Many of the Unix shells and the programming languages Perl, PHP, and
Ruby use pairs of this character to indicate command substitution,
that is, substitution of the standard output from one command into a
line of text defining another command. For example, using $ as the
symbol representing a terminal prompt, the code line...
How-to: Escape characters, Delimiters and Quotes

How to properly escape filenames in Windows cmd.exe?

If you have a filename containing spaces, you typically double quote it on the Windows command shell (cmd.exe).
dir "\Program Files"
This also works for other special characters like ^&;,=. But it doesn't work for percent signs as they may be part of variable substitution. For example,
mkdir "%os%"
will create a directory named Windows_NT. To escape the percent sign, a caret can be used:
mkdir ^%os^%
But unfortunately, the caret sign loses its meaning in double quotes:
mkdir "^%os^%"
creates a directory named ^%os^%.
This is what I found out so far (Windows 7 command shell):
The characters ^ and & can be escaped with either a caret or double quotes.
The characters ;, ,, =, and space can only be escaped with double quotes.
The character % can only be escaped with a caret.
The characters '`+-~_.!#$#()[]{} apparently don't have to be escaped in filenames.
The characters <>:"/\|?* are illegal in filenames anyway.
This seems to make a general algorithm to quote filenames rather complicated. For example, to create a directory named My favorite %OS%, you have to write:
mkdir "My favorite "^%OS^%
Question 1: Is there an easier way to safely quote space and percent characters?
Question 2: Are the characters '`+-~_.!#$#()[]{} really safe to use without escaping?
The characters ^ and & can be escaped with either a caret or double quotes.
There is an additional restriction when piping.
When a pipe is used, the expressions are parsed twice. First when the
expression before the pipe is executed and a second time when the
expression after the pipe is executed. So to escape any characters in
the second expression double escaping is needed:
The line below will echo a single & character:
break| echo ^^^&
The character % can only be escaped with a caret.
% can also be escaped by doubling it.
The % character has a special meaning for command line parameters
and FOR parameters.
To treat a percent as a regular character, double it:
%%
For example, to create a directory named My favorite %OS%, you have to write:
mkdir "My favorite "^%OS^%
Question 1: Is there an easier way to safely quote space and percent
characters?
Use %% instead of % with the second " at the end where you would normally expect it to be.
C:\test\sub>dir
...
Directory of C:\test\sub
03/06/2015 14:40 <DIR> .
03/06/2015 14:40 <DIR> ..
0 File(s) 0 bytes
2 Dir(s) 82,207,772,672 bytes free
C:\test\sub>mkdir "My favorite %%OS%%"
C:\test\sub>dir
...
Directory of C:\test\sub
03/06/2015 14:40 <DIR> .
03/06/2015 14:40 <DIR> ..
03/06/2015 14:40 <DIR> My favorite %Windows_NT%
0 File(s) 0 bytes
3 Dir(s) 82,207,772,672 bytes free
Question 2: Are the characters '`+-~_.!#$#()[]{} really safe to use without escaping?
No, again there are some additional circumstances when some of these must be escaped, for example when using delayed variable expansion or when using for /f.
See Escape Characters for all the details ... in particular the Summary table.
Sources Syntax : Escape Characters, Delimiters and Quotes and Escape Characters.
Further Reading
How does the Windows Command Interpreter (CMD.EXE) parse scripts?

Escape double quotes in parameter

In Unix I could run myscript '"test"' and I would get "test".
In Windows cmd I get 'test'.
How can I pass double-quotes as a parameter? I would like to know how to do this manually from a cmd window so I don't have to write a program to test my program.
Another way to escape quotes (though probably not preferable), which I've found used in certain places is to use multiple double-quotes. For the purpose of making other people's code legible, I'll explain.
Here's a set of basic rules:
When not wrapped in double-quoted groups, spaces separate parameters:program param1 param2 param 3 will pass four parameters to program.exe: param1, param2, param, and 3.
A double-quoted group ignores spaces as value separators when passing parameters to programs:program one two "three and more" will pass three parameters to program.exe: one, two, and three and more.
Now to explain some of the confusion:
Double-quoted groups that appear directly adjacent to text not wrapped with double-quotes join into one parameter:hello"to the entire"world acts as one parameter: helloto the entireworld.
Note: The previous rule does NOT imply that two double-quoted groups can appear directly adjacent to one another.
Any double-quote directly following a closing quote is treated as (or as part of) plain unwrapped text that is adjacent to the double-quoted group, but only one double-quote:"Tim says, ""Hi!""" will act as one parameter: Tim says, "Hi!"
Thus there are three different types of double-quotes: quotes that open, quotes that close, and quotes that act as plain-text.
Here's the breakdown of that last confusing line:
" open double-quote group
T inside ""s
i inside ""s
m inside ""s
inside ""s - space doesn't separate
s inside ""s
a inside ""s
y inside ""s
s inside ""s
, inside ""s
inside ""s - space doesn't separate
" close double-quoted group
" quote directly follows closer - acts as plain unwrapped text: "
H outside ""s - gets joined to previous adjacent group
i outside ""s - ...
! outside ""s - ...
" open double-quote group
" close double-quote group
" quote directly follows closer - acts as plain unwrapped text: "
Thus, the text effectively joins four groups of characters (one with nothing, however):
Tim says, is the first, wrapped to escape the spaces
"Hi! is the second, not wrapped (there are no spaces)
is the third, a double-quote group wrapping nothing
" is the fourth, the unwrapped close quote.
As you can see, the double-quote group wrapping nothing is still necessary since, without it, the following double-quote would open up a double-quoted group instead of acting as plain-text.
From this, it should be recognizable that therefore, inside and outside quotes, three double-quotes act as a plain-text unescaped double-quote:
"Tim said to him, """What's been happening lately?""""
will print Tim said to him, "What's been happening lately?" as expected. Therefore, three quotes can always be reliably used as an escape.However, in understanding it, you may note that the four quotes at the end can be reduced to a mere two since it technically is adding another unnecessary empty double-quoted group.
Here are a few examples to close it off:
program a b REM sends (a) and (b)
program """a""" REM sends ("a")
program """a b""" REM sends ("a) and (b")
program """"Hello,""" Mike said." REM sends ("Hello," Mike said.)
program ""a""b""c""d"" REM sends (abcd) since the "" groups wrap nothing
program "hello to """quotes"" REM sends (hello to "quotes")
program """"hello world"" REM sends ("hello world")
program """hello" world"" REM sends ("hello world")
program """hello "world"" REM sends ("hello) and (world")
program "hello ""world""" REM sends (hello "world")
program "hello """world"" REM sends (hello "world")
Final note: I did not read any of this from any tutorial - I came up with all of it by experimenting. Therefore, my explanation may not be true internally. Nonetheless all the examples above evaluate as given, thus validating (but not proving) my theory.
I tested this on Windows 7, 64bit using only *.exe calls with parameter passing (not *.bat, but I would suppose it works the same).
I cannot quickly reproduce the symptoms: if I try myscript '"test"' with a batch file myscript.bat containing just #echo.%1 or even #echo.%~1, I get all quotes: '"test"'
Perhaps you can try the escape character ^ like this: myscript '^"test^"'?
Try this:
myscript """test"""
"" escape to a single " in the parameter.
The 2nd document quoted by Peter Mortensen in his comment on the answer of Codesmith made things much clearer for me. That document was written by windowsinspired.com. The link repeated: A Better Way To Understand Quoting and Escaping of Windows Command Line Arguments.
Some further trial and error leads to the following guideline:
Escape every double quote " with a caret ^. If you want other characters with special meaning to the Windows command shell (e.g., <, >, |, &) to be interpreted as regular characters instead, then escape them with a caret, too.
If you want your program foo to receive the command line text "a\"b c" > d and redirect its output to file out.txt, then start your program as follows from the Windows command shell:
foo ^"a\^"b c^" ^> d > out.txt
If foo interprets \" as a literal double quote and expects unescaped double quotes to delimit arguments that include whitespace, then foo interprets the command as specifying one argument a"b c, one argument >, and one argument d.
If instead foo interprets a doubled double quote "" as a literal double quote, then start your program as
foo ^"a^"^"b c^" ^> d > out.txt
The key insight from the quoted document is that, to the Windows command shell, an unescaped double quote triggers switching between two possible states.
Some further trial and error implies that in the initial state, redirection (to a file or pipe) is recognized and a caret ^ escapes a double quote and the caret is removed from the input. In the other state, redirection is not recognized and a caret does not escape a double quote and isn't removed. Let's refer to these states as 'outside' and 'inside', respectively.
If you want to redirect the output of your command, then the command shell must be in the outside state when it reaches the redirection, so there must be an even number of unescaped (by caret) double quotes preceding the redirection. foo "a\"b " > out.txt won't work -- the command shell passes the entire "a\"b " > out.txt to foo as its combined command line arguments, instead of passing only "a\"b " and redirecting the output to out.txt.
foo "a\^"b " > out.txt won't work, either, because the caret ^ is encountered in the inside state where it is an ordinary character and not an escape character, so "a\^"b " > out.txt gets passed to foo.
The only way that (hopefully) always works is to keep the command shell always in the outside state, because then redirection works.
If you don't need redirection (or other characters with special meaning to the command shell), then you can do without the carets. If foo interprets \" as a literal double quote, then you can call it as
foo "a\"b c"
Then foo receives "a\"b c" as its combined arguments text and can interpret it as a single argument equal to a"b c.
Now -- finally -- to the original question. myscript '"test"' called from the Windows command shell passes '"test"' to myscript. Apparently myscript interprets the single and double quotes as argument delimiters and removes them. You need to figure out what myscript accepts as a literal double quote and then specify that in your command, using ^ to escape any characters that have special meaning to the Windows command shell. Given that myscript is also available on Unix, perhaps \" does the trick. Try
myscript \^"test\^"
or, if you don't need redirection,
myscript \"test\"
I'm calling powershell from cmd, and passing quotes and neither escapes here worked. The grave accent worked to escape double quotes on this Win 10 surface pro.
>powershell.exe "echo la`"" >> test
>type test
la"
Below are outputs I got for other characters to escape a double quote:
la\
la^
la
la~
Using another quote to escape a quote resulted in no quotes.
As you can see, the characters themselves got typed, but didn't escape the double quotes.
Maybe you came here, because you wonder how to escape quotes that you need in the command that you pass to /c on cmd.exe? Well you don't:
CMD /c "MKDIR "foo bar""
will execute
MKDIR "foo bar"
which is really a behavior that I did not expect in the first glance.

Why do backslashes prevent alias expansion?

In the first part of my question I will provide some background info as a
service to the community. The second part contains the actual question.
Part I
Assume I've created the following alias:
alias ls='ls -r'
I know how to temporarily unalias (i.e., override this alias) in the following
ways, using:
1) the full pathname of the command: /bin/ls
2) command substitution: $(which ls)
3) the command builtin: command ls
4) double quotation marks: "ls"
5) single quotation marks: 'ls'
6) a backslash character: \ls
Case 1 is obvious and case 2 is simply a variation. The command builtin in case 3 was designed to ignore shell functions, but apparently it also works for circumventing aliases. Finally, cases 4 and 5 are consistent with both the POSIX standard (2.3.1):
"a resulting word that is identified
to be the command name word of a
simple command shall be examined to
determine whether it is an unquoted,
valid alias name."
and the Bash Reference Manual (6.6):
"The first word of each simple
command, if unquoted, is checked to
see if it has an alias."
Part II
Here's the question: why is case 6 (overriding the alias by saying \ls)
considered quoting the word? In keeping with the style of this question, I am looking for references to the "official" documentation.
The documentation says that a backslash only escapes the following
character, as opposed to single and double quotation marks, which quote a
sequence of characters. POSIX standard (2.2.1):
"A backslash that is not quoted shall
preserve the literal value of the
following character, with the
exception of a < newline >"
Bash Reference Manual (3.1.2.1):
"A non-quoted backslash ‘\’ is the
Bash escape character. It preserves
the literal value of the next
character that follows, with the
exception of newline."
(BTW, isn't "the next character that follows" a bit of overkill?)
A possible answer might be that this situation isn't that special: it is
similar to a few cases in ANSI-C quoting, e.g. \nnn. However, that is still
escaping a single character (the eight-bit character whose value is the octal
value nnn), not a sequence of characters.
Historically, and maintained by POSIX, quoting any part of the word causes the entire word to be considered quoted for the purposes of functions and alias expansion. It also applies to quoting the end token for a here document:
cat << \EOF
this $text is fully quoted
EOF
Just for completion, here's yet another way to suppress alias & function lookups (by clearing the entire shell environment for a single command):
# cf. http://bashcurescancer.com/temporarily-clearing-environment-variables.html
env -i ls
\ls only quotes the first character rather than the whole word. It's equivalent to writing 'l's.
You can verify it like this:
$ touch \?l
$ \??
bash: ?l: command not found
If \?? quoted the whole word it would say ?? not found rather than ?l not found.
I.e. it has the same effect as:
$ '?'?
bash: ?l: command not found
rather than:
$ '??'
bash: ??: command not found

Resources