Can a bash function be used to fully escape any string (including nested single quotes)? - bash

I know that there are a number of ways to manually escape nested quotes of the same kind.
Example
echo 'this single quote '"'"' is escaped.'
There are several more ways to do this that are documented well on StackOverflow already, but I'm trying to take that concept and apply it as a function to escape everything. I have been unsuccessful several times to find an all in one escaping solution for Bash (ex. I've looked for a string library that includes escaping with no luck).
Here's one of my attempts:
function quote() {
printf '%s' "'${1//\'/\'"\'"\'}'"
}
The idea is that all single quotes will be replaced with the aforementioned escape style '"'"'. I've also tried doing the same thing using the escape style like so: \'. No luck.
Every response on this topic in my research that I've come across has made it sound like this isn't possible, but I'm not the type to accept that for an answer. To me, if you can echo a string like this: 'test \' test', then it should be expressible in a function too where the backslashes or otherwise are automatically added to escape the characters.
echo '~!##$%^&*()_+`1234567890-=qwertyuiop[]\QWERTYUIOP{}|ASDFGHJKL:"ZXCVBNM<>?zxcvbnm,./'
As shown above, none of the basic keys on a traditional English keyboard need to be escaped, but nesting the same type of quotes within itself requires it.
Can anyone shed some light on this? Am I missing something obvious or is this really that difficult?

You don't need to mimic the shell quoting, properly quoting the variables should be enough.
#! /bin/bash
tag () {
echo Setting tags to "$2".
}
while read tags ; do
tag --set "$tags"
done <<EOF
tag1,tag2,tag3
Tom's_Shoes
The_"best"
EOF

If the tags are in variables or an array and you use proper quoting, you should't need to do any extra escaping. See #choroba's answer.
But there are times when you need it and this will do the trick:
printf -v my_var '%q' "$myvar"
which replaces the contents of my_var with an shell-escaped version.

Related

How to create new names for files with problematic characters for use in an existing bash scripted environment?

The goal is to get rid of (by changing) filenames that give headaches for scripting by translating them to something else. The reason is that in this nearly 30 year Unix / Linux environment, with a lot of existing scripts that may not be "written correctly", a new, large and important cache of files arrived that have to be managed, and so, a colleague has asked me to write a script to help with "problematic filenames" and translate them. They've got a list of chars to turn into dots, such as the comma, and another list to turn into underscores, such as whitespace, as but two examples and ran into problems which I asked about over here.
I was using tr to do it, but commenters to it said I should perhaps ask just about this instead of how to get tr to work. So, I have!
Parameter expansion can do this for you.
Note that unlike when using tr (as requested on your other question), when using parameter expansion you don't need to use backslashes inside your character class definitions: put the expansion in double quotes and bash will treat the results of that expansion as literal.
#!/usr/bin/env bash
toDots='\,;:|+##$%^&*~'
toUnderscores='}{]['"'"'="()`!'
# requires bash 5+: if debug=1, then print what we would do instead of doing it
runOrDebug() {
if (( debug )); then
printf '%s\n' "${*#Q}"
else
"$#"
fi
}
renameFiles() {
local name subDots subBoth
for name; do
subDots=${name//["$toDots"]/.}
subBoth=${subDots//["$toUnderscores"]/_}
if [[ $subBoth != "$name" ]]; then
runOrDebug mv -- "$name" "$subBoth"
fi
done
}
debug=1 renameFiles '[/a],/;[p:r|o\b+lem#a#t$i%c]/#(%$^!/(e^n&t*ry)~='
Note that toUnderscores is (except for the single quote in the middle) in single quotes, so all the backslashes in it are part of the variable's data rather than being syntax; because globs use character class syntax from REs, they're parsed as POSIX regular expression character class syntax.
See a demonstration of the technique running at https://ideone.com/kKE7IJ

zip exclude subfolder passed as argument or variable [duplicate]

I want to run a command from a bash script which has single quotes and some other commands inside the single quotes and a variable.
e.g. repo forall -c '....$variable'
In this format, $ is escaped and the variable is not expanded.
I tried the following variations but they were rejected:
repo forall -c '...."$variable" '
repo forall -c " '....$variable' "
" repo forall -c '....$variable' "
repo forall -c "'" ....$variable "'"
If I substitute the value in place of the variable the command is executed just fine.
Please tell me where am I going wrong.
Inside single quotes everything is preserved literally, without exception.
That means you have to close the quotes, insert something, and then re-enter again.
'before'"$variable"'after'
'before'"'"'after'
'before'\''after'
Word concatenation is simply done by juxtaposition. As you can verify, each of the above lines is a single word to the shell. Quotes (single or double quotes, depending on the situation) don't isolate words. They are only used to disable interpretation of various special characters, like whitespace, $, ;... For a good tutorial on quoting see Mark Reed's answer. Also relevant: Which characters need to be escaped in bash?
Do not concatenate strings interpreted by a shell
You should absolutely avoid building shell commands by concatenating variables. This is a bad idea similar to concatenation of SQL fragments (SQL injection!).
Usually it is possible to have placeholders in the command, and to supply the command together with variables so that the callee can receive them from the invocation arguments list.
For example, the following is very unsafe. DON'T DO THIS
script="echo \"Argument 1 is: $myvar\""
/bin/sh -c "$script"
If the contents of $myvar is untrusted, here is an exploit:
myvar='foo"; echo "you were hacked'
Instead of the above invocation, use positional arguments. The following invocation is better -- it's not exploitable:
script='echo "arg 1 is: $1"'
/bin/sh -c "$script" -- "$myvar"
Note the use of single ticks in the assignment to script, which means that it's taken literally, without variable expansion or any other form of interpretation.
The repo command can't care what kind of quotes it gets. If you need parameter expansion, use double quotes. If that means you wind up having to backslash a lot of stuff, use single quotes for most of it, and then break out of them and go into doubles for the part where you need the expansion to happen.
repo forall -c 'literal stuff goes here; '"stuff with $parameters here"' more literal stuff'
Explanation follows, if you're interested.
When you run a command from the shell, what that command receives as arguments is an array of null-terminated strings. Those strings may contain absolutely any non-null character.
But when the shell is building that array of strings from a command line, it interprets some characters specially; this is designed to make commands easier (indeed, possible) to type. For instance, spaces normally indicate the boundary between strings in the array; for that reason, the individual arguments are sometimes called "words". But an argument may nonetheless have spaces in it; you just need some way to tell the shell that's what you want.
You can use a backslash in front of any character (including space, or another backslash) to tell the shell to treat that character literally. But while you can do something like this:
reply=\”That\'ll\ be\ \$4.96,\ please,\"\ said\ the\ cashier
...it can get tiresome. So the shell offers an alternative: quotation marks. These come in two main varieties.
Double-quotation marks are called "grouping quotes". They prevent wildcards and aliases from being expanded, but mostly they're for including spaces in a word. Other things like parameter and command expansion (the sorts of thing signaled by a $) still happen. And of course if you want a literal double-quote inside double-quotes, you have to backslash it:
reply="\"That'll be \$4.96, please,\" said the cashier"
Single-quotation marks are more draconian. Everything between them is taken completely literally, including backslashes. There is absolutely no way to get a literal single quote inside single quotes.
Fortunately, quotation marks in the shell are not word delimiters; by themselves, they don't terminate a word. You can go in and out of quotes, including between different types of quotes, within the same word to get the desired result:
reply='"That'\''ll be $4.96, please," said the cashier'
So that's easier - a lot fewer backslashes, although the close-single-quote, backslashed-literal-single-quote, open-single-quote sequence takes some getting used to.
Modern shells have added another quoting style not specified by the POSIX standard, in which the leading single quotation mark is prefixed with a dollar sign. Strings so quoted follow similar conventions to string literals in the ANSI standard version of the C programming language, and are therefore sometimes called "ANSI strings" and the $'...' pair "ANSI quotes". Within such strings, the above advice about backslashes being taken literally no longer applies. Instead, they become special again - not only can you include a literal single quotation mark or backslash by prepending a backslash to it, but the shell also expands the ANSI C character escapes (like \n for a newline, \t for tab, and \xHH for the character with hexadecimal code HH). Otherwise, however, they behave as single-quoted strings: no parameter or command substitution takes place:
reply=$'"That\'ll be $4.96, please," said the cashier'
The important thing to note is that the single string that gets stored in the reply variable is exactly the same in all of these examples. Similarly, after the shell is done parsing a command line, there is no way for the command being run to tell exactly how each argument string was actually typed – or even if it was typed, rather than being created programmatically somehow.
Below is what worked for me -
QUOTE="'"
hive -e "alter table TBL_NAME set location $QUOTE$TBL_HDFS_DIR_PATH$QUOTE"
EDIT: (As per the comments in question:)
I've been looking into this since then. I was lucky enough that I had repo laying around. Still it's not clear to me whether you need to enclose your commands between single quotes by force. I looked into the repo syntax and I don't think you need to. You could used double quotes around your command, and then use whatever single and double quotes you need inside provided you escape double ones.
just use printf
instead of
repo forall -c '....$variable'
use printf to replace the variable token with the expanded variable.
For example:
template='.... %s'
repo forall -c $(printf "${template}" "${variable}")
Variables can contain single quotes.
myvar=\'....$variable\'
repo forall -c $myvar
I was wondering why I could never get my awk statement to print from an ssh session so I found this forum. Nothing here helped me directly but if anyone is having an issue similar to below, then give me an up vote. It seems any sort of single or double quotes were just not helping, but then I didn't try everything.
check_var="df -h / | awk 'FNR==2{print $3}'"
getckvar=$(ssh user#host "$check_var")
echo $getckvar
What do you get? A load of nothing.
Fix: escape \$3 in your print function.
Does this work for you?
eval repo forall -c '....$variable'

variable substitution removing quotes

I seem to have some difficulty getting what I want to work. Basically, I have a series of variables that are assigned strings with some quotes and \ characters. I want to remove the quotes to embed them inside a json doc, since json hates quotes using python dump methods.
I figured it would be easy. Just determine how to remove the characters easy and then write a simple for loop for the variable substitution, well it didn't work that way.
Here is what I want to do.
There is a variable called "MESSAGE23", it contains the following "com.centrify.tokend.cac", I want to strip out the quotes, which to me is easy, a simple echo $opt | sed "s/\"//g". When I do this from the command line:
$> MESSAGE23="com."apple".cacng.tokend is present"
$> MESSAGE23=`echo $MESSAGE23 | sed "s/\"//g"`
$> com.apple.cacng.tokend is present
This works. I get the properly formatted string.
When I then try to throw this into a loop, all hell breaks loose.
for i to {1..25}; do
MESSAGE$i=`echo $MESSAGE$i | sed "s/\"//g"`
done
This doesn't work (either it throws a bunch of indexes out or nothing), and I'm pretty sure I just don't know enough about arg or eval or other bash substitution variables.
But basically I want to do this for another set of variables with the same problems, where I strip out the quotes and incidentally the "\" too.
Any help would be greatly appreciated.
You can't do that. You could make it work using eval, but that introduces another level of quoting you have to worry about. Is there some reason you can't use an array?
MESSAGE=("this is MESSAGE[0]" "this is MESSAGE[1]")
MESSAGE[2]="I can add more, too!"
for (( i=0; i<${#MESSAGE[#]}; ++i )); do
echo "${MESSAGE[i]}"
done
Otherwise you need something like this:
eval 'echo "$MESSAGE'"$i"'"'
and it just gets worse from there.
First, a couple of preliminary problems: MESSAGE23="com."apple".cacng.tokend is present" will not embed double-quotes in the variable value, use MESSAGE23="com.\"apple\".cacng.tokend is present" or MESSAGE23='com."apple".cacng.tokend is present' instead. Second, you should almost always put double-quotes around variable expansions (e.g. echo "$MESSAGE23") to prevent parsing oddities.
Now, the real problems: the shell doesn't allow variable substitution on the left side of an assignment (i.e. MESSAGE$i=something won't work). Fortunately, it does allow this in a declare statement, so you can use that instead. Also, when the sees $MESSAGE$i it replaces it will the value of $MESSAGE followed by the value of $i; for this you need to use indirect expansion (`${!metavariable}').
for i in {1..25}; do
varname="MESSAGE$i"
declare $varname="$(echo "${!varname}" | tr -d '"')"
done
(Note that I also used tr instead of sed, but that's just my personal preference.)
(Also, note that #Mark Reed's suggestion of an array is really the better way to do this sort of thing.)

Expansion of variables inside single quotes in a command in Bash

I want to run a command from a bash script which has single quotes and some other commands inside the single quotes and a variable.
e.g. repo forall -c '....$variable'
In this format, $ is escaped and the variable is not expanded.
I tried the following variations but they were rejected:
repo forall -c '...."$variable" '
repo forall -c " '....$variable' "
" repo forall -c '....$variable' "
repo forall -c "'" ....$variable "'"
If I substitute the value in place of the variable the command is executed just fine.
Please tell me where am I going wrong.
Inside single quotes everything is preserved literally, without exception.
That means you have to close the quotes, insert something, and then re-enter again.
'before'"$variable"'after'
'before'"'"'after'
'before'\''after'
Word concatenation is simply done by juxtaposition. As you can verify, each of the above lines is a single word to the shell. Quotes (single or double quotes, depending on the situation) don't isolate words. They are only used to disable interpretation of various special characters, like whitespace, $, ;... For a good tutorial on quoting see Mark Reed's answer. Also relevant: Which characters need to be escaped in bash?
Do not concatenate strings interpreted by a shell
You should absolutely avoid building shell commands by concatenating variables. This is a bad idea similar to concatenation of SQL fragments (SQL injection!).
Usually it is possible to have placeholders in the command, and to supply the command together with variables so that the callee can receive them from the invocation arguments list.
For example, the following is very unsafe. DON'T DO THIS
script="echo \"Argument 1 is: $myvar\""
/bin/sh -c "$script"
If the contents of $myvar is untrusted, here is an exploit:
myvar='foo"; echo "you were hacked'
Instead of the above invocation, use positional arguments. The following invocation is better -- it's not exploitable:
script='echo "arg 1 is: $1"'
/bin/sh -c "$script" -- "$myvar"
Note the use of single ticks in the assignment to script, which means that it's taken literally, without variable expansion or any other form of interpretation.
The repo command can't care what kind of quotes it gets. If you need parameter expansion, use double quotes. If that means you wind up having to backslash a lot of stuff, use single quotes for most of it, and then break out of them and go into doubles for the part where you need the expansion to happen.
repo forall -c 'literal stuff goes here; '"stuff with $parameters here"' more literal stuff'
Explanation follows, if you're interested.
When you run a command from the shell, what that command receives as arguments is an array of null-terminated strings. Those strings may contain absolutely any non-null character.
But when the shell is building that array of strings from a command line, it interprets some characters specially; this is designed to make commands easier (indeed, possible) to type. For instance, spaces normally indicate the boundary between strings in the array; for that reason, the individual arguments are sometimes called "words". But an argument may nonetheless have spaces in it; you just need some way to tell the shell that's what you want.
You can use a backslash in front of any character (including space, or another backslash) to tell the shell to treat that character literally. But while you can do something like this:
reply=\”That\'ll\ be\ \$4.96,\ please,\"\ said\ the\ cashier
...it can get tiresome. So the shell offers an alternative: quotation marks. These come in two main varieties.
Double-quotation marks are called "grouping quotes". They prevent wildcards and aliases from being expanded, but mostly they're for including spaces in a word. Other things like parameter and command expansion (the sorts of thing signaled by a $) still happen. And of course if you want a literal double-quote inside double-quotes, you have to backslash it:
reply="\"That'll be \$4.96, please,\" said the cashier"
Single-quotation marks are more draconian. Everything between them is taken completely literally, including backslashes. There is absolutely no way to get a literal single quote inside single quotes.
Fortunately, quotation marks in the shell are not word delimiters; by themselves, they don't terminate a word. You can go in and out of quotes, including between different types of quotes, within the same word to get the desired result:
reply='"That'\''ll be $4.96, please," said the cashier'
So that's easier - a lot fewer backslashes, although the close-single-quote, backslashed-literal-single-quote, open-single-quote sequence takes some getting used to.
Modern shells have added another quoting style not specified by the POSIX standard, in which the leading single quotation mark is prefixed with a dollar sign. Strings so quoted follow similar conventions to string literals in the ANSI standard version of the C programming language, and are therefore sometimes called "ANSI strings" and the $'...' pair "ANSI quotes". Within such strings, the above advice about backslashes being taken literally no longer applies. Instead, they become special again - not only can you include a literal single quotation mark or backslash by prepending a backslash to it, but the shell also expands the ANSI C character escapes (like \n for a newline, \t for tab, and \xHH for the character with hexadecimal code HH). Otherwise, however, they behave as single-quoted strings: no parameter or command substitution takes place:
reply=$'"That\'ll be $4.96, please," said the cashier'
The important thing to note is that the single string that gets stored in the reply variable is exactly the same in all of these examples. Similarly, after the shell is done parsing a command line, there is no way for the command being run to tell exactly how each argument string was actually typed – or even if it was typed, rather than being created programmatically somehow.
Below is what worked for me -
QUOTE="'"
hive -e "alter table TBL_NAME set location $QUOTE$TBL_HDFS_DIR_PATH$QUOTE"
EDIT: (As per the comments in question:)
I've been looking into this since then. I was lucky enough that I had repo laying around. Still it's not clear to me whether you need to enclose your commands between single quotes by force. I looked into the repo syntax and I don't think you need to. You could used double quotes around your command, and then use whatever single and double quotes you need inside provided you escape double ones.
just use printf
instead of
repo forall -c '....$variable'
use printf to replace the variable token with the expanded variable.
For example:
template='.... %s'
repo forall -c $(printf "${template}" "${variable}")
Variables can contain single quotes.
myvar=\'....$variable\'
repo forall -c $myvar
I was wondering why I could never get my awk statement to print from an ssh session so I found this forum. Nothing here helped me directly but if anyone is having an issue similar to below, then give me an up vote. It seems any sort of single or double quotes were just not helping, but then I didn't try everything.
check_var="df -h / | awk 'FNR==2{print $3}'"
getckvar=$(ssh user#host "$check_var")
echo $getckvar
What do you get? A load of nothing.
Fix: escape \$3 in your print function.
Does this work for you?
eval repo forall -c '....$variable'

Which quotes does a programmer need?

My keyboard only has normal quotes, not the smart ones.
I have obversed that I need normal ones in cgi development and the backward ones in AWK/SED.
Is there any rule when I should use smart quotes, normal ones and backward ones?
Obviously, I need to edit my keyboard layout to get the smart quotes.
If you mean ` by smart quotes, then that is actually called "backquote". Smart quotes are when you type ' and ", but get ‘ and ’ or “ and ” automatically depending on the context. I'm not sure how you would use smart quotes in awk or sed.
In the shell, backquotes, such as `command`, are used to evaluate a command and substitute the result of the command within them into the shell expression being evaluated; it can be used to compute and argument to another command, or to set a variable. For less ambiguity, you can instead use $(command), which makes a lot of quoting rules easier to work out.
In the shell, ' and " are also different. " is used for strings in which you want variable substitution and escape sequences. ' represents a string containing just the characters within the quotes, with not variable interpolation or escape sequences.
So, for example:
$ name=world
$ echo "Hello, $name"
Hello, world
$ echo 'Hello, $name'
Hello, $name
$ echo "Testing \\ escapes"
Testing \ escapes
$ echo 'Testing \\ escapes'
Testing \\ escapes
$ echo `ls`
example-file another-example
$ echo 'ls'
ls
$ echo "ls"
ls
Other scripting languages, such as Perl and Ruby, have similar rules, though there may be slight differences.
Smart quotes are for beautiful typesetting. They have nothing to do with programming.
Edit: the quotes you do need.
Double quotes: " " they are used for literal strings in many languages
Single quotes: ' ' used for literal characters in some languages like C and for strings in languages like javascript and php. (For example if you need to print a string "foo", you could use '"foo"')
Back quotes: in UNIX shells, to indicate substitution of the standard output from one command into a line of text defining another command. For example echo ``date\ might execute echo Sat Mar 1 09:43:00 GMT 2008 and print Sat Mar 1 09:43:00 GMT 2008.
Backquotes are used a lot in shell/awk/perl programming, and when doing documents in TeX. Other than that, you probably won't use them much.
Smart quotes are the devil.
As far as I know, no language requires (or necessarily even supports) "smart quotes" unless you are calling the backtick character ` a smart quote. if that's the case, many language support the backtick. For example, both bash and ruby use the backtick for command substitution.
To answer the question Is there any rule when I should use smart quotes and normal ones?, yes, there is a rule (again, assuming you mean the backtick when you say "smart quotes"). In most languages, different types of quoting give you different types of behavior. The rule is, learn what the behavior is for that particular language then pick the quote that gives you that behavior.
Smart quotes is word processor feature. When you type "quote" it gets automatically replaced with “quote” or „quote”. I think you got your nomenclature wrong.
Just an FYI, another term for 'smart quotes' (which I have never heard of that before), is grave accent.
I think the rules have been laid out pretty clearly in previous answers.
$ /usr/games/fortune

Resources