How to paste literal words in Tcl - syntax

Is there any syntax trick / feature which would allow me to paste two literal words in TCL, e.g. to concatenate a braced ({..}) word and a double-quoted "...") word into a single one?
I'm not asking about set a {foo}; set b "bar\nquux"; set c $a$b or append a $b -- I know about them; but about something without intermediate variables or commands. Analogous to the {*}word (which turns a word into a list).
I guess that the answer is "no way", but my shallow knowledge of Tcl doesn't allow me to draw such a conclusion.

If you are using a recent Tcl version (8.6.2 or newer) you can use
set c [string cat {foo} "bar\nquux"]
For older versions, you can resort to
set c [format %s%s {foo} "bar\nquux"]

There's no way to do what you're asking for without a command, since the syntax of braced words doesn't permit anything before or afterwards, and once you have several words you need to join them with a command (because that's what commands do from the perspective of Tcl's language core; take some values and produce a value result). Not that having braces in the middle of a string is syntax error — it isn't — but it does stop them being quote characters. To be clear:
puts a{b} prints a{b} because { is not special in that case and instead becomes part of the value.
puts {a}b is a syntax error. (The only exception to this is {*}, which started as {expand} but that was waaaay too wordy.)
Approaches that work:
Use string cat.
Use a concatenation procedure (e.g., proc strcat {a b} {return $a$b}
Put both values inside the braces so it is a combined literal. Which only works if you have both parts being literals, of course.
Convert the braced part to non-braced (and non-double-quoted) form. This is always possible as every braced string has a non-braced equivalent, but can involve a lot of backslashes.

If your word is a valid list, you can do:
set orig {abc def}
set new [join $orig {}]

Related

Get the same results from string.start_with? and string[ ]

Basically, I want to check if a string (main) starts with another string (sub), using both of the above methods. For example, following is my code:
main = gets.chomp
sub = gets.chomp
p main.start_with? sub
p main[/^#{sub}/]
And, here is an example with I/O - Try it online!
If I enter simple strings, then both of them works exactly the same, but when I enter strings like "1\2" in stdin, then I get errors in the Regexp variant, as seen in TIO example.
I guess this is because of the reason that the string passed into second one isn't raw. So, I tried passing sub.dump into second one - Try it online!
which gives me nil result. How to do this correctly?
As a general rule, you should never ever blindly execute inputs from untrusted sources.
Interpolating untrusted input into a Regexp is not quite as bad as interpolating it into, say, Kernel#eval, because the worst thing an attacker can do with a Regexp is to construct an Evil Regex to conduct a Regular expression Denial of Service (ReDoS) attack (see also the section on Performance in the Regexp documentation), whereas with eval, they could execute arbitrary code, including but not limited to, deleting the entire file system, scanning memory for unencrypted passwords / credit card information / PII and exfiltrate that via the network, etc.
However, it is still a bad idea. For example, when I say "the worst thing that happen is a ReDoS", that assumes that there are no bugs in the Regexp implementation (Onigmo in the case of YARV, Joni in the case of JRuby and TruffleRuby, etc.) Ruby's Regexps are quite powerful and thus Onigmo, Joni and co. are large and complex pieces of code, and may very well have their own security holes that could be used by a specially crafted Regexp.
You should properly sanitize and escape the user input before constructing the Regexp. Thankfully, the Ruby core library already contains a method which does exactly that: Regexp::escape. So, you could do something like this:
p main[/^#{Regexp.escape(sub)}/]
The reason why your attempt at using String#dump didn't work, is that String#dump is for representing a String the same way you would have to write it as a String literal, i.e. it is escaping String metacharacters, not Regexp metacharacters and it is including the quote characters around the String that you need to have it recognized as a String literal. You can easily see that when you simply try it out:
sub.dump
#=> "\"1\\\\2\""
# equivalent to '"1\\2"'
So, that means that String#dump
includes the quotes (which you don't want),
escapes characters that don't need escaping in Regexp just because they need escaping in Strings (e.g. # or "), and
doesn't escape characters that don't need escaping in Strings (e.g. [, ., ?, *, +, ^, -).

Escape characters in bash & expect script [duplicate]

I am using Tcl_StringCaseMatch function in C++ code for string pattern matching. Everything works fine until input pattern or string has [] bracket. For example, like:
str1 = pq[0]
pattern = pq[*]
Tcl_StringCaseMatch is not working i.e returning false for above inputs.
How to avoid [] in pattern matching?
The problem is [] are special characters in the pattern matching. You need to escape them using a backslash to have them treated like plain characters
pattern= "pq\\[*\\]"
I don't think this should affect the string as well. The reason for double slashing is you want to pass the backslash itself to the TCL engine.
For the casual reader:
[] have a special meaning in TCL in general, beyond the pattern matching role they take here - "run command" (like `` or $() in shells), but [number] will have no effect, and the brackets are treated normally - thus the string str1 does not need escaping here.
For extra confusion:
TCL will interpret ] with no preceding [ as a normal character by default. I feel that's getting too confusing, and would rather that TCL complains on unbalanced brackets. As OP mentions though, this allows you to forgo the final two backslashes and use "pq\\[*]". I dislike this, and rather make it obvious both are treated normally and not the usual TCL way, but to each her/is own.

Semantic differences between percent literals and herdocs in Ruby?

Looking at some documentation, I saw a multiline string defined using a percent literal:
command %Q{
do this;
do that;
}
In the past, I've always used heredocs when I needed multiline strings:
command <<-heredoc
echo "stuff" | do stuff;
heredoc
What are the semantic differences between them? Is there any reason why I would want to use %Q and not a heredoc?
I tend to evaluate how much text is being used when deciding which to use.
I use %Q when there's not a lot of text (for example, a single line), e.g. %Q|foobar|. The value that %Q provides, is it allows you to easily mix quotes, e.g.
%Q|"Get a Job" ~Mom's words|
I use "heredoc"s when there is a lot of text that spans multiple lines.
For example, suppose you're pasting a lot of text into a REPL (like the content of a YAML file). Unless you traverse the whole file, you can't be certain whether or not you will have a conflict with whatever %Q separator you have chosen. With a "heredoc" you just use some really obscure piece of text that you're fairly certain will not have a conflict, e.g.
<<-BatMobilePrettyObscure
... Lots of text ...
BatMobilePrettyObscure
As far as I know, semantically, there are just a few small differences:
%Q can only use one character to delimit strings
%Q can be multi-line or single-line
"heredoc"s must be Multi-line, with the closing "heredoc" standing alone
%Q delimiters can be "mashed" up against their strings, e.g. %Q|foobar|
There's a funky trick that you can use with heredocs: the first line can be used as if it was a complete string. For example, all of the following examples are valid Ruby code:
puts(<<-EOS)
Hello, world!
EOS
<<-EOS.upcase
Hello, world!
EOS
puts(<<-EOS.upcase)
Hello, world!
EOS
However, you will not find that very often in the wild. Other than that, they are the same as double quoted strings or %Q{} and %{} literals, except that you can choose multi-character delimiters. This comes in handy when all of the possible percent literal delimiters may occur in the string. This especially applies to long strings.
There isn't really a semantic difference, and it doesn't have to do with multiline strings either. All strings can be multiline in Ruby. These are all the same string:
'a
b
'
"a
b
"
%Q{a
b
}
<<-heredoc
a
b
heredoc
The question of which to use is decided by whether you need interpolation and the convenience of escaping characters. For example:
Do you need interpolation? If not then '' or %q()
Will there be lots of quote characters to escape? Then use %Q()
Do you want to write a lot of text without thinking about escaping characters? Use heredocs.

Bash variable concatenation

Which form is most efficient?
1)
v=''
v+='a'
v+='b'
v+='c'
2)
v2='a'` `'b'` `'c'
Assuming readability were exactly the same to you, and that's a stretch, would 1) mean creating and throwing away a few string immutables (like in Python) or act as a Java "StringBuffer" with periodical expansion of the buffer capacity? How are string concatenations handled internally in Bash?
If 2) were just as readable to you as 1), would the backticks spawn subshells and would that be more costly, even as a potential 'no-op' than what is done in 1) ?
Well, the simplest and most efficient mechanism would be option 0:
v="abc"
The first mechanism involves four assignments.
The second mechanism is bizarre (and is definitely not readable). It (nominally) runs an empty command in two sub-shells (the two ` ` parts) and concatenates the outputs (an empty string) with the three constants. If the shell simply executes the back-tick commands without noting that they're empty (and it's not unreasonable that it won't notice; it is a weird thing to try — I don't recall seeing it done in my previous 30 years of shell scripting), this is definitely vastly slower.
So, given only options (1) and (2), use option (1), but in general, use option (0) shown above.
Why would you be building up the string piecemeal like that? What's missing from your example that makes the original code sensible but the reduced code shown less sensible.
v=""
x=$(...)
v="$v$x"
y=$(...)
v="$v$y"
z=$(...)
v="$v$z"
This would make more sense, especially if you use each of $x, $y and $z later, and/or use intermediate values of $v (perhaps in the commands represented by triple dots). The concatenation notation used will work with any Bourne-shell derivative; the alternative += shell will work with fewer shells, but is probably slightly more efficient (with the emphasis on 'slightly').
The portable and straight forward method would be to use double quotes and curly brackets for variables:
VARA="beginning text ${VARB} middle text ${VARC}..."
you can even set default values for empty variables this way
VARA="${VARB:-default text} substring manipulation 1st 3 characters ${VARC:0:3}"
using the curly brackets prevents situations where there is a $VARa and you want to write ${VAR}a but end up getting the contents of ${VARa}

VBScript SQL select query clarification and variable substitution

I have read this entry (http://stackoverflow.com/questions/8513185/vbscript-to-correctly-re-format-a-delimited-text-file) many times and still do not understand the .Execute section.
WScript.Echo oTDb.Execute(Replace("SELECT * FROM [#T]", "#T", sTbl1)) _
.GetString( adClipString, , "|", vbCrLf, "" )
The pieces I am having trouble with are the [#T] and "#T".
I know it is the "#T" that is reading the filename in the schema file and and the [#T] must be using the "#T" as a substitute. What I cannot find out is where this is mentioned/spoken about.
Some addition questions I have are:
1. If the filename can be substituted with a variable then what else can?
2. What are the rules for maintaining variables
Do they have to start with the # symbol
Are there any reserved words
If they have to start with the # symbol, does the next character have to be a letter
As I am responsible for #Milton's worry/puzzlement:
There is no variable interpolation/substitution in VBScript. Other languages - e.g. Perl - will splice variables or even expression results into string literals when you mark the replacements with special symbols. No such funny letters in VBScript.
SQL dialects allow parameterized commands in which parts to be replaced are marked by ? and/or names prefixed by symbols like #. But here ADO never sees the #T - VBScript's Replace() function has interpolated the table name before the resulting strings is send to .Execute().
Building complex strings from parts (SQL statements, commandlines for .Run or .Exec, ...) by concatenation is cumbersome. The most important drawback is that you can't (proof) read the string anymore for all those " and &.
A simple workaround is to use Replace(), as in
[sResult = ] Replace("SELECT * FROM [#T]", "#T", sTbl1)
I used the # just for letting the placeholder stand out. As you would have to stack/nest the Replace() calls when you need more substitutions on the template, other strategies are worth considering:
writing a function that takes a template string and a dictionary of replacements to apply Regexp.Replace() to the string
using .NET's System.Text.StringBuilder and its .AppendFormat to do the slicing in a sprintf like style

Resources