Ruby string sub without regex back references - ruby

I'm trying to do a simple string sub in Ruby.
The second argument to sub() is a long piece of minified JavaScript which has regular expressions contained in it. Back references in the regex in this string seem to be effecting the result of sub, because the replaced string (i.e., the first argument) is appearing in the output string.
Example:
input = "string <!--tooreplace--> is here"
output = input.sub("<!--tooreplace-->", "\&")
I want the output to be:
"string \& is here"
Not:
"string & is here"
or if escaping the regex
"string <!--tooreplace--> is here"
Basically, I want some way of doing a string sub that has no regex consequences at all - just a simple string replace.

To avoid having to figure out how to escape the replacement string, use Regex.escape. It's handy when replacements are complicated, or dealing with it is an unnecessary pain. A little helper on String is nice too.
input.sub("<!--toreplace-->", Regexp.escape('\&'))

You can also use block notation to make it simpler (as opposed to Regexp.escape):
=> puts input.sub("<!--tooreplace-->") {'\&'}
string \& is here

Use single quotes and escape the backslash:
output = input.sub("<!--tooreplace-->", '\\\&') #=> "string \\& is here"

Well, since '\\&' (that is, \ followed by &) is being interpreted as a special regex statement, it stands to reason that you need to escape the backslash. In fact, this works:
>> puts 'abc'.sub 'b', '\\\\&'
a\&c
Note that \\\\& represents the literal string \\&.

Related

Regexp.escape adds weird escapes to a plain space

I stumbled over this problem using the following simplified example:
line = searchstring.dup
line.gsub!(Regexp.escape(searchstring)) { '' }
My understanding was, that for every String stored in searchstring, the gsub! would cause that line is afterwards empty. Indeed, this is the case for many strings, but not for this case:
searchstring = "D "
line = searchstring.dup
line.gsub!(Regexp.escape(searchstring)) { '' }
p line
It turns out, that line is printed as "D " afterwards, i.e. no replacement had been performed.
This happens to any searchstring containing a space. Indeed, if I do a
p(Regexp.escape(searchstring))
for my example, I see "D\\ " being printed, while I would expect to get "D " instead. Is this a bug in the Ruby core library, or did I misuse the escape function?
Some background: In my concrete application, where this simplified example is derived from, I just want to do a literal string replacement inside a long string, in the following way:
REPLACEMENTS.each do
|from, to|
line.chomp!
line.gsub!(Regexp.escape(from)) { to }
end
. I'm using Regexp.escape just as a safety measure in the case that the string being replaced contains some regex metacharacter.
I'm using the Cygwin port of MRI Ruby 2.6.4.
line.gsub!(Regexp.escape(searchstring)) { '' }
My understanding was, that for every String stored in searchstring, the gsub! would cause that line is afterwards empty.
Your understanding is incorrect. The guarantee in the docs is
For any string, Regexp.new(Regexp.escape(str))=~str will be true.
This does hold for your example
Regexp.new(Regexp.escape("D "))=~"D " # => 0
therefore this is what your code should look like
line.gsub!(Regexp.new(Regexp.escape(searchstring))) { '' }
As for why this is the case, there used to be a bug where Regex.escape would incorrectly handle space characters:
# in Ruby 1.8.4
Regex.escape("D ") # => "D\\s"
My guess is they tried to keep the fix as simple as possible by replacing 's' with ' '. Technically this does add an unnecessary escape character but, again, that does not break the intended use of the method.
This happens to any searchstring containing a space. Indeed, if I do a
p(Regexp.escape(searchstring))
for my example, I see "D\\ " being printed, while I would expect to get "D " instead. Is this a bug in the Ruby core library, or did I misuse the escape function?
This looks to be a bug. In my opinion, whitespace is not a Regexp meta character, there is no need to escape it.
Some background: In my concrete application, where this simplified example is derived from, I just want to do a literal string replacement inside a long string […]
If you want to do literal string replacement, then don't use a Regexp. Just use a literal string:
line.gsub!(from, to)

How can I get a non-interpolation bash escape in Ruby?

In ruby, the backticks are a system call, but they are interpolation. This is nice as I could do this
a = 20.sqrt
`cat #{a}`
But it is also annoying because I sometimes want \ in my code, but I need \\ within `` because it is interpolation and escaping. How can I avoid this?
Try this
Kernel.`('echo "#{a}"')
Which prints verbatim
#{a}
Fun fact, ` is actually a method on Kernel and you can call it just like any other method. And thus pass a single quote string as argument.
Form the string in a nonescaping context and interpolate it into the backticks:
s = %{echo "he\ny"}
puts `#{s}`
If you need to juggle with escape characters, you could instead use %q :
system(%q{echo '#&%$][/\'})
#=> #&%$][/\
If you want string interpolation, you can use %Q :
a = 20
system(%Q{echo '#{a}&%$][/\'})
#=> 20&%$][/
Here's a thread about it. Note that you can use any delimiter after %q and %Q : pick one that isn't in your string!
I used system here instead of %x{} or ticks. They're not equivalent, but I just wanted to show the string definition without making the line even more complex than it already is.

String literal without need to escape backslash

In C#, I can write backslashes and other special characters without escaping by using # before a string, but I have to escape double-quotes.
C#
string foo = "The integer division operator in VisualBASIC is written \"a \\ b\"";
string bar = #"The integer division operator in VisualBASIC is written \"a \ b\""
In Ruby, I could use the single-quote string literal, but I'd like to use this in conjuction with string interpolation like "text #{value}". Is there an equivalent in Ruby to # in C#?
There is somewhat similar thing available in Ruby. E.g.
foo = %Q(The integer division operator in VisualBASIC is written "a \\ b" and #{'interpolation' + ' works'})
You can also interpolate strings in it. The only caveat is, you would still need to escape \ character.
HTH
You can use heredoc with single quotes.
foo = <<'_'
The integer division operator in VisualBASIC is written "a \ b";
_
If you want to get rid of the newline character at the end, then chomp it.
Note that this does not work with string interpolation. If you want to insert evaluated expressions within the string, you can use % operation after you create the string.
foo = <<'_'
text %{value}
_
foo.chomp % {value: "foo"}

Wrap parentheses

I'm trying to wrap parentheses around string in ruby, only if it isn't wrapped yet:
"my string (to_wrap)" => "my string (to_wrap)"
"my string to_wrap" => "my string (to_wrap)"
I've tried something like:
to_wrap = 'to_wrap'
regexp = Regexp.new "(?!\()#{to_wrap}(?!\))"
string.sub(regexp, "(#{to_wrap})")
but it does not work.
Thanks in advance!
You are very close. Your first negative lookaround is a lookahead, though. So it looks at the first character of to_wrap. Just make that a lookbehind:
"(?<!\()#{to_wrap}(?!\))"
And just to present you with an alternative option to escape parentheses (it's really a matter of taste which on to use, but I find it more easily readable):
"(?<![(])#{to_wrap}(?![)])"

Escaping single and double quotes in a string in ruby?

How can I escape single and double quotes in a string?
I want to escape single and double quotes together. I know how to pass them separately but don't know how to pass both of them.
e.g: str = "ruby 'on rails" " = ruby 'on rails"
My preferred way is to not worry about escaping and instead use %q, which behaves like a single-quote string (no interpolation or character escaping), or %Q for double quoted string behavior:
str = %q[ruby 'on rails" ] # like single-quoting
str2 = %Q[quoting with #{str}] # like double-quoting: will insert variable
See https://docs.ruby-lang.org/en/trunk/syntax/literals_rdoc.html#label-Strings and search for % strings.
Use backslash to escape characters
str = "ruby \'on rails\" "
Here is a complete list:
From http://learnrubythehardway.org/book/ex10.html
You can use Q strings which allow you to use any delimiter you like:
str = %Q|ruby 'on rails" " = ruby 'on rails|
>> str = "ruby 'on rails\" \" = ruby 'on rails"
=> "ruby 'on rails" " = ruby 'on rails"
I would go with a heredoc if I'm starting to have to worry about escaping. It will take care of it for you:
string = <<MARKER
I don't have to "worry" about escaping!!'"!!
MARKER
MARKER delineates the start/end of the string. start string on the next line after opening the heredoc, then end the string by using the delineator again on it's own line.
This does all the escaping needed and converts to a double quoted string:
string
=> "I don't have to \"worry\" about escaping!!'\"!!\n"
I would use just:
str = %(ruby 'on rails ")
Because just % stands for double quotes(or %Q) and allows interpolation of variables on the string.
Here is an example of how to use %Q[] in a more complex scenario:
%Q[
<meta property="og:title" content="#{#title}" />
<meta property="og:description" content="#{#fullname}'s profile. #{#fullname}'s location, ranking, outcomes, and more." />
].html_safe
One caveat:
Using %Q[] and %q[] for string comparisons is not intuitively safe.
For example, if you load something meant to signify something empty, like "" or '', you need to use the actual escape sequences. For example, let's say qvar equals "" instead of any empty string.
This will evaluate to false
if qvar == "%Q[]"
As will this,
if qvar == %Q[]
While this will evaluate to true
if qvar == "\"\""
I ran into this issue when sending command-line vars from a different stack to my ruby script. Only Gabriel Augusto's answer worked for me.

Resources