Another way instead of escaping regex patterns? - ruby

Usually when my regex patterns look like this:
http://www.microsoft.com/
Then i have to escape it like this:
string.match(/http:\/\/www\.microsoft\.com\//)
Is there another way instead of escaping it like that?
I want to be able to just use it like this http://www.microsoft.com, cause I don't want to escape all the special characters in all my patterns.

Regexp.new(Regexp.quote('http://www.microsoft.com/'))
Regexp.quote simply escapes any characters that have special regexp meaning; it takes and returns a string. Note that . is also special. After quoting, you can append to the regexp as needed before passing to the constructor. A simple example:
Regexp.new(Regexp.quote('http://www.microsoft.com/') + '(.*)')
This adds a capturing group for the rest of the path.

You can also use arbitrary delimiters in Ruby for regular expressions by using %r and defining a character before the regular expression, for example:
%r!http://www.microsoft.com/!

Regexp.quote or Regexp.escape can be used to automatically escape things for you:
https://ruby-doc.org/core/Regexp.html#method-c-escape
The result can be passed to Regexp.new to create a Regexp object, and then you can call the object's .match method and pass it the string to match against (the opposite order from string.match(/regex/)).

You can simply use single quotes for escaping.
string.match('http://www.microsoft.com/')
you can also use %q{} if you need single quotes in the text itself. If you need to have variables extrapolated inside the string, then use %Q{}. That's equivalent to double quotes ".
If the string contains regex expressions (eg: .*?()[]^$) that you want extrapolated, use // or %r{}

For convenience I just define
def regexcape(s)
Regexp.new(Regexp.escape(s))
end

Related

How to use escaped colon in thymeleaf th:text?

This is what happens when I try to use a colon in th:text:
and a backslash doesn't seem to fix it:
How can I use the colon symbol in th:text?
If you want to place a literal into th:text, you have to use single quotes: th:text="'7:00AM'". See documentation here.
(By contrast, something like this th:text="7_00AM" is valid - because it is a literal token. Such strings can only use a subset of characters, but do not need enclosing 's.)

Remove quotation marks from an object

I want to call User.first, but I get it like "User.first". How can I strip the quotation marks so I can call User? Using a regex like this: gsub!(/\A"|"\Z/, "") returns nil instead of the expression.
First I would say that it's dangerous to do this based on where your input is coming from, but if you absolutely need to run arbitrary ruby code contained in a string, you would use eval:
http://ruby-doc.org/core-2.2.2/Kernel.html#method-i-eval
Again, I would avoid evaluating strings if at all possible.

Ruby - Simple way to escape regex characters

I have an expression like this:
s.gsub! /[\?\/\\]/, ''
In fact, the list of forbidden characters is longer than that. Some require escaping, some don't. Is there a way I could just put the characters in some literal structure (?/\) and say "remove all these from the string". I'm aware of Regexp.quote but not sure how to use it in this context.
You can interpolate a statement inside a Ruby regular expression, exactly like you do for strings.
/#{...}/
In your case
s.gsub! /[#{Regexp.escape('?/\')}]/, ''

Backslash + captured group within Ruby regular expression

How do I excape a backslash before a captured group?
Example:
"foo+bar".gsub(/(\+)/, '\\\1')
What I expect (and want):
foo\+bar
what I unfortunately get:
foo\\1bar
How do I escape here correctly?
As others have said, you need to escape everything in that string twice. So in your case the solution is to use '\\\\\1' or '\\\\\\1'. But since you asked why, I'll try to explain that part.
The reason is that replacement sequence is being parsed twice--once by Ruby and once by the underlying regular expression engine, for whom \1 is its own escape sequence. (It's probably easier to understand with double-quoted strings, since single quotes introduce an ambiguity where '\\1' and '\1' are equivalent but '\' and '\\' are not.)
So for example, a simple replacement here with a captured group and a double quoted string would be:
"foo+bar".gsub(/(\+)/, "\\1") #=> "foo+bar"
This passes the string \1 to the regexp engine, which it understands as a reference to a capture group. In Ruby string literals, "\1" means something else entirely (ASCII character 1).
What we actually want in this case is for the regexp engine to receive \\\1. It also understands \ as an escape character, so \\1 is not sufficient and will simply evaluate to the literal output \1. So, we need \\\1 in the regexp engine, but to get to that point we need to also make it past Ruby's string literal parser.
To do that, we take our desired regexp input and double every backslash again to get through Ruby's string literal parser. \\\1 therefore requires "\\\\\\1". In the case of single quotes one slash can be omitted as \1 is not a valid escape sequence in single quotes and is treated literally.
Addendum
One of the reasons this problem is usually hidden is thanks to the use of /.+/ style regexp quotes, which Ruby treats in a special way to avoid the need to double escape everything. (Of course, this doesn't apply to gsub replacement strings.) But you can still see it in action if you use a string literal instead of a regexp literal in Regexp.new:
Regexp.new("\.").match("a") #=> #<MatchData "a">
Regexp.new("\\.").match("a") #=> nil
As you can see, we had to double-escape the . for it to be understood as a literal . by the regexp engine, since "." and "\." both evaluate to . in double-quoted strings, but we need the engine itself to receive \..
This happens due to a double string escaping. You should use 5 slashes in this case.
"foo+bar".gsub(/([+])/, '\\\\\1')
Adding \ two more times escapes this properly.
irb(main):011:0> puts "foo+bar".gsub(/(\+)/, '\\\\\1')
foo\+bar
=> nil

How can I remove the string "\n" from within a Ruby string?

I have this string:
"some text\nandsomemore"
I need to remove the "\n" from it. I've tried
"some text\nandsomemore".gsub('\n','')
but it doesn't work. How do I do it? Thanks for reading.
You need to use "\n" not '\n' in your gsub. The different quote marks behave differently.
Double quotes " allow character expansion and expression interpolation ie. they let you use escaped control chars like \n to represent their true value, in this case, newline, and allow the use of #{expression} so you can weave variables and, well, pretty much any ruby expression you like into the text.
While on the other hand, single quotes ' treat the string literally, so there's no expansion, replacement, interpolation or what have you.
In this particular case, it's better to use either the .delete or .tr String method to delete the newlines.
See here for more info
If you want or don't mind having all the leading and trailing whitespace from your string removed you can use the strip method.
" hello ".strip #=> "hello"
"\tgoodbye\r\n".strip #=> "goodbye"
as mentioned here.
edit The original title for this question was different. My answer is for the original question.
When you want to remove a string, rather than replace it you can use String#delete (or its mutator equivalent String#delete!), e.g.:
x = "foo\nfoo"
x.delete!("\n")
x now equals "foofoo"
In this specific case String#delete is more readable than gsub since you are not actually replacing the string with anything.
You don't need a regex for this. Use tr:
"some text\nandsomemore".tr("\n","")
use chomp or strip functions from Ruby:
"abcd\n".chomp => "abcd"
"abcd\n".strip => "abcd"

Resources