Converting String to Regex string - ruby

How can I transform a string into a regex string, properly escaping all regex-specific characters? I am using interpolation to build the regex string to allow users to customize the regex without having to touch the code (or expecting them to know regex)
Example
custom_text = "Hello"
my_regex = /#{custom_text}:\s*(\d+)/i
Which results in the following regex when my code uses it
/Hello:\s*(\d+)/i
This allows users to perhaps provide language localizations without having to worry about figuring out where my regex is used, how it's used, or whether they will break the script if they changed something.
However if they wanted to include things like periods or question marks like Hello?, I would probably need to escape them first.

Use Regexp.escape:
my_regex = /#{Regexp.escape(custom_text)}:\s*(\d+)/i
For example:
>> puts /#{Regexp.escape('Hello?')}/.inspect
/Hello\?/

Related

Using a ruby regular expression

I'm completely new to Ruby so I was just wondering if someone could help me out.
I have the following String:
"<planKey><key>OR-J8U</key></planKey>"
What is the regex I have to write to get the center part OR-J8U?
Use the following:
str = "<planKey><key>OR-J8U</key></planKey>"
str[/(?<=\<key\>).*(?=\<\/key\>)/]
#=> "OR-J8U"
This captures anything in between opening and closing 'key' tags using lookahead and lookbehinds
If you want to get the string OR-J8U then you could simply use that string in the regular expression; the - character has to be escaped:
/OR\-J8U/
Though, I believe you want any string that is enclosed within <planKey><key> and </key></planKey>. In that case ice's answer is useful if you allow for an empty string:
/(?<=\<key\>).*(?=\<\/key\>)/
If you don't allow for an empty string, replace the * with +:
/(?<=\<key\>).*(?=\<\/key\>)/
If you prefer a more general approach (any string enclosed within any tags), then I believe the common opinion is not to use a regular expression. Instead consider using an HTML parser. On SO you can find some questions and answers in that regard.

How can I check for repeated strings with check-tail plugin in Sensu?

I am using sensu and the check-tail.rb plugin to alert if any errors appear in my app logs. The problem is that I want the check to be successful if it finds 3 or more error messages.
The solution that I came up with is using a regex like:
\^.*"status":503,.*$.*^.*"status":503,.*$.*^.*"status":503,.*$\im
But it seems to not work because of the match function: instead of passing the variable as a ruby regex it passes it as a string (this can be seen here).
You need to pass the pattern as a string literal, not as a Regexp object.
Thus, you need to remove the regex delimiters and change the modifiers to their inline option variants, that is, prepend the pattern with (?im).
(?im)\A.*"status":503,.*$.*^.*"status":503,.*$.*^.*"status":5‌​03,.*\z
Note that to match the start of string in Ruby, you need to use \A and to match the end of string, you need to use \z anchors.

Regular expression to clean string

I'm struggling to figure out even where to start with this. I believe there is a regular expression to make this a fairly straight forward task. I want to trim off the extra asterisks in a string.
Example string:
test="AM*BE*3***LAST****~"
I would like it to trim asterisks off only the end that don't have repeating symbols. So the resulting value in the variable would be:
test="AM*BE*3***LAST~"
In Perl I was able to use this:
s/\*+~+/~/;
Is there something similar I can do in Ruby? I'm sure there is, just struggling to find it for some reason. Any help would be greatly appreciated.
You could use this regex:
/\*+~$/
Then use the gsub method to replace all matches with a tilde ~:
test = "AM*BE*3***LAST****~"
test.gsub!(/\*+~$/, '~')
# => "AM*BE*3***LAST~"
Or you could use this more flexible regex, which matches any amount of characters after * until end of line:
/\*+([^*])+$/
Then use the first capture group ($1) as the replacement:
test.gsub(/\*+([^*])+$/) { $1 }
Ruby's String class has the [] method, which lets us use regexp as a parameter. We can also assign to that, allowing us to do things like:
foo = "AM*BE*3***LAST****~"
foo[/\*+~+$/] = '~'
foo # => "AM*BE*3***LAST~"
That reuses the match pattern from your Perl search/replace. (I'm assuming you only want to match at the end of the line because of your examples. If it needs to be anywhere in the string remove the trailing $ from the pattern.)
You can use Rubular and try to test the regex and achieve what you need based on the references down the page.
http://rubular.com/

Changing "word" to "Word" using a RegEx like [A-Z]([a-z]*)\b

The title sums up my conundrum pretty well. I've been searching around the net for a while, and being new to Ruby and Regular Expressions as a whole, I'm stuck trying to figure out how to alter the case of a single word string using a RegEx "filter" such as [A-Z]([a-z]*)\b.
Basically I want the flow to be
input: woRD
filter: [A-Z]([a-z]*)\b
output: Word
I already have the words filtered into a list, so I don't need to match words; I only need to filter the case of the word using a RegEx filter.
I do not want to use standard capitalization methods, I want this to be done using Regular Expressions.
You can use
"woRD".downcase.capitalize
Ruby provides some predefined methods for these type of functionality. Try to use them instead of regex. which saves coding time!
Well, for some reason you want to use regexps. Here you go:
# prepare hashes for gsub
to_down = (to_upper = Hash[('a'..'z').zip('A'..'Z')]).invert
# convert to downcase
downcased = 'woRD'.gsub(/[A-Z]/, to_down)
# ⇛ 'word'
titlecased = downcased.gsub(/^\w/, to_upper)
# ⇒ 'Word'
Hope it helps. Note the usage of String#gsub(re, hash) method.
You can't use Regex to such altering as you want to do.
Please read carefully this topic: How to change case of letters in string using regex in Ruby.
The best way to solve your problem is to use:
"woRD".downcase.capitalize
or
name_of_your_variable.downcase!.capitalize!
if you want to alter string in your variable permanently without need of assign it to other variable.

Using regexes in ruby with a need to match lots of * and /

I need to find strings with * and / using reg-exes, I am writing in Ruby.The reason for this need to find lots of * and / is that I am building a tokenizer for an language and there are multi-line comments that use the C style of multi-line comments (/* */). I have the single line comments handled already.
Is there a way to use reg-ex without having to use the two foreword slashes to indicate some regular expression because I am finding it impossible to find my mistakes due to the insane amount of escaping. Or can someone give me advise on how to handle the escaping in a sane matter? I already tried writing the sequence first then escaping it.
Thank you for your time and advise.
One trick that might help is the %r literal:
%r{http://www\.google\.com}
I like to use pipes myself, when they're not in the regex.
%r|http://www\.google\.com|
You can also create new instances of Regexp via Regexp.new and pass a string.
Finally, you might also look at Regexp.quote:
Escapes any characters that would have special meaning in a regular expression. Returns a new escaped string, or self if no characters are escaped. For any string, Regexp.new(Regexp.escape(str))=~str will be true.

Resources