Refactoring a regular expression check in ruby - ruby

I'm thinking there has got to be a cleaner way to check if a regular expression is not nil / is true. This is what I have been using:
hold = (h4.text =~ /Blah/)
if !hold.nil?
...
end
I tried: !(h4.text =~ /Blah/).nil? but it did not seem to work.

You can use unless here:
unless h4.text =~ /Blah/
#...
end

if h4.text !~ /Blah/
# ...
end

#!/usr/bin/ruby1.8
text = 'Blah blah blah'
puts "blah" if text =~ /Blah/ # => blah
text = 'Foo bar baz'
puts "blah" if text =~ /Blah/ # (nothing printed)
In a Ruby conditional statement,
anything that is neither nil nor
false is considered to be true.
=~ returns nil for no match, or an integer character position if there
is a match.
nil is as good a false; an integer is
as good as true.
Therefore, you can use the result of =~ directly in an if, while, etc.

Neither of the above seemed to work, this is what I ended up with:
unless (h4.text =~ /Blah/) == nil
...
end

Related

How can I create a method that checks if a string starts with a capitalized letter?

So far I have:
def capitalized?(str)
str[0] == str[0].upcase
end
THe problem wit this is that it returns true for strings like "12345", "£$%^&" and"9ball" etc. I would like it to only return true if the first character is a capital letter.
You can use match? to return true if the first character is a letter in the range of A to Z both uppercase or not:
def capitalized?(str)
str.match?(/\A[A-Z]/)
end
p capitalized?("12345") # false
p capitalized?("fooo") # false
p capitalized?("Fooo") # true
Also you can pass a regular expression to start_with?:
p 'Foo'.start_with?(/[A-Z]/) # true
p 'foo'.start_with?(/[A-Z]/) # false
There's probably a nicer way to do it with regex, but keeping this ruby based, you can make an array of capital letters:
capital_letters = ("A".."Z")
Then you can check if your first letter is in that array:
def capitalized?(str)
capital_letters = ("A".."Z")
capital_letters.include?(str[0])
end
Or a bit shorter:
def capitalized?(str)
("A".."Z").include?(str[0])
end
I would avoid character ranges if possible, because without knowing the encoding, you can never be sure what is in a range. In your case, it is unnecessary. A simple
/^[[:upper:]]/ =~ str
would do. See here for the definition of POSIX character classes.
def capitalized?(str)
str[0] != str[0].downcase
end
capitalized? "Hello" #=> true
capitalized? "hello" #=> false
capitalized? "007, I presume" #=> false
capitalized? "$100 for that?" #=> false
Simple solution
def capitalized?(str)
str == str.capitalize
end

ruby multiple regexp gsub with array.inject

I have to work with a long text and make some substitution with regexp inside it.
Now I wrote the following code:
text = File.read(file)
replacements = [
[/```([\w]+)/, "\n\1>"],
[/```\n/, "\n\n"],
[/pattern not found/, 'sub'],
[/pattern that should be found/, 'no sub'],
]
replacements.inject(text) do |text, (k,v)|
if text =~ k
text.gsub!(k,v)
end
end
File.write(name', text)
If every regexp is found in my document everything works fine, but if a replacements pattern is not found, all subsequent replacements are not carried out.
I put the if text =~ k but it does not work the same.
Any idea?
The reason is that String#gsub! returns nil if there were no substitutions made, and the result if there were. Another glitch is that you call matching twice, the check for text =~ k is redundant.
I would go with not inplace version of gsub:
result = replacements.inject(text) do |text, (k, v)|
text.gsub(k, v)
end
the above should do the trick.
Whether you still want to use inplace substitution, you might just return text itself on unsuccessful gsub!:
result = replacements.inject(text) do |text, (k, v)|
text.gsub!(k, v) || text
end
Each inject iteration should return memo (in your case text) to the next iteration. Try this code:
replacements.inject(text) do |text, (k,v)|
if text =~ k
text.gsub!(k,v)
end
text
end
The block of inject must return memo value. So, you may have to change your code to do this:
replacements.inject(text) do |text, (k,v)|
text.gsub(k,v)
end
When if test =~ k failed in your case, the block's output was nil - hence, the issue.
Alternatively, you can use with_object
replacements.each.with_object(text) do |(k,v), text|
text.gsub!(k,v)
end

Overriding the =~ operator of Regexp in a subclass Subregex, result in a weird behaviour when executing "example" =~ subregexex

Given the following example in Ruby 2.0.0:
class Regexp
def self.build
NumRegexp.new("-?[\d_]+")
end
end
class NumRegexp < Regexp
def match(value)
'hi two'
end
def =~(value)
'hi there'
end
end
var_ex = Regexp.build
var_ex =~ '12' # => "hi there" , as expected
'12' =~ var_ex # => nil , why? It was expected "hi there" or "hi two"
According to the documentation of Ruby of the =~ operator for the class String:
str =~ obj → fixnum or nil
"If obj is a Regexp, use it as a pattern to match against str,and returns the position the match starts, or nil if there is no match. Otherwise, invokes obj.=~, passing str as an argument. The default =~ in Object returns nil."
http://www.ruby-doc.org/core-2.0.0/String.html#method-i-3D-7E
It is a fact that the variable var_ex is an object of class NumRegexp, hence, it is not a Regexp object. Therefore, it should invoke the method obj.=~ passing the string as an argument, as indicated in the documentation and returning "hi there".
In another case, maybe as NumRegexp is a subclass of Regexp it could be considered a Regexp type. Then, "If obj is a Regexp use it as a pattern to match against str". It should return "hi two" in that case.
What is wrong in my reasoning? What do I have to do to achieve the desired functionality?
I've found that the record:
var_ex =~ '12'
isn't the same of:
'12' =~ var_ex
It seems that there are no string method that calls to regexp class #~= method back, this is a bug already reported, and expected to be solved in 2.2.0. So you have to declare it explicitly:
class String
alias :__system_match :=~
def =~ regex
regex.is_a?( Regexp ) && ( regex =~ self ) || __system_match( regex )
end
end
'12' =~ /-?[\d_]+/
# => 0
This is a possible and acceptable solution using monkey patching but it presents some problems to take into account:
The problem with this is that we have now polluted the namespace with a superfluous __system_match method. This method will show up in our documentation, it will show up in code completion in our IDEs, it will show up during reflection. Also, it still can be called, but presumably we monkey patched it, because we didn't like its behavior in the first place, so we might not want other people to call it.
The reason is that you are calling =~ method on a string, not on your NumRegexp object. You need to tell String how to behave:
class String
def =~(reg)
return reg=~self if reg.is_a? NumRegexp
super
end
end

Ruby: Use condition result in condition block

I have such code
reg = /(.+)_path/
if reg.match('home_path')
puts reg.match('home_path')[0]
end
This will eval regex twice :(
So...
reg = /(.+)_path/
result = reg.match('home_path')
if result
puts result[0]
end
But it will store variable result in memory till.
I have one functional-programming idea
/(.+)_path/.match('home_path').compact.each do |match|
puts match[0]
end
But seems there should be better solution, isn't it?
There are special global variables (their names start with $) that contain results of the last regexp match:
r = /(.+)_path/
# $1 - the n-th group of the last successful match (may be > 1)
puts $1 if r.match('home_path')
# => home
# $& - the string matched by the last successful match
puts $& if r.match('home_path')
# => home_path
You can find full list of predefined global variables here.
Note, that in the examples above puts won't be executed at all if you pass a string that doesn't match the regexp.
And speaking about general case you can always put assignment into condition itself:
if m = /(.+)_path/.match('home_path')
puts m[0]
end
Though, many people don't like that as it makes code less readable and gives a good opportunity for confusing = and ==.
My personal favorite (w/ 1.9+) is some variation of:
if /(?<prefix>.+)_path/ =~ "home_path"
puts prefix
end
If you really want a one-liner: puts /(?<prefix>.+)_path/ =~ 'home_path' ? prefix : false
See the Ruby Docs for a few limitations of named captures and #=~.
From the docs: If a block is given, invoke the block with MatchData if match succeed.
So:
/(.+)_path/.match('home_path') { |m| puts m[1] } # => home
/(.+)_path/.match('homepath') { |m| puts m[1] } # prints nothing
How about...
if m=/regex here/.match(string) then puts m[0] end
A neat one-line solution, I guess :)
how about this ?
puts $~ if /regex/.match("string")
$~ is a special variable that stores the last regexp match. more info: http://www.regular-expressions.info/ruby.html
Actually, this can be done with no conditionals at all. (The expression evaluates to "" if there is no match.)
puts /(.+)_path/.match('home_xath').to_a[0].to_s

How could I check to see if a word exists in a string, and return false if it doesn't, in ruby?

Say I have a string str = "Things to do: eat and sleep."
How could I check if "do: " exists in str, case insensitive?
Like this:
puts "yes" if str =~ /do:/i
To return a boolean value (from a method, presumably), compare the result of the match to nil:
def has_do(str)
(str =~ /do:/i) != nil
end
Or, if you don’t like the != nil then you can use !~ instead of =~ and negate the result:
def has_do(str)
not str !~ /do:/i
end
But I don’t really like double negations …
In ruby 1.9 you can do like this:
str.downcase.match("do: ") do
puts "yes"
end
It's not exactly what you asked for, but I noticed a comment to another answer. If you don't mind using regular expressions when matching the string, perhaps there is a way to skip the downcase part to get case insensitivity.
For more info, see String#match
You could also do this:
str.downcase.include? "Some string".downcase
If all I'm looking for is a case=insensitive substring match I usually use:
str.downcase['do: ']
9 times out of 10 I don't care where in the string the match is, so this is nice and concise.
Here's what it looks like in IRB:
>> str = "Things to do: eat and sleep." #=> "Things to do: eat and sleep."
>> str.downcase['do: '] #=> "do: "
>> str.downcase['foobar'] #=> nil
Because it returns nil if there is no hit it works in conditionals too.
"Things to do: eat and sleep.".index(/do: /i)
index returns the position where the match starts, or nil if not found
You can learn more about index method here:
http://ruby-doc.org/core/classes/String.html
Or about regex here:
http://www.regular-expressions.info/ruby.html

Resources