Ruby Regex match method - ruby

I was giving this regex /\A[\w+\-.]+#[a-z\d\-]+(\.[a-z\d\-]+)*\.[a-z]{2,}\z/ and I'm to use it in order to verify the validity of an email address.
This is my first Ruby code and I'm not sure how to do it. I was told to use the .match method, which returns MatchData object. But how do I go on verifying that the MatchData object confirms the validity?
I attempted using the following, but it seems that it's accepting any string, even not an email address.
#Register new handler
def register_handler
email_regex = /\A[\w+\-.]+#[a-z\d\-]+(\.[a-z\d\-]+)*\.[a-z]{2,}\z/
email = "invalid"
unless email =~ email_regex then
puts("Insert your e-mail address:")
email = gets
end
puts("email received")
end
What is the correct way to do this? Either using .match or the method I attempted above.

The regex matches email addresses:
'invalid' =~ email_regex
=> nil # which boolean value is false
'email#example.com' =~ email_regex
=> 0 # which boolean value is true
however:
"email#example.com\n" =~ email_regex
=> nil
The newline character \n is appended by gets to every input.
That is why, the until loop will run forever regardless of what you will type in the terminal. The matching result will always be nil because of the newline character.
Try using gets.chomp, which will trim the newline character and your code should work.

Try this:
Edited the regex and put () to capture group and the beginning and the end
re = /^([\w+\-.]+#[a-z\d\-]+(\.[a-z\d\-]+)*\.[a-z]{2,})$/m
str = 'a#b.com
p#qasdf.com
adbadf#bwdsfqaf.com
....
a#bdotcom
aasdf.com
www.yahoo.com'
# Print the match result
str.scan(re) do |match|
puts match.to_s
end
Running sample code

Related

create comma separated string in the First element of an array Ruby

So this may seem odd, and I have done quite a bit of googling, however, I am not really a programmer, (sysops) and trying to figure out how to pass data to the AWS API in the required format, which does seem a little odd.
So, working with resources in AWS, I need to pass tags which are keys and values. The key is a string. The value is a comma separated string, in the first element of an array. So in Ruby terms, looks like this.
{env => ["stage,qa,dev"]}
and not
{env => ["stage","qa","dev"]}
I'm created an admittedly. not a very pretty little app that will allow me to run ssm documents on targeted instances in aws.
I can get the string into an array element using this class I created
class Tags
attr_accessor :tags
def initialize
#tags = {"env" => nil ,"os" => nil ,"group" => nil }
end
def set_values()
puts "please enter value/s for the following keys, using spaces or commas for multiple values"
#tags.each { |key,value|
print "enter #{key} value/s: "
#tags[key] = [gets.strip.chomp]
#tags[key] = Validate.multi_value(tags[key])
}
end
end
I then call this Validate.multi_value passing in the created Array, but it spits an array of my string value back.
class Validate
def self.multi_value(value)
if value.any?{ |sub_string| sub_string.include?(",") || sub_string.include?(" ") }
value = value[0].split(/[,\s]+/)
return value
else
return value
end
end
end
Using pry, I've seen it gets for example ["stage dev qa"] then the if statement does work, then it spits out ["stage","dev","qa"].
and I need it to output ["stage,dev,qa"] but for the life of me, I can't make it work.
I hope that's clear.
If you have any suggestions, I'd be most grateful.
I'm not hugely experienced at ruby and the may be class methods that I've missed.
If your arrays are always coming through in the format ["stage dev qa"] then first we need to split the one string into the parts we want:
arr = ["stage dev qa"]
arr.split(' ')
=> ["stage", "dev", "qa"]
Then we need to join them with the comma:
arr.split(' ').join(',')
=> "stage,dev,qa"
And finally we need to wrap it in an array:
[arr.first.split(' ').join(',')]
=> ["stage,dev,qa"]
All together:
def transform_array(arr)
[arr.first.split(' ').join(',')]
end
transform_array(['stage dev qa'])
=> ['stage,dev,qa']
More info: How do I convert an array of strings into a comma-separated string?
I see no point in creating a class here when a simple method would do.
def set_values
["env", "os", "group"].map do |tag|
puts "Please enter values for #{tag}, using spaces or commas"
print "to separate multiple values: "
gets.strip.gsub(/[ ,]+/, ',')
end
end
Suppose, when asked, the user enters, "stage dev,qa" (for"env"), "OS X" (for"OS") and "Hell's Angels" for "group". Then:
set_values
#=> ["stage,dev,qa", "OS,X", "Hell's,Angels"]
If, as I suspect, you only wish to convert spaces to commas for "env" and not for "os" or "group", write:
def set_values
puts "Please enter values for env, using spaces or commas"
print "to separate multiple values: "
[gets.strip.gsub(/[ ,]+/, ',')] +
["os", "group"].map do |tag|
print "Please enter value for #{tag}: "
gets.strip
end
end
set_values
#=> ["stage,dev,ga", "OS X", "Hell's Angels"]
See Array#map, String#gsub and Array#+.
gets.strip.gsub(/[ ,]+/, ',') merely chains the two operations s = gets.strip and s.gsub(/[ ,]+/, ','). Chaining is commonplace in Ruby.
The regular expression used by gsub reads, "match one or more spaces or commas", [ ,] being a character class, requiring one of the characters in the class be matched, + meaning that one or more of those spaces or commas are to be matched. If the string were "a , b,, c" there would be two matches, " , " and ",, "; gsub would convert both to a single comma.
Using print rather than puts displays the user's entry on the same line as the prompt, immediately after ": ", rather than on the next line. That is of course purely stylistic.
Often one would write gets.chomp rather than gets.strip. Both remove newlines and other whitespace at the end of the string, strip also removes any whitespace at the beginning of the string. strip is probably best in this case.
What do you think about this?, everything gets treated in the Validates method. I don't know if you wanted to remove repeated values, but, just in case I did, so a
"this string,, has too many,,, , spaces"
will become
"this,string,has,too,many,spaces"
and not
"this,,,,string,,,has,too,,many,,,,,,spaces"
Here's the code
class Tags
attr_accessor :tags
# initializes the class (no change)
#
def initialize
#tags = {"env" => nil ,"os" => nil ,"group" => nil }
end
# request and assign the values <- SOME CHANGES
#
def set_values
puts "please enter value/s for the following keys, using spaces or commas for multiple values"
#tags.each do |key,value|
print "enter #{key} value/s: "
#tags[key] = Validate.multi_value( gets )
end
end
end
class Validate
# Sets the array
#
def self.multi_value(value)
# Remove leading spaces, then remove special chars,
# replace all spaces with commas, then remove repetitions
#
[ value.strip.delete("\n","\r","\t","\rn").gsub(" ", ",").squeeze(",") ]
end
end
EDITED, thanks lacostenycoder

What does 'no implicit conversion of Regexp into String (TypeError)' mean?

class String
def digit?
self.include?(/[0-9]/)
end
end
Test.assert_equals "".digit?, false
Test.assert_equals "7".digit?, true
Test.assert_equals " ".digit?, false
I have been playing around with regular expressions. Can you tell me how I have made an error? I have tried explicitly converting it to a string but it doesn't work nor do I see why I should have to. Could anyone enlighten me? Thank you!
include? expects a string as documented here. It seems like you were looking for match.
Matt's answer is good, but it's worth noting that there's some nuance here. Firstly, with the Regexp you have, you're going to match any string that contains any digit:
class String
def digit?
match(/[0-9]/)
end
end
"foo1bar".digit? # => #<MatchData "1">
If you want to match a single digit you'll want to use anchors—\A and \z for the start and end of the string, respectively—and while we're at it, you might as well use \d instead of [0-9], ergo /\A\d\z/. If you want to match one or more digits, use the + quantifier: /\A\d+\z/.
Finally, match might be overkill here. I suggest using the =~ operator:
class String
def digit?
self =~ /\A\d\z/
end
def digits?
self =~ /\A\d+\z/
end
end
"foo123bar".digit? # => nil
"1".digit? # => 0
"123".digit? # => nil
"foo123bar".digits? # => nil
"1".digits? # => 0
"123".digits? # => 0

Swap part of a string in Ruby

What's the easiest way in Ruby to interchange a part of a string with another value. Let's say that I have an email, and I want to check it on two domains, but I don't know which one I'll get as an input. The app I'm building should work with #gmail.com and #googlemail.com domains.
Example:
swap_string 'user#gmail.com' # >>user#googlemail.com
swap_string 'user#googlemail.com' # >>user#gmail.com
If you're looking to substitute a part of a string with something else, gsub works quite well.
Link to Gsub docs
It lets you match a part of a string with regex, and then substitute just that part with another string. Naturally, in place of regex, you can just use a specific string.
Example:
"user#gmail.com".gsub(/#gmail/, '#googlemail')
is equal to
user#googlemail.com
In my example I used #gmail and #googlemail instead of just gmail and googlemail. The reason for this is to make sure it's not an account with gmail in the name. It's unlikely, but could happen.
Don't match the .com either, as that can change depending on where the user's email is.
Assuming googlemail.com and gmail.com are the only two possibilities, you can use sub to replace a pattern with given replacement:
def swap_string(str)
if str =~ /gmail.com$/
str.sub("gmail.com","googlemail.com")
else
str.sub("googlemail.com","gmail.com")
end
end
swap_string 'user#gmail.com'
# => "user#googlemail.com"
swap_string 'user#googlemail.com'
# => "user#gmail.com"
You can try with Ruby gsub :
eg:
"user#gmail.com".gsub("gmail.com","googlemail.com");
As per your need of passing a string parameter in a function this should do:
def swap_mails(str)
if str =~ /gmail.com$/
str.sub('gmail.com','googlemail.com');
else
str.sub('googlemail.com','gmail.com');
end
end
swap_mails "vgmail#gmail.com" //vgmail#googlemail.com
swap_mails "vgmail#googlemail.com" ////vgmail#gmail.com
My addition :
def swap_domain str
str[/.+#/] + [ 'gmail.com', 'googlemail.com' ].detect do |d|
d != str.split('#')[1]
end
end
swap_domain 'user#gmail.com'
#=> user#googlemail.com
swap_domain 'user#googlemail.com'
#=> user#gmail.com
And this is bad code, imo.
String has a neat trick up it's sleeve in the form of String#[]:
def swap_string(string, lookups = {})
string.tap do |s|
lookups.each { |find, replace| s[find] = replace and break if s[find] }
end
end
# Example Usage
lookups = {"googlemail.com"=>"gmail.com", "gmail.com"=>"googlemail.com"}
swap_string("user#gmail.com", lookups) # => user#googlemail.com
swap_string("user#googlemail.com", lookups) # => user#gmail.com
Allowing lookups to be passed to your method makes it more reusable but you could just as easily have that hash inside of the method itself.

Overriding the =~ operator of Regexp in a subclass Subregex, result in a weird behaviour when executing "example" =~ subregexex

Given the following example in Ruby 2.0.0:
class Regexp
def self.build
NumRegexp.new("-?[\d_]+")
end
end
class NumRegexp < Regexp
def match(value)
'hi two'
end
def =~(value)
'hi there'
end
end
var_ex = Regexp.build
var_ex =~ '12' # => "hi there" , as expected
'12' =~ var_ex # => nil , why? It was expected "hi there" or "hi two"
According to the documentation of Ruby of the =~ operator for the class String:
str =~ obj → fixnum or nil
"If obj is a Regexp, use it as a pattern to match against str,and returns the position the match starts, or nil if there is no match. Otherwise, invokes obj.=~, passing str as an argument. The default =~ in Object returns nil."
http://www.ruby-doc.org/core-2.0.0/String.html#method-i-3D-7E
It is a fact that the variable var_ex is an object of class NumRegexp, hence, it is not a Regexp object. Therefore, it should invoke the method obj.=~ passing the string as an argument, as indicated in the documentation and returning "hi there".
In another case, maybe as NumRegexp is a subclass of Regexp it could be considered a Regexp type. Then, "If obj is a Regexp use it as a pattern to match against str". It should return "hi two" in that case.
What is wrong in my reasoning? What do I have to do to achieve the desired functionality?
I've found that the record:
var_ex =~ '12'
isn't the same of:
'12' =~ var_ex
It seems that there are no string method that calls to regexp class #~= method back, this is a bug already reported, and expected to be solved in 2.2.0. So you have to declare it explicitly:
class String
alias :__system_match :=~
def =~ regex
regex.is_a?( Regexp ) && ( regex =~ self ) || __system_match( regex )
end
end
'12' =~ /-?[\d_]+/
# => 0
This is a possible and acceptable solution using monkey patching but it presents some problems to take into account:
The problem with this is that we have now polluted the namespace with a superfluous __system_match method. This method will show up in our documentation, it will show up in code completion in our IDEs, it will show up during reflection. Also, it still can be called, but presumably we monkey patched it, because we didn't like its behavior in the first place, so we might not want other people to call it.
The reason is that you are calling =~ method on a string, not on your NumRegexp object. You need to tell String how to behave:
class String
def =~(reg)
return reg=~self if reg.is_a? NumRegexp
super
end
end

Ruby Regex not matching

I'm writing a short class to extract email addresses from documents. Here is my code so far:
# Class to scrape documents for email addresses
class EmailScraper
EmailRegex = /\A[\w+\-.]+#[a-z\d\-.]+\.[a-z]+\z/i
def EmailScraper.scrape(doc)
email_addresses = []
File.open(doc) do |file|
while line = file.gets
temp = line.scan(EmailRegex)
temp.each do |email_address|
puts email_address
emails_addresses << email_address
end
end
end
return email_addresses
end
end
if EmailScraper.scrape("email_tests.txt").empty?
puts "Empty array"
else
puts EmailScraper.scrape("email_tests.txt")
end
My "email_tests.txt" file looks like so:
example#live.com
another_example90#hotmail.com
example3#diginet.ie
When I run this script, all I get is the "Empty array" printout. However, when I fire up irb and type in the regex above, strings of email addresses match it, and the String.scan function returns an array of all the email addresses in each string. Why is this working in irb and not in my script?
Several things (some already mentioned and expanded upon below):
\z matches to the end of the string, which with IO#gets will typically include a \n character. \Z (upper case 'z') matches the end of the string unless the string ends with a \n, in which case it matches just before.
the typo of emails_addresses
using \A and \Z is fine while the entire line is or is not an email address. You say you're seeking to extract addresses from documents, however, so I'd consider using \b at each end to extract emails delimited by word boundaries.
you could use File.foreach()... rather than the clumsy-looking File.open...while...gets thing
I'm not convinced by the Regex - there's a substantial body of work already around:
There's a smarter one here: http://www.regular-expressions.info/email.html (clicking on that odd little in-line icon takes you to a piece-by-piece explanation). It's worth reading the discussion, which points out several potential pitfalls.
Even more mind-bogglingly complex ones may be found here.
class EmailScraper
EmailRegex = /\A[\w+\-.]+#[a-z\d\-.]+\.[a-z]+\Z/i # changed \z to \Z
def EmailScraper.scrape(doc)
email_addresses = []
File.foreach(doc) do |line| # less code, same effect
temp = line.scan(EmailRegex)
temp.each do |email_address|
email_addresses << email_address
end
end
email_addresses # "return" isn't needed
end
end
result = EmailScraper.scrape("email_tests.txt") # store it so we don't print them twice if successful
if result.empty?
puts "Empty array"
else
puts result
end
Looks like you're putting the results into emails_addresses, but are returning email_addresses. This would mean that you're always returning the empty array you defined for email_addresses, making the "Empty array" response correct.
You have a typo, try with:
class EmailScraper
EmailRegex = /\A[\w+\-.]+#[a-z\d\-.]+\.[a-z]+\z/i
def EmailScraper.scrape(doc)
email_addresses = []
File.open(doc) do |file|
while line = file.gets
temp = line.scan(EmailRegex)
temp.each do |email_address|
puts email_address
email_addresses << email_address
end
end
end
return email_addresses
end
end
if EmailScraper.scrape("email_tests.txt").empty?
puts "Empty array"
else
puts EmailScraper.scrape("email_tests.txt")
end
You used at the end \z try to use \Z according to http://www.regular-expressions.info/ruby.html it has to be a uppercase Z to match the end of the string.
Otherwise try to use ^ and $ (matching the start and the end of a row) this worked for me here on Regexr
When you read the file, the end of line is making the regex fail. In irb, there probably is no end of line. If that is the case, chomp the lines first.
regex=/\A[\w+\-.]+#[a-z\d\-.]+\.[a-z]+\z/i
line_from_irb = "example#live.com"
line_from_file = line_from_irb +"/n"
p line_from_irb.scan(regex) # => ["example#live.com"]
p line_from_file.scan(regex) # => []

Resources