regex email validation Ruby - ruby

This is my regex for email validation, but I want to restrict consecutive period like I don't want . _ - to be consecutively repeated. Anyone can help me?
/^((?:[a-z]+[0-9_\.-]*)+[a-z0-9_\.-]*[a-z0-9])#((?:[a-z0-9]+[\.-]*)+\.[a-z]{2,4})$/
for example:
test..test#example.com instead i want test.test#example.com or test_test#example.com test-test#example.com

You can use following regex to avoid consecutive period.
^(?!.*\.{2})\A\S+#.+\.\S+\z
Check it here
You can add,
^(?!.*\.{2})
before any email regex that will work to avoid consecutive dots.

Related

How does this regular expression limit email addresses to ".com" instead of "...com"

The regex below:
EMAIL_REGEX = /\A[\w+\-.]+#[a-z\d\-.]+\.[a-z]+\z/i
is what I initially used to validate email format. After finding that the format "name#email...com" was passing my tests, I copy/pasted a different piece of regex that limits the amount of periods. This looks like:
EMAIL_REGEX = /\A[\w+\-.]+#[a-z\d\-]+(?:\.[a-z\d\-]+)*\.[a-z]+\z/i
The main difference is the piece of regex below:
(?:\.[a-z\d\-]+)
I can't quite figure out how this bit works. Can someone break it down for me?
Notice that in this subexpression:
(?:\.[a-z\d\-]+)
The character class [a-z\d-] does not contain a period. The expression requires there to be at least one (+) of those characters after the period (\.) in order to match. Therefore, a series of periods with no letters or digits or hyphens between them won't match the repetition of the subexpression.
The problem with your regular expression here is that you're allowing for multiple dots:
/[a-z\.]+\.[a-z]+\z/
To fix this you need to make your repeating pattern more specific in terms of structure:
/(?:[a-z]+\.)+[a-z]+\z/
That means you can have one or more repeating groups of letters plus dot. That will exclude multiple dots in a row.
Do keep in mind that email addresses are getting increasingly insane with the introduction of new GTLDs that are often used without any sort of prefix. That is, example#google may be a valid address in the future. You can't expect there to be a dot in the domain.
You have [a-z\d\-]+(?:\.[a-z\d\-]+)*. The [a-z\d\-]+ part ensures that this part of the string starts with a sequence of at least one non-period character. A period is only allowed one per (?:\.[a-z\d\-]+) structure. In each (?:\.[a-z\d\-]+), the period \. is necessarily followed by [a-z\d\-]+, which includes at least one non-period character. This ensures that whenever a period appears, it has at least one non-period character on the left and on the right. In other words, consecutive periods are not allowed.

phone regex does not completely working

In my country the phone numbers follow a format like this (XX)XXXX-XXXX. But enter phone numbers according to the pattern in input texts it's too mainstream. Some people follow, but some people don't. I'd like to make a regex to catch all possible cases. By now it look like this:
/^[\(]?\d{2}?[\)]?\d{4}[. -]?\d{4}$/
And I prepared some test cases to prove the regex's functionality
# GOOD PHONES #
8432115262
843211 5262
843211.5262
843211-5262
32115262
3211.5262
3211 5262
3211-5262
(84)32115262
(84)3211.5262
(84)3211 5262
(84)3211-5262
# BAD PHONES #
!##$%*()
()32115262
()1231 3213
()1231.3213
()1231-3213
().3213
()-3213
()3213.
()3213-
3211-5a62
sakdiihbnmwlzi
Unfortunately, the wrong case ()32115262 is bypassing the regex. Altought it is clear why. this part [\(]?\d{2}?[\)]? is responsable for the mistake. From left to right, you can enter zero or one of (; You can enter zero or two digits; You can enter zero or one of ).
I'd like that part should be like this: If you put (, you will have to enter two digits and ), else you can enter zero or two digits. Something like this or with simmilar semantics is possible in regex world?
Thanks in advance
Something like this perhaps:
/^(?:\(\d{2}\)|\d{2}?)\d{4}[. -]?\d{4}$/
I used a non-matching group (?: ... ) and alternation to provide two possible options for the first part of the phone number.
Either it is \(\d{2}\) which means brackets with exactly two digits, or it is \d{2}? which means two digits or empty string.
Combine these two options together with | (which means OR) and you get the first part of the regex above: (?:\(\d{2}\)|\d{2}?)
It seemed to work for all your test cases!
try with this: ^(?:\(\d\d\)|\d\d)?\d{4}[. -]?\d{4}$
If pattern matches (..) then have to match 2 digits inside.

How do I regex a name and an email out of the 3 major email clients in ruby?

I thought I had it figured out, but it appears that my regex still has quirks in it. Basically I would like to use the same regex pattern to match the following major email clients (Gmail, Yahoo, and regular email):
"Brian Mang" <brian.mang#email.com> -- Case1
Brian Mang (brian.mang#email.com) -- Case2
<brian.mang#email.com> -- Case3
brian.mang#email.com -- Case4
I had the following regex pattern:
/[\W"]*(?<name>.*?)[\"]*?\s*[<(](?<email>\w.*)[>)]/.match(contact)
and it works for all Cases 1-3, but I cant get it to pick up case 4, I tried messing around with it but cant figure it out cause it breaks the other cases. Any idea what I need to change/modify to make my regex pick up all of the 4 cases? Thank you.
Try this
[\W"]*(?<name>.*?)[\"]*?\s*[<(]?(?<email>\S+#\S+)[>)]?
See it here on Regexr
I made the classes surrounding the address optional and changed the part that matches the email to \S+#\S+ that means at least one non-whitespace followed by a # then at least one more non-whitespace character.
Since the above version matches the closing character also, you can restrict the part after the # a bit more
[\W"]*(?<name>.*?)[\"]*?\s*[<(]?(?<email>\S+#[^\s>)]+)[>)]?
see it here on Regexr
Edit: This one works for all four:
[\W"]*(?<name>.*?)[\"]*?\s*[<(]?(?<email>\S+#[^)>]+)[>)]?

How can I sort an array of emails by the email provider?

So I dumped all the emails from a DB into a txt file and I`m looking to sort them by email provider, basically anything that comes after the # sign.
I know I can use regex to validate each email.
However how do I indicate that I want to sort them by anything that comes after the # sign?
I know I can use regex to validate each email.
Careful! The range of valid e-mail addresses is much wider than most people think. The only correct regexes for e-mail validation are on the order of a page in length. If you must use a regex, just check for the # and one ..
However how do I indicate that I want to sort them by anything that comes after the # sign
email_addresses.sort_by {|addr| addr.split('#').last }

Using Regex to grab multiple values from a string and drop them into an array?

Trying to grab the two $ values and the X value from this string in Ruby/watir:
16.67%: $xxx.xx down, includes the Policy Fee, and x installments of $xxx.xx
So far I've got:
16.67%:\s+\$(\d+.\d{2})
which grabs the first xxx.xx fine, what do I need to add to it to grab the last two variables and load this all into an array?
You can use the following, but regex may be unnecessary if the surrounding text is always the same:
\$(\d+.\d{2}).*?(\d+) installments.*?\$(\d+.\d{2})
http://www.rubular.com/r/sk5wO3fyZF
if you know that the text in between will always be the same you could just:
16.67%:\s+\$(\d+.\d{2}) down, includes the Policy Fee, and x installments of (\d+.\d{2})
You better use scan.
sub(/.*%/, '').scan(/\$?([\d\.]+)/)
Have you considered just splitting the string on the $ character?, then manipulating what you get with a regex or basic string commands?
/\$(\d+.\d{2}).+\$(\d+.\d{2})/ should do it. it wont matter what text is there, only that there are two "$" in the sentence.

Resources