Is it standard practice to block or allow email addresses with a ‘+’ in? - validation

I want each user to register with a unique email address. However some email addresses like GMail allow you to add a + suffix which could be used to register multiple accounts to a website but it all goes to a single email address e.g.
bob#gmail.com goes to bob#gmail.com
bob+1#gmail.com goes to bob#gmail.com
bob+2#gmail.com goes to bob#gmail.com
bob+3#gmail.com goes to bob#gmail.com
bob+4#gmail.com goes to bob#gmail.com
Effectively they can have as many email addresses as they want. This is a problem because my website sees it as 5 separate email addresses but gmail sees it as one email address.
I was thinking of blocking any email addresses with a ‘+' in, but I don’t want to block any valid email addresses. What is the standard practice?

I don't think there is a standard practice on how to handle this, other than not allowing + all together. On the other hand, preventing it doesn't seem to be that useful. It won't take more than a few minutes to create an entirely new e-mail address on some free service if whoever you're intending to block-out really needs it.
It should also be noted that a lot of other e-mail providers also provide subaddressing, but not using the plus sign, but with a hyphen (Yahoo, Runbox, etc.), and attempting to block this out will only cause trouble for anybody just having an e-mail address with a hyphen in it. It's a war that you've already lost.
Besides, if you filter out plus signs, you're essentially not compliant with the RFC3696 standard anymore:
The exact rule is that any ASCII character, including control
characters, may appear quoted, or in a quoted string. [...]
Without quotes, local-parts may consist of any combination of
alphabetic characters, digits, or any of the special characters
! # $ % & ' * + - / = ? ^ _ ` . { | } ~
But you could just strip out the plus part if you insist.
$emails = array('bob#gmail.com','bob+1#gmail.com','bob+hello#gmail.com');
foreach ($emails as &$email)
{
list($identifier, $domain) = explode('#',$email);
list($name) = explode('+',$identifier);
$email = $name."#".$domain;
}
print_r($emails);
The above will give you
Array
(
[0] => bob#gmail.com
[1] => bob#gmail.com
[2] => bob#gmail.com
)

Email ids can contain many characters which would look incorrect to us, I found a good thread here which might answer your query: What characters are allowed in an email address?
Also to find the unique email id, just take the first half of the email id and remove + and . chars and then verify.

Related

add space in twilio alphanumeric sender id using ruby

Am using twilio to send sms to my user, now i came to know that we can change from_number to my app name as "Top Expert" using alphanumeric sender id.
This is the name which am using "Top Expert", but while am receiving sms as "TopExpert". Why the spacing is not working, please help to resolve the problem.
#client = Twilio::REST::Client.new ENV["TWILLIO_ACCOUNT_SID"],
ENV["TWILLIO_AUTH_TOKEN"]
#client.account.messages.create(
from: "Top Expert",
to: "+610412345678",
body: "Sample testing"
)
Twilio developer evangelist here. As per comments above and documentation here, I can confirm that you can only use characters that are A-Za-z or 0-9, so a space won't work.
The docs read:
What characters can I use as the sender ID?
You may use any combination of 1 to 11 letters, A-Z and numbers, 0-9.
Both lowercase and uppercase characters are supported as well as
spaces. 1 letter and no more than 11 alphanumeric characters may be
used.
That is because carriers won't accept anything that is outside of that scope.
Hope this helps you

Clean string to get Email with Regex [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 8 years ago.
Improve this question
I have a ruby code that extracts email addresses from a page. my code outputs the email address, but also captures other text as well.
I would like to pull the actual email out of this string. Sometimes, the string will include a mailto, sometimes it will not. I was trying to get the single word that occurs before the #, and anything that comes after the # by using a split, but I'm having trouble. Any ideas? Thanks!
href="mailto:someonesname#domain.rr.com"> | Email</a></td>
Use something prebuilt:
require 'uri'
addresses = URI.extract(<<EOT, :mailto)
this is some text. mailto:foo#bar.com and more text
and some more http://foo#bar.com text
href="mailto:someonesname#domain.rr.com"> | Email</a></td>
EOT
addresses # => ["mailto:foo#bar.com", "mailto:someonesname#domain.rr.com"]
URI comes with Ruby, and the pattern used to parse out URIs is well tested. It's not bullet-proof, but it works pretty well. If you're getting false-positives, you can use a select, reject or grep block to filter out the unwanted entries returned.
If you can't count on having mailto:, the problem becomes harder, because email addresses aren't simple to parse; There's too much variation to them. The problem is akin to validating an email address using a pattern, because, again, the format for addresses varies too much. "Using a regular expression to validate an email address" and "JavaScript Email Validation when there are (soon to be) 1000's of TLD's?" are good reads for more information.
This should also work nicely though won't account for invalid email formats - it will simply extract the email address based on your two use cases.
string[/[^\"\:](\w+#.*)(?=\")/]
This should work
inputstring[/href="[^"]+"/][6 .. -2].gsub("mailto:", "")
Explanation:
Grab the href attribute and it's contents
Remove the href= and qoutes
Remove the mailto: if it's there
Example:
irb(main):021:0> test = "href=\"mailto:francesco#hawaii.rr.com\"> | Email DuVin</a></td>"
=> "href=\"mailto:francesco#hawaii.rr.com\"> | Email DuVin</a></td>"
irb(main):022:0> test[/href="[^"]+"/][6 .. -2].gsub("mailto:", "")
=> "francesco#hawaii.rr.com"
irb(main):023:0> test = "href=\"francesco#hawaii.rr.com\"> | Email DuVin</a></td>"
=> "href=\"francesco#hawaii.rr.com\"> | Email DuVin</a></td>"
irb(main):024:0> test[/href="[^"]+"/][6 .. -2].gsub("mailto:", "")
=> "francesco#hawaii.rr.com"

How can I sort an array of emails by the email provider?

So I dumped all the emails from a DB into a txt file and I`m looking to sort them by email provider, basically anything that comes after the # sign.
I know I can use regex to validate each email.
However how do I indicate that I want to sort them by anything that comes after the # sign?
I know I can use regex to validate each email.
Careful! The range of valid e-mail addresses is much wider than most people think. The only correct regexes for e-mail validation are on the order of a page in length. If you must use a regex, just check for the # and one ..
However how do I indicate that I want to sort them by anything that comes after the # sign
email_addresses.sort_by {|addr| addr.split('#').last }

What is the actual minimum length of an email address as defined by the IETF?

I'm specifically looking for the minimum length of the prefix and domain.
I've seen conflicting information and nothing that looks authoritative.
For reference, I found this page which claims that a one character email address is functional:
http://www.cjvandyk.com/blog/Lists/Posts/Post.aspx?ID=176
I tried validating email addresses at Gmail and they expect prefix greater than or equal to 6.
These are obviously way off.
My web framework expects prefix greater than or equal to 2.
The shortest valid email address may consist of only two parts: name and domain.
name#domain
Since both the name and domain may have the length of 1 character, the minimal total length resolves to 3 characters.
well the problem is really the question.. email depends on if it is sent over the internet, or within a closed system (eg intranet). over the internet, I believe x#y.zz is the shortest email possible (e.g. google's G.CN for china would result in the shortest email adress possible, e.g. i#g.cn, which is 6 characters long). on the intranet however, it is an entirely different thing, and i#y would be possible, which is just 3 characters long.
I believe the standard you are looking for is RFC 2822 - Internet Message Format
More specific info on email address restrictions in RFC 3696 - Section 3
To quote the spec:
Contemporary email addresses consist of a "local part" separated from a "domain part" (a fully-qualified domain name) by an at-sign ("#").
So three characters is the shortest.
I originally got this info from Phil Haack's blog post.
Many mail-servers will not accept the email-address if there aren't at least 2 characters before the #.
That doesn't make it an invalid address, but if the servers don't know that, it sure can lead to a lot of problems.

Extract email addresses from a block of text

How can I create an array of email addresses contained within a block of text?
I've tried
addrs = text.scan(/ .+?#.+? /).map{|e| e[1...-1]}
but (not surprisingly) it doesn't work reliably.
Howabout this for a (slightly) better regular expression
\b[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4}\b
You can find this here:
Email Regex
Just an FYI, the problem with your email is that you allow only one type of separator before or after an email address. You would match "#" alone, if separated by spaces.

Resources