Mail gem. Extract recipient display name and address as separate values - ruby

Using the Mail gem (i.e. Rails + ActionMailer), is there a clean way to get the display name of the recipient?
I can get the address with:
mail.to.first
And I can get the formatted display name + address with:
mail.header_fields.select{ |f| f.name == "To" }.first.to_s
But how can I get just the display name part (i.e. before the < and >). I know somebody is going to suggest a Regex, but that's not what I'm looking for, since I'd then have to parse out any encoding, which is something the Mail gem probably already does. I'm the author of a popular Mailer library in PHP and am aware of the pitfalls of just assuming the bit before < and > is human-readable, in the headers, when 8-bit characters come into play.
I can do this:
mail.header_fields.select{ |f| f.name == "To" }.first.parse.individual_recipients.first.display_name.text_value
But there must be a better way? :)

Figured it out, sorry. For anyone else who hits this thread looking for the solution:
mail[:to].display_names.first

The gotcha is that bracket access and dotted access are different for this gem.
From the doc:
mail = Mail.new
mail.to = 'Mikel Lindsaar <mikel#test.lindsaar.net>, ada#test.lindsaar.net'
mail.to #=> ['mikel#test.lindsaar.net', 'ada#test.lindsaar.net']
mail[:to] #=> '#<Mail::Field:0x180e5e8 #field=#<Mail::ToField:0x180e1c4
mail['to'] #=> '#<Mail::Field:0x180e5e8 #field=#<Mail::ToField:0x180e1c4
mail['To'] #=> '#<Mail::Field:0x180e5e8 #field=#<Mail::ToField:0x180e1c4
mail[:to].encoded #=> 'To: Mikel Lindsaar <mikel#test.lindsaar.net>, ada#test.lindsaar.net\r\n'
mail[:to].decoded #=> 'Mikel Lindsaar <mikel#test.lindsaar.net>, ada#test.lindsaar.net'
mail[:to].addresses #=> ['mikel#test.lindsaar.net', 'ada#test.lindsaar.net']
mail[:to].formatted #=> ['Mikel Lindsaar <mikel#test.lindsaar.net>', 'ada#test.lindsaar.net']
So to get the display name, you can use #display_name
mail[:to].addrs.first.display_name #=> Mikel Lindsaar
Use #address to get the email address
mail[:from].addrs.first.address #=> mikel#test.lindsaar.net

Related

Mail::Address not parsing correctly emails with comma

I'm trying to use Mail::Address to parse an email address, however the output is not as expected:
Mail::Address.new('Arnold, Roa <aroa#so.com>').address
=> "Arnold"
What is the problem and what alternatives do I have?
This works, not sure why the comma is there:
Mail::Address.new('Arnold, Roa <aroa#so.com>'.gsub(',','')).address
I've created an issue on the github project: https://github.com/mikel/mail/issues/1219
In the meanwhile I created this monkey patch (which is not a good practice and should be avoided):
class Mail::Address
class << self
def new(value = nil)
if value.is_a? String
value = value.gsub(',', ' ')
end
super(value)
end
end
end

How to remove characters in string after email

I'm using this code to list email addresses from a HTML page.
require 'nokogiri'
selector = "//a[starts-with(#href, \"mailto:\")]/#href"
doc = Nokogiri::HTML.parse File.read 'in.rb'
nodes = doc.xpath selector
addresses = nodes.collect {|n| n.value[7..-1]}
puts addresses
This is sample code I'm parsing:
<a href="mailto:joe#example.com?subject=My Business Is Dying">
But I'm getting more than just the email address. I'm getting this in my results:
joe#example.com?subject=My Business Is Dying
How do I drop off everything after the question mark so it's only the email address?
You could always chop off anything after the ? character:
addresses.map! do |address|
address.sub(/\?.*/, '')
end
I'd probably use one of these two:
str = 'joe#example.com?subject=My Business Is Dying'
str.split('?').first # => "joe#example.com"
str[/^[^?]+/] # => "joe#example.com"
The second is a simple regular expression embedded in String's [] (slice) method. The pattern basically says "start at the beginning and grab everything up until a question mark."
They're equivalent as far as speed goes. I'd probably use the first because it's easier to read.

Reading a Gmail Message with ruby-gmail

I am looking for an instance method from the ruby-gmail gem that would allow me to read either:
the body
or
subject
of a Gmail message.
After reviewing the documentation, found here, I couldn't find anything!?
There is a .message instance method found in the Gmail::Message class section; but it only returns, for lack of a better term, email "mumbo-jumbo," for the body.
My attempt:
#!/usr/local/bin/ruby
require 'gmail'
gmail = Gmail.connect('username', 'password')
emails = gmail.inbox.emails(:from => 'someone#mail.com')
emails.each do |email|
email.read
email.message
end
Now:
email.read does not work
email.message returns that, "mumbo-jumbo," mentioned above
Somebody else asked this question on SO but didn't get an answer.
This probably isn't exactly the answer to your question, but I will tell you what I have done in the past. I tried using the ruby-gmail gem but it didn't do what I wanted it to do in terms of reading a message. Or, at least, I couldn't get it to work. Instead I use the built-in Net::IMAP class to log in and get a message.
require 'net/imap'
imap = Net::IMAP.new('imap.gmail.com',993,true)
imap.login('<username>','<password>')
imap.select('INBOX')
subject_id = search_mail(imap, 'SUBJECT', '<mail_subject>')
subject_message = imap.fetch(subject_id,'RFC822')[0].attr['RFC822']
mail = Mail.read_from_string subject_message
body_message = mail.html_part.body
From here your message is stored in body_message and is HTML. If you want the entire email body you will probably need to learn how to use Nokogiri to parse it. If you just want a small bit of the message where you know some of the surrounding characters you can use a regex to find the part you are interested in.
I did find one page associated with the ruby-gmail gem that talks about using ruby-gmail to read a Gmail message. I made a cursory attempt at testing it tonight but apparently Google upped the security on my account and I couldn't get in using irb without tinkering with my Gmail configuration (according to the warning email I received). So I was unable to verify what is stated on that page, but as I mentioned my past attempts were unfruitful whereas Net::IMAP works for me.
EDIT:
I found this, which is pretty cool. You will need to add in
require 'cgi'
to your class.
I was able to implement it in this way. After I have my body_message, call the html2text method from that linked page (which I modified slightly and included below since you have to convert body_message to a string):
plain_text = html2text(body_message)
puts plain_text #Prints nicely formatted plain text to the terminal
Here is the slightly modified method:
def html2text(html)
text = html.to_s.
gsub(/( |\n|\s)+/im, ' ').squeeze(' ').strip.
gsub(/<([^\s]+)[^>]*(src|href)=\s*(.?)([^>\s]*)\3[^>]*>\4<\/\1>/i,
'\4')
links = []
linkregex = /<[^>]*(src|href)=\s*(.?)([^>\s]*)\2[^>]*>\s*/i
while linkregex.match(text)
links << $~[3]
text.sub!(linkregex, "[#{links.size}]")
end
text = CGI.unescapeHTML(
text.
gsub(/<(script|style)[^>]*>.*<\/\1>/im, '').
gsub(/<!--.*-->/m, '').
gsub(/<hr(| [^>]*)>/i, "___\n").
gsub(/<li(| [^>]*)>/i, "\n* ").
gsub(/<blockquote(| [^>]*)>/i, '> ').
gsub(/<(br)(| [^>]*)>/i, "\n").
gsub(/<(\/h[\d]+|p)(| [^>]*)>/i, "\n\n").
gsub(/<[^>]*>/, '')
).lstrip.gsub(/\n[ ]+/, "\n") + "\n"
for i in (0...links.size).to_a
text = text + "\n [#{i+1}] <#{CGI.unescapeHTML(links[i])}>" unless
links[i].nil?
end
links = nil
text
end
You also mentioned in your original question that you got mumbo-jumbo with this step:
email.message *returns mumbo-jumbo*
If the mumbo-jumbo is HTML, you can probably just use your existing code with this html2text method instead of switching over to Net::IMAP as I had discussed when I posted my original answer.
Nevermind, it's:
email.subject
email.body
silly me
ok, so how do I get the body in "readable" text? without all the encoding stuff and html?
Subject, text body and HTML body:
email.subject
if email.message.multipart?
text_body = email.message.text_part.body.decoded
html_body = email.message.html_part.body.decoded
else
# Only multipart messages contain a HTML body
text_body = email.message.body.decoded
html_body = text
end
Attachments:
email.message.attachments.each do |attachment|
path = "/tmp/#{attachment.filename}"
File.write(path, attachment.decoded)
# The MIME type might be useful
content_type = attachment.mime_type
end
require 'gmail'
gmail = Gmail.connect('username', 'password')
emails = gmail.inbox.emails(:from => 'someone#mail.com')
emails.each do |email|
puts email.subject
puts email.text_part.body.decoded
end

How to detach an attachment for POP3 using ruby net/pop?

pop = Net::POP3.new mailhost
pop.start mailuser, mailpass
if pop.mails.empty?
puts "Mailbox empty."
else
pop.mails.each do |mail|
if mail.pop.has_attachments?
mail.pop.attachments.each do |attachment|
puts attachment.original_filename
end
end
end
end
gives undefined method 'has_attachments?' for #<String:0xb7cc4f7c>.
Is this example no longer working?
mail.pop returns string representation of email see corresponding docs. If you want to parse it and work with mail object you can do it like this:
email = Mail.new(mail.pop)
I really recommend you to take a look into docs - if you'll have big attachments you can run into memory issues and this thing is explained in docs.

Feedzirra cannot parse atom feeds

The idea of having a single parser for any kind of feed is great and was hoping that it would work for me.
I have been trying to get feedzirra to parse atom feeds.
specifically:
http://pindancing.blogspot.com/feeds/posts/default
http://adam.heroku.com/feed
Those are just 2 that I tried with the problem is that feedzirra cannot parse the
entry URL. It always comes out nil
feed = Feedzirra::Feed.fetch_and_parse(search.rss_feed_url)
p feed.entries.first.title
p feed.entries.first.url #=> returns nil
Is there anything I need to do to get it working?
thanks for your help
Hate to say "works for me", but, well, works for me:
require 'Feedzirra'
urls = %w{
http://adam.heroku.com/feed
http://pindancing.blogspot.com/feeds/posts/default
}
urls.each do |url|
feed = Feedzirra::Feed.fetch_and_parse(url)
puts feed.entries.first.title
puts feed.entries.first.url
end
# => Memcached, a Database?
# => http://adam.heroku.com/past/2010/7/19/memcached_a_database/
# => The answer to "Will you mentor me?" is
# => http://pindancing.blogspot.com/2010/12/answer-to-will-you-mentor-me-is.html
It'd help to see the rest of your code, particularly the actual parameter you're using in the fetch_and_parse method.

Resources