eMail archiving with Ruby - ruby

I'm looking for information on any libraries or methods that would help me to build an email archiving system using Ruby (I'm open to other languages if suggested).
The application would need to do the following:
1) Sit on a incoming mail server, receiving and storing all incoming
email.
2) After storing email, push it out to our actual email server.
3) The Email archive should be searchable.
Any thoughts on this are appreciated, I can't seem to find an existing project that does this.

Even though I'm a big Ruby fan, Zed Shaw has written a very interesting and configurable SMTP server in Python, called Lamson:
http://lamsonproject.org/
I've never used Lamson, but I think with minimum tweaking you could make it store e-mails into most any DB you choose, and forward e-mails easily wherever you like.
Once you have all your emails in a DB, it should be a relatively easy task to build a front-end to the DB with Ruby (and/or Rails) if you wish.
Since processing e-mails can be fairly tricky stuff, using something purpose-built like Lamson as your intermediate processor might be worth a shot.

The lamson project looks pretty awesome. If you're looking to actually implement something yourself I posted a blog post a while back on some of the best methods to receive email in Ruby. There are also plenty of ways to push the mail back out again fairly easily, it's probably better to rely on a system that already has all of this functionality though.

Related

Calendar integration to Domino (Lotus Notes)?

How do I integrate with a Lotus Notes Domino server? I know there are several versions and the answer would be different for each one, but advice on any version would be great at the moment as I haven't gotten the info on what server it is I'm supposed to integrate with yet. Assume version 6+.
I'm assuming I need to do the integration with the server and not the local Lotus Notes client, but that might not be correct?
I need to both read and write to the calendar appointments of a select number of users.
For instance I should be able to create/update/delete a appointment for a certain user.
The appointments are the only thing I need access to, at the moment I have no need for the mails.
From what I have read on the internet there are no standard interface to do this?
Should I develop a Domino app that does what I want?
Maybe there is a server API that I can use to connect and retrive information?
Hopefully this can be done in c#? If not what is the preferred way? I read something about java and that is doable also.
If you don't have any concrete answers but you have useful links, please post those as comments.
I have used Java and the C++ APIs to read a Domino calendar. Depending on the scenario, a server side solution can run into trouble if you want to do more than read -- the workflow sometimes needs the Notes client. Need to understand more about what you intend to do.
API documentation:
http://www.ibm.com/developerworks/lotus/downloads/toolkits.html
I'd use Java.
Here's Domino Designer help section on Java:
http://publib.boulder.ibm.com/infocenter/domhelp/v8r0/topic/com.ibm.designer.domino.main.doc/H_9_CODING_GUIDELINES_JAVA.html?resultof=%22%6a%61%76%61%22%20
First read Running a Java program section.
Then you'll be interested in Accessing databases link.
Here's example of how to access user's mail db (calendar items are inside mail db in Lotus):
http://publib.boulder.ibm.com/infocenter/domhelp/v8r0/topic/com.ibm.designer.domino.main.doc/H_EXAMPLES_OPENMAIL_METHOD_JAVA.html
GooCalSync (openntf and LotusNotes-Google Calendar Synchronizer (sourceforce) are great examples of how to do this in Java.
The best way to do this without the pain of having to write code is to use ICal. You will enter all sorts of issues with access, reading appointments etc that are best left to Domino to handle.
There are some good documents on the web on ICal support in Domino.
I've done this before for a CRM product (clearc2.com). iCal is easy, but if you want to do more than insert items and actually do a bi-directional sync to the calendars (which are mail databases on a domino server), then I would look at the appendix of the Lotus Notes C API first. There is a section that explains the C&S piece fairly well. You do not need to use the C API to do the work, but it will explain what the many c&s items (fields) are for.
Click here for documentation.
My advice is to keep it simple, e.g. do not try to tackle repeating items (appts/tasks) on the first attempt. And try not to re-use any custom product objects you find in the mail template. These are undocumented Notes classes and can go away anytime. Furthermore, they may not work the same from each point release or even incremental release. The mail template code can be evil.

How to set up a computer network with ruby

I would like to set up a network with some computers I have, where they can connect to one main source, then receive and send messages back to it. I have never done any network programming before, so I'm just wondering what are the best tutorials using Ruby that I could use.
Thanks in advance.
There are about a billion ways you could do this. Could you post more about what the problem is you're trying to solve, or what the content/purpose/size/format/etc. of the messages is to be? Are you building something "for real" or just trying to learn network programming?
Also, do you already have the lower layer stuff figured out? You have networking infrastructure setup, IP addresses assigned, etc? If not, you'll need to get through that. Once you have that, you could start with a tutorial on basic socket programming in Ruby, but - depending on the answers to the questions above - you might not want to "roll your own" solution at that level. The answer might be to use an XMPP (Jabber) server, and use an XMPP client library, or you might want to deploy something like ActiveMQ, HornetQ, etc. and use a library for interfacing with that. Or maybe you want to use HTTP and pass messages around in JSON, or XML or $WHATEVER. In short, there are a LOT of options in this area.

Streaming, Daemons, Cronjobs, how do you use them? (in Ruby)

I've finally had a second to look into streaming, daemons, and cron
tasks and all the neat gems built around them! But I'm not clear on
how/when to use these things.
I have a few questions:
1) If I wanted to have a website that stayed constantly updated, realtime, with my Facebook friends' activity feeds, up-to-the-minute Amazon book reviews on my favorite books, and my Twitter feed, would I just create some custom streaming implementation using the Daemon gem, the ruby-yali gem for streaming the content, and the Whenever gem, which could say, check those sites every 3-10 seconds to see if content I'm looking for has changed? Is that how it would work? Or is it typically/preferably done differently?
2) Is (1) too processor intensive? Is there a better way you do it, a better way for live content streaming, given that the website you want realtime updates on doesn't have a streaming api? I'm thinking about just sending a request every few seconds in a separate small ruby app (with daemons and cronjobs), getting the json/xml result, using nokogiri to remove the stuff I don't need, and then just going through the small list of comments/books/posts/etc., building a feed of what's changed, and using Juggernaut or something to push those changes to some rails app. Would that work?
I guess it all boils down to the question:
How does real-time streaming of the latest content of some website work? How do YOU do it?
...so if someone is on my site, they can see in real time the new message or new book that just came out?
Looking forward to your answers,
Lance
Well first, if a website that doesn't provide an API, then it's a strong indication that it's not legal to parse and extract their data, however you'd better check their terms of use and privacy policy.
Personally I'm not aware of something called "Streaming API", but supposing that they have an API , you still need to pull the results provided by it(xml, json, ....), parse them and present them back to the user. The strategy will vary depending on your app type:
Desktop app: then you just can pull the data directly, parse it and provide it to the user, many apps are like that just like Twhirl.
Web app: then you need to cut down the time for extracting the data. Typically you will pull the data from the API and store it. However, storing the data is a bit tricky! You don't want want your database to be a lock down for the app by the extreme pull queries that it gonna get to retrieve the data back. One way to do this is to use push methodology; follow option 2 in this case to get the data and then push to the user. If you want instant updates like chat for example you can have a look at orbited. If it's ok to save the data to some kind of user and followers' 'inboxes', then the simplest way as I can tell is to use IMAP to send the updates to the user inbox.

Is there a super-high-load (Ajax) chat script out there?

For a pet project, I have been looking for a web chat script capable of running potentially tens of thousands of users simultaneously. I don't want to use any kind of applet or browser extension, so on the client side, it should be simple Ajax. On the server side I'm pretty much open to anything.
I'm not looking for bells and whistles, a simple text-only chat is more than enough, as long as it supports a number of 'channels' or 'rooms' simultaneously, and a very large number of users.
When I first started researching the chat scripts out there, it seemed like the only viable option was to run an IRC server and just build a web interface on top of that. I know I could get good performance and stability with that setup, but could I get better performance by using something else?
Any ideas?
You might want to check cometd
I believe there are some chat scripts already using cometd.
I have no idea regarding stability tho.
You can have a look at Jabbify.
Not sure about the rooms and channels part, but it is built on the AJAX and MVC model.
I am going with Twitch.me, which is based on node.js

How would you make an RSS-feeds entries available longer than they're accessible from the source?

My computer at home is set up to automatically download some stuff from RSS feeds (mostly torrents and podcasts). However, I don't always keep this computer on. The sites I subscribe to have a relatively large throughput, so when I turn the computer back on it has no idea what it missed between the the time it was turned off and the latest update.
How would you go about storing the feeds entries for a longer period of time than they're available on the actual sites?
I've checked out Yahoo's pipes and found no such functionality, Google reader can sort of do it, but it requires a manual marking of each item. Magpie RSS for php can do caching, but that's only to avoid retrieving the feed too much not really storing more entries.
I have access to a webserver (LAMP) that's on 24/7, so a solution using a php/mysql would be excellent, any existing web-service would be great too.
I could write my own code to do this, but I'm sure this has to be an issue previously encountered by someone?
What I did:
I wasn't aware you could share an entire tag using Google reader, thanks to Mike Wills for pointing this out.
Once I knew I could do this it was simply a matter of adding the feed to a separate Google account (not to clog up my personal reading list), I also did some selective matching using Yahoo pipes just to get the specific entries I was interested in, this too to minimize the risk that anything would be missed.
It sounds like Google Reader does everything you're wanting. Not sure what you mean by marking individual items--you'd have to do that with any RSS aggregator.
I use Google Reader for my podiobooks.com subscriptions. I add all of the feeds to a tag, in this case podiobooks.com, that I share (but don't share the URL). I then add the RSS feed to iTunes. Example here.
Sounds like you want some sort of service that checks the RSS feed every X minutes, so you can download every single article/item published to the feed while you are "watching" it, rather than only seeing the items displayed on the feed when you go to view it. Do I have that correct?
Instead of coming up with a full-blown software solution, can you just use cron or some other sort of job scheduling on the webserver with whatever solution you are already using to read the feeds and download their content?
Otherwise it sounds like you'll end up coming close to re-writing a full-blown service like Google Reader.
Writing an aggregator for keeping longer history shouldn't be too hard with a good RSS library.

Resources