embed vs link on large mongoDB sets (ruby) - ruby

Embedded vs link
I'm looking for the fastest way to search a Newsletter document for a connected Email. So far I have used MongoMapper with one document for Newsletter and another for Email. This is getting really slow with +100k Emails.
I was thinking maybe its faster to embed the emails in an array inside Newsletter
since I'm really only interested in the email ('someemail#email.com')
and not any logic around it.
1) Is it possible at all to embed as much as 100k-500k emails in one document?
2) Is Mongoid better/faster for this?
I'm adding the email if it is not already in the collection by asking
email = newsletter.emails.first(:email => 'someemail#email.com')
unless email
email = Email.new(:email => 'someemail#email.com', :newsletter_id => self.id)
email.save
end
And I think this is where it all starts to hurt.
Here is how they are connected
Class Newsletter
include MongoMapper::Document
many :emails
...
end
Class Email
include MongoMapper::Document
key :email, String
key :newsletter_id, ObjectId
belongs_to :newsletter
end
would love for any help on this :)

There is a maximum document size of 16mb currently for MongoDB, MongoMapper or Mongoid will make no difference to this.
see http://www.mongodb.org/display/DOCS/Documents
Embedded documents should be considerably quicker though, if you can fit all the emails within the limit could be a squeeze.
If storing the whole email is to much, why not just store either an array or just embedded the emails address withing the newsletter with a reference to the full email.
You can then get the speed advantage you want, and keep the emails accessible outside of the newsletter.

Related

Searching a twitter list for certain tags or words

I am learning ruby at the moment and using twitter as a platform to help me build my first prototype in Sinatra. I'm using the Twitter gem and have managed to get a private list of mine and display all the tweets related to the users in that list.
However I now want to search through the list for a set of certain set of keywords, and if found display the tweet.
Does anyone know if there is anyway within the Twitter gem to do this? Or how I would go about doing this in rails in an efficient way.
The only way I can figure out is to iterate through each tweet returned, get the text related to that tweet and search for the keywords, if found display that tweet. This to me is stupidly inefficient and would this not use up unnecessary API request?
This is what I have so far if this is of any help to anyone.
require 'sinatra'
require 'rubygems'
require 'twitter'
client = Twitter::REST::Client.new do |config|
config.consumer_key = 'xxxx'
config.consumer_secret = 'xxx'
config.access_token = 'xx'
config.access_token_secret = 'xx'
end
get '/' do
#tweet = client.list_timeline(1231123123123,{:include_rts => 0})
erb :index
end
Many thanks in advance
Matt
You are correct about this: iterate through each tweet returned, get the text related to that tweet and search for the keywords, if found display that tweet.
You wrote: "this to me is stupidly inefficient". You are correct. It's inefficient because you have to retrieve all the tweets, rather than just the tweets that contain the keywords that you want.
The Twitter gem does not do what you want, because Twitter search is slightly unpredictable. This is because the Twitter search is optimizing for relevancy, not thoroughness.
What you're looking for, I think, is Twitter "Streams". When you ask for a Twitter Stream, you get all the tweets from the user (or site, or globally). This is more sophisticated to set up. This gives you everything, and gives it to you in real time.
https://dev.twitter.com/streaming/overview
Then you search the tweets within Rails.
If your want a simple search, you may want to look at using Ruby's select method and Regexp class.
If you want powerful search capabilities, you may want to look at various search gems and search engines such as sunspot_solr and Lucene.
If you're building a real-world business application with more advanced needs for scaling and searching, you may want to read about Twitter Firehose partners and text engines such as discovertext. These partners and engines provide real-time search APIs, caching, and many more features.
Consider using search method as shown in example here

How can I efficiently pull specific information from JSON

I currently have a public Google calendar that I am successfully pulling JSON data down using Google's API.
I am using HTTParty to convert the JSON to a ruby object.
response = HTTParty.get('http://www.google.com/calendar/feeds/colorado.edu_mdpltf14q21hhg50qb3e139fjg#group.calendar.google.com/public/full?alt=json&orderby=starttime&max-results=15&singleevents=true&sortorder=ascending&futureevents=true')
I want to retrieve many titles, event names, start times, end times ect. I can get these with commands like
response["feed"]["title"["$t"]
for the calendar's title, and
response["feed"]["entry"][0]["title"]["$t"]
for the event's title.
My question is two-fold. One, Is there a simpler way to pull this data? Two, how can I go about pulling multiple events information? I tried:
response.each do |x| response["feed"]["title"]["$t"]
but that spits out a no implicit conversion of string into integer error.
Based on your examples this should do it
response["feed"]["entry"].map {|entry| entry["title"]["$t"] }
response['feed']['entry'] is a simple array of hashes. It is probably best to extract that array to a temporary variable with
entries = response['feed']['entry']
thereafter your code it depends entirely on what you need to achieve. For instance, using the URL that you have provided
puts entries.length
shows
2
And
entries.each do |entry|
puts entry['title']['$t']
end
gives
NEW EVENT
Future EVENT
If we can help you to achieve something specific then please alter your answer or ask for clarification in a comment.

DRY way to handle users with multiple accounts in Rails app

This is a Rails app for a school. I'm using Devise for user accounts. So far each user has a .role of admin, teacher or student which restricts what the user can access and contribute in the app. I'm using user.email to log in. All is working great so far.
Now I've realized that I have a problem with sibling accounts. The unique user.email used for logging in is actually the parent's email address (since the students are minors) and now I have to account for a parent with two or more children who are students. I obviously can't have a student account for each child and use the same email address because it has to be unique (and I do want to keep the requirement in the validation).
Given that I already have a large chuck of the app done, what would be a nice DRY way to account for this situation?
I've searched this site and others but not really found anything that accounts for this situation.
One way I thought would be to change the role in the User model to parent and then have a separate Student model . A user with the role of parent could then has_many :students but this doesn't sit well with me.
Any general ideas/concepts would be appreciated, as would any pointers to articles or gems that would help me with this.
I don't know if it's the best way, but one approach would be to append a sibling-unique identifier to the name portion of the parent's email address for all student email addresses. You could do it in such a way that it could be stripped off when you wanted the "real" email address. There's actually some precedent for this kind of approach, per http://www.plankdesign.com/blog/2013/06/testing-infinite-unique-email-addresses-with-gmail/.

Twitter Ruby Gem - get friends names and ids, nothing more

Is it possible to simply get the people you are following with just an id and full name? I do not need any of the additional data, it's a waste of bandwidth.
Currently the only solution I have is:
twitter_client = Twitter::Client.new
friend_ids = twitter_client.friend_ids['ids']
friends = twitter_client.users(friend_ids).map { |f| {:twitter_id => f.id, :name => f.name} }
is there anyway to just have users returned be an array of ids and full names? better way of doing it than the way depicted above? preferably a way to not filter on the client side.
The users method uses the users/lookup API call. As you can see on the page, the only param available is include_entities. The only other method which helps you find users has the same limitation. So you cannot download only the needed attributes.
The only other thing I'd like to say is that you could directly use the friends variable, I don't see any benefit of running the map on it.

Facebook posting help, adding to facebook whenever a user creates a post

I think this question is a matter of writing nice ruby code, let me see what you guys think. I've already setup all the auth/access token stuff with omniauth and and fbgraph, what I can't seem to work out is how to integrate it when a user creates a post.
My app revolves around users making posts (made up of 'title' and 'content'), I'd like the post to be automatically shared on facebook or twitter or both, depending on the particular authentications the users has setup. And not share anywhere if the user has signed up conventionally without facebook/twitter.
How would I integrate a dynamic way to share the title and content of a user's post whenever they post automatically? I was thinking of some type of after_save to the post model but I can't get it working right. Thank you for any help is it very much appreciated.Also it would great if it was a method that allowed for furture expansion if I wanted to share links and pictures later on.
This is the only post while searching that sheds some light about sharing to both but I'm still confused :(
Easy way of posting on Facebook page (not a profile but a fanpage)
In your Post model have:
after_commit :share_content
def share_content
user.share_content title, content
end
Then in User model have:
def share_content title, content
# some conditionals with whatever stuff you have to determine whether
# it's a twitter and/or facebook update...
if go_for_twitter
twitter.update title
end
if go_for_facebook
facebook.feed! :message => title
# ...etc.
end
end

Resources