Mongodb and Ruby gem - Check if record exists - ruby

I've a simple Ruby script (no rails, sinatra etc.) that uses the Mongo gem to insert records into my DB as part of a Redis/Resque worker.
Upon occasion instead of doing a fresh insert I'd like to update a counter field on an existing record. I can do this handily enough with rails/mysql. What's the quickest way of doing this in pure Ruby with Mongodb?
Thanks,
Ed

The Ruby client library for MongoDB is very convenient and easy to use. So, to update a document in MongoDB, use something similar to this:
#!/usr/bin/ruby
require 'mongo'
database = Mongo::Connection.new.db("yourdatabasename")
# get the document
x = database.find({"_id" => "12312132"})
# change the document
x["count"] = (x["count"] || 0) + 1
# update it in mongodb
database["collection"].update("_id" => "thecollectionid", x)
You might want to check out the manual for updating documents in MongoDB as well.

thanks to envu's direction I went with upsert in the end. here is an example snippet of how to use it the Ruby client:
link_id = #globallinks.update(
{
":url" => "http://somevalue.com"
},
{
'$inc' => {":totalcount" => 1},
'$set' => {":timelastseen" => Time.now}
},
{
:upsert=>true
}
)

Related

Mongo find by id in Phoenix

When trying to find by id I don't get a result using the mongodb driver that comes with Phoenix.
The readme in the mongodb package has the following examples
Mongo.find(MongoPool, "test-collection", %{}, limit: 20)
Mongo.find(MongoPool, "test-collection", %{"field" => %{"$gt" => 0}}, limit: 20, sort: %{"field" => 1})
but when I try like the following I don't get any results.
cursor = Mongo.find(AppName.Repo.Pool, "test-collection", %{"_id" => "1df66b12302b812298308dba"})
Enum.to_list(cursor)
Get [] empty list.
Do I need to convert the id to something first?
I would like to not have to use Ecto all the time.
I figured out the following code that works to convert a string mongo document id to what can be plugged into a mongodb _id parameter
def objectid(id) do
{_, idbin} = Base.decode16(id, case: :mixed)
%BSON.ObjectId{value: idbin}
end

How to use Tire for "MoreLikeThis" query of ElasticSearch

I would like execute this exemple :
$ curl -XGET 'http://localhost:9200/twitter/tweet/1/_mlt?mlt_fields=tag,content&min_doc_freq=1'
with Tire gem. It's poossible ?
My goal to search document related to another document.
It is not implemented directly in tire. Karmi, however, has implemented it as a tire extension in the tire-contrib repository.
Source Code: more_like_this.rb
Add by adding gem 'tire-contrib'
more_like_this_field(:tag, like_text, options = {min_doc_freq: 1})
Okay the internet forgot to include a single example of this call (including the source project), so here is one style of it.
related_articles = Article.search {
query {
more_like_this("#{current_article.title} #{current_article.body}",
fields: [:title, :description],
percent_terms_to_match: 0.1,
min_term_freq: 1,
min_doc_freq: 1
)
}
}
puts related_articles.results.count
puts related_articles.results.first.title if related_articles.present?
The gotcha here are the min_term_freq and min_doc_freq params above. They default to 2 and 5 respectively in ElasticSearch, which makes it easy to get confused while testing this.

Bulk Insert into Mongo - Ruby

I am new to Ruby and Mongo and am working with twitter data. I'm using Ruby 1.9.3 and Mongo gems.
I am querying bulk data out of Mongo, filtering out some documents, processing the remaining documents (inserting new fields) and then writing new documents into Mongo.
The code below is working but runs relatively slow as I loop through using .each and then insert new documents into Mongo one at a time.
My Question: How can this be structured to process and insert in bulk?
cursor = raw.find({'user.screen_name' => users[cur], 'entities.urls' => []},{:fields => params})
cursor.each do |r|
if r['lang'] == "en"
score = r['retweet_count'] + r['favorite_count']
timestamp = Time.now.strftime("%d/%m/%Y %H:%M")
#Commit to Mongo
#document = {:id => r['id'],
:id_str => r['id_str'],
:retweet_count => r['retweet_count'],
:favorite_count => r['favorite_count'],
:score => score,
:created_at => r['created_at'],
:timestamp => timestamp,
:user => [{:id => r['user']['id'],
:id_str => r['user']['id_str'],
:screen_name => r['user']['screen_name'],
}
]
}
#collection.save(#document)
end #end.if
end #end.each
Any help is greatly appreciated.
In your case there is no way to make this much faster. One thing you could do is retrieve the documents in bulks, processing them and the reinserting them in bulks, but it would still be slow.
To speed this up you need to do all the processing server side, where the data already exist.
You should either use the aggregate framework of mongodb if the result document does not exceed 16mb or for more flexibility but slower execution (much faster than the potential your solution has) you can use the MapReduce framework of mongodb
What exactly are you doing? Why not going pure ruby or pure mongo (well that's ruby too) ? and Why do you really need to load every single attribute?
What I've understood from your code is you actually create a completely new document, and I think that's wrong.
You can do that with this in ruby side:
cursor = YourModel.find(params)
cursor.each do |r|
if r.lang == "en"
r.score = r.retweet_count + r.favorite_count
r.timestamp = Time.now.strftime("%d/%m/%Y %H:%M")
r.save
end #end.if
end #end.each
And ofcourse you can import include Mongoid::Timestamps in your model and it handles your created_at, and updated_at attribute (it creates them itself)
in mongoid it's a little harder
first you get your collection with use my_db then the next code will generate what you want
db.models.find({something: your_param}).forEach(function(doc){
doc.score = doc.retweet_count + doc.favorite_count
doc.timestamp = new Timestamp()
db.models.save(doc)
}
);
I don't know what was your parameters, but it's easy to create them, and also mongoid really do lazy loading, so if you don't try to use an attribute, it won't load that. You can actually save a lot of time not using every attribute.
And these methods, change the existing document, and won't create another one.

How to update or insert on Sequel dataset?

I just started using Sequel in a really small Sinatra app. Since I've got only one DB table, I don't need to use models.
I want to update a record if it exists or insert a new record if it does not. I came up with the following solution:
rec = $nums.where(:number => n, :type => t)
if $nums.select(1).where(rec.exists)
rec.update(:counter => :counter + 1)
else
$nums.insert(:number => n, :counter => 1, :type => t)
end
Where $nums is DB[:numbers] dataset.
I believe that this way isn't the most elegant implementation of "update or insert" behavior.
How should it be done?
You should probably not check before updating/inserting; because:
This is an extra db call.
This could introduce a race condition.
What you should do instead is to test the return value of update:
rec = $nums.where(:number => n, :type => t)
if 1 != rec.update(:counter => :counter + 1)
$nums.insert(:number => n, :counter => 1, :type => t)
end
Sequel 4.25.0 (released July 31st, 2015) added insert_conflict for Postgres v9.5+
Sequel 4.30.0 (released January 4th, 2016) added insert_conflict for SQLite
This can be used to either insert or update a row, like so:
DB[:table_name].insert_conflict(:update).insert( number:n, type:t, counter:c )
I believe you can't have it much cleaner than that (although some databases have specific upsert syntax, which might be supported by Sequel). You can just wrap what you have in a separate method and pretend that it doesn't exist. :)
Just couple suggestions:
Enclose everything within a transaction.
Create unique index on (number, type) fields.
Don't use global variables.
You could use upsert, except it doesn't currently work for updating counters. Hopefully a future version will - ideas welcome!

Find documents including element in Array field with mongomapper?

I am new to mongodb/mongomapper and can't find an answer to this.
I have a mongomapper class with the following fields
key :author_id, Integer
key :partecipant_ids, Array
Let's say I have a "record" with the following attributes:
{ :author_id => 10, :partecipant_ids => [10,15,201] }
I want to retrieve all the objects where the partecipant with id 15 is involved.
I did not find any mention in the documentation.
The strange thing is that previously I was doing this query
MessageThread.where :partecipant_ids => [15]
which worked, but after (maybe) some change in the gem/mongodb version it stopped working.
Unfortunately I don't know which version of mongodb and mongomapper I was using before.
In the current versions of MongoMapper, this will work:
MessageThread.where(:partecipant_ids => 15)
And this should work as well...
MessageThread.where(:partecipant_ids => [15])
...because plucky autoexpands that to:
MessageThread.where(:partecipant_ids => { :$in => [15] })
(see https://github.com/jnunemaker/plucky/blob/master/lib/plucky/criteria_hash.rb#L121)
I'd say take a look at your data and try out queries in the Mongo console to make sure you have a working query. MongoDB queries translate directly to MM queries except for the above (and a few other minor) caveats. See http://www.mongodb.org/display/DOCS/Querying

Resources