What is the efficient way of counting 2 level nested model (using mongoid or MongoDB in general)? - ruby

I have following data models. In which a Project embeds many ComponentDescriptor, and a ComponentDescriptor embeds many Statistic.
class Project
include Mongoid::Document
embeds_many :component_descriptors
field :status, :type => Integer
backgrounded :publish
class ComponentDescriptor
include Mongoid::Document
include Mongoid::Acts::Tree
embeds_many :statistics
embedded_in :project, :inverse_of => :component_descriptors
class Statistic
include Mongoid::Document
field :statistics_type, :type => String
field :data, :type => String
field :playhead_time, :type => String
field :remote_ip, :type => String
field :user_agent, :type => String
embedded_in :component_descriptor, :inverse_of => :statistics
The question is what is the best way to count the total number of Statistic object in a Project.
One way I can think of is looping through each ComponentDescriptor and count the number of Statistics objects and then sum them up. But I think this is not efficient way.
Thank in advance.

If you can store the count field at the Project level that would be the most optimal and fastest way for you to obtain it.


Mongoid model with hardcoded data

I have a mongoid model
class MyMongoidModel
include Mongoid::Document
include Mongoid::Timestamps
field :name, :type => String
field :data_id, :type => Integer
has_and_belongs_to_many :the_other_model, :class_name => 'class_name_model'
has_many :model2
def self.all
#.... the hardcoded data that will never be changed
it's used by the other model and it uses them as well. However, it contains the data that won't be changed for a very long time, let's say, at all. Thus, I don't want to retrieve it from db, I want it to be hardcoded and, at the same time, I want it acts like a normal mongoid model. Using caching is not what I'm looking for.
I hope you understand what I mean.
How do accomplish it?
There's a great gem called active_hash that provides this functionality for ActiveRecord: defining a fixed set of data as models you can reference/relate to normal models, but have it defined in code and loaded in memory (not stored/retrieved from DB).
Interestingly, since Mongoid and ActiveRecord both share common ActiveModel basis, you may be able to use active_hash with a Mongoid document now.
For example:
class Country < ActiveHash::Base
self.data = [
{:id => 1, :name => "US"},
{:id => 2, :name => "Canada"}
class Order
include Mongoid::Document
include Mongoid::Timestamps
has_one :country

Querying and sorting embedded document in mongoid

I have three classes
class Post
include Mongoid::Document
include Mongoid::Timestamps
belongs_to :user, :inverse_of => nil
embeds_many :comments, :as => :commentable
field :content, :type => String
class Commment
include Mongoid::Document
include Mongoid::Timestamps
belongs_to :user, :inverse_of => nil
embedded_in :commentable, :polymoriphic => true
class User
has_many :posts, :dependent => :destroy
field :name, :type => String
Whenever the user creates a new comment, I want to compare the contents of it with the latest comment that the user has made. Here is my code that fetches the latest comment by the user:
Post.where("comments.user_id" => user.id).
each {|p| comments_user += p.comments.where(:user_id => user.id).to_a}
latest_comment = comments_user.sort_by{|comment| comment[:updated_at]}.reverse.first
The above code gives me the result but the approach taken is not efficient as I have to traverse through all the posts that the user has commmented to find the latest comment. If any, can anyone provide me a more efficient solution to this problem?
Simply speaking, Isn't there any way I can get all the comments made by this user?
This should fetch the latest user`s comment:
Post.where("comments.user_id" => user.id).order_by(:'comments.updated_at'.desc).limit(1).only(:comments).first
This is standard problem with embedding. It greatly improves some queries ("load post with all its comments"), but makes others non-efficient/impractical ("find latest comment of a user").
I see two options here:
Keep embedding and duplicate data. That is, when user makes a comment, embed this comment to a post document and to the user document. This data duplication has its drawbacks, of course (what if you need to edit comments?);
Stop embedding and start referencing. This means that comment is now a top level entity. You can't quickly load a post with comments, because there are no joins. But other queries are faster now, and there's no data duplication.

DataMapper filter records by association count

With the following model, I'm looking for an efficient and straightforward way to return all of the Tasks that have 0 parent tasks (the top-level tasks, essentially). I'll eventually want to return things like 0 child tasks as well, so a general solution would be great. Is this possible using existing DataMapper functionality, or will I need to define a method to filter the results manually?
class Task
include DataMapper::Resource
property :id, Serial
property :name , String, :required => true
#Any link of type parent where this task is the target, represents a parent of this task
has n, :links_to_parents, 'Task::Link', :child_key => [ :target_id ], :type => 'Parent'
#Any link of type parent where this task is the source, represents a child of this task
has n, :links_to_children, 'Task::Link', :child_key => [ :source_id ], :type => 'Parent'
has n, :parents, self,
:through => :links_to_parents,
:via => :source
has n, :children, self,
:through => :links_to_children,
:via => :target
def add_parent(parent)
def add_child(child)
class Link
include DataMapper::Resource
storage_names[:default] = 'task_links'
belongs_to :source, 'Task', :key => true
belongs_to :target, 'Task', :key => true
property :type, String
I would like to be able to define a shared method on the Task class like:
def self.without_parents
#Code to return collection here
DataMapper falls down in these scenarios, since effectively what you're looking for is the LEFT JOIN query where everything on the right is NULL.
SELECT tasks.* FROM tasks LEFT JOIN parents_tasks ON parents_tasks.task_id = task.id WHERE parents_tasks.task_id IS NULL
You parents/children situation makes no different here, since they are both n:n mappings.
The most efficient you'll get with DataMapper alone (at least in version 1.x) is:
Task.all(:parents => nil)
Which will execute two queries. The first being a relatively simple SELECT from the n:n pivot table (WHERE task_id NOT NULL), and the second being a gigantic NOT IN for all of the id's returned in the first query... which is ultimately not what you're looking for.
I think you're going to have to write the SQL yourself unfortunately ;)
EDIT | https://github.com/datamapper/dm-ar-finders and it's find_by_sql method may be of interest. If field name abstraction is important to you, you can reference things like Model.storage_name and Model.some_property.field in your SQL.

Mongoid Relations 1..*

Consider the following:
class Picture
include Mongoid::Document
field :data, :type => String
class Cat
include Mongoid::Document
has_one :picture, :autosave => true
field :name, :type => String
class Dog
include Mongoid::Document
has_one :picture, :autosave => true
field :name, :type => String
Now, is it possible to do the following:
dog = Dog.new
dog.picture = Picture.new
Without having to edit the Picture class to the following:
class Picture
include Mongoid::Document
belongs_to :cat
belongs_to :dog
field :data, :type => String
I don't need pictures to know about it's Dog or Cat. Is this possible?
I believe you could do this if you put the belongs_to :picture in your dog and cat classes. The side of the relation that has belongs_to is the side that will store the foreign key. That would put a picture_id field in each of Dog and Cat, instead of having to store a whatever_id for each type of think you want to link on your Picture class.
No it is not. You need to have cat_id or dog_id or some polymorphic obj_id for all of them to store information about belonging of this picture.
Or how do you know wich Picture belongs to current Dog or Cat?

datamapper multi-field unique index

In Datamapper, how would one specify the the combination of two fields must be unique. For example categories must have unique names within a domain:
class Category
include DataMapper.resource
property :name, String, :index=>true #must be unique for a given domain
belongs_to :domain
You have to create a unique index for the two properties:
class Category
include DataMapper::Resource
property :name, String, :unique_index => :u
property :domain_id, Integer, :unique_index => :u
belongs_to :domain
Actually, John, Joschi's answer is correct: the use of named :unique_index values does create a multiple-column index; it's important to read the right-hand side of those hash-rockets (i.e., if it had just been true, you would be right).
Did you try to define both properties as keys? Not sure I have tried it but that way they should become a composite key.
property :name, String, :key => true
property :category, Integer, :key => true
