Mongoid Capped Collection - ruby

I'm trying to create a capped collection with Mongoid. I have a definition as follows:
class Customer
include Mongoid::Document
store_in(collection: 'customers')
field: n, type: String, as: :name
field: a, type: String, as: :address
field: z, type: String, as: :zip
end
I've been referencing the documentation but can't figure out how to make a capped collection in this portion of the code. I've tried removing the store_in line and replacing it with session.command(create: "customers", capped: true, size: 10000000, max: 1000) to no avail. Is session supposed to be replaced with something? Or am I going about this incorrectly?

Mongoid does not provide a mechanism for creating capped collections on the fly - you will need to create these yourself via the Mongo console.

Related

RSpec validation fails with little explaination

I've created an RSpec test to simply test if my model is valid with the given info, it should be, yet my test is still failing. I'm hoping someone can see why since I've stared at this all day yesterday.
I'm also using MongoDB (not sure if that matters).
models/stock.rb
class Stock
include Mongoid::Document
field :symbol, type: String
field :last_trade_price, type: Integer
field :ask, type: Integer
field :change, type: Integer
field :change_percent, type: String
field :market_cap, type: String
field :avg_volume, type: Integer
field :change_from_year_high, type: Integer
field :change_from_year_low, type: Integer
field :change_from_year_high_percent, type: Integer
field :change_from_year_low_percent, type: Integer
field :year_high, type: Integer
field :year_low, type: Integer
field :day_high, type: Integer
field :day_low, type: Integer
field :day_range, type: String
field :ebitda, type: String
field :eps_estimate_current_year, type: Integer
field :eps_estimate_next_year, type: Integer
field :eps_estimate_next_quarter, type: Integer
validates :symbol, :last_trade_price, :ask, :change, :change_percent, :market_cap,
:avg_volume, :change_from_year_high, :change_from_year_low, :change_from_year_high_percent,
:change_from_year_low_percent, :year_high, :year_low, :day_high, :day_low, :day_range,
:ebitda, :eps_estimate_current_year, :eps_estimate_next_year, :eps_estimate_next_quarter, presence: true
validates :last_trade_price, :ask, :change, :avg_volume,
:change_from_year_high, :change_from_year_low, :change_from_year_high_percent,
:change_from_year_low_percent, :year_high, :year_low, :day_high, :day_low,
:eps_estimate_current_year, :eps_estimate_next_year, :eps_estimate_next_quarter, numericality: true
validates_uniqueness_of :symbol
end
spec/factories.rb
FactoryGirl.define do
factory :stock do
symbol "AAPL"
last_trade_price 92.51
ask 92.78
change -0.91
change_percent "-0.91 - -0.97"
market_cap "512.93B"
avg_volume 37776500
change_from_year_high -40.46
change_from_year_low 0.66
change_from_year_high_percent -30.43
change_from_year_low_percent 0.72
year_high 132.97
year_low 91.85
day_high 93.57
day_low 92.46
day_range "92.46 - 93.57"
ebitda "82.79B"
eps_estimate_current_year 8.29
eps_estimate_next_year 9.15
eps_estimate_next_quarter 1.67
end
end
spec/models/stock_spec.rb
describe Stock do
let(:stock) { build(:stock) }
it "should be valid if all information is provided" do
expect(stock).to be_valid
end
end
My output from running the rspec test is:
Failures:
1) Stock should be valid if all information is provided
Failure/Error: expect(stock).to be_valid
expected `#<Stock _id: 5734dd60b8066872f6000000, symbol: "AAPL", last_trade_price: 92, ask: 92, change: 0, change_percent: "-0.91 - -0.97", market_cap: "512.93B", avg_volume: 37776500, change_from_year_high: -40, change_from_year_low: 0, change_from_year_high_percent: -30, change_from_year_low_percent: 0, year_high: 132, year_low: 91, day_high: 93, day_low: 92, day_range: "92.46 - 93.57", ebitda: "82.79B", eps_estimate_current_year: 8, eps_estimate_next_year: 9, eps_estimate_next_quarter: 1>.valid?` to return true, got false
# ./spec/models/stock_spec.rb:5:in `block (2 levels) in <top (required)>'
Finished in 0.02311 seconds (files took 1.72 seconds to load)
1 examples, 1 failure
Failed examples:
rspec ./spec/models/stock_spec.rb:4 # Stock should be valid if all information is provided
Randomized with seed 36574
From looking at the error, it seems that all of the information was built into the factory test object, so I'm unsure why the test is getting false instead of the true it's expecting.
Thanks for any help!
You can test what fields are giving an error by modifiying the spec:
describe Stock do
let(:stock) { build(:stock) }
it "should be valid if all information is provided" do
#expect(stock).to be_valid
stock.valid?
expect(stock.errors.full_messages).to eq []
end
end
However even as such the spec has very little actual value - you're just testing that your factory has all the required fields. If it didn't you would get failures in other specs anyways.
Also if you are grouping a bunch of similar validations by type you might want to use the longhand methods instead as it is much easier to read:
validates_presence_of :symbol, :last_trade_price, :ask, :change, :change_percent, :market_cap,
:avg_volume, :change_from_year_high, :change_from_year_low, :change_from_year_high_percent,
:change_from_year_low_percent, :year_high, :year_low, :day_high, :day_low, :day_range,
:ebitda, :eps_estimate_current_year, :eps_estimate_next_year, :eps_estimate_next_quarter
Added
When defining factories you should use sequences or computed properties to ensure that unique fields are unique - otherwise your validations will fail if you create more than one record from your factory!
FactoryGirl.define do
factory :stock do
sequence :symbol do |n|
"TEST-#{n}"
end
last_trade_price 92.51
ask 92.78
change -0.91
change_percent "-0.91 - -0.97"
market_cap "512.93B"
avg_volume 37776500
change_from_year_high -40.46
change_from_year_low 0.66
change_from_year_high_percent -30.43
change_from_year_low_percent 0.72
year_high 132.97
year_low 91.85
day_high 93.57
day_low 92.46
day_range "92.46 - 93.57"
ebitda "82.79B"
eps_estimate_current_year 8.29
eps_estimate_next_year 9.15
eps_estimate_next_quarter 1.67
end
end
Gems like FFaker are really helpful here. See the FactoryGirl docs for more info.
Also you should use a gem like database_cleaner (Yes it works for mongoid) to clean out your database between specs - the reason your validation is currently failing is that you have residual test state from some other test which is effecting the result.

Mongoid push with upsert

I've got model User:
class User
field :username, type: String
embeds_many :products
end
class Product
field :name, type: String
embedded_in :user
end
I would like to have single operation that would:
insert the user
update the user in case the user exists already (this i can easily do with upsert)
push the products
This works for upserting:
User.new(username: 'Hello').upsert
The problem is that this will delete the embedded products (the products attribute is not specified).
Can I ask mongoid to skip setting array to empty?
Can I ask mongoid to push new products at the end of products array?
Something like this:
User.new(username: 'Hello').push(products: [Product.new(name: 'Screen')]).upsert
Finally I ended up by manually writing the following query:
User.mongo_client[:users].update_one({username: 'Hello'},
{"$set" => {first_name: 'Jim', last_name: 'Jones'},
"$pushAll" => [products: [{name: 'Screen'}, {name: 'Keyboard'}]
},
upsert: true)
Where:
$set - are the params that we want to set for a given document
$pushAll - when you use $push you can specify only one element, $pushAll allows you to append multiple elements (when you specify only one it will behave like $push)
upsert - will do the insert/update magic in the mongodb
In the second hash you can also specify $inc, $dec, $pop, $set etc... which is quite useful.

Mongoid $project aggregation doesn't return anything

I'm trying to perform the following aggregation with Mongoid:
Award.collection.aggregate( [ {"$project" => {:"value.amount"=> 1}} ] )
This returns:
#<Mongo::Collection::View::Aggregation:0x0055cc6e8658b8
#options={},
#pipeline=[{"$project"=>{:"value.amount"=>1}}],
#view=#<Mongo::Collection::View:0x47168257993960
namespace='elvis_development.awards #selector={} #options={}>>
so no results but no errors either. This version has the same syntax as the example they give in the docs but I've tried different syntax too, with no success. In the mongo shell this:
db.awards.aggregate( [ { $project : { value.amount : 1 } } ] )
returns the desired results.
I use MongoDB v3.0.7 and Mongoid 5.0.1 and this is my model:
class Award
include Mongoid::Document
include Mongoid::Elasticsearch
# Associations
belongs_to :document
embeds_one :date, class_name: "AwardDate", inverse_of: :award
embeds_one :value, class_name: "Value", inverse_of: :award
accepts_nested_attributes_for :value, :date
# Fields
field :title, type: String
field :description, type: String
elasticsearch!({
prefix_name: false,
index_name: 'awards',
wrapper: :load
})
end
Am I doing something wrong? I noticed in this example on mongo_ruby_driver Github that the $project aggregation is supported, but I've tried with both nested and not nested attributes with the same result. I realize I could do this with normal retrieval but I would prefer aggregations since they are faster and I have a large data set. Any thoughts would be very much appreciated.
Modern releases of Mongoid (v5 and greater) now use a modern mongodb ruby driver rather than the older "moped" driver of Mongoid v3 and v4.
This means that .aggregate() returns a "cursor", or specifically a Mongo::Collection::View::Readable object instead of a plain array of objects, which is consistent with other modern driver releases.
So iterate the "cursor" instead, via the standard ways. i.e:
require "pp"
Award.collection.aggregate( [ {"$project" => { "value.amount"=> 1}} ] ).each do | doc |
pp doc
end
Which will give you output like this for each document in the response:
{"_id"=>BSON::ObjectId('564c4836023fb886145f8063'), "value"=>{"amount"=>1.0}}
Just like you asked for.

Import from one index to a new index with a persistence model

I have an application that has a Nutch crawler sending results directly to an ElasticSearch index created by a Tire Persistence model.
I am looking for the best way to make changes to the index that does not require deleting the index, and then recreating it and re-populating it as the index is the master data source. I've been trying to get the method working where your index is an alias, then have indexes associated with the alias, and then import from the master index to a new index.
I have been trying to get the rake environment tire:import CLASS='Applicant' INDEX='index_new' command to get the job done with this approach, but have not had any success as it fails on the import at first due to an undefined method 'paginate' and then after I defined a 'paginate' method in my model, it fails from an undefined method 'count' which it hits in at tire-0.60.0/lib/tire/model/import.rb:102.
I've been scouring for days looking for the right approach, and I'm not convinced at this point that I'm on the right path at all at this point. I have included my model below for reference. I am using WillPaginate for pagination.
class Applicant
include Tire::Model::Persistence
include Tire::Model::Search
include Tire::Model::Callbacks
require 'will_paginate'
require 'will_paginate-bootstrap'
require 'will_paginate/array'
index_name 'index'
document_type 'doc'
mapping
indexes :boost, type: 'string'
indexes :content, type: 'string'
indexes :digest, type: 'string'
indexes :id, type: 'string'
indexes :skill, type: 'string'
indexes :title, type: 'string'
indexes :tstamp, type: 'date', format: 'dateOptionalTime'
indexes :url, type: 'string'
indexes :domain, type 'string'
property :boost
property :content
property :digest
property :id
property :skill
property :title
property :tstamp
property :url
property :domain
def self.search(params)
tire.search(page: params[:page], per_page: 20)do
query { string params[:query], default_operator: "AND" } if params[:query].present?
filter :term, domain: params[:domain_selected] if params[:domain_selected].present?
filter :term, skill: params[:skill_selected] if params[:skill_selected].present?
facet "domains" do
terms :domain
end
facet "skills" do
terms :skill
end
end
end
def self.paginate(params)
#page_results = WillPaginate::Collection.create(params[:page], per_page, total_entries) do |pager|
pager.replace(#self.to_array)
end
#page_results = #self.paginate(params[:current_page], params[:per_page])
end
end
On a side note but lower priority too me, I've been digging through the code trying to understand why the import needs pagination and it's not clear to me.
Thanks in advance.
Well, the reason you're getting that error is that in your view, I would guess, you're referring to the paginate gem.
First thing to do is either check your view, and strip paginate out of the view and the controller, OR, if you need paginate, do this simple test:
Your application should load the will_paginate gem. To see if the
library has been loaded, open the console for your app and try the
following lines:
defined? WillPaginate
ActiveRecord::Base.respond_to? :paginate If any of these lines return nil/false, will_paginate has not properly loaded in your app.
(( from https://github.com/mislav/will_paginate/wiki/Troubleshooting ))
If it fails out, make sure you have the following two lines in your Gemfile:
gem 'will_paginate', '~> 3.0.3'
gem 'bootstrap-will_paginate', '~> 0.0.6'
If that doesn't work for you, let me know, and we'll dig deeper.
So after 2 weeks of searching, I found the solution I was looking for. I basically accomplish the same result I was looking for using Article.create_elasticsearch_index followed by Tire.index('original-index-name').reindex 'new-index-name'. Karmi's tweet here is what led me to the right solution.
https://twitter.com/karmiq/status/185811361069142016
I'm also working on adapting jarosan's work here into working for my situation and will post soon.
https://gist.github.com/3124884
Thanks Michel and Karel.

Elasticsearch, Tire, and Nested queries / associations with ActiveRecord

I'm using ElasticSearch with Tire to index and search some ActiveRecord models, and I've been searching for the "right" way to index and search associations. I haven't found what seems like a best practice for this, so I wanted to ask if anyone has an approach that they think works really well.
As an example setup (this is made up but illustrates the problem), let's say we have a book, with chapters. Each book has a title and author, and a bunch of chapters. Each chapter has text. We want to index the book's fields and the chapters' text so you can search for a book by author, or for any book with certain words in it.
class Book < ActiveRecord::Base
include Tire::Model::Search
include Tire::Model::Callbacks
has_many :chapters
mapping do
indexes :title, :analyzer => 'snowball', :boost => 100
indexes :author, :analyzer => 'snowball'
indexes :chapters, type: 'object', properties: {
chapter_text: { type: 'string', analyzer: 'snowball' }
}
end
end
class Chapter < ActiveRecord::Base
belongs_to :book
end
So then I do the search with:
s = Book.search do
query { string query_string }
end
That doesn't work, even though it seems like that indexing should do it. If instead I index:
indexes :chapters, :as => 'chapters.map{|c| c.chapter_text}.join('|'), :analyzer => 'snowball'
That makes the text searchable, but obviously it's not a nice hack and it loses the actual associated object. I've tried variations of the searching, like:
s = Book.search do
query do
boolean do
should { string query_string }
should { string "chapters.chapter_text:#{query_string}" }
end
end
end
With no luck there, either. If anyone has a good, clear example of indexing and searching associated ActiveRecord objects using Tire, it seems like that would be a really good addition to the knowledge base here.
Thanks for any ideas and contributions.
The support for ActiveRecord associations in Tire is working, but requires couple of tweaks inside your application. There's no question the library should do better job here, and in the future it certainly will.
That said, here is a full-fledged example of Tire configuration to work with Rails' associations in elasticsearch: active_record_associations.rb
Let me highlight couple of things here.
Touching the parent
First, you have to ensure you notify the parent model of the association about changes in the association.
Given we have a Chapter model, which “belongs to” a Book, we need to do:
class Chapter < ActiveRecord::Base
belongs_to :book, touch: true
end
In this way, when we do something like:
book.chapters.create text: "Lorem ipsum...."
The book instance is notified about the added chapter.
Responding to touches
With this part sorted, we need to notify Tire about the change, and update the elasticsearch index accordingly:
class Book < ActiveRecord::Base
has_many :chapters
after_touch() { tire.update_index }
end
(There's no question Tire should intercept after_touch notifications by itself, and not force you to do this. It is, on the other hand, a testament of how easy is to work your way around the library limitations in a manner which does not hurt your eyes.)
Proper JSON serialization in Rails < 3.1
Despite the README mentions you have to disable automatic "adding root key in JSON" in Rails < 3.1, many people forget it, so you have to include it in the class definition as well:
self.include_root_in_json = false
Proper mapping for elasticsearch
Now comes the meat of our work -- defining proper mapping for our documents (models):
mapping do
indexes :title, type: 'string', boost: 10, analyzer: 'snowball'
indexes :created_at, type: 'date'
indexes :chapters do
indexes :text, analyzer: 'snowball'
end
end
Notice we index title with boosting, created_at as "date", and chapter text from the associated model. All the data are effectively “de-normalized” as a single document in elasticsearch (if such a term would make slight sense).
Proper document JSON serialization
As the last step, we have to properly serialize the document in the elasticsearch index. Notice how we can leverage the convenient to_json method from ActiveRecord:
def to_indexed_json
to_json( include: { chapters: { only: [:text] } } )
end
With all this setup in place, we can search in properties in both the Book and the Chapter parts of our document.
Please run the active_record_associations.rb Ruby file linked at the beginning to see the full picture.
For further information, please refer to these resources:
https://github.com/karmi/railscasts-episodes/commit/ee1f6f3
https://github.com/karmi/railscasts-episodes/commit/03c45c3
https://github.com/karmi/tire/blob/master/test/models/active_record_models.rb#L10-20
See this StackOverflow answer: ElasticSearch & Tire: Using Mapping and to_indexed_json for more information about mapping / to_indexed_json interplay.
See this StackOverflow answer: Index the results of a method in ElasticSearch (Tire + ActiveRecord) to see how to fight n+1 queries when indexing models with associations.
I have created this as a solution in one of my applications, that indexes a deeply nested set of models
https://gist.github.com/paulnsorensen/4744475
UPDATE: I have now released a gem that does this:
https://github.com/paulnsorensen/lifesaver

Resources