Batch insert multiple records with Mongoid? - ruby

I am reading through this Stackoverflow answer about how to insert multiple documents in Mongoid in one query. From the answer I read:
batch = [{:name => "mongodb"}, {:name => "mongoid"}]
Article.collection.insert(batch)
I need an example to understand how this works. Say we have the Article class:
class Article
include Mongoid::Document
include Mongoid::Timestamps
field :subject, type: String
field :body, type: String
field :remote_id, type: String
validates_uniqueness_of :remote_id
belongs_to :news_paper, :inverse_of => :articles
end
And the I e.g. create an array of articles:
[ {subject: "Mongoid rocks", body: "It really does", remote_id: "1234", news_paper_id: "abc"},
{subject: "Ruby rocks", body: "It really does", remote_id: "1234", news_paper_id: "abc"},
{subject: "Rails rocks", body: "It really does", remote_id: "5678", news_paper_id: "abc"} ]
How do I create them, and at the same time make sure the validation catches that I have 2 remote_id's that are the same?

If you add a unique indexing for remote_id field, MongoDB will take care the uniqueness of this field
index({ remote_id: 1 }, { unique: true })
Don't forget to run create_indexes: rake db:mongoid:create_indexes
After that, you are free to use Article.collection.insert(batch).

Related

How to add a field to ElasticSearch in order to order by it in ruby gem SearchKick

I have a project, where I use Searchkick gem in order to fetch records from ElasticSearch first and then do further processing on them. The guys that initially implemented it added a sort option by two fields in base options in SearchKick:
def base options
{
fields: %i[catalogue_number oe_numbers],
match: :word_start,
misspellings: false,
where: #conditions,
load: false,
order: { sale_rotation_group: { order: :desc } }, { lauber_id_integer: { order: :desc } },
highlight: #params[:highlight]
}
end
lauber_it_integer does not exist in the application database so it must be a field in ElasticSearch that they added.
Now I want to change this and instead of current order I want to tell SearchKick to tell ElasticSearch to order the records by the fields I added to the application database: images_count and parameters_count. So probably I need to add those new fields to ElasticSearch so it knows how to order the records. I changed the order option to
order: { images_count: { order: :desc } }
but now I am getting the error:
Searchkick::InvalidQueryError ([400] {"error":{"root_cause":[{"type":"query_shard_exception","reason":"No mapping found for [images_count] in order to sort on","index_uuid":"WhC4XK8IRnmmfkPJMNmV1g","index":"parts_development_20191124205133405"}]
This will probably involve also some extra work on ElasticSearch to add data to those new fields, however I know very little about ElasticSearch. Could you give me some indications or hints on how I could solve my problem?
You need to add elasticsearch indexes to each new field you want to be orderable / searchable.
Here is a decent guide to setting one up from scratch which explains all the basic concepts, elasticsearch has its own schema so you need to get that to match your new one.
The example they give is:
class Post < ApplicationRecord
include Elasticsearch::Model
include Elasticsearch::Model::Callbacks
settings do
mappings dynamic: false do
indexes :author, type: :text
indexes :title, type: :text, analyzer: :english
indexes :body, type: :text, analyzer: :english
indexes :tags, type: :text, analyzer: :english
indexes :published, type: :boolean
end
end
end
You should see a similar section on whatever model you're searching for.
Just add the fields you want to have indexed to the model file, and then from console call (replacing Post with your model name):
Post.__elasticsearch__.delete_index!
Post.import force: true
I've found a solution on searchkick documentation.
Update search_data definition
def search_data
{
...,
images_count: images_count,
parameters_count: parameters_count
}
end
Update searchkick otpions:
def base options
{
fields: %i[catalogue_number oe_numbers],
match: :word_start,
misspellings: false,
where: #conditions,
load: false,
order: [{ images_count: { order: :desc } }, { parameters_count: { order: :desc } }],
highlight: #params[:highlight]
}
end
Reindex the model
Part.reindex
It will automatically add indexes.

How can I update a mongoid document's schema and remove "stale" fields from the old schema?

I'm trying to figure out how, using Mongoid, I can update my document schema and have any old fields no longer in the schema be automatically purged.
For example, let's say I have something like:
class Car
include Mongoid::Document
field :kind, type: String
field :model, type: String
end
And I do:
Car.create(kind: "Toyota", model: "Sequoia")
In the DB it looks like:
{
"_id": {
"$oid": "55df818533aa6848de000000"
},
"kind": "Toyota",
"model": "Sequoia"
}
I then later, in a separate session, redefine car as follows:
class Car
include Mongoid::Document
field :make, type: String
field :model, type: String
end
And I grab the existing record (there's only one for now):
car = Car.first
And update it:
car.make = "Honda"
car.model = "Accord"
car.save
If I then go look in the DB I have:
{
"_id": {
"$oid": "55df818533aa6848de000000"
},
"kind": "Toyota",
"model": "Accord",
"make": "Honda"
}
In other words, even though my "schema" changed to be only "make" and "model", the attribute "kind" stayed in the document.
Is there a way to tell Mongoid "save and clear out unreferenced fields/objects"?
I'm not really sure what this is called, otherwise I'd probably find an answer to it... something about removing orphaned field definitions?
I welcome any help. Thanks!
Note: Mongoid "hides" this because when I go access the document based on the latest definition of Car, it only returns the relevant fields:
Car.last.inspect
"#<Car _id: 55df818533aa6848de000000, make: \"Honda\", model: \"Accord\">"
However, the "kind" field is still there in the DB. And in fact, if I redefine my schema again, to be something like:
class Car
include Mongoid::Document
field :kind, type: String
field :make, type: String
field :model, type: String
end
And go get the data record, it still has the "kind" field's data:
Car.last.inspect
"#<Car _id: 55df818533aa6848de000000, kind: \"Toyota\", make: \"Honda\", model: \"Accord\">"
I'm guessing this is intentional, but in my case, I'd rather have an option to explicitly clear out "stale" or "orphaned" fields.

How to stop DataMapper from double query when limiting columns/fields?

I'm not sure if I'm at fault here or if my approach is wrong with this.
I want to fetch a user (limiting columns/fields only to name, email, id):
#user = User.first(:api_key => request.env["HTTP_API_KEY"], :fields => [:id, :name, :email])
The output in the command line is correct as follows:
SELECT "id", "name", "email" FROM "users" WHERE "api_key" = '90e20c4838ba3e1772ace705c2f51d4146656cc5' ORDER BY "id" LIMIT 1
Directly after the above query, I have this code:
render_json({
:success => true,
:code => 200,
:user => #user
})
render_json() looks like this, nothing special:
def render_json(p)
status p[:code] if p.has_key?(:code)
p.to_json
end
The problem at this point is that the #user variable contains the full user object (all other fields included) and DataMapper has made an additional query to the database to fetch the fields not included in the :fields constraint, from the logs:
SELECT "id", "password", "api_key", "premium", "timezone", "verified", "notify_me", "company", "updated_at" FROM "users" WHERE "id" = 1 ORDER BY "id"
My question is this: how do I stop DM from performing the additional query? I know it has to do with it's lazy loading architecture and that returning the #user variable in JSON assumes that I want the whole user object. I particularly don't want the password field to be visible in any output representation of the user object.
The same behaviour can be seen when using DM's own serialisation module.
I think you should use an intermediate object for json rendering.
First, query the user from database :
db_user = User.first(:api_key => request.env["HTTP_API_KEY"], :fields => [:id, :name, :email])
Then, create a "json object" to manipulate this user :
#user = { id: db_user.id, name: db_user.name, email: db_user.email }

Tire (Elasticsearch) indexing on associations

I'm newbie to Elasticsearch and Tire. I have to classes: PersonRecord and Bookmark-PersonRecord contains attributes like name, birthday, location, etc, while Bookmark contains PersonRecord which is bookmarked and a user who bookmark it. PersonRecord class has_many Bookmarks.
Now I want to index on PersonRecord and Bookmark for elastic search, the goal is to boost the PersonRecord that has been bookmarked by the user to rank on top. To do that I need to index the user_id as well. The bookmark class looks like:
class Bookmark
include Mongoid::Document
belongs_to :person_record, index: true, foreign_key: :pr_id, touch: true
belongs_to :user, index: true
end
I tried to find documents but didn't find a clear solution. So I'm planning to do something like this in the mapping:
indexes :bookmarks, type: 'object',
properties: {
user: {
type: 'multi_field',
fields: {
id: {type: 'string', index: 'not_analyzed'},
}
}
}
Not sure is it the way to index on user_id? Or can I simplify it to something like
indexes :user_ids, as: bookmarks.map {|b| b.user.id }
And do I need to put it in *to_indexed_json* as well? Thanks!

Mongoid: filtering an embedded collection by sub-sub-documents multiple fields

I am a beginner in mogo and mongoid.
I it possible to filter sub-documents collection by a sub-sub-documents multiple fields ($elemMatch)? I'm trying to make a parametrized scope for an embedded collection.
Set up:
class Product
include Mongoid::Document
include Mongoid::Timestamps
field :name, type: String, default: ''
embeds_many :versions, class_name: self.name, validate: false, cyclic: true
embeds_many :flags
end
class Flag
include Mongoid::Document
include Mongoid::Timestamps
field :text, type: String
field :state, type: Boolean
end
Typically now i want to filter my versions within single product by flags state and name:
Product.first.versions.where('$elemMatch' => {'flags.text' => 'normalized', 'flags.state' => true}) dosn't work.
Either don't work:
Product.first.versions.elem_match(flags: {text: 'normalized', state: true})
Product.first.versions.where(:flags.elem_match => {text: 'normalized', state: true})
Product.first.versions.where(flags: {'$elemMatch' => {text: 'normalized', state: true}})
Is there a way to do this? Thanks.

Resources