tire terms filter not working - elasticsearch

I'm trying to achieve a "scope-like" function with tire/elasticsearch. Why is this not working, even when i have entries with status "Test1" or "Test2"? The results are always empty.
collection = #model.search(:page => page, :per_page => per_page) do |s|
s.query {all}
s.filter :terms, :status => ["Test1", "Test2"]
s.sort {by :"#{column}", "#{direction}"}
end
The method works fine without the filter. Is something wrong with the filter method?! I've checked the tire doku....it should work.
Thanks! :)

Your issue is most probably being caused by using the default mappings for the status field, which would tokenize it -- downcase, split into words, etc.
Compare these two:
http://localhost:9200/myindex/_analyze?text=Text1&analyzer=standard
http://localhost:9200/myindex/_analyze?text=Text1&analyzer=keyword
The solution in your case is to use the keyword analyzer (or set the field to not_analyzed) in your mapping. When the field would not be an “enum” type of data, you could use the multi-field feature.
A working Ruby version would look like this:
require 'tire'
Tire.index('myindex') do
delete
create mappings: {
document: {
properties: {
status: { type: 'string', analyzer: 'keyword' }
}
}
}
store status: 'Test1'
store status: 'Test2'
refresh
end
search = Tire.search 'myindex' do
query do
filtered do
query { all }
filter :terms, status: ['Test1']
end
end
end
puts search.results.to_a.inspect
Note: It's rarely possible -- this case being an exception -- to offer reasonable advice when no index mappings, example data, etc. are provided.

Related

How to add a field to ElasticSearch in order to order by it in ruby gem SearchKick

I have a project, where I use Searchkick gem in order to fetch records from ElasticSearch first and then do further processing on them. The guys that initially implemented it added a sort option by two fields in base options in SearchKick:
def base options
{
fields: %i[catalogue_number oe_numbers],
match: :word_start,
misspellings: false,
where: #conditions,
load: false,
order: { sale_rotation_group: { order: :desc } }, { lauber_id_integer: { order: :desc } },
highlight: #params[:highlight]
}
end
lauber_it_integer does not exist in the application database so it must be a field in ElasticSearch that they added.
Now I want to change this and instead of current order I want to tell SearchKick to tell ElasticSearch to order the records by the fields I added to the application database: images_count and parameters_count. So probably I need to add those new fields to ElasticSearch so it knows how to order the records. I changed the order option to
order: { images_count: { order: :desc } }
but now I am getting the error:
Searchkick::InvalidQueryError ([400] {"error":{"root_cause":[{"type":"query_shard_exception","reason":"No mapping found for [images_count] in order to sort on","index_uuid":"WhC4XK8IRnmmfkPJMNmV1g","index":"parts_development_20191124205133405"}]
This will probably involve also some extra work on ElasticSearch to add data to those new fields, however I know very little about ElasticSearch. Could you give me some indications or hints on how I could solve my problem?
You need to add elasticsearch indexes to each new field you want to be orderable / searchable.
Here is a decent guide to setting one up from scratch which explains all the basic concepts, elasticsearch has its own schema so you need to get that to match your new one.
The example they give is:
class Post < ApplicationRecord
include Elasticsearch::Model
include Elasticsearch::Model::Callbacks
settings do
mappings dynamic: false do
indexes :author, type: :text
indexes :title, type: :text, analyzer: :english
indexes :body, type: :text, analyzer: :english
indexes :tags, type: :text, analyzer: :english
indexes :published, type: :boolean
end
end
end
You should see a similar section on whatever model you're searching for.
Just add the fields you want to have indexed to the model file, and then from console call (replacing Post with your model name):
Post.__elasticsearch__.delete_index!
Post.import force: true
I've found a solution on searchkick documentation.
Update search_data definition
def search_data
{
...,
images_count: images_count,
parameters_count: parameters_count
}
end
Update searchkick otpions:
def base options
{
fields: %i[catalogue_number oe_numbers],
match: :word_start,
misspellings: false,
where: #conditions,
load: false,
order: [{ images_count: { order: :desc } }, { parameters_count: { order: :desc } }],
highlight: #params[:highlight]
}
end
Reindex the model
Part.reindex
It will automatically add indexes.

Ruby finding duplicates in MongoDB

I am struggling to get this working efficiently I think map reduce is the answer but can't getting anything working, I know it is probably a simple answer hopefully someone can help
Entry Model looks like this:
field :var_name, type: String
field :var_data, type: String
field :var_date, type: DateTime
field :external_id, type: Integer
If the external data source malfunctions we get duplicate data. One way to stop this was when consuming the results we check if a record with the same external_id already exists, as one we have already consumed. However this is slowing down the process a lot. The plan now is to check for duplicates once a day. So we are looking get a list of Entries with the same external_id. Which we can then sort and delete those no longer needed.
I have tried adapting the snippet from here https://coderwall.com/p/96dp8g/find-duplicate-documents-in-mongoid-with-map-reduce as shown below but get
failed with error 0: "exception: assertion src/mongo/db/commands/mr.cpp:480"
def find_duplicates
map = %Q{
function() {
emit(this.external_id, 1);
}
}
reduce = %Q{
function(key, values) {
return Array.sum(values);
}
}
Entry.all.map_reduce(map, reduce).out(inline: true).each do |entry|
puts entry["_id"] if entry["value"] != 1
end
end
Am I way off? Could anyone suggest a solution? I am using Mongiod, Rails 4.1.6 and Ruby 2.1
I got it working using the suggestion in the comments of the question by Stennie using the Aggregation framework. It looks like this:
results = Entry.collection.aggregate([
{ "$group" => {
_id: { "external_id" => "$external_id"},
recordIds: {"$addToSet" => "$_id" },
count: { "$sum" => 1 }
}},
{ "$match" => {
count: { "$gt" => 1 }
}}
])
I then loop through the results and delete any unnecessary entries.

Mongoid failing to find document by nested ID

I have a collection with documents that look something like this:
{
_id: ObjectId("521d11014903728f8d000006"),
association_chain: [
{
name: "Foobar",
id: ObjectId("521d11014903728f8d000005")
}
],
// etc...
}
I can search by the name attribute with this query:
#results = Model.where 'association_chain.name' => 'Foobar'
This returns the results as expected. However, when I try to search using the id attribute:
#results = Model.where 'association_chain.id' => '521d11014903728f8d000005'
There are no results. As far as I can tell, the query that Mongoid generates looks correct:
MOPED: 127.0.0.1:27017 QUERY database=x collection=x selector={"$query"=>{"association_chain.id"=>"521d11014903728f8d000005"}, "$orderby"=>{"created_at"=>-1}} flags=[] limit=25 skip=0 batch_size=nil fields=nil (244.7259ms)
What am I doing wrong?
You are searching for the string. Try searching for an ObjectId, like
#results = Model.where 'association_chain.id' => BSON::ObjectId('521d11014903728f8d000005')

Filter on associations' ids with ElasticSearch and Tire

I've been hurting my head against what should be a simple query for quite a while now. I looked at all the documentation and examples, as well as most questions regarding Tire here on StackOverflow with no success.
Basically, I'm trying to filter my search results based on the IDs of some associated models.
Here's the model (note that I still use dynamic mapping at the moment):
class Location < ActiveRecord::Base
belongs_to :city
has_and_belongs_to_many :tags
# also has a string attribute named 'kind'
end
What I'm trying to do is to filter my search query by the city_id, by one tag_id and by kind.
I've tried building the query, but I only get errors because I can't seem to build it correctly. Here's what I have so far (not working):
Location.search do
query { string params[:query] } if params[:query].present?
filter :term, { city_id: params[:city_id] } if params[:city_id].present? # I'd like to use the ids filter, but have no idea of the syntax I'm supposed to use
filter :ids, { 'tag.id', values: [params[:tag_id]] } if params[:tag_id].present? # does not compile
filter :match, { kind: params[:kind] } if params[:kind].present? # does not compile either
end
Turns out the dynamic mapping doesn't cut it for this kind of scenario. I also had to define how my data was indexed.
Here's my mapping:
mapping do
indexes :id, index: :not_analyzed
indexes :kind, index: :not_analyzed
indexes :city_id, index: :not_analyzed
indexes :tags do
indexes :id, index: :not_analyzed
end
end
and my custom to_indexed_json:
def to_indexed_json
{
kind: kind,
city_id: city_id,
tags: tags.map do |t|
{
id: t.id
}
end
}.to_json
end
Finally, I can filter like so:
Location.search do
query { string params[:query] } if params[:query].present?
filter :term, { city_id: params[:city_id] } if params[:city_id].present?
filter :term, { "tags.id" => params[:tag_id] } if params[:tag_id].present?
filter :term, { kind: params[:kind] } if params[:kind].present?
end
The important part is the tags indexing which allows me to use "tags.id" in a filter.

Mongoid Complex Query Including Embedded Docs

I have a model with several embedded models. I need to query for a record to see if it exists. the issue is that I will have to include reference to multiple embedded documents my query would have to include the following params:
{
"first_name"=>"Steve",
"last_name"=>"Grove",
"email_addresses"=>[
{"type"=>"other", "value"=>"steve#stevegrove.com", "primary"=>"true"}
],
"phone_numbers"=>[
{"type"=>"work_fax", "value"=>"(720) 555-0631"},
{"type"=>"home", "value"=>"(303) 555-1978"}
],
"addresses"=>[
{"type"=>"work", "street_address"=>"6390 N Main Street", "city"=>"Elbert", "state"=>"CO"}
],
}
How can I query for all the embedded docs even though some fields are missing such as _id and associations?
A few things to think about.
Are you sure the query HAS to contain all these parameters? Is there not a subset of this information that uniquely identifies the record? Say (first_name, last_name, and an email_addresses.value). It would be silly to query all the conditions if you could accomplish the same thing in less work.
In Mongoid the where criteria allows you to use straight javascript, so if you know how to write the javascript criteria you could just pass a string of javascript to where.
Else you're left writing a really awkward where criteria statement, thankfully you can use the dot notation.
Something like:
UserProfile.where(first_name: "Steve",
last_name: "Grove",
:email_addresses.matches => {type: "other",
value: "steve#stevegrove.com",
primary: "true"},
..., ...)
in response to the request for embedded js:
query = %{
function () {
var email_match = false;
for(var i = 0; i < this.email_addresses.length && !email_match; i++){
email_match = this.email_addresses[i].value === "steve#stevegrove.com";
}
return this.first_name === "Steve" &&
this.last_name === "Grove" &&
email_match;
}
}
UserProfile.where(query).first
It's not pretty, but it works
With Mongoid 3 you could use elem_match http://mongoid.org/en/origin/docs/selection.html#symbol
UserProfile.where(:email_addresses.elem_match => {value: 'steve#stevegrove.com', primary: true})
This assumes
class UserProfile
include Mongoid::Document
embeds_many :email_addresses
end
Now if you needed to include every one of these fields, I would recommend using the UserProfile.collection.aggregate(query). In this case you could build a giant hash with all the fields.
query = { '$match' => {
'$or' => [
{:email_addresses.elem_match => {value: 'steve#stevegrove.com', primary: true}}
]
} }
it starts to get a little crazy, but hopefully that will give you some insight into what your options might be. https://coderwall.com/p/dtvvha for another example.

Resources