Elasticsearch Nested Query matching and/or - elasticsearch

For example, I have data containing the following:
{
author: "test",
books: [
{
name: "first book",
cost: 50
},
{
name: "second book",
cost: 100
}
]
}
I want to search the author which has ALL books with cost > 40. How would the query for that look like? The field books is mapped as nested property.

For author names with cost of one book greater than 40 (in hits), something as below in query would work
POST http://192.168.0.68:9200/library/Book/_search
{
"fields": ["author"],
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"nested": {
"path": "books",
"filter": {
"range": {
"books.cost": {
"gt": 40
}
}
}
}
}
}
}
}
For all books having cost greater than 40, I had to handle the collection of nested field manually in client side after getting the response.
Not sure if script applies here to apply filter to all nested objects.
Reference
document not in nested documents elasticsearch

Related

Boolean AND with exact matches oin Elasticsearch

In our Elasticsearch collection of products, we have an an array of hashes, called "nutrients". A partial example of the data would be:
"_source": {
"quantity": "150.0",
"id": 1001,
"barcode": "7610809001066",
"nutrients": [
{
"per_hundred": "1010.0",
"name_fr": "Énergie",
"per_portion": "758.0",
"name_de": "Energie",
"per_day": "9.0",
"name_it": "Energia",
"name_en": "Energy"
},
{
"per_hundred": "242.0",
"name_fr": "Énergie (kCal)",
"per_portion": "181.0",
"name_de": "Energie (kCal)",
"per_day": "9.0",
"name_it": "Energia (kCal)",
"name_en": "Energy (kCal)"
},
{
"per_hundred": "18.0",
"name_fr": "Matières grasses",
"per_portion": "13.5",
"name_de": "Fett",
"per_day": "19.0",
"name_it": "Grassi",
"name_en": "Fat"
},
In the search, we are trying to bring back the products based on an exact match of two of the fields contained in the nutrients array. What I am finding is the conditions seemed to be OR and not AND.
The two attempts have been:
"query": {
"bool": {
"must": [
{ "match": { "nutrients.name_fr": "Énergie" } },
{ "match": { "nutrients.per_hundred": "242.0" } }
]
}
}
}
and
"query": {
"filtered": {
"filter": {
"and": [
{ "term": { "nutrients.name_fr": "Énergie" } },
{ "term": { "nutrients.per_hundred": "242.0" } }
]
}
}
}
Both of these are in fact bringing back entries with Énergie and 242.0, but are also match on different name_fr, eg:
{
"per_hundred": "242.0",
"name_fr": "Acide folique",
"per_portion": "96.0",
"name_de": "Folsäure",
"per_day": "48.0",
"name_it": "Acido folico",
"name_en": "Folic acid"
},
They are also matching on a non exact match, i.e: matching also on "Énergie (kCal)" when we want to match only on "Énergie"
On your first problem:
You have to make the nutrients field nested, so you can query each object inside it for itself Elasticsearch Nested Objects.

Elasticsearch custom function score

i am trying to do a search with custom functions to modify document score.
I have a mapping with specialities stored inside a hospital and every speciality has a priority with it:
Something like:
hospital:{
name: 'Fortis',
specialities: [
{
name: 'Cardiology',
priority: 10
},
{
name: 'Oncology',
priority: 15
}
]
}
Now i have a function score :
functions: [{
filter: {terms: {'specialities.name' => params[:search_text]}},
script_score: {script: "_score * doc['specialities.priority'].value"}
},
I have a filter query to match the search text to any speciality.
Like if i search Oncology, it will match and then I have specified a script_score to take priority of that speciality and add it to final score of document.
But, it is taking the priority of the first speciality it encounters that is 10 and a score of 1 for the filter matched and the end score is 11 not 21 (priority of oncology + 1 for filter match)
I solved it using nested mapping in elasticsearch.
Lucene internally has no concept of storing object mappings by default, so if I am looking to store priority for every speciality I should have a mapping like this:
hospital: {
properties: {
specialities: {
type: nested,
properties: {
name: {
type: 'string'
}priority: {
type: 'long'
}
}
}
}
}
Reference: https://www.elastic.co/guide/en/elasticsearch/reference/2.0/nested.html
After that, I was able to define function score with nested query and my query looks like this:
"query": {
"filtered": {
"query": {
"bool": {
"must": [
{
"nested": {
"path": "specialities",
"query": {
"function_score": {
"score_mode": "sum",
"boost_mode": "sum",
"filter": {
"terms": {
"specialities.name.raw": ["Oncology"]
}
},
"functions": [
{
"field_value_factor": {
"field": "specialities.priority"
}
}
]
}
}
}
}
]
}
}
}
}

May Elasticsearch nested query return only matched nested documents for nested fields?

I'm new to Elasticsearch, and come up with a question that whether Elasticsearch nested query may return only matched nested documents for nested fields or not.
For Example I have a type named blog with a nested field named comments
{
"id": 1,
...
"comments":[
{"content":"Michael is a basketball player"},
{"content":"David is a soccer player"}
]
}
{
"id": 2,
...
"comments":[
{"content":"Wayne is a soccer player"},
{"content":"Steven is also a soccer player"},
]
}
and the nested query
{"query":{
"nested":{
"path":"comments",
"query":{"match":{"comments.content":"soccer"}}
}
}
What I need is to search blog posts with comments which mentioned "soccer", with the count of comments that matched "soccer" (in the example it counts 1, since another comment just mentioned "basketball") for each blog post.
{"hits":[
{
"id":1,
...
"count_for_comments_that_matches_query":1,
},
{
"id":2,
...
"count_for_comments_that_matches_query":2,
}
]}
However it seems Elasticsearch always return the full document, so how could I achieve it, or I couldn't?
The answer is here.
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-inner-hits.html#nested-inner-hits
You need to use the nested inner hits feature of the Elastic search.
{
"_source": [
"id"
],
"query": {
"bool": {
"must": [
{
"match": {
"id": "1"
}
},
{
"nested": {
"path": "comments",
"query": {
"match": {
"comments.content": "soccer"
}
},
"inner_hits": {}
}
}
]
}
}
}
I think it will solve the problem

Elasticsearch Nested Filters being inclusive vs. exclusive

I have an object mapping that uses nested objects (props in our example) in a tag-like fashion.
Each tag can belong to a client/user and when we want to allow our users to generate query_string style searches against the props.name.
Issue is that when we run our query if an object has multiple props and if one of the many props match the filter when others don't the object is returned, when we want the opposite - if one returns false don't return vs. if one returns true return.
I have posted a comprehensive example here: https://gist.github.com/d2kagw/1c9d4ef486b7a2450d95
Thanks in advance.
I believe here you might need the advantage of a flattened list of values, like an array of values. The major difference between an array and nested objects is that the latter "knows" which value of a nested property corresponds to another value of another property in the same nested object. The array of values, on the other hand will flatten the values of a certain property and you lose the "association" between a client_id and a name. Meaning, with arrays you have props.client_id = [null, 2] and props.name = ["petlover", "premiumshopper"].
With your nested filter you want to match that string to all values for props.name meaning ALL nested props.names of one parent doc needs to match. Well, this doesn't happen with nested objects, because the nested documents are separate and are queried separately. And, if at least one nested document matches then it's considered a match.
In other words, for a query like "query": "props.name:(carlover NOT petlover)" you basically need to run it against a flattened list of values, just like arrays. You need that query ran against ["carlover", "petlover"].
My suggestion for you is to make your nested documents "include_in_parent": true (meaning, keep in parent a flattened, array-like list of values) and change a bit the queries:
for the query_string part, use the flattened properties approach to be able to match your query for a combined list of elements, not element by element.
for the match (or term, see below) and missing parts use the nested properties approach because you can have nulls in there. A missing on an array will match only if the whole array is missing, not one value in it, so here one cannot use the same approach as for the query, where the values were flattened in an array.
optional, but for the query match integer I would use term, as it's not string but integer and is by default not_analyzed.
These being said, with the above changes, these are the changes:
{
"mappings" : {
...
"props": {
"type": "nested",
"include_in_parent": true,
...
should (and does) return zero results
GET /nesting-test/_search?pretty=true
{
"query": {
"filtered": {
"filter": {
"and": [
{
"query": {
"query_string": { "query": "props.name:((carlover AND premiumshopper) NOT petlover)" }
}
},
{
"nested": {
"path": "props",
"filter": {
"or": [ { "query": { "match": { "props.client_id": 1 } } }, { "missing": { "field": "props.client_id" } } ]
}
}
}
]
}
}
}
}
should (and does) return just 1
GET /nesting-test/_search?pretty=true
{
"query": {
"filtered": {
"filter": {
"and": [
{"query": {"query_string": { "query": "props.name:(carlover NOT petlover)" } } },
{
"nested": {
"path": "props",
"filter": {
"or": [{ "query": { "match": { "props.client_id": 1 } } },{ "missing": { "field": "props.client_id" } } ]
}
}
}
]
}
}
}
}
should (and does) return just 2
GET /nesting-test/_search?pretty=true
{
"query": {
"filtered": {
"filter": {
"and": [
{ "query": {"query_string": { "query": "props.name:(* NOT carlover)" } } },
{
"nested": {
"path": "props",
"filter": {
"or": [{ "query": { "term": { "props.client_id": 1 } } },{ "missing": { "field": "props.client_id" } }
]
}
}
}
]
}
}
}
}

Elastic search multiple terms in a dictionary

I have mapping like:
"profile": {
"properties": {
"educations": {
"properties": {
"university": {
"type": "string"
},
"graduation_year": {
"type": "string"
}
}
}
}
}
which obviously holds the educations history of people. Each person can have multiple educations. What I want to do is search for people who graduated from "SFU" in "2012". To do that I am using filtered search:
"filtered": {
"filter": {
"and": [
{
"term": {
"educations.university": "SFU"
}
},
{
"term": {
"educations.graduation_year": "2012"
}
}
]
}
But what this query does is to find the documents who have "SFU" and "2012" in their education, so this document would match, which is wrong:
educations[0] = {"university": "SFU", "graduation_year": 2000}
educations[1] = {"university": "UBC", "graduation_year": 2012}
Is there anyway I could filter both terms on each education?
You need to define nested type for educations and use nested filter to filter it, or Elasticsearch will internally flattens inner objects into a single object, and return the wrong results.
You can refer here for detail explainations and samples:
http://www.elasticsearch.org/blog/managing-relations-inside-elasticsearch/
http://www.spacevatican.org/2012/6/3/fun-with-elasticsearch-s-children-and-nested-documents/

Resources