Elasticsearch function_score with nested_objects - elasticsearch

I have a problem with a query.I used a query for boosting documents with no nested_objects. Now i use nested_objects and changed the query to use a nested filter but nothing is boosted.
I get the documents i expected but with no _score changes.
Am i doing something wrong ??
GET index/type/_search
{
"query": {
"function_score": {
"filter": {
"bool": {
"must": [
{
"term": {
"parent.child": "test"
}
}
]
}
},
"functions": [
{
"boost_factor": "100",
"filter": {
"nested": {
"path": "parent",
"filter": {
"bool": {
"must": [
{
"term": {
"child": "test"
}
}
]
}
}
}
}
}
],
"score_mode": "sum"
}
},
"sort": "_score",
"from": 0,
"size": 320
}
EDIT:
Could it be caused by
nested filter
A nested filter behaves much like a nested query, except that it
doesn’t accept the score_mode parameter. It can only be used in
“filter context” — such as inside a filtered query —  and it behaves
like any other filter: it includes or excludes, but it doesn’t score.
While the results of the nested filter itself are not cached, the
usual caching rules apply to the filter inside the nested filter.
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/nested-query.html

As the documentation says:
nested filter
A nested filter behaves much like a nested query, except that it doesn’t accept the score_mode parameter. It can only be used in “filter context” — such as inside a filtered query —  and it behaves like any other filter: it includes or excludes, but it doesn’t score.
While the results of the nested filter itself are not cached, the usual caching rules apply to the filter inside the nested filter.
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/nested-query.html

I get the feeling your JSON should look closer this (maybe not exactly though, I havent tested it).
{
"query": {
"function_score": {
"query": {
"nested": {
"path": "parent",
"query": {
"bool": {
"must": [
{
"term": {
"parent.child": "test"
}
}
]
}
}
}
},
"functions": [
{
"boost_factor": "100"
}
],
"score_mode": "sum"
}
},
"sort": "_score",
"from": 0,
"size": 320
}
Specifically, you dont want to filter the boost_factor, you just want to boost this function_score by 100. The actual nested query though goes in the query section of the function_score, and the functions section just contains the boost_factor. I think.

Related

Elastic query combining should (boolean OR) with retrieval of nested documents

I have an Elastic index with nested documents. I am trying to retrieve multiple documents by ids along with their nested documents. Retrieving the documents themselves is simple enough with a Should query, but where and how would I include the nested doc query in this?
Boolean "Should" query:
GET myIndex/_search
{
"query": {
"bool": {
"should": [
{
"term": {
"id": "123"
}
},
{
"term": {
"id": "456"
}
},
{
"term": {
"id": "789"
}
}
]
}
}
}
Query to retrieve nested docs:
"nested": {
"path": "myNestedDocs",
"query": {
"match_all": {}
}
It is not possible to add the nested query to each term query, because this gives a parsing exception: "[term] malformed query, expected [END_OBJECT] but found [FIELD_NAME]"
Also, it is not possible to add the nested doc query on the same level as the term queries, because then it would be treated as just another OR clause and simply retrieve all docs from the index.
Any ideas? Thanks!
As per my understanding, you want to match any one id from list and retrive nested document. If my understanding correct then You need to combined both the query to must clause like below:
{
"query": {
"bool": {
"must": [
{
"terms": {
"id": [
"123",
"456",
"789"
]
}
},
{
"nested": {
"path": "myNestedDocs",
"query": {
"match_all": {}
}
}
}
]
}
}
}

How to return multiple inner hits in multiple nested sub-queries for the same path?

When I have multiple nested sub-queries for the same path, it seems the result will only include the inner hits result of the last nested sub-query. Is there a way to return all of the inner hits results for the multiple nested sub-queries?
e.g.
{
"query": {
"bool: {
"must": [{
"nested": {
"query": {...},
"path": "path_a",
"inner_hits": {}
}
},{
"nested": {
"query": {...},
"path": "path_a",
"inner_hits": {}
}
}]
}
}
}
If you add a unique name to your inner_hits, then the result will basically contain a map of your inner hits as you're expecting.
Note: It seems that sometimes the inner hits contains extra query names (from the other nested queries) in the matched_queries, so it may need some post-processing
For the same path one needs to specify the nested path before its sub-queries.
Below is an example of searching, either via the match or the range, in the same nested path. You could modify the search according to your needs.
GET index/_search
{
"query": {
"nested": {
"path": “path.subpath”,
"query": {
"bool": {
"must": [
{ "match": { “path.subpath.match1”: “valueMatch” }},
{ "range" : { “path.subpath.range1” : {"gte": “rangeMatch” } }
}
]
}
}
}
}
}
I hope this helps!

How to add properties from a root object in a nested object for sorting?

A simplified example of the kind of document in our index:
{
"organisation" : {
"code" : "01310"
},
"publications" : [
{
"dateEnd" : 1393801200000,
"dateStart" : 1391986800000,
"code" : "PUB.02"
},
{
"dateEnd" : 1401055200000,
"dateStart" : 1397512800000,
"code" : "PUB.06"
}
]
}
Note that publications are mapped as nested objects because we need to filter based on a combination of the dateEnd, dateStart and publicationStatus properties.
The PUB.02 status code is special. It states: 'this publication period is valid if the current user is a member of the organisation'.
I have a problem when I want to sort on 'most recent':
{
"sort": {
"publications.dateStart" : {
"mode" : "min",
"order" : "desc",
"nested_filter" : {
"or" : [
{
"and" : [
{ "term" : { "organisation.code" : "01310" } },
{ "term" : { "publications.code" : "PUB.02" } }
]
},
{ "term" : { "publications.code" : "PUB.06" } }
]
}
}
}
}
No error is given, but the PUB.02 entry is ignored. I tried to use copy_to in my mapping to copy the value of organisation.code to the nested object, but that did not help.
Is there a way to reach for the parent document inside a nested sort?
Alternatively, is there a way to copy data from parent to the nested document?
I am currently using version 1.7 of Elasticsearch without the ability to use scripts. Upgrading to a newer version could be done if that would help the situation.
This gist shows that the sort is performed on the PUB.06 publications: https://gist.github.com/EECOLOR/2db9a1ec9d6d5c791ea6
Although the documentation does not explictly mention it does look like we cannot access the parent field in a nested filter context.
Also I wasn't able to use copy_to to add data from root/parent field to nested document. I would suggest asking in elasticsearch discuss thread you would have more luck about the reasons for this.
Before some trigger happy bloke downvotes this answer I would like to add that the query and intended results that was desired in the OP using sort could be achieved using function_score work-around.
One implementation to achieve this is as follows
1) start of with a should query
2) In the first should clause
a) use filtered query to filter documents with the `organisation.code : 01310`
b) then score these documents based on max value of reciprocal of nested document **dateStart** with terms **PUB2.0 PUB6.0**
3) In the second should clause
a) use filtered query to filter documents with those with `organisation.code not equal to 01310`
b) like before score these documents based on max value of reciprocal of nested document **dateStart** with term **PUB6.0** only
Example Query:
POST /testindex/testtype/_search
{
"query": {
"bool": {
"should": [
{
"filtered": {
"filter": {
"term": {
"organisation.code": "01310"
}
},
"query": {
"nested": {
"path": "publications",
"query": {
"filtered": {
"query": {
"function_score": {
"functions": [
{
"field_value_factor": {
"field": "publications.dateStart",
"modifier": "reciprocal"
}
}
],
"boost_mode": "replace",
"score_mode": "max"
}
},
"filter": {
"terms": {
"publications.code": [
"PUB.02",
"PUB.06"
]
}
}
}
},
"score_mode": "max"
}
}
}
},
{
"filtered": {
"filter": {
"not": {
"term": {
"organisation.code": "01310"
}
}
},
"query": {
"nested": {
"path": "publications",
"query": {
"filtered": {
"query": {
"function_score": {
"functions": [
{
"field_value_factor": {
"field": "publications.dateStart",
"modifier": "reciprocal"
}
}
],
"boost_mode": "replace",
"score_mode": "max"
}
},
"filter": {
"terms": {
"publications.code": [
"PUB.06"
]
}
}
}
},
"score_mode": "max"
}
}
}
}
]
}
}
}
I'm first to admit it is not the most readable and if there is a way to 'copy_to' nested it would be much more ideal
If not simulating copy_to by injecting data in the source by client before indexing would be more simpler and flexible.
But the above is an example of how it could be done using function scores.

Minimum should match on filtered query

Is it possible to have a query like this
"query": {
"filtered": {
"filter": {
"terms": {
"names": [
"Anna",
"Mark",
"Joe"
],
"execution" : "and"
}
}
}
}
With the "minimum_should_match": "2" statement?
I know that I can use a simple query (I've tried, it works) but I don't need the score to be computed. My goal is just to filter documents which contains 2 of the values.
Does the score generally heavily impact the time needed to retrieves document?
Using this query:
"query": {
"filtered": {
"filter": {
"terms": {
"names": [
"Anna",
"Mark",
"Joe"
],
"execution" : "and",
"minimum_should_match": "2"
}
}
}
}
I got this error:
QueryParsingException[[my_db] [terms] filter does not support [minimum_should_match]]
Minimum should match is not a parameter for the terms filter. If that is the functionality you are looking for, I might rewrite your query like this, to use the bool query wrapped in a query filter:
{
"filter": {
"query": {
"bool": {
"should": [
{
"term": {
"names": "Anna"
}
},
{
"term": {
"names": "Mark"
}
},
{
"term": {
"name": "Joe"
}
}
],
"minimum_should_match": 2
}
}
}
}
You will get documents matching preferably exactly all three, but the query will also match document with exactly two of the three terms. The must is an implicit and. We also do not compute score, as we have executed the query as a filter.

Elasticsearch: inner_hits does not work for nested queries (nested twice)

I'm using ES v1.5.2 (so inner_hits is present and does work correctly for filters) and I have deeply nested documents in elasticsearch with nested paths :
members
members.members
This means that in the mapping (https://gist.github.com/frickm/834a4ff8f952cb86ec02) these two fields are declared as nested. I also have a working query (which essentially is a doubly nested filter on un-analyzed strings) which is shown below and which does work perfectly fine (my actual use-case has an even deeper nesting).
I have two inner_hits clauses in the query: the outer work works
The problem is that the "inner" inner_hits does not work: for the first inner_hits clause we obtain the "real" inner-hits for the members field; but for the second inner_hits clause I get following result for members.members field, which is just wrong (the hits cannot be empty, since then the entire document wouldn't be a hit):
"members.members": {
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
Query:
POST nia/condition/_search
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"nested": {
"path": "members",
"filter": {
"bool": {
"must": [
{
"nested": {
"path": "members.members",
"filter": {
"bool": {
"must": [
{
"term": {
"members.members.findingRef.name": "ImpairedSenseOfTouch"
}
}
]
}
},
"inner_hits": {}
}
}
]
}
},
"inner_hits": {}
}
}
}
}
}
Remark: Replacing the bool filter with direct term filter does not help either (and it shouldn't matter since entire "sub-documents' are supposed to be given back).

Resources