ElasticSearch Failing to Sort Nested Object in order - elasticsearch

ElasticSearch 6.5.2 Given the mapping and query, the document order is not effected by changing 'desc' to 'asc' and vice versa. Not seeing any errors, just sort: [Infinity] in the results.
Mapping:
{
"mappings": {
"_doc": {
"properties": {
"tags": {
"type": "keyword"
},
"metrics": {
"type": "nested",
"dynamic": true
}
}
}
}
}
Query
{
"query": {
"match_all": {
}
},
"sort": [
{
"metrics.http.test.value": {
"order": "desc"
}
}
]
}
Document structure:
{
"tags": ["My Tag"],
"metrics": {
"http.test": {
"updated_at": "2018-12-08T23:22:07.056Z",
"value": 0.034
}
}
}

When sorting by nested field it is necessary to tell the path of nested field using nested param.
One thing more you were missing in the query is the field on which to sort. Assuming you want to sort on updated_at the query will be:
{
"query": {
"match_all": {}
},
"sort": [
{
"metrics.http.test.updated_at": {
"order": "desc",
"nested": {
"path": "metrics"
}
}
}
]
}
One more thing that you should keep in mind while sorting using nested field is about filter clause in sort. Read more about it here.

Apparently changing the mapping to this:
"metrics": {
"dynamic": true,
"properties": {}
}
Fixed it and allowed sorting to happen in the correct order.

Related

How can I retrieve a document with a defined amount of sorted nested fields?

Here is my problem: let's assume I have a Facebook post indexed on ElasticSearch. This post has many comments as nested fields, which, themselves, have a "likes" count. So, the mapping would be something like this:
"mappings": {
"post": {
"properties": {
"id": {
"type": "integer"
},
"comments": {
"type": "nested",
"properties": {
"like_count": {
"type": "integer"
}
}
}
}
}
}
A post could have thousands of comments, but what if I want to retrieve only the 10 most liked comments from a certain post (so I'd have to define the post's id, limit a size for the field array and define a sort rule)? Is it possible? I've tried many ways using the "nested" query, but with no success.
Any ideas?
Edit: one of the queries I tried, in case anyone still has a doubt about what I want:
{
"query": {
"match": {
"id": 81500
}
},
"sort": [
{
"comments.like_count":
{
"order": "desc"
}
}
],
"nested": {
"path": "comments",
"inner_hits": {
"size": 10
}
}
}
Try this query:
{
"query": {
"match": {
"id": 81500
}
},
"sort": [
{
"comments.like_count":
{
"order": "desc"
}
}
],
"size": 10
}

Elasticsearch nested significant terms aggregation with background filter

I am having hard times applying a background filter to a nested significant terms aggregation , the bg_count is always 0.
I'm indexing article views that have ids and timestamps, and have multiple applications on a single index. I want the foreground and background set to relate to the same application, so I'm trying to apply a term filter on the app_id field both in the boo query and in the background filter. article_views is a nested object since I want to be also able to query on views with a range filter on timestamp, but I haven't got to that yet.
Mapping:
{
"article_views": {
"type": "nested",
"properties": {
"id": {
"type": "string",
"index": "not_analyzed"
},
"timestamp": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
}
}
},
"app_id": {
"type": "string",
"index": "not_analyzed"
}
}
Query:
{
"aggregations": {
"articles": {
"nested": {
"path": "article_views"
},
"aggs": {
"articles": {
"significant_terms": {
"field": "article_views.id",
"size": 5,
"background_filter": {
"term": {
"app_id": "17"
}
}
}
}
}
}
},
"query": {
"bool": {
"must": [
{
"term": {
"app_id": "17"
}
},
{
"nested": {
"path": "article_views",
"query": {
"terms": {
"article_views.id": [
"1",
"2"
]
}
}
}
}
]
}
}
}
As I said, in my result, the bg_count is always 0, which had me worried. If the significant terms is on other fields which are not nested the background_filter works fine.
Elasticsearch version is 2.2.
Thanks
You seem to be hitting the following issue where in your background filter you'd need to "go back" to the parent context in order to define your background filter based on a field of the parent document.
You'd need a reverse_nested query at that point, but that doesn't exist.
One way to circumvent this is to add the app_id field to your nested documents so that you can simply use it in the background filter context.

Elasticsearch sorting by matching array item

I have a following structure in indexed documents:
document1: "customLists":[{"id":8,"position":8},{"id":26,"position":2}]
document2: "customLists":[{"id":26,"position":1}]
document3: "customLists":[{"id":8,"position":1},{"id":26,"position":3}]
I am able to search matching documents that belong to a given list with match query "customLists.id = 26". But I need to sort the documents based on the position value within that list and ignore positions of the other lists.
So the expected results would be in order of document2, document1, document3
Is the data structure suitable for this kind of sorting and how to handle this?
One way to achieve this would be to set mapping type of customLists as nested and then use sorting by nested fields
Example :
1) Create Index & Mapping
put test
put test/test/_mapping
{
"properties": {
"customLists": {
"type": "nested",
"properties": {
"id": {
"type": "integer"
},
"position": {
"type": "integer"
}
}
}
}
}
2) Index Documents :
put test/test/1
{
"customLists":[{"id":8,"position":8},{"id":26,"position":2}]
}
put test/test/2
{
"customLists":[{"id":26,"position":1}]
}
put test/test/3
{
"customLists":[{"id":8,"position":1},{"id":26,"position":3}]
}
3) Query to sort by positon for given id
post test/_search
{
"filter": {
"nested": {
"path": "customLists",
"query": {
"term": {
"customLists.id": {
"value": "26"
}
}
}
}
},
"sort": [
{
"customLists.position": {
"order": "asc",
"mode": "min",
"nested_filter": {
"term": {
"customLists.id": {
"value": "26"
}
}
}
}
}
]
}

Elasticsearch getting the last nested or most recent nested element

We have this mapping:
{
"product_achievement": {
"type": "nested",
"properties": {
"id": {
"type": "long"
},
"last_purchase": {
"type": "long"
},
"products": {
"type": "long"
}
}
}
}
As you see this is nested, and the last_purchase field is a unixtimestamp value. We would like to query from all nested elements the most recent entry defined by the last_purchase field AND see if in the last entry there is some product id is in products.
You can achieve this using a nested query with inner_hits. In the query part, you can specify the product id you want to match and then using inner_hits you can sort by decreasing last_purchase timestamp and only take the first one using size: 1
{
"query": {
"nested": {
"path": "product_achievement",
"query": {
"term": {
"product_achievement.products": 1
}
},
"inner_hits": {
"size": 1,
"sort": {
"product_achievement.last_purchase": "desc"
}
}
}
}
}

Searching objects having all nested children matching a given query in Elasticsearch

Given an object with the following mapping:
{
"a": {
"properties": {
"id": {"type": "string"}
"b": {
"type": "nested",
"properties": {
"key": {"type": "string"}
}
}
}
}
}
I want to retrieve all the instances of this object having all nested children matching a given query.
For example, suppose I want to retrieve all the instances having all children with "key" = "yes".
Given the following instances:
{
"id": "1",
"b": [
{
"key": "yes"
},
{
"key": "yes"
}
]
},
{
"id": "2",
"b": [
{
"key": "yes"
},
{
"key": "yes"
},
{
"key": "no"
}
]
},
I want to retrieve only the first one (the one with "id" = "1").
Both using filters or queries is fine to me.
I already tried to use the "not filter" and the "must_not bool filter". The idea was to use a double negation to extract only objects that doesn't have fields that are different to the given one.
However, I was not able to write down this query correctly.
I realize that this is not a common query for a search engine, but, in my case, it can be useful.
Is it possible to write this query ("forall nested query") using nested objects?
In case it is not, would it be possible to write this query using parent-child?
Update
Andrei Stefan gave a good answer in case we know all the values of "key" that we want to avoid, ("no", in the example).
I am interested also in the case you don't know the values you want to avoid, and you just want to match nested object with "key"="yes".
You need a flattened data structure for this - an array of values. The simplest way and not to change the current mapping too much, is to use include_in_parent property and to query the field that's being included in the parent for this particular requirement:
{
"mappings": {
"a": {
"properties": {
"id": {
"type": "string"
},
"b": {
"type": "nested",
"include_in_parent": true,
"properties": {
"key": {
"type": "string"
}
}
}
}
}
}
}
And then your query would look like this:
{
"query": {
"filtered": {
"filter": {
"and": [
{
"query": {
"query_string": { "query": "b.key:(yes NOT no)"}
}
}
]
}
}
}
}
The alternative is to change the type of the field from nested to object but in this way you'll loose the advantages of using nested fields:
{
"mappings": {
"a": {
"properties": {
"id": {
"type": "string"
},
"b": {
"type": "object",
"properties": {
"key": {
"type": "string"
}
}
}
}
}
}
}
The query remains the same.
Encountered the same problem, though didn't have just yes/no variants.
As per Clinton Gormley's answer in https://github.com/elastic/elasticsearch/issues/19166:
"You can't do it any efficient way. You have to count all children and compare that to how many children match. The following will return all parents where all children match but it is a horrible inefficient solution and I would never recommend using it in practice":
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "b",
"score_mode": "sum",
"query": {
"function_score": {
"query": {
"match_all": {}
},
"functions": [
{
"weight": -1
},
{
"filter": {
"match": {
"b.key": "yes"
}
},
"weight": 1
}
],
"score_mode": "sum",
"boost_mode": "replace"
}
}
}
}
]
}
}
}

Resources