Elasticsearch sorting by matching array item - sorting

I have a following structure in indexed documents:
document1: "customLists":[{"id":8,"position":8},{"id":26,"position":2}]
document2: "customLists":[{"id":26,"position":1}]
document3: "customLists":[{"id":8,"position":1},{"id":26,"position":3}]
I am able to search matching documents that belong to a given list with match query "customLists.id = 26". But I need to sort the documents based on the position value within that list and ignore positions of the other lists.
So the expected results would be in order of document2, document1, document3
Is the data structure suitable for this kind of sorting and how to handle this?

One way to achieve this would be to set mapping type of customLists as nested and then use sorting by nested fields
Example :
1) Create Index & Mapping
put test
put test/test/_mapping
{
"properties": {
"customLists": {
"type": "nested",
"properties": {
"id": {
"type": "integer"
},
"position": {
"type": "integer"
}
}
}
}
}
2) Index Documents :
put test/test/1
{
"customLists":[{"id":8,"position":8},{"id":26,"position":2}]
}
put test/test/2
{
"customLists":[{"id":26,"position":1}]
}
put test/test/3
{
"customLists":[{"id":8,"position":1},{"id":26,"position":3}]
}
3) Query to sort by positon for given id
post test/_search
{
"filter": {
"nested": {
"path": "customLists",
"query": {
"term": {
"customLists.id": {
"value": "26"
}
}
}
}
},
"sort": [
{
"customLists.position": {
"order": "asc",
"mode": "min",
"nested_filter": {
"term": {
"customLists.id": {
"value": "26"
}
}
}
}
}
]
}

Related

ElasticSearch Failing to Sort Nested Object in order

ElasticSearch 6.5.2 Given the mapping and query, the document order is not effected by changing 'desc' to 'asc' and vice versa. Not seeing any errors, just sort: [Infinity] in the results.
Mapping:
{
"mappings": {
"_doc": {
"properties": {
"tags": {
"type": "keyword"
},
"metrics": {
"type": "nested",
"dynamic": true
}
}
}
}
}
Query
{
"query": {
"match_all": {
}
},
"sort": [
{
"metrics.http.test.value": {
"order": "desc"
}
}
]
}
Document structure:
{
"tags": ["My Tag"],
"metrics": {
"http.test": {
"updated_at": "2018-12-08T23:22:07.056Z",
"value": 0.034
}
}
}
When sorting by nested field it is necessary to tell the path of nested field using nested param.
One thing more you were missing in the query is the field on which to sort. Assuming you want to sort on updated_at the query will be:
{
"query": {
"match_all": {}
},
"sort": [
{
"metrics.http.test.updated_at": {
"order": "desc",
"nested": {
"path": "metrics"
}
}
}
]
}
One more thing that you should keep in mind while sorting using nested field is about filter clause in sort. Read more about it here.
Apparently changing the mapping to this:
"metrics": {
"dynamic": true,
"properties": {}
}
Fixed it and allowed sorting to happen in the correct order.

Multiple (AND) queries for a nested index structure in Elasticsearch

I have an index with the below mapping
{
"mappings": {
"xxxxx": {
"properties": {
"ID": {
"type": "text"
},
"pairs": {
"type": "nested"
},
"xxxxx": {
"type": "text"
}
}
}
}
}
the pairs field is essentially an array of objects - each object has a unique ID associated with it
What i'm trying to do is to get only one object from the pairs field for updates. To that extent , i've tried this
GET /sample/_search/?size=1000
{
"query": {
"bool": {
"must": [
{
"match": {
"ID": "2rXdCf5OM9g1ebPNFdZNqW"
}
},
{
"match": {
"pairs.id": "c1vNGnnQLuk"
}
}
]
}
},
"_source": "pairs"
}
but this just returns an empty object despite them being valid IDs. If i remove the pairs.id rule - i get the entire array of objects .
What do i need to add/edit to ensure that i can query via both IDS (original and nested)
Since pairs is of nested type, you need to use a nested query. Also you might probably want to leverage nested inner-hits as well:
GET /sample/_search/?size=1000
{
"query": {
"bool": {
"must": [
{
"match": {
"ID": "2rXdCf5OM9g1ebPNFdZNqW"
}
},
{
"nested": {
"path": "pairs",
"query": {
"match": {
"pairs.id": "c1vNGnnQLuk"
}
},
"inner_hits": {}
}
}
]
}
},
"_source": false
}

Elastic Search: filter query results by entry its field into another query results

I found the question about the IN equivalent operator:
ElasticSearch : IN equivalent operator in ElasticSearch
But I would to find equivalent to the another more complicated request:
SELECT * FROM table WHERE id IN (SELECT id FROM anotherTable WHERE something > 0);
Mapping:
First index:
{
"mappings": {
"products": {
"properties": {
"id": { "type": "integer" },
"name": { "type": "text" },
}
}
}
}
Second index:
{
"mappings": {
"reserved": {
"properties": {
"id": { "type": "integer" },
"type": { "type": "text" },
}
}
}
}
I want to get products which ids are contained in reserved index and have the specific type of a reserve.
First step - get all relevant ids from reserved index:
{
"size": 0,
"query": {
"bool": {
"must": [
{
"term": {
"type": "TYPE_HERE"
}
}
]
}
},
"aggregations": {
"ids": {
"terms": {
"field": "id"
}
}
}
}
--> see: Terms Aggregations, Bool Query and Term Query.
--> _source will retrieve only relevant field id.
Second step - get all relevant documents from products index:
{
"query": {
"bool": {
"must": [
{
"terms": {
"id": [
"ID_1",
"ID_2",
"AND_SO_ON..."
]
}
}
]
}
}
}
--> take all the ids from first step and put them as a list under terms:id[...]
--> see Terms Query.

Elasticsearch getting the last nested or most recent nested element

We have this mapping:
{
"product_achievement": {
"type": "nested",
"properties": {
"id": {
"type": "long"
},
"last_purchase": {
"type": "long"
},
"products": {
"type": "long"
}
}
}
}
As you see this is nested, and the last_purchase field is a unixtimestamp value. We would like to query from all nested elements the most recent entry defined by the last_purchase field AND see if in the last entry there is some product id is in products.
You can achieve this using a nested query with inner_hits. In the query part, you can specify the product id you want to match and then using inner_hits you can sort by decreasing last_purchase timestamp and only take the first one using size: 1
{
"query": {
"nested": {
"path": "product_achievement",
"query": {
"term": {
"product_achievement.products": 1
}
},
"inner_hits": {
"size": 1,
"sort": {
"product_achievement.last_purchase": "desc"
}
}
}
}
}

Searching objects having all nested children matching a given query in Elasticsearch

Given an object with the following mapping:
{
"a": {
"properties": {
"id": {"type": "string"}
"b": {
"type": "nested",
"properties": {
"key": {"type": "string"}
}
}
}
}
}
I want to retrieve all the instances of this object having all nested children matching a given query.
For example, suppose I want to retrieve all the instances having all children with "key" = "yes".
Given the following instances:
{
"id": "1",
"b": [
{
"key": "yes"
},
{
"key": "yes"
}
]
},
{
"id": "2",
"b": [
{
"key": "yes"
},
{
"key": "yes"
},
{
"key": "no"
}
]
},
I want to retrieve only the first one (the one with "id" = "1").
Both using filters or queries is fine to me.
I already tried to use the "not filter" and the "must_not bool filter". The idea was to use a double negation to extract only objects that doesn't have fields that are different to the given one.
However, I was not able to write down this query correctly.
I realize that this is not a common query for a search engine, but, in my case, it can be useful.
Is it possible to write this query ("forall nested query") using nested objects?
In case it is not, would it be possible to write this query using parent-child?
Update
Andrei Stefan gave a good answer in case we know all the values of "key" that we want to avoid, ("no", in the example).
I am interested also in the case you don't know the values you want to avoid, and you just want to match nested object with "key"="yes".
You need a flattened data structure for this - an array of values. The simplest way and not to change the current mapping too much, is to use include_in_parent property and to query the field that's being included in the parent for this particular requirement:
{
"mappings": {
"a": {
"properties": {
"id": {
"type": "string"
},
"b": {
"type": "nested",
"include_in_parent": true,
"properties": {
"key": {
"type": "string"
}
}
}
}
}
}
}
And then your query would look like this:
{
"query": {
"filtered": {
"filter": {
"and": [
{
"query": {
"query_string": { "query": "b.key:(yes NOT no)"}
}
}
]
}
}
}
}
The alternative is to change the type of the field from nested to object but in this way you'll loose the advantages of using nested fields:
{
"mappings": {
"a": {
"properties": {
"id": {
"type": "string"
},
"b": {
"type": "object",
"properties": {
"key": {
"type": "string"
}
}
}
}
}
}
}
The query remains the same.
Encountered the same problem, though didn't have just yes/no variants.
As per Clinton Gormley's answer in https://github.com/elastic/elasticsearch/issues/19166:
"You can't do it any efficient way. You have to count all children and compare that to how many children match. The following will return all parents where all children match but it is a horrible inefficient solution and I would never recommend using it in practice":
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "b",
"score_mode": "sum",
"query": {
"function_score": {
"query": {
"match_all": {}
},
"functions": [
{
"weight": -1
},
{
"filter": {
"match": {
"b.key": "yes"
}
},
"weight": 1
}
],
"score_mode": "sum",
"boost_mode": "replace"
}
}
}
}
]
}
}
}

Resources