Elasticsearch Nested 2 Step Sorting - sorting

Given the following data with nested objects (members within teams), I need to do a 2 step sort:
Return the youngest member of each team.
Sort the teams by the name of that youngest member.
I have a query below that is close: it does get the youngest member of each team, but then it sorts the teams using the names of all the members, not just the one selected per team.
What would the query be to do this?
And would such a query be performant assuming there was a lot of data? (Probably a few million objects each having 1-3 nested objects.)
Note: Although it's not clear in this simple example, I cannot simply store the youngest member, since in my real world case, the sorting of the nested objects is determined by a formula that includes an external parameter. This is just a very simplified example of the many sorts like this I would have to do on a larger data set, where I need to get the single best matching nested document for each outer document sorted in one way, but then sort the outer objects based on some other property of that selected nested object.
Data
PUT nested_test
{
"mappings": {
"dynamic": "strict",
"properties": {
"team": { "type": "keyword", "index": true, "doc_values": true },
"members": {
"type": "nested",
"properties": {
"name": { "type": "keyword", "index": true, "doc_values": true },
"age": { "type": "integer", "index": true, "doc_values": true}
}
}
}
}
}
PUT nested_test/_doc/1
{
"team" : "A" ,
"members" :
[
{ "name" : "Curt" , "age" : "34" } ,
{ "name" : "Dave" , "age" : "33" }
]
}
PUT nested_test/_doc/2
{
"team" : "B" ,
"members" :
[
{ "name" : "Alex" , "age" : "36" } ,
{ "name" : "Earl" , "age" : "32" }
]
}
PUT nested_test/_doc/3
{
"team" : "C" ,
"members" :
[
{ "name" : "Brad" , "age" : "35" } ,
{ "name" : "Gary" , "age" : "31" }
]
}
Attempted Query
GET nested_test/_search?filter_path=hits.hits._source.team,hits.hits.sort.*,hits.hits.inner_hits.members.hits.hits._source.*,hits.hits.inner_hits.members.hits.hits.sort.*
{
"query": {
"bool": {
"filter": [
{
"nested": {
"path": "members",
"query": {
"match_all" : { }
} ,
"inner_hits": {
"size": 1,
"sort": {
"members.age": { "order": "asc" }
}
}
}
}
]
}
}
,
"sort": [
{ "members.name": {
"order": "asc" ,
"nested": {
"path": "members",
"filter": { "match_all" : { } }
}
} }
]
}
Results (If the query was correct, the teams would be in A, B, C order, but they are B, C, A)
{
"hits" : {
"hits" : [
{
"_source" : {
"team" : "B"
},
"inner_hits" : {
"members" : {
"hits" : {
"hits" : [
{
"_source" : {
"name" : "Earl",
"age" : "32"
}
}
]
}
}
}
},
{
"_source" : {
"team" : "C"
},
"inner_hits" : {
"members" : {
"hits" : {
"hits" : [
{
"_source" : {
"name" : "Gary",
"age" : "31"
}
}
]
}
}
}
},
{
"_source" : {
"team" : "A"
},
"inner_hits" : {
"members" : {
"hits" : {
"hits" : [
{
"_source" : {
"name" : "Dave",
"age" : "33"
}
}
]
}
}
}
}
]
}
}

I not feasable with nested sort. And you cant use the result of the inner_hits to sort your documents.
You could maybe use some runtime field with a complex script to extract the name of the youngest member at search time, but it will certainly be ugly and the performance of the query will be impacted, it will perform poorly at scale.
Since you use a nested model, you have all the data needed during indexation to store the youngest member name in a specific field at the root of the document.
Then you will be able to use a standard sort for this use case.
Its the right way to do it in Elasticsearch it you want to keep the performance.

Related

Include parent _source fields in nested top hits aggregation

I am trying to aggregate on a field and get the top records using top_ hits but I want to include other fields in the response which are not included in the nested property mapping. Currently if I specify _source:{"include":[]}, I am able to get only the fields which are in the current nested property.
Here is my mapping
{
"my_cart":{
"mappings":{
"properties":{
"store":{
"properties":{
"name":{
"type":"keyword"
}
}
},
"sales":{
"type":"nested",
"properties":{
"Price":{
"type":"float"
},
"Time":{
"type":"date"
},
"product":{
"properties":{
"name":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword"
}
}
}
}
}
}
}
}
}
}
}
UPDATE
Joe's answer solved my above issue.
My current issue in response is that though I am getting the product name as "key" and other details, But I am getting other product names as well in the hits which were part of that transaction in the billing receipt. I want to aggregate on the product's name and find last sold date of each product along with other details such as price,quantity, etc .
Current Response
"aggregations" : {
"aggregate_by_most_sold_product" : {
"doc_count" : 2878592,
"all_products" : {
"buckets" : [
{
"key" : "shampoo",
"doc_count" : 1,
"lastSold" : {
"value" : 1.602569793E12,
"value_as_string" : "2018-10-13T06:16:33.000Z"
},
"using_reverse_nested" : {
"doc_count" : 1,
"latest product" : {
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.0,
"hits" : [
{
"_index" : "my_cart",
"_type" : "_doc",
"_id" : "36303258-9r7w-4b3e-ba3d-fhds7cfec7aa",
"_source" : {
"cashier" : {
"firstname" : "romeo",
"uuid" : "2828dhd-0911-7229-a4f8-8ab80dde86a6"
},
"product_price": {
"price":20,
"discount_offered":10
},
"sales" : [
{
"product" : {
"name" : "shampoo",
"time":"2018-10-13T04:44:26+00:00
},
"product" : {
"name" : "noodles",
"time":"2018-10-13T04:42:26+00:00
},
"product" : {
"name" : "biscuits",
"time":"2018-10-13T04:41:26+00:00
}
}
]
}
}
]
}
}
]
Expected Response
It gives me all product name's in that transaction which is increasing the bucket size. I only want single product name with the last date sold along with other details for each product.
My aggregation is same as Joe's aggregation in answer
Also my doubt is that can I also add scripts to perform actions on fields which I got in _source.
Ex:- price-discount_offered = Final amount.
The nested context does not have access to the parent unless you use reverse_nested. In that case, however, you've lost the ability to only retrieve the applicable nested subdocument. But there is luckily a way to sort a terms aggregation by the result of a different, numeric one:
GET my_cart/_search
{
"size": 0,
"aggs": {
"aggregate": {
"nested": {
"path": "sales"
},
"aggs": {
"all_products": {
"terms": {
"field": "sales.product.name.keyword",
"size": 6500,
"order": { <--
"lowest_date": "asc"
}
},
"aggs": {
"lowest_date": { <--
"min": {
"field": "sales.Time"
}
},
"using_reverse_nested": {
"reverse_nested": {}, <--
"aggs": {
"latest product": {
"top_hits": {
"_source": {
"includes": [
"store.name"
]
},
"size": 1
}
}
}
}
}
}
}
}
}
}
The caveat is that you won't be getting the store.name inside of the top_hits -- though I suspect you're probably already doing some post-processing on the client side where you could combine those entries:
"aggregate" : {
...
"all_products" : {
...
"buckets" : [
{
"key" : "myproduct", <--
...
"using_reverse_nested" : {
...
"latest product" : {
"hits" : {
...
"hits" : [
{
...
"_source" : {
"store" : {
"name" : "mystore" <--
}
}
}
]
}
}
},
"lowest_date" : {
"value" : 1.4200704E12,
"value_as_string" : "2015/01/01" <--
}
}
]
}
}

Elasticsearch filter by multiple fields in an object which is in an array field

The goal is to filter products with multiple prices.
The data looks like this:
{
"name":"a",
"price":[
{
"membershipLevel":"Gold",
"price":"5"
},
{
"membershipLevel":"Silver",
"price":"50"
},
{
"membershipLevel":"Bronze",
"price":"100"
}
]
}
I would like to filter by membershipLevel and price. For example, if I am a silver member and query price range 0-10, the product should not appear, but if I am a gold member, the product "a" should appear. Is this kind of query supported by Elasticsearch?
You need to make use of nested datatype for price and make use of nested query for your use case.
Please see the below mapping, sample document, query and response:
Mapping:
PUT my_price_index
{
"mappings": {
"properties": {
"name":{
"type":"text"
},
"price":{
"type":"nested",
"properties": {
"membershipLevel":{
"type":"keyword"
},
"price":{
"type":"double"
}
}
}
}
}
}
Sample Document:
POST my_price_index/_doc/1
{
"name":"a",
"price":[
{
"membershipLevel":"Gold",
"price":"5"
},
{
"membershipLevel":"Silver",
"price":"50"
},
{
"membershipLevel":"Bronze",
"price":"100"
}
]
}
Query:
POST my_price_index/_search
{
"query": {
"nested": {
"path": "price",
"query": {
"bool": {
"must": [
{
"term": {
"price.membershipLevel": "Gold"
}
},
{
"range": {
"price.price": {
"gte": 0,
"lte": 10
}
}
}
]
}
},
"inner_hits": {} <---- Do note this.
}
}
}
The above query means, I want to return all the documents having price.price range from 0 to 10 and price.membershipLevel as Gold.
Notice that I've made use of inner_hits. The reason is despite being a nested document, ES as response would return the entire set of document instead of only the document specific to where the query clause is applicable.
In order to find the exact nested doc that has been matched, you would need to make use of inner_hits.
Below is how the response would return.
Response:
{
"took" : 128,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.9808291,
"hits" : [
{
"_index" : "my_price_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.9808291,
"_source" : {
"name" : "a",
"price" : [
{
"membershipLevel" : "Gold",
"price" : "5"
},
{
"membershipLevel" : "Silver",
"price" : "50"
},
{
"membershipLevel" : "Bronze",
"price" : "100"
}
]
},
"inner_hits" : {
"price" : {
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.9808291,
"hits" : [
{
"_index" : "my_price_index",
"_type" : "_doc",
"_id" : "1",
"_nested" : {
"field" : "price",
"offset" : 0
},
"_score" : 1.9808291,
"_source" : {
"membershipLevel" : "Gold",
"price" : "5"
}
}
]
}
}
}
}
]
}
}
Hope this helps!
Let me take show you how to do it, using the nested fields and query and filter context. I will take your example to show, you how to define index mapping, index sample documents, and search query.
It's important to note the include_in_parent param in Elasticsearch mapping, which allows us to use these nested fields without using the nested fields.
Please refer to Elasticsearch documentation about it.
If true, all fields in the nested object are also added to the parent
document as standard (flat) fields. Defaults to false.
Index Def
{
"mappings": {
"properties": {
"product": {
"type": "nested",
"include_in_parent": true
}
}
}
}
Index sample docs
{
"product": {
"price" : 5,
"membershipLevel" : "Gold"
}
}
{
"product": {
"price" : 50,
"membershipLevel" : "Silver"
}
}
{
"product": {
"price" : 100,
"membershipLevel" : "Bronze"
}
}
Search query to show Gold with price range 0-10
{
"query": {
"bool": {
"must": [
{
"match": {
"product.membershipLevel": "Gold"
}
}
],
"filter": [
{
"range": {
"product.price": {
"gte": 0,
"lte" : 10
}
}
}
]
}
}
}
Result
"hits": [
{
"_index": "so-60620921-nested",
"_type": "_doc",
"_id": "1",
"_score": 1.0296195,
"_source": {
"product": {
"price": 5,
"membershipLevel": "Gold"
}
}
}
]
Search query to exclude Silver, with same price range
{
"query": {
"bool": {
"must": [
{
"match": {
"product.membershipLevel": "Silver"
}
}
],
"filter": [
{
"range": {
"product.price": {
"gte": 0,
"lte" : 10
}
}
}
]
}
}
}
Above query doesn't return any result as there isn't any matching result.
P.S :- this SO answer might help you to understand nested fields and query on them in detail.
You have to use Nested fields and nested query to archive this: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-nested-query.html
Define you Price property with type "Nested" and then you will be able to filter by every property of nested object

Elasticsearch Match Date Range or Number in Array

My goal is to filter my records by date and a day of the week (Mo = 1, Tue = 2, Thu = 3, ..., Sun = 7). In this case, either the date or the weekday should match any of the days in the array. Or both, of course. I am new to Elasticsearch and seem to have a number of mistakes in my query. I documented everything here, as far as I got and hope for a couple of helpful insights. Thanks in advance.
Current Mapping
{
"index":{
"mappings":{
"entity":{
"_meta":{
"model":"AppBundle\\Entity\\Entity"
},
"properties":{
"subEntity":{
"properties":{
"date":{
"type":"date",
"format":"strict_date_optional_time||epoch_millis"
},
"days":{
"properties":{
"day":{
"type":"string"
}
}
}
}
}
}
}
}
}
}
Current Records
curl -XGET 'localhost:9200/index/_search?pretty=1'
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 4,
"max_score" : 1.0,
"hits" : [ {
"_index" : "index",
"_type" : "entity",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"subEntity" : [ {
"date" : "2016-09-20T00:00:00+02:00",
"days" : [ ]
}, {
"date" : "2016-09-21T00:00:00+02:00",
"days" : [ ]
}, {
"date" : "2016-09-22T00:00:00+02:00",
"days" : [ {
"day" : 4
}, {
"day" : 5
}, {
"day" : 6
} ]
}, {
"date" : "2016-09-20T00:00:00+02:00",
"days" : [ ]
} ]
}
},
[...]
}
}
Current Request
{
"query":{
"should":{
"filter":[ {
"range":{
"entity.subEntity.date":{
"gte":"2016-09-20",
"lte":"2016-09-21"
}
}
}, {
"term":{
"entity.subEntity.days.day": 2
}
} ]
}
}
}
MySQL Equivalent
SELECT entity
FROM entity
LEFT JOIN subEntity ON (subEntity.entity_id = entity.id)
LEFT JOIN day ON (day.subEntity_id = subEntity.id)
WHERE subEntity.date BETWEEN 2016-09-20 AND 2016-09-21
OR day = 2
If you want to query across properties of a sub-object within a document (where a document may have a collection of such sub-objects), you need to map subEntity as a nested type. In your example, since you are only looking for documents that are within the date range or match the day value, you can use an object mapping as have, but if you need to combine queries with an and operation, then you would need a nested type mapping. If you need to do this, it would make sense to map as a nested type. Additionally, since day is a numeric value, you should map it as a byte.
{
"index":{
"mappings":{
"entity":{
"_meta":{
"model":"AppBundle\\Entity\\Entity"
},
"properties":{
"subEntity":{
"type": "nested",
"properties":{
"date":{
"type":"date",
"format":"strict_date_optional_time||epoch_millis"
},
"days":{
"properties":{
"day":{
"type":"byte"
}
}
}
}
}
}
}
}
}
}
Now that subEntity is mapped as a nested type, a nested query needs to be used to query against it, so the query becomes
{
"query": {
"nested": {
"query": {
"bool": {
"should": [
{
"bool": {
"filter": [
{
"range": {
"subEntity.date": {
"gte": "2016-09-20",
"lte": "2016-09-21"
}
}
}
]
}
},
{
"bool": {
"filter": [
{
"terms": {
"subEntity.days.day": [
2
]
}
}
]
}
}
]
}
},
"path": "subEntity"
}
}
}
Both queries are issued as bool filter queries as we don't need to calculate a relevancy score for either, we simply need to know if a document matches or not i.e. a simple yes/no answer. Warpping a query in a bool filter means that the query runs in a filter context.
Next, either query can match, so we add both as should clauses to an outer bool query.
As a complete example:
Create index and mapping
PUT http://localhost:9200/entities?pretty=true
{
"settings": {
"index.number_of_replicas": 0,
"index.number_of_shards": 1
},
"mappings": {
"entity": {
"properties": {
"id": {
"type": "integer"
},
"subEntity": {
"type": "nested",
"properties": {
"date": {
"type": "date"
},
"days": {
"properties": {
"day": {
"type": "short"
}
},
"type": "object"
}
}
}
}
}
}
}
Bulk index four entities
POST http://localhost:9200/_bulk?pretty=true
{"index":{"_index":"entities","_type":"entity","_id":"1"}}
{"subEntity":{"date":"2016-09-19T05:00:00+00:00"}}
{"index":{"_index":"entities","_type":"entity","_id":"2"}}
{"subEntity":{"date":"2016-09-20T05:00:00+00:00"}}
{"index":{"_index":"entities","_type":"entity","_id":"3"}}
{"subEntity":{"date":"2016-09-18T18:00:00+00:00","days":[{"day":2},{"day":5}]}}
{"index":{"_index":"entities","_type":"entity","_id":"4"}}
{"subEntity":{"date":"2016-09-18T18:00:00+00:00","days":[{"day":3},{"day":4}]}}
Issue the search query above
POST http://localhost:9200/entities/entity/_search?pretty=true
{
"query": {
"nested": {
"query": {
"bool": {
"should": [
{
"bool": {
"filter": [
{
"range": {
"subEntity.date": {
"gte": "2016-09-20",
"lte": "2016-09-21"
}
}
}
]
}
},
{
"bool": {
"filter": [
{
"terms": {
"subEntity.days.day": [
2
]
}
}
]
}
}
]
}
},
"path": "subEntity"
}
}
}
We should only get back entities with ids 2 and 3; id 2 matches on date and id 3 matches on day
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 0.0,
"hits" : [ {
"_index" : "entities",
"_type" : "entity",
"_id" : "2",
"_score" : 0.0,
"_source" : {
"subEntity" : {
"date" : "2016-09-20T05:00:00+00:00"
}
}
}, {
"_index" : "entities",
"_type" : "entity",
"_id" : "3",
"_score" : 0.0,
"_source" : {
"subEntity" : {
"date" : "2016-09-18T18:00:00+00:00",
"days" : [ {
"day" : 2
}, {
"day" : 5
} ]
}
}
} ]
}
}
Your Solution can be easily achieved using "or" query but now in es 2.0.0 onwards "or" query is deprecated. in-place of using or query we can use "bool" query now. Sample query is given below
{
"query": {
"bool" : {
"should" : [
{
"term" : { "CREAT_DT": "2015-11-03T07:49:07.000Z" }
},
{
"term" : { "TableName": "dwd" }
}
],
"minimum_should_match" : 1,
"boost" : 1.0
}
}
}
More details about it's uses can be found in below link
https://www.elastic.co/guide/en/elasticsearch/reference/2.0/query-dsl-bool-query.html

Search query for elastic search

I have documents in elastic search in the following format
{
"stringindex" : {
"mappings" : {
"files" : {
"properties" : {
"BaseOfCode" : {
"type" : "long"
},
"BaseOfData" : {
"type" : "long"
},
"Characteristics" : {
"type" : "long"
},
"FileType" : {
"type" : "long"
},
"Id" : {
"type" : "string"
},
"Strings" : {
"properties" : {
"FileOffset" : {
"type" : "long"
},
"RO_BaseOfCode" : {
"type" : "long"
},
"SectionName" : {
"type" : "string"
},
"SectionOffset" : {
"type" : "long"
},
"String" : {
"type" : "string"
}
}
},
"SubSystem" : {
"type" : "long"
}
}
}
}
}
}
My requirement is when I search for a particular string (String.string) i want to get only the FileOffSet (String.FileOffSet) for that string.
How do i do this?
Thanks
I suppose that you want to perform a nested query and retrieve only one field as the result, but I see problems in your mapping, hence I will split my answer in 3 sections:
What is the problem I see:
How to query nested fields (this is more ES background):
How to find a solution:
1) What is the problem I see:
You want to query a nested field, but you don't have a nested field.
The nested field part:
The field "Strings" is not nested in the type "files" (nested data without a nested field may bring future problems), otherwise your mapping for the field "Strings" would be something like this:
{
"stringindex" : {
"mappings" : {
"files" : {
"properties" : {
"Strings" : {
"properties" : {
"type" : "nested",
"String" : {
"type" : "string"
}
}
}
}
}
}
}
}
Note: yes, I cut most of the fields, but I did this to easily show that you didn't create a nested field.
With a nested field "in hands", we need a nested query.
The specific field result part:
To retrieve only one field as result, you have to include the property "_source" in your query.
2) How to query nested fields:
This is more for ES background, if you have never worked with nested fields.
Small example:
You define a type with a nested field:
{
"nesttype" : {
"properties" : {
"name" : { "type" : "string" },
"parents" : {
"type" : "nested" ,
"properties" : {
"sex" : { "type" : "string" },
"name" : { "type" : "string" }
}
}
}
}
}
You create some inputs:
{ "name" : "Dan", "parents" : [{ "name" : "John" , "sex" : "m" },
{ "name" : "Anna" , "sex" : "f" }] }
{ "name" : "Lana", "parents" : [{ "name" : "Maria" , "sex" : "f" }] }
Then you query, but only fetch the nested field "parents.name":
{
"query": {
"nested": {
"path": "parents",
"query": {
"bool": {
"must": [
{
"term": {
"sex": "m"
}
}
]
}
}
}
},
"_source" : [ "parents.name" ]
}
The output of this query is "the name of the parents of all people who have a parent of the sex 'm' ". One entry (Dan) has a father, whereas the other (Lana) doesn't. So it only will retrieve Dan's parents names.
3) How to find a solution:
To fix your mapping:
You only need to include the type "nested" in the field "Strings":
{
"files" : {
"properties" : {
...
"Strings" : {
"type" : "nested" ,
"properties" : {
"FileOffset" : { "type" : "long" },
"RO_BaseOfCode" : { "type" : "long" },
...
}
}
...
}
}
}
To query your data:
{
"query": {
"nested": {
"path": "Strings",
"query": {
"bool": {
"must": [
{
"term": {
"String": "my string"
}
}
]
}
}
}
},
"_source" : [ "Strings.FileOffSet" ]
}
Great answer by dan, but I think he didn't mention it all.
His solution don't work for your question, but I guess you even don't know that.
Consider a scenario where data is like ,
doc_1
{
"Id": 1,
"Strings": [
{
"string": "x",
"fileoffset": "f1"
},
{
"string": "y",
"fileoffset": "f2"
}
]
}
doc_2
{
"Id": 2,
"Strings": {
"string": "z",
"fileoffset": "f3"
}
}
When you run the like dan said, like say let's apply filter with Strings.string=x then response is like,
{
"hits": [
{
"_index": "stringindex",
"_type": "files",
"_id": "11961",
"_score": 1,
"_source": {
"Strings": [
{
"fileoffset": "f1"
},
{
"fileoffset": "f2"
}
]
}
}
]
}
This is because, elasticsearch will get hits from documents where any of the object inside nested field (here Strings) pass the filter criteria. (In this case in doc_1, Strings.string=x passed filter, so doc_1 is returned. But we don't know which nested object pass the criteria.
So, you have to use nested_aggregation,
Here is a solution for you..
POST index/type/_search
{
"size": 0,
"aggs": {
"StringsNested": {
"nested": {
"path": "Strings"
},
"aggs": {
"StringFilter": {
"filter": {
"term": {
"Strings.string": "x"
}
},
"aggs": {
"FileOffsets": {
"terms": {
"field": "Strings.fileoffset"
}
}
}
}
}
}
}
}
So, response is like,
"aggregations": {
"StringsNested": {
"doc_count": 2,
"StringFilter": {
"doc_count": 1,
"FileOffsets": {
"buckets": [
{
"key": "f1",
"doc_count": 1
}
]
}
}
}
}
Remember to have mapping of Strings as nested, as dan said.

ElasticSearch : search and return nested type

I am pretty new to ElasticSearch and I am having trouble using nested mapping / query.
I have the following data structure added to my index :
{
"_id": "3",
"_rev": "6-e9e1bc15b39e333bb4186de05ec1b167",
"skuCode": "test",
"name": "Dragon vol. 1",
"pages": [
{
"id": "1",
"tags": [
{
"name": "dragon"
},
{
"name": "japonese"
}
]
},
{
"id": "2",
"tags": [
{
"name": "tagforanotherpage"
}
]
}
]
}
This index mapping is defined as bellow :
{
"metabook" : {
"metabook" : {
"properties" : {
"_rev" : {
"type" : "string"
},
"name" : {
"type" : "string"
},
"pages" : {
"type" : "nested",
"properties" : {
"tags" : {
"properties" : {
"name" : {
"type" : "string"
}
}
}
}
},
"skuCode" : {
"type" : "string"
}
}
}
}
}
My goal is to search all pages containing a specific tag, and return the book object with the filtered page list (I would like ES to return only pages that match the given tag). Something like (ignoring the second page) :
{
"_id": "3",
"_rev": "6-e9e1bc15b39e333bb4186de05ec1b167",
"skuCode": "test",
"name": "Dragon vol. 1",
"pages": [
{
"id": "1",
"tags": [
{
"name": "dragon"
},
{
"name": "japonese"
}
]
}
]
}
Here is the query I actually use :
{
"from": 0,
"size": 10,
"query" : {
"nested" : {
"path" : "pages",
"score_mode" : "avg",
"query" : {
"term" : { "tags.name" : "japonese" }
}
}
}
}
But it actually returns an empty result. What am I doing wrong ? Maybe I should index my "pages" directly instead of books ? What am I missing ?
Thank you in advance !
Sadly you can't get back only parts of the a document. If the document matches a query, you will get the whole thing back; the root and all nested docs. If you want to get only parts back, then you could look at using parent/child docs.
Also you aren't seeing any hits as you have a small syntax error in the nested query. Look closely at the field name:
{
"from": 0,
"size": 10,
"query" : {
"nested" : {
"path" : "pages",
"score_mode" : "avg",
"query" : {
"term" : { "pages.tags.name" : "japonese" }
}
}
}
}
If you need help with parent child docs feel free to ask! (There should be examples if you do a google search)
Good luck!

Resources