inner_hits does not work for nested filters when searching multiple indices - elasticsearch

Elasticsearch version
Version: 6.2.2
I'm trying to search multiple indices which both have nested objects. To keep it simple I will use this example. In the query I use "inner_hits" property.
PUT /sales
{
"mappings": {
"_doc" : {
"properties" : {
"tags" : { "type" : "keyword" },
"comments" : {
"type" : "nested",
"properties" : {
"username" : { "type" : "keyword" },
"comment" : { "type" : "text" }
}
}
}
}
}
}
PUT /sales/_doc/1?refresh
{
"tags": ["car", "auto"],
"comments": [
{"username": "baddriver007", "comment": "This car could have better brakes"},
{"username": "dr_who", "comment": "Where's the autopilot? Can't find it"},
{"username": "ilovemotorbikes", "comment": "This car has two extra wheels"}
]
}
PUT /markets
{
"mappings": {
"_doc" : {
"properties" : {
"name" : { "type" : "keyword" },
"products" : {
"type" : "nested",
"properties" : {
"name" : { "type" : "keyword" },
"sku" : { "type" : "text" }
}
}
}
}
}
}
POST /sales,markets/_search
{
"query": {
"bool": {
"should": [
{
"bool": {
"should": [
{
"nested": {
"path": "comments",
"inner_hits": {
},
"ignore_unmapped": true,
"query": {
"match_all": {}
}
}
}]
}
},
{
"bool": {
"should": [
{
"nested": {
"path": "products",
"inner_hits": {
},
"ignore_unmapped": true,
"query": {
"match_all": {}
}
}
}
]
}
}
]
}
}
}
So this query gives an error
{
"error": {
"root_cause": [
{
"type": "illegal_state_exception",
"reason": "[match_all] no mapping found for type [comments]"
},
{
"type": "illegal_state_exception",
"reason": "[match_all] no mapping found for type [products]"
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "markets",
"node": "-22psoQNRLa8_Y9GeHBXaw",
"reason": {
"type": "illegal_state_exception",
"reason": "[match_all] no mapping found for type [comments]"
}
},
{
"shard": 0,
"index": "sales",
"node": "-22psoQNRLa8_Y9GeHBXaw",
"reason": {
"type": "illegal_state_exception",
"reason": "[match_all] no mapping found for type [products]"
}
}
]
},
"status": 500
}
But when I add "ignore_unmapped": true inside each "inner_hits": { "ignore_unmapped": true } everything works fine. This is not implemented in NEST .net library.
Is it right to use "ignore_unmapped" inside "inner_hits", because I didn't find this as "inner_hits" property in the documentation ?
Is there any other solution in NEST for this to be done.
UPDATE:
I tried using operator overloading queries and I got
Func<SearchDescriptor<object>, ISearchRequest> search = s => s.Type<object>()
.Index(Indices.Index("sales", "markets"))
.AllTypes()
.Explain(false)
.Query(q => (q
.Nested(n => n
.IgnoreUnmapped(true)
.Path(Infer.Field<SaleDocument>(f => f.Comments))
.InnerHits(ih => ih
.Size(1)
)
.Query(q1 => q1
.MatchAll()
)
) && +q.Terms(t => t
.Field("_index")
.Terms(new[] { "sales" })
)
) || (q
.Nested(n => n
.IgnoreUnmapped(true)
.Path(Infer.Field<MarketDocument>(f => f.Products))
.InnerHits(ih => ih
.Size(1)
)
.Query(q1 => q1
.MatchAll()
)
) && +q.Terms(t => t
.Field("_index")
.Terms(new[] { "markets" })
)
)
);
This code created query
POST /sales,markets/_search
{
"from": 0,
"size": 10,
"explain": false,
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"nested": {
"query": {
"match_all": {}
},
"path": "comments",
"inner_hits": {
"size": 1
},
"ignore_unmapped": true
}
}
],
"filter": [
{
"terms": {
"_index": [
"sales"
]
}
}
]
}
},
{
"bool": {
"must": [
{
"nested": {
"query": {
"match_all": {}
},
"path": "products",
"inner_hits": {
"size": 1
},
"ignore_unmapped": true
}
}
],
"filter": [
{
"terms": {
"_index": [
"markets"
]
}
}
]
}
}
]
}
}
}
Which again gives error
{
"error": {
"root_cause": [
{
"type": "illegal_state_exception",
"reason": "[match_all] no mapping found for type [comments]"
},
{
"type": "illegal_state_exception",
"reason": "[match_all] no mapping found for type [products]"
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "markets",
"node": "-22psoQNRLa8_Y9GeHBXaw",
"reason": {
"type": "illegal_state_exception",
"reason": "[match_all] no mapping found for type [comments]"
}
},
{
"shard": 0,
"index": "sales",
"node": "-22psoQNRLa8_Y9GeHBXaw",
"reason": {
"type": "illegal_state_exception",
"reason": "[match_all] no mapping found for type [products]"
}
}
]
},
"status": 500
}

As you say, it looks like inner_hits property is missing within NEST; I'll open an issue to add this for the next release now.
ignore_unmapped is the way to handle this when needing inner_hits. If you didn't need inner_hits, you could combine each nested query with a term query on the "_index" metadata field that targets the respective index name in each case, such that the nested query is run only against the indices that contain the target field.

This issue has been fixed in both places and will be available in some of the future releases of NEST or ElasticSearch.
NEST .net library by adding IgnoreUnmapped() method in InnerHits.
https://github.com/elastic/elasticsearch-net/issues/3132
Elastic Search by inherit ignore_unmapped in inner_hits from nested query ignore_unmapped property
https://github.com/elastic/elasticsearch/issues/29071

Related

Elastic search array of objects nested range aggregation

I'm trying to make range aggregation on the following data set:
{
"ProductType": 1,
"ProductDefinition": "fc588f8e-14f2-4871-891f-c73a4e3d17ca",
"ParentProduct": null,
"Sku": "074617",
"VariantSku": null,
"Name": "Paraboot Avoriaz/Jannu Marron Brut Marron Brown Hiking Boot Shoes",
"AllowOrdering": true,
"Rating": null,
"ThumbnailImageUrl": "/media/1106/074617.jpg",
"PrimaryImageUrl": "/media/1106/074617.jpg",
"Categories": [
"399d7b20-18cc-46c0-b63e-79eadb9390c7"
],
"RelatedProducts": [],
"Variants": [
"84a7ff9f-edf0-4aab-87f9-ba4efd44db74",
"e2eb2c50-6abc-4fbe-8fc8-89e6644b23ef",
"a7e16ccc-c14f-42f5-afb2-9b7d9aefbc5c"
],
"PriceGroups": [
"86182755-519f-4e05-96ef-5f93a59bbaec"
],
"DisplayName": "Paraboot Avoriaz/Jannu Marron Brut Marron Brown Hiking Boot Shoes",
"ShortDescription": "",
"LongDescription": "<ul><li>Paraboot Avoriaz Mountaineering Boots</li><li>Marron Brut Marron (Brown)</li><li>Full leather inners and uppers</li><li>Norwegien Welted Commando Sole</li><li>Hand made in France</li><li>Style number : 074617</li></ul><p>As featured on Pritchards.co.uk</p>",
"UnitPrices": {
"EUR 15 pct": 343.85
},
"Taxes": {
"EUR 15 pct": 51.5775
},
"PricesInclTax": {
"EUR 15 pct": 395.4275
},
"Slug": "paraboot-avoriazjannu-marron-brut-marron-brown-hiking-boot-shoes",
"VariantsProperties": [
{
"Key": "ShoeSize",
"Value": "8"
},
{
"Key": "ShoeSize",
"Value": "10"
},
{
"Key": "ShoeSize",
"Value": "6"
}
],
"Guid": "0d4f6899-c66a-4416-8f5d-26822c3b57ae",
"Id": 178,
"ShowOnHomepage": true
}
I'm aggregating on VariantsProperties which have the following mapping
"VariantsProperties": {
"type": "nested",
"properties": {
"Key": {
"type": "keyword"
},
"Value": {
"type": "keyword"
}
}
}
Terms aggregations are working fine with following code:
{
"aggs": {
"Nest": {
"nested": {
"path": "VariantsProperties"
},
"aggs": {
"fieldIds": {
"terms": {
"field": "VariantsProperties.Key"
},
"aggs": {
"values": {
"terms": {
"field": "VariantsProperties.Value"
}
}
}
}
}
}
}
}
However when I try to do a range aggregation to get shoes in size between 8 - 12 such as:
{
"aggs": {
"Nest": {
"nested": {
"path": "VariantsProperties"
},
"aggs": {
"fieldIds": {
"range": {
"field": "VariantsProperties.Value",
"ranges": [ { "from": 8, "to": 12 }]
}
}
}
}
}
}
I get the following error:
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "Field [VariantsProperties.Value] of type [keyword] is not supported for aggregation [range]"
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "product-avenueproductindexdefinition-24476f82-en-us",
"node": "ejgN4XecT1SUfgrhzP8uZg",
"reason": {
"type": "illegal_argument_exception",
"reason": "Field [VariantsProperties.Value] of type [keyword] is not supported for aggregation [range]"
}
}
],
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Field [VariantsProperties.Value] of type [keyword] is not supported for aggregation [range]",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Field [VariantsProperties.Value] of type [keyword] is not supported for aggregation [range]"
}
}
},
"status": 400
}
Is there a way to "transform" the terms aggregation into a range aggregation, without the need of changing the schema? I know I could build the ranges myself by extracting the data from the terms aggregation and building the ranges out of it, however, I would prefer a solution within the elastic itself.
There are two ways to solve this:
Option A: Use a script instead of a field. This option will work without having to reindex your data, but depending on your volume of data, the performance might suffer.
POST test/_search
{
"aggs": {
"Nest": {
"nested": {
"path": "VariantsProperties"
},
"aggs": {
"fieldIds": {
"range": {
"script": "Integer.parseInt(doc['VariantsProperties.Value'].value)",
"ranges": [
{
"from": 8,
"to": 12
}
]
}
}
}
}
}
}
Option B: Add an integer sub-field in your mapping.
PUT my-index/_mapping
{
"properties": {
"VariantsProperties": {
"type": "nested",
"properties": {
"Key": {
"type": "keyword"
},
"Value": {
"type": "keyword",
"fields": {
"numeric": {
"type": "integer",
"ignore_malformed": true
}
}
}
}
}
}
}
Once your mapping is modified, you can run _update_by_query on your index in order to reindex the VariantsProperties.Value data
PUT my-index/_update_by_query
Finally, when this last command is done, you can run the range aggregation on the VariantsProperties.Value.numeric field.
Also note that this second but will be more performant on the long term.

Nested Query in Elastic Search

I have a schema in elastic search of this form:
{
"index1" : {
"mappings" : {
"properties" : {
"key1" : {
"type" : "keyword"
},
"key2" : {
"type" : "keyword"
},
"key3" : {
"properties" : {
"components" : {
"type" : "nested",
"properties" : {
"sub1" : {
"type" : "keyword"
},
"sub2" : {
"type" : "keyword"
},
"sub3" : {
"type" : "keyword"
}
}
}
}
}
}
}
}
}
and then the data stored in elastic search would be of the format:
{
"_index" : "index1",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"key1" : "val1",
"key2" : "val2",
"key3" : {
components : [
{
"sub1" : "subval11",
"sub3" : "subval13"
},
{
"sub1" : "subval21",
"sub2" : "subval22",
"sub3" : "subval23"
},
{
"sub1" : "subval31",
"sub2" : "subval32",
"sub3" : "subval33"
}
]
}
}
}
As you can see that the sub1, sub2 and sub3 might not be present in few of the objects under key3.
Now if I try to write a query to fetch the result based on key3.sub2 as subval22 using this query
GET index1/_search
{
"query": {
"nested": {
"path": "components",
"query": {
"bool": {
"must": [
{
"match": {"key3.sub2": "subval22"}
}
]
}
}
}
}
}
I always get the error as
{
"error": {
"root_cause": [
{
"type": "query_shard_exception",
"reason": "failed to create query: {...}",
"index_uuid": "1",
"index": "index1"
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "index1",
"node": "1aK..",
"reason": {
"type": "query_shard_exception",
"reason": "failed to create query: {...}",
"index_uuid": "1",
"index": "index1",
"caused_by": {
"type": "illegal_state_exception",
"reason": "[nested] failed to find nested object under path [components]"
}
}
}
]
},
"status": 400
}
I understand that since sub2 is not present in all the objects under components, this error is being thrown. I am looking for a way to search such scenarios such that it matches and finds all the objects in the array. If a value is matched, then this doc should get returned.
Can someone help me to get this working.
You made mistake while defining your schema, below schema works fine, Note I just defined key3 as nested. and changed the nested path to key3
Index def
{
"mappings": {
"properties": {
"key1": {
"type": "keyword"
},
"key2": {
"type": "keyword"
},
"key3": {
"type": "nested"
}
}
}
}
Index you sample doc without any change
{
"key1": "val1",
"key2": "val2",
"key3": {
"components": [ --> this was a diff
{
"sub1": "subval11",
"sub3": "subval13"
},
{
"sub1": "subval21",
"sub2": "subval22",
"sub3": "subval23"
},
{
"sub1": "subval31",
"sub2": "subval32",
"sub3": "subval33"
}
]
}
}
Searching with your criteria
{
"query": {
"nested": {
"path": "key3", --> note this
"query": {
"bool": {
"must": [
{
"match": {
"key3.components.sub2": "subval22" --> note this
}
}
]
}
}
}
}
}
This brings the proper search result
"hits": [
{
"_index": "so_nested_61200509",
"_type": "_doc",
"_id": "2",
"_score": 0.2876821,
"_source": {
"key1": "val1",
"key2": "val2",
"key3": {
"components": [ --> note this
{
"sub1": "subval11",
"sub3": "subval13"
},
{
"sub1": "subval21",
"sub2": "subval22",
"sub3": "subval23"
},
{
"sub1": "subval31",
"sub2": "subval32",
"sub3": "subval33"
}
]
Edit:- Based on the comment from OP, updated sample doc, search query and result.

Elasticsearch 6.1 multi index search with nested fields issue

I run a multi index search (elasticsearch 6.1.1) in 2 indexes with nested field,
"uniqueID" is a nested field that exist only in person index.
"pobox" is a nested field that exist only in adress index
I am getting error:
"index": "adress", "[nested] failed to find nested object under path [uniqueID]"
"index": "person", "[nested] nested object under path [pobox] is not of nested type"
In my query I search in person index for field uniqueID, why I am getting error for pobox field that exist only in adress index. Same for search in adress index it look for uniqueId field that exist only in person index
POST http://locahost:9200/person,adress/_search
{
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"terms": {
"_index": [
"person"
]
}
},
{
"nested": {
"path": "uniqueID",
"query": {
"span_near": {
"clauses": [
{
"span_term": {
"uniqueID.uniqueID.auto": "1"
}
}
],
"slop": 3,
"in_order": true
}
}
}
}
]
}
},
{
"bool": {
"must": [
{
"terms": {
"_index": [
"adress"
]
}
},
{
"nested": {
"path": "pobox",
"query": {
"span_near": {
"clauses": [
{
"span_term": {
"pobox.pobox.auto": "1"
}
}
],
"slop": 3,
"in_order": true
}
}
}
}
]
}
}
]
}
}
}
Error
{
"error": {
"root_cause": [
{
"type": "query_shard_exception",
"reason": "failed to create query: { my_query }",
"index_uuid": "9Z0W-P9ZS02kJ7WmOKHPVQ",
"index": "adress"
},
{
"type": "query_shard_exception",
"reason": "failed to create query: { my_query }",
"index_uuid": "EHoxKGhdSmKoYdNgsylotw",
"index": "person"
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "adress",
"node": "AEhiq0wvQTGh468sSmDN5g",
"reason": {
"type": "query_shard_exception",
"reason": "failed to create query: { my_query }",
"index_uuid": "9Z0W-P9ZS02kJ7WmOKHPVQ",
"index": "adress",
"caused_by": {
"type": "illegal_state_exception",
"reason": "[nested] failed to find nested object under path [uniqueID]"
}
}
},
{
"shard": 0,
"index": "person",
"node": "AEhiq0wvQTGh468sSmDN5g",
"reason": {
"type": "query_shard_exception",
"reason": "failed to create query: { my_query }",
"index_uuid": "EHoxKGhdSmKoYdNgsylotw",
"index": "person",
"caused_by": {
"type": "illegal_state_exception",
"reason": "[nested] nested object under path [pobox] is not of nested type"
}
}
}
]
},
"status": 400
}
Well, the nested structure you try to query on one index doesn't exist on the other, so what do you expect?
IMO you should split the queries up and use _msearch if you always need those together.

update a particular field of elasticsearch document

Hi I am trying to update documents a elasticsearch which meets specific criteria. I am using google sense(chrome extension) for making request. The request that I am making is as shown below:
GET styling_rules2/product_line_filters/_update
{
"query": {
"filtered": {
"query": {
"bool": {
"should": [
{"term":{"product_line_attribute": "brand"}}
],
"minimum_should_match": 1
}
},
"filter": {
"term": {
"product_line_name": "women_skirts"
}
}
}
},
"script" : "ctx._source.brand=brands"
}
sample document is as shown below:
{
"product_line_attribute_db_path": "product_filter.brand",
"product_line_attribute": "brand",
"product_line_name": "women_skirts",
"product_line_attribute_value_list": [
"vero moda",
"faballey",
"only",
"rider republic",
"dorothy perkins"
]
}
desired result: update all the document which has product_line_attribute="brand" and product_line_name="women_skirts" to product_line_attribute="brands".
problem: I am getting the error as follows:
{
"error": {
"root_cause": [
{
"type": "search_parse_exception",
"reason": "failed to parse search source. unknown search element [script]",
"line": 18,
"col": 4
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "styling_rules2",
"node": "2ijp1pXwT46FN4on4-JPlg",
"reason": {
"type": "search_parse_exception",
"reason": "failed to parse search source. unknown search element [script]",
"line": 18,
"col": 4
}
}
]
},
"status": 400
}
thanks in advance!
You should use the _update_by_query endpoint and not _update. Also the script section is not correct, which is probably why you're getting a class_cast_exception.
Try this instead:
POST styling_rules2/product_line_filters/_update_by_query
{
"query": {
"filtered": {
"query": {
"bool": {
"should": [
{
"term": {
"product_line_attribute": "brand"
}
}
],
"minimum_should_match": 1
}
},
"filter": {
"term": {
"product_line_name": "women_skirts"
}
}
}
},
"script": {
"inline": "ctx._source.brand=brands"
}
}

ElasticSearch - failed to parse search source. expected field name but got [START_OBJECT]

I can't figure out what's wrong in my ES query.
I want to filter on a specific field, and also sort by other field.
Request:
GET /_search
{
"query" : {
"term": {
"_type" : "monitor"
},
"filtered" : {
"filter" : { "term" : { "ProcessName" : "myProc" }}
}
},
"sort": { "TraceDateTime": { "order": "desc", "ignore_unmapped": "true" }}
}
Response:
{
"error": {
"root_cause": [
{
"type": "parse_exception",
"reason": "failed to parse search source. expected field name but got [START_OBJECT]"
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": ".kibana",
"node": "94RPDCjhQh6eoTe6XoRmSg",
"reason": {
"type": "parse_exception",
"reason": "failed to parse search source. expected field name but got [START_OBJECT]"
}
}
]
},
"status": 400
}
You have a syntax error in your query, you need to enclose both of your term queries inside a bool/must compound query, it needs to be like this:
POST /_search
{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"ProcessName": "myProc"
}
},
{
"term": {
"_type": "monitor"
}
}
]
}
}
}
},
"sort": {
"TraceDateTime": {
"order": "desc",
"ignore_unmapped": "true"
}
}
}
PS: Always use POST when sending a payload in your query.

Resources