Incorrect month in Elasticsearch date_histogram - elasticsearch

My Document looks like below:
{
"_index": "rep_cdr",
"_type": "doc",
"_id": "TaArd2YBDRXNehCp7GmW",
"_score": 1,
"_source": {
"level": "info",
"#version": "1",
"thirdPartyTime": 139,
"date": "15-10-2018",
"time": "15:00:59",
"reqId": "25718d6e-b8ef-438d-8218-1a8726c6c816",
"TAT": 1574,
"message": "",
"thirdPartyErrorDescription": "",
"#timestamp": "2018-10-15T10:00:59.146Z",
}
}
And I am running following query:
GET rep_cdr/doc/_search
{
"size": 0,
"aggs": {
"datewise": {
"date_histogram": {
"field": "date",
"interval": "day"
}
}
}
}
I am getting below result:
{
"aggregations": {
"datewise": {
"buckets": [
{
"key_as_string": "15-01-2018",
"key": 1515974400000,
"doc_count": 8
}
]
}
}
}
Index mapping is as below:
{
"rep_cdr": {
"aliases": {},
"mappings": {
"doc": {
"dynamic_date_formats": [
"DD-MM-YYYY",
"HH:mm:ss",
"yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"
],
"properties": {
"#timestamp": {
"type": "date",
"format": "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"
},
"#version": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"TAT": {
"type": "integer"
},
"date": {
"type": "date",
"format": "DD-MM-YYYY"
},
"level": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"message": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 400
}
}
}
"reqId": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"response": {
"type": "keyword"
},
"thirdPartyErrorDescription": {
"type": "text"
},
"thirdPartyTime": {
"type": "integer"
},
"time": {
"type": "date",
"format": "HH:mm:ss"
}
}
}
},
"settings": {
"index": {
"creation_date": "1539236694553",
"number_of_shards": "3",
"number_of_replicas": "1",
"uuid": "BYDQOhY_TbWhuqMAOA3iNw",
"version": {
"created": "6040099"
},
"provided_name": "rep_cdr"
}
}
}
}
The "key_as_string" gives me wrong month. In document the date field has value "15-10-2018" but "key_as_string" gives me "15-01-2018". I am using elasticsearch version 6.4. What could be wrong?

Your date field format is set to DD-MM-YYYY where D is day of year as mentioned on https://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html. Change your date format to dd-MM-yyyy instead and it should work as expected.
What you are seeing in response is 15th day of the year i.e. 15-01-2018

Related

How to perform nested aggregation in child parent relationship

I am using elasticsearch 7.11 and have implemented parent child relation on of the base reason was my updates were very frequent and time a new child could be added under 1 parent,
My project is something managing all the computers in the network all the activity related to the endpoints should be logged for the analytics purpose so.
My mapping is some thing.
PcInformation -> User
Now Pc has its own information the main thing to note is the activationTime and the user has its Department, username, role etc.
Now I want to get the top departments w.r.t to PC and its time.
Say I want to know which departments have most number of PC in 2020.
What I am currently doing is first get all the PC using the user relationship using hasChild query is below.
{
"query": {
"bool": {
"filter": [
{
"has_child": {
"type": "user",
"query": {
"nested": {
"path": "user",
"query": {
"match_all": {}
}
}
}
}
},
{
"range": {
"regDate": {
"gte": "2020-04-11",
"lte": "2022-04-31"
}
}
}
]
}
}
}
This would return me all the PC in specific time.
And then I am performing aggregation first on user than sub aggregation on pcConnection data for the time based aggragation now I want to know the name of the department but this is not in the the pc information.
One thing is to put user information in the pc but I would lost for what I am using parent child model.
Is there anyway to do so ?
Updated
The Sample Mapping
{
"pcinformation": {
"mappings": {
"properties": {
"_class": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"user": {
"type": "nested",
"properties": {
"userGroup": {
"type": "keyword"
},
"userTeam": {
"type": "keyword"
},
"userCode": {
"type": "long"
},
"userName": {
"type": "keyword"
}
}
},
"antivirus": {
"type": "nested",
"properties": {
"datetime": {
"type": "date"
},
"name": {
"type": "keyword"
}
}
},
"cpuId": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"domainName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"firewall": {
"type": "nested",
"properties": {
"datetime": {
"type": "date"
},
"status": {
"type": "keyword"
}
}
},
"friendlyName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"activationDate": {
"type": "date"
},
"macId": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"osArch": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"osType": {
"type": "keyword"
},
"osVersion": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"pcSignature": {
"type": "text"
},
"pcSignatureHash": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"relation": {
"type": "join",
"eager_global_ordinals": true,
"relations": {
"infection": [
"user"
]
}
},
"userName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"vm": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
So I got two records as this is parent child the one is
{
"_index": "pcInformation",
"_type": "_doc",
"_id": "abcd",
"_version": 1,
"_score": 1,
"_source": {
"_class": "stor.doc.pcInformation",
"pcSignatureHash": "abcd",
"pcSignature": "dddd",
"name": "DESKTOP8JGBPB9",
"userName": "Win1064",
"osType": "Windows.10.Enterprise",
"domainName": "DESKTOP8JGBPB9",
"cpuId": "NOCPUID",
"osVersion": "10.0.19042",
"osArch": "32",
"macId": "0800278A763D",
"activationDate": "2021-05-25T08:46:30.510Z",
"vm": "No VM",
"friendlyName": "Windows Defender",
"relation": {
"name": "pcInformation"
}
}
}
The other one is user information.
{
"_index": "pcInformation",
"_type": "_doc",
"_id": "Qw60onkBDTnt1BMJOeq0",
"_version": 1,
"_score": 1,
"_routing": "abcd",
"_source": {
"_class": "stor.doc.pcInformation",
"agent": {
"userCode": 1,
"userGroup":"admin",
"userRole":"manager"
},
"relation": {
"name": "user",
"parent": "abcd"
}
}
}

ElasticSearch painless, how can I access an array in _source

I try to execute a search request to the ElasticSearch (6.4.0) API which includes a custom script function. In this function I try to access an array which should part of the response data. But I always get a 'null_pointer_exception':
{
"error": {
"root_cause": [
{
"type": "script_exception",
"reason": "runtime error",
"script_stack": [
"i = 0; i < params['_source']['userStats'].length; i++) { } ",
" ^---- HERE"
],
"script": "double scoreBoost = 1; for (int i = 0; i < params['_source']['userStats'].length; i++) { } return _score * Math.log1p(scoreBoost);",
"lang": "painless"
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "search--project",
"node": "...",
"reason": {
"type": "script_exception",
"reason": "runtime error",
"script_stack": [
"i = 0; i < params['_source']['userStats'].length; i++) { } ",
" ^---- HERE"
],
"script": "double scoreBoost = 1; for (int i = 0; i < params['_source']['userStats'].length; i++) { } return _score * Math.log1p(scoreBoost);",
"lang": "painless",
"caused_by": {
"type": "null_pointer_exception",
"reason": null
}
}
}
]
},
"status": 500
}
Also doc['userStats'] does't work.
Here is the complete request body:
{
"size": 10,
"query": {
"function_score": {
"query": {
"bool": {
"filter": {
"term": {
"_routing": "00000000-0000-0000-0000-000000000000"
}
},
"should": {
"query_string": {
"query": "123*",
"default_operator": "and",
"fuzziness": 1,
"analyze_wildcard": true,
"fields": [
"name^4",
"number^2",
"description",
"projectTypeId",
"projectStatusId",
"tags^1.5",
"company.name^2",
"company.number^2",
"company.industry",
"company.tags^1.5",
"company.companyTypes.name",
"company.companyContactInfos.value",
"company.companyContactInfos.addressLine1^1.25",
"company.companyContactInfos.addressLine2",
"company.companyContactInfos.zipCode^0.5",
"company.companyContactInfos.city",
"company.companyContactInfos.state",
"company.companyContactInfos.country",
"projectType.id",
"projectType.name",
"projectType.description",
"projectStatus.id",
"projectStatus.name",
"members.name",
"members.projectRoleName"
]
}
},
"minimum_should_match": 1
}
},
"functions": [
{
"script_score": {
"script": {
"source": "double scoreBoost = 1; for (int i = 0; i < params['_source']['userStats'].length; i++) { } return _score * Math.log1p(scoreBoost);",
"lang": "painless",
"params": {
"dtNow": 1543589276,
"uId": "00000000-0000-0000-0000-000000000000"
}
}
}
}
],
"score_mode": "multiply"
}
}
}
Without the script_score part the response look like this:
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 4,
"hits": [
{
"_index": "search--project",
"_type": "projectentity",
"_id": "00000000-0000-0000-0000-000000000000",
"_score": 4,
"_routing": "00000000-0000-0000-0000-000000000000",
"_source": {
"name": "123",
"description": "123...",
"projectTypeId": "00000000-0000-0000-0000-000000000000",
"projectStatusId": "00000000-0000-0000-0000-000000000000",
"tags": [
"232",
"2331",
"343"
],
"plannedDuration": 0,
"startDate": "2018-07-09T22:00:00Z",
"projectType": {
"id": "00000000-0000-0000-0000-000000000000",
"name": "test 1",
"icon": "poll"
},
"projectStatus": {
"id": "00000000-0000-0000-0000-000000000000",
"name": "In Progress",
"type": "progress"
},
"members": [
{
"userId": "00000000-0000-0000-0000-000000000000",
"name": "dummy",
"projectRoleName": "test",
"hasImage": false
},
{
"userId": "00000000-0000-0000-0000-000000000000",
"name": "dummy ",
"projectRoleName": "Manager",
"hasImage": false
}
],
"id": "00000000-0000-0000-0000-000000000000",
"userStats": [
{
"userId": "00000000-0000-0000-0000-000000000000",
"openCount": 55,
"lastOpened": 1543851773
},
{
"userId": "00000000-0000-0000-0000-000000000000",
"openCount": 9,
"lastOpened": 1542372179
}
],
"indexTime": "2018-12-03T15:42:53.157649Z"
}
}
]
}
}
The mapping look like this:
{
"search--project": {
"aliases": {},
"mappings": {
"projectentity": {
"_routing": {
"required": true
},
"properties": {
"company": {
"properties": {
"companyTypes": {
"properties": {
"icon": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"indexTime": {
"type": "date"
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"indexTime": {
"type": "date"
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"number": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"members": {
"properties": {
"hasImage": {
"type": "boolean"
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"projectRoleName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"userId": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
...
"plannedDuration": {
"type": "long"
},
"startDate": {
"type": "date"
},
"userStats": {
"properties": {
"lastOpened": {
"type": "long"
},
"openCount": {
"type": "long"
},
"userId": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
},
"settings": {
"index": {
"creation_date": "1539619646426",
"number_of_shards": "5",
"number_of_replicas": "1",
"uuid": "G5ohN1FvQBGkYFh_803Ifw",
"version": {
"created": "6040299"
},
"provided_name": "search--project"
}
}
}
}
Anyone has a suggestion what I'm doing wrong?
Thanks.
In the query you have your params identified as dtdNow and uid. So if you're wanting to use those in your script you would do the params.dtdNow.
If you're wanting to use a property from your _source (i.e. userStats) you should be using ctx._source.userStats.length. There are more examples in the documentation: https://www.elastic.co/guide/en/elasticsearch/painless/current/painless-examples.html.
EDIT
With a function_score query you'll either be using the doc map or params['_source']. The difference being that when you use doc it gets cached in memory and params['_source'] will be loaded up each time (see https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-script-fields.html). The other caveat is that you need to use a non-analyzed field or an analyzed text field if fielddata is enabled. https://www.elastic.co/guide/en/elasticsearch/reference/master/modules-scripting-fields.html#modules-scripting-doc-vals
To achieve what you're trying do, you should be able to just use a non-analyzed field in your userStats object. Something like this doc['userStats.openCount'].length (here I'm assuming that openCount is required on your userStats object).

Elasticsearch Sorting fields anomaly

Trying to sort a list on certain fields. firstName and lastName but I have noticed some inconstant result.
I am running a simple query
//Return all the employees from a specific company ordering by lastName asc | desc
GET employee-index-sorting
{
"query": {
"bool": {
"filter": {
"term": {
"companyId": 3179
}
}
}
},
"sort": [
{
"lastName.keyword": { <-- Should this be keyword? or not_analyzed
"order": "desc"
}
}
]
}
In the result why would van der Mescht and van Breda be before Zwane and Zwezwe?
I suspect there is something wrong with my mappings
{
"_index": "employee-index",
"_type": "_doc",
"_id": "637467",
"_score": null,
"_source": {
"companyId": 3179,
"firstName": "Name",
"lastName": "van der Mescht",
},
"sort": [
"van der Mescht"
]
},
{
"_index": "employee-index",
"_type": "_doc",
"_id": "678335",
"_score": null,
"_source": {
"companyId": 3179,
"firstName": "Name3",
"lastName": "van Breda",
},
"sort": [
"van Breda"
]
},
{
"_index": "employee-index",
"_type": "_doc",
"_id": "113896",
"_score": null,
"_source": {
"companyId": 3179,
"firstName": "Name2",
"lastName": "Zwezwe",
},
"sort": [
"Zwezwe"
]
},
{
"_index": "employee-index",
"_type": "_doc",
"_id": "639639",
"_score": null,
"_source": {
"companyId": 3179,
"firstName": "Name1",
"lastName": "Zwane",
},
"sort": [
"Zwane"
]
}
Mappings
Posting the entire map because I am not sure if there might be something else wrong with it.
How should i change the lastName and firstName propery to allow for sorting on them?
PUT employee-index-sorting
{
"settings": {
"index": {
"analysis": {
"filter": {},
"analyzer": {
"keyword_analyzer": {
"filter": [
"lowercase",
"asciifolding",
"trim"
],
"char_filter": [],
"type": "custom",
"tokenizer": "keyword"
},
"edge_ngram_analyzer": {
"filter": [
"lowercase"
],
"tokenizer": "edge_ngram_tokenizer"
},
"edge_ngram_search_analyzer": {
"tokenizer": "lowercase"
}
},
"tokenizer": {
"edge_ngram_tokenizer": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 5,
"token_chars": [
"letter"
]
}
}
}
}
},
"mappings": {
"_doc": {
"properties": {
"employeeId": {
"type": "keyword"
},
"companyGroupId": {
"type": "keyword"
},
"companyId": {
"type": "keyword"
},
"number": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"preferredName": {
"type": "text",
"index": false
},
"firstName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"middleName": {
"type": "text",
"index": false
},
"lastName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"fullName": {
"type": "text",
"fields": {
"keywordstring": {
"type": "text",
"analyzer": "keyword_analyzer"
},
"edgengram": {
"type": "text",
"analyzer": "edge_ngram_analyzer",
"search_analyzer": "edge_ngram_search_analyzer"
}
},
"analyzer": "standard"
},
"terminationDate": {
"type": "date"
},
"companyName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"email": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"idNumber": {
"type": "text"
},
"description": {
"type": "text",
"index": false
},
"jobNumber": {
"type": "keyword"
},
"frequencyId": {
"type": "long"
},
"frequencyCode": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"frequencyAccess": {
"type": "boolean"
}
}
}
}
}
For sorting you need to use lastName.keyword, that's correct, no need to change anything there.
The reason why van der Mescht and van Breda are before Zwane and Zwezwe is because sorting on strings happens on a lexicographical level, i.e. basically using the ASCII table and uppercase characters happen before lowercase ones, so words are sorted in that same order. But since you're sorting in desc mode, that's exactly the opposite:
z...
...
van der Mescht
...
van Breda
...
a...
...
Zwezwe
...
Zwane
...
Z...
...
A...
To fix this, what you simply need to do is to add a normalizer to your lastName.keyword field, i.e. change your mapping to this and it will work:
{
"settings": {
"index": {
"analysis": {
"filter": {},
"analyzer": {
...
},
"tokenizer": {
...
},
"normalizer": { <-- add this
"lowersort": {
"type": "custom",
"filter": [
"lowercase"
]
}
}
}
}
},
"mappings": {
"_doc": {
"properties": {
...
"lastName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"normalizer": "lowersort", <-- add this
"ignore_above": 256
}
}
},
...
}
}
}
}

Negative values in Elasticsearch range queries

I have find this problem while making a watch in Elasticsearch, this is my query:
"body": {
"query": {
"bool": {
"must": [
{
"range": {
"percent": {
"lt": 100
}
It returns successfully every document with percent between 0 and 99, however it ignores those with negative value. The "percent" field is mapped as long number in the index.
Can you help me?
Thanks
Edit: Return of executing "curl -XGET localhost:9200/monthly-tickets-2018-06"
{
"monthly-tickets-2018-06": {
"aliases": {},
"mappings": {
"monthly_tickets": {
"properties": {
"percent": {
"type": "long"
},
"priority": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"project": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"ref": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"timestamp": {
"type": "date"
}
}
}
},
"settings": {
"index": {
"creation_date": "1528946562231",
"number_of_shards": "5",
"number_of_replicas": "1",
"uuid": "aIfLjFwqS_aCzQFvZm0L5Q",
"version": {
"created": "6020399"
},
"provided_name": "monthly-tickets-2018-06"
}
}
}
}

Elastic search top_hits aggregation on nested

I have an index which contains CustomerProfile documents. Each of this document in the CustomerInsightTargets(with the properties Source,Value) property can be an array with x items. What I am trying to achieve is an autocomplete (of top 5) on CustomerInsightTargets.Value grouped by CustomerInisghtTarget.Source.
It will be helpful if anyone gives me hint about how to select only a subset of nested objects from each document and use that nested obj in aggregations.
{
"customerinsights": {
"aliases": {},
"mappings": {
"customerprofile": {
"properties": {
"CreatedById": {
"type": "long"
},
"CreatedDateTime": {
"type": "date"
},
"CustomerInsightTargets": {
"type": "nested",
"properties": {
"CustomerInsightSource": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"CustomerInsightValue": {
"type": "text",
"term_vector": "yes",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
},
"analyzer": "ngram_tokenizer_analyzer"
},
"CustomerProfileId": {
"type": "long"
},
"Guid": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"Id": {
"type": "long"
}
}
},
"DisplayName": {
"type": "text",
"term_vector": "yes",
"analyzer": "ngram_tokenizer_analyzer"
},
"Email": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"Id": {
"type": "long"
},
"ImageUrl": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
},
"settings": {
"index": {
"number_of_shards": "1",
"provided_name": "customerinsights",
"creation_date": "1484860145041",
"analysis": {
"analyzer": {
"ngram_tokenizer_analyzer": {
"type": "custom",
"tokenizer": "ngram_tokenizer"
}
},
"tokenizer": {
"ngram_tokenizer": {
"type": "nGram",
"min_gram": "1",
"max_gram": "10"
}
}
},
"number_of_replicas": "2",
"uuid": "nOyI0O2cTO2JOFvqIoE8JQ",
"version": {
"created": "5010199"
}
}
}
}
}
Having as example a document:
{
{
"Id": 9072856,
"CreatedDateTime": "2017-01-12T11:26:58.413Z",
"CreatedById": 9108469,
"DisplayName": "valentinos",
"Email": "valentinos#mail.com",
"CustomerInsightTargets": [
{
"Id": 160,
"CustomerProfileId": 9072856,
"CustomerInsightSource": "Tags",
"CustomerInsightValue": "Tag1",
"Guid": "00000000-0000-0000-0000-000000000000"
},
{
"Id": 160,
"CustomerProfileId": 9072856,
"CustomerInsightSource": "ProfileName",
"CustomerInsightValue": "valentinos",
"Guid": "00000000-0000-0000-0000-000000000000"
},
{
"Id": 160,
"CustomerProfileId": 9072856,
"CustomerInsightSource": "Playground",
"CustomerInsightValue": "Wiki",
"Guid": "00000000-0000-0000-0000-000000000000"
}
]
}
}
If i ran an aggregation on the top_hits the result will include all targets from a document -> if one of them match my search text.
Example
GET customerinsights/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "CustomerInsightTargets",
"query": {
"bool": {
"must": [
{
"match": {
"CustomerInsightTargets.CustomerInsightValue": {
"query": "2017",
"operator": "AND",
"fuzziness": 2
}
}
}
]
}
}
}
}
]
}
} ,
"aggs": {
"root": {
"nested": {
"path": "CustomerInsightTargets"
},
"aggs": {
"top_tags": {
"terms": {
"field": "CustomerInsightTargets.CustomerInsightSource.keyword"
},
"aggs": {
"top_tag_hits": {
"top_hits": {
"sort": [
{
"_score": {
"order": "desc"
}
}
],
"size": 5,
"_source": "CustomerInsightTargets"
}
}
}
}
}
}
},
"size": 0,
"_source": "CustomerInsightTargets"
}
My question is how I should use the aggregation to get the "autocomplete" Values grouped by Source and order by the _score. I tried to use a significant_terms aggregation but doesn't work so well, also terms aggs doesn't sort by score (and by _count) and having fuzzy also adds complexity.

Resources