elasticsearch multiple aggregations not working - elasticsearch

fighting with elasticsearch aggregations - might need some advice ...
elasticsearch version:
Version: 1.4.1, Build: 89d3241/2014-11-26T15:49:29Z, JVM: 1.7.0_72
sample dataset:
"_index": "logstash-2014.12.17",
"_type": "netflow",
"_id": "AUpaDdUVUcM5Us_C6x7Z",
"_score": 1,
"_source": {
"message": "<27>Dec 17 22:01:02 es01 nfcapd[29441]: expip= fweventtime=2014-12-17 22:01:02.793 fwevent=DENIED srcip= dstip= srcport=62327 dstport=41863 proto=UDP input=3 output=4 inbytes=0 outbytes=0 postnatsrcip= postnatdstip= postnatsrcport=62327 postnatdstport=41863 ingressacl=0x45b0635e/0x9872d678/0x724bf9a4 egressacl=0x0/0x0/0x0",
"#version": "1",
"#timestamp": "2014-12-17T21:01:02.794Z",
"type": "netflow",
"host": "",
"timestamp": "Dec 17 22:01:02",
"hostname": "es01",
"expip": "",
"time": "2014-12-17 22:01:02.793",
"fwevent": "DENIED",
"srcip": "",
"dstip": "",
"srcport": "62327",
"dstport": "41863",
"proto": "UDP",
"output": "4",
"inbytes": "0",
"outbytes": "0",
"postnatsrcip": "",
"postnatdstip": "",
"postnatsrcport": "62327",
"postnatdstport": "41863",
"ingressacl1": "0x45b0635e",
"ingressacl2": "0x9872d678",
"ingressacl3": "0x724bf9a4",
"egressacl1": "0x0",
"egressacl2": "0x0",
"egressacl3": "0x0",
"srcgeo": {
"country_code3": "CHE",
"latitude": 47,
"longitude": 8,
"location": [
sample query:
GET _search
"size": 1,
"query": {
"filtered": {
"filter": {
"range": {
"#timestamp": {
"gt": "2014-12-17T21:00:00"
"aggs": {
"proto": {
"terms": {
"field": "proto"
"aggs": {
"traffic_sum": {
"sum": {
"field": "outbytes"
results in error:
"error": "SearchPhaseExecutionException[Failed to execute phase [query], all shards failed;
shardFailures {[jJZG3gX7QlujjG4ZXttyRA][logstash-2014.12.17][0]:
ClassCastException[org.elasticsearch.index.fielddata.plain.PagedBytesIndexFieldData cannot be cast to org.elasticsearch.index.fielddata.IndexNumericFieldData]}{[8Rz-FI7JSvebgBdGG9zOkA][logstash-2014.12.17][1]:
nested: ClassCastException[org.elasticsearch.index.fielddata.plain.PagedBytesIndexFieldData cannot be cast to org.elasticsearch.index.fielddata.IndexNumericFieldData]; }{[8Rz-FI7JSvebgBdGG9zOkA][logstash-2014.12.17][2]:
nested: ClassCastException[org.elasticsearch.index.fielddata.plain.PagedBytesIndexFieldData cannot be cast to org.elasticsearch.index.fielddata.IndexNumericFieldData]; }{[jJZG3gX7QlujjG4ZXttyRA][logstash-2014.12.17][3]:
ClassCastException[org.elasticsearch.index.fielddata.plain.PagedBytesIndexFieldData cannot be cast to org.elasticsearch.index.fielddata.IndexNumericFieldData]}{[jJZG3gX7QlujjG4ZXttyRA][logstash-2014.12.17][4]:
ClassCastException[org.elasticsearch.index.fielddata.plain.PagedBytesIndexFieldData cannot be cast to org.elasticsearch.index.fielddata.IndexNumericFieldData]}]",
"status": 500
* works fine with just one aggregation - fails if I insert second aggregation *
any idea?

this is the important part:
cannot be cast to org.elasticsearch.index.fielddata.IndexNumericFieldData]}]"
You're attempting to do a sum using a string field.
this field is the problem:
"outbytes": "0",
delete your existing data and create a numeric field type by posting a document containing "outbytes": 0 (note the lack of quotation marks).
Delete your existing data and create an explicit mapping with field outbytes set to a number.
Keep your data but update the aggregation to call a script that does the string to number conversion.
My recommendation would be to go for option 2.


elasticsearch filebeat mapper_parsing_exception when using decode_json_fields

I have ECK setup and im using filebeat to ship logs from Kubernetes to elasticsearch.
Ive recently added decode_json_fields processor to my configuration, so that im able decode the json that is usually in the message field.
- decode_json_fields:
fields: ["message"]
process_array: false
max_depth: 10
target: "log"
overwrite_keys: true
add_error_key: true
However logs have stopped appearing since adding it.
example log:
"_index": "filebeat-7.9.1-2020.10.01-000001",
"_type": "_doc",
"_id": "wF9hB3UBtUOF3QRTBcts",
"_score": 1,
"_source": {
"#timestamp": "2020-10-08T08:43:18.672Z",
"kubernetes": {
"labels": {
"controller-uid": "9f3f9d08-cfd8-454d-954d-24464172fa37",
"job-name": "stream-hatchet-cron-manual-rvd"
"container": {
"name": "stream-hatchet-cron",
"image": "<redacted>.dkr.ecr.us-east-2.amazonaws.com/stream-hatchet:v0.1.4"
"node": {
"name": "ip-172-20-32-60.us-east-2.compute.internal"
"pod": {
"uid": "041cb6d5-5da1-4efa-b8e9-d4120409af4b",
"name": "stream-hatchet-cron-manual-rvd-bh96h"
"namespace": "default"
"ecs": {
"version": "1.5.0"
"host": {
"mac": [],
"hostname": "ip-172-20-32-60",
"architecture": "x86_64",
"name": "ip-172-20-32-60",
"os": {
"codename": "Core",
"platform": "centos",
"version": "7 (Core)",
"family": "redhat",
"name": "CentOS Linux",
"kernel": "4.9.0-11-amd64"
"containerized": false,
"ip": []
"cloud": {
"instance": {
"id": "i-06c9d23210956ca5c"
"machine": {
"type": "m5.large"
"region": "us-east-2",
"availability_zone": "us-east-2a",
"account": {
"id": "<redacted>"
"image": {
"id": "ami-09d3627b4a09f6c4c"
"provider": "aws"
"stream": "stdout",
"message": "{\"message\":{\"log_type\":\"cron\",\"status\":\"start\"},\"level\":\"info\",\"timestamp\":\"2020-10-08T08:43:18.670Z\"}",
"input": {
"type": "container"
"log": {
"offset": 348,
"file": {
"path": "/var/log/containers/stream-hatchet-cron-manual-rvd-bh96h_default_stream-hatchet-cron-73069980b418e2aa5e5dcfaf1a29839a6d57e697c5072fea4d6e279da0c4e6ba.log"
"agent": {
"type": "filebeat",
"version": "7.9.1",
"hostname": "ip-172-20-32-60",
"ephemeral_id": "6b3ba0bd-af7f-4946-b9c5-74f0f3e526b1",
"id": "0f7fff14-6b51-45fc-8f41-34bd04dc0bce",
"name": "ip-172-20-32-60"
"fields": {
"#timestamp": [
"suricata.eve.timestamp": [
In the filebeat logs i can see the following error:
2020-10-08T09:25:43.562Z WARN [elasticsearch] elasticsearch/client.go:407 Cannot
index event
ext:63737745936, loc:(*time.Location)(nil)}, Meta:null,
Private:file.State{Id:"native::30998361-66306", PrevId:"",
Finished:false, Fileinfo:(*os.fileStat)(0xc001c14dd0),
Offset:539, Timestamp:time.Time{wall:0xbfd7d4a1e556bd72,
ext:916563812286, loc:(*time.Location)(0x607c540)}, TTL:-1,
Type:"container", Meta:map[string]string(nil),
FileStateOS:file.StateOS{Inode:0x1d8ff59, Device:0x10302},
IdentifierName:"native"}, TimeSeries:false}, Flags:0x1,
Cache:publisher.EventCache{m:common.MapStr(nil)}} (status=400):
{"type":"mapper_parsing_exception","reason":"failed to parse field
[log.message] of type [keyword] in document with id
'56aHB3UBLgYb8gz801DI'. Preview of field's value: '{log_type=cron,
get text on a START_OBJECT at 1:113"}}
It throws an error because apparently log.message is of type "keyword" however this does not exist in the index mapping.
I thought this maybe an issue with the "target": "log" so ive tried changing this to something arbitrary like "my_parsed_message" or "m_log" or "mlog" and i get the same error for all of them.
{"type":"mapper_parsing_exception","reason":"failed to parse field
[mlog.message] of type [keyword] in document with id
'J5KlDHUB_yo5bfXcn2LE'. Preview of field's value: '{log_type=cron,
get text on a START_OBJECT at 1:217"}}
Elastic version: 7.9.2
The problem is that some of your JSON messages contain a message field that is sometimes a simple string and other times a nested JSON object (like in the case you're showing in your question).
After this index was created, the very first message that was parsed was probably a string and hence the mapping has been modified to add the following field (line 10553):
"mlog": {
"properties": {
"message": {
"type": "keyword",
"ignore_above": 1024
You'll find the same pattern for my_parsed_message (line 10902), my_parsed_logs (line 10742), etc...
Hence the next message that comes with message being a JSON object, like
{"message":{"log_type":"cron","status":"start"}, ...
will not work because it's an object, not a string...
Looking at the fields of your custom JSON, it seems you don't really have the control over either their taxonomy (i.e. naming) or what they contain...
If you're serious about willing to search within those custom fields (which I think you are since you're parsing the field, otherwise you'd just store the stringified JSON), then I can only suggest to start figuring out a proper taxonomy in order to make sure that they all get a standard type.
If all you care about is logging your data, then I suggest to simply disable the indexing of that message field. Another solution is to set dynamic: false in your mapping to ignore those fields, i.e. not modify your mapping.

Elastic search Average time difference Aggregate Query

I have documents in elasticsearch in which each document looks something like as follows:
"id": "T12890ADSA12",
"status": "ENDED",
"type": "SAMPLE",
"updatedAt": "2020-05-29T18:18:08.483Z",
"events": [
"event": "STARTED",
"version": 1,
"timestamp": "2020-04-30T13:41:25.862Z"
"event": "INPROGRESS",
"version": 2,
"timestamp": "2020-05-14T17:03:09.137Z"
"event": "INPROGRESS",
"version": 3,
"timestamp": "2020-05-17T17:03:09.137Z"
"event": "ENDED",
"version": 4,
"timestamp": "2020-05-29T18:18:08.483Z"
"createdAt": "2020-04-30T13:41:25.862Z"
Now, I wanted to write a query in elasticsearch to get all the documents which are of type "SAMPLE" and I can get the average time between STARTED and ENDED of all those documents. Eg. Avg of (2020-05-29T18:18:08.483Z - 2020-04-30T13:41:25.862Z, ....). Assume that STARTED and ENDED event is present only once in events array. Is there any way I can do that?
You can do something like this. The query selects the events of type SAMPLE and status ENDED (to make sure there is a ENDED event). Then the avg aggregation uses scripting to gather the STARTED and ENDED timestamps and subtracts them to return the number of days:
POST test/_search
"query": {
"bool": {
"filter": [
"term": {
"status.keyword": "ENDED"
"term": {
"type.keyword": "SAMPLE"
"aggs": {
"duration": {
"avg": {
"script": "Map findEvent(List events, String type) {return events.find(it -> it.event == type);} def started = Instant.parse(findEvent(params._source.events, 'STARTED').timestamp); def ended = Instant.parse(findEvent(params._source.events, 'ENDED').timestamp); return ChronoUnit.DAYS.between(started, ended);"
The script looks like this:
Map findEvent(List events, String type) {
return events.find(it -> it.event == type);
def started = Instant.parse(findEvent(params._source.events, 'STARTED').timestamp);
def ended = Instant.parse(findEvent(params._source.events, 'ENDED').timestamp);
return ChronoUnit.DAYS.between(started, ended);

Mapping exception while executing Rollup job

I have a Kibana instance which stores log data from our java apps in per daily indexes, like logstash-java-beats-2019.09.01. As far as amount of indexes could be pretty big in future I want to create a rollup job, to be able to archive old logs in separate index, something like logstash-java-beats-rollup. Typical document in logstash-java-beats-2019.09.01 index looks like this:
"_index": "logstash-java-beats-2019.10.01",
"_type": "_doc",
"_id": "C9mfhG0Bf_Fr5GBl6kTg",
"_version": 1,
"_score": 1,
"_source": {
"#timestamp": "2019-10-01T00:02:13.756Z",
"ecs": {
"version": "1.0.0"
"event_timestamp": "2019-10-01 00:02:13,756",
"log": {
"offset": 5729359,
"file": {
"path": "/var/log/application-name/application.log"
"tags": [
"loglevel": "WARN",
"java_class": "java.class.name",
"message": "Log message here",
"host": {
"name": "host-name-beat"
"#version": "1",
"agent": {
"hostname": "host-name",
"id": "a34af368-3359-495a-9775-63502693d148",
"ephemeral_id": "cc4afd3c-ad97-47a4-bd21-72255d450232",
"type": "filebeat",
"version": "7.2.0",
"name": "host-name-beat"
"input": {
"type": "log"
So I created a rollup job with such config:
"config": {
"id": "Test 2 job",
"index_pattern": "logstash-java-beats-2*",
"rollup_index": "logstash-java-beats-rollup",
"cron": "0 0 * * * ?",
"groups": {
"date_histogram": {
"fixed_interval": "1000ms",
"field": "#timestamp",
"delay": "1d",
"time_zone": "UTC"
"metrics": [],
"timeout": "20s",
"page_size": 1000
"status": {
"job_state": "stopped",
"current_position": {
"#timestamp.date_histogram": 1567933199000
"upgraded_doc_id": true
"stats": {
"pages_processed": 1840,
"documents_processed": 5322525,
"rollups_indexed": 1838383,
"trigger_count": 1,
"index_time_in_ms": 1555018,
"index_total": 1839,
"index_failures": 0,
"search_time_in_ms": 59059,
"search_total": 1840,
"search_failures": 0
but it fails to rollup the data with such exception:
Error while attempting to bulk index documents: failure in bulk execution:
[0]: index [logstash-java-beats-rollup], type [_doc], id [Test 2 job$GTvyIZtPhKqi-dtfVd6MXg], message [MapperParsingException[Could not dynamically add mapping for field [#timestamp.date_histogram.time_zone]. Existing mapping for [#timestamp] must be of type object but found [date].]]
[1]: index [logstash-java-beats-rollup], type [_doc], id [Test 2 job$v-r89eEpLvImr0lWIrOb_Q], message [MapperParsingException[Could not dynamically add mapping for field [#timestamp.date_histogram.time_zone]. Existing mapping for [#timestamp] must be of type object but found [date].]]
[2]: index [logstash-java-beats-rollup], type [_doc], id [Test 2 job$quCHwZP1iVU_Bs2fmhgSjQ], message [MapperParsingException[Could not dynamically add mapping for field [#timestamp.date_histogram.time_zone]. Existing mapping for [#timestamp] must be of type object but found [date].]]
logstash-java-beats-rollup index is empty, even if there is some stats for the rollup job available.
I'm using elasticsearch v7.2.0
Could you please explain what is wrong with the data, or with the rollup job configuration?

Children are not mapping properly in elastic to parents

"chods": {
"mappings": {
"chod": {
"properties": {
"state": {
"type": "text"
"chods": {},
"variant": {
"_parent": {
"type": "chod"
"_routing": {
"required": true
"properties": {
"percentage": {
"type": "double"
When I execute:
PUT /chods/variant/565?parent=36442
{ // some data }
It returns:
But when I run this query:
GET /chods/variant/565?parent=36442
It returns variant with parent=36443
"_index": "chods",
"_type": "variant",
"_id": "565",
"_version": 7,
"_routing": "36443",
"_parent": "36443",
"found": true,
"_source": {
Why it returns with parent 36443 and not 36442?
When I tried to reproduce this with your steps, I got the expected result (version=36442). I noticed that after your PUT of the document with "_parent": "36442" the output is "_version":6. In your GET of the document, "_version": 7 is returned. Is it possible that you posted another version of the document?
I also noticed that GET /chods/variant/565?parent=36443 would not actually filter by the parent id - the query parameter is disregarded. If you actually want to filter by parent id, this is the query you're looking for:
GET /chods/_search
"query": {
"parent_id": {
"type": "variant",
"id": "36442"
As #fylie pointed out the main problem is that if you use same id of the document you will get your document overridden by last version - sort of
Lets say that we have index /tests and type "a" which is child of type "test" and we do following commands:
PUT /tests/a/50?parent=25
"item": "C"
PUT /tests/a/50?parent=26
"item": "D"
PUT /tests/a/50?parent=50
"item": "E",
"item2": "F",
What the result will be? Well it can result in creating 1 - 3 documents.
If it will route to the same shard, you will end up with one document, which will have 3 versions.
If it will route to 3 different shards, you will end up with 3 new documents.

Elastic Search existing field mapping

I am new to elastic search, I have created an index "cmn" with a type "mention". I am trying to import data from my existing solr to elasticsearch, so I want to map an existing field to the _id field.
I have created the following file under /config/mappings/cmn/,
"mappings": {
"_id" : {
"path" : "docKey"
But this doesn't seem to be working, every time I index a record the following _id is created,
"_index": "cmn",
"_type": "mentions",
"_id": "k4E0dJr6Re2Z39HAIjYMmg",
"_score": 1
Also, the mapping is not reflects. I have also tried the following option,
"mappings": {
"_id" : {
"path" : "docKey"
SAMPLE DOCUMENT: Basically a tweet.
"usrCreatedDate": "2012-01-24 21:34:47",
"sex": "U",
"listedCnt": 2,
"follCnt": 432,
"state": "Southampton",
"classified": 0,
"favCnt": 468,
"timeZone": "Casablanca",
"twitterId": 473333038,
"lang": "en",
"stnostem": "#ootd #ootw #fashion #styling #photography #white #pink #playsuit #prada #sunny #spring http://t.co/YbPFrXlpuh",
"sourceId": "tw",
"timestamp": "2014-04-09T22:58:00.396Z",
"sentiment": 0,
"updatedOnGMTDate": "2014-04-09T22:56:57.000Z",
"userLocation": "Southampton",
"age": 0,
"priorityScore": 57.4700012207031,
"statusCnt": 14612,
"name": "YazzyK",
"profilePicUrl": "http://pbs.twimg.com/profile_images/453578494556270594/orsA0pKi_normal.jpeg",
"mentions": "",
"sourceStripped": "Instagram",
"collectionName": "STREAMING",
"tags": "557/161/193/197",
"msgid": 1397084280396.33,
"_version_": 1464949081784713200,
"url2": "{\"urls\":[{\"url\":\"http://t.co/YbPFrXlpuh\",\"expandedURL\":\"http://instagram.com/p/mliZbgxVZm/\",\"displayURL\":\"instagram.com/p/mliZbgxVZm/\",\"start\":88,\"end\":110}]}",
"links": "http://t.co/YbPFrXlpuh",
"retweetedStatus": "",
"twtScreenName": "YazKader",
"postId": "454030232501358592",
"country": "Bermuda",
"message": "#ootd #ootw #fashion #styling #photography #white #pink #playsuit #prada #sunny #spring http://t.co/YbPFrXlpuh",
"source": "Instagram",
"parentStatusId": -1,
"bio": "Live and breathe Fashion. Persian and proud- Instagram: #Yazkader",
"createdOnGMTDate": "2014-04-09T22:56:57.000Z",
"searchText": "#ootd #ootw #fashion #styling #photography #white #pink #playsuit #prada #sunny #spring http://t.co/YbPFrXlpuh",
"isFavorited": "False",
"frenCnt": 214,
"docKey": "tw_454030232501358592"
Also, how can we create unique mapping for each "TYPE" and not just the index.
Do like this,
Put the mapping as,
PUT index_name/type_name/_mapping
"type_name": {
"_id": {
"path": "docKey"
"properties": {
"docKey": {
"type": "string"
And, it will work. (When you index docKey, then _id is set). You shouldn't have to provide all the mapping.
