Elastic search Average time difference Aggregate Query - elasticsearch

I have documents in elasticsearch in which each document looks something like as follows:
{
"id": "T12890ADSA12",
"status": "ENDED",
"type": "SAMPLE",
"updatedAt": "2020-05-29T18:18:08.483Z",
"events": [
{
"event": "STARTED",
"version": 1,
"timestamp": "2020-04-30T13:41:25.862Z"
},
{
"event": "INPROGRESS",
"version": 2,
"timestamp": "2020-05-14T17:03:09.137Z"
},
{
"event": "INPROGRESS",
"version": 3,
"timestamp": "2020-05-17T17:03:09.137Z"
},
{
"event": "ENDED",
"version": 4,
"timestamp": "2020-05-29T18:18:08.483Z"
}
],
"createdAt": "2020-04-30T13:41:25.862Z"
}
Now, I wanted to write a query in elasticsearch to get all the documents which are of type "SAMPLE" and I can get the average time between STARTED and ENDED of all those documents. Eg. Avg of (2020-05-29T18:18:08.483Z - 2020-04-30T13:41:25.862Z, ....). Assume that STARTED and ENDED event is present only once in events array. Is there any way I can do that?

You can do something like this. The query selects the events of type SAMPLE and status ENDED (to make sure there is a ENDED event). Then the avg aggregation uses scripting to gather the STARTED and ENDED timestamps and subtracts them to return the number of days:
POST test/_search
{
"query": {
"bool": {
"filter": [
{
"term": {
"status.keyword": "ENDED"
}
},
{
"term": {
"type.keyword": "SAMPLE"
}
}
]
}
},
"aggs": {
"duration": {
"avg": {
"script": "Map findEvent(List events, String type) {return events.find(it -> it.event == type);} def started = Instant.parse(findEvent(params._source.events, 'STARTED').timestamp); def ended = Instant.parse(findEvent(params._source.events, 'ENDED').timestamp); return ChronoUnit.DAYS.between(started, ended);"
}
}
}
}
The script looks like this:
Map findEvent(List events, String type) {
return events.find(it -> it.event == type);
}
def started = Instant.parse(findEvent(params._source.events, 'STARTED').timestamp);
def ended = Instant.parse(findEvent(params._source.events, 'ENDED').timestamp);
return ChronoUnit.DAYS.between(started, ended);

Related

Get GraphQL output without second curly bracket

I have a problem. I do not want a "break" down in my GraphQL output. I have a GraphQL schema with a person. That person can have one or more interests. But unfortunately I only get a breakdown
What I mean by breakdown is the second curly brackets.
{
...
{
...
}
}
Is there an option to get the id of the person plus the id of the interests and the status without the second curly bracket?
GraphQL schema
Person
└── Interest
Query
query {
model {
allPersons{
id
name
interest {
id
status
}
}
}
}
[OUT]
{
"data": {
"model": {
"allPersons": [
{
"id": "01",
"name": "Max",
"interest ": {
"id": 4488448
"status": "active"
}
},
{
"id": "02",
"name": "Sophie",
"interest ": {
"id": 15445
"status": "deactivated"
}
},
What I want
{
{
"id": "01",
"id-interest": 4488448
"status": "active"
},
{
"id": "02",
"name": "Sophie",
"id-interest": 15445
"status": "deactivated"
},
}
What I tried but that deliver me the same result
fragment InterestTask on Interest {
id
status
}
query {
model {
allPersons{
id
interest {
...InterestTask
}
}
}
}

Incorrectly selected data in the query

Only articles that contain the EmailMarketing tag are needed.
I'm probably doing the wrong search on the tag, since it's an array of values, not a single object, but I don't know how to do it right, I'm just learning graphql. Any help would be appreciated
query:
query {
enArticles {
title
previewText
tags(where: {name: "EmailMarketing"}){
name
}
}
}
result:
{
"data": {
"enArticles": [
{
"title": "title1",
"previewText": "previewText1",
"tags": [
{
"name": "EmailMarketing"
},
{
"name": "Personalization"
},
{
"name": "Advertising_campaign"
}
]
},
{
"title": "title2",
"previewText": "previewText2",
"tags": [
{
"name": "Marketing_strategy"
},
{
"name": "Marketing"
},
{
"name": "Marketing_campaign"
}
]
},
{
"title": "article 12",
"previewText": "article12",
"tags": []
}
]
}
}
I believe you first need to have coded an equality operator within your GraphQL schema. There's a good explanation of that here.
Once you add an equality operator - say, for example _eq - you can use it something like this:
query {
enArticles {
title
previewText
tags(where: {name: {_eq: "EmailMarketing"}}){
name
}
}
}
Specifically, you would need to create a filter and resolver.
The example here may help.

Merge / flatten sub aggs into main agg

Is there away in elasticsearch to get the results back in a sort of flattend form (multiple child/sub aggs?
For instance currently i am trying to get back all product types and their status (online / offline).
This is what i end up with:
aggs
[
{ key: SuperProduct, doc_count:3, subagg:[
{status:online, doc_count:1},
{status:offline, doc_count:2}
]
},
{ key: SuperProduct2, doc_count:10, subagg:[
{status:online, doc_count:7},
{status:offline, doc_count:3}
]
Charting libraries tend to like it flattened so i was wondering if elasticsearch could probide it in this sort of manner:
[
{ products_key: 'SuperProduct', status_key:'online', doc_count:1},
{ products_key: 'SuperProduct', status_key:'offline', doc_count:2},
{ products_key: 'SuperProduct2', status_key:'online', doc_count:7},
{ products_key: 'SuperProduct2', status_key:'offline', doc_count:3}
]
Thanks
It is possible with composite aggregation which you can use to link two terms aggregations:
// POST /i/_search
{
"size": 0,
"aggregations": {
"distribution": {
"composite": {
"sources": [
{"product": {"terms": {"field": "product.keyword"}}},
{"status": {"terms": {"field": "status.keyword"}}}
]
}
}
}
}
This results in following structure:
{
"aggregations": {
"distribution": {
"after_key": {
"product": "B",
"status": "online"
},
"buckets": [
{
"key": {
"product": "A",
"status": "offline"
},
"doc_count": 3
},
{
"key": {
"product": "A",
"status": "online"
},
"doc_count": 2
},
{
"key": {
"product": "B",
"status": "offline"
},
"doc_count": 1
},
{
"key": {
"product": "B",
"status": "online"
},
"doc_count": 4
}
]
}
}
}
If for any reason composite aggregation doesn't fulfill your needs, you can create (via copy_to or by concatenation) or simulate (via scripted fields) field that would uniquely identify bucket. In our project we went with concatenation (partially for the necessity to collapse on this field), e.g. {"bucket": "SuperProductA:online"}, which results in dirtier output (you'll have to decode that field back or use top hits to get original values) but still does the job.

Date math in elastic watcher email

I would like to find the datetime for 1 day ago so that I can create link to kibana in an email sent from the watcher. Using Elasticsearch 5.0.2
I've tried the watch below but it returns an error of
ScriptException[runtime error]; nested: IllegalArgumentException[Unable to find dynamic method [minusDays] with [1] arguments for class [org.joda.time.DateTime].];
minusDays does exist in the joda DateTime spec
but it doesn't exist in the elastic codebase
here's the watch
PUT /_xpack/watcher/watch/errors-prod
{
"trigger": {
"schedule": {
"daily": {
"at": [
"08:36"
]
}
}
},
"input": {
"search": {
"request": {
"search_type": "query_then_fetch",
"indices": [
"<das-logstash-{now}>",
"<das-logstash-{now-1d}>"
],
"types": [
"redis-input"
],
"body": {
"size": 0,
"query": {
"match_all": {}
}
}
}
}
},
"actions": {
"send_email": {
"transform": {
"script" : "return [ 'from' : ctx.trigger.scheduled_time.minusDays(1) ]"
},
"email": {
"profile": "standard",
"from": "noreply#email.com",
"to": [
"me#email.com"
],
"subject": "errors",
"body": {
"html": "<html><body><p>from {{ctx.payload.from}}</p><p>to {{ctx.trigger.scheduled_time}}</p></body></html>"
}
}
}
}
}
I needed something similar and was able to hack this together by modifying a comment that almost worked from an elastic forum.
"transform": {
"script" : {
"source" : "def payload = ctx.payload; DateFormat df = new SimpleDateFormat(\"yyyy-MM-dd'T'HH:mm:ss.SSS'Z'\"); ctx.payload.from = df.format(Date.from(Instant.ofEpochMilli(ctx.execution_time.getMillis() - (24 * 60 * 60 * 1000) ))); return payload"
}
},
Hope that helps!

Children are not mapping properly in elastic to parents

"chods": {
"mappings": {
"chod": {
"properties": {
"state": {
"type": "text"
}
}
},
"chods": {},
"variant": {
"_parent": {
"type": "chod"
},
"_routing": {
"required": true
},
"properties": {
"percentage": {
"type": "double"
}
}
}
}
},
When I execute:
PUT /chods/variant/565?parent=36442
{ // some data }
It returns:
{
"_index":"chods",
"_type":"variant",
"_id":"565",
"_version":6,
"result":"updated",
"_shards":{
"total":2,
"successful":1,
"failed":0
},
"created":false
}
But when I run this query:
GET /chods/variant/565?parent=36442
It returns variant with parent=36443
{
"_index": "chods",
"_type": "variant",
"_id": "565",
"_version": 7,
"_routing": "36443",
"_parent": "36443",
"found": true,
"_source": {
...
}
}
Why it returns with parent 36443 and not 36442?
When I tried to reproduce this with your steps, I got the expected result (version=36442). I noticed that after your PUT of the document with "_parent": "36442" the output is "_version":6. In your GET of the document, "_version": 7 is returned. Is it possible that you posted another version of the document?
I also noticed that GET /chods/variant/565?parent=36443 would not actually filter by the parent id - the query parameter is disregarded. If you actually want to filter by parent id, this is the query you're looking for:
GET /chods/_search
{
"query": {
"parent_id": {
"type": "variant",
"id": "36442"
}
}
}
As #fylie pointed out the main problem is that if you use same id of the document you will get your document overridden by last version - sort of
Lets say that we have index /tests and type "a" which is child of type "test" and we do following commands:
PUT /tests/a/50?parent=25
{
"item": "C"
}
PUT /tests/a/50?parent=26
{
"item": "D"
}
PUT /tests/a/50?parent=50
{
"item": "E",
"item2": "F",
}
What the result will be? Well it can result in creating 1 - 3 documents.
If it will route to the same shard, you will end up with one document, which will have 3 versions.
If it will route to 3 different shards, you will end up with 3 new documents.

Resources