Elasticsearch scripting - convert unix timestamp to "YYYYMM" - elasticsearch

I am using Elasticsearch 2.4 and I have a Groovy script. I have a field in my document say doc['created_unix_timestamp'] which is type integer and it holds Unix timestamp. In a search query's script, I am trying to get YYYYMM from that value.
For example, if doc['created_unix_timestamp'] is 1522703848, then in the scripting during a calculation, I want to convert it to as 201804 where first 4 digits are Year and last two are month (with padded 0, if required)
I tried:
Integer.parseInt(new SimpleDateFormat('YYYYMM').format(new Date(doc['created_unix_timestamp'])))
But it throws "compilation error" "TransportError(400, 'search_phase_execution_exception', 'failed to compile groovy script')" . Any idea how to get it work or what is the correct syntax?

A couple recommendations.
Reindex and make created_unix_timestamp a true date in elasticsearch. This will make all kinds of new querying possible (date range for example). This would be ideal.
If you can't reindex, then pull it back as an int from ES and then convert it to whatever you want client side, I wouldn't recommend pushing that work to the ES cluster.

As per the suggestion of Szymon Stepniak in above comment, I solved this issue by
(new Date(1522705958L*1000).format('yyyyMM')).toInteger()
Thanks and credit goes to Szymon Stepniak

Related

Kibana scripted fields using painless: How to find number of days passed using a preexisting index

I have a bunch of data in Kibana that I need to clean up by using an scripted field with "painless" which is a version of Java. At the moment I have an preexisting index in my logs with a date in this format "2021-09-27T13:54:17.165Z" I need to find how many days its been since that day until today whenever this search is ran, if its over or at 300 days it needs to return false if its lower true.
I was trying this to get number of days its been:
new Date().getTime() - doc['date'].value;
I was on stack overflow I saw someone said that new Date().getTime() will give you todays date. But I think the issue is that the time format for new Date().getTime() returns time in the format of 1657151078131 but my index date is in "2021-09-27T13:54:17.165Z" I am not sure how to convert it in order to find the displacement of less or more than 300 days.
Any help will be greatly appreciated
#ELK #elasticsearch #kibana #elastic
Tldr;
You have an issue converting everything.
new Date().getTime() -> gives a Long
"2021-09-27T13:54:17.165Z" -> a string
To Solve
You need to make move them to the same format.
Below I am converting the date in a string format to a ZonedDateTime then to a long.
new Date().getTime() - ZonedDateTime.parse("2021-09-27T13:54:17.165Z").toInstant().toEpochMilli() > 25920000000L;
That one way to go about it.
You can learn more about painless and time formats here

Grafana elasticsearch time from now

I've configured Grafana to use Elasticsearch as a data source and prepared a few panels.
My document in ES index contains only a few fields. It describes some system actions, respectively there are such fields as userId, action date, and others.
Now I faced with the issue that I can't calculate the amount of time left when the action happened. I mean if the action happened 2 days ago, I need to have the number 2 in my table. Pretty simple I supposed.
I couldn't find any metric or transformation that can allow me to do it.
Any suggestion, help, please.
Resolved my issue with scripted field.
In table visualization, I just picked any numeric field, selected Min metric, and added script like next:
Instant Currentdate = Instant.ofEpochMilli(new Date().getTime());
Instant Startdate = Instant.ofEpochMilli(doc['activity'].value.getMillis());
ChronoUnit.DAYS.between(Startdate, Currentdate);
As a result, it shows me the number I needed.
I don't find this solution the best, so in case anybody know some better way to resolve this issue, please add a comment.

Adding hours with Logstash configuration file

I'm working with logstash for 2 weeks, and I've a question about modifing the data.
The device which generate syslogs is not at the right hour, so the logs aren't at the right hour too, and I'd like to know how can I add hours to the time field for finally generate the correct timestamp.
Thanks in advance for any help !
You can use the Ruby filter plugin to do this.
Below sample code will add 14 days to the value in field #timestamp.
Replace the field name and the numeric values in second with whatever you want.
ruby {
code => 'event.set("#timestamp", LogStash::Timestamp.new(Time.at(event.get("#timestamp").to_f+1209600)))'
}

elasticsearch fill gaps with previous value

I have time series data in Elasticsearch and I want to aggregate it to create histogram. What I want to achieve is to fill the null buckets with the value of the previous data point. I know that I can use min_doc_count: 0 but it will put the value as 0 and I couldn't find any out of the box way to do this via Elastic. May be there is some trick that I am not aware of?
Appreciate your feedback.
I think the Date Histogram Aggregation does not provide a native way to perform what you would like.
The closest thing I can think of is using missing value. However, this will set a static value to all the dates where no values are found, which is not exactly what you want.
I also thought of using Painless with the following logic:
Get the first value in the Histogram and store it in a variable current.
If the next value is different to 0, store this value to current.
If the value is 0, set the current value to the histogram date. Don't change current.
Repeat step 2 until you finish the Histogram.
Using painless, in my experience is really painful but you can consider it as an alternative.
Additionally, I would recommend you to limit ES to perform searches and aggregations. If you require additional logic to the output, consider performing it outside ES. You can use the Python ES Client for instance.
I can think of the following script with a similar logic as the Painless scenario:
current = 0
results = es.search(...)
for i in res["aggregations"]["my_histogram_name"]["buckets"]:
if not i["doc_count"]: #this is the same as "if i["doc_count"]==0"
i["doc_count"] = current
current = i["doc_count"] #changed or not, we always use the last value to current
After that, the histogram should look as you want and ready to be displayed.
Hope this is helpful! :)

How to sort by a derived value that includes a moving date in ElasticSearch?

I have a requirement to sort the results returned by ElasticSearch by a special value i define, let's call it 'X'.
Now - the problem is, 'X' is a value derived based on:
field A in the document (which is a 'term')
field B (which is a 'date')
the current date (UTC)
So, the problem is obviously 3. The date always changes, therefore i'm not sure how to include this in the sort, since it's not part of the document.
From my initial reading it appears i can use a 'script' here, but i'm worried about the performance, since i could be searching + sorting over 1000's of documents.
The only other idea that came to mind is to calculate the value nightly, and store that in each document. But that has a few drawbacks:
i need to have something running in the background to update this value
could be a lot of documents to update (60%+ every night).
i lose precision for the value depending on how long between script runs. (if i run nightly, value is 23 hours 'stale')
Any advice?
Thanks
This can be done by having an ES script run nightly calculating value, and store that in each document

Resources