I have an Elasticsearch index which uses the #timestamp field to store the date in a date field.
There are many records which are missing the #timestamp field, but have a timestamp field containing a unix timestamp. (Generated from PHP, so seconds, not milliseconds)
Note, the timestamp field is of date type, but numeric data seems to be stored there.
How can I use Painless script in a reindex and set #timestamp where it is missing, IF there is a numeric timestamp field with a unix timestamp?
Here's an example record that I would want to transform.
{
"_index": "my_log",
"_type": "doc",
"_id": "AWjEkbynNsX24NVXXmna",
"_score": 1,
"_source": {
"name": null,
"pid": "148651",
"timestamp": 1549486104
}
},
Did you have a look at the ingest module of Elasticsearch??
https://www.elastic.co/guide/en/elasticsearch/reference/current/date-processor.html
Parses dates from fields, and then uses the date or timestamp as the
timestamp for the document. By default, the date processor adds the
parsed date as a new field called #timestamp. You can specify a
different field by setting the target_field configuration parameter.
Multiple date formats are supported as part of the same date processor
definition. They will be used sequentially to attempt parsing the date
field, in the same order they were defined as part of the processor
definition.
It does exactly what you want :) In your reindex statement you can direct documents through this ingest processor.
If you need more help let me know, then I can jump behind a computer and help out :D
Related
I am moving data from Kafka to Elasticsearch and using Kafka connects SMT and more specificly TimeStampConverter . I fiddled around some with it and couldnt get it to output a Timestamp format.
When I used types "Date", "Time" or "Timestamp" as the values for transforms.TimestampConverter.target.type I couldnt get data into Elasticsearch. It was only until I set the value "string" there that it outputs the values into elasticsearch as date data type. This unfortunately means that I can only get the value by accuracy of a day.
Here is the transformer configs:
"transforms": "TimestampConverter",
"transforms.TimestampConverter.type": "org.apache.kafka.connect.transforms.TimestampConverter$Value",
"transforms.TimestampConverter.field": "UPDATED",
"transforms.TimestampConverter.format": "yyyy-MM-dd",
"transforms.TimestampConverter.target.type": "string"
Any known ways how to achieve this with more accurate timestamp? I tried all kinds of configurations alterin the target.typeand format fields
The UPDATED value is epoch bigint
I have an index which contains information about some objects. I want to display some of the information on my Kibana's dasboard. Lets assume an object looks as follows:
{
"_index": "obj",
"_type": "_doc",
"_id": "KwDPAHABfo5V345r4IYV",
"_version": 1,
"_score": 0,
"_source": {
"value_1": "some value",
"value_2": "some_other value",
"owner": "jason",
"modified_date": "2020-02-01T12:53:08.210317+00:00",
"created_date": "2020-02-01T12:53:08.243980+00:00"
}
}
I need to show (live) number of objects that has owner: 'UNKNOWN'. Thing is, that this value changes in time. Each change is a new document - they are not being updated. I need to track how many UNKNOWN owners currently I see. Updates (new documents) are being sent to elk in fixed intervals.
When I try to set up a metric, it sometimes shows 0, during the window between one update and another - when there is no documents flowing into elk. How can I make Kibana display only last documents with owner: 'UNKNOWN'?
How can I make Kibana display only last documents with owner: 'UNKNOWN'?
You could set up a data table visualization for that as an alternative to the one-dimensional metric visualization.
This is how I personally would configure the data table:
Set a filter with 'owner(.keyword) is UNKNOWN'.
Use the metric 'Top Hit' on the field created_date (or #timestamp, thats up to you) instead of the count metric.
Set the order to descending based on the timestamp field.
Split the rows (Term Aggregations) for every field you want to display in the rows. This will create 'columns' in your table.
Go to the options tab and enable count on the sum of all rows.
Set an appropriate time interval, e.g. last 1 hour.
This will display all the relevant data of your documents that have the field owner equal to UNKNOWN. Also, you see the ingestion/creation date timestamp of these documents in a descending order. Furthermore, you see the number of documents that match (configured via the options tab as described above).
I hope I could help you.
I have a date field defined in index as
"_reportDate": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
}
and I have a query to query from _source field which gives _reportDate field in string of 2015-12-05 01:05:00.
I can't seems to find a way to get date in different date format during query retrieval apart from using script field (which is not preferable). From what I understand a date field will be parse to long value to be indexed in elastic search, can we retrieve the long value as well during elasticsearch query?
You need to store the field and at search time ask for this stored field.
If it does not work you can always apply the script at index time with ingest feature and a script processor.
I have connected Kibana to my ES instance.
cat/indices returns:
yellow open .kibana 1 1 1 0 3.1kb 3.1kb
yellow open tests 5 1 413042 0 3.4gb 3.4gb
However I get the following on the kibana configuration screen. What am I missing?
Update:
My sample document looks like this
"_index": "tests",
"_type": "test7",
"_id": "AVGlIKIM1CQ8BZRgLZVg",
"_score": 1.7840601,
"_source": {
"severity": "ERROR",
"code": "CODE,
"message": "MESSAGE",
"environment": "TEST",
"error_uuid": "cbe99080-0bf3-495c-a417-77384ba0fd39",
"correlation_id": "cf5a1fd5-4fd2-40bb-9cdf-405b91dcbd6f",
"timestamp": "2015-11-20 15:24:39.831"
Disable the option Use event times to create index names and put the index name instead of the pattern (tests).
The option you are trying to use is used when you have index names based on timestamp (imagine you create a new index per day with tests-2015.12.01, tests-2015.12.02...). It's quite clear if you read the message when you enable that option:
Patterns allow you to define dynamic index names. Static text in an index name is denoted using brackets. Example: [logstash-]YYYY.MM.DD. Please note that weeks are setup to use ISO weeks which start on Monday
EDIT: The problem with an empty dropdown in the time-field name is because you don't have any field with date type in the mapping of your index. You can actually check if you do GET /<index-name>/_mapping?pretty, that the timestamp is a "string" type and not "date". This happens because the format didn't match the regex for the date detection (yyyy/MM/dd HH:mm:ss Z||yyyy/MM/dd Z). To solve this:
You can change the format of the timestamp you are inserting to match the default regex.
You can modify the dynamic_date_format property and put a regex that matches the current format of your timestamp.
You can set an index template and set the type "date" for the "timestamp" field.
In any of the cases, you would need to delete the index and create a new one or reindex the data.
I have an Elasticsearch index with the following mapping:
"pickup_datetime": {
"type": "date",
"format": "dateOptionalTime"
}
Here is an example of a date contained in the file that is being read in
"pickup_datetime": "2013-01-07 06:08:51"
I am using Logstash to read and insert data into ES with the following lines to attempt to convert the date string into the date type.
date {
match => [ "pickup_datetime", "yyyy-MM-dd HH:mm:ss" ]
target => "pickup_datetime"
}
But the match never seems to occur.
What am I doing wrong?
It turns out the date filter was before the csv filter, where the columns get named, hence the date filter was not finding the pickup_datetime column since it had not yet been named.
It might be a good idea to clearly mention the sequentiality of the filters in the documentation to avoid others having similar problems in the future.