any elasticsearch datatype matching decimal timestamp? - elasticsearch

I have a timestamp from log file like {"ts" : "1486418325.948487"}
My infrastructure are "filebeat" 5.2.0 --> "elasticsearch" 5.2
I tried mapping the "ts" to "date" -- "epoch_second" but es writing failed in filebeat.
PUT /auth_auditlog
{
"mappings": {
"auth-auditlogs": {
"properties": {
"ts": {
"type": "date",
"format": "epoch_second"
}
}
}
}
}
The filebeat error msg like
WARN Can not index event (status=400): {"type":"mapper_parsing_exception","reason":"failed to parse [ts]","caused_by":{"type":"illegal_argument_exception","reason":"Invalid format: \"1486418325.948487\""}}
I tried use "1486418325" is ok so I guess es doesn't accept decimal format timestamp. However, python default output timestamp is this format.
My purpose is to type correctly in elasticsearch. I want to use ts as a original timestamp in elasticsearch.
Any solution is welcome except to change the original log data!

Filebeat doesn't have a processor for this type of stuff. You can't replace the #timestamp with the one your log has in Filebeat. What you can do, is send that stuff to logstash and let the date filter parse epoch.
date {
match => ["timestamp","UNIX","UNIX_MS"]
}
The other option would be to use a ingest node. Although I haven't used this myself, it seems it is also able to do the job. Check out the docs here.

Related

Specifying Field Types Indexing from Logstash to Elasticsearch

I have successfully ingested data using the XML filter plugin from Logstash to Elasticsearch, however all the field types are of the type "text."
Is there a way to manually or automatically specify the correct type?
I found the following technique good for my use:
Logstash would filter the data and change a field from the default - text to whatever form you want. The documentation would be found here. The example given in the documentation is:
filter {
mutate {
convert => { "fieldname" => "integer" }
}
}
This you add in the /etc/logstash/conf.d/02-... file in the body part. I believe the downside of this practice is that from my understanding it is less recommended to alter data entering the ES.
After you do this you will probably get the this problem. If you have this problem and your DB is a test DB that you can erase all old data just DELETE the index until now that there would not be a conflict (for example you have a field that was until now text and now it is received as date there would be a conflict between old and new data). If you can't just erase the old data then read into the answer in the link I linked.
What you want to do is specify a mapping template.
PUT _template/template_1
{
"index_patterns": ["te*", "bar*"],
"settings": {
"number_of_shards": 1
},
"mappings": {
"type1": {
"_source": {
"enabled": false
},
"properties": {
"host_name": {
"type": "keyword"
},
"created_at": {
"type": "date",
"format": "EEE MMM dd HH:mm:ss Z YYYY"
}
}
}
}
}
Change the settings to match your needs such as listing the properties to map what you want them to map to.
Setting index_patterns is especially important because it tells elastic how to apply this template. You can set an array of index patterns and can use * as appropriate for wildcards. i.e logstash's default is to rotate by date. They will look like logstash-2018.04.23 so your pattern could be logstash-* and any that match the pattern will receive the template.
If you want to match based on some pattern, then you can use dynamic templates.
Edit: Adding a little update here, if you want logstash to apply the template for you, here is a link to the settings you'll want to be aware of.

Can we use a Unix Timestamp as _timestamp field in elasticsearch

When we create time-based indices, elasticsearch/kibana need a field named "_timestamp".
I found that this field should be a string.
But in my log, Unix Timestamp is a necessary segment.
Yes you can store unix timestamp in Date type fields. But make sure you use proper format like epoch_millis for timestamp in millis and epoch_second for timestamp in seconds.
Example mapping for timestamp field which stores unix timestamp in seconds.
PUT my-index
{
"mappings": {
"my-type": {
"properties": {
"timestamp": {
"type": "date",
"format": "epoch_second"
}
}
}
}
}
You can find more information here

Timestamp mapping in Spark to Elasticsearch

I am writing logs using Spark to elasticsearch.Logs are in JSON format having timestamp field.
example { "timestamp": "2016-11-02 21:16:06.116" }
When I write the Json logs to Elastic index, timestamp is analysed as String instead of date. I tried setting the property in sparkconf using sparkConf.set("es.mapping.timestamp", "timestamp") but it throws following error at runtime : org.apache.spark.util.TaskCompletionListenerException: failed to parse timestamp [2016-11-03 15:46:55.1155]
you can change timestamp data format
2016-11-02 21:16:06.116 -> 2016-11-02T21:16:06.116
i using 2016-11-02T21:16:06.116 insert to Elastic is work
type properties
"create_time": {
"format": "strict_date_optional_time||epoch_millis",
"type": "date"

ElasticSearch Mapping: is it possible to auto-truncate a date to fit it's format?

On our project we're using NEST to insert data into ElasticSearch (1.7). We'd like to be able to force ES to truncate all dates towards the mapped format.
Mapping example:
"dateFrom" : {
"type": "date",
"format": "dateHourMinute" // Or yyyy-MM-dd'T'HH:mm
}
Data example:
{
"dateFrom" : 2015-12-21T15:55:00.000Z
}
Inserting this data throws an IllegalArgumentException:
Invalid format: "2015-12-21T15:55:00.000Z" is malformed at ":00.000Z"
Obviously we don't need the last part of the date. Can't we configure ES to just truncate it instead of erroring out?
Keep in mind we're using 1.7 right now, since date formatting seems to have changed in recent versions...
In order to get the data to index correctly I could change the data type to date_optional_time (supported in 1.7)
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"date": {
"type": "date",
"format": "date_optional_time"
}
}
}
}
}
This will allow you to contribute date with time being optional.
such as:
PUT /my_index/my_type/1
{
"date": "2015-12-21"
}
or as you have it
PUT /my_index/my_type/2
{
"date": "2015-12-21T15:55:00.000Z"
}
Both are now valid submissions. I don't know of any transformation approaches within ES to support a truncation or transformation of field data at time of index. I would think if you want to parse the data and remove the time pre-submission you will need to do that outside of ES when you create the JSON object.
It appears ES is currently not capable of editing dates through a custom mapping. We ended up using JsonConverters (like this) to drop seconds and millis before inserting them into ES.

Logstash/ElasticSearch: guesses wrong for datatype for field

The log files I'm trying to import into Logstash contain a field that sometimes looks like a date/time and sometimes does not. Unfortunately, the first occurrence looked like a date/time and someone (logstash or elasticsearch) decided to define the field as a date/time. When trying to import a later log record, Elasticsearch has an exception:
Failed to execute [index ...]
org.elasticsearch.index.mapper.MapperParsingException: Failed to parse [#fields.field99]
at org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:320)
at org.elasticsearch.index.mapper.object.ObjectMapper.serializeValue(ObjectMapper.java:587)
...
Caused by: java.lang.IllegalArgumentException: Invalid format: "empty"
at org.elasticsearch.common.joda.time.format.DateTimeFormatter.parseMillis(DateTimeFormatter.java:747)
...
Question: How do I tell logstash/elasticsearch to not define this field as a date/time? I would like all the fields from my log (except the one explicit timestamp field) to be defined as just text.
Question: it appears that logstash gives up trying to import records from the log file after seeing this one that elasticsearch throws an exception on. How can I tell logstash to ignore this exception and keep trying to import the other records from the log file?
I found the answer to my first question myself.
Before adding data through Logstash, I had to set the defaults for Elasticsearch to treat the field as "string" instead of "date".
I did this by creating a defaults.js file like this:
{
"template": "logstash-*",
"mappings": {
`"_default_"`: {
"dynamic_templates": [{
"fields_template": {
"mapping": { "type": "string" },
"path_match": "#fields.*"
}
}]
}
}
}
and telling Elasticsearch to use it before adding any data through Logstash:
curl -XPUT 'http://localhost:9200/_template/template_logstash/' -d #defaults_for_elasticsearch.js
Hope this helps someone else.

Resources