Why is elasticsearch trying to parse a timestamp found in a field consisting of (and mapped as) strings to a date, and how do I stop it? - elasticsearch

How do you prevent Elasticsearch from attempting to parse dates it finds in string fields?
I have a simple json document like this:
{
key: val,
key2: val,
text_blob: ["hello", "world", "something else", "2015-01-01T00:00:00+1", "sentence"]
}
The timestamp's existence in the text_blob field is totally arbitrary. It was just present in the data and doesn't really mean anything. However, because it's there, Elastic seemingly thinks it's special and tries to map it to dateOptionalTime. I want it to just keep on being a plain ol' string!
I tried explicitly declaring a mapping on that field before loading in my data.
POST myindex
{
"mappings": {
"mytype": {
"_source": {"enabled": true},
"properties": {
"text_blob": {"type": "String"}
}
}
}
}
But it seems to have no effect. As soon as Elastic finds that datestring among the other strings it tries to apply a new mapping and explodes with:
MapperParsingException[failed to parse [text_blob]]; nested: MapperParsingException[failed to parse date field [None], tried both date format [dateOptionalTime], and timestamp number with locale []];
But this error is really somewhat of a red herring in my opinion. It's exploding because it can't parse the timestamp string that contains an offset. However, the core issue is why it's trying to parse it as a date at all.

Change your mapping to this
POST myindex
{
"mappings": {
"mytype": {
"_source": {"enabled": true},
"properties": {
"text_blob": {
"type": "String"
"index":"not_analyzed"
}
}
}
}
}
This will stop elasticsearch from analyzing the field in any way whatsoever. String fields by default are analyzed.

Related

keeping [Europe/Berlin] (or other timezones in this fomat) while indexing Elasticsearch

I'm trying to familiarize myself with the Elasticsearch, specifically defining the mapping within a json file and creating a new index with it (with the help of the new Java API Client and Spring boot).
This is what my json file looks like:
{
"mappings": {
"properties": {
"Id": {
"type": "text"
},
"timestamp": {
"type": "date",
"format": "date_optional_time"
},
"metadata":{
"type": "nested"
},
"attributes": {
"type": "nested"
}
}
}
}
this can index my documents just fine, but I realized that if I use ZonedDateTime.now() for the data in my timestamp field, it fails to index due to the [Europe/Berlin] at the end. It works if I change it to
ZonedDateTime now = ZonedDateTime.now();
String date = now.format(DateTimeFormatter.ISO_OFFSET_DATE_TIME);
which gives me the time but without [Europe/Berlin]! As far as I understand from my various googling and "stackoverflow-ing", ES does not take [Timezone] in its date types, only the +2:00 format. But is it possible to keep it? (Maybe through an ingest pipeline?)
There are various documents that I would like to reindex that has [Timezone] hanging at the end of it, but these older documents saved it as text.... I would like to be able to do date math with the timestamp field in the future, which is why I decided to try and create a new/better mapping with proper fields. Any pointers appreciated!

Elasticsearch 2.3 put mapping (Attempting to override date field type) error

I have some birth_dates that I want to store as a string. I don't plan on doing any querying or analysis on the data, I just want to store it.
The input data I have been given is in lots of different random formats and some even include strings like (approximate). Elastic has determined that this should be a date field with a date format which means when elastic receives a date like 1981 (approx) it freaks out and says the input is in an invalid format.
Instead of changing input dates I want to change the date type to string.
I have looked at the documentation and have been trying to update the mapping with the PUT mapping API, but elastic keeps returning a parsing error.
based on the documentation here:
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html
I have tried:
PUT /sanctions_lists/eu_financial_sanctions/_mapping
{
"mappings":{
"eu_financial_sanctions":{
"properties": {
"birth_date": {
"type": "string", "index":"not_analyzed"
}
}
}
}
}
but returns:
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "Root mapping definition has unsupported parameters: [mappings : {eu_financial_sanctions={properties={birth_date={type=string, index=not_analyzed}}}}]"
}
],
"type": "mapper_parsing_exception",
"reason": "Root mapping definition has unsupported parameters: [mappings : {eu_financial_sanctions={properties={birth_date={type=string, index=not_analyzed}}}}]"
},
"status": 400
}
Question Summary
Is it possible to override elasticsearch's automatically determined date field, forcing string as the field type?
NOTE
I'm using the google chrome sense plugin to send the requests
Elastic search version is 2.3
Just remove type reference and mapping from url, you have them inside request body. More examples.
PUT /sanctions_lists
{
"mappings":{
"eu_financial_sanctions":{
"properties": {
"birth_date": {
"type": "string", "index":"not_analyzed"
}
}
}
}
}

Unanalyzed fields on Kibana

i need help to correct kibana field. when I try to visualizing the fields, shown me the following warning:
Careful! The field contains Analyzed selected strings. Analyzed
strings are highly unique and can use a lot of memory to visualize.
Values: such as bar will be foo-foo and bar broken into. See Core
Mapping Types for more information on setting esta field Analyzed as
not
Elasticsearch default dynamic mapping is to analyze any string field (break the field into tokens, for instance: aaa_bbb_ccc will be break down into aaa,bbb and ccc).
If you do not want such behavior you must change the mapping settings
before any document was pushed into the index.
You have two options to do that:
Change the mapping for a particular index using mapping API, in a static way or dynamic way (dynamic means that the mapping will be applies also to fields that still does not exist in the index)
You can change the behavior of any index according to a pattern, using the template API
This example shows a template that changes the mapping for any index that starts with "app", applying "not analyze" to any field in any type and make sure "timestamp" is a date (good for cases in with the timestamp is represented as a number of seconds from 1970):
{
"template": "myindciesprefix*",
"mappings": {
"_default_": {
"dynamic_templates": [
{
"strings": {
"match_mapping_type": "string",
"mapping": {
"type": "string",
"index": "not_analyzed"
}
}
},
{
"timestamp_field": {
"match": "timestamp",
"mapping": {
"type": "date"
}
}
}
]
}
}
}
Really you dont have any problem is only a message of info, but if you dont want analyzed fields when you build your index in elasticsearch you must indicate that one field is a not analyzed field.

ElasticSearch Mapping: is it possible to auto-truncate a date to fit it's format?

On our project we're using NEST to insert data into ElasticSearch (1.7). We'd like to be able to force ES to truncate all dates towards the mapped format.
Mapping example:
"dateFrom" : {
"type": "date",
"format": "dateHourMinute" // Or yyyy-MM-dd'T'HH:mm
}
Data example:
{
"dateFrom" : 2015-12-21T15:55:00.000Z
}
Inserting this data throws an IllegalArgumentException:
Invalid format: "2015-12-21T15:55:00.000Z" is malformed at ":00.000Z"
Obviously we don't need the last part of the date. Can't we configure ES to just truncate it instead of erroring out?
Keep in mind we're using 1.7 right now, since date formatting seems to have changed in recent versions...
In order to get the data to index correctly I could change the data type to date_optional_time (supported in 1.7)
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"date": {
"type": "date",
"format": "date_optional_time"
}
}
}
}
}
This will allow you to contribute date with time being optional.
such as:
PUT /my_index/my_type/1
{
"date": "2015-12-21"
}
or as you have it
PUT /my_index/my_type/2
{
"date": "2015-12-21T15:55:00.000Z"
}
Both are now valid submissions. I don't know of any transformation approaches within ES to support a truncation or transformation of field data at time of index. I would think if you want to parse the data and remove the time pre-submission you will need to do that outside of ES when you create the JSON object.
It appears ES is currently not capable of editing dates through a custom mapping. We ended up using JsonConverters (like this) to drop seconds and millis before inserting them into ES.

how do you transform a date that's storred as a type long (epoch time) into a dateOptionalTime in Elasticsearch?

I have a field in my database that's stored as Epoch time, which is a long. I'm trying to get Elasticsearch to recognize this field as what it actually is: a date. Once indexed by Elasticsearch, I want it to be of type dateOptionalTime.
My thinking is that I need to apply a transform to convert the Epoch long into a string date.
On my index, I have a mapping that specifies the type for my field as date with a format of dateOptionalTime. Finally, this timestamp is in all of my docs, so I've added my (attempted) mapping to _default_.
The code:
'_default_', {
'_all': {'enabled': 'true'},
'dynamic_templates': [
{ "date_fixer" : {
"match": "my_timestamp",
"mapping": {
"transform": {
"script": "ctx._source['my_timestamp'] = new Date(ctx._source['my_timestamp']).toString()"
},
'type': "date",
}
}
}]
}
I'm brand new to Elastic, so I'll walk through what I think is happening.
I'm setting this to type _default_ which will apply this to all new types Elastic encounters.
I set _all to enabled. I want Elastic to use the default mapping for all types with the exception of my timestamp field.
finally, I add my dynamic template that (a) converts the long into a date, and (b) applies a mapping to the timestamp field explicitly saying that it is a date
The Problem:
When I try to add my data to the index, I get the following exception.
TransportError(400, u'MapperParsingException[failed to parse [my_timestamp]]; nested: ElasticsearchIllegalArgumentException[unknown property [$date]]; ')
My data looks like:
{
"timestamp": 8374747594,
"owner": "text",
"some_more": {
"key": "val",
"key": "val"
}
}

Resources