Elasticsearch 2.3 put mapping (Attempting to override date field type) error - elasticsearch

I have some birth_dates that I want to store as a string. I don't plan on doing any querying or analysis on the data, I just want to store it.
The input data I have been given is in lots of different random formats and some even include strings like (approximate). Elastic has determined that this should be a date field with a date format which means when elastic receives a date like 1981 (approx) it freaks out and says the input is in an invalid format.
Instead of changing input dates I want to change the date type to string.
I have looked at the documentation and have been trying to update the mapping with the PUT mapping API, but elastic keeps returning a parsing error.
based on the documentation here:
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html
I have tried:
PUT /sanctions_lists/eu_financial_sanctions/_mapping
{
"mappings":{
"eu_financial_sanctions":{
"properties": {
"birth_date": {
"type": "string", "index":"not_analyzed"
}
}
}
}
}
but returns:
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "Root mapping definition has unsupported parameters: [mappings : {eu_financial_sanctions={properties={birth_date={type=string, index=not_analyzed}}}}]"
}
],
"type": "mapper_parsing_exception",
"reason": "Root mapping definition has unsupported parameters: [mappings : {eu_financial_sanctions={properties={birth_date={type=string, index=not_analyzed}}}}]"
},
"status": 400
}
Question Summary
Is it possible to override elasticsearch's automatically determined date field, forcing string as the field type?
NOTE
I'm using the google chrome sense plugin to send the requests
Elastic search version is 2.3

Just remove type reference and mapping from url, you have them inside request body. More examples.
PUT /sanctions_lists
{
"mappings":{
"eu_financial_sanctions":{
"properties": {
"birth_date": {
"type": "string", "index":"not_analyzed"
}
}
}
}
}

Related

keeping [Europe/Berlin] (or other timezones in this fomat) while indexing Elasticsearch

I'm trying to familiarize myself with the Elasticsearch, specifically defining the mapping within a json file and creating a new index with it (with the help of the new Java API Client and Spring boot).
This is what my json file looks like:
{
"mappings": {
"properties": {
"Id": {
"type": "text"
},
"timestamp": {
"type": "date",
"format": "date_optional_time"
},
"metadata":{
"type": "nested"
},
"attributes": {
"type": "nested"
}
}
}
}
this can index my documents just fine, but I realized that if I use ZonedDateTime.now() for the data in my timestamp field, it fails to index due to the [Europe/Berlin] at the end. It works if I change it to
ZonedDateTime now = ZonedDateTime.now();
String date = now.format(DateTimeFormatter.ISO_OFFSET_DATE_TIME);
which gives me the time but without [Europe/Berlin]! As far as I understand from my various googling and "stackoverflow-ing", ES does not take [Timezone] in its date types, only the +2:00 format. But is it possible to keep it? (Maybe through an ingest pipeline?)
There are various documents that I would like to reindex that has [Timezone] hanging at the end of it, but these older documents saved it as text.... I would like to be able to do date math with the timestamp field in the future, which is why I decided to try and create a new/better mapping with proper fields. Any pointers appreciated!

How to update data type of a field in elasticsearch

I am publishing a data to elasticsearch using fluentd. It has a field Data.CPU which is currently set to string. Index name is health_gateway
I have made some changes in python code which is generating the data so now this field Data.CPU has now become integer. But still elasticsearch is showing it as string. How can I update it data type.
I tried running below commands in kibana dev tools:
PUT health_gateway/doc/_mapping
{
"doc" : {
"properties" : {
"Data.CPU" : {"type" : "integer"}
}
}
}
But it gave me below error:
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "Types cannot be provided in put mapping requests, unless the include_type_name parameter is set to true."
}
],
"type" : "illegal_argument_exception",
"reason" : "Types cannot be provided in put mapping requests, unless the include_type_name parameter is set to true."
},
"status" : 400
}
There is also this document which says using mutate we can convert the data type but I am not able to understand it properly.
I do not want to delete the index and recreate as I have created a visualization based on this index and after deleting it will also be deleted. Can anyone please help in this.
The short answer is that you can't change the mapping of a field that already exists in a given index, as explained in the official docs.
The specific error you got is because you included /doc/ in your request path (you probably wanted /<index>/_mapping), but fixing this alone won't be sufficient.
Finally, I'm not sure you really have a dot in the field name there. Last I heard it wasn't possible to use dots in field names.
Nevertheless, there are several ways forward in your situation... here are a couple of them:
Use a scripted field
You can add a scripted field to the Kibana index-pattern. It's quick to implement, but has major performance implications. You can read more about them on the Elastic blog here (especially under the heading "Match a number and return that match").
Add a new multi-field
You could add a new multifield. The example below assumes that CPU is a nested field under Data, rather than really being called Data.CPU with a literal .:
PUT health_gateway/_mapping
{
"doc": {
"properties": {
"Data": {
"properties": {
"CPU": {
"type": "keyword",
"fields": {
"int": {
"type": "short"
}
}
}
}
}
}
}
}
Reindex your data within ES
Use the Reindex API. Be sure to set the correct mapping on the target index.
Delete and reindex everything from source
If you are able to regenerate the data from source in a timely manner, without disrupting users, you can simply delete the index and reingest all your data with an updated mapping.
You can update the mapping, by indexing the same field in multiple ways i.e by using multi fields.
Using the below mapping, Data.CPU.raw will be of integer type
{
"mappings": {
"properties": {
"Data": {
"properties": {
"CPU": {
"type": "string",
"fields": {
"raw": {
"type": "integer"
}
}
}
}
}
}
}
}
OR you can create a new index with correct index mapping, and reindex the data in it using the reindex API

How to use aggregation value on document update using script in Elasticsearch

I am trying to update a document field based on id using script. The value of that field should be MAX(field) * 2. For example consider the following index
PUT /my-index
{
"mappings": {
"properties": {
"name": {
"type": "text"
},
"cost": {
"type": "integer"
}
}
}
}
Document will be created with only name field value
POST /my-index/_doc/sp1
{
"name": "Shirt"
}
Once this document was created, I want to update this document with cost value as maximum value of cost in that index (max(cost) * 2). I tried this logic using update API as follows
POST /my-index/_doc/sp1
{
"script" : {
"source": "ctx._source.cost = Math.max(doc['cost'].value) * 2"
}
}
But I couldn't able to achieve this. Encountered the following error
"caused_by" : {
"type" : "illegal_argument_exception",
"reason" : "static method [java.lang.Math, max/1] not found"
}
How to achieve this scenario
It doesn't work that way. The _update API (which you're not using in your example by the way), only allows you to update a document in its own context. You don't have access to any other document, only to the document itself (via ctx._source or doc) and the script parameters (via params).
There's no way to perform an aggregation on the whole index and update a specific document with the result. You need to do this in two steps from your client application (first query for aggregation results + then index the result into a document) or via the transform API but the latter works in its own way.

Why are fields specified by type rather than index in Elasticsearch?

If multiple types in an Elasticsearch index have fields with the same name, those fields must have the same mapping that tries to create a "foobar" property as both string and long"...
For example if you try to PUT the following index mapping:
{
"mappings": {
"type_one": {
"properties": {
"foobar": {
"type": "string"
}
}
},
"type_two": {
"properties": {
"foobar": {
"type": "long"
}
}
}
}
}
...the following error will be returned
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "Failed to parse mapping [type_one]: mapper [foobar] cannot be changed from type [long] to [string]"
}
],
"type": "mapper_parsing_exception",
"reason": "Failed to parse mapping [type_one]: mapper [foobar] cannot be changed from type [long] to [string]",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "mapper [foobar] cannot be changed from type [long] to [string]"
}
},
"status": 400
}
The following is from the elasticsearch site:
Conflicts between fields in different types
Fields in the same index with the same name in two different types
must have the same mapping, as they are backed by the same field
internally. Trying to update a mapping parameter for a field which
exists in more than one type will throw an exception, unless you
specify the update_all_types parameter, in which case it will update
that parameter across all fields with the same name in the same index.
If fields with the same name must have the same mapping for all types in the index then why are field mappings specified per type? Why not specify the fields for the entire index, and then identify which fields are assigned to each type.
For example something like this:
{
"fields":{
"PropA":{
"type":"string"
},
"PropB":{
"type":"long"
},
"PropC":{
"type":"boolean"
}
},
"types":{
"foo":[
"PropA",
"PropB"
],
"foo":[
"PropA",
"PropC"
],
"foo":[
"PropA",
"PropC",
"PropC"
]
}
}
Wouldn't a mapping format like this be more succinct, and a better representation of what is actually allowed?
The reason I ask is because I'm working on creating an index template JSON file with about 80 different fields used across 15 types. Many of the fields are used across multiple if not all the types. So anytime I need to update a field I have to make sure I update it for every type where it's used.
Looks like I'm not the only one that found this confusing.
Remove support for types? #15613
Sounds like removing support for multiple types per index and specifying fields at the index level is on the roadmap for future versions.

Why is elasticsearch trying to parse a timestamp found in a field consisting of (and mapped as) strings to a date, and how do I stop it?

How do you prevent Elasticsearch from attempting to parse dates it finds in string fields?
I have a simple json document like this:
{
key: val,
key2: val,
text_blob: ["hello", "world", "something else", "2015-01-01T00:00:00+1", "sentence"]
}
The timestamp's existence in the text_blob field is totally arbitrary. It was just present in the data and doesn't really mean anything. However, because it's there, Elastic seemingly thinks it's special and tries to map it to dateOptionalTime. I want it to just keep on being a plain ol' string!
I tried explicitly declaring a mapping on that field before loading in my data.
POST myindex
{
"mappings": {
"mytype": {
"_source": {"enabled": true},
"properties": {
"text_blob": {"type": "String"}
}
}
}
}
But it seems to have no effect. As soon as Elastic finds that datestring among the other strings it tries to apply a new mapping and explodes with:
MapperParsingException[failed to parse [text_blob]]; nested: MapperParsingException[failed to parse date field [None], tried both date format [dateOptionalTime], and timestamp number with locale []];
But this error is really somewhat of a red herring in my opinion. It's exploding because it can't parse the timestamp string that contains an offset. However, the core issue is why it's trying to parse it as a date at all.
Change your mapping to this
POST myindex
{
"mappings": {
"mytype": {
"_source": {"enabled": true},
"properties": {
"text_blob": {
"type": "String"
"index":"not_analyzed"
}
}
}
}
}
This will stop elasticsearch from analyzing the field in any way whatsoever. String fields by default are analyzed.

Resources