How to update data type of a field in elasticsearch - elasticsearch

I am publishing a data to elasticsearch using fluentd. It has a field Data.CPU which is currently set to string. Index name is health_gateway
I have made some changes in python code which is generating the data so now this field Data.CPU has now become integer. But still elasticsearch is showing it as string. How can I update it data type.
I tried running below commands in kibana dev tools:
PUT health_gateway/doc/_mapping
{
"doc" : {
"properties" : {
"Data.CPU" : {"type" : "integer"}
}
}
}
But it gave me below error:
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "Types cannot be provided in put mapping requests, unless the include_type_name parameter is set to true."
}
],
"type" : "illegal_argument_exception",
"reason" : "Types cannot be provided in put mapping requests, unless the include_type_name parameter is set to true."
},
"status" : 400
}
There is also this document which says using mutate we can convert the data type but I am not able to understand it properly.
I do not want to delete the index and recreate as I have created a visualization based on this index and after deleting it will also be deleted. Can anyone please help in this.

The short answer is that you can't change the mapping of a field that already exists in a given index, as explained in the official docs.
The specific error you got is because you included /doc/ in your request path (you probably wanted /<index>/_mapping), but fixing this alone won't be sufficient.
Finally, I'm not sure you really have a dot in the field name there. Last I heard it wasn't possible to use dots in field names.
Nevertheless, there are several ways forward in your situation... here are a couple of them:
Use a scripted field
You can add a scripted field to the Kibana index-pattern. It's quick to implement, but has major performance implications. You can read more about them on the Elastic blog here (especially under the heading "Match a number and return that match").
Add a new multi-field
You could add a new multifield. The example below assumes that CPU is a nested field under Data, rather than really being called Data.CPU with a literal .:
PUT health_gateway/_mapping
{
"doc": {
"properties": {
"Data": {
"properties": {
"CPU": {
"type": "keyword",
"fields": {
"int": {
"type": "short"
}
}
}
}
}
}
}
}
Reindex your data within ES
Use the Reindex API. Be sure to set the correct mapping on the target index.
Delete and reindex everything from source
If you are able to regenerate the data from source in a timely manner, without disrupting users, you can simply delete the index and reingest all your data with an updated mapping.

You can update the mapping, by indexing the same field in multiple ways i.e by using multi fields.
Using the below mapping, Data.CPU.raw will be of integer type
{
"mappings": {
"properties": {
"Data": {
"properties": {
"CPU": {
"type": "string",
"fields": {
"raw": {
"type": "integer"
}
}
}
}
}
}
}
}
OR you can create a new index with correct index mapping, and reindex the data in it using the reindex API

Related

How to use aggregation value on document update using script in Elasticsearch

I am trying to update a document field based on id using script. The value of that field should be MAX(field) * 2. For example consider the following index
PUT /my-index
{
"mappings": {
"properties": {
"name": {
"type": "text"
},
"cost": {
"type": "integer"
}
}
}
}
Document will be created with only name field value
POST /my-index/_doc/sp1
{
"name": "Shirt"
}
Once this document was created, I want to update this document with cost value as maximum value of cost in that index (max(cost) * 2). I tried this logic using update API as follows
POST /my-index/_doc/sp1
{
"script" : {
"source": "ctx._source.cost = Math.max(doc['cost'].value) * 2"
}
}
But I couldn't able to achieve this. Encountered the following error
"caused_by" : {
"type" : "illegal_argument_exception",
"reason" : "static method [java.lang.Math, max/1] not found"
}
How to achieve this scenario
It doesn't work that way. The _update API (which you're not using in your example by the way), only allows you to update a document in its own context. You don't have access to any other document, only to the document itself (via ctx._source or doc) and the script parameters (via params).
There's no way to perform an aggregation on the whole index and update a specific document with the result. You need to do this in two steps from your client application (first query for aggregation results + then index the result into a document) or via the transform API but the latter works in its own way.

How to get the analyzed text from the elasticsearch database

I need to get the analyzed text from the elasticseatch database. I know that I can apply an analyzer to any text using the analyze API, however, since the text has already be analyzed during indexing, there should be a way to get access to the analyzed data.
Here is what I want to do using the analyze API and Python Elasticsearch
res = es.indices.analyze(index=app.config['ES_ARXIV_PAPER_INDEX'],
body={"char_filter": ["html_strip"],
"tokenizer" : "standard",
"filter" : ["lowercase", "stop", "snowball"],
"text" : text})
tokens = []
for token in res['tokens']:
tokens.append(token['token'])
print("tokens = ", tokens)
I noticed that this procedure is actually quite slow. So getting the data directly from the indexed data should be much faster.
Using the termvectors api should do the job, but you must specify the id of every entry and it must be enabled (since the information is stored). If you don't want that, then you are already using the correct method.
Example below:
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"my_field": {
"type": "text"
}
}
}
}
}
POST my_index/my_type/1
{
"my_field": "this is a test"
}
GET /my_index/my_type/1/_termvectors?fields=*
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-termvectors.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/term-vector.html

Is it possible to update an existing field in an index through mapping in Elasticsearch?

I've already created an index, and it contains data from my MySQL database. I've got few fields which are string in my table, where I need them as different types (integer & double) in Elasticsearch.
So I'm aware that I could do it through mapping as follows:
{
"mappings": {
"my_type": {
"properties": {
"userid": {
"type": "text",
"fielddata": true
},
"responsecode": {
"type": "integer"
},
"chargeamount": {
"type": "double"
}
}
}
}
}
But I've tried this when I'm creating the index as a new one. What I wanted to know is how can I update an existing field (ie: chargeamount in this scenario) using mapping as a PUT?
Is this possible? Any help could be appreciated.
Once a mapping type has been created, you're very constrained on what you can update. According to the official documentation, the only changes you can make to an existing mapping after it's been created are the following, but changing a field's type is not one of them:
In general, the mapping for existing fields cannot be updated. There
are some exceptions to this rule. For instance:
new properties can be added to Object datatype fields.
new multi-fields can be added to existing fields.
doc_values can be disabled, but not enabled.
the ignore_above parameter can be updated.

Elasticsearch 2.3 put mapping (Attempting to override date field type) error

I have some birth_dates that I want to store as a string. I don't plan on doing any querying or analysis on the data, I just want to store it.
The input data I have been given is in lots of different random formats and some even include strings like (approximate). Elastic has determined that this should be a date field with a date format which means when elastic receives a date like 1981 (approx) it freaks out and says the input is in an invalid format.
Instead of changing input dates I want to change the date type to string.
I have looked at the documentation and have been trying to update the mapping with the PUT mapping API, but elastic keeps returning a parsing error.
based on the documentation here:
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html
I have tried:
PUT /sanctions_lists/eu_financial_sanctions/_mapping
{
"mappings":{
"eu_financial_sanctions":{
"properties": {
"birth_date": {
"type": "string", "index":"not_analyzed"
}
}
}
}
}
but returns:
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "Root mapping definition has unsupported parameters: [mappings : {eu_financial_sanctions={properties={birth_date={type=string, index=not_analyzed}}}}]"
}
],
"type": "mapper_parsing_exception",
"reason": "Root mapping definition has unsupported parameters: [mappings : {eu_financial_sanctions={properties={birth_date={type=string, index=not_analyzed}}}}]"
},
"status": 400
}
Question Summary
Is it possible to override elasticsearch's automatically determined date field, forcing string as the field type?
NOTE
I'm using the google chrome sense plugin to send the requests
Elastic search version is 2.3
Just remove type reference and mapping from url, you have them inside request body. More examples.
PUT /sanctions_lists
{
"mappings":{
"eu_financial_sanctions":{
"properties": {
"birth_date": {
"type": "string", "index":"not_analyzed"
}
}
}
}
}

Unanalyzed fields on Kibana

i need help to correct kibana field. when I try to visualizing the fields, shown me the following warning:
Careful! The field contains Analyzed selected strings. Analyzed
strings are highly unique and can use a lot of memory to visualize.
Values: such as bar will be foo-foo and bar broken into. See Core
Mapping Types for more information on setting esta field Analyzed as
not
Elasticsearch default dynamic mapping is to analyze any string field (break the field into tokens, for instance: aaa_bbb_ccc will be break down into aaa,bbb and ccc).
If you do not want such behavior you must change the mapping settings
before any document was pushed into the index.
You have two options to do that:
Change the mapping for a particular index using mapping API, in a static way or dynamic way (dynamic means that the mapping will be applies also to fields that still does not exist in the index)
You can change the behavior of any index according to a pattern, using the template API
This example shows a template that changes the mapping for any index that starts with "app", applying "not analyze" to any field in any type and make sure "timestamp" is a date (good for cases in with the timestamp is represented as a number of seconds from 1970):
{
"template": "myindciesprefix*",
"mappings": {
"_default_": {
"dynamic_templates": [
{
"strings": {
"match_mapping_type": "string",
"mapping": {
"type": "string",
"index": "not_analyzed"
}
}
},
{
"timestamp_field": {
"match": "timestamp",
"mapping": {
"type": "date"
}
}
}
]
}
}
}
Really you dont have any problem is only a message of info, but if you dont want analyzed fields when you build your index in elasticsearch you must indicate that one field is a not analyzed field.

Resources