Elasticsearch : map date as text? - elasticsearch

I have json data that has a "product_ref" field that can take these values as an example:
"product_ref": "N/A"
"product_ref": "90323"
"product_ref": "SN3005"
"product_ref": "2015-05-23"
When pushing the data to the index i get a mapping error:
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"mapper [root.product_ref] of different type, current_type [date], merged_type [text]"}],"type":"illegal_argument_exception","reason":"mapper [root.product_ref] of different type, current_type [date], merged_type [text]"},"status":400}
Any idea?

There is something called date detection, and by default, it is enabled.
If date_detection is enabled (default), then new string fields are checked to see whether their contents match any of the date patterns specified in dynamic_date_formats. If a match is found, a new date field is added with the corresponding format.
You just need to disable it by modifying your mappings:
PUT /products
{
"mappings": {
"doc": {
"date_detection": false,
"properties": {
"product_ref": { "type": "keyword" },
}
}
}
}

This is happening because ElasticSearch assumed you're indexing dates of a particular format, and a value which doesn't match that was attempted to be indexed. i.e. after indexing date, you index wrong format.
Make sure all the values are dates and none are empty,perhaps remove these in your ingestion layer.
EDIT: If you don't care to lose the date value you can use the dynamic mapping.
{
"dynamic_templates": [
{
"integers": {
"match_mapping_type": "date",
"mapping": {
"type": "text"
}
}
}
]
}

Related

Elastic Dynamic field mapping

ES version 6.8.12
I want to map all types to a given field in an index, it should store all types of data , instead of bound to specific type.
im facing issue when the string is stored in Long type field.
[WARN ] 2020-09-14 06:34:36.470 [[main]>worker0] elasticsearch - Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>"5f4632bab98bdd75a267546b", :_index=>"cdrindex", :_type=>"doc", :routing=>nil}, #<LogStash::Event:0x38a5506>], :response=>{"index"=>{"_index"=>"cdrindex", "_type"=>"doc", "_id"=>"5f4632bab98bdd75a267546b", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse field [caller_id_number] of type [long] in document with id '5f4632bab98bdd75a267546b'", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"For input string: \"Anonymous\""}}}}}
Then you need to pick the text or keyword data type.
In your mapping, you need to set the caller_id_number data type explicitly to one of the above instead of letting Elasticsearch decide for you.
For instance:
PUT your-index
{
"mappings": {
"properties": {
"caller_id_number": {
"type": "text"
},
...
}
}
}
Note that you can leverage dynamic mappings if you want to automatically set the mapping for some fields:
PUT your-index
{
"mappings": {
"dynamic_templates": [
{
"sources": {
"match": "caller_*",
"mapping": {
"type": "text"
}
}
}
],
"properties": {
"specific_field": {
"type": "long"
}
}
}
}
With the dynamic mapping above, all fields starting with caller_ would get automatically mapped as text while specific_field would be mapped as long...

Try to index fields which are missing from the Elasticsearch template defined

I couldn't find any documentation regarding the following issue:
We are creating a template file for all the fields we are indexing to the Elasticsearch. The question is regarding fields which are not defined in the template:
What is the default Elastic value they are index with?
What are the limitations (if there are any) for indexing those fields?
I was trying to index a field which it's value is a list of JSON and
I got the exception: "Can't get text on a START_OBJECT at 1:311",
what does it mean?
string fields are indexed with a text field with a standard analyzer, and a subfield .keyword with a keyword datatype, with option ignore_above setted to 256. date field are tried to parse into iso 8601 format - this one yyyy-MM-dd HH:mm:ss . long is the default for numerical and double for for decimals. you could modify this default behaviour with dynamic templates. For example, if we wanted to map all integer fields as short instead of long, and all string fields as keyword, we could use the following template:
PUT my_index
{
"mappings": {
"dynamic_templates": [
{
"integers": {
"match_mapping_type": "long",
"mapping": {
"type": "short"
}
}
},
{
"strings": {
"match_mapping_type": "string",
"mapping": {
"type": "keyword",
"ignore_above" :256
}
}
}
]
}
}
There are no limitation to index fields not defined in templates
this error means that there is an error in json with es syntax, could you share this json?

How to range query a date mapped as long

I have an external ES db (i.e. I can't change its structure) with the following mapping
"failure_url": {
"properties": {
"lastAccessTime": {
"type": "long"
},
"url": {
"type": "keyword"
}
}
}
lastAccessTime represents a date, but is mapped as a long. A standard "range" filter fails with
"caused_by": {
"type": "number_format_exception",
"reason": "For input string: "now""
}
is the error in my filter expression or is it due to the field not being a "date"? If the latter, how can I still query this date?
If you want to query your field with now and other date match expressions, then your field must be defined as a date type.
One thing you can do is to create another field of type date and then update your index to index that new field.
So first change your mapping like this:
PUT my-index/_mappings
{
"properties": {
"lastAccessDate": {
"type": "date"
}
}
}
And then you can leverage the update by query API in order to index your dates from the long field:
POST my-index/_update_by_query
{
"script": {
"source": "ctx._source.lastAccessDate = ctx._source.lastAccessTime"
}
}
When that's done, you'll be able to run your range query on the new lastAccessDate field of type date.

can ElasticSearch only add field index ,no save the orignal value just like lucene Field.Store.NO

I have a big size field in MySQL and do not want to save the original value to ElasticSearch. Is there a method just like Lucene Field.Store.NO?
Thanks.
You just need to define the "store" mapping accordingly, eg. :
PUT your-index
{
"mappings": {
"properties": {
"some_field": {
"type": "text",
"index": true,
"store": false
}
}
}
}
You may also want to disable the _source field :
#disable-source-field
The _source field contains the original JSON document body that was passed at index time [...] Though very handy to have around, the source field does incur storage overhead within the index.
For this reason, it can be disabled as follows:
PUT your-index
{
"mappings": {
"_source": {
"enabled": false
}
}
}

How to define a mapping in elasticsearch that doesn't accept fields other that the mapped ones?

Ok, in my elastisearch I am using the following mapping for an index:
{
"mappings": {
"mytype": {
"type":"object",
"dynamic" : "false",
"properties": {
"name": {
"type": "string"
},
"address": {
"type": "string"
},
"published": {
"type": "date"
}
}
}
}
}
it works. In fact if I put a malformed date in the field "published" it complains and fails.
Also I've the following configuration:
...
node.name : node1
index.mapper.dynamic : false
index.mapper.dynamic.strict : true
...
And without the mapping, I can't really use the type. The problem is that if I insert something like:
{
"name":"boh58585",
"address": "hiohio",
"published": "2014-4-4",
"test": "hophiophop"
}
it will happily accept it. Which is not the behaviour that I expect, because the field test is not in the mapping. How can I restrict the fields of the document to only those that are in the mapping???
The use of "dynamic": false tells Elasticsearch to never allow the mapping of an index to be changed. If you want an error thrown when you try to index new documents with fields outside of the defined mapping, use "dynamic": "strict" instead.
From the docs:
"The dynamic parameter can also be set to strict, meaning that not only new fields will not be introduced into the mapping, parsing (indexing) docs with such new fields will fail."
Since you've defined this in the settings, I would guess that leaving out the dynamic from the mapping definition completely will default to "dynamic": "strict".
Is your problem with the malformed date field?
I would fix the date issue and continue to use dynamic: false.
You can read about the ways to set up the date field mapping for a custom format here.
Stick the date format string in a {type: date, format: ?} mapping.

Resources