Null value mapping in Elastic - elasticsearch

I have previously created an index on which I want add null value property:
PUT /check_test-1/_mapping
{ "properties": {
"name": {
"type": "keyword",
"null_value":"N/A"
}}}
EXISTING INDEX:
{ "check_test-1" : {
"mappings" : {
"properties" : {
"name" : {
"type" : "keyword"
},
"status_code" : {
"type" : "keyword",
"null_value" : "N/A"
}
}
}}}
when I run the above query it gives this error:
"type" : "illegal_argument_exception",
"reason" : "Mapper for [name] conflicts with existing mapper:\n\tCannot update parameter [null_value] from [null] to [N/A]"
Index created using below query:
PUT /check_test-1/
{
"mappings": {
"properties": {
"name": {
"type": "keyword"
},
"status_code": {
"type": "keyword",
"null_value": "N/A"
}
}
}
}
Elastic version 7.10.0

Like the message said, you cannot change an existing mapping of a field.
Check the Elasticsearch documentation about mapping updates, especially
If you need to change the mapping of a field in other indices, create a new index with the correct mapping and reindex your data into that index.

Related

Elasticsearch change type existing fields

In my case, NIFI will receive data from syslog firewall, then after transformation sends JSON to ELASTIC. This is my first contact with ELASTICSEARCH
{
"LogChain" : "Corp01 input",
"src_ip" : "162.142.125.228",
"src_port" : "61802",
"dst_ip" : "177.16.1.13",
"dst_port" : "6580",
"timestamp_utc" : 1646226066899
}
In Elasticsearch automatically created Index with such types
{
"mt-firewall" : {
"mappings" : {
"properties" : {
"LogChain" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"dst_ip" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"dst_port" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"src_ip" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"src_port" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"timestamp_utc" : {
"type" : "long"
}
}
}
}
}
How to change type fields in Elasticsearch?
"src_ip": type "ip"
"dst_ip": type "ip"
"timestamp_utc": type "data"
You can change or configure field type using Mapping in Elasticsearch and some of the way i have given below:
1. Explicit Index Mapping
Here, you will define index mapping by your self with all the required field and specific type of field before indexing any document to Elasticsearch.
PUT /my-index-000001
{
"mappings": {
"properties": {
"src_ip": { "type": "ip" },
"dst_ip": { "type": "ip" },
"timestamp_utc": { "type": "date" }
}
}
}
2. Dyanamic Template:
Here, you will provide dynamic template while creating index and based on condition ES will map field with specific data type like if field name end with _ip then map field as ip type.
PUT my-index-000001/
{
"mappings": {
"dynamic_templates": [
{
"strings_as_ip": {
"match_mapping_type": "string",
"match": "*ip",
"runtime": {
"type": "ip"
}
}
}
]
}
}
Update 1:
If you want to update mapping in existing index then it is not recommndate as it will create data inconsistent.
You can follow bellow steps:
Use Reindex API to copy data to temp index.
Delete your original index.
define index with one of the above one method with index mapping.
Use Reindex API to copy data from temp index to original index (newly created index with Mapping)

Copying co-ordinates to field geo_point type using copy_to in Elasticsearch

I am trying to work with geo code in elasticsearch, I have an index which is having two different unique field as latitude and longitude. Both are being stored as double, I want to use copy to feature of elasticsearch and copy both field value to a third field which will have geo_point type. I tried doing that but that's not working as intended.
{
"mappings": {
"properties": {
"unique_id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"location_data": {
"properties": {
"latitude": {
"type": "float",
"copy_to": "last_location"
},
"longitude": {
"type": "float",
"copy_to": "last_location"
},
"last_location": {
"type": "geo_point"
}
}
}
}
}
}
When I index a sample document such as
{
"unique_id": "12345_mytest",
"location_data": {
"latitude": 37.16,
"longitude": -124.76
}
}
You will be able to see in the new mapping that the last_location field which was supposed to be inside location_data object is also populated at root level with a different data type other than geo_point.
{
"mappings" : {
"properties" : {
"last_location" : {
"type" : "float"
},
"location_data" : {
"properties" : {
"last_location" : {
"type" : "geo_point",
"store" : true
},
"latitude" : {
"type" : "float",
"copy_to" : [
"last_location"
]
},
"longitude" : {
"type" : "float",
"copy_to" : [
"last_location"
]
}
}
},
"unique_id" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
}
}
}
}
And furthermore when I query over the field I am unable to get the result as expected.
This doesn't works, any other ideas or way to do that. I know I can do that from the source itself or by altering the data before indexing, but I don't have to luxury to do that right away. Any other way by altering the mapping is most welcome. Thanks in advance for any pointers to get this done.
Thanks
Ashit

Elastic Search GeoIp location not of type geo_point

I'm running ElasticSearch, Logstash and Kibana using Docker Compose based on the solution: https://github.com/deviantony/docker-elk.
I'm following this tutorial trying to add geoip information when processing my web logs: https://www.elastic.co/blog/geoip-in-the-elastic-stack.
In logstash I'm processing files from FileBeat and I've added geoip to my filter:
filter {
...
geoip {
source => "client_ip"
}
}
When I view the documents in Kibana they do contain additional information like geoip.country_name, geoip.city_name etc. but I expect the geoip.location field being of type geo_point in my index.
Here is an example of how some of the geoip fields are mapped:
Instead of geo_point I see location.lat and location.lon. Why are my location not of type geo_point? Do I need some kind of mapping etc.?
Both ingest-common, ingest-geoip, ingest-user-agent and x-pack are loaded when ElasticSearch starts up. I've refreshed the field list for my index in Kibana.
EDIT1:
Based on answer from #Val I'm trying to change the mapping of my index:
PUT iis-log-*/_mapping/log
{
"properties": {
"geoip": {
"dynamic": true,
"properties": {
"ip": {
"type": "ip"
},
"location": {
"type": "geo_point"
},
"latitude": {
"type": "half_float"
},
"longitude": {
"type": "half_float"
}
}
}
}
}
But that gives me this error:
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "mapper [geoip.ip] of different type, current_type [text], merged_type [ip]"
}
],
"type": "illegal_argument_exception",
"reason": "mapper [geoip.ip] of different type, current_type [text], merged_type [ip]"
},
"status": 400
}
In the article you referred to, they do explain that you need to put a specific mapping for the geo_point field in the "Mapping, for Maps" section.
If you're using the default index names (i.e. logstash-*) and the default mapping type (i.e. log), then the mapping is taken care of for you by Logstash. But if not, you need to install it yourself using:
PUT your_index
{
"mappings" : {
"_default_" : {
"_all" : {"enabled" : true, "norms" : false},
"dynamic_templates" : [ {
"message_field" : {
"path_match" : "message",
"match_mapping_type" : "string",
"mapping" : {
"type" : "text",
"norms" : false
}
}
}, {
"string_fields" : {
"match" : "*",
"match_mapping_type" : "string",
"mapping" : {
"type" : "text", "norms" : false,
"fields" : {
"keyword" : { "type": "keyword", "ignore_above": 256 }
}
}
}
} ],
"properties" : {
"#timestamp": { "type": "date", "include_in_all": false },
"#version": { "type": "keyword", "include_in_all": false },
"geoip" : {
"dynamic": true,
"properties" : {
"ip": { "type": "ip" },
"location" : { "type" : "geo_point" },
"latitude" : { "type" : "half_float" },
"longitude" : { "type" : "half_float" }
}
}
}
}
}
}
In the above mappings, you see the geoip.location field being treated as a geo_point.

Multiple document types with same mapping in Elasticseach

I have index named test which can be associated to n number of documents types named sub_test_1 to sub_text_n. But all will have same mapping.
Is there any way to make an index such all document types have same mapping for their documents? I.e. test\sub_text1\_mapping should be same as test\sub_text2\_mapping.
Otherwise if I have like 1000 document types, I will we having 1000 mappings of the same type referring to each document types.
UPDATE:
PUT /test_index/
{
"settings": {
"index.store.type": "default",
"index": {
"number_of_shards": 5,
"number_of_replicas": 1,
"refresh_interval": "60s"
},
"analysis": {
"filter": {
"porter_stemmer_en_EN": {
"type": "stemmer",
"name": "porter"
},
"default_stop_name_en_EN": {
"type": "stop",
"name": "_english_"
},
"snowball_stop_words_en_EN": {
"type": "stop",
"stopwords_path": "snowball.stop"
},
"smart_stop_words_en_EN": {
"type": "stop",
"stopwords_path": "smart.stop"
},
"shingle_filter_en_EN": {
"type": "shingle",
"min_shingle_size": "2",
"max_shingle_size": "2",
"output_unigrams": true
}
}
}
}
}
Intended mapping:
{
"sub_text" : {
"properties" : {
"_id" : {
"include_in_all" : false,
"type" : "string",
"store" : true,
"index" : "not_analyzed"
},
"alternate_id" : {
"include_in_all" : false,
"type" : "string",
"store" : true,
"index" : "not_analyzed"
},
"text" : {
"type" : "multi_field",
"fields" : {
"text" : {
"type" : "string",
"store" : true,
"index" : "analyzed",
},
"pdf": {
"type" : "attachment",
"fields" : {
"pdf" : {
"type" : "string",
"store" : true,
"index" : "analyzed",
}
}
}
}
}
}
}
}
I want this mapping to be an individual mapping for all sub_texts I create so that I can change it for one sub_text without affecting others e.g. I may want to add two custom analyzers to sub_text1 and three analyzers to sub_text3, rest others will stay same.
UPDATE:
PUT /my-index/document_set/_mapping
{
"properties": {
"type": {
"type": "string",
"index": "not_analyzed"
},
"doc_id": {
"type": "string",
"index": "not_analyzed"
},
"plain_text": {
"type": "string",
"store": true,
"index": "analyzed"
},
"pdf_text": {
"type": "attachment",
"fields": {
"pdf_text": {
"type": "string",
"store": true,
"index": "analyzed"
}
}
}
}
}
POST /my-index/document_set/1
{
"type": "d1",
"doc_id": "1",
"plain_text": "simple text for doc1."
}
POST /my-index/document_set/2
{
"type": "d1",
"doc_id": "2",
"pdf_text": "cGRmIHRleHQgaXMgaGVyZS4="
}
POST /my-index/document_set/3
{
"type": "d2",
"doc_id": "3",
"plain_text": "simple text for doc3 in d2."
}
POST /my-index/document_set/4
{
"type": "d2",
"doc_id": "4",
"pdf_text": "cGRmIHRleHQgaXMgaGVyZSBpbiBkMi4="
}
GET /my-index/document_set/_search
{
"query" : {
"filtered" : {
"filter" : {
"term" : {
"type" : "d1"
}
}
}
}
}
This gives me the documents related to type "d1". How to add analyzers only to document of type "d1"?
At the moment a possible solution is to use index templates or dynamic mapping. However they do not allow wildcard type matching so you would have to use the _default_ root type to apply the mappings to all types in the index and thus it would be up to you to ensure that all your types can be applied to the same dynamic mapping. This template example may work for you:
curl -XPUT localhost:9200/_template/template_1 -d '
{
"template" : "test",
"mappings" : {
"_default_" : {
"dynamic": true,
"properties": {
"field1": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
'
Do not do this.
Otherwise if I have like 1000 document types, I will we having 1000 mappings of the same type referring to each document types.
You're exactly right. For every additional _type with an identical mapping you are needlessly adding to the size of your index's mapping. They will not be merged, nor will any compression save you.
A much better solution is to simply create a shared _type and to create a field that represents the intended type. This completely avoids having wasted mappings and all of the negatives associated with it, including an unnecessary increase for your cluster state's size.
From there, you can imitate what Elasticsearch is doing for you and filter on your custom type without ballooning your mappings.
$ curl -XPUT localhost:9200/my-index -d '{
"mappings" : {
"my-type" : {
"properties" : {
"type" : {
"type" : "string",
"index" : "not_analyzed"
},
# ... whatever other mappings exist ...
}
}
}
}'
Then, for any search against sub_text1 (etc.), then you can do a term (for one) or terms (for more than one) filter to imitate the _type filter that would happen for you.
$ curl -XGET localhost:9200/my-index/my-type/_search -d '{
"query" : {
"filtered" : {
"filter" : {
"term" : {
"type" : "sub_text1"
}
}
}
}
}'
This is doing the same thing as the _type filter and you can create _aliases that contain the filter if you want to have the higher level search capability without exposing client-level logic to the filtering.

Dynamic Mapping for Nested Type

I am trying to create a dynamic mapping for objects like the following:
{
"product": {
"productId": 99999,
"manufacturerId": "A0001",
"manufacturerCode": "A101LI",
"name": "Test Product",
"description": "Describe the product here.",
"feature_details":{
"category": "Category1",
"brand": "Brand Name"
},
"feature_tpcerts":{
"certifiedPass": true,
"levelCertified": 2
},
"feature_characteristics":{
"amount": 0.73,
"location": 49464
}
}
}
I would like the feature_* properties to be a nested type, which I have defined in the mapping below with the nested_feature template and it is working as expected. However, I also want to have each property in the nested object of the feature_*property to be multi_value with an additional facet property defined. I have tried the second nested_template template, but without any success.
{
"product" : {
"_timestamp" : {"enabled" : true, "store": "yes" },
"dynamic_templates": [
{
"nested_feature": {
"match" : "feature_*",
"mapping" : {
"type" : "nested",
"stored": "true"
}
}
},
{
"nested_template": {
"match": "feature_*.*",
"mapping": {
"type": "multi_field",
"fields": {
"{name}": {
"type": "{dynamic_type}",
"index": "analyzed"
},
"facet": {
"type": "{dynamic_type}",
"index": "not_analyzed"
}
}
}
}
}
],
"properties" : {
"productId" : { "type" : "integer", "store" : "yes"},
"manufacturerId" : { "type" : "string", "store" : "yes", "index" : "analyzed"},
"manufacturer" : { "type" : "string", "store" : "yes", "index" : "not_analyzed"},
"manufacturerCode" : { "type" : "string", "store" : "yes"},
"name" : {"type" : "string", "store" : "yes"},
"description": {"type": "string", "index" : "analyzed"}
}
}
}
Unfortunately, the properties within the feature_* properties are created from another process and can be almost any name/value pair. Any suggestions on how to use a dynamic template to setup a property as nested as well as make each property within the nested object multi_field with an additional facet property?
You just have to use path_match instead of match when the pattern refers to the whole field path, otherwise only its name (last part) is taken into account. Have a look at the reference page for the root object, which contains also some documentation related to dynamic templates.
You might also want to use match_mapping_type as you can't set "index":"analyzed" for numeric or boolean fields for instance. In that case you might want to do different things depending on the field type.
I noticed that your document contains the product root object, which you don't really need. I would remove it, as the type name is already product.
Also, I would avoid storing fields explicitly unless you really need to, as with elasticsearch you have the _source field stored by default, which is what you are going to need all the time.
The following mapping should work in your case (without the product root object in the documents):
{
"product" : {
"dynamic_templates": [
{
"nested_feature": {
"match" : "feature_*",
"mapping" : {
"type" : "nested"
}
}
},
{
"nested_template": {
"path_match": "feature_*.*",
"match_mapping_type" : "string",
"mapping": {
"type": "multi_field",
"fields": {
"{name}": {
"type": "{dynamic_type}",
"index": "analyzed"
},
"facet": {
"type": "{dynamic_type}",
"index": "not_analyzed"
}
}
}
}
}
]
}
}

Resources