Only allow fields that are in the index template - elasticsearch

I have logstash pushing docs into an elasticsearch cluster.
And I apply a template to the indices with logstash:
elasticsearch {
hosts => 1.1.1.1.,2.2.2.2.
index => "logstash-myindex-%{+YYYY-MM-dd}"
template_name => "mytemplate"
template => "/etc/logstash/index_templates/mytemplate.json"
template_overwrite => true
}
Is there a way I can have only the fields defined in the template get added to the docs? Because sometimes the docs have a bunch of other fields I don't care about and I don't want to manually filter out each one. I want to be able to say if field not in index template do not add.
edit:
I did this in my index template but fields not specified in the template are still getting added to docs:
{
"template": "logstash-myindex*",
"order": 10,
"mappings": {
"_default_": {
"dynamic": "scrict",
"_all": {
"enabled": false
},
"properties": {
"#timestamp": {
"type": "date",
"include_in_all": false
},
"#version": {
"type": "keyword",
"include_in_all": false
},
"bytesReceived": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword"
}
}
},
.... etc

I'm not familiar with logstash - but I'm assuming this is just like creating an index in ElasticSearch.
In ElasticSearch you can disabled the dynamic creation of fields by adding:
"dynamic": false
to the mapping.
This would look something like this:
{
"mappings": {
"_default_": {
"dynamic": false
}
}
}

Related

Elasticsearch: How to define Contexts property of nested completion field?

I've got the following mapping for an ES index (I'm not including config for analyzer and other things):
{
"mappings": {
"properties": {
"topCustomer": {
"type": "text",
"analyzer": "autocomplete",
"search_analyzer": "autocomplete_search",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"topCustomer_suggest": {
"type": "completion",
"contexts": [
{
"name": "index_name",
"type": "category"
}
]
},
"customer": {
"type": "nested",
"include_in_root": "true",
"properties": {
"customer_name": {
"type": "text",
"analyzer": "autocomplete",
"search_analyzer": "autocomplete_search",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
},
"customer_name_suggest": {
"type": "completion",
"contexts": [
{
"name": "index_name",
"type": "category"
}
]
}
}
},
"customer_level": {
"type": "integer"
}
}
}
}
}
}
Also, I have the following logstash configuration file:
input {
jbdc {
//Input config
}
}
filter {
mutate {
remove_field => ["#version"]
}
ruby {
code => "
input = event.get('topCustomer').strip.gsub(/[\(\)]+/, '').split(/[\s\/\-,]+/);
event.set('[topCustomer_suggest][input]', input);
contexts = { 'index_name' => [event.get('type')] };
event.set('[topCustomer_suggest][contexts]', contexts);
input = event.get('[customer][cutomer_name]').strip.gsub(/[\(\)]+/, '').split(/[\s\/\-,]+/);
event.set('[customer][customer_name][fields][customer_name_suggest][input]', input);
contexts = { 'index_name' => [event.get('type')] };
event.set('[customer][customer_name][fields][customer_name_suggest][contexts]', contexts);
"
}
}
output {
elasticsearch {
index => "%{type}"
manage_template => false
hosts => ["localhost:9200"]
}
}
Now, when I try to refresh my index, to apply the changes that I made to one of these files, I get the following error:
Could not index event to Elasticsearch ...
:response=>{"index"=>{"index"=>"customers", "_type"=>"_doc",
"_id"=>"...", "status"=>400,
"error"=>{"type"=>"illegal_argument_exception", "reason"=>"Contexts
are mandatory in context enabled completion field
[customer.customer_name.customer_name_suggest]"}}}}
I tried to modify my config file so that the set events (in the ruby filter section) match the format that the error displays to access the field; I also tried many more combinations to see if this was causing the error.
As you can see, I defined another completion field in the mapping. This field works as expected. The difference is that this is not a nested field.
Notice that the customer_name_suggest is a sub-field and not an 'independent' field like the topCustomer_suggest field. Is this the correct way of doing it or should I not make customer_name_suggest a sub field? I really don't understand why I'm getting the error as I'm defining the contexts property in the mapping.

Elasticsearch copy multi fields to one field and do not mapping source field?

There are too many similar fields from data source, I want to copy them to one field, and do not mapping these similar fields. Here is my dynamic template:
[
{
"username": {
"path_match": "facts.wifi*username",
"mapping": {
"copy_to": "username",
"type": "keyword",
"enabled": false
}
}
},
{
"ipaddress": {
"path_match": "facts.ipaddress*",
"mapping": {
"copy_to": "ipaddress",
"type": "keyword",
"enabled": false
}
}
},
{
"macaddress": {
"path_match": "facts.macaddress*",
"mapping": {
"copy_to": "macaddress",
"type": "keyword",
"enabled": false
}
}
},
{
"ignore_others": {
"path_match": "facts.*",
"mapping": {
"type": "object",
"enabled": false
}
}
}
]
BUT When I write data to index, ES says "enabled=false" is invalid and ignore it.
#! Deprecation: dynamic template [username] has invalid content [{"path_match":"facts.wifi*username","mapping":{"copy_to":"username","enabled":false,"type":"keyword"}}], caused by [Unused mapping attributes [{enabled=false}]]
#! Deprecation: dynamic template [ipaddress] has invalid content [{"path_match":"facts.ipaddress*","mapping":{"copy_to":"ipaddress","enabled":false,"type":"keyword"}}], caused by [Unused mapping attributes [{enabled=false}]]
#! Deprecation: dynamic template [macaddress] has invalid content [{"path_match":"facts.macaddress*","mapping":{"copy_to":"macaddress","enabled":false,"type":"keyword"}}], caused by [Unused mapping attributes [{enabled=false}]]
I try to update mapping config to this:
"index": false,
"doc_values": false
It succeeded to write data into index, I can not filter or aggregate those fields, but still have field mapping.
Can anyone help me to figure it out?
Is it possible to do so in elasticsearh, or do I make some mistakes?
my elasticsearch version is 7.7.

How to set elasticsearch index mapping as not_analysed for all the fields

I want my elasticsearch index to match the exact value for all the fields. How do I map my index to "not_analysed" for all the fields.
I'd suggest making use of multi-fields in your mapping (which would be default behavior if you aren't creating mapping (dynamic mapping)).
That way you can switch to traditional search and exact match searches when required.
Note that for exact matches, you would need to have keyword datatype + Term Query. Sample examples are provided in the links I've specified.
Hope it helps!
You can use dynamic_templates mapping for this. As a default, Elasticsearch is making the fields type as text and index: true like below:
{
"products2": {
"mappings": {
"product": {
"properties": {
"color": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"type": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
As you see, also it creates a keyword field as multi-field. This keyword fields indexed but not analyzed like text. if you want to drop this default behaviour. You can use below configuration for the index while creating it :
PUT products
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"product": {
"dynamic_templates": [
{
"strings": {
"match_mapping_type": "string",
"mapping": {
"type": "keyword",
"index": false
}
}
}
]
}
}
}
After doing this the index will be like below :
{
"products": {
"mappings": {
"product": {
"dynamic_templates": [
{
"strings": {
"match_mapping_type": "string",
"mapping": {
"type": "keyword",
"index": false
}
}
}
],
"properties": {
"color": {
"type": "keyword",
"index": false
},
"type": {
"type": "keyword",
"index": false
}
}
}
}
}
}
Note: I don't know the case but you can use the multi-field feature as mentioned by #Kamal. Otherwise, you can not search on the not analyzed fields. Also, you can use the dynamic_templates mapping set some fields are analyzed.
Please check the documentation for more information :
https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-templates.html
Also, I was explained the behaviour in this article. Sorry about that but it is Turkish. You can check the example code samples with google translate if you want.

ElasticSearch Logstash JDBC: How to aggregate into different column names

I am new to Elasticsearch and I am trying to use Logstash to load data to an index. Following is a partial of my losgstash config:
filter {
aggregate {
task_id => "%{code}"
code => "
map['campaignId'] = event.get('CAM_ID')
map['country'] = event.get('COUNTRY')
map['countryName'] = event.get('COUNTRYNAME')
# etc
"
push_previous_map_as_event => true
timeout => 5
}
}
output {
elasticsearch {
document_id => "%{code}"
document_type => "company"
index => "company_v1"
codec => "json"
hosts => ["127.0.0.1:9200"]
}
}
I was expecting that the aggregation would map for instance the column 'CAM_ID' into a property in the ElasticSearch Index as 'campaignId'. Instead, is creating a property with the name 'cam_id' which is the column name as lowercase. The same with the rest of the properties.
Following is the Index Document after logstash being executed:
{
"company_v1": {
"aliases": {
},
"mappings": {
"company": {
"properties": {
"#timestamp": {
"type": "date"
},
"#version": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"cam_id": {
"type": "long"
},
"campaignId": {
"type": "long"
},
"cam_type": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"campaignType": {
"type": "text"
}
}
}
},
"settings": {
"index": {
"creation_date": "1545905435871",
"number_of_shards": "5",
"number_of_replicas": "1",
"uuid": "Dz0x16ohQWWpuhtCB3Y4Vw",
"version": {
"created": "6050399"
},
"provided_name": "company_v1"
}
}
}
}
'campaignId' and 'campaignType' were created by me when i created the index, but logstash created the other 2.
Can someone explain me how to configure logstash to customize the indexes documents properties names when data is being loaded?
Thank you very much.
Best Regards

Default elasticsearch configuration for docker container

What is the best way to configure ES index template with mappings in docker container? I expected to use template file but it seems that from version 2 it is not possible. Executing http request also won't work because on container creation process doesn't start. It could be done on each container launch with script which will start ES and execute HTTP request to it but it looks really ugly.
you can configure template with mappings by execute HTTP PUT request in Linux terminal, as following:
curl -XPUT http://ip:port/_template/logstash -d '
{
"template": "logstash-*",
"settings": {
"number_of_replicas": 1,
"number_of_shards": 8
},
"mappings": {
"_default_": {
"_all": {
"store": false
},
"_source": {
"enabled": true,
"compress": true
},
"properties": {
"_id": {
"index": "not_analyzed",
"type": "string"
},
"_type": {
"index": "not_analyzed",
"type": "string"
},
"field1": {
"index": "not_analyzed",
"type": "string"
},
"field2": {
"type": "double"
},
"field3": {
"type": "integer"
},
"xy": {
"properties": {
"x": {
"type": "double"
},
"y": {
"type": "double"
}
}
}
}
}
}
}
'
The "logstash-*" is your index name, you can have a try.
if using logstash, you can make template part of your logstash pipeline config
pipeline/logstash.conf
input {
...
}
filter {
...
}
output {
elasticsearch {
hosts => "elasticsearch:9200"
template => "/usr/share/logstash/templates/logstash.template.json"
template_name => "logstash"
template_overwrite => true
index => "logstash-%{+YYYY.MM.dd}"
}
}
Reference: https://www.elastic.co/guide/en/logstash/6.1/plugins-outputs-elasticsearch.html#plugins-outputs-elasticsearch-template

Resources