can i set logstash default elasticsearch mapping through elasticsearch-template.json - elasticsearch

I use logstash + elasticsearch to collect syslog and want to set ttl for log ageing
I find a file named elasticsearch-template.json in the logstash,the path is logstash/logstash-1.4.2/lib/logstash/outputs/elasticsearch/elasticsearch-template.json
I add ttl info in the file like this:
{
"template" : "logstash-*",
"settings" : {
"index.refresh_interval" : "5s"
},
"mappings" : {
"_default_" : {
"_all" : {"enabled" : true},
"dynamic_templates" : [ {
"string_fields" : {
"match" : "*",
"match_mapping_type" : "string",
"mapping" : {
"type" : "string", "index" : "analyzed", "omit_norms" : true,
"fields" : {
"raw" : {"type": "string", "index" : "not_analyzed", "ignore_above" : 256}
}
}
}
} ],
"_ttl": {
"enabled": true,
"default": "1d"
},
"properties" : {
"#version": { "type": "string", "index": "not_analyzed" },
"geoip" : {
"type" : "object",
"dynamic": true,
"path": "full",
"properties" : {
"location" : { "type" : "geo_point" }
}
}
}
}
}
}
then restart logstash, delete all elasticsearch index.
I check the new index's mapping in the elasticsearch, but it didn't work in this way.
How can I config the index template?

you need to change your logstash configuration.
if you have followed the default settings, logstash has already created a template inside elasticsearch named logstash, logstash will keep on using that template stored in elasticsearch unless you tell it not to explicitly.
modify that template file you found but in addition to that, in your logstash configuration, set the following:
output {
elasticsearch {
...
template_overwrite => true
...
}
}

Doesn't look like that JSON file is in the correct folder. Here is the documentation on how to use the templates:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-templates.html
About the folder:
Config
Index templates can also be placed within the config location (path.conf) under the templates directory (note, make sure to place them on all master eligible nodes). For example, a file called template_1.json can be placed under config/templates and it will be added if it matches an index. Here is a sample of the mentioned file:

I've created new template.json file and defined path to it into elasticsearch output block of logstash.yml config file:
stdout { codec => json_lines }
elasticsearch {
"hosts" => ["ip:port"]
"index" => "name-of-index-%{+dd.MM.YYYY}"
template => "/{path-to-logstash-folder}/templates/your-template.json"
template_overwrite => true
manage_template => false
}
document_type for Elastic I defined into input block of logstash.yml config file:
input {
file {
path => "/your-path-to-directory/*.log"
type => "name-of-type"
}
}
There is my template.json file
{
"name-of-index": {
"order": 0,
"version": 50001,
"template": "name-of-index-*",
"settings": {
"index": {
"refresh_interval": "5s"
}
},
"mappings": {
"_default_": {
"dynamic_templates": [
{
"message_field": {
"path_match": "message",
"mapping": {
"norms": false,
"type": "text"
},
"match_mapping_type": "string"
}
},
{
"string_fields": {
"mapping": {
"norms": false,
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"match_mapping_type": "string",
"match": "*"
}
}
],
"_all": {
"norms": false,
"enabled": true
},
"properties": {
"#timestamp": {
"include_in_all": false,
"type": "date"
},
"geoip": {
"dynamic": true,
"properties": {
"ip": {
"type": "ip"
},
"latitude": {
"type": "half_float"
},
"location": {
"type": "geo_point"
},
"longitude": {
"type": "half_float"
}
}
},
"#version": {
"include_in_all": false,
"type": "keyword"
}
}
}
},
"aliases": {}
}
}

Related

Custom indexing template is not being applied

I have a project where I am to analyze and visualize access log data. I use Logstash to send data to Elasticsearch and then visualize some stuff with Kibana.
Everything has worked fine until I discovered that I needed the Path Hierarchy Analyzer to show what I want to. I now have a custom template (JSON) and changed the out section of my Logstash configuration. But when I index data, my template is not being applied.
(Version 5.2 of Elasticseach and Logstash, can't update since that is the version in use at the place where I work).
My JSON file is valid. As far as the input and filters go, my Logstash configuration is fine, too. I guess I made a mistake in the output.
I already tried setting manage_template to false. I also tried template_overwrite => "false" just for the sake of it.
I tried creating the index first (Kibana Dev Tools) and populating it after. I created the index template and then the index. That way my template was applied and when I created the index pattern, everything seemed correct. Then I indexed one of my log files. I ended up with a Courier Fetch Error. http://localhost:9200/_all/_mapping?pretty=1 showed my that while indexing my data a default template was being used instead of my custom one. Nothing was different from before adding a custom template.
I searched the web and read everything I could find on stackoverflow and in the elastic forum about custom templates not being applied. I tried out all the solutions provided there, that is why I ended up opting for a custom template saved locally and providing the path in my logstash output. But I am all out of ideas now.
This is the output of my logstash configuration:
output {
elasticsearch {
hosts => ["localhost:9200"]
template => "/etc/logstash/conf.d/template.json"
index => "beam-%{+YYYY.MM.dd}"
manage_template => "true"
template_overwrite => "true"
document_type => "beamlogs"
}
stdout {
codec => rubydebug
}
}
And this is my custom template:
{
"template": "beam_custom",
"index_patterns": "beam-*",
"order" : 5,
"settings": {
"number_of_shards": 1,
"analysis": {
"analyzer": {
"custom_path_tree": {
"tokenizer": "custom_hierarchy"
},
"custom_path_tree_reversed": {
"tokenizer": "custom_hierarchy_reversed"
}
},
"tokenizer": {
"custom_hierarchy": {
"type": "path_hierarchy",
"delimiter": "/"
},
"custom_hierarchy_reversed": {
"type": "path_hierarchy",
"delimiter": "/",
"reverse": "true"
}
}
}
},
"mappings": {
"beamlogs": {
"properties": {
"object": {
"type": "text",
"fields": {
"tree": {
"type": "text",
"analyzer": "custom_path_tree"
},
"tree_reversed": {
"type": "text",
"analyzer": "custom_path_tree_reversed"
}
}
},
"referral": {
"type": "text",
"fields": {
"tree": {
"type": "text",
"analyzer": "custom_path_tree"
},
"tree_reversed": {
"type": "text",
"analyzer": "custom_path_tree_reversed"
}
}
},
"#timestamp" : {
"type" : "date"
},
"action" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
},
"datetime" : {
"type" : "date",
"format": "time_no_millis",
"fields" : {
"keyword" : {
"type": "keyword"
}
}
},
"id" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
},
"info" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
},
"message" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
},
"page" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
},
"path" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
},
"result" : {
"type" : "long"
},
"s_direct" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
},
"s_limit" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
},
"s_mobile" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
},
"s_terms" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
},
"size" : {
"type" : "long"
},
"sort" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
}
}
}
}
}
After indexing my data this is part of what I get with http://localhost:9200/_all/_mapping?pretty=1
"datetime" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"object" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
datetime should not have the type text. But worse than that, fields like objet.tree are not even created.
I really don't care about the wrong mapping for datetime, but I need to get the Path Hierarchy Analyzer to work. I just don't know what to do anymore.
So. What I just tried was creating the index template in Kibana.
PUT _template/beam_custom
/followed by what is in my template.json
I then checked if the template was created.
GET _template/beam_custom
The output was this:
{
"beam_custom": {
"order": 100,
"template": "beam_custom",
"settings": {
"index": {
"analysis": {
"analyzer": {
"custom_path_tree_reversed": {
"tokenizer": "custom_hierarchy_reversed"
},
"custom_path_tree": {
"tokenizer": "custom_hierarchy"
}
},
"tokenizer": {
"custom_hierarchy": {
"type": "path_hierarchy",
"delimiter": "/"
},
...
So I guess creating the template worked.
Then I created an index
PUT beam-2019-07-15
But when I checked the index, I got this:
{
"beam-2019.07.15": {
"aliases": {},
"mappings": {},
"settings": {
"index": {
"creation_date": "1563044670605",
"number_of_shards": "5",
"number_of_replicas": "1",
"uuid": "rGzplctSQDmrI_NSlt47hQ",
"version": {
"created": "5061699"
},
"provided_name": "beam-2019.07.15"
}
}
}
}
Shouldn't the index pattern have been recognized? I think this is the heart of the problem. I thought that my template would have been used and the output should have been something like this instead:
{
"beam-2019.07.15": {
"aliases": {},
"mappings": {
"logs": {
"properties": {
"#timestamp": {
"type": "date"
},
"action": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
},...
Why doesn't it recognize the pattern?
So, I found the mistake.
When I looked up how to build my own template, at some point I looked at the documentation for the current version. But in 5.2., "index_patterns =>" doesn't exist.
"template": "beam_custom",
"index_patterns": "beam-*",
This doesn't work then, of course.
Instead, I dropped the "index_patterns" line and defined my pattern in the template-parameter.
"template": ["beam-*"],
//rest
This fixed the problem. After that, my pattern was recognized.
Yet I am facing a different problem now. The Path Hierarchy Analyzer is not working properly. object.tree and the rest of the fields I want are not being created.
GET beam-*/_search
{
"query": {
"term": {
"object.tree": "/belletristik/"
}
}
}
yields nothing, though I should have a few hundred hits. Looking at my data, there are no analyzed fields for my paths. Any ideas?

Elastic Search GeoIp location not of type geo_point

I'm running ElasticSearch, Logstash and Kibana using Docker Compose based on the solution: https://github.com/deviantony/docker-elk.
I'm following this tutorial trying to add geoip information when processing my web logs: https://www.elastic.co/blog/geoip-in-the-elastic-stack.
In logstash I'm processing files from FileBeat and I've added geoip to my filter:
filter {
...
geoip {
source => "client_ip"
}
}
When I view the documents in Kibana they do contain additional information like geoip.country_name, geoip.city_name etc. but I expect the geoip.location field being of type geo_point in my index.
Here is an example of how some of the geoip fields are mapped:
Instead of geo_point I see location.lat and location.lon. Why are my location not of type geo_point? Do I need some kind of mapping etc.?
Both ingest-common, ingest-geoip, ingest-user-agent and x-pack are loaded when ElasticSearch starts up. I've refreshed the field list for my index in Kibana.
EDIT1:
Based on answer from #Val I'm trying to change the mapping of my index:
PUT iis-log-*/_mapping/log
{
"properties": {
"geoip": {
"dynamic": true,
"properties": {
"ip": {
"type": "ip"
},
"location": {
"type": "geo_point"
},
"latitude": {
"type": "half_float"
},
"longitude": {
"type": "half_float"
}
}
}
}
}
But that gives me this error:
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "mapper [geoip.ip] of different type, current_type [text], merged_type [ip]"
}
],
"type": "illegal_argument_exception",
"reason": "mapper [geoip.ip] of different type, current_type [text], merged_type [ip]"
},
"status": 400
}
In the article you referred to, they do explain that you need to put a specific mapping for the geo_point field in the "Mapping, for Maps" section.
If you're using the default index names (i.e. logstash-*) and the default mapping type (i.e. log), then the mapping is taken care of for you by Logstash. But if not, you need to install it yourself using:
PUT your_index
{
"mappings" : {
"_default_" : {
"_all" : {"enabled" : true, "norms" : false},
"dynamic_templates" : [ {
"message_field" : {
"path_match" : "message",
"match_mapping_type" : "string",
"mapping" : {
"type" : "text",
"norms" : false
}
}
}, {
"string_fields" : {
"match" : "*",
"match_mapping_type" : "string",
"mapping" : {
"type" : "text", "norms" : false,
"fields" : {
"keyword" : { "type": "keyword", "ignore_above": 256 }
}
}
}
} ],
"properties" : {
"#timestamp": { "type": "date", "include_in_all": false },
"#version": { "type": "keyword", "include_in_all": false },
"geoip" : {
"dynamic": true,
"properties" : {
"ip": { "type": "ip" },
"location" : { "type" : "geo_point" },
"latitude" : { "type" : "half_float" },
"longitude" : { "type" : "half_float" }
}
}
}
}
}
}
In the above mappings, you see the geoip.location field being treated as a geo_point.

Elasticsearch Index template lost raw string mapping

I'm running a small ELK 5.4.0 stack server on a single node. When I started, I just took all the defaults, which meant 5 shards for each index. I didn't want the overhead of all those shards, so I created an index template like so:
PUT /_template/logstash
{
"template": "logstash*",
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
}
}
This worked fine, but I just realized that all my raw fields are now missing in ES. For example, "uri" is one of my indexed fields and I used to get "uri.raw" as an unanalyzed version of it. But since I updated the template, they are missing. Looking at the current template shows
GET /_template/logstash
Returns:
{
"logstash": {
"order": 0,
"template": "logstash*",
"settings": {
"index": {
"number_of_shards": "1",
"number_of_replicas": "0"
}
},
"mappings": {},
"aliases": {}
}
}
It seems that the mappings have gone missing. I can pull the mappings off an earlier index
GET /logstash-2017.03.01
and compare it with a recent one
GET /logstash-2017.08.01
Here I see that back in March there was a mapping structure like
mappings: {
"logs": {
"_all": {...},
"dynamic_templates": {...},
"properties": {...}
},
"_default_": {
"_all": {...},
"dynamic_templates": {...},
"properties": {...}
}
}
and now I have only
mappings: {
"logs": {
"properties": {...}
}
}
The dynamic_templates hash holds the information about creating "raw" fields.
My guess is that I need to add to update my index template to
PUT /_template/logstash
{
"template": "logstash*",
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"logs": {
"_all": {...},
"dynamic_templates": {...},
},
"_default_": {
"_all": {...},
"dynamic_templates": {...},
"properties": {...}
}
}
IOW, everything but logs.properties (which holds the current list of fields being sent over by logstash).
But I'm not an ES expert and now I'm a bit worried. My original index template didn't work out the way I thought it would. Is my above plan going to work? Or am I going to make things worse? Must you always include everything when you create an index template? And where did the mappings for the older indexes, before I had a template file, come from?
When Logstash first starts, the elasticsearch output plugin installs its own index template with the _default_ template and dynamic_templates as you correctly figured out.
Everytime Logstash creates a new logstash-* index (i.e. every day), the template is leveraged and the index is created with the proper mapping(s) present in the template.
What you need to do now is simply to take the official logstash template that you have overridden and reinstall it like this (but with the modified shard settings):
PUT /_template/logstash
{
"template" : "logstash-*",
"version" : 50001,
"settings" : {
"index.refresh_interval" : "5s"
"index.number_of_shards": 1,
"index.number_of_replicas": 0
},
"mappings" : {
"_default_" : {
"_all" : {"enabled" : true, "norms" : false},
"dynamic_templates" : [ {
"message_field" : {
"path_match" : "message",
"match_mapping_type" : "string",
"mapping" : {
"type" : "text",
"norms" : false
}
}
}, {
"string_fields" : {
"match" : "*",
"match_mapping_type" : "string",
"mapping" : {
"type" : "text", "norms" : false,
"fields" : {
"keyword" : { "type": "keyword", "ignore_above": 256 }
}
}
}
} ],
"properties" : {
"#timestamp": { "type": "date", "include_in_all": false },
"#version": { "type": "keyword", "include_in_all": false },
"geoip" : {
"dynamic": true,
"properties" : {
"ip": { "type": "ip" },
"location" : { "type" : "geo_point" },
"latitude" : { "type" : "half_float" },
"longitude" : { "type" : "half_float" }
}
}
}
}
}
}
Another way you could have done it is to not overwrite the logstash template, but use any other id, such as _template/my_logstash, so that at index creation time, both templates would have kicked in and used the mappings from the official logstash template and the shard settings from your template.

Creating custom elasticsearch index with logstash

I have to create custom index in elasticsearch using logstash. I have created new template in elasticsearch, and in logstash configuration i have specify template path,template_name and template_overwrite value,but still whenever I run logstash, new index is generated with logstash-dd-mm-yy regex,not with template_name specified in properties,
logstash -config file is
input {
file {
path => "/temp/file.txt"
type => "words"
start_position => "beginning"
}
}
filter {
mutate {
add_field => {"words" => "%{message}"}
}
}
output {
elasticsearch {
hosts => ["elasticserver:9200"]
template => "pathtotemplate.json"
template_name => "newIndexName-*"
template_overwrite => true
}
stdout{}
}
Index template file is
{
"template": "dictinary-*",
"settings" : {
"number_of_shards" : 1,
"number_of_replicas" : 0,
"index" : {
"query" : { "default_field" : "#words" },
"store" : { "compress" : { "stored" : true, "tv": true } }
}
},
"mappings": {
"_default_": {
"_all": { "enabled": false },
"_source": { "compress": true },
"dynamic_templates": [
{
"string_template" : {
"match" : "*",
"mapping": { "type": "string", "index": "not_analyzed" },
"match_mapping_type" : "string"
}
}
],
"properties" : {
"#fields": { "type": "object", "dynamic": true, "path": "full" },
"#words" : { "type" : "string", "index" : "analyzed" },
"#source" : { "type" : "string", "index" : "not_analyzed" },
"#source_host" : { "type" : "string", "index" : "not_analyzed" },
"#source_path" : { "type" : "string", "index" : "not_analyzed" },
"#tags": { "type": "string", "index" : "not_analyzed" },
"#timestamp" : { "type" : "date", "index" : "not_analyzed" },
"#type" : { "type" : "string", "index" : "not_analyzed" }
}
}
}
}
Please help
To do what you want, you have to set the index parameter in the Elasticsearch output block. Your output block will look like this:
output {
elasticsearch {
hosts => ["elasticserver:9200"]
index => "newIndexName-%{+YYYY.MM.dd}"
template => "pathtotemplate.json"
template_name => "newIndexName-*"
template_overwrite => true
}
stdout{}
}

Multiple document types with same mapping in Elasticseach

I have index named test which can be associated to n number of documents types named sub_test_1 to sub_text_n. But all will have same mapping.
Is there any way to make an index such all document types have same mapping for their documents? I.e. test\sub_text1\_mapping should be same as test\sub_text2\_mapping.
Otherwise if I have like 1000 document types, I will we having 1000 mappings of the same type referring to each document types.
UPDATE:
PUT /test_index/
{
"settings": {
"index.store.type": "default",
"index": {
"number_of_shards": 5,
"number_of_replicas": 1,
"refresh_interval": "60s"
},
"analysis": {
"filter": {
"porter_stemmer_en_EN": {
"type": "stemmer",
"name": "porter"
},
"default_stop_name_en_EN": {
"type": "stop",
"name": "_english_"
},
"snowball_stop_words_en_EN": {
"type": "stop",
"stopwords_path": "snowball.stop"
},
"smart_stop_words_en_EN": {
"type": "stop",
"stopwords_path": "smart.stop"
},
"shingle_filter_en_EN": {
"type": "shingle",
"min_shingle_size": "2",
"max_shingle_size": "2",
"output_unigrams": true
}
}
}
}
}
Intended mapping:
{
"sub_text" : {
"properties" : {
"_id" : {
"include_in_all" : false,
"type" : "string",
"store" : true,
"index" : "not_analyzed"
},
"alternate_id" : {
"include_in_all" : false,
"type" : "string",
"store" : true,
"index" : "not_analyzed"
},
"text" : {
"type" : "multi_field",
"fields" : {
"text" : {
"type" : "string",
"store" : true,
"index" : "analyzed",
},
"pdf": {
"type" : "attachment",
"fields" : {
"pdf" : {
"type" : "string",
"store" : true,
"index" : "analyzed",
}
}
}
}
}
}
}
}
I want this mapping to be an individual mapping for all sub_texts I create so that I can change it for one sub_text without affecting others e.g. I may want to add two custom analyzers to sub_text1 and three analyzers to sub_text3, rest others will stay same.
UPDATE:
PUT /my-index/document_set/_mapping
{
"properties": {
"type": {
"type": "string",
"index": "not_analyzed"
},
"doc_id": {
"type": "string",
"index": "not_analyzed"
},
"plain_text": {
"type": "string",
"store": true,
"index": "analyzed"
},
"pdf_text": {
"type": "attachment",
"fields": {
"pdf_text": {
"type": "string",
"store": true,
"index": "analyzed"
}
}
}
}
}
POST /my-index/document_set/1
{
"type": "d1",
"doc_id": "1",
"plain_text": "simple text for doc1."
}
POST /my-index/document_set/2
{
"type": "d1",
"doc_id": "2",
"pdf_text": "cGRmIHRleHQgaXMgaGVyZS4="
}
POST /my-index/document_set/3
{
"type": "d2",
"doc_id": "3",
"plain_text": "simple text for doc3 in d2."
}
POST /my-index/document_set/4
{
"type": "d2",
"doc_id": "4",
"pdf_text": "cGRmIHRleHQgaXMgaGVyZSBpbiBkMi4="
}
GET /my-index/document_set/_search
{
"query" : {
"filtered" : {
"filter" : {
"term" : {
"type" : "d1"
}
}
}
}
}
This gives me the documents related to type "d1". How to add analyzers only to document of type "d1"?
At the moment a possible solution is to use index templates or dynamic mapping. However they do not allow wildcard type matching so you would have to use the _default_ root type to apply the mappings to all types in the index and thus it would be up to you to ensure that all your types can be applied to the same dynamic mapping. This template example may work for you:
curl -XPUT localhost:9200/_template/template_1 -d '
{
"template" : "test",
"mappings" : {
"_default_" : {
"dynamic": true,
"properties": {
"field1": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
'
Do not do this.
Otherwise if I have like 1000 document types, I will we having 1000 mappings of the same type referring to each document types.
You're exactly right. For every additional _type with an identical mapping you are needlessly adding to the size of your index's mapping. They will not be merged, nor will any compression save you.
A much better solution is to simply create a shared _type and to create a field that represents the intended type. This completely avoids having wasted mappings and all of the negatives associated with it, including an unnecessary increase for your cluster state's size.
From there, you can imitate what Elasticsearch is doing for you and filter on your custom type without ballooning your mappings.
$ curl -XPUT localhost:9200/my-index -d '{
"mappings" : {
"my-type" : {
"properties" : {
"type" : {
"type" : "string",
"index" : "not_analyzed"
},
# ... whatever other mappings exist ...
}
}
}
}'
Then, for any search against sub_text1 (etc.), then you can do a term (for one) or terms (for more than one) filter to imitate the _type filter that would happen for you.
$ curl -XGET localhost:9200/my-index/my-type/_search -d '{
"query" : {
"filtered" : {
"filter" : {
"term" : {
"type" : "sub_text1"
}
}
}
}
}'
This is doing the same thing as the _type filter and you can create _aliases that contain the filter if you want to have the higher level search capability without exposing client-level logic to the filtering.

Resources