Changing elasticsearch index's shard-count on the next index-rotation - elasticsearch

I have an ELK (Elasticsearch-Kibana) stack wherein the elasticsearch node has the default shard value of 5. Logs are pushed to it in logstash format (logstash-YYYY.MM.DD), which - correct me if I am wrong - are indexed date-wise.
Since I cannot change the shard count of an existing index without reindexing, I want to increase the number of shards to 8 when the next index is created. I figured that the ES-API allows on-the-fly persistent changes.
How do I go about doing this?

You can use the "Template Management" features in Elasticsearch: http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.x/indices-templates.html
Create a new logstash template by using:
curl -XPUT localhost:9200/_template/logstash -d '
{
"template": "logstash-*",
"settings": {
"number_of_replicas": 1,
"number_of_shards": 8,
"index.refresh_interval": "5s"
},
"mappings": {
"_default_": {
"_all": {
"enabled": true
},
"dynamic_templates": [
{
"string_fields": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"type": "string",
"index": "analyzed",
"omit_norms": true,
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed",
"ignore_above": 256
}
}
}
}
}
],
"properties": {
"#version": {
"type": "string",
"index": "not_analyzed"
},
"geoip": {
"type": "object",
"dynamic": true,
"path": "full",
"properties": {
"location": {
"type": "geo_point"
}
}
}
}
}
}
}'
The next time the index that matches your pattern is created, it will be created with your new settings.

The setting is on your elasticsearch. You need to change to config file config/elasticsearch.yml
Change the index.number_of_shards: 8. and restart elasticsearch. The new configuration will set and the new index will use the new configuration, which create 8 shard as you want.

Best would be to use templates and to add one I would recommend Kopf pluin found here: https://github.com/lmenezes/elasticsearch-kopf
You can ofcourse use the API:
curl -XPUT $ELASTICSEARCH-MASTER$:9200/_template/$TEMPLATE-NAME$ -d '$TEMPLATE-CONTENT$'
In the plugin: on the top left corner click on more -> Index templates and then create a new template and make sure you have the following settings as part of your template:
{
"order": 0,
"template": "logstash*",
"settings": {
"index": {
"number_of_shards": "5",
"number_of_replicas": "1"
}
},
"mappings": {### your mapping ####},
"aliases": {}
}
The above setting will make sure that if a new new index with name logstash* is created it would have 5 number of shards and 1 replica.

Related

Elasticsearch create multi index on a daily bases

I load this index in elasticsearch
curl -XPUT 'localhost:9200/filebeat?pretty' -H 'Content-Type: application/json' -d'
{
"mappings": {
"_default_": {
"_all": {
"enabled": true,
"norms": {
"enabled": false
}
},
"dynamic_templates": [
{
"template1": {
"mapping": {
"doc_values": true,
"ignore_above": 50000,
"index": "not_analyzed",
"type": "{dynamic_type}"
},
"match": "*"
}
}
],
"properties": {
"#timestamp": {
"type": "date"
},
"message": {
"type": "string",
"index": "analyzed"
},
"offset": {
"type": "long",
"doc_values": "true"
},
"geoip" : {
"type" : "object",
"dynamic": true,
"properties" : {
"location" : { "type" : "geo_point" }
}
}
}
}
},
"settings": {
"index.refresh_interval": "2s"
},
"template": "filebeat-*"
}
'
and the result of curl 'localhost:9200/_cat/indices?v' is filebeat-2018-02-05
and every day on index add to the list of elasticsearch on a daily basies and I have to add it in kibana if I want to search on my latest log file. why elasticsearch add multiple index on a daily bases. and how can I solve this problem (just have my own indexes)
thank you.
I assume you're pushing data to elasticsearch using filebeat.
Elasticsearch doesnt decide what index your data should be written to. It is filebeat that tells elasticsearch where the data should be written. And the default behaviour of filebeat/logstash is to create a new new index every day.
If you want to visualize data for a range of index patterns, you can use the wildcard symbol in your kibana index pattern, say filebeat-*. And all visualizations created against filebeat-* should have aggregated data from all your filebeat- indices.
The reason to have a new index everyday is to help with the logging use case, where new data is more valuable than old data. Hence, this gives an opportunity to easily retire old data, or to move old indices to a less performant elasticsearch node etc.
If you still need a new pattern, you should be able to modify your filebeat config file and specify the new index_pattern value. Document

How do I alter the schema without destroying data in elasticsearch?

This is my current schema
{
"mappings": {
"historical_data": {
"properties": {
"continent": {
"type": "string",
"index": "not_analyzed"
},
"country": {
"type": "string",
"index": "not_analyzed"
},
"description": {
"type": "string"
},
"funding": {
"type": "long"
},
"year": {
"type": "integer"
},
"agency": {
"type": "string"
},
"misc": {
"type": "string"
},
"university": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
I have 700k records uploaded. Without destroying the data, how can I make the university index not "not_analysed" such that the change reflects in my existing data?
The mapping for an existing field cannot be modified.
However you can achieve the desired outcome in two ways .
Create another field. Adding fields is free using put _mapping API
curl -XPUT localhost:9200/YOUR_INDEX/_mapping -d '{
"properties": {
"new_university": {
"type": "string"
}
}
}'
Use multi-fields, add a sub-field to your not_analyzed field.
curl -XPUT localhost:9200/YOUR_INDEX/_mapping -d '{
"properties": {
"university": {
"type": "string",
"index": "not_analyzed",
"fields": {
"university_analyzed": {
"type": "string" // <-- ANALYZED sub field
}
}
}
}
}'
In both the case, you need to reindex in order to populate the new field. Use _reindex API
curl -XPUT localhost:9200/_reindex -d '{
"source": {
"index": "YOUR_INDEX"
},
"dest": {
"index": "YOUR_INDEX"
},
"script": {
"inline": "ctx._source.university = ctx._source.university"
}
}'
You are not exactly forced to "destroy" your data, what you can do is reindex your data as described in this article (I'm not gonna rip off the examples as they are particularly clear in the section Reindexing your data with zero downtime).
For reindexing, you can also take a look at the reindexing API, the simplest way being:
POST _reindex
{
"source": {
"index": "twitter"
},
"dest": {
"index": "new_twitter"
}
}
Of course it will take some resources to perform this operation, so I would suggest that you take a complete look at the changes you want to introduce in your mapping, and perform the operation when you have the least amount of activity on your servers (e.g. during the weekend, or at night...)

Default elasticsearch configuration for docker container

What is the best way to configure ES index template with mappings in docker container? I expected to use template file but it seems that from version 2 it is not possible. Executing http request also won't work because on container creation process doesn't start. It could be done on each container launch with script which will start ES and execute HTTP request to it but it looks really ugly.
you can configure template with mappings by execute HTTP PUT request in Linux terminal, as following:
curl -XPUT http://ip:port/_template/logstash -d '
{
"template": "logstash-*",
"settings": {
"number_of_replicas": 1,
"number_of_shards": 8
},
"mappings": {
"_default_": {
"_all": {
"store": false
},
"_source": {
"enabled": true,
"compress": true
},
"properties": {
"_id": {
"index": "not_analyzed",
"type": "string"
},
"_type": {
"index": "not_analyzed",
"type": "string"
},
"field1": {
"index": "not_analyzed",
"type": "string"
},
"field2": {
"type": "double"
},
"field3": {
"type": "integer"
},
"xy": {
"properties": {
"x": {
"type": "double"
},
"y": {
"type": "double"
}
}
}
}
}
}
}
'
The "logstash-*" is your index name, you can have a try.
if using logstash, you can make template part of your logstash pipeline config
pipeline/logstash.conf
input {
...
}
filter {
...
}
output {
elasticsearch {
hosts => "elasticsearch:9200"
template => "/usr/share/logstash/templates/logstash.template.json"
template_name => "logstash"
template_overwrite => true
index => "logstash-%{+YYYY.MM.dd}"
}
}
Reference: https://www.elastic.co/guide/en/logstash/6.1/plugins-outputs-elasticsearch.html#plugins-outputs-elasticsearch-template

Kibana doesn't show results on tile map

I have approximately 3300 documents with geo_point typed field filled.
When I try to visualize my documents on the tile map, kibana says "no results found".
I've already tried putting coordinates as:
- geohash in string
- [lon, lat] array
- object with "lat" and "lon" properties
- string "lat,lon"
All these ways of setting geo_point are allowed according to ES docs.
Kibana detects this field as geo_point (there is a globe icon near field name), but nothing shows up on tile map.
What's wrong with me?
I'm using Kibana 4.2, elasticsearch 2.0.0
I've managed it.
It was happening because I had my geo_point typed field inside of the field with "type": "nested" parameter.
I've changed this outer field to "dynamic": "true" and now I can visualize my locations!
I was able to have a nested geo_point by removing the "type": "nested" from the mapping. No "dynamic":"true" needed. My mapping looks like this:
"mappings": {
"_default_": {
"_all": {
"enabled": true
},
"_ttl": {
"enabled": true,
"default": "12m"
},
"dynamic_templates": [{
"string_fields": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"type": "string",
"index": "analyzed",
"omit_norms": true,
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed",
"ignore_above": 256
}
}
}
}
}],
"properties": {
"#version": {
"type": "string",
"index": "not_analyzed"
},
"user_data": {
"properties": {
"user_geolocation": {
"properties": {
"location": {
"type": "geo_point"
}
}
}
}
}
}
}
}

ElasticSearch - Reindex to add doc_value

What am I trying to do?
Add doc_type to an existing index.
What have I tried?
Created index and document
POST /my_index-1/my_type/1
{
"my_prop": "my_value"
}
Added a template
PUT /_template/my_template
{
"id": "my_template",
"template": "my_index-*",
"mappings": {
"_default_": {
"dynamic_templates": [
{
"my_prop_template": {
"mapping": {
"index": "not_analyzed",
"doc_values": true,
"fielddata": {
"format": "doc_values"
},
"type": "string"
},
"match": "my_prop",
"match_mapping_type": "string"
}
}
]
}
}
}
Reindexed
./stream2es es --source http://localhost:9200/my_index-1 --target http://localhost:9200/my_index-2
What went wrong?
In the new index my_index-2 the property did not receive "doc_values": true:
...
"properties": {
"my_prop": {
"type": "string"
}
}
...
Just for the sanity, I have also tried adding the same document to my_index-3, and it got "doc_values": true.
My question
How can I reindex my old index with "doc_values": true?
Thanks #Val! Logstash indeed solved the problem.
Both stream2es and elasticsearch-reindex created new mapping without "doc_values": true.

Resources