How to point elasticsearch alias to current index and removing the alias from old index from index template? - elasticsearch

In our application , we are creating the elasticsearch index daily basis and index pattern is index-. (eg. index-17-09-2019). But our application is accessing the index through an alias which is pointing the current index. Now attaching and removing of the alias with the index is done through a cron job. Is it possible to do it through through index template as we are avoiding the cron job.
We can attach alias with the index through index template but I am not sure whether we can detach the alias with the old index and add it to the new index through index template.

That can be done with built-in index lifecycle management (ILM). Your application will be sending data to index alias and ILM will take care of the rest.
Here is the description of how it can be done, but basically you need to:
1. Create ILM job
PUT /_ilm/policy/my_policy
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": {
"max_age": "1d"
}
}
}
}
}
}
2. Create an index template with ILM policy attached
PUT _template/my_template
{
"index_patterns": ["test-*"],
"settings": {
"number_of_shards": 1,
"number_of_replicas": 1,
"index.lifecycle.name": "my_policy",
"index.lifecycle.rollover_alias": "test-alias"
}
}
3. Start the process by creating init index
PUT test-000001
{
"aliases": {
"test-alias":{
"is_write_index": true
}
}
}
That will help you with handling creation of new index every day without external CRON job. You can also extend your policy, later on to e.g. delete old indices after 7 days after rollover.
Hope that helps.

Related

ILM new index does not obey my policy limit GB

I have a policy using the index pattern logstash-* and with alias logstash-rollover. My first index is logstash-001 and it was created manually by me and attached to rollover alias.
After 50GB (my policy set max 50GB), another index is created: logstash-002, and it is ok for me. The problem is my second index get about 200GB and more, seems like policy is not applied to other indexes different from -001.
Index -001 : (check policy)
Index -002: (no policy here)
Tldr;
The ILM policy name does not work with a pattern it is just the name of the policy.
You need to create an index template, which hold the ILM policy. This index template does use an index pattern, which should match your indices.
This tutorial explain it nicely.
Solution
Create a template
PUT _index_template/automated_ILM
{
"index_patterns": ["logstash-*"],
"template": {
"settings": {
"index.lifecycle.name": "<Your ILM policy name>"
"index.lifecycle.rollover_alias": "logstash-rollover"
}
}
}
Apply the ILM policy manually to the index logstash-002
PUT logstash-002/_settings
{
"index": {
"lifecycle": {
"name": "<Your ILM policy name>"
}
}
}
The do the rollover manually
POST logstash-rollover/_rollover
And you should be all set.

elasticsearch template doesn't change index ILM

in my elasticsearch, I will receive daily index with format like dstack-prod_dcbs-. I want to add ILM to them, immediately after they are revived. I dont know why ILM are not added to indexs. below you can find my command.(I have already defined "dstack-prod_dcbs-policy" ILM)
*PUT _template/dstack-prod_dcbs
{
"index_patterns": ["dstack-prod_dcbs-*"],
"settings": {
"index.lifecycle.name": "dstack-prod_dcbs-policy"
}
}*
but when I run
GET dstack-prod_dcbs/_ilm/explain*
below result returns
*{
"indices" : {
"dstack-prod_dcbs-20200821" : {
"index" : "dstack-prod_dcbs-20200821",
"managed" : false
},
"dstack-prod_dcbs-2020-09-22" : {
"index" : "dstack-prod_dcbs-2020-09-22",
"managed" : false
}
}
}*
I believe ILM is an alternative to using daily indices where indices are rolled over when a condition is met in the policy (not when it becomes a new day)
For ILM you need to define a rollover alias for the template
PUT _template/dstack-prod_dcbs
{
"index_patterns": ["dstack-prod_dcbs-*"],
"settings": {
"index.lifecycle.name": "dstack-prod_dcbs-policy",
"index.lifecycle.rollover_alias": "dstack-prod_dcbs"
}
}
Then you need to create the first index manually and assign it as the write index for the alias
PUT dstack-prod_dcbs-000001
{
"aliases": {
"dstack-prod_dcbs":{
"is_write_index": true
}
}
}
After that everything will be handled automatically and a new index will be created on rollover which will be then assigned as the write index for the alias

Update configuration for actively used index without data loss

Sometimes, I need to update mappings, settings, or bind default pipelines to the actively used index.
For the time being, I am using a method with data loss as follows:
update the index template with proper mapping (or binding the default pipeline by index.default_pipeline);
create a_new_index (matching the template index_patterns);
reindex the index_to_fix to a_new_index to migrate the data already indexed;
use alias to redirect the coming indexing request to a_new_index (the alias will have the same name as index_to_fix to ensure the indexing is undisturbed) and delete the index_to_fix;
But between step 3 and step 4, there is a time gap, during which the newly indexed data are lost in the original index_to_fix.
Is there a way, to update configurations for actively used index without any data loss?
Thanks for the help of #LeBigCat, after some discussions. I think this problem could be solved in three steps.
Use Alias for CRUD
First thing first, try not to use index directly, use alias if possible; since you can't use an alias with the same name as the existed indices, directly you can't replace the index even if it's broken (badly designed). The easiest way is to use a template and include the index name directly in the alias.
PUT _template/test
{
...
"aliases" : {
"{index}-alias" : {}
}
}
Redirect the Indexing
Since the index_to_fix is being actively used, after updating the template and create a new index a_new_fix, we can use alias to redirect the indexing to a_new_fix.
POST /_aliases
{
"actions" : [
{ "add": { "index": "a_new_index", "alias": "index_to_fix-alias" } },
{ "remove": { "index": "index_to_fix", "alias": "index_to_fix-alias" } }
]
}
Migrating the Data
Simply use _reindex to migrate all the data from index_to_fix to a_new_index.
POST _reindex
{
"source": {
"index": "index_to_fix"
},
"dest": {
"index": "index_to_fix-alias"
}
}

Elasticsearch - Reindex whole cluster using pattern for new index name

I have an index with thousands of indices, with 5 shards per index.
I would like to reindex them with only 1 shard per index.
Is there a build in solution in Elastic to reindex for instance all the indices by adding "-reindexed" to each index ?
Looks like you want to dynamically change the index names while reindexing.
Let's understand this with an example:
1) Add some indices:
POST sample/_doc/1
{
"test" : "sample"
}
POST sample1/_doc/1
{
"test" : "sample"
}
POST sample2/_doc/1
{
"test" : "sample"
}
2) Use Reindex API to dynamically change the index names while reindexing multiple indices:
POST _reindex
{
"source": {
"index": "sample*"
},
"dest": {
"index": ""
},
"script": {
"inline": "ctx._index = ctx._index + '-reindexed'"
}
}
The above request will reindex all the indices starting with sample and add -reindexed in their indexNames. So that means sample, sample1 and sample2 will be reindexed as sample-reindexed, sample1-reindexed and sample2-reindexed all with this one request.
In order to set up the destination indices with one shard you need to
create those indices before reindexing.
Hope that helps.
You could do a simple reindex but I'd also recommend you take a look at the Shrink Index API:
https://www.elastic.co/guide/en/elasticsearch/reference/7.0/indices-shrink-index.html
The documentation above links to v7.0, but this has been around for many iterations.
In your example, you would do something similar to the following:
First, reallocate copies of all primary or replica shards to a single node and prevent any future write-access while the shrink operations are being performed.
PUT /my_source_index/_settings
{
"settings": {
"index.routing.allocation.require._name": "shrink_node_name",
"index.blocks.write": true
}
}
Initiate the shrink operation, clear the index settings set in the previous command, and update your primary and replica settings on the target index:
POST my_source_index/_shrink/my_target_index-reindexed
{
"settings": {
"index.routing.allocation.require._name": null,
"index.blocks.write": null,
"index.number_of_replicas": 1,
"index.number_of_shards": 1,
"index.codec": "best_compression"
}
}
Note the above is also allocating a replica shard - if you don't want this, ensure you set this to 0.
You would want to set up a script of some sort to iterate through the list of source indices one by one.

Modify default number of Elasticsearch shards

If I have a 15 node cluster, do I have to change the
index.number_of_shards
value on all 15 nodes, and restart them, before the new value comes into effect for new indexes?
That is right changing index.number_of_shards defaults in config file would involve changing the setting on all nodes and then restarting the instance ideally following the guidelines for rolling restarts.
However if that is not an option and if explicitly specifying the number_of_shards in the settings while creating the new index is not ideal then the workaround would be using index templates
Example:
One can create an index_defaults default as below
PUT /_template/index_defaults
{
"template": "*",
"settings": {
"number_of_shards": 4
}
}
This applies the setting specified in index_defaults template to all new indexes.
Once you set the number of shards for an index in ElasticSearch, you cannot change them. You will need to create a new index with the desired number of shards, and depending on your use case, you may want then to transfer the data to the new index.
I say depending on the use case because, for instance, if you are storing time based data such as log events, it is perfectly reasonable to close one index and open a new one with a different number of shards, and index all data going forward to that new index, keeping the old one for searches.
However, if your use case is, for instance, storing blog documents, and your indices are by topic, then you will need to (a) create new indices as stated above with a different number of shards and (b) reindex your data. For (b) I recommend using the Scroll and Scan API to get the data out of the old index.
You need to create a template for new indices that will be created:
PUT /_template/index_defaults
{
"index_patterns": "*",
"settings" : {
"index" : {
"number_of_shards" : 1,
"number_of_replicas" : 1
}
}
}
For old indices you need to reindex.
Example: from my_old_index to my_new_index
Create the new index with appropriate mapping and settings:
PUT my_new_index
{
"settings" : {
"index" : {
"number_of_shards" : 1,
"number_of_replicas" : 1
}
}
}
Reindex from old index to new one, specify type only if you desire:
POST /_reindex?slices=5
{
"size": 100000,
"source": { "index": "my_old_index" },
"dest": { "index": "my_new_index", "type": "my_type" }
}
Updated syntax to avoid some deprecation warnings in Elasticsearch 6+
per
https://www.elastic.co/guide/en/elasticsearch/reference/6.0/indices-templates.html
PUT /_template/index_defaults
{
"index_patterns": ["*"],
"order" : 0,
"settings": {
"number_of_shards": 2
}
}
Please remember that specifying the number of shards is a static operation and should be done when creating an index. But, any change after the index is created will require complete reindexing again which will take time.
To create the number of shards when creating an index use this command.
curl -XPUT ‘localhost:9200/my_sample_index?pretty’ -H ‘Content-Type: application/json’ -d’
{
“settings:”{
“number_of_shards”:2,
“number_of_replicas”:0
}
}
you don't have to to run this on all the nodes. run them on any one node. All the nodes communicate with each other about the change to the elastic index.

Resources