Elasticsearch Opendistro ISM: What approach can be taken to apply the rollover alias and policy to new indices, automatically? - elasticsearch

When ISM policies are used, the index policy settings need to be applied during index creation but those settings are lost once a new index is created from the rollover action applied by a certain stage/phase in the policy.
For instance, having indices in the form:
pattern: msp-* [* => number, in the index template]
alias: msp-*-alias [applied during the index creation]
rollover alias: msp-*-alias
policy: msp-policy-id
Having a template index pattern msp-* (where * is a number) impedes having a rollover alias msp-*-alias for each value that * can take applied automatically. How could this situation be approached?
References:
Can variables be used in elasticsearch index templates?
https://discuss.elastic.co/t/index-lifecycle-management-dynamic-rollover-alias-and-template-name/169614
https://github.com/elastic/elasticsearch/issues/20367
https://github.com/opendistro-for-elasticsearch/index-management/issues/95
https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/ism.html

In ISM policy alias does not change after rollover. For example after multiple rollover you will have msp-000001, msp-000002, msp-000003 indices are there. While all indices should point to single static alias like msp-alias. Alias does not change after rollover.
Index setting would be applicable by template while creation of the index through rollover. Below is the example of index template.
PUT _template/msp_template
{
"index_patterns": "msp-*",
"settings": {
"number_of_shards": 2,
"number_of_replicas": 1,
"index": {
"opendistro.index_state_management.rollover_alias": "msp-alias"
}
}
}

Related

Moving data from oine Elasticsearch index to another with higher number of shards or increasing shard number in existing index

I am new to Elasticsearch and I have been reading documentation in order to find a way of increasing amount of shards that my index consists of. Currently my index looks like this:
country_data 0 p STARTED 227 100.7kb 192.168.0.115 $HOSTNAME
country_data 0 r STARTED 227 100.7kb 192.168.0.116 $HOSTNAME
I wanted to increase the number of shard to 5 however I was unable to find a proper way of doing it. I learnt from another Stackoverflow question that I should be able to do it like this:
POST _reindex?slices=5
{
"source": {
"index": "country_data"
},
"dest": {
"index": "country_data_new"
}
}
However when I did that I got a copy of my country_data with same amount of shards and replicas (1 and 1). I tried to learn more about it in documentation but all I found is this: https://www.elastic.co/guide/en/elasticsearch/client/curator/current/option_slices.html
I couldn't find anything in documentation about increasing number of shards in existing index or how can I move data to new index which would have more shards. I would be grateful for any insights into this problem or at least a website where could I learn how to do it.
This can be done in any of the below mentioned way.
1st Option : You can use the elastic search Split Index API.
I suggest you to please go through the documentation once before proceeding with this method.
2nd Option : Create a new index with same mappings and give the required settings for new shards. Then use the reindex API to copy data from source index to destination index
To create the new Index:
PUT /<NEW_INDEX_NAME>
{
"settings": {
"number_of_shards": <REQUIRED_NUMBER_OF_SHARDS>
},
"mappings": {<MAPPINGS_OF_SOURCE_INDEX>}
}
}
If you don't give the number of shards in the settings while creating an index, by default it creates index with one primary and one replica shard.
To Reindex from source to newly created index:
POST _reindex
{
"source": {
"index": "<SOURCE_INDEX_NAME>"
},
"dest": {
"index": "<NEW_INDEX_NAME>"
}
}

How to create rolling index with date as the index name?

My elastic search index will ingest thousands of documents per second. The service which puts documents in the index doesn't creates a new index, instead it just gets current data in nodejs and indexes docs in "log-YYYY.MM.DD". So, we know index is created automatically if not present.
Now, my question is can this practice of creating index and putting docs at the same time cause performance issues or failures given that index wil be ingesting thousands of docs per second ?
If the answer to above question is yes, how can I create a rolling index whitch date as the index name? Say today is 5 May, 2021, so I want automatic creation of index for 6 May, 2021 in the format log-2021.05.06.
For your first question, may be this can help
how many indices?
For the second question I think you can use index-alias
like
PUT /_index_template/logdate_template
{
"index_patterns": [
"log*"
],
"priority": 1,
"template": {
"aliases": {
"log":{}
},
"mappings": {
//your mappings
}
}
}
}
As here index_pattern is "log*",
in your application code you can have a job, which creates index everyday by generating date in required format and calling
PUT log-YYYY.MM.DD
The advantage of index-alias is: You can access all those indices with "log" only.

AWS elasticsearch disable replication of all indices

I am using a single node AWS ES cluster. Currently, its health status is showing yellow which is obvious because there is no other node to which Amazon ES can assign a replica. I want to set the replication of all my current and upcoming indices to 0. I have indices created in this pattern:
app-one-2021.02.10
app-two-2021.01.11
so on...
These indices are currently having number_of_replicas set to 1. To disable replication for all indices I am throwing a PUT request in index pattern:
PUT /app-one-*/_settings
{
"index" : {
"number_of_replicas":0
}
}
Since I am using a wildcard here so it should set number_of_replicas to 0 in all the matching indices, which it is doing successfuly.
But if any new index is created in the future let's say app-one-2021.03.10. Then the number_of_replicas is again set to 1 in this index.
Every time I have to run a PUT request to set number_of_replicas to 0 which is tedious. Why new indices are not automatically taking number_of_replicas to 0 even if I am using a wildcard (*) in my PUT request.
Is there any way to completely set replication (number_of_replicas to 0) to 0, and doesn't matter if it's a new index or an old index. How can I achieve this?
Yes, the way is to define index templates.
Before Elasticsearch v7.8, you could only use the _template API (see docs). E.g., in your case, you can create a template matching all the app-* indices:
PUT _template/app_settings
{
"index_patterns": ["app-*"],
"settings": {
"number_of_replicas": 0
}
}
Since Elasticsearch v7.8, the old API is still supported but deprecated, and you can use the _index_template API instead (see docs).
PUT _index_template/app_settings
{
"index_patterns": ["app-*"],
"template": {
"settings": {
"number_of_replicas": 0
}
}
}
Update: add code snippets for both _template and _index_template API.

elasticsearch reindexing to different index has different no of documnts

i am trying to re-index an existing index to some other index. ex. index all documents from index A to index B. index B has new mapping .but when i look for number of documents in both indexes its very much different i am getting an approx difference of 19000 documents.what could be the reason for it.here in the re-indexing code:
The nuber of documents in index B happens to be 19000 less than the documents in index A.
POST /_reindex
{
"source": {
"index": "A"
},
"dest": {
"index": "B"
}
}
EDIT: i needed to remove a type from an existing index and add some new types to that index .below are the steps that i performed.
steps to remove an existing type from index and add new type to it
remove all data from a type of index A
download new data to index A
create index B and update it mapping (latest mapping for new
data)
reindex from original index A to new index B
remove original index A -
create the original index A with updated mapping
reindex from index B to index A

Do you need to delete Elasticsearch aliases?

Can't seem to find a simple yes or no answer to this question.
When you have an index with one or more aliases can you just delete the index without any negative side effects? Will deleting the index also delete the aliases? Should you remove all aliases first before deleting an index?
What is considered best practice?
A simple test provides the answer.
First create an index:
PUT my_index
Then create an alias:
POST _aliases
{
"actions": [
{
"add": {
"index": "my_index",
"alias": "alias1"
}
}
]
}
Verify the alias exists:
GET _aliases # should return the alias named alias1
GET alias1 # should return documents from my_index
Delete the index:
DELETE my_index
Check that the alias is gone too
GET _aliases # should be empty
GET alias1 # should return "no such index"
To sum it up, no you don't need to delete aliases before/after deleting an index. Simply deleting the index will take care of deleting the orphan alias as well.

Resources