Possible to get an ElasticSerach template to pick up existing indexes? - elasticsearch

I have an ElasticSearch template that includes several indices and creates several aliases. When testing I've always created my template (and the underlying aliases) first and then the indexes. When I do this the aliases show up as expected. I'm not in a situation where the indexes already exist on a test environment and when I created my template there the aliases are not showing up.
Am I correct in assuming that the reason the aliases aren't showing is because the indexes already existed? If that is correct, is there a way to get the template to pick up the indices without deleting and re-creating the indices? Why is it that the indices need to be created after the template in order to be picked up?
I'm new to ElasticSerach so if the answer is obvious here I apologize. I looked through the documentation for templates, indexes and aliases but couldn't find an explanation for the behavior I was seeing.

Templates are only applied in the index creation.
If you create an index before you created the template that would match that index, this template won't be applied, the same things happens if you change something in your template like the number of shards or aliases.
From the documentation about index templates you have:
An index template is a way to tell Elasticsearch how to configure an index when it is created.
And
Templates are configured prior to index creation. When an index is created - either manually or through indexing a document - the template settings are used as a basis for creating the index.
If you define your alias in the template it will only be applied to index created after the creation of the template, if you want to set alias to existing indices you will need to do it manually using the alias API.
You can't apply templates to existing indices.

Related

Elastic Search:Update of existing Record (which has custom routing param set) results in duplicate record, if custom routing is not set during update

Env Details:
Elastic Search version 7.8.1
routing param is an optional in Index settings.
As per ElasticSearch docs - https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-routing-field.html
When indexing documents specifying a custom _routing, the uniqueness of the _id is not guaranteed across all of the shards in the index. In fact, documents with the same _id might end up on different shards if indexed with different _routing values.
We have landed up in same scenario where earlier we were using custom routing param(let's say customerId). And for some reason we need to remove custom routing now.
Which means now docId will be used as default routing param. This is creating duplicate record with same id across different shard during Index operation. Earlier it used to (before removing custom routing) it resulted in update of record (expected)
I am thinking of following approaches to come out of this, please advise if you have better approach to suggest, key here is to AVOID DOWNTIME.
Approach 1:
As we receive the update request, let duplicate record get created. Once record without custom routing gets created, issue a delete request for a record with custom routing.
CONS: If there is no update on records, then all those records will linger around with custom routing, we want to avoid this as this might results in unforeseen scenario in future.
Approach 2
We use Re-Index API to migrate data to new index (turning off custom routing during migration). Application will use new index after successful migration.
CONS: Some of our Indexes are huge, they take 12 hrs+ for re-index operation, and since elastic search re-index API will not migrate the newer records created between this 12hr window, as it uses snapshot mechanism. This needs a downtime approach.
Please suggest alternative if you have faced this before.
Thanks #Val, also found few other approaches like write to both indexes and read from old. And then shift to read new one after re-indexing is finished. Something on following lines -
Create an aliases pointing to the old indices (*_v1)
Point the application to these aliases instead of actual indices
Create a new indices (*_v2) with the same mapping
Move data from old indices to new using re-indexing and make sure we don't
retain custom routing during this.
Post re-indexing, change the aliases to point to new index instead of old
(need to verify this though, but there are easy alternatives if this
doesn't work)
Once verification is done, delete the old Indices
What do we do in transition period (window between reindexing start to reindexing finish) -
Write to both Indices (old and new) and read from old indices via aliases

Reindex all of ElasticSearch with Curator?

Is there a Recipe out there to Reindex all ElasticSearch Indices with Curator?
I'm seeing that it can Reindex a set of indices into one (Daily to Month use case), however I don't see anything that would suggest it could easily apply a new mapping file to every Elastic Index.
I'm taking a guess I'll need to write a wrapper script around Curator to grab index names and feed them into Curator.
I don't know if I got you right as you mentioned reindexing and mapping changes...
If you want to set/update a mapping in a collection of indices and if you know the indices to update by name (or pattern), you are able to apply the same mapping or a mapping change at once with https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html#_multi_index_2
For reindexing, there is no way to specify multiple source/target pairs at once but you can split one index into many. But as you sugessted, you can use subsequent calls to the reindex api.
BTW: The reindex api does not copy the settings nor mappings from the source into the destination index. You need to handle it by yourself, maybe using https://www.elastic.co/guide/en/elasticsearch/reference/6.4/indices-templates.html

Updating existing documents in ElasticSearch (ES) while using rollover API

I have a data source which will create a high number of entries that I'm planning to store in ElasticSearch.
The source creates two entries for the same document in ElasticSearch:
the 'init' part which records init-time and other details under a random key in ES
the 'finish' part which contains the main data, and updates the initially created document (merges) in ES under the init's random key.
I will need to use time-based indexes in ElasticSearch, with an alias pointing to the actual index,
using the rollover index.
For updates I'll use the update API to merge init and finish.
Question: If the init document with the random key is not in the current index (but in an older one already rolled over) would updating it using it's key
successfully execute? If not, what is the best practice to perform the update?
After some quietness I've set out to test it.
Short answer: After the index is rolled over under an alias, an update operation using the alias refers to the new index only, so it will create the document in the new index, resulting in two separate documents.
One way of solving it is to perform a search in the last 2 (or more if needed) indexes and figure out which non-alias index name to use for the update.
Other solution which I prefer is to avoid using the rollover, but calculate index name from the required date field of our document, and create new index from the application, using template to define mapping. This way event sourcing and replaying the documents in order will yield the same indexes.

Elasticsearch update index template

I have a question about elasticsearch index template, there is a scene of my question.
Create a template for a series indices, named templateA, and there are some indices create from this template, named Index-yyyy.mm.dd2 and Index-yyyy.mm.dd2. After a period of time, I need create some new fields in indice, and I update the templateA.
SO, How to make the previously created indices use the new template? please give me some suggestion. Thanks a lot!
The template is only used at index creation. You'll have to modify your mapping or recreate your index and reindex your data.
You can use the PUT mapping API to modify your mapping.
At least in ElasticSearch 7.15 you could create or update an index template using the same endpoint, also:
Index templates are applied during data stream or index creation
It is ovious but "old" data needs to be updated in some way.
In case you are using Logstash, just restart it to do the reindex.

Speeding up mapping creation in ElasticSearch

With each index that we create, we need to create mapping for 10 types.
While indexes are created and documents are indexed blazingly fast, bottleneck that we keep hitting is slow mapping creation. In some cases (when we need to create multiple indexes at the same time) it even breaks when ElasticSearch rejects request because mapping was not created in 30 seconds.
Is there any way to speed up mapping creation, or send mappings in bulk?
I think you have to use Index templates, that will allow you to define templates that will automatically be applied to new indices created. The templates include both settings and mappings, and a simple pattern template that controls if the template will be applied to the index created.
More details here.
Regards,
Alain

Resources