Best way to reindex multiple indices in ElasticSearch - elasticsearch

I am using Elasticsearch 5.1.1 and have 500 + indices created with default mapping provided by ES.
Now we have decided to use dynamic templates.
In order to apply this template/mapping to old indices I need to reindex all indices.
What is the best way to do it? Can we use Kibana for this ? Couldn't find sufficient documentation to do so.

Example: Reindex from a daily index to a monthly index (August)
POST _reindex?slices=10&refresh
{
"source": {
"index": "myindex-2019.08.*"
},
"dest": {
"index": "myindex-2019.08"
}
}
Monitor reindex task (wait until is finished)
GET _tasks?detailed=true&actions=*reindex
Check if new index was created
GET _cat/indices/myindex-2019.08*?v&s=index
You can delete old indices
DELETE myindex-2019.08.*
Source:
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html

You can use the _reindex API which can also reindex multiple indices. It was specifically built for this.

Bash script to re-index all indices matching a pattern: https://gist.github.com/hartfordfive/e507bc47e17f4e03a89055918900e44d

If you want to filter some field and reindex it from index you can use this.
POST _reindex
{
"source": {
"index": "auditbeat",
"query": {
"match": {
"agent.version": "7.6.0"
}
}
},
"dest": {
"index":"auditbeat-7.6.0"
}
}

Related

How to reindex and change _type

We need to migrate a number of indexes from ElasticSearch 6.8 to ElasticSearch 7.x. To be able to do this, we now need to go back and fix a large number of documents are the _type field of these documents aren't _doc as required. We fixed this for newer indexes, but some of the older data which we still need has other values in here.
How do we reindex these indexes and also change the _type field?
POST /_reindex
{
"source": {
"index": "my-index-2021-11"
},
"dest": {
"index": "my-index-2021-11-n"
},
"script": {
"source": "ctx._type = '_doc';"
}
}
I saw a post indicating the above might work, but on execution, the value for _type in the next index was still the existing of my-index.
The one option I can think of is to iterate through each document in the index and add it to the new index again which should create the correct _type, but that will take days to complete, so not so keen on doing that.
I think below should work . Please test it out, before running on actual data
{
"source": {
"index": "my-index-2021-11"
},
"dest": {
"index": "my-index-2021-11-n",
"type":"_doc"
}
}
Docs to help in upgradation
https://www.elastic.co/guide/en/elasticsearch/reference/7.17/reindex-upgrade-inplace.html

elasticsearch : How can i tell _reindex api in to continue indexing docs while source index still receiving new docs?

I have daily created indices, these indices are filled by an agent which collects a logs every second of the day, and i'am reindexing them (by field) to new indices using _reindex api.
How can i tell _reindex api to still reindixing while the source index still receiving new documents ?
Any help woould be really appriciated!
Thank you
you cannot force reindex API to be online to reindex new received documents.
but I have solution. you can add a date field (index_time) to your source index. write an hourly cron job to run reindex API with a query to get last hour indexed docs via index_time.
POST _reindex
{
"source": {
"index": "my-index-000001",
"query": {
"filter" :{
"query": {
"range": {
"index_time": {"gte" : "now-1h"}
}
}
}
}
},
"dest": {
"index": "my-new-index-000001"
}
}

Elasticsearch reindex only missing documents

I am trying to reindex an index of 200M of documents from cluster A to cluster B. I used the Reindex API with a remote source and everything worked fine. In the menwhile of my reindex some documents were added into the cluster A so I want to add them as well into the cluster B.
I launched again the reindex request but it seems that the reindex process is taking a lot, like if it was reindexing everything again.
My question is, is the cluster reindexing from scratch all the documents, even if they didn't change ?
My elasticsearch version is the 5.6
The elasticsearch does not know there is a change in the documents or not. So it tries to have each document completely in both indices. If you have a field like insert_time in your data, you can use reindex with query to limit the part of index of A to become reindex on B. This will let you use your older reindex and finish it faster. Reindex by query would be something like this:
POST _reindex
{
"source": {
"index": "A",
"query": {
"range": {
"insert_time": {
"gt": "time you want"
}
}
},
"dest": {
"index": "B"
}
}

How can I batch reindex several elastic indexes?

For example, I have quite large numbers of indexes named like:
logstash-oidmsgcn_2016.12.01
logstash-oidmsgcn_2016.12.02
logstash-oidmsgcn_2016.12.03
...
logstash-oidmsgcn_2017.02.21
need to be indexed to names:
bk-logstash-oidmsgcn_2016.12.01
bk-logstash-oidmsgcn_2016.12.02
bk-logstash-oidmsgcn_2016.12.03
...
bk-logstash-oidmsgcn_2017.02.21
so, I only need to give their names a prefix in a batch way.
what can I do to get this job done?
I have referenced to reindex api and bulk api, but I still cannot get the hang of its way.
You can only do this be reindexing all your indices. If you are open to do this, you can do it with the reindex API like this:
POST _reindex
{
"source": {
"index": "logstash-oidmsgcn_*"
},
"dest": {
"index": "bk-logstash-oidmsgcn"
},
"script": {
"inline": "ctx._index = 'bk-logstash-oidmsgcn_' + (ctx._index.substring('logstash-oidmsgcn_'.length(), ctx._index.length()))"
}
}
Note that you need to enable dynamic scripting in order for this to work.

How to set existing elastic search mapping from index: no to index: analyzed

I am new to elastic search, I want to updated the existing mapping under my index. My existing mapping looks like
"load":{
"mappings": {
"load": {
"properties":{
"customerReferenceNumbers": {
"type": "string",
"index": "no"
}
}
}
}
}
I would like to update this field from my mapping to be analyzed, so that my 'customerReferenceNumber' field will be available for search.
I am trying to run the following query in Sense plugin to do so,
PUT /load/load/_mapping { "load": {
"properties": {
"customerReferenceNumbers": {
"type": "string",
"index": "analyzed"
}
}
}}
but I am getting following error with this command,
MergeMappingException[Merge failed with failures {[mapper customerReferenceNumbers] has different index values]
Though there exist data associated with these mappings, here I am unable to understand why elastic search not allowing me to update mapping from no-index to indexed?
Thanks in advance!!
ElasticSearch doesn't allow this kind of change.
And even if it was possible, as you will have to reindex your data for your new mapping to be used, it is faster for you to create a new index with the new mapping, and reindex your data into it.
If you can't afford any downtime, take a look at the alias feature which is designed for these use cases.
This is by design. You cannot change the mapping of an existing field in this way. Read more about this at https://www.elastic.co/blog/changing-mapping-with-zero-downtime and https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html.

Resources