Use Elasticsearch Index from newer Version - elasticsearch

Is it possible to use (e.g. reindex) an existing index from a newer Elasticsearch version? I tried to do it via the snapshots API, but that fails with:
the snapshot was created with Elasticsearch version [7.5.0] which is higher than the version of this node [7.4.2]
The reason we need to use the newer index is that we want to experiment with a plugin that is not yet available for the new version, but the experiments must be done on data indexed by the newer version.

The snapshot API won't work since you are trying to restore the index on an instance older than the instance that created the index.
You will need to have your index data on a 7.5 instance and use the reindex API on a 7.4.2 instance to reindex from remote
It is something like this:
POST _reindex
{
"source": {
"remote": {
"host": "http://7-5-remote-host:9200"
},
"index": "source"
},
"dest": {
"index": "dest"
}
}
You can also use a logstash pipeline to read from your 7.5 instance and index on your 7.4.2 instance.
Something like this:
input {
elasticsearch {
hosts => "http://7-5-instance:9200"
index => "your-index"
}
}
output {
elasticsearch {
hosts => "http://7-4-instance:9200"
index => "your-index"
}
}

Related

Elasticsearch reindex only missing documents

I am trying to reindex an index of 200M of documents from cluster A to cluster B. I used the Reindex API with a remote source and everything worked fine. In the menwhile of my reindex some documents were added into the cluster A so I want to add them as well into the cluster B.
I launched again the reindex request but it seems that the reindex process is taking a lot, like if it was reindexing everything again.
My question is, is the cluster reindexing from scratch all the documents, even if they didn't change ?
My elasticsearch version is the 5.6
The elasticsearch does not know there is a change in the documents or not. So it tries to have each document completely in both indices. If you have a field like insert_time in your data, you can use reindex with query to limit the part of index of A to become reindex on B. This will let you use your older reindex and finish it faster. Reindex by query would be something like this:
POST _reindex
{
"source": {
"index": "A",
"query": {
"range": {
"insert_time": {
"gt": "time you want"
}
}
},
"dest": {
"index": "B"
}
}

How to dynamically add index to alias when new index is dynamically added

How to dynamically add an index to alias when index is dynamically created every day? I'm using Logstash to send data to our ElasticSearch engine, version 6.1.1, with the following convention:
elasticsearch {
hosts => "10.01.01.01:9200"
index => "%{[#metadata][beat]}-%{[#metadata][version]}-%{+YYYY.MM.dd}"
}
This dynamically creates a new index per day. I configured the system based on install instructions for this version.
I created an alias to be able to query across all index types (Filebeat/Winlogbeat/etc).
How can I dynamically make all dynamic indexes be added to this alias to avoid having a system administrator perform a daily task to add the index, like: (using Kibana DevTools)
POST /_aliases
{
"actions": [
{ "add": { "index": "winlogbeat-6.1.1-2018.02.16", "alias": "myaliasname"}}
]
}

Reindex ElasticSearch index returns "Incorrect HTTP method for uri [/_reindex] and method [GET], allowed: [POST]"

I'm trying to upgrade an elasticsearch cluster from 1.x to 6.x. I'm reindexing the remote 1.x indices into the 6.x cluster. According to the docs, this is possible:
To upgrade an Elasticsearch 1.x cluster, you have two options:
Perform a full cluster restart upgrade to Elasticsearch 2.4.x and reindex or delete the 1.x indices. Then, perform a full cluster restart upgrade to 5.6 and reindex or delete the 2.x indices. Finally, perform a rolling upgrade to 6.x. For more information about upgrading from 1.x to 2.4, see Upgrading Elasticsearch in the Elasticsearch 2.4 Reference. For more information about upgrading from 2.4 to 5.6, see Upgrading Elasticsearch in the Elasticsearch 5.6 Reference.
Create a new 6.x cluster and reindex from remote to import indices directly from the 1.x cluster.
I'm doing this locally for test purposes, and using the following command with 6.x running:
curl --request POST localhost:9200/_reindex -d #reindex.json
My reindex.json file looks like this:
{
"source": {
"remote": {
"host": "http://localhost:9200"
},
"index": "some_index_name",
"query": {
"match": {
"test": "data"
}
}
},
"dest": {
"index": "some_index_name"
}
}
However, this returns the following error:
Incorrect HTTP method for uri [/_reindex] and method [GET], allowed: [POST]"
Why is it telling me I can't use GET and to use POST instead? I'm clearly specifying a POST request here, but it seems to think it's a GET request. Any idea why it's getting the wrong request type?
I was facing the same issue, but by adding setting in the PUT request it worked.
PUT /my_blog
{
"settings" : {
"number_of_shards" : 1
},
"mapping": {
"post": {
"properties": {
"user_id": {
"type": "integer"
},
"post_text": {
"type": "string"
},
"post_date": {
"type": "date"
}
}
}
}
}
You can also refer this - https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html

Best way to reindex multiple indices in ElasticSearch

I am using Elasticsearch 5.1.1 and have 500 + indices created with default mapping provided by ES.
Now we have decided to use dynamic templates.
In order to apply this template/mapping to old indices I need to reindex all indices.
What is the best way to do it? Can we use Kibana for this ? Couldn't find sufficient documentation to do so.
Example: Reindex from a daily index to a monthly index (August)
POST _reindex?slices=10&refresh
{
"source": {
"index": "myindex-2019.08.*"
},
"dest": {
"index": "myindex-2019.08"
}
}
Monitor reindex task (wait until is finished)
GET _tasks?detailed=true&actions=*reindex
Check if new index was created
GET _cat/indices/myindex-2019.08*?v&s=index
You can delete old indices
DELETE myindex-2019.08.*
Source:
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html
You can use the _reindex API which can also reindex multiple indices. It was specifically built for this.
Bash script to re-index all indices matching a pattern: https://gist.github.com/hartfordfive/e507bc47e17f4e03a89055918900e44d
If you want to filter some field and reindex it from index you can use this.
POST _reindex
{
"source": {
"index": "auditbeat",
"query": {
"match": {
"agent.version": "7.6.0"
}
}
},
"dest": {
"index":"auditbeat-7.6.0"
}
}

No effect of “size” in ‘query’ while reindexing in elasticsearch

I have been using logstash to migrate a index to another. I have recently tried to reindex certain amount of data from large dataset in local environment. So I tried using following configuration for migration:
input{
elasticsearch{
hosts=>"localhost:9200"
index=>"old_indexindex"
query=>'{"query":{"match_all":{}},"size":10 }'
}
}filter{
mutate{
remove_field=>[
"#version",
"#timestamp"
]
}
}output{
elasticsearch{
hosts=>"localhost:9200"
index=>"new_index"
document_type=>"contact"
manage_template=>false
document_id=>"%{contactId}"
}
}
But this reindexes all the documents in old_index to new_index, where as , I was expecting just 10 documents to be reindexed in new_index.
Am I missing some concept using logstash with elasticsearch?
The elasticsearch input doesn't make a conventional search, but does a scan/scroll search type instead. This means that all data will be retrieved from the index and the role of the size parameter just serves to define how much data will be fetched during each scroll, not how much data will be fetched altogether.
Also, note that the size parameter in the query itself has no effect. You need to use the size parameter of the elasticsearch input and not specify it in the query.
input{
elasticsearch{
hosts=> "localhost:9200"
index=> "old_index"
query=> '*'
size => 10 <--- size goes here
}
}
That being said, if you're running ES 2.3 or later, there's a way to achieve what you desire using the Reindex API, like this:
POST /_reindex
{
"size": 10,
"source": {
"index": "old_index"
},
"dest": {
"index": "new_index"
}
}

Resources