Restore elasticsearch cluster onto another cluster - elasticsearch

Hello i have 3 node elasticsearch cluster ( source ) and i have snapshot called
snapshot-1 which taken from source cluster
and i have another 6 node elasticsearch cluster ( destination ) cluster
and when i restore my destinatition cluster from snapshot-1 using this command
curl -X POST -u elastic:321 "192.168.2.15:9200/snapshot/snapshot_repository/snapshot-1/_restore?pretty" -H 'Content-Type: application/json' -d'
> {
> "indices": "*",
> "ignore_unavailable": true,
> "include_global_state": false,
> "rename_pattern": ".security(.+)",
> "rename_replacement": "delete_$1",
> "include_aliases": false
> }
> '
{
and i got this error
"error" : {
"root_cause" : [
{
"type" : "snapshot_restore_exception",
"reason" : "[snapshot:snapshot-1 yjg/mHsYhycHQsKiEhWVhBywxQ] cannot restore index [.ilm-history-0003] because an open index with same name already exists in the cluster. Either close or delete the existing index or restore the index under a different name by providing a rename pattern and replacement name"
}
so as you can see the index .ilm-history-0003 already exists in the cluster, but how can i do rename replacement for security,.ilm,.slm,.transfrom indices using only 1 rename_pattern?
like this one
"rename_pattern": ".security(.+)",

From my experiences the rename pattern doesn't need to be super fancy because you will probably
a) delete the index (as your renaming pattern suggests) or
b) reindex data from the restored index to new indices. In this case the naming of the restored index is insignificant.
So this is what I would suggest:
Use the following renaming pattern to include all indices. Again, from my experience, your first aim is to get the old data restored. After that you have to manage the reindexing etc.
POST /_snapshot/REPOSITORY_NAME/SNAPSHOT_NAME/_restore
{
"indices": "*",
"ignore_unavailable": true,
"include_aliases": false,
"include_global_state": false,
"rename_pattern": "(.+)",
"rename_replacement": "restored_$1"
}
This will prepend restored_ to the actual index name resulting in the following restored indices:
restored_security
restored_.ilm*
restored_.slm*
restored_.transfrom*
I hope I could help you.

solve it using this way
curl -X POST -u elastic:321 "192.168.2.15:9200/snapshot/snapshot_repository/snapshot-1/_restore?pretty" -H 'Content-Type: application/json' -d'
with response:
{
"indices": "*,-.slm*,-,ilm*,-.transfrom*,-security*",
"ignore_unavailable": true,
"include_global_state": false,
"include_aliases": false
}

Related

Getting error index.max_inner_result_window during rolling upgrade of ES from 5.6.10 to 6.8.10

I have 2 data nodes and 3 master nodes in an ES cluster. I was doing a rolling upgrade as ES suggested moving from 5.6.10 to 6.8.10.
As there should be zero downtime, I was testing that and getting one error.
I have upgraded the 1 data node and do basic search testing. It is working fine. When I have upgraded 2nd node search is breaking with the below Error.
java.lang.IllegalArgumentException: Top hits result window is too large, the top hits aggregator [top]'s from + size must be less than or equal to: [100] but was [999]. This limit can be set by changing the [index.max_inner_result_window] index level setting.
index.max_inner_result_window -- This property was introduced in the 6.X version, and the master node is still on 5.6.10. So what will be the solution with 0 downtimes?
Note: My indexing is stopped completely. My 2 data nodes are now on 6.8.10 and master nodes are on 5.6.
Thanks
1 - Change the parameter on current indexes:
curl -X PUT "http://localhost:9200/_all/_settings?pretty" -H 'Content-Type: application/json' -d'
{
"index.max_inner_result_window": "2147483647"
}
'
2 - Create a template to further indexes:
curl -X PUT "http://localhost:9200/_index_template/template_max_inner_result?pretty" -H 'Content-Type: application/json' -d'
{
"index_patterns": ["*"],
"template": {
"settings": {
"index":{
"max_inner_result_window": 2147483647
}
}
}
}
'

Index with ! in their name cant be filtered for recovering

I have an ES cluster whith indices name like web.analytics.data.api!monthly!2018-07_v0 and doing regular snapshots/backups
Now, when I want to restore all of them, all works pretty well. If I want to restore just a specific index however, es wont do it. The command I use:
curl -X POST "localhost:9200/_snapshot/s3_backups/20191218_060001/_restore?pretty&wait_for_completion=true" -H 'Content-Type: application/json' -d'
{
"indices": "web.analytics.data.api!monthly!2018-07_v0",
"index_settings": {
"index.number_of_replicas": 0
}
}
'
The result I get is:
{
"snapshot" : {
"snapshot" : "20191218_060001",
"indices" : [ ],
"shards" : {
"total" : 0,
"failed" : 0,
"successful" : 0
}
}
}
Please note, that If I use index without ! in its name (e.g. .kibana), it works well. Any ideas of how I can solve that? Preferably without telling developers to rename the indices. The ES in question has version 1.7.3 I am aware it is EOL, but it is what I have to work with right now.
So it was my bad in the end. The index I got did not exist (typo in it) but I was told ! is problematic so i did not double check and the test indices were picked by me, so of course they were correct...

Backup and restore some records of an elasticsearch index

I wish to take a backup of some records(eg latest 1 million records only) of an Elasticsearch index and restore this backup on a different machine. It would be better if this could be done using available/built-in Elasticsearch features.
I've tried Elasticsearch snapshot and restore (following code), but looks like it takes a backup of the whole index, and not selective records.
curl -H 'Content-Type: application/json' -X PUT "localhost:9200/_snapshot/es_data_dump?pretty=true" -d '
{
"type": "fs",
"settings": {
"compress" : true,
"location": "es_data_dump"
}
}'
curl -H 'Content-Type: application/json' -X PUT "localhost:9200/_snapshot/es_data_dump/snapshot1?wait_for_completion=true&pretty=true" -d '
{
"indices" : "index_name",
"type": "fs",
"settings": {
"compress" : true,
"location": "es_data_dump"
}
}'
The format of backup could be anything, as long as it can be successfully restored on a different machine.
you can use _reinex API. it can take any query. after reindex, you have a new index as backup, which contains requested records. easily copy it where ever you want.
complete information is here: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html
In the end, I fetched the required data using python driver because that is what I found the easiest for the given use case.
For that, I ran an Elasticsearch query and stored its response in a file in newline-separated format and then I later restored data from it using another python script. A maximum of 10000 entries are returned this way along with the scroll ID to be used to fetch next 10000 entries and so on.
es = Elasticsearch(timeout=30, max_retries=10, retry_on_timeout=True)
page = es.search(index=['ct_analytics'], body={'size': 10000, 'query': _query, 'stored_fields': '*'}, scroll='5m')
while len(page['hits']['hits']) > 0:
es_data = page['hits']['hits'] #Store this as you like
scrollId = page['_scroll_id']
page = es.scroll(scroll_id=scrollId, scroll='5m')

Autocompletion elasticsearch

I'm following along with the tutorial for elasticsearch's completion suggester here. It's pretty easy to get going. But I'm unable to get completions for more than one word. In the example single incomplete words give great results, e.g
"Nir" -> "options":[{"text":"Nevermind Nirvana..."
"Nev" -> "options":[{"text":"Nevermind Nirvana..."
But the following fail:
"Nirvana Nev" -> Nothing!
"Nevermind Nir" -> Nothing!
I can get it to work by populating combinatorial options e.g
curl -X PUT "localhost:9200/music/_doc/1?refresh" -H 'Content-Type: application/json' -d'
{
"suggest" : {
"input": [ "Nevermind", "Nirvana", "Nirvana Nevermind", "Nevermind Nirvana" ],
"weight" : 34
},
"title" : "Nevermind by Nirvana"
}
'
But this approach will soon lead to massive variants of text added to the input.
There must be a better way?

Laravel Scout with Elastic search not working

I tried
Using Elastic search with Laravel scout with packages
"laravel/scout": "^1.0",
"tamayo/laravel-scout-elastic": "^1.0"
Ran Elasticsearch server in localhost:9200 and created index and gave necessary config's,
added searchable trait's to the model,
and imported data to index like
php artisan scout:import "App\story"
Imported [App\story] models up to ID: 4
All [App\story] records have been imported.
But when I do a search it returns an empty array
story::search("the")->get()
=> Illuminate\Database\Eloquent\Collection {#754
all: [],
}
when I do curl also it shows like,
// http://localhost:9200/author/_search?pretty=true&q=*:*
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": [
]
}
}
When I adding the record without index in ES the model throws an error like index not found. But after adding data and all, it seems empty. Did I miss anything?
The whole same works fine with algolia.
Set QUEUE=sync or you could turn off queue on config/scout.php.
Had the same issue: https://github.com/ErickTamayo/laravel-scout-elastic/issues/43
I had the same issue.
Delete your index in Elasticsearch and run:
php artisan scout:import App\\story
Let scout create it.
In case you are using elastic search, check the error log - I had:
sudo su
tail -f /var/log/elasticsearch/elasticsearch.log
high disk watermark [90%] exceeded on [minbqqKpRV-umA0DPxkuww][mihai-MS-7A72][/var/lib/elasticsearch/nodes/0] free: 28.9gb[6.3%], shards will be relocated away from this node; currently relocating away shards totalling [0] bytes; the node is expected to continue to exceed the high disk watermark when these relocations are complete
IF that`s the case,disable the disk check,at least for testing
curl -XPUT -H "Content-Type: application/json" \
http://localhost:9200/_all/_settings \
-d '{"index.blocks.read_only_allow_delete": false}'
Or make those changes permanent in elasticsearch.yml and restart the service.

Resources