AWS Elasticsearch: How to update analyzer settings for existing index - elasticsearch

I'm trying to update settings for existing index with custom analyzer, but it is not allowing for open index, and aws is not allowing me to close the index.
Any thoughts?

AWS Elasticsearch only supports a subset of operations and indeed _close is not supported. You can get the list of supported operation here.
Since you are updating an analyzer you will probably have to reindex your documents, so you can create a new index with the correct mapping and use _reindex endpoint to copy the documents.
If you are not already doing it, I would advise you to use index alias to handle the index migration.

Related

How to manage Elastic mapping changes with ILM rollovers?

We have some Elastic rollover indices managed by ILM. Mappings are defined in templates.
When we try to add mappings fields concepts collide:
The obvious way is (1) update the template and (2) provoke a rollover so the new index has the new mapping. BUT rolling over an ILM managed index breaks things. _ilm/explain will tell you index [blah] is not the write index for alias [blah-000004].
So... what do people do?
Directly updating the write index mapping might work, but it feels like there should be a better option?
This is Elastic 6.8.

Elasticsearch: when inserting a record to index I don't want to create an index mapping

Elasticsearch default behavior when inserting a document to an index, is to create an index mapping if it's not exist.
I know that I can change this behavior on the cluster level using this call
PUT _cluster/settings
{
"persistent": {
"action.auto_create_index": "false"
}
}
but I can't control the customer's elasticsearch.
I'm asking is there a parameter which I can send with the index a document request that will tell elastic not to create the index in case it doesn't exist but to fail instead?
If you couldn’t change cluster settings or settings in elasticsearch.yml, I’m afraid it’s not possible, since there are no special parameters during POST/PUT of the documents.
Another possible solution could be to create an API level, which will prevent going to Elasticsearch completely, if there is no such index.
There is an issue on Github, that is proposing to set action.auto_create_index to false by default, but unfortunately, I couldn’t see if there is any progress on it.

Reindex all of ElasticSearch with Curator?

Is there a Recipe out there to Reindex all ElasticSearch Indices with Curator?
I'm seeing that it can Reindex a set of indices into one (Daily to Month use case), however I don't see anything that would suggest it could easily apply a new mapping file to every Elastic Index.
I'm taking a guess I'll need to write a wrapper script around Curator to grab index names and feed them into Curator.
I don't know if I got you right as you mentioned reindexing and mapping changes...
If you want to set/update a mapping in a collection of indices and if you know the indices to update by name (or pattern), you are able to apply the same mapping or a mapping change at once with https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html#_multi_index_2
For reindexing, there is no way to specify multiple source/target pairs at once but you can split one index into many. But as you sugessted, you can use subsequent calls to the reindex api.
BTW: The reindex api does not copy the settings nor mappings from the source into the destination index. You need to handle it by yourself, maybe using https://www.elastic.co/guide/en/elasticsearch/reference/6.4/indices-templates.html

Specifying data type and analyzer while creating index

I am using elastic search to index my data and i was able to do it with the following POST request
http://localhost:9200/index/type/id
{
JSON data over here
}
Yesterday while i was going through some of the elastic tutorials i found one person mentioning about setting analyzer to those fields where we are planning to do full text search.I found during my googling that mapping API can be used to update datatypes and analyzer, but in my case i want to do it as i am creating index.
How can i do it?
You can create index with custom settings (and mappings) in the first request and then index your data with second request. In this case you can not do both at the same time.
However if you index your data first and index does not exist yet, it will be created automatically with default settings. You can then update your mappings.
Source: Index

Does ElasticSearch Snapshot/Restore functionality cause the data to be analyzed again during restore?

I have a decent amount of data in my ElasticSearch index. I changed the default analyzer for the index and hence essentially I need to reindex my data so that it is analyzed again using the new analyzer. So instead of creating a test script that will delete all of the existing data in the ES index and re-add the data I thought if there is a back-up/restore module that I could use. As part of that, I found the snapshot/restore module that ES supports - ElasticSearch-SnapshotAndRestore.
My question is - If I use the above ES snapshot/restore module will it actually cause the data to be re-analyzed? Since I changed the default analyzer, I need the data to be reanalyzed. If not, is there an alternate tool/module you will suggest that will allow for pure export and import of data and hence cause the data to be re-analyzed during import?
DevUser
No it does not re-analyze the data. You will need to reindex your data.
Fortunately that's fairly straightforward with Elasticsearch as it by default stores the source of your documents:
Reindexing your data
While you can add new types to an index, or add new fields to a type,
you can’t add new analyzers or make changes to existing fields. If you
were to do so, the data that has already been indexed would be
incorrect and your searches would no longer work as expected.
The simplest way to apply these changes to your existing data is just
to reindex: create a new index with the new settings and copy all of
your documents from the old index to the new index.
One of the advantages of the _source field is that you already have
the whole document available to you in Elasticsearch itself. You don’t
have to rebuild your index from the database, which is usually much
slower.
To reindex all of the documents from the old index efficiently, use
scan & scroll to retrieve batches of documents from the old index, and
the bulk API to push them into the new index.
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/reindex.html
I'd read up on Scan and Scroll prior to taking this approach:
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/scan-scroll.html
TaskRabbit did opensource an import/export tool but I've not used it so cannot recommend but it is worth a look:
https://github.com/taskrabbit/elasticsearch-dump

Resources