Do you need to delete Elasticsearch aliases? - elasticsearch

Can't seem to find a simple yes or no answer to this question.
When you have an index with one or more aliases can you just delete the index without any negative side effects? Will deleting the index also delete the aliases? Should you remove all aliases first before deleting an index?
What is considered best practice?

A simple test provides the answer.
First create an index:
PUT my_index
Then create an alias:
POST _aliases
{
"actions": [
{
"add": {
"index": "my_index",
"alias": "alias1"
}
}
]
}
Verify the alias exists:
GET _aliases # should return the alias named alias1
GET alias1 # should return documents from my_index
Delete the index:
DELETE my_index
Check that the alias is gone too
GET _aliases # should be empty
GET alias1 # should return "no such index"
To sum it up, no you don't need to delete aliases before/after deleting an index. Simply deleting the index will take care of deleting the orphan alias as well.

Related

When changing the index of an Elasticsearch alias via API, do write operations immediately point to the right index after the response?

Let's say I got the alias car-alias pointing to car-index-1. I now want car-alias to point to car-index-2.
I therefore perform the following POST request to the Aliases API:
{
"actions": [
{
"remove": {
"index": "car-index-1",
"alias": "car-alias"
}
},
{
"add": {
"index": "car-index-2",
"alias": "car-alias"
}
}
]
}
I receive the following response:
{
"acknowledged": true
}
Can I now immediately index data into the car-alias and it ends up in the car-index-2?
Or does the "acknowledged": true response not guarantee that write operations point to the right index immediately?
Yes, the alias is changed atomically and will point to car-index-2 immediately when the call returns.
As stated in the documentation "...during the swap, the alias has no downtime and never points to both streams at the same time."
In addition to #Val's answer:
Since in your case, the alias only points to one index, the "next" index is automatically set as the write index.
From the docs regarding the is_write_index option of the add action:
is_write_index (Optional, Boolean)
If true, sets the write index or data stream for the alias.
If an alias points to multiple indices or data streams and
is_write_index isn’t set, the alias rejects write requests. If an
index alias points to one index and is_write_index isn’t set, the
index automatically acts as the write index. [...]
Only the add action supports this parameter.

Rebuild index with zero downtime

Currently working on something and needed some help. I will have an elastic index populated from a sql database. There will be an initial full reindex from the sql database then there will be nightly job which will update / delete / insert updates.
In the event of a major failure I may need to do full reindex. Ideally i want zero downtime. I did find some articles about creating aliases etc however this sees to be more updates to field mappings. My situation is a full reindex of the data from my source db. Can i just get that data push the docs to elastic and elastic will just update the existing index as ids will be same? Or do i need to do something else?
Regards
Ismail
For zero downtime you can create a new index, populate it from your database, and use the alias to switch from the old index to the new one. Steps:
Call your main index something like main_index_1 (or whatever you like)
Create an alias for that index called main_index
curl -XPUT 'localhost:9200/main_index_1/_alias/main_index?pretty
Set up your application to point to this alias
Create a new index called main_index_2 and index it from your database
Switch the alias to point to the new index
curl -XPOST 'localhost:9200/_aliases?pretty' -H 'Content-Type: application/json' -d
{
"actions": [
{ "remove": { "index": "main_index_1", "alias": "main_index" }},
{ "add": { "index": "main_index_2", "alias": "main_index" }}
]
}

ElasticSearch - How to make a 1-to-1 copy of an existing index

I'm using Elasticsearch 2.3.3 and trying to make an exact copy of an existing index. (using the reindex plugin bundled with Elasticsearch installation)
The problem is that the data is copied but settings such as the mapping and the analyzer are left out.
What is the best way to make an exact copy of an existing index, including all of its settings?
My main goal is to create a copy, change the copy and only if all went well switch an alias to the copy. (Zero downtime backup and restore)
In my opinion, the best way to achieve this would be to leverage index templates. Index templates allow you to store a specification of your index, including settings (hence analyzers) and mappings. Then whenever you create a new index which matches your template, ES will create the index for you using the settings and mappings present in the template.
So, first create an index template called index_template with the template pattern myindex-*:
PUT /_template/index_template
{
"template": "myindex-*",
"settings": {
... your settings ...
},
"mappings": {
"type1": {
"properties": {
... your mapping ...
}
}
}
}
What will happen next is that whenever you want to index a new document in any index whose name matches myindex-*, ES will use this template (+settings and mappings) to create the new index.
So say your current index is called myindex-1 and you want to reindex it into a new index called myindex-2. You'd send a reindex query like this one
POST /_reindex
{
"source": {
"index": "myindex-1"
},
"dest": {
"index": "myindex-2"
}
}
myindex-2 doesn't exist yet, but it will be created in the process using the settings and mappings of index_template because the name myindex-2 matches the myindex-* pattern.
Simple as that.
The following seems to achieve exactly what I wanted:
Using Snapshot And Restore I was able to restore to a different index:
POST /_snapshot/index_backup/snapshot_1/_restore
{
"indices": "original_index",
"ignore_unavailable": true,
"include_global_state": false,
"rename_pattern": "original_index",
"rename_replacement": "replica_index"
}
As far as I can currently tell, it has accomplished exactly what I needed.
A 1-to-1 copy of my original index.
I also suspect this operation has better performance than re-indexing for my purposes.
I'm facing the same issue when using the reindex API.
Basically I'm merging daily, weekly, monthly indices to reduce shards.
We have a lot of indices with different data inputs, and maintaining a template for all cases is not an option. Thus we use dynamic mapping.
Due to dynamic mapping the reindex process can produce conflicts if your data is complicated, say json stored in a string field, and the reindexed field can end up as something else.
Sollution:
Copy the mapping of your source index
Create a new index, applying the mapping
Disable dynamic mapping
Start the reindex process.
A script can be created, and should of course have error checking in place.
Abbreviated scripts below.
Create a new empty index with the mapping from an original index.:
#!/bin/bash
SRC=$1
DST=$2
# Create a temporary file for holding the SRC mapping
TMPF=$(mktemp)
# Extract the SRC mapping, use `jq` to get the first record
# write to TMPF
curl -f -s "${URL:?}/${SRC}/_mapping | jq -M -c 'first(.[])' > ${TMPF:?}
# Create the new index
curl -s -H 'Content-Type: application/json' -XPUT ${URL:?}/${DST} -d #${TMPF:?}
# Disable dynamic mapping
curl -s -H 'Content-Type: application/json' -XPUT \
${URL:?}/${DST}/_mapping -d '{ "dynamic": false }'
Start reindexing
curl -s -XPOST "${URL:?}" -H 'Content-Type: application/json' -d'
{
"conflicts": "proceed",
"source": {
"index": "'${SRC}'"
},
"dest": {
"index": "'${DST}'",
"op_type": "create"
}
}'

Why are Elasticsearch aliases not unique

The Elasticsearch documentation describes aliases as feature to reindex data with zero downtime:
Create a new index and index the whole data
Let your alias point to the new index
Delete the old index
This would be a great feature if aliases would be unique but it's possible that one alias points to multiple indexes. Considering that maybe the deletion of the old index fails my application might speak to two indexes which might not be in sync. Even worse: the application doesn't know about that.
Why is it possible to reuse an alias?
It allows you to easily have several indexes that are both used individually and together with other indexes. This is useful for example when having a logging index where sometimes you want to query the most recent (logs-recent alias) and sometimes want to query everything (logs alias). There are probably lots of other use cases but this one pops up as the first for me.
As per the documentation you can send both the remove and add in one request:
curl -XPOST 'http://localhost:9200/_aliases' -d '
{
"actions" : [
{ "remove" : { "index" : "test1", "alias" : "alias1" } },
{ "add" : { "index" : "test2", "alias" : "alias1" } }
]
}'
After that succeeds you can remove your old index and if that fails you will just have an extra index taking up some space until its cleaned out.

Is it possible to write to multiple indexes with an ElasticSearch alias?

The ElasticSearch Docs reads:
An alias can also be mapped to more than one index, and when specifying it, the alias will automatically expand to the aliases indices.
But when I try to add an alias to 2 indices and write to both, neither seem to get updated with the document. If I remove one of the aliases, it will write correctly to the alias that still exists.
Fails with multiple write aliases:
$ curl -XGET 'http://localhost:9200/_aliases'
result:
{
"dev_01": {
"aliases": {
"dev_read": {},
"dev_write": {}
}
},
"dev": {
"aliases": {
"dev_write": {}
}
}
}
Works with single alias:
$ curl -XGET 'http://localhost:9200/_aliases'
result:
{
"dev_01": {
"aliases": {
"dev_read": {},
"dev_write": {}
}
},
"dev": {
"aliases": {}
}
}
Does elasticsearch support writing to multiple indices? Are aliases Read-Only if pointed at multiple indices?
the answer is No
So it appears I should have triaged this a beep deeper, but the response my client gets from es is:
ElasticSearchIllegalArgumentException[Alias [dev_write] has more than one indices associated with it [[dev_01, dev]], can't execute a single index op
Just wish the docs were a little more explicit up front, as they confused me a bit
At first seems to imply you can:
The index aliases API allow to alias an index with a name, with all APIs automatically converting the alias name to the actual index name. An alias can also be mapped to more than one index...
Associating an alias with more than one index are simply several add actions...
Further down the page lets you know you can not:
It is an error to index to an alias which points to more than one index.

Resources