Question is pretty straight forward, is there a way to enable TTL on the index level. effectively means all types created under this index will inherit an enabled TTL.
on the documentation it is said that "You can provide a per index/type default _ttl value as follows", but I wasn't able to request TTL on an index level.
in case it isn't possible, what workaround can be suggested ? in our environment new types are created all the time, and the data has to be removed after it is not needed anymore.
You can accomplish this using default option under mapping. Under an index , if you put any configuration under_default_ it would be applied to all the mappings whose these configurations not defined under the same index.
curl -XPUT "http://localhost:9200/test_index" -d'{
"mappings": {
"_default_": {
"_ttl": {
"enabled": true
}
}
}
}'
Related
Sometimes, I need to update mappings, settings, or bind default pipelines to the actively used index.
For the time being, I am using a method with data loss as follows:
update the index template with proper mapping (or binding the default pipeline by index.default_pipeline);
create a_new_index (matching the template index_patterns);
reindex the index_to_fix to a_new_index to migrate the data already indexed;
use alias to redirect the coming indexing request to a_new_index (the alias will have the same name as index_to_fix to ensure the indexing is undisturbed) and delete the index_to_fix;
But between step 3 and step 4, there is a time gap, during which the newly indexed data are lost in the original index_to_fix.
Is there a way, to update configurations for actively used index without any data loss?
Thanks for the help of #LeBigCat, after some discussions. I think this problem could be solved in three steps.
Use Alias for CRUD
First thing first, try not to use index directly, use alias if possible; since you can't use an alias with the same name as the existed indices, directly you can't replace the index even if it's broken (badly designed). The easiest way is to use a template and include the index name directly in the alias.
PUT _template/test
{
...
"aliases" : {
"{index}-alias" : {}
}
}
Redirect the Indexing
Since the index_to_fix is being actively used, after updating the template and create a new index a_new_fix, we can use alias to redirect the indexing to a_new_fix.
POST /_aliases
{
"actions" : [
{ "add": { "index": "a_new_index", "alias": "index_to_fix-alias" } },
{ "remove": { "index": "index_to_fix", "alias": "index_to_fix-alias" } }
]
}
Migrating the Data
Simply use _reindex to migrate all the data from index_to_fix to a_new_index.
POST _reindex
{
"source": {
"index": "index_to_fix"
},
"dest": {
"index": "index_to_fix-alias"
}
}
I have a field that was originally set to "index": false because we did not think we would ever have to query on this particular field. Its now about 9 months later and we have a new feature request that is going to require us to query on this field.
I know ES offers some nice features like fields that allow you to add more functionality to a field, but it does not appear to allow you to go from index = false to index = true by simply adding a sub-field.
After some googling I was not able to find a solution to this issue that doesn't involve either 1) creating a new field altogether or 2) re-indexing all of the data.
Does anyone know of a clean/side effect-free way of adding this kind of functionality to an existing field? If not, what is the suggested process?
Here is what the current field looks like:
{
"mappings": {
"entity": {
"properties": {
"contentType": {
"type": "keyword",
"index": false
}
}
}
}
}
Like I said I am looking to find the cleanest way of changing "index" to true
Thanks!
You need to create a new index and reindex I’m afraid.
Reindex API can be a great help though.
what is offline and online indexing in Elastic search? I did my research but I couldn't find enough resources to see what these terms mean? any idea? and also when do we need to reindex? any examples would be great
The terms offline and online indexing are used here.
https://spark-summit.org/2014/wp-content/uploads/2014/07/Streamlining-Search-Indexing-using-Elastic-Search-and-Spark-Holden-Karau.pdf
Reindexing
The most basic form if reindexing just copies one index to another.
I have used this form of reindexing to change a mapping.
Elasticsearch doesn't allow you to change a mapping, so if you want to change a mapping you have to create a new index (index2) with a new mapping and then reindex. The reindex will fill that new mapping with the data of the old index.
The command below will move everything from index to index2.
curl -XPOST 'localhost:9200/_reindex?pretty' -d'
{
"source": {
"index": "index"
},
"dest": {
"index": "index2"
}
}'
You can also use reindexing to fill a new index with a part of the old one. You can do so by using a couple of parameters. The example below will copy the newest 1000 documents.
POST /_reindex
{
"size": 1000,
"source": {
"index": "index",
"sort": { "date": "desc" }
},
"dest": {
"index": "index2"
}
}
For more examples about reindexing please have a look at the official documentation.
offline vs online indexing
In ONLINE mode the new index is built while the old index is accessible to reads and writes. any update on the old index will also get applied to the new index.
In OFFLINE mode the table is locked up front for any read or write, and then the new index gets built from the old index. No read or write operation is permitted on the table while the index is being rebuilt. Only when the operation is done is the lock on the table released and reads and writes are allowed again.
If I have a 15 node cluster, do I have to change the
index.number_of_shards
value on all 15 nodes, and restart them, before the new value comes into effect for new indexes?
That is right changing index.number_of_shards defaults in config file would involve changing the setting on all nodes and then restarting the instance ideally following the guidelines for rolling restarts.
However if that is not an option and if explicitly specifying the number_of_shards in the settings while creating the new index is not ideal then the workaround would be using index templates
Example:
One can create an index_defaults default as below
PUT /_template/index_defaults
{
"template": "*",
"settings": {
"number_of_shards": 4
}
}
This applies the setting specified in index_defaults template to all new indexes.
Once you set the number of shards for an index in ElasticSearch, you cannot change them. You will need to create a new index with the desired number of shards, and depending on your use case, you may want then to transfer the data to the new index.
I say depending on the use case because, for instance, if you are storing time based data such as log events, it is perfectly reasonable to close one index and open a new one with a different number of shards, and index all data going forward to that new index, keeping the old one for searches.
However, if your use case is, for instance, storing blog documents, and your indices are by topic, then you will need to (a) create new indices as stated above with a different number of shards and (b) reindex your data. For (b) I recommend using the Scroll and Scan API to get the data out of the old index.
You need to create a template for new indices that will be created:
PUT /_template/index_defaults
{
"index_patterns": "*",
"settings" : {
"index" : {
"number_of_shards" : 1,
"number_of_replicas" : 1
}
}
}
For old indices you need to reindex.
Example: from my_old_index to my_new_index
Create the new index with appropriate mapping and settings:
PUT my_new_index
{
"settings" : {
"index" : {
"number_of_shards" : 1,
"number_of_replicas" : 1
}
}
}
Reindex from old index to new one, specify type only if you desire:
POST /_reindex?slices=5
{
"size": 100000,
"source": { "index": "my_old_index" },
"dest": { "index": "my_new_index", "type": "my_type" }
}
Updated syntax to avoid some deprecation warnings in Elasticsearch 6+
per
https://www.elastic.co/guide/en/elasticsearch/reference/6.0/indices-templates.html
PUT /_template/index_defaults
{
"index_patterns": ["*"],
"order" : 0,
"settings": {
"number_of_shards": 2
}
}
Please remember that specifying the number of shards is a static operation and should be done when creating an index. But, any change after the index is created will require complete reindexing again which will take time.
To create the number of shards when creating an index use this command.
curl -XPUT ‘localhost:9200/my_sample_index?pretty’ -H ‘Content-Type: application/json’ -d’
{
“settings:”{
“number_of_shards”:2,
“number_of_replicas”:0
}
}
you don't have to to run this on all the nodes. run them on any one node. All the nodes communicate with each other about the change to the elastic index.
If I change the mapping so certain properties have new/different boost values, does that work even if the documents have already been indexed? Or do the boost values get applied when the document is indexed?
You cannot change field level boost factors after indexing data. It's not even possible for new data to be indexed once the same fields have been indexed already for previous data.
The only way to change the boost factor is to reindex your data. The pattern to do this without changing the code of your application is to use aliases. An alias points to a specific index. In case you want to change the index, you create a new index, then reindex data from the old index to the new index and finally you change the alias to point to the new index. Reindexing data is either supported by the elasticsearch library or can be achieved with a scan/scroll.
First version of mapping
Index: items_v1
Alias: items -> items_v1
Change necessary, sencond version of the index with new field level boost values :
Create new index: items_v2
Reindex data: items_v1 => items_v2
Change alias: items -> items_v2
This might be useful in other situations where you want to change your mapping.
Field level boosts are, however, not recommended. The better approach is to use boosting at query time.
Alias commands are:
Adding an alias
POST /_aliases
{
"actions": [
{ "add": {
"alias": "tems",
"index": "items_v1"
}}
]
}
Removing an alias
POST /_aliases
{
"actions": [
{ "remove": {
"alias": "tems",
"index": "items_v1"
}}
]
}
They do not.
Index time boosting is generally not recommended. Instead, you should do your boosting when you search.