I am very new to elastic search. I need to know what is settings in the index.is it optional? what happens if we don't include it and what happens if we don't include shards in settings.
If you're new to Elasticsearch, it's important that you understand the basic terminologies of Elastic search first.
cluster – An Elasticsearch cluster consists of one or more nodes and is identifiable by its cluster name.
node – A single Elasticsearch instance. In most environments, each node runs on a separate box or virtual machine.
index – In Elasticsearch, an index is a collection of documents like the database in mysql.
shard – Because Elasticsearch is a distributed search engine, an index is usually split into elements known as shards that are distributed across multiple nodes. Elasticsearch automatically manages the arrangement of these shards. It also rebalances the shards as necessary, so users need not worry about the details.
replica – By default, Elasticsearch creates five primary shards and one replica for each index. This means that each index will consist of five primary shards, and each shard will have one copy.
Settings are generally used to define the overall architecture of your application. It differs based on the requirement of the application.
It contains the number of shards, no of Replica sets, etc. This information is helpful to design our Elastic according to the need of the application as below:
{
"settings" : {
"index" : {
"number_of_shards" : 3,
"number_of_replicas" : 2
}
}
}
For further clarification you can visit the official documentation of Elastic community, that is very well written here.
Setting in ElasticSearch
Related
I need a way that whenever shard size increases from a given size limit, I need to redistribute that shard's data into two equal-size shards by adding one more shard and transfer half of the original size exceeded shard's data into newly created shard in the same index.
I have got the shard state like following, but need help find a way to distribute the data
{
"index": "public",
"shard": "0",
"store": "20GB"
}
P.S. I have tried Split Index API Link but this doesn't serve the purpose as it requires a new non-existing index and it cannot do the magic on the existing index, like in the above example index 'public' need to be the same but shard should increase and distribute data among themselves
This is not possible, you can't change the primary shards of elasticsearch index on the same index, this is because if your routing and location depend on the number of primary shards(created at the index creation time).
And if you change it, elasticsearch will have to change the routing algorithm and distribute the data again to evenly distribute the data in all the shards(including replica). Doing the above on a distributed large-scale stateful application is not an easy feat and elasticsearch as of now doesn't support it.
You cannot just add a shard without reindexing (but you can add a replica)
If part of your data is readonly, and you can activate a basic licence,(probably not in aws) you can define an ILM.
In Open Distro, you can use the equivalent :
https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/ism.html
I have a newly setup Elasticsearch 7.5.2 cluster. When I create an index, only one shard is created default for it.
My cluster strategy is as below:
Total Nodes: 5
--------------
Node 1 & Node 2 - Master Only
Node 3 - Master & Data Node
Node 4 & Node 5 - Data Only
Could not find any cluster setting that is restricting the shards for index creation.
Is the issue with cluster strategy or am I missing any settings here?.
Please help me to find the the issue.
Earlier Elasticsearch had default number of primary shards to 5, which is changed from Elasticsearch 7.X, which you are using, hence you are seeing just 1 primary shard.
Elasticsearch link for this change and more info on this SO answer.
Apart from API which is applicable on a particular index, which #Kamal already mentioned, you can specify this setting in your elasticsearch.yml, which would be effective on every index created until you override using the API call.
Config to add in your elasticsearch.yml
index.number_of_shards: {your desired number of shards}
Note: This is for primary shards that can't be changed dynamically, so be cautious of setting this, Unlike the number of replicas which can be changed dynamically.
That is correct. Post version 7, Elasticsearch by default creates index with shard size 1 as mentioned here
You can always specify the index shard using the below settings, while creating the index.
PUT <your_index_name>
{
"settings" : {
"index" : {
"number_of_shards" : 5
}
}
}
Hope this helps!
I'm trying to move all the shards (primary and copies) from one specific elasticsearch node to others.
While doing some studies, I came to know about Cluster-level shard allocation filtering where I can specify the node name which I want to ignore while allocating shards.
PUT _cluster/settings
{
"transient" : {
"cluster.routing.allocation.exclude._name" : "data-node-1"
}
}
My questions are,
If I dynamically update the setting, will the shards be moved from the nodes that I excluded to other nodes automatically?
How can I check and make sure that all shards are moved from a specific node?
Yes, your shards will be moved automatically, if it is possible to do so:
Shards are only relocated if it is possible to do so without breaking another routing constraint, such as never allocating a primary and replica shard on the same node.
More information here
You can use the shards api to see the location of all shards. Alternatively, if you have access to a kibana Dashboard, you can see the shard allocation in the monitoring tab for shards or indices at the very bottom.
I know that with below config we can exclude some nodes from elastic cluster, And elastic itself relocate existing indexes on those nodes.
PUT /_cluster/settings
{
"transient" : {
"cluster.routing.allocation.exclude._ip" : "192.168.2.*"
}
}
But what I really want is to exclude some indexes from particular nodes, I tried this config
PUT test/_settings
{
"index.routing.allocation.exclude._ip": "192.168.2.*"
}
This config prohibit elastic to assign new shards to this nodes, but it seems that it does not make elastic to relocate index's shards from those node. Am I right? If I'm right how can I move existing index from particular node?
I know I can reroute shards manually but there are many shards and it is almost impossible! _reindex is another option but it takes even more!
If it matters I use elastic 2.3.5
Ok, The answer is that that config will make elastic to move indexes from excluded nodes, But elastic do it when cluster is green!
I am using ElasticSearch version 1.0.1 and want to achieve two things at the same time -
1. Allow new indices to be created ( the primary and replica shards need to be allocated as per usual logic).
2. Prevent existing shards to be rebalanced on node failure.
What combination of settings will allow me to achieve the same? I tried the settings from the cluster module documented at http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-cluster.html. But I am unable to achieve both of them at the same time.
Thanks,