I have an ElasticSearch server with thousands of indexes. My question is how can I know on which node is my index saved on
You need to use the cat shards API.
GET /_cat/shards/index-name
It will return where the shards of the index-name is located.
Related
I have two different Elasticsearch clusters,
One cluster is Elastcisearch 6.x with the data, Second new Elasticsearch cluster 7.7.1 with pre-created indexes.
I reindexed data from Elastcisearch 6.x to Elastcisearch 7.7.1
Is there any way to get the doc from source and compare it with the target doc, in order to check that data is there and it is not affected somehow.
When you perform a reindex the data will be indexed based on destination index mapping, so if your mapping is same you should get the same result in search, the _source value will be unique on both indices but it doesn't mean your search result will be the same. If you really want to be sure everything is OK you should check the inverted index generated by both indices and compare them for fulltext search, this data can be really big and there is not an easy way to retrieve it, you can check this for getting term-document matrix .
I'm trying to move all the shards (primary and copies) from one specific elasticsearch node to others.
While doing some studies, I came to know about Cluster-level shard allocation filtering where I can specify the node name which I want to ignore while allocating shards.
PUT _cluster/settings
{
"transient" : {
"cluster.routing.allocation.exclude._name" : "data-node-1"
}
}
My questions are,
If I dynamically update the setting, will the shards be moved from the nodes that I excluded to other nodes automatically?
How can I check and make sure that all shards are moved from a specific node?
Yes, your shards will be moved automatically, if it is possible to do so:
Shards are only relocated if it is possible to do so without breaking another routing constraint, such as never allocating a primary and replica shard on the same node.
More information here
You can use the shards api to see the location of all shards. Alternatively, if you have access to a kibana Dashboard, you can see the shard allocation in the monitoring tab for shards or indices at the very bottom.
I am very new to elastic search. I need to know what is settings in the index.is it optional? what happens if we don't include it and what happens if we don't include shards in settings.
If you're new to Elasticsearch, it's important that you understand the basic terminologies of Elastic search first.
cluster – An Elasticsearch cluster consists of one or more nodes and is identifiable by its cluster name.
node – A single Elasticsearch instance. In most environments, each node runs on a separate box or virtual machine.
index – In Elasticsearch, an index is a collection of documents like the database in mysql.
shard – Because Elasticsearch is a distributed search engine, an index is usually split into elements known as shards that are distributed across multiple nodes. Elasticsearch automatically manages the arrangement of these shards. It also rebalances the shards as necessary, so users need not worry about the details.
replica – By default, Elasticsearch creates five primary shards and one replica for each index. This means that each index will consist of five primary shards, and each shard will have one copy.
Settings are generally used to define the overall architecture of your application. It differs based on the requirement of the application.
It contains the number of shards, no of Replica sets, etc. This information is helpful to design our Elastic according to the need of the application as below:
{
"settings" : {
"index" : {
"number_of_shards" : 3,
"number_of_replicas" : 2
}
}
}
For further clarification you can visit the official documentation of Elastic community, that is very well written here.
Setting in ElasticSearch
I know that with below config we can exclude some nodes from elastic cluster, And elastic itself relocate existing indexes on those nodes.
PUT /_cluster/settings
{
"transient" : {
"cluster.routing.allocation.exclude._ip" : "192.168.2.*"
}
}
But what I really want is to exclude some indexes from particular nodes, I tried this config
PUT test/_settings
{
"index.routing.allocation.exclude._ip": "192.168.2.*"
}
This config prohibit elastic to assign new shards to this nodes, but it seems that it does not make elastic to relocate index's shards from those node. Am I right? If I'm right how can I move existing index from particular node?
I know I can reroute shards manually but there are many shards and it is almost impossible! _reindex is another option but it takes even more!
If it matters I use elastic 2.3.5
Ok, The answer is that that config will make elastic to move indexes from excluded nodes, But elastic do it when cluster is green!
Since i am a new in ES, i need help.
I read that it is possible to specify the shard where the document to be stored using 'routing'. But is it possible to restrict that a document should be saved in a particular node?..
Suppose i have two nodes. Node1 and node2. my requirement is that, if i add a document from node1 to index 'attenadance', it should store primary shard at node1 and the replica may be at node2. And the same thing if i add a document from node2 to index 'attenadance', it should store primary shard at node2 and the replica may be at node1...Please advice me is it possible in ES?..if yes, please tell how to achieve this in ES.
Each node has a specific value associated with rack in cofig.yml. Node 1 has a setting node.rack: rack1, Node 2 a setting of node.rack: rack2, and so on.
We can create an index that will only deploy on nodes that have rack set to rack1 setting index.routing.allocation.include.tag to rack1. For example:
curl -XPUT localhost:9200/test/_settings -d '{
"index.routing.allocation.include.tag" : "rack1"
}'
further ref:
Elasticsearch official doc
You don't control where shards/replicas go -- elasticsearch handles that... In general, it won't put a replica of a shard on the same node. There is a really good explanation of how it all works here: Shards and replicas in Elasticsearch
There is also good documentation on using shard routing on the elasticsearch blog if you need to group data together (but be careful because it can generate hot-spots.