I have 2 nodes running elastic search cluster with 5 shards and 1 replica.It shows cluster health to green but suddenly it shows cluster health to yellow and I fix it by shard re-routing.
.
I wants to understand root cause of unassigned shards because when it goes to yellow state I tried telnet between both nodes on port 9300 and 9200 and connected successfully
I also faced that problem earlier and that time I went to elasticsearch logs.So based on that I fixed the issue.
So my suggestion is please go to logs and check what is the route cause of the issue.
regards
Kartheek Gummaluri
Related
I have an elastic search cluster of 3 nodes (1 master and 2 data nodes), I have enabled xpack after that I was not able to start the master node. So I ran the elasticsearch-node repurpose command. And the cluster restarted.
But now I have the shards which are unassigned.
analytics-2019-11-19 0 p UNASSIGNED
analytics-2019-11-19 0 r UNASSIGNED
and the cluster status is red. I am new to elk. Let me know how to fix this and make the cluster green?
Thanks
In order to resolve UNASSIGNED shards issue you have to follow these steps:
Let's find out which shards are unassigned, and why run:
curl -XGET localhost:9200/_cat/shards?h=index,shard,prirep,state,unassigned.reason| grep UNASSIGNED
Via Kibana
GET _cat/shards?h=index,shard,prirep,state,unassigned.reason| grep UNASSIGNED
Let's use the cluster allocation explain API to try to garner more information about shard allocation issues
curl -XGET localhost:9200/_cluster/allocation/explain?pretty
Via Kibana
GET _cluster/allocation/explain?pretty
The resulting output will provide helpful details about why certain shards in your cluster remain unassigned.
For example:
You might see this explanation: "explanation" : "the shard cannot be allocated to the same node on which a copy of the shard already exists"
Meaning there is an index that you don’t need anymore and you can delete it to restore your cluster status to green.
If it is not the issue (the example) then it could be one of the following reasons:
-Shard allocation is purposefully delayed
-Too many shards, not enough nodes
-You need to re-enable shard allocation
-Shard data no longer exists in the cluster
-Low disk watermark
-Multiple Elasticsearch versions
Follow this guide to resolve unassigned shards issue
Hope this helps
I'm running with 3 nodes cluster on AWS EC2, one of my nodes crashed and after reboot I see 2900 unassigned shards and cluster state RED.
I configured indices to have 5 shards with 1 replica - and I don't understand why after rebooting the shards are not recovered from the replicas.
I tried to manually migrate shards with elasticsearch reroute API https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-reroute.html
But got errors:
can't cancel 2, failed to find it on node {infra-elasticsearch-1}
can't move 2, failed to find it on node {infra-elasticsearch-1}
[allocate_replica] trying to allocate a replica shard
[filebeat-demo00-2018.07.21][2], while corresponding primary shard is
still
unassigned"}],"type":"illegal_argument_exception","reason":"[allocate_replica]
trying to allocate a replica shard [filebeat-demo00-2018.07.21][2],
while corresponding primary shard is still unassigned
It's look like the some primary shard was lost (don't exists on disk) and I don't know how to the state back to GREEN.
thanks
Make sure the shard allocation is enabled in the active nodes by using the below API request
PUT _cluster/settings
{
"persistent": {
"cluster.routing.allocation.enable": null
}
}
Also you can check if the replica exists for the indexes whose primary shard has been lost by looking at the Indices information of the Monitoring app on Kibana.
To check the undergoing recovery process use the below API
GET /_recovery
I don't if this can help, but I just restarted the elasticsearch and kibana services. I waited for a few minutes, the cluster health changed from red to yellow then green in a matter of minutes.
on elastic cluster nodes:
#systemctl restart elasticsearch.service
on kibana node:
#systemctl restart kibana.service
I started two clusters of ElasticSearch with different names but the other one won't show up either in Marvel or querying for health manually.
curl 'http://127.0.0.1:9200/_cat/health?v'
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1501062768 15:22:48 Cove_dev_cluster yellow 1 1 8 8 0 0 8 0 - 50.0%
But it's running on my screen.
I am assuming you are running both clusters (single nodes I believe in this case) on the same machine... In this case the nodes have a default port range setting of 9200-9300 and they are configured to bind to first available port in the specified range. More details available in Network Settings documentation.
So in your case the other cluster is running on port 9201 most likely. If you check for Marvel or query the health manually on port 9201 you should find the other cluster.
However, if you want to have two nodes participating in the same cluster, then make sure that the cluster name matches in the configuration of both instances of elasticsearch you have running.
Hope this helps.
I have setup an elastic search cluster with 1 master node and 1 client node, but problem is as I am creating index my cluster move to red state with 3 initializing_shards on client node, master node shards working fine.
don't know how to resolve it.
It was installation issue we have re installed elastic search and that solved our problem.
As you said in question, You have only 1 master and 1 client node but you should have at least 1 data node to store at least primary shards.
After many trials and errors, here is the pattern:
1 Node: Works perfect
2 Nodes: root / query show 200, and cluster health is green, but cannot index new doc (no response). Search query has no response too. But once I shut down one node, everything would work again.
I do make sure that port 9300 firewall is open between the nodes. Am I missing other important config? The cluster API reports 2 nodes, so the communication should be fine. But is there other factor that prevent new documents from indexing in the two nodes?