elasticsearch 6 index change to read only after few second - elasticsearch

I want to use elasticsearch 6 on mac os but when I create an index by adding a document to none exist index after few second index change to read-only and if add document or update document give this error
"error" : {
"root_cause" : [
{
"type" : "cluster_block_exception",
"reason" : "blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"
}
],
"type" : "cluster_block_exception",
"reason" : "blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"
},
"status" : 403
}
I test to disable read only by
curl -H'Content-Type: application/json' -XPUT localhost:9200/test/_settings?pretty -d'
{
"index": {
"blocks.read_only": false
}
}'
{
"acknowledged" : true
}
but nothing change
I test elastic 6 on another system with ubuntu os it's ok and there is no error then I think maybe something wrong with my system but elasticsearch 5.6.2 works correctly without any error
the elastic log is
[2018-01-05T21:56:52,254][WARN ][o.e.c.r.a.DiskThresholdMonitor] [gCjouly] flood stage disk watermark [95%] exceeded on [gCjoulysTFy1DDU7f7dOWQ][gCjouly][/Users/peter/Downloads/elasticsearch-6.1.1/data/nodes/0] free: 15.7gb[3.3%], all indices on this node will marked read-only

I had this problem
I think in elastic 6 add new setting to close index when empty hard less than 5%
you can disable this by below line in elasticsearch.yml
cluster.routing.allocation.disk.threshold_enabled: false
Then restart elasticsearch.
I hope this work for you

Convenience for copy/pasting into Kibana console
# disable threshold alert
PUT /_cluster/settings
{
"persistent" : {
"cluster.routing.allocation.disk.threshold_enabled" : false
}
}
# unlock indices from read-only state
PUT /_all/_settings
{
"index.blocks.read_only_allow_delete": null
}

If you are working in with elastic search in Docker, it's possible that Docker has run out of space. Either run docker volume prune to remove unused local volumes or increase your disk image size in Docker Preferences.

Run these commands
curl -XPUT -H "Content-Type: application/json" http://localhost:9200/_cluster/settings -d '{ "transient": { "cluster.routing.allocation.disk.threshold_enabled": false } }'
curl -XPUT -H "Content-Type: application/json" http://localhost:9200/_all/_settings -d '{"index.blocks.read_only_allow_delete": null}'
If you are using docker containers then you wanted to make sure that you have enough space in your container and its usage should be less than 85%.
You can make up space by clear the dangling images and volumes by running the following commands
# remove the dangling images
docker rmi $(docker images -f "dangling=true" -q)
# remove the dangling volumes
docker volume rm $(docker volume ls -qf dangling=true)
If you still having space issues then better to increase the space for your docker by going into Docker > Preferences
After making space for the docker, you need to run the above CURL command shared at the top of this post.

Its not recommended,
for a sudden debugging you can add below in docker,
elasticsearch:
image: elasticsearch:7.9.3
environment:
"cluster.routing.allocation.disk.threshold_enabled": "false"

Related

TransportError(403, u'cluster_block_exception', u'blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];')

When I try to store anything in elasticsearch, An error says that:
TransportError(403, u'cluster_block_exception', u'blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];')
I already inserted about 200 millions documents in my index. But I don't have an idea why this error is happening.
I've tried:
curl -u elastic:changeme -XPUT 'localhost:9200/_cluster/settings' -H 'Content-Type: application/json' -d '{"persistent":{"cluster.blocks.read_only":false}}'
As mentioned here:
ElasticSearch entered "read only" mode, node cannot be altered
And the results is:
{"acknowledged":true,"persistent":{"cluster":{"blocks":{"read_only":"false"}}},"transient":{}}
But nothing changed. what should I do?
Try GET yourindex/_settings, this will show yourindex settings. If read_only_allow_delete is true, then try:
PUT /<yourindex>/_settings
{
"index.blocks.read_only_allow_delete": null
}
I got my issue fixed.
plz refer to es config guide for more detail.
The curl command for this is
curl -X PUT "localhost:9200/twitter/_settings?pretty" -H 'Content-Type: application/json' -d '
{
"index.blocks.read_only_allow_delete": null
}'
Last month I facing the same problem, you can try this code on your Kibana Dev Tools
curl -XPUT -H "Content-Type: application/json" http://localhost:9200/_all/_settings -d '{"index.blocks.read_only_allow_delete": null}'
I hope it helps
I had faced the same issue when my disk space was full,
please see the steps that I did
1- Increase the disk space
2- Update the index read-only mode, see the following curl request
curl -XPUT -H "Content-Type: application/json"
http://localhost:9200/_all/_settings -d
'{"index.blocks.read_only_allow_delete": null}'
This happens because of the default watermark disk usage of Elastic Search. Usually it is 95% of disk size.
This happens when Elasticsearch thinks the disk is running low on space so it puts itself into read-only mode.
By default Elasticsearch's decision is based on the percentage of disk space that's free, so on big disks this can happen even if you have many gigabytes of free space.
The flood stage watermark is 95% by default, so on a 1TB drive you need at least 50GB of free space or Elasticsearch will put itself into read-only mode.
For docs about the flood stage watermark see https://www.elastic.co/guide/en/elasticsearch/reference/6.2/disk-allocator.html.
Quoted from part of this answer
One solution is to disable it enitrely (I found it useful in my local and CI setup). To do it run the 2 commands:
curl -XPUT -H "Content-Type: application/json" http://localhost:9200/_cluster/settings -d '{ "transient": { "cluster.routing.allocation.disk.threshold_enabled": false } }'
curl -XPUT -H "Content-Type: application/json" http://localhost:9200/_all/_settings -d '{"index.blocks.read_only_allow_delete": null}'
Tagging into this later on as I just encountered the problem myself - I accomplished the following steps.
1) Deleted older indexes to free up space immediately - this brought me to around 23% free.
2) Update the index read-only mode.
I still had the same issue. I checked the Dev Console to see what might be locked still and none were. Restarted the cluster and had the same issue.
Finally under index management I selected the indexes with ILM lifecycle issues and picked to reapply ILM step. Had to do that a couple of times to clear them all out but it did.
The problem may be a disk space problem, i had this problem despite i cleaned many space my disk, so, finally i delete the data folder and it worked: sudo rm -rf /usr/share/elasticsearch/data/
This solved the issue;
PUT _settings { "index": { "blocks": { "read_only_allow_delete": "false" }
}

Elastic data restore from S3

I have elasticsearch backup taken into S3. But I am not able to restore it using any of the commands mentioned below.
curl -XPOST http://localhost:9200/_snapshot/elasticsearch/snap-dev_1/_restore
curl -XPOST http://localhost:9200/_snapshot/snap-deliveryreports_june2016bk/elasticsearch/_restore
I can see the files in S3:
What is the command to restore the data shown in the image?
update:
The following command is successful (returns acknowleged: true)
It means access key, secret key, bucket name and region is correct.
curl -XPUT 'http://localhost:9200/_snapshot/s3_repository?verify=true&pretty' -d'
{
"type": "s3",
"settings": {
"bucket": "todel162",
"region": "us-east-1"
}
}'
I guess I only need to know how to use restore snapshot command.
You can use the cat recovery API to monitor your restore status, as restoring just piggybacks on the regular recovery mechanism of elasticsearch, so check if you see anything using those APIs.

How to bulk create (export/import) indices in elasticsearch?

I'm trying to upgrade our ELK stack from 1.x > 5.x following the re-index from remote instructions. I'm not sure of how to export a list of the indices that I need to create and then import that list into the new instance. I've created a list of indices using this command, both with "pretty," and without, but I'm not sure which file format to use as well as what to next do with that file.
The create index instructions don't go into how to create more than one at a time, and the bulk instructions only refer to creating/indexing documents, not creating the indices themselves. Any assistance on how to best follow the upgrade instructions would be appreciated.
I apparently don't have enough reputation to link the "create index" and "bulk" instructions, so apologies for that.
With a single curl command you could create an index template that will trigger the index creation at the time the documents hit your ES 5.x cluster.
Basically, this single curl command will create an index template that will kick in for each new index created on-the-fly. You can then use the "reindex from remote" technique in order to move your documents from ES 1.x to ES 5.x and don't worry about index creation since the index template will take care of it.
curl -XPUT 'localhost:9200/_template/my_template' -H 'Content-Type: application/json' -d'
{
"template": "*",
"settings": {
"index.refresh_interval" : -1,
"index.number_of_replicas" : 0
}
}
'
Was able to accomplish this with a formatted list of indices created via an index list fed through sed, then feeding that file through the following script:
#! /bin/bash
while read some_index; do
curl -XPUT "localhost:9200/$some_index?pretty" -d'
{
"settings" : {
"index" : {
"refresh_interval" : -1,
"number_of_replicas" : 0
}
}
}'
sleep 1
done <$1
If anyone can point me in the direction of any pre-existing mechanisms in Elasticsearch, though, please do.

How to (persistently) update the index.number_of_replicas setting in Elasticsearch without restarting the cluster?

In a running Elasticsearch cluster, the index.number_of_replicas setting in the configuration file is 1.
I could update this to 2 on a running cluster, by running
# curl -XPUT "http://127.0.0.1:9200/_settings?pretty" \
-d '{ "index": {"number_of_replicas":2}}'
{
"acknowledged" : true
}
Elasticsearch immediately creates the extra replicas for existing indexes.
However, newly created indexes have only 1 replica. How can the setting be persisted for newly created indexes too?
The API you used is to dynamically update the replica settings for existing indices.
If you want to apply them for the indices to be created in future , a better approach would be to use index template.
You can find more information on it here.
curl -XPUT localhost:9200/_template/template_1 -d '
{
"template" : "*",
"settings" : {
"number_of_replicas" : 2
}
}'
The above should work find for your case.

ElasticSearch: Unassigned Shards, how to fix?

I have an ES cluster with 4 nodes:
number_of_replicas: 1
search01 - master: false, data: false
search02 - master: true, data: true
search03 - master: false, data: true
search04 - master: false, data: true
I had to restart search03, and when it came back, it rejoined the cluster no problem, but left 7 unassigned shards laying about.
{
"cluster_name" : "tweedle",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 4,
"number_of_data_nodes" : 3,
"active_primary_shards" : 15,
"active_shards" : 23,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 7
}
Now my cluster is in yellow state. What is the best way to resolve this issue?
Delete (cancel) the shards?
Move the shards to another node?
Allocate the shards to the node?
Update 'number_of_replicas' to 2?
Something else entirely?
Interestingly, when a new index was added, that node started working on it and played nice with the rest of the cluster, it just left the unassigned shards laying about.
Follow on question: am I doing something wrong to cause this to happen in the first place? I don't have much confidence in a cluster that behaves this way when a node is restarted.
NOTE: If you're running a single node cluster for some reason, you might simply need to do the following:
curl -XPUT 'localhost:9200/_settings' -d '
{
"index" : {
"number_of_replicas" : 0
}
}'
By default, Elasticsearch will re-assign shards to nodes dynamically. However, if you've disabled shard allocation (perhaps you did a rolling restart and forgot to re-enable it), you can re-enable shard allocation.
# v0.90.x and earlier
curl -XPUT 'localhost:9200/_settings' -d '{
"index.routing.allocation.disable_allocation": false
}'
# v1.0+
curl -XPUT 'localhost:9200/_cluster/settings' -d '{
"transient" : {
"cluster.routing.allocation.enable" : "all"
}
}'
Elasticsearch will then reassign shards as normal. This can be slow, consider raising indices.recovery.max_bytes_per_sec and cluster.routing.allocation.node_concurrent_recoveries to speed it up.
If you're still seeing issues, something else is probably wrong, so look in your Elasticsearch logs for errors. If you see EsRejectedExecutionException your thread pools may be too small.
Finally, you can explicitly reassign a shard to a node with the reroute API.
# Suppose shard 4 of index "my-index" is unassigned, so you want to
# assign it to node search03:
curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
"commands": [{
"allocate": {
"index": "my-index",
"shard": 4,
"node": "search03",
"allow_primary": 1
}
}]
}'
OK, I've solved this with some help from ES support. Issue the following command to the API on all nodes (or the nodes you believe to be the cause of the problem):
curl -XPUT 'localhost:9200/<index>/_settings' \
-d '{"index.routing.allocation.disable_allocation": false}'
where <index> is the index you believe to be the culprit. If you have no idea, just run this on all nodes:
curl -XPUT 'localhost:9200/_settings' \
-d '{"index.routing.allocation.disable_allocation": false}'
I also added this line to my yaml config and since then, any restarts of the server/service have been problem free. The shards re-allocated back immediately.
FWIW, to answer an oft sought after question, set MAX_HEAP_SIZE to 30G unless your machine has less than 60G RAM, in which case set it to half the available memory.
References
Shard Allocation Awareness
This little bash script will brute force reassign, you may lose data.
NODE="YOUR NODE NAME"
IFS=$'\n'
for line in $(curl -s 'localhost:9200/_cat/shards' | fgrep UNASSIGNED); do
INDEX=$(echo $line | (awk '{print $1}'))
SHARD=$(echo $line | (awk '{print $2}'))
curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
"commands": [
{
"allocate": {
"index": "'$INDEX'",
"shard": '$SHARD',
"node": "'$NODE'",
"allow_primary": true
}
}
]
}'
done
I also encountered similar error. It happened to me because one of my data node was full and due to which shards allocation failed. If unassigned shards are there and your cluster is RED and few indices also RED, in that case I have followed below steps and these worked like a champ.
in kibana dev tool-
GET _cluster/allocation/explain
If any unassigned shards are there then you will get details else will throw ERROR.
simply running below command will solve everything-
POST _cluster/reroute?retry_failed
Thanks to -
https://github.com/elastic/elasticsearch/issues/23199#issuecomment-280272888
The only thing that worked for me was changing the number_of_replicas (I had 2 replicas, so I changed it to 1 and then changed back to 2).
First:
PUT /myindex/_settings
{
"index" : {
"number_of_replicas" : 1
}
}
Then:
PUT /myindex/_settings
{
"index" : {
"number_of_replicas" : 2
}
}
(I Already asnwered it in this question)
Elasticsearch automatically allocates shards if the below config is set to all. This config can be set using a rest api as well
cluster.routing.allocation.enable: all
If even after application of the below config, es fails to assign the shards automatically, then you have to force assign the shards yourself. ES official link for this
I have written a script to force assign all unassigned shards across cluster.
below array contains list of nodes among which you want to balance the unassigned shards
#!/bin/bash
array=( node1 node2 node3 )
node_counter=0
length=${#array[#]}
IFS=$'\n'
for line in $(curl -s 'http://127.0.0.1:9200/_cat/shards'| fgrep UNASSIGNED); do
INDEX=$(echo $line | (awk '{print $1}'))
SHARD=$(echo $line | (awk '{print $2}'))
NODE=${array[$node_counter]}
echo $NODE
curl -XPOST 'http://127.0.0.1:9200/_cluster/reroute' -d '{
"commands": [
{
"allocate": {
"index": "'$INDEX'",
"shard": '$SHARD',
"node": "'$NODE'",
"allow_primary": true
}
}
]
}'
node_counter=$(((node_counter)%length +1))
done
I've stuck today with the same issue of shards allocation. The script that
W. Andrew Loe III has proposed in his answer didn't work for me, so I modified it a little and it finally worked:
#!/usr/bin/env bash
# The script performs force relocation of all unassigned shards,
# of all indices to a specified node (NODE variable)
ES_HOST="<elasticsearch host>"
NODE="<node name>"
curl ${ES_HOST}:9200/_cat/shards > shards
grep "UNASSIGNED" shards > unassigned_shards
while read LINE; do
IFS=" " read -r -a ARRAY <<< "$LINE"
INDEX=${ARRAY[0]}
SHARD=${ARRAY[1]}
echo "Relocating:"
echo "Index: ${INDEX}"
echo "Shard: ${SHARD}"
echo "To node: ${NODE}"
curl -s -XPOST "${ES_HOST}:9200/_cluster/reroute" -d "{
\"commands\": [
{
\"allocate\": {
\"index\": \"${INDEX}\",
\"shard\": ${SHARD},
\"node\": \"${NODE}\",
\"allow_primary\": true
}
}
]
}"; echo
echo "------------------------------"
done <unassigned_shards
rm shards
rm unassigned_shards
exit 0
Now, I'm not kind of a Bash guru, but the script really worked for my case. Note, that you'll need to specify appropriate values for "ES_HOST" and "NODE" variables.
In my case, when I create a new index then the default number_of_replicas is set as 1. And the number of nodes in my cluster was only one so there was no extra node to create the replica, so the health was turning to yellow.
So when I created the index with settings property and set the number_of_replicas as 0. Then it worked fine. Hope this helps.
PUT /customer
{
"settings": {
"number_of_replicas": 0
}
}
In my case, the hard disk space upper bound was reached.
Look at this article: https://www.elastic.co/guide/en/elasticsearch/reference/current/disk-allocator.html
Basically, I ran:
PUT /_cluster/settings
{
"transient": {
"cluster.routing.allocation.disk.watermark.low": "90%",
"cluster.routing.allocation.disk.watermark.high": "95%",
"cluster.info.update.interval": "1m"
}
}
So that it will allocate if <90% hard disk space used, and move a shard to another machine in the cluster if >95% hard disk space used; and it checks every 1 minute.
Maybe it helps someone, but I had the same issue and it was due to a lack of storage space caused by a log getting way too big.
Hope it helps someone! :)
I was having this issue as well, and I found an easy way to resolve it.
Get the index of unassigned shards
$ curl -XGET http://172.16.4.140:9200/_cat/shards
Install curator Tools, and use it to delete index
$ curator --host 172.16.4.140 delete indices --older-than 1 \
--timestring '%Y.%m.%d' --time-unit days --prefix logstash
NOTE: In my case, the index is logstash of the day 2016-04-21
Then check the shards again, all the unassigned shards go away!
I had the same problem but the root cause was a difference in version numbers (1.4.2 on two nodes (with problems) and 1.4.4 on two nodes (ok)). The first and second answers (setting "index.routing.allocation.disable_allocation" to false and setting "cluster.routing.allocation.enable" to "all") did not work.
However, the answer by #Wilfred Hughes (setting "cluster.routing.allocation.enable" to "all" using transient) gave me an error with the following statement:
[NO(target node version [1.4.2] is older than source node version
[1.4.4])]
After updating the old nodes to 1.4.4 these nodes started to resnc with the other good nodes.
I also meet this situation and finally fixed it.
Firstly, I will describe my situation. I have two nodes in ElasticSearch cluster, they can find each other, but when I created a index with settings "number_of_replicas" : 2, "number_of_shards" : 5, ES show yellow signal and unassigned_shards is 5.
The problem occurs because the value of number_of_replicas, when I set its value with 1, all is fine.
For me, this was resolved by running this from the dev console: "POST /_cluster/reroute?retry_failed"
.....
I started by looking at the index list to see which indices were red and then ran
"get /_cat/shards?h=[INDEXNAME],shard,prirep,state,unassigned.reason"
and saw that it had shards stuck in ALLOCATION_FAILED state, so running the retry above caused them to re-try the allocation.
In my case an old node with old shares was joining the cluster, so we had to shutdown the old node and delete the indices with unassigned shards.
I tried several of the suggestions above and unfortunately none of them worked. We have a "Log" index in our lower environment where apps write their errors. It is a single node cluster. What solved it for me was checking the YML configuration file for the node and seeing that it still had the default setting "gateway.expected_nodes: 2". This was overriding any other settings we had. Whenever we would create an index on this node it would try to spread 3 out of 5 shards to the phantom 2nd node. These would therefore appear as unassigned and they could never be moved to the 1st and only node.
The solution was editing the config, changing the setting "gateway.expected_nodes" to 1, so it would quit looking for its never-to-be-found brother in the cluster, and restarting the Elastic service instance. Also, I had to delete the index, and create a new one. After creating the index, the shards all showed up on the 1st and only node, and none were unassigned.
# Set how many nodes are expected in this cluster. Once these N nodes
# are up (and recover_after_nodes is met), begin recovery process immediately
# (without waiting for recover_after_time to expire):
#
# gateway.expected_nodes: 2
gateway.expected_nodes: 1
Similar problem on ES 7.4.2, commands have changed. As already mentionned in answers , first thing to check GET _cluster/allocation/explain?pretty then POST _cluster/reroute?retry_failed
Primary
You have to pass "accept_data_loss": true for a primary shard
POST _cluster/reroute
{
"commands": [{
"allocate_stale_primary": {
"index": "filebeat-7.4.2-xxxx",
"shard": 0,
"node": "my_node",
"accept_data_loss": false
}
}]
}
Replica
POST _cluster/reroute
{
"commands": [{
"allocate_replica": {
"index": "filebeat-7.4.2-xxxx",
"shard": 0,
"node": "my_other_node"
}
}]
}
cluster-reroute doc
Might help, but I had this issue when trying to run ES in embedded mode. Fix was to make sure the Node had local(true) set.
Another possible reason for unassigned shards is that your cluster is running more than one version of the Elasticsearch binary.
shard replication from the more recent version to the previous
versions will not work
This can be a root cause for unassigned shards.
Elastic Documentation - Rolling Upgrade Process
I ran into exactly the same issue. This can be prevented by temporarily setting the shard allocation to false before restarting elasticsearch, but this does not fix the unassigned shards if they are already there.
In my case it was caused by lack of free disk space on the data node. The unassigned shards where still on the data node after the restart but they where not recognized by the master.
Just cleaning 1 of the nodes from the disk got the replication process started for me. This is a rather slow process because all the data has to be copied from 1 data node to the other.
I tried to delete unassigned shards or manually assign them to particular data node. It didn't work because unassigned shards kept appearing and health status was "red" over and over.
Then I noticed that one of the data nodes stuck in "restart" state. I reduce number of data nodes, killed it. Problem is not reproducible anymore.
I had two indices with unassigned shards that didn't seem to be self-healing. I eventually resolved this by temporarily adding an extra data-node[1]. After the indices became healthy and everything stabilized to green, I removed the extra node and the system was able to rebalance (again) and settle on a healthy state.
It's a good idea to avoid killing multiple data nodes at once (which is how I got into this state). Likely, I had failed to preserve any copies/replicas for at least one of the shards. Luckily, Kubernetes kept the disk storage around, and reused it when I relaunched the data-node.
...Some time has passed...
Well, this time just adding a node didn't seem to be working (after waiting several minutes for something to happen), so I started poking around in the REST API.
GET /_cluster/allocation/explain
This showed my new node with "decision": "YES".
By the way, all of the pre-existing nodes had "decision": "NO" due to "the node is above the low watermark cluster setting". So this was probably a different case than the one I had addressed previously.
Then I made the following simple POST[2] with no body, which kicked things into gear...
POST /_cluster/reroute
Other notes:
Very helpful: https://datadoghq.com/blog/elasticsearch-unassigned-shards
Something else that may work. Set cluster_concurrent_rebalance to 0, then to null -- as I demonstrate here.
[1] Pretty easy to do in Kubernetes if you have enough headroom: just scale out the stateful set via the dashboard.
[2] Using the Kibana "Dev Tools" interface, I didn't have to bother with SSH/exec shells.
I just first increased the
"index.number_of_replicas"
by 1 (wait until nodes are synced) then decreased it by 1 afterwards, which effectively removes the unassigned shards and cluster is Green again without the risk of losing any data.
I believe there are better ways but this is easier for me.
Hope this helps.
When dealing with corrupted shards you can set the replication factor to 0 and then set it back to the original value. This should clear up most if not all your corrupted shards and relocate the new replicas in the cluster.
Setting indexes with unassigned replicas to use a replication factor of 0:
curl -XGET http://localhost:9200/_cat/shards |\
grep UNASSIGNED | grep ' r ' |\
awk '{print $1}' |\
xargs -I {} curl -XPUT http://localhost:9200/{}/_settings -H "Content-Type: application/json" \
-d '{ "index":{ "number_of_replicas": 0}}'
Setting them back to 1:
curl -XGET http://localhost:9200/_cat/shards |\
awk '{print $1}' |\
xargs -I {} curl -XPUT http://localhost:9200/{}/_settings -H "Content-Type: application/json" \
-d '{ "index":{ "number_of_replicas": 1}}'
Note: Do not run this if you have different replication factors for different indexes. This would hardcode the replication factor for all indexes to 1.
This may be a cause of the disk space as well,
In Elasticsearch 7.5.2, by default, if disk usage is above 85% then replica shards are not assigned to any other node.
This can be fixed by setting a different threshold or by disabling it either in the .yml or via Kibana
PUT _cluster/settings
{
"persistent": {
"cluster.routing.allocation.disk.threshold_enabled": "false"
}
}
First use cluster health API to get the current health of cluster, where RED means one or more primary shards missing and Yellow means one of more replica shards are missing.
After this use the cluster allocation explain API to know why a particular shard is missing and elasticsearch is not able to allocate it on data-node.
Once you get the exact root cause, try to address the issue, which often requires, changing few cluster settings(mentioned in #wilfred answer earlier) But in some cases, if its replica shards, and you have another copy of same shard(i.e. another replica) available, you can reduce the replica count using update replica setting and later on again increase it, if you need it.
Apart from above, if your cluster allocation API mention it doesn't have a valid data nodes to allocate a shard, than you need to add a new data nodes, or change the shard allocation awareness settings.
If you have an unassigned shard, usually the first step is to call the allocation explain API and look for the reason. Depending on the reason, you'd do something about it. Here are some that come to mind:
node doesn't have enough disk space (check disk-based allocation settings)
node can't allocate the shard because of some restrictions like allocation is disabled or allocation filtering or awareness (e.g. node is on the wrong side of the cluster, like the other availability zone or a hot or a warm node)
there's some error loading the shard. E.g. a checksum fails on files, there's a missing synonyms file referenced by an analyzer
Sometimes it helps to bump-start it, like using the Cluster Reroute API to allocate the shard manually, or disabling and re-enabling replicas.
If you need more info on operating Elasticsearch, check Sematext's Elasticsearch Operations training (disclaimer: I'm delivering it).
If you are using the aws elasticsearch service, the above suggestions will not provide a solution. In this case, I backed up the index with the backup structure connected to s3. Then I deleted the index and restored it. it worked for me. Please make sure the backup completed successfully!

Resources