Unable to move Elasticsearch shards - elasticsearch

I have an Elasticsearch cluster with two nodes and eight shards. I am in the situation where all primaries are on one node and all replicas are on the other.
Running the command:
http://xx.xx.xx.1:9200/_cat/shards
returns this result:
myindex 2 r STARTED 16584778 1.4gb xx.xx.xx.2 node2
myindex 2 p STARTED 16584778 1.4gb xx.xx.xx.1 node1
myindex 1 r STARTED 16592755 1.4gb xx.xx.xx.2 node2
myindex 1 p STARTED 16592755 1.4gb xx.xx.xx.1 node1
myindex 3 r STARTED 16592009 1.4gb xx.xx.xx.2 node2
myindex 3 p STARTED 16592033 1.4gb xx.xx.xx.1 node1
myindex 0 r STARTED 16610776 1.3gb xx.xx.xx.2 node2
myindex 0 p STARTED 16610776 1.3gb xx.xx.xx.1 node1
I am trying to swap around certain shards by posting this command:
http://xx.xx.xx.1:9200/_cluster/reroute?explain
with this body:
{
"commands" : [
{
"move" : {
"index" : "myindex",
"shard" : 1,
"from_node" : "node1",
"to_node" : "node2"
}
},
{
"allocate_replica" : {
"index" : "myindex",
"shard" : 1,
"node" : "node1"
}
}
]
}
It doesn't work, and the only "NO" I get in the list of decisions in the explainitions is:
{
"decider": "same_shard",
"decision": "NO",
"explanation": "the shard cannot be allocated on the same node id [xxxxxxxxxxxxxxxxxxxxxx] on which it already exists"
},
It's not fully clear to me if this is the actual error, but there is no other negative feedback. How can I resolve this and move my shard?

This is expected.
Why would you do that? Primaries and replicas are doing the same job.
What problems do you think this would solve?

Related

Elasticsearch doesn't allow to allocate unassigned shard

I have an ES cluster of 2 nodes. As I restarted nodes the cluster status is yellow as some of the shards are unassigned. I've tried to google and the common solution is to reroute unassigned shards. Unfortunately, it doesn't work for me.
curl localhost:9200/_cluster/health?pretty=true
{
"cluster_name" : "infra",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 2,
"number_of_data_nodes" : 2,
"active_primary_shards" : 34,
"active_shards" : 68,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 31,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 68.68686868686868
}
curl localhost:9200/_cluster/settings?pretty
{
"persistent" : { },
"transient" : {
"cluster" : {
"routing" : {
"allocation" : {
"enable" : "all"
}
}
}
}
}
curl localhost:9200/_cat/indices?v
health status index pri rep docs.count docs.deleted store.size pri.store.size
yellow open logstash-log-2016.05.13 5 2 88314 0 300.5mb 150.2mb
yellow open logstash-log-2016.05.12 5 2 254450 0 833.9mb 416.9mb
yellow open .kibana 1 2 3 0 47.8kb 25.2kb
green open .marvel-es-data-1 1 1 3 0 8.7kb 4.3kb
yellow open logstash-log-2016.05.11 5 2 313095 0 709.1mb 354.6mb
yellow open logstash-log-2016.05.10 5 2 613744 0 1gb 520.2mb
green open .marvel-es-1-2016.05.18 1 1 88720 495 89.9mb 45mb
green open .marvel-es-1-2016.05.17 1 1 69430 492 59.4mb 29.7mb
yellow open logstash-log-2016.05.17 5 2 188924 0 518.2mb 259mb
yellow open logstash-log-2016.05.18 5 2 226775 0 683.7mb 366.1mb
Rerouting
curl -XPOST 'localhost:9200/_cluster/reroute?pretty' -d '{
"commands": [
{
"allocate": {
"index": "logstash-log-2016.05.13",
"shard": 3,
"node": "elasticsearch-mon-1",
"allow_primary": true
}
}
]
}'
{
"error" : {
"root_cause" : [ {
"type" : "illegal_argument_exception",
"reason" : "[allocate] allocation of [logstash-log-2016.05.13][3] on node {elasticsearch-mon-1}{K-J8WKyZRB6bE4031kHkKA}{172.45.0.56}{172.45.0.56:9300} is not allowed, reason: [YES(allocation disabling is ignored)][NO(shard cannot be allocated on same node [K-J8WKyZRB6bE4031kHkKA] it already exists on)][YES(no allocation awareness enabled)][YES(allocation disabling is ignored)][YES(target node version [2.3.2] is same or newer than source node version [2.3.2])][YES(primary is already active)][YES(total shard limit disabled: [index: -1, cluster: -1] <= 0)][YES(shard not primary or relocation disabled)][YES(node passes include/exclude/require filters)][YES(enough disk for shard on node, free: [25.4gb])][YES(below shard recovery limit of [2])]"
} ],
"type" : "illegal_argument_exception",
"reason" : "[allocate] allocation of [logstash-log-2016.05.13][3] on node {elasticsearch-mon-1}{K-J8WKyZRB6bE4031kHkKA}{172.45.0.56}{172.45.0.56:9300} is not allowed, reason: [YES(allocation disabling is ignored)][NO(shard cannot be allocated on same node [K-J8WKyZRB6bE4031kHkKA] it already exists on)][YES(no allocation awareness enabled)][YES(allocation disabling is ignored)][YES(target node version [2.3.2] is same or newer than source node version [2.3.2])][YES(primary is already active)][YES(total shard limit disabled: [index: -1, cluster: -1] <= 0)][YES(shard not primary or relocation disabled)][YES(node passes include/exclude/require filters)][YES(enough disk for shard on node, free: [25.4gb])][YES(below shard recovery limit of [2])]"
},
"status" : 400
}
curl -XPOST 'localhost:9200/_cluster/reroute?pretty' -d '{
"commands": [
{
"allocate": {
"index": "logstash-log-2016.05.13",
"shard": 3,
"node": "elasticsearch-mon-2",
"allow_primary": true
}
}
]
}'
{
"error" : {
"root_cause" : [ {
"type" : "illegal_argument_exception",
"reason" : "[allocate] allocation of [logstash-log-2016.05.13][3] on node {elasticsearch-mon-2}{Rxgq2aWPSVC0pvUW2vBgHA}{172.45.0.166}{172.45.0.166:9300} is not allowed, reason: [YES(allocation disabling is ignored)][NO(shard cannot be allocated on same node [Rxgq2aWPSVC0pvUW2vBgHA] it already exists on)][YES(no allocation awareness enabled)][YES(allocation disabling is ignored)][YES(target node version [2.3.2] is same or newer than source node version [2.3.2])][YES(primary is already active)][YES(total shard limit disabled: [index: -1, cluster: -1] <= 0)][YES(shard not primary or relocation disabled)][YES(node passes include/exclude/require filters)][YES(enough disk for shard on node, free: [25.4gb])][YES(below shard recovery limit of [2])]"
} ],
"type" : "illegal_argument_exception",
"reason" : "[allocate] allocation of [logstash-log-2016.05.13][3] on node {elasticsearch-mon-2}{Rxgq2aWPSVC0pvUW2vBgHA}{172.45.0.166}{172.45.0.166:9300} is not allowed, reason: [YES(allocation disabling is ignored)][NO(shard cannot be allocated on same node [Rxgq2aWPSVC0pvUW2vBgHA] it already exists on)][YES(no allocation awareness enabled)][YES(allocation disabling is ignored)][YES(target node version [2.3.2] is same or newer than source node version [2.3.2])][YES(primary is already active)][YES(total shard limit disabled: [index: -1, cluster: -1] <= 0)][YES(shard not primary or relocation disabled)][YES(node passes include/exclude/require filters)][YES(enough disk for shard on node, free: [25.4gb])][YES(below shard recovery limit of [2])]"
},
"status" : 400
}
So it fails and doesn't make any change. Shards are still in unassigned state.
Thank you.
Added
curl localhost:9200/_cat/shards
logstash-log-2016.05.13 2 p STARTED 17706 31.6mb 172.45.0.166 elasticsearch-mon-2
logstash-log-2016.05.13 2 r STARTED 17706 31.5mb 172.45.0.56 elasticsearch-mon-1
logstash-log-2016.05.13 2 r UNASSIGNED
logstash-log-2016.05.13 4 p STARTED 17698 31.6mb 172.45.0.166 elasticsearch-mon-2
logstash-log-2016.05.13 4 r STARTED 17698 31.4mb 172.45.0.56 elasticsearch-mon-1
logstash-log-2016.05.13 4 r UNASSIGNED
For all the indices that are yellow you have configured 2 replicas:
health status index pri rep
yellow open logstash-log-2016.05.13 5 2
yellow open logstash-log-2016.05.12 5 2
yellow open .kibana 1 2
yellow open logstash-log-2016.05.11 5 2
yellow open logstash-log-2016.05.10 5 2
yellow open logstash-log-2016.05.17 5 2
yellow open logstash-log-2016.05.18 5 2
2 replicas on two nodes cluster is impossible. You need a third node for all the replicas to be assigned.
Or, decrease the number of replicas:
PUT /logstash-log-*,.kibana/_settings
{
"index": {
"number_of_replicas": 1
}
}
Had same problem with version 5.1.2
I tried below option and it worked out.
curl -XPUT 'localhost:9200/_cluster/settings' -d
'{ "transient":
{ "cluster.routing.allocation.enable" : "all" }
}'
After this it automatically allocated shards.

Remove unassigned Graylog2 index/shards with Bash in a loop

There were some disk issues on a Graylog2 server I use for debug logs. There are unassigned shards now:
curl -XGET http://host:9200/_cat/shards
graylog_292 1 p STARTED 751733 648.4mb 127.0.1.1 Doctor Leery
graylog_292 1 r UNASSIGNED
graylog_292 2 p STARTED 756663 653.2mb 127.0.1.1 Doctor Leery
graylog_292 2 r UNASSIGNED
graylog_290 0 p STARTED 299059 257.2mb 127.0.1.1 Doctor Leery
graylog_290 0 r UNASSIGNED
graylog_290 3 p STARTED 298759 257.1mb 127.0.1.1 Doctor Leery
graylog_290 3 r UNASSIGNED
graylog_290 1 p STARTED 298314 257.3mb 127.0.1.1 Doctor Leery
graylog_290 1 r UNASSIGNED
graylog_290 2 p STARTED 297722 257.1mb 127.0.1.1 Doctor Leery
graylog_290 2 r UNASSIGNED
....
It's over 400 shards. I can delete them without data loss, because it's a single node setup. In order to do this I need to loop over the index (graylog_xxx) and over the shard (1,2,...).
How do I loop over this (2 variables) with Bash? There are 2 variables for the deletion API call, which I need to replace (afaik):
curl -XPOST 'host:9200/_cluster/reroute' -d '{
"commands" : [ {
"allocate" : {
"index" : "$index",
"shard" : $shard,
"node" : "Doctor Leery",
"allow_primary" : true
}
}
]
}'
What also bothers me about this is, that the unassigned shards have no node. But in the API call I need to specify one.
Form the _cat/shards output you shared, it simply looks like those are unassigned replicas, which you can simply remove by updating the cluster settings and setting the replica count to 0, like this:
curl -XPUT 'localhost:9200/_settings' -d '{
"index" : {
"number_of_replicas" : 0
}
}'
After running the above curl, your cluster will be green again.

elasticsearch - remove a second elasticsearch node and add an other node, get unassigned shards

As a starter in Elasticsearch, I just use it for two weeks ago and I have just did a silly thing.
My Elasticsearch has one cluster with two nodes, one master-data node (version 1.4.2), one non-data node (version 1.1.1).There was conflict version when using, I decided to shutdown and delete the non-data node then, install another data node (version 1.4.2) See my image for easy imagine. node3 is named node2 then
Then, I check elastic status
{
"cluster_name":"elasticsearch",
"status":"yellow",
"timed_out":false,
"number_of_nodes":2,
"number_of_data_nodes":2,
"active_primary_shards":725,
"active_shards":1175,
"relocating_shards":0,
"initializing_shards":0,
"unassigned_shards":273
}
Check the cluster state
curl -XGET http://localhost:9200/_cat/shards
logstash-2015.03.25 2 p STARTED 3031 621.1kb 10.146.134.94 node1
logstash-2015.03.25 2 r UNASSIGNED
logstash-2015.03.25 0 p STARTED 3084 596.4kb 10.146.134.94 node1
logstash-2015.03.25 0 r UNASSIGNED
logstash-2015.03.25 3 p STARTED 3177 608.4kb 10.146.134.94 node1
logstash-2015.03.25 3 r UNASSIGNED
logstash-2015.03.25 1 p STARTED 3099 577.3kb 10.146.134.94 node1
logstash-2015.03.25 1 r UNASSIGNED
logstash-2014.12.30 4 r STARTED 10.146.134.94 node2
logstash-2014.12.30 4 p STARTED 94 114.3kb 10.146.134.94 node1
logstash-2014.12.30 0 r STARTED 111 195.8kb 10.146.134.94 node2
logstash-2014.12.30 0 p STARTED 111 195.8kb 10.146.134.94 node1
logstash-2014.12.30 3 r STARTED 110 144kb 10.146.134.94 node2
logstash-2014.12.30 3 p STARTED 110 144kb 10.146.134.94 node1
I have read related question and tried to follow it but no luck. I also comment in the answer the error i got.
ElasticSearch: Unassigned Shards, how to fix?
https://t37.net/how-to-fix-your-elasticsearch-cluster-stuck-in-initializing-shards-mode.html
elasticsearch - what to do with unassigned shards
http://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-reroute.html#cluster-reroute
curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
"commands" : [ {
"allocate" : {
"index" : "logstash-2015.03.25",
"shard" : 4,
"node" : "node2",
"allow_primary" : true
}
}
]
}'
I get
"routing_nodes":{"unassigned":[{"state":"UNASSIGNED","primary":false,"node":null,
"relocating_node":null,"shard":0,"index":"logstash-2015.03.25"}
And I followed the answer in https://stackoverflow.com/a/23781013/1920536
curl -XPUT 'localhost:9200/_cluster/settings' -d '{
"transient" : {
"cluster.routing.allocation.enable" : "all"
}
}'
but no affection.
what should i do ?
Thank in advances.
Update: when I check pending task, it shows that:
{"tasks":[{"insert_order":88401,"priority":"HIGH","source":"shard-failed
([logstash-2015.01.19][3], node[PVkS47JyQQq6G-lstUW04w], [R], s[INITIALIZING]),
**reason [Failed to start shard, message** [RecoveryFailedException[[logstash-2015.01.19][3]: **Recovery failed from** [node1][_72bJJX0RuW7AyM86WUgtQ]
[localhost][inet[/localhost:9300]]{master=true} into [node2][PVkS47JyQQq6G-lstUW04w]
[localhost][inet[/localhost:9302]]{master=false}];
nested: RemoteTransportException[[node1][inet[/localhost:9300]]
[internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[[logstash-2015.01.19][3] Phase[2] Execution failed];
nested: RemoteTransportException[[node2][inet[/localhost:9302]][internal:index/shard/recovery/prepare_translog]];
nested: EngineCreationFailureException[[logstash-2015.01.19][3] **failed to create engine];
nested: FileSystemException**[data/elasticsearch/nodes/0/indices/logstash-2015.01.19/3/index/_0.si: **Too many open files**]; ]]","executing":true,"time_in_queue_millis":53,"time_in_queue":"53ms"}]}
If you have two nodes like
1) Node-1 - ES 1.4.2
2) Node-2 - Es 1.1.1
Now follow these steps to debug.
1) Stop all elasticsearch instance from Node-2.
2) Install elasticsearch 1.4.2 in new elasticsearch node.
Change elasticsearch.yml as master node configuration , specially these three config settings
cluster.name: <Same as master node>
node.name: < Node name for Node-2>
discovery.zen.ping.unicast.hosts: <Master Node IP>
3) Restart Node-2 Elasticsearch.
4) Verify Node-1 logs.

How to move shards around in a cluster

I have a 5 node cluster with 5 indices and 5 shards for each index. Currently the shards of each index are evenly distributed accross the nodes. I need to move shards belonging to 2 different indices from a specific node to a different node on the same cluster
You can use the shard reroute API
A sample command looks like below -
curl -XPOST 'localhost:9200/_cluster/reroute' -H 'Content-Type: application/json' -d '{
"commands" : [ {
"move" :
{
"index" : "test", "shard" : 0,
"from_node" : "node1", "to_node" : "node2"
}
}
]
}'
This moves shard 0 of index test from node1 to node2

Get rid of unassigned shard

I've an ELK stack with two ElasticSearch nodes running and the cluster state turned red due to some unassigned shards which I can't get rid of. Looking up the unassigned shard, resp. the incomplete index with:
# curl -s elastic01.local:9200/_cat/shards | grep "logstash-2014.09.29"
Shows:
logstash-2014.09.29 4 p STARTED 745489 481.3mb 10.165.98.107 Crimson and the Raven
logstash-2014.09.29 4 r STARTED 745489 481.3mb 10.165.98.106 Glenn Talbot
logstash-2014.09.29 0 p STARTED 781110 502.3mb 10.165.98.107 Crimson and the Raven
logstash-2014.09.29 0 r STARTED 781110 502.3mb 10.165.98.106 Glenn Talbot
logstash-2014.09.29 3 p INITIALIZING 10.165.98.107 Crimson and the Raven
logstash-2014.09.29 3 r UNASSIGNED
logstash-2014.09.29 1 p STARTED 762991 490.1mb 10.165.98.107 Crimson and the Raven
logstash-2014.09.29 1 r STARTED 762991 490.1mb 10.165.98.106 Glenn Talbot
logstash-2014.09.29 2 p STARTED 761811 491.3mb 10.165.98.107 Crimson and the Raven
logstash-2014.09.29 2 r STARTED 761811 491.3mb 10.165.98.106 Glenn Talbot
My attempt to assign the shard to the other node fails:
curl XPOST -s 'http://elastic01.local:9200/_cluster/reroute?pretty=true' -d '{
"commands" : [ {
"allocate" : {
"index" : "logstash-2014.09.29",
"shard" : 3 ,
"node" : "Glenn Talbot",
"allow_primary" : 1
}
}
]
}'
With:
NO(primary shard is not yet active)]
I can't really seem to find an API to push the shard states any further. How could I proceed here?
Just for a complete picture, that what the system health looks like:
{
"cluster_name" : "logstash_es",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 2,
"number_of_data_nodes" : 2,
"active_primary_shards" : 114,
"active_shards" : 228,
"relocating_shards" : 0,
"initializing_shards" : 1,
"unassigned_shards" : 1
}
Thank you for your time and help
I actually ran into this situation with ElasticSearch 1.5 just the other day. After initially getting the same error, I simply repeated the /_cluster/reroute request the next day for lack of other ideas, and it worked, and it put the cluster back into a green state immediately.

Resources