Elasticsearch shard relocation not working - elasticsearch

I added 12 new data nodes to an existing cluster of 8 data nodes. I am trying to shutdown the previous 8 nodes using the "exclude allocation" as recommended
curl -XPUT localhost:9200/_cluster/settings -d '{
"transient" : {
"cluster.routing.allocation.exclude._ip" : "10.0.0.1"
} }'
It wasn't relocating any shards, so I ran the reroute command with the 'explain' option. Can someone explain what the following text is saying ?
> "explanations" : [ {
> "command" : "move",
> "parameters" : {
> "index" : "2015-09-20",
> "shard" : 0,
> "from_node" : "_dDn1SmqSquhMGgjti6vGg",
> "to_node" : "OQBFMt17RaWboOzMnUy2jA"
> },
> "decisions" : [ {
> "decider" : "same_shard",
> "decision" : "YES",
> "explanation" : "shard is not allocated to same node or host"
> }, {
> "decider" : "filter",
> "decision" : "YES",
> "explanation" : "node passes include/exclude/require filters"
> }, {
> "decider" : "replica_after_primary_active",
> "decision" : "YES",
> "explanation" : "shard is primary"
> }, {
> "decider" : "throttling",
> "decision" : "YES",
> "explanation" : "below shard recovery limit of [16]"
> }, {
> "decider" : "enable",
> "decision" : "YES",
> "explanation" : "allocation disabling is ignored"
> }, {
> "decider" : "disable",
> "decision" : "YES",
> "explanation" : "allocation disabling is ignored"
> }, {
> "decider" : "awareness",
> "decision" : "NO",
> "explanation" : "too many shards on nodes for attribute: [dc]" }, {
> "decider" : "shards_limit",
> "decision" : "YES",
> "explanation" : "total shard limit disabled: [-1] <= 0"
> }, {
> "decider" : "node_version",
> "decision" : "YES",
> "explanation" : "target node version [1.4.5] is same or newer than source node version [1.4.5]"
> }, {
> "decider" : "disk_threshold",
> "decision" : "YES",
> "explanation" : "enough disk for shard on node, free: [1.4tb]"
> }, {
> "decider" : "snapshot_in_progress",
> "decision" : "YES", "explanation" : "no snapshots are currently running"
>

If you have replicas, you can simply switch off your nodes, one by one and wait for each that the cluster becomes green again.
You don't need to explicitly reroute in that case.
That said, in your logs, it sounds like you are using awareness in your elasticsearch.yml file. You should check your settings.

You can install kopf plugin, it will help you manage elasticsearch nodes and the task will be more simplified.
With this plugin what you want it's easier.
You can download here: https://github.com/lmenezes/elasticsearch-kopf .
Other plugins with support you can get in: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-plugins.html .

Related

default location for single primary shard on elastic search?

I have an ES 7.8 cluster to store log data, one index for one tenant.
As you can see the default index.number_of_shards is one. (Please ignore the fact I dont have any replicas because the data are just imported)
This looks problematic as all primary shards are located on the same node. How can I assign them evenly on different nodes when creating the index?
Update1:
$ curl -sk 'myhost:19081/_cluster/settings?pretty'
{
"persistent" : { },
"transient" : { }
}
$ $ curl -sk 'myhost:19081/_cluster/allocation/explain?pretty&include_disk_info=true&include_yes_decisions=true'
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "unable to find any unassigned shards to explain [ClusterAllocationExplainRequest[useAnyUnassignedShard=true,includeYesDecisions?=true]"
}
],
"type" : "illegal_argument_exception",
"reason" : "unable to find any unassigned shards to explain [ClusterAllocationExplainRequest[useAnyUnassignedShard=true,includeYesDecisions?=true]"
},
"status" : 400
}
$ curl -sk 'myhost:19081/_cat/nodeattrs?v'
node host ip attr value
node-001 10.96.110.92 10.96.110.92 ml.machine_memory 99750834176
node-001 10.96.110.92 10.96.110.92 ml.max_open_jobs 20
node-001 10.96.110.92 10.96.110.92 xpack.installed true
node-001 10.96.110.92 10.96.110.92 transform.node true
node-004 10.96.108.179 10.96.108.179 ml.machine_memory 99531649024
node-004 10.96.108.179 10.96.108.179 ml.max_open_jobs 20
node-004 10.96.108.179 10.96.108.179 xpack.installed true
node-004 10.96.108.179 10.96.108.179 transform.node true
node-003 10.96.113.19 10.96.113.19 ml.machine_memory 99531649024
node-003 10.96.113.19 10.96.113.19 ml.max_open_jobs 20
node-003 10.96.113.19 10.96.113.19 xpack.installed true
node-003 10.96.113.19 10.96.113.19 transform.node true
node-002 10.96.112.213 10.96.112.213 ml.machine_memory 99531649024
node-002 10.96.112.213 10.96.112.213 ml.max_open_jobs 20
node-002 10.96.112.213 10.96.112.213 xpack.installed true
node-002 10.96.112.213 10.96.112.213 transform.node true
node-005 10.96.101.214 10.96.101.214 ml.machine_memory 99516563456
node-005 10.96.101.214 10.96.101.214 ml.max_open_jobs 20
node-005 10.96.101.214 10.96.101.214 xpack.installed true
node-005 10.96.101.214 10.96.101.214 transform.node true
$ curl -sk 'myhost:19081/_all/_settings?include_defaults&filter_path=**.allocation&pretty'
{
// with several hundreds other identical results of indices
"my_index_1" : {
"defaults" : {
"index" : {
"routing" : {
"allocation" : {
"enable" : "all",
"total_shards_per_node" : "-1"
}
},
"allocation" : {
"max_retries" : "5",
"existing_shards_allocator" : "gateway_allocator"
}
}
}
}
}
Update2:
curl -sk -HContent-Type:application/json -d ' {"index": "my_index_1", "shard": 0, "primary": true }' 'myhost:19081/_cluster/allocation/explain?pretty&include_disk_info=true&include_yes_decisions=true'
{
"index" : "my_index_1",
"shard" : 0,
"primary" : true,
"current_state" : "started",
"current_node" : {
"id" : "CNyCF4_eTmCQYXh_Bhb0KQ",
"name" : "node004",
"transport_address" : "10.96.108.179:9300",
"attributes" : {
"ml.machine_memory" : "99531649024",
"ml.max_open_jobs" : "20",
"xpack.installed" : "true",
"transform.node" : "true"
},
"weight_ranking" : 1
},
"cluster_info" : {
"nodes" : {
"CNyCF4_eTmCQYXh_Bhb0KQ" : {
"node_name" : "node004",
"least_available" : {
"path" : "/data3/nodes/0",
"total_bytes" : 15999772393472,
"used_bytes" : 23527976960,
"free_bytes" : 15976244416512,
"free_disk_percent" : 99.9,
"used_disk_percent" : 0.1
},
"most_available" : {
"path" : "/data2/nodes/0",
"total_bytes" : 15999772393472,
"used_bytes" : 19824119808,
"free_bytes" : 15979948273664,
"free_disk_percent" : 99.9,
"used_disk_percent" : 0.1
}
},
"xiR8clLRSVirvkmlyDpgXg" : {
"node_name" : "node001",
"least_available" : {
"path" : "/data1/nodes/0",
"total_bytes" : 15999896125440,
"used_bytes" : 2815332352,
"free_bytes" : 15997080793088,
"free_disk_percent" : 100.0,
"used_disk_percent" : 0.0
},
"most_available" : {
"path" : "/data3/nodes/0",
"total_bytes" : 15999896125440,
"used_bytes" : 278740992,
"free_bytes" : 15999617384448,
"free_disk_percent" : 100.0,
"used_disk_percent" : 0.0
}
},
"afbAZaznQwaRtryF7yI4dA" : {
"node_name" : "node003",
"least_available" : {
"path" : "/data1/nodes/0",
"total_bytes" : 15999836385280,
"used_bytes" : 34533376,
"free_bytes" : 15999801851904,
"free_disk_percent" : 100.0,
"used_disk_percent" : 0.0
},
"most_available" : {
"path" : "/data1/nodes/0",
"total_bytes" : 15999836385280,
"used_bytes" : 34533376,
"free_bytes" : 15999801851904,
"free_disk_percent" : 100.0,
"used_disk_percent" : 0.0
}
},
"vhFAg67YSgquqP8tR-s98w" : {
"node_name" : "node002",
"least_available" : {
"path" : "/data1/nodes/0",
"total_bytes" : 15999836385280,
"used_bytes" : 34537472,
"free_bytes" : 15999801847808,
"free_disk_percent" : 100.0,
"used_disk_percent" : 0.0
},
"most_available" : {
"path" : "/data1/nodes/0",
"total_bytes" : 15999836385280,
"used_bytes" : 34537472,
"free_bytes" : 15999801847808,
"free_disk_percent" : 100.0,
"used_disk_percent" : 0.0
}
},
"KL8hcVTJTBmN9MTa3fX8eQ" : {
"node_name" : "node005",
"least_available" : {
"path" : "/data1/nodes/0",
"total_bytes" : 15999772393472,
"used_bytes" : 34983936,
"free_bytes" : 15999737409536,
"free_disk_percent" : 100.0,
"used_disk_percent" : 0.0
},
"most_available" : {
"path" : "/data1/nodes/0",
"total_bytes" : 15999772393472,
"used_bytes" : 34983936,
"free_bytes" : 15999737409536,
"free_disk_percent" : 100.0,
"used_disk_percent" : 0.0
}
}
},
"shard_sizes" : {
"[my_index_1][0][p]_bytes" : 2120083,
// redact several hundreds others
},
"shard_paths" : {
"[my_index_1][0], node[CNyCF4_eTmCQYXh_Bhb0KQ], [P], s[STARTED], a[id=dqceFOaFT0ugDALnFEJWvg]" : "/data2/nodes/0",
// redact several hundreds others
}
},
"can_remain_on_current_node" : "yes",
"can_rebalance_cluster" : "yes",
"can_rebalance_to_other_node" : "no",
"rebalance_explanation" : "cannot rebalance as no target node exists that can both allocate this shard and improve the cluster balance",
"node_allocation_decisions" : [
{
"node_id" : "KL8hcVTJTBmN9MTa3fX8eQ",
"node_name" : "node005",
"transport_address" : "10.96.101.214:9300",
"node_attributes" : {
"ml.machine_memory" : "99516563456",
"ml.max_open_jobs" : "20",
"xpack.installed" : "true",
"transform.node" : "true"
},
"node_decision" : "worse_balance",
"weight_ranking" : 1,
"deciders" : [
{
"decider" : "max_retry",
"decision" : "YES",
"explanation" : "shard has no previous failures"
},
{
"decider" : "replica_after_primary_active",
"decision" : "YES",
"explanation" : "shard is primary and can be allocated"
},
{
"decider" : "enable",
"decision" : "YES",
"explanation" : "all allocations are allowed"
},
{
"decider" : "node_version",
"decision" : "YES",
"explanation" : "can relocate primary shard from a node with version [7.8.0] to a node with equal-or-newer version [7.8.0]"
},
{
"decider" : "snapshot_in_progress",
"decision" : "YES",
"explanation" : "no snapshots are currently running"
},
{
"decider" : "restore_in_progress",
"decision" : "YES",
"explanation" : "ignored as shard is not being recovered from a snapshot"
},
{
"decider" : "filter",
"decision" : "YES",
"explanation" : "node passes include/exclude/require filters"
},
{
"decider" : "same_shard",
"decision" : "YES",
"explanation" : "this node does not hold a copy of this shard"
},
{
"decider" : "disk_threshold",
"decision" : "YES",
"explanation" : "enough disk for shard on node, free: [14.5tb], shard size: [2mb], free after allocating shard: [14.5tb]"
},
{
"decider" : "throttling",
"decision" : "YES",
"explanation" : "below shard recovery limit of outgoing: [0 < 2] incoming: [0 < 2]"
},
{
"decider" : "shards_limit",
"decision" : "YES",
"explanation" : "total shard limits are disabled: [index: -1, cluster: -1] <= 0"
},
{
"decider" : "awareness",
"decision" : "YES",
"explanation" : "allocation awareness is not enabled, set cluster setting [cluster.routing.allocation.awareness.attributes] to enable it"
}
]
},
{
"node_id" : "afbAZaznQwaRtryF7yI4dA",
"node_name" : "node003",
"transport_address" : "10.96.113.19:9300",
"node_attributes" : {
"ml.machine_memory" : "99531649024",
"ml.max_open_jobs" : "20",
"xpack.installed" : "true",
"transform.node" : "true"
},
"node_decision" : "worse_balance",
"weight_ranking" : 1,
"deciders" : [
{
"decider" : "max_retry",
"decision" : "YES",
"explanation" : "shard has no previous failures"
},
{
"decider" : "replica_after_primary_active",
"decision" : "YES",
"explanation" : "shard is primary and can be allocated"
},
{
"decider" : "enable",
"decision" : "YES",
"explanation" : "all allocations are allowed"
},
{
"decider" : "node_version",
"decision" : "YES",
"explanation" : "can relocate primary shard from a node with version [7.8.0] to a node with equal-or-newer version [7.8.0]"
},
{
"decider" : "snapshot_in_progress",
"decision" : "YES",
"explanation" : "no snapshots are currently running"
},
{
"decider" : "restore_in_progress",
"decision" : "YES",
"explanation" : "ignored as shard is not being recovered from a snapshot"
},
{
"decider" : "filter",
"decision" : "YES",
"explanation" : "node passes include/exclude/require filters"
},
{
"decider" : "same_shard",
"decision" : "YES",
"explanation" : "this node does not hold a copy of this shard"
},
{
"decider" : "disk_threshold",
"decision" : "YES",
"explanation" : "enough disk for shard on node, free: [14.5tb], shard size: [2mb], free after allocating shard: [14.5tb]"
},
{
"decider" : "throttling",
"decision" : "YES",
"explanation" : "below shard recovery limit of outgoing: [0 < 2] incoming: [0 < 2]"
},
{
"decider" : "shards_limit",
"decision" : "YES",
"explanation" : "total shard limits are disabled: [index: -1, cluster: -1] <= 0"
},
{
"decider" : "awareness",
"decision" : "YES",
"explanation" : "allocation awareness is not enabled, set cluster setting [cluster.routing.allocation.awareness.attributes] to enable it"
}
]
},
{
"node_id" : "vhFAg67YSgquqP8tR-s98w",
"node_name" : "node002",
"transport_address" : "10.96.112.213:9300",
"node_attributes" : {
"ml.machine_memory" : "99531649024",
"ml.max_open_jobs" : "20",
"xpack.installed" : "true",
"transform.node" : "true"
},
"node_decision" : "worse_balance",
"weight_ranking" : 1,
"deciders" : [
{
"decider" : "max_retry",
"decision" : "YES",
"explanation" : "shard has no previous failures"
},
{
"decider" : "replica_after_primary_active",
"decision" : "YES",
"explanation" : "shard is primary and can be allocated"
},
{
"decider" : "enable",
"decision" : "YES",
"explanation" : "all allocations are allowed"
},
{
"decider" : "node_version",
"decision" : "YES",
"explanation" : "can relocate primary shard from a node with version [7.8.0] to a node with equal-or-newer version [7.8.0]"
},
{
"decider" : "snapshot_in_progress",
"decision" : "YES",
"explanation" : "no snapshots are currently running"
},
{
"decider" : "restore_in_progress",
"decision" : "YES",
"explanation" : "ignored as shard is not being recovered from a snapshot"
},
{
"decider" : "filter",
"decision" : "YES",
"explanation" : "node passes include/exclude/require filters"
},
{
"decider" : "same_shard",
"decision" : "YES",
"explanation" : "this node does not hold a copy of this shard"
},
{
"decider" : "disk_threshold",
"decision" : "YES",
"explanation" : "enough disk for shard on node, free: [14.5tb], shard size: [2mb], free after allocating shard: [14.5tb]"
},
{
"decider" : "throttling",
"decision" : "YES",
"explanation" : "below shard recovery limit of outgoing: [0 < 2] incoming: [0 < 2]"
},
{
"decider" : "shards_limit",
"decision" : "YES",
"explanation" : "total shard limits are disabled: [index: -1, cluster: -1] <= 0"
},
{
"decider" : "awareness",
"decision" : "YES",
"explanation" : "allocation awareness is not enabled, set cluster setting [cluster.routing.allocation.awareness.attributes] to enable it"
}
]
},
{
"node_id" : "xiR8clLRSVirvkmlyDpgXg",
"node_name" : "node001",
"transport_address" : "10.96.110.92:9300",
"node_attributes" : {
"ml.machine_memory" : "99750834176",
"ml.max_open_jobs" : "20",
"xpack.installed" : "true",
"transform.node" : "true"
},
"node_decision" : "worse_balance",
"weight_ranking" : 1,
"deciders" : [
{
"decider" : "max_retry",
"decision" : "YES",
"explanation" : "shard has no previous failures"
},
{
"decider" : "replica_after_primary_active",
"decision" : "YES",
"explanation" : "shard is primary and can be allocated"
},
{
"decider" : "enable",
"decision" : "YES",
"explanation" : "all allocations are allowed"
},
{
"decider" : "node_version",
"decision" : "YES",
"explanation" : "can relocate primary shard from a node with version [7.8.0] to a node with equal-or-newer version [7.8.0]"
},
{
"decider" : "snapshot_in_progress",
"decision" : "YES",
"explanation" : "no snapshots are currently running"
},
{
"decider" : "restore_in_progress",
"decision" : "YES",
"explanation" : "ignored as shard is not being recovered from a snapshot"
},
{
"decider" : "filter",
"decision" : "YES",
"explanation" : "node passes include/exclude/require filters"
},
{
"decider" : "same_shard",
"decision" : "YES",
"explanation" : "this node does not hold a copy of this shard"
},
{
"decider" : "disk_threshold",
"decision" : "YES",
"explanation" : "enough disk for shard on node, free: [14.5tb], shard size: [2mb], free after allocating shard: [14.5tb]"
},
{
"decider" : "throttling",
"decision" : "YES",
"explanation" : "below shard recovery limit of outgoing: [0 < 2] incoming: [0 < 2]"
},
{
"decider" : "shards_limit",
"decision" : "YES",
"explanation" : "total shard limits are disabled: [index: -1, cluster: -1] <= 0"
},
{
"decider" : "awareness",
"decision" : "YES",
"explanation" : "allocation awareness is not enabled, set cluster setting [cluster.routing.allocation.awareness.attributes] to enable it"
}
]
}
]
}
Elasticsearch by default tries to spread the shards evenly on all the data-nodes, in your case its really strange why all the shards all on the same data-node.
You should debug the cause of it, hopefully you don't have a single data-node in your cluster, please provide your cluster settings in order to get more information on your cluster settings and setup.
Also provide the output of Shard allocation explain API.
For time being, You can manually move these shards on other data-nodes by using the cluster reroute API
Elasticsearch automatically take cares of allocating shards to different node.
Try to rebalance cluster that may fix the problem
https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-cluster.html#:~:text=Elasticsearch%20runs%20an%20automatic%20process,from%20completely%20balancing%20the%20cluster.
As it turns out my cluster has a cluster.routing.allocation.balance.shard of zero.
Solved this by
PUT /_cluster/settings
{
"persistent" : {
"cluster.routing.allocation.balance.shard" : "0.45"
}
}

elasticsearch = cannot allocate because allocation is not permitted to any of the nodes

I have elasticsearch as a single node cluster.
One of the indexes is yellow with the explanation below.
I have read all the material here and in general, I did not find a solution for this problem.
here is the index info:
yellow open research-pdl 8_TrwZieRM6oBes8sGBUWg 1 1 416656058 0 77.9gb 77.9gb
this command POST _cluster/reroute?retry_failed does not seems to be doing anything.
the setup is running on docker, I have 650GB free space.
{
"index" : "research-pdl",
"shard" : 0,
"primary" : false,
"current_state" : "unassigned",
"unassigned_info" : {
"reason" : "CLUSTER_RECOVERED",
"at" : "2020-12-16T05:21:19.977Z",
"last_allocation_status" : "no_attempt"
},
"can_allocate" : "no",
"allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
"node_allocation_decisions" : [
{
"node_id" : "5zzXP2kCQ9eDI0U6WY4j9Q",
"node_name" : "37f65704d9bb",
"transport_address" : "172.19.0.2:9300",
"node_attributes" : {
"ml.machine_memory" : "67555622912",
"xpack.installed" : "true",
"transform.node" : "true",
"ml.max_open_jobs" : "20"
},
"node_decision" : "no",
"deciders" : [
{
"decider" : "same_shard",
"decision" : "NO",
"explanation" : "a copy of this shard is already allocated to this node [[research-pdl][0], node[5zzXP2kCQ9eDI0U6WY4j9Q], [P], s[STARTED], a[id=J7IX30jBSP2jXl5-IGp0BQ]]"
}
]
}
]
}
Thanks
The exception message is very clear, Elasticsearch never assigns the replica of the same primary shard on the same node for high availability reasons.
a copy of this shard is already allocated to this node
[[research-pdl][0], node[5zzXP2kCQ9eDI0U6WY4j9Q], [P], s[STARTED],
a[id=J7IX30jBSP2jXl5-IGp0BQ]]
And as you have a single node cluster, so you will not have another other node where your replicas can be assigned.
Solutions
Add more nodes to your cluster, so that replicas can be assigned on other nodes. (preferred way)
Reduce the replica shards to 0, this can cause data-loss and performance issues. (if at all, you don't have the option to add data-nodes and you want the green state for your cluster).
You can update the replica counts using cluster update API.

Elastic Search ICU plugin installed but failure IllegalArgumentException Unknown char_filter type

Im using elasticsearch 5.6.16
I did an installation of analysis icu plugin. But my shard allocation is failed and this the reason behind this.
"node_id" : "wcF3Ob3ATKu5jB3_ur2k8w",
"node_name" : "prod-08-gce",
"transport_address" : "10.10.80.12:9300",
"node_decision" : "no",
"deciders" : [
{
"decider" : "max_retry",
"decision" : "NO",
"explanation" : "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2020-06-19T19:06:20.436Z], failed_attempts[5], delayed=false, details[failed to create index, failure IllegalArgumentException[Unknown char_filter type [icu_normalizer] for [icu_normalizer_casesensitive]]], allocation_status[no_attempt]]]"
}
]
}
]

elasticsearch - there is no copy of the shard available?

I have a few indices in red, after a failure of the system, caused by disk full.
But I cannot reallocate the lost shard. It says "there is no copy of the shard available"
curl -XGET 'localhost:9200/_cluster/allocation/explain?pretty'
{
"shard" : {
"index" : "my_index",
"index_uuid" : "iNY9t81wQf6wJc-KqufUrg",
"id" : 0,
"primary" : true
},
"assigned" : false,
"shard_state_fetch_pending" : false,
"unassigned_info" : {
"reason" : "ALLOCATION_FAILED",
"at" : "2017-05-30T07:33:04.192Z",
"failed_attempts" : 5,
"delayed" : false,
"details" : "failed to create shard, failure FileSystemException[/data/es/storage/nodes/0/indices/iNY9t81wQf6wJc-KqufUrg/0/_state/state-13.st.tmp: Read-only file system]",
"allocation_status" : "deciders_no"
},
"allocation_delay_in_millis" : 60000,
"remaining_delay_in_millis" : 0,
"nodes" : {
"KvOd2vSQTOSgjgqyEnOKpA" : {
"node_name" : "node1",
"node_attributes" : { },
"store" : {
"shard_copy" : "NONE"
},
"final_decision" : "NO",
"final_explanation" : "there is no copy of the shard available",
"weight" : -3.683333,
"decisions" : [
{
"decider" : "max_retry",
"decision" : "NO",
"explanation" : "shard has already failed allocating [5] times vs. [5] retries allowed unassigned_info[[reason=ALLOCATION_FAILED], at[2017-05-30T07:33:04.192Z], failed_attempts[5], delayed=false, details[failed to create shard, failure FileSystemException[/data/es/storage/nodes/0/indices/iNY9t81wQf6wJc-KqufUrg/0/_state/state-13.st.tmp: Read-only file system]], allocation_status[deciders_no]] - manually call [/_cluster/reroute?retry_failed=true] to retry"
}
]
},
"pC9fL41xRgeZDAEYvNR1eQ" : {
"node_name" : "node2",
"node_attributes" : { },
"store" : {
"shard_copy" : "AVAILABLE"
},
"final_decision" : "NO",
"final_explanation" : "the shard cannot be assigned because one or more allocation decider returns a 'NO' decision",
"weight" : -2.333333,
"decisions" : [
{
"decider" : "max_retry",
"decision" : "NO",
"explanation" : "shard has already failed allocating [5] times vs. [5] retries allowed unassigned_info[[reason=ALLOCATION_FAILED], at[2017-05-30T07:33:04.192Z], failed_attempts[5], delayed=false, details[failed to create shard, failure FileSystemException[/data/es/storage/nodes/0/indices/iNY9t81wQf6wJc-KqufUrg/0/_state/state-13.st.tmp: Read-only file system]], allocation_status[deciders_no]] - manually call [/_cluster/reroute?retry_failed=true] to retry"
}
]
},
"1g7eCfEQS9u868lFSoo7FQ" : {
"node_name" : "node3",
"node_attributes" : { },
"store" : {
"shard_copy" : "AVAILABLE"
},
"final_decision" : "NO",
"final_explanation" : "the shard cannot be assigned because one or more allocation decider returns a 'NO' decision",
"weight" : 40.866665,
"decisions" : [
{
"decider" : "max_retry",
"decision" : "NO",
"explanation" : "shard has already failed allocating [5] times vs. [5] retries allowed unassigned_info[[reason=ALLOCATION_FAILED], at[2017-05-30T07:33:04.192Z], failed_attempts[5], delayed=false, details[failed to create shard, failure FileSystemException[/data/es/storage/nodes/0/indices/iNY9t81wQf6wJc-KqufUrg/0/_state/state-13.st.tmp: Read-only file system]], allocation_status[deciders_no]] - manually call [/_cluster/reroute?retry_failed=true] to retry"
}
]
}
}
}
I tried basically every option of the reroute command (documentation here). but it gives me 400 error.. like this:
curl -XPOST 'localhost:9200/_cluster/reroute?pretty' -H 'Content-Type: application/json' -d'
{
"commands" : [
{
"allocate_replica" : {
"index" : "myindex", "shard" : 0,
"node" : "node2"
}
}
]
}'
response:
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "[allocate_replica] trying to allocate a replica shard [myindex][0], while corresponding primary shard is still unassigned"
}
],
"type" : "illegal_argument_exception",
"reason" : "[allocate_replica] trying to allocate a replica shard [myindex][0], while corresponding primary shard is still unassigned"
},
"status" : 400
}
try this:
curl -XPOST 'xx.xxx.xx:9200/_cluster/reroute' -d '{"commands" : [{"allocate_stale_primary":{"index" : "myindex", "shard" : 0, "node" : "node2","accept_data_loss" : true}}]}'

strange {"OK":{}} response on ElasticSearch curl -X GET 'http://localhost:9200'

On one on my nodes in my ElasticSearch cluster I get the following strange response:
Command:
curl -X GET 'http://localhost:9200'
Response:
{"OK":{}}
Not sure what to do about this? Anyone run into this before?
UPDATE:
This is what I get when I call (I replaced IP's with xxx):
curl -XGET localhost:9200/_nodes/jvm?human\&pretty
{
"cluster_name" : "elasticsearch",
"nodes" : {
"dtUV63D4RBq9JXw_o03-eg" : {
"name" : "elasticsearch1",
"transport_address" : "inet[xxx/xxx:9300]",
"host" : "elasticsearch1",
"ip" : "xxx",
"version" : "1.3.2",
"build" : "dee175d",
"http_address" : "inet[/xxx:9200]",
"jvm" : {
"pid" : 1471,
"version" : "1.7.0_65",
"vm_name" : "OpenJDK 64-Bit Server VM",
"vm_version" : "24.65-b04",
"vm_vendor" : "Oracle Corporation",
"start_time" : "2014-11-19T14:50:10.408Z",
"start_time_in_millis" : 1416408610408,
"mem" : {
"heap_init" : "4gb",
"heap_init_in_bytes" : 4294967296,
"heap_max" : "3.9gb",
"heap_max_in_bytes" : 4277534720,
"non_heap_init" : "23.1mb",
"non_heap_init_in_bytes" : 24313856,
"non_heap_max" : "214mb",
"non_heap_max_in_bytes" : 224395264,
"direct_max" : "3.9gb",
"direct_max_in_bytes" : 4277534720
},
"gc_collectors" : [ "ParNew", "ConcurrentMarkSweep" ],
"memory_pools" : [ "Code Cache", "Par Eden Space", "Par Survivor Space", "CMS Old Gen", "CMS Perm Gen" ]
}
},
"8eGVx6IGQ8qiFTc4rnaG3A" : {
"name" : "elasticsearch2",
"transport_address" : "inet[/xxx:9300]",
"host" : "elasticsearch2",
"ip" : "xxx",
"version" : "1.3.2",
"build" : "dee175d",
"http_address" : "inet[/xxx:9200]",
"jvm" : {
"pid" : 1476,
"version" : "1.7.0_65",
"vm_name" : "OpenJDK 64-Bit Server VM",
"vm_version" : "24.65-b04",
"vm_vendor" : "Oracle Corporation",
"start_time" : "2014-11-19T14:54:33.909Z",
"start_time_in_millis" : 1416408873909,
"mem" : {
"heap_init" : "4gb",
"heap_init_in_bytes" : 4294967296,
"heap_max" : "3.9gb",
"heap_max_in_bytes" : 4277534720,
"non_heap_init" : "23.1mb",
"non_heap_init_in_bytes" : 24313856,
"non_heap_max" : "214mb",
"non_heap_max_in_bytes" : 224395264,
"direct_max" : "3.9gb",
"direct_max_in_bytes" : 4277534720
},
"gc_collectors" : [ "ParNew", "ConcurrentMarkSweep" ],
"memory_pools" : [ "Code Cache", "Par Eden Space", "Par Survivor Space", "CMS Old Gen", "CMS Perm Gen" ]
}
}
}
}
Elasticsearch 1.3.2 alone is not capable of producing such response - there is simply no "OK" string in the production source code. So, I would guess somebody installed a plugin on this node that intercepts some calls and replaces them with this message.
One of the plugins that does that is elasticsearch-http-basic plugin, which indeed displays {"OK":{}} to unauthorized users instead of a full response. You can verify presence of this and other plugins by executing the following command on the node that gives you responses:
curl "localhost:9200/_nodes/plugins?pretty"

Resources