Elasticsearch Searchguard unassigned shards - elasticsearch

i've ran into the issue with unassigned searchguard shards when I've added new nodes to the ElasticSearch cluster. Cluster is located in public-cloud and has enabled awareness setting with node.awareness.attributes: availability_zone. Searchguard has enabled replica count auto-expand enabled by default. Problem reoccurs when I have three nodes in one zone and by one in two other zones:
eu-central-1a = 3 nodes
eu-central-1b = 1 node
eu-central-1c = 1 node
I do understand this is cluster configuration is kinda imbalanced, this is just replay of production issue. I just want to understand the logic of elasticsearch and searchguard. Why it is causing such issue. So here is my config
{
"cluster_name" : "test-cluster",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 8,
"number_of_data_nodes" : 5,
"active_primary_shards" : 1032,
"active_shards" : 3096,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 1,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 99.96771068776235
}
indices
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open searchguard GuL6pHCUTUKbmygbIsLAYw 1 4 5 0 131.3kb 35.6kb
explanation
"deciders" : [
{
"decider" : "same_shard",
"decision" : "NO",
"explanation" : "the shard cannot be allocated to the same node on which a copy of the shard already exists [[searchguard][0], node[a59ptCI2SfifBWmnmRoqxA], [R], s[STARTED], a[id=d3rMAN8xQi2xrTD3y_SUPA]]"
},
{
"decider" : "awareness",
"decision" : "NO",
"explanation" : "there are too many copies of the shard allocated to nodes with attribute [aws_availability_zone], there are [5] total configured shard copies for this shard id and [3] total attribute values, expected the allocated shard count per attribute [3] to be less than or equal to the upper bound of the required number of shards per attribute [2]"
}
]
searchguard config
{
"searchguard" : {
"settings" : {
"index" : {
"number_of_shards" : "1",
"auto_expand_replicas" : "0-all",
"provided_name" : "searchguard",
"creation_date" : "1554095156112",
"number_of_replicas" : "4",
"uuid" : "GuL6pHCUTUKbmygbIsLAYw",
"version" : {
"created" : "6020499"
}
}
}
}
}
questions I have:
searchguard config said "number_of_replicas" : "4", but allocator explanations said there are [5] total configured shard copies so 5 is this with primary replica? Even if so...
what is the problem to put all these shards(3) to one zone (eu-central-1a) even if zone collapsed we would have two replicas in other zones, isn't it enough to recover?
how elasticsearch calculates these conditionals required number of shards per attribute [2]. Considering this limitation I can raise only up to 2*zones_count (2*3 = 6) for my cluster. This is really not much. Looks like there should be ways to overcome this limit.

Related

Elasticsearch cluster dropping messages

In my cluster trying to insert message (from filebeat) I get
(status=400): {"type":"illegal_argument_exception","reason":"Validation Failed: 1: this action would add [2] shards, but this cluster currently has [2999]/[3000] maximum normal shards open;"}, dropping event!
looking at the cluster health its looks like it has available shards
"cluster_name" : "elastic",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 4,
"number_of_data_nodes" : 1,
"active_primary_shards" : 1511,
"active_shards" : 1511,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 1488,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 1159,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 57267,
"active_shards_percent_as_number" : 50.3834611537179
any ideas ?
Tldr;
It seems like you only have one data node.
As you can see you have 1511 allocated shards, and 1488 unassigned shards
And guess what happens if we add them up ^^.
We get the 2999.
Elasticsearch does not allow to have the primary shards and its replicas on the same node.
But since you have only one data node all those unassigned shards are most likely replicas.
Solution
short term
Create your new index with 0 replica.
POST /my-index-000001
{
"settings": {
"index" : {
"number_of_replicas" : 0
}
}
}
You could also set all other index with no replica via:
PUT /*/_settings
{
"index" : {
"number_of_replicas" : 0
}
}
Long term
You either need to change all you indices to only have one replica or add another node. So that those replicas get allocated.
At the moment you are risking data loss of the node crash.
Then as per the documentation you can change the value of cluster.routing.allocation.total_shards_per_node which must be set a 3000

Unexpected Disk Capacity issue using Elasticsearch cluster on EKS with a 8Exabytes EFS disk

I have configured an elasticsearch cluster in my kubernetes cluster (EKS), the elasticsearch cluster has 3 nodes and I have set up a 8E disk for the nodes to store the data. (thinking that I wont have any space issues for a while...)
[root#es-cluster-0 elasticsearch]# curl -s -XGET http://localhost:9200/_cat/allocation?v
shards disk.indices disk.used disk.avail disk.total disk.percent host ip node
36 66.7gb 966.1gb 8191.9pb 8191.9pb 0 10.65.32.184 10.65.32.184 es-cluster-0
33 82.6gb 966.1gb 8191.9pb 8191.9pb 0 10.65.32.202 10.65.32.202 es-cluster-2
37 76gb 966.1gb 8191.9pb 8191.9pb 0 10.65.32.178 10.65.32.178 es-cluster-1
14 UNASSIGNED
The cluster current health is:
[root#es-cluster-0 elasticsearch]# curl -s -XGET http://localhost:9200/_cluster/health?pretty
{
"cluster_name" : "k8s-logs",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 3,
"active_primary_shards" : 56,
"active_shards" : 106,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 14,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 88.33333333333333
}
I can see that I have 14 "unassigned_shards" which matches perfectly with the last line of the /_cat/allocation above
When I'm start figuring out what is happening I found this:
[root#es-cluster-0 elasticsearch]# curl -s -XGET http://localhost:9200/_cluster/allocation/explain?pretty
{
"index" : "logstash-2022.01.22",
"shard" : 0,
"primary" : false,
"current_state" : "unassigned",
"unassigned_info" : {
"reason" : "ALLOCATION_FAILED",
"at" : "2022-01-22T00:00:11.254Z",
"failed_allocation_attempts" : 5,
"details" : "failed shard on node [bf_GjmcUQGuCTk-_voh4Xw]: failed recovery, failure RecoveryFailedException[[logstash-2022.01.22][0]: Recovery failed from {es-cluster-0}{hYJ4ifx7R7yWJq6VFP3Drw}{jjAAtdcmQXeVpJXxj4DYcA}{10.65.32.184}{10.65.32.184:9300}{dilmrt}{ml.machine_memory=15878057984, ml.max_open_jobs=20, xpack.installed=true, transform.node=true} into {es-cluster-1}{bf_GjmcUQGuCTk-_voh4Xw}{QNp4DD51TQa716D4TjMFPg}{10.65.32.178}{10.65.32.178:9300}{dilmrt}{ml.machine_memory=15878057984, xpack.installed=true, transform.node=true, ml.max_open_jobs=20}]; nested: RemoteTransportException[[es-cluster-0][10.65.32.184:9300][internal:index/shard/recovery/start_recovery]]; nested: RemoteTransportException[[es-cluster-1][10.65.32.178:9300][internal:index/shard/recovery/clean_files]]; nested: UncategorizedExecutionException[Failed execution]; nested: NotSerializableExceptionWrapper[execution_exception: java.io.IOException: Disk quota exceeded]; nested: IOException[Disk quota exceeded]; ",
"last_allocation_status" : "no_attempt"
},
"can_allocate" : "no",
"allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
"node_allocation_decisions" : [
{
"node_id" : "7WHft5LVTYCEWvwKM64A-w",
"node_name" : "es-cluster-2",
"transport_address" : "10.65.32.202:9300",
"node_attributes" : {
"ml.machine_memory" : "15878057984",
"ml.max_open_jobs" : "20",
"xpack.installed" : "true",
"transform.node" : "true"
},
--- TRUNCATED ---
I don't know why it's saying Disk quota exceeded if the elasticsearch cluster is reporting correctly the capacity that it has available /_cat/allocation is there any additional configuration that I need to setup in order to tell elasticsearch that we have enough space to work with ?
See here for EFS limitations that can cause disk quota error which is not necessary disk size related. Generally EFS does not support a sizeable ES stack, example elasticsearch expects 64K file descriptors per data node instance but EFS only support 32K at the moment. If you look into your elasticsearch logs there could be clue about which limitation has breached.

elasticsearch = cannot allocate because allocation is not permitted to any of the nodes

I have elasticsearch as a single node cluster.
One of the indexes is yellow with the explanation below.
I have read all the material here and in general, I did not find a solution for this problem.
here is the index info:
yellow open research-pdl 8_TrwZieRM6oBes8sGBUWg 1 1 416656058 0 77.9gb 77.9gb
this command POST _cluster/reroute?retry_failed does not seems to be doing anything.
the setup is running on docker, I have 650GB free space.
{
"index" : "research-pdl",
"shard" : 0,
"primary" : false,
"current_state" : "unassigned",
"unassigned_info" : {
"reason" : "CLUSTER_RECOVERED",
"at" : "2020-12-16T05:21:19.977Z",
"last_allocation_status" : "no_attempt"
},
"can_allocate" : "no",
"allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
"node_allocation_decisions" : [
{
"node_id" : "5zzXP2kCQ9eDI0U6WY4j9Q",
"node_name" : "37f65704d9bb",
"transport_address" : "172.19.0.2:9300",
"node_attributes" : {
"ml.machine_memory" : "67555622912",
"xpack.installed" : "true",
"transform.node" : "true",
"ml.max_open_jobs" : "20"
},
"node_decision" : "no",
"deciders" : [
{
"decider" : "same_shard",
"decision" : "NO",
"explanation" : "a copy of this shard is already allocated to this node [[research-pdl][0], node[5zzXP2kCQ9eDI0U6WY4j9Q], [P], s[STARTED], a[id=J7IX30jBSP2jXl5-IGp0BQ]]"
}
]
}
]
}
Thanks
The exception message is very clear, Elasticsearch never assigns the replica of the same primary shard on the same node for high availability reasons.
a copy of this shard is already allocated to this node
[[research-pdl][0], node[5zzXP2kCQ9eDI0U6WY4j9Q], [P], s[STARTED],
a[id=J7IX30jBSP2jXl5-IGp0BQ]]
And as you have a single node cluster, so you will not have another other node where your replicas can be assigned.
Solutions
Add more nodes to your cluster, so that replicas can be assigned on other nodes. (preferred way)
Reduce the replica shards to 0, this can cause data-loss and performance issues. (if at all, you don't have the option to add data-nodes and you want the green state for your cluster).
You can update the replica counts using cluster update API.

elasticsearch: How to interpret log file (cluster went to yellow status)?

Elasticsearch 1.7.2 on CentOS, 8GB RAM, 2 node cluster.
We posted the whole log here: http://pastebin.com/zc2iG2q4
When we look at /_cluster/health , we see 2 unassigned shards:
{
"cluster_name" : "elasticsearch-prod",
"status" : "yellow", <--------------------------
"timed_out" : false,
"number_of_nodes" : 2,
"number_of_data_nodes" : 2,
"active_primary_shards" : 5,
"active_shards" : 8,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 2, <--------------------------
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0
In the log, we see:
marking and sending shard failed due to [failed to create shard]
java.lang.OutOfMemoryError: Java heap space
And other errors.
The only memory related config value we have is:
indices.fielddata.cache.size: 75%
We are looking to:
understand the log more completely
understand what action we need to take to address the situation now (recover) and prevent it in the future
Additional details:
1) ES_HEAP_SIZE is stock, no changes. (Further, looking around, it is not clear where best to change it.... /etc/init.d/elasticsearch ?)
2) Our jvm stats are below. (And please note, as a test, I modded "/etc/init.d/elasticsearch" and and added export ES_HEAP_SIZE=4g [in place of the existing "export ES_HEAP_SIZE" line] and restarted ES.... Comparing two identical nodes, one with the changed elasticsearch file, and one stock, the values below appear identical)
"jvm" : {
"timestamp" : 1448395039780,
"uptime_in_millis" : 228297,
"mem" : {
"heap_used_in_bytes" : 81418872,
"heap_used_percent" : 7,
"heap_committed_in_bytes" : 259522560,
"heap_max_in_bytes" : 1037959168,
"non_heap_used_in_bytes" : 50733680,
"non_heap_committed_in_bytes" : 51470336,
"pools" : {
"young" : {
"used_in_bytes" : 52283368,
"max_in_bytes" : 286326784,
"peak_used_in_bytes" : 71630848,
"peak_max_in_bytes" : 286326784
},
"survivor" : {
"used_in_bytes" : 2726824,
"max_in_bytes" : 35782656,
"peak_used_in_bytes" : 8912896,
"peak_max_in_bytes" : 35782656
},
"old" : {
"used_in_bytes" : 26408680,
"max_in_bytes" : 715849728,
"peak_used_in_bytes" : 26408680,
"peak_max_in_bytes" : 715849728
}
}
},
"threads" : {
"count" : 81,
"peak_count" : 81
},
"gc" : {
"collectors" : {
"young" : {
"collection_count" : 250,
"collection_time_in_millis" : 477
},
"old" : {
"collection_count" : 1,
"collection_time_in_millis" : 22
}
}
},
"buffer_pools" : {
"direct" : {
"count" : 112,
"used_in_bytes" : 20205138,
"total_capacity_in_bytes" : 20205138
},
"mapped" : {
"count" : 0,
"used_in_bytes" : 0,
"total_capacity_in_bytes" : 0
}
}
},
Solved.
The key here is the error "java.lang.OutOfMemoryError: Java heap space"
Another day, another gem from the ES docs:
https://www.elastic.co/guide/en/elasticsearch/guide/current/heap-sizing.html
says (emphasis mine):
The default installation of Elasticsearch is configured with a 1 GB heap. For just about every deployment, this number is far too small. If you are using the default heap values, your cluster is probably configured incorrectly.
Resolution:
Edit: /etc/sysconfig/elasticsearch
Set ES_HEAP_SIZE=4g // this system has 8GB RAM
Restart ES
And tada.... the unassigned shards are magically assigned, and the cluster goes green.

Elasticsearch replica shard recovery failed, yellow status

During an ElasticSearch recovery process a primary shard was lost and a replica shard was promoted but after that, when cluster tried to recover this shard
we got the following exception repeateadly:
RecoverFilesRecoveryException[[my-index][0] Failed to transfer [279] files with total size of [12.6gb]]; nested: RemoteTransportException[File corruption occured on recovery but checksums are ok]; ]]
In addition to that, the recovery of this shard response was:
{
"id" : 0,
"type" : "REPLICA",
"stage" : "INDEX",
"primary" : false,
...
"index" : {
"files" : {
"total" : 279,
"reused" : 0,
"recovered" : 262,
"percent" : "93.9%"
},
"bytes" : {
"total" : 13630355592,
"reused" : 0,
"recovered" : 7036450677,
"percent" : "51.6%"
},
And after 2 minutes
"index" : {
"files" : {
"total" : 279,
"reused" : 0,
"recovered" : 276,
"percent" : "98.9%"
},
"bytes" : {
"total" : 13630355592,
"reused" : 0,
"recovered" : 10500690274,
"percent" : "77.0%"
}
The above numbers where up/down for quite some time, but it never really recovered and cluster status remained yellow!
Is there a way to make cluster green again? Perhaps delete this replica somehow?
A dirty solution was to delete the index of the above shard and reindex again.
ES version: 1.3.1

Resources