How to free space in Elasticsearch cluster - elasticsearch

My Elasticsearch cluster status is red due to low space but when I checked through query GET /_cat/allocation?v&pretty it's showing 6.8 Gb free space in both nodes.
Can anyone help me?
shards disk.indices disk.used disk.avail disk.total disk.percent host ip node
6 25.5gb 27.3gb 6.8gb 34.2gb 80 x.x.x.x x.x.x.x
6 25.5gb 27.3gb 6.8gb 34.2gb 80 x.x.x.x x.x.x.x

You can increase the disk watermark as mentioned in the docs here
curl -XPUT "localhost:9200/_cluster/settings" -d '{
"transient": {
"cluster.routing.allocation.disk.watermark.low": "1gb",
"cluster.routing.allocation.disk.watermark.high": "500mb",
"cluster.routing.allocation.disk.watermark.flood_stage": "200mb",
"cluster.info.update.interval": "1m"
}'

Related

elasticsearch 7 is in YELLOW state with unassigned shard

I'm unable to get cluster of single master node in working green condition:
elasticsearch: 7.17.0
I guess that is because I have unassigned_shards > 0
config:
apiVersion: v1
data:
elasticsearch.yml: |-
discovery:
type: single-node
network:
host: 0.0.0.0
path:
data: /bitnami/elasticsearch/data
xpack:
ml:
enabled: false
kind: ConfigMap
metadata:
labels:
app.kubernetes.io/instance: elasticsearch
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: elasticsearch
argocd.argoproj.io/instance: elasticsearch
helm.sh/chart: elasticsearch-19.5.5
name: elasticsearch
namespace: elasticsearch
kubectl logs elasticsearch-master-0
[2022-12-25T07:52:40,652][INFO ][o.e.c.r.a.AllocationService] [elasticsearch-master-0] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[.ds-ilm-history-5-2022.10.28-000014][0], [.ds-ilm-history-5-2022.09.28-000012][0]]]).
[2022-12-25T07:52:40,856][INFO ][o.e.i.g.GeoIpDownloader ] [elasticsearch-master-0] updating geoip database [GeoLite2-ASN.mmdb]
✗ curl -XGET http://localhost:9200/_cluster/health\?pretty\=true
{
"cluster_name" : "elasticsearch",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 7,
"active_shards" : 7,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 1,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 87.5
}
➜ curl -XGET http://localhost:9200/_cat/shards
magento2_product_1_v583 0 p STARTED 4868 18.9mb 10.110.4.229 elasticsearch-master-0
.ds-ilm-history-5-2022.10.28-000014 0 p STARTED 10.110.4.229 elasticsearch-master-0
.ds-ilm-history-5-2022.11.27-000015 0 p STARTED 10.110.4.229 elasticsearch-master-0
.ds-ilm-history-5-2022.08.29-000010 0 p STARTED 10.110.4.229 elasticsearch-master-0
.ds-ilm-history-5-2022.09.28-000012 0 p STARTED 10.110.4.229 elasticsearch-master-0
.geoip_databases 0 p STARTED 40 38.1mb 10.110.4.229 elasticsearch-master-0
.ds-.logs-deprecation.elasticsearch-default-2022.12.21-000022 0 p STARTED 10.110.4.229 elasticsearch-master-0
.ds-.logs-deprecation.elasticsearch-default-2022.12.21-000022 0 r UNASSIGNED
I'm trying to delete it but facing an error
➜ curl -XGET http://localhost:9200/_cat/shards | grep UNASSIGNED | awk {'print $1'} | xargs -i curl -XDELETE "http://localhost:9200/{}"
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 986 100 986 0 0 5241 0 --:--:-- --:--:-- --:--:-- 5244
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"index [.ds-.logs-deprecation.elasticsearch-default-2022.12.21-000022] is the write index for data stream [.logs-deprecation.elasticsearch-default] and cannot be deleted"}],"type":"illegal_argument_exception","reason":"index [.ds-.logs-deprecation.elasticsearch-default-2022.12.21-000022] is the write index for data stream [.logs-deprecation.elasticsearch-default] and cannot be deleted"},"status":400}
GET /_cluster/allocation/explain:
➜ curl -XGET http://localhost:9200/_cluster/allocation/explain\?pretty\=true | jq
{
"note": "No shard was specified in the explain API request, so this response explains a randomly chosen unassigned shard. There may be other unassigned shards in this cluster which cannot be assigned for different reasons. It may not be possible to assign this shard until one of the other shards is assigned correctly. To explain the allocation of other shards (whether assigned or unassigned) you must specify the target shard in the request to this API.",
"index": ".ds-.logs-deprecation.elasticsearch-default-2022.12.21-000022",
"shard": 0,
"primary": false,
"current_state": "unassigned",
"unassigned_info": {
"reason": "CLUSTER_RECOVERED",
"at": "2022-12-25T07:52:37.022Z",
"last_allocation_status": "no_attempt"
},
"can_allocate": "no",
"allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes",
"node_allocation_decisions": [
{
"node_id": "aURccTcnSuqPC3fBfmezCg",
"node_name": "elasticsearch-master-0",
"transport_address": "10.110.4.229:9300",
"node_attributes": {
"xpack.installed": "true",
"transform.node": "true"
},
"node_decision": "no",
"deciders": [
{
"decider": "same_shard",
"decision": "NO",
"explanation": "a copy of this shard is already allocated to this node [[.ds-.logs-deprecation.elasticsearch-default-2022.12.21-000022][0], node[aURccTcnSuqPC3fBfmezCg], [P], s[STARTED], a[id=tsxhnODlSn-i__-vEvJj3A]]"
}
]
}
]
So what can be done in such scenario?
curl -v -XPUT "localhost:9200/*/_settings" -H 'Content-Type: application/json' -d '
{
"index" : {
"number_of_replicas" : 0
}
}
'
{"acknowledged":true}
curl -XGET http://localhost:9200/_cat/indices
green open magento2_product_1_v583 hvYpUxJUT16-g6_YS8qkaA 1 0 4868 0 18.9mb 18.9mb
green open .geoip_databases tDXBLQRdSFeQyi6Pk5zq2Q 1 0 40 40 38.1mb 38.1mb
Yellow status indicates that one or more of the replica shards on the Elasticsearch cluster are not allocated to a node. When you have only one node, it means your number of replicas is greater than your number of nodes. Elasticsearch will never assign a replica to the same node as the primary shard. so, if you only have one node it is perfectly normal and expected for your cluster to indicate yellow. But, if you are not convinced and want your cluster to be green, set the number of replicas to each index to be 0.
PUT /my-index/_settings
{
"index" : {
"number_of_replicas" : 0
}
}

Elasticsearch add node to cluster

I have 3 node elasticsearch cluster
192.168.2.11 - node-01
192.168.2.12 - node-02
192.168.2.13 - node-03
and i deleted node-02 from cluster using this command
curl -XPUT 192.168.2.12:9200/_cluster/settings -H 'Content-Type: application/json' -d '{
"transient" :{
"cluster.routing.allocation.exclude._ip" : "192.168.2.12"
}
}'
and ok, all my indexes moved to node-01 and node-03, but how to return back this node to the cluster?
i try this command
curl -XPUT 192.168.2.12:9200/_cluster/settings -H 'Content-Type: application/json' -d '{
"transient" :{
"cluster.routing.allocation.include._ip" : "192.168.2.12"
}
}'
but this doesn't works
:"node does not cluster setting [cluster.routing.allocation.include] filters [_ip:\"192.168.2.12\"]
The node has not been deleted but you can 'undo' your command by updating the setting you changed to null
Try updating the settings on either of the running nodes (01 or 03) with
"transient" :{
"cluster.routing.allocation.exclude._ip" : null
}
and the cluster should rebalance shards across the three nodes.
Be careful using the include._ip: "192.168.2.12" as this might stop routing indices to the other two, instead include all three ip addresses if you wanted to us this, for example
"transient" :{
"cluster.routing.allocation.include._ip" :"192.168.2.11, 192.168.2.12, 192.168.2.13"
}

we are losing data in elastic search cluster

We make a poc with ElasticSearch but while doing it, we lost data in clustered enviroment. We use ES 2.4.0.
Can anyone say what we are missing?
Our scenario is:
Open Elastic Server-1 and Server-2 with the configurations below,
they are in a cluster.
Index document over Server-1:
curl -XPUT '20.20.20.5:9200/ert/post/1' -d '
{
"user": "easlan",
"postDate": "01-16-2015",
"body": "Adding Data in ElasticSearch Cluster" ,
"title": "ElasticSearch Cluster Test - 1"
}'
Look for indexed docs over Server-1 or Server-2:Total number of results is 1(as expected):
curl -XGET '20.20.20.5:9200/ert/post/_search?q=user:easlan&pretty=true'
curl -XGET '20.20.20.6:9200/ert/post/_search?q=user:easlan&pretty=true'
Then close Server-1
Index new document over Server-2:
curl -XPUT '20.20.20.6:9200/ert/post/2' -d '
{
"user": "easlan",
"postDate": "01-16-2015",
"body": "Adding Data in ElasticSearch Cluster" ,
"title": "ElasticSearch Cluster Test - 2"
}'
Look for indexed docs over Server-2:Total number of results is 2:
curl -XGET '20.20.20.6:9200/ert/post/_search?q=user:easlan&pretty=true'
Close Server-2
Open Server-1
Look for indexed docs over Server-1:Total number of results is 1 (as expected, because server-2 is closed):
curl -XGET '20.20.20.5:9200/ert/post/_search?q=user:easlan&pretty=true'
Then open Server-2 again. Look for indexed docs over Server-1 or Server-2. We expect to see total number of results as 2 but when we look, we got 1 as a result. Even we restart two of them again the result is still 1:
curl -XGET '20.20.20.5:9200/ert/post/_search?q=user:easlan&pretty=true'
curl -XGET '20.20.20.6:9200/ert/post/_search?q=user:easlan&pretty=true'
Our Configurations:
*** Server-1 ****
cluster.name: ESCluster
node.master: true
node.name: "es1"
node.data: true
network.bind_host: ["127.0.0.1","20.20.20.5"]
network.publish_host: "20.20.20.5"
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["20.20.20.5","20.20.20.6"]
discovery.zen.minimum_master_nodes: 1
*** Server-2 ****
cluster.name: ESCluster
node.master: true
node.name: "es2"
node.data: true
network.bind_host: ["127.0.0.1","20.20.20.6"]
network.publish_host: "20.20.20.6"
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["20.20.20.5","20.20.20.6"]
discovery.zen.minimum_master_nodes: 1

ElasticSearch Unassigned Shard

I have 2 nodes in elastic search cluster with 8 CPU and 16 GB RAM. I have set ES_HEAP_SIZE to 10 GB.
In my yml configuration file on both machines i have set
index.number_of_shards: 5
index.number_of_replicas: 1
And both machines are allowed as master/data true.Now problem is my 0th shard of node 1 is unassigned after restart.I tried
for shard in $(curl -XGET http://localhost:9201/_cat/shards | grep UNASSIGNED | awk '{print $2}'); do
echo "processing $shard"
curl -XPOST 'localhost:9201/_cluster/reroute' -d '{
"commands" : [ {
"allocate" : {
"index" : "inxn",
"shard" : '$shard',
"node" : "node1",
"allow_primary" : true
}
}
]
}'
done
it does not give any error and says acknowledged true and show status of shard to initialize but when i view shard its still uninitialized.
Am I doing anything wrong in setting? Should I make both node as master/data true and on both machines set shard:5 and replica:1
Any help or suggestion would be greatly appreciated.
Thanks
I did a trick to solve the same , I renamed 0 folder under indices on node1 and did a force full assign 0th shard on node1 and it worked for me.
curl -XPOST 'localhost:9201/_cluster/reroute' -d '{
"commands" : [ {
"allocate" : {
"index" : "inxc",
"shard" : 0,
"node" : "node1",
"allow_primary" : true
}
}
]
}'

ElasticSearch find disk space usage

How can I find the amount of disk space that Elastic Search is using for my indexes? I'm currently running it locally and I'm trying to see how much disk space I will need on the VM that I'll be spinning up.
The Elasticsearch way to do this would be to use _cat/shards and look at the store column:
curl -XGET "http://localhost:9200/_cat/shards?v"
index shard prirep state docs store ip node
myindex_2014_12_19 2 r STARTED 76661 415.6mb 192.168.1.1 Georgianna Castleberry
myindex_2014_12_19 2 p STARTED 76661 417.3mb 192.168.1.2 Frederick Slade
myindex_2014_12_19 2 r STARTED 76661 416.9mb 192.168.1.3 Maverick
myindex_2014_12_19 0 r STARTED 76984 525.9mb 192.168.1.1 Georgianna Castleberry
myindex_2014_12_19 0 r STARTED 76984 527mb 192.168.1.2 Frederick Slade
myindex_2014_12_19 0 p STARTED 76984 526mb 192.168.1.3 Maverick
myindex_2014_12_19 3 r STARTED 163 208.5kb 192.168.1.1 Georgianna Castleberry
myindex_2014_12_19 3 p STARTED 163 191.4kb 192.168.1.2 Frederick Slade
myindex_2014_12_19 3 r STARTED 163 181.6kb 192.168.1.3 Maverick
myindex_2014_12_19 1 p STARTED 424923 2.1gb 192.168.1.1 Georgianna Castleberry
myindex_2014_12_19 1 r STARTED 424923 2.1gb 192.168.1.2 Frederick Slade
myindex_2014_12_19 1 r STARTED 424923 2.1gb 192.168.1.3 Maverick
myindex_2014_12_19 4 r STARTED 81020 435.9mb 192.168.1.1 Georgianna Castleberry
myindex_2014_12_19 4 p STARTED 81020 437.8mb 192.168.1.2 Frederick Slade
myindex_2014_12_19 4 r STARTED 81020 437.8mb 192.168.1.3 Maverick
Otherwise in Linux to view the space by folder use:
du -hs /myelasticsearch/data/folder
or to view the space by filesystem:
df -h
In case you don't need per-shard statistics returned by /_cat/shards you can use
curl -XGET 'http://localhost:9200/_cat/allocation?v'
to get used and available disk space for each node.
To view the overall disk usage/available space on ES cluster you can use the following command:
curl -XGET 'localhost:9200/_cat/allocation?v&pretty'
Hope this helps.
you can use the nodes stats rest API
see: https://www.elastic.co/guide/en/elasticsearch/reference/1.6/cluster-nodes-stats.html
make a request for the fs stats like so:
http://:9200/_nodes/stats/fs?pretty=1
and you will see:
{
"cluster_name" : "<cluster>",
"nodes" : {
"pEO34wutR7qk3Ix8N7MgyA" : {
"timestamp" : 1438880525206,
"name" : "<name>",
"transport_address" : "inet[/10.128.37.111:9300]",
"host" : "<host>",
"ip" : [ "inet[/10.128.37.111:9300]", "NONE" ],
"fs" : {
"timestamp" : 1438880525206,
"total" : {
"total_in_bytes" : 363667091456,
"free_in_bytes" : 185081352192,
"available_in_bytes" : 166608117760,
"disk_reads" : 154891,
"disk_writes" : 482628039,
"disk_io_op" : 482782930,
"disk_read_size_in_bytes" : 6070391808,
"disk_write_size_in_bytes" : 1989713248256,
"disk_io_size_in_bytes" : 1995783640064,
"disk_queue" : "0",
"disk_service_time" : "0"
},
"data" : [ {
"path" : "/data1/elasticsearch/data/<cluster>/nodes/0",
"mount" : "/data1",
"dev" : "/dev/sda4",
"total_in_bytes" : 363667091456,
"free_in_bytes" : 185081352192,
"available_in_bytes" : 166608117760,
"disk_reads" : 154891,
"disk_writes" : 482628039,
"disk_io_op" : 482782930,
"disk_read_size_in_bytes" : 6070391808,
"disk_write_size_in_bytes" : 1989713248256,
"disk_io_size_in_bytes" : 1995783640064,
"disk_queue" : "0",
"disk_service_time" : "0"
} ]
}
}
}
}
the space for the data drive is listed:
"total" : {
"total_in_bytes" : 363667091456,
"free_in_bytes" : 185081352192,
"available_in_bytes" : 166608117760,
A more concise solution to find the size of indices is to use
curl -XGET 'localhost:9200/_cat/indices?v'
The output has a 'store.size' column that tells you exactly the size of an index.
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open logstash-2017.03.01 TfraFM8TQkSXdxjx13CnpQ 5 1 33330000 0 1gb 1gb
yellow open .monitoring-es-2-2017.03.02 10YscrcfQuGny5wMxeb0TA 1 1 68834 88 30.3mb 30.3mb
yellow open .kibana GE6xXV7QT-mNbX7xTPbZ4Q 1 1 3 0 14.5kb 14.5kb
yellow open .monitoring-es-2-2017.03.01 SPeQNnPlRB6y7G6w1Axokw 1 1 29441 108 14.7mb 14.7mb
yellow open .monitoring-data-2 LLeWqsD-QE-rPFblwu5K_Q 1 1 3 0 6.9kb 6.9kb
yellow open .monitoring-kibana-2-2017.03.02 l_MAPERUTmSbq0xbhpnf2Q 1 1 5320 0 1.1mb 1.1mb
yellow open .monitoring-kibana-2-2017.03.01 UFVg9c7TTA-nbsEd2d4oFw 1 1 2699 0 763.4kb 763.4kb
In addition you can find out about available disk space by using
curl -XGET 'localhost:9200/_nodes/_local/stats/fs'
Look up the disk space information under the 'fs' key
{
"_nodes": {
"total": 1,
"successful": 1,
"failed": 0
},
"cluster_name": "elasticsearch",
"nodes": {
"MfgVaoRQT9iRAZtAvO549Q": {
"fs": {
"timestamp": 1488466297268,
"total": {
"total_in_bytes": 29475753984,
"free_in_bytes": 18352095232,
"available_in_bytes": 18352095232
},
}
}
}
}
I've tested this for ElasticSearch version 5.2.1
You may want to use the _cat api for nodewise disk space usage
curl http://host:9200/_cat/nodes?h=h,diskAvail
Reference : https://www.elastic.co/guide/en/elasticsearch/reference/current/cat-nodes.html
RUN BELOW TO COMMAND TO FIND OUT DISK SPACE USED BY EACH ELASTICSEARCH INDEXING
# FOR SHARDS
curl http://host:9200/_cat/shards?v&pretty
# OR
GET _cat/shards?v&pretty
RUN BELOW TO COMMAND TO FIND OUT DISK SPACE USED BY EACH ELASTICSEARCH INDICES
# FOR INDICES
curl -XGET 'host:9200/_cat/indices?v&pretty
# SORT BY SIZE STORE OF INDICES
curl -XGET 'host:9200/_cat/indices/_all?v&s=store.size
OUTPUT
# GET /_cat/indices/_all?v&s=store.size
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open sync-rails-logs sSIBqr2iQHG8TGeKFozTpQ 5 1 0 0 1.2kb 1.2kb
yellow open web-nginx-logs iTV-xFFBSdy-C2-NTuEwqQ 5 1 0 0 1.2kb 1.2kb
yellow open web-rails-logs BYD_qHS8SguZvBuGpNvCwA 5 1 0 0 1.2kb 1.2kb
yellow open sync-nginx-logs XAI1hsxlT6qBYN4Ql36lbg 5 1 0 0 1.2kb 1.2kb
green open .tasks XGrMZiqCR0Wr33cCG1u0VQ 1 0 1 0 6.2kb 6.2kb
green open .kibana_1 -g0ztoGWQnuOXnP6di7OYQ 1 0 13 0 100.6kb 100.6kb
green open .kibana_2 eAxt-LXbQyybCyp_6ZYNZg 1 0 14 5 432.2kb 432.2kb
green open sync-nginx-logs-2019-09-13 Q_Ki0dvXQEiuqiGCd10hRg 1 0 144821 0 28.8mb 28.8mb
green open sync-nginx-logs-2019-08-31 m7-oi7ZTSM6ZH_wPDWwbdw 1 0 384954 0 76.4mb 76.4mb
yellow open sync-nginx-logs-2019-08-26 gAvOPNhMRZK6fjAazpzPQQ 5 1 354260 0 76.5mb 76.5mb
green open sync-nginx-logs-2019-09-01 vvgysMB_SqGDFegF6_wOEQ 1 0 400248 0 79.5mb 79.5mb
green open sync-nginx-logs-2019-09-02 8yHv66FuTE6A8L5GgnEl3g 1 0 416184 0 84.8mb 84.8mb
green open sync-nginx-logs-2019-09-07 iZCX1A3fRMaglOCHFLaFsA 1 0 436122 0 86.7mb 86.7mb
green open sync-nginx-logs-2019-09-08 4Y9rA_1cSlGJ9KADmickQQ 1 0 446164 0 88.3mb 88.3mb
RUN BELOW TO COMMAND TO FIND OUT OVERALL DISK SPACE USED BY ALL ELASTICSEARCH INDICES
GET _cat/nodes?h=h,diskAvail
OR
curl http://host:9200/_cat/nodes?h=h,diskAvail
OUTPUT:-
148.3gb
Or you may also query disk directly to measure disk space for each directories under /var/lib/elasticsearch/[environment name]/nodes/0/indices on Elasticsearch nodes.
$ du -b --max-depth=1 /var/lib/elasticsearch/[environment name]/nodes/0/indices \
| sort -rn | numfmt --to=iec --suffix=B --padding=5
> 17GB /var/lib/elasticsearch/env1/nodes/0/indices
3.8GB /var/lib/elasticsearch/env1/nodes/0/indices/index1
2.1GB /var/lib/elasticsearch/env1/nodes/0/indices/index2
1.2GB ...

Resources