Elasticsearch Reenable shard allocation ineffective? - elasticsearch

I am running a 2 node cluster on version 5.6.12
I followed the following rolling upgrade guide: https://www.elastic.co/guide/en/elasticsearch/reference/5.6/rolling-upgrades.html
After reconnecting the last upgraded node back into my cluster, the health status remained as yellow due to unassigned shards.
Re-enabling shard allocation seemed to have no effect:
PUT _cluster/settings
{
"transient": {
"cluster.routing.allocation.enable": "all"
}
}
My query results when checking cluster health:
GET _cat/health:
1541522454 16:40:54 elastic-upgrade-test yellow 2 2 84 84 0 0 84 0 - 50.0%
GET _cat/shards:
v2_session-prod-2018.11.05 3 p STARTED 6000 1016kb xx.xxx.xx.xxx node-25
v2_session-prod-2018.11.05 3 r UNASSIGNED
v2_session-prod-2018.11.05 1 p STARTED 6000 963.3kb xx.xxx.xx.xxx node-25
v2_session-prod-2018.11.05 1 r UNASSIGNED
v2_session-prod-2018.11.05 4 p STARTED 6000 1020.4kb xx.xxx.xx.xxx node-25
v2_session-prod-2018.11.05 4 r UNASSIGNED
v2_session-prod-2018.11.05 2 p STARTED 6000 951.4kb xx.xxx.xx.xxx node-25
v2_session-prod-2018.11.05 2 r UNASSIGNED
v2_session-prod-2018.11.05 0 p STARTED 6000 972.2kb xx.xxx.xx.xxx node-25
v2_session-prod-2018.11.05 0 r UNASSIGNED
v2_status-prod-2018.11.05 3 p STARTED 6000 910.2kb xx.xxx.xx.xxx node-25
v2_status-prod-2018.11.05 3 r UNASSIGNED
Is there another way to try and get shards allocation working again so I can get my cluster health back to green?

The other node within my cluster had a "high disk watermark [90%] exceeded" warning message so shards were "relocated away from this node".
I updated the config to:
cluster.routing.allocation.disk.watermark.high: 95%
After restarting the node, shards began to allocate again.
This is a quick fix - I will also attempt to increase the disk space on this node to ensure I don't lose reliability.

Related

Daily index not created

On my single test server with 8G of RAM (1955m to JVM) having es v 7.4, I have 12 application indices + few system indices like (.monitoring-es-7-2021.08.02, .monitoring-logstash-7-2021.08.02, .monitoring-kibana-7-2021.08.02) getting created daily. So on an average I can see daily es creates 15 indices.
today I can see only two indices are created.
curl -slient -u elastic:xxxxx 'http://127.0.0.1:9200/_cat/indices?v' -u elastic | grep '2021.08.03'
Enter host password for user 'elastic':
yellow open metricbeat-7.4.0-2021.08.03 KMJbbJMHQ22EM5Hfw 1 1 110657 0 73.9mb 73.9mb
green open .monitoring-kibana-7-2021.08.03 98iEmlw8GAm2rj-xw 1 0 3 0 1.1mb 1.1mb
and reason for above I think is below,
While looking into es logs, found
[2021-08-03T12:14:15,394][WARN ][o.e.x.m.e.l.LocalExporter] [elasticsearch_1] unexpected error while indexing monitoring document org.elasticsearch.xpack.monitoring.exporter.ExportException: org.elasticsearch.common.ValidationException: Validation Failed: 1: this action would add [1] total shards, but this cluster currently has [1000]/[1000] maximum shards open;
logstash logs for application index and filebeat index
[2021-08-03T05:18:05,246][WARN ][logstash.outputs.elasticsearch][main] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"ping_server-2021.08.03", :_type=>"_doc", :routing=>nil}, #LogStash::Event:0x44b98479], :response=>{"index"=>{"_index"=>"ping_server-2021.08.03", "_type"=>"_doc", "_id"=>nil, "status"=>400, "error"=>{"type"=>"validation_exception", "reason"=>"Validation Failed: 1: this action would add [2] total shards, but this cluster currently has [1000]/[1000] maximum shards open;"}}}}
[2021-08-03T05:17:38,230][WARN ][logstash.outputs.elasticsearch][main] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"filebeat-7.4.0-2021.08.03", :_type=>"_doc", :routing=>nil}, #LogStash::Event:0x1e2c70a8], :response=>{"index"=>{"_index"=>"filebeat-7.4.0-2021.08.03", "_type"=>"_doc", "_id"=>nil, "status"=>400, "error"=>{"type"=>"validation_exception", "reason"=>"Validation Failed: 1: this action would add [2] total shards, but this cluster currently has [1000]/[1000] maximum shards open;"}}}}
Adding active and unassigned shards totals to 1000
"active_primary_shards" : 512,
"active_shards" : 512,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 488,
"delayed_unassigned_shards" : 0,
"active_shards_percent_as_number" : 51.2
If I check with below command, I see all unassigned shards are replica shards
curl -slient -XGET -u elastic:xxxx http://localhost:9200/_cat/shards | grep 'UNASSIGNED'
.
.
dev_app_server-2021.07.10 0 r UNASSIGNED
apm-7.4.0-span-000028 0 r UNASSIGNED
ping_server-2021.07.02 0 r UNASSIGNED
api_app_server-2021.07.17 0 r UNASSIGNED
consent_app_server-2021.07.15 0 r UNASSIGNED
Q. So for now, can I safely delete unassigned shards to free up some shards as its single node cluster?
Q. Can I changed the settings from allocating 2 shards (1 primary and 1 replica) to 1 primary shard being its a single server for each index online?
Q. If I have to keep one year of indices, Is below calculation correct?
15 indices daily with one primary shard * 365 days = 5475 total shards (or say 6000 for round off)
Q. Can I set 6000 shards as shard limit for this node so that I will never face this mentioned shard issue?
Thanks,
You have a lot of unassigned shards (probably because you have a single node and all indices have replicas=1), so it's easy to get rid of all of them and get rid of the error at the same time, by running the following command
PUT _all/_settings
{
"index.number_of_replicas": 0
}
Regarding the count of the indices, you probably don't have to create one index per day if those indexes stay small (i.e. below 10GB each). So the default 1000 shards count is more than enough without you have to change anything.
You should simply leverage Index Lifecycle Management in order to keep your index size at bay and not create too many small ones of them.

Delete unused data elasticsearch

I'm new in using elasticseach. I to use elasticsearch to aggregate logs. My problem is with the storage, I deleted all indices and now I have only one index.
When I call /_cat/allocation?v disk.indices is 23.9mb and disk.used is 16.4gb. Why is this difference? How can I remove unused data or how can I remove properly indices?
I ran the command:
curl -XPOST "elasticsearch:9200/_forcemerge?only_expunge_deletes=true"
But I didn't see any improvement.
Output of _cat/allocation?v :
shards disk.indices disk.used disk.avail
12 24.3mb 16.4gb 22.7gb
Output of _cat/shards?v :
index shard prirep state docs store ip node
articles 0 p STARTED 3666 24.2mb 192.168.1.21 lW9hsd5
articles 0 r UNASSIGNED
storage_test 2 p STARTED 0 261b 192.168.1.21 lW9hsd5
storage_test 2 r UNASSIGNED
storage_test 3 p STARTED 0 261b 192.168.1.21 lW9hsd5
storage_test 3 r UNASSIGNED
storage_test 4 p STARTED 0 261b 192.168.1.21 lW9hsd5
storage_test 4 r UNASSIGNED
storage_test 1 p STARTED 0 261b 192.168.1.21 lW9hsd5
storage_test 1 r UNASSIGNED
storage_test 0 p STARTED 0 261b 192.168.1.21 lW9hsd5
storage_test 0 r UNASSIGNED
twitter 3 p STARTED 1 4.4kb 192.168.1.21 lW9hsd5
twitter 3 r UNASSIGNED
twitter 2 p STARTED 0 261b 192.168.1.21 lW9hsd5
twitter 2 r UNASSIGNED
twitter 4 p STARTED 0 261b 192.168.1.21 lW9hsd5
twitter 4 r UNASSIGNED
twitter 1 p STARTED 0 261b 192.168.1.21 lW9hsd5
twitter 1 r UNASSIGNED
twitter 0 p STARTED 0 261b 192.168.1.21 lW9hsd5
twitter 0 r UNASSIGNED
.kibana 0 p STARTED 4 26.4kb 192.168.1.21 lW9hsd5
Thanks
https://www.elastic.co/guide/en/elasticsearch/guide/current/delete-doc.html
As already mentioned in Updating a Whole Document, deleting a document
doesn’t immediately remove the document from disk; it just marks it as
deleted. Elasticsearch will clean up deleted documents in the
background as you continue to index more data.
You might be facing some side effects of a _forcemerge on an a non-read-only index:
Warning: Force merge should only be called against read-only indices. Running force merge against a read-write index can cause very large segments to be produced (>5Gb per segment), and the merge policy will never consider it for merging again until it mostly consists of deleted docs. This can cause very large segments to remain in the shards.
In this case I would suggest to first make the index read-only:
PUT your_index/_settings
{
"index": {
"blocks.read_only": true
}
}
Then to do force merge again and enable back writing to the index:
PUT your_index/_settings
{
"index": {
"blocks.read_only": false
}
}
In case this does not work, you can do a reindex from an old index into a new index and then delete the old index.
Is there a better way of deleting old logs?
Looks like you want to delete old log messages. Although you could issue a delete by query, there is in fact a better way: using Rollover API.
The idea is to create a new index every time the old index gets too big. Writes will happen into a fixed alias, and Rollover API will make alias point into a new index when the old one is too old or too big. Then to delete the old data you will only have to delete the old indexes.
Hope that helps!

Kibana5: index pattern does not contain any of the following field types: number, boolean, date, ip or string

So I just started with ELK (actually just EK)
I uploaded my data using PHP and MySQL
I can see all of the data in Kibana but whenever I try to create a visualization except the maps I get:
No Compatible Fields: The "sample" index pattern does not contain any of the following field types: number, boolean, date, ip or string
This is my index.
It clearly has the type number and string.
I also tried:
Kibana5, index pattern does not contain any of the following field types: *
which gives an error:
Field Capabilities: 5 of 5 shards failed.
To figure out the shards thing I tried:
curl 'localhost:9200/_cat/shards?h=index,shard,prirep,state,unassigned.reason'
It outputs:
sample 1 p STARTED
sample 1 r UNASSIGNED CLUSTER_RECOVERED
sample 4 p STARTED
sample 4 r UNASSIGNED CLUSTER_RECOVERED
sample 3 p STARTED
sample 3 r UNASSIGNED CLUSTER_RECOVERED
sample 2 p STARTED
sample 2 r UNASSIGNED CLUSTER_RECOVERED
sample 0 p STARTED
sample 0 r UNASSIGNED CLUSTER_RECOVERED
.kibana 0 p STARTED
.kibana 0 r UNASSIGNED INDEX_CREATED
I have no idea what to do. I just started with ELK.
I have elasticsearch 5.4.1
Kibana 5.4
Please help :(
Delete the index-pattern, and recreate it

I have the error 503, No Shard Available Action Exception with Elasticsearch

I am having problems when I am looking for a register inside of an index and the message is the next:
TransportError: TransportError(503, u'NoShardAvailableActionException[[new_gompute_history][2] null]; nested: IllegalIndexShardStateException[[new_gompute_history][2] CurrentState[POST_RECOVERY] operations only allowed when started/relocated]; ')
This comes when I am searching by an id inside of an index.
The health of my cluster is green:
GET _cat/health?v
epoch timestamp cluster status node.total node.data shards pri relo init unassign
1438678496 10:54:56 juan green 5 4 212 106 0 0 0
GET _cat/allocation?v
shards disk.used disk.avail disk.total disk.percent host ip node
53 3.1gb 16.8gb 20gb 15 bc10-05 10.8.5.15 Anomaloco
53 6.4gb 80.8gb 87.3gb 7 bc10-03 10.8.5.13 Algrim the Strong
0 0b l8a 10.8.0.231 logstash-l8a-5920-4018
53 6.4gb 80.8gb 87.3gb 7 bc10-03 10.8.5.13 Harry Leland
53 3.1gb 16.8gb 20gb 15 bc10-05 10.8.5.15 Hypnotia
I solved putting a a sleep time between consecutive PUTs, but I do not like this solution

elasticsearch - remove a second elasticsearch node and add an other node, get unassigned shards

As a starter in Elasticsearch, I just use it for two weeks ago and I have just did a silly thing.
My Elasticsearch has one cluster with two nodes, one master-data node (version 1.4.2), one non-data node (version 1.1.1).There was conflict version when using, I decided to shutdown and delete the non-data node then, install another data node (version 1.4.2) See my image for easy imagine. node3 is named node2 then
Then, I check elastic status
{
"cluster_name":"elasticsearch",
"status":"yellow",
"timed_out":false,
"number_of_nodes":2,
"number_of_data_nodes":2,
"active_primary_shards":725,
"active_shards":1175,
"relocating_shards":0,
"initializing_shards":0,
"unassigned_shards":273
}
Check the cluster state
curl -XGET http://localhost:9200/_cat/shards
logstash-2015.03.25 2 p STARTED 3031 621.1kb 10.146.134.94 node1
logstash-2015.03.25 2 r UNASSIGNED
logstash-2015.03.25 0 p STARTED 3084 596.4kb 10.146.134.94 node1
logstash-2015.03.25 0 r UNASSIGNED
logstash-2015.03.25 3 p STARTED 3177 608.4kb 10.146.134.94 node1
logstash-2015.03.25 3 r UNASSIGNED
logstash-2015.03.25 1 p STARTED 3099 577.3kb 10.146.134.94 node1
logstash-2015.03.25 1 r UNASSIGNED
logstash-2014.12.30 4 r STARTED 10.146.134.94 node2
logstash-2014.12.30 4 p STARTED 94 114.3kb 10.146.134.94 node1
logstash-2014.12.30 0 r STARTED 111 195.8kb 10.146.134.94 node2
logstash-2014.12.30 0 p STARTED 111 195.8kb 10.146.134.94 node1
logstash-2014.12.30 3 r STARTED 110 144kb 10.146.134.94 node2
logstash-2014.12.30 3 p STARTED 110 144kb 10.146.134.94 node1
I have read related question and tried to follow it but no luck. I also comment in the answer the error i got.
ElasticSearch: Unassigned Shards, how to fix?
https://t37.net/how-to-fix-your-elasticsearch-cluster-stuck-in-initializing-shards-mode.html
elasticsearch - what to do with unassigned shards
http://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-reroute.html#cluster-reroute
curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
"commands" : [ {
"allocate" : {
"index" : "logstash-2015.03.25",
"shard" : 4,
"node" : "node2",
"allow_primary" : true
}
}
]
}'
I get
"routing_nodes":{"unassigned":[{"state":"UNASSIGNED","primary":false,"node":null,
"relocating_node":null,"shard":0,"index":"logstash-2015.03.25"}
And I followed the answer in https://stackoverflow.com/a/23781013/1920536
curl -XPUT 'localhost:9200/_cluster/settings' -d '{
"transient" : {
"cluster.routing.allocation.enable" : "all"
}
}'
but no affection.
what should i do ?
Thank in advances.
Update: when I check pending task, it shows that:
{"tasks":[{"insert_order":88401,"priority":"HIGH","source":"shard-failed
([logstash-2015.01.19][3], node[PVkS47JyQQq6G-lstUW04w], [R], s[INITIALIZING]),
**reason [Failed to start shard, message** [RecoveryFailedException[[logstash-2015.01.19][3]: **Recovery failed from** [node1][_72bJJX0RuW7AyM86WUgtQ]
[localhost][inet[/localhost:9300]]{master=true} into [node2][PVkS47JyQQq6G-lstUW04w]
[localhost][inet[/localhost:9302]]{master=false}];
nested: RemoteTransportException[[node1][inet[/localhost:9300]]
[internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[[logstash-2015.01.19][3] Phase[2] Execution failed];
nested: RemoteTransportException[[node2][inet[/localhost:9302]][internal:index/shard/recovery/prepare_translog]];
nested: EngineCreationFailureException[[logstash-2015.01.19][3] **failed to create engine];
nested: FileSystemException**[data/elasticsearch/nodes/0/indices/logstash-2015.01.19/3/index/_0.si: **Too many open files**]; ]]","executing":true,"time_in_queue_millis":53,"time_in_queue":"53ms"}]}
If you have two nodes like
1) Node-1 - ES 1.4.2
2) Node-2 - Es 1.1.1
Now follow these steps to debug.
1) Stop all elasticsearch instance from Node-2.
2) Install elasticsearch 1.4.2 in new elasticsearch node.
Change elasticsearch.yml as master node configuration , specially these three config settings
cluster.name: <Same as master node>
node.name: < Node name for Node-2>
discovery.zen.ping.unicast.hosts: <Master Node IP>
3) Restart Node-2 Elasticsearch.
4) Verify Node-1 logs.

Resources