shard unassigned for elasticsearch cluster with docker - elasticsearch

here is my compose file,escp is my docker images files
elasticsearch_master:
#image: elasticsearch:latest
image: escp
command: "elasticsearch \
-Des.cluster.name=dcluster \
-Des.node.name=esmaster \
-Des.node.master=true \
-Des.node.data=true \
-Des.node.client=false \
-Des.discovery.zen.minimum_master_nodes=1"
volumes:
- "${PWD}/es/config:/usr/share/elasticsearch/config"
- "${PWD}/esdata/node:/usr/share/elasticsearch/data"
- "${PWD}/es/plugins:/usr/share/elasticsearch/plugins"
environment:
- ES_HEAP_SIZE=512m
ports:
- "9200:9200"
- "9300:9300"
elasticsearch1:
#image: elasticsearch:latest
image: escp
command: "elasticsearch \
-Des.cluster.name=dcluster \
-Des.node.name=esnode1 \
-Des.node.data=true \
-Des.node.client=false \
-Des.node.master=false \
-Des.discovery.zen.minimum_master_nodes=1 \
-Des.discovery.zen.ping.unicast.hosts=elasticsearch_master"
links:
- elasticsearch_master
volumes:
- "${PWD}/es/config:/usr/share/elasticsearch/config"
- "${PWD}/esdata/node1:/usr/share/elasticsearch/data"
- "${PWD}/es/plugins:/usr/share/elasticsearch/plugins"
environment:
- ES_HEAP_SIZE=512m
elasticsearch2:
#image: elasticsearch:latest
image: escp
command: "elasticsearch \
-Des.cluster.name=dcluster \
-Des.node.name=esnode2 \
-Des.node.data=true \
-Des.node.client=false \
-Des.node.master=false \
-Des.discovery.zen.minimum_master_nodes=1 \
-Des.discovery.zen.ping.unicast.hosts=elasticsearch_master"
links:
- elasticsearch_master
volumes:
- "${PWD}/es/config:/usr/share/elasticsearch/config"
- "${PWD}/esdata/node2:/usr/share/elasticsearch/data"
- "${PWD}/es/plugins:/usr/share/elasticsearch/plugins"
environment:
- ES_HEAP_SIZE=512m
this is config file
index.number_of_shards: 1
index.number_of_replicas: 0
network.host: 0.0.0.0
after running
Name Command State Ports
--------------------------------------------------------------------------------------------------------------------
est_elasticsearch1_1 /docker-entrypoint.sh elas ... Up 9200/tcp, 9300/tcp
est_elasticsearch2_1 /docker-entrypoint.sh elas ... Up 9200/tcp, 9300/tcp
est_elasticsearch_master_1 /docker-entrypoint.sh elas ... Up 0.0.0.0:9200->9200/tcp, 0.0.0.0:9300->9300/tcp
but when i create new index there will show UNASSIGNED...
curl -s '192.168.99.100:9200/_cluster/health?pretty'
{
"cluster_name" : "dcluster",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 3,
"active_primary_shards" : 0,
"active_shards" : 0,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 1,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 0.0
}
check nodes
curl -s '192.168.99.100:9200/_cat/nodes?v'
host ip heap.percent ram.percent load node.role master name
172.17.0.2 172.17.0.2 13 33 0.00 d * esmaster
172.17.0.3 172.17.0.3 16 33 0.00 d - esnode1
172.17.0.4 172.17.0.4 13 33 0.00 d - esnode2
check shards
curl -s '192.168.99.100:9200/_cat/shards'
abcq 0 p UNASSIGNED
check allocation
curl -s '192.168.99.100:9200/_cat/allocation?v'
shards disk.indices disk.used disk.avail disk.total disk.percent host ip node
0 0b 223.4gb 9.5gb 232.9gb 95 172.17.0.4 172.17.0.4 esnode2
0 0b 223.4gb 9.5gb 232.9gb 95 172.17.0.2 172.17.0.2 esmaster
0 0b 223.4gb 9.5gb 232.9gb 95 172.17.0.3 172.17.0.3 esnode1
1 UNASSIGNED
check setting
curl 'http://192.168.99.100:9200/_cluster/settings?pretty'
{
"persistent" : { },
"transient" : { }
}
enabled reroute
curl 'http://192.168.99.100:9200/_cluster/settings?pretty'
{
"persistent" : { },
"transient" : {
"cluster" : {
"routing" : {
"allocation" : {
"enable" : "true"
}
}
}
}
}
reroute index abcq
curl -XPOST http://192.168.99.100:9200/_cluster/reroute?pretty -d '{
"commands" : [
{
"allocate" : {
"index" : "abcq",
"shard" : 0,
"node" : "esnode2",
"allow_primary" : true
}
}
]
}'
get error bellow
{
"error" : {
"root_cause" : [ {
"type" : "illegal_argument_exception",
"reason" : "[allocate] allocation of [abcq][0] on node {esnode2}{Pisl95VUSPmZa3Ga_e3sDA}{172.17.0.4}{172.17.0.4:9300}{master=false} is not allowed, reason: [YES(shard is primary)][YES(no allocation awareness enabled)][NO(more than allowed [90.0%] used disk on node, free: [4.078553722498398%])][YES(allocation disabling is ignored)][YES(primary shard can be allocated anywhere)][YES(node passes include/exclude/require filters)][YES(shard is not allocated to same node or host)][YES(total shard limit disabled: [index: -1, cluster: -1] <= 0)][YES(allocation disabling is ignored)][YES(no snapshots are currently running)][YES(below primary recovery limit of [4])]"
} ],
"type" : "illegal_argument_exception",
"reason" : "[allocate] allocation of [abcq][0] on node {esnode2}{Pisl95VUSPmZa3Ga_e3sDA}{172.17.0.4}{172.17.0.4:9300}{master=false} is not allowed, reason: [YES(shard is primary)][YES(no allocation awareness enabled)][NO(more than allowed [90.0%] used disk on node, free: [4.078553722498398%])][YES(allocation disabling is ignored)][YES(primary shard can be allocated anywhere)][YES(node passes include/exclude/require filters)][YES(shard is not allocated to same node or host)][YES(total shard limit disabled: [index: -1, cluster: -1] <= 0)][YES(allocation disabling is ignored)][YES(no snapshots are currently running)][YES(below primary recovery limit of [4])]"
},
"status" : 400
}
why i create new index get unassigned, can any one help? thanks.

guys here is how to fix problem.
more than allowed [90.0%] used disk on node
it means my disk total full, not too much space for shards allocation.
shards disk.indices disk.used disk.avail disk.total disk.percent host ip node
0 0b 223.4gb 9.5gb 232.9gb 95 172.17.0.4 172.17.0.4 esnode2
disable disk check or set it lower
curl -XPUT localhost:9200/_cluster/settings -d '{
"transient" : {
"cluster.routing.allocation.disk.threshold_enabled" : false
}
}'
curl -XPUT http://192.168.99.100:9200/_cluster/settings -d '
{
"transient" : {
"cluster.routing.allocation.disk.watermark.low": "10%",
"cluster.routing.allocation.disk.watermark.high": "10gb",
"cluster.info.update.interval": "1m"
}
}'
hope this can help ohters, more detail can check here.
https://www.elastic.co/guide/en/elasticsearch/reference/current/disk-allocator.html

Related

elasticsearch 7 is in YELLOW state with unassigned shard

I'm unable to get cluster of single master node in working green condition:
elasticsearch: 7.17.0
I guess that is because I have unassigned_shards > 0
config:
apiVersion: v1
data:
elasticsearch.yml: |-
discovery:
type: single-node
network:
host: 0.0.0.0
path:
data: /bitnami/elasticsearch/data
xpack:
ml:
enabled: false
kind: ConfigMap
metadata:
labels:
app.kubernetes.io/instance: elasticsearch
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: elasticsearch
argocd.argoproj.io/instance: elasticsearch
helm.sh/chart: elasticsearch-19.5.5
name: elasticsearch
namespace: elasticsearch
kubectl logs elasticsearch-master-0
[2022-12-25T07:52:40,652][INFO ][o.e.c.r.a.AllocationService] [elasticsearch-master-0] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[.ds-ilm-history-5-2022.10.28-000014][0], [.ds-ilm-history-5-2022.09.28-000012][0]]]).
[2022-12-25T07:52:40,856][INFO ][o.e.i.g.GeoIpDownloader ] [elasticsearch-master-0] updating geoip database [GeoLite2-ASN.mmdb]
✗ curl -XGET http://localhost:9200/_cluster/health\?pretty\=true
{
"cluster_name" : "elasticsearch",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 7,
"active_shards" : 7,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 1,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 87.5
}
➜ curl -XGET http://localhost:9200/_cat/shards
magento2_product_1_v583 0 p STARTED 4868 18.9mb 10.110.4.229 elasticsearch-master-0
.ds-ilm-history-5-2022.10.28-000014 0 p STARTED 10.110.4.229 elasticsearch-master-0
.ds-ilm-history-5-2022.11.27-000015 0 p STARTED 10.110.4.229 elasticsearch-master-0
.ds-ilm-history-5-2022.08.29-000010 0 p STARTED 10.110.4.229 elasticsearch-master-0
.ds-ilm-history-5-2022.09.28-000012 0 p STARTED 10.110.4.229 elasticsearch-master-0
.geoip_databases 0 p STARTED 40 38.1mb 10.110.4.229 elasticsearch-master-0
.ds-.logs-deprecation.elasticsearch-default-2022.12.21-000022 0 p STARTED 10.110.4.229 elasticsearch-master-0
.ds-.logs-deprecation.elasticsearch-default-2022.12.21-000022 0 r UNASSIGNED
I'm trying to delete it but facing an error
➜ curl -XGET http://localhost:9200/_cat/shards | grep UNASSIGNED | awk {'print $1'} | xargs -i curl -XDELETE "http://localhost:9200/{}"
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 986 100 986 0 0 5241 0 --:--:-- --:--:-- --:--:-- 5244
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"index [.ds-.logs-deprecation.elasticsearch-default-2022.12.21-000022] is the write index for data stream [.logs-deprecation.elasticsearch-default] and cannot be deleted"}],"type":"illegal_argument_exception","reason":"index [.ds-.logs-deprecation.elasticsearch-default-2022.12.21-000022] is the write index for data stream [.logs-deprecation.elasticsearch-default] and cannot be deleted"},"status":400}
GET /_cluster/allocation/explain:
➜ curl -XGET http://localhost:9200/_cluster/allocation/explain\?pretty\=true | jq
{
"note": "No shard was specified in the explain API request, so this response explains a randomly chosen unassigned shard. There may be other unassigned shards in this cluster which cannot be assigned for different reasons. It may not be possible to assign this shard until one of the other shards is assigned correctly. To explain the allocation of other shards (whether assigned or unassigned) you must specify the target shard in the request to this API.",
"index": ".ds-.logs-deprecation.elasticsearch-default-2022.12.21-000022",
"shard": 0,
"primary": false,
"current_state": "unassigned",
"unassigned_info": {
"reason": "CLUSTER_RECOVERED",
"at": "2022-12-25T07:52:37.022Z",
"last_allocation_status": "no_attempt"
},
"can_allocate": "no",
"allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes",
"node_allocation_decisions": [
{
"node_id": "aURccTcnSuqPC3fBfmezCg",
"node_name": "elasticsearch-master-0",
"transport_address": "10.110.4.229:9300",
"node_attributes": {
"xpack.installed": "true",
"transform.node": "true"
},
"node_decision": "no",
"deciders": [
{
"decider": "same_shard",
"decision": "NO",
"explanation": "a copy of this shard is already allocated to this node [[.ds-.logs-deprecation.elasticsearch-default-2022.12.21-000022][0], node[aURccTcnSuqPC3fBfmezCg], [P], s[STARTED], a[id=tsxhnODlSn-i__-vEvJj3A]]"
}
]
}
]
So what can be done in such scenario?
curl -v -XPUT "localhost:9200/*/_settings" -H 'Content-Type: application/json' -d '
{
"index" : {
"number_of_replicas" : 0
}
}
'
{"acknowledged":true}
curl -XGET http://localhost:9200/_cat/indices
green open magento2_product_1_v583 hvYpUxJUT16-g6_YS8qkaA 1 0 4868 0 18.9mb 18.9mb
green open .geoip_databases tDXBLQRdSFeQyi6Pk5zq2Q 1 0 40 40 38.1mb 38.1mb
Yellow status indicates that one or more of the replica shards on the Elasticsearch cluster are not allocated to a node. When you have only one node, it means your number of replicas is greater than your number of nodes. Elasticsearch will never assign a replica to the same node as the primary shard. so, if you only have one node it is perfectly normal and expected for your cluster to indicate yellow. But, if you are not convinced and want your cluster to be green, set the number of replicas to each index to be 0.
PUT /my-index/_settings
{
"index" : {
"number_of_replicas" : 0
}
}

Why does Elasticsearch allow me to index documents in a single node cluster that fails to meet quorum requirements?

From https://www.elastic.co/guide/en/elasticsearch/guide/2.x/distrib-write.html:
Note that the number_of_replicas is the number of replicas specified in the index settings, not the number of replicas that are currently active. If you have specified that an index should have three replicas, a quorum would be as follows:
int( (primary + 3 replicas) / 2 ) + 1 = 3
But if you start only two nodes, there will be insufficient active shard copies to satisfy the quorum, and you will be unable to index or delete any documents.
I ran the following commands on a single-node cluster and I was able to index a document successfully even though the math above says I should not be able to index documents.
curl -X DELETE http://localhost:9200/a/?pretty
curl -X PUT -siH 'Content-Type: application/json' \
http://localhost:9200/a?pretty -d '{
"settings": {
"number_of_replicas": 3
}
}'
curl -sH 'Content-Type: application/json' -X PUT http://localhost:9200/a/a/1?pretty -d '{"a": "a"}'
curl -si http://localhost:9200/_cluster/health?pretty
curl -si http://localhost:9200/a/a/1?pretty
Here is the output:
$ curl -X PUT -siH 'Content-Type: application/json' \
http://localhost:9200/a?pretty -d '{
"settings": {
"number_of_replicas": 3
}
}'
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 77
{
"acknowledged" : true,
"shards_acknowledged" : true,
"index" : "a"
}
$ curl -sH 'Content-Type: application/json' -X PUT http://localhost:9200/a/a/1?pretty -d '{"a": "a"}'
{
"_index" : "a",
"_type" : "a",
"_id" : "1",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 4,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1
}
$ curl -si http://localhost:9200/_cluster/health?pretty
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 468
{
"cluster_name" : "docker-cluster",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 5,
"active_shards" : 5,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 15,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 25.0
}
$ curl -si http://localhost:9200/a/a/1?pretty
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 124
{
"_index" : "a",
"_type" : "a",
"_id" : "1",
"_version" : 1,
"found" : true,
"_source" : {
"a" : "a"
}
}
How was I able to index a document with just 1 node even though I have configured the index to have 3 replica shards and the math says I must have at least 3 nodes?
That was an old quorum rule in ES 2.x. As of ES 5.x the write concistency checks have been changed a little bit and a yellow cluster, i.e. a cluster will all primary shards allocated, will pass the consistency checks for write operations and allow you to index and delete documents.
Now the way to decide whether writes can be made is by using the wait_for_active_shards parameter in the indexing operation. By default, if all primary shards are up the index operation will be allowed. You can override that settings by specifying the number of shards that you wish to be active before an indexing operation should be authorized, e.g. wait_for_active_shards=all which is equivalent to wait_for_active_shards=4 (4 = 1 primary + 3 replicas) in your case. If you want the same quorum rule as before you'd specify wait_for_active_shards=3.
More info can be found in the official documentation and here

Kibana is not connecting to ElasticSearch

I am trying to use the Kubernetes 1.7.12 fluentd-elasticsearch addon:
https://github.com/kubernetes/kubernetes/tree/v1.7.12/cluster/addons/fluentd-elasticsearch
ElasticSearch starts up and can respond with:
{
"name" : "0322714ad5b7",
"cluster_name" : "kubernetes-logging",
"cluster_uuid" : "_na_",
"version" : {
"number" : "2.4.1",
"build_hash" : "c67dc32e24162035d18d6fe1e952c4cbcbe79d16",
"build_timestamp" : "2016-09-27T18:57:55Z",
"build_snapshot" : false,
"lucene_version" : "5.5.2"
},
"tagline" : "You Know, for Search"
}
But Kibana is still unable to connect to it. The connection error starts out with:
{"type":"log","#timestamp":"2018-01-23T07:42:06Z","tags":["warning","elasticsearch"],"pid":6,"message":"Unable to revive connection: http://elasticsearch-logging:9200/"}
{"type":"log","#timestamp":"2018-01-23T07:42:06Z","tags":["warning","elasticsearch"],"pid":6,"message":"No living connections"}
And after ElasticSearch is up, the error changes to:
{"type":"log","#timestamp":"2018-01-23T07:42:08Z","tags":["status","plugin:elasticsearch#1.0.0","error"],"pid":6,"state":"red","message":"Status changed from red to red - Service Unavailable","prevState":"red","prevMsg":"Unable to connect to Elasticsearch at http://elasticsearch-logging:9200."}
So it seems as though, Kibana is finally able to get a response from ElasticSearch, but a connection still cannot be established.
This is what the Kibana dashboard looks like:
I tried to get the logs to output more information, but do not have enough knowledge about Kibana and ElasticSearch to know what else I can try next.
I am able to reproduce the error locally using this docker-compose.yml:
version: '2'
services:
elasticsearch-logging:
image: gcr.io/google_containers/elasticsearch:v2.4.1-2
ports:
- "9200:9200"
- "9300:9300"
kibana-logging:
image: gcr.io/google_containers/kibana:v4.6.1-1
ports:
- "5601:5601"
depends_on:
- elasticsearch-logging
environment:
- ELASTICSEARCH_URL=http://elasticsearch-logging:9200
It doesn't look like there should be much involved based on what I can tell from this question:
Kibana on Docker cannot connect to Elasticsearch
and this blog: https://gunith.github.io/docker-kibana-elasticsearch/
But I can't figure out what I'm missing.
Any ideas what else I might be able to try?
Thank you for your time. :)
Update 1:
curling http://elasticsearch-logging on the Kubernetes cluster resulted in the same output:
{
"name" : "elasticsearch-logging-v1-68km4",
"cluster_name" : "kubernetes-logging",
"cluster_uuid" : "_na_",
"version" : {
"number" : "2.4.1",
"build_hash" : "c67dc32e24162035d18d6fe1e952c4cbcbe79d16",
"build_timestamp" : "2016-09-27T18:57:55Z",
"build_snapshot" : false,
"lucene_version" : "5.5.2"
},
"tagline" : "You Know, for Search"
}
curling http://elasticsearch-logging/_cat/indices?pretty on the Kubernetes cluster timed out because of a proxy rule. Using the docker-compose.yml and curling locally (e.g. curl localhost:9200/_cat/indices?pretty) results in:
{
"error" : {
"root_cause" : [ {
"type" : "master_not_discovered_exception",
"reason" : null
} ],
"type" : "master_not_discovered_exception",
"reason" : null
},
"status" : 503
}
The docker-compose logs show:
[2018-01-23 17:04:39,110][DEBUG][action.admin.cluster.state] [ac1f2a13a637] no known master node, scheduling a retry
[2018-01-23 17:05:09,112][DEBUG][action.admin.cluster.state] [ac1f2a13a637] timed out while retrying [cluster:monitor/state] after failure (timeout [30s])
[2018-01-23 17:05:09,116][WARN ][rest.suppressed ] path: /_cat/indices, params: {pretty=}
MasterNotDiscoveredException[null]
at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$5.onTimeout(TransportMasterNodeAction.java:234)
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:236)
at org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:804)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Update 2:
Running kubectl --namespace kube-system logs -c kubedns po/kube-dns-667321983-dt5lz --tail 50 --follow
yields:
I0124 16:43:33.591112 5 dns.go:264] New service: kibana-logging
I0124 16:43:33.591225 5 dns.go:264] New service: nginx
I0124 16:43:33.591251 5 dns.go:264] New service: registry
I0124 16:43:33.591274 5 dns.go:264] New service: sudoe
I0124 16:43:33.591295 5 dns.go:264] New service: default-http-backend
I0124 16:43:33.591317 5 dns.go:264] New service: kube-dns
I0124 16:43:33.591344 5 dns.go:462] Added SRV record &{Host:kube-dns.kube-system.svc.cluster.local. Port:53 Priority:10 Weight:10 Text: Mail:false Ttl:30 TargetStrip:0 Group: Key:}
I0124 16:43:33.591369 5 dns.go:462] Added SRV record &{Host:kube-dns.kube-system.svc.cluster.local. Port:53 Priority:10 Weight:10 Text: Mail:false Ttl:30 TargetStrip:0 Group: Key:}
I0124 16:43:33.591390 5 dns.go:264] New service: kubernetes
I0124 16:43:33.591409 5 dns.go:462] Added SRV record &{Host:kubernetes.default.svc.cluster.local. Port:443 Priority:10 Weight:10 Text: Mail:false Ttl:30 TargetStrip:0 Group: Key:}
I0124 16:43:33.591429 5 dns.go:264] New service: elasticsearch-logging
Update 3:
I'm still trying to get everything to work, but with the help of others, I am confident it is a RBAC issue. I'm not completely sure, but it looks like the elasticsearch nodes were not able to connect with the master (which I never knew was even needed) due to permissions.
Here are some steps that helped, in case it helps others starting out:
with RBAC on:
# kubectl --kubeconfig kubeconfig.yaml --namespace kube-system logs po/elasticsearch-logging-v1-wkwcs
F0119 00:18:44.285773 9 elasticsearch_logging_discovery.go:60] kube-system namespace doesn't exist: User "system:serviceaccount:kube-system:default" cannot get namespaces in the namespace "kube-system". (get namespaces kube-system)
goroutine 1 [running]:
k8s.io/kubernetes/vendor/github.com/golang/glog.stacks(0x1f7f600, 0xc400000000, 0xee, 0x1b2)
vendor/github.com/golang/glog/glog.go:766 +0xa5
k8s.io/kubernetes/vendor/github.com/golang/glog.(*loggingT).output(0x1f5f5c0, 0xc400000003, 0xc42006c300, 0x1ef20c8, 0x22, 0x3c, 0x0)
vendor/github.com/golang/glog/glog.go:717 +0x337
k8s.io/kubernetes/vendor/github.com/golang/glog.(*loggingT).printf(0x1f5f5c0, 0xc400000003, 0x16949d6, 0x1e, 0xc420579ee8, 0x2, 0x2)
vendor/github.com/golang/glog/glog.go:655 +0x14c
k8s.io/kubernetes/vendor/github.com/golang/glog.Fatalf(0x16949d6, 0x1e, 0xc420579ee8, 0x2, 0x2)
vendor/github.com/golang/glog/glog.go:1145 +0x67
main.main()
cluster/addons/fluentd-elasticsearch/es-image/elasticsearch_logging_discovery.go:60 +0xb53
[2018-01-19 00:18:45,273][INFO ][node ] [elasticsearch-logging-v1-wkwcs] version[2.4.1], pid[5], build[c67dc32/2016-09-27T18:57:55Z]
[2018-01-19 00:18:45,275][INFO ][node ] [elasticsearch-logging-v1-wkwcs] initializing ...
# kubectl --kubeconfig kubeconfig.yaml --namespace kube-system exec kibana-logging-2104905774-69wgv curl elasticsearch-logging.kube-system:9200/_cat/indices?pretty
{
"error" : {
"root_cause" : [ {
"type" : "master_not_discovered_exception",
"reason" : null
} ],
"type" : "master_not_discovered_exception",
"reason" : null
},
"status" : 503
}
With RBAC off:
# kubectl --kubeconfig kubeconfig.yaml --namespace kube-system log elasticsearch-logging-v1-7shgk
[2018-01-26 01:19:52,294][INFO ][node ] [elasticsearch-logging-v1-7shgk] version[2.4.1], pid[5], build[c67dc32/2016-09-27T18:57:55Z]
[2018-01-26 01:19:52,294][INFO ][node ] [elasticsearch-logging-v1-7shgk] initializing ...
[2018-01-26 01:19:53,077][INFO ][plugins ] [elasticsearch-logging-v1-7shgk] modules [reindex, lang-expression, lang-groovy], plugins [], sites []
# kubectl --kubeconfig kubeconfig.yaml --namespace kube-system exec elasticsearch-logging-v1-7shgk curl http://elasticsearch-logging:9200/_cat/indices?pretty
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 40 100 40 0 0 2 0 0:00:20 0:00:15 0:00:05 10
green open .kibana 1 1 1 0 6.2kb 3.1kb
Thanks everyone for your help :)
A few troubleshooting tips:
1) ensure ElasticSearch is running fine.
Enter the container running elasticsearch and run:
curl localhost:9200
You should get a JSON, with some data about elasticsearch.
2) ensure ElasticSearch is reachable from the kibana container
Enter the kibana container and run:
curl <elasticsearch_service_name>:9200
You should get the same output as above.
3) Ensure your ES indices are fine.
Run the following command from the elasticsearch container:
curl localhost:9200/_cat/indices?pretty
You should get a table with all indices in your ES cluster and their status (which should be green or yellow in case you only have one ES replica).
If one of the above points fails, check the logs of your ES container for any error messages and try to solve them.
This exception indicates 2 misconfiguration
1. DNS Addon of Kubernetes is not working properly. Check your dns addon logs
2. Pod 2 Pod communication is not working properly. This is related with your underlying sdn addon cni flannel calico.
You can check by pinging one pod from another pod. If it is not working than check your networking configuration especially kube-proxy component.

ElasticSearch Unassigned Shard

I have 2 nodes in elastic search cluster with 8 CPU and 16 GB RAM. I have set ES_HEAP_SIZE to 10 GB.
In my yml configuration file on both machines i have set
index.number_of_shards: 5
index.number_of_replicas: 1
And both machines are allowed as master/data true.Now problem is my 0th shard of node 1 is unassigned after restart.I tried
for shard in $(curl -XGET http://localhost:9201/_cat/shards | grep UNASSIGNED | awk '{print $2}'); do
echo "processing $shard"
curl -XPOST 'localhost:9201/_cluster/reroute' -d '{
"commands" : [ {
"allocate" : {
"index" : "inxn",
"shard" : '$shard',
"node" : "node1",
"allow_primary" : true
}
}
]
}'
done
it does not give any error and says acknowledged true and show status of shard to initialize but when i view shard its still uninitialized.
Am I doing anything wrong in setting? Should I make both node as master/data true and on both machines set shard:5 and replica:1
Any help or suggestion would be greatly appreciated.
Thanks
I did a trick to solve the same , I renamed 0 folder under indices on node1 and did a force full assign 0th shard on node1 and it worked for me.
curl -XPOST 'localhost:9201/_cluster/reroute' -d '{
"commands" : [ {
"allocate" : {
"index" : "inxc",
"shard" : 0,
"node" : "node1",
"allow_primary" : true
}
}
]
}'

ElasticSearch find disk space usage

How can I find the amount of disk space that Elastic Search is using for my indexes? I'm currently running it locally and I'm trying to see how much disk space I will need on the VM that I'll be spinning up.
The Elasticsearch way to do this would be to use _cat/shards and look at the store column:
curl -XGET "http://localhost:9200/_cat/shards?v"
index shard prirep state docs store ip node
myindex_2014_12_19 2 r STARTED 76661 415.6mb 192.168.1.1 Georgianna Castleberry
myindex_2014_12_19 2 p STARTED 76661 417.3mb 192.168.1.2 Frederick Slade
myindex_2014_12_19 2 r STARTED 76661 416.9mb 192.168.1.3 Maverick
myindex_2014_12_19 0 r STARTED 76984 525.9mb 192.168.1.1 Georgianna Castleberry
myindex_2014_12_19 0 r STARTED 76984 527mb 192.168.1.2 Frederick Slade
myindex_2014_12_19 0 p STARTED 76984 526mb 192.168.1.3 Maverick
myindex_2014_12_19 3 r STARTED 163 208.5kb 192.168.1.1 Georgianna Castleberry
myindex_2014_12_19 3 p STARTED 163 191.4kb 192.168.1.2 Frederick Slade
myindex_2014_12_19 3 r STARTED 163 181.6kb 192.168.1.3 Maverick
myindex_2014_12_19 1 p STARTED 424923 2.1gb 192.168.1.1 Georgianna Castleberry
myindex_2014_12_19 1 r STARTED 424923 2.1gb 192.168.1.2 Frederick Slade
myindex_2014_12_19 1 r STARTED 424923 2.1gb 192.168.1.3 Maverick
myindex_2014_12_19 4 r STARTED 81020 435.9mb 192.168.1.1 Georgianna Castleberry
myindex_2014_12_19 4 p STARTED 81020 437.8mb 192.168.1.2 Frederick Slade
myindex_2014_12_19 4 r STARTED 81020 437.8mb 192.168.1.3 Maverick
Otherwise in Linux to view the space by folder use:
du -hs /myelasticsearch/data/folder
or to view the space by filesystem:
df -h
In case you don't need per-shard statistics returned by /_cat/shards you can use
curl -XGET 'http://localhost:9200/_cat/allocation?v'
to get used and available disk space for each node.
To view the overall disk usage/available space on ES cluster you can use the following command:
curl -XGET 'localhost:9200/_cat/allocation?v&pretty'
Hope this helps.
you can use the nodes stats rest API
see: https://www.elastic.co/guide/en/elasticsearch/reference/1.6/cluster-nodes-stats.html
make a request for the fs stats like so:
http://:9200/_nodes/stats/fs?pretty=1
and you will see:
{
"cluster_name" : "<cluster>",
"nodes" : {
"pEO34wutR7qk3Ix8N7MgyA" : {
"timestamp" : 1438880525206,
"name" : "<name>",
"transport_address" : "inet[/10.128.37.111:9300]",
"host" : "<host>",
"ip" : [ "inet[/10.128.37.111:9300]", "NONE" ],
"fs" : {
"timestamp" : 1438880525206,
"total" : {
"total_in_bytes" : 363667091456,
"free_in_bytes" : 185081352192,
"available_in_bytes" : 166608117760,
"disk_reads" : 154891,
"disk_writes" : 482628039,
"disk_io_op" : 482782930,
"disk_read_size_in_bytes" : 6070391808,
"disk_write_size_in_bytes" : 1989713248256,
"disk_io_size_in_bytes" : 1995783640064,
"disk_queue" : "0",
"disk_service_time" : "0"
},
"data" : [ {
"path" : "/data1/elasticsearch/data/<cluster>/nodes/0",
"mount" : "/data1",
"dev" : "/dev/sda4",
"total_in_bytes" : 363667091456,
"free_in_bytes" : 185081352192,
"available_in_bytes" : 166608117760,
"disk_reads" : 154891,
"disk_writes" : 482628039,
"disk_io_op" : 482782930,
"disk_read_size_in_bytes" : 6070391808,
"disk_write_size_in_bytes" : 1989713248256,
"disk_io_size_in_bytes" : 1995783640064,
"disk_queue" : "0",
"disk_service_time" : "0"
} ]
}
}
}
}
the space for the data drive is listed:
"total" : {
"total_in_bytes" : 363667091456,
"free_in_bytes" : 185081352192,
"available_in_bytes" : 166608117760,
A more concise solution to find the size of indices is to use
curl -XGET 'localhost:9200/_cat/indices?v'
The output has a 'store.size' column that tells you exactly the size of an index.
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open logstash-2017.03.01 TfraFM8TQkSXdxjx13CnpQ 5 1 33330000 0 1gb 1gb
yellow open .monitoring-es-2-2017.03.02 10YscrcfQuGny5wMxeb0TA 1 1 68834 88 30.3mb 30.3mb
yellow open .kibana GE6xXV7QT-mNbX7xTPbZ4Q 1 1 3 0 14.5kb 14.5kb
yellow open .monitoring-es-2-2017.03.01 SPeQNnPlRB6y7G6w1Axokw 1 1 29441 108 14.7mb 14.7mb
yellow open .monitoring-data-2 LLeWqsD-QE-rPFblwu5K_Q 1 1 3 0 6.9kb 6.9kb
yellow open .monitoring-kibana-2-2017.03.02 l_MAPERUTmSbq0xbhpnf2Q 1 1 5320 0 1.1mb 1.1mb
yellow open .monitoring-kibana-2-2017.03.01 UFVg9c7TTA-nbsEd2d4oFw 1 1 2699 0 763.4kb 763.4kb
In addition you can find out about available disk space by using
curl -XGET 'localhost:9200/_nodes/_local/stats/fs'
Look up the disk space information under the 'fs' key
{
"_nodes": {
"total": 1,
"successful": 1,
"failed": 0
},
"cluster_name": "elasticsearch",
"nodes": {
"MfgVaoRQT9iRAZtAvO549Q": {
"fs": {
"timestamp": 1488466297268,
"total": {
"total_in_bytes": 29475753984,
"free_in_bytes": 18352095232,
"available_in_bytes": 18352095232
},
}
}
}
}
I've tested this for ElasticSearch version 5.2.1
You may want to use the _cat api for nodewise disk space usage
curl http://host:9200/_cat/nodes?h=h,diskAvail
Reference : https://www.elastic.co/guide/en/elasticsearch/reference/current/cat-nodes.html
RUN BELOW TO COMMAND TO FIND OUT DISK SPACE USED BY EACH ELASTICSEARCH INDEXING
# FOR SHARDS
curl http://host:9200/_cat/shards?v&pretty
# OR
GET _cat/shards?v&pretty
RUN BELOW TO COMMAND TO FIND OUT DISK SPACE USED BY EACH ELASTICSEARCH INDICES
# FOR INDICES
curl -XGET 'host:9200/_cat/indices?v&pretty
# SORT BY SIZE STORE OF INDICES
curl -XGET 'host:9200/_cat/indices/_all?v&s=store.size
OUTPUT
# GET /_cat/indices/_all?v&s=store.size
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open sync-rails-logs sSIBqr2iQHG8TGeKFozTpQ 5 1 0 0 1.2kb 1.2kb
yellow open web-nginx-logs iTV-xFFBSdy-C2-NTuEwqQ 5 1 0 0 1.2kb 1.2kb
yellow open web-rails-logs BYD_qHS8SguZvBuGpNvCwA 5 1 0 0 1.2kb 1.2kb
yellow open sync-nginx-logs XAI1hsxlT6qBYN4Ql36lbg 5 1 0 0 1.2kb 1.2kb
green open .tasks XGrMZiqCR0Wr33cCG1u0VQ 1 0 1 0 6.2kb 6.2kb
green open .kibana_1 -g0ztoGWQnuOXnP6di7OYQ 1 0 13 0 100.6kb 100.6kb
green open .kibana_2 eAxt-LXbQyybCyp_6ZYNZg 1 0 14 5 432.2kb 432.2kb
green open sync-nginx-logs-2019-09-13 Q_Ki0dvXQEiuqiGCd10hRg 1 0 144821 0 28.8mb 28.8mb
green open sync-nginx-logs-2019-08-31 m7-oi7ZTSM6ZH_wPDWwbdw 1 0 384954 0 76.4mb 76.4mb
yellow open sync-nginx-logs-2019-08-26 gAvOPNhMRZK6fjAazpzPQQ 5 1 354260 0 76.5mb 76.5mb
green open sync-nginx-logs-2019-09-01 vvgysMB_SqGDFegF6_wOEQ 1 0 400248 0 79.5mb 79.5mb
green open sync-nginx-logs-2019-09-02 8yHv66FuTE6A8L5GgnEl3g 1 0 416184 0 84.8mb 84.8mb
green open sync-nginx-logs-2019-09-07 iZCX1A3fRMaglOCHFLaFsA 1 0 436122 0 86.7mb 86.7mb
green open sync-nginx-logs-2019-09-08 4Y9rA_1cSlGJ9KADmickQQ 1 0 446164 0 88.3mb 88.3mb
RUN BELOW TO COMMAND TO FIND OUT OVERALL DISK SPACE USED BY ALL ELASTICSEARCH INDICES
GET _cat/nodes?h=h,diskAvail
OR
curl http://host:9200/_cat/nodes?h=h,diskAvail
OUTPUT:-
148.3gb
Or you may also query disk directly to measure disk space for each directories under /var/lib/elasticsearch/[environment name]/nodes/0/indices on Elasticsearch nodes.
$ du -b --max-depth=1 /var/lib/elasticsearch/[environment name]/nodes/0/indices \
| sort -rn | numfmt --to=iec --suffix=B --padding=5
> 17GB /var/lib/elasticsearch/env1/nodes/0/indices
3.8GB /var/lib/elasticsearch/env1/nodes/0/indices/index1
2.1GB /var/lib/elasticsearch/env1/nodes/0/indices/index2
1.2GB ...

Resources