Enterprise search timeout for Elasticsearch create index - elasticsearch
I am using ECK to deploy Elasticsearch cluster on Kubernetes.
My Elasticsearch is working fine and it shows green as cluster. But when Enterprise search start and start creating indexes in Elasticsearch, after creating some indexes, it give error for timeout.
pv.yaml
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: elasticsearch-master
labels:
type: local
spec:
storageClassName: standard
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
hostPath:
path: /mnt/nfs/kubernetes/elasticsearch/master/
...
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: elasticsearch-data
labels:
type: local
spec:
storageClassName: standard
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
hostPath:
path: /mnt/nfs/kubernetes/elasticsearch/data/
...
multi_node.yaml
---
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: bselastic
spec:
version: 8.1.2
nodeSets:
- name: masters
count: 1
config:
node.roles: ["master",
# "data",
]
xpack.ml.enabled: true
# Volumeclaim needed to add volume, it was giving error for not volume claim
# and its not starting pod.
volumeClaimTemplates:
- metadata:
name: elasticsearch-data # Do not change this name unless you set up a volume mount for the data path.
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: standard
- name: data-node
count: 1
config:
node.roles: ["data", "ingest"]
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: standard
...
---
apiVersion: enterprisesearch.k8s.elastic.co/v1
kind: EnterpriseSearch
metadata:
name: enterprise-search-bselastic
spec:
version: 8.1.3
count: 1
elasticsearchRef:
name: bselastic
podTemplate:
spec:
containers:
- name: enterprise-search
env:
- name: JAVA_OPTS
value: -Xms2g -Xmx2g
- name: "elasticsearch.startup_retry.interval"
value: "30"
- name: allow_es_settings_modification
value: "true"
...
Apply these changes using below command.
kubectl apply -f multi_node.yaml -n deleteme -f pv.yaml
Check the Elasticsearch cluster status
# kubectl get es -n deleteme
NAME HEALTH NODES VERSION PHASE AGE
bselastic unknown 8.1.2 ApplyingChanges 47s
Check all pods
# kubectl get pod -n deleteme
NAME READY STATUS RESTARTS AGE
bselastic-es-data-node-0 0/1 Running 0 87s
bselastic-es-masters-0 1/1 Running 0 87s
enterprise-search-bselastic-ent-54675f95f8-9sskf 0/1 Running 0 86s
Elasticsearch cluster become green after 7+ min
[root#1175014-kubemaster01 nilesh]# kubectl get es -n deleteme
NAME HEALTH NODES VERSION PHASE AGE
bselastic green 2 8.1.2 Ready 7m30s
enterprise search log
# kubectl -n deleteme logs -f enterprise-search-bselastic-ent-549bbcb9-rnhmc
Custom Enterprise Search configuration file detected, not overwriting it (any settings passed via environment will be ignored)
Found java executable in PATH
Java version detected: 11.0.14.1 (major version: 11)
Enterprise Search is starting...
[2022-04-25T16:34:22.282+00:00][7][2000][app-server][INFO]: Elastic Enterprise Search version=8.1.3, JRuby version=9.2.16.0, Ruby version=2.5.7, Rails version=5.2.6
[2022-04-25T16:34:23.862+00:00][7][2000][app-server][INFO]: Performing pre-flight checks for Elasticsearch running on https://bselastic-es-http.deleteme.svc:9200...
[2022-04-25T16:34:25.308+00:00][7][2000][app-server][WARN]: [pre-flight] Failed to connect to Elasticsearch backend. Make sure it is running and healthy.
[2022-04-25T16:34:25.310+00:00][7][2000][app-server][INFO]: [pre-flight] Error: /usr/share/enterprise-search/lib/war/shared_togo/lib/shared_togo/elasticsearch_checks.class:187: Connection refused (Connection refused) (Faraday::ConnectionFailed)
[2022-04-25T16:34:31.353+00:00][7][2000][app-server][WARN]: [pre-flight] Failed to connect to Elasticsearch backend. Make sure it is running and healthy.
[2022-04-25T16:34:31.355+00:00][7][2000][app-server][INFO]: [pre-flight] Error: /usr/share/enterprise-search/lib/war/shared_togo/lib/shared_togo/elasticsearch_checks.class:187: Connection refused (Connection refused) (Faraday::ConnectionFailed)
[2022-04-25T16:34:37.370+00:00][7][2000][app-server][WARN]: [pre-flight] Failed to connect to Elasticsearch backend. Make sure it is running and healthy.
[2022-04-25T16:34:37.372+00:00][7][2000][app-server][INFO]: [pre-flight] Error: /usr/share/enterprise-search/lib/war/shared_togo/lib/shared_togo/elasticsearch_checks.class:187: Connection refused (Connection refused) (Faraday::ConnectionFailed)
[2022-04-25T16:34:43.384+00:00][7][2000][app-server][WARN]: [pre-flight] Failed to connect to Elasticsearch backend. Make sure it is running and healthy.
[2022-04-25T16:34:43.386+00:00][7][2000][app-server][INFO]: [pre-flight] Error: /usr/share/enterprise-search/lib/war/shared_togo/lib/shared_togo/elasticsearch_checks.class:187: Connection refused (Connection refused) (Faraday::ConnectionFailed)
[2022-04-25T16:34:49.400+00:00][7][2000][app-server][WARN]: [pre-flight] Failed to connect to Elasticsearch backend. Make sure it is running and healthy.
[2022-04-25T16:34:49.401+00:00][7][2000][app-server][INFO]: [pre-flight] Error: /usr/share/enterprise-search/lib/war/shared_togo/lib/shared_togo/elasticsearch_checks.class:187: Connection refused (Connection refused) (Faraday::ConnectionFailed)
[2022-04-25T16:37:56.290+00:00][7][2000][app-server][INFO]: [pre-flight] Elasticsearch cluster is ready
[2022-04-25T16:37:56.292+00:00][7][2000][app-server][INFO]: [pre-flight] Successfully connected to Elasticsearch
[2022-04-25T16:37:56.367+00:00][7][2000][app-server][INFO]: [pre-flight] Successfully loaded Elasticsearch plugin information for all nodes
[2022-04-25T16:37:56.381+00:00][7][2000][app-server][INFO]: [pre-flight] Elasticsearch running with an active basic license
[2022-04-25T16:37:56.423+00:00][7][2000][app-server][INFO]: [pre-flight] Elasticsearch API key service is enabled
[2022-04-25T16:37:56.446+00:00][7][2000][app-server][INFO]: [pre-flight] Elasticsearch will be used for authentication
[2022-04-25T16:37:56.447+00:00][7][2000][app-server][INFO]: Elasticsearch looks healthy and configured correctly to run Enterprise Search
[2022-04-25T16:37:56.452+00:00][7][2000][app-server][INFO]: Performing pre-flight checks for Kibana running on http://localhost:5601...
[2022-04-25T16:37:56.482+00:00][7][2000][app-server][WARN]: [pre-flight] Failed to connect to Kibana backend. Make sure it is running and healthy.
[2022-04-25T16:37:56.486+00:00][7][2000][app-server][ERROR]: Could not connect to Kibana backend after 0 seconds.
[2022-04-25T16:37:56.488+00:00][7][2000][app-server][WARN]: Enterprise Search is unable to connect to Kibana. Ensure it is running at http://localhost:5601 for user deleteme-enterprise-search-bselastic-ent-user.
[2022-04-25T16:37:59.344+00:00][7][2000][app-server][INFO]: Elastic APM agent is disabled
{"timestamp": "2022-04-25T16:38:05+00:00", "message": "readiness probe failed", "curl_rc": "7"}
{"timestamp": "2022-04-25T16:38:06+00:00", "message": "readiness probe failed", "curl_rc": "7"}
{"timestamp": "2022-04-25T16:38:16+00:00", "message": "readiness probe failed", "curl_rc": "7"}
{"timestamp": "2022-04-25T16:38:26+00:00", "message": "readiness probe failed", "curl_rc": "7"}
{"timestamp": "2022-04-25T16:38:36+00:00", "message": "readiness probe failed", "curl_rc": "7"}
[2022-04-25T16:38:43.880+00:00][7][2000][app-server][INFO]: [db_lock] [installation] Status: [Starting] Ensuring migrations tracking index exists
{"timestamp": "2022-04-25T16:38:45+00:00", "message": "readiness probe failed", "curl_rc": "7"}
{"timestamp": "2022-04-25T16:38:56+00:00", "message": "readiness probe failed", "curl_rc": "7"}
[2022-04-25T16:39:05.283+00:00][7][2000][app-server][INFO]: [db_lock] [installation] Status: [Finished] Ensuring migrations tracking index exists
[2022-04-25T16:39:05.782+00:00][7][2000][app-server][INFO]: [db_lock] [installation] Status: [Starting] Creating indices for 38 models
[2022-05-02T16:21:47.303+00:00][8][2000][es][DEBUG]: {
"request": {
"url": "https://bselastic-es-http.deleteme.svc:9200/.ent-search-actastic-oauth_applications_v2",
"method": "put",
"headers": {
"Authorization": "[FILTERED]",
"Content-Type": "application/json",
"x-elastic-product-origin": "enterprise-search",
"User-Agent": "Faraday v1.8.0"
},
"params": null,
"body": "{\"settings\":{\"index\":{\"hidden\":true,\"refresh_interval\":-1},\"number_of_shards\":1,\"auto_expand_replicas\":\"0-3\",\"priority\":250},\"mappings\":{\"dynamic\":\"strict\",\"properties\":{\"id\":{\"type\":\"keyword\"},\"created_at\":{\"type\":\"date\"},\"updated_at\":{\"type\":\"date\"},\"name\":{\"type\":\"keyword\"},\"uid\":{\"type\":\"keyword\"},\"secret\":{\"type\":\"keyword\"},\"redirect_uri\":{\"type\":\"keyword\"},\"scopes\":{\"type\":\"keyword\"},\"confidential\":{\"type\":\"boolean\"},\"app_type\":{\"type\":\"keyword\"}}},\"aliases\":{}}"
},
"exception": "/usr/share/enterprise-search/lib/war/lib/swiftype/es/client.class:28: Read timed out (Faraday::TimeoutError)\n",
"duration": 30042.3,
"stack": [
"lib/actastic/schema.class:172:in `create_index!'",
"lib/actastic/schema.class:195:in `create_index_and_mapping!'",
"shared_togo/lib/shared_togo.class:894:in `block in apply_actastic_migrations'",
"shared_togo/lib/shared_togo.class:892:in `block in each'",
"shared_togo/lib/shared_togo.class:892:in `block in apply_actastic_migrations'",
"lib/db_lock.class:182:in `with_status'",
"shared_togo/lib/shared_togo.class:891:in `apply_actastic_migrations'",
"shared_togo/lib/shared_togo.class:406:in `block in install!'",
"lib/db_lock.class:171:in `with_lock'",
"shared_togo/lib/shared_togo.class:399:in `install!'",
"config/application.class:102:in `block in Application'",
"config/environment.class:9:in `<main>'",
"config/environment.rb:1:in `<main>'",
"shared_togo/lib/shared_togo/cli/command.class:37:in `initialize'",
"shared_togo/lib/shared_togo/cli/command.class:10:in `run_and_exit'",
"shared_togo/lib/shared_togo/cli.class:143:in `run_supported_command'",
"shared_togo/lib/shared_togo/cli.class:125:in `run_command'",
"shared_togo/lib/shared_togo/cli.class:112:in `run!'",
"bin/enterprise-search-internal:15:in `<main>'"
]
}
[2022-04-25T16:55:21.340+00:00][7][2000][app-server][INFO]: [db_lock] [installation] Status: [Failed] Creating indices for 38 models: Error = Faraday::TimeoutError: Read timed out
Unexpected exception while running Enterprise Search:
Error: Read timed out at
Master node logs
# kubectl -n deleteme logs -f bselastic-es-masters-0
Skipping security auto configuration because the configuration file [/usr/share/elasticsearch/config/elasticsearch.yml] is missing or is not a regular file
{"#timestamp":"2022-04-25T16:55:11.051Z", "log.level": "INFO", "current.health":"GREEN","message":"Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.ent-search-actastic-search_relevance_suggestions-document_position_id-unique-constraint][0]]]).","previous.health":"YELLOW","reason":"shards started [[.ent-search-actastic-search_relevance_suggestions-document_position_id-unique-constraint][0]]" , "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[bselastic-es-masters-0][masterService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.routing.allocation.AllocationService","elasticsearch.cluster.uuid":"rnaZmz4kQwOBNbWau43wYA","elasticsearch.node.id":"YMyOM1umSL22ro86II6Ymw","elasticsearch.node.name":"bselastic-es-masters-0","elasticsearch.cluster.name":"bselastic"}
{"#timestamp":"2022-04-25T16:55:21.447Z", "log.level": "WARN", "message":"writing cluster state took [10525ms] which is above the warn threshold of [10s]; [skipped writing] global metadata, wrote metadata for [0] new indices and [1] existing indices, removed metadata for [0] indices and skipped [48] unchanged indices", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[bselastic-es-masters-0][generic][T#5]","log.logger":"org.elasticsearch.gateway.PersistedClusterStateService","elasticsearch.cluster.uuid":"rnaZmz4kQwOBNbWau43wYA","elasticsearch.node.id":"YMyOM1umSL22ro86II6Ymw","elasticsearch.node.name":"bselastic-es-masters-0","elasticsearch.cluster.name":"bselastic"}
{"#timestamp":"2022-04-25T16:55:21.448Z", "log.level": "INFO", "message":"after [10.3s] publication of cluster state version [226] is still waiting for {bselastic-es-masters-0}{YMyOM1umSL22ro86II6Ymw}{ljGkLdk-RAukc9NEJtQCVw}{192.168.88.213}{192.168.88.213:9300}{m}{k8s_node_name=1175027-kubeworker15.sb.rackspace.com, xpack.installed=true} [SENT_APPLY_COMMIT], {bselastic-es-data-node-0}{K88khDyfRwaGCBZwMKEaHA}{g9mXrT4WTumoj09W1OylYA}{192.168.88.214}{192.168.88.214:9300}{di}{k8s_node_name=1175027-kubeworker15.sb.rackspace.com, xpack.installed=true} [SENT_PUBLISH_REQUEST]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[bselastic-es-masters-0][generic][T#1]","log.logger":"org.elasticsearch.cluster.coordination.Coordinator.CoordinatorPublication","elasticsearch.cluster.uuid":"rnaZmz4kQwOBNbWau43wYA","elasticsearch.node.id":"YMyOM1umSL22ro86II6Ymw","elasticsearch.node.name":"bselastic-es-masters-0","elasticsearch.cluster.name":"bselastic"}
Which attribute we have to set in Enterprise search to increase timeout ? or is there any way to get debug log for Enterprise search ?
You can try to increase the default timeout Globally parameter by following this example:
es = Elasticsearch(timeout=30, max_retries=10, retry_on_timeout=True)
This would help to give the cluster more time to respond.
Related
GKE cluster node ends up with CrashLoopBackOff
I've had a 3 node setup in GKE. And one of my pod creation is in CrashLoopBackOff state and it is not recovering. The log suggests the below java.lang.IllegalArgumentException. But the other 2 pods they have no such issue. They are up and running. I'm completely unsure of the issue, can someone help me? Is the issue, a by-product of install-plugins in the YML file? If yes, why didn't the same problem occur with other pods? Can you please help me with it? Exception: "type": "server", "timestamp": "2022-08-29T19:52:29,743Z", "level": "ERROR", "component": "o.e.b.ElasticsearchUncaughtExceptionHandler", "cluster.name": "dev", "node.name": "dev-es-data-hot-1", "message": "uncaught exception in thread [main]", "stacktrace": ["org.elasticsearch.bootstrap.StartupException: java.lang.IllegalArgumentException: unknown secure setting [dev-es-snapshot-backup-feeb83405c27.json] please check that any required plugins are installed, or check the breaking changes documentation for removed settings", "at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:170) ~[elasticsearch-7.16.3.jar:7.16.3]", "at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:157) ~[elasticsearch-7.16.3.jar:7.16.3]", "at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:77) ~[elasticsearch-7.16.3.jar:7.16.3]", "at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:112) ~[elasticsearch-cli-7.16.3.jar:7.16.3]", "at org.elasticsearch.cli.Command.main(Command.java:77) ~[elasticsearch-cli-7.16.3.jar:7.16.3]", "at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:122) ~[elasticsearch-7.16.3.jar:7.16.3]", "at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:80) ~[elasticsearch-7.16.3.jar:7.16.3]", "Caused by: java.lang.IllegalArgumentException: unknown secure setting [dev-es-snapshot-backup-feeb83405c27.json] please check that any required plugins are installed, or check the breaking changes documentation for removed settings", "at org.elasticsearch.common.settings.AbstractScopedSettings.validate(AbstractScopedSettings.java:561) ~[elasticsearch-7.16.3.jar:7.16.3]", uncaught exception in thread [main] "at org.elasticsearch.common.settings.AbstractScopedSettings.validate(AbstractScopedSettings.java:507) ~[elasticsearch-7.16.3.jar:7.16.3]", "at org.elasticsearch.common.settings.AbstractScopedSettings.validate(AbstractScopedSettings.java:477) ~[elasticsearch-7.16.3.jar:7.16.3]", "at org.elasticsearch.common.settings.AbstractScopedSettings.validate(AbstractScopedSettings.java:447) ~[elasticsearch-7.16.3.jar:7.16.3]", "at org.elasticsearch.common.settings.SettingsModule.<init>(SettingsModule.java:137) ~[elasticsearch-7.16.3.jar:7.16.3]", "at org.elasticsearch.node.Node.<init>(Node.java:500) ~[elasticsearch-7.16.3.jar:7.16.3]", "at org.elasticsearch.node.Node.<init>(Node.java:309) ~[elasticsearch-7.16.3.jar:7.16.3]", "at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:234) ~[elasticsearch-7.16.3.jar:7.16.3]", "at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:234) ~[elasticsearch-7.16.3.jar:7.16.3]", "at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:434) ~[elasticsearch-7.16.3.jar:7.16.3]", "at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:166) ~[elasticsearch-7.16.3.jar:7.16.3]", "... 6 more"] } Here is my YAML config: - name: data-hot-ingest count: 3 config: node.roles: ["data_hot", "ingest", "data_content"] node.attr.data: hot node.store.allow_mmap: false xpack.security.authc: anonymous: username: anon roles: monitoring_user podTemplate: spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: type operator: In values: - hot initContainers: - name: install-plugins command: - sh - -c - | bin/elasticsearch-plugin install --batch repository-gcs - name: set-virtual-mem command: - sysctl - -w - vm.max_map_count=262144 containers: - name: elasticsearch resources: requests: memory: "64Gi" cpu: "30000m" limits: memory: "65Gi" cpu: "30000m" env: - name: ES_JAVA_OPTS value: -Xms32g -Xmx32g readinessProbe: httpGet: scheme: HTTPS port: 8080 volumeClaimTemplates: - metadata: name: elasticsearch-data spec: accessModes: - ReadWriteOnce resources: requests: storage: 350Gi storageClassName: gold EDIT: We have this secure setting configured, which is linked to a secret in our secureSettings: - secretName: credentials
[ANSWERING MY OWN QUESTION] Trying to resolve the below exception: java.lang.IllegalArgumentException: unknown secure setting [dev-es-snapshot-backup-feeb83405c27.json] I tried comparing the yaml config of the pods, and I found that the pods running successfully do not have a secure setting. But the pod that was crash looping, had the secure setting under elastic-internal-secure-settings - name: elastic-internal-secure-settings secret: defaultMode: 420 optional: false secretName: dev-es-secure-settings And in the operator yaml, I found this: secureSettings: - secretName: credentials Just to confirm the behaviour, I upscaled the statefulset, and found the new pod also crash looping with the same error. So someone had tried the secure setting last month, and it crash looped the pod, and didn't reset it back to normal. Once I removed the secure-setting from the operator yaml, the pods started running without any issue.
Filebeat initialize failed with 10.96.0.1:443 i/o timeout error
In my k8s cluster, filebeat connection is failing after a node restart. Other k8s nodes work normally. logs from filebeat pod: 2020-08-30T03:18:58.770Z ERROR kubernetes/util.go:90 kubernetes: Querying for pod failed with error: performing request: Get https://10.96.0.1:443/api/v1/namespaces/monitoring/pods/filebeat-gfg5l: dial tcp 10.96.0.1:443: i/o timeout 2020-08-30T03:18:58.770Z INFO kubernetes/watcher.go:180 kubernetes: Performing a resource sync for *v1.PodList 2020-08-30T03:19:28.771Z ERROR kubernetes/watcher.go:183 kubernetes: Performing a resource sync err performing request: Get https://10.96.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout for *v1.PodList 2020-08-30T03:19:28.771Z INFO instance/beat.go:357 filebeat stopped. 2020-08-30T03:19:28.771Z ERROR instance/beat.go:800 Exiting: error initializing publisher: error initializing processors: performing request: Get https://10.96.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout Exiting: error initializing publisher: error initializing processors: performing request: Get https://10.96.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout An error occurs and pod restarts are repeated. Also i restarted this node, but it didn't work. filebeat version is 6.5.2 and deployed using daemonset. Are there any known issues like this? All pods except filebeat work on that node has no problems. update: apiVersion: v1 data: filebeat.yml: |- filebeat.inputs: - type: docker multiline.pattern: '^[[:space:]]+' multiline.negate: false multiline.match: after symlinks: true cri.parse_flags: true containers: ids: [""] path: "/var/log/containers" processors: - decode_json_fields: fields: ["message"] process_array: false max_depth: 1 target: message_json overwrite_keys: false when: contains: source: "/var/log/containers/app" - add_kubernetes_metadata: in_cluster: true default_matchers.enabled: false matchers: - logs_path: logs_path: /var/log/containers/ output: logstash: hosts: - logstash:5044 kind: ConfigMap metadata: creationTimestamp: "2020-01-06T09:31:31Z" labels: k8s-app: filebeat name: filebeat-config namespace: monitoring resourceVersion: "6797684985" selfLink: /api/v1/namespaces/monitoring/configmaps/filebeat-config uid: 52d86bbb-3067-11ea-89c6-246e96da5c9c
The add_kubernetes_metadata failed querying https://10.96.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&resourceVersion=0. As it turned out in the discussion above, this could be fixed by a restart of the Beat that resolved the temporary network interface problem.
Impossible connect to elasticsearch in kubernetes(bare metal)
I've set up elastic + kibana + metricbeat in local cluster. But the metricbeat can't connect to elastic: ERROR pipeline/output.go:100 Failed to connect to backoff(elasticsearch(http://elasticsearch:9200)): Get http://elasticsearch:9200: lookup elasticsearch on 10.96.0.10:53: no such host 2019-10-15T14:14:32.553Z INFO pipeline/output.go:93 Attempting to reconnect to backoff(elasticsearch(http://elasticsearch:9200)) with 10 reconnect attempt(s) 2019-10-15T14:14:32.553Z INFO [publisher] pipeline/retry.go:189 retryer: send unwait-signal to consumer 2019-10-15T14:14:32.553Z INFO [publisher] pipeline/retry.go:191 done 2019-10-15T14:14:32.553Z INFO [publisher] pipeline/retry.go:166 retryer: send wait signal to consumer 2019-10-15T14:14:32.553Z INFO [publisher] pipeline/retry.go:168 done 2019-10-15T14:14:32.592Z WARN transport/tcp.go:53 DNS lookup failure "elasticsearch": lookup elasticsearch on 10.96.0.10:53: no such host In my cluster I use metalldb and ingress. I've set up ingress rules but it didnt help me. Also I've noticed that the elk and the metricbeat have different namespaces in docs. I've tried make everywhere the same namespaces but it was unsuccesfully. Below I've attached my yamls. Files for elastic/kibana and metricbeat I didn't attach because they have a lot of lines, I wrote only ref on them: elastic/kibana - https://download.elastic.co/downloads/eck/1.0.0-beta1/all-in-one.yaml metricbeat - https://raw.githubusercontent.com/elastic/beats/7.4/deploy/kubernetes/metricbeat-kubernetes.yaml Maybe anybody know why it happens? **elastic config** - apiVersion: elasticsearch.k8s.elastic.co/v1beta1 kind: Elasticsearch metadata: name: quickstart spec: version: 7.4.0 nodeSets: - name: default count: 1 config: node.master: true node.data: true node.ingest: true node.store.allow_mmap: false volumeClaimTemplates: - metadata: name: elasticsearch-data # note: elasticsearch-data must be the name of the Elasticsearch volume spec: accessModes: - ReadWriteOnce resources: requests: storage: 20Gi storageClassName: standard http: service: spec: type: LoadBalancer **kibana config** - apiVersion: kibana.k8s.elastic.co/v1beta1 kind: Kibana metadata: name: quickstart spec: version: 7.4.0 count: 1 elasticsearchRef: name: quickstart http: service: spec: type: LoadBalancer tls: selfSignedCertificate: disabled: true **ingress rules** - apiVersion: extensions/v1beta1 kind: Ingress metadata: name: ingress annotations: spec: rules: - http: paths: - path: / backend: serviceName: undemo-service servicePort: 80 - path: / backend: serviceName: quickstart-kb-http servicePort: 80 - path: / backend: serviceName: quickstart-es-http servicePort: 80
Just to be aware. Filebeat, metricbeats... runs under kube-system namespace. If you run elastic on default namespace you should use elasticsearch.default as host in order to resolve your service properly.
microk8s.enable dns gets stuck in ContainerCreating
I have installed microk8s snap on Ubuntu 19 in a VBox. When I run microk8s.enable dns, the pod for the deployment does not get past ContainerCreating state. I used to work in before. I have also re-installed microk8s, this helped in the passed, but not anymore. n.a. Output from microk8s.kubectl get all --all-namespaces shows that something is wrong with the volume for the secrets. I don't know how I can investigate further, so any help is appreciated. Cheers NAMESPACE NAME READY STATUS RESTARTS AGE kube-system pod/coredns-9b8997588-z88lz 0/1 ContainerCreating 0 16m NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE default service/kubernetes ClusterIP 10.152.183.1 <none> 443/TCP 20m kube-system service/kube-dns ClusterIP 10.152.183.10 <none> 53/UDP,53/TCP,9153/TCP 16m NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE kube-system deployment.apps/coredns 0/1 1 0 16m NAMESPACE NAME DESIRED CURRENT READY AGE kube-system replicaset.apps/coredns-9b8997588 1 1 0 16m Output from microk8s.kubectl describe pod/coredns-9b8997588-z88lz -n kube-system Name: coredns-9b8997588-z88lz Namespace: kube-system Priority: 2000000000 Priority Class Name: system-cluster-critical Node: peza-ubuntu-19/10.0.2.15 Start Time: Sun, 29 Sep 2019 15:49:27 +0200 Labels: k8s-app=kube-dns pod-template-hash=9b8997588 Annotations: scheduler.alpha.kubernetes.io/critical-pod: Status: Pending IP: IPs: <none> Controlled By: ReplicaSet/coredns-9b8997588 Containers: coredns: Container ID: Image: coredns/coredns:1.5.0 Image ID: Ports: 53/UDP, 53/TCP, 9153/TCP Host Ports: 0/UDP, 0/TCP, 0/TCP Args: -conf /etc/coredns/Corefile State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Limits: memory: 170Mi Requests: cpu: 100m memory: 70Mi Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5 Readiness: http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3 Environment: <none> Mounts: /etc/coredns from config-volume (ro) /var/run/secrets/kubernetes.io/serviceaccount from coredns-token-h6qlm (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: config-volume: Type: ConfigMap (a volume populated by a ConfigMap) Name: coredns Optional: false coredns-token-h6qlm: Type: Secret (a volume populated by a Secret) SecretName: coredns-token-h6qlm Optional: false QoS Class: Burstable Node-Selectors: <none> Tolerations: CriticalAddonsOnly node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled <unknown> default-scheduler Successfully assigned kube-system/coredns-9b8997588-z88lz to peza-ubuntu-19 Warning FailedMount 5m59s kubelet, peza-ubuntu-19 Unable to attach or mount volumes: unmounted volumes=[coredns-token-h6qlm config-volume], unattached volumes=[coredns-token-h6qlm config-volume]: timed out waiting for the condition Warning FailedMount 3m56s (x11 over 10m) kubelet, peza-ubuntu-19 MountVolume.SetUp failed for volume "coredns-token-h6qlm" : failed to sync secret cache: timed out waiting for the condition Warning FailedMount 3m44s (x2 over 8m16s) kubelet, peza-ubuntu-19 Unable to attach or mount volumes: unmounted volumes=[config-volume coredns-token-h6qlm], unattached volumes=[config-volume coredns-token-h6qlm]: timed out waiting for the condition Warning FailedMount 113s (x12 over 10m) kubelet, peza-ubuntu-19 MountVolume.SetUp failed for volume "config-volume" : failed to sync configmap cache: timed out waiting for the condition
I spent my morning fighting with this on ubuntu 19.04. None of the microk8s add-ons worked. Their containers got stuck in "ContainerCreating" status having something like "MountVolume.SetUp failed for volume "kubernetes-dashboard-token-764ml" : failed to sync secret cache: timed out waiting for the condition" in their descriptions. I tried to start/stop/reset/reinstall microk8s a few times. Nothing worked. Once I downgraded it to the prev version the problem went away. sudo snap install microk8s --classic --channel=1.15/stable
Output: mount.nfs: requested NFS version or transport protocol is not supported
I am trying out the Kubernetes NFS volume claim in a replication controller example [1]. I have setup the NFS server, PV and PVC. And my replication controller looks like this apiVersion: v1 kind: ReplicationController metadata: name: node-manager labels: name: node-manager spec: replicas: 1 selector: name: node-manager template: metadata: labels: name: node-manager spec: containers: - name: node-manager image: org/node-manager-1.0.0:1.0.0 ports: - containerPort: 9763 protocol: "TCP" - containerPort: 9443 protocol: "TCP" volumeMounts: - name: nfs mountPath: "/mnt/data" volumes: - name: nfs persistentVolumeClaim: claimName: nfs When I try to deploy the Replication Controller, the container is in the ContainerCreating status and I can see the following error in the journal of the minion Feb 26 11:39:41 node-01 kubelet[1529]: Mounting arguments: 172.17.8.102:/ /var/lib/kubelet/pods/0e66affa-dc79-11e5-89b3-080027f84891/volumes/kubernetes.io~nfs/nfs nfs [] Feb 26 11:39:41 node-01 kubelet[1529]: Output: mount.nfs: requested NFS version or transport protocol is not supported Feb 26 11:39:41 node-01 kubelet[1529]: E0226 11:39:41.908756 1529 kubelet.go:1383] Unable to mount volumes for pod "node-manager-eemi2_default": exit status 32; skipping pod Feb 26 11:39:41 node-01 kubelet[1529]: E0226 11:39:41.923297 1529 pod_workers.go:112] Error syncing pod 0e66affa-dc79-11e5-89b3-080027f84891, skipping: exit status 32 Feb 26 11:39:51 node-01 kubelet[1529]: E0226 11:39:51.904931 1529 mount_linux.go:103] Mount failed: exit status 32 Used [2] Kubernetes-cluster-vagrant-cluster to setup my Kubernetes cluster. my minion details: core#node-01 ~ $ cat /etc/lsb-release DISTRIB_ID=CoreOS DISTRIB_RELEASE=969.0.0 DISTRIB_CODENAME="Coeur Rouge" DISTRIB_DESCRIPTION="CoreOS 969.0.0 (Coeur Rouge)" [1] - https://github.com/kubernetes/kubernetes/tree/master/examples/nfs [2] - https://github.com/pires/kubernetes-vagrant-coreos-cluster
I had the same problem then realized that nfs-server.service status is disabled. After activating, the problem has been solved.
hence in order to resolve this nfs mount version issue by making the entry in /etc/nfsmount.conf in nfs server with Defaultvers=4 in the NFS server .The will resolved !!