Unable to connect Fluend to ElasticSearch - elasticsearch

I am following instructions to connect FluentD to Elastic from HERE.
Unfortunately, when I deploy the daemonset, I get the following error:
2020-11-05 09:13:19 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. no implicit conversion of nil into String (TypeError)
2020-11-05 09:13:19 +0000 [warn]: #0 [out_es] Remaining retry: 14. Retry to communicate after 2 second(s).
2020-11-05 09:13:23 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. no implicit conversion of nil into String (TypeError)
2020-11-05 09:13:23 +0000 [warn]: #0 [out_es] Remaining retry: 13. Retry to communicate after 4 second(s).
2020-11-05 09:13:31 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. no implicit conversion of nil into String (TypeError)
2020-11-05 09:13:31 +0000 [warn]: #0 [out_es] Remaining retry: 12. Retry to communicate after 8 second(s).
2020-11-05 09:13:47 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. no implicit conversion of nil into String (TypeError)
2020-11-05 09:13:47 +0000 [warn]: #0 [out_es] Remaining retry: 11. Retry to communicate after 16 second(s).
2020-11-05 09:14:24 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. no implicit conversion of nil into String (TypeError)
2020-11-05 09:14:24 +0000 [warn]: #0 [out_es] Remaining retry: 10. Retry to communicate after 32 second(s).
2020-11-05 09:15:28 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. no implicit conversion of nil into String (TypeError)
2020-11-05 09:15:28 +0000 [warn]: #0 [out_es] Remaining retry: 9. Retry to communicate after 64 second(s).
#<Thread:0x00007fcf20f41a40#/fluentd/vendor/bundle/ruby/2.6.0/gems/fluent-plugin-kubernetes_metadata_filter-2.3.0/lib/fluent/plugin/filter_kubernetes_metadata.rb:265 run> terminated with exception (report_on_exception is true):
/usr/local/lib/ruby/2.6.0/openssl/buffering.rb:125:in `sysread': error reading from socket: Connection reset by peer (HTTP::ConnectionError)
from /usr/local/lib/ruby/2.6.0/openssl/buffering.rb:125:in `readpartial'
from /fluentd/vendor/bundle/ruby/2.6.0/gems/http-4.4.1/lib/http/timeout/null.rb:45:in `readpartial'
from /fluentd/vendor/bundle/ruby/2.6.0/gems/http-4.4.1/lib/http/connection.rb:212:in `read_more'
from /fluentd/vendor/bundle/ruby/2.6.0/gems/http-4.4.1/lib/http/connection.rb:92:in `readpartial'
from /fluentd/vendor/bundle/ruby/2.6.0/gems/http-4.4.1/lib/http/response/body.rb:30:in `readpartial'
from /fluentd/vendor/bundle/ruby/2.6.0/gems/http-4.4.1/lib/http/response/body.rb:36:in `each'
from /fluentd/vendor/bundle/ruby/2.6.0/gems/kubeclient-4.9.1/lib/kubeclient/watch_stream.rb:25:in `each'
from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluent-plugin-kubernetes_metadata_filter-2.3.0/lib/fluent/plugin/kubernetes_metadata_watch_namespaces.rb:36:in `start_namespace_watch'
from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluent-plugin-kubernetes_metadata_filter-2.3.0/lib/fluent/plugin/filter_kubernetes_metadata.rb:265:in `block in configure'
/usr/local/lib/ruby/2.6.0/openssl/buffering.rb:125:in `sysread': Connection reset by peer (Errno::ECONNRESET)
from /usr/local/lib/ruby/2.6.0/openssl/buffering.rb:125:in `readpartial'
from /fluentd/vendor/bundle/ruby/2.6.0/gems/http-4.4.1/lib/http/timeout/null.rb:45:in `readpartial'
from /fluentd/vendor/bundle/ruby/2.6.0/gems/http-4.4.1/lib/http/connection.rb:212:in `read_more'
from /fluentd/vendor/bundle/ruby/2.6.0/gems/http-4.4.1/lib/http/connection.rb:92:in `readpartial'
from /fluentd/vendor/bundle/ruby/2.6.0/gems/http-4.4.1/lib/http/response/body.rb:30:in `readpartial'
I have tried both true and false for this flag:
- name: FLUENT_ELASTICSEARCH_SSL_VERIFY
value: "false"
and have had no luck.
I would appreciate any pointers as to what might be the issue.

You are passing one of the env variables a blank value. Update that.

Related

FluentD unable to establish connection to ElasticSearch

I am trying to setup an FluentD + ECK on my Kubernetes Cluster.
But FluentD is failing to establish connection with ElasticSearch which is on SSL.
Error log
2022-10-12 04:55:27 +0000 [info]: adding match in #OUTPUT pattern="**" type="elasticsearch"
2022-10-12 04:55:29 +0000 [warn]: #0 Could not communicate to Elasticsearch, resetting connection and trying again. EOFError (EOFError)
2022-10-12 04:55:29 +0000 [warn]: #0 Remaining retry: 14. Retry to communicate after 2 second(s).
2022-10-12 04:55:33 +0000 [warn]: #0 Could not communicate to Elasticsearch, resetting connection and trying again. EOFError (EOFError)
2022-10-12 04:55:33 +0000 [warn]: #0 Remaining retry: 13. Retry to communicate after 4 second(s).
FluentD output Conf
<label #OUTPUT>
<match **>
#type elasticsearch
host elasticsearch-es-http
port 9200
path ""
user elastic
password XXXXXXXXX
ca_path "/etc/ssl/certs/ca.crt"
</match>
</label>
Mounted the below ElasticSearch secret as cert on fluentd
- name: elasticsearch-es-http-certs-public
secret:
secretName: elasticsearch-es-http-certs-public
- name: elasticsearch-es-http-certs-public
mountPath: "/etc/ssl/certs"
elasticsearch-es-http is the ElasticSearch Service name and the PODs are up and running.
Please guide me on where I went wrong.

[warn]: [input_forward] incoming chunk is broken:

td-agent 4.3.0 and fluentd 1.14.3 is installed on ubuntu 18.04 and while running Nessus scan to check vulnerability on td-agent server at that time below warning found on td-agent log and logs are not push to elasticsearch and td-agent not working.
[warn]: #0 [input_http] unexpected error error="Could not parse data entirely (0 != 85)"
[warn]: #0 [input_forward] incoming chunk is broken: host="[Nessus server ip address]" msg=36

fluentd can't connect to elasticsearch in cluster

i tried to set up an EFK Stack. While E+K work fine in the default namespace, the Fluentd container can't connect to elasticsearch.
kubectl get services -n default
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
elasticsearch-master ClusterIP 10.43.40.136 <none> 9200/TCP,9300/TCP 92m
elasticsearch-master-headless ClusterIP None <none> 9200/TCP,9300/TCP 92m
kibana-kibana ClusterIP 10.43.152.189 <none> 5601/TCP 74m
kubernetes ClusterIP 10.43.0.1 <none> 443/TCP 14d
I've installed fluentd from this repo and changed the url to elasticsearch
https://github.com/fluent/fluentd-kubernetes-daemonset/blob/master/fluentd-daemonset-elasticsearch-rbac.yaml
kubectl -n kube-system get pods | grep fluentd
fluentd-4fd2s 1/1 Running 0 51m
fluentd-7t2v5 1/1 Running 0 49m
fluentd-dfnfg 1/1 Running 0 50m
fluentd-lvrsv 1/1 Running 0 48m
fluentd-rv4td 1/1 Running 0 50m
but the log is telling me:
2021-07-23 21:38:59 +0000 [info]: starting fluentd-1.13.2 pid=7 ruby="2.6.8"
2021-07-23 21:38:59 +0000 [info]: spawn command to main: cmdline=["/usr/local/bin/ruby", "-Eascii-8bit:ascii-8bit", "/fluentd/vendor/bundle/ruby/2.6.0/bin/fluentd", "-c", "/fluentd/etc/fluent.conf", "-p", "/fluentd/plugins", "--gemfile", "/fluentd/Gemfile", "-r", "/fluentd/vendor/bundle/ruby/2.6.0/gems/fluent-plugin-elasticsearch-5.0.5/lib/fluent/plugin/elasticsearch_simple_sniffer.rb", "--under-supervisor"]
2021-07-23 21:39:01 +0000 [info]: adding match in #FLUENT_LOG pattern="fluent.**" type="null"
2021-07-23 21:39:01 +0000 [info]: adding filter pattern="kubernetes.**" type="kubernetes_metadata"
2021-07-23 21:39:01 +0000 [warn]: #0 [filter_kube_metadata] !! The environment variable 'K8S_NODE_NAME' is not set to the node name which can affect the API server and watch efficiency !!
2021-07-23 21:39:01 +0000 [info]: adding match pattern="**" type="elasticsearch"
2021-07-23 21:39:09 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. connect_write timeout reached
2021-07-23 21:39:09 +0000 [warn]: #0 [out_es] Remaining retry: 14. Retry to communicate after 2 second(s).
2021-07-23 21:39:18 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. connect_write timeout reached
2021-07-23 21:39:18 +0000 [warn]: #0 [out_es] Remaining retry: 13. Retry to communicate after 4 second(s).
2021-07-23 21:39:31 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. connect_write timeout reached
2021-07-23 21:39:31 +0000 [warn]: #0 [out_es] Remaining retry: 12. Retry to communicate after 8 second(s).
2021-07-23 21:39:52 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. connect_write timeout reached
2021-07-23 21:39:52 +0000 [warn]: #0 [out_es] Remaining retry: 11. Retry to communicate after 16 second(s).
2021-07-23 21:40:29 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. connect_write timeout reached
2021-07-23 21:40:29 +0000 [warn]: #0 [out_es] Remaining retry: 10. Retry to communicate after 32 second(s).
2021-07-23 21:41:38 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. connect_write timeout reached
I installed dig and it resolved the service:
root#fluentd-dfnfg:/home/fluent# nslookup elasticsearch-master.default.svc.cluster.local
Server: 10.43.0.10
Address: 10.43.0.10#53
Name: elasticsearch-master.default.svc.cluster.local
Address: 10.43.40.136
I'm out of ideas.
PS: Im using a hardened RKE2. (https://github.com/rancherfederal/rke2-ansible)

fail to flush the buffer in fluentd to elasticsearch

Describe the bug
logs are not getting transferred to elasticsearch.
Expected behavior
logs from the source folder should've been transferred to elasticsearch.
Your Environment
Fluentd or td-agent version: td-agent 1.11.2
Operating system: Linux
Your Configuration
<source>
#type tail
path /home/smtp-api/storage/logs/fluentd/*
pos_file /home/smtp-api/storage/logs/fluentd/access-logs.pos
tag smtp.access
format json
read_from_head true
</source>
<source>
#type tail
path /home/smtp-api-2/storage/logs/fluentd/*
pos_file /home/smtp-api-2/storage/logs/fluentd/access-logs.pos
tag smtp.access2
format json
read_from_head true
</source>
<match smtp.*>
#type elasticsearch
host elk-data.amartya-int.int
<buffer>
#type file
path /var/log/fluent/smtp
compress gzip
</buffer>
port 9200
logstash_format true
logstash_prefix smtp.access
</match>
Your Error Log
2021-05-17 15:30:41 +0530 [warn]: #0 failed to flush the buffer. retry_time=3 next_retry_seconds=2021-05-17 15:30:45 864793371479422728793/2199023255552000000000 +0530 chunk="5c279c260c779ea13a39e5ac45c9b1a9" error_class=IOError error="closed stream"
2021-05-17 15:30:41 +0530 [warn]: #0 suppressed same stacktrace
2021-05-17 15:30:46 +0530 [warn]: #0 failed to flush the buffer. retry_time=4 next_retry_seconds=2021-05-17 15:30:53 1246145672473190896169/2199023255552000000000 +0530 chunk="5c279c260c779ea13a39e5ac45c9b1a9" error_class=IOError error="closed stream"
2021-05-17 15:30:46 +0530 [warn]: #0 suppressed same stacktrace
2021-05-17 15:30:53 +0530 [warn]: #0 failed to flush the buffer. retry_time=5 next_retry_seconds=2021-05-17 15:31:09 431493802875340255437/549755813888000000000 +0530 chunk="5c279c260c779ea13a39e5ac45c9b1a9" error_class=IOError error="closed stream"
2021-05-17 15:30:53 +0530 [warn]: #0 suppressed same stacktrace
2021-05-17 15:31:10 +0530 [warn]: #0 failed to flush the buffer. retry_time=6 next_retry_seconds=2021-05-17 15:31:44 36018824927254535859/274877906944000000000 +0530 chunk="5c279c260c779ea13a39e5ac45c9b1a9" error_class=IOError error="closed stream"
2021-05-17 15:31:10 +0530 [warn]: #0 suppressed same stacktrace
2021-05-17 15:31:45 +0530 [warn]: #0 failed to flush the buffer. retry_time=7 next_retry_seconds=2021-05-17 15:32:41 213226392719973317877/274877906944000000000 +0530 chunk="5c279c260c779ea13a39e5ac45c9b1a9" error_class=IOError error="closed stream"
2021-05-17 15:31:45 +0530 [warn]: #0 suppressed same stacktrace
2021-05-17 15:32:41 +0530 [warn]: #0 failed to flush the buffer. retry_time=8 next_retry_seconds=2021-05-17 15:34:46 14488516743329774951/68719476736000000000 +0530 chunk="5c279c260c779ea13a39e5ac45c9b1a9" error_class=IOError error="closed stream"
2021-05-17 15:32:41 +0530 [warn]: #0 suppressed same stacktrace
^[2021-05-17 15:34:46 +0530 [warn]: #0 failed to flush the buffer. retry_time=9 next_retry_seconds=2021-05-17 15:38:59 1590472627730823409/17179869184000000000 +0530 chunk="5c279c260c779ea13a39e5ac45c9b1a9" error_class=IOError error="closed stream"
2021-05-17 15:34:46 +0530 [warn]: #0 suppressed same stacktrace
2021-05-17 15:38:59 +0530 [warn]: #0 failed to flush the buffer. retry_time=10 next_retry_seconds=2021-05-17 15:48:09 6447852607269569521/8589934592000000000 +0530 chunk="5c279c260c779ea13a39e5ac45c9b1a9" error_class=IOError error="closed stream"
2021-05-17 15:38:59 +0530 [warn]: #0 suppressed same stacktrace
2021-05-17 15:48:10 +0530 [warn]: #0 failed to flush the buffer. retry_time=11 next_retry_seconds=2021-05-17 16:05:21 2422140701418556159/8589934592000000000 +0530 chunk="5c279c260c779ea13a39e5ac45c9b1a9" error_class=IOError error="closed stream"
2021-05-17 15:48:10 +0530 [warn]: #0 suppressed same stacktrace
2021-05-17 16:05:21 +0530 [warn]: #0 failed to flush the buffer. retry_time=12 next_retry_seconds=2021-05-17 16:43:30 22032519751758471/268435456000000000 +0530 chunk="5c279c260c779ea13a39e5ac45c9b1a9" error_class=IOError error="closed stream"
2021-05-17 16:05:21 +0530 [warn]: #0 suppressed same stacktrace

error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error="could not push logs to Elasticsearch cluster

I am getting this error from the fluentd pods and they keep restarting. I am running this on kuberentes v1.17.9-eks-4c6976.
Not sure of what the cause is. Any help would be appreciated.
/usr/local/bundle/gems/fluentd-1.11.4/lib/fluent/plugin_helper/http_server/compat/webrick_handler.rb:26: warning: The called method `build' is defined here
2020-11-23 18:02:08 +0000 [warn]: [elasticsearch] failed to flush the buffer. retry_time=0 next_retry_seconds=2020-11-23 18:02:09.126315296 +0000 chunk="5b4c9fd811e8162eb94f03d8cec677e5" error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error="could not push logs to Elasticsearch cluster ({:host=>\"elasticsearch-master\", :port=>9200, :scheme=>\"http\", :path=>\"\"}): read timeout reached"
2020-11-23 18:02:08.126340601 +0000 fluent.warn: {"retry_time":0,"next_retry_seconds":"2020-11-23 18:02:09.126315296 +0000","chunk":"5b4c9fd811e8162eb94f03d8cec677e5","error":"#<Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure: could not push logs to Elasticsearch cluster ({:host=>\"elasticsearch-master\", :port=>9200, :scheme=>\"http\", :path=>\"\"}): read timeout reached>","message":"[elasticsearch] failed to flush the buffer. retry_time=0 next_retry_seconds=2020-11-23 18:02:09.126315296 +0000 chunk=\"5b4c9fd811e8162eb94f03d8cec677e5\" error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error=\"could not push logs to Elasticsearch cluster ({:host=>\\\"elasticsearch-master\\\", :port=>9200, :scheme=>\\\"http\\\", :path=>\\\"\\\"}): read timeout reached\""}
2020-11-23 18:02:08 +0000 [warn]: /usr/local/bundle/gems/fluent-plugin-elasticsearch-4.2.2/lib/fluent/plugin/out_elasticsearch.rb:1055:in `rescue in send_bulk'
2020-11-23 18:02:08 +0000 [warn]: /usr/local/bundle/gems/fluent-plugin-elasticsearch-4.2.2/lib/fluent/plugin/out_elasticsearch.rb:1017:in `send_bulk'
2020-11-23 18:02:08 +0000 [warn]: /usr/local/bundle/gems/fluent-plugin-elasticsearch-4.2.2/lib/fluent/plugin/out_elasticsearch.rb:842:in `block in write'
2020-11-23 18:02:08 +0000 [warn]: /usr/local/bundle/gems/fluent-plugin-elasticsearch-4.2.2/lib/fluent/plugin/out_elasticsearch.rb:841:in `each'
2020-11-23 18:02:08 +0000 [warn]: /usr/local/bundle/gems/fluent-plugin-elasticsearch-4.2.2/lib/fluent/plugin/out_elasticsearch.rb:841:in `write'
2020-11-23 18:02:08 +0000 [warn]: /usr/local/bundle/gems/fluentd-1.11.4/lib/fluent/plugin/output.rb:1136:in `try_flush'
2020-11-23 18:02:08 +0000 [warn]: /usr/local/bundle/gems/fluentd-1.11.4/lib/fluent/plugin/output.rb:1442:in `flush_thread_run'
2020-11-23 18:02:08 +0000 [warn]: /usr/local/bundle/gems/fluentd-1.11.4/lib/fluent/plugin/output.rb:462:in `block (2 levels) in start'
2020-11-23 18:02:08 +0000 [warn]: /usr/local/bundle/gems/fluentd-1.11.4/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2020-11-23 18:02:08 +0000 [warn]: [elasticsearch] failed to flush the buffer. retry_time=1 next_retry_seconds=2020-11-23 18:02:09 475256825743319463889/8796093022208000000000 +0000 chunk="5b4c9fd80c4e40f1d7a4a799916ae12b" error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error="could not push logs to Elasticsearch cluster ({:host=>\"elasticsearch-master\", :port=>9200, :scheme=>\"http\", :path=>\"\"}): read timeout reached"
2020-11-23 18:02:08.127449054 +0000 fluent.warn: {"retry_time":1,"next_retry_seconds":"2020-11-23 18:02:09 475256825743319463889/8796093022208000000000 +0000","chunk":"5b4c9fd80c4e40f1d7a4a799916ae12b","error":"#<Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure: could not push logs to Elasticsearch cluster ({:host=>\"elasticsearch-master\", :port=>9200, :scheme=>\"http\", :path=>\"\"}): read timeout reached>","message":"[elasticsearch] failed to flush the buffer. retry_time=1 next_retry_seconds=2020-11-23 18:02:09 475256825743319463889/8796093022208000000000 +0000 chunk=\"5b4c9fd80c4e40f1d7a4a799916ae12b\" error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error=\"could not push logs to Elasticsearch cluster ({:host=>\\\"elasticsearch-master\\\", :port=>9200, :scheme=>\\\"http\\\", :path=>\\\"\\\"}): read timeout reached\""}
The default request_timeout value for fluent-plugin-elasticsearch is 5s, which could often be too short when the fluentd has a large backlog to replay back to elasticsearch in large bulk messages.
So you may want to increase that request_timeout value for your elasticsearch output in your fluentd configuration to 15s or even much higher - like say 60s. It is important that you specify the time unit such as s also and not just the value of say 60.
The documentation for the that setting can be seen here: https://github.com/uken/fluent-plugin-elasticsearch#request_timeout
This could also be an indication that your elasticsearch node/cluster cannot ingest the data fast enough.

Resources