td-agent 4.3.0 and fluentd 1.14.3 is installed on ubuntu 18.04 and while running Nessus scan to check vulnerability on td-agent server at that time below warning found on td-agent log and logs are not push to elasticsearch and td-agent not working.
[warn]: #0 [input_http] unexpected error error="Could not parse data entirely (0 != 85)"
[warn]: #0 [input_forward] incoming chunk is broken: host="[Nessus server ip address]" msg=36
Related
I am trying to setup an FluentD + ECK on my Kubernetes Cluster.
But FluentD is failing to establish connection with ElasticSearch which is on SSL.
Error log
2022-10-12 04:55:27 +0000 [info]: adding match in #OUTPUT pattern="**" type="elasticsearch"
2022-10-12 04:55:29 +0000 [warn]: #0 Could not communicate to Elasticsearch, resetting connection and trying again. EOFError (EOFError)
2022-10-12 04:55:29 +0000 [warn]: #0 Remaining retry: 14. Retry to communicate after 2 second(s).
2022-10-12 04:55:33 +0000 [warn]: #0 Could not communicate to Elasticsearch, resetting connection and trying again. EOFError (EOFError)
2022-10-12 04:55:33 +0000 [warn]: #0 Remaining retry: 13. Retry to communicate after 4 second(s).
FluentD output Conf
<label #OUTPUT>
<match **>
#type elasticsearch
host elasticsearch-es-http
port 9200
path ""
user elastic
password XXXXXXXXX
ca_path "/etc/ssl/certs/ca.crt"
</match>
</label>
Mounted the below ElasticSearch secret as cert on fluentd
- name: elasticsearch-es-http-certs-public
secret:
secretName: elasticsearch-es-http-certs-public
- name: elasticsearch-es-http-certs-public
mountPath: "/etc/ssl/certs"
elasticsearch-es-http is the ElasticSearch Service name and the PODs are up and running.
Please guide me on where I went wrong.
i tried to set up an EFK Stack. While E+K work fine in the default namespace, the Fluentd container can't connect to elasticsearch.
kubectl get services -n default
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
elasticsearch-master ClusterIP 10.43.40.136 <none> 9200/TCP,9300/TCP 92m
elasticsearch-master-headless ClusterIP None <none> 9200/TCP,9300/TCP 92m
kibana-kibana ClusterIP 10.43.152.189 <none> 5601/TCP 74m
kubernetes ClusterIP 10.43.0.1 <none> 443/TCP 14d
I've installed fluentd from this repo and changed the url to elasticsearch
https://github.com/fluent/fluentd-kubernetes-daemonset/blob/master/fluentd-daemonset-elasticsearch-rbac.yaml
kubectl -n kube-system get pods | grep fluentd
fluentd-4fd2s 1/1 Running 0 51m
fluentd-7t2v5 1/1 Running 0 49m
fluentd-dfnfg 1/1 Running 0 50m
fluentd-lvrsv 1/1 Running 0 48m
fluentd-rv4td 1/1 Running 0 50m
but the log is telling me:
2021-07-23 21:38:59 +0000 [info]: starting fluentd-1.13.2 pid=7 ruby="2.6.8"
2021-07-23 21:38:59 +0000 [info]: spawn command to main: cmdline=["/usr/local/bin/ruby", "-Eascii-8bit:ascii-8bit", "/fluentd/vendor/bundle/ruby/2.6.0/bin/fluentd", "-c", "/fluentd/etc/fluent.conf", "-p", "/fluentd/plugins", "--gemfile", "/fluentd/Gemfile", "-r", "/fluentd/vendor/bundle/ruby/2.6.0/gems/fluent-plugin-elasticsearch-5.0.5/lib/fluent/plugin/elasticsearch_simple_sniffer.rb", "--under-supervisor"]
2021-07-23 21:39:01 +0000 [info]: adding match in #FLUENT_LOG pattern="fluent.**" type="null"
2021-07-23 21:39:01 +0000 [info]: adding filter pattern="kubernetes.**" type="kubernetes_metadata"
2021-07-23 21:39:01 +0000 [warn]: #0 [filter_kube_metadata] !! The environment variable 'K8S_NODE_NAME' is not set to the node name which can affect the API server and watch efficiency !!
2021-07-23 21:39:01 +0000 [info]: adding match pattern="**" type="elasticsearch"
2021-07-23 21:39:09 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. connect_write timeout reached
2021-07-23 21:39:09 +0000 [warn]: #0 [out_es] Remaining retry: 14. Retry to communicate after 2 second(s).
2021-07-23 21:39:18 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. connect_write timeout reached
2021-07-23 21:39:18 +0000 [warn]: #0 [out_es] Remaining retry: 13. Retry to communicate after 4 second(s).
2021-07-23 21:39:31 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. connect_write timeout reached
2021-07-23 21:39:31 +0000 [warn]: #0 [out_es] Remaining retry: 12. Retry to communicate after 8 second(s).
2021-07-23 21:39:52 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. connect_write timeout reached
2021-07-23 21:39:52 +0000 [warn]: #0 [out_es] Remaining retry: 11. Retry to communicate after 16 second(s).
2021-07-23 21:40:29 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. connect_write timeout reached
2021-07-23 21:40:29 +0000 [warn]: #0 [out_es] Remaining retry: 10. Retry to communicate after 32 second(s).
2021-07-23 21:41:38 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. connect_write timeout reached
I installed dig and it resolved the service:
root#fluentd-dfnfg:/home/fluent# nslookup elasticsearch-master.default.svc.cluster.local
Server: 10.43.0.10
Address: 10.43.0.10#53
Name: elasticsearch-master.default.svc.cluster.local
Address: 10.43.40.136
I'm out of ideas.
PS: Im using a hardened RKE2. (https://github.com/rancherfederal/rke2-ansible)
I use bitnami fluentd chart for Kubernetes and my setup is almost native besides of some changes.
My source section looks like
#type tail
path /var/log/containers/*my-app*.log
pos_file /opt/bitnami/fluentd/logs/buffers/fluentd-docker.pos
tag kubernetes.*
read_from_head true
and my application sends to stdout some more advanced logs information like:
2021-07-13 11:33:49.060 +0000 - [ERROR] - fatal error - play.api.http.DefaultHttpErrorHandler in postman-akka.actor.default-dispatcher-6 play.api.UnexpectedException: Unexpected exception[RuntimeException: java.net.ConnectException: Connection refused (Connection refused)]
at play.api.http.HttpErrorHandlerExceptions$.throwableToUsefulException(HttpErrorHandler.scala:328)
at play.api.http.DefaultHttpErrorHandler.onServerError(HttpErrorHandler
and the problem is because in fluentd forwarder I can see (in /var/log/containers/*) that all records are stored in the following format:
{"log":"2021-07-13 19:54:48.523 +0000 - [ERROR] - from akka.io.TcpListener in postman-akka.actor.default-dispatcher-6 New connection accepted \n","stream":"stdout","time":"2021-07-13T19:54:48.523724149Z"}
{"log":"2021-07-13 19:54:48.523 +0000 - [ERROR] -- play.api.http.DefaultHttpErrorHandler in postman-akka.actor.default-dispatcher-6 \n","stream":"stdout","time":"2021-07-13T19:55:10.479279395Z"}
{"log":"2021-07-13 19:54:48.523 +0000 - [ERROR] - play.api.UnexpectedException: Unexpected exception[RuntimeException: }
{"log":"2021-07-13 19:54:48.523 +0000 - [ERROR] - java.net.ConnectException: Connection refused (Connection refused)] }
and the problem as you can see here is that all those lines are "separated" log record.
I would like to extract entire log message with entire stack trace, I wrote some configuration to fluentd parse section
#type regexp
expression /^(?<time>^(.*?:.*?)):\d\d.\d+\s\+0000 - (?<type>(\[\w+\])).- (?<text>(.*))/m
time_key time
time_format %Y-%m-%d %H:%M:%S
</parse>
but I am pretty sure that this is not problem because from some reason those files in (/var/log/containers/*.log) already storing wrong format of records, how can I configure fluentd forwarder to "take" logs from containers and store logs in format (non-json) ?
I am following instructions to connect FluentD to Elastic from HERE.
Unfortunately, when I deploy the daemonset, I get the following error:
2020-11-05 09:13:19 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. no implicit conversion of nil into String (TypeError)
2020-11-05 09:13:19 +0000 [warn]: #0 [out_es] Remaining retry: 14. Retry to communicate after 2 second(s).
2020-11-05 09:13:23 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. no implicit conversion of nil into String (TypeError)
2020-11-05 09:13:23 +0000 [warn]: #0 [out_es] Remaining retry: 13. Retry to communicate after 4 second(s).
2020-11-05 09:13:31 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. no implicit conversion of nil into String (TypeError)
2020-11-05 09:13:31 +0000 [warn]: #0 [out_es] Remaining retry: 12. Retry to communicate after 8 second(s).
2020-11-05 09:13:47 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. no implicit conversion of nil into String (TypeError)
2020-11-05 09:13:47 +0000 [warn]: #0 [out_es] Remaining retry: 11. Retry to communicate after 16 second(s).
2020-11-05 09:14:24 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. no implicit conversion of nil into String (TypeError)
2020-11-05 09:14:24 +0000 [warn]: #0 [out_es] Remaining retry: 10. Retry to communicate after 32 second(s).
2020-11-05 09:15:28 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. no implicit conversion of nil into String (TypeError)
2020-11-05 09:15:28 +0000 [warn]: #0 [out_es] Remaining retry: 9. Retry to communicate after 64 second(s).
#<Thread:0x00007fcf20f41a40#/fluentd/vendor/bundle/ruby/2.6.0/gems/fluent-plugin-kubernetes_metadata_filter-2.3.0/lib/fluent/plugin/filter_kubernetes_metadata.rb:265 run> terminated with exception (report_on_exception is true):
/usr/local/lib/ruby/2.6.0/openssl/buffering.rb:125:in `sysread': error reading from socket: Connection reset by peer (HTTP::ConnectionError)
from /usr/local/lib/ruby/2.6.0/openssl/buffering.rb:125:in `readpartial'
from /fluentd/vendor/bundle/ruby/2.6.0/gems/http-4.4.1/lib/http/timeout/null.rb:45:in `readpartial'
from /fluentd/vendor/bundle/ruby/2.6.0/gems/http-4.4.1/lib/http/connection.rb:212:in `read_more'
from /fluentd/vendor/bundle/ruby/2.6.0/gems/http-4.4.1/lib/http/connection.rb:92:in `readpartial'
from /fluentd/vendor/bundle/ruby/2.6.0/gems/http-4.4.1/lib/http/response/body.rb:30:in `readpartial'
from /fluentd/vendor/bundle/ruby/2.6.0/gems/http-4.4.1/lib/http/response/body.rb:36:in `each'
from /fluentd/vendor/bundle/ruby/2.6.0/gems/kubeclient-4.9.1/lib/kubeclient/watch_stream.rb:25:in `each'
from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluent-plugin-kubernetes_metadata_filter-2.3.0/lib/fluent/plugin/kubernetes_metadata_watch_namespaces.rb:36:in `start_namespace_watch'
from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluent-plugin-kubernetes_metadata_filter-2.3.0/lib/fluent/plugin/filter_kubernetes_metadata.rb:265:in `block in configure'
/usr/local/lib/ruby/2.6.0/openssl/buffering.rb:125:in `sysread': Connection reset by peer (Errno::ECONNRESET)
from /usr/local/lib/ruby/2.6.0/openssl/buffering.rb:125:in `readpartial'
from /fluentd/vendor/bundle/ruby/2.6.0/gems/http-4.4.1/lib/http/timeout/null.rb:45:in `readpartial'
from /fluentd/vendor/bundle/ruby/2.6.0/gems/http-4.4.1/lib/http/connection.rb:212:in `read_more'
from /fluentd/vendor/bundle/ruby/2.6.0/gems/http-4.4.1/lib/http/connection.rb:92:in `readpartial'
from /fluentd/vendor/bundle/ruby/2.6.0/gems/http-4.4.1/lib/http/response/body.rb:30:in `readpartial'
I have tried both true and false for this flag:
- name: FLUENT_ELASTICSEARCH_SSL_VERIFY
value: "false"
and have had no luck.
I would appreciate any pointers as to what might be the issue.
You are passing one of the env variables a blank value. Update that.
In an EFK setup, the fluentd suddenly stopped sending to elasticsearch with the following errors in the logs:
2020-09-28 18:48:55 +0000 [warn]: #0 Could not communicate to Elasticsearch, resetting connection and trying again. getaddrinfo: Name or service not known (SocketError)
2020-09-28 18:48:55 +0000 [warn]: #0 Remaining retry: 6. Retry to communicate after 512 second(s).
The elasticsearch components are up and running, and I can curl and access elasticsearch from inside the fluentd pod. There is no error message in the logs of the elasticsearch.
Restarting the fluentd pod or elasticsearch components did not help.
The issue was in one of the configurations that was uploaded to fluentd. The elasticsearch host was set to a wrong value in that configuration. After fixing that configuration, the issue waa resolved.