Fluentd forwarder DaemonSet has wrong logs format - elasticsearch

I use bitnami fluentd chart for Kubernetes and my setup is almost native besides of some changes.
My source section looks like
#type tail
path /var/log/containers/*my-app*.log
pos_file /opt/bitnami/fluentd/logs/buffers/fluentd-docker.pos
tag kubernetes.*
read_from_head true
and my application sends to stdout some more advanced logs information like:
2021-07-13 11:33:49.060 +0000 - [ERROR] - fatal error - play.api.http.DefaultHttpErrorHandler in postman-akka.actor.default-dispatcher-6 play.api.UnexpectedException: Unexpected exception[RuntimeException: java.net.ConnectException: Connection refused (Connection refused)]
at play.api.http.HttpErrorHandlerExceptions$.throwableToUsefulException(HttpErrorHandler.scala:328)
at play.api.http.DefaultHttpErrorHandler.onServerError(HttpErrorHandler
and the problem is because in fluentd forwarder I can see (in /var/log/containers/*) that all records are stored in the following format:
{"log":"2021-07-13 19:54:48.523 +0000 - [ERROR] - from akka.io.TcpListener in postman-akka.actor.default-dispatcher-6 New connection accepted \n","stream":"stdout","time":"2021-07-13T19:54:48.523724149Z"}
{"log":"2021-07-13 19:54:48.523 +0000 - [ERROR] -- play.api.http.DefaultHttpErrorHandler in postman-akka.actor.default-dispatcher-6 \n","stream":"stdout","time":"2021-07-13T19:55:10.479279395Z"}
{"log":"2021-07-13 19:54:48.523 +0000 - [ERROR] - play.api.UnexpectedException: Unexpected exception[RuntimeException: }
{"log":"2021-07-13 19:54:48.523 +0000 - [ERROR] - java.net.ConnectException: Connection refused (Connection refused)] }
and the problem as you can see here is that all those lines are "separated" log record.
I would like to extract entire log message with entire stack trace, I wrote some configuration to fluentd parse section
#type regexp
expression /^(?<time>^(.*?:.*?)):\d\d.\d+\s\+0000 - (?<type>(\[\w+\])).- (?<text>(.*))/m
time_key time
time_format %Y-%m-%d %H:%M:%S
</parse>
but I am pretty sure that this is not problem because from some reason those files in (/var/log/containers/*.log) already storing wrong format of records, how can I configure fluentd forwarder to "take" logs from containers and store logs in format (non-json) ?

Related

FluentD unable to establish connection to ElasticSearch

I am trying to setup an FluentD + ECK on my Kubernetes Cluster.
But FluentD is failing to establish connection with ElasticSearch which is on SSL.
Error log
2022-10-12 04:55:27 +0000 [info]: adding match in #OUTPUT pattern="**" type="elasticsearch"
2022-10-12 04:55:29 +0000 [warn]: #0 Could not communicate to Elasticsearch, resetting connection and trying again. EOFError (EOFError)
2022-10-12 04:55:29 +0000 [warn]: #0 Remaining retry: 14. Retry to communicate after 2 second(s).
2022-10-12 04:55:33 +0000 [warn]: #0 Could not communicate to Elasticsearch, resetting connection and trying again. EOFError (EOFError)
2022-10-12 04:55:33 +0000 [warn]: #0 Remaining retry: 13. Retry to communicate after 4 second(s).
FluentD output Conf
<label #OUTPUT>
<match **>
#type elasticsearch
host elasticsearch-es-http
port 9200
path ""
user elastic
password XXXXXXXXX
ca_path "/etc/ssl/certs/ca.crt"
</match>
</label>
Mounted the below ElasticSearch secret as cert on fluentd
- name: elasticsearch-es-http-certs-public
secret:
secretName: elasticsearch-es-http-certs-public
- name: elasticsearch-es-http-certs-public
mountPath: "/etc/ssl/certs"
elasticsearch-es-http is the ElasticSearch Service name and the PODs are up and running.
Please guide me on where I went wrong.

[warn]: [input_forward] incoming chunk is broken:

td-agent 4.3.0 and fluentd 1.14.3 is installed on ubuntu 18.04 and while running Nessus scan to check vulnerability on td-agent server at that time below warning found on td-agent log and logs are not push to elasticsearch and td-agent not working.
[warn]: #0 [input_http] unexpected error error="Could not parse data entirely (0 != 85)"
[warn]: #0 [input_forward] incoming chunk is broken: host="[Nessus server ip address]" msg=36

Fluentd is not filtering as intended before writing to Elasticsearch

Using:
Elasticsearch 7.5.1.
Fluentd 1.11.2
Fluent-plugin-elasticsearch 4.1.3
Springboot 2.3.3
I have a Springboot artifact with Logback configured with an appender that, in addition to the app STDOUT, sends logs to Fluentd:
<appender name="FLUENT_TEXT"
class="ch.qos.logback.more.appenders.DataFluentAppender">
<filter class="ch.qos.logback.classic.filter.ThresholdFilter">
<level>INFO</level>
</filter>
<tag>myapp</tag>
<label>myservicename</label>
<remoteHost>fluentdservicename</remoteHost>
<port>24224</port>
<useEventTime>false</useEventTime>
</appender>
Fluentd config file looks like this:
<ROOT>
<source>
#type forward
port 24224
bind "0.0.0.0"
</source>
<filter myapp.**>
#type parser
key_name "message"
reserve_data true
remove_key_name_field false
<parse>
#type "json"
</parse>
</filter>
<match myapp.**>
#type copy
<store>
#type "elasticsearch"
host "elasticdb"
port 9200
logstash_format true
logstash_prefix "applogs"
logstash_dateformat "%Y%m%d"
include_tag_key true
type_name "app_log"
tag_key "#log_name"
flush_interval 1s
user "elastic"
password xxxxxx
<buffer>
flush_interval 1s
</buffer>
</store>
<store>
#type "stdout"
</store>
</match>
</ROOT>
So it just adds a filter to parse the information (a Json string) to a structured way and then writes it to Elasticsearch (as well as to Fluentd's STDOUT). Check how I add the myapp.** regexp to make it match in the filter and in the match blocks.
Everyting is up and running properly in Openshift. Springboot sends properly the logs to Fluentd, and Fluentd writes in Elasticsearch.
But the problem is that every log generated from the app is also written. This means that every INFO log with, for example, the initial Spring configuration or any other information that the app sends to through Logback is also written.
Example of "wanted" log:
2020-11-04 06:33:42.312840352 +0000 myapp.myservice: {"traceId":"bf8195d9-16dd-4e58-a0aa-413d89a1eca9","spanId":"f597f7ffbe722fa7","spanExportable":"false","X-Span-Export":"false","level":"INFO","X-B3-SpanId":"f597f7ffbe722fa7","idOrq":"bf8195d9-16dd-4e58-a0aa-413d89a1eca9","logger":"es.organization.project.myapp.commons.services.impl.LoggerServiceImpl","X-B3-TraceId":"f597f7ffbe722fa7","thread":"http-nio-8085-exec-1","message":"{\"traceId\":\"bf8195d9-16dd-4e58-a0aa-413d89a1eca9\",\"inout\":\"IN\",\"startTime\":1604471622281,\"finishTime\":null,\"executionTime\":null,\"entrySize\":5494.0,\"exitSize\":null,\"differenceSize\":null,\"user\":\"pmmartin\",\"methodPath\":\"Method Path\",\"errorMessage\":null,\"className\":\"CamelOrchestrator\",\"methodName\":\"preauthorization_validate\"}","idOp":"","inout":"IN","startTime":1604471622281,"finishTime":null,"executionTime":null,"entrySize":5494.0,"exitSize":null,"differenceSize":null,"user":"pmmartin","methodPath":"Method Path","errorMessage":null,"className":"CamelOrchestrator","methodName":"preauthorization_validate"}
Example of "unwanted" logs (check how there is a Fluentd warning per each unexpected log message):
2020-11-04 06:55:09.000000000 +0000 myapp.myservice: {"level":"INFO","logger":"org.apache.camel.impl.engine.InternalRouteStartupManager","thread":"restartedMain","message":"Route: route6 started and consuming from: servlet:/preAuth"}
2020-11-04 06:55:09 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="pattern not matched with data 'Total 20 routes, of which 20 are started'" location=nil tag="myapp.myservice" time=1604472909 record={"level"=>"INFO", "logger"=>"org.apache.camel.impl.engine.AbstractCamelContext", "thread"=>"restartedMain", "message"=>"Total 20 routes, of which 20 are started"}
2020-11-04 06:55:09.000000000 +0000 myapp.myservice: {"level":"INFO","logger":"org.apache.camel.impl.engine.AbstractCamelContext","thread":"restartedMain","message":"Total 20 routes, of which 20 are started"}
2020-11-04 06:55:09 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="pattern not matched with data 'Apache Camel 3.5.0 (MyService DEMO Mode) started in 0.036 seconds'" location=nil tag="myapp.myservice" time=1604472909 record={"level"=>"INFO", "logger"=>"org.apache.camel.impl.engine.AbstractCamelContext", "thread"=>"restartedMain", "message"=>"Apache Camel 3.5.0 (MyService DEMO Mode) started in 0.036 seconds"}
2020-11-04 06:55:09.000000000 +0000 myapp.myservice: {"level":"INFO","logger":"org.apache.camel.impl.engine.AbstractCamelContext","thread":"restartedMain","message":"Apache Camel 3.5.0 (MyService DEMO Mode) started in 0.036 seconds"}
2020-11-04 06:55:09 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="pattern not matched with data 'Started MyServiceApplication in 15.446 seconds (JVM running for 346.061)'" location=nil tag="myapp.myservice" time=1604472909 record={"level"=>"INFO", "logger"=>"es.organization.project.myapp.MyService", "thread"=>"restartedMain", "message"=>"Started MyService in 15.446 seconds (JVM running for 346.061)"}
The question is: What and how do I tell Fluentd to really filter the info that gets to it so the unwanted info gets discarded?
Thanks to #Azeem, and according to grep and regexp features documentation, I got it :).
I just added this to my Fluentd config file:
<filter onpay.**>
#type grep
<regexp>
key message
pattern /^.*inout.*$/
</regexp>
</filter>
Any line that does not contain the word "inout" is now excluded.

Fluentd - Could not communicate to Elasticsearch, resetting connection and trying again. getaddrinfo: Name or service not known (SocketError)

In an EFK setup, the fluentd suddenly stopped sending to elasticsearch with the following errors in the logs:
2020-09-28 18:48:55 +0000 [warn]: #0 Could not communicate to Elasticsearch, resetting connection and trying again. getaddrinfo: Name or service not known (SocketError)
2020-09-28 18:48:55 +0000 [warn]: #0 Remaining retry: 6. Retry to communicate after 512 second(s).
The elasticsearch components are up and running, and I can curl and access elasticsearch from inside the fluentd pod. There is no error message in the logs of the elasticsearch.
Restarting the fluentd pod or elasticsearch components did not help.
The issue was in one of the configurations that was uploaded to fluentd. The elasticsearch host was set to a wrong value in that configuration. After fixing that configuration, the issue waa resolved.

Preprocessing a message containing multiple log records

TL;DR. Is it possible to preprocess a message by splitting on the newlines, and then have each message go through the fluentd pipeline as usually?
I'm receiving these log messages in fluentd:
2018-09-13 13:00:41.251048191 +0000 : {"message":"146 <190>1 2018-09-13T13:00:40.685591+00:00 host app web.1 - 13:00:40.685 request_id=40932fe8-cd7e-42e9-af24-13350159376d [info] Received GET /alerts\n"}
2018-09-13 13:00:41.337628343 +0000 : {"message":"199 <190>1 2018-09-13T13:00:40.872670+00:00 host app web.1 - 13:00:40.871 request_id=40932fe8-cd7e-42e9-af24-13350159376d [info] Processing with Api.AlertController.index/2 Pipelines: [:api]\n156 <190>1 2018-09-13T13:00:40.898316+00:00 host app web.1 - 13:00:40.894 request_id=40932fe8-cd7e-42e9-af24-13350159376d [info] Rendered \"index.json\" in 1.0ms\n155 <190>1 2018-09-13T13:00:40.898415+00:00 host app web.1 - 13:00:40.894 request_id=40932fe8-cd7e-42e9-af24-13350159376d [info] Sent 200 response in 209.70ms\n"}
The problem with these logs is that second message: it contains multiple application log lines.
This is, unfortunately, what I have to deal with: the system (hello, Heroku logs!)I'm working with buffers logs and the spits them out as a single chunk, making it impossible to know the number of records in the chunk upfront.
This is known property of Heroku log draining.
Is there a way to preprocess the log message, so that I get a flat stream of messages to be processed normally by subsequent fluentd facilities?
This is how the post-processed stream of messages should look like:
2018-09-13 13:00:41.251048191 +0000 : {"message":"146 <190>1 2018-09-13T13:00:40.685591+00:00 host app web.1 - 13:00:40.685 request_id=40932fe8-cd7e-42e9-af24-13350159376d [info] Received GET /alerts\n"}
2018-09-13 13:00:41.337628343 +0000 : {"message":"199 <190>1 2018-09-13T13:00:40.872670+00:00 host app web.1 - 13:00:40.871 request_id=40932fe8-cd7e-42e9-af24-13350159376d [info] Processing with Api.AlertController.index/2 Pipelines: [:api]\n"}
2018-09-13 13:00:41.337628343 +0000 : {"message":"156 <190>1 2018-09-13T13:00:40.898316+00:00 host app web.1 - 13:00:40.894 request_id=40932fe8-cd7e-42e9-af24-13350159376d [info] Rendered \"index.json\" in 1.0ms\n"}
2018-09-13 13:00:41.337628343 +0000 : {"message":"155 <190>1 2018-09-13T13:00:40.898415+00:00 host app web.1 - 13:00:40.894 request_id=40932fe8-cd7e-42e9-af24-13350159376d [info] Sent 200 response in 209.70ms\n"}
P.S. My current config is super basic, but I'm posting it just in case. All I'm trying to do is to understand if it's possible, in principle, preprocess the message?
<source>
#type http
port 5140
bind 0.0.0.0
<parse>
#type none
</parse>
</source>
<filter **>
#type stdout
</filter>
How about https://github.com/hakobera/fluent-plugin-heroku-syslog ?
fluent-plugin-heroku-syslog has been unmaintained since 4 years ago, but it will work with Fluentd v1 using compatible layer.

Resources