How to send haproxy logs to fluentd by td-agent? - elasticsearch

I wanna send haproxy logs to fluentd/elasticsearch/kibana using td-agent, but I can't do it correctly
I have installed EFK by dockers and it rules correctly.
I have a haproxy with log type haproxy.tcp like this:
haproxy[27508]: info 127.0.0.1:45111 [12/Jul/2012:15:19:03.258] wss-relay wss-relay/local02_9876 0/0/50015 1277 cD 1/0/0/0/0 0/0
My td-agent.conf is this:
<source>
#type tail
path /var/log/haproxy.log
format /^(?<ps>\w+)\[(?<pid>\d+)\]: (?<pri>\w+) (?<c_ip>[\w\.]+):(?<c_port>\d+) \[(?<time>.+)\] (?<f_end>[\w-]+) (?<b_end>[\w-]+)\/(?<b_server>[\w-]+) (?<tw>\d+)\/(?<tc>\d+)\/(?<tt>\d+) (?<bytes>\d+) (?<t_state>[\w-]+) (?<actconn>\d+)\/(?<feconn>\d+)\/(?<beconn>\d+)\/(?<srv_conn>\d+)\/(?<retries>\d+) (?<srv_queue>\d+)\/(?<backend_queue>\d+)$/
tag haproxy.tcp
time_format %d/%B/%Y:%H:%M:%S
</source>
<match haproxy.tcp>
#type forward
<server>
host dockerdes01
port 24224
</server>
</match>
But the log don't arrive to /var/log/td-agent/td-agent.log
If I use this :
<match haproxy.tcp>
#type copy
<store>
#type stdout
</store>
<store>
#type elasticsearch
logstash_format true
flush_interval 10s # for testing.
host dockerdes01
port 9200
</store>
</match>
I see this in my /var/log/td-agent/td-agent.log:
2012-07-12 15:19:03.000000000 +0200 haproxy.tcp: {"ps":"haproxy","pid":"27508","pri":"info","c_ip":"127.0.0.1","c_port":"45111","f_end":"wss-relay","b_end":"wss-relay","b_server":"local02_9876","tw":"0","tc":"0","tt":"50015","bytes":"1277","t_state":"cD","actconn":"1","feconn":"0","beconn":"0","srv_conn":"0","retries":"0","srv_queue":"0","backend_queue":"0"}
but it doesn't arrive to fluentd...
I need that the logs arrive to fluentd

Better to use syslog set up to fluentd and just send from haproxy with syslog.

Related

Fluentd is not filtering as intended before writing to Elasticsearch

Using:
Elasticsearch 7.5.1.
Fluentd 1.11.2
Fluent-plugin-elasticsearch 4.1.3
Springboot 2.3.3
I have a Springboot artifact with Logback configured with an appender that, in addition to the app STDOUT, sends logs to Fluentd:
<appender name="FLUENT_TEXT"
class="ch.qos.logback.more.appenders.DataFluentAppender">
<filter class="ch.qos.logback.classic.filter.ThresholdFilter">
<level>INFO</level>
</filter>
<tag>myapp</tag>
<label>myservicename</label>
<remoteHost>fluentdservicename</remoteHost>
<port>24224</port>
<useEventTime>false</useEventTime>
</appender>
Fluentd config file looks like this:
<ROOT>
<source>
#type forward
port 24224
bind "0.0.0.0"
</source>
<filter myapp.**>
#type parser
key_name "message"
reserve_data true
remove_key_name_field false
<parse>
#type "json"
</parse>
</filter>
<match myapp.**>
#type copy
<store>
#type "elasticsearch"
host "elasticdb"
port 9200
logstash_format true
logstash_prefix "applogs"
logstash_dateformat "%Y%m%d"
include_tag_key true
type_name "app_log"
tag_key "#log_name"
flush_interval 1s
user "elastic"
password xxxxxx
<buffer>
flush_interval 1s
</buffer>
</store>
<store>
#type "stdout"
</store>
</match>
</ROOT>
So it just adds a filter to parse the information (a Json string) to a structured way and then writes it to Elasticsearch (as well as to Fluentd's STDOUT). Check how I add the myapp.** regexp to make it match in the filter and in the match blocks.
Everyting is up and running properly in Openshift. Springboot sends properly the logs to Fluentd, and Fluentd writes in Elasticsearch.
But the problem is that every log generated from the app is also written. This means that every INFO log with, for example, the initial Spring configuration or any other information that the app sends to through Logback is also written.
Example of "wanted" log:
2020-11-04 06:33:42.312840352 +0000 myapp.myservice: {"traceId":"bf8195d9-16dd-4e58-a0aa-413d89a1eca9","spanId":"f597f7ffbe722fa7","spanExportable":"false","X-Span-Export":"false","level":"INFO","X-B3-SpanId":"f597f7ffbe722fa7","idOrq":"bf8195d9-16dd-4e58-a0aa-413d89a1eca9","logger":"es.organization.project.myapp.commons.services.impl.LoggerServiceImpl","X-B3-TraceId":"f597f7ffbe722fa7","thread":"http-nio-8085-exec-1","message":"{\"traceId\":\"bf8195d9-16dd-4e58-a0aa-413d89a1eca9\",\"inout\":\"IN\",\"startTime\":1604471622281,\"finishTime\":null,\"executionTime\":null,\"entrySize\":5494.0,\"exitSize\":null,\"differenceSize\":null,\"user\":\"pmmartin\",\"methodPath\":\"Method Path\",\"errorMessage\":null,\"className\":\"CamelOrchestrator\",\"methodName\":\"preauthorization_validate\"}","idOp":"","inout":"IN","startTime":1604471622281,"finishTime":null,"executionTime":null,"entrySize":5494.0,"exitSize":null,"differenceSize":null,"user":"pmmartin","methodPath":"Method Path","errorMessage":null,"className":"CamelOrchestrator","methodName":"preauthorization_validate"}
Example of "unwanted" logs (check how there is a Fluentd warning per each unexpected log message):
2020-11-04 06:55:09.000000000 +0000 myapp.myservice: {"level":"INFO","logger":"org.apache.camel.impl.engine.InternalRouteStartupManager","thread":"restartedMain","message":"Route: route6 started and consuming from: servlet:/preAuth"}
2020-11-04 06:55:09 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="pattern not matched with data 'Total 20 routes, of which 20 are started'" location=nil tag="myapp.myservice" time=1604472909 record={"level"=>"INFO", "logger"=>"org.apache.camel.impl.engine.AbstractCamelContext", "thread"=>"restartedMain", "message"=>"Total 20 routes, of which 20 are started"}
2020-11-04 06:55:09.000000000 +0000 myapp.myservice: {"level":"INFO","logger":"org.apache.camel.impl.engine.AbstractCamelContext","thread":"restartedMain","message":"Total 20 routes, of which 20 are started"}
2020-11-04 06:55:09 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="pattern not matched with data 'Apache Camel 3.5.0 (MyService DEMO Mode) started in 0.036 seconds'" location=nil tag="myapp.myservice" time=1604472909 record={"level"=>"INFO", "logger"=>"org.apache.camel.impl.engine.AbstractCamelContext", "thread"=>"restartedMain", "message"=>"Apache Camel 3.5.0 (MyService DEMO Mode) started in 0.036 seconds"}
2020-11-04 06:55:09.000000000 +0000 myapp.myservice: {"level":"INFO","logger":"org.apache.camel.impl.engine.AbstractCamelContext","thread":"restartedMain","message":"Apache Camel 3.5.0 (MyService DEMO Mode) started in 0.036 seconds"}
2020-11-04 06:55:09 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="pattern not matched with data 'Started MyServiceApplication in 15.446 seconds (JVM running for 346.061)'" location=nil tag="myapp.myservice" time=1604472909 record={"level"=>"INFO", "logger"=>"es.organization.project.myapp.MyService", "thread"=>"restartedMain", "message"=>"Started MyService in 15.446 seconds (JVM running for 346.061)"}
The question is: What and how do I tell Fluentd to really filter the info that gets to it so the unwanted info gets discarded?
Thanks to #Azeem, and according to grep and regexp features documentation, I got it :).
I just added this to my Fluentd config file:
<filter onpay.**>
#type grep
<regexp>
key message
pattern /^.*inout.*$/
</regexp>
</filter>
Any line that does not contain the word "inout" is now excluded.

EFK with Searchguard

I have installed an EFK stack to log nginx access log.
While using fresh install Im able to send data from Fluentd to elasticsearch without any problem. However, I installed searchguard to implement authentication on elasticsearch and kibana. Now Im able to login to Kibana and elasticsearch with searchguards demo user credentials.
Now my problem is that fluentd is unable to to connect to elasticsearch. From td-agent log im getting the following messages:
2018-07-19 15:20:34 +0600 [warn]: #0 failed to flush the buffer. retry_time=5 next_retry_seconds=2018-07-19 15:20:34 +0600 chunk="57156af05dd7bbc43d0b1323fddb2cd0" error_class=Fluent::Plugin::ElasticsearchOutput::ConnectionFailure error="Can not reach Elasticsearch cluster ({:host=>\"<elasticsearch-ip>\", :port=>9200, :scheme=>\"http\", :user=>\"logstash\", :password=>\"obfuscated\"})!"
Here is my Fluentd config
<source>
#type forward
</source>
<match user_count.**>
#type copy
<store>
#type elasticsearch
host https://<elasticsearch-ip>
port 9200
ssl_verify false
scheme https
user "logstash"
password "<logstash-password>"
index_name "custom_user_count"
include_tag_key true
tag_key "custom_user_count"
logstash_format true
logstash_prefix "custom_user_count"
type_name "custom_user_count"
utc_index false
<buffer>
flush_interval 2s
</buffer>
</store>
</match>
sg_roles.yml:
sg_logstash:
cluster:
- CLUSTER_MONITOR
- CLUSTER_COMPOSITE_OPS
- indices:admin/template/get
- indices:admin/template/put
indices:
'custom*':
'*':
- CRUD
- CREATE_INDEX
'logstash-*':
'*':
- CRUD
- CREATE_INDEX
'*beat*':
'*':
- CRUD
- CREATE_INDEX
Can anyone help me on this?
It seemed td-agent was using TLSv1 as default
added ssl_version TLSv1_2 to the config and now working

Logs not forwarded (but fluentd container running)

I have a fluentd container that after a week of working regularly, it stops forwarding logs to elasticsearch.
If a 'docker logs' that container it shows me all logs, but after a certain date/time these are not forwarded.
Fluentd config is this:
</source>
#type forward
#label #mainstream
bind 0.0.0.0
port 24224
</source>
<label #mainstream>
<match **>
#type copy <store>
#type elasticsearch
host elasticsearch
port 9200
logstash_format true
logstash_prefix fluentd
logstash_dateformat %Y%m%d
include_tag_key true
type_name access_log
tag_key #log_name
<buffer>
flush_mode interval
flush_interval 1s
</buffer>
</store> </match>
</label>
Do you have any suggestions to find the root of this problem?
Thank you in advance.

Fluentd - Using source tag as index

I have a setup with Fluentd and Elasticsearch running on a Docker engine. I have swarms of services which I would like to log to Fluentd.
What I want to do is create a tag for each service that I run and use that tag as an index in Elasticsearch. Here's the setup that I have:
<source>
#type forward
port 24224
bind 0.0.0.0
</source>
<match docker.service1>
#type elasticsearch
host "172.20.0.3"
port 9200
index_name service1
type_name fluentd
flush_interval 10s
</match>
<match docker.service2>
#type elasticsearch
host "172.20.0.3"
port 9200
index_name service2
type_name fluentd
flush_interval 10s
</match>
and so forth.
It would be annoying to have to include a new match tag for every single service I create, because I want to be able to add new service without updating my fluentd configuration. Is there a way to do something like this:
<source>
#type forward
port 24224
bind 0.0.0.0
</source>
<match docker.**>
#type elasticsearch
host "172.20.0.3"
port 9200
index_name $(TAG)
type_name fluentd
flush_interval 10s
</match>
Where I use a $(TAG) variable to indicate that I want the Tag name to be the name of the index?
I've tried this from an answer I found here: ${tag_parts[0]}. This was printed literally as my index. So my index was "${tag_parts[0]}".
Thanks in advance.
I figured out that I needed to import the other Elasticsearch plugin. Here's an example of a match tag that I used:
<match>
#type elasticsearch_dynamic
host "172.20.0.3"
port 9200
type_name fluentd
index_name ${tag_parts[2]}
flush_interval 10s
include_tag_key true
reconnect_on_error true
</match>
I've imported the #elasticsearch_dynamic plugin instead of the #elasticsearch plugin. Then, I can use the ${tag_parts} thing.
The include_tag_key will include the tag in the json data.
It helps to read the documentation
I had the same problem, and the solution provided here is being deprecated. What I ended up doing was this:
Add a transform filter that adds the index name you want as a key on the record
<filter xxx.*>
#type record_transformer
enable_ruby true
<record>
index_name ${tag_parts[1]}-${time.strftime('%Y%m')}
</record>
</filter>
and then in the elasticsearch output you configure
<match xxx.*>
#type elasticsearch-service
target_index_key index_name
index_name fallback-index-%Y%m
the fallback index_name here will be used if a record is missing the index_name key, but that should never happen.

how to create multiple index in elasticsearch using fluentd ( td-agent.cong)

I am setting up EFK Stack. In Kibana I want to represent one index for application and one index for syslogs.
I am using fluentd for log forwarding.
syslogs --> /var/log/messages and /var/log/secure
application --> /var/log/application.log
what is the td-agent.cong to create two index plz help
thanking you
If you are using the ElasticSearch output plugin and want to use kibana you can config your indices names by changing the logstash_prefix attribute.
read the documentation : elasticsearch output plugin documentation
I have added this following fluentd.conf file to demonstrate your usecase.
In this file I have 2 matches:
1. "alert" - will pipe all logs with "alert" (FluentLogger.getLogger("alert")) to "alert" index in elasticsearch.
default match - will pipe all logs to elasticsearch with "fluentd" index (which is the default index of this plugin).
fluentd/conf/fluent.conf
#type forward
port 24224
bind 0.0.0.0
<match alert.**>
#type copy
<store>
#type elasticsearch
host elasticsearch
port 9200
logstash_format true
logstash_prefix alert
logstash_dateformat %Y%m%d
type_name access_log
tag_key #log_name
flush_interval 1s
</store>
<store>
#type stdout
</store>
</match>
<match *.**>
#type copy
<store>
#type elasticsearch
host elasticsearch
port 9200
logstash_format true
logstash_prefix fluentd
logstash_dateformat %Y%m%d
include_tag_key true
type_name access_log
tag_key #log_name
flush_interval 1s
</store>
<store>
#type stdout
</store>
</match>

Resources