How can I use environment variables in fluentd config?

How can I use environment variables in fluentd config? - ruby

I have problem with using env at td-agent config, I tried:
<source>
#type tail
path /home/td-agent/test.txt
tag "#{ENV['WEBTEST']}"
pos_file /var/log/td-agent/td-agent-test.pos
#include /etc/td-agent/web_parse_regex.conf
</source>
/etc/sysconfig/td-agent :
export WEBTEST="webtest"
and when I start td-agent and check td-agent.log, tag is empty
2020-06-09 15:40:20 +0900 [info]: using configuration file: <ROOT>
<source>
#type tail
path "/home/td-agent/test.txt"
tag ""
pos_file "/var/log/td-agent/td-agent-test.pos"
.....
+I'm using centos

You need to make sure that the /etc/sysconfig/td-agent have execute rights
chmod a+x /etc/sysconfig/td-agen
and to make sure that the init script is executing these files, the below lines need to be in the file /etc/init.d/td-agent
TD_AGENT_DEFAULT=/etc/sysconfig/td-agent
# Read configuration variable file if it is present
if [ -f "${TD_AGENT_DEFAULT}" ]; then
. "${TD_AGENT_DEFAULT}"
fi

could not find a way to set env vars from inside the conf file, but you can set variable values in ruby in the system blocks and reuse them in the conf file.
<system>
"#{MONGO_CONNECTION_STRING='mongodb://localhost:27017/test'}"
</system>
<match>
#type mongo
connection_string "#{MONGO_CONNECTION_STRING}"
# database test
collection fluentd
</match>

Related

date is not appending to elasticsearch index name while using td-agent

i need to store the log to Es index when i use logstash date is getting append ed to index name as logstash.2018-08-06,but when i try to give the custom name as in the flowing conf,its not getting added
</store>
<store>
#type elasticsearch
host X.X.X.X
port 9200
logstash_format false
index_name updatetest.%Y%m%d --> In index name its not replacing with date
</store>
Here is the index name ,created by above conf updatetest.%Y%m%d --> its should be like updatetest.20180806
Thanks for help in advance

If you don't want to use the logstash format this also works:
<store>
#type elasticsearch
host x.x.x.x
index_name test.%Y%m
<buffer tag, time>
timekey 1h
</buffer>
flush_interval 5s
</store>
Now %Y and %m get replaced. Defining a buffer makes the datetime formatting codes available.

HI Solved the above issue.
</store>
<store>
#type elasticsearch
host X.X.X.X
port 9200
logstash_format true
logstash_prefix babuji
</store>
</match>
#</match>

Include fluentd time into json post data

td-agent.config
<match test>
type webhdfs
host localhost
port 50070
path /test/%Y%m%d_%H
username hdfs
output_include_tag false
remove_prefix test
time_format %Y-%m-%d %H:%M:%S
output_include_time true
format json
localtime
buffer_type file
buffer_path /test/test
buffer_chunk_limit 4m
buffer_queue_limit 50
flush_interval 3s
</match>
In hdfs log file it show as below:
2016-02-22 16:04:15 {"login_id":123,"email":"abcd#gmail.com"}
Have any way to embed the fluentd time field not the client time into json data before store in file such as:
{"time_key":"2016-02-22 16:04:15","login_id":123,"email":"abcd#gmail.com"}

I have the solution :
Use plugin https://github.com/repeatedly/fluent-plugin-record-modifier
Add the field time and then push to hdfs
:)

How to forward a JSON file with FluentD to Graylog2 with a valid time format

I am working on logging with FluentD and Graylog GELF with limited success. I want to forward a JSON file:
<source>
#type tail
path /var/log/suricata/eve.json
pos_file /var/log/td-agent/suri_eve.pos # pos record
tag ids
format json
# JSON time stamp: 2016-02-01T11:52:49.157072+0000
# this timestamp is ruby's t.strftime("%Y-%m-%dT%H:%M:%S.%6N%z")
time_format %Y-%m-%dT%H:%M:%S.%6N%z
time_key timestamp # I show a JSON message below
</source>
<match **>
#type graylog
host 1.2.3.4 #(optional; default="localhost")
port 12201 #(optional; default=9200)
flush_interval 30
num_threads 2
</match>
This kicks in, but produces error messages:
2016-02-01 15:30:11 +0000 [warn]: plugin/in_tail.rb:263:rescue in
convert_line_to_event:
"{\"timestamp\":\"2016-02-01T15:27:09.000087+0000\",\"flow_id\":51921072,\"event_type\":\"flow\",\"src_ip\":\"10.1.1.85\",\"src_port\":59820,\"dest_ip\":\"224.0.0.252\",\"dest_port\":5355,\"proto\":\"UDP\",\"flow\":{\"pkts_toserver\":4,\"pkts_toclient\":0,\"bytes_toserver\":294,\"bytes_toclient\":0,\"start\":\"2016-02-01T15:26:30.393371+0000\",\"end\":\"2016-02-01T15:26:37.670904+0000\",\"age\":7,\"state\":\"new\",\"reason\":\"timeout\"}}" error="invalid time format: value = 2016-02-01T15:27:09.000087+0000,
error_class = ArgumentError, error = invalid strptime format -
`%Y-%m-%dT%H:%M:%S.%6N%z'"
An original messages looks like this:
{"timestamp":"2016-02-01T15:31:02.000699+0000","flow_id":52015920,"event_type":"flow","src_ip":"10.1.1.44","src_port":49313,"dest_ip":"224.0.0.252","dest_port":5355,"proto":"UDP","flow":{"pkts_toserver":2,"pkts_toclient":0,"bytes_toserver":128,"bytes_toclient":0,"start":"2016-02-01T15:30:31.348568+0000","end":"2016-02-01T15:30:31.759024+0000","age":0,"state":"new","reason":"timeout"}}
So I checked the Ruby docs. I am not too familiar with FluentD but from what I know the time format expression should fit? I tried format=none but that also doesn't work.

https://github.com/Graylog2/graylog2-server/issues/1761
This is a bug/problem with reserved fields (undocumented) in Graylog2.
If you find a similar bug with timestamps, check the linked issue and the dev response.

Using id_key with fluentd/elasticsearch

I recently started attempting to use the fluentd + elasticsearch + kibana setup.
I'm currently feeding information through fluentd by having it read a log file I'm spitting out with python code.
The log is made out of a list of json data, one per line, like so:
{"id": "1","date": "2014-02-01T09:09:59.000+09:00","protocol": "tcp","source ip": "xxxx.xxxx.xxxx.xxxx","source port": "37605","country": "CN","organization": "China Telecom jiangsu","dest ip": "xxxx.xxxx.xxxx.xxxx","dest port": "23"}
I have the fluentd set-up to read my field "id" and fill out "_id", as per instructions here:
<source>
type tail
path /home/(usr)/bin1/fluentd.log
tag es
format json
keys id, date, prot, srcip, srcport, country, org, dstip, dstport
id_key id
time_key date
time_format %Y-%m-%dT%H:%M:%S.%L%:z
</source>
<match es.**>
type elasticsearch
logstash_format true
flush_interval 10s # for testing
</match>
However, the "_id" after inserting the above still comes out to be the randomly generated _id.
If anyone could point out to me what I'm doing wrong, I would much appreciate it.

id_key id should be in inside <match es.**>, not <source>.
<source> is for input plugin, tail in this case.
<match> is for output plugin, elasticsearch in this case.
So elasticsearch configuration should be set in <match>.
http://docs.fluentd.org/articles/config-file

bash script extract XML data into column format

Trying to extract xml data from multiple string outputs dynamically (the data changes) into a column format.
About 100 of these XML bits echo out when I run a query against an SQL database.
<?xml version="1.0"?>
<Connection>
<ConnectionType>Putty</ConnectionType>
<CreatedBy>Someone</CreatedBy>
<CreationDateTime>2014-10-27T11:53:59.8993492-04:00</CreationDateTime>
<Events>
<OpenCommentPrompt>true</OpenCommentPrompt>
<WarnIfAlreadyOpened>true</WarnIfAlreadyOpened>
</Events>
<Group>Cloud Services Client Delivery\Willis\Linux\Test - SJC</Group>
<ID>77e96d52-f165-482f-8389-ffb95b9d8ccd</ID>
<KeyboardHook>InFullScreenMode</KeyboardHook>
<MetaInformation />
<Name>Hostname-H-A10D</Name>
<OpenEmbedded>true</OpenEmbedded>
<PinEmbeddedMode>False</PinEmbeddedMode>
<Putty>
<PortFowardingArray />
<Scripting />
<SessionHost>10.0.0.100</SessionHost>
<SessionName>10.0.0.100</SessionName>
<TelnetEncoding>IBM437</TelnetEncoding>
</Putty>
<ScreenColor>C24Bits</ScreenColor>
<SoundHook>DoNotPlay</SoundHook>
<Stamp>771324d1-0c59-4f12-b81e-96edb5185ef7</Stamp>
</Connection>
And what I need is the and in a column format. And essentially where the hostname equal Hostname-H-A10D, I want to be able to match the D at the end and mark the First column with Dev, Q as Test and no letter at the end as Prod. So the output would look like -->
Dev Hostname-H-A10D 10.0.0.100
Dev Hostname-H-A11D 10.0.0.101
Prod Hostname-H-A12 10.0.0.201
Test Hostname-H-A13Q 10.0.0.10
I have played around with sed/awk/etc and not just cannot get the format I want without writing out temp flat files. I would prefer to get this into an array using something like xmlstarlet or xmllint. Of course better suggestions can be made and that is why I am here :) Thanks folks.

It would be better to use an XML parser.
Using awk:
$ awk -F'[<>]' 'BEGIN{a["D"]="Dev";a["Q"]="Test"} /Name/{name=$3; type=a[substr(name,length(name))]; if (length(type)==0) type="Prod";} /SessionHost/{print type, name, $3;}' s.xml
Dev Hostname-H-A10D 10.0.0.100
How it works
BEGIN{a["D"]="Dev";a["Q"]="Test"}
This defines associative array a.
/Name/{name=$3; type=a[substr(name,length(name))]; if (length(type)==0) type="Prod";}
On the line that has the host name, this captures the host name and, from it, determines the host type.
/SessionHost/{print type, name, $3;}
On the line that contains the host IP, this prints the type, name, and IP.

You have not mentioned any parameter in XML file whether the host is Dev or Prod or Test.
But from the above XML file you can get the name using the following way.
$cat test.xml |grep Name |awk -F '[<,>]' '{print $3}' |xargs
Hostname-H-A10D 10.0.0.100

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How can I use environment variables in fluentd config? - ruby

Related

date is not appending to elasticsearch index name while using td-agent

Include fluentd time into json post data

How to forward a JSON file with FluentD to Graylog2 with a valid time format

Using id_key with fluentd/elasticsearch

bash script extract XML data into column format

Categories

Resources