time_format configuration in fluent.conf - elasticsearch

What is the correct time_format for following date and time format in fluent.conf configuration.
12/Apr/2021:12:17:03.747 +0530
I tried with below but not sure whether it is correct because access logs are not printing in the Kibana.
time_format %d/%b/%Y:%H:%M:%3N %z
Thanks.

I was able to make it work with following format.
time_format %d/%b/%Y:%H:%M:%S.%N %z
Thanks.

Related

Fluentbit Kubernetes - How to extract fields from existing logs

I have configured EFK stack with Fluent-bit on my Kubernetes cluster. I can see the logs in Kibana.
I also have deployed nginx pod, I can see the logs of this nginx pod also in Kibana. But all the log data are sent to a single field "log" as shown below.
How can I extract each field into a separate field. There is a solution for fluentd already in this question. Kibana - How to extract fields from existing Kubernetes logs
But how can I achieve the same with fluent-bit?
I have tried the below by adding one more FILTER section under the default FILTER section for Kubernetes, but it didn't work.
[FILTER]
Name parser
Match kube.*
Key_name log
Parser nginx
From this (https://github.com/fluent/fluent-bit/issues/723), I can see there is no grok support for fluent-bit.
In our official documentation for Kubernetes filter we have an example about how to make your Pod suggest a parser for your data based in an annotation:
https://docs.fluentbit.io/manual/filter/kubernetes
Look at this configmap:
https://github.com/fluent/fluent-bit-kubernetes-logging/blob/master/output/elasticsearch/fluent-bit-configmap.yaml
The nginx parser should be there:
[PARSER]
Name nginx
Format regex
Regex ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z

Converting timestamp with time zone to local time in Bash

I am trying to do basic time math in Bash having following timestamps from OpenDj:
createtimestamp: 20161123165725Z
I don't know what time zone the system will be set with, it can be with "Zulu" timezone (UTC) as above, or might be with some other timezone. The above timestamp is for the users on LDAP servers. What I need to do is to compare the timestamp of the user with a certain timestamp.
Question: how do I convert the timestamp with zone (Z) above to local timestamp or epoch time in Bash?
Maybe Dateutils will help
http://www.fresse.org/dateutils/
dateconv -i '%s' -f '%A, %d %b %Y' 1234567890
Friday, 13 Feb 2009
dateutils tools default to UTC (add -z your/timezone if needed).

Include fluentd time into json post data

td-agent.config
<match test>
type webhdfs
host localhost
port 50070
path /test/%Y%m%d_%H
username hdfs
output_include_tag false
remove_prefix test
time_format %Y-%m-%d %H:%M:%S
output_include_time true
format json
localtime
buffer_type file
buffer_path /test/test
buffer_chunk_limit 4m
buffer_queue_limit 50
flush_interval 3s
</match>
In hdfs log file it show as below:
2016-02-22 16:04:15 {"login_id":123,"email":"abcd#gmail.com"}
Have any way to embed the fluentd time field not the client time into json data before store in file such as:
{"time_key":"2016-02-22 16:04:15","login_id":123,"email":"abcd#gmail.com"}
I have the solution :
Use plugin https://github.com/repeatedly/fluent-plugin-record-modifier
Add the field time and then push to hdfs
:)

How to forward a JSON file with FluentD to Graylog2 with a valid time format

I am working on logging with FluentD and Graylog GELF with limited success. I want to forward a JSON file:
<source>
#type tail
path /var/log/suricata/eve.json
pos_file /var/log/td-agent/suri_eve.pos # pos record
tag ids
format json
# JSON time stamp: 2016-02-01T11:52:49.157072+0000
# this timestamp is ruby's t.strftime("%Y-%m-%dT%H:%M:%S.%6N%z")
time_format %Y-%m-%dT%H:%M:%S.%6N%z
time_key timestamp # I show a JSON message below
</source>
<match **>
#type graylog
host 1.2.3.4 #(optional; default="localhost")
port 12201 #(optional; default=9200)
flush_interval 30
num_threads 2
</match>
This kicks in, but produces error messages:
2016-02-01 15:30:11 +0000 [warn]: plugin/in_tail.rb:263:rescue in
convert_line_to_event:
"{\"timestamp\":\"2016-02-01T15:27:09.000087+0000\",\"flow_id\":51921072,\"event_type\":\"flow\",\"src_ip\":\"10.1.1.85\",\"src_port\":59820,\"dest_ip\":\"224.0.0.252\",\"dest_port\":5355,\"proto\":\"UDP\",\"flow\":{\"pkts_toserver\":4,\"pkts_toclient\":0,\"bytes_toserver\":294,\"bytes_toclient\":0,\"start\":\"2016-02-01T15:26:30.393371+0000\",\"end\":\"2016-02-01T15:26:37.670904+0000\",\"age\":7,\"state\":\"new\",\"reason\":\"timeout\"}}" error="invalid time format: value = 2016-02-01T15:27:09.000087+0000,
error_class = ArgumentError, error = invalid strptime format -
`%Y-%m-%dT%H:%M:%S.%6N%z'"
An original messages looks like this:
{"timestamp":"2016-02-01T15:31:02.000699+0000","flow_id":52015920,"event_type":"flow","src_ip":"10.1.1.44","src_port":49313,"dest_ip":"224.0.0.252","dest_port":5355,"proto":"UDP","flow":{"pkts_toserver":2,"pkts_toclient":0,"bytes_toserver":128,"bytes_toclient":0,"start":"2016-02-01T15:30:31.348568+0000","end":"2016-02-01T15:30:31.759024+0000","age":0,"state":"new","reason":"timeout"}}
So I checked the Ruby docs. I am not too familiar with FluentD but from what I know the time format expression should fit? I tried format=none but that also doesn't work.
https://github.com/Graylog2/graylog2-server/issues/1761
This is a bug/problem with reserved fields (undocumented) in Graylog2.
If you find a similar bug with timestamps, check the linked issue and the dev response.

Using id_key with fluentd/elasticsearch

I recently started attempting to use the fluentd + elasticsearch + kibana setup.
I'm currently feeding information through fluentd by having it read a log file I'm spitting out with python code.
The log is made out of a list of json data, one per line, like so:
{"id": "1","date": "2014-02-01T09:09:59.000+09:00","protocol": "tcp","source ip": "xxxx.xxxx.xxxx.xxxx","source port": "37605","country": "CN","organization": "China Telecom jiangsu","dest ip": "xxxx.xxxx.xxxx.xxxx","dest port": "23"}
I have the fluentd set-up to read my field "id" and fill out "_id", as per instructions here:
<source>
type tail
path /home/(usr)/bin1/fluentd.log
tag es
format json
keys id, date, prot, srcip, srcport, country, org, dstip, dstport
id_key id
time_key date
time_format %Y-%m-%dT%H:%M:%S.%L%:z
</source>
<match es.**>
type elasticsearch
logstash_format true
flush_interval 10s # for testing
</match>
However, the "_id" after inserting the above still comes out to be the randomly generated _id.
If anyone could point out to me what I'm doing wrong, I would much appreciate it.
id_key id should be in inside <match es.**>, not <source>.
<source> is for input plugin, tail in this case.
<match> is for output plugin, elasticsearch in this case.
So elasticsearch configuration should be set in <match>.
http://docs.fluentd.org/articles/config-file

Resources