Pattern not match error in fluentd while tailing json file - elasticsearch

I have installed fluentd logger and I want it to monitor the logs of my python code. The logs are the json logs and looks like below:
{
"FileNo": 232,
"FileClass": "timitry",
"FileLevel": "24",
"DataCount": 5,
"Data": {
"User1": <Username>,
"User2": <Username>,
"User3": <Username>,
"User4": <Username>,
"User5": <Username>"
},
"time": "2018-05-14T05:33:02.071793"
}
This is updated every 5mins. I need to write a fluentd input plugin for it so that it can read the new json data and the publish it to elastic search. I dont really know which input plugin to use here but I used tail which give me below errors:
2018-05-14 05:31:04 +0000 [warn]: #0 pattern not match: " \"FileClass\": \"timitry\","
This is same for all the data. Can anyone please suggest me how can I resolve this issue. Below is the configuration file:
<source>
#type tail
format json
path /home/user/Documents/logs/file_log.json
tag first
</source>
<match first*>
#type elasticsearch
hosts 192.168.111.456:9200
user <username>
password <password>
</match>
I have seen others using regex and other formats. Do I also need to use it. How can I use the logs generated by python code to be used by fluentd and publish it to elastic search.
Thanks

Could you try too remove the wildcard after the first in your match directive ?
Like :
<match first>
#type elasticsearch
hosts 192.168.111.456:9200
user <username>
password <password>
</match>

Related

How can I send log from fluentd to influxDB?

I'm in trouble with the parser of Fluentd not working.
I don't know where the hell the problem is.
fluentd.conf
<source>
#type tail
#id in_tail_app_logs
path /tmp/log/test/*.log
pos_file /var/log/app.log.pos
tag logging.app
refresh_interval 1
read_from_head true
<parse>
#type regexp
expression /^(?<time>.+?)\t(?<logname>.+?)\t(?<log>.+?)$/
time_key time
time_format %Y-%m-%dT%H:%M:%S%:z
</parse>
</source>
<match logging.app>
#type stdout
#type copy
<store ignore_error>
#type file
path /tmp/log/influx
</store>
<store>
#type influxdb
host 20.10.222.22
port 8086
user test
password test123
use_ssl false
dbname test
measurement test_measurement
time_precision s
auto_tags true
flush_interval 10
verify_ssl false
sequence_tag _seq
</store>
</match>
/tmp/log/test/buffer.*.log
...
2023-02-06T08:09:46+00:00 kubernetes.var.log.containers.app-test-app-test-space-fb5ffca1ec-0_cf-workloads_opi-05942675f9f0dd17039304733f228e7abd6ebdfd91609baf1a7afefaeb33ced8.log {"stream":"stdout","log":"Console output from test-node-app","docker":{"container_id":"05942675f9f0dd17039304733f228e7abd6ebdfd91609baf1a7afefaeb33ced8"},"kubernetes":{"container_name":"opi","namespace_name":"cf-workloads","pod_name":"app-test-app-test-space-fb5ffca1ec-0","pod_id":"f4f2592e-ed57-4375-aff7-40c7b214abe0","host":"ap-joy-sidecar-5","labels":{"controller-revision-hash":"app-test-app-test-space-fb5ffca1ec-5bb7dc5769","cloudfoundry_org/app_guid":"8fc07280-506c-49b2-ab00-a97222fcf0a5","cloudfoundry_org/guid":"8fc07280-506c-49b2-ab00-a97222fcf0a5","cloudfoundry_org/org_guid":"fdf0a222-33a4-46fd-a7f9-7955b9ea862c","cloudfoundry_org/org_name":"system","cloudfoundry_org/process_type":"web","cloudfoundry_org/source_type":"APP","cloudfoundry_org/space_guid":"f1b70a4b-7581-4214-b128-2f1597f7789d","cloudfoundry_org/space_name":"app-test-space","cloudfoundry_org/version":"ebfde654-e73c-48da-b55b-42a37a6ba139","security_istio_io/tlsMode":"istio","service_istio_io/canonical-name":"app-test-app-test-space-fb5ffca1ec","service_istio_io/canonical-revision":"latest","statefulset_kubernetes_io/pod-name":"app-test-app-test-space-fb5ffca1ec-0"}},"app_id":"8fc07280-506c-49b2-ab00-a97222fcf0a5","instance_id":"f4f2592e-ed57-4375-aff7-40c7b214abe0","structured_data":"[tags#47450 source_type=\"APP/PROC/WEB\"]"}
...
I think \t exists between time, logname, and log.
I tried the following methods.
What I tried
1. fluentd.conf
json
<parse>
#type json
time_format %Y-%m-%dT%H:%M:%S%:z
</parse>
tsv
<parse>
#type tsv
keys time,logname,logs
time_key time
time_format %Y-%m-%dT%H:%M:%S%:z
</parse>
none
<parse>
#type none
</parse>
2. result
Log files did not exist for all.
There is no output when using stdout parser except json parser.
What I want
I want the parser to work normally and to send data to influxdb.
Please help me...

Unable to monitor Elasticsearch server logs in kibana dashboard

Unable to monitor Elasticsearch server logs in Kibana dashboard.
I have 2 RHEL VMs for testing. I'm using this approach since production have different architecture
VM1- Elasticsearch,Kibana,Rsyslog
VM2 - FluentD
I want to push Elasticsearch logs from VM1 pushing it using Rsyslog and then sending it to VM2 where Fluentd is installed and Fluentd should send back to VM1 Elasticsearch. Below are the configuration.
I've tried installing fluentd in elasticsearch VM and was able to see the elastic logs in kibana.
But my requirement is to use rsyslog and send it to FLuentd. Since, fluentD is not installed in ELasticsearch VM's
td-agent.conf
log_level info
worker 2
</system>
<source>
#type tcp
port 5142
bind 0.0.0.0
<parse>
#type multiline
format_firstline /^(?<date>\[.*?\])/
format1 /(?<date>\[.*?\])(?<logLevel>\[.*?\])(?<service>\[.*?\]) (?<node_name>\[.*?\]) (?<LogMessage>.*)/
</parse>
tag es_logs
</source>
<source>
#type syslog
port 5145
<transport tcp>
</transport>
bind 0.0.0.0
tag syslog
</source>
<filter es_logs**>
#type parser
format json
time_key time_msec
key_name message
reserve_data true # tells Fluentd to keep the encompasing JSON - off by default
remove_key_name_field true # removes the key of the parsed JSON: message - off by default
</filter>
<match es**>
#type elasticsearch
host vm1ip
port 9200
index_name es_logs_write
include_timestamp true
type_name fluentd
# connection configs
reconnect_on_error true
reload_on_failure true
slow_flush_log_threshold 90
# buffer configs
<buffer>
#type file
path /data/opt/fluentd/buffer/elaticsearch_logs
chunk_limit_size 2MB
total_limit_size 1GB
flush_thread_count 8
flush_mode interval
retry_type exponential_backoff
retry_timeout 10s
retry_max_interval 30
overflow_action drop_oldest_chunk
flush_interval 5s
</buffer>
</match>```
rsyslog.conf
# Sample rsyslog configuration file
#
$ModLoad imfile
$ModLoad immark
$ModLoad imtcp
$ModLoad imudp
#$ModLoad imsolaris
$ModLoad imuxsock
module(load="omelasticsearch")
template(name="es_logs" type="list" option.json="on") {
constant(value="{")
constant(value="\"#timestamp\":\"") property(name="timereported" dateFormat="rfc3339")
constant(value="\",\"host\":\"") property(name="hostname")
constant(value="\",\"severity-num\":") property(name="syslogseverity")
constant(value=",\"facility-num\":") property(name="syslogfacility")
constant(value=",\"severity\":\"") property(name="syslogseverity-text")
constant(value="\",\"facility\":\"") property(name="syslogfacility-text")
constant(value="\",\"syslogtag\":\"") property(name="syslogtag")
constant(value="\",\"message\":\"") property(name="msg")
constant(value="\"}")
}
$UDPServerRun 514
#### GLOBAL DIRECTIVES ####
# Use default timestamp format
$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat
# Where to place auxiliary files
$WorkDirectory /var/lib/rsyslog
#### RULES ####
# Log all kernel messages to the console.
# Logging much else clutters up the screen.
#kern.* /dev/console
# Log anything (except mail) of level info or higher.
# Don't log private authentication messages!
*.none;mail.none;authpriv.none;cron.none;local6.none /var/log/messages
# Log auth.info separate
auth.info /var/log/authlog
# The authpriv file has restricted access.
authpriv.* /var/log/secure
# Log all the mail messages in one place.
mail.* -/var/log/maillog
# Log cron stuff
cron.* /var/log/cron
# Everybody gets emergency messages
*.emerg :omusrmsg:*
# Save news errors of level crit and higher in a special file.
uucp,news.crit /var/log/spooler
# Save boot messages also to boot.log
local7.* /var/log/boot.log
# ### begin forwarding rule ###
# The statement between the begin ... end define a SINGLE forwarding
# rule. They belong together, do NOT split them. If you create multiple
# forwarding rules, duplicate the whole block!
# Remote Logging (we use TCP for reliable delivery)
#
# An on-disk queue is created for this action. If the remote host is
# down, messages are spooled to disk and sent when it is up again.
$ActionQueueFileName fwdRule1 # unique name prefix for spool files
$ActionQueueMaxDiskSpace 1g # 1gb space limit (use as much as possible)
$ActionQueueSaveOnShutdown on # save messages to disk on shutdown
$ActionQueueType LinkedList # run asynchronously
$ActionResumeRetryCount -1 # infinite retries if host is down
$MaxMessageSize 64k
# remote host is: name/ip:port, e.g. 192.168.0.1:514, port optional
# Forward output to Fluentd
#local8.* /data/elastic_logs/elasticdemo.log
*.* #Vm1Ip:5142;es_logs
I use the below configurations, created a new file /etc/rsyslog.d/11-elastic.conf
For rsys:
$ModLoad imfile
$InputFilePollInterval 1
$InputFileName /var/log/elasticsearch/elasticdemo.log
$InputFileTag eslogs:
$InputFileStateFile eslogs
$InputFileFacility local0
$InputRunFileMonitor
:syslogtag, isequal, "eslogs:" {
:msg, contains, "ERROR" {
local0.* /var/log/eslog_error.log
local0.* #fluentdVMip:5141
}
stop
}
For FluentD
td-agent.conf
<system>
worker 2
</system>
<source>
#type udp
port 5141
tag eslogs
<parse>
#type multiline
format_firstline /^\[(?<date>.*?)\]/
format1 /\[(?<date>.*?)\]\[(?<logLevel>.*?)\]\[(?<service>.*?)\] \[(?<node_name>.*?)\](?<LogMessage>.*)/
</parse>
</source>
<match system.**>
#type stdout
</match>
<match eslogs.**>
#type elasticsearch
host ipoftheelasticserver or domain name
port 9200
index_name es_logs_write
include_timestamp true
type_name fluentd
# connection configs
reconnect_on_error true
reload_on_failure true
slow_flush_log_threshold 90
# buffer configs
<buffer>
#type file
path /data/opt/fluentd/buffer/elaticsearch_logs
chunk_limit_size 2MB
total_limit_size 1GB
flush_thread_count 8
flush_mode interval
retry_type exponential_backoff
retry_timeout 10s
retry_max_interval 30
overflow_action drop_oldest_chunk
flush_interval 5s
</buffer>
</match>

Custom OpenFlow via OpenDaylight has no effect

I have two VM in an OpenStack cloud. Using the following command, I can send data between them:
# On the server (IP 10.0.0.7)
nc -u -l -p 7865
# On the client (10.0.0.10)
nc -u 10.0.0.7 7865
Now, I would like to block the communication from 10.0.0.10 to 10.0.0.7 (but still allow it in the other direction). So I create this flow:
root#ubuntu:/opt/stack/opendaylight# cat my_custom_flow.xml
<?xml version="1.0"?>
<flow xmlns="urn:opendaylight:flow:inventory">
<priority>1</priority>
<flow-name>nakrule-custom-flow</flow-name>
<idle-timeout>12000</idle-timeout>
<match>
<ethernet-match>
<ethernet-type>
<type>2048</type>
</ethernet-type>
</ethernet-match>
<ipv4-source>10.0.0.10/32</ipv4-source>
<ipv4-destination>10.0.0.7/32</ipv4-destination>
<ip-match>
<ip-dscp>28</ip-dscp>
</ip-match>
</match>
<id>10</id>
<table_id>0</table_id>
<instructions>
<instruction>
<order>6555</order>
</instruction>
<instruction>
<order>0</order>
<apply-actions>
<action>
<order>0</order>
<drop-action/>
</action>
</apply-actions>
</instruction>
</instructions>
</flow>
Then, I send the flow to my switch. I use OpenDaylight as my SDN controller to manage my OpenStack cloud. I have two switchs, br-int and br-ex. A port for each VM in OpenStack is created on br-int. I can get the switchs ID with the following command:
curl -u admin:admin http://192.168.100.100:8181/restconf/config/opendaylight-inventory:nodes | python -m json.tool | grep '"id": "openflow:'[0-9]*'"'
"id": "openflow:2025202531975591"
"id": "openflow:202520253197559"
The switch with the ID 202520253197559 has a lot of flows in his table, while the other has like 2-3. So I guess 202520253197559 is br-int and therefore add my new flow to it with the following command:
curl -u admin:admin -H 'Content-Type: application/yang.data+xml' -X PUT -d #my_custom_flow.xml http://192.168.100.100:8181/restconf/config/opendaylight-inventory:nodes/node/openflow:202520253197559/table/234/flow/10
Now, I can see my flow with another REST request:
curl -u admin:admin http://192.168.100.100:8181/restconf/config/opendaylight-inventory:nodes | python -m json.tool
{
"flow-name": "nakrule-custom-flow",
"id": "10",
"idle-timeout": 12000,
"instructions": {
"instruction": [
{
"order": 6555
},
{
"apply-actions": {
"action": [
{
"drop-action": {},
"order": 0
}
]
},
"order": 0
}
]
},
"match": {
"ethernet-match": {
"ethernet-type": {
"type": 2048
}
},
"ip-match": {
"ip-dscp": 28
},
"ipv4-destination": "10.0.0.7/32",
"ipv4-source": "10.0.0.10/32"
},
"priority": 1,
"table_id": 0
},
However, then I go back to my two VM, they still can send data successfully to each other. Moreover, using the following command return nothing:
ovs-ofctl dump-flows br-int --protocols=OpenFlow13 | grep nakrule
I should see my new flow, does that mean OpenDaylight does not add it to my switch ?
root#ubuntu:/opt/stack# ovs-ofctl snoop br-int
2018-05-11T09:15:27Z|00001|vconn|ERR|unix:/var/run/openvswitch/br-int.snoop: received OpenFlow version 0x04 != expected 01
2018-05-11T09:15:27Z|00002|vconn|ERR|unix:/var/run/openvswitch/br-int.snoop: received OpenFlow version 0x04 != expected 01
Thank you in advance.
are you sure openflow:1 is the node id of the switch (br-int) that
you want to program. I am doubting that. Usually openflow:1 is something
we see from a mininet deployment.
do a GET on the topology API via RESTCONF and figure out the node id
of your switch(es). Or you can probably guess it by finding the mac
address of the br-int you are using and converting the HEX to decimal.
For example, mininet actually makes their mac addresses simple, like
00:00:00:00:00:01, so that's why it ends up openflow:1
another problem I notice in your updated question is that you are sending
the flow for table 234 in the URL, but specifying table 0 in the flow
data.
Also, you can check the config/ store in restconf for those nodes to
see if ODL is even accepting the flow. If it's in the config store and
that switch is connected to the openflow plugin, then the flow should
be pushed down to the switch.
another place to look for clues, is the karaf.log.
finally, if you think everything is right and the flow should be getting
sent down to the switch, but the switch is not showing the flow, then
try doing a packet capture. It's possible that your switch is rejecting
the flow for some reason. That might also be shown in the ovs logs, if
that's the case. I doubt this is what the problem is, but just adding it
in case.

How to read /var/log/wtmp logs in elasticsearch

I am trying to read the access log s from /var/log/wtmp in elasticsearch
I can read the file when logged into the box by using last -F /var/log/wtmp
I have logstash running and sending logs to elasticsearch, here is logstash conf file.
input {
file {
path => "/var/log/wtmp"
start_position => "beginning"
}
}
output {
elasticsearch {
host => localhost
protocol => "http"
port => "9200"
}
}
what is showing in elasticsearch is
G
Once i opened the file using less , i could only see binary data.
Now logstash cant understand this data.
A logstash file like the following should work fine -
input {
pipe {
command => "/usr/bin/last -f /var/log/wtmp"
}
}
output {
elasticsearch {
host => localhost
protocol => "http"
port => "9200"
}
}
Vineeth's answer is right but the following cleaner config works as well:
input { pipe { command => "last" } }
last /var/log/wtmp and last are exactly the same.
utmp, wtmp, btmp are Unix files that keep track of user logins and logouts. They cannot be read directly because they are not regular text files. However, there is the last command which displays the information of /var/log/wtmp in plain text.
$ last --help
Usage:
last [options] [<username>...] [<tty>...]
I can read the file when logged into the box by using last -F /var/log/wtmp
I doubt that. What the -F flag does:
-F, --fulltimes print full login and logout times and dates
So, last -F /var/log/wtmp will interpret /var/log/wtmp as a username and won't print any login information.
What the -f flag does:
-f, --file <file> use a specific file instead of /var/log/wtmp

Logstash not matching the pattern

I was learning logstash. Have a very simple config file..
input {
file {
path => "D:\b.log"
start_position => beginning
}
}
# The filter part of this file is commented out to indicate that it is
# optional.
filter {
grok {
match => { "message" => "%{LOGLEVEL:loglevel}" }
}
}
output {
stdout { codec => rubydebug }
}
The input file is just this:
INFO
I am running logstash on windows and the command is
logstash -f logstash.conf
I expect the output to be shown on the console to ensure that its working. But logstash produces no output, just the logstash config messages..
D:\Installables\logstash-2.0.0\logstash-2.0.0\bin>logstash -f logstash.conf
io/console not supported; tty will not be manipulated
Default settings used: Filter workers: 2
Logstash startup completed
I have deleted the sincedb file and tried. Is there something that i am missing?
I think this answers your question:
How to force Logstash to reparse a file?
It looks like you are missing the quotes around "beginning" and the other post recommends redirecting sincedb to dev/null. I don't know if there is a windows equivalent for that. I did use that as well, and it worked fine.
As an alternative, what I do now is to configure stdin() as input so that I don't have to worry about anything else.

Resources