grok script for writing to logstash and rendering in Kibana - elasticsearch

I am following filebeat->logstash->elasticsearch->kibana pipeline. filebeat successfully working and fetching the logs from the target file.
Logstash receiving the logs on input plugin and bypassing the filter plugin and sending over to the output plugin.
filebeat.yml
# ============================== Filebeat inputs ===============================
filebeat.inputs:
- type: log
enabled: true
paths:
- D:\serverslogs\ch5shdmtbuil100\TeamCity.BuildServer-logs\launcher.log
fields:
type: launcherlogs
- type: filestream
# Change to true to enable this input configuration.
enabled: false
# Paths that should be crawled and fetched. Glob based paths.
paths:
- /var/log/*.log
# =================================== Kibana ===================================
setup.kibana:
host: "localhost:5601"
# ------------------------------ Logstash Output -------------------------------
output.logstash:
# The Logstash hosts
hosts: ["localhost:5044"]
logstash.conf
input{
beats{
port => "5044"
}
}
filter {
if [fields][type] == "launcherlogs"{
grok {
match => {"message" =>%{YEAR:year}-%{MONTH:month}-%{MONTHDAY:day}%{DATA:loglevel}%{SPACE}-%{SPACE}%{DATA:class}%{SPACE}-%{GREEDYDATA:message}}
}
}
}
output{
elasticsearch{
hosts => ["http://localhost:9200"]
index => "poclogsindex"
}
}
I am able to send the logs on kibana but the grok debugger scripts is not rendering desired json on kibana.
The data json rendered on Kibana is not showing all the attributes passed in the script. Please advise.

Your grok pattern does not match the sample you gave in comment : several parts are missing (the brackets, the HH:mm:ss,SSS part and an additionnal space). Grok debuggers are your friends ;-)
Instead of :
%{YEAR:year}-%{MONTH:month}-%{MONTHDAY:day}%{DATA:loglevel}%{SPACE}-%{SPACE}%{DATA:class}%{SPACE}-%{GREEDYDATA:message}
Your pattern should be :
\[%{TIMESTAMP_ISO8601:timestamp}\] %{DATA:loglevel}%{SPACE}-%{SPACE}%{DATA:class}%{SPACE}-%{GREEDYDATA:message}
TIMESTAMP_ISO8601 matches this date/time.
Additionnally, I always doubled-quote the pattern, so the grok part would be :
grok {
match => {"message" => "\[%{TIMESTAMP_ISO8601:timestamp}\] %{DATA:loglevel}%{SPACE}-%{SPACE}%{DATA:class}%{SPACE}-%{GREEDYDATA:message}"}
}

Related

How to resolve parsing error for CSV file in Logstash

I am using Filebeat to send a CSV file to Logstash and then up to Kibana, however I am getting a parsing error when the CSV file is picked up by Logstash.
This is the contents of the CSV file:
time version id score type
May 6, 2020 # 11:29:59.863 1 2 PPy_6XEBuZH417wO9uVe _doc
The logstash.conf:
input {
beats {
port => 5044
}
}
filter {
csv {
separator => ","
columns =>["time","version","id","index","score","type"]
}
}
output {
elasticsearch {
hosts => ["http://localhost:9200"]
index => "%{[#metadata][beat]}-%{[#metadata][version]}-%{+YYYY.MM.dd}"
}
}
Filebeat.yml:
filebeat.inputs:
# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.
- type: log
# Change to true to enable this input configuration.
enabled: true
# Paths that should be crawled and fetched. Glob based paths.
paths:
- /etc/test/*.csv
#- c:\programdata\elasticsearch\logs\*
and the error in Logstash:
[2020-05-27T12:28:14,585][WARN ][logstash.filters.csv ][main] Error parsing csv {:field=>"message", :source=>"time,version,id,score,type,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,", :exception=>#<TypeError: wrong argument type String (expected LogStash::Timestamp)>}
[2020-05-27T12:28:14,586][WARN ][logstash.filters.csv ][main] Error parsing csv {:field=>"message", :source=>"\"May 6, 2020 # 11:29:59.863\",1,2,PPy_6XEBuZH417wO9uVe,_doc,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,", :exception=>#<TypeError: wrong argument type String (expected LogStash::Timestamp)>}
I do get some data in Kibana but not what I want to see.
I have managed to get it to work locally. the mistakes I have noticed so far were:
Using ES reserved fields like #timestamp, #version, and more.
The timestamp was not in ISO8601 format. It had an # sign in the middle.
Your filter set the separator to , but your CSV real separator is "\t".
According to the error you can see it is trying to also work on your titles line, I suggest you remove it from the CSV or use the skip_header option.
Below is the logstash.conf file I used:
input {
file {
path => "C:/work/elastic/logstash-6.5.0/config/test.csv"
start_position => "beginning"
}
}
filter {
csv {
separator => ","
columns =>["time","version","id","score","type"]
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "csv-test"
}
}
The CSV file I used:
May 6 2020 11:29:59.863,1,PPy_6XEBuZH417wO9uVe,_doc
May 6 2020 11:29:59.863,1,PPy_6XEBuZH417wO9uVe,_doc
May 6 2020 11:29:59.863,1,PPy_6XEBuZH417wO9uVe,_doc
May 6 2020 11:29:59.863,1,PPy_6XEBuZH417wO9uVe,_doc
From my Kibana:

Filtering Filebeat input with or without Logstash

In our current setup we use Filebeat to ship logs to an Elasticsearch instance. The application logs are in JSON format and it runs in AWS.
For some reason AWS decided to prefix the log lines in a new platform release, and now the log parsing doesn't work.
Apr 17 06:33:32 ip-172-31-35-113 web: {"#timestamp":"2020-04-17T06:33:32.691Z","#version":"1","message":"Tomcat started on port(s): 5000 (http) with context path ''","logger_name":"org.springframework.boot.web.embedded.tomcat.TomcatWebServer","thread_name":"main","level":"INFO","level_value":20000}
Before it was simply:
{"#timestamp":"2020-04-17T06:33:32.691Z","#version":"1","message":"Tomcat started on port(s): 5000 (http) with context path ''","logger_name":"org.springframework.boot.web.embedded.tomcat.TomcatWebServer","thread_name":"main","level":"INFO","level_value":20000}
The question would be whether we can avoid using Logstash to convert the log lines into the old format? If not, how do I drop the prefix? Which filter is the best choice for this?
My current Filebeat configuration looks like this:
filebeat.inputs:
- type: log
paths:
- /var/log/web-1.log
json.keys_under_root: true
json.ignore_decoding_error: true
json.overwrite_keys: true
fields_under_root: true
fields:
environment: ${ENV_NAME:not_set}
app: myapp
cloud.id: "${ELASTIC_CLOUD_ID:not_set}"
cloud.auth: "${ELASTIC_CLOUD_AUTH:not_set}"
I would try to leverage the dissect and decode_json_fields processors:
processors:
# first ignore the preamble and only keep the JSON data
- dissect:
tokenizer: "%{?ignore} %{+ignore} %{+ignore} %{+ignore} %{+ignore}: %{json}"
field: "message"
target_prefix: ""
# then parse the JSON data
- decode_json_fields:
fields: ["json"]
process_array: false
max_depth: 1
target: ""
overwrite_keys: false
add_error_key: true
There is a plugin in Logstash called JSON filter that includes all the raw log line in a field called "message" (for instance).
filter {
json {
source => "message"
}
}
If you do not want to include the beginning part of the line, use the dissect filter in Logstash. It would be something like this:
filter {
dissect {
mapping => {
"message" => "%{}: %{message_without_prefix}"
}
}
}
Maybe in Filebeat there are these two features available as well. But in my experience, I prefer working with Logstash when parsing/manipulating logging data.

Can filebeat convert log lines output to json without logstash in pipeline?

We have standard log lines in our Spring Boot web applications (non json).
We need to centralize our logging and ship them to an elastic search as json.
(I've heard the later versions can do some transformation)
Can Filebeat read the log lines and wrap them as a json ? i guess it could append some meta data aswell. no need to parse the log line.
expected output :
{timestamp : "", beat: "", message: "the log line..."}
i have no code to show unfortunately.
filebeat supports several outputs including Elastic Search.
Config file filebeat.yml can look like this:
# filebeat options: https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-reference-yml.html
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/log/../file.err.log
processors:
- drop_fields:
# Prevent fail of Logstash (https://www.elastic.co/guide/en/beats/libbeat/current/breaking-changes-6.3.html#custom-template-non-versioned-indices)
fields: ["host"]
- dissect:
# tokenizer syntax: https://www.elastic.co/guide/en/logstash/current/plugins-filters-dissect.html.
tokenizer: "%{} %{} [%{}] {%{}} <%{level}> %{message}"
field: "message"
target_prefix: "spring boot"
fields:
log_type: spring_boot
output.elasticsearch:
hosts: ["https://localhost:9200"]
username: "filebeat_internal"
password: "YOUR_PASSWORD"
Well it seems to do it by default. this is my result when i tried it locally to read log lines. it wraps it exactly like i wanted.
{
"#timestamp":"2019-06-12T11:11:49.094Z",
"#metadata":{
"beat":"filebeat",
"type":"doc",
"version":"6.2.4"
},
"message":"the log line...",
"source":"/Users/myusername/tmp/hej.log",
"offset":721,
"prospector":{
"type":"log"
},
"beat":{
"name":"my-macbook.local",
"hostname":"my-macbook.local",
"version":"6.2.4"
}
}

logstash-2.2.2, windows, IIS log file format

when I'm parsing iis log file in UTF-8 format I'm getting below error and When I'm parsing log file using ANSI format there is nothing working Logstash just display message on console " Logstash startup completed". There is almost 1000 files on my server i can't change each file format from ANSI to UTF-8. Can you please help where I need to change in my config file. I'm also attaching debug file when I'm parsing files on UTF-8 format. I'm using elastic search on same box and its completely working fine. I'm also able to telnet port 9200 with 127.0.0.1.
Log sample:
2016-03-26T05:40:40.764Z WIN-AK44913P759 2016-03-24 00:16:31 W3SVC20 ODSANDBOXWEB01 172.x.x.x GET /healthmonitor.axd - 80 - 172.x.x.x HTTP/1.1 - - - www.xyz.net 200 0 0 4698 122 531
stdout output:
{
"message" => "2016-03-24 04:43:02 W3SVC20 ODSANDBOXWEB01 172.x.x.x GET /healthmonitor.axd - 80 - 172.x.x.x HTTP/1.1 - - - www.xyz.net 200 0 0 4698 122 703\r",
"#version" => "1",
"#timestamp" => "2016-03-26T05:42:15.045Z",
"path" => "C:\\IISLogs/u_ex160324.log",
"host" => "WIN-AK44913P759",
"type" => "IISLog",
"tags" => [
[0] "_grokparsefailure"
]
}
Below is my logstash conf file configuration
input {
file {
type => "IISLog"
path => "C:\IISLogs/u_ex*.log"
start_position => "beginning"
}
}
filter {
#ignore log comments
if [message] =~ "^#" {
drop {}
}
grok {
match => ["message", "%{TIMESTAMP_ISO8601:log_timestamp} %{WORD:iisSite} %{IPORHOST:site} %{WORD:method} %{URIPATH:page} %{NOTSPACE:querystring} %{NUMBER:port} %{NOTSPACE:username} %{IPORHOST:clienthost} %{NOTSPACE:useragent} %{NOTSPACE:referer} %{NUMBER:response} %{NUMBER:subresponse} %{NUMBER:scstatus} %{NUMBER:bytes:int} %{NUMBER:timetaken:int}"]
}
#Set the Event Timesteamp from the log
date {
match => [ "log_timestamp", "YYYY-MM-dd HH:mm:ss" ]
timezone => "Etc/UCT"
}
useragent {
source=> "useragent"
prefix=> "browser"
}
mutate {
remove_field => [ "log_timestamp"]
}
}
# output logs to console and to elasticsearch
output {
stdout {}
elasticsearch {
hosts => ["127.0.0.1:9200"]
}
stdout {
codec => rubydebug
}
}
The _grokparsefailure tag means that your grok pattern didn't match your input. It looks like you're intending the pattern to skip the first two fields, which is fine.
Then, looking at the next four fields, I see:
2016-03-24 00:16:31 W3SVC20 ODSANDBOXWEB01 172.1.1.1
but your pattern is looking for
%{TIMESTAMP_ISO8601:log_timestamp} %{WORD:iisSite} %{IPORHOST:site} %{WORD:method}
You haven't accounted for the IP address (since ODSANDBOXWEB01 is going into [site]).
Building grok patterns is a deliberate, iterative process. Start at the debugger. Enter a sample input line and then add grok patterns - one at a time! - until the entire line has been matched.
Also, when you obfuscate your data, please leave it as valid data. Changing the ip to 172.x.x.x means that it won't match the %{IP} pattern without us having to figure out what you did. I changed it to 172.1.1.1 in this example.

Issue in reading log file that contains date in it's name

I have 2 linux boxes setup in which 1 box contains one component which generates log and logstash installed in it to transfer the logs. And in other box I have redis elasticsearch and logstash. here logstash will act as logstash indexer to grok the data.
Now my problem is that in 1st box component generate new log file everyday, but only difference in log file name varies as per date.
like
counters-20151120-0.log
counters-20151121-0.log
counters-20151122-0.log
and so on, I have included below type of code in my logstash shipper conf file:
file {
path => "/opt/data/logs/counters-%{YEAR}%{MONTHNUM}%{MONTHDAY}*.log"
type => "rg_counters"
}
And in my logstash indexer, I have below type of code to catch those log files:
if [type] == "rg_counters" {
grok{
match => ["message", "%{YEAR}%{MONTHNUM}%{MONTHDAY}\s*%{HOUR}:%{MINUTE}:%{SECOND}\s*(?<counters_raw_data>[0-9\-A-Z]*)\s*(?<counters_operation_type>[\-A-Z]*)\s*%{GREEDYDATA:counters_extradata}"]
}
}
output {
elasticsearch { host => ["elastichost1","elastichost1" ] port => "9200" protocol => "http" }
stdout { codec => rubydebug }
}
Please note that this is working setup and other types log files are getting transfered and processed successfully, so there is no issue of setup.
The problem is how do I process this log file which contains date in it's file name.
Any help here?
Thanks in advance!!
Based on the comments...
Instead of trying to use regexp patterns in your path:
path => "/opt/data/logs/counters-%{YEAR}%{MONTHNUM}%{MONTHDAY}*.log"
just use glob patterns:
path => "/opt/data/logs/counters-*.log"
logstash will remember which files (inodes) that it's seen before.

Resources