How to handle non-matching Logstash grok filters - filter

I am wondering what the best approach to take with my Logstash Grok filters. I have some filters that are for specific log entries, and won't apply to all entries. The ones that don't apply always generate _grokparsefailure tags. For example, I have one grok filter that's for every log entry and it works fine. Then I have another filter that's for error messages with tracebacks. The traceback filter throws a grokparsefailure for every single log entry that doesn't have a traceback.
I'd prefer to have it just pass the rule if there isn't a match instead of adding the parsefailure tag. I use the parsefailure tag to find things that aren't parsing properly, not things that simply didn't match a particular filter. Maybe it's just the nomenclature "parse failure" that gets me. To me that means there's something wrong with the filter (e.g. badly formatted), not that it didn't match.
So the question is, how should I handle this?
Make the filter pattern optional using ?
(ab)use the tag_on_failure option by setting it to nothing []
make the filter conditional using something like "if traceback in message"
something else I'm not considering?
Thanks in advance.
EDIT
I took the path of adding a conditional around the filter:
if [message] =~ /took\s\d+/ {
grok {
patterns_dir => "/etc/logstash/patterns"
match => ["message", "took\s+(?<servicetime>[\d\.]+)"]
add_tag => [ "stats", "servicetime" ]
}
}
Still interested in feedback though. What is considered "best practice" here?

When possible, I'd go with a conditional wrapper just like the one you're using. Feel free to post that as an answer!
If your application produces only a few different line formats, you can use multiple match patterns with the grok filter. By default, the filter will process up to the first successful match:
grok {
patterns_dir => "./patterns"
match => {
"message" => [
"%{BASE_PATTERN} %{EXTRA_PATTERN}",
"%{BASE_PATTERN}",
"%{SOME_OTHER_PATTERN}"
]
}
}
If your logic is less straightforward (maybe you need to check the same condition more than once), the grep filter can be useful to add a tag. Something like this:
grep {
drop => false #grep normally drops non-matching events
match => ["message", "/took\s\d+/"]
add_tag => "has_traceback"
}
...
if "has_traceback" in [tags] {
...
}

You can also add tag_on_failure => [] to your grok stanza like so:
grok {
match => ["context", "\"tags\":\[%{DATA:apptags}\]"]
tag_on_failure => [ ]
}
grok will still fail, but will do so without adding to the tags array.

This is the most efficient way of doing this. Ignore the filter
filter {
grok {
match => [ "message", "something"]
}
if "_grokparsefailure" in [tags] {
drop { }
}
}

You can also do this
remove_tag => [ "_grokparsefailure" ]
whenever you have a match.

Related

Logstash create a new field based on existing field

I have data coming from database queries using jdbc input plugin and result from queries contains url field from which I want to extract a few properties.
Example urls:
/incident.do?sys_id=0dc18b246faa17007a64cbe64f3ee4e1&sysparm_view
/navpage_form_default.do
/u_pm_prov_project_list.do?sysparm_userpref_module=fa547ce26f661
JOB: email read events process
JOB: System - reduce resources
I added regex patterns in grok patterns file:
webpage_category .*
job_type .*
I have two types of url so I used if in filter block to distinguish between them
Config I tried so far:
filter {
if [url] =~ /JOB: .*/ {
grok {
patterns_dir => ["/etc/logstash/patterns"]
match => {
"url" => "JOB: %{job_type:job_type}"
}
}
} else
if [url] =~ /\/.*\.do\?.*/ {
grok {
patterns_dir => ["/etc/logstash/patterns"]
match => {
"url" => "/{webpage_category:webpage_category}\.do\?.*"
}
}
}
}
Creation of a new field for urls starting with JOB: works properly but webpage_category is not working at all. Is it because regex can not be used inside of match?
The problem is you are trying to use grok pattern inside a mutate filter, which wouldn't work. mutate and grok are two separate filter plugins.
You need to use add_field inside grok filter if you want to use grok pattern to create a field. please remember add_field is supported by all filter plugins.
Please have a look at following example,
filter {
grok {
add_field => { "foo_%{somefield}" => "Hello world, from %{host}" }
}
}
In your case, it will be,
filter{
grok {
add_field => {
"webpage_category" => "%{webpage_category:url}"
"job_type" => "%{job_type:url}"
}
}
}
Please also make sure, patterns_dir is imported,
patterns_dir => ["./patterns"] => ["./patterns"]
please checkout grok filter documentation as well.

Logstash: How to use date/time in a filename as an imported field

I have a bunch of log files that are named as 'XXXXXX_XX_yymmdd_hh:mm:ss.txt' - I need to include the date and time (separate fields) from the filename in fields that are added to Logstash.
Can anyone help?
Thanks
Use a grok filter to extract the date and time:
filter {
grok {
match => [
"path",
"^%{GREEDYDATA}/[^/]+_%{INT:date}_%{TIME:time}\.txt$"
]
}
}
Depending on what goes instead of XXXXXX_XX you might prefer a stricter expression. Also, GREEDYDATA isn't very efficient. This might yield better performance:
filter {
grok {
match => [
"path", "^(?:/[^/]+)+/[^/]+_%{INT:date}_%{TIME:time}\.txt$"
]
}
}

Logstash change time format

My log statement looks like this.
2014-04-23 06:40:29 INFO [1605853264] [ModuleName] - [ModuleName] -
Blah blah
I am able to parse it fine and it gets logged to ES correctly with following ES field
"LogTimestamp": "2014-04-23T13:40:29.000Z"
But my requirement is to log this statement as following, note 'z' is dropped with +0000. I tried replace, gsub but none changes the output.
"LogTimestamp": "2014-04-23T13:40:29.000+0000"
Can somebody help?
Here is my pattern
TEMP_TIMESTAMP %{YEAR}-%{MONTHNUM}-%{MONTHDAY}\s%{HOUR}:%{MINUTE}:%{SECOND} TEMP_LOG %{TEMP_TIMESTAMP:logdate}\s*?%{LOGLEVEL:TempLogLevel}\s*?\[\s?*%{BASE10NUM:TempThreadId}\]%{GREEDYDATA}
This is the filter config:
grok{
patterns_dir => ["patterns"]
match=> ["message", "%{TEMP_LOG}"]
}
date{
match => [ "logdate", "yyyy-MM-dd HH:mm:ss" ]
target => "LogTimestamp"
timezone => "PST8PDT"
}
mutate {
gsub => ["logdate", ".000Z", ".000+0000"]
}
I haven't quite understood meaning of fields in logstash and how they map to elastic search, that confusion is making me go wrong in this case.
You can use ruby plugin to do what you want!
As your requirement, you want to change this
"LogTimestamp": "2014-04-23T13:40:29.000Z"
to
"LogTimestamp": "2014-04-23T13:40:29.000+0000"
Try to use this filter
filter {
ruby {
code => "
event['LogTimestamp'] = event['LogTimestamp'].localtime('+00:00')
"
}
}
Hope this can help you.

Tagging the Logs by Logstash - Grok - ElasticSearch

Summary:
I am using Logstash - Grok and elastic search and my main aim is to First accept the logs by logstash, parse them by grok and associate tags with the messages depending on the type of the log, and then finally feed it to the Elastic server to query with Kibana.
I have already written this code but am not able to get the tags in Elastic Search.
This is my logstash confif file.
input {
stdin {
type => "stdin-type"
}
}
filter {
grok {
tags => "mytags"
pattern => "I am a %{USERNAME}"
add_tag => "mytag"
named_captures_only => true
}
}
output {
stdout { debug => true debug_format => "json"}
elasticsearch {}
}
Where am I going wrong?
1) I would first start with editing your values to match the data type they represent. For example
add_tag => "mytag"
actually should have an array as it's value, not a simple string. Change that to
add_tag => ["mytag"]
as a good start. Double check all your values and verify they are of the correct type for logstash.
2) You are limiting your grok filters to messages that are already tagged with "mytags" based on the config line
tags => "mytags"
I don't see anywhere where you have added that tag ahead of time. Therefore, none of your messages will even go through your grok filter.
3) Please read the logstash docs carefully. I am rather new to the Logstash/Grok/ES/Kibana etc. world as well, but I have had very similar problems to what you have had, and all of them were solved by paying attention to what the documentation says.
You can run LogStash by hand (You may already be doing this) with /opt/logstash/bin/logstash -f $CONFIG_FILE and can check that your config file is valid with /opt/logstash/bin/logstash -f $CONFIG_FILE --configtest I bet you're already doing that though.
You may need to put your add_tag stanza into an array
grok {
...
add_tag => [ "mytag" ]
}
It could also be that what you're piping into STDIN isn't being matched in the grok pattern. If grok doesn't match is should result in _grokparsefailure being added to your tags. If you see those, it means your grok pattern isn't firing.
A better way to do this may be...
input {
stdin {
type => 'stdin'
}
}
filter {
if [type] = 'stdin' {
mutate {
add_tag => [ "mytag" ]
}
}
}
output {
stdout {
codec => 'rubydebug'
}
}
This will add a "mytag" tag to all things coming from standard in, wether they're groked or not.

logstash if field exists then grok

I'm trying to create a filter for logstash that will have "general" grok filter for all logs and if some field exists, then I want it to perform a different grok.
The first grok I'm using is
grok {
match => [
"message", "....%{NOTSPACE:name} %{GREEDYDATA:logcontent}"
]
}
This is working great. But I want this to be able to filter even more if the "name" field is i.e "foo"
if [name] == "foo" {
grok {
match => [
"message", ".....%{NOTSPACE:name} %{NOTSPACE:object1} %{NOTSPACE:object2}"
]
}
I tried this option but it didn't work.
Any thoughts?
The easiest way is to use a pattern match on the message before you grok anything.
For example:
if [message] =~ /....foo/ {
// foo specific grok here
} else {
// general grok
}

Resources