How to parse a log string in Logstash using Grok? - elasticsearch

I am trying to parse the following string using Grok;
2018-06-08 13:26:02.002851: <action cmd="run" options="IGNORE_ERROR" path="/usr/lib/vmware/likewise/bin/lw-lsa get-metrics"> (/etc/vmware/vm-support/ad.mfx) took 0.000 sec
I want to separate the above out into columns ultimately like TIMESTAMP, ACTION, OPTIONS, PATH etc - I have tried multiple combinations but have so far failed.

Grok pattern for above log:->
%{TIMESTAMP_ISO8601:time}:%{SPACE}\<%{WORD:action}%{SPACE} %{DATA:kvpairs}\>%{SPACE}\(%{DATA:path_2}\)%{SPACE}took%{SPACE}%{NUMBER:time_taken}%{SPACE}%{WORD:time_unit}
In the above grok pattern, I have captured cmd, options and path in an event named kvpairs. This is because these key-value pairs can be easily extracted in logstash using kv filter. So your filter configuration will look like:->
filter{
grok(
match => { "message" => "%{TIMESTAMP_ISO8601:timestamp}:%{SPACE}\<%{WORD:action}%{SPACE} %{DATA:kvpairs}\>%{SPACE}\(%{DATA:path_2}\)%{SPACE}took%{SPACE}%{NUMBER:time_taken}%{SPACE}%{WORD:time_unit}"}
)
kv{
source => "kvpairs"
}
date{
match => ["timestamp","yyyy-MM-dd HH:mm:ss.SSS"]
}
}
kv filter by default takes space as the delimiter and will extract columns cmd,options and path.
date filter will make the #timestamp variable.

Related

Logstash - Breaking apart a field filtered with Grok into further fields

We have log messages that look like the following:
<TE CT="20:33:57.258102" Sv="N" As="CTWare.PerimeterService" T="PerimeterService" M="GetWallboard" TID="1" TN="" MID="" ID="" MSG="Exit method 'GetWallboard' took 00:00:00.0781247" />
Right now, we use the following Grok filter:
match => { "message" => "<TE CT=\"%{DATESTAMP:log_timestamp}\" Sv=%{QS:severity} As=%{QS:assembly} T=%{QS:T} M=%{QS:M} TID=%{QS:TID} TN=%{QS:TN} MID=%{QS:MID} ID=%{QS:ID} MSG=%{QS:log_raw} />" }
Inside the "MSG" / "log_raw" field, however, I want to try and extract the timestamp after "...took" into its own field. I was hoping to accomplish it by using a custom regex to extract "MSG" / "log_raw" up to a specific point, then another regex to capture the "took" timestamp and make a new field. But, when I test with online Grok debuggers I'm not having any luck. Is it even possible to do something like this?
Your CT field does not match DATESTAMP. It will match TIME. You can then use a second grok to pull the time from the [log_raw] field.
grok { match => { "[log_raw]" => "took %{TIME:someField}" } }
to get
"someField" => "00:00:00.0781247",
I would be tempted to use an xml filter to parse that [message], since it will adjust if there are ever additional or missing fields.

add common prefix to logstash output for given filter

Im working with some logstash io that generates lots of fields with names like 'a0', 'a1'. I can mutate these but there are lots of them so I'd like to prepend a 'namespace' (of sorts) to all the fields from a filter.
IE if the parsed records are 'a0' and 'a1' Id like them to appear in elasticsearch as 'somespace.a0' and 'somespace.a1'.
Is this possible?
Turns out if you are using the kv filter you can add a 'prefix' (see here).
prefix:
Value type is string
Default value is ""
A string to prepend to all of the extracted keys.
For example, to prepend arg_ to all keys:
filter { kv { prefix => "arg_" } }

Add extra value to field before sending to elasticsearch

I'm using logstash, filebeat and grok to send data from logs to my elastisearch instance. This is the grok configuration in the pipe
filter {
grok {
match => {
"message" => "%{SYSLOGTIMESTAMP:messageDate} %{GREEDYDATA:messagge}"
}
}
}
This works fine, the issue is that messageDate is in this format Jan 15 11:18:25 and it doesn't have a year entry.
Now, i actually know the year these files were created in and i was wondering if it is possible to add the value to the field during the process, that is, somehow turn Jan 15 11:18:25 into 2016 Jan 15 11:18:25 before sending to elasticsearch (obviously without editing the files, which i could do and even with ease but it'll be a temporary fix to what i have to do and not a definitive solution)
I have tried googling if it was possible but no luck...
Valepu,
The only way to modify the data from a field is using the ruby filter:
filter {
ruby {
code => "#your code here#"
}
}
For more information like...how to get,set field values, here is the link:
https://www.elastic.co/guide/en/logstash/current/plugins-filters-ruby.html
If you have a separate field for date as a string, you can use logstash date plugin:
https://www.elastic.co/guide/en/logstash/current/plugins-filters-date.html
If you don't have it as a separate field (as in this case) use this site to construct your own grok pattern:
http://grokconstructor.appspot.com/do/match
I made this to preprocess the values:
%{YEAR:yearVal} %{MONTH:monthVal} %{NUMBER:dayVal} %{TIME:timeVal} %{GREEDYDATA:message}
Not the most elegant I guess, but you get the values in different fields. Using this you can create your own date field and parse it with date filter so you will get a comparable value or you can use these fields by themselves. I'm sure there is a better solution, for example you could make your own grok pattern and use that, but I'm gonna leave some exploration for you too. :)
By reading thoroughly the grok documentation i found what google couldn't find for me and which i apparently missed the first time i read that page
https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html#plugins-filters-grok-add_field
Using the add_field and remove_field options i managed to add the year to my date, then i used the date plugin to send it to logstash as a timestamp. My filter configuration now looks like this
filter {
grok {
match => {
"message" => "%{SYSLOGTIMESTAMP:tMessageDate} %{GREEDYDATA:messagge}"
add_field => { "messageDate" => "2016 %{tMessageDate}" }
remove_field => ["tMessageDate"]
}
}
date {
match => [ "messageDate", "YYYY MMM dd HH:mm:ss"]
}
}
And it worked fine

Logstash Filter for a custom message

I am trying to parse a bunch of strings in Logstash and output is set as ElasticSearch.
Sample input string is: 2016 May 24 10:20:15 User1 CREATE "Create a new folder"
The grok filter is:
match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{WORD:user} %{WORD:action_performed} %{WORD:action_description} "}
In Elasticsearch, I am not able to see separate columns for different field such as timstamp, user, action_performed etc.
Instead the whole string is under a single column "message".
I would like to store the information in separate fields instead of just a single column.
Not sure what to change in logstash filter to achieve as desired.
Thanks!
You need to change your grok pattern with this, i.e. use QUOTEDSTRING instead of WORD and it will work!
match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{WORD:user} %{WORD:action_performed} %{QUOTEDSTRING:action_description}"}

How to assign a variable in logstash config?

I'm trying to fetch the host name from the events that logstash processes, and if the events matches to the criteria I want the host name to be sent to another file. But meanwhile the event should be sent to elasticsearch output.
The idea what am having is to assign the host name to a variable, and send the variable value to a file, if the "if" condition is satisfied.
Will this be possible with logstash?
Regards,
Gaurav
Yes, what you want is posible in Logstash. The Logstash site has documentation for the config format, and all the available plugins which can be found at http://logstash.net/docs/1.4.0/. You will probably want to use the grok filter to extract the host name, and the file output to write the data.
Here is an example confg, which does what you want:
input {
#some input
}
filters {
grok {
match => ["message", "%{HOSTNAME:host} rest of message line" ]
add_tag => ["has_hostname"]
}
}
output {
elasticsearch {}
if "has_hostname" in [tags] {
file {
message_format => "%{host}"
path => "path/to/file"
}
}
}
The grok pattern will need to be altered to match your data, the logstash docs include a link the default pattern set that you can use.

Resources