add common prefix to logstash output for given filter - elasticsearch

Im working with some logstash io that generates lots of fields with names like 'a0', 'a1'. I can mutate these but there are lots of them so I'd like to prepend a 'namespace' (of sorts) to all the fields from a filter.
IE if the parsed records are 'a0' and 'a1' Id like them to appear in elasticsearch as 'somespace.a0' and 'somespace.a1'.
Is this possible?

Turns out if you are using the kv filter you can add a 'prefix' (see here).
prefix:
Value type is string
Default value is ""
A string to prepend to all of the extracted keys.
For example, to prepend arg_ to all keys:
filter { kv { prefix => "arg_" } }

Related

Some of KV filter values has custom date that identified as string in Kibana

I'm using kv filter in Logstash to process config file in the following format :
key1=val1
key2=val2
key3=2020-12-22-2150
with the following lines in Logstash :
kv {
field_split => "\r\n"
value_split => "="
source => "message"
}
Some of my fields in the conf file have a the following date format : YYYY-MM-DD-HHMMSS. When Logstash send the fields to ES, Kibana display them as strings. How can I let Logstash know that those fields are date fields and by that indexing them in ES as dates and not strings ?
I don't want to edit the mapping of the index because it will require reindexing. My final goal with those fields is to calculate the diff between the fields (in seconds, minutes,hours..) and display it in Kibana.
The idea that I have :
Iterate over k,v filter results, if the value is of format YYYY-MM-DD-HHMMSS (check with regex)
In this case, chance the value of the field to milliseconds since epoch
I decided to use k,v filter and Ruby code as a solution but I'm facing an issue.
It could be done more easily outside of logstash by adding a dynamic_template on your index and let him manage field types.
You can use the field name as a detector if it is clear enough (*_date) or define a regex
"match_pattern": "regex",
"match": "^(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])[- /.](19|20)\d\d$"
The code above hasnot been tested.
You can find the official doc here.
https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-templates.html
My solution :
I used the kv filter to convert each line into key value set.
I saved the kv filter resut into a dedicated field.
On this dedicated field, I run a Ruby script that changed all the dates with the custom format to miliseconds since epoch.
code :
filter {
if "kv_file" in [tags] {
kv {
field_split => "\r\n"
value_split => "="
source => "message"
target => "config_file"
}
ruby {
id => "kv_ruby"
code => "
require 'date'
re = /([12]\d{3}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])-[0-23]{2}[0-5]{1}[0-9]{1}[0-5]{1}[0-9]{1})/
hash = event.get('config_file').to_hash
hash.each { |key,value|
if value =~ re
date_epochs_milliseconds = DateTime.strptime(value,'%F-%H%M%S').strftime('%Q')
event.set(key, date_epochs_milliseconds.to_i)
end
}
"
}
}
}
By the way, if you are facing the following error in your Ruby compilation : (ruby filter code):6: syntax error, unexpected null hash it doesn't actually mean that you got a null value, it seems that it is related to the escape character of the double quotes. Just try to replace double quotes with one quote.

How to parse a log string in Logstash using Grok?

I am trying to parse the following string using Grok;
2018-06-08 13:26:02.002851: <action cmd="run" options="IGNORE_ERROR" path="/usr/lib/vmware/likewise/bin/lw-lsa get-metrics"> (/etc/vmware/vm-support/ad.mfx) took 0.000 sec
I want to separate the above out into columns ultimately like TIMESTAMP, ACTION, OPTIONS, PATH etc - I have tried multiple combinations but have so far failed.
Grok pattern for above log:->
%{TIMESTAMP_ISO8601:time}:%{SPACE}\<%{WORD:action}%{SPACE} %{DATA:kvpairs}\>%{SPACE}\(%{DATA:path_2}\)%{SPACE}took%{SPACE}%{NUMBER:time_taken}%{SPACE}%{WORD:time_unit}
In the above grok pattern, I have captured cmd, options and path in an event named kvpairs. This is because these key-value pairs can be easily extracted in logstash using kv filter. So your filter configuration will look like:->
filter{
grok(
match => { "message" => "%{TIMESTAMP_ISO8601:timestamp}:%{SPACE}\<%{WORD:action}%{SPACE} %{DATA:kvpairs}\>%{SPACE}\(%{DATA:path_2}\)%{SPACE}took%{SPACE}%{NUMBER:time_taken}%{SPACE}%{WORD:time_unit}"}
)
kv{
source => "kvpairs"
}
date{
match => ["timestamp","yyyy-MM-dd HH:mm:ss.SSS"]
}
}
kv filter by default takes space as the delimiter and will extract columns cmd,options and path.
date filter will make the #timestamp variable.

Logstash extracting and customizing field with grok and ruby

i have this data in elastic search logs saved in a referer field
/clientReq?sessionid=3332&UID=ed91b-517234-4f4c211-a20e-d2e1aefc126a&signUp=false
i want to use ruby to save this data ed91b-517234-4f4c211-a20e-d2e1aefc126a in a separate field.
i have tried this in ruby in my pattern configuration file,
ruby {
code => "
saveid=event[referer].match((\w+[-]?)+)+)
event.set('saved',saveid) "
}
this doesn't even save the entire filed. So i went ahead to try grok filter instead and tried this,
grok {
match => {"message" => "%{COMBINEDAPACHELOG}"}
add_field => { "savedData" => "%{referer}" }
}
neither of these works. I have tested configuration and if configuring successfully. when i visit kibana front end i don't see new field created either.
Ruby hash syntax event[field] = foo is not used anymore, and has been replaced by Get API for example, event.get(referrer).
Beside that, your regex is not correct to get desired results. One of the solutions is to use Positive Lookbehind to check for UID,
this should work,
ruby {
code => "
saveid = event.get('referer').match(/(?<=UID=)((\w+[-]?)+)+/)[1]
event.set('saved',saveid)
"
}
for grok, you can create a new filter for your referer field, and use the gork's predefined UUID pattern to match your string...can you try this,
grok {
match => {"referer" => "UID=%{UUID:saveData}"}
}
hope this helps.

Unique count, array to string

There is my input
{"Names":"Name1, Name2","Country":"TheCountry"}
What i have been trying to do is count how many time a certain name appears not only in one input but also using all previous events. For that i have looked into Metrics but i cannot figure out how i might be able to do that. The first problem i have meet is that Names is a string and not an array.
I do not see how i might convert Names into an array and give it to metric. Is there any other solution ?
First of all, please check logstash configuration and add the following split filter to your logstash.yml file. Your comma separated names will be split while ingesting the data:
filter {
split {
field => "Names"
terminator => ","
target => "NamesArray"
}
}
And you can change your mapping. To add a new field to your type mapping like below:
{
"properties": {
...
"NamesArray": {
"type": "keyword"
}
...
}
}
You should use keyword type for NamesArray to get correct metrics about the separated words with the blank character.

Logstash Filter for a custom message

I am trying to parse a bunch of strings in Logstash and output is set as ElasticSearch.
Sample input string is: 2016 May 24 10:20:15 User1 CREATE "Create a new folder"
The grok filter is:
match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{WORD:user} %{WORD:action_performed} %{WORD:action_description} "}
In Elasticsearch, I am not able to see separate columns for different field such as timstamp, user, action_performed etc.
Instead the whole string is under a single column "message".
I would like to store the information in separate fields instead of just a single column.
Not sure what to change in logstash filter to achieve as desired.
Thanks!
You need to change your grok pattern with this, i.e. use QUOTEDSTRING instead of WORD and it will work!
match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{WORD:user} %{WORD:action_performed} %{QUOTEDSTRING:action_description}"}

Resources