Some of KV filter values has custom date that identified as string in Kibana - elasticsearch

I'm using kv filter in Logstash to process config file in the following format :
key1=val1
key2=val2
key3=2020-12-22-2150
with the following lines in Logstash :
kv {
field_split => "\r\n"
value_split => "="
source => "message"
}
Some of my fields in the conf file have a the following date format : YYYY-MM-DD-HHMMSS. When Logstash send the fields to ES, Kibana display them as strings. How can I let Logstash know that those fields are date fields and by that indexing them in ES as dates and not strings ?
I don't want to edit the mapping of the index because it will require reindexing. My final goal with those fields is to calculate the diff between the fields (in seconds, minutes,hours..) and display it in Kibana.
The idea that I have :
Iterate over k,v filter results, if the value is of format YYYY-MM-DD-HHMMSS (check with regex)
In this case, chance the value of the field to milliseconds since epoch
I decided to use k,v filter and Ruby code as a solution but I'm facing an issue.

It could be done more easily outside of logstash by adding a dynamic_template on your index and let him manage field types.
You can use the field name as a detector if it is clear enough (*_date) or define a regex
"match_pattern": "regex",
"match": "^(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])[- /.](19|20)\d\d$"
The code above hasnot been tested.
You can find the official doc here.
https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-templates.html

My solution :
I used the kv filter to convert each line into key value set.
I saved the kv filter resut into a dedicated field.
On this dedicated field, I run a Ruby script that changed all the dates with the custom format to miliseconds since epoch.
code :
filter {
if "kv_file" in [tags] {
kv {
field_split => "\r\n"
value_split => "="
source => "message"
target => "config_file"
}
ruby {
id => "kv_ruby"
code => "
require 'date'
re = /([12]\d{3}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])-[0-23]{2}[0-5]{1}[0-9]{1}[0-5]{1}[0-9]{1})/
hash = event.get('config_file').to_hash
hash.each { |key,value|
if value =~ re
date_epochs_milliseconds = DateTime.strptime(value,'%F-%H%M%S').strftime('%Q')
event.set(key, date_epochs_milliseconds.to_i)
end
}
"
}
}
}
By the way, if you are facing the following error in your Ruby compilation : (ruby filter code):6: syntax error, unexpected null hash it doesn't actually mean that you got a null value, it seems that it is related to the escape character of the double quotes. Just try to replace double quotes with one quote.

Related

Logstash extracting and customizing field with grok and ruby

i have this data in elastic search logs saved in a referer field
/clientReq?sessionid=3332&UID=ed91b-517234-4f4c211-a20e-d2e1aefc126a&signUp=false
i want to use ruby to save this data ed91b-517234-4f4c211-a20e-d2e1aefc126a in a separate field.
i have tried this in ruby in my pattern configuration file,
ruby {
code => "
saveid=event[referer].match((\w+[-]?)+)+)
event.set('saved',saveid) "
}
this doesn't even save the entire filed. So i went ahead to try grok filter instead and tried this,
grok {
match => {"message" => "%{COMBINEDAPACHELOG}"}
add_field => { "savedData" => "%{referer}" }
}
neither of these works. I have tested configuration and if configuring successfully. when i visit kibana front end i don't see new field created either.
Ruby hash syntax event[field] = foo is not used anymore, and has been replaced by Get API for example, event.get(referrer).
Beside that, your regex is not correct to get desired results. One of the solutions is to use Positive Lookbehind to check for UID,
this should work,
ruby {
code => "
saveid = event.get('referer').match(/(?<=UID=)((\w+[-]?)+)+/)[1]
event.set('saved',saveid)
"
}
for grok, you can create a new filter for your referer field, and use the gork's predefined UUID pattern to match your string...can you try this,
grok {
match => {"referer" => "UID=%{UUID:saveData}"}
}
hope this helps.

Logstash Filter for a custom message

I am trying to parse a bunch of strings in Logstash and output is set as ElasticSearch.
Sample input string is: 2016 May 24 10:20:15 User1 CREATE "Create a new folder"
The grok filter is:
match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{WORD:user} %{WORD:action_performed} %{WORD:action_description} "}
In Elasticsearch, I am not able to see separate columns for different field such as timstamp, user, action_performed etc.
Instead the whole string is under a single column "message".
I would like to store the information in separate fields instead of just a single column.
Not sure what to change in logstash filter to achieve as desired.
Thanks!
You need to change your grok pattern with this, i.e. use QUOTEDSTRING instead of WORD and it will work!
match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{WORD:user} %{WORD:action_performed} %{QUOTEDSTRING:action_description}"}

Logstash Dynamic Index From Document Field Fails

I still face problems to figure out, how to tell Logstash to send a dynamic index, based on a document field. Furthermore, this Field must be transformed in order to get the "real" index at the very end.
Given, that there is a field "time" (which is a UNIX Timestamp). This Field gets already transformed with a "date" Filter to a DateTime Object for Elastic.
Additionally, it should server as index (YYYYMM). The index should NOT be derived from #Timestamp, which is not touched.
Example:
{...,"time":1453412341,...}
Shall go to the Index: 201601
I use the following Config:
filter {
date {
match => [ "time", "UNIX" ]
target => "time"
timezone => "Europe/Berlin"
}
}
output {
elasticsearch {
index => "%{time}%{+YYYYMM}"
document_type => "..."
document_id => "%{ID}"
hosts => "..."
}
}
Sadly, its not working. Any idea, how to achieve that?
Thanks a lot!
The "%{+YYYYMM}" says to use the date values from #timestamp. If you want an index named after the YYYYMM in %{time}, you need to make a string out of that date field and then reference that string in the output stanza. There might be a mutate{} that would do it, or drop into ruby{}.
In most installations, you want to set #timestamp to the event's value. The default of logstash's own time is not very useful (imagine if your events were delayed by an hour during processing). If you did that, then %{+YYYYMM}" would work just fine.
This is caused because the index name is created based on UTC time by default.

Logstash grok filter integer

I need to index numerical data in my ElasticSearch DB and i'm using grok filter to parse the log line (which is all comma separated integers).
trying to use this format %{NUMBER:userID_2:int} did not work and no data was indexed and no exception appeared.
When i changed the type to "float" -i.e. %{NUMBER:userID_2:float} it worked just fine.
Any idea why i'm not able to index integers?
(Using elastic 1.4.4 and logstash 1.4.1)
Thanks!
In "filter" section you set up match expression:
match => "%{NUMBER:user_id}"
and then you convert it:
mutate {
convert => {
"user_id" => "integer"
....
}
}

Convert a string field to date

So, I have a two fields in my log: timeLogged, timeQueued all this fields have date format: 2014-06-14 19:41:21+0000
My question is, how to convert string date value to logstash date? like in #timestamp
For the sole purpose of converting to #timestamp there is a dedicated date filter
date {
match => ["timeLogged","YYYY-MM-dd HH:mm:ss+SSSS"]
}
Now in your case there are basically two types of fields that might be used so you will have to dig a little, either use a grok filter to copy the values in a generic "log_date" field, or trying to see if the date filter can take several arguments like one of thoses possibilities:
date {
match => ["timeLogged","YYYY-MM-dd HH:mm:ss+SSSS",
"timeQueued","YYYY-MM-dd HH:mm:ss+SSSS" ]
}
OR
date {
match => ["timeLogged","YYYY-MM-dd HH:mm:ss+SSSS"]
match => ["timeQueued","YYYY-MM-dd HH:mm:ss+SSSS"]
}
It is up to you to experiment, I never tried myself ;)
this should suffice
date{
match => [ "timeLogged","ISO8601","YYYY-MM-dd HH:mm:ss" ]
target => "timeLogged"
locale => "en"
}
You can try this filter
filter {
ruby {
code => "
event['timeLogged'] = Time.parse(event['timeLogged'])
event['timeQueued'] = Time.parse(event['timeQueued'])
"
}
}
Use the powerful ruby library to do what you need!

Resources