Logstash grok filter integer - elasticsearch

I need to index numerical data in my ElasticSearch DB and i'm using grok filter to parse the log line (which is all comma separated integers).
trying to use this format %{NUMBER:userID_2:int} did not work and no data was indexed and no exception appeared.
When i changed the type to "float" -i.e. %{NUMBER:userID_2:float} it worked just fine.
Any idea why i'm not able to index integers?
(Using elastic 1.4.4 and logstash 1.4.1)
Thanks!

In "filter" section you set up match expression:
match => "%{NUMBER:user_id}"
and then you convert it:
mutate {
convert => {
"user_id" => "integer"
....
}
}

Related

Some of KV filter values has custom date that identified as string in Kibana

I'm using kv filter in Logstash to process config file in the following format :
key1=val1
key2=val2
key3=2020-12-22-2150
with the following lines in Logstash :
kv {
field_split => "\r\n"
value_split => "="
source => "message"
}
Some of my fields in the conf file have a the following date format : YYYY-MM-DD-HHMMSS. When Logstash send the fields to ES, Kibana display them as strings. How can I let Logstash know that those fields are date fields and by that indexing them in ES as dates and not strings ?
I don't want to edit the mapping of the index because it will require reindexing. My final goal with those fields is to calculate the diff between the fields (in seconds, minutes,hours..) and display it in Kibana.
The idea that I have :
Iterate over k,v filter results, if the value is of format YYYY-MM-DD-HHMMSS (check with regex)
In this case, chance the value of the field to milliseconds since epoch
I decided to use k,v filter and Ruby code as a solution but I'm facing an issue.
It could be done more easily outside of logstash by adding a dynamic_template on your index and let him manage field types.
You can use the field name as a detector if it is clear enough (*_date) or define a regex
"match_pattern": "regex",
"match": "^(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])[- /.](19|20)\d\d$"
The code above hasnot been tested.
You can find the official doc here.
https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-templates.html
My solution :
I used the kv filter to convert each line into key value set.
I saved the kv filter resut into a dedicated field.
On this dedicated field, I run a Ruby script that changed all the dates with the custom format to miliseconds since epoch.
code :
filter {
if "kv_file" in [tags] {
kv {
field_split => "\r\n"
value_split => "="
source => "message"
target => "config_file"
}
ruby {
id => "kv_ruby"
code => "
require 'date'
re = /([12]\d{3}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])-[0-23]{2}[0-5]{1}[0-9]{1}[0-5]{1}[0-9]{1})/
hash = event.get('config_file').to_hash
hash.each { |key,value|
if value =~ re
date_epochs_milliseconds = DateTime.strptime(value,'%F-%H%M%S').strftime('%Q')
event.set(key, date_epochs_milliseconds.to_i)
end
}
"
}
}
}
By the way, if you are facing the following error in your Ruby compilation : (ruby filter code):6: syntax error, unexpected null hash it doesn't actually mean that you got a null value, it seems that it is related to the escape character of the double quotes. Just try to replace double quotes with one quote.

Add extra value to field before sending to elasticsearch

I'm using logstash, filebeat and grok to send data from logs to my elastisearch instance. This is the grok configuration in the pipe
filter {
grok {
match => {
"message" => "%{SYSLOGTIMESTAMP:messageDate} %{GREEDYDATA:messagge}"
}
}
}
This works fine, the issue is that messageDate is in this format Jan 15 11:18:25 and it doesn't have a year entry.
Now, i actually know the year these files were created in and i was wondering if it is possible to add the value to the field during the process, that is, somehow turn Jan 15 11:18:25 into 2016 Jan 15 11:18:25 before sending to elasticsearch (obviously without editing the files, which i could do and even with ease but it'll be a temporary fix to what i have to do and not a definitive solution)
I have tried googling if it was possible but no luck...
Valepu,
The only way to modify the data from a field is using the ruby filter:
filter {
ruby {
code => "#your code here#"
}
}
For more information like...how to get,set field values, here is the link:
https://www.elastic.co/guide/en/logstash/current/plugins-filters-ruby.html
If you have a separate field for date as a string, you can use logstash date plugin:
https://www.elastic.co/guide/en/logstash/current/plugins-filters-date.html
If you don't have it as a separate field (as in this case) use this site to construct your own grok pattern:
http://grokconstructor.appspot.com/do/match
I made this to preprocess the values:
%{YEAR:yearVal} %{MONTH:monthVal} %{NUMBER:dayVal} %{TIME:timeVal} %{GREEDYDATA:message}
Not the most elegant I guess, but you get the values in different fields. Using this you can create your own date field and parse it with date filter so you will get a comparable value or you can use these fields by themselves. I'm sure there is a better solution, for example you could make your own grok pattern and use that, but I'm gonna leave some exploration for you too. :)
By reading thoroughly the grok documentation i found what google couldn't find for me and which i apparently missed the first time i read that page
https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html#plugins-filters-grok-add_field
Using the add_field and remove_field options i managed to add the year to my date, then i used the date plugin to send it to logstash as a timestamp. My filter configuration now looks like this
filter {
grok {
match => {
"message" => "%{SYSLOGTIMESTAMP:tMessageDate} %{GREEDYDATA:messagge}"
add_field => { "messageDate" => "2016 %{tMessageDate}" }
remove_field => ["tMessageDate"]
}
}
date {
match => [ "messageDate", "YYYY MMM dd HH:mm:ss"]
}
}
And it worked fine

logstash add_field conversion issue

I am using logstash version 5.0.2. Parsing a file which holds a filename as one of the fields which is parsed by logstash grok filter, but for visualization i needed the file number to identify each file. So I added new field through mutate filter add_field checking the filename in [message].
if 'filename_1' in [message] {
mutate { add_field => { "file_no" => "13" } }
mutate {convert => [ "file_no", "float" ] }
}
If i check the parsing through stdin/stdout (rubydebug codec) filterit shows the file_no field is converted properly. but if I send the logstash output to elasticsearch kibana shows conflict in data type of that field.
there I am able to see file_no.keyword(as string) and file_no(as conflict), with error as:
Mapping conflict! A field is defined as several types (string, integer,
etc) across the indices that match this pattern. You may still be able to use
these conflict fields in parts of Kibana, but they will be unavailable for
functions that require Kibana to know their type. Correcting this issue will
require reindexing your data
I have converted the added filed so why is is still being sent to elasticsearch as string not sure.
any help would be great.
When tried converting the field there is not option of number in Kibana. The source logfile being monitored doesn't have this number to parse it directly as an integer with %{PATTERN_FOR_NUMBER:number_variable:int} otherwise this could have been easier

Logstash Dynamic Index From Document Field Fails

I still face problems to figure out, how to tell Logstash to send a dynamic index, based on a document field. Furthermore, this Field must be transformed in order to get the "real" index at the very end.
Given, that there is a field "time" (which is a UNIX Timestamp). This Field gets already transformed with a "date" Filter to a DateTime Object for Elastic.
Additionally, it should server as index (YYYYMM). The index should NOT be derived from #Timestamp, which is not touched.
Example:
{...,"time":1453412341,...}
Shall go to the Index: 201601
I use the following Config:
filter {
date {
match => [ "time", "UNIX" ]
target => "time"
timezone => "Europe/Berlin"
}
}
output {
elasticsearch {
index => "%{time}%{+YYYYMM}"
document_type => "..."
document_id => "%{ID}"
hosts => "..."
}
}
Sadly, its not working. Any idea, how to achieve that?
Thanks a lot!
The "%{+YYYYMM}" says to use the date values from #timestamp. If you want an index named after the YYYYMM in %{time}, you need to make a string out of that date field and then reference that string in the output stanza. There might be a mutate{} that would do it, or drop into ruby{}.
In most installations, you want to set #timestamp to the event's value. The default of logstash's own time is not very useful (imagine if your events were delayed by an hour during processing). If you did that, then %{+YYYYMM}" would work just fine.
This is caused because the index name is created based on UTC time by default.

Elasticsearch converting a string to number

I am new to Elasticsearch and am just starting up with ELK stack. I am collecting key value type logs in my Logstash and passing it to an index in Elasticsearch. I am using the kv filter plugin in Logstash. Due to this, all the fields are string type by default.
When I try to perform aggregation like avg or sum on a numeric field in Elasticsearch, I am getting an Exception: ClassCastException[org.elasticsearch.index.fielddata.plain.PagedBytesIndexFieldData cannot be cast to org.elasticsearch.index.fielddata.IndexNumericFieldData]
When I check the mappings in the index, all the fields except the timestamp ones are marked as string.
Please tell me how to overcome this issue as I have many numeric fields in my log events for aggregation.
Thanks,
Keerthana
You could set explicit mappings for those fields (see e.g. Change default mapping of string to "not analyzed" in Elasticsearch for some guidance), but it's easier to just convert those fields to integers in Logstash using the mutate filter:
mutate {
convert => ["name-of-field", "integer"]
}
Then Elasticsearch will do a better job at guessing the best data type for your field(s).
(See also Data type conversion using logstash grok.)
In latest Logstash the syntax is as follows
filter {
mutate {
convert => { "fieldname" => "integer" }
}
}
You can visit this link for more detail: https://www.elastic.co/guide/en/logstash/current/plugins-filters-mutate.html#plugins-filters-mutate-convert

Resources