Kibana. Extract fields from #message containing a JSON - elasticsearch

I would like to extract in Kiabana fields from #message field which contains a json.
ex:
Audit{
uuid='xxx-xx-d3sd-fds3-f43',
action='/v1.0/execute/super/method',
resultCode='SUCCESS',
browser='null',
ipAddress='192.168.2.44',
application='application1',
timeTaken='167'
}
Having "action" and "application" fields I hope to be able to find top 5 requests that hits the application.
I started with something similar to this:
filter {
if ([message]~ = "Audit") {
grok {
match => {
"message" => "%{WORD:uuid}, %{WORD:action}, %{WORD:resultCode}, %{WORD:browser}, %{WORD:ipAddress}, %{WORD:application}, %{NUMBER:timeTaken}"
}
add_field => ["action", "%{action}"]
add_field => ["application", "%{application}"]
}
}
}
But it seems to be too far from reality.

If the content of "Audit" is really in json format, you can use the filter plugin "json"
json{
source => "Audit"
}
It will do the parsing for you and creates everything. You don't need grok / add_field.

Related

how filter {"foo":"bar", "bar": "foo"} with grok to get only foo field?

I copied
{"name":"myapp","hostname":"banana.local","pid":40161,"level":30,"msg":"hi","time":"2013-01-04T18:46:23.851Z","v":0}
from https://github.com/trentm/node-bunyan and save it as my logs.json. I am trying to import only two fields (name and msg) to ElasticSearch via LogStash. The problem is that I depend on a sort of filter that I am not able to accomplish. Well I have successfully imported such line as a single message but certainly it is not worth in my real case.
That said, how can I import only name and msg to ElasticSearch? I tested several alternatives using http://grokdebug.herokuapp.com/ to reach an useful filter with no success at all.
For instance, %{GREEDYDATA:message} will bring the entire line as an unique message but how to split it and ignore all other than name and msg fields?
At the end, I am planing to use here:
input {
file {
type => "my_type"
path => [ "/home/logs/logs.log" ]
codec => "json"
}
}
filter {
grok {
match => { "message" => "data=%{GREEDYDATA:request}"}
}
#### some extra lines here probably
}
output
{
elasticsearch {
codec => json
hosts => "http://127.0.0.1:9200"
index => "indextest"
}
stdout { codec => rubydebug }
}
I have just gone through the list of available Logstash filters. The prune filter should match your need.
Assume you have installed the prune filter, your config file should look like:
input {
file {
type => "my_type"
path => [ "/home/logs/logs.log" ]
codec => "json"
}
}
filter {
prune {
whitelist_names => [
"#timestamp",
"type",
"name",
"msg"
]
}
}
output {
elasticsearch {
codec => json
hosts => "http://127.0.0.1:9200"
index => "indextest"
}
stdout { codec => rubydebug }
}
Please be noted that you will want to keep type for Elasticsearch to index it into a correct type. #timestamp is required if you will view the data on Kibana.

Modify the content of a field using logstash

I am using logstash to get data from a sql database. There is a field called "code" in which the content has
this structure:
PO0000001209
ST0000000909
And what I would like to do is to remove the 6 zeros after the letters to get the following result:
PO1209
ST0909
I will put the result in another field called "code_short" and use it for my query in elasticsearch. I have configured the input
and the output in logstash but I am not sure how to do it using grok or maybe mutate filter
I have read some examples but I am quite new on this and I am a bit stuck.
Any help would be appreciated. Thanks.
You could use a mutate/gsub filter for this but that will replace the value of the code field:
filter {
mutate {
gsub => [
"code", "000000", "",
]
}
}
Another option is to use a grok filter like this:
filter {
grok {
match => { "code" => "(?<prefix>[a-zA-Z]+)000000%{INT:suffix}" }
add_field => { "code_short" => "%{prefix}%{suffix}"}
}
}

Logstash - Add fields from the log - Grok

I'm learning logstash and I'm using Kibana to see the logs. I would like to know if is there anyway to add fields using data from message property.
For example, the log is like this:
#timestamp:December 21st 2016, 21:39:12.444 port:47,144
appid:%{[path]} host:172.18.0.5 levell:level message:
{"#timestamp":"2016-12-22T00:39:12.438+00:00","#version":1,"message":"Hello","logger_name":"com.empresa.miAlquiler.controllers.UserController","thread_name":"http-nio-7777-exec-1","level":"INFO","level_value":20000,
"HOSTNAME":"6f92ae402cb4","X-Span-Export":"false","X-B3-SpanId":"8f548829e9d18a8a","X-B3-TraceId":"8f548829e9d18a8a"}
My logstash conf is like:
filter {
grok {
match => {
"message" =>
"^%{TIMESTAMP_ISO8601:timestamp}\s+%{LOGLEVEL:level}\s+%{NUMBER:pid}\s+---\s+\[\s*%{USERNAME:thread}\s*\]\s+%{JAVAFILE:class}\s*:\s*%{DATA:themessage}(?:\n+(?<stacktrace>(?:.|\r|\n)+))?$"
}
}
date {
match => [ "timestamp" , "yyyy-MM-dd HH:mm:ss.SSS" ]
}
mutate {
remove_field => ["#version"]
add_field => {
"appid" => "%{[path]}"
}
add_field => {
"levell" => "level"
}
}
}
I would like to take level (in the log is INFO), and message (in the log is Hello) and add them as fields.
Is there anyway to do that?
What if you do something like this using mutate:
filter {
mutate {
add_field => ["newfield", "%{appid} %{levell}"] <-- this should concat both your appid and level to a new field
}
}
You might have a look at this thread.

Data type conversion using logstash grok

Basic is a float field. The mentioned index is not present in elasticsearch. When running the config file with logstash -f, I am getting no exception. Yet, the data reflected and entered in elasticsearch shows the mapping of Basic as string. How do I rectify this? And how do I do this for multiple fields?
input {
file {
path => "/home/sagnik/work/logstash-1.4.2/bin/promosms_dec15.csv"
type => "promosms_dec15"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
grok{
match => [
"Basic", " %{NUMBER:Basic:float}"
]
}
csv {
columns => ["Generation_Date","Basic"]
separator => ","
}
ruby {
code => "event['Generation_Date'] = Date.parse(event['Generation_Date']);"
}
}
output {
elasticsearch {
action => "index"
host => "localhost"
index => "promosms-%{+dd.MM.YYYY}"
workers => 1
}
}
You have two problems. First, your grok filter is listed prior to the csv filter and because filters are applied in order there won't be a "Basic" field to convert when the grok filter is applied.
Secondly, unless you explicitly allow it, grok won't overwrite existing fields. In other words,
grok{
match => [
"Basic", " %{NUMBER:Basic:float}"
]
}
will always be a no-op. Either specify overwrite => ["Basic"] or, preferably, use mutate's type conversion feature:
mutate {
convert => ["Basic", "float"]
}

reparsing a logstash record? fix extracts?

I'm taking a JSON message (Cloudtrail, many objects concatenated together) and by the time I'm done filtering it, Logstash doesn't seem to be parsing the message correctly. It's as if the hash was simply dumped into a string.
Anyhow, here's the input and filter.
input {
s3 {
bucket => "stanson-ops"
delete => false
#snipped unimportant bits
type => "cloudtrail"
}
}
filter {
if [type] == "cloudtrail" {
json { # http://logstash.net/docs/1.4.2/filters/json
source => "message"
}
ruby {
code => "event['RecordStr'] = event['Records'].join('~~~')"
}
split {
field => "RecordStr"
terminator => "~~~"
remove_field => [ "message", "Records" ]
}
}
}
By the time I'm done, elasticsearch entries include a RecordStr key with the following data. It doesn't have a message field, nor does it have a Records field.
{"eventVersion"=>"1.01", "userIdentity"=>{"type"=>"IAMUser", "principalId"=>"xxx"}}
Note that is not JSON style, it's been parsed. (which is important for the concat->split thing to work).
So, the RecordStr key looks not quite right as one value. Further, in Kibana, filterable fields include RecordStr (no subfields). It includes some entries that aren't there anymore: Records.eventVersion, Records.userIdentity.type.
Why is that? How can I get the proper fields?
edit 1 here's part of the input.
{"Records":[{"eventVersion":"1.01","userIdentity":{"type":"IAMUser",
It's unprettified JSON. It appears the body of the file (the above) is in the message field, json extracts it and I end up with an array of records in the Records field. That's why I join and split it- I then end up with individual documents, each with a single RecordStr entry. However, the template(?) doesn't seem to understand the new structure.
I've worked out a method that allows for indexing the appropriate CloudTrail fields as you requested. Here are the modified input and filter configs:
input {
s3 {
backup_add_prefix => \"processed-logs/\"
backup_to_bucket => \"test-bucket\"
bucket => \"test-bucket\"
delete => true
interval => 30
prefix => \"AWSLogs/<account-id>/CloudTrail/\"
type => \"cloudtrail\"
}
}
filter {
if [type] == \"cloudtrail\" {
json {
source => \"message\"
}
ruby {
code => \"event.set('RecordStr', event.get('Records').join('~~~'))\"
}
split {
field => \"RecordStr\"
terminator => \"~~~\"
remove_field => [ \"message\", \"Records\" ]
}
mutate {
gsub => [
\"RecordStr\", \"=>\", \":\"
]
}
mutate {
gsub => [
\"RecordStr\", \"nil\", \"null\"
]
}
json {
skip_on_invalid_json => true
source => \"RecordStr\"
target => \"cloudtrail\"
}
mutate {
add_tag => [\"cloudtrail\"]
remove_field=>[\"RecordStr\", \"#version\"]
}
date {
match => [\"[cloudtrail][eventTime]\",\"ISO8601\"]
}
}
}
The key observation here is that once the split is done we no longer possess valid json in the event and are therefore required to execute the mutate replacements ('=>' to ':' and 'nil' to 'null'). Additionally, I found it useful to get the timestamp out of the CloudTrail eventTime and do some cleanup of unnecessary fields.

Resources