How to overwrite field value in Kibana? - elasticsearch

I am using Logstash to feed data into Elasticsearch and then analyzing that data with Kibana. I have a field that contains numeric identifiers. These are not easy to read. How can I have Kibana overwrite or show a more human-readable value?
More specifically, I have a 'ip.proto' field. When this field contains a 6, it should be shown as 'TCP'. When this field contains a 7, it should be shown as 'UDP'.
I am not sure which tool in the ELK stack I need to modify to make this happen.
Thanks

You can use conditionals and the mutate filter:
filter {
if [ip][proto] == "6" {
mutate {
replace => ["[ip][proto]", "TCP"]
}
} else if [ip][proto] == "7" {
mutate {
replace => ["[ip][proto]", "UDP"]
}
}
}
This quickly gets clumsy, and the translate filter is more elegant (and probably faster). Untested example:
filter {
translate {
field => "[ip][proto]"
dictionary => {
"6" => "TCP"
"7" => "UDP"
}
}
}

Related

Lowercase field name in Logstash for Elasticsearch index

I have a logstash command that I'm piping a file to that will write to Elasticsearch. I want to use one field to select the index I will write to (appName). However the data in this field is not all lowercase so I need to do that when selecting the index but I don't want the data in the document itself to be modified.
I have an attempt below where I first copy the original field (appName) to a new one (appNameIndex), lowercase the new field, remove it from the upload and then use it pick the index.
input {
stdin { type => stdin }
}
filter {
csv {
separator => " "
columns => ["appName", "field1", "field2", ...]
convert => {
...
}
}
filter {
mutate {
copy => ["appName", "appNameIndex"]
}
}
filter {
mutate {
lowercase => ["appNameIndex"]
}
}
filter {
mutate {
remove_field => [
"appNameIndex", // if I remove this it works
...
]
}
}
output {
amazon_es {
hosts =>
["my-es-cluster.us-east-1.es.amazonaws.com"]
index => "%{appNameIndex}"
region => "us-east-1"
}
}
However I am getting errors that say
Invalid index name [%{appIndexName}]
Clearly it's not grabbing my mutation. Is it because the remove section takes it out entirely? I was hoping that just removed it from the document upload. Am I going about this incorrectly?
UPDATE I tried taking out the remove index name part and it does in fact work, so that helps identify the source of the error. Now the question becomes how do I get around it. With that part of the config removed I essentially have two fields with the same data, one lowercased and one not
You can define a #metadata field that is a special field which will never be included in the output https://www.elastic.co/guide/en/logstash/current/event-dependent-configuration.html#metadata.
input {
stdin { type => stdin }
}
filter {
csv {
separator => " "
columns => ["appName", "field1", "field2", ...]
convert => {
...
}
}
filter {
mutate {
copy => ["appName", "[#metadata][appNameIndex]"]
}
}
filter {
mutate {
lowercase => ["[#metadata][appNameIndex]"]
}
}
output {
amazon_es {
hosts => ["my-es-cluster.us-east-1.es.amazonaws.com"]
index => "%{[#metadata][appNameIndex]}"
region => "us-east-1"
}
}

Modify the content of a field using logstash

I am using logstash to get data from a sql database. There is a field called "code" in which the content has
this structure:
PO0000001209
ST0000000909
And what I would like to do is to remove the 6 zeros after the letters to get the following result:
PO1209
ST0909
I will put the result in another field called "code_short" and use it for my query in elasticsearch. I have configured the input
and the output in logstash but I am not sure how to do it using grok or maybe mutate filter
I have read some examples but I am quite new on this and I am a bit stuck.
Any help would be appreciated. Thanks.
You could use a mutate/gsub filter for this but that will replace the value of the code field:
filter {
mutate {
gsub => [
"code", "000000", "",
]
}
}
Another option is to use a grok filter like this:
filter {
grok {
match => { "code" => "(?<prefix>[a-zA-Z]+)000000%{INT:suffix}" }
add_field => { "code_short" => "%{prefix}%{suffix}"}
}
}

Can we extract numeric values from string through script in kibana?

I need to extract numeric values from string and store in new field..Can we do this through scripted field?
Ex: 1 hello 3 test
I need to extract 1 and 3.
You can do this through logstash if you are using elasticsearch.
Run a logstash process with a config like
input {
elasticsearch {
hosts => "your_host"
index => "your_index"
query => "{ "query": { "match_all": {} } }"
}
}
filter {
grok {
match => { "your_string_field" => "%{NUMBER:num1} %{GREEDYDATA:middle_stuff} %{NUMBER:num2} %{GREEDYDATA:other_stuff}" }
}
mutate {
remove_field => ["middle_stuff", "other_stuff"]
}
}
output{
elasticsearch {
host => "yourhost"
index => "your index"
document_id => %{id}
}
}
This would essentially overwrite each document in your index with two more fields, num1 and num2 that correspond to the numbers that you are looking for. This is just a quick and dirty approach that would take up more memory, but would allow you to do all of the break up at one time instead of at visualization time.
I am sure there is a way to do this with scripting, look into groovy regex matching where you return a specific group.
Also no guarantee my config representation is correct as I don't have time to test it at the moment.
Have a good day!

Logstash - find length of split result inside mutate

I'm newbie with Logstash. Currently i'm trying to parse a log in CSV format. I need to split a field with whitespace delimiter, then i'll add new field(s) based on split result.
Here is the filter i need to create:
filter {
...
mutate {
split => ["user", " "]
if [user.length] == 2 {
add_field => { "sourceUsername" => "%{user[0]}" }
add_field => { "sourceAddress" => "%{user[1]}" }
}
else if [user.length] == 1 {
add_field => { "sourceAddress" => "%{user[0]}" }
}
}
...
}
I got error after the if script.
Please advice, is there any way to capture the length of split result inside mutate plugin.
Thanks,
Heri
According to your code example I suppose that you are done with csv parsing and you already have a field user which has either a value that contains a sourceAddress or a value that contains a sourceUsername sourceAddress (separated by whitespace).
Now, there are a lot of filters that can be used to retrieve further fields. You don't need to use the mutate filter to split the field. In this case, a more flexible approach would be the grok filter.
Filter:
grok {
match => {
"user" => [
"%{WORD:sourceUsername} %{IP:sourceAddress}",
"%{WORD:sourceUsername}"
]
}
}
A field "user" => "192.168.0.99" would result in
"sourceAddress" => "191.168.0.99".
A field "user" => "Herry 192.168.0.99" would result in
"sourceUsername" => "Herry", "sourceAddress" => "191.168.0.99"
Of course you can change IP to WORD if your sourceAddress is not an IP.

reparsing a logstash record? fix extracts?

I'm taking a JSON message (Cloudtrail, many objects concatenated together) and by the time I'm done filtering it, Logstash doesn't seem to be parsing the message correctly. It's as if the hash was simply dumped into a string.
Anyhow, here's the input and filter.
input {
s3 {
bucket => "stanson-ops"
delete => false
#snipped unimportant bits
type => "cloudtrail"
}
}
filter {
if [type] == "cloudtrail" {
json { # http://logstash.net/docs/1.4.2/filters/json
source => "message"
}
ruby {
code => "event['RecordStr'] = event['Records'].join('~~~')"
}
split {
field => "RecordStr"
terminator => "~~~"
remove_field => [ "message", "Records" ]
}
}
}
By the time I'm done, elasticsearch entries include a RecordStr key with the following data. It doesn't have a message field, nor does it have a Records field.
{"eventVersion"=>"1.01", "userIdentity"=>{"type"=>"IAMUser", "principalId"=>"xxx"}}
Note that is not JSON style, it's been parsed. (which is important for the concat->split thing to work).
So, the RecordStr key looks not quite right as one value. Further, in Kibana, filterable fields include RecordStr (no subfields). It includes some entries that aren't there anymore: Records.eventVersion, Records.userIdentity.type.
Why is that? How can I get the proper fields?
edit 1 here's part of the input.
{"Records":[{"eventVersion":"1.01","userIdentity":{"type":"IAMUser",
It's unprettified JSON. It appears the body of the file (the above) is in the message field, json extracts it and I end up with an array of records in the Records field. That's why I join and split it- I then end up with individual documents, each with a single RecordStr entry. However, the template(?) doesn't seem to understand the new structure.
I've worked out a method that allows for indexing the appropriate CloudTrail fields as you requested. Here are the modified input and filter configs:
input {
s3 {
backup_add_prefix => \"processed-logs/\"
backup_to_bucket => \"test-bucket\"
bucket => \"test-bucket\"
delete => true
interval => 30
prefix => \"AWSLogs/<account-id>/CloudTrail/\"
type => \"cloudtrail\"
}
}
filter {
if [type] == \"cloudtrail\" {
json {
source => \"message\"
}
ruby {
code => \"event.set('RecordStr', event.get('Records').join('~~~'))\"
}
split {
field => \"RecordStr\"
terminator => \"~~~\"
remove_field => [ \"message\", \"Records\" ]
}
mutate {
gsub => [
\"RecordStr\", \"=>\", \":\"
]
}
mutate {
gsub => [
\"RecordStr\", \"nil\", \"null\"
]
}
json {
skip_on_invalid_json => true
source => \"RecordStr\"
target => \"cloudtrail\"
}
mutate {
add_tag => [\"cloudtrail\"]
remove_field=>[\"RecordStr\", \"#version\"]
}
date {
match => [\"[cloudtrail][eventTime]\",\"ISO8601\"]
}
}
}
The key observation here is that once the split is done we no longer possess valid json in the event and are therefore required to execute the mutate replacements ('=>' to ':' and 'nil' to 'null'). Additionally, I found it useful to get the timestamp out of the CloudTrail eventTime and do some cleanup of unnecessary fields.

Resources