Modify the content of a field using logstash - elasticsearch

I am using logstash to get data from a sql database. There is a field called "code" in which the content has
this structure:
PO0000001209
ST0000000909
And what I would like to do is to remove the 6 zeros after the letters to get the following result:
PO1209
ST0909
I will put the result in another field called "code_short" and use it for my query in elasticsearch. I have configured the input
and the output in logstash but I am not sure how to do it using grok or maybe mutate filter
I have read some examples but I am quite new on this and I am a bit stuck.
Any help would be appreciated. Thanks.

You could use a mutate/gsub filter for this but that will replace the value of the code field:
filter {
mutate {
gsub => [
"code", "000000", "",
]
}
}
Another option is to use a grok filter like this:
filter {
grok {
match => { "code" => "(?<prefix>[a-zA-Z]+)000000%{INT:suffix}" }
add_field => { "code_short" => "%{prefix}%{suffix}"}
}
}

Related

Kibana. Extract fields from #message containing a JSON

I would like to extract in Kiabana fields from #message field which contains a json.
ex:
Audit{
uuid='xxx-xx-d3sd-fds3-f43',
action='/v1.0/execute/super/method',
resultCode='SUCCESS',
browser='null',
ipAddress='192.168.2.44',
application='application1',
timeTaken='167'
}
Having "action" and "application" fields I hope to be able to find top 5 requests that hits the application.
I started with something similar to this:
filter {
if ([message]~ = "Audit") {
grok {
match => {
"message" => "%{WORD:uuid}, %{WORD:action}, %{WORD:resultCode}, %{WORD:browser}, %{WORD:ipAddress}, %{WORD:application}, %{NUMBER:timeTaken}"
}
add_field => ["action", "%{action}"]
add_field => ["application", "%{application}"]
}
}
}
But it seems to be too far from reality.
If the content of "Audit" is really in json format, you can use the filter plugin "json"
json{
source => "Audit"
}
It will do the parsing for you and creates everything. You don't need grok / add_field.

Logstash normalise URL from JSON logs

I have logs in new line separated JSON like following
{
"httpRequest": {
"requestMethod": "GET",
"requestUrl": "/foo/submit?proj=56"
}
}
Now I need the url without the dynamic parts in the i.e. 1st resource (someTenant) and the query parameters to be added as a field in elasticsearch ie. the expected normalised url is
"requestUrl": "/{{someTenant}}/submit?{{someParams}}"
I already have the following filter in logstash config but not sure how to do sequence of regex operation on a specific field and add it as a new one.
json{
source => "message"
}
This way I could aggregate the unique endpoints although the urls are different in logs due to variable path params and query params.
Since this question tagged with grok, i will go ahead and assume you can use grok filters.
use grok filter and create a new field from requestUrl field, you can then use URIPATHPARAM grok pattern to separate various components from requestUrl as follows,
grok {
match => {"requestUrl" => "%{URIPATHPARAM:request_data}"}
}
this will produce following output,
{
"request_data": [
[
"/foo/submit?proj=56"
]
],
"URIPATH": [
[
"/foo/submit"
]
],
"URIPARAM": [
[
"?proj=56"
]
]
}
Can be tested on Grok Online Debugger
thanks

Not able to remove all fields with a prefix in Logstash

I have following fields after I have parsed my JSON in Logstash.
parsedcontent.userinfo.appId
parsedcontent.userinfo.deviceId
parsedcontent.userinfo.id
parsedcontent.userinfo.token
parsedcontent.userinfo.type.
I want to remove all these fields using a filter. I can do it with :
filter{
mutate{
remove_field => "[parsedcontent][userinfo][appId]"
}
}
But I have to write field names with same prefix many times and I have many such kind of fields. Is there any filter to remove fields with a prefix easily? Please guide.
You can use wildcards or regex.
filter {
mutate {
remove_field => "[parsedcontent*]"
}
}

How to leverage logstash to index data but not generating extra fields from logstash

I am testing ElasticSearch to handle around 1 billion small doc (only 8 fields). When i use logstash to index data, it adds other fields like "message", "#version", "#timestamp" that not useful to my case and seems to consume lots of doc size. Is there a way to only index the fields defined in configuration?
Yes, simply add the following mutate filter in your Logstash configuration:
filter {
mutate {
remove_field => [ "#version", "#timestamp", "message" ]
}
}
Yes, you can add and remove fields to remove use following snippet in your conf file.
filter {
mutate {
remove_field => [ "#timestamp", "message", "#version" ]
}
}
To add new field use following snippet.
filter {
mutate {
add_field => { "foo_%{somefield}" => "Hello world, from %{host}" }
}
}

Logstash - find length of split result inside mutate

I'm newbie with Logstash. Currently i'm trying to parse a log in CSV format. I need to split a field with whitespace delimiter, then i'll add new field(s) based on split result.
Here is the filter i need to create:
filter {
...
mutate {
split => ["user", " "]
if [user.length] == 2 {
add_field => { "sourceUsername" => "%{user[0]}" }
add_field => { "sourceAddress" => "%{user[1]}" }
}
else if [user.length] == 1 {
add_field => { "sourceAddress" => "%{user[0]}" }
}
}
...
}
I got error after the if script.
Please advice, is there any way to capture the length of split result inside mutate plugin.
Thanks,
Heri
According to your code example I suppose that you are done with csv parsing and you already have a field user which has either a value that contains a sourceAddress or a value that contains a sourceUsername sourceAddress (separated by whitespace).
Now, there are a lot of filters that can be used to retrieve further fields. You don't need to use the mutate filter to split the field. In this case, a more flexible approach would be the grok filter.
Filter:
grok {
match => {
"user" => [
"%{WORD:sourceUsername} %{IP:sourceAddress}",
"%{WORD:sourceUsername}"
]
}
}
A field "user" => "192.168.0.99" would result in
"sourceAddress" => "191.168.0.99".
A field "user" => "Herry 192.168.0.99" would result in
"sourceUsername" => "Herry", "sourceAddress" => "191.168.0.99"
Of course you can change IP to WORD if your sourceAddress is not an IP.

Resources