Not able to remove all fields with a prefix in Logstash - elasticsearch

I have following fields after I have parsed my JSON in Logstash.
parsedcontent.userinfo.appId
parsedcontent.userinfo.deviceId
parsedcontent.userinfo.id
parsedcontent.userinfo.token
parsedcontent.userinfo.type.
I want to remove all these fields using a filter. I can do it with :
filter{
mutate{
remove_field => "[parsedcontent][userinfo][appId]"
}
}
But I have to write field names with same prefix many times and I have many such kind of fields. Is there any filter to remove fields with a prefix easily? Please guide.

You can use wildcards or regex.
filter {
mutate {
remove_field => "[parsedcontent*]"
}
}

Related

Logstash: removing all the nodes of a given name, in any position of a json

I would need to create a kind of regular expression to remove all the properties in a json matching the name "source_user_id", since such property is at different levels of the json. E.g.
filter
{
mutate {
remove_field => ["[extended_tweet][entities][media][0][source_user_id]", "message"]
remove_field => ["[extended_tweet][extended_entities][media][0][source_user_id]", "message"]
...
I read that I can do something like:
remove_field => ["[%{source_user_id}]", "message"]
But that will only match at the first level.
Any suggestion?
Thanks in advance,

Logstash create a new field based on existing field

I have data coming from database queries using jdbc input plugin and result from queries contains url field from which I want to extract a few properties.
Example urls:
/incident.do?sys_id=0dc18b246faa17007a64cbe64f3ee4e1&sysparm_view
/navpage_form_default.do
/u_pm_prov_project_list.do?sysparm_userpref_module=fa547ce26f661
JOB: email read events process
JOB: System - reduce resources
I added regex patterns in grok patterns file:
webpage_category .*
job_type .*
I have two types of url so I used if in filter block to distinguish between them
Config I tried so far:
filter {
if [url] =~ /JOB: .*/ {
grok {
patterns_dir => ["/etc/logstash/patterns"]
match => {
"url" => "JOB: %{job_type:job_type}"
}
}
} else
if [url] =~ /\/.*\.do\?.*/ {
grok {
patterns_dir => ["/etc/logstash/patterns"]
match => {
"url" => "/{webpage_category:webpage_category}\.do\?.*"
}
}
}
}
Creation of a new field for urls starting with JOB: works properly but webpage_category is not working at all. Is it because regex can not be used inside of match?
The problem is you are trying to use grok pattern inside a mutate filter, which wouldn't work. mutate and grok are two separate filter plugins.
You need to use add_field inside grok filter if you want to use grok pattern to create a field. please remember add_field is supported by all filter plugins.
Please have a look at following example,
filter {
grok {
add_field => { "foo_%{somefield}" => "Hello world, from %{host}" }
}
}
In your case, it will be,
filter{
grok {
add_field => {
"webpage_category" => "%{webpage_category:url}"
"job_type" => "%{job_type:url}"
}
}
}
Please also make sure, patterns_dir is imported,
patterns_dir => ["./patterns"] => ["./patterns"]
please checkout grok filter documentation as well.

Remove header fields generated by http input plugin

When I use http input plugin, Logstash adds the following fields to Elasticsearch:
headers.http_accept
headers.content_type
headers.request_path
headers.http_version
headers.request_method
...
How can I remove all these fields starting with headers.?
Since these are all pathed, that means they all are hierarchical under [headers] as far as the logstash configs go. This will probably do wonders for you:
filter {
mutate {
remove_field => [ "headers" ]
}
}
Which should drop the [headers][http_accept], [headers][content_type] and so on fields.

Modify the content of a field using logstash

I am using logstash to get data from a sql database. There is a field called "code" in which the content has
this structure:
PO0000001209
ST0000000909
And what I would like to do is to remove the 6 zeros after the letters to get the following result:
PO1209
ST0909
I will put the result in another field called "code_short" and use it for my query in elasticsearch. I have configured the input
and the output in logstash but I am not sure how to do it using grok or maybe mutate filter
I have read some examples but I am quite new on this and I am a bit stuck.
Any help would be appreciated. Thanks.
You could use a mutate/gsub filter for this but that will replace the value of the code field:
filter {
mutate {
gsub => [
"code", "000000", "",
]
}
}
Another option is to use a grok filter like this:
filter {
grok {
match => { "code" => "(?<prefix>[a-zA-Z]+)000000%{INT:suffix}" }
add_field => { "code_short" => "%{prefix}%{suffix}"}
}
}

Logstash: How to use date/time in a filename as an imported field

I have a bunch of log files that are named as 'XXXXXX_XX_yymmdd_hh:mm:ss.txt' - I need to include the date and time (separate fields) from the filename in fields that are added to Logstash.
Can anyone help?
Thanks
Use a grok filter to extract the date and time:
filter {
grok {
match => [
"path",
"^%{GREEDYDATA}/[^/]+_%{INT:date}_%{TIME:time}\.txt$"
]
}
}
Depending on what goes instead of XXXXXX_XX you might prefer a stricter expression. Also, GREEDYDATA isn't very efficient. This might yield better performance:
filter {
grok {
match => [
"path", "^(?:/[^/]+)+/[^/]+_%{INT:date}_%{TIME:time}\.txt$"
]
}
}

Resources