Logstash using gsub - ruby

I would like to use the gsub filter or a ruby code filter to do the following in logstash.
I have a field which is dynamically named eg. P12IP3, P12IP2, P13IP1 etc.
I would like to remove all white space characters in these fields.
However, the following does not seem to work
gsub => ["/(.)IP(.)/"," ",""]
I've tried some variations using ruby code filter as well, but could not get it to work. Can someone suggest a solution?
Sample Conf of what I have tried
grok {
patterns_dir => "/etc/logstash/patterns"
match => [ "message", "iLO %{BASE16NUM:P16F1} %{HLA_TS_1:ts1} / %{BASE16NUM:P16F2}
%{BASE16NUM:P16F3} :
%{BASE16NUM:P16F4} %{BASE16NUM:P16F5} Browser login : OA
Administrator1 \- \ %{IP_HLA:P16IP1} \( DNS name not found \) \." ]
add_tag => [ "pattern", "16" ]
tag_on_failure => []
}
grok {
patterns_dir => "/etc/logstash/patterns"
match => [ "message", "iLO %{BASE16NUM:P17F1} %{HLA_TS_1:ts1} / %{BASE16NUM:P17F2} %{BASE16NUM:P17F3} :
%{BASE16NUM:P17F4} %{BASE16NUM:P17F5} Browser login : OA
Administrator3 \- \ %{IP_HLA:P17IP1} \( DNS name not found \) \." ]
add_tag => [ "pattern", "17" ]
tag_on_failure => []
}
mutate{
gsub => [
"/(.*)IP(.*)/"," ",""
]
}
Here above you can see that there are two IP fields P16IP1 and P17IP1, what I want is that both of them should be replaced by the gsub mutation filter such that all white space is removed in the values of the field.
I am also providing the input, the following is an input for the first pattern (16).
iLO 2 2012 / 31 / 14 13 : 24 : 01 / 2011 12 : 52 1 Browser login : OA Administrator1 - 15 . 33 . 64 . 119 ( DNS name not found ) .
Here the output for the IP field is currently "P16IP1":"15 . 33 . 64 . 119", what I would like is for the output to be "P16IP1":"15.33.64.119"

Removing all whitespace from a string is easy:
"a \t\n\r\fb".gsub(/\s+/, '') # => "ab"
/\s+/ is the regular expression way of saying "all whitespace characters". This is its definition:
/\s/ - A whitespace character: /[ \t\r\n\f]/
If you're trying to match lines containing variants on
P12IP2
P01IP1
P99IP9
then you can use a pattern like:
/P\d{2}IP\d/
http://rubular.com/r/MCnY87DkZv
From there you can capture the leading/trailing characters:
/^(.+)P\d{2}IP\d(.+)/
http://rubular.com/r/HmekyYzXcU
If it's possible that the first two digits in the string can be shorter or longer than nn you can adjust the {2} size to whatever. See the Regexp documentation for how it works.

Related

Sed print the items in square brackets that's after the occurence of a text

I have the following Scenarios:
Scenario 1
foo_bar = ["123", "456", "789"]
Scenario 2
foo_bar = [
"123",
"456",
"789"
]
Scenario 3
variable "foo_bar" {
type = list(string)
default = ["123", "456", "789"]
}
So i'm trying to figure out how I can print with sed the items inside the brackets that are under foo_bar accounting scenario 2 which is a multiline
so the resulting matches here would be
Scenario 1
"123", "456", "789"
Scenario 2
"123",
"456",
"789"
Scenario 3
"123", "456", "789"
In the case of
not_foo_bar = [
"123",
"456",
"789"
]
This should not match, only match foo_bar
This is what I've tried so far
sed -e '1,/foo_bar/d' -e '/]/,$d' test.tf
And this
sed -n 's/.*\foo_bar\(.*\)\].*/\1/p' test.tf
This is a mouthful, but it’s POSIX sed and works.
sed -Ene \
'# scenario 1
s/(([^[:alnum:]_]|^)foo_bar[^[:alnum:]_][[:space:]]*=[[:space:]]*\[)([^]]+)(\]$)/\3/p
# scenario 2 and 3
/([^[:alnum:]_]|^)foo_bar[^[:alnum:]_][[:space:]]*=?[[:space:]]*[[{][[:space:]]*$/,/^[]}]$/ {
//!p
s/(([^[:alnum:]_]|^)default[^[:alnum:]_][[:space:]]*=[[:space:]]*\[)([^]]+)(\]$)/\3/p
}' |
# filter out unwanted lines from scenario 3 ("type =")
sed -n '/^[[:space:]]*"/p'
I couldn’t quite get it all in a single sed.
The first and last lines of the first sed are the same command (using default instead of foobar).
edit: in case it confuses someone, I left in that last [[:space:]]*, in the second really long regex, by mistake. I won’t edit it, but it’s not vital, nor consistent - I didn’t allow for any trailing whitespace in line ends in other patterns.
This might work for you (GNU sed):
sed -En '/foo_bar/{:a;/.*\[([^]]*)\].*/!{N;ba};s//\1/p}' file
Turn off implicit printing and on extended regexp -nE.
Pattern match on foo_bar, then gather up line(s) between the next [ and ] and print the result.

grok not reading a word with hyphen

This is my grok pattern
2017-09-25 08:58:17,861 p=14774 u=ec2-user | 14774 1506329897.86160: checking for any_errors_fatal
I'm trying to read the user but it's giving only ec2 , it's not giving the full word
Sorry i'm newer to the grok filter
My current pattern :
%{TIMESTAMP_ISO8601:timestamp} p=%{WORD:process_id} u=%{WORD:user_id}
Current output :
...
...
...
"process_id": [
[
"14774"
]
],
"user_id": [
[
"ec2"
]
]
}
WORD is defined as "\b\w+\b"
See https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns
\b is a word boundary
\w matches a single alphanumeric character (an alphabetic character, or a decimal digit) or "_"
+ means any number of the previous character. So \w+ means any number of characters
Note that \w does NOT match -
So to make it work instead of WORD use
(?<user_id>\b[\w\-]+\b)
This does not use the preddefined grok patterns but "raw" regexp
the (?....) is used instead of %{ as it is "raw" regexp
\- means a literal - sign
[ ] means a character class. So [\w-] will match all the things \w does and - as well
InputAllow1-2 : Success
Grok Filter(?:%{GREEDYDATA:Output}?|-)
Result
{"Output":[["Allow1-2 : Success"]]}

How to match any pattern by ignoring any special character in Logstash?

I am writing a grok pattern for switch log. I am not getting how to ignore the "%" character form the log %DAEMON-3-SYSTEM_MSG
Complete log is-
Jul 16 21:06:50 %DAEMON-3-SYSTEM_MSG: Un-parsable frequency in /mnt/pss/ntp.drift
This can be done using the plain % character. A not very efficient example:
%%{NOTSPACE:switch_source}: %{GREEDYDATA:switch_message}
Which will set:
{
"switch_source": [
[
"DAEMON-3-SYSTEM_MSG"
]
],
"switch_message": [
[
"Un-parsable frequency in /mnt/pss/ntp.drift"
]
]
}
The percent-sign is not a special character in Oniguruma regex, so you don't have to escape it. When used with %{ and then } later, that's when you run into problems. But your log-snippet doesn't seem to use that pattern.

what would the regular expression to extract the 3 from be?

I basically need to get the bit after the last pipe
"3083505|07733366638|3"
What would the regular expression for this be?
You can do this without regex. Here:
"3083505|07733366638|3".split("|").last
# => "3"
With regex: (assuming its always going to be integer values)
"3083505|07733366638|3".scan(/\|(\d+)$/)[0][0] # or use \w+ if you want to extract any word after `|`
# => "3"
Try this regex :
.*\|(.*)
It returns whatever comes after LAST | .
You could do that most easily by using String#rindex:
line = "3083505|07733366638|37"
line[line.rindex('|')+1..-1]
#=> "37"
If you insist on using a regex:
r = /
.* # match any number of any character (greedily!)
\| # match pipe
(.+) # match one or more characters in capture group 1
/x # extended mode
line[r,1]
#=> "37"
Alternatively:
r = /
.* # match any number of any character (greedily!)
\| # match pipe
\K # forget everything matched so far
.+ # match one or more characters
/x # extended mode
line[r]
#=> "37"
or, as suggested by #engineersmnky in a comment on #shivam's answer:
r = /
(?<=\|) # match a pipe in a positive lookbehind
\d+ # match any number of digits
\z # match end of string
/x # extended mode
line[r]
#=> "37"
I would use split and last, but you could do
last_field = line.sub(/.+\|/, "")
That remove all chars up to and including the last pipe.

What is the Ruby regex to match a string with at least one period and no spaces?

What is the regex to match a string with at least one period and no spaces?
You can use this :
/^\S*\.\S*$/
It works like this :
^ <-- Starts with
\S <-- Any character but white spaces (notice the upper case) (same as [^ \t\r\n])
* <-- Repeated but not mandatory
\. <-- A period
\S <-- Any character but white spaces
* <-- Repeated but not mandatory
$ <-- Ends here
You can replace \S by [^ ] to work strictly with spaces (not with tabs etc.)
Something like
^[^ ]*\.[^ ]*$
(match any non-spaces, then a period, then some more non-spaces)
no need regular expression. Keep it simple
>> s="test.txt"
=> "test.txt"
>> s["."] and s.count(" ")<1
=> true
>> s="test with spaces.txt"
=> "test with spaces.txt"
>> s["."] and s.count(" ")<1
=> false
Try this:
/^\S*\.\S*$/

Resources