grokdebugger validates entries of a log that logstash eventually refuses

grokdebugger validates entries of a log that logstash eventually refuses - elasticsearch

Using the grokdebugger I've adapted what I found over the Internet for my first attempt to handle logback spring-boot kind of logs.
Here is a log entry sent to grokdebugger:
2022-03-09 06:35:15,821 [http-nio-9090-exec-1] WARN org.springdoc.core.OpenAPIService - found more than one OpenAPIDefinition class. springdoc-openapi will be using the first one found.
with the grok pattern:
(?<timestamp>%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME}) \[(?<thread>(.*?)+)\] %{LOGLEVEL:level}\s+%{GREEDYDATA:class} - (?<logmessage>.*)
and its dispatches its content as wished:
{
"timestamp": [
[
"2022-03-09 06:35:15,821"
]
],
"YEAR": [
[
"2022"
]
],
"MONTHNUM": [
[
"03"
]
],
"MONTHDAY": [
[
"09"
]
],
"TIME": [
[
"06:35:15,821"
]
],
"HOUR": [
[
"06"
]
],
"MINUTE": [
[
"35"
]
],
"SECOND": [
[
"15,821"
]
],
"thread": [
[
"http-nio-9090-exec-1"
]
],
"level": [
[
"WARN"
]
],
"class": [
[
"org.springdoc.core.OpenAPIService"
]
],
"logmessage": [
[
"found more than one OpenAPIDefinition class. springdoc-openapi will be using the first one found."
]
]
}
But when I ask for the same action inside logstash, I set in configuration for input declaration:
input {
file {
path => "/home/lebihan/dev/Java/comptes-france/metier-et-gestion/dev/ApplicationMetierEtGestion/sparkMetier.log"
codec => multiline {
pattern => "^%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME}.*"
negate => "true"
what => "previous"
}
}
}
and for filter declaration:
filter {
#If log line contains tab character followed by 'at' then we will tag that entry as stacktrace
if [message] =~ "\tat" {
grok {
match => ["message", "^(\tat)"]
add_tag => ["stacktrace"]
}
}
grok {
match => [ "message",
"(?<timestamp>%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME}) \[(?<thread>(.*?)+)\] %{LOGLEVEL:level}\s+%{GREEDYDATA:class} - (?<logmessage>.*)"
]
}
date {
match => [ "timestamp" , "yyyy-MM-dd HH:mm:ss.SSS" ]
}
}
But it fails in parsing it, and I don't know how to have extra content about the underlying error mentioned by _grokparsefailure.

The main responsible of my trouble is the:
grok {
match => [
instead of:
grok {
match => {
But after that, I had to change:
the timestamp definition to a %{TIMESTAMP_ISO8601:timestamp}
the date match
and in the date match add a target to it to avoid a
to avoid a _dateparsefailure.
#timestamp:
Mar 16, 2022 # 09:14:22.002
#version:
1
class:
f.e.service.AbstractSparkDataset
host:
debian
level:
INFO
logmessage:
Un dataset a été sauvegardé dans le fichier parquet /data/tmp/balanceComptesCommunes_2019_2019.
thread:
http-nio-9090-exec-10
timestamp:
2022-03-16T06:34:09.394Z
_id:
8R_KkX8BBIYNTaMw1Jfg
_index:
ecoemploimetier-2022.03.16
_score:
-
_type:
_doc
I eventually corrected my logstash config file like that:
input {
file {
path => "/home/[...]/myLog.log"
sincedb_path => "/dev/null"
start_position => "beginning"
codec => multiline {
pattern => "^%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME}.*"
negate => "true"
what => "previous"
}
}
}
filter {
#If log line contains tab character followed by 'at' then we will tag that entry as stacktrace
if [message] =~ "\tat" {
grok {
match => ["message", "^(\tat)"]
add_tag => ["stacktrace"]
}
}
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} \[(?<thread>(.*?)+)\] %{LOGLEVEL:level} %{GREEDYDATA:class} - (?<logmessage>.*)" }
}
date {
# 2022-03-16 07:32:24,860
match => [ "timestamp" , "yyyy-MM-dd HH:mm:ss,SSS" ]
target => "timestamp"
}
# S'il n'y a pas d'erreur de parsing, supprimer le message d'origine, non parsé
if "_grokparsefailure" not in [tags] {
mutate {
remove_field => [ "message", "path" ]
}
}
}
output {
stdout { codec => rubydebug }
elasticsearch {
hosts => ["localhost:9200"]
index => "ecoemploimetier-%{+YYYY.MM.dd}"
}
}

Related

Have a key value pair as logstash output, by only using grok filter

I am working on a spring boot project and using ELK stack for logging and auditing. I need a logstash.conf file which will process logs and the output can have dynamic key-value pairs. This output data will be used for auditing.
Adding an example for better clarity
Example:
Sample log:
[INFO] [3b1d04f219fc43d18ccb6cb22db6cff4] 2021-10-13_13:43:09.074 Audit_ key1:value1| key2:value2| key3:value3| keyN:valueN
Required logstash output:
{
"logLevel": [
[
"INFO"
]
],
"threadId": [
[
"3b1d04f219fc43d18ccb6cb22db6cff4"
]
],
"timeStamp": [
[
"2021-10-13_13:43:09.074"
]
],
"class": [
[
"Audit_"
]
],
"key1": [
[
"value1"
]
],
"key2": [
[
"value2"
]
],
"key3": [
[
"value3"
]
],
"keyN": [
[
"valueN"
]
]
}
Note:
"key" will always be a word or string value
"value" can be word, numeric or sentence(string with spaces)
":" is the separator between key and value
"|" is the separator between key-value pairs
The number of key-value pairs can vary.
Can someone suggest/help me with the match pattern to be used here? I am only allowed to use grok filter.

Thank you for guidance Filip and leandrojmp!
Just using a grok filter for this, would make it very complex and also it wont support dynamic key-value pairs.
So I went with a combination of grok followed by kv filter. And this approach worked for me.
Sample Log:
[INFO] [3b1d04f219fc43d18ccb6cb22db6cff4] 2021-10-13_13:43:09.074 _Audit_ key1:value1| key2:value2| key3:value3| keyN:valueN
logstash.conf file:
input {
beats {
port => "5044"
}
}
filter {
grok {
match => {"message" => "\[%{LOGLEVEL:logLevel}\]\ \[%{WORD:traceId}\]\ (?<timestamp>[0-9\-_:\.]*)\ %{WORD:class}\ %{GREEDYDATA:message}"]}
overwrite => [ "message" ]
}
if [class] == "_Audit_" {
kv {
source => "message"
field_split => "&"
value_split => "="
remove_field => ["message"]
}
}
}
output {
if [class] == "_Audit_" {
elasticsearch {
hosts => ["localhost:9200"]
index => "audit-logs-%{+YYYY.MM.dd}"
}
}
else {
elasticsearch {
hosts => ["localhost:9200"]
index => "normal-logs-%{+YYYY.MM.dd}"
}
}
}

Logstash nginx filter doesn't apply to half of rows

Using filebeat to push nginx logs to logstash and then to elasticsearch.
Logstash filter:
filter {
if [fileset][module] == "nginx" {
if [fileset][name] == "access" {
grok {
match => { "message" => ["%{IPORHOST:[nginx][access][remote_ip]} - %{DATA:[nginx][access][user_name]} \[%{HTTPDATE:[nginx][access][time]}\] \"%{WORD:[nginx][access][method]} %{DATA:[nginx][access][url]} HTTP/%{NUMBER:[nginx][access][http_version]}\" %{NUMBER:[nginx][access][response_code]} %{NUMBER:[nginx][access][body_sent][bytes]} \"%{DATA:[nginx][access][referrer]}\" \"%{DATA:[nginx][access][agent]}\""] }
remove_field => "message"
}
mutate {
add_field => { "read_timestamp" => "%{#timestamp}" }
}
date {
match => [ "[nginx][access][time]", "dd/MMM/YYYY:H:m:s Z" ]
remove_field => "[nginx][access][time]"
}
useragent {
source => "[nginx][access][agent]"
target => "[nginx][access][user_agent]"
remove_field => "[nginx][access][agent]"
}
geoip {
source => "[nginx][access][remote_ip]"
target => "[nginx][access][geoip]"
}
}
else if [fileset][name] == "error" {
grok {
match => { "message" => ["%{DATA:[nginx][error][time]} \[%{DATA:[nginx][error][level]}\] %{NUMBER:[nginx][error][pid]}#%{NUMBER:[nginx][error][tid]}: (\*%{NUMBER:[nginx][error][connection_id]} )?%{GREEDYDATA:[nginx][error][message]}"] }
remove_field => "message"
}
mutate {
rename => { "#timestamp" => "read_timestamp" }
}
date {
match => [ "[nginx][error][time]", "YYYY/MM/dd H:m:s" ]
remove_field => "[nginx][error][time]"
}
}
}
}
There is just one file /var/log/nginx/access.log.
In kibana, I see ± half of the rows with parsed message and other half - not.
All of the rows in kibana have a tag "beats_input_codec_plain_applied".
Examples from filebeat -e
Row that works fine:
"source": "/var/log/nginx/access.log",
"offset": 5405195,
"message": "...",
"fileset": {
"module": "nginx",
"name": "access"
}
Row that doesn't work fine (no "fileset"):
"offset": 5405397,
"message": "...",
"source": "/var/log/nginx/access.log"
Any idea what could be the cause?

How to use the logstash mutate or ruby filter

I have the following Json syntax
{"result": {
"entities": {
"SERVICE-CCC89FB0A922657A": "service1",
"SERVICE-D279F46CD751424F": "service2",
"SERVICE-7AB760E70FCDCA18": "service3",
},
"dataPoints": {
"SERVICE-CCC89FB0A922657A": [
[
1489734240000,
1101.0
],
[
1489734300000,
null
]
],
"SERVICE-7AB760E70FCDCA18": [
[
1489734240000,
4080800.5470588235
],
[
1489734300000,
null
]
],
"SERVICE-D279F46CD751424F": [
[
1489734240000,
26677.695652173912
],
[
1489734300000,
null
]
]
}
},
"#timestamp": "2017-03-17T07:05:37.531Z",
"data": "data",
"#version": "1"
}
I want to change the following and input it in elasticsearch.
{"#timestamp": "2017-03-17T07:05:37.531Z",
"data": "data",
"#version": "1",
"data" : {
"service1",: [
[
1489734240000,
1101.0
],
[
1489734300000,
null
]
],
"service3" : [
[
1489734240000,
4080800.5470588235
],
[
1489734300000,
null
]
],
"service2": [
[
1489734240000,
26677.695652173912
],
[
1489734300000,
null
]
]
}
}
This is the contents of the current logstash conf file.
input {
http_poller {
urls => {
test => {
method => get
url => "https://xxxx.com"
headers => {
Accept => "application/json"
}
}
}
request_timeout => 60
schedule => { every => "60s" }
codec => "plain"
}
}
filter {
json{
source => "message"
remove_field => ["[result][aggregationType]","message"]
}
# translate{
# }
# mutate{
# }
# ruby{
# }
}
output {
stdout {
codec => rubydebug {
#metadata => true
}
}
elasticsearch {
hosts => ["http://192.168.0.36:9200"]
}
}
I have just used elasticsearch and I do not know how to implement what filter to use.
I wonder if it is possible to implement the contents of the mutate filter rename.
Or should I implement code with ruby filters?
It is likely that the entities will be arrayed with the ruby filter to match the SERVICE- * s of the dataPoints.
However, it is difficult to cope with Ruby code.
I want you to help me.
Thank you.

Here are couple of filters are used for logstash...
https://www.elastic.co/guide/en/logstash/current/plugins-filters-json.html

Elasticsearch, Logstash and Kibana for pfsense logs with geo location

I want to create a tile map in Kibana to show source IP's from countries around the world.
When trying to set up a tile map, I get an error saying that "The "logstash-*" index pattern does not contain any of the following field types: geo_point"
I've googled the problem and found this link https://github.com/elastic/logstash/issues/3137 and at the end of that page, it states this is fixed in 2.x. But I am on 2.1.
Here are my configs:
1inputs.conf:
input {
udp {
type => "syslog"
port => 5140
}
}
5pfsense.conf:
filter {
# Replace with your IP
if [host] =~ /10\.1\.15\.200/ {
grok {
match => [ 'message', '.* %{WORD:program}:%{GREEDYDATA:rest}' ]
}
if [program] == "filterlog" {
# Grab fields up to IP version. The rest will vary depending on IP version.
grok {
match => [ 'rest', '%{INT:rule_number},%{INT:sub_rule_number},,%{INT:tracker_id},%{WORD:interface},%{WORD:reason},%{WORD:action},%{WORD:direction},%{WORD:ip_version},%{GREEDYDATA:rest2}' ]
}
}
mutate {
replace => [ 'message', '%{rest2}' ]
}
if [ip_version] == "4" {
# IPv4. Grab field up to dest_ip. Rest can vary.
grok {
match => [ 'message', '%{WORD:tos},,%{INT:ttl},%{INT:id},%{INT:offset},%{WORD:flags},%{INT:protocol_id},%{WORD:protocol},%{INT:length},%{IP:src_ip},%{IP:dest_ip},%{GREEDYDATA:rest3}' ]
}
}
if [protocol_id] != 2 {
# Non-IGMP has more fields.
grok {
match => [ 'rest3', '^%{INT:src_port:int},%{INT:dest_port:int}' ]
}
}
else {
# IPv6. Grab field up to dest_ip. Rest can vary.
grok {
match => [ 'message', '%{WORD:class},%{WORD:flow_label},%{INT:hop_limit},%{WORD:protocol},%{INT:protocol_id},%{INT:length},%{IPV6:src_ip},%{IPV6:dest_ip},%{GREEDYDATA:rest3}' ]
}
}
mutate {
replace => [ 'message', '%{rest3}' ]
lowercase => [ 'protocol' ]
}
if [message] {
# Non-ICMP has more fields
grok {
match => [ 'message', '^%{INT:src_port:int},%{INT:dest_port:int},%{INT:data_length}' ]
}
}
mutate {
remove_field => [ 'message' ]
remove_field => [ 'rest' ]
remove_field => [ 'rest2' ]
remove_field => [ 'rest3' ]
remove_tag => [ '_grokparsefailure' ]
add_tag => [ 'packetfilter' ]
}
geoip {
add_tag => [ "GeoIP" ]
source => "src_ip"
}
}
}
Lastly, the 50outputs.conf:
output {
elasticsearch { hosts => localhost index => "logstash-%{+YYYY.MM.dd}" template_overwrite => "true" }
stdout { codec => rubydebug }
}

Adding fields depending on event message in Logstash not working

I have ELK installed and working in my machine, but now I want to do a more complex filtering and field adding depending on event messages.
Specifically, I want to set "id_error" and "descripcio" depending on the message pattern.
I have been trying a lot of code combinations in "logstash.conf" file, but I am not able to get the expected behavior.
Can someone tell me what I am doing wrong, what I have to do or if this is not possible? Thanks in advance.
This is my "logstash.conf" file, with the last test I have made, resulting in no events captured in Kibana:
input {
file {
path => "C:\xxx.log"
}
}
filter {
grok {
patterns_dir => "C:\elk\patterns"
match => [ "message", "%{ERROR2:error2}" ]
add_field => [ "id_error", "2" ]
add_field => [ "descripcio", "error2!!!" ]
}
grok {
patterns_dir => "C:\elk\patterns"
match => [ "message", "%{ERROR1:error1}" ]
add_field => [ "id_error", "1" ]
add_field => [ "descripcio", "error1!!!" ]
}
if ("_grokparsefailure" in [tags]) { drop {} }
}
output {
elasticsearch {
host => "localhost"
protocol => "http"
index => "xxx-%{+YYYY.MM.dd}"
}
}
I also have tried the following code, resulting in fields "id_error" and "descripcio" with both vaules "[1,2]" and "[error1!!!,error2!!!]" respectively, in each matched event.
As "break_on_match" is set "true" by default, I expect getting only the fields behind the matching clause, but this doesn't occur.
input {
file {
path => "C:\xxx.log"
}
}
filter {
grok {
patterns_dir => "C:\elk\patterns"
match => [ "message", "%{ERROR1:error1}" ]
add_field => [ "id_error", "1" ]
add_field => [ "descripcio", "error1!!!" ]
match => [ "message", "%{ERROR2:error2}" ]
add_field => [ "id_error", "2" ]
add_field => [ "descripcio", "error2!!!" ]
}
if ("_grokparsefailure" in [tags]) { drop {} }
}
output {
elasticsearch {
host => "localhost"
protocol => "http"
index => "xxx-%{+YYYY.MM.dd}"
}
}

I have solved the problem. I get the expected results with the following code in "logstash.conf":
input {
file {
path => "C:\xxx.log"
}
}
filter {
grok {
patterns_dir => "C:\elk\patterns"
match => [ "message", "%{ERROR1:error1}" ]
match => [ "message", "%{ERROR2:error2}" ]
}
if [message] =~ /error1_regex/ {
grok {
patterns_dir => "C:\elk\patterns"
match => [ "message", "%{ERROR1:error1}" ]
}
mutate {
add_field => [ "id_error", "1" ]
add_field => [ "descripcio", "Error1!" ]
remove_field => [ "message" ]
remove_field => [ "error1" ]
}
}
else if [message] =~ /error2_regex/ {
grok {
patterns_dir => "C:\elk\patterns"
match => [ "message", "%{ERROR2:error2}" ]
}
mutate {
add_field => [ "id_error", "2" ]
add_field => [ "descripcio", "Error2!" ]
remove_field => [ "message" ]
remove_field => [ "error2" ]
}
}
if ("_grokparsefailure" in [tags]) { drop {} }
}
output {
elasticsearch {
host => "localhost"
protocol => "http"
index => "xxx-%{+YYYY.MM.dd}"
}
}

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

grokdebugger validates entries of a log that logstash eventually refuses - elasticsearch

Related

Have a key value pair as logstash output, by only using grok filter

Logstash nginx filter doesn't apply to half of rows

How to use the logstash mutate or ruby filter

Elasticsearch, Logstash and Kibana for pfsense logs with geo location

Adding fields depending on event message in Logstash not working

Categories

Resources