Logstash to kibana multiline not working for custom message - elasticsearch

I am trying push my log file through logstash to elasticsearch and display it on kibana. It works fine for the single line log records. However, it fails when it comes to the multiline filter.
Here is my sample multiline log input:
2016-06-02T04:02:29,720 INFO Thread-25-match-entity-bolt a52488cc-316b-402e-af58-3b8a663cd76a STDIO invoke Error processing message:{
"eid": "f9f16541-4fab-4131-a82e-e3ddf6fcd949",
"entityInfo": {
"entityType": "style",
"defaultLocale": "en-US"
},
"systemInfo": {
"tenantId": "t1"
},
"attributesInfo": {
"externalId": 1514,
"attributesRead": {
"IsEntityVariantsValid": false,
"IsEntityExtensionsValid": false
},
"attributesUpdated": {
"DateAttribute": "2016-06-01T00:00:00.0000000",
"IsEntitySelfValid": true,
"IsEntityMetaDataValid": true,
"IsEntityCommonAttributesValid": true,
"IsEntityCategoryAttributesValid": true,
"IsEntityRelationshipsValid": true
}
},
"jsAttributesInfo": {
"jsRelationship": {
"entityId": "CottonMaterial001",
"parentEntityId": "Apparel",
"category": "Apparel",
"categoryName": "Apparel",
"categoryPath": "Apparel",
"categoryNamePath": "Apparel",
"variant": "1514",
"variantPath": "1035/1514",
"container": "Demo Master",
"containerName": "Demo Master",
"containerPath": "DemoOrg/Demo Master/Apparel",
"organization": "DemoOrg",
"segment": "A"
},
"jsChangeContext": {
"entityAction": "update",
"user": "cfadmin",
"changeAgent": "EntityEditor.aspx",
"changeAgentType": "PIM",
"changeInterface": "Entity",
"sourceTimestamp": "2016-06-01T19:48:19.4162475+05:30",
"ingestTimestamp": "2016-06-01T19:48:19.4162475+05:30"
}
}
}
I have tried these logstash configs so far:
input {
file {
path => "path_to_logs/logs.log"
start_position => "beginning"
}
}
filter{
multiline {
negate => "true"
pattern => "^%{TIMESTAMP_ISO8601} "
what => "previous"
}
grok{
match => { "message" => "^%{TIMESTAMP_ISO8601:JigsawTimestamp}%{SPACE}%{LOGLEVEL:JigsawLoglevel}%{SPACE}%{HOSTNAME:ThreadName}%{SPACE}%{UUID:GUID}%{SPACE}%{JAVACLASS:JigsawClassName}%{SPACE}%{WORD:JigsawMethodName}%{SPACE}%{GREEDYDATA:JigsawLogMessage}" }
}
}
output {
if "_grokparsefailure" not in [tags] {
elasticsearch {
hosts => ["localhost:9200"]
}
}
}
The second one:
input {
file {
path => "path_to_logs/logs.log"
start_position => "beginning"
codec => multiline {
negate => "true"
pattern => "^%{TIMESTAMP_ISO8601} "
what => "previous"
}
}
}
filter{
grok{
match => { "message" => "^%{TIMESTAMP_ISO8601:JigsawTimestamp}%{SPACE}%{LOGLEVEL:JigsawLoglevel}%{SPACE}%{HOSTNAME:ThreadName}%{SPACE}%{UUID:GUID}%{SPACE}%{JAVACLASS:JigsawClassName}%{SPACE}%{WORD:JigsawMethodName}%{SPACE}%{GREEDYDATA:JigsawLogMessage}" }
}
}
output {
if "_grokparsefailure" not in [tags] {
elasticsearch {
hosts => ["localhost:9200"]
}
}
}
I tried this pattern as well:
pattern => "^\s"
However, none of this helped. All of them got _grokparsefailure tag. I want the JSON lines to be part of a single message. Please point out the mistake in this filter.

In your grok filter, there are couple of mistakes via which you are unable to see any logs.
In your sample data after INFO there are 2 spaces.
For the field JigsawClassName you are using JAVACLASS as input which is wrong for your log.
Why JAVACLASS is wrong?
It's implementation is as :-
JAVACLASS (?:[a-zA-Z0-9-]+.)+[A-Za-z0-9$]+
As per the above JAVACLASS requires atleast a period (.) symbol to appear in the text. However in your logs it is just STDIO.
Replace your grok match by the following:-
match => { "message" => "^%{TIMESTAMP_ISO8601:JigsawTimestamp}%{SPACE}%{LOGLEVEL:JigsawLoglevel}%{SPACE}%{SPACE}%{HOSTNAME:ThreadName}%{SPACE}%{UUID:GUID}%{SPACE}%{WORD:JigsawClassName}%{SPACE}%{WORD:JigsawMethodName}%{SPACE}%{GREEDYDATA:JigsawLogMessage}" }
Also for easy understanding use output to redirect it to console by adding stdout plugin as shown below:-
output {
if "_grokparsefailure" not in [tags] {
elasticsearch {
hosts => ["localhost:9200"]
}
stdout { codec => rubydebug }
}
It will make it easier for you to understand the error while processing data using Logstash.

Related

How can I create JSON format log when entering into Elasticsearch by logstash

i been told that, by using logstash pipeline i can re-create a log format(i.e JSON) when entering into elasticsearch. but not understanding how to do it .
current LOGStash Configure ( I took bellow from Google , not for any perticular reason)
/etc/logstash/conf.d/metrics-pipeline.conf
input {
beats {
port => 5044
client_inactivity_timeout => "3600"
}
}
filter {
if [message] =~ />/ {
dissect {
mapping => {
"message" => "%{start_of_message}>%{content}"
}
}
kv {
source => "content"
value_split => ":"
field_split => ","
trim_key => "\[\]"
trim_value => "\[\]"
target => "message"
}
mutate {
remove_field => ["content","start_of_message"]
}
}
}
filter {
if [system][process] {
if [system][process][cmdline] {
grok {
match => {
"[system][process][cmdline]" => "^%{PATH:[system][process][cmdline_path]}"
}
remove_field => "[system][process][cmdline]"
}
}
}
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
}
output {
elasticsearch {
hosts => "1.2.1.1:9200"
manage_template => false
index => "%{[#metadata][beat]}-%{[#metadata][version]}-%{+YYYY.MM.dd}"
}
}
I have couple of log file located at
/root/logs/File.log
/root/logs/File2.log
the Log format there is :
08:26:51,753 DEBUG [ABC] (default-threads - 78) (1.2.3.4)(368)>[TIMESTAMP:Wed Sep 11 08:26:51 UTC 2019],[IMEI:03537],[COMMAND:INFO],[GPS STATUS:true],[INFO:true],[SIGNAL:false],[ENGINE:0],[DOOR:0],[LON:90.43],[LAT:23],[SPEED:0.0],[HEADING:192.0],[BATTERY:100.0%],[CHARGING:1],[O&E:CONNECTED],[GSM_SIGNAL:100],[GPS_SATS:5],[GPS POS:true],[FUEL:0.0V/0.0%],[ALARM:NONE][SERIAL:01EE]
in Kibana by default it shows like ethis
https://imgshare.io/image/stackflow.I0u7S
https://imgshare.io/image/jsonlog.IHQhp
"message": "21:33:42,004 DEBUG [LOG] (default-threads - 100) (1.2.3.4)(410)>[TIMESTAMP:Sat Sep 07 21:33:42 UTC 2019],[TEST:123456],[CMD:INFO],[STATUS:true],[INFO:true],[SIGNAL:false],[ABC:0],[DEF:0],[GHK:1111],[SERIAL:0006]"
but i want to get it like bellow :-
"message": {
"TIMESTAMP": "Sat Sep 07 21:33:42 UTC 2019",
"TEST": "123456",
"CMD":INFO,
"STATUS":true,
"INFO":true,
"SIGNAL":false,
"ABC":0,
"DEF":0,
"GHK":0,
"GHK":1111
}
Can this be done ? if yes how ?
Thanks
With the if [message] =~ />/, the filters will only apply to messages containing a >. The dissect filter will split the message between the >. The kv filter will apply a key-value transformation on the second part of the message, removing the []. The mutate.remove_field remove any extra field.
filter {
if [message] =~ />/ {
dissect {
mapping => {
"message" => "%{start_of_message}>%{content}"
}
}
kv {
source => "content"
value_split => ":"
field_split => ","
trim_key => "\[\]"
trim_value => "\[\]"
target => "message"
}
mutate {
remove_field => ["content","start_of_message"]
}
}
}
Result, using the provided log line:
{
"#version": "1",
"host": "YOUR_MACHINE_NAME",
"message": {
"DEF": "0",
"TIMESTAMP": "Sat Sep 07 21:33:42 UTC 2019",
"TEST": "123456",
"CMD": "INFO",
"SERIAL": "0006]\r",
"GHK": "1111",
"INFO": "true",
"STATUS": "true",
"ABC": "0",
"SIGNAL": "false"
},
"#timestamp": "2019-09-10T09:21:16.422Z"
}
In addition to doing the filtering with if [message] =~ />/, you can also do the comparison on the path field, which is set by the file input plugin. Also if you have multiple file inputs, you can set the type field and use this one, see https://stackoverflow.com/a/20562031/6113627.

Distributing tracing and elastic stack visualisation

2019-06-03 10:45:00.051 INFO [currency-exchange,411a0496b048bcf4,8d40fcfea92613ad,true] 45648 --- [x-Controller-10] logger : inside exchange
This is the log format in my console. I am using spring cloud stream to transport my log from application to logstash.This is the format for log parsing in logstash
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:timestamp}\s+%{LOGLEVEL:severity}\s+\[%{DATA:service},%{DATA:trace},%{DATA:span},%{DATA:exportable}\]\s+%{DATA:pid}\s+---\s+\[%{DATA:thread}\]\s+%{DATA:class}\s+:\s+%{GREEDYDATA:rest}" }
}
This is my logstash.conf
input { kafka { topics => ['zipkin'] } } filter {
# pattern matching logback pattern
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:timestamp}\s+%{LOGLEVEL:severity}\s+\[%{DATA:service},%{DATA:trace},%{DATA:span},%{DATA:exportable}\]\s+%{DATA:pid}\s+---\s+\[%{DATA:thread}\]\s+%{DATA:class}\s+:\s+%{GREEDYDATA:rest}"
}
} } output { elasticsearch {hosts => ['localhost:9200'] index => 'logging'} stdout {} }
and this is my output in log-stash console . which is parsing exception
{
"message" => "[{\"traceId\":\"411a0496b048bcf4\",\"parentId\":\"8d40fcfea92613ad\",\"id\":\"f14c1c332d2ef077\",\"kind\":\"CLIENT\",\"name\":\"get\",\"timestamp\":1559538900053889,\"duration\":16783,\"localEndpoint\":{\"serviceName\":\"currency-exchange\",\"ipv4\":\"10.8.0.7\"},\"tags\":{\"http.method\":\"GET\",\"http.path\":\"/convert/1/to/4\"}},{\"traceId\":\"411a0496b048bcf4\",\"parentId\":\"411a0496b048bcf4\",\"id\":\"8d40fcfea92613ad\",\"name\":\"hystrix\",\"timestamp\":1559538900050039,\"duration\":34500,\"localEndpoint\":{\"serviceName\":\"currency-exchange\",\"ipv4\":\"10.8.0.7\"}},{\"traceId\":\"411a0496b048bcf4\",\"id\":\"411a0496b048bcf4\",\"kind\":\"SERVER\",\"name\":\"get
/convert\",\"timestamp\":1559538900041446,\"duration\":44670,\"localEndpoint\":{\"serviceName\":\"currency-exchange\",\"ipv4\":\"10.8.0.7\"},\"remoteEndpoint\":{\"ipv6\":\"::1\",\"port\":62200},\"tags\":{\"http.method\":\"GET\",\"http.path\":\"/convert\",\"mvc.controller.class\":\"Controller\",\"mvc.controller.method\":\"convert\"}}]",
"#timestamp" => 2019-06-03T05:15:00.296Z,
"#version" => "1",
"tags" => [
[0] "_grokparsefailure"
] }
When I use the Grok Debugger that is built into Kibana (under Dev Tools) I get the following result from your sample log and grok pattern:
{
"severity": "DEBUG",
"rest": "GET \"/convert/4/to/5\", parameters={}",
"pid": "35973",
"thread": "nio-9090-exec-1",
"trace": "62132b44a444425e",
"exportable": "true",
"service": "currency-conversion",
"class": "o.s.web.servlet.DispatcherServlet",
"timestamp": "2019-05-31 05:31:42.667",
"span": "62132b44a444425e"
}
That looks correct to me. So what is the missing part?
Also the logging output you are showing contains "ipv4":"192.168.xx.xxx"},"remoteEndpoint": {"ipv6":"::1","port":55394},"tags": ..., which is not in the sample log. Where is that coming from?

Using multiple config files for logstash

I am just learning elasticsearch and I need to know how to correctly split a configuration file into multiple. I'm using the official logstash on docker with ports bound on 9600 and 5044. Originally I had a working single logstash file without conditionals like so:
input {
beats {
port => '5044'
}
}
filter
{
grok{
match => {
"message" => "%{TIMESTAMP_ISO8601:timestamp} \[(?<event_source>[\w\s]+)\]:\[(?<log_type>[\w\s]+)\]:\[(?<id>\d+)\] %{GREEDYDATA:details}"
"source" => "%{GREEDYDATA}\\%{GREEDYDATA:app}.log"
}
}
mutate{
convert => { "id" => "integer" }
}
date {
match => [ "timestamp", "ISO8601" ]
locale => en
remove_field => "timestamp"
}
}
output
{
elasticsearch {
hosts => ["http://elastic:9200"]
index => "logstash-supportworks"
}
}
When I wanted to add metricbeat I decided to split that configuration into a new file. So I ended up with 3 files:
__input.conf
input {
beats {
port => '5044'
}
}
metric.conf
# for testing I'm adding no filters just to see what the data looks like
output {
if ['#metadata']['beat'] == 'metricbeat' {
elasticsearch {
hosts => ["http://elastic:9200"]
index => "%{[#metadata][beat]}-%{[#metadata][version]}"
}
}
}
supportworks.conf
filter
{
if ["source"] =~ /Supportwork Server/ {
grok{
match => {
"message" => "%{TIMESTAMP_ISO8601:timestamp} \[(?<event_source>[\w\s]+)\]:\[(?<log_type>[\w\s]+)\]:\[(?<id>\d+)\] %{GREEDYDATA:details}"
"source" => "%{GREEDYDATA}\\%{GREEDYDATA:app}.log"
}
}
mutate{
convert => { "id" => "integer" }
}
date {
match => [ "timestamp", "ISO8601" ]
locale => en
remove_field => "timestamp"
}
}
}
output
{
if ["source"] =~ /Supportwork Server/ {
elasticsearch {
hosts => ["http://elastic:9200"]
index => "logstash-supportworks"
}
}
}
Now no data is being sent to the ES instance. I have verified that filebeat at least is running and publishing messages, so I'd expect to at least see that much going to ES. Here's a published message from my server running filebeat
2019-03-06T09:16:44.634-0800 DEBUG [publish] pipeline/processor.go:308 Publish event: {
"#timestamp": "2019-03-06T17:16:44.634Z",
"#metadata": {
"beat": "filebeat",
"type": "doc",
"version": "6.6.1"
},
"source": "C:\\Program Files (x86)\\Hornbill\\Supportworks Server\\log\\swserver.log",
"offset": 4773212,
"log": {
"file": {
"path": "C:\\Program Files (x86)\\Hornbill\\Supportworks Server\\log\\swserver.log"
}
},
"message": "2019-03-06 09:16:42 [COMMS]:[INFO ]:[4924] Helpdesk API (5005) Socket error while idle - 10053",
"prospector": {
"type": "log"
},
"input": {
"type": "log"
},
"beat": {
"name": "WIN-22VRRIEO8LM",
"hostname": "WIN-22VRRIEO8LM",
"version": "6.6.1"
},
"host": {
"name": "WIN-22VRRIEO8LM",
"architecture": "x86_64",
"os": {
"platform": "windows",
"version": "6.3",
"family": "windows",
"name": "Windows Server 2012 R2 Standard",
"build": "9600.0"
},
"id": "e5887ac2-6fbf-45ef-998d-e40437066f56"
}
}
I got this working by adding a mutate filter to __input.conf to replace backslashes with forward slashes in the source field
filter {
mutate{
gsub => [ "source", "[\\]", "/" ]
}
}
And removing the " from the field accessors in my conditionals So
if ["source"] =~ /Supportwork Server/
Became
if [source] =~ /Supportwork Server/
Both changes seemed to be necessary to get this configuration working.

Logstash nginx filter doesn't apply to half of rows

Using filebeat to push nginx logs to logstash and then to elasticsearch.
Logstash filter:
filter {
if [fileset][module] == "nginx" {
if [fileset][name] == "access" {
grok {
match => { "message" => ["%{IPORHOST:[nginx][access][remote_ip]} - %{DATA:[nginx][access][user_name]} \[%{HTTPDATE:[nginx][access][time]}\] \"%{WORD:[nginx][access][method]} %{DATA:[nginx][access][url]} HTTP/%{NUMBER:[nginx][access][http_version]}\" %{NUMBER:[nginx][access][response_code]} %{NUMBER:[nginx][access][body_sent][bytes]} \"%{DATA:[nginx][access][referrer]}\" \"%{DATA:[nginx][access][agent]}\""] }
remove_field => "message"
}
mutate {
add_field => { "read_timestamp" => "%{#timestamp}" }
}
date {
match => [ "[nginx][access][time]", "dd/MMM/YYYY:H:m:s Z" ]
remove_field => "[nginx][access][time]"
}
useragent {
source => "[nginx][access][agent]"
target => "[nginx][access][user_agent]"
remove_field => "[nginx][access][agent]"
}
geoip {
source => "[nginx][access][remote_ip]"
target => "[nginx][access][geoip]"
}
}
else if [fileset][name] == "error" {
grok {
match => { "message" => ["%{DATA:[nginx][error][time]} \[%{DATA:[nginx][error][level]}\] %{NUMBER:[nginx][error][pid]}#%{NUMBER:[nginx][error][tid]}: (\*%{NUMBER:[nginx][error][connection_id]} )?%{GREEDYDATA:[nginx][error][message]}"] }
remove_field => "message"
}
mutate {
rename => { "#timestamp" => "read_timestamp" }
}
date {
match => [ "[nginx][error][time]", "YYYY/MM/dd H:m:s" ]
remove_field => "[nginx][error][time]"
}
}
}
}
There is just one file /var/log/nginx/access.log.
In kibana, I see ± half of the rows with parsed message and other half - not.
All of the rows in kibana have a tag "beats_input_codec_plain_applied".
Examples from filebeat -e
Row that works fine:
"source": "/var/log/nginx/access.log",
"offset": 5405195,
"message": "...",
"fileset": {
"module": "nginx",
"name": "access"
}
Row that doesn't work fine (no "fileset"):
"offset": 5405397,
"message": "...",
"source": "/var/log/nginx/access.log"
Any idea what could be the cause?

parsing this format of date dd/MMM/yyyy:HH:mm:ss with logstack

my timestamp in the logs have this format below:
<p> 2018/03/15 16:22:31 SYST DEBUG :: RefOPoolConnexionsSQL::getConnexionSQL() --> A016 </P>
I need to have something like that{
"
"_type": "logs",
"_method1": "SYST",
"_method2": "DEBUG",
"line": RefOPoolConnexionsSQL::getConnexionSQL() --> A016,
"_source": {
"path": "xxxxxxxxxxxxxxxxxxx",
"#timestamp": "2018/03/15 16:22:31, **logstash #timestamp**
"timestamp": "2018/03/15 16:22:31" **Mine time stamp**
}
for that i do this filter but it didnt work as i want:
input {
beats {
port => 5044
}
}
filter {
grok {
match => { "message" => "%{WORD:client} %{TIMESTAMP_ISO8601:timestamp} %{WORD:sys} %{GREEDYDATA:line}" }
}
date {
match => ["timestamp", "dd/MMM/yyyy:HH:mm:ss"]
target => "#timestamp"
}
}
output {
elasticsearch { hosts => ["localhost:9200"] }
stdout { codec => rubydebug }
}

Resources