Distributing tracing and elastic stack visualisation - spring-boot

2019-06-03 10:45:00.051 INFO [currency-exchange,411a0496b048bcf4,8d40fcfea92613ad,true] 45648 --- [x-Controller-10] logger : inside exchange
This is the log format in my console. I am using spring cloud stream to transport my log from application to logstash.This is the format for log parsing in logstash
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:timestamp}\s+%{LOGLEVEL:severity}\s+\[%{DATA:service},%{DATA:trace},%{DATA:span},%{DATA:exportable}\]\s+%{DATA:pid}\s+---\s+\[%{DATA:thread}\]\s+%{DATA:class}\s+:\s+%{GREEDYDATA:rest}" }
}
This is my logstash.conf
input { kafka { topics => ['zipkin'] } } filter {
# pattern matching logback pattern
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:timestamp}\s+%{LOGLEVEL:severity}\s+\[%{DATA:service},%{DATA:trace},%{DATA:span},%{DATA:exportable}\]\s+%{DATA:pid}\s+---\s+\[%{DATA:thread}\]\s+%{DATA:class}\s+:\s+%{GREEDYDATA:rest}"
}
} } output { elasticsearch {hosts => ['localhost:9200'] index => 'logging'} stdout {} }
and this is my output in log-stash console . which is parsing exception
{
"message" => "[{\"traceId\":\"411a0496b048bcf4\",\"parentId\":\"8d40fcfea92613ad\",\"id\":\"f14c1c332d2ef077\",\"kind\":\"CLIENT\",\"name\":\"get\",\"timestamp\":1559538900053889,\"duration\":16783,\"localEndpoint\":{\"serviceName\":\"currency-exchange\",\"ipv4\":\"10.8.0.7\"},\"tags\":{\"http.method\":\"GET\",\"http.path\":\"/convert/1/to/4\"}},{\"traceId\":\"411a0496b048bcf4\",\"parentId\":\"411a0496b048bcf4\",\"id\":\"8d40fcfea92613ad\",\"name\":\"hystrix\",\"timestamp\":1559538900050039,\"duration\":34500,\"localEndpoint\":{\"serviceName\":\"currency-exchange\",\"ipv4\":\"10.8.0.7\"}},{\"traceId\":\"411a0496b048bcf4\",\"id\":\"411a0496b048bcf4\",\"kind\":\"SERVER\",\"name\":\"get
/convert\",\"timestamp\":1559538900041446,\"duration\":44670,\"localEndpoint\":{\"serviceName\":\"currency-exchange\",\"ipv4\":\"10.8.0.7\"},\"remoteEndpoint\":{\"ipv6\":\"::1\",\"port\":62200},\"tags\":{\"http.method\":\"GET\",\"http.path\":\"/convert\",\"mvc.controller.class\":\"Controller\",\"mvc.controller.method\":\"convert\"}}]",
"#timestamp" => 2019-06-03T05:15:00.296Z,
"#version" => "1",
"tags" => [
[0] "_grokparsefailure"
] }

When I use the Grok Debugger that is built into Kibana (under Dev Tools) I get the following result from your sample log and grok pattern:
{
"severity": "DEBUG",
"rest": "GET \"/convert/4/to/5\", parameters={}",
"pid": "35973",
"thread": "nio-9090-exec-1",
"trace": "62132b44a444425e",
"exportable": "true",
"service": "currency-conversion",
"class": "o.s.web.servlet.DispatcherServlet",
"timestamp": "2019-05-31 05:31:42.667",
"span": "62132b44a444425e"
}
That looks correct to me. So what is the missing part?
Also the logging output you are showing contains "ipv4":"192.168.xx.xxx"},"remoteEndpoint": {"ipv6":"::1","port":55394},"tags": ..., which is not in the sample log. Where is that coming from?

Related

How can I create JSON format log when entering into Elasticsearch by logstash

i been told that, by using logstash pipeline i can re-create a log format(i.e JSON) when entering into elasticsearch. but not understanding how to do it .
current LOGStash Configure ( I took bellow from Google , not for any perticular reason)
/etc/logstash/conf.d/metrics-pipeline.conf
input {
beats {
port => 5044
client_inactivity_timeout => "3600"
}
}
filter {
if [message] =~ />/ {
dissect {
mapping => {
"message" => "%{start_of_message}>%{content}"
}
}
kv {
source => "content"
value_split => ":"
field_split => ","
trim_key => "\[\]"
trim_value => "\[\]"
target => "message"
}
mutate {
remove_field => ["content","start_of_message"]
}
}
}
filter {
if [system][process] {
if [system][process][cmdline] {
grok {
match => {
"[system][process][cmdline]" => "^%{PATH:[system][process][cmdline_path]}"
}
remove_field => "[system][process][cmdline]"
}
}
}
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
}
output {
elasticsearch {
hosts => "1.2.1.1:9200"
manage_template => false
index => "%{[#metadata][beat]}-%{[#metadata][version]}-%{+YYYY.MM.dd}"
}
}
I have couple of log file located at
/root/logs/File.log
/root/logs/File2.log
the Log format there is :
08:26:51,753 DEBUG [ABC] (default-threads - 78) (1.2.3.4)(368)>[TIMESTAMP:Wed Sep 11 08:26:51 UTC 2019],[IMEI:03537],[COMMAND:INFO],[GPS STATUS:true],[INFO:true],[SIGNAL:false],[ENGINE:0],[DOOR:0],[LON:90.43],[LAT:23],[SPEED:0.0],[HEADING:192.0],[BATTERY:100.0%],[CHARGING:1],[O&E:CONNECTED],[GSM_SIGNAL:100],[GPS_SATS:5],[GPS POS:true],[FUEL:0.0V/0.0%],[ALARM:NONE][SERIAL:01EE]
in Kibana by default it shows like ethis
https://imgshare.io/image/stackflow.I0u7S
https://imgshare.io/image/jsonlog.IHQhp
"message": "21:33:42,004 DEBUG [LOG] (default-threads - 100) (1.2.3.4)(410)>[TIMESTAMP:Sat Sep 07 21:33:42 UTC 2019],[TEST:123456],[CMD:INFO],[STATUS:true],[INFO:true],[SIGNAL:false],[ABC:0],[DEF:0],[GHK:1111],[SERIAL:0006]"
but i want to get it like bellow :-
"message": {
"TIMESTAMP": "Sat Sep 07 21:33:42 UTC 2019",
"TEST": "123456",
"CMD":INFO,
"STATUS":true,
"INFO":true,
"SIGNAL":false,
"ABC":0,
"DEF":0,
"GHK":0,
"GHK":1111
}
Can this be done ? if yes how ?
Thanks
With the if [message] =~ />/, the filters will only apply to messages containing a >. The dissect filter will split the message between the >. The kv filter will apply a key-value transformation on the second part of the message, removing the []. The mutate.remove_field remove any extra field.
filter {
if [message] =~ />/ {
dissect {
mapping => {
"message" => "%{start_of_message}>%{content}"
}
}
kv {
source => "content"
value_split => ":"
field_split => ","
trim_key => "\[\]"
trim_value => "\[\]"
target => "message"
}
mutate {
remove_field => ["content","start_of_message"]
}
}
}
Result, using the provided log line:
{
"#version": "1",
"host": "YOUR_MACHINE_NAME",
"message": {
"DEF": "0",
"TIMESTAMP": "Sat Sep 07 21:33:42 UTC 2019",
"TEST": "123456",
"CMD": "INFO",
"SERIAL": "0006]\r",
"GHK": "1111",
"INFO": "true",
"STATUS": "true",
"ABC": "0",
"SIGNAL": "false"
},
"#timestamp": "2019-09-10T09:21:16.422Z"
}
In addition to doing the filtering with if [message] =~ />/, you can also do the comparison on the path field, which is set by the file input plugin. Also if you have multiple file inputs, you can set the type field and use this one, see https://stackoverflow.com/a/20562031/6113627.

Beat input in Logstash is losing fields

I have the following infrastructure:
ELK installed as docker containers, each in its own container. And on a virtual machine running CentOS I installed nginx web server and Filebeat to collect the logs.
I enabled the nginx module in filebeat.
> filebeat modules enable nginx
Before starting filebeat I set it up with elasticsearch and installed it's dashboards on kibana.
config file (I have removed unnecessary comments from the file):
filebeat.config.modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
setup.kibana:
host: "172.17.0.1:5601"
output.elasticsearch:
hosts: ["172.17.0.1:9200"]
then to set it up in elasticsearch and kibana
> filebeat setup -e --dashboards
This works fine. In fact if I keep it this way everything works perfectly. I can use the collected logs in kibana and use the dashboards for NGinX I installed with the above command.
I want though to pass the logs through to Logstash.
And here's my Logstash configuration uses the following pipelines:
- pipeline.id: filebeat
path.config: "config/filebeat.conf"
filebeat.conf:
input {
beats {
port => 5044
}
}
#filter {
# mutate {
# add_tag => ["filebeat"]
# }
#}
output {
elasticsearch {
hosts => ["elasticsearch0:9200"]
index => "%{[#metadata][beat]}-%{[#metadata][version]}-%{+YYYY.MM.dd}"
}
stdout { }
}
Making the logs go through Logstash the resulting log is just:
{
"offset" => 6655,
"#version" => "1",
"#timestamp" => 2019-02-20T13:34:06.886Z,
"message" => "10.0.2.2 - - [20/Feb/2019:08:33:58 -0500] \"GET / HTTP/1.1\" 304 0 \"-\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/71.0.3578.98 Chrome/71.0.3578.98 Safari/537.36\" \"-\"",
"beat" => {
"version" => "6.5.4",
"name" => "localhost.localdomain",
"hostname" => "localhost.localdomain"
},
"source" => "/var/log/nginx/access.log",
"host" => {
"os" => {
"version" => "7 (Core)",
"codename" => "Core",
"family" => "redhat",
"platform" => "centos"
},
"name" => "localhost.localdomain",
"id" => "18e7cb2506624fb6ae2dc3891d5d7172",
"containerized" => true,
"architecture" => "x86_64"
},
"fileset" => {
"name" => "access",
"module" => "nginx"
},
"tags" => [
[0] "beats_input_codec_plain_applied"
],
"input" => {
"type" => "log"
},
"prospector" => {
"type" => "log"
}
}
A lot of fields are missing from my object. There should have been many more structured information
UPDATE: This is what I'm expecting instead
{
"_index": "filebeat-6.5.4-2019.02.20",
"_type": "doc",
"_id": "ssJPC2kBLsya0HU-3uwW",
"_version": 1,
"_score": null,
"_source": {
"offset": 9639,
"nginx": {
"access": {
"referrer": "-",
"response_code": "404",
"remote_ip": "10.0.2.2",
"method": "GET",
"user_name": "-",
"http_version": "1.1",
"body_sent": {
"bytes": "3650"
},
"remote_ip_list": [
"10.0.2.2"
],
"url": "/access",
"user_agent": {
"patch": "3578",
"original": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/71.0.3578.98 Chrome/71.0.3578.98 Safari/537.36",
"major": "71",
"minor": "0",
"os": "Ubuntu",
"name": "Chromium",
"os_name": "Ubuntu",
"device": "Other"
}
}
},
"prospector": {
"type": "log"
},
"read_timestamp": "2019-02-20T14:29:36.393Z",
"source": "/var/log/nginx/access.log",
"fileset": {
"module": "nginx",
"name": "access"
},
"input": {
"type": "log"
},
"#timestamp": "2019-02-20T14:29:32.000Z",
"host": {
"os": {
"codename": "Core",
"family": "redhat",
"version": "7 (Core)",
"platform": "centos"
},
"containerized": true,
"name": "localhost.localdomain",
"id": "18e7cb2506624fb6ae2dc3891d5d7172",
"architecture": "x86_64"
},
"beat": {
"hostname": "localhost.localdomain",
"name": "localhost.localdomain",
"version": "6.5.4"
}
},
"fields": {
"#timestamp": [
"2019-02-20T14:29:32.000Z"
]
},
"sort": [
1550672972000
]
}
The answer provided by #baudsp was mostly correct, but it was incomplete. I had exactly the same problem, and I also had exactly the same filter mentioned in the documentation (and in #baudsp's answer), but documents in Elastic Search still did not contain any of the expected fields.
I finally found the problem: because I had Filebeat configured to send Nginx logs via the Nginx module and not the Log input, the data coming from Logbeat didn't match quite what the example Logstash filter was expecting.
The conditional in the example is if [fileset][module] == "nginx", which is correct if Filebeat was sending data from a Log input. However, since the log data is coming from the Nginx module, the fileset property doesn't contain a module property.
To make the filter work with Logstash data coming from the Nginx module, the conditional needs to be modified to look for something else. I found the [event][module] to work in place of [fileset][module].
The working filter:
filter {
if [event][module] == "nginx" {
if [fileset][name] == "access" {
grok {
match => { "message" => ["%{IPORHOST:[nginx][access][remote_ip]} - %{DATA:[nginx][access][user_name]} \[%{HTTPDATE:[nginx][access][time]}\] \"%{WORD:[nginx][access][method]} %{DATA:[nginx][access][url]} HTTP/%{NUMBER:[nginx][access][http_version]}\" %{NUMBER:[nginx][access][response_code]} %{NUMBER:[nginx][access][body_sent][bytes]} \"%{DATA:[nginx][access][referrer]}\" \"%{DATA:[nginx][access][agent]}\""] }
remove_field => "message"
}
mutate {
add_field => { "read_timestamp" => "%{#timestamp}" }
}
date {
match => [ "[nginx][access][time]", "dd/MMM/YYYY:H:m:s Z" ]
remove_field => "[nginx][access][time]"
}
useragent {
source => "[nginx][access][agent]"
target => "[nginx][access][user_agent]"
remove_field => "[nginx][access][agent]"
}
geoip {
source => "[nginx][access][remote_ip]"
target => "[nginx][access][geoip]"
}
}
else if [fileset][name] == "error" {
grok {
match => { "message" => ["%{DATA:[nginx][error][time]} \[%{DATA:[nginx][error][level]}\] %{NUMBER:[nginx][error][pid]}#%{NUMBER:[nginx][error][tid]}: (\*%{NUMBER:[nginx][error][connection_id]} )?%{GREEDYDATA:[nginx][error][message]}"] }
remove_field => "message"
}
mutate {
rename => { "#timestamp" => "read_timestamp" }
}
date {
match => [ "[nginx][error][time]", "YYYY/MM/dd H:m:s" ]
remove_field => "[nginx][error][time]"
}
}
}
}
Now, documents in Elastic Search have all of the expected fields:
Note: You'll have the same problem with other Filebeat modules, too. Just use [event][module] in place of [fileset][module].
From your logstash configuration, it doesn't look like you are parsing the log message.
There's an example in the logstash documentation on how to parse nginx logs:
Nginx Logs
The Logstash pipeline configuration in this example shows how to ship and parse access and error logs collected by the nginx Filebeat module.
input {
beats {
port => 5044
host => "0.0.0.0"
}
}
filter {
if [fileset][module] == "nginx" {
if [fileset][name] == "access" {
grok {
match => { "message" => ["%{IPORHOST:[nginx][access][remote_ip]} - %{DATA:[nginx][access][user_name]} \[%{HTTPDATE:[nginx][access][time]}\] \"%{WORD:[nginx][access][method]} %{DATA:[nginx][access][url]} HTTP/%{NUMBER:[nginx][access][http_version]}\" %{NUMBER:[nginx][access][response_code]} %{NUMBER:[nginx][access][body_sent][bytes]} \"%{DATA:[nginx][access][referrer]}\" \"%{DATA:[nginx][access][agent]}\""] }
remove_field => "message"
}
mutate {
add_field => { "read_timestamp" => "%{#timestamp}" }
}
date {
match => [ "[nginx][access][time]", "dd/MMM/YYYY:H:m:s Z" ]
remove_field => "[nginx][access][time]"
}
useragent {
source => "[nginx][access][agent]"
target => "[nginx][access][user_agent]"
remove_field => "[nginx][access][agent]"
}
geoip {
source => "[nginx][access][remote_ip]"
target => "[nginx][access][geoip]"
}
}
else if [fileset][name] == "error" {
grok {
match => { "message" => ["%{DATA:[nginx][error][time]} \[%{DATA:[nginx][error][level]}\] %{NUMBER:[nginx][error][pid]}#%{NUMBER:[nginx][error][tid]}: (\*%{NUMBER:[nginx][error][connection_id]} )?%{GREEDYDATA:[nginx][error][message]}"] }
remove_field => "message"
}
mutate {
rename => { "#timestamp" => "read_timestamp" }
}
date {
match => [ "[nginx][error][time]", "YYYY/MM/dd H:m:s" ]
remove_field => "[nginx][error][time]"
}
}
}
}
I know it doesn't deal with why filebeat doesn't send to logstash the full object, but it should give a start on how to parse the nginx logs in logstash.

Logstash to kibana multiline not working for custom message

I am trying push my log file through logstash to elasticsearch and display it on kibana. It works fine for the single line log records. However, it fails when it comes to the multiline filter.
Here is my sample multiline log input:
2016-06-02T04:02:29,720 INFO Thread-25-match-entity-bolt a52488cc-316b-402e-af58-3b8a663cd76a STDIO invoke Error processing message:{
"eid": "f9f16541-4fab-4131-a82e-e3ddf6fcd949",
"entityInfo": {
"entityType": "style",
"defaultLocale": "en-US"
},
"systemInfo": {
"tenantId": "t1"
},
"attributesInfo": {
"externalId": 1514,
"attributesRead": {
"IsEntityVariantsValid": false,
"IsEntityExtensionsValid": false
},
"attributesUpdated": {
"DateAttribute": "2016-06-01T00:00:00.0000000",
"IsEntitySelfValid": true,
"IsEntityMetaDataValid": true,
"IsEntityCommonAttributesValid": true,
"IsEntityCategoryAttributesValid": true,
"IsEntityRelationshipsValid": true
}
},
"jsAttributesInfo": {
"jsRelationship": {
"entityId": "CottonMaterial001",
"parentEntityId": "Apparel",
"category": "Apparel",
"categoryName": "Apparel",
"categoryPath": "Apparel",
"categoryNamePath": "Apparel",
"variant": "1514",
"variantPath": "1035/1514",
"container": "Demo Master",
"containerName": "Demo Master",
"containerPath": "DemoOrg/Demo Master/Apparel",
"organization": "DemoOrg",
"segment": "A"
},
"jsChangeContext": {
"entityAction": "update",
"user": "cfadmin",
"changeAgent": "EntityEditor.aspx",
"changeAgentType": "PIM",
"changeInterface": "Entity",
"sourceTimestamp": "2016-06-01T19:48:19.4162475+05:30",
"ingestTimestamp": "2016-06-01T19:48:19.4162475+05:30"
}
}
}
I have tried these logstash configs so far:
input {
file {
path => "path_to_logs/logs.log"
start_position => "beginning"
}
}
filter{
multiline {
negate => "true"
pattern => "^%{TIMESTAMP_ISO8601} "
what => "previous"
}
grok{
match => { "message" => "^%{TIMESTAMP_ISO8601:JigsawTimestamp}%{SPACE}%{LOGLEVEL:JigsawLoglevel}%{SPACE}%{HOSTNAME:ThreadName}%{SPACE}%{UUID:GUID}%{SPACE}%{JAVACLASS:JigsawClassName}%{SPACE}%{WORD:JigsawMethodName}%{SPACE}%{GREEDYDATA:JigsawLogMessage}" }
}
}
output {
if "_grokparsefailure" not in [tags] {
elasticsearch {
hosts => ["localhost:9200"]
}
}
}
The second one:
input {
file {
path => "path_to_logs/logs.log"
start_position => "beginning"
codec => multiline {
negate => "true"
pattern => "^%{TIMESTAMP_ISO8601} "
what => "previous"
}
}
}
filter{
grok{
match => { "message" => "^%{TIMESTAMP_ISO8601:JigsawTimestamp}%{SPACE}%{LOGLEVEL:JigsawLoglevel}%{SPACE}%{HOSTNAME:ThreadName}%{SPACE}%{UUID:GUID}%{SPACE}%{JAVACLASS:JigsawClassName}%{SPACE}%{WORD:JigsawMethodName}%{SPACE}%{GREEDYDATA:JigsawLogMessage}" }
}
}
output {
if "_grokparsefailure" not in [tags] {
elasticsearch {
hosts => ["localhost:9200"]
}
}
}
I tried this pattern as well:
pattern => "^\s"
However, none of this helped. All of them got _grokparsefailure tag. I want the JSON lines to be part of a single message. Please point out the mistake in this filter.
In your grok filter, there are couple of mistakes via which you are unable to see any logs.
In your sample data after INFO there are 2 spaces.
For the field JigsawClassName you are using JAVACLASS as input which is wrong for your log.
Why JAVACLASS is wrong?
It's implementation is as :-
JAVACLASS (?:[a-zA-Z0-9-]+.)+[A-Za-z0-9$]+
As per the above JAVACLASS requires atleast a period (.) symbol to appear in the text. However in your logs it is just STDIO.
Replace your grok match by the following:-
match => { "message" => "^%{TIMESTAMP_ISO8601:JigsawTimestamp}%{SPACE}%{LOGLEVEL:JigsawLoglevel}%{SPACE}%{SPACE}%{HOSTNAME:ThreadName}%{SPACE}%{UUID:GUID}%{SPACE}%{WORD:JigsawClassName}%{SPACE}%{WORD:JigsawMethodName}%{SPACE}%{GREEDYDATA:JigsawLogMessage}" }
Also for easy understanding use output to redirect it to console by adding stdout plugin as shown below:-
output {
if "_grokparsefailure" not in [tags] {
elasticsearch {
hosts => ["localhost:9200"]
}
stdout { codec => rubydebug }
}
It will make it easier for you to understand the error while processing data using Logstash.

logstash multiline codec with java stack trace

I am trying to parse a log file with grok. the configuration I use allows me to parse a single lined event but not if multilined (with java stack trace).
#what i get on KIBANA for a single line:
{
"_index": "logstash-2015.02.05",
"_type": "logs",
"_id": "mluzA57TnCpH-XBRbeg",
"_score": null,
"_source": {
"message": " - 2014-01-14 11:09:35,962 [main] INFO (api.batch.ThreadPoolWorker) user.country=US",
"#version": "1",
"#timestamp": "2015-02-05T09:38:21.310Z",
"path": "/root/test2.log",
"time": "2014-01-14 11:09:35,962",
"main": "main",
"loglevel": "INFO",
"class": "api.batch.ThreadPoolWorker",
"mydata": " user.country=US"
},
"sort": [
1423129101310,
1423129101310
]
}
#what i get for a multiline with Stack trace:
{
"_index": "logstash-2015.02.05",
"_type": "logs",
"_id": "9G6LsSO-aSpsas_jOw",
"_score": null,
"_source": {
"message": "\tat oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:20)",
"#version": "1",
"#timestamp": "2015-02-05T09:38:21.380Z",
"path": "/root/test2.log",
"tags": [
"_grokparsefailure"
]
},
"sort": [
1423129101380,
1423129101380
]
}
input {
file {
path => "/root/test2.log"
start_position => "beginning"
codec => multiline {
pattern => "^ - %{TIMESTAMP_ISO8601} "
negate => true
what => "previous"
}
}
}
filter {
grok {
match => [ "message", " -%{SPACE}%{SPACE}%{TIMESTAMP_ISO8601:time} \[%{WORD:main}\] %{LOGLEVEL:loglevel}%{SPACE}%{SPACE}\(%{JAVACLASS:class}\) %{GREEDYDATA:mydata} %{JAVASTACKTRACEPART}"]
}
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
}
output {
elasticsearch {
host => "194.3.227.23"
}
# stdout { codec => rubydebug}
}
Can anyone please tell me what i'm doing wrong on my configuration file? Thanks.
here's a sample of my log file:
- 2014-01-14 11:09:36,447 [main] INFO (support.context.ContextFactory) Creating default context
- 2014-01-14 11:09:38,623 [main] ERROR (support.context.ContextFactory) Error getting connection to database jdbc:oracle:thin:#HAL9000:1521:DEVPRINT, with user cisuser and driver oracle.jdbc.driver.OracleDriver
java.sql.SQLException: ORA-28001: the password has expired
at oracle.jdbc.driver.SQLStateMapping.newSQLException(SQLStateMapping.java:70)
at oracle.jdbc.driver.DatabaseError.newSQLException(DatabaseError.java:131)
**
*> EDIT: here's the latest configuration i'm using
https://gist.github.com/anonymous/9afe80ad604f9a3d3c00#file-output-L1*
**
First point, when repeating testing with the file input, be sure to use sincedb_path => "/dev/null" to be sure to read from the beginning of the file.
About multiline, there must be something wrong either with your question content or your multiline pattern because none of the event have the multiline tag that is added by the multiline codec or filter when aggregating the lines.
Your message field should contains all lines separated by line feed characters \n (\r\n in my case being on windows). Here is the expected output from your input configuration
{
"#timestamp" => "2015-02-10T11:03:33.298Z",
"message" => " - 2014-01-14 11:09:35,962 [main] INFO (api.batch.ThreadPoolWorker) user.country=US\r\n\tat oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:20\r",
"#version" => "1",
"tags" => [
[0] "multiline"
],
"host" => "localhost",
"path" => "/root/test.file"
}
About grok, as you want to match a multiline string you should use a pattern like this.
filter {
grok {
match => {"message" => [
"(?m)^ -%{SPACE}%{TIMESTAMP_ISO8601:time} \[%{WORD:main}\] % {LOGLEVEL:loglevel}%{SPACE}\(%{JAVACLASS:class}\) %{DATA:mydata}\n%{GREEDYDATA:stack}",
"^ -%{SPACE}%{TIMESTAMP_ISO8601:time} \[%{WORD:main}\] %{LOGLEVEL:loglevel}%{SPACE}\(%{JAVACLASS:class}\) %{GREEDYDATA:mydata}"]
}
}
}
(?m) prefix instruct the regex engine to do multiline matching.
And then you get an event like
{
"#timestamp" => "2015-02-10T10:47:20.078Z",
"message" => " - 2014-01-14 11:09:35,962 [main] INFO (api.batch.ThreadPoolWorker) user.country=US\r\n\tat oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:20\r",
"#version" => "1",
"tags" => [
[0] "multiline"
],
"host" => "localhost",
"path" => "/root/test.file",
"time" => "2014-01-14 11:09:35,962",
"main" => "main",
"loglevel" => "INFO",
"class" => "api.batch.ThreadPoolWorker",
"mydata" => " user.country=US\r",
"stack" => "\tat oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:20\r"
}
You can build and validate your multiline patterns with this online tool http://grokconstructor.appspot.com/do/match
A final warning, there is currently a bug in Logstash file input with multiline codec that mixup content from several files if you use a list or wildcard in path setting. The only workaroud is to use the multiline filter
HTH
EDIT: I was focusing on the multiline strings, you need to add a similar pattern for non single lines string

Logstash and ElasticSearch filter Date #timestamp issue

Im trying to index some data from file to ElasticSearch by using Logstash.
If I'm not using the Date filter in order to replace the #timestamp everything works very well, but when in using the filter I do not get all the data.
I can't figure out why there is a difference between the Logstash command line and Elasticsearch in the #timestamp value.
Logstash conf
filter {
mutate {
replace => {
"type" => "dashboard_a"
}
}
grok {
match => [ "message", "%{DATESTAMP:Logdate} \[%{WORD:Severity}\] %{JAVACLASS:Class} %{GREEDYDATA:Stack}" ]
}
date {
match => [ "Logdate", "dd-MM-yyyy hh:mm:ss,SSS" ]
}
}
Logstash Command line trace
{
**"#timestamp" => "2014-08-26T08:16:18.021Z",**
"message" => "26-08-2014 11:16:18,021 [DEBUG] com.fnx.snapshot.mdb.SnapshotMDB - SnapshotMDB Ctor is called\r",
"#version" => "1",
"host" => "bts10d1",
"path" => "D:\\ElasticSearch\\logstash-1.4.2\\Dashboard_A\\Log_1\\6.log",
"type" => "dashboard_a",
"Logdate" => "26-08-2014 11:16:18,021",
"Severity" => "DEBUG",
"Class" => "com.fnx.snapshot.mdb.SnapshotMDB",
"Stack" => " - SnapshotMDB Ctor is called\r"
}
ElasticSearch result
{
"_index": "logstash-2014.08.28",
"_type": "dashboard_a",
"_id": "-y23oNeLQs2mMbyz6oRyew",
"_score": 1,
"_source": {
**"#timestamp": "2014-08-28T14:31:38.753Z",
**"message": "15:07,565 [DEBUG] com.fnx.snapshot.mdb.SnapshotMDB - SnapshotMDB Ctor is called\r",
"#version": "1",
"host": "bts10d1",
"path": "D:\\ElasticSearch\\logstash-1.4.2\\Dashboard_A\\Log_1\\6.log",
"type": "dashboard_a",
"tags": ["_grokparsefailure"]
}
}
Please make sure all your logs is in format!
You can see in the logstash command line trace the logs is
26-08-2014 11:16:18,021 [DEBUG] com.fnx.snapshot.mdb.SnapshotMDB - SnapshotMDB Ctor is called\r
But, in the elastsicsearch the log is
15:07,565 [DEBUG] com.fnx.snapshot.mdb.SnapshotMDB - SnapshotMDB Ctor is called\r",
Two logs have different time and their format are not same! The second one do not have any information about daytime, therefore it will cause the grok filter parsing error. You can go to check the origin logs. Or can you provide the origin logs sample for more discussion if all of them are in format!

Resources