elasticsearch/kiabana - analyze and visualize total time for transactions? - elasticsearch

Parsing log files using logstash, here is the json sent to elasticsearch looks like:
For log lines contaning transaction start time, i add db_transaction_commit_begin_time field with the time it is logged.
{
"message" => "2015-05-27 10:26:47,048 INFO [T:3 ID:26] (ClassName.java:396) - End committing transaction",
"#version" => "1",
"#timestamp" => "2015-05-27T15:24:11.594Z",
"host" => "test.com",
"path" => "/abc/xyz/log.logstash.test",
"logTimestampString" => "2015-05-27 10:26:47,048",
"logLevel" => "INFO",
"threadInfo" => "T:3 ID:26",
"class" => "ClassName.java",
"line" => "396",
"logMessage" => "End committing transaction",
"db_transaction_commit_begin_time" => "2015-05-27 10:26:47,048"
}
For log lines contaning transaction end time, i add db_transaction_commit_end_time field with the time it is logged.
{
"message" => "2015-05-27 10:26:47,048 INFO [T:3 ID:26] (ClassName.java:396) - End committing transaction",
"#version" => "1",
"#timestamp" => "2015-05-27T15:24:11.594Z",
"host" => "test.com",
"path" => "/abc/xyz/log.logstash.test",
"logTimestampString" => "2015-05-27 10:26:47,048",
"logLevel" => "INFO",
"threadInfo" => "T:3 ID:26",
"class" => "ClassName.java",
"line" => "396",
"logMessage" => "End committing transaction",
"db_transaction_commit_end_time" => "2015-05-27 10:26:47,048"
}
Is it possible to calculate time for db transaction (db_transaction_commit_end_time - db_transaction_commit_begin_time) where threadinfo is same ?. I know aggregation might help but I am new and couldn't figure it out.
If somehow I get the db_transaction_time calculated and stored in a variable. how can I visualize time taken in a kibana chart ?

Use the elapsed{} filter in logstash.

Related

Logstash : Mutate filter does not work

I have the following filter
filter {
grok {
break_on_match => false
match => { 'message' => '\[(?<log_time>\d{0,2}\/\d{0,2}\/\d{2} \d{2}:\d{2}:\d{2}:\d{3} [A-Z]{3})\]%{SPACE}%{BASE16NUM}%{SPACE}%{WORD:system_stat}%{GREEDYDATA}\]%{SPACE}%{LOGLEVEL}%{SPACE}(?<log_method>[a-zA-Z\.]+)%{SPACE}-%{SPACE}%{GREEDYDATA:log_message}%{SPACE}#%{SPACE}%{IP:app_host}:%{INT:app_port};%{SPACE}%{GREEDYDATA}Host:%{IPORHOST:host_name}:%{POSINT:host_port}' }
match => { 'message' => '\[(?<log_time>\d{0,2}\/\d{0,2}\/\d{2} \d{2}:\d{2}:\d{2}:\d{3} [A-Z]{3})\]'}
}
kv{
field_split => "\n;"
value_split => "=:"
trimkey => "<>\[\],;\n"
trim => "<>\[\],;\n"
}
date{
match => [ "log_time","MM/dd/YY HH:mm:ss:SSS z" ]
target => "log_time"
locale => "en"
}
mutate {
convert => {
"line_number" => "integer"
"app_port" => "integer"
"host_port" => "integer"
"et" => "integer"
}
#remove_field => [ "message" ]
}
mutate {
rename => {
"et" => "execution_time"
"URI" => "uri"
"Method" => "method"
}
}
}
i can get results out of the grok and kv filters but neither of the mutate filters work. Is it because of the kv filter?
EDIT: Purpose
my problem is that my log contains heterogenous log records. For example
[9/13/16 15:01:18:301 EDT] 89798797 SystemErr jbhsdbhbdv [vjnwnvurnuvuv] INFO djsbbdyebycbe - Filter.doFilter(..) took 0 ms.
[9/13/16 15:01:18:302 EDT] 4353453443 SystemErr sdgegrebrb [dbebtrntn] INFO sverbrebtnnrb - [SECURITY AUDIT] Received request from: "null" # wrvrbtbtbtf:000222; Headers=Host:vervreertherg:1111
Connection:keep-alive
User-Agent:Mozilla/5.0
Accept:text/css,*/*;q=0.1
Referer:https:kokokfuwnvuwnev/ikvdwninirnv/inwengi
Accept-Encoding:gzip
Accept-Language:en-US,en;q=0.8
; Body=; Method=GET; URI=dasd/wgomnwiregnm/iwenviewn; et=10ms; SC=200
all i care about is capturing the timestamp at the beginning of each record and a few other fields if they are present. i want Method,et,Host,loglevel and URI. If these fields are not present, i still want to capture the event with the loglevel and the message being logged.
is it advisable to capture such events using the same logstash process? should i be running two logstash processes? The problem is that i dont know the structure of the logs beforehand, apart from the few fields that i do want to capture.
Multiline config
path => ["path to log"]
start_position => "beginning"
ignore_older => 0
sincedb_path => "/dev/null"
codec => multiline {
pattern => "^\[\d{0,2}\/\d{0,2}\/\d{2} \d{2}:\d{2}:\d{2}:\d{3} [A-Z]{3}\]"
negate => "true"
what => "previous"
Maybe it is because some fields (line_number, et, URI, Method) aren't being created during the initial grok. For example, I see you define "log_method" but in mutate->rename, you refer to "Method". Is there a json codec or something applied in the input block that adds these extra fields?
If you post sample logs, I can test them with your filter and help you more. :)
EDIT:
I see that the log you sent has multiple lines. Are you using a multiline filter on input? Could you share your input block as well?
You definitely don't need to run two Logstash processes. One Logstash can take care of multiple log formats. You can use conditionals, try/catch, or mark the fields as optional by adding a '?' after.
MORE EDIT:
I'm getting output that implies that your mutate filters work:
"execution_time" => 10,
"uri" => "dasd/wgomnwiregnm/iwenviewn",
"method" => "GET"
once I changed trimkey => "<>\[\],;\n" to trimkey => "<>\[\],;( )?\n". I noticed that those fields (et, Method) were being prefixed with a space.
Note: I'm using the following multiline filter for testing, if yours is different it would affect the outcome. Let me know if that helps.
codec => multiline {
pattern => "\n"
negate => true
what => previous
}

Parsing out text from a string using a logstash filter

I have an Apache Access Log that I would like to parse out some text from within the REQUEST field:
GET /foo/bar?contentId=ABC&_=1212121212 HTTP/1.1"
What I would like to do is extract and assign the 12121212122 to a value but the value is based off of the prefix ABC&_ (so I think I need an if statement or something). The prefix could take on other forms (e.g., DDD&_)
So basically I would like to say
if (prefix == ABC&_)
ABCID = 1212121212
elseif (prefix == DDD&_)
DDDID = <whatever value>
else
do nothing
I have been struggling to build the right filter in logstash to extract the id based on the prefix. Any help would be great.
Thank you
For this you would use a grok filter.
For example:
artur#pandaadb:~/dev/logstash$ ./logstash-2.3.2/bin/logstash -f conf2
Settings: Default pipeline workers: 8
Pipeline main started
GET /foo/bar?contentId=ABC&_=1212121212 HTTP/1.1"
{
"message" => "GET /foo/bar?contentId=ABC&_=1212121212 HTTP/1.1\"",
"#version" => "1",
"#timestamp" => "2016-07-28T15:59:12.787Z",
"host" => "pandaadb",
"prefix" => "ABC&_",
"id" => "1212121212"
}
This is your sample input, parsing out your prefix and Id.
There is no need for an if here, since the regular expression of the GROK filter takes care of it.
You can however (if you need to put it in different fields) analyse your field and add it to a different one.
This would output like that:
GET /foo/bar?contentId=ABC&_=1212121212 HTTP/1.1"
{
"message" => "GET /foo/bar?contentId=ABC&_=1212121212 HTTP/1.1\"",
"#version" => "1",
"#timestamp" => "2016-07-28T16:05:07.442Z",
"host" => "pandaadb",
"prefix" => "ABC&_",
"id" => "1212121212",
"ABCID" => "1212121212"
}
GET /foo/bar?contentId=DDD&_=1212121212 HTTP/1.1"
{
"message" => "GET /foo/bar?contentId=DDD&_=1212121212 HTTP/1.1\"",
"#version" => "1",
"#timestamp" => "2016-07-28T16:05:20.026Z",
"host" => "pandaadb",
"prefix" => "DDD&_",
"id" => "1212121212",
"DDDID" => "1212121212"
}
The filter I used for this looks like that:
filter {
grok {
match => {"message" => ".*contentId=%{GREEDYDATA:prefix}=%{NUMBER:id}"}
}
if [prefix] =~ "ABC" {
mutate {
add_field => {"ABCID" => "%{id}"}
}
}
if [prefix] =~ "DDD" {
mutate {
add_field => {"DDDID" => "%{id}"}
}
}
}
I hope that illustrates how to go about it. You can use this to test your grok regex:
http://grokdebug.herokuapp.com/
Have fun!
Artur

CSV filter in logstash throwing "_csvparsefailure" error

I asked another question eairler which I think might be related to this question:
JSON parser in logstash ignoring data?
The reason I think it's related is because in the previous question kibana wasn't displaying results from the JSON parser which have the "PROGRAM" field as "mfd_status". Now I'm changing the way I do things, removed the JSON parser just in case it might be interfering with stuff, but I still don't have any logs with "mfd_status" in them showing up.
csv
{
columns => ["unixTime", "unixTime2", "FACILITY_NUM", "LEVEL_NUM", "PROGRAM", "PID", "MSG_FULL"]
source => "message"
separator => " "
}
In my filter from the previous question I used two grok filters, now I've replaced them with a csv filter. I also have two date and a fingerprint filter but they're irrelevant for this question, I think.
Example log messages:
"1452564798.76\t1452496397.00\t1\t4\tkernel\t\t[ 6252.000246] sonar: sonar_write(): waiting..."
OUTPUT:
"unixTime" => "1452564798.76",
"unixTime2" => "1452496397.00",
"FACILITY_NUM" => "1",
"LEVEL_NUM" => "4",
"PROGRAM" => "kernel",
"PID" => nil,
"MSG_FULL" => "[ 6252.000246] sonar: sonar_write(): waiting...",
"TIMESTAMP" => "2016-01-12T02:13:18.760Z",
"TIMESTAMP_second" => "2016-01-11T07:13:17.000Z"
"1452564804.57\t1452496403.00\t1\t7\tmfd_status\t\t00800F08CFB0\textra\t{\"date\":1452543203,\"host\":\"ABCD1234\",\"inet\":[\"169.254.42.207/16\",\"10.8.207.176/32\",\"172.22.42.207/16\"],\"fb0\":[\"U:1280x800p-60\",32]}"
OUTPUT:
"tags" => [
[0] "_csvparsefailure"
After it says kernel/mfd_status in the logs, there shouldn't be any more deliminators and it should all go under the MSG_FULL field.
So, to summarize, why does one of my log messages parse correctly and the other one not? Also, even if it doesn't parse correctly it should still send it to elasticsearch just with empty fields, I think, why doesn't it do that either?
You're almost good, you need to override two more parameters in your CSV filter and both lines will be parsed correctly.
The first is skip_empty_columns => true because you have one empty field in your second log line and you need to ignore it.
The second is quote_char=> "'" (or anything else than the double quote ") since your JSON contain double quotes.
csv {
columns => ["unixTime", "unixTime2", "FACILITY_NUM", "LEVEL_NUM", "PROGRAM", "PID", "MSG_FULL"]
source => "message"
separator => " "
skip_empty_columns => true
quote_char => "'"
}
Using this, your first log line parses as:
{
"message" => "1452564798.76\\t1452496397.00\\t1\\t4\\tkernel\\t\\t[ 6252.000246] sonar: sonar_write(): waiting...",
"#version" => "1",
"#timestamp" => "2016-01-12T04:21:34.051Z",
"host" => "iMac.local",
"unixTime" => "1452564798.76",
"unixTime2" => "1452496397.00",
"FACILITY_NUM" => "1",
"LEVEL_NUM" => "4",
"PROGRAM" => "kernel",
"MSG_FULL" => "[ 6252.000246] sonar: sonar_write(): waiting..."
}
And the second log lines parses as:
{
"message" => "1452564804.57\\t1452496403.00\\t1\\t7\\tmfd_status\\t\\t00800F08CFB0\\textra\\t{\\\"date\\\":1452543203,\\\"host\\\":\\\"ABCD1234\\\",\\\"inet\\\":[\\\"169.254.42.207/16\\\",\\\"10.8.207.176/32\\\",\\\"172.22.42.207/16\\\"],\\\"fb0\\\":[\\\"U:1280x800p-60\\\",32]}",
"#version" => "1",
"#timestamp" => "2016-01-12T04:21:07.974Z",
"host" => "iMac.local",
"unixTime" => "1452564804.57",
"unixTime2" => "1452496403.00",
"FACILITY_NUM" => "1",
"LEVEL_NUM" => "7",
"PROGRAM" => "mfd_status",
"MSG_FULL" => "00800F08CFB0",
"column8" => "extra",
"column9" => "{\\\"date\\\":1452543203,\\\"host\\\":\\\"ABCD1234\\\",\\\"inet\\\":[\\\"169.254.42.207/16\\\",\\\"10.8.207.176/32\\\",\\\"172.22.42.207/16\\\"],\\\"fb0\\\":[\\\"U:1280x800p-60\\\",32]}"
}

How to write expression for special KV string in logstash kv fiter?

I got plenty of logs like these kind of stuff:
uid[118930] pageview h5_act, actTag[cyts] corpId[2] inviteType[0] clientId[3] clientVer[2.3.0] uniqueId[d317de16a78a0089b0d94d684e7a9585565ffa236138c0.85354991] srcId[0] subSrc[]
Most of these are key-value expression in KEY[VALUE] form.
I have read the document but still cannot figure out how to write the configurations.
Any help would be appreciated!
You can simply configure your kv filter using the value_split and trim settings, like below:
filter {
kv {
value_split => "\["
trim => "\]"
}
}
For the sample log line you've given, you'll get:
{
"message" => "uid[118930] pageview h5_act, actTag[cyts] corpId[2] inviteType[0] clientId[3] clientVer[2.3.0] uniqueId[d317de16a78a0089b0d94d684e7a9585565ffa236138c0.85354991] srcId[0] subSrc[]",
"#version" => "1",
"#timestamp" => "2015-12-12T05:04:00.888Z",
"host" => "iMac.local",
"uid" => "118930",
"actTag" => "cyts",
"corpId" => "2",
"inviteType" => "0",
"clientId" => "3",
"clientVer" => "2.3.0",
"uniqueId" => "d317de16a78a0089b0d94d684e7a9585565ffa236138c0.85354991",
"srcId" => "0",
"subSrc" => ""
}

Logstash converting date to valid joda time (#timestamp)

Hope someone can help me out!
I have a question about logstash. I grok the following date with succes: 26/Jun/2013:14:00:26 +0200
Next, I want this date to be used as the #timestamp of the event. As you know logstash automatically adds a timestamp.
Replacing the timestamp that logstash is adding can be done by the date filter. I have added the following date filter: match => [ "date", "dd/MMM/YYYY:HH:mm:ss Z"]
But, for some reason, that doesn't work. When I test it out, I see that logstash just adds his own timestamp.
Code:
grok {
type => "log-date"
pattern => "%{HTTPDATE:date}"
}
date{
type => "log-date"
match => [ "date", "dd/MMM/YYYY:HH:mm:ss Z"]
}
I need to do this, so I can add events to elasticsearch.
Thanks in advance!
I used the following approach:
# strip the timestamp and force event timestamp to be the same.
# the original string is saved in field %{log_timestamp}.
# the original logstash input timestamp is saved in field %{event_timestamp}.
grok {
patterns_dir => "./patterns"
match => [ "message", "%{IRODS_TIMESTAMP:log_timestamp}" ]
add_tag => "got_syslog_timestamp"
add_field => [ "event_timestamp", "%{#timestamp}" ]
}
date {
match => [ "log_timestamp", "MMM dd HH:mm:ss" ]
}
mutate {
replace => [ "#timestamp", "%{log_timestamp}" ]
}
My problem now is that, even if #timestamp is replaced, I would like to convert it to a ISO8601-compatible format first so that other programs don't have problems interpreting it, like the timestamp present in "event_timestamp":
"#timestamp" => "Mar 5 14:38:40",
"#version" => "1",
"type" => "irods.relog",
"host" => "ids-dev",
"path" => "/root/logstash/reLog.2013.03.01",
"pid" => "5229",
"level" => "NOTICE",
"log_timestamp" => "Mar 5 14:38:40",
"event_timestamp" => "2013-09-17 12:20:28 UTC",
"tags" => [
[0] "got_syslog_timestamp"
]
You could convert it easily since you have the year information... In my case I would have to parse it out of the "path" (filename) attribute... but still, there does not seem to be an convert_to_iso8901 => #timestamp directive.
Hope this helps with your issue anyway! :)
The above answer is just a work around !, try to add locale => "en" to your code.
If not added, the date weekdays and month names will be parsed with the default platform locale language (spanish, french or whatever) and that's why it didn't work (since your log is in english).
date{
type => "log-date"
match => [ "date", "dd/MMM/YYYY:HH:mm:ss Z"]
locale => "en"
}

Resources