Logstash replace #timestamp with syslog date - filter

I'm a bit confused. I'm trying to pull out the syslog date (backfilling the logstash) and replace the #timestamp with it. I've tried almost everything.
This is my filter
filter {
if [type] == "syslog" {
grok {
match => {
"message" => ["%{SYSLOGTIMESTAMP:DATETIME} %{WORD:SERVER} (?<BINARY>(.*?)(php\-cgi|php))\: %{DATA:PHP_ERROR_TYPE}\:\s\s(?<PHP_ERROR_DESC>(.*?)(e\s\d))""]
}
}
date {
match => { "DATETIME" => [ "MMM d HH:mm:ss", "MMM dd HH:mm:ss", "ISO8601" ] }
target => "#timestamp"
add_tag => [ "tmatch" ]
}
if !("_grokparsefailure" in [tags]) {
mutate {
replace => [ "#source_host", "%{SERVER}" ]
}
}
mutate {
remove_field => [ "SERVER" ]
}
}
}
sample output:
{
"message" => "Sep 10 00:00:00 xxxxxxx",
"#timestamp" => "2013-12-05T13:29:35.169Z",
"#version" => "1",
"type" => "xxxx",
"host" => "127.0.0.1:xxx",
"DATETIME" => "Sep 10 00:00:00",
"BINARY" => "xxxx",
"PHP_ERROR_TYPE" => "xxxx",
"PHP_ERROR_DESC" => "xxxxx",
"tags" => [
[0] "tmatch"
],
"#source_host" => "xxx"
}
tmatch is in the tags so I assume that the date filter works, but why do I still have:
#timestamp => "2013-12-05T13:29:35.169Z"
?
Thanks for help (my logstash is logstash-1.2.2-flatjar.jar)

Let's take a look at your date filter:
date {
match => { "DATETIME" => [ "MMM d HH:mm:ss", "MMM dd HH:mm:ss", "ISO8601" ] }
target => "#timestamp"
add_tag => [ "tmatch" ]
}
In particular, the match parameter:
match => { "DATETIME" => [ "MMM d HH:mm:ss", "MMM dd HH:mm:ss", "ISO8601" ] }
Match expects an array. I'm not sure what you're passing, exactly, but it's definitely not an array. I tried running this with -v, and I'm surprised to see it doesn't complain.
You probably mean something closer to this:
match => ["DATETIME", "MMM d HH:mm:ss", "MMM dd HH:mm:ss", "ISO8601"]
Note the first element of the array is the target field; additional elements are pattern(s) to match against.
Past that, you really only need to pass the one format you expect, but it looks like that's included among the three you're sending.

If you want the timestamp showed as your timezone format, instead of UTC time,
you can do like this
ruby {
code => "event['#timestamp'] = event['#timestamp'].local('-08:00')"
}
Before:#timestamp => "2013-12-05T13:29:35.169Z"
After :#timestamp => "2013-12-05T05:29:35.169-08:00"
Updated:
The local method can't work in version 1.4.2.
So, change another API
ruby {
code => "event['#timestamp'] = event['#timestamp'].getlocal"
}

Related

logstash : create fingerprint from timestamp part

I have a problem to create a fingerprint based on client-ip and a timestamp containing date+hour.
I'm using logstash 7.3.1. Here it the relevant part of my configuration file
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
date{
match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
}
...
ruby{
code => "
keydate = Date.parse(event.get('timestamp'))
event.set('keydate', keydate.strftime('%Y%m%d-%H'))
"
}
fingerprint {
key => "my_custom_secret"
method => "SHA256"
concatenate_sources => "true"
source => [
"clientip",
"keydate"
]
}
}
The problem is into the 'ruby' block. I tried multiple methods to compute the keydate, but none works without giving me errors.
The last one (using this config file) is
[ERROR][logstash.filters.ruby ] Ruby exception occurred: Missing Converter handling for full class name=org.jruby.ext.date.RubyDateTime, simple name=RubyDateTime
input document
{
"timestamp" => "19/Sep/2019:00:07:56 +0200",
"referrer" => "-",
"#version" => "1",
"#timestamp" => 2019-09-18T22:07:56.000Z,
...
"request" => "index.php",
"type" => "apache_access",
"clientip" => "54.157.XXX.XXX",
"verb" => "GET",
...
"tags" => [
[0] "_rubyexception" # generated by the ruby exception above
],
"response" => "200"
}
expected output
{
"timestamp" => "19/Sep/2019:00:07:56 +0200",
"referrer" => "-",
"#version" => "1",
"#timestamp" => 2019-09-18T22:07:56.000Z,
...
"request" => "index.php",
"type" => "apache_access",
"clientip" => "54.157.XXX.XXX",
"verb" => "GET",
...
"keydate" => "20190919-00", #format : YYYYMMDD-HH
"fingerprint" => "ab347766ef....1190af",
"response" => "200"
}
As always, many thanks for all your help !
I advice to remove the ruby snippet and use the build in Date filter: https://www.elastic.co/guide/en/logstash/current/plugins-filters-date.html
What you are doing in the ruby snippet is exactly what the date filter does - extract a timestamp from a field and reconstruct it into your desire format.
another option (a bit less recommended, but will also work) is to use grok in order to extract the relevant parts of the timestamp and combine them in a different manner.

How can I create JSON format log when entering into Elasticsearch by logstash

i been told that, by using logstash pipeline i can re-create a log format(i.e JSON) when entering into elasticsearch. but not understanding how to do it .
current LOGStash Configure ( I took bellow from Google , not for any perticular reason)
/etc/logstash/conf.d/metrics-pipeline.conf
input {
beats {
port => 5044
client_inactivity_timeout => "3600"
}
}
filter {
if [message] =~ />/ {
dissect {
mapping => {
"message" => "%{start_of_message}>%{content}"
}
}
kv {
source => "content"
value_split => ":"
field_split => ","
trim_key => "\[\]"
trim_value => "\[\]"
target => "message"
}
mutate {
remove_field => ["content","start_of_message"]
}
}
}
filter {
if [system][process] {
if [system][process][cmdline] {
grok {
match => {
"[system][process][cmdline]" => "^%{PATH:[system][process][cmdline_path]}"
}
remove_field => "[system][process][cmdline]"
}
}
}
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
}
output {
elasticsearch {
hosts => "1.2.1.1:9200"
manage_template => false
index => "%{[#metadata][beat]}-%{[#metadata][version]}-%{+YYYY.MM.dd}"
}
}
I have couple of log file located at
/root/logs/File.log
/root/logs/File2.log
the Log format there is :
08:26:51,753 DEBUG [ABC] (default-threads - 78) (1.2.3.4)(368)>[TIMESTAMP:Wed Sep 11 08:26:51 UTC 2019],[IMEI:03537],[COMMAND:INFO],[GPS STATUS:true],[INFO:true],[SIGNAL:false],[ENGINE:0],[DOOR:0],[LON:90.43],[LAT:23],[SPEED:0.0],[HEADING:192.0],[BATTERY:100.0%],[CHARGING:1],[O&E:CONNECTED],[GSM_SIGNAL:100],[GPS_SATS:5],[GPS POS:true],[FUEL:0.0V/0.0%],[ALARM:NONE][SERIAL:01EE]
in Kibana by default it shows like ethis
https://imgshare.io/image/stackflow.I0u7S
https://imgshare.io/image/jsonlog.IHQhp
"message": "21:33:42,004 DEBUG [LOG] (default-threads - 100) (1.2.3.4)(410)>[TIMESTAMP:Sat Sep 07 21:33:42 UTC 2019],[TEST:123456],[CMD:INFO],[STATUS:true],[INFO:true],[SIGNAL:false],[ABC:0],[DEF:0],[GHK:1111],[SERIAL:0006]"
but i want to get it like bellow :-
"message": {
"TIMESTAMP": "Sat Sep 07 21:33:42 UTC 2019",
"TEST": "123456",
"CMD":INFO,
"STATUS":true,
"INFO":true,
"SIGNAL":false,
"ABC":0,
"DEF":0,
"GHK":0,
"GHK":1111
}
Can this be done ? if yes how ?
Thanks
With the if [message] =~ />/, the filters will only apply to messages containing a >. The dissect filter will split the message between the >. The kv filter will apply a key-value transformation on the second part of the message, removing the []. The mutate.remove_field remove any extra field.
filter {
if [message] =~ />/ {
dissect {
mapping => {
"message" => "%{start_of_message}>%{content}"
}
}
kv {
source => "content"
value_split => ":"
field_split => ","
trim_key => "\[\]"
trim_value => "\[\]"
target => "message"
}
mutate {
remove_field => ["content","start_of_message"]
}
}
}
Result, using the provided log line:
{
"#version": "1",
"host": "YOUR_MACHINE_NAME",
"message": {
"DEF": "0",
"TIMESTAMP": "Sat Sep 07 21:33:42 UTC 2019",
"TEST": "123456",
"CMD": "INFO",
"SERIAL": "0006]\r",
"GHK": "1111",
"INFO": "true",
"STATUS": "true",
"ABC": "0",
"SIGNAL": "false"
},
"#timestamp": "2019-09-10T09:21:16.422Z"
}
In addition to doing the filtering with if [message] =~ />/, you can also do the comparison on the path field, which is set by the file input plugin. Also if you have multiple file inputs, you can set the type field and use this one, see https://stackoverflow.com/a/20562031/6113627.

Logstash nginx filter doesn't apply to half of rows

Using filebeat to push nginx logs to logstash and then to elasticsearch.
Logstash filter:
filter {
if [fileset][module] == "nginx" {
if [fileset][name] == "access" {
grok {
match => { "message" => ["%{IPORHOST:[nginx][access][remote_ip]} - %{DATA:[nginx][access][user_name]} \[%{HTTPDATE:[nginx][access][time]}\] \"%{WORD:[nginx][access][method]} %{DATA:[nginx][access][url]} HTTP/%{NUMBER:[nginx][access][http_version]}\" %{NUMBER:[nginx][access][response_code]} %{NUMBER:[nginx][access][body_sent][bytes]} \"%{DATA:[nginx][access][referrer]}\" \"%{DATA:[nginx][access][agent]}\""] }
remove_field => "message"
}
mutate {
add_field => { "read_timestamp" => "%{#timestamp}" }
}
date {
match => [ "[nginx][access][time]", "dd/MMM/YYYY:H:m:s Z" ]
remove_field => "[nginx][access][time]"
}
useragent {
source => "[nginx][access][agent]"
target => "[nginx][access][user_agent]"
remove_field => "[nginx][access][agent]"
}
geoip {
source => "[nginx][access][remote_ip]"
target => "[nginx][access][geoip]"
}
}
else if [fileset][name] == "error" {
grok {
match => { "message" => ["%{DATA:[nginx][error][time]} \[%{DATA:[nginx][error][level]}\] %{NUMBER:[nginx][error][pid]}#%{NUMBER:[nginx][error][tid]}: (\*%{NUMBER:[nginx][error][connection_id]} )?%{GREEDYDATA:[nginx][error][message]}"] }
remove_field => "message"
}
mutate {
rename => { "#timestamp" => "read_timestamp" }
}
date {
match => [ "[nginx][error][time]", "YYYY/MM/dd H:m:s" ]
remove_field => "[nginx][error][time]"
}
}
}
}
There is just one file /var/log/nginx/access.log.
In kibana, I see ± half of the rows with parsed message and other half - not.
All of the rows in kibana have a tag "beats_input_codec_plain_applied".
Examples from filebeat -e
Row that works fine:
"source": "/var/log/nginx/access.log",
"offset": 5405195,
"message": "...",
"fileset": {
"module": "nginx",
"name": "access"
}
Row that doesn't work fine (no "fileset"):
"offset": 5405397,
"message": "...",
"source": "/var/log/nginx/access.log"
Any idea what could be the cause?

Grok configuration ELK

I have an original type of log to parse. The syntax is :
2013-01-05 03:29:38,842 INFO [ajp-bio-8009-exec-69] web.CustomAuthenticationSuccessHandler - doLogin : admin.ebusiness date : 2013-01-05 03:29:38
When I use the grok pattern :
if [type] in ["edai"] {
grok {
match => { "message" => ["%{YEAR:year}-%{WORD:month}-%{DATA:day} %{DATA:hour}:%{DATA:minute}:%{DATA:second},%{DATA:millis} %{NOTSPACE:loglevel} {0,1}%{GREEDYDATA:message}"] }
overwrite => [ "message" ]
}
}
The pattern work as you can see, but when I go into Kibana, the log stay in one block in the "message" section like this:
2013-01-05 23:27:47,030 INFO [ajp-bio-8009-exec-63] web.CustomAuthenticationSuccessHandler - doLogin : admin.ebusiness date : 2013-01-05 23:27:47
I would prefer to have it like this:
{ "year": [["2013"]], "month": [["01"]], "day": [["05"]], "hour": [["04"]], "minute": [["04"]], "second": [["39"]], "millis": [["398"] ], "loglevel": [ ["INFO"]] }
Can you help me to parse it correctly please?
Just tested this configuration. I kinda copied everything from your question.
input {
stdin { type => "edai" }
}
filter {
if [type] == "edai" {
grok {
match => { "message" => ["%{YEAR:year}-%{WORD:month}-%{DATA:day} %{DATA:hour}:%{DATA:minute}:%{DATA:second},%{DATA:millis} %{NOTSPACE:loglevel} {0,1}%{GREEDYDATA:message}"] }
overwrite => [ "message" ]
}
}
}
output {
stdout { codec => rubydebug }
}
This is the output:
{
"year" => "2013",
"message" => " [ajp-bio-8009-exec-69] web.CustomAuthenticationSuccessHandler - doLogin : admin.ebusiness date : 2013-01-05 03:29:38\r",
"type" => "edai",
"minute" => "29",
"second" => "38",
"#timestamp" => 2017-06-29T08:19:08.605Z,
"month" => "01",
"hour" => "03",
"loglevel" => "INFO",
"#version" => "1",
"host" => "host_name",
"millis" => "842",
"day" => "05"
}
Everything seems fine from my perspective.
I had issue when I compared type the way you did:
if [type] in ["eday"]
It did not work and I've replaced it with direct comparison:
if [type] == "edai"
Also this worked too:
if [type] in "edai"
And that solved the issue.

Reading positional file with logstash, converting two string fields to date, applying math operation to another

So, I have a positional file that looks like this
0100003074400003074400000000103000000000066167424000000000131527492000000000131527463C19860000000000000320160302201603300010019700XXXXXXXX XX XXXXXX 000000000133719971
02000008013000008013000000001010000000001327506142016033000000000000046053100000000013268252820160516000000000020091000000000066558874002002
And I want logstash to ship only the lines starting with '01' to elasticsearch. I've managed to do this by doing the following
filter {
# get only lines that start with 01
if ([message] !~ "^01") {
drop{}
}
grok {
match => { "message" => "^01(?<n_estab>.{9})(?<n_filial>.{9})(?<depart>.{9})(?<prod>.{2})(?<id_apres>.{18})(?<id_mov>.{18})(?<id_mov_orig>.{18})(?<orig_int>.{1})(?<cod_oper>.{1})(?<cod_moeda>.{3})(?<valor>.{14})(?<date_trans>.{8})(?<date_agenda>.{8})(?<num_parcela>.{3})(?<qtd_parcelas>.{3})(?<cod_rub>.{4})(?<desc_rub>.{30})(?<id_pgto>.{18})" }
}
mutate {
strip => [
"n_estab",
"n_filial",
"depart",
"prod",
"id_apres",
"id_mov",
"id_mov_orig",
"orig_int",
"cod_oper",
"cod_moeda",
"valor",
"date_trans",
"date_agenda",
"num_parcela",
"qtd_parcelas",
"cod_rub",
"desc_rub",
"id_pgto"
]
convert => {
"n_estab" => "integer"
"n_filial" => "integer"
"depart" => "integer"
"prod" => "integer"
"id_apres" => "integer"
"id_mov" => "integer"
"id_mov_orig" => "integer"
"orig_int" => "string"
"cod_oper" => "integer"
"cod_moeda" => "integer"
"valor" => "float"
"date_trans" => "string"
"date_agenda" => "string"
"num_parcela" => "integer"
"qtd_parcelas" => "integer"
"cod_rub" => "integer"
"desc_rub" => "string"
"id_pgto" => "integer"
}
}
}
Now, I want to divide valor by 100 and convert fields date_trans and date_agenda from string to date format, so I can index by any of those fields on elasticsearch and kibana.
I've tried adding the following lines to filter
ruby {
code => "event['valor'] = event['valor'] / 100
event['date_trans'] = Date.strptime(event['date_trans'], '%Y%m%d')
event['date_agenda'] = Date.strptime(event['date_agenda'], '%Y%m%d')"
}
After I've added those lines to my conf file, logstash starts, but doesn't parse any of my files... It simply hangs! Since I can add gibberish to the ruby code block and it won't alert me of anything, I figure it must be something with the ruby code, right...?
UPDATE
After executing
/opt/logstash/bin/logstash -f /etc/logstash/conf.d/subq_detliq.conf -v --debug --verbose
It started inserting into elasticsearch... Does logstash keeps which files it read somewhere and never come back to them?
Also, this it what it's inserting into ES...
"message" => "0100001504000001504000000000101000000000063916400000000000124569419000000000124569414C09860000000000011620151127201601260020029700XXXXXXXX XX XXXXXX 000000000128479123 ",
"#version" => "1",
"#timestamp" => "2016-04-28T18:11:58.681Z",
"host" => "cherno-alpha",
"path" => "/tmp/HSTRD0003/SUBQ_DETLIQ_HSTR_20160124_000100.REM",
"n_estab" => 15040,
"n_filial" => 15040,
"depart" => 1,
"prod" => 1,
"id_apres" => 63916400,
"id_mov" => 124569419,
"id_mov_orig" => 124569414,
"orig_int" => "C",
"cod_oper" => 0,
"cod_moeda" => 986,
"valor" => 1.16,
"date_trans" => #<Date: 2015-11-27 ((2457354j,0s,0n),+0s,2299161j)>,
"date_agenda" => #<Date: 2016-01-26 ((2457414j,0s,0n),+0s,2299161j)>,
"num_parcela" => 2,
"qtd_parcelas" => 2,
"cod_rub" => 9700,
"desc_rub" => "XXXXXXXX XX XXXXXX",
"id_pgto" => 128479123
Somehow Ruby converted the date fields, but not really? ES still thinks it's just a regular string and won't let create an index on them.

Resources