Having trouble parsing checkpoint firewall logs using grok filter - elasticsearch

They are check point fire wall logs and they look like so.... (first row = fields, second row and all the rows thereafter = values of the respective fields)
"Number" "Date" "Time" "Interface" "Origin" "Type" "Action" "Service" "Source Port" "Source" "Destination" "Protocol" "Rule" "Rule Name" "Current Rule Number" "Information"
"7319452" "18Mar2015" "15:00:00" "eth1-04" "grog1" "Log" "Accept" "domain-udp" "20616" "172.16.36.250" "8.8.8.8" "udp" "7" "" "7-open_1" "inzone: Internal; outzone: External; service_id: domain-udp" "Security Gateway/Management"
I have tried doing this bit by bit by getting some code online (grok filters).
I have a file that has nothing more than
"GoLpoT" "502" (quotes included)
and some code that reads this file which is pasted below:
input {
file {
path => "/usr/local/bin/firewall_log"
}
}
filter {
grok {
match => ["message", "%{WORD:type}\|%{NUMBER:nums}"]
}
}
output {
elasticsearch { host => localhost }
stdout { codec => rubydebug }
}
When I run the code, I get the following error
"message" => "",
"#version" => "1",
"#timestamp" => "2015-04-30T15:52:48.331Z",
"host" => "UOD-220076",
"path" => "/usr/local/bin/firewall_log",
"tags" => [
[0] "_grokparsefailure"
Any help please.
My second question - how do I parse the Date and Time - together or separately ?
The date doesn't change - it's all logs from one day - it's only the time that changes.
Many thanks.

Related

Automatically parse logs fields with Logstash

Let's say I have this kind of log :
Jun 2 00:00:00 192.168.14.4 date=2016-06-01 time=23:56:05
devname=POPB-FW-01 devid=FG1K2D3I14800220 logid=1059028704 type=utm
subtype=app-ctrl eventtype=app-ctrl-all level=information vd="root"
appid=40568 user="" srcip=10.20.4.35 srcport=52438
srcintf="VRF-PUBLIC" dstip=125.209.230.238 dstport=443 dstintf="OUT"
proto=6 service="HTTPS" sessionid=424666004 applist="Monitor-all"
appcat="Web.Others" app="HTTPS.BROWSER" action=pass
hostname="lcs.naver.com" url="/" msg="Web.Others: HTTPS.BROWSER,"
apprisk=medium
So with this code below, I can regex the timestamp and the ip in future elastic fields :
filter {
grok {
match => {"message" => "%{SYSLOGTIMESTAMP:timestamp} %{client}" }
}
}
Now, how do I automatically get fields for the rest of the log ? Is there a simple way to say :
The thing before the "=" is the field name and the thing after is the value.
So I can obtain a JSON for elastic index with many fields for each log line :
{
"path" => "C:/Users/yoyo/Documents/yuyu/temp.txt",
"#timestamp" => 2017-11-29T10:50:18.947Z,
"#version" => "1",
"client" => "192.168.14.4",
"timestamp" => "Jun 2 00:00:00",
"date" => "2016-06-01",
"time" => "23:56:05",
"devname" => "POPB-FW-01 ",
"devid" => "FG1K2D3I14800220",
etc,...
}
Thanks in advance
Okay, I am really dumb
It was easy, rather than search on google, how to match equals, I just had to search key value matching with logstash.
So I just have to write :
filter {
kv {
}
}
And it's done !
Sorry

ElasticSearch - not setting the date type

I am trying the ELK stack, and so far so good :)
I have run in to strange situation regardgin the parsing the date field and sending it to ElasticSearch. I manage to parse the field, and it really gets created in the ElasticSearch, but it always end up as string.
I have tried many different combinations. Also I have tried many different things that people suggested, but still I fail.
This is my setup:
The strings that comes from Filebeat:
[2017-04-26 09:40:33] security.DEBUG: Stored the security token in the session. {"key":"securitysecured_area"} []
[2017-04-26 09:50:42] request.INFO: Matched route "home_logged_in". {"route_parameters":{"controller":"AppBundle\Controller\HomeLoggedInController::showAction","locale":"de","route":"homelogged_in"},"request_uri":"https://qa.someserver.de/de/home"} []
The logstash parsing section:
if [#metadata][type] == "feprod" or [#metadata][type] == "feqa"{
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:logdate}" }
}
date {
#timezone => "Europe/Berlin"
match => [ "logdate", "yyyy-MM-dd HH:mm:ss"]
}
}
According to the documentation, my #timestamp field should be overwritten with the logdate value. But it is no happening.
In the ElasticSearch I can see the field logdate is being created and it has value of 2017-04-26 09:40:33, but its type is string.
I always create index from zero, I delete it first and let the logstash populate it.
I need either #timestamp overwritten with the actual date (not the date when it was indexed), or that logdate field is created with date type. Both is good
Unless you are explicitly adding [#metadata][type] somewhere that you aren't showing, that is your problem. It's not set by default, [type] is set by default from the 'type =>' parameter on your input.
You can validate this with a minimal complete example:
input {
stdin {
type=>'feprod'
}
}
filter {
if [#metadata][type] == "feprod" or [#metadata][type] == "feqa"{
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:logdate}" }
}
date {
match => [ "logdate", "yyyy-MM-dd HH:mm:ss"]
}
}
}
output {
stdout { codec => "rubydebug" }
}
And running it:
echo '[2017-04-26 09:40:33] security.DEBUG: Stored the security token in the session. {"key":"securitysecured_area"} []' | bin/logstash -f test.conf
And getting the output:
{
"#timestamp" => 2017-05-02T15:15:05.875Z,
"#version" => "1",
"host" => "xxxxxxxxx",
"message" => "[2017-04-26 09:40:33] security.DEBUG: Stored the security token in the session. {\"key\":\"securitysecured_area\"} []",
"type" => "feprod",
"tags" => []
}
if you use just if [type] ==... it will work fine.
{
"#timestamp" => 2017-04-26T14:40:33.000Z,
"logdate" => "2017-04-26 09:40:33",
"#version" => "1",
"host" => "xxxxxxxxx",
"message" => "[2017-04-26 09:40:33] security.DEBUG: Stored the security token in the session. {\"key\":\"securitysecured_area\"} []",
"type" => "feprod",
"tags" => []
}

Logstash Filtering and Parsing Dies Output

Environment
Ubuntu 16.04
Logstash 5.2.1
ElasticSearch 5.1
I've configured our Deis platform to send logs to our Logstack node with no issues. However, I'm still new to Ruby and Regexes are not my strong suit.
Log Example:
2017-02-15T14:55:24UTC deis-logspout[1]: 2017/02/15 14:55:24 routing all to udp://x.x.x.x:xxxx\n
Logstash Configuration:
input {
tcp {
port => 5000
type => syslog
codec => plain
}
udp {
port => 5000
type => syslog
codec => plain
}
}
filter {
json {
source => "syslog_message"
}
}
output {
elasticsearch { hosts => ["foo.somehost"] }
}
Elasticsearch output:
"#timestamp" => 2017-02-15T14:55:24.408Z,
"#version" => "1",
"host" => "x.x.x.x",
"message" => "2017-02-15T14:55:24UTC deis-logspout[1]: 2017/02/15 14:55:24 routing all to udp://x.x.x.x:xxxx\n",
"type" => "json"
Desired outcome:
"#timestamp" => 2017-02-15T14:55:24.408Z,
"#version" => "1",
"host" => "x.x.x.x",
"type" => "json"
"container" => "deis-logspout"
"severity level" => "Info"
"message" => "routing all to udp://x.x.x.x:xxxx\n"
How can I extract the information out of the message into their individual fields?
Unfortunately your assumptions about what you are trying to do is slightly off, but we can fix that!
You created a regex for JSON, but you are not parsing JSON. You are simply parsing a log that is bastardized syslog (see syslogStreamer in the source), but is not in fact syslog format (either RFC 5424 or 3164). Logstash afterwards provides JSON output.
Let's break down the message, which becomes the source that you parse. The key is you have to parse the message front to back.
Message:
2017-02-15T14:55:24UTC deis-logspout[1]: 2017/02/15 14:55:24 routing all to udp://x.x.x.x:xxxx\n
2017-02-15T14:55:24UTC: Timestamp is a common grok pattern. This mostly follows TIMESTAMP_ISO8601 but not quite.
deis-logspout[1]: This would be your logsource, which you can name container. You can use the grok pattern URIHOST.
routing all to udp://x.x.x.x:xxxx\n: Since the message for most logs is contained at the end of the message, you can just then use the grok pattern GREEDYDATA which is the equivalent of .* in a regular expression.
2017/02/15 14:55:24: Another timestamp (why?) that doesn't match common grok patterns.
With grok filters, you can map a syntax (abstraction from regular expressions) to a semantic (name for the value that you extract). For example %{URIHOST:container}
You'll see I did some hacking together of the grok filters to make the formatting work. You have match parts of the text, even if you don't intend to capture the results. If you can't change the formatting of the timestamps to match standards, create a custom pattern.
Configuration:
input {
tcp {
port => 5000
type => deis
}
udp {
port => 5000
type => deis
}
}
filter {
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:timestamp}(UTC|CST|EST|PST) %{URIHOST:container}\[%{NUMBER}\]: %{YEAR}/%{MONTHNUM}/%{MONTHDAY} %{TIME} %{GREEDYDATA:msg}" }
}
}
output {
elasticsearch { hosts => ["foo.somehost"] }
}
Output:
{
"container" => "deis-logspout",
"msg" => "routing all to udp://x.x.x.x:xxxx",
"#timestamp" => 2017-02-22T23:55:28.319Z,
"port" => 62886,
"#version" => "1",
"host" => "10.0.2.2",
"message" => "2017-02-15T14:55:24UTC deis-logspout[1]: 2017/02/15 14:55:24 routing all to udp://x.x.x.x:xxxx",
"timestamp" => "2017-02-15T14:55:24"
"type" => "deis"
}
You can additionally mutate the item to drop #timestamp, #host, etc. as these are provided by Logstash by default. Another suggestion is to use the date filter to convert any timestamps found into usable formats (better for searching).
Depending on the log formatting, you may have to slightly alter the pattern. I only had one example to go off of. This also maintains the original full message, because any field operations done in Logstash are destructive (they overwrite values with fields of the same name).
Resources:
Grok
Grok Patterns
Grok Debugger

Logstash Doesn't Read Entire Line With File Input

I'm using Logstash and I'm having troubles getting a rather simple configuration to work.
input {
file {
path => "C:/path/test-data/*.log"
start_position => beginning
type => "usage_data"
}
}
filter {
if [type] == "usage_data" {
grok {
match => { "message" => "^\s*%{NUMBER:lineNumber}\s+%{TIMESTAMP_ISO8601:date},(?<value1>[A-Za-z0-9+/]+),(?<value2>[A-Za-z0-9+/]+),(?<value3>[A-Za-z0-9+/]+),(?<value4>[^,]+),(?<value5>[^\r]*)" }
}
}
if "_grokparsefailure" not in [tags] {
drop { }
}
}
output {
stdout { codec => rubydebug }
}
I call Logstash like this:
SET LS_MAX_MEM=2g
DEL "%USERPROFILE%\.sincedb_*" 2> NUL
"C:\Program Files (x86)\logstash-1.4.1\bin\logstash.bat" agent -p "C:\path\\." -w 1 -f "logstash.conf"
The output:
←[33mUsing milestone 2 input plugin 'file'. This plugin should be stable, but if you see strange behavior, please let us know! For more information on plugin milestones, see http://logstash.net/docs/1.4.1/plugin-milestones {:level=>:w
arn}←[0m
{
"message" => ",",
"#version" => "1",
"#timestamp" => "2014-11-20T09:16:08.591Z",
"type" => "usage_data",
"host" => "my-machine",
"path" => "C:/path/test-data/monitor_20141116223000.log",
"tags" => [
[0] "_grokparsefailure"
]
}
If I parse only C:\path\test-data\monitor_20141116223000.log all lines are read and there is no grokparsefailure. If I remove C:\path\test-data\monitor_20141116223000.log the same grokparsefailure pops up in another log-file:
{
"message" => "atches in another context\r",
"#version" => "1",
"#timestamp" => "2014-11-20T09:14:04.779Z",
"type" => "usage_data",
"host" => "my-machine",
"path" => "C:/path/test-data/monitor_20140829235900.log",
"tags" => [
[0] "_grokparsefailure"
]
}
Especially the last output proves that Logstash doesn't read the entire line or attempts to interpret a newline where there is none. It always breaks at the same line at the same position.
Maybe I should add that the log-files contain \n as a line separator and I'm running Logstash on Windows. However, I'm not getting a whole lot of errors, just that one. And there are quite a lot of lines in there. They all appear properly when I remove the if "_grokparsefailure" ....
I assume that there is some problem with buffering, but I have no clue how to make this work. Any ideas?
Workaround:
# diff -Nur /opt/logstash/vendor/bundle/jruby/1.9/gems/filewatch-0.5.1/lib/filewatch/tail.rb.orig /opt/logstash/vendor/bundle/jruby/1.9/gems/filewatch-0.5.1/lib/filewatch/tail.rb
--- /opt/logstash/vendor/bundle/jruby/1.9/gems/filewatch-0.5.1/lib/filewatch/tail.rb.orig 2015-02-25 10:46:06.916321816 +0700
+++ /opt/logstash/vendor/bundle/jruby/1.9/gems/filewatch-0.5.1/lib/filewatch/tail.rb 2015-02-12 18:39:34.943833909 +0700
## -86,7 +86,9 ##
_read_file(path, &block)
#files[path].close
#files.delete(path)
- #statcache.delete(path)
+ ##statcache.delete(path)
+ inode = #statcache.delete(path)
+ #sincedb[inode] = 0
else
#logger.warn("unknown event type #{event} for #{path}")
end

Logstash date parsing as timestamp using the date filter

Well, after looking around quite a lot, I could not find a solution to my problem, as it "should" work, but obviously doesn't.
I'm using on a Ubuntu 14.04 LTS machine Logstash 1.4.2-1-2-2c0f5a1, and I am receiving messages such as the following one:
2014-08-05 10:21:13,618 [17] INFO Class.Type - This is a log message from the class:
BTW, I am also multiline
In the input configuration, I do have a multiline codec and the event is parsed correctly. I also separate the event text in several parts so that it is easier to read.
In the end, I obtain, as seen in Kibana, something like the following (JSON view):
{
"_index": "logstash-2014.08.06",
"_type": "customType",
"_id": "PRtj-EiUTZK3HWAm5RiMwA",
"_score": null,
"_source": {
"#timestamp": "2014-08-06T08:51:21.160Z",
"#version": "1",
"tags": [
"multiline"
],
"type": "utg-su",
"host": "ubuntu-14",
"path": "/mnt/folder/thisIsTheLogFile.log",
"logTimestamp": "2014-08-05;10:21:13.618",
"logThreadId": "17",
"logLevel": "INFO",
"logMessage": "Class.Type - This is a log message from the class:\r\n BTW, I am also multiline\r"
},
"sort": [
"21",
1407315081160
]
}
You may have noticed that I put a ";" in the timestamp. The reason is that I want to be able to sort the logs using the timestamp string, and apparently logstash is not that good at that (e.g.: http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/multi-fields.html).
I have unsuccessfull tried to use the date filter in multiple ways, and it apparently did not work.
date {
locale => "en"
match => ["logTimestamp", "YYYY-MM-dd;HH:mm:ss.SSS", "ISO8601"]
timezone => "Europe/Vienna"
target => "#timestamp"
add_field => { "debug" => "timestampMatched"}
}
Since I read that the Joda library may have problems if the string is not strictly ISO 8601-compliant (very picky and expects a T, see https://logstash.jira.com/browse/LOGSTASH-180), I also tried to use mutate to convert the string to something like 2014-08-05T10:21:13.618 and then use "YYYY-MM-dd'T'HH:mm:ss.SSS". That also did not work.
I do not want to have to manually put a +02:00 on the time because that would give problems with daylight saving.
In any of these cases, the event goes to elasticsearch, but date does apparently nothing, as #timestamp and logTimestamp are different and no debug field is added.
Any idea how I could make the logTime strings properly sortable? I focused on converting them to a proper timestamp, but any other solution would also be welcome.
As you can see below:
When sorting over #timestamp, elasticsearch can do it properly, but since this is not the "real" log timestamp, but rather when the logstash event was read, I need (obviously) to be able to sort also over logTimestamp. This is what then is output. Obviously not that useful:
Any help is welcome! Just let me know if I forgot some information that may be useful.
Update:
Here is the filter config file that finally worked:
# Filters messages like this:
# 2014-08-05 10:21:13,618 [17] INFO Class.Type - This is a log message from the class:
# BTW, I am also multiline
# Take only type- events (type-componentA, type-componentB, etc)
filter {
# You cannot write an "if" outside of the filter!
if "type-" in [type] {
grok {
# Parse timestamp data. We need the "(?m)" so that grok (Oniguruma internally) correctly parses multi-line events
patterns_dir => "./patterns"
match => [ "message", "(?m)%{TIMESTAMP_ISO8601:logTimestampString}[ ;]\[%{DATA:logThreadId}\][ ;]%{LOGLEVEL:logLevel}[ ;]*%{GREEDYDATA:logMessage}" ]
}
# The timestamp may have commas instead of dots. Convert so as to store everything in the same way
mutate {
gsub => [
# replace all commas with dots
"logTimestampString", ",", "."
]
}
mutate {
gsub => [
# make the logTimestamp sortable. With a space, it is not! This does not work that well, in the end
# but somehow apparently makes things easier for the date filter
"logTimestampString", " ", ";"
]
}
date {
locale => "en"
match => ["logTimestampString", "YYYY-MM-dd;HH:mm:ss.SSS"]
timezone => "Europe/Vienna"
target => "logTimestamp"
}
}
}
filter {
if "type-" in [type] {
# Remove already-parsed data
mutate {
remove_field => [ "message" ]
}
}
}
I have tested your date filter. it works on me!
Here is my configuration
input {
stdin{}
}
filter {
date {
locale => "en"
match => ["message", "YYYY-MM-dd;HH:mm:ss.SSS"]
timezone => "Europe/Vienna"
target => "#timestamp"
add_field => { "debug" => "timestampMatched"}
}
}
output {
stdout {
codec => "rubydebug"
}
}
And I use this input:
2014-08-01;11:00:22.123
The output is:
{
"message" => "2014-08-01;11:00:22.123",
"#version" => "1",
"#timestamp" => "2014-08-01T09:00:22.123Z",
"host" => "ABCDE",
"debug" => "timestampMatched"
}
So, please make sure that your logTimestamp has the correct value.
It is probably other problem. Or can you provide your log event and logstash configuration for more discussion. Thank you.
This worked for me - with a slightly different datetime format:
# 2017-11-22 13:00:01,621 INFO [AtlassianEvent::0-BAM::EVENTS:pool-2-thread-2] [BuildQueueManagerImpl] Sent ExecutableQueueUpdate: addToQueue, agents known to be affected: []
input {
file {
path => "/data/atlassian-bamboo.log"
start_position => "beginning"
type => "logs"
codec => multiline {
pattern => "^%{TIMESTAMP_ISO8601} "
charset => "ISO-8859-1"
negate => true
what => "previous"
}
}
}
filter {
grok {
match => [ "message", "(?m)^%{TIMESTAMP_ISO8601:logtime}%{SPACE}%{LOGLEVEL:loglevel}%{SPACE}\[%{DATA:thread_id}\]%{SPACE}\[%{WORD:classname}\]%{SPACE}%{GREEDYDATA:logmessage}" ]
}
date {
match => ["logtime", "yyyy-MM-dd HH:mm:ss,SSS", "yyyy-MM-dd HH:mm:ss,SSS Z", "MMM dd, yyyy HH:mm:ss a" ]
timezone => "Europe/Berlin"
}
}
output {
elasticsearch { hosts => ["localhost:9200"] }
stdout { codec => rubydebug }
}

Resources