Its related to logstash - logstash-configuration

Following is my log file:
2016-05-20 16:09:06.948UTC DEBUG spray.can.server.HttpServerConnection - Dispatching GET request to https://example.com/2.0/top.json to handler Actor[akka://test-server/system/IO-TCP/selectors/$a/1070#1248431494]
How do I filter "https://example.com/2.0/top.json" from the log file

The grok filter for this kind of log is
%{TIMESTAMP_ISO8601:timestamp}%{TZ:timezone} %{LOGLEVEL:loglevel} %{DATA:package} - %{DATA:dispatching} %{WORD:method} request to %{DATA:url} to handler Actor\[%{DATA:foo}\]
Where the url field = https://example.com/2.0/top.json.
If you want to remove the field you can use this logstash plugin, if you want to replace the field with something else you can use this logstash plugin.

Related

How to let fluent-bit skip the field that can not be parserd?

I am trying to send data from fluent_bit to the Elastic search
Here is my fluent-bit parser:
[PARSER]
Name escape_utf8_log
Format json
# Command | Decoder | Field | Optional Action
# =============|=====================|=================
Decode_Field_As escaped_utf8 log
Decode_Field json log [PARSER]
Name escape_message
Format json
# Command | Decoder | Field | Optional Action
# =============|=================|=================
Decode_Field_As escaped_utf8 message
Decode_Field json message
Here is my fluent-bit config:
[FILTER]
Name parser
Match docker_logs
Key_Name message
Parser escape_message
Reserve_Data True
In some cases, other people would put the log data to the fluent-bit in the wrong format so that we can get "mapper_parsing_exception" (example: filed to parse field [id] of type long in document).
I am trying to skip parsing a log and then send it to ES anyway if the fluent can not parse that log. so that we would not get the parser error even if someone sends the wrong format to fluent_bit. Is this possible to do that?

Add text at the end of the logs

I actually use Rsyslog 8.24 and I configured my rsyslog to accept logs from multiples input/sources.
I want to add the syslog hostname at the end of every logs.
Example :
Old log : timestamps, header, message
New log : timestamps, header, message syslog.domain.local
I know that the variable $myhostname or $MYHOSTNAME should return the hostname of the syslog but I don't understand how to implement this and add the syslog hostname at the end of each log.
I managed to do what I wanted by adding the following template and binding it in the ruleset :
template (name="LogsFormat" type="string" string="%TIMESTAMP% %$year% %syslogtag% %msg% <SYSLOG_HOSTNAME>:%$myhostname%\n")
ruleset(name="RemoteLogPort") {
if (re_match($msg, "AP:aaa-bbbb-ccc-dddd-ap")) then {
action(type="omfile" dynaFile="ArubaNetworksPath" template="LogsFormat")
}
}
PS : ArubaNetworksPath is also a template defining the log path.

Grok filter for logstash to match a specific value from a log file

I have the following log:
2018-10-30 11:47:52 INFO 30464 SMS-MT [cid:300038] [queue-msgid:bb7a195d-fb23-42ae-bbfa-d2dcda405af9] [smpp-msgid:j.11082.639364178944.#MARKET SETU] [status:ESME_ROK] [prio:1] [dlr:NO_SMSC_DELIVERY_RECEIPT_REQUESTED] [validity:none] [from:2323232] [to:23232132312] [content:'#MARKET SETUP\nadsadadadadasdasdadaasdada mo ang:\nC jean_rivera\n--Mag reply ng A-C']
I've created a grok filter based on pattern in logstash so I can parse the log the way I want. And I have this:
%{DATESTAMP:Timestamp} %{LOGLEVEL:Level} %{BASE10NUM:Pid} %{USERNAME:SMS_TYPE} %{CID:CID} %{GREEDYDATA:Message}
I'm trying to create a GROK patter that will match 300038, which is the number coming after cid:. The syntax is always the same, [cid:number]. What I have now is:
CID (\[cid:[0-9]{6}\])
but that results into:
"CID": [
[
"[cid:300038]"
]
],
and I only want to match the 300038, without the [cid:] part
I have noticed that there are more than single space character between LOG and pid, you can match all of them using \s*.
To match just a number from [cid:300038] you can use custom pattern, \[cid:(?<CID>[0-9]{1,})\] this will match cid of any length, not just 6 digits.
Your pattern will become,
%{DATESTAMP:Timestamp} %{LOGLEVEL:Level}\s*%{BASE10NUM:Pid} %{USERNAME:SMS_TYPE} \[cid:(?<CID>[0-9]{1,})\] %{GREEDYDATA:Message}
Use
%{DATESTAMP:Timestamp} %{LOGLEVEL:Level} %{BASE10NUM:Pid} %{USERNAME:SMS_TYPE} \[cid:(?<CID>[0-9]{6})\] %{GREEDYDATA:Message}

Logstash Grok Parser not working for error logs

I am trying to parse error logs using Logstash to capture few fields especially errormessage. But unable to capture errormessage in Logstash. Below is the actual error message and parser which I wrote
12345 http://google.com 2017-04-17 09:02:43.065 ERROR 10479 --- [http-nio-8052-exec-2] com.utilities.TokenUtils : Error
org.xml.SAXParseException: An invalid XML character (Unicode: 0xe) was found in the value of attribute "ID" and element is "saml".
at org.apache.parsers.DOMParser.parse(Unknown Source)
at org.apache.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at javax.parsers.DocumentBuilder.parse(DocumentBuilder.java:121)
at com.utilities.TokenUtils.validateSignature(TokenUtils.java:99)
Parser:
`%{NOTSPACE:stnum}\s*%{NOTSPACE:requestURL}\s*%{TIMESTAMP_ISO8601:log_timestamp}\s*%{LOGLEVEL:loglevel}\s*%{NUMBER:pid}\s*---\s*\[(?<thread>[A-Za-z0-9-]+)\]\s*%{DATA:class}\s*:\s%{NOTSPACE:level}\s*(?<errormessage>.[^\n]*).[^\n]*`
I am trying to capture this message from the log:
org.xml.SAXParseException: An invalid XML character (Unicode: 0xe) was found in the value of attribute "ID" and element is "saml".
Which logstash parser you are using? Please provide while conf file which can give us more info. Here's the sample to parse exception type from your logs (Using grok filter).
filter {
grok {
match => ["message", "%{DATA:errormessage} %{GREEDYDATA:EXTRA}"]
}
}

Issue in reading log file that contains date in it's name

I have 2 linux boxes setup in which 1 box contains one component which generates log and logstash installed in it to transfer the logs. And in other box I have redis elasticsearch and logstash. here logstash will act as logstash indexer to grok the data.
Now my problem is that in 1st box component generate new log file everyday, but only difference in log file name varies as per date.
like
counters-20151120-0.log
counters-20151121-0.log
counters-20151122-0.log
and so on, I have included below type of code in my logstash shipper conf file:
file {
path => "/opt/data/logs/counters-%{YEAR}%{MONTHNUM}%{MONTHDAY}*.log"
type => "rg_counters"
}
And in my logstash indexer, I have below type of code to catch those log files:
if [type] == "rg_counters" {
grok{
match => ["message", "%{YEAR}%{MONTHNUM}%{MONTHDAY}\s*%{HOUR}:%{MINUTE}:%{SECOND}\s*(?<counters_raw_data>[0-9\-A-Z]*)\s*(?<counters_operation_type>[\-A-Z]*)\s*%{GREEDYDATA:counters_extradata}"]
}
}
output {
elasticsearch { host => ["elastichost1","elastichost1" ] port => "9200" protocol => "http" }
stdout { codec => rubydebug }
}
Please note that this is working setup and other types log files are getting transfered and processed successfully, so there is no issue of setup.
The problem is how do I process this log file which contains date in it's file name.
Any help here?
Thanks in advance!!
Based on the comments...
Instead of trying to use regexp patterns in your path:
path => "/opt/data/logs/counters-%{YEAR}%{MONTHNUM}%{MONTHDAY}*.log"
just use glob patterns:
path => "/opt/data/logs/counters-*.log"
logstash will remember which files (inodes) that it's seen before.

Resources