_grokparsefailure without Filters - elasticsearch

I have some simple logstash configuration:
input {
syslog {
port => 5140
type => "fortigate"
}
}
output {
elasticsearch {
cluster => "logging"
node_name => "logstash-logging-03"
bind_host => "10.100.19.77"
}
}
Thats it. Problem is that the documents that end up in elasticsearch do contain a _grokparsefailure:
{
"_index": "logstash-2014.12.19",
...
"_source": {
"message": ...",
...
"tags": [
"_grokparsefailure"
],
...
},
...
}
How come? There are no (grok) filters...

OK: The syslog input obviously makes use of gork internally. Therefore, if some other log format than "syslog" hits the input a "_grokparsefailure" will occure.
Instead, I just used "tcp" and "udp" inputs to achieve the required result (I was not aware of them before).
Cheers

Related

how to add auto remove field in logstash filter

I am trying to add a _ttl field in logstash so that elasticsearch removes the document after a while, 120 seconds in this case but that's for testing.
filter {
if "drop" in [message] {
drop { }
}
add_field => { "_ttl" => "120s" }
}
but now nothing is logged in elasticsearch.
I have 2 questions.
Where is logged what is going wrong, maybe the syntax of the filter is wrong?
How do I add a ttl field to elasticsearch for auto removal?
When you add a filter to logstash.conf with a mutator it works:
filter {
mutate {
add_field => { "_ttl" => "120s" }
}
}
POST myindex/_search
{
"query": {
"match_all": {}
}
}
Results:
"hits": [
{
"_index": "myindex",
...................
"_ttl": "120s",
For the other question, cant really help there. Im running logstash as container so logging is read with:
docker logs d492eb3c3d0d

Logstash - Send output from log files to elk

I have an index in elastic search that has a field named locationCoordinates. It's being sent to ElasticSearch from logstash.
The data in this field looks like this...
-38.122, 145.025
When this field appears in ElasticSearch it is not coming up as a geo point.
I know if I do this below it works.
{
"mappings": {
"logs": {
"properties": {
"http_request.locationCoordinates": {
"type": "geo_point"
}
}
}
}
}
But what I would like to know is how can i change my logstash.conf file so that it does this at startup.
At the moment my logstash.conf looks a bit like this...
input {
# Default GELF input
gelf {
port => 12201
type => gelf
}
# Default TCP input
tcp {
port => 5000
type => syslog
}
# Default UDP input
udp {
port => 5001
type => prod
codec => json
}
file {
path => [ "/tmp/app-logs/*.log" ]
codec => json {
charset => "UTF-8"
}
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
json{
source => "message"
}
}
output {
elasticsearch {
hosts => "elasticsearch:9200"
}
}
And I end up with this in Kibana (without the little Geo sign).
You simply need to modify your elasticsearch output to configure an index template in which you can add your additional mapping.
output {
elasticsearch {
hosts => "elasticsearch:9200"
template_overwrite => true
template => "/path/to/template.json"
}
}
And then in the file at /path/to/template.json you can add your additional geo_point mapping
{
"template": "logstash-*",
"mappings": {
"logs": {
"properties": {
"http_request.locationCoordinates": {
"type": "geo_point"
}
}
}
}
}
If you want to keep the official logstash template, you can download it and add your specific geo_point mapping to it.

Multiple Logstash Outputs depending from collectd

I'm facing a configuration failure which I can't solve on my own, tried to get the solution with the documentation, but without luck.
I'm having a few different hosts which send their metrics via collectd to logstash. Inside the logstash configuration I'd like to seperate each host and pipe it into an own ES-index. When I try to configtest my settings logstash throws a failure - maybe someone can help me.
The seperation should be triggered by the hostname collectd delivers:
[This is an old raw json output, so please don't mind the wrong set index]
{
"_index": "wv-metrics",
"_type": "logs",
"_id": "AVHyJunyGanLcfwDBAon",
"_score": null,
"_source": {
"host": "somefqdn.com",
"#timestamp": "2015-12-30T09:10:15.211Z",
"plugin": "disk",
"plugin_instance": "dm-5",
"collectd_type": "disk_merged",
"read": 0,
"write": 0,
"#version": "1"
},
"fields": {
"#timestamp": [
1451466615211
]
},
"sort": [
1451466615211
]
}
Please see my config:
Input Config (Working so far)
input {
udp {
port => 25826
buffer_size => 1452
codec => collectd { }
}
}
Output Config File:
filter {
if [host] == "somefqdn.com" {
output {
elasticsearch {
hosts => "someip:someport"
user => logstash
password => averystrongpassword
index => "somefqdn.com"
}
}
}
}
Error which is thrown:
root#test-collectd1:/home/username# service logstash configtest
Error: Expected one of #, => at line 21, column 17 (byte 314) after filter {
if [host] == "somefqdn.com" {
output {
elasticsearch
I understand, that there's a character possible missing in my config, but I can't locate it.
Thx in advance!
I spot two errors in a quick scan:
First, your output stanza should not be wrapped with a filter{} block.
Second, your output stanza should start with output{} (put the conditional inside):
output {
if [host] == "somefqdn.com" {
elasticsearch {
...
}
}
}

logstash, syslog and grok

I am working on an ELK-stack configuration. logstash-forwarder is used as a log shipper, each type of log is tagged with a type-tag:
{
"network": {
"servers": [ "___:___" ],
"ssl ca": "___",
"timeout": 15
},
"files": [
{
"paths": [
"/var/log/secure"
],
"fields": {
"type": "syslog"
}
}
]
}
That part works fine... Now, I want logstash to split the message string in its parts; luckily, that is already implemented in the default grok patterns, so the logstash.conf remains simple so far:
input {
lumberjack {
port => 6782
ssl_certificate => "___" ssl_key => "___"
}
}
filter {
if [type] == "syslog" {
grok {
match => [ "message", "%{SYSLOGLINE}" ]
}
}
}
output {
elasticsearch {
cluster => "___"
template => "___"
template_overwrite => true
node_name => "logstash-___"
bind_host => "___"
}
}
The issue I have here is that the document that is received by elasticsearch still holds the whole line (including timestamp etc.) in the message field. Also, the #timestamp still shows the date of when logstash has received the message which makes is bad to search since kibana does query the #timestamp in order to filter by date... Any idea what I'm doing wrong?
Thanks, Daniel
The reason your "message" field contains the original log line (including timestamps etc) is that the grok filter by default won't allow existing fields to be overwritten. In other words, even though the SYSLOGLINE pattern,
SYSLOGLINE %{SYSLOGBASE2} %{GREEDYDATA:message}
captures the message into a "message" field it won't overwrite the current field value. The solution is to set the grok filter's "overwrite" parameter.
grok {
match => [ "message", "%{SYSLOGLINE}" ]
overwrite => [ "message" ]
}
To populate the "#timestamp" field, use the date filter. This will probably work for you:
date {
match => [ "timestamp", "MMM dd HH:mm:ss", "MMM d HH:mm:ss" ]
}
It is hard to know were the problem without seeing an example event that is causing you the problem. I can suggest you to try the grok debugger in order to verify the pattern is correct and to adjust it to your needs once you see the problem.

Issue using grok filter with logstash and a windows file

I am attempting to filter a sql server error log using Logstash and grok. Logstash 1.3.3 is running as a windows service using NSSM and JRE6. My config file is below
input {
file {
path => "c:\program files\microsoft sql server\mssql10_50.mssqlserver\mssql\log\errorlog"
type => SQLServerLog
start_position => "beginning"
codec => plain {
charset => "UTF-8"
}
}
}
filter {
grok {
type => "SQLServerLog"
match => [ "message", "%{DATESTAMP:DateStamp} %{WORD:Process} %{GREEDYDATA:Message}" ]
named_captures_only => true
singles => true
remove_tag => [ "_grokparsefailure" ]
add_tag => [ "GrokFilterWorked" ]
}
}
output {
stdout {
codec => rubydebug
}
elasticsearch {
embedded => true
}
}
A sample of the log file content is below.
2014-01-31 00:00:38.73 spid21s This instance of SQL Server has been using a process ID of 14632 since 28/01/2014 13:09:24 (local) 28/01/2014 13:09:24 (UTC). This is an informational message only; no user action is required.
Events are visible in Kibana but when collapsed the message is displayed like {"message":"\u00002\u00000\u00001\u00004...
When expanded the table view shows the event message as text instead. The raw data for the event when viewed is as below.
{
"_index": "logstash-2014.01.31",
"_type": "SQLServerLog",
"_id": "NpvKSf4eTFSHkBdoG3zw6g",
"_score": null,
"_source": {
"message": "\u00002\u00000\u00001\u00004\u0000-\u00000\u00001\u0000-\u00003\u00000\u0000 \u00000\u00000\u0000:\u00000\u00000\u0000:\u00002\u00001\u0000.\u00006\u00004\u0000 \u0000s\u0000p\u0000i\u0000d\u00002\u00004\u0000s\u0000 \u0000 \u0000 \u0000 \u0000 \u0000T\u0000h\u0000i\u0000s\u0000 \u0000i\u0000n\u0000s\u0000t\u0000a\u0000n\u0000c\u0000e\u0000 \u0000o\u0000f\u0000 \u0000S\u0000Q\u0000L\u0000 \u0000S\u0000e\u0000r\u0000v\u0000e\u0000r\u0000 \u0000h\u0000a\u0000s\u0000 \u0000b\u0000e\u0000e\u0000n\u0000 \u0000u\u0000s\u0000i\u0000n\u0000g\u0000 \u0000a\u0000 \u0000p\u0000r\u0000o\u0000c\u0000e\u0000s\u0000s\u0000 \u0000I\u0000D\u0000 \u0000o\u0000f\u0000 \u00001\u00004\u00006\u00003\u00002\u0000 \u0000s\u0000i\u0000n\u0000c\u0000e\u0000 \u00002\u00008\u0000/\u00000\u00001\u0000/\u00002\u00000\u00001\u00004\u0000 \u00001\u00003\u0000:\u00000\u00009\u0000:\u00002\u00004\u0000 \u0000(\u0000l\u0000o\u0000c\u0000a\u0000l\u0000)\u0000 \u00002\u00008\u0000/\u00000\u00001\u0000/\u00002\u00000\u00001\u00004\u0000 \u00001\u00003\u0000:\u00000\u00009\u0000:\u00002\u00004\u0000 \u0000(\u0000U\u0000T\u0000C\u0000)\u0000.\u0000 \u0000T\u0000h\u0000i\u0000s\u0000 \u0000i\u0000s\u0000 \u0000a\u0000n\u0000 \u0000i\u0000n\u0000f\u0000o\u0000r\u0000m\u0000a\u0000t\u0000i\u0000o\u0000n\u0000a\u0000l\u0000 \u0000m\u0000e\u0000s\u0000s\u0000a\u0000g\u0000e\u0000 \u0000o\u0000n\u0000l\u0000y\u0000;\u0000 \u0000n\u0000o\u0000 \u0000u\u0000s\u0000e\u0000r\u0000 \u0000a\u0000c\u0000t\u0000i\u0000o\u0000n\u0000 \u0000i\u0000s\u0000 \u0000r\u0000e\u0000q\u0000u\u0000i\u0000r\u0000e\u0000d\u0000.\u0000\r\u0000",
"#version": "1",
"#timestamp": "2014-01-31T08:55:03.373Z",
"type": "SQLServerLog",
"host": "MyMachineName",
"path": "C:\\Program Files\\Microsoft SQL Server\\MSSQL10_50.MSSQLSERVER\\MSSQL\\Log\\ERRORLOG"
},
"sort": [
1391158503373,
1391158503373
]
}
I am unsure whether the encoding of the message is preventing Grok from filtering it properly.
I would like to be able to filter these events using Grok and am unsure how to proceed.
Further info:
I created a copy of the log file as UTF-8 and the filter worked fine. So it's definitely a charset issue. I guess I need to determine what the correct charset for the log file is and it should work.
So I had the same issue with reading SQL Server log file.
Then I realised that SQL Server will log the same entries to the Windows Event Log, which logstash supports as an input.
SQL Server logs entries with 'MSSQLSERVER' source on my systems. You will need the logstash-contrib package, simply extract the contents over base logstash files on your Windows box (wherever you run logstash to collect data).
I have my logstash agent configured to simply ship the entries to another logstash instance on a linux box that does some other stuff not relevant to this question ;)
Example logstash.conf:
input {
eventlog {
type => "Win32-EventLog"
logfile => ["Application", "Security", "System"]
}
}
filter {
if "MSSQLSERVER" in [SourceName] {
# Track logon failures
grok {
match => ["Message", "Login failed for user '%{DATA:username}'\..+CLIENT: %{IP:client_ip}"]
}
dns {
action => "append"
resolve => "client_ip"
}
}
}
output {
stdout { codec => rubydebug }
tcp {
host => "another-logstash-instance.local"
port => "5115"
codec => "json_lines"
}
}
Hope this helps.

Resources