Multiple Logstash Outputs depending from collectd - elasticsearch

I'm facing a configuration failure which I can't solve on my own, tried to get the solution with the documentation, but without luck.
I'm having a few different hosts which send their metrics via collectd to logstash. Inside the logstash configuration I'd like to seperate each host and pipe it into an own ES-index. When I try to configtest my settings logstash throws a failure - maybe someone can help me.
The seperation should be triggered by the hostname collectd delivers:
[This is an old raw json output, so please don't mind the wrong set index]
{
"_index": "wv-metrics",
"_type": "logs",
"_id": "AVHyJunyGanLcfwDBAon",
"_score": null,
"_source": {
"host": "somefqdn.com",
"#timestamp": "2015-12-30T09:10:15.211Z",
"plugin": "disk",
"plugin_instance": "dm-5",
"collectd_type": "disk_merged",
"read": 0,
"write": 0,
"#version": "1"
},
"fields": {
"#timestamp": [
1451466615211
]
},
"sort": [
1451466615211
]
}
Please see my config:
Input Config (Working so far)
input {
udp {
port => 25826
buffer_size => 1452
codec => collectd { }
}
}
Output Config File:
filter {
if [host] == "somefqdn.com" {
output {
elasticsearch {
hosts => "someip:someport"
user => logstash
password => averystrongpassword
index => "somefqdn.com"
}
}
}
}
Error which is thrown:
root#test-collectd1:/home/username# service logstash configtest
Error: Expected one of #, => at line 21, column 17 (byte 314) after filter {
if [host] == "somefqdn.com" {
output {
elasticsearch
I understand, that there's a character possible missing in my config, but I can't locate it.
Thx in advance!

I spot two errors in a quick scan:
First, your output stanza should not be wrapped with a filter{} block.
Second, your output stanza should start with output{} (put the conditional inside):
output {
if [host] == "somefqdn.com" {
elasticsearch {
...
}
}
}

Related

how to add auto remove field in logstash filter

I am trying to add a _ttl field in logstash so that elasticsearch removes the document after a while, 120 seconds in this case but that's for testing.
filter {
if "drop" in [message] {
drop { }
}
add_field => { "_ttl" => "120s" }
}
but now nothing is logged in elasticsearch.
I have 2 questions.
Where is logged what is going wrong, maybe the syntax of the filter is wrong?
How do I add a ttl field to elasticsearch for auto removal?
When you add a filter to logstash.conf with a mutator it works:
filter {
mutate {
add_field => { "_ttl" => "120s" }
}
}
POST myindex/_search
{
"query": {
"match_all": {}
}
}
Results:
"hits": [
{
"_index": "myindex",
...................
"_ttl": "120s",
For the other question, cant really help there. Im running logstash as container so logging is read with:
docker logs d492eb3c3d0d

Logstash parsing different line than 1st line as header

I have a sample data:
employee_name,user_id,O,C,E,A,N
Yvette Vivien Donovan,YVD0093,38,19,29,15,36
Troy Alvin Craig,TAC0118,34,40,24,15,34
Eden Jocelyn Mcclain,EJM0952,20,37,48,35,34
Alexa Emma Wood,AEW0655,25,20,18,40,38
Celeste Maris Griffith,CMG0936,36,13,18,50,29
Tanek Orson Griffin,TOG0025,40,36,24,19,26
Colton James Lowery,CJL0436,39,41,27,25,28
Baxter Flynn Mcknight,BFM0761,42,32,28,17,22
Olivia Calista Hodges,OCH0195,37,36,39,38,32
Price Zachery Maldonado,PZM0602,24,46,30,18,29
Daryl Delilah Atkinson,DDA0185,17,43,33,18,25
And logstash config file as:
input {
file {
path => "/path/psychometric_data.csv"
start_position => "beginning"
}
}
filter {
csv {
separator => ","
autodetect_column_names => true
autogenerate_column_names => true
}
}
output {
amazon_es {
hosts => [ "https://xxx-xxx-es-xxx.xx-xx-1.es.amazonaws.com:443" ]
ssl => true
region => "ap-south-1"
index => "psychometric_data"
}
}
I am expecting 1st row(i.e. employee_name,user_id,O,C,E,A,N) as a Elasticsearch field name(header), but I am gettting 3rd row(i.e.Troy Alvin Craig,TAC0118,34,40,24,15,34) as header as follows.
{
"_index": "psychometric_data",
"_type": "_doc",
"_id": "md4hm3YB8",
"_score": 1,
"_source": {
"15": "21",
"24": "17",
"34": "39",
"40": "37",
"#version": "1",
"#timestamp": "2020-12-25T18:20:00.759Z",
"message": "Ishmael Mannix Velazquez,IMV0086,22,37,17,21,39\r",
"path": "/path/psychometric_data.csv",
"Troy Alvin Craig": "Ishmael Mannix Velazquez",
"host": "xx-ThinkPad-xx",
"TAC0118": "IMV0086"
}
}
What might be the reason for it?
If you set autodetect_column_names to true then the filter interprets the first line that it sees as the column names. If pipeline.workers is set to more than one then it is a race to see which thread sets the column names first. Since different workers are processing different lines this means it may not use the first line. You must set pipeline.workers to 1.
In addition to that, the java execution engine (enabled by default) does not always preserve the order of events. There is a setting pipeline.ordered in logstash.yml that controls that. In 7.9 that keeps event order iff pipeline.workers is set to 1.
You do not say which version you are running. For anything from 7.0 (when java_execution became the default) to 7.6 the fix is to disable the java engine using either pipeline.java_execution: false in logstash.yml or --java_execution false on the command line. For any 7.x release from 7.7 onwards, make sure pipeline.ordered is set to auto or true (auto is the default in 7.x). In future releases (8.x perhaps) pipeline.ordered will default to false.

Create a Kibana graph from logstash logs

I need to create a graph in kibana according to a specific value.
Here is my raw log from logstash :
2016-03-14T15:01:21.061Z Accueil-PC 14-03-2016 16:01:19.926 [pool-3-thread-1] INFO com.github.vspiewak.loggenerator.SearchRequest - id=300,ip=84.102.53.31,brand=Apple,name=iPhone 5S,model=iPhone 5S - Gris sideral - Disque 64Go,category=Mobile,color=Gris sideral,options=Disque 64Go,price=899.0
In this log line, I have the id information "id=300".
In order to create graphics in Kibana using the id value, I want a new field. So I have a specific grok configuration :
grok {
match => ["message", "(?<mycustomnewfield>id=%{INT}+)"]
}
With this transformation I get the following JSON :
{
"_index": "metrics-2016.03.14",
"_type": "logs",
"_id": "AVN1k-cJcXxORIbORG7w",
"_score": null,
"_source": {
"message": "{\"message\":\"14-03-2016 15:42:18.739 [pool-1950-thread-1] INFO com.github.vspiewak.loggenerator.SellRequest - id=300,ip=54.226.24.77,email=client951#gmail.com,sex=F,brand=Apple,name=iPad R\\\\xE9tina,model=iPad R\\\\xE9tina - Noir,category=Tablette,color=Noir,price=509.0\\\\r\",\"#version\":\"1\",\"#timestamp\":\"2016-03-14T14:42:19.040Z\",\"path\":\"D:\\\\LogStash\\\\logstash-2.2.2\\\\logstash-2.2.2\\\\bin\\\\logs.logs.txt\",\"host\":\"Accueil-PC\",\"type\":\"metrics-type\",\"mycustomnewfield\":\"300\"}",
"#version": "1",
"#timestamp": "2016-03-14T14:42:19.803Z",
"host": "127.0.0.1",
"port": 57867
},
"fields": {
"#timestamp": [
1457966539803
]
},
"sort": [
1457966539803
]}
A new field was actually created (the field 'mycustomnewfield') but within the message field ! As a result I can't see it in kibana when I try to create a graph. I tried to create a "scripted field" in Kibana but only numeric field can be accessed.
Should I create an index in elasticSearch with a specific mapping to create a new field ?
There was actually something wrong with my configuration. I should have paste the whole configuration with my question. In fact i'm using logstash as a shipper and also as a log server. On the server side, I modified the configuration :
input {
tcp {
port => "yyyy"
host => "x.x.x.x"
mode => "server"
codec => json # I forgot this option
}}
Because the logstash shipper is actually sending json, I need to advice the server about this. Now I no longer have a message field within a message field, and my new field is inserted at the right place.

_grokparsefailure without Filters

I have some simple logstash configuration:
input {
syslog {
port => 5140
type => "fortigate"
}
}
output {
elasticsearch {
cluster => "logging"
node_name => "logstash-logging-03"
bind_host => "10.100.19.77"
}
}
Thats it. Problem is that the documents that end up in elasticsearch do contain a _grokparsefailure:
{
"_index": "logstash-2014.12.19",
...
"_source": {
"message": ...",
...
"tags": [
"_grokparsefailure"
],
...
},
...
}
How come? There are no (grok) filters...
OK: The syslog input obviously makes use of gork internally. Therefore, if some other log format than "syslog" hits the input a "_grokparsefailure" will occure.
Instead, I just used "tcp" and "udp" inputs to achieve the required result (I was not aware of them before).
Cheers

Issue using grok filter with logstash and a windows file

I am attempting to filter a sql server error log using Logstash and grok. Logstash 1.3.3 is running as a windows service using NSSM and JRE6. My config file is below
input {
file {
path => "c:\program files\microsoft sql server\mssql10_50.mssqlserver\mssql\log\errorlog"
type => SQLServerLog
start_position => "beginning"
codec => plain {
charset => "UTF-8"
}
}
}
filter {
grok {
type => "SQLServerLog"
match => [ "message", "%{DATESTAMP:DateStamp} %{WORD:Process} %{GREEDYDATA:Message}" ]
named_captures_only => true
singles => true
remove_tag => [ "_grokparsefailure" ]
add_tag => [ "GrokFilterWorked" ]
}
}
output {
stdout {
codec => rubydebug
}
elasticsearch {
embedded => true
}
}
A sample of the log file content is below.
2014-01-31 00:00:38.73 spid21s This instance of SQL Server has been using a process ID of 14632 since 28/01/2014 13:09:24 (local) 28/01/2014 13:09:24 (UTC). This is an informational message only; no user action is required.
Events are visible in Kibana but when collapsed the message is displayed like {"message":"\u00002\u00000\u00001\u00004...
When expanded the table view shows the event message as text instead. The raw data for the event when viewed is as below.
{
"_index": "logstash-2014.01.31",
"_type": "SQLServerLog",
"_id": "NpvKSf4eTFSHkBdoG3zw6g",
"_score": null,
"_source": {
"message": "\u00002\u00000\u00001\u00004\u0000-\u00000\u00001\u0000-\u00003\u00000\u0000 \u00000\u00000\u0000:\u00000\u00000\u0000:\u00002\u00001\u0000.\u00006\u00004\u0000 \u0000s\u0000p\u0000i\u0000d\u00002\u00004\u0000s\u0000 \u0000 \u0000 \u0000 \u0000 \u0000T\u0000h\u0000i\u0000s\u0000 \u0000i\u0000n\u0000s\u0000t\u0000a\u0000n\u0000c\u0000e\u0000 \u0000o\u0000f\u0000 \u0000S\u0000Q\u0000L\u0000 \u0000S\u0000e\u0000r\u0000v\u0000e\u0000r\u0000 \u0000h\u0000a\u0000s\u0000 \u0000b\u0000e\u0000e\u0000n\u0000 \u0000u\u0000s\u0000i\u0000n\u0000g\u0000 \u0000a\u0000 \u0000p\u0000r\u0000o\u0000c\u0000e\u0000s\u0000s\u0000 \u0000I\u0000D\u0000 \u0000o\u0000f\u0000 \u00001\u00004\u00006\u00003\u00002\u0000 \u0000s\u0000i\u0000n\u0000c\u0000e\u0000 \u00002\u00008\u0000/\u00000\u00001\u0000/\u00002\u00000\u00001\u00004\u0000 \u00001\u00003\u0000:\u00000\u00009\u0000:\u00002\u00004\u0000 \u0000(\u0000l\u0000o\u0000c\u0000a\u0000l\u0000)\u0000 \u00002\u00008\u0000/\u00000\u00001\u0000/\u00002\u00000\u00001\u00004\u0000 \u00001\u00003\u0000:\u00000\u00009\u0000:\u00002\u00004\u0000 \u0000(\u0000U\u0000T\u0000C\u0000)\u0000.\u0000 \u0000T\u0000h\u0000i\u0000s\u0000 \u0000i\u0000s\u0000 \u0000a\u0000n\u0000 \u0000i\u0000n\u0000f\u0000o\u0000r\u0000m\u0000a\u0000t\u0000i\u0000o\u0000n\u0000a\u0000l\u0000 \u0000m\u0000e\u0000s\u0000s\u0000a\u0000g\u0000e\u0000 \u0000o\u0000n\u0000l\u0000y\u0000;\u0000 \u0000n\u0000o\u0000 \u0000u\u0000s\u0000e\u0000r\u0000 \u0000a\u0000c\u0000t\u0000i\u0000o\u0000n\u0000 \u0000i\u0000s\u0000 \u0000r\u0000e\u0000q\u0000u\u0000i\u0000r\u0000e\u0000d\u0000.\u0000\r\u0000",
"#version": "1",
"#timestamp": "2014-01-31T08:55:03.373Z",
"type": "SQLServerLog",
"host": "MyMachineName",
"path": "C:\\Program Files\\Microsoft SQL Server\\MSSQL10_50.MSSQLSERVER\\MSSQL\\Log\\ERRORLOG"
},
"sort": [
1391158503373,
1391158503373
]
}
I am unsure whether the encoding of the message is preventing Grok from filtering it properly.
I would like to be able to filter these events using Grok and am unsure how to proceed.
Further info:
I created a copy of the log file as UTF-8 and the filter worked fine. So it's definitely a charset issue. I guess I need to determine what the correct charset for the log file is and it should work.
So I had the same issue with reading SQL Server log file.
Then I realised that SQL Server will log the same entries to the Windows Event Log, which logstash supports as an input.
SQL Server logs entries with 'MSSQLSERVER' source on my systems. You will need the logstash-contrib package, simply extract the contents over base logstash files on your Windows box (wherever you run logstash to collect data).
I have my logstash agent configured to simply ship the entries to another logstash instance on a linux box that does some other stuff not relevant to this question ;)
Example logstash.conf:
input {
eventlog {
type => "Win32-EventLog"
logfile => ["Application", "Security", "System"]
}
}
filter {
if "MSSQLSERVER" in [SourceName] {
# Track logon failures
grok {
match => ["Message", "Login failed for user '%{DATA:username}'\..+CLIENT: %{IP:client_ip}"]
}
dns {
action => "append"
resolve => "client_ip"
}
}
}
output {
stdout { codec => rubydebug }
tcp {
host => "another-logstash-instance.local"
port => "5115"
codec => "json_lines"
}
}
Hope this helps.

Resources