Logstash and filebeat set event.dataset value - elasticsearch

I've a configuration in which filebeat fetches logs from some files (using a custom format) and sends those logs to a logstash instance.
In logstash I apply a gork filter in order to split some of the fields and then I send the output to my elasticsearch instance.
The pipeline works fine and it is correctly loaded on elasticsearch, but no event data is present (such as event.dataset or event.module). So I'm looking for the piece of code for adding such information to my events.
Here my filebeat configuration:
filebeat.config:
modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
filebeat.inputs:
- type: log
paths:
- /var/log/*/info.log
- /var/log/*/warning.log
- /var/log/*/error.log
output.logstash:
hosts: '${ELK_HOST:logstash}:5044'
Here my logstash pipeline:
input {
beats {
port => 5044
}
}
filter {
grok {
match => { "message" => "MY PATTERN"}
}
mutate {
add_field => { "logLevelLower" => "%{logLevel}" }
}
mutate {
lowercase => [ "logLevelLower" ]
}
}
output {
elasticsearch {
hosts => "elasticsearch:9200"
user => "USER"
password => "PASSWORD"
index => "%{[#metadata][beat]}-%{logLevelLower}-%{[#metadata][version]}"
}
}

You can do it like this easily with a mutate/add_field filter:
filter {
mutate {
add_field => {
"[ecs][version]" => "1.5.0"
"[event][kind]" => "event"
"[event][category]" => "host"
"[event][type]" => ["info"]
"[event][dataset]" => "module.dataset"
}
}
}
The Elastic Common Schema documentation explains how to pick values for kind, category and type.

Related

Filebeat -> Logstash indexing documents twice

I have Nginx logs being sent from Filebeat to Logstash which is indexing them into Elasticsearch.
Every entry gets indexed twice. Once with the correct grok filter and then again with no fields found except for the "message" field.
This is the logstash configuration.
02-beats-input.conf
input {
beats {
port => 5044
ssl => false
}
}
11-nginx-filter.conf
filter {
if [type] == "nginx-access" {
grok {
patterns_dir => ['/etc/logstash/patterns']
match => {"message" => "%{NGINXACCESS}"
}
date {
match => [ "timestamp", "dd/MMM/YYYY:HH:mm:ss Z", "d/MMM/YYYY:HH:mm:ss Z" ]
}
}
}
Nginx Patterns
NGUSERNAME [a-zA-Z\.\#\-\+_%]+
NGUSER %{NGUSERNAME}
NGINXACCESS %{IPORHOST:clientip}\s+%{NGUSER:ident}\s+%{NGUSER:auth}\s+\[%{HTTPDATE:timestamp}\]\s+\"%{WORD:verb}\s+%{URIPATHPARAM:request}\s+HTTP/%{NUMBER:httpversion}\"\s+%{NUMBER:response}\s+(?:%{NUMBER:bytes}|-)\s+(?:\"(?:%{URI:referrer}|-)\"|%{QS:referrer})\s+%{QS:agent}
30-elasticsearc-output.conf
output {
elasticsearch {
hosts => ["elastic00:9200", "elastic01:9200", "elastic02:9200"]
manage_template => false
index => "%{[#metadata][beat]}-%{+YYYY.MM.dd}"
document_type => "%{[#metadata][type]}"
}
}
Check your filebeat configuration!
During setup I had accidentally un-commented and configured the output.elasticsearch section of the filebeat.yml.
I then also configured the output.logstash section of the configuration but forgot to comment out the elasticsearch output section.
This caused one entry to be sent to logstash where it was grok'd and another one to be sent directly to elasticsearch.

Elasticsearch Logstash Filebeat mapping

Im having a problem with ELK Stack + Filebeat.
Filebeat is sending apache-like logs to Logstash, which should be parsing the lines. Elasticsearch should be storing the split data in fields so i can visualize them using Kibana.
Problem:
Elasticsearch recieves the logs but stores them in a single "message" field.
Desired solution:
Input:
10.0.0.1 some.hostname.at - [27/Jun/2017:23:59:59 +0200]
ES:
"ip":"10.0.0.1"
"hostname":"some.hostname.at"
"timestamp":"27/Jun/2017:23:59:59 +0200"
My logstash configuration:
input {
beats {
port => 5044
}
}
filter {
if [type] == "web-apache" {
grok {
patterns_dir => ["./patterns"]
match => { "message" => "IP: %{IPV4:client_ip}, Hostname: %{HOSTNAME:hostname}, - \[timestamp: %{HTTPDATE:timestamp}\]" }
break_on_match => false
remove_field => [ "message" ]
}
date {
locale => "en"
timezone => "Europe/Vienna"
match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
}
useragent {
source => "agent"
prefix => "browser_"
}
}
}
output {
stdout {
codec => rubydebug
}
elasticsearch {
hosts => ["localhost:9200"]
index => "test1"
document_type => "accessAPI"
}
}
My Elasticsearch discover output:
I hope there are any ELK experts around that can help me.
Thank you in advance,
Matthias
The grok filter you stated will not work here.
Try using:
%{IPV4:client_ip} %{HOSTNAME:hostname} - \[%{HTTPDATE:timestamp}\]
There is no need to specify desired names seperately in front of the field names (you're not trying to format the message here, but to extract seperate fields), just stating the field name in brackets after the ':' will lead to the result you want.
Also, use the overwrite-function instead of remove_field for message.
More information here:
https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html#plugins-filters-grok-options
It will look similar to that in the end:
filter {
grok {
match => { "message" => "%{IPV4:client_ip} %{HOSTNAME:hostname} - \[%{HTTPDATE:timestamp}\]" }
overwrite => [ "message" ]
}
}
You can test grok filters here:
http://grokconstructor.appspot.com/do/match

logstash not reading logtype field from beats

I have logstash filebeat and elasticsearch running on one node.
I'm trying to get logstash to identify logs labeled as "syslog" and dump them in an index named "syslog", but it appears to not see the label as they are all going into the "uncategorized" index (my catch all default index)
Here is my beats config
/etc/filebeat/filebeat.yml
filebeat:
prospectors:
-
paths:
- /var/log/messages
fields:
type: syslog
output:
logstash:
hosts: ["localhost:9901"]
Here is my logstash config file
/etc/logstash/conf.d/logstash_server_syslog.conf
input {
beats {
port => "9901"
}
}
filter {
if [type] == "syslog" {
grok {
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
add_field => [ "received_at", "%{#timestamp}" ]
add_field => [ "received_from", "%{host}" ]
}
date {
match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
}
}
output {
if [type] == "syslog" {
elasticsearch {
hosts => ["10.0.0.167:9200", "10.0.0.168:9200"]
index => "syslog"
}
} else {
elasticsearch {
hosts => ["10.0.0.167:9200", "10.0.0.168:9200"]
index => "uncategorized"
}
}
}
Looking at the output (with a stdout{} stanza) would confirm this, but I'm guessing that you missed this part of the doc:
By default, the fields that you specify [in the 'fields' config'] will be grouped under a
fields sub-dictionary in the output document. To store the custom
fields as top-level fields, set the fields_under_root option to true.
To set a custom type field in Filebeat using the document_type configuration option.
filebeat:
prospectors:
- paths:
- /var/log/messages
document_type: syslog
This will set the #metadata.type field for use with Logstash whereas a custom field will not.

How to check if logstash receiving/parsing data from suricata to elasticsearch?

Trying to configure suricata v2.0.8 with ElasticSearch(v1.5.2)-Logstash(v1.4.2)-Kibana(v4.0.2) on Mac OS X 10.10.3 Yosemite.
suricata.yaml:
# Extensible Event Format (nicknamed EVE) event log in JSON format
- eve-log:
enabled: yes
type: file #file|syslog|unix_dgram|unix_stream
filename: eve.json
# the following are valid when type: syslog above
#identity: "suricata"
#facility: local5
#level: Info ## possible levels: Emergency, Alert, Critical,
## Error, Warning, Notice, Info, Debug
types:
- alert
- http:
extended: yes # enable this for extended logging information
# custom allows additional http fields to be included in eve-log
# the example below adds three additional fields when uncommented
#custom: [Accept-Encoding, Accept-Language, Authorization]
- dns
- tls:
extended: yes # enable this for extended logging information
- files:
force-magic: yes # force logging magic on all logged files
force-md5: yes # force logging of md5 checksums
#- drop
- ssh
#- smtp
#- flow
logstash.conf:
input {
file {
path => ["/var/log/suricata/eve.json"]
sincedb_path => ["/var/lib/logstash/"]
codec => json
type => "SuricataIDPS"
start_position => "beginning"
}
}
filter {
if [type] == "SuricataIDPS" {
date {
match => [ "timestamp", "ISO8601" ]
}
ruby {
code => "if event['event_type'] == 'fileinfo'; event['fileinfo']['type']=event['fileinfo']['magic'].to_s.split(',')[0]; end;"
}
}
if [src_ip] {
geoip {
source => "src_ip"
target => "geoip"
#database => "/usr/local/opt/logstash/libexec/vendor/geoip/GeoLiteCity.dat"
add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" ]
}
mutate {
convert => [ "[geoip][coordinates]", "float" ]
}
if ![geoip.ip] {
if [dest_ip] {
geoip {
source => "dest_ip"
target => "geoip"
#database => "/usr/local/opt/logstash/libexec/vendor/geoip/GeoLiteCity.dat"
add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" ]
}
mutate {
convert => [ "[geoip][coordinates]", "float" ]
}
}
}
}
}
output {
elasticsearch {
host => localhost
#protocol => http
}
}
Suricata logs all events successfully into eve.json. When I open kibana in browser, I see no dashboards or any information from suricata... So I assume either logstash doesn't read the data from eve.json or doesn't parse the data to elasticsearch (or both)... Are there any ways to check what's going on?
Turn on a debug output in logstash:
output {
stdout {
codec = rubydebug
}
}
Also, try running your query against Elasticsearch directly (curl) rather than with kibana.
I made an adaptation of the nginx log to the suricata log. I can have the geoip information in the suricata logs. I make the adaptation through swatch and send to a log file configured in filebeat.
Ex:
nginx.access.referrer: ET INFO Session Traversal Utilities for NAT (STUN Binding Request) [**
nginx.access.geoip.location:
{
“lon”: -119.688,
“lat”: 45.8696
}
Use the swatch to read the suricata logs and send them to the shell script that will do the adaptation.
Ex:
echo "$IP - - [$nd4] \"GET $IP2:$PORT2 --- $TYPE HTTP/1.1\" 777 0 \"$CVE\" \"Mozilla/5.0 (NONE) (NONE) NONE\"" >> /var/log/suricata_mod.log
Then configure filebeat.yml:
document_type: nginx-access
paths:
/var/log/suricata_mod.log
Restart filebeat.
Finally configure the logstash:
filter {
if [type] == "nginx-access" {
grok {
match => { "message" => ["%{IPORHOST:[nginx][access][remote_ip]} - %{DATA:[nginx][access][user_name]} \[%{HTTPDATE:[nginx][access][time]}\] \"%{WORD:[nginx][access][method]} %{DATA:[nginx][access][url]} HTTP/%{NUMBER:[$
remove_field => "message"}
mutate {
add_field => { "read_timestamp" => "%{#timestamp}" }}
date {
match => [ "[nginx][access][time]", "dd/MMM/YYYY:H:m:s Z" ]
remove_field => "[nginx][access][time]"}
useragent {
source => "[nginx][access][agent]"
target => "[nginx][access][user_agent]"
remove_field => "[nginx][access][agent]"}
geoip {
source => "[nginx][access][remote_ip]"
target => "[nginx][access][geoip]"
database => "/opt/GeoLite2-City.mmdb"}} } output {
elasticsearch {
hosts => [ "xxx.xxx.xxx.xxx:9200" ]
manage_template => false
document_type => "%{[#metadata][type]}"
index => "%{[#metadata][beat]}-%{+YYYY.MM.dd}"}}
And restart logstash. In Kibana create a filebeat- * index. Ready.

Logstash not importing files due to missing index error

I am having a difficult time trying to get the combination of the Logstash, Elasticsearch & Kibana working in my Windows 7 environment.
I have set all 3 up and they all seem to be running fine, Logstash and Elasticsearch are running as Windows services and Kibana as a website in IIS.
Logstash is running from http://localhost:9200
I have a web application creating log files in .txt with the format:
Datetime=[DateTime], Value=[xxx]
The log files get created in this directory:
D:\wwwroot\Logs\Errors\
My logstash.conf file looks like this:
input {
file {
format => ["plain"]
path => ["D:\wwwroot\Logs\Errors\*.txt"]
type => "testlog"
}
}
output {
elasticsearch {
embedded => true
}
}
My Kibana config.js file looks like this:
define(['settings'],
function (Settings) {
return new Settings({
elasticsearch: "http://localhost:9200",
kibana_index: "kibana-int",
panel_names: [
'histogram',
'map',
'pie',
'table',
'filtering',
'timepicker',
'text',
'fields',
'hits',
'dashcontrol',
'column',
'derivequeries',
'trends',
'bettermap',
'query',
'terms'
]
});
});
When I view Kibana I see the error:
No index found at http://localhost:9200/_all/_mapping. Please create at least one index.If you're using a proxy ensure it is configured correctly.
I have no idea on how to create the index, so if anyone can shed some light on what I am doing wrong that would be great.
It seems like nothing is making it to elasticsearch currently.
For the current version of es (0.90.5), I had to use elasticsearch_http output. The elasticsearch output seemed to be too closely associated with 0.90.3.
e.g: here is how my config is for log4j format to elastic search
input {
file {
path => "/srv/wso2/wso2am-1.4.0/repository/logs/wso2carbon.log"
path => "/srv/wso2/wso2as-5.1.0/repository/logs/wso2carbon.log"
path => "/srv/wso2/wso2is-4.1.0/repository/logs/wso2carbon.log"
type => "log4j"
}
}
output {
stdout { debug => true debug_format => "ruby"}
elasticsearch_http {
host => "localhost"
port => 9200
}
}
For my file format, I have a grok filter as well - to parse it properly.
filter {
if [message] !~ "^[ \t\n]+$" {
# if the line is a log4j type
if [type] == "log4j" {
# parse out fields from log4j line
grok {
match => [ "message", "TID:%{SPACE}\[%{BASE10NUM:thread_name}\]%{SPACE}\[%{WORD:component}\]%{SPACE}\[%{TIMESTAMP_ISO8601:timestamp}\]%{SPACE}%{LOGLEVEL:level}%{SPACE}{%{JAVACLASS:java_file}}%{SPACE}-%{SPACE}%{GREEDYDATA:log_message}" ]
add_tag => ["test"]
}
if "_grokparsefailure" not in [tags] {
mutate {
replace => ["message", " "]
}
}
multiline {
pattern => "^TID|^ $"
negate => true
what => "previous"
add_field => {"additional_log" => "%{message}"}
remove_field => ["message"]
remove_tag => ["_grokparsefailure"]
}
mutate {
strip => ["additional_log"]
remove_tag => ["test"]
remove_field => ["message"]
}
}
} else {
drop {}
}
}
Also, I would get elasticsearch head plugin to monitor your content in elasticsearch- to easily verify the data and state it is in.

Resources