Logstash kibana geoip filter conflict - elasticsearch

I have been trying to make Geo Filter working for logs. No luck yet.
I keep recreating my Logstash Index in ES, recreating GeoIP field with with Default type, double, and float, but Kibana keep complaining that my geoip.location property has Conflict.
Any suggestion would be appreciated.
geoip {
source => "[headers][x-forwarded-for]"
target => "geoip"
database => "/etc/logstash/GeoLiteCity.dat"
add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" ]
}
mutate {
convert => [ "[geoip][coordinates]", "float"]
}

Resolved the issue by specifying default Mapping template, and recreating index now has "geo_point" data type for geoip.location.

Related

How to filter data with Logstash before storing parsed data in Elasticsearch

I understand that Logstash is for aggregating and processing logs. I have NGIX logs and had Logstash config setup as:
filter {
grok {
match => [ "message" , "%{COMBINEDAPACHELOG}+%{GREEDYDATA:extra_fields}"]
overwrite => [ "message" ]
}
mutate {
convert => ["response", "integer"]
convert => ["bytes", "integer"]
convert => ["responsetime", "float"]
}
geoip {
source => "clientip"
target => "geoip"
add_tag => [ "nginx-geoip" ]
}
date {
match => [ "timestamp" , "dd/MMM/YYYY:HH:mm:ss Z" ]
remove_field => [ "timestamp" ]
}
useragent {
source => "agent"
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "weblogs-%{+YYYY.MM}"
document_type => "nginx_logs"
}
stdout { codec => rubydebug }
}
This would parse the unstructured logs into a structured form of data, and store the data into monthly indexes.
What I discovered is that the majority of logs were contributed by robots/web-crawlers. In python I would filter them out by:
browser_names = browser_names[~browser_names.str.\
match('^[\w\W]*(google|bot|spider|crawl|headless)[\w\W]*$', na=False)]
However, I would like to filter them out with Logstash so I can save a lot of disk space in Elasticsearch server. Is there a way to do that? Thanks in advance!
Thanks LeBigCat for generously giving a hint. I solved this problem by adding the following under the filter:
if [browser_names] =~ /(?i)^[\w\W]*(google|bot|spider|crawl|headless)[\w\W]*$/ {
drop {}
}
the (?i) flag is for case insensitive matching.
In your filter you can ask for drop (https://www.elastic.co/guide/en/logstash/current/plugins-filters-drop.html). As you already got your pattern, should be pretty fast ;)

Plot a Tile map with the ELK stack

I'm trying to create a tile map with Kibana. My conf file logstash works correctly and generates all what Kibana needs to plot a tile map. This is my conf logstash :
input {
file {
path => "/home/ec2-user/part.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
separator => ","
columns => ["kilo_bytes_total","ip","session_number","request_number_total","duration_minutes_total","referer_list","filter_match_count_avg","request_number_avg","duration_minutes_avg","kilo_bytes_avg","segment_duration_avg","req_by_minute_avg","segment_mix_rank_avg","offset_avg_avg","offset_std_avg","extrem_interval_count_avg","pf0_avg","pf1_avg","pf2_avg","pf3_avg","pf4_avg","code_0_avg","code_1_avg","code_2_avg","code_3_avg","code_4_avg","code_5_avg","volume_classification_filter_avg","code_classification_filter_avg","profiles_classification_filter_avg","strange_classification_filter_avg"]
}
geoip {
source => "ip"
database => "/home/ec2-user/logstash-5.2.0/GeoLite2-City.mmdb"
target => "geoip"
add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" ]
add_tag => "geoip"
}
mutate {
convert => [ "[geoip][coordinates]", "float"]
}
}
output {
elasticsearch {
index => "geotrafficip"
}
}
And this is what that generates :
It looks cool. Trying to create my tile map, I have this message :
What to do ?
It seems that I must add somewhere the possiblity to use dynamic templates.. Should I create a template and add it to my file conf logstash ?
Can anybody give me some feedback ? Thanks !
If you look in the Kibana settings for your index, you'll need at least one field to show up with a type of geo_point to be able to get anything on a map.
If you don't already have a geo_point field, you'll need to re-index your data after setting up an appropriate mapping for the geoip.coordinates field. For example: https://stackoverflow.com/a/42004303/2785358
If you are using a relatively new version of Elasticsearch (2.3 or later), it's relatively easy to re-index your data. You need to create a new index with the correct mapping, use the re-index API to copy the data to the new index, delete the original index and then re-index back to the original name.
You are using the geoip filter wrong and are trying to convert the longitude and latitude to float. Get rid of your mutate filter and change the geoip filter to this.
geoip {
source => "ip"
fields => ["latitude","longitude"]
add_tag => "geoip"
}
This will create the appropriate fields. And the required GeoJSON object.

ELK Stack - Customize autogenerated field mappings

I've got a very basic ELK stack setup and passing logs to it via syslog. I have used inbuilt grok patterns to split the logs in to fields. But the field mappings are auto-generated by logstash elasticsearch plugin and I am unable to customize them.
For instance, I create a new field by name "dst-geoip" using logstash config file (see below):
geoip {
database => "/usr/local/share/GeoIP/GeoLiteCity.dat" ### Change me to location of GeoLiteCity.dat file
source => "dst_ip"
target => "dst_geoip"
fields => [ "ip", "country_code2", "country_name", "latitude", "longitude","location" ]
add_field => [ "coordinates", "%{[dst_geoip][latitude]},%{[geoip][longitude]}" ]
add_field => [ "dst_country", "%{[dst_geoip][country_code2]}"]
add_field => [ "flow_dir", "outbound" ]
}
I want to assign it the type "geo_point" which I cannot edit from Kibana. Online documents mentions manually updating the mapping on respective index using ElasticSearch APIs. But Logstash generates many indices (one per day). If I update one index, will the mapping stay the same in future indices?
What you're looking for is a "template".

How to parse a xml-file with logstash filters

I'm trying to index some simple XML-files with elasticsearch and logstash. So far I have the ELK-stack set up, and logstash-forwarder. I am trying to use the documentation to set up a xml filter, but I just cant seem to get it right.
My XML format is pretty straigth forward;
<Recording>
<DataFile description="desc" fileName="test.wav" Source="mic" startTime="2014-12-12_121212" stopTime="2014-12-12_131313"/>
</Recording>
I just want each file to be an entry in elasticsearch, and every parameter in the DataFile-tag to be a key-value that I can search. Since the documentation is getting me nowhere, how would such a filter look? I have also tried to use the answers in this and this without any luck.
Add the below in your logstash-forwarder configuration and change the logstash server IP, Certificate path and the log path accordingly.
{
"network": {
"servers": [ "x.x.x.x:5043" ],
"ssl ca": " / cert/server.crt",
"timeout": 15
},
"files": [
{
"paths": [
"D:/ELK/*.log"
],
"fields": { "type": "log" }
}
]
}
Add the below input plugin in your logstash server configuration. Change the certificate ,key path and name accordingly.
lumberjack {
port => 5043
type => "lumberjack"
ssl_certificate => " /cert/server.crt"
ssl_key => "D:/ELK/logstash/cert/server.key"
codec => multiline {
pattern => "(\/Recording>)"
what => "previous"
negate => true
}
}
Now add the below grok filter under your logstash filter section
grok {
match => ["message", "(?<content>(< Recording(.)*?</Recording>))"]
tag_on_failure => [ ]
}
Finally in the logstash output session add
elasticsearch {
host => "127.0.0.1"
port => "9200"
protocol => "http"
index => "Recording-%{+YYYY.MM.dd}"
index_type => "log"
}
Now when you add your xml messages into your log file. Each entry will be processed and stored in your elastic search server.
Thanks,

logstash, syslog and grok

I am working on an ELK-stack configuration. logstash-forwarder is used as a log shipper, each type of log is tagged with a type-tag:
{
"network": {
"servers": [ "___:___" ],
"ssl ca": "___",
"timeout": 15
},
"files": [
{
"paths": [
"/var/log/secure"
],
"fields": {
"type": "syslog"
}
}
]
}
That part works fine... Now, I want logstash to split the message string in its parts; luckily, that is already implemented in the default grok patterns, so the logstash.conf remains simple so far:
input {
lumberjack {
port => 6782
ssl_certificate => "___" ssl_key => "___"
}
}
filter {
if [type] == "syslog" {
grok {
match => [ "message", "%{SYSLOGLINE}" ]
}
}
}
output {
elasticsearch {
cluster => "___"
template => "___"
template_overwrite => true
node_name => "logstash-___"
bind_host => "___"
}
}
The issue I have here is that the document that is received by elasticsearch still holds the whole line (including timestamp etc.) in the message field. Also, the #timestamp still shows the date of when logstash has received the message which makes is bad to search since kibana does query the #timestamp in order to filter by date... Any idea what I'm doing wrong?
Thanks, Daniel
The reason your "message" field contains the original log line (including timestamps etc) is that the grok filter by default won't allow existing fields to be overwritten. In other words, even though the SYSLOGLINE pattern,
SYSLOGLINE %{SYSLOGBASE2} %{GREEDYDATA:message}
captures the message into a "message" field it won't overwrite the current field value. The solution is to set the grok filter's "overwrite" parameter.
grok {
match => [ "message", "%{SYSLOGLINE}" ]
overwrite => [ "message" ]
}
To populate the "#timestamp" field, use the date filter. This will probably work for you:
date {
match => [ "timestamp", "MMM dd HH:mm:ss", "MMM d HH:mm:ss" ]
}
It is hard to know were the problem without seeing an example event that is causing you the problem. I can suggest you to try the grok debugger in order to verify the pattern is correct and to adjust it to your needs once you see the problem.

Resources