Using event field as hash variable - elasticsearch

I'm receving events in Logstash containing measurement, values and tags. I do not know ahead of time what field there are and what tags. So i wanted to do something like this:
input {
http {}
}
filter {
ruby {
code => '
tags = event.get("stats_tags").split(",")
samples = event.get("stats_samples").split(" ")
datapoints = {}
samples.each {|s|
splat = s.split(" ")
datapoints[splat[0]] = splat[1]
}
event.set("[#metadata][stats-send-as-tags]", tags)
event.set("[#metadata][stats-datapoints]", datapoints)
'
}
}
output {
influxdb {
host => "influxdb"
db => "events_db"
measurement => measurement
send_as_tags => [#metadata][stats-send-as-tags]
data_points => [#metadata][stats-datapoints]
}
}
But this produce error. After much googling to no avail i'm starting to think this is imposible.
Is there a way to pass hash and array from event field to output/filter configuration?
EDIT: If i doublequote it, the error i'm getting is
output {
influxdb {
# This setting must be a hash
# This field must contain an even number of items, got 1
data_points => "[#metadata][stats-datapoints]"
...
}
}

Related

I want to convert a single json event into multiple events, through logstash Hope to get some inspiration, thanks

4 fields (warnTags、warnSlrs、warnActions、denyMsg) fields need to be separated by semicolon(;)
Raw String
{ "waf": {
"warnTags": "OWASP_CRS/WEB_ATTACK/SQL_INJECTION;OWASP_CRS/WEB_ATTACK/XSS;OWASP_CRS/WEB_ATTACK/XSS;OWASP_CRS/WEB_ATTACK/XSS;OWASP_CRS/WEB_ATTACK/SPECIAL_CHARS;OWASP_CRS/WEB_ATTACK/SQL_INJECTION",
"policy": "bot_77598",
"warnSlrs": "ARGS:wvstest;ARGS:wvstest;ARGS:wvstest;ARGS:wvstest;ARGS:wvstest;ARGS:wvstest",
"riskTuples": ":-973305-973333-973335",
"warnActions": "2;2;2;2;2;2",
"denyActions": "3",
"warnMsg": "SQL Injection Attack;XSS Attack Detected;IE XSS Filters - Attack Detected;IE XSS Filters - Attack Detected;Restricted SQL Character Anomaly Detection Alert - Total # of special characters exceeded;Classic SQL Injection Probes 1/2",
"riskGroups": ":XSS-ANOMALY",
"warnRules": "950901;973305;973333;973335;981173;981242",
"denyMsg": "Anomaly Score Exceeded for Cross-Site Scripting",
"ver": "2.0",
"denyData": "VmVjdG9yIFNjb3JlOiBx",
"riskScores": ":-5-5-2",
"warnData": "eHNzdGFnPigpbG9jeHNz;amF2YXNYcm"
} }
Expected Output Result
{
"waf": {
"warnTags": "OWASP_CRS/WEB_ATTACK/SQL_INJECTION",
"policy": "bot_77598",
"warnSlrs": "ARGS:wvstest",
"riskTuples": ":-973305-973333-973335",
"warnActions": "2",
"denyActions": "3",
"warnMsg": "SQL Injection Attack",
"riskGroups": ":XSS-ANOMALY",
"warnRules": "950901",
"denyMsg": "Anomaly Score Exceeded for Cross-Site Scripting",
"ver": "2.0",
"denyData": "VmVjdG9yIFNjb3JlOiBx",
"riskScores": ":-5-5-2",
"warnData": "eHNzdGFnPigpbG9jeHNz;amF2YXNYcm"
}
}
{
"waf": {
"warnTags": "OWASP_CRS/WEB_ATTACK/XSS",
"policy": "bot_77598",
"warnSlrs": "ARGS:wvstest",
"riskTuples": ":-973305-973333-973335",
"warnActions": "2",
"denyActions": "3",
"warnMsg": "XSS Attack Detected",
"riskGroups": ":XSS-ANOMALY",
"warnRules": "973305",
"denyMsg": "Anomaly Score Exceeded for Cross-Site Scripting",
"ver": "2.0",
"denyData": "VmVjdG9yIFNjb3JlOiBx",
"riskScores": ":-5-5-2",
"warnData": "eHNzdGFnPigpbG9jeHNz;amF2YXNYcm"
}
}
filter {
ruby {
code => "
#info = []
events = event.to_hash
#warnTags = events['waf']['warnTags'].split(';')
#warnMsgs = events['waf']['warnMsg'].split(';')
#warnActions = events['waf']['warnActions'].split(';')
#warnRules = events['waf']['warnRules'].split(';')
#list = #warnTags.zip( #warnMsgs, #warnActions, #warnRules )
#list.each do |tag, msg, action, rule|
detail = {
'tag' => tag,
'msg' => msg,
'action' => action,
'rule' => rule
}
#info.push(detail)
end
event.remove('[waf][warnTags]')
event.remove('[waf][warnMsg]')
event.remove('[waf][warnActions]')
event.remove('[waf][warnRules]')
event.set('[waf][info]', #info)
"
}
split {
field => "[waf][info]"
}}
The config below should be along the lines of what you need. It includes parsing as json at the outset which you may not need depending on prior steps in your pipeline. Essentially this will split the warnTags field on ; to begin with; that will result in warnTags being an array nested within one object. The output of the string split is passed in the to higher level split filter which will create multiple output events splitting on input field, in this case warnTags (again). Hope this helps!
[EDIT: Added warnSlrs as second split field]
filter {
json {
source => "message"
}
mutate {
split => {"[waf][warnTags]" => ";"}
}
mutate {
split => {"[waf][warnSlrs]" => ";"}
}
split {
field => "[waf][warnTags]"
}
split {
field => "[waf][warnSlrs]"
}
}

Logstash escape JSON Keys

I have multiple systems that send data as JSON Request Body. This is my simple config file.
input {
http {
port => 5001
}
}
output {
elasticsearch {
hosts => "elasticsearch:9200"
}
}
In most cases this works just fine. I can look at the json data with kibana.
In some cases the JSON will not be processed. It hase something to do with the JSON escaping. For example: If a key contains a '.', the JSON will not be processed.
I can not control the JSON. Is there a way to escape these characters in a JSON key?
Update: As mentioned in the comments I'll give an example of a JSON String (Content is altered. But I,ve tested the JSON String. It has the same behavior as the original.):
{
"http://example.com": {
"a": "",
"b": ""
}
}
My research brings me back to my post, finally.
Before Elasticsearch 2.0 dots in the key were allowed. Since version 2.0 this is not the case anymore.
One user in the logstash forum developed a ruby script that takes care of the dots in json keys:
filter {
ruby {
init => "
def remove_dots hash
new = Hash.new
hash.each { |k,v|
if v.is_a? Hash
v = remove_dots(v)
end
new[ k.gsub('.','_') ] = v
if v.is_a? Array
v.each { |elem|
if elem.is_a? Hash
elem = remove_dots(elem)
end
new[ k.gsub('.','_') ] = elem
} unless v.nil?
end
} unless hash.nil?
return new
end
"
code => "
event.instance_variable_set(:#data,remove_dots(event.to_hash))
"
}
}
All credits go to #hanzmeier1234 (Field name cannot contain ‘.’)

logstash kv filter, converting strings to integers using dynamic mapping

I have a log with a format similar to:
name=johnny amount=30 uuid=2039248934
The problem is I am using this parser on multiple log files with each basically containing numerous kv pairs.
Is there a way to recognize when values are integers and cast them as such without having to use mutate on every single key value pair?(Rather than a string)
I found this link but it was very vague in where the template json file was suppose to go and how I was to go about using it.
Can kv be told to auto-detect numeric values and emit them as numeric JSON values?
You can use ruby plugin to do it.
input {
stdin {}
}
filter {
ruby {
code => "
fieldArray = event['message'].split(' ');
for field in fieldArray
name = field.split('=')[0];
value = field.split('=')[1];
if value =~ /\A\d+\Z/
event[name] = value.to_i
else
event[name] = value
end
end
"
}
}
output {
stdout { codec => rubydebug }
}
First, split the message to an array by SPACE.
Then, for each k,v mapping, check whether the value is numberic, if YES, convert it to Integer.
Here is the sample output for your input:
{
"message" => "name=johnny amount=30 uuid=2039248934",
"#version" => "1",
"#timestamp" => "2015-06-25T08:24:39.755Z",
"host" => "BEN_LIM",
"name" => "johnny",
"amount" => 30,
"uuid" => 2039248934
}
Update Solution for Logstash 5:
input {
stdin {}
}
filter {
ruby {
code => "
fieldArray = event['message'].split(' ');
for field in fieldArray
name = field.split('=')[0];
value = field.split('=')[1];
if value =~ /\A\d+\Z/
event.set(name, value.to_i)
else
event.set(name, value)
end
end
"
}
}
output {
stdout { codec => rubydebug }
}
Note, if you decide to upgrade to Logstash 5, there are some breaking changes:
https://www.elastic.co/guide/en/logstash/5.0/breaking-changes.html
In particular, it is the event that needs to be modified to use either event.get or event.set. Here is what I used to get it working (based on Ben Lim's example):
input {
stdin {}
}
filter {
ruby {
code => "
fieldArray = event.get('message').split(' ');
for field in fieldArray
name = field.split('=')[0];
value = field.split('=')[1];
if value =~ /\A\d+\Z/
event.set(name, value.to_i)
else
event.set(name, value)
end
end
"
}
}
output {
stdout { codec => rubydebug }
}

Ruby mongoid aggregation return object

I am doing an mongodb aggregation using mongoid, using ModleName.collection.aggregate(pipeline) . The value returned is an array and not a Mongoid::Criteria, so if a do a first on the array, I get the first element which is of the type BSON::Document instead of ModelName. As a result, I am unable to use it as a model.
Is there a method to return a criteria instead of an array from the aggregation, or convert a bson document to a model instance?
Using mongoid (4.0.0)
I've been struggling with this on my own too. I'm afraid you have to build your "models" on your own. Let's take an example from my code:
class Searcher
# ...
def results(page: 1, per_page: 50)
pipeline = []
pipeline <<
"$match" => {
title: /#{#params['query']}/i
}
}
geoNear = {
"near" => coordinates,
"distanceField" => "distance",
"distanceMultiplier" => 3959,
"num" => 500,
"spherical" => true,
}
pipeline << {
"$geoNear" => geoNear
}
count = aggregate(pipeline).count
pipeline << { "$skip" => ((page.to_i - 1) * per_page) }
pipeline << { "$limit" => per_page }
places_hash = aggregate(pipeline)
places = places_hash.map { |attrs| Offer.new(attrs) { |o| o.new_record = false } }
# ...
places
end
def aggregate(pipeline)
Offer.collection.aggregate(pipeline)
end
end
I've omitted a lot of code from original project, just to present the way what I've been doing.
The most important thing here was the line:
places_hash.map { |attrs| Offer.new(attrs) { |o| o.new_record = false } }
Where both I'm creating an array of Offers, but additionally, manually I'm setting their new_record attribute to false, so they behave like any other documents get by simple Offer.where(...).
It's not beautiful, but it worked for me, and I could take the best of whole Aggregation Framework!
Hope that helps!

Ruby finding duplicates in MongoDB

I am struggling to get this working efficiently I think map reduce is the answer but can't getting anything working, I know it is probably a simple answer hopefully someone can help
Entry Model looks like this:
field :var_name, type: String
field :var_data, type: String
field :var_date, type: DateTime
field :external_id, type: Integer
If the external data source malfunctions we get duplicate data. One way to stop this was when consuming the results we check if a record with the same external_id already exists, as one we have already consumed. However this is slowing down the process a lot. The plan now is to check for duplicates once a day. So we are looking get a list of Entries with the same external_id. Which we can then sort and delete those no longer needed.
I have tried adapting the snippet from here https://coderwall.com/p/96dp8g/find-duplicate-documents-in-mongoid-with-map-reduce as shown below but get
failed with error 0: "exception: assertion src/mongo/db/commands/mr.cpp:480"
def find_duplicates
map = %Q{
function() {
emit(this.external_id, 1);
}
}
reduce = %Q{
function(key, values) {
return Array.sum(values);
}
}
Entry.all.map_reduce(map, reduce).out(inline: true).each do |entry|
puts entry["_id"] if entry["value"] != 1
end
end
Am I way off? Could anyone suggest a solution? I am using Mongiod, Rails 4.1.6 and Ruby 2.1
I got it working using the suggestion in the comments of the question by Stennie using the Aggregation framework. It looks like this:
results = Entry.collection.aggregate([
{ "$group" => {
_id: { "external_id" => "$external_id"},
recordIds: {"$addToSet" => "$_id" },
count: { "$sum" => 1 }
}},
{ "$match" => {
count: { "$gt" => 1 }
}}
])
I then loop through the results and delete any unnecessary entries.

Resources