How can I translate all values in an array with Logstash? - elasticsearch

I'm indexing logs in ELasticsearch through Logstash which contain a field with an array of codes, for example:
indicator.codes : [ "3", "120", "148" ]
Is there some way in Logstash to lookup these codes in a csv and save the categories and descriptions in 2 new fields such as indicator.categories and indicator.descriptions.
A subset of the csv with 3 columns:
Column 1 => indicator.code
Column 2 => indicator.category
Column 3 => indicator.description
3;Hiding;There are signs in the header
4;Hiding;This binary might try to schedule a task
34;General;This is a 7zip selfextracting file
120;General;This is a selfextracting RAR file
121;General;This binary tries to run as a service
148;Stealthiness;This binary uses tunnel traffic
I've been looking at the csv filter and the translate filter, but they do not seem to be able to lookup multiple keys.
The translate filter seems to work only with 2 columns. The csv filter seems unable to loop through the indicator.codes array.

I would suggest using a Ruby filter to loop over the indicator.codes and compare them to your data you retrieved from the csv.
https://www.elastic.co/guide/en/logstash/8.1/plugins-filters-ruby.html

Related

MariaDb - NamedQuery to filter records based on a column contains comma-separated values

I have a Spring boot application that connects to a MariaDB to fetch records. In my MariaDb table, I have a column called "tags" and this column consists of comma-separated values something like "summer,vacation,sea,sun,beach,travel".
There is a search form at the frontend. I want to create a NamedQuery to filter the search results.
I have something like this for the "tags" part:
" AND m.tags LIKE CONCAT('%', :tags, '%')"
This is working fine if the user input contains only one word like "summer" or "vacation". It works with multiple words as well but only if you write those words exactly as they appear in the tags column, like "summer,vacation".
The problem occurs when the user types something like: "vacation summer". Since these values are not in order as they persisted in the database, the result becomes an empty list. I receive input value as a list of Strings in the backend.
How can I modify my existing NamedQuery to work with all possible inputs?

How to parse a csv file which has some field containing seprator (comma) as-values

sample message - 111,222,333,444,555,val1in6th,val2in6th,777
The sixth column contains a value consisting of commas (val1in6th,val2in6th is a sample value of 6th column).
When I use a simple csv filter this message is getting converted to 8 fields. I want to be able to tell the filter that val1in6th,val2in6th should be treated as a single value and placed as the value of 6th column (its okay not to have comma between val1in6th and val2in6th when placed as the output as 6th column).
change your plugin, no more the csv one but grok filter - doc here.
Then you use a debugger to create a parser for your lines - like this one: https://grokdebug.herokuapp.com/
For your lines you could use this grok expression:
%{WORD:FIELD1},%{WORD:FIELD2},%{WORD:FIELD3},%{WORD:FIELD4},%{WORD:FIELD5},%{GREEDYDATA:FIELD6}
or :
%{INT:FIELD1},%{INT:FIELD2},%{INT:FIELD3},%{INT:FIELD4},%{INT:FIELD5},%{GREEDYDATA:FIELD6}
It changes the datatypes in elastic of the firsts 5 fields.
To know about parse csv with grok filter in elastic you could use this es official blog guide, it is explained how to use grok with ingestion pipeline, but it is the same with logstash

Elasticsearch + Logstash: How to add a fields based on existing data at importing time

Currently, I'm importing data into Elastic through logstash, at this time by reading csv files.
Now let's say I have two numeric fields in the csv, age, and weight.
I would need to add a 3rd field on the fly, by making a math on the age, the weight and another external data ( or function result ); and I need that 3rd field to be created when importing the data.
There is any way to do this?
What could be the best practice?
In all Logstash filter sections, you can add fields via add_field, but that's typically static data.
Math calculations need a separate plugin
As mentioned there, the ruby filter plugin would probably be your best option. Here is an example snippet for your pipeline
filter {
# add calculated field, for example BMI, from height and weight
ruby {
code => "event['data']['bmi'] = event['data']['weight'].to_i / (event['data']['height'].to_i)"
}
}
Alternatively, in Kibana, there are Scripted fields meant to be visualized, but cannot be queried

Logstash plugin kv - keys and values not getting read into Elastic

Small part of my CSV log:
TAGS
contentms:Drupal;contentms.ver:7.1.8;vuln:rce;cve:CVE-2018-0111;
cve:CVE-2014-0160;vuln:Heartbleed;
contentms.ver:4.1.6;contentms:WordPress;tag:backdoor
tag:energia;
Idea is that I know nothing of the keys and values other than the format
key:value;key:value;key:value;key:value; etc
I just create an pattern with logstash plugin "kv"
kv {
source => "TAGS"
field_split => ";"
value_split => ":"
target => "TAGS"
}
I've been trying to get my data into Elastic for Kibana and some of it goes through. But for example keys contentms: and contentms.ver: don't get read. Also keys that do - only one value is searchable in Kibana. For example key cve: is seen on mutliple lines mutliple times in my log with different values but only this value is indexed cve:CVE-2014-0160 same problem for tag: and vuln: keys.
I've seen some similar problems and solutions with ruby, but any solutions with just kv? or change my log format around a bit?
I can't test it right now, but notice that you have both "contentms" (a string) and "contentms.ver", which probably looks to elasticsearch like a nested field ([contentms][ver]), but "contentms" was already defined as a string, so you can't nest beneath it.
After the cvs filter, try renaming "contentms" to "[contentms][name]", which would then be a peer to "[contentms][ver]".
You'd need to start with a new index to create this new mapping.

Rename a dynamic field with logstash

I want to use the logstash ruby plugin to rename a dynamic field name.
Specifically I want to strip out dots so I can feed it to Elasticsearch and remove some extra static text
A field name like this
foo.bar.Host11.x.y.uptime => 37
would become
host11_uptime => 37
or even better to split into seperate fields
host => 11
uptime => 37
Here's some general code to loop across fields in ruby. You could then split the field name to create the one (or more) that you wanted.

Resources