Having issues creating conditional outputs with logstash using metadata fields - elasticsearch

I want to send winlogbeat data to a separate index than my main index. I have configured winlogbeat to send it's data to my logstash server and i can confirm that i have received the data.
This is what i do currently:
output {
if [#metadata][beat] == "winlogbeat" {
elasticsearch {
hosts => ["10.229.1.12:9200", "10.229.1.13:9200"]
index => "%{[#metadata][beat]}-%{+YYYY-MM-dd}"
user => logstash_internal
password => password
stdout { codec => rubydebug }
}
else {
elasticsearch {
hosts => ["10.229.1.12:9200", "10.229.1.13:9200"]
index => "logstash-%{stuff}-%{+YYYY-MM-dd}"
user => logstash_internal
password => password
}
}
}
}
However, i cannot start logstash using this configuration. If i remove the if statements and only use one elasticsearch output, the one which handles regular logstash data, it works.
What am i doing wrong here?

You have problems with the brackets from your configuration. To fix your code please see below:
output {
if [#metadata][beat] == "winlogbeat" {
elasticsearch {
hosts => ["10.229.1.12:9200", "10.229.1.13:9200"]
index => "%{[#metadata][beat]}-%{+YYYY-MM-dd}"
user => logstash_internal
password => password
}
stdout { codec => rubydebug }
} else {
elasticsearch {
hosts => ["10.229.1.12:9200", "10.229.1.13:9200"]
index => "logstash-%{stuff}-%{+YYYY-MM-dd}"
user => logstash_internal
password => password
}
}
}
I hope this sorts your issue.

Related

Logstash aggregate fields

I am trying to configure logstash to aggregate similar syslog based on a message field and in a specific timestamp.
To make my case clear, this is an example of what I would like to do.
example: I have those junk syslog coming through my logstash
timestamp. message
13:54:24. hello
13:54:35. hello
What I would like to do is have a condition that check if the message are the same and those message occurs in a specific timespan (for example 10min) I would like to aggregate them into one row, and increase the count
the output I am expecting to see is as follow
timestamp. message. count
13.54.35. hello. 2
I know and I saw that there is the opportunity to aggregate the fields, but I was wondering if there is a chance to do this aggregation based on a specific time range
If anyone can help me I would be extremely grateful as I am new to logstash and I have the problem that in my server I am receiving tons of junk syslog and I would like to reduce that amount.
So far I did some cleaning with this configuration
input {
syslog {
port => 514
}
}
filter {
prune {
whitelist_names =>["timestamp","message","newfield"]
}
mutate {
add_field => {"newfield" => "%{#timestamp}%{message}"}
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "logstash_index"
}
stdout {
codec => rubydebug
}
}
Now I just need to do the aggregation.
Thank you so much for your help guys
EDIT:
Following the documentation, I put in place this configuration:
input {
syslog {
port => 514
}
}
filter {
prune {
whitelist_names =>["timestamp","message","newfield"]
}
mutate {
add_field => {"newfield" => "%{#timestamp}%{message}"}
}
if [message] =~ "MESSAGE FROM" {
aggregate {
task_id => "%{message}"
code => "map['message'] ||= 0; map['message'] += 1;"
push_map_as_event_on_timeout => true
timeout_task_id_field => "message"
timeout => 60
inactivity_timeout => 50
timeout_tags => ['_aggregatetimeout']
timeout_code => "event.set('count_message', event.get('message') > 1)"
}
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "logstash_index"
}
stdout {
codec => rubydebug
}
}
I don't get any error but the output is not what I am expecting.
The actual output is that it create a tag field (Good) passing an array with _aggregationtimeout and _aggregationexception
{
"message" => "<88>MESSAGE FROM\r\n",
"tags" => [
[0] "_aggregatetimeout",
[1] "_aggregateexception"
],
"#timestamp" => 2021-07-23T12:10:45.646Z,
"#version" => "1"
}

Read a CSV in Logstash level and filter on basis of the extracted data

I am using Metricbeat to get process-level data and push it to Elastic Search using Logstash.
Now, the aim is to categorize the processes into 2 tags i.e the process running is either a browser or it is something else.
I am able to do that statically using this block of code :
input {
beats {
port => 5044
}
}
filter{
if [process][name]=="firefox.exe" or [process][name]=="chrome.exe" {
mutate {
add_field => { "process.type" => "browsers" }
convert => {
"process.type" => "string"
}
}
}
else {
mutate {
add_field => { "process.type" => "other" }
}
}
}
output {
elasticsearch {
hosts => "localhost:9200"
# manage_template => false
index => "metricbeatlogstash"
}
}
But when I try to make that if condition dynamic by reading the process list from a CSV, I am not getting any valid results in Kibana, nor a error on my LogStash level.
The CSV config file code is as follows :
input {
beats {
port => 5044
}
file{
path=>"filePath"
start_position=>"beginning"
sincedb_path=>"NULL"
}
}
filter{
csv{
separator=>","
columns=>["processList","IT"]
}
if [process][name] in [processList] {
mutate {
add_field => { "process.type" => "browsers" }
convert => {
"process.type" => "string"
}
}
}
else {
mutate {
add_field => { "process.type" => "other" }
}
}
}
output {
elasticsearch {
hosts => "localhost:9200"
# manage_template => false
index => "metricbeatlogstash2"
}
}
What you are trying to do does not work that way in logstash, the events in a logstash pipeline are independent from each other.
The events received by your beats input have no knowledge about the events received by your csv input, so you can't use fields from different events in a conditional.
To do what you want you can use the translate filter with the following config.
translate {
field => "[process][name]"
destination => "[process][type]"
dictionary_path => "process.csv"
fallback => "others"
refresh_interval => 300
}
This filter will check the value of the field [process][name] against a dictionary, loaded into memory from the file process.csv, the dictionary is a .csv file with two columns, the first is the name of the browser process and the second is always browser.
chrome.exe,browser
firefox.exe,browser
If the filter got a match, it will populate the field [process][type] (not process.type) with the value from the second column, in this case, always browser, if there is no match, it will populate the field [process][type] with the value of the fallback config, in this case, others, it will also reload the content of the process.csv file every 300 seconds (5 minutes)

Kinesis input stream into Logstash

I am currently evaluating Logstash for our data ingestion needs. One of the use case is to read data from AWS Kinesis stream. I have tried to install logstash-input-kinesis plugin. When i run it, i do not see logstash processing any event from the stream. My logstash is working fine with other type of inputs (tcp). There is no error in debug logs. It just behaves as there is nothing to process. my config file is :
input {
kinesis {
kinesis_stream_name => "GwsElasticPoc"
application_name => "logstash"
type => "kinesis"
}
tcp {
port => 10000
type => tcp
}
}
filter {
if [type] == "kinesis" {
json {
source => "message"
}
}
if [type] == "tcp" {
grok {
match => { "message" => "Hello, %{WORD:name}"}
}
}
}
output{
if [type] == "kinesis"
{
elasticsearch{
hosts => "http://localhost:9200"
user => "elastic"
password => "changeme"
index => elasticpoc
}
}
if [type] == "tcp"
{
elasticsearch{
hosts => "http://localhost:9200"
user => "elastic"
password => "changeme"
index => elkpoc
}
}
}
I have not tried the logstash way but if you are running on AWS. There is a Kinesis Firehose to Elasticsearch ingestion available as documented at http://docs.aws.amazon.com/firehose/latest/dev/basic-create.html#console-to-es
You can see if that would work as an alternate to logstash
we need to provide the AWS credentials for accessing the AWS services for this integration to work.
You can find the same here: https://github.com/logstash-plugins/logstash-input-kinesis#authentication
This plugin requires additional access to AWS DynamoDB as 'checkpointing' database.
You need to use 'application_name' to specify the table name in DynamoDB if you have multiple streams.
https://github.com/logstash-plugins/logstash-input-kinesis

separate indexes on logstash

Currently I have logstash configuration that pushing data to redis, and elastic server that pulling the data using the default index 'logstash'.
I've added another shipper and I've successfully managed to move the data using the default index as well. My goal is to move and restore that data on a separate index, what is the best way to achieve it?
This is my current configuration using the default index:
shipper output:
output {
redis {
host => "my-host"
data_type => "list"
key => "logstash"
codec => json
}
}
elk input:
input {
redis {
host => "my-host"
data_type => "list"
key => "logstash"
codec => json
}
}
Try to give the index filed in output. Give the name you want and then run that. so a seperate index will be created for that.
input {
redis {
host => "my-host"
data_type => "list"
key => "logstash"
codec => json
}
}
output {
stdout { codec => rubydebug }
elasticsearch {
index => "redis-logs"
cluster => "cluster name"
}
}

Logstash configuration condition

i am new to Logstash, elasticsearch.
I have NodeJS app, where i am sending logs trough Winston:Redis. I have different types of logs, like Requests, system, etc. And i want these logs to be in separate index_type inside ElasticSearch.
I am sending these keys fe. : "web:production:request", "web:production:system" and im sending JSON obejcts.
My configuration is:
NodeJS (Winston Redis client) -> Redis -> Logstash -> Elastic search
Its working good, except index_types.
I have 1 redis client (stream/subcribe) and i want to filter these logs depending on key value to different index_types inside elastic search output.
I tried this config:
input {
redis {
host => "127.0.0.1"
data_type => "pattern_channel"
key => "web:production:*"
codec => json
}
filter {
if [key] == "web:production:request" {
alter {
add_field => { "index_type" => "request" }
}
}
if [key] == "web:production:system" {
alter {
add_field => { "index_type" => "system" }
}
}
}
output {
elasticsearch {
index => "web-production-%{+YYYY.MM.dd}"
index_type => "%{index_type}"
# THIS IS NOT WORKING
protocol => "http"
}
}
So questions are:
How do conditionals right ?
How would you proceed if you want to send different indexes depending on conditions
I cannot have condition inside command ? fe. grok { if [key] == "1" {} } ?
suggestion for a workaround:
output {
if [index_type] == "request"{
elasticsearch {
index => "web-production-request%{+YYYY.MM.dd}"
protocol => "http"
}
}
if [index_type] == "system"{
elasticsearch {
index => "web-production-system%{+YYYY.MM.dd}"
protocol => "http"
}
}
}

Resources