Using Redis key as Elasticsearch index name - elasticsearch

I am attempting to use a logstash indexer to move data from redis to elasticsearch.
On the input to redis end, I give a 'key' to one set of logs from logstash output.
redis
{
host => "server
port => "7379"
data_type => "list"
key => "aruba"
}
On input end , I read each keys in the input.
input
{
redis
{
host => "localhost"
port => "6379"
data_type => "list"
type => "redis-input"
key => "logstash"
codec => "json"
threads => 32
batch_count => 1000
#timeout => 10
}
redis
{
host => "localhost"
port => "6379"
data_type => "list"
type => "redis-input"
key => "aruba"
codec => "json"
threads => 32
batch_count => 1000
#timeout => 10
}
}
and I am attempting to use the key in the logstash to write to index. i.e.
aruba-2017.24.10. something like that, but the output always goes to logstash. I tried
if[redis.key] == "xyz"
{
elasticsearch {index => "xyz-%{time}"}
}
or if[key] == "xyz" ....
also tried
elasticsearch
{
index => "%{key}-%{time}"
}
and elasticsearch{index => "%{redis.key}-%{time}"}
etc. None of it seems to work.

While #sysadmin1138 is write in that accessing nested fields is done via [field][subfield] rather than [field.subfield] your problem is that you are trying to access data that is not in your log event.
While in Redis, your log events have a key associated with them, but this is not part of the event itself and is merely used to access the event from Redis. When Logstash fetches the event from Redis, it uses that "key" to specify which events it wants, but the key does not make it to elastic.
To see this for yourself, try running logstash with stdout{codec => "rubydebug"} as an output plugin, it will prettyprint your whole log event allowing you to see what data is included.
To your rescue comes the add_field parameter that exists for every logstash plugin. You can add to your input:
redis
{
host => "localhost"
port => "6379"
data_type => "list"
type => "redis-input"
key => "aruba"
codec => "json"
threads => 32
batch_count => 1000
add_field => {
"[redis][key]" => "aruba"
}
}
Then changing your conditional to use [redis][key] will leave your code working.
(Cheers to RELK stacks)

This is likely due to an incorrect definition of the name in your conditional.
if [redis.key] == "xyz" {
elasticsearch {index => "xyz-%{time}"}
}
Should be:
if [redis][key] == "xyz" {
elasticsearch {index => "xyz-%{time}"}
}

Related

can logstash send data simultaneusly to mulpile location along with elastic search

Normally, in ELK logstash parsed data and send to elastics search.
I want to know is it possible that logstash send same data to different location at real time.
If it is possible, please let me know how to do it.
Create several output files that match type and send to different hosts.
output {
if [type] == "syslog" {
elasticsearch {
hosts => ["127.0.0.1:9200"]
index => "logstash-%{+YYYY.MM.dd}"
codec => "plain"
workers => 1
manage_template => true
template_name => "logstash"
template_overwrite => false
flush_size => 100
idle_flush_time => 1
}
}
}

Logstash with elasticsearch output: how to write to different indices?

I hope to find here an answer to my question that I am struggling with since yesterday:
I'm configuring Logstash 1.5.6 with a rabbitMQ input and an elasticsearch output.
Messages are published in rabbitMQ in bulk format, my logstash consumes them and write them all to elasticsearch default index logstash-YYY.MM.DD with this configuration:
input {
rabbitmq {
host => 'xxx'
user => 'xxx'
password => 'xxx'
queue => 'xxx'
exchange => "xxx"
key => 'xxx'
durable => true
}
output {
elasticsearch {
host => "xxx"
cluster => "elasticsearch"
flush_size =>10
bind_port => 9300
codec => "json"
protocol => "http"
}
stdout { codec => rubydebug }
}
Now what I'm trying to do is send the messages to different elasticsearch indices.
The messages coming from the amqp input already have the index and type parameters (bulk format).
So after reading the documentation:
https://www.elastic.co/guide/en/logstash/1.5/event-dependent-configuration.html#logstash-config-field-references
I try doing that
input {
rabbitmq {
host => 'xxx'
user => 'xxx'
password => 'xxx'
queue => 'xxx'
exchange => "xxx"
key => 'xxx'
durable => true
}
output {
elasticsearch {
host => "xxx"
cluster => "elasticsearch"
flush_size =>10
bind_port => 9300
codec => "json"
protocol => "http"
index => "%{[index][_index]}"
}
stdout { codec => rubydebug }
}
But what logstash is doing is create the index %{[index][_index]} and putting there all the docs instead of reading the _index parameter and sending there the docs !
I also tried the following:
index => %{index}
index => '%{index}'
index => "%{index}"
But none seems to work.
Any help ?
To resume, the main question here is: If the rabbitMQ messages have this format:
{"index":{"_index":"indexA","_type":"typeX","_ttl":2592000000}}
{"#timestamp":"2017-03-09T15:55:54.520Z","#version":"1","#fields":{DATA}}
How to tell to logstash to send the output in the index named "indexA" with type "typeX" ??
If your messages in RabbitMQ are already in bulk format then you don't need to use the elasticsearch output but a simple http output hitting the _bulk endpoint would do the trick:
output {
http {
http_method => "post"
url => "http://localhost:9200/_bulk"
format => "message"
message => "%{message}"
}
}
So everyone, with the help of Val, the solution was:
As he said since the RabbitMQ messages were already in bulk format, no need to use elasticsearch output, the http output to _bulk API will make it (silly me)
So I replaced the output with this:
output {
http {
http_method => "post"
url => "http://172.16.1.81:9200/_bulk"
format => "message"
message => "%{message}"
}
stdout { codec => json_lines }
}
But it still wasn't working. I was using Logstash 1.5.6 and after upgrading to Logstash 2.0.0 (https://www.elastic.co/guide/en/logstash/2.4/_upgrading_using_package_managers.html) it worked with the same configuration.
There it is :)
If you store JSON message in Rabbitmq, then this problem can be solved.
Use index and type as field in JSON message and assign those values to Elasticsearch output plugin.
index =>
"%{index}"                                                        
        //INDEX from JSON body received from Kafka Producer document_type => "%{type}" }               //TYPE from JSON body
With this approach , each message can have their own index and type.
   

Logstash agent not indexing anymore

I have a Logstash instance running as a service that reads from Redis and outputs to Elasticsearch. I just noticed there was nothing new in Elasticsearch for the last few days, but the Redis lists were increasing.
Logstash log was filled with 2 errors repeated for thousands of lines:
:message=>"Got error to send bulk of actions"
:message=>"Failed to flush outgoing items"
The reason being:
{"error":"IllegalArgumentException[Malformed action/metadata line [107], expected a simple value for field [_type] but found [START_ARRAY]]","status":500},
Additionally, trying to stop the service failed repeatedly, I had to kill it. Restarting it emptied the Redis lists and imported everything to Elasticsearch. It seems to work ok now.
But I have no idea how to prevent that from happening again. The mentioned type field is set as a string for each input directive, so I don't understand how it could have become an array.
What am I missing?
I'm using Elasticsearch 1.7.1 and Logstash 1.5.3. The logstash.conf file looks like this:
input {
redis {
host => "127.0.0.1"
port => 6381
data_type => "list"
key => "b2c-web"
type => "b2c-web"
codec => "json"
}
redis {
host => "127.0.0.1"
port => 6381
data_type => "list"
key => "b2c-web-staging"
type => "b2c-web-staging"
codec => "json"
}
/* other redis inputs, only key/type variations */
}
filter {
grok {
match => ["msg", "Cache hit %{WORD:query} in %{NUMBER:hit_total:int}ms. Network: %{NUMBER:hit_network:int} ms. Deserialization %{NUMBER:hit_deserial:int}"]
add_tag => ["cache_hit"]
tag_on_failure => []
}
/* other groks, not related to type field */
}
output {
elasticsearch {
host => "[IP]"
port => "9200"
protocol=> "http"
cluster => "logstash-prod-2"
}
}
According to your log message:
{"error":"IllegalArgumentException[Malformed action/metadata line [107], expected a simple value for field [_type] but found [START_ARRAY]]","status":500},
It seems you're trying to index a document with a type field that's an array instead of a string.
I can't help you without more of the logstash.conf file.
But check followings to make sure:
When you use add_field for changing the type you actually turn type into an array with multiple values, which is what Elasticsearch is complaining about.
You can use mutate join to convert arrays to strings: api link
filter {
mutate {
join => { "fieldname" => "," }
}
}

separate indexes on logstash

Currently I have logstash configuration that pushing data to redis, and elastic server that pulling the data using the default index 'logstash'.
I've added another shipper and I've successfully managed to move the data using the default index as well. My goal is to move and restore that data on a separate index, what is the best way to achieve it?
This is my current configuration using the default index:
shipper output:
output {
redis {
host => "my-host"
data_type => "list"
key => "logstash"
codec => json
}
}
elk input:
input {
redis {
host => "my-host"
data_type => "list"
key => "logstash"
codec => json
}
}
Try to give the index filed in output. Give the name you want and then run that. so a seperate index will be created for that.
input {
redis {
host => "my-host"
data_type => "list"
key => "logstash"
codec => json
}
}
output {
stdout { codec => rubydebug }
elasticsearch {
index => "redis-logs"
cluster => "cluster name"
}
}

logstash not fast enough output to elasticsearch

I use logstash as indexer to output data into elasticsearch from redis, but it is not fast enougth because of large data. And then I used mutil workers,but it will be lead various problem. There are other better ways to do faster output? Thanks.
Here is my configuration:
input {
redis {
host => "10.240.93.41"
data_type => "list"
key => "tcpflow"
}
}
filter {
csv {
columns => [ts,node,clientip,vip,rtt,city,isp,asn,province]
separator => "|"
}
}
output {
elasticsearch {
index => "tcpflow-%{+YYYY.MM.dd}"
index_type => "tcpflow"
cluster => "elasticsearch"
host => ["10.240.93.41", "10.240.129.32"]
#protocol => "node"
#protocol => "http"
#port => 8200
protocol => "transport"
manage_template => false
workers => 30
}
}
The redis{} input in logstash defaults to reading one document at a time. Try setting batch_count to something in the 100-1000 range, depending on the size of your documents.
Having multiple worker threads ("-w") is ideal, unless you're using the multiline{} filter which is not thread-safe.

Resources