increase rabbitmq throughput at multi machine - elasticsearch

I'm using logstash and elasticsearch to build a log system. RabbitMQ used to queue log message between two logstashs.
The message path is like below:
source log -> logstash -> rabbitMQ -> logstash(parse) -> elasticsearch
But i figure out that, no matter how much machine i added to rabbitMQ, it just use one machine resource to process messages.
I'm found some article say cluster just increase reliability and redundancy to prevent message lost.
But what i want is increase entire RabbitMQ cluster's throughput(in and out) by add more machine.
How do i configure my RabbitMQ cluster to increase it throughput?
Any comments are appreciated.
--
PS. i need to add more information here.
In my system limit i test is, can receive 7000/s messages and output 1700/s messages in 4 machine cluster system, but not enable HA and just bind 1 exchange to 1 queue and the queue just bind to 1 node. i guess 1 queue bind to 1 node is the throughput bottleneck. And its difficult to change the routing key now, so we have just one routing key and want to distribute message to different nodes to increase whole system throughput.
below is my logstash-indexer config
rabbitmq {
codec => "json"
auto_delete => false
durable => true
exchange => "logstash-exchange"
key => "logstash-routingKey"
queue => "logstash-queue"
host => "VIP-of-rabbitMQ"
user => "guest"
password => "guest"
passive => false
exclusive => false
threads => 4
prefetch_count => 512 }

You need to add more queues. I guess you using only one queue. So in other word you tied to one erlang process. What you want is use multiple queues:
Here is a quick and dirty example how to add some logic to logstash to send message to different queue:
filter {
# check if path contains source subfolder
if "foo" in [path] {
mutate { add_field => [ "source", "foo"] }
}
else if "bar" in [path] {
mutate { add_field => [ "source", "bar"] }
}
else {
mutate { add_field => [ "source", "unknown"] }
}
}
Then in your output:
rabbitmq {
debug => true
durable => true
exchange_type => "direct"
host => "your_rabbit_ip"
key => "%{source}"
exchange => "my_exchange"
persistent => true
port => 5672
user => "logstash"
password => "xxxxxxxxxx"
workers => 12
}
Updated:
Take a look at the repositories that this guy has:
https://github.com/simonmacmullen
I guess you will be interested in this one:
https://github.com/simonmacmullen/random-exchange
This exchange type is for load-balancing among consumers.

Related

trying Consume data From RabbitMQ To Elasticsearch

I am trying Consume data From RabbitMQ To Elasticsearch, and I followed this tutorial https://akintola-lonlon.medium.com/logstash-5-easy-steps-to-consume-data-from-rabbitmq-to-elasticsearch-8fb0eb6e9196
this is my rabbitmq quque
This is my logstash-rabbitmq.conf
input {
rabbitmq {
id => "rabbitmq_logs"
host => "localhost"
port => 5672
vhost => "/"
queue => "system_logs"
ack => false
}
}
filter {
grok {
match => {"message" => "%{COMBINEDAPACHELOG}"}
}
date {
match => ["timestamp", "dd/MM/yyyy:HH:mm:ss Z"]
}
}
output {
elasticsearch {
hosts => ["127.0.0.1:9200"]
index => "logstash_rabbit_mq_hello"
}
stdout {
codec => rubydebug
}
}
Then I try to run sudo bin/logstash -f conf.d/logstash-rabbitmq.conf I get flowing error
[2022-10-17T10:08:43,917][WARN ][logstash.inputs.rabbitmq ][main][rabbitmq_logs] Error while setting up connection, will retry {:exception=>MarchHare::PreconditionFailed, :message=>"PRECONDITION_FAILED - inequivalent arg 'durable' for queue 'system_logs' in vhost '/': received 'false' but current is 'true'", :cause=>#<Java::JavaIo::IOException: >}
[2022-10-17T10:08:43,917][WARN ][logstash.inputs.rabbitmq ][main][rabbitmq_logs] RabbitMQ connection was closed {:url=>"amqp://guest:XXXXXX#localhost:5672/", :automatic_recovery=>true, :cause=>#<Java::ComRabbitmqClient::ShutdownSignalException: clean connection shutdown; protocol method: #method<connection.close>(reply-code=200, reply-text=OK, class-id=0, method-id=0)>}
[2022-10-17T10:08:44,929][INFO ][logstash.inputs.rabbitmq ][main][rabbitmq_logs] Connected to RabbitMQ {:url=>"amqp://guest:XXXXXX#localhost:5672/"}
how can I fix this problem?
I am a beginner in RabbitMQ and ELK, pleas help me

Consume messages from rabbitmq in logstash

Im trying to read logs from rabbitmq queue from logstash and then pass it to elasticsearch. But with no success. Here is my logstash config.
input {
rabbitmq {
host => "localhost"
port => 15672
heartbeat => 30
durable => true
exchange => "logging_queue"
exchange_type => "logging_queue"
}
}
output {
elasticsearch {
hosts => "localhost:9200"
}
stdout {}
}
But there is no index created so ofcourse I cant see any logs in Kibana
There are some messages in queue
I think the correct (default) port is 5672, as 15672 is the port of the web admin console.
input {
rabbitmq {
host => "localhost"
port => 5672 <--- change this
heartbeat => 30
durable => true
exchange => "logging_queue"
exchange_type => "logging_queue"
}
}
output {
elasticsearch {
hosts => "localhost:9200"
}
stdout {}
}

Logstash with queue enabled not ack http input events after jdbc input runs

I’m using logstash with queuing enabled.
I’ve setup logstash to inject rows from mysql via the mysql input plugin on startup. Currently this is injecting 1846 rows.
I also have a http input.
When I take down ES and restart logstash as expected I get errors
logstash_1 WARN logstash.outputs.amazones - Failed to flush outgoing
items {:outgoing_count=>1, :exception=>“Faraday::ConnectionFailed”,
:backtrace=>nil} logstash_1 ERROR logstash.outputs.amazones -
Attempted to send a bulk request to Elasticsearch configured at … I’d
expect when in this situation hitting the logstash http input would
result in an ack.
Actually the http POST does not return and the injection is not seen in logstash logs.
My logstash.yaml looks like
queue {
type: persisted
checkpoint.writes: 1
queue.max_bytes: 8gb
queue.page_capacity: 512mb
}
And my logstash.conf
input {
jdbc {
jdbc_connection_string => "${JDBC_CONNECTION_STRING}"
jdbc_user => "${JDBC_USER}"
jdbc_password => "${JDBC_PASSWORD}"
jdbc_driver_library => "/home/logstash/jdbc_driver.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
statement => "
SELECT blah blah blah
"
}
http {
host => "0.0.0.0"
port => 31311
}
}
output {
stdout { codec => json_lines }
amazon_es {
hosts => ["${AWS_ES_HOST}"]
region => "${AWS_REGION}"
aws_access_key_id => '${AWS_ACCESS_KEY_ID}'
aws_secret_access_key => '${AWS_SECRET_ACCESS_KEY}'
"index" => "${INDEX_NAME}"
"document_type" => "data"
"document_id" => "%{documentid}"
}
}
Is it possible for the http input to still ack events as I’m pretty sure the queue cannot be full as each event payload is about 850 characters?
Thanks in advance

create a virtual server under a specific subnet

I am using softlayer's ruby API, and i am trying to create a virtual server under a specific subnet in a VLAN and i couldnt find a way of doing it.
At the moment i am using the following json:
creation_hash = {
'complexType' => 'SoftLayer_Virtual_Guest',
'hostname' => XXX,
'domain' => XXXX
'datacenter' => { 'name' => #datacenter },
'startCpus' => sl_machine_type(#params['instance_type'])['cpu'],
'maxMemory' => sl_machine_type(#params['instance_type'])['memory'],
'hourlyBillingFlag' => true,
'blockDeviceTemplateGroup' => { 'globalIdentifier' => #params['image_id'] },
'localDiskFlag' => false,
'dedicatedAccountHostOnlyFlag' => true,
'primaryBackendNetworkComponent' => {
'networkVlan' => {
'id' => #private_vlan['id']
}
},
'networkComponents' => [{ 'maxSpeed' => 1000 }],
'privateNetworkOnlyFlag' => true
}
so when i choose a VLAN, it chooses a random subnet under that VLAN.
how can i specify a subnet ? i didnt find this option in the documentation.
Unfortunately it is not possible to specify which subnet a server should be provisioned into.
The provisioning system will choose an IP from the VLAN's primary subnet.
The wording is a bit vague in this article, but it states that IPs are automatically assigned. I will get it updated to state that it is not possible to request a specific block of IPs for the primary.
Adding an IP to the server from a secondary subnet directly after provisioning could be a possible work around. This could be done with a post install script or config manager(salt, chef, etc), if automation is needed. It would also allow you to control specifically which IPs are used for each server.

Where does logstash /elasticsearch write data?

In my input section of my logstash config file, I have created a configuration for reading a rabbitMQ queue. Using the RabbitMQ console, I can see logstash drain the queue. However, I have no idea what logstash is doing with the message. Is it discarding it? Is if forwarding it to elasticsearch?
Here's the logstash configuration
input {
rabbitmq {
host => "192.168.34.151"
exchange => an_exchange
key => a_key
queue => a_queue
}
}
output {
elasticsearch {
embedded => true
protocol => http
}
}
edit - removed the bogus comma from the config.

Resources