Decrease consume rate on RabbitMq server - performance

We are running production single server RabbitMQ (3.7) where around 500 mobile applications are connected as producers (MQTT) and around 10 server applications as consumers. Those 500 publishers push messages basically into one queue and less often in the another one.
Recently we had issue with spikes of stacked messages in all our queues. Numbers of stacked messages went from 1 to 1000. This spike was caused by decrease of consumer rate.
I tired to find what happened and how to eliminate spikes in queues and I should limit queue length or eliminate connections. But we can’t limit we have to perform better. I took a look into RabbitMQ memory usage, cpu and same for consumers everything looks fine and RabbitMq was running around 50% on total load same for memory. Also consumers doesn’t seems to be a bottleneck because consume rate went event higher after the queue length grown.
I have a couple of questions:
Is RabbitMQ designed for such a large amount of consumers?
I read that each queue is single threaded is it possible that rabbit just can’t handle 500 producers in one queue and throughput gets lower?
What else I can use to tackle the cause of lower consumer rate? Number of threads in Rabbit?
What do you recon to measure or test benchmark/performance of RabbitMQ server?

Related

Confluent kafka go client memory leak

My service consumes messages from one kafka topic. While the consumer is idle and blocked waiting for messages I see a continuous and linear increase in the POD memory. GO pprof proves that the go memory consumption is constant around 40 MB, at the same time POD metrics show more than 100 MB is consumed.
This leads me to the conclusion that memory is consumed in the C library librdkafka as mentioned here https://zendesk.engineering/hunting-down-a-c-memory-leak-in-a-go-program-2d08b24b617d
The solution to the memory consumption in librdkafka in the link above was to consume the OffsetCommitResponse events that librdkafka produces. Here is the quote from the link:
It turned out that librdkafka was generating an event every time it
received an OffsetCommitResponse from the Kafka broker (which, with
our auto-commit interval set to 5 seconds, was pretty often), and
placing it in a queue for our app to handle. However, our application
was not actually handling events from that queue, so the size of that
queue grew without bound
Does anyone know how to consume these events in go? unfortunately the link above didn't mention the solution
I solved this issue by counting the number of consumed messages in my service. When the number of consumed messages reaches a configured value e.g. 100,000 in my case, then I simply close and recreate the kafka consumer and producer.
This solution is neither elegant nor doesn't solve the original issue, but hey it stabilized my production. Now I have a flat memory consumption curve.

JMS Priority Messages Causing Starvation of Lower Priority Message

I have a queue that is loaded with high priority JMS messages throughout the day, I want to get them out the door quickly. The queue is also being loaded periodically with lower priority messages in large batches. The problem that I see on busy days, is that there are always enough high priority messages at the front of the queue that none of the lower priority messages get selected until that volume drops off. Often they will sit on the queue until they middle of the night. The app is distributed over a number of servers, but the CPUs are not even breathing hard, the JMS seems to be the choak point.
My hunch is to implement some sort of aging algorithm that increases priority for messages that have been on the queue for a very long time, but of course, that is what middleware is supposed to do for me. I can't imagine that the JMS provider (IBM WebsphereMQ) or the application server (TIBCO BusinessWorks) doesn't have some sort of facility to cope with this. So before I go write some code, I thought I would ask, is there any way to get either of these technologies to help me out with this problem?
The BusinessWorks activity that is reading the queue is a JMS SOAP Event Source, but I could turn it into a JMS Queue Receiver activity or whatever.
All thoughts on how to solve this are welcome :-) TIA
That's like tying 1 hand behind your back and then complaining that you cannot swim properly. D'oh! First off, who's bright idea was it to mix messages. Just because you can do something does not mean you should.
The app is distributed over a number of servers, but the CPUs are not
even breathing hard, the JMS seems to be the choak point.
Well then, the solution is easy. Put high priority messages into queue "A" (the existing queue) and low priority messages into a new queue "B". Next, startup another instance of your JMS application to read the messages off queue "B".
Also, JMS is probably not the choke-point. It is what the application is doing with the message data after the JMS layer picks up the message that is taking a long time (i.e. backend work).
Finally, how many instances of your JMS application is running against the existing queue? If you are only running 1 instance, why? If you have lots of CPU capacity then why don't you run 10 instances of your JMS application. Do some true parallel processing of messages.
If you really want to keep you messages mixed on the same queue and have the high priority messages processed first, and yet your volume of messages is such that you cannot work through all the volume sometimes until the middle of the night, then you quite simply do not have enough processing applications. MQ is a parallel processing system, it is designed to allow many applications to put or get from a queue at once. Make use of this by running more of your getting applications at the same time. They will work through your high priority messages quicker and then get back to processing the lower priority ones.
From your description it's clear that you want the high priority messages to processed first. In such a case lower priority messages will have to wait.
MQ will not increase the priority of messages if they are sitting in queue for long time. How will it know that it has to change property of a message :)?. You will need to develop an application to do that.
I would think segregating messages based on priority, for example, high priority messages are put to one queue and lower priority messages to another queue could be one option you could look at.
Second option would be to look at the changing the delivery sequence (MSGDLVSQ) to FIFO. This makes to messages to be delivered to consumers in the order they arrived into queue. But note this will ignore the message priority, meaning if there is a lower priority message followed by a higher priority message, then higher priority message will wait till the lower priority message is delivered.

HornetQ low throughput when max-size-bytes reached

I have a simple configuration for testing: a fast C++ producer sending ~60 byte messages via Stomp to a topic, a slow consumer, and address-full-policy set to DROP.
The queue grows rapidly receiving several thousand messages per second until it reaches my max-size-bytes which amounts to about 300,000 messages. HornetQ starts dropping messages as expected, but from then on is accepting only 3-4 messages per second from the producer. What would cause that? If it's dropping messages, shouldn't it be able to accept them full speed from the producer?

Active MQ load balancing to achieve high throughput

Currently my activeMQ configuration (non persistent messaging) allows me to achieve 2000 msgs/sec. There are four queues and four consumers consuming the messages. There's only one activeMQ broker in this configuration. I would like to achieve a higher throughput of about 5000 msgs/sec (with addition of additional brokers). I'm pretty clueless on how to achieve this with out splitting individual queues on to individual ActiveMQ instances. What are the topologies that support higher throughput than the individual instance with out splitting the queues among instances ?
Adding a network of brokers might help. That is if you have a decent number of consumers and a decent number of producers connecting to different brokers.
If you have a single producer or a single consumer, all traffic will still go over one of the brokers, making it the bottleneck in any case. So, your actual setup of the servers using the AMQ broker is important.
You will also need to check what's the bottleneck of your physical machines. Is it I/O? CPU? Memory usage/heap size? Even Linkspeed? Use OS tools together with visualvm to track this down. Then you at least know what kind of server you need next.
In any case, some semi-manual load balancing is always possible over several nodes, weather you are using a network of brokers or not. Just make sure messages are routed through certain brokers depending on their content or whatnot. If you cannot distinguish between different message types in any logical way - you can do things like finding some integer number in the message (be it client IP, yesterdays temperature in celsius or whatever), and do a number modulo <num brokers>. Then route it to the destination you selected. Round robin is also an option. There is almost always a way to distribute the load in a logical way among several brokers.

Are there any tools to optimize the number of consumer and producer threads on a JMS queue?

I'm working on an application that is distributed over two JBoss instances and that produces/consumes JMS messages on several JMS queues.
When we configured the application we had to determine which threading model we would use, in particular the number of producing and consuming threads per queue. We have done this in a rather ad-hoc fashion but after reading the most recent columns by Herb Sutter in Dr Dobbs (in particular this one) I would like to size our threads in a more rigorous manner.
Are there any methods/tools to measure the throughput of JMS queues (in particular JBoss Messaging queues) as a function of the number of producing/consuming threads?
This is not really about a specific tool, but may be helpful.
Consumers:
Not sure what your inner architecture is, but let's assume it's an MDB reading in messages. I assert that your only requirement here for rigorous thread count sizing is to choose a maximum cap. If your MDB uses resources from a finite supplier like a JDBC connection pool, consider the maximum cap as the highest number of concurrent instances from that resource that you can tolerate taking. If the MDB's queue is remote, you probably want to consider remote connections (or technically, JMS sessions) a finite resource. If the MDB has less finite requirements (and the queue is local), your maximum cap becomes the number of threads, memory used and/or flat out CPU consumed by the working threads. The reasoning here is that the JBoss MDB container will simply keep allocating more MDB instances (and therefore threads) until the queue is empty or the maximum cap is reached. The only reason I can think of that you would really agonize over the minimum would be if the container's elapsed time or overhead to create new instances is above your tolerance and those operations are usually pretty small potatoes.
Producers
A general axiom of messaging is that producers nearly always outperform consumers. You would think this is pretty arbitrary, but it is a pattern I see recurring all the time, even in widely different messaging scenarios. Anyways, it's tough to say how the threading should work for the producer without knowing a bit about the application, but are you basically capable of [indefinitely] proportionally increasing the number of producer threads and the number of messages generated, or do you have some sort of cap where additional threads simply do not generate more messages ? I would guess it is the latter since most useful work has some limited data or calculation supplier. As I see it, the two drivers here are ordering and persistence.
First off, if you have strict message ordering where messages must be processed in strict (FPFP) First Produced First Processed then you're in a bit of a bind because you almost have to drop down to single threaded throughput unless you can devise some form of logical message demarcation (eg. a client number where any given client's messages are always sent to the same queue, but you may have multiple queues each serviced by one thread so each client is effectively FPFP).
Ordering aside, persistence is the next consideration in that if you have reliable and extensive message persistence, (or have a very high tolerance for message loss) just let the producer threads go to town. The messages will queue up reliably and eventually the consumers will [hopefully] catch up. However, if your message persistence message count or simple queue depths can potentially give you the willies when they get too high, here's where a tool might come in useful. If your producer thread count can be dynamically modified (which they can in many Java ThreadPool implementations) then you could sample the queue depths and raise or lower the producer thread count in accordance with the queue depth ranges you define, optionally to the point where if the consumers basically stall, so will the producers. I do not know of a specific tool that does this but between two JBoss servers this is fairly simple to whip up. Picking your queue depth-->producer thread count will be trickier.
Having said all that, I am going to actually read the article you linked to.....
I've got the perfect thing for you: IBM provide a free command line tool called perfharness.
It's aimed at benchmarking JMS providers, i.e. measuring the throughput of queues (single or multiple) given different numbers of producing or consuming threads.
Some features:
Send and consume messages at a fixed rate (msg/s) or at maximum rate possible on the queue
Use a specific number of threads
Use either JMS or native MQ
Can use data either generated randomly or taken from a file
Generates statistics telling you exactly how fast your queue is performing
The only down side is that it's not super intuitive, given the number of operations it supports. And IBM haven't open sourced it, which is a shame. However it sounds perfect for your purposes.

Resources