HornetQ throughput limited to 4000 TPS without persistence - performance

I am using HornetQ embedded in JBoss 6.1 application server.
My applications (a client app, producing messages, and JBoss app consuming them) cannot handle more than 4000 TPS on a server while the CPU is still 60% idle. I tried to remove persistence to check if I was disk-bound but it does not improve the throughput.
It seems the problem is on the producer side. At least while monitoring the queue size, it stays very small, meaning consumers are not the bottleneck.
Should I use several queues to be more efficient? I already read performance tuning documentation from HornetQ, but could not find the reason for this.
Or may be it is because I am using AUTO_ACKNOWLEDGE mode? I am running several threads for the producers to this should not impact a lot. The producer JVM cannot use more than 1 CPU thread anyway. I even tried to run several instances of my producer application, but it does not go faster.
The network bandwidth is high (1 Gbps) and my messages are very small (< 1 KB).
Also, the producer and consumer applications are running on the same server.
HornetQ is configured in a JBoss cluster of 2 servers.

I was able to solve this by using several queues. By using JProfiler, I could see lock on my queues and all threads were waiting on these locks.
I tried with 2 queues, and I could double the performance. So now I am putting in place a set of queues.

You could maybe try 2.3.0. I have removed a few locks on appending messages to the queue. Maybe it would scale up with a single Queue on 2.3.0.Final.
(at the time I wrote this, 2.3.0.Final was about to be released. 2.3.0.CR2 didn't have this change I'm talking about)

Did you increase your client's send-window-size?
<connection-factory name="ConnectionFactory">
<producer-window-size>1000</producer-window-size>
...
This parameter limits outstanding messages per-client. It doesn't apply to a single producer thread but to all combined, so adding threads won't help if that limit was reached. Network roundtrip delays make you reach it quite soon.
see:
http://docs.jboss.org/hornetq/2.2.14.Final/user-manual/en/html/flow-control.html#d0e4005

Related

ActiveMQ JMS queues get very full during long performance runs

I'm running ActiveMQ 5.16.3, and I'm testing a Java-based order management application which runs across several JVMs. These JVMs each do work to push orders through a status pipeline. For example, one JVM creates orders, one schedules orders, and so forth. JDBC is used to store data, and JMS is used to process work between the JVMs. One JVM will read work from the database and put up to 5000 messages into a JMS queue for another JVM to process and do its own work. I am running just one ActiveMQ server with many JMS queues. I have not changed how ActiveMQ is storing messages so whatever is the default is what is being used (which should be KahaDB). JDBC is used only by the Java application.
We support several different JMS vendors other than ActiveMQ (such as Weblogic JMS and IBM MQ). However, only with ActiveMQ I am finding that for longer running or high volume tests the JMS queues start to back up and sometimes have hundreds of thousands of messages in them. There is no where near that much work in the system so something else is going on. Via JMX I've confirmed that the ActiveMQ console is correctly showing the numbers. This behavior seems random in that it's not all JVMs doing this (though all JVMs conceptually do the same thing) and if I stop the work generators (so that the JVMs only process what is already in the queue), the queues usually empty out quickly.
For example, if I want to make 10k orders, 10k messages would go into the first JMS queue for the create order JVM to process. This JVM would then update the JDBC database to create the orders and insert records that the next JVM (in this case, schedule order) would then pick up. The schedule JVM reads from the database and based on what work it sees, puts messages into the next JMS queue (only up to 5k, then it waits for it to empty, then fetches 5k more) for it to process. What I am seeing is that the schedule order JMS queue is filling up far past 10k messages.
My studies have lead me to the possibility of uncommitted reads and concurrency but I've come to a dead end. Does anyone have any thoughts?

ActiveMQ non-persistent delivery mode limitations?

I am using ActiveMQ where I need following requirements
To have very fast consumers as my producers are already very fast
Need processing at lease 2K messages per second
Not require to process/consume messages again in case of server crash or other failures. I can trigger whole process again.
Needs to run very normal configuration server - 4Gib RAM
I have configured ActiveMQ as given below
Using non-persistent delivery mode (vm://localhost)(http://activemq.apache.org/what-is-the-difference-between-persistent-and-non-persistent-delivery.html)
Using spring integration for put/fetch messages in/from queue/channel.
Using max-concurrent-consumers with 10 threads
Assume all other configs are by default with ActiveMQ and Sprig-integration.
Problems/Questions
I am not sure how ActiveMQ stores messages in case of non-persistent delivery mode, is it possible that my process will fail with out of memory errors once my queue size exceed some limit? I am asking this because it's very difficult to test whole process for me. So I needs to be aware about limitation before I trigger the process.
If non-persistent delivery mode is not sufficient with my above requirements, is there any performance tuning tips with which I can achieve my requirements with persistent delivery mode (tcp://). I have already tested with this mode, but it seems consumers are very slow here. Also, I have already tried to use DUPS_OK_ACKNOWLEDGE to make my consumer fast with persistent delivery mode but no luck.
NOTE : I am using latest ActiveMQ version 5.14
I am not sure how ActiveMQ stores messages in case of non-persistent delivery mode
Activemq store messages in the memory at first, and it will also swap it to the disk(there is a tmp_storage folder in activemq's data path).
is it possible that my process will fail with out of memory errors once my queue size exceed some limit
I have never met out of memory in activemq, even with about one million messages.
You can also make sure by the producer flow control(http://activemq.apache.org/producer-flow-control.html).
You can make the producer hang when there is too many messages not consumed.
And about performance of persistent delivery, I also have no good methods.

Cassandra throttling workload

I've been recently attempting to send a workload of read operations to a 2-node Cassandra cluster (version 2.0.9, with rf=2). My intention was to send a number of reads at a rate that is higher than the capacity of my backend servers, thereby overwhelming them and resulting in server-side queuing. To do so, I'm using the datastax java driver (cql version 2) to run my operations asynchronously (in other words, the calling thread doesn't block waiting for a response).
The problem is that I'm unable to reach a high-enough sending-rate to overload my backend servers. The # of requests that I'm sending is being somehow throttled by Cassandra. To confirm this, I've ran clients from two different machines simultaneously, and the total number of requests sent per unit time is still peaking at the same value. I'm wondering if there's a mechanism that is employed by Cassandra to throttle the amount of requests that are being received? Otherwise, what else might be causing this behavior?
Each request received by Cassandra will be handled by multiple thread pools implementing a staged event-driven architecture, where requests will be queued for each stage. You can use nodetool tpstats to inspect the current status of each queue. Once too many requests are about to overwhelm the server, Cassandra will shed load by dropping requests once queues are about to reach their capacity. You'll notice this by numbers shown in the dropped section of tpstats. In case no requests are dropped, all of them will eventually complete, but you may see higher latencies using nodetool cfhistograms or WriteTimeoutExceptions on the client.
The network bandwidth from Cassandra side is throttling the amount of requests that are being received.
As far as I know their is no other mechanism employed by Cassandra to prevent itself from receiving too much requests. Timeout Exception is the main mechanism that Cassandra use to avoid crashing when it is overloaded.
Yes, Cassandra has multiple ways to throttle incoming requests. The first action on your part would be to find out which mechanism is the culprit. Then you can tune this mechanism to fit your needs.
The first step to find out where the block occurs, would be to connect to JMX with jconsole or similar and look at the queues and block values.
If I would hazard a guess, check MessagingService for timeouts and dropped messages between nodes. Then check the native transport requests for blocked tasks before the request even get to the stages.

Spring Integration JMS Outbound adapter transaction control

in order to reach high performance production of messages with jms with transactions enabled, one needs to control the amount of messages being sent on each transaction, the larger the number the higher the performance are,
is it possible to control transactions in such a way using spring integration ?
one might suggest using an aggregator, but that defeats the purpose because i dont want to have one message containing X smaller messages on the queue, but actually X messages on my queue..
Thanks !
I'm not aware of your setup, but I'd bump up the concurrent consumers on the source than try to tweak the outbound adapter. What kind of data source is pumping in this volume of data ? From my experience, usually the producer lags behind the publisher - unless both are JMS / messaging resources - like in the case of a bridge. In which case you will mostly see a significant improvement by bumping up the concurrent consumers, because you are dedicating n threads to receive messages and process them in parallel, and each thread will be running in its own "transaction environment".
It's also worthwhile to note that JMS does not specify a transport mechanism, and its unto the broker to choose the transport. If you are using activemq you can try experimenting with open wire vs amqp and see if you get the desired throughput.

IBM MQ Message Throttling

We are using IBM MQ and we are facing some serious problems regarding controlling its asynchronous delivery to its recipient.We are having some java listeners configured, now the problem is that we need to control the messages coming towards listener, because the messages coming to server are in millions count and server machine dont have that much capacity t process so many threads at a time, so is there any way like throttling on IBM MQ side where we can configure preetch limit like Apache MQ does?
or is there any other way to achieve this?
Currently we are closing connection with IBM MQ when some X limit has reached on listener, but doesen't seems to be efficient way.
Please guys help us out to solve this issue.
Generally with message queueing technologies like MQ the point of the queue is that the sender is decoupled from the receiver. If you're having trouble with message volumes then the answer is to let them queue up on the receiver queue and process them as best you can, not to throttle the sender.
The obvious answer is to limit the maximum number of threads that your listeners are allowed to take up. I'm assuming you're using some sort of MQ threadpool? What platform are you using that provides unlimited listener threads?
From your description, it almost sounds like you have some process running that - as soon as it detects a message in the queue - it reads the message, starts up a new thread and goes back and looks at the queue again. This is the WRONG approach.
You should have a defined number of process threads running (start with one and scale up as required, and within limits of your server) which read from the queue themselves. They would each open the queue in shared mode and either get-with-wait or do immediate get with a sleep if you get a MQRC 2033 (no messages in queue).
Hope that helps.
If you are running in the application server environment, then the maxPoolDepth property on the activationSpec will define the maximum ServerSessionPool size for the MDB - decreasing this will throttle the number messages being delivered concurrently.
Of course, if your MDB (or javax.jms.MessageListener in the JSE environment) does nothing but hand the message to something else (or, worse, just spawn an unmanaged Thread and start it) onMessage will spin rapidly and you can still encounter problems. So in that case you need to limit other resources too, e.g. via threadpool configuration.
Closing the connection to the QM is never an efficient way, as the MQCONN/MQDISC cycle is expensive.

Resources