Spring JMS Websphere MQ open input count issue - spring

I am using Spring 3.2.8 with JDK 6 and Websphere MQ 7.5.0.5. In my application I am making some jms calls using jmsTemplate via ThreadPool. First I faced condition that "Current queue depth" count increases as I hit jms calls. I tracked all objects I am initiating via ThreadPool and interrupt or cancel all threads/future objects. So this "Current queue depth" count controlled.
Now problem is "Open input count" value increases nearly to the number of requests I am sending. When I stops my server this count becomes 0.
In all this case I am able to send request and get response till count of 80 and my ThreadPool size is 30. After reaching request count somewhere to 80 I keep receiving error of future object rejections and not able to receive responses. In fact null responses receive for remaining calls.
Please suggest.

I am using queue in my application and filter on correlation id has been applied. I read more on it and found when we make a call to jmsTemplate.receiveSelected (queue, filter) then this has serious impact on performance. Once I removed this filter the thread conjunction issue resolved. But now filtering is still a problem for me.
Now I will be applying filter in a different way with some limitation of the application but not using receiveSelected instead now I am using jmsTemplate.receive.
Update on 14-Sep
All - I find this as a solution and like to post here.
One of my colleague helped in rectifying this issue which is great help. What we observed after debugging that if cacheConsumer is true then based on combination of
queue + message-selector + session
consumers are cached by Spring. And even calling close() method does not do any thing; basically empty method and causing thread to be hanged/stuck.
After setting cacheConsumer to false, I reverted my code back to original i.e. jmsTemplate.receiveSelected (destination, messageSelector), now when I hit 100 request count of threads only increased between 5 to 10 during multiple iterations of test.
So - this property need to be used carefully.
hope this helps !!

First I faced condition that "Current queue depth" count increases as
I hit jms calls. I tracked all objects I am initiating via ThreadPool
and interrupt or cancel all threads/future objects.
I have no idea what you are talking about but you should NOT be using/monitoring the 'current queue depth' value from your application. Bad, bad design. Only MQ monitoring tools should be using it.
Now problem is "Open input count" value increases nearly to the number
of requests I am sending. When I stops my server this count becomes 0.
Bad programming. You are 'opening a queue then putting a message' over and over and over again. How about you put some code to CLOSE the queue or better yet, REUSE the open queue!!!!!!!

Related

Spring JMS Message Listener - DMLC - what is benefit of polling?

I know the DefaultMessageListenerContainer polls by design. And that the receiveTimeout which sets the polling interval defaults to 1 second.
The way I understand it is that the DMLC will issue a get, and waits the 'receiveTimeout' defined interval (1 second) before it times out and issues another get.
From what I have read, we can set this receiveTimout value to a larger value and have NO effect on messages getting picked up from the MQ because the active 'get' will sit on the listener until a message arrives... and once/if the timeout interval expires it will just submit another get which remains active on the queue until a message arrives.
So my questions is, what is the benefit of a smaller receiveTimout interval? If we are always going to process a message when it arrives, why on earth would we want to poll the queue every second?
We are running many large applications, and the polling is simply running the CPU usage/bill through the roof, and I cannot find a justification for this.
Yes - the 1 second receive timeout can be very CPU intensive with a large number of queues.
The general idea for the DefaultMessageListenerContainer was to wait for a bit (1 second seems to be a very short wait period), and then, if you don't get a message, it actually tears everything down and does a full reconnect. This is kind of a poor-mans error handling. "If I haven't heard from the broker, assume that something is broken, drop everything and reconnect". If the reconnect were not so expensive, it might not be a bad strategy. Or if you have only one queue. Or maybe you are expecting 10 messages a second and do want to reconnect if a second goes by. If you have a reasonable number of destinations, the reconnect traffic can get downright abusive.
For IBM MQ, failures on the JMS connection/session are reliably picked up. You don't have the, "it just sits there not getting any messages for some reason" scenario. So setting the timeout to 10 minutes (whatever) would be fine.
Note that if you are running in a JEE application server, and your JMS connections are managed by the JCA, then that layer is responsible for detecting bad connections and you don't have to worry about it up in the application layer.
With Camel and for SpringBoot GitHub might be useful.

Apache.NMS.AMQP setting prefetch size

I am using Apache.NMS.AMQP (v1.8.0) to connect to AWS managed ActiveMQ (v5.15.9) broker but am having problems with setting prefetch size for connection/consumer/destination (couldn't set custom value on either of them).
While digging through source code I've found that default prefetch value (DEFAULT_CREDITS) is set to 200.
To test this behavior I've written test that enqueues 220 messages on a single queue, creates two consumers and then consumes messages. The result was, as expected, that first consumer dequeued 200 messages and second dequeued 20 messages.
After that I was looking for a way to set prefetch size on my consumer without any success since LinkCredit property of ConsumerInfo class is readonly.
Since my usecase requires me to set one prefetch size for connection that is what I've tried next according to this documentation page, but no success. This are URLs that I've tried:
amqps://*my-broker-url*.amazonaws.com:5671?transport.prefetch=50
amqps://*my-broker-url*.amazonaws.com:5671?jms.prefetchPolicy.all=50
amqps://*my-broker-url*.amazonaws.com:5671?jms.prefetchPolicy.queuePrefetch=50
After trying everything stated above I've tried setting prefetch for my queue destinations by appending
?consumer.prefetchSize=50 to queue name. Resulting in something like this:
queue://TestQueue?consumer.prefetchSize=50
All of above attempts resulted with effective prefetch size of 200 (determined through test described above).
Is there any way to set custom prefetch size per connection when connecting to broker using AMQP? Is there any other way to configure broker than through query parameters stated on this documentation page?
From a quick read of the code there isn't any means of setting the consumer link credit in the NMS.AMQP client implementation at this time. This seems to be something that would need to be added as it currently seems to just use a default value to supply to the AmqpNetLite receiver link for auto refill.
Their issue reporter is here.

Kafka Producer is not retrying after Timeout

Intermittently(once or twice in a month) I am seeing the error
org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for cart-topic-0: 5109 ms has passed since batch creation plus linger time
in my logs due to which the corresponding message was not processed by Kafka Producer.
Though all the brokers are up and available I'm not sure why this error is being observed. Even the load is not much during this period.
I have set the retries property value to 10 in Producer configs but still, the message was not been retried. Is there anything else I need to add for the Kafka send method? I have gone through the similar issues raised, but there is no proper conclusion for this error.
Can someone please help on how to fix this.
From the KIP proposal which is now addressed
We propose adding a new timeout delivery.timeout.ms. The window of enforcement includes batching in the accumulator, retries, and the inflight segments of the batch. With this config, the user has a guaranteed upper bound on when a record will either get sent, fail or expire from the point when send returns. In other words we no longer overload request.timeout.ms to act as a weak proxy for accumulator timeout and instead introduce an explicit timeout that users can rely on without exposing any internals of the producer such as the accumulator.
So basically, post this now you can additionally be able to configure a delivery timeout and retries for every async send you execute.
I had an issue where retries were not being obeyed, but in my particular case it was because we were calling the get() method on send for synchronous behaviour. We hadn't realized it would impact retries.
In investigating the issue through various paths I came across the definition of the sorts of errors that are retrial
https://kafka.apache.org/11/javadoc/org/apache/kafka/common/errors/RetriableException.html
What had confused me is that timeout was listed as a retrial one.
I would normally have suggested you would want to look into if the delivery of your batches was taking too long and messages in your buffer were expiring due to increased volume, but you've mentioned that the volume isn't particularly high.
Did you determine if increasing the request.timeout.ms has an impact on the frequency of occurrence? It might be more of a treating the symptom step than the cause.

WMQ Message Logging Scenarios v 7.5

In the following scenario, i'm curious as to what happens as it relates to what's in the active LOG files of the queue manager in question. Linear Logging is being used.
What activity (if any) is experienced by the MQ Active LOGS, during a scenario where a queue with say, 100 messages, is being READ with a JMS context attribute (looking for a specific message) -- that, for the case of this arguement, it will NEVER find. All messages are read off the queue, but none are committed. The messages therefore were never actually deleted from the queue; does the queue manager, however, record such GET operations so as to recover these "in flight" conditions, should the queue manager Crash while this is happening? We recently experienced a situation where the dequeue rate from a specific queue was in the 4000-4500 msg / min range, while the queue depth was only about 2500. We surmise that more than 1 such process thread were trying to read off a JMS message by context (sort of like with correlation ID I suppose), but without any hope of ever actually finding a message it was looking for (due to a probable misconfiguration). During this time, the active LOGS filled up rapidly. Is it likely that such wanton dequeue rates as we saw were the culprit?
MQ writes log records for persistent messages during get and put. More details can be found here:
http://pic.dhe.ibm.com/infocenter/wmqv7/v7r5/topic/com.ibm.mq.dev.doc/q023070_.htm

Detect dropped messages in ZeroMQ Queues

Since it does not seem to be possible to query/inspect the underlying ZeroMQ queues/buffers sockets to see how much they are utilized, is there some way to detect when a message is dropped due to full buffers in a Publisher socket when sent/queued?
For example, if the publisher queue is full, the zmq_send operation will simply drop the message.
Basically, what I want to achieve is a way to detect situations where the queues are getting stressed and/or full to be able to (later on) tune the solution to work better. One alternative way would be to add a sequence number to each message and do a simple calculation in the subscriber but I can never be sure that a message was lost due to full buffers in the publisher.
There is an example for this in the ZeroMQ Guide (which you should read and digest if you want to use 0MQ happily): http://zguide.zeromq.org/page:all#Slow-Subscriber-Detection-Suicidal-Snail-Pattern
The mechanism is as you answered yourself, to add a sequence number in the message, and allow the subscriber to detect gaps and take appropriate action. For most pubsub scenarios you can raise the default HWM, which is 1,000, to something much higher; it depends on your average message size.
I know this is an old post but here is what I did when recently facing the same issue.
I opted to use a DEALER/ROUTER and set the ZMQ_SNDHWM option to 1. Also I provided the timeout parameter on each zmq_send(). The timeout could be anything between 10 ms to 3 seconds, depending on what your scenario is ( a local or remote send ).
If the message is not sent within the timeout or the send-buffer is full the zmq_send() will return false. That enabled me to set up a retry queue in front of zmq. I know it's not a perfect solution but for me it worked just fine. What puzzles me though is the meaning of true/false returned by the DEALER-socket zmq_send(). I have not been able to find the answer to that question. Whether it indicates that the message has been buffered or that the message has been delivered to the ROUTER has eluded me. In my case I got the results needed anyway.
Just for the record this was done using netmq but I guess it applies to ZeroMQ as well.
I do agree wtih james though. ZeroMQ ( and netmq ) should at least provide a way to inspect the queue ( and get the messages out ) and also a way to tell the various sockets not to drop messages. The best option would be to send messages not delivered in timely fashion according to the configured options to some sort of deadletter queue. The deadletter queue could then be handled separately.

Resources