Jms connection is timing out in spring batch remote partitoning on the master side - spring

I am trying to implement spring batch remote partitioning using spring integration. I am triggering the master using standalone app and slaves are running on jboss eap 6.1 cluster. I am able to trigger the job and i can see slaves also got triggered. but after some time master jms connection is timing out. can someone shed me a light, how can i configure this timeout settings..
2636699 21/11 16:59:55,580[org.springframework.jms.listener.DefaultMessageListenerContainer#0-378] WARN jms.listener.DefaultMessageListenerContainer.handleListenerSetupFailure - Setup of JMS message listener invoker failed for destination 'queue-screening-replies-partitioning' - trying to recover. Cause: Session is closed
2636699 21/11 16:59:55,580[org.springframework.jms.listener.DefaultMessageListenerContainer#0-378] INFO jms.listener.DefaultMessageListenerContainer.refreshConnectionUntilSuccessful - Successfully refreshed JMS Connection
I am getting this kind of errors..
Thanks in advance..
--M K

Either the network or your broker is timing out the connection.
Consult your broker documentation to enable some kind of heartbeats/timeouts to keep the connection open.
Or, it would be better to reduce your partition sizes (increase the number of partitions) so they complete in a reasonable time.
EDIT:
Also, configure the outbound gateway with a <reply-listener/> and use a fixed, named, reply queue and the gateway will be able to recover from the dropped connection.
But I would still recommend smaller partitions in general.

Related

how to limit number of connections to IBM MQ

I have a Spring Boot based messaging app sending/receiving JMS messages to/from IBM MQ queue manager.
Basically, it uses MQConnectionFactory to organize connection to IBM MQ and a JmsPoolConnectionFactory from messaginghub:pooledjms to enable JMS connection pool, which is removed from MQConnectionFactory in IBM MQ 7.x
The app uses two different appoach to work with JMS. A "correct" one runs a JMSListener to receive messages and then sends a response on each message using JmsTemplate.send(). And there is a second "troubling" approach, where the app sends requests using JmsTemplate.send() and waits for response using JmsTemplate.readByCorrelId() until received or timed out.
I say troubling because this makes JMS sessions last longer if the response is delayed and could easily exhaust IBM MQ connection limit. Unfortunately, I cannot rewrite the app at the moment to the first approach to resolve the issue.
Now I want to restrict the number of connections in the pool. Of course, the delayed requests will fail but IBM MQ connection limit is more important at the moment, so this is kind of appropriate. The problem is that even if I disable the JmsPoolConnectionFactory, it seems that MQConnectionFactory still opens multiple connections to the query manager.
While profiling the app I see multiple threads RvcThread: com.ibm.mq.jmmqi.remote.impl.RemoteTCPConnection#12433875[...] created by JMSCCMasterThreadPool and corresponding connections to the query manager in MQ Explorer. I wonder why there are many of them in spite of the connection pooling is removed from MQConnectionFactory? I suppose it should open and reuse a single connection then but it is not true in my test.
Disabling "troubling" JmsTemplate.readByCorrelId() and leaving only "correct" way in the app removes these multiple connections (and the waiting threads of course).
Replacing JmsPoolConnectionFactory with SingleConnectionFactory has not effect on the issue.
Is there any way to limit those connections? Is it possible to control max threads in the JMSCCMasterThreadPool as a workaround?
Because it affects other applications your MQ admins probably want you to not exhaust the overall Queue Manager's connection limit (MaxChannels and MaxActiveChannels parameters in qm.ini). They can help you by defining an MQ channel exclusively used by your application. By this, they can limit the number of connections of your application with the MAXINST / MAXINSTC channel parameter. You will get an exception when this number is exhausted which is appropriate as you say. Other applications won’t be affected anymore.

Payara 4 JMS - Any way to log when a connection to MQ Server is created or dropped?

We currently have issues after connecting a new MQ Server (IBM MQ v9) to our service.
The current expectation is that we have two issues:
The connection establishment at MQ-Server takes too much time
We assume that on every message send a new physical connection is established instead of using the payara pool
Especially to prove No. 2 - is there any possibility to find out or log every Connection establishment that is done by the Resource Adapter?
You can use the Application activity trace of IBM MQ.
There is also a very nice presentation available to find out more about how to use the Activity Trace.

Can you check to see if an IBM MQ topic is up and available through a Java application before attempting to create a connection?

I would like to add some conditional logic to our Java application code for attempting to create a JMS Topic Connection. I have seen problems in the past stemming from attempting to create a connection when the MQ server had been restarted or was currently down. One improvement I added was to check for the quiescent state, and another was to increase the timer before attempting reconnection to our durable topic queue.
Is there a way to confirm with the MQ server/topic/channel that it is up and running and a connection request can safely be made?
The best way to confirm that a queue manager (and the channel you are using to connect to the queue manager) is up and running is to attempt to connect to it.
If your connection attempt fails, you will get an MQ Reason code telling you exactly why. This is a much better way to confirm than any administrative command, because it also confirms that your application, and it's security context is correct and able to connect to the queue manager. It is completely possible to have an up-and-running queue manager but an application that is not yet correctly configured to use it. So connect from the application and if it works, the queue manager is up-and-running.
Your comment about having an increased timer before attempting to reconnect after a failure is well made. It doesn't help anyone if you hammer the queue manager with lots of repeated and close together connection attempts until it is ready to accept your connection, but still anything that is going to test the availability of the queue manager needs to ultimately connect to it, so very simply, just connect.

ActiveMQ crash after some times

I am using ActiveMQ 5.10.0. I have many consumer that connect through stomp connection to ActiveMQ. I have about 2000 consumer that connect to unique queue. My problem now is that ActiveMQ always crash/hang after a couple of hours and the log not showing any error. It makes me difficult to trace where is the problem. As far as I know, the new consumer can't create new connection and subscription to queue after ActiveMQ crashed. I can't open the web console too. Is there any way to improve the performance of ActiveMQ to handle many connections or anything else for performance tuning?
In order to debug to exact issue, try:
Enable debug logs of active mq
Add TransportListener to active mq connection/connection factory. Logging connection interrupt, resume and exception.
Refer http://activemq.apache.org/maven/apidocs/org/apache/activemq/transport/TransportListener.html
JMX monitoring via jconsole or jvisual vm, thread dump might help in debugging the same.
We faced similar issue in production env. where producers were able to produce but consumers were not consuming after 24-48 hrs of everything running fine. Restarting active mq lead to consumers starting consuming messages. We haven't found the exact cause/fix yet but recently added above debugging steps.

JMS with clustered nodes

I have two clustered managed servers running on Weblogic, and seperate JMS server1 and server2 are running on each managed server. The problem is in application properties file, we only hardcoded and pass JMS server1 JNDI name to the application. So both applications running on each node actually only uses one fixed JMS server, which is not truly distributed and clustered. If JMS server 1 is down, the whole application will be down.
My question is how to let application dynamically find JMS server in above senario? Can you please point me a direction? Thanks!
It's in the Weblogic docs at: http://docs.oracle.com/cd/E14571_01/web.1111/e13738/best_practice.htm#CACDDFJD
Basically you created a comma separated list of servers and the JMS connection logic should be automatically able to handle to case when one of the servers is down:
e.g.
t3://hostA:7001,hostB:7001
When you use a property like jms.jndi.provider.url=t3://hostA:31122,hostA:31124
it tells wls to connect to either hostA:31122 or hostA:31124.
Note your JMS client is connected to only one Host at any given time.
when you shutdown hostA the connection between JMS client and server is cut abruptly resulting in an exception, your code will have to handle this exception gracefully and attempt to connect to WLS again periodically to ensure it connects to hostB.
WLS internally will round robin the request if more than 1 instance of the JMS client is running.
When using MDB as JMS client and deploying it to a cluster and using such a url 1 mdb instance would connect to one host and the other instance would connect to another host. MDB also inherently has the ability to reconnect periodically to the JMS destination.
A easy solution to your problem could be to
1) Set the jms.jndi.provider.url=t3://hostA:31122,hostA:31124
2) Have 2 instance of the JMS client code running, so one will connect to port 31122 and other to 31124
3) Set Forward-Delay on the JMS Queue so that message dont remain in queue without getting consumed for long and get forwarded to the other queue which has an active consumer.
I am updating my progress here instead of adding more comments. I have tested using a standalone JMS client by changing properties file from t3://hostA:7001 to t3://hostA:7001,hostB:7001 for JMS provider. The failover is automatically handled by WLS. No code change. The exception I got above is caused by using wlclient.jar, it is working after it changed to wlfullclient.jar.
I followed this link to generate wlfullclient.jar.
Thanks everyone!

Resources