Recently I started getting the following problem which results in messages not being delivered to:
"PollThread" prio=10 tid=0x00007f0a2cf86000 nid=0x76b8 in Object.wait() [0x00007f09eb6bf000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:503)
at com.ibm.mq.jmqi.remote.api.RemoteHconn.checkUsable(RemoteHconn.java:2121)
- locked <0x000000048f040a10> (a com.ibm.mq.jmqi.remote.api.RemoteHconn$ReconnectMutex)
at com.ibm.mq.jmqi.remote.api.RemoteHconn.enterCall(RemoteHconn.java:1787)
at com.ibm.mq.jmqi.remote.api.RemoteHconn.enterCall(RemoteHconn.java:1764)
at com.ibm.mq.jmqi.remote.api.RemoteFAP.jmqiPutMessageWithProps(RemoteFAP.java:7804)
at com.ibm.mq.jmqi.remote.api.RemoteFAP.jmqiPut(RemoteFAP.java:7254)
at com.ibm.mq.ese.jmqi.InterceptedJmqiImpl.jmqiPut(InterceptedJmqiImpl.java:496)
at com.ibm.mq.ese.jmqi.ESEJMQI.jmqiPut(ESEJMQI.java:385)
at com.ibm.msg.client.wmq.internal.WMQMessageProducer$SpiIdentifiedProducerShadow.sendInternal(WMQMessageProducer.java:812)
at com.ibm.msg.client.wmq.internal.WMQMessageProducer$ProducerShadow.send(WMQMessageProducer.java:531)
at com.ibm.msg.client.wmq.internal.WMQMessageProducer.send(WMQMessageProducer.java:1178)
at com.ibm.msg.client.jms.internal.JmsMessageProducerImpl.sendMessage(JmsMessageProducerImpl.java:927)
at com.ibm.msg.client.jms.internal.JmsMessageProducerImpl.send_(JmsMessageProducerImpl.java:783)
at com.ibm.msg.client.jms.internal.JmsMessageProducerImpl.send(JmsMessageProducerImpl.java:446)
Entire client application is unresponsive. How do I troubleshoot this type of problems with IBM MQ classes for JMS? Is it connection setup problems? There are no error in MQ log file. MQ version is 7.5.2. Thanks in advance for any help.
The lock is being held on the ReconnectMutex meaning that a reconnection to the queue manager is currently in progress and so waits until it is notified that the reconnection has been successful. Is there another thread that looks like it is attempting to reconnect to the queue manager and is not moving? Do you know if the queue manager is up and running at this time?
Related
I have installed one J2EE application in Websphere ND 8.5.5.9 on a IBM AIX 7.2 server.
While installing application, I have skipped the Queue setup by giving the dummy values to it. Then, Listener port issue came up, as the queue was trying to connect to dummy setup. This way the connection pool was full and system started giving exceptions. So, I re-installed the application and kept the Listener port in STOP mode. First few hours application ran as expected. Now, it is giving below exceptions:
[5/23/18 17:29:53:609 CEST] 000000a9 FreePool E J2CA0045E: Connection not available while invoking method createOrWaitForConnection for resource jdbc/"".
[5/23/18 17:31:12:899 CEST] 00000055 FreePool E J2CA0045E: Connection not available while invoking method createOrWaitForConnection for resource jdbc/"".
[5/23/18 17:31:12:900 CEST] 00000055 AlarmThreadMo W UTLS0009W: Alarm Thread "Non-deferrable Alarm : 0" (00000055) previously reported to be delayed has now completed. It was active for approximately 180004 milliseconds.
[5/23/18 17:32:11:191 CEST] 00000029 AlarmThreadMo W UTLS0008W: The return of alarm thread "Non-deferrable Alarm : 2" (00000057) to the alarm thread pool has been delayed for 18271 milliseconds. This may be preventing normal alarm function within the application server. The alarm listener stack trace is as follows:
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:201)
at com.ibm.ejs.j2c.FreePool.queueRequest(FreePool.java:438)
at com.ibm.ejs.j2c.FreePool.createOrWaitForConnection(FreePool.java:1344)
at com.ibm.ejs.j2c.PoolManager.reserve(PoolManager.java:3898)
at com.ibm.ejs.j2c.PoolManager.reserve(PoolManager.java:3118)
at com.ibm.ejs.j2c.ConnectionManager.allocateMCWrapper(ConnectionManager.java:1548)
at com.ibm.ejs.j2c.ConnectionManager.allocateConnection(ConnectionManager.java:1031)
at com.ibm.ws.rsadapter.jdbc.WSJdbcDataSource.getConnection(WSJdbcDataSource.java:646)
at com.ibm.ws.rsadapter.jdbc.WSJdbcDataSource.getConnection(WSJdbcDataSource.java:924)
at com.ibm.ws.extensionhelper.db.impl.DatabaseHelperImpl$DSWrapper.getConnection(DatabaseHelperImpl.java:1595)
at com.ibm.ws.extensionhelper.db.impl.DatabaseHelperImpl.getConnection(DatabaseHelperImpl.java:750)
at com.ibm.ws.leasemanager.impl.LeaseManagerDBHelper.getConnection(LeaseManagerDBHelper.java:213)
at com.ibm.ws.leasemanager.impl.LeaseStoreImpl.renew(LeaseStoreImpl.java:452)
at com.ibm.ws.leasemanager.impl.LeaseImpl.renew(LeaseImpl.java:141)
at com.ibm.ws.scheduler.LeaseAlarm.alarm(LeaseAlarm.java:173)
at com.ibm.ejs.util.am._Alarm.runImpl(_Alarm.java:151)
at com.ibm.ejs.util.am._Alarm.run(_Alarm.java:136)
at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1892).
Please suggest what can be done to free the connection pool without making any code changes. Is it possible to handle it on OS level or Websphere level?
The last of the warnings with the 18 second wait is for a connection attempt that is made by the WAS scheduler. You should look in your configuration to see if the scheduler is configured to use the same data source, jdbc/"" (which is an unusual name - is this data source configured properly?) as the prior errors. There are a couple of possibilities for the cause behind theses errors/warnings. You could have a connection pool that is insufficiently sized to handle the load that your application requires, or you could have code that is holding onto connections for too long, starving out the other users of the data source.
I am using IBM Websphere MQ 7.5 in Unix system. I have installed the client on my machine and server is running on other machine. I am observing a scenario where I am able to communicate with server when running my JMS application via 'mqm' user but facing below mentioned error when using other user.
But I am able to run 'amqsputc' and 'amqsgetc' command and communicate with the server with mqm as well as other user also. I have followed all steps mentioned http://www-01.ibm.com/support/knowledgecenter/SSFKSJ_7.5.0/com.ibm.mq.ins.doc/q009300_.htm?lang=en
Exception:
com.ibm.msg.client.jms.DetailedJMSSecurityException: JMSWMQ2013: The security authentication was not valid that was supplied for QueueManager 'TestManager' with connection mode 'Client' and host name 'x.x.x.x(9923)'.
Please check if the supplied username and password are correct on the QueueManager to which you are connecting.
at com.ibm.msg.client.wmq.common.internal.Reason.reasonToException(Reason.java:521)
at com.ibm.msg.client.wmq.common.internal.Reason.createException(Reason.java:221)
at com.ibm.msg.client.wmq.internal.WMQConnection.<init>(WMQConnection.java:426)
at com.ibm.msg.client.wmq.factories.WMQConnectionFactory.createV7ProviderConnection(WMQConnectionFactory.java:6902)
at com.ibm.msg.client.wmq.factories.WMQConnectionFactory.createProviderConnection(WMQConnectionFactory.java:6277)
at com.ibm.msg.client.jms.admin.JmsConnectionFactoryImpl.createConnection(JmsConnectionFactoryImpl.java:285)
at com.ibm.mq.jms.MQConnectionFactory.createCommonConnection(MQConnectionFactory.java:6233)
at com.ibm.mq.jms.MQQueueConnectionFactory.createQueueConnection(MQQueueConnectionFactory.java:120)
at com.ibm.mq.jms.MQQueueConnectionFactory.createConnection(MQQueueConnectionFactory.java:203)
at performance.IBMMQTestProducer.start(IBMMQTestProducer.java:142)
at performance.IBMMQTestProducer.main(IBMMQTestProducer.java:177)
Caused by: com.ibm.mq.MQException: JMSCMQ0001: WebSphere MQ call failed with compcode '2' ('MQCC_FAILED') reason '2035' ('MQRC_NOT_AUTHORIZED').**
at com.ibm.msg.client.wmq.common.internal.Reason.createException(Reason.java:209)
... 9 more
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2035;AMQ9509: Program cannot open queue manager object. [1=2035,5=???]**
at com.ibm.mq.jmqi.internal.JmqiTools.getQueueManagerInfo(JmqiTools.java:783)
at com.ibm.mq.jmqi.remote.impl.RemoteSession.loadInfo(RemoteSession.java:1993)
at com.ibm.mq.jmqi.remote.impl.RemoteSession.getName(RemoteSession.java:2026)
at com.ibm.mq.jmqi.remote.api.RemoteHconn.getName(RemoteHconn.java:728)
at com.ibm.mq.ese.intercept.JmqiConnInterceptorImpl.validate(JmqiConnInterceptorImpl.java:321)
at com.ibm.mq.ese.intercept.JmqiConnInterceptorImpl.afterConnect(JmqiConnInterceptorImpl.java:226)
at com.ibm.mq.ese.intercept.JmqiConnInterceptorImpl.afterJmqiConnect(JmqiConnInterceptorImpl.java:133)
at com.ibm.mq.ese.jmqi.InterceptedJmqiImpl.jmqiConnect(InterceptedJmqiImpl.java:315)
at com.ibm.mq.ese.jmqi.ESEJMQI.jmqiConnect(ESEJMQI.java:337)
I am able to run it when I am running as myself and passing 'mqm' when starting the connection.
connection = cf.createConnection("mqm", "pswd");
I am not getting anything in manager's log. Below is the log.
------------------------------------------------------------------------------
10/30/2015 06:50:54 AM - Process(31064.1) User(mqm) Program(strmqm)
Host(x.x.x.x) Installation(Installation1)
VRMF(7.5.0.2)
AMQ7125: There are 83 days left in the trial period for this copy of WebSphere
MQ.
EXPLANATION:
This copy of WebSphere MQ is licensed for a limited period only.
ACTION:
None.
Given that you have told us that you have successfully connected and run the amqsgetc and amqsputc client samples using the same server-connection channel as you are attempting to use for your JMS program that suggests that this is not a connection time problem, in other words the MQCONN to the queue manager has been successful and something following that is failing. We know that you can MQPUT and MQGET (since that is what the aforementioned samples do).
Something that JMS does that those simple samples do not do, is an MQINQ of the queue manager. The following part of your exception makes me wonder if that is what you are tripping over:-
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2035;
AMQ9509: Program cannot open queue manager object. [1=2035,5=???] at
com.ibm.mq.jmqi.internal.JmqiTools.getQueueManagerInfo(JmqiTools.java:783) at
To be completely certain you must check the queue manager AMQERR01.LOG to see what is reported there. If it is missing authorization then it will tell you there.
I have a queue for which the reader consumes messages under sycpoint and it ended abruptly. This caused 2 messages left in Uncommitted state. so the "msgage" property keeps increasing for the message and "uncom" property of queue remains same as 2, even though we restarted the consumer application and no long running UOW.
Anyway we can reset these properties without restarting MQ?
Presuming your application is connecting in client mode (over TCP) I expect that although your application has gone away, from the queue manager's point of view it is still active.
When the network socket the application opened closes, then MQ should roll back the 2 messages so they're eligible for consumption by another application.
The network socket will close when the operating system eventually notices the remote end of the TCP connection is unresponsive - this triggers a 'connection reset by peer' type socket closure. It's the operating system the queue manager is running on which will do this, not the remote one.
Some operating systems can take hours to notice a duff socket in their default configuration. Look into 'TCP keepalive' settings on your operating system to tune how long this takes.
I have the following error showing up in AMQERR01.LOG
AMQ9999: Channel 'MGATESrvChannel' to host 'Mgate (127.0.0.1)' ended
abnormally.
EXPLANATION:
The channel program running under process ID 1060(4364) for channel
'MGATESrvChannel' ended abnormally. The host name is 'Mgate (127.0.0.1)'; in
some cases the host name cannot be determined and so is shown as '????'.
This error is preceded with following message:
AMQ9508: Program cannot connect to the queue manager.
EXPLANATION:
The connection attempt to queue manager 'MGATE.QM' failed with reason code
2059.
ACTION:
Ensure that the queue manager is available and operational.
According to what I have been told this can be caused by an application that is using queue manager, however, it seems to me that this has more to do with the way that manager was set up or similar. Can anyone please shed some light on this?
Thanks in advance!
The 2059 says that a connection request was received and refused because the QMgr was not available. We used to see this a lot when the listener was run as a separate process or when inetd was used to start channels. This is because the listener was there to accept the connection but the QMgr processes were not.
Now that the listener is run as a child process of the QMgr, it is quite rare to see this on the WMQ error logs though clients commonly see it. This is because when the listener is run as a child process of the QMgr, there is nothing listening to receive the connection request and it bounces off of the host's IP stack before ever getting to MQ code.
The AMQ9999 message says that a channel program, one of the QMgr's child processes, died or was killed and this caused the channel to terminate. There are many reasons for a channel process to die including being killed by the OS if resources are short, or being killed by a human operator. Other than that the most common way they can die due to running in trusted or fastpath mode and the attached program corrupts them.
It would help to narrow down the field to know the details of the QMgr in question - version and fix pack, how the listeners are started, channel settings, etc.
Start your listener up, you may check the Control property for that channel, so it start up automatically when the Queue Manager restart.
We are trying to write a message to a broker queue. But the whole request fails when it tries to commit the JMS transaction & then it tries to rollback each subsequent time. We use oracle XA drivers. Not sure where to post this issue: MQ forums or Oracle forum. So thought would give a try here. Can someone help resolve this please.
Error:
[9/25/12 17:10:06:871 EDT] 0000003e XATransaction E J2CA0027E: An exception occurred while invoking commit on an XA Resource Adapter from dataSource JMS$QCF$JMSManagedConnection#23, within transaction ID {XidImpl: formatId(57415344), gtrid_length(36), bqual_length(54), data(00000139ff43ef2500000001000043106c82332ef6bc723402e84f341fb357080ddd4d1b00000139ff43ef2500000001000043106c82332ef6bc723402e84f341fb357080ddd4d1b000000010000000000000000000000000001)}: javax.transaction.xa.XAException: The method 'xa_commit' has failed with errorCode '-7'.
at com.ibm.mq.jmqi.JmqiXAResource.commit(JmqiXAResource.java:407)
at com.ibm.ejs.jms.JMSManagedSession$JMSXAResource.commit(JMSManagedSession.java:1702)
at com.ibm.ejs.j2c.XATransactionWrapper.commit(XATransactionWrapper.java:463)
at com.ibm.ws.Transaction.JTA.JTAXAResourceImpl.commit_one_phase(JTAXAResourceImpl.java:305)
at com.ibm.ws.Transaction.JTA.RegisteredResources.flowCommitOnePhase(RegisteredResources.java:2916)
at com.ibm.ws.Transaction.JTA.TransactionImpl.commitXAResources(TransactionImpl.java:2533)
at com.ibm.ws.Transaction.JTA.TransactionImpl.stage1CommitProcessing(TransactionImpl.java:1687)
at com.ibm.ws.Transaction.JTA.TransactionImpl.processCommit(TransactionImpl.java:1647)
at com.ibm.ws.Transaction.JTA.TransactionImpl.commit(TransactionImpl.java:1582)
at com.ibm.ws.Transaction.JTA.TranManagerImpl.commit(TranManagerImpl.java:247)
at com.ibm.ws.Transaction.JTA.TranManagerSet.commit(TranManagerSet.java:168)
at com.ibm.ws.Transaction.JTA.UserTransactionImpl.commit(UserTransactionImpl.java:293)
at org.springframework.transaction.jta.JtaTransactionManager.doCommit(JtaTransactionManager.java:1009)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:754)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:723)
at org.springframework.jms.listener.AbstractPollingMessageListenerContainer.receiveAndExecute(AbstractPollingMessageListenerContainer.java:255)
at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.invokeListener(DefaultMessageListenerContainer.java:1002)
at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.run(DefaultMessageListenerContainer.java:901)
at org.springframework.scheduling.commonj.DelegatingWork.run(DelegatingWork.java:61)
at com.ibm.ws.asynchbeans.J2EEContext$RunProxy.run(J2EEContext.java:264)
at java.security.AccessController.doPrivileged(Native Method)
at com.ibm.ws.asynchbeans.J2EEContext.run(J2EEContext.java:1137)
at com.ibm.ws.asynchbeans.WorkWithExecutionContextImpl.go(WorkWithExecutionContextImpl.java:195)
at com.ibm.ws.asynchbeans.CJWorkItemImpl.run(CJWorkItemImpl.java:187)
at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1560)
.
[9/25/12 17:10:06:880 EDT] 0000003e RegisteredRes W WTRN0052E: An attempt by the transaction manager to call one phase commit on a transactional resource has resulted in an XAER_RMFAIL error. The resource was com.ibm.ws.Transaction.JTA.JTAXAResourceImpl#1d07bf1#{XidImpl: formatId(57415344), gtrid_length(36), bqual_length(54), data(00000139ff43ef2500000001000043106c82332ef6bc723402e84f341fb357080ddd4d1b00000139ff43ef2500000001000043106c82332ef6bc723402e84f341fb357080ddd4d1b000000010000000000000000000000000001)}
[9/25/12 17:10:06:887 EDT] 0000003e DefaultMessag W org.springframework.jms.listener.DefaultMessageListenerContainer handleListenerSetupFailure Setup of JMS message listener invoker failed for destination 'queue:///RANDOM QUEUE?targetClient=1' - trying to recover. Cause: Heuristic completion: outcome state is mixed; nested exception is javax.transaction.HeuristicMixedException
Here's the cause and resolution
Quote-
The cause of these errors is usually the result of a WebSphere MQ
messaging provider JMS Connection being closed off by WebSphere
Application Server because the Aged timeout for the Connection has
expired.
Resolution-
To resolve this issue, ensure that the JMS Connection Factory being
used by the application has the Connection Pool property Aged timeout
set to zero. This will prevent JMS Connections being closed when they
are returned to the Free Pool, and so ensures that any outstanding
transactional work can be completed
It is sometimes also caused by the faulty DataDirect Driver and is reported and fixed by IBM, see this.
Earlier we had multiple JMS sessions. That was probably the cause of the issue in one of the environments. So we had to change to 2 diff. sessions & now it works!
Julian:
My scenario is slightly different from yours. Earlier we had:
A request message was put on the queue. This queue was picked up & processed. Then we saved to the DB & then generated another message & put it on another broker queue & then sent a response to the first message. All this was 1 flow.
Now we changed that to 2 diff. flows : Request --> Process --> Save to DB --> Reply
and then another flow to put on Broker queue.
Hope this helps