MQ Base classes for Java :- intermittent mqrc 2128 :UoW is already in progress - ibm-mq

We are using MQ base classes and MQ as a XA transaction coordinator.
Environment
MQ 7.5
Red Hat Linux 6.4
Java 1.7
Scenario :
1. MqManager.begin
queue. get (sync point set in get option)
db save
MqManager.commit/rollback
go back to step 1
Most of the time step 5 i. e. starting of the new transaction works fine however intermittently exception is thrown UoW already in progress. Since step 4 call was successful we believed transaction should be either committed or rolled back successfully. It shouldn't cause issues when a new transaction is being started. Can someone please suggest what can be causing UoW not being committed or rolled back even after successful commit/rollback call on the q manager?
Thanks
Vaibhav

You are probably caching JMS consumers or some other JMS objects.
Essentially at lower level MQ conversation is driven by various lower level client calls like PUT, GET or MQOPEN commands. Your problem is that issued MQBEGIN command was not matched by proper MQCLOSE (?) command. This all happens in JMS driver and it is hidden from Java developer.
If I remember correctly MQBEGIN call is mapped to the creation o jms consumer. Closing all jms object's but connection itself will work.
There is downfall to this. Problem with closing all JMS objects is that it increases cpu usage, as MQOPEN and similar calls tend to be CPU expensive.

Related

What happens to uncommited messages if the application suddenly fails?

IBM MQ documentation says:
"If an application disconnects (MQDISC) from a queue manager while a global unit of work coordinated by IBM® MQ is still active, an attempt is made to commit the unit of work. If, however, the application terminates without disconnecting, the unit of work is rolled back as the application is deemed to have terminated abnormally."
The case of application issuing a MQDISC is clear. My doubt is about the second scenario: let's say I'm putting messages and the applicative servers suddenly loss power before the commit. How does the rollback work if the application is not available at all? Would those message end in the DLQ, or...?

Two consumers on same Websphere MQ JMS Queue, both receiving same message

I am working with someone who is trying to achieve a load-balancing behavior using JMS Queues with IBM Websphere MQ. As such, they have multiple Camel JMS consumers configured to read from the same Queue. Despite that this behavior is undefined according to the JMS spec (last time I looked anyway), they expect a sort of round-robin / load-balancing behavior. And, while the spec leaves this undefined, I'm led to believe that the normal behavior of Websphere MQ is to deliver the message to only one of the consumers, and that it may do some type of load-balancing. See here, for example: When multi MessageConsumer connect to same queue(Websphere MQ),how to load balance message-consumer?
But in this particular case, it appears that both consumers are receiving the same message.
Can anyone who is more of an expert with Websphere MQ shed any light on this? Is there any situation where this behavior is expected? Is there any configuration change that can alleviate this?
I'm leaning towards telling everyone here to use the native Websphere MQ clustering facility and go away from having multiple consumers pointing at the same Queue, but that will be a big change for them, so I'd love to discover a way to make this work.
Not that I'm a fan of relying on anything that's undefined, but if they're willing to rely on IBM specific behavior, I'll leave that up to them.
The only way for them to both receive the same messages are:
There are multiple copies of the message.
The apps are browsing the message without a lock, then circling back to delete it.
The apps are backing out a transaction and making the message available again.
The connection is severed before the app acknowledges the message.
Having multiple apps compete for messages in a queue is a recommended practice. If one app goes down the queue is still served. In a cluster this is crucial because the cluster will continue to direct messages to the un-served queue instance until it fills up.
If it's a Dev system, install SupportPac MA0W and tell it to trace just that one queue and you will be able to see exactly what is happening.
See the JMS spec in section 4.4. The provider must never deliver a second copy of an acknowledged message. Exception is made for session handling in 4.4.13 which I cover in #4 above. That's pretty unambiguous and part of the official spec so not an IBM-specific behavior.

JMS p2p failover pattern in order to guarantee delivery

Im a web developer ended up in some j2ee development (newbie). I sincerely need this theory confirmed.
I been given the privilege to deliver a message from our system (producer) to the SOA Enterprice service bus (consumer) when the user hits the save button. The information can not be missed or not delivered and the delivery order must be kept.
Environment:
Jboss eap 5.1 as the producer.
JNDI server is the ESB (maybe standard).
Jboss ESB as the consumer.
My weapon of choice is JMS, p2p, due to the asynchronous nature.
When the producer is abut to send the message some problems can occur:
ESB is down causing JNDI exception
Queue manager is for some reason not awake or wrongly configured. This should cause some JMS exception.
Network hickup, causing a JMS error.
So Im looking for some failover pattern. Here is my suggestion:
Add a internal JMS queue to which the message is initially added.
Add a MDB that listen to the internal queue and tries to send it to the target queue (ESB).
If failing in any way log fatal and send email to cool support people.
This should generate a reliable pattern where a message remains on the internal que until processed by the MDB.
Please advice.
Best Regards
ds
Well a 'temporary' queue is not a totally bad idea, but during the time from moving data from one queue to putting it on another you'll have a potential window of risk. Even though that window is close to nothing, what would happen if you got some failure right there and then? -You'd have to put the message back on the queue (and there you'd get into the problem with getting it in the correct order - nasty stuff!) or hold on to it in some way until you put it the other queue (which in turn can be cumbersome if you'd e g get into some failure-situaton.
A more stable solution would be to put data in a db with a queue-order column. You can then select your data in the correct order, send it to the new queue, and finally flag it as 'done' or something or even (better?) remove the data in the db.

WebSphere MQ: Message keeps toggling between input queue and backout queue

The logic flow is like this
A message is sent to an input queue
A ProcessorMDB's onMessage() is invoked. Within this method several operations/validations are done
In case of a poison message(msg that application code cannot handle) a RuntimeException is thrown.
This should rollback the transaction. We are seeing evidence in the log file.
There is a backout threshold defined with a backout queue name
once threshold is reached, the message is sent to backout queue
But immediately it starts going back and forth between the input queue and backout queue.
We are using MQMON tool to observe this weird behavior. It continues for ever almost even after the app server(where MDB is running) is shutdown.
We are using Weblogic 10.3.1 and WebSphere MQ 6.02
Any help will be much appreciated, looks like we are running out of ideas.
This sounds like a syncpoint issue. If the QMgr were to issue a COMMIT when a message is requeued inside of a unit of work it would affect all messages under syncpoint inside of that thread. This would cause serious problems if an application had performed several PUT or GET calls prior to hitting the poison message. Rather than issue a COMMIT outside of the program's control, the QMgr just leaves the message on the backout queue inside the unit of work and waits for the program to issue the COMMIT. This can lead to some unexpected behavior such as what you are seeing where a message lands back on the input queue.
If another message is in the queue behind the "bad" one and it is processed successfully by the same thread, everything works out perfectly. The app issues a COMMIT on the new message and this also affects the poison message on the Backout Queue. However if the thread were to exit uncleanly (without an explicit disconnect or COMMIT) then the transaction is rolled back and the poison message is returned to the input queue.
The usual way of dealing with this is that the next good message (or batch of messages if transactions are batched) in the input queue will force the COMMIT. However in some cases where the owning thread gets no new work (perhaps it was performing a GET by Correlation ID) there is nothing to push the bad message through. In these cases, it is important to make sure that the application issues a COMMIT before ending. One way to do this is to write the code to perform the GET by CORRELID with a wait interval. If the wait interval expires, the application would get a return code of 2033 and then issue a COMMIT before closing the thread. If the reply message is legitimately late for whatever reason, the COMMIT will have no effect. But if the message arrived and had been backed out and requeued, the COMMIT will cause it to stay in the Backout Queue.
One way to see exactly what is going on is to run a trace against the queue in question. You can use the built-in trace function - strmqtrc - which has a few more options in V7 than does the V6 version. However if you want very fine grained control you can use the trace exit in SupportPac MA0W. With MA0W you can see exactly what API calls are made by the program and those made on its behalf.
[EDIT] Updating the response with some info from the PMR:
The following is from the WMQ V7 Infocenter:
MessageConsumers are single threaded below the Session level, and
any requeuing of poison messages
takes place within the current unit of
work. This does not affect the
operation of the application, however
when poison messages are requeued
under a transacted or
Client_acknowledge Session, the
requeue action itself will not be
committed until the current unit of
work is committed by the application
code or, if appropriate, the
application container code."
Hence, if it is important for the customer to have poison messages
committed immediately after they are
backed out, it is recommended they
either make use of the Application
Server Facilities
(ConnectionConsumer) which can commit
the message immediately, or
another mechanism to move poison
messages from the queue.
Here is the link to this information in the V6 and V7 Information Centers. Since you are using the V6 client so you would want to refer to the V6 Infocenter. Note that with the V6 client, there is no mention in the Infocenter of ASF being able to commit the poison message immediately, even when using a ConnectionConsumer. The way I read it, this means you probably will need to upgrade to the V7 client to get the behavior you are looking for. Will be interested to see if the PMR results in a similar recommendation.

websphere server stop causes inflight ejb transactions to rollback

I observe errors as below when Websphere server instance is stopped from admin console
Caused by: javax.ejb.TransactionRolledbackLocalException: ; nested exception
is: javax.transaction.TransactionRolledbackException: Transaction is ended due to timeout
at com.ibm.ws.Transaction.JTA.TranManagerImpl.completeTxTimeout(TranManagerImpl.java:576)
at com.ibm.ws.Transaction.JTA.TranManagerSet.completeTxTimeout(TranManagerSet.java:625)
These are inflight txns during server stop.
Increasing the timeouts from "Application servers->server->Transaction Service" does not seem to help.
Is this to do with the server quiesce timeouts? if yes is there a way to configure those.
Also the rollbacks are not observed when I "terminate" the server from the Admin Console, only observed when I "stop" the server.
Any ideas to debug this issue would be great.
What you want is probably Deployment for transactional high availability. The describe method is the only product feature that is available for finishing those transactions without getting actual errors.
What happens for you is that the WebSphere Application server gives each container some time to shut down. After the shutdown timeouts it will use force. The transactions get rolled back. Well, you could also change the heurestic policy to for instance COMMIT. That depends on whether it is better for your application that everything in the transaction gets lost or whether only the rest of the transaction gets lost.

Resources