Spring AMQP RabbitMQ how to make two parallel consumers will not grab same task at the same time? - parallel-processing

I have two systems are integrated with RabbitMQ.
BackGround
Client send multiple request messages from Spring-AMQP outbound-Gateway to RabbitMQExchange, the rabbitmq-DirectExchange will use round-robin dispatching those messages to Multiple Workers(Those workers are independent located on different desktops which will parallel running same worker code for processing different messages from RabbitExchange by use simpleMessageListner.
Logic Flow
Similiar with Rabbitmq Tutorial multiWorker-DirectExchange.
Client-----sendRequests (5tasks) to ---->RabbitMQ-DirectExchange
then Rabbitmq-DirectExchange distribute those 5 tasks to workers
PC1 ( Worker1 ), PC2 ( Worker2 )
ExchangeType & my Bindings
<!-- rabbit connection factory, rabbit template, and rabbit admin -->
<rabbit:connection-factory
id="connectionFactory"
host="local IP address"
username="guest"
password="guest"
channel-cache-size="10" />
<rabbit:template id="amqpTemplate"
connection-factory="connectionFactory"
reply-timeout="600000"
exchange="JobRequestDirectExchange"/>
<rabbit:admin connection-factory="connectionFactory" id="rabbitAdmin" />
<rabbit:direct-exchange name="taskRequests"
auto-delete="false"
durable="true" >
<rabbit:bindings>
<rabbit:binding queue="jobRequests" key="request.doTask" />
</rabbit:bindings>
</rabbit:direct-exchange>
<rabbit:queue name="jobRequests" auto-delete="false" durable="true" />
Worker-The consumer Configuration
<rabbit:listener-container id="workerContainer"
acknowledge="auto"
prefetch="1"
connection-factory="connectionFactory">
<rabbit:listener ref="taskWorker" queue-names="jobRequests" />
</rabbit:listener-container>
The Worker Class is simple POJO who will process the request and complete task.
Use: RabbitMQ 3.2.2 with Spring-Integration-Amqp 2.2
What I expect
I expect that Worker1 can receive some of tasks while Worker2 can pick the rest of tasks ( the other tasks ).
I wish workers can parallel together do whole 5 tasks. Each time each worker only do one task, after finish will be distribute another tasks one by one. (rabbit-listner has been set to prefetch=1)
Such as
worker1: t2 t3 t5
worker2: t1 t4
But
After lots of runtime-test, sometime it do the task correctly.
Worker1------task4 task1
Worker2------task3 task2 task5
While sometime it do the wrong way like this:
Worker1------task4 task1
Worker2------task4 task2 task1
Aparently, the task4 and task1 are be picked by worker1 and worker2 sametime.
Runtime test:
I checked that the client correctly send out task1 task2 task3 task4 task5 request message to RabbitExchange. But everytime each worker receive different tasks. There is a common case that may trigger wrong dispathcing.
There are 5tasks (t1,t2,t3,t4,t5) at RabbitmqExchange, and they will be send to 2 parallel workers (w1,w2).
w1 got tasks: t2 t1 t4
w2 got tasks: t3 t1
As Round-Robin dispatch method, w1 and w2 in sequence got tasks.
w1 got t2 and w2 got t3.
While t2 and t3 running, RabbitmqExchange send t1 to w1 and wait for ack from w1.
Suppose t2 spend more time to finish task than t3 does and w2 is free when w1 doing t1.
w2 finish t3 task will receive RabbitmqExchange dispatched t1 because w2 is not busy and RabbitExchange did not receive t1 finished task ack message.
My understanding is
Both w1 and w2 are doing same task t1. Either one of them once finish t1 will send back ack to RabbitmqExchange, then RabbitmqExchange will dequeue one task message. As t1 has been finished twice, RabbitmqExchange dequeue one more message that it should. So in this way t5 message has been dequeued because t1 has been done twice. Although 5 messages in RabbitmqExchange are acked and dequeue finish. But two workers missing do t5 and do t1 twice.
What should I do that can prevent two parallel workers grab the same message from a same Rabbit queue?
I tried auto-ack way, the message are correctly acked. But during the time of server wait for worker's ack, rabbitmq may redispatch the message which is not-acked but already been distributed to another worker.
Also thinking about synchronized the sent out messages or give priority to sent out messages. But do not have clear vision how to accomplish.
I am grateful to hear any ideas about this problem.Thanks

One thing I can think that is causing this duplicated messages for your consumers is when a consumer closes the channel before sending an ack message.
In that case, the RabbitMQ broker will requeue the message and set it's redelivered flag to true. From RabbitMQ docs:
If a message is delivered to a consumer and then requeued (because it was not acknowledged before the consumer connection dropped, for example) then RabbitMQ will set the redelivered flag on it when it is delivered again (whether to the same consumer or a different one). This is a hint that a consumer may have seen this message before (although that's not guaranteed, the message may have made it out of the broker but not into a consumer before the connection dropped). Conversely if the redelivered flag is not set then it is guaranteed that the message has not been seen before. Therefore if a consumer finds it more expensive to deduplicate messages or process them in an idempotent manner, it can do this only for messages with the redelivered flag set.
If when you are testing you close one of the worker processes before sending an ack, or in case they fault, this is very likely to happen. You can try to examine the redelivered flag in order to avoid it to be processed again by a different consumer, if that is the case.
Another thing I've noticed is the prefetch setting in your consumer configuration. You should set this to a higher value (tune it for your needs) instead of leaving it at just 1. You can learn more about prefetch here.
Hope that helps!

I tried long time to work out SpringConfigured-way to implement this feature but failed.
While I come out the workable solution using RabbitMQ Java Client API.
Using Spring-Asynchronous Gateway with QuartzScheduler, it always have problem send message as needed. I guess it has reason for multi-threads sort of.
At the beginning, I thought it because of that the Channel instance may accessed concurrently by multiple threads. In this way the confirms are not handled properly.
An important caveat to this is that confirms are not handled properly when a Channel is shared between multiple threads. In that scenario, it is therefore important to ensure that the Channel instance is not accessed concurrently by multiple threads.
Above from http://www.rabbitmq.com/javadoc/com/rabbitmq/client/Channel.html
Finally, I decide give up use Spring-way and change back to use RabbitMQ API(Before I use Spring XML configure the gateway/channels, now use RabbitMQ-JavaClient java programming way declare exchange with channels.). And add usage of RabbitMQRPC for asynchronous callback. Now everything works fine for current requirement.
So in summary, the final solution for my requirement is:
Use RabbitMQ JAVAClient API to declare exchange/channels/binding/routingKey.
For both client and server side.
Use RabbitMQ RPC for implement asynchronous callback feature.
(I follow RabbitMQ's java tutorial, use this link: http://www.rabbitmq.com/tutorials/tutorial-six-java.html)

Did you try setting concurrentConsumers property on the listener container as discussed here?

Related

Redis queue with spring integration with high throughput was lossing messages

I am using redis as a queue (using the spring queue-in/outbound-channel-adapter) to distribute tasks (a message into the queue, etc)
As the throughput is quite high we observed that, although the messages were sent to the redis queue, a lot of them were lost and no messages arrived to the component after the inbound (a header router)
The channel config is attached below; the point is that we though that the problem was in this header router after the inbound addapter, that was unable to manage the rate of messages read from the queue, so they were lost.
We have use an intermediate element between the inbound adapter and this component (that is a header-router) and add a queue to fix this.
This works fine, but actually we don't fully understand the solution and if this is the proper one.
An expert view and opinion about this configuration will be wellcome!
Thanks
<!-- a Queue Inbound Channel Adapter is available to 'right pop' messages
from a Redis List. -->
<redis:queue-inbound-channel-adapter
id="fromRedis" channel="in" queue="${name}"
receive-timeout="1000" recovery-interval="3000" expect-message="true"
auto-startup="true"/>
<!-- a queue to avoid lost messages before the header router -->
<int:channel id="in">
<int:queue capacity="1000"/>
</int:channel>
<!-- a bridge to connect channels and have a poller -->
<int:bridge input-channel="in" output-channel="out">
<int:poller fixed-delay="500" />
</int:bridge>
<int:header-value-router id="router" timeout="15000"
input-channel="out" header-name="decision"
resolution-required="false" default-output-channel="defaultChannel" />
---added on 26/02
To insert messages into redis we have a web service, but actually is as you said, simply write messages into redis (
for... channel.send(msg)
Nothing more
About your answer I am now thinking in remove the in channel and its queue and use directly the header-value-router; but I have more questions:
I think the right solution is a low value for timeout in header-value-router, so I'll have the error notification faster if we don't have a consumer available. If I don't use a value as timeout, it will block indefinitely and this is a bad idea, isn't it?
I don't know how to manage the MesssageDeliveryException because the router don't have an error-channel configuration, ???
I think that if I can manage this error and get the message back I can re-send it to redis again. There are other servers that get the messages from redis and they luckily could attend it.
I add my proposed solution, but is not complete and we are not sure about the error management as I explained above
<!-- a Queue Inbound Channel Adapter is available to 'right pop' messages
from a Redis List. -->
<redis:queue-inbound-channel-adapter
id="fromRedis" channel="in" queue="${name}"
receive-timeout="1000" recovery-interval="3000" expect-message="true"
auto-startup="true"/>
<!-- a header-value-router with quite low timeout -->
<int:header-value-router id="router" timeout="150"
input-channel="in" header-name="decision"
resolution-required="false" default-output-channel="defaultChannel" />
<!-- ¿if MessageDeliveryException???? what to do??? -->
<int:channel id="someConsumerHeaderValue">
<int:dispatcher task-executor="ConsumerExecutor" />
</int:channel>
<!-- If 5 threads are busy we queue messages up to 5; if queue is full we can increase to 5 more working threads; if no more threads we have a... ¿¿MessageDeliveryException?? -->
<task:executor id="ConsumerExecutor" pool-size="5-5"
queue-capacity="5" />
Well, that's great to see such an observation. That might improve the Framework somehow.
So, I'd like to see:
Some test-case to reproduce from the Framework perspective.
Although I guess there is just enough to send a lot of messages to the Redis and use your config to consume. (Correct me if there is need anything else)
The downstream flow after the <int:header-value-router>. Look, you use there timeout="15000" which is synonym to the send-timeout :
Specify the maximum amount of time in milliseconds to wait
when sending Messages to the target MessageChannels if blocking
is possible (e.g. a bounded queue channel that is currently full).
By default the send will block indefinitely.
Synonym for 'timeout' - only one can be supplied.
From here I can say that if your downstream consumer if enough slow on some QueueChannel there you end up with the:
/**
* Inserts the specified element at the tail of this queue, waiting if
* necessary up to the specified wait time for space to become available.
*
* #return {#code true} if successful, or {#code false} if
* the specified waiting time elapses before space is available
* #throws InterruptedException {#inheritDoc}
* #throws NullPointerException {#inheritDoc}
*/
public boolean offer(E e, long timeout, TimeUnit unit)
....
while (count.get() == capacity) {
if (nanos <= 0)
return false;
nanos = notFull.awaitNanos(nanos);
}
Pay attention to that return false; indicating exactly the message lost.
That is also know like back-pressure drop strategy.
Let me know if you have different picture there.
You may consider to remove that timeout="15000" to meet the same in queue channel behavior.
UPDATE
Well, the error handling works a bit different way. The "guilty" component just throws Exception, like it is with raw Java and it is OK that this component isn't responsible for Exception catching that is up to the caller.
And caller in our case an upstream component - <redis:queue-inbound-channel-adapter>.
Any inbound channel adapter has an error-channel option. Through the <poller> if it is MessageSource or directly when it is MessageProducer.
I'm sure you will be able to handle:
if (!sent) {
throw new MessageDeliveryException(message,
"failed to send message to channel '" + channel + "' within timeout: " + timeout);
}
in that error-channel sub-flow and achieve your requirements for recovery.

How to use multiple sessions per connection in a multi-threaded application?

Suppose I have one connection c and many session objects s1, s2 .. sn, each working in different threads t1, t2 ... tn.
c
|
-------------------------------------------------
| | | |
(t1,s1) (t2,s2) (t3,s3) ...... (tn,sn)
Now suppose one of the thread t3 wants to send a message to a particular queue q3 and then listen to the reply asynchronously. So it does the following:
1: c.stop();
2: auto producer = s3.createProducer(s3.createQueue(q3));
3: auto text = s3.createTextMessage(message);
4: auto replyQueue = s3.createTemporaryQueue();
5: text.setJMSReplyTo(replyQueue);
6: producer.send(text);
7: auto consumer = s3.createConsumer(replyQueue);
8: consumer.setMessageListener(myListener);
9: c.start();
The reason why I called c.stop() in the beginning and then c.start() in the end, because I'm not sure if any of the other threads has called start on the connection (making all the sessions asynchronous — is that right?) and as per the documentation:
"If synchronous calls, such as creation of a consumer or producer, must be made on an asynchronous session, the Connection.Stop must be called. A session can be resumed by calling the Connection.Start method to start delivery of messages."
So calling stop in the beginning of the steps and then start in the end seems reasonable and thus the code seems correct (at least to me). However, when I thought about it more, I think the code is buggy, as it doesn't make sure no other threads call start before t3 finishes all the steps.
So my questions are:
Do I need to use mutex to ensure it? Or the XMS handles it automatically (which means my reasoning is wrong)?
How to design my application so that I dont have to call stop and start everytime I want to send a messages and listen reply asynchronously?
As per the quoted text above, I cannot call createProducer() and createConsumer() if the connection is in asynchronous mode. What are other methods I cannot call? The documentation doesn't categorise the methods in this way:
Also, the documentation doesn't say clearly what makes a session asynchronous. It says this:
"A session is not made asynchronous by assigning a message listener to a consumer. A session becomes asynchronous only when the Connection.Start method is called."
I see two problems here:
Calling c.start() makes all sessions asynchronous, not just one.
If I call c.start() but doesn't assign any message listener to a consumer, are the session(s) still asynchronous?
It seems I've lots of questions, so it'd be great if anyone could provide me with links to the parts or sections of the documentation which explains XMS objects with such minute details.
This says,
"According to the specification, calling stop(), close() on a Connection, setMessageListener() on a Session etc. must wait till all message processing finishes, that is till all onMessage() calls which have already been entered exit. So if anyone attempts to do that operation inside onMessage() there will be a deadlock by design."
But I'm not sure if that information is authentic, as I didn't find this info on IBM documentation.
I prefer the KIS rule. Why don't you use 1 connection per thread? Hence, the code would not have to worry about conflicts between threads.

JMS Delayed Delivery based on conditional variable(s)

I'm looking for a possibility in any of the more popular message queues (AMPQ, RabbitMQ, ActiveMQ, etc) to conditionally delay the delivery of a message.
For example:
System A sends a message(foo, condition = bar.x > 1);
System B sends a message(bar, x = 2)
Because the message of System B satisfies the condition set on the Message for System A, the message is unlocked and delivered.
Do such strategies exist?
Sort of, yes, with RabbitMQ.
You need two things:
code that checks the condition - your code, not RabbitMQ code.
the Delayed Message Exchange plugin https://github.com/rabbitmq/rabbitmq-delayed-message-exchange/
RabbitMQ does not have the ability to process logic statements or code. But you are already writing code so you can easily do that in your code.
If the condition is true, then send your message to the Delayed Message Exchange. If it is not true, send your message to a normal exchange.

Changing state of messages which are "in delivery"

In my application, I have a queue (HornetQ) set up on JBoss 7 AS.
I have used Spring batch to do some work once the messages is received (save values in database etc.) and then the consumer commits the JMS session.
Sometimes when there is an exception while processing the message, the excecution of consumer is aborted abruptly.
And the message remains in "in delivery" state. There are about 30 messages in this state on my production queue.
I have tried restarting the consumer but the state of these messages is not changed. The only way to remove these
messages from the queue is to restart the queue. But before doing that I want a way to read these messages so
that they can be corrected and sent to the queue again to be processed.
I have tried using QueueBrowser to read them but it does not work. I have searched a lot on Google but could not
find any way to read these messages.
I am using a Transacted session, where once the message is processed, I am calling:
session.commit();
This sends the acknowledgement.
I am implementing spring's
org.springframework.jms.listener.SessionAwareMessageListener
to recieve messages and then to process them.
While processing the messages, I am using spring batch to insert some data in database.
For a perticular case, it tries to insert data too big to be inserted in a column.
It throws an exception and transaction is aborted.
Now, I have fixed my producer and consumer not to have such data, so that this case should not happen again.
But my question is what about the 30 "in delivery" state messages that are in my production queue? I want to read them so that they can be corrected and sent to the queue again to be processed. Is there any way to read these messages? Once I know their content, I can restart the queue and submit them again (after correcting them).
Thanking you in anticipation,
Suvarna
It all depends on the Transaction mode you are using.
for instance if you use transactions:
// session here is a TX Session
MessageConsumer cons = session.createConsumer(someQueue);
session.start();
Message msg = consumer.receive...
session.rollback(); // this will make the messages to be redelivered
if you are using non TX:
// session here is auto-ack
MessageConsumer cons = session.createConsumer(someQueue);
session.start();
// this means the message is ACKed as we receive, doing autoACK
Message msg = consumer.receive...
//however the consumer here could have a buffer from the server...
// if you are not using the consumer any longer.. close it
consumer.close(); // this will release messages on the client buffer
Alternatively you could also set consumerWindowSize=0 on the connectionFactory.
This is on 2.2.5 but it never changed on following releases:
http://docs.jboss.org/hornetq/2.2.5.Final/user-manual/en/html/flow-control.html
I"m covering all the possibilities I could think of since you're not being specific on how you are consuming. If you provide me more detail then I will be able to tell you more:
You can indeed read your messages in the queue using jmx (with for example jconsole)
In Jboss As7 you can do it the following way :
MBeans>jboss.as>messaging>default>myJmsQueue>Operations
listMessagesAsJson
[edit]
Since 2.3.0 You have a dedicated method for this specific case :
listDeliveringMessages
See https://issues.jboss.org/browse/HORNETQ-763

Should the DLQ on a Queue Manager *must* be a local queue on the QM?

We are attempting to consolidate the DLQs across the board in our enterprise, into a single Q (an Enterprise_DLQ if you will...). We have a mix of QMs on various platforms - Mainframe, various Unix flavours - Linux,AIX,Solaris etc., Windows, AS/400....
The idea was to configure the DLQ on the QM (set the DEADQ attribute on the QM) to that of the ENTERPRISE_DLQ which is a Cluster Q. All the QMs in the Enterprise are members of the Cluster. This approach, however does not seem to work when we tested it.
I have tested this by setting up a simple Cluster with 4 QMs. On one of the QM, defined a QRemote to a non-existent QM and non-existent Q, but a valid xmitq and configure the requsite SDR chl between the QMs as follows:
QM_FR - Full_Repos
QM1, QM2, QM3 - members of the Cluster
QM_FR hosts ENTERPRISE_DLQ which is advertised to the Cluster
On QM3 setup the following:
QM3.QM1 - sdr to QM1, ql(QM1) with usage xmitq, qr(qr.not_exist) rqmname(not_exist) rname(not_exist) xmitq(qm1), setup QM1 to trigger-start QM3.QM1 when a msg arrives on QM1
On QM1:
QM3.QM1 - rcvr chl, ql(local_dlq), ql(qa.enterise_dlq), qr(qr.enterprise.dlq)
Test 1:
Set deadq on QM1 to ENTERPRISE_DLQ, write a msg to QR.NOT_EXIST on QM3
Result: Msg stays put on QM1, QM3.QM1 is RETRYING, QM1 error logs complain about not being able to MQOPEN the Q - ENTERPRISE_DLQ!!
ql(qm1) curdepth(1)
Test 2:
Set deadq on QM1 to qr.enterprise.dlq, write a msg to QR.NOT_EXIST on QM3
Result: Msg stays put on QM1, QM3.QM1 is RETRYING, QM1 error logs complain about not being able to MQOPEN the Q - qr.enterprise.dlq (all caps)!!
ql(qm1) curdepth(2)
Test 3:
Set deadq on QM1 to qa.enterise_dlq, write a msg to QR.NOT_EXIST on QM3
Result: Msg stays put on QM1, QM3.QM1 is RETRYING, QM1 error logs complain about not being able to MQOPEN the Q - qa.enterise_dlq (all caps)!!
ql(qm1) curdepth(3)
Test 4:
Set deadq on QM1 to local_dlq, write a msg to QR.NOT_EXIST on QM3
Result: Msg stays put on QM1, QM3.QM1 is RUNNING, all msgs on QM3 ql(QM1) make it to local_dlq on QM3.
ql(qm1) curdepth(0)
Now the question: Looks like the DLQ on a QM must be a local queue. Is this a correct conclusion? If not, how can I make all the DLQs msg go to a single Q - Enterprise_DLQ above?
One obvious solution is to define a trigger on local_dlq on QM3 (and do the same on others QMs) which will read the msg and write it to the Cluster Q - ENTERPRISE_DLQ. But this involves additional moving parts - trigger, trigger monitor on each QM. It is most desirable to be able to configure a Cluster Q/QRemote/QAlias to be a DLQ on the QM. Thoughts/ideas???
Thanks
-Ravi
Per the documentation here:
A dead-letter queue has no special requirements except that:
It must be a local queue
Its MAXMSGL (maximum message length) attribute must enable the queue to accommodate the largest messages that the queue manager has
to handle plus the size of the dead-letter header (MQDLH)
The DLQ provides a means for a QMgr to handle messages that a channel was unable to deliver. If the DLQ were not local then the error handling for channels would itself be dependent on channels. This would present something of an architectural design flaw.
The prescribed way to do what you require is to trigger a job to forward the messages to the remote queue. This way whenever a message hits the DLQ, the triggered job fires up and forwards the messages. If you didn't want to write such a program, you could easily use a bit of shell or Perl code and the Q program from SupportPac MA01. It would be advisable that the channels used to send such messages off the QMgr would be set to not use the DLQ. Ideally, these would exist in a dedicated cluster so that DLQ traffic did not mix with application traffic.
Also, be aware that one of the functions of the DLQ is to move messages out of the XMitQ if a conversion error prevents them from being sent. Forwarding them to a central location would have the effect of putting them back onto the cluster XMitQ. Similarly, if the destination filled up, these messages would also sit on the sending qMgr's cluster XMitQ. If they built up there in sufficient numbers, a full cluster XMitQ would prevent all cluster channels from working. In that event you'd need some kind of tooling to let you selectively delete or move messages out of the cluster XMitQ which would be a bit challenging.
With all that in mind, the requirement would seem to present more challenges than it solves. Recommendation: error handling for channels is best handled without further use of channels - i.e. locally.

Resources