I want to read 10000 messages from Websphere MQ in groups in sequential order, i am using below code to do the same, but it is taking long time to read all the messages. Even i tried to use multi thread concepts, but sometimes 2 threads are consuming same group and race condition happening. Below is the code snippet.
I am trying to use 3 threads to read 10000 messages from MQ sequentially, but two of my threads accessing same group at time. How to avoid this ? what is best way to read large volume of messages in sequential.? My requirement is i want to read 10000 messages sequentially. Please help.
MQConnectionFactory factory = new MQConnectionFactory();
factory.setQueueManager("QM_host")
MQQueue destination = new MQQueue("default");
Connection connection = factory.createConnection();
connection.start();
Session session = connection.createSession(true, Session.AUTO_ACKNOWLEDGE);
MessageConsumer lastMessageConsumer =
session.createConsumer(destination, "JMS_IBM_Last_Msg_In_Group=TRUE");
TextMessage lastMessage = (TextMessage) lastMessageConsumer.receiveNoWait();
lastMessageConsumer.close();
if (lastMessage != null) {
int groupSize = lastMessage.getIntProperty("JMSXGroupSeq");
String groupId = lastMessage.getStringProperty("JMSXGroupID");
boolean failed = false;
for (int i = 1; (i < groupSize) && !failed; i++) {
MessageConsumer consumer = session.createConsumer(destination,
"JMSXGroupID='" + groupId + "'AND JMSXGroupSeq=" + i);
TextMessage message = (TextMessage)consumer.receiveNoWait();
if (message != null) {
System.out.println(message.getText());
} else {
failed = true;
}
consumer.close();
}
if (failed) {
session.rollback();
} else {
System.out.println(lastMessage.getText());
session.commit();
}
}
connection.close();
I think a better way would be to have a coordinator thread in your application, which would listen for the last messages of groups and for each would start a new thread to get messages belonging in the group assigned to that thread. (This would cater for the race conditions.)
Within the threads getting the messages belonging in a group, you don't need to use a for loop to get each message separately, instead you should take any message belonging in the group, while maintain a group counter and buffering out of order messages. This would be safe as long as you commit your session only after receiving and processing all messages of the group. (This would yield more performance, as each group would be processed by a separate thread, and that thread would only access every message once in MQ.)
Please see IBM's documentation on sequential retrieval of messages. In case the page moves or is changed, I'll quote the most relevant part. For sequential processing to be guaranteed, the following conditions must be met:
All the put requests were done from the same application.
All the put requests were either from the same unit of work, or all the put requests were made outside of a unit of work.
The messages all have the same priority.
The messages all have the same persistence.
For remote queuing, the configuration is such that there can only be one path from the application making the put request, through its
queue manager, through intercommunication, to the destination queue
manager and the target queue.
The messages are not put to a dead-letter queue (for example, if a queue is temporarily full).
The application getting the message does not deliberately change the order of retrieval, for example by specifying a particular MsgId
or CorrelId or by using message priorities.
Only one application is doing get operations to retrieve the messages from the destination queue. If there is more than one
application, these applications must be designed to get all the
messages in each sequence put by a sending application.
Though the page does not state this explicitly, when they say "one application" what is meant is a single thread of that one application. If an application has concurrent threads, the order of processing is not guaranteed.
Furthermore, reading 10,000 messages in a single unit of work as suggested in another response is not recommended as a means to preserve message order! Only do that if the 10,000 messages must succeed or fail as an atomic unit, which has nothing to do with whether they were received in order. In the event that large numbers of messages must be processed in a single unit of work it is absolutely necessary to tune the size of the log files, and quite possibly a few other parameters. Preserving sequence order is torture enough for any threaded async messaging transport without also introducing massive transactions that run for very long periods of time.
You can do what you want with MQ classes for Java (non-JMS) and it may be possible with MQ classes for JMS but be really tricky.
First read this page from the MQ Knowledge.
I converted the pseudo code (from the web page above) to MQ classes for Java and changed it from a browse to a destructive get.
Also, I prefer to do each group of messages under a syncpoint (assuming a reasonable sized groups).
First off, you are missing several flags for the 'options' field of GMO (GetMessageOptions) and the MatchOptions field needs to be set to 'MQMO_MATCH_MSG_SEQ_NUMBER', so that all threads will always grab the first message in the group for the first message. i.e. not grab the 2nd message in the group for the first message as you stated above.
MQGetMessageOptions gmo = new MQGetMessageOptions();
MQMessage rcvMsg = new MQMessage();
/* Get the first message in a group, or a message not in a group */
gmo.Options = CMQC.MQGMO_COMPLETE_MSG | CMQC.MQGMO_LOGICAL_ORDER | CMQC.MQGMO_ALL_MSGS_AVAILABLE | CMQC.MQGMO_WAIT | CMQC.MQGMO_SYNCPOINT;
gmo.MatchOptions = CMQC.MQMO_MATCH_MSG_SEQ_NUMBER;
rcvMsg.messageSequenceNumber = 1;
inQ.get(rcvMsg, gmo);
/* Examine first or only message */
...
gmo.Options = CMQC.MQGMO_COMPLETE_MSG | CMQC.MQGMO_LOGICAL_ORDER | CMQC.MQGMO_SYNCPOINT;
do while ((rcvMsg.messageFlags & CMQC.MQMF_MSG_IN_GROUP) == CMQC.MQMF_MSG_IN_GROUP)
{
rcvMsg.clearMessage();
inQ.get(rcvMsg, gmo);
/* Examine each remaining message in the group */
...
}
qMgr.commit();
Related
My problem is, let's say we have 10 consumers subscribed to the topic. From the producer side, I have to send a message to only 5 consumers.
Let's say 5 consumers are having unique id [1,2,3,4,5]
I have included this in the producer side with string concatenation as "1,2,3,4,5", I specified this in
devices = "1,2,3,4,5"
messagePostProcessor.setStringProperty("deviceIds", devices);
How to handle it on the consumer side as a selector. Because I may send to 5 consumers, 10 consumers, or 50 consumers out of 100 consumers based on the demand of the situation.
From the producer's side we get consumers id's to send. But how we can identify or handle it on consumers.
As mentioned in jms-selectors, jms-message and activemq-message, you cannot use array object as a selector property for jms message. Regardless, what you can try is something like this.
I am thinking your device ids are going to be like this.
For ex: 'P8O4O18143JA3068', 'M0A0H8081436A22N', 'A0N0G8081436A2DI' etc.
So, while sending the message from the producer do like this.
String messageBody = "Message body that you want to send."
String messageSelector = "P8O4O18143JA3068, M0A0H8081436A22N, A0N0G8081436A2DI";
TextMessage message = session.createTextMessage(messageBody);
message.setStringProperty("deviceIds", messageSelector);
producer.send(message);
And, while receiving the message in the consumer do like this.
String myDeviceId = "P8O4O18143JA3068";
String messageSelector = "deviceIds LIKE '%" + myDeviceId + "%'";
consumer = session.createConsumer(destination, messageSelector);
Message message = consumer.receive()
So, this way you can allow your consumers to select/receive the messages only if it's associated deviceId exists in the message property.
I am trying to schedule my consumption process from a single partition topic. I can start it using endpointlistenerregistry.start() but I want to stop it after I have consumed all the messages in current partition i.e. when I reach to last offset in current partition. Production into the topic is done after I have finished the consumption and close it. How should I achieve the assurance that I have read all the messages till the time I started scheduler and stop my consumer ? I am using #Kafkalistener for consumer.
Set the idleEventInterval container property and add an #EventListener method to listen for ListenerContainerIdleEvents.
Then stop the container.
To read till the last offset, you simply poll till you are getting empty records.
You can invoke kafkaConsumer.pause() at the end of consumption. During next schedule it is required to invoke kafkaConsumer.resume().
Suspend fetching from the requested partitions. Future calls to poll(Duration) will not return any records from these partitions until they have been resumed using resume(Collection). Note that this method does not affect partition subscription. In particular, it does not cause a group rebalance when automatic assignment is used.
Something like this,
List<TopicPartition> topicPartitions = new ArrayList<>();
void scheduleProcess() {
topicPartitions = ... // assign partition info for this
kafkaConsumer.resume(topicPartitions)
while(true) {
ConsumerRecords<String, Object> events = kafkaConsumer.poll(Duration.ofMillis(1000));
if(!events.isEmpty()) {
// processing logic
} else {
kafkaConsumer.pause(List.of(topicPartition));
break;
}
}
}
I have a spring boot kafka consumer which consume data from a topic and store it in a Database and acknowledge it once stored.
It is working fine but the problem is happening if the application failed to get the DB connection after consuming the record ,in this case we are not sending the acknowledgement but still the message never consumed until or unless we change the group id and restart the consumer
My consumer looks like below
#KafkaListener(id = "${group.id}", topics = {"${kafka.edi.topic}"})
public void onMessage(ConsumerRecord record, Acknowledgment acknowledgment) {
boolean shouldAcknowledge = false;
try {
String tNo = getTrackingNumber((String) record.key());
log.info("Check Duplicate By Comparing With DB records");
if (!ediRecordService.isDuplicate(tNo)) {---this checks the record in my DB
shouldAcknowledge = insertEDIRecord(record, tNo); --this return true
} else {
log.warn("Duplicate record found.");
shouldAcknowledge = true;
}
if (shouldAcknowledge) {
acknowledgment.acknowledge();
}```
So if you see the above snippet we did not sent acknowledgment.
That is not how kafka offset works here
The records in the partitions are each assigned a sequential id number called the offset that uniquely identifies each record within the partition.
From the above statement For example, from the first poll consumer get the message at offset 300 and if it failed to persist into database because of some issue and it will not submit the offset.
So in the next poll it will get the next record where offset is 301 and if it persist data into database successfully then it will commit the offset 301 (which means all records in that partitions are processed till that offset, in above example it is 301)
Solution for this : use retry mechanism until it successfully stores data into database with some limited retries or just save failed data into error topic and reprocess it later, or save the offset of failed records somewhere so later you can reprocess them.
I have two consumers which need to process messages from same queue but only one of them at any time. A sequence of what I am trying to accomplish is like this:
(start) None of the consumers have subscribed to the queue
Consumer1 subscribes to queue
A producer sends message to the queue, messages are delivered to consumer1
Consumer1 processes messages and then unsubscribes after sometime
Producer sends more messages to the queue, messages are stored in queue (autoDelete=false, so the queue is not destroyed when no consumer subscribed)
Consumer2 subscribes to the queue, processes the stored messages and unsubscribes after sometime.
Consumer1 subscribes, processes messages...
.. so on
This works as expected initially. After step #5 above, I see that further messages from producer are delivered to both consumers , alternately to each one of them, even though only one has subscribed and the other one has unsubscribed.
The code I am using to get this working is like this:
1. Code for consumer subscribes to queue
connection = amqp.createConnection( { url: "http://guest#localhost:5672" }
connection.on('ready', function() {
connection.queue(queuename, {autoDelete: false}, function(queue) {
queue.bind('myexchange', '1');
queue.subscribe(mycallback).addCallback(function(ok) { qtag = ok.consumerTag; }
}
2. code for consumer unsubcribe
queue.unsubscribe(qtag);
queue.on('basicCancelOk', function() {
}
Is there anything wrong with this code or with the overall approach towards achieving the desired sequence as I described earlier?
I have a bunch of threads, each creating an org.apache.qpid.client.AMQConnection and then a session.
public void run() {
Connection connection = new AMQConnection("amqp://*******:*****#clientid/test?brokerlist='tcp://********:****?sasl_mechs='ANONYMOUS''");
connection.start();
Session ssn = connection.createSession(false,Session.AUTO_ACKNOWLEDGE);
System.out.println(ssn.toString());
ssn.close();
connection.close();
}
On some runs, I get the same Session.hashCode() in two different threads like so:
org.apache.qpid.client.AMQSession_0_10#420e44
org.apache.qpid.client.AMQSession_0_10#d76237
org.apache.qpid.client.AMQSession_0_10#d76237
org.apache.qpid.client.AMQSession_0_10#7148e9
Now I understand hashcode() is not guaranteed to be unique, how can I prove or disprove that createSession() returns the same session object on two separate threads?
Turned out to be more of a Java object equivalency question rather than anything to do with qpid or messaging.
Instead of printing hashcodes, I inserted the Session objects themselves into a Vector<Session> and compared them (==). Turns out they were all unique across all threads.