I am implementing an kafka based application where I would like to manually acknowledge incoming messages. Architecture forces me to do it in a separate thread.
The question is: is it possible and safe to execute Acknowledgement.acknowledge() in a different thread than consumer?
Yes it is, as long as you use MANUAL and not MANUAL_IMMEDIATE, but I don't think you'll get what you expect.
Kafka doesn't track each message, just offsets within the partition.
Let's say message 1 arrives and you hand off to another thread. Then message 2 arrives and it is handed off to yet another thread.
When the offset for message 2 is ack'd, you are effectively acking both messages.
Related
I am new to Spring Boot #kafkaListener. My application receiving almost 200K message per second on topic. I want to separate message listener and processing of the message.
How can I use java.util.concurrent.BlockingQueue with #kafkaListener? Can I use it by using CompletableFuture?
Any sample code will help more.
I believe you want to have your consumer with pipelining implemented. Its not uncommon for one to implement this in a scenario like yours. Why? Well, the KafkaConsumer lacks in that decompressing / deserializing can be time consuming without considering the time it takes to do processing. Since these operations are stacked behind one thread, it would be ideal to separate the polling from the processing, which is achieved through a couple of buffers.
One way to do this: your EventReceiver spins up a thread for the polling. That thread would do the same thing you always do, but instead of firing off the listeners for each event, you'd pass the event to a receivedEvents buffer which could be BlockingQueue<RecieveEvent>. So in the for loop, you pass each record to the blocking queue. This thread would leverage another buffer once the for loop is over, like Queue<Map<TopicPartition, OffsetAndMetadata>> -- and it would commit the offsets that the processingThread has successfully processed.
Next, your EventReceiver spins up another thread - processingThread. This would handle pulling records from the buffer, firing the event to all the listeners for this receiver, and then update the Queues state for the pollingThread to commit.
Why doesn't the processingThread just commit the events instead of passing it back to the pollingThread? This is bc KafkaConsumer requires that the same thread that calls .poll() should be the one that calls consumer.commitAsync(...) or else you'll get a concurrency exception.
This approach doesn't work with auto commit enabled.
In terms of how one can do this using Spring Kafka, I'm not completely sure. However, I do know Spring Kafka separates EventReceiver from EventListener (#KafkaListener) which is separating the low-level kafka work from the business logic. In theory, you'd have to tune their implementation, but I think implementing this one without Spring Kafka library would be easier.
Is there any way to confirm a JMS message in a subprocess?
I have process A that starts with a JMS Queue Receiver (or JMS Topic Subscriber). It calls process B which has to confirm the message received - I'm using Tibco EMS Explicit acknowledge mode.
This will allow me to reuse some parts. Is it possible to do it?
I'm afraid this is not possible. The confirm always has to be in the same process as the receiver.
In a well-designed architecture you do not want to split the messaging (and confirm) layer but rather push all functional processing into a sub-process having a result parameter indicating if the initial message shall be kept (defer processing to a later time by not confirming) or mark it as "processed" (and confirm it).
By default, all (JMS) messages are auto-confirmed so an explicit confirm is a design choice made by you (based on a particular consumption model) in the config tab of the process starter/step. You should only use this if you know what happens with that message and if a processing deferral is possible. Most loosely coupled messaging is not "transactional" (except you decide to take the extra mile) in a DB sense - so rather stick to the auto-confirm if you have no special handling demand! BW/EMS are quite good in handling (reasonably small) messages so NOT auto-confirming can create re-deliveries within milliseconds and flood your whole system (heap space) if not handled properly.
I know that JMS messages are immutable. But I have a task to solve, which requires rewrite message in queue by entity id. Maybe there is a problem with system design, help me please.
App A sends message (with entity id = 1) to JMS. App B checks for new messages every minute.
App A might send many messages with entity id = 1 in a minute, but App B should see just the last one.
Is it possible?
App A should work as fast as possible, so I don't like the idea to perform removeMatchingMessages(String selector) before new message push.
IMO the approach is flawed.
Even if you did accept clearing off the queue by using a message selector to remove all messages where entity id = 1 before writing the new message, timing becomes an issue: it's possible that whichever process writes the out-dated messages would need to complete before the new message is written, some level of synchronization.
The other solution I can think of is reading all messages before processing them. Every minute, the thread takes the messages and bucketizes them. An earlier entity id = 1 message would be replaced by a later one, so that at the end you have a unique set of messages to process. Then you process them. Of course now you might have too many messages in memory at once, and transactionality gets thrown out the window, but it might achieve what you want.
In this case you could actually be reading the messages as they come in and bucketizing them, and once a minute just run your processing logic. Make sure you synchronize your buckets so they aren't changed out from under you as new messages come in.
But overall, not sure it's going to work
Say I have one JMS message FooCompleted
{"businessId": 1,"timestamp": "20140101 01:01:01.000"}
and another JMS message BazCompleted
{"businessId": 1,"timestamp": "20140101 01:02:02.000"}
The use case is that I want some action triggered when both messages have been received for the business id in question - essentially a join point of reception of the two messages. The two messages are published on two different queues and order between reception of FooCompleted and BazCompleted may change. In reality, I may need to have join of reception of several different messages for the businessId in question.
The naive approach was that to store the reception of the message in a db and check if message(s) its dependent join arm(s) have been received and only then kick off the action desired. Given that the problem seems generic enough, we were wondering if there is a better way to solve this.
Another thought was to move messages from these two queues into a third queue on reception. The listener on this third queue will be using a special avataar of DefaultMessageListenerContainer which overrides the doReceiveAndExecute to call receiveMessage for all outstanding messages in the queue and adding messages back to the queue whose all dependent messages have not yet arrived - the remaining ones will be acknowledged and hence removed. Given that the quantum of messages will be low, probing the queue over and adding messages again should not be a problem. The advantage would be avoiding the DB dependency and the associated scaffolding code. Wanted to see if there is something glaringly bad with this
Gurus, please critique and point out better ways to achieve this.
Thanks in advance!
Spring Integration with a JMS message-driven adapter and an aggregator with custom correlation and release strategies, and a peristent (JDBC) message store will provide your first solution without writing much (or any) code.
I send a java object to a queue from a thread. The relavent MDB's onMessage is invoked with a message from the queue. onMessage, I match a key present in the message with a key in a cache, if key is not present I throw a custom runtimeexception just to make the container redeliver this message. (I have another autonomous system that adds key to the cache from the external system response, it may be little slow by 3-5 seconds)
In such case, does this container add this unprocessed message to the end of the queue, or is it redelivered immediately? is there a way to delay the redelivery time? assuming the queue is always filled with ~550 messages every second.
regards
There's current a redelivery delay feature on HornetQ but all the subsequent messages are delivered fine.
There's a feature request in place to hold the queue for some time if a redelivery happens but that has not been implemented yet.
but if you have multiple consumers on the queue the order will be spread with your consumers anyways. You could use message-grouping and add a sleep on your onMessage if deliveryCount > 1. The message grouping is to guarantee no other consumer (or another MDB instance) will receive the messages out of order.
Depending on how you're application is done, and depending on your requirements you may want to only allow a single instance of your MDB.
Also: look at the consumer-window-size where you can select no buffering on the client which has a better behaviour when you have multiple consumers or multiple mdb instances.