Spring Integration: TaskExecutor and MaxConcurrentConsumers on AmqpInboundChannelAdapter - spring

My Spring Integration app consumes messages from RabbitMQ transforms them to SOAP message and does web service request.
It is possible to get many (10 – 50) messages per second from the queue.
Or after application restart there could be many thousand messages in RabbitMQ queue.
What is the best possible scenario to process up to 10 messages in parallel threads (message ordering is nice to have but not required feature, if web service answers with business failure then failed message should go to retry until it succeeds).
Amqp listener should not consume more messages from the queue as not busy threads available in task executor.
I could define an ThreadExecutor in an channel like this:
#Bean
public AmqpInboundChannelAdapterSMLCSpec
amqpInboundChannelAdapter(ConnectionFactory connectionFactory, Queue queue)
{
return Amqp.inboundAdapter(connectionFactory, queue);
}
IntegrationFlow integrationFlow = IntegrationFlows
.from(amqpInboundChannelAdapter)
.channel(c -> c.executor(exportFlowsExecutor))
.transform(businessObjectToSoapRequestTransformer)
.handle(webServiceOutboundGatewayFactory.getObject())
.get();
Or is it enough to define an task executor in AmqpInboundChannelAdapter like this and do not define channels task executor in the flow definition:
#Bean
public AmqpInboundChannelAdapterSMLCSpec
amqpInboundChannelAdapter(ConnectionFactory connectionFactory, Queue queue)
{
return Amqp.inboundAdapter(connectionFactory, queue)
.configureContainer(c->c.taskExecutor(taskExecutor));
}
Or maybe to define task executor for a channel like option 1 but additionally set maxConcurrentConsumers on a channel adapter like this:
#Bean
public AmqpInboundChannelAdapterSMLCSpec
amqpInboundChannelAdapter(ConnectionFactory connectionFactory, Queue queue)
{
return Amqp.inboundAdapter(connectionFactory, queue)
.configureContainer(c->c.maxConcurrentConsumers(10));
}

The best practice to configure a concurrency on the ListenerContainer and let all downstream processes happen on those threads from the container. This way you'll get a natural back-pressure when no more messages are polled from the queue because the thread is busy. And on the other hand there is not going to be message loss just because with an ExecutorChannel after the listener container you're going to free a polling thread and the current message will be acked as consumed, but you may fail downstream.

Related

Spring integration messages queue

I have jms message endpoint like:
#Bean
public JmsMessageDrivenEndpoint fsJmsMessageDrivenEndpoint(ConnectionFactory fsConnectionFactory,
Destination fsInboundDestination,
MessageConverter fsMessageConverter) {
return Jms.messageDrivenChannelAdapter(fsConnectionFactory)
.destination(fsInboundDestination)
.jmsMessageConverter(fsMessageConverter)
.outputChannel("fsChannelRouter.input")
.errorChannel("fsErrorChannel.input")
.get();
}
So, my questions is did I get next message before current message will be processed? If it will...Did it will get all messages in mq queue until it fills up all the memory? How to avoid it?
The JmsMessageDrivenEndpoint is based on the JmsMessageListenerContainer, its threading model and MessageListener callback for pulled messages. As long as your MessageListener blocks, it doesn't go to the next message in the queue to pull. When we build an integration flow starting with JmsMessageDrivenEndpoint, it becomes as a MessageListener callback. As long as we process the message downstream in the same thread (DirectChannel by default in between endpoints), we don't pull the next message from JMS queue. If you place a QueueChannel or an ExecutorChannel in between, you shift a processing to a different thread. The current one (JMS listener) gets a control back and it is ready to pull the next message. And in this case your concern about the memory is correct. You can still use QueueChannel with limited size or your ExecutorChannel can be configured with limited thread pool.
In any way my recommendation do not do any thread shifting in the flow when you start from JMS listener container. It is better to block for the next message and let the current transaction to finish its job. So you won't lose a message when something crashes.

How to Ack/Nack with reactive RabbitListener in Spring AMQP?

I'm using Spring AMQP 2.1.6 for consuming messages with a RabbitListener returning a Mono<Void>. For example:
#RabbitListener
public Mono<Void> myListener(MyMessage myMessage) {
Mono<Void> mono = myService.doSomething(myMessage);
return mono;
}
Reading the documentation it says:
The listener container factory must be configured with AcknowledgeMode.MANUAL so that the consumer thread will not ack the message; instead, the asynchronous completion will ack or nack the message when the async operation completes.
I've thus configured the container factory with AcknowledgeMode.MANUAL, but it's not clear to me whether "the asynchronous completion will ack or nack the message when the async operation completes" means that this is handled by spring-amqp itself or if this is something that I have to do? I.e. do I have to the ack/nack the message after the call to myService.doSomething(myMessage) or does Spring AMQP automatically ack's it since I'm returning a Mono (even though AcknowledgeMode.MANUAL is set)?
If it's the case that I need to manually send ack's or reject's, what's the idiomatic way to do this in a non-blocking manner when using the RabbitListener?
The listener adapter takes care of the ack when the Mono is completed.
See AbstractAdaptableMessageListener.asyncSuccess() and asyncFailure().
EDIT
I am not a reactor person but, as far as I can tell, completing a Mono<Void> does nothing so the on...() methods are never called.
You can ack the delivery manually using channel.basicAck or basicReject...
#RabbitListener(queues = "foo")
public void listen(String in, Channel channel,
#Header(AmqpHeaders.DELIVERY_TAG) long tag) throws IOException {
...
}

Spring-Kafka Concurrency Property

I am progressing on writing my first Kafka Consumer by using Spring-Kafka. Had a look at the different options provided by framework, and have few doubts on the same. Can someone please clarify below if you have already worked on it.
Question - 1 : As per Spring-Kafka documentation, there are 2 ways to implement Kafka-Consumer; "You can receive messages by configuring a MessageListenerContainer and providing a message listener or by using the #KafkaListener annotation". Can someone tell when should I choose one option over another ?
Question - 2 : I have chosen KafkaListener approach for writing my application. For this I need to initialize a container factory instance and inside container factory there is option to control concurrency. Just want to double check if my understanding about concurrency is correct or not.
Suppose, I have a topic name MyTopic which has 4 partitions in it. And to consume messages from MyTopic, I've started 2 instances of my application and these instances are started by setting concurrency as 2. So, Ideally as per kafka assignment strategy, 2 partitions should go to consumer1 and 2 other partitions should go to consumer2. Since the concurrency is set as 2, does each of the consumer will start 2 threads, and will consume data from the topics in parallel ? Also should we consider anything if we are consuming in parallel.
Question 3 - I have chosen manual ack mode, and not managing the offsets externally (not persisting it to any database/filesystem). So should I need to write custom code to handle rebalance, or framework will manage it automatically ? I think no as I am acknowledging only after processing all the records.
Question - 4 : Also, with Manual ACK mode, which Listener will give more performance? BATCH Message Listener or normal Message Listener. I guess if I use Normal Message listener, the offsets will be committed after processing each of the messages.
Pasted the code below for your reference.
Batch Acknowledgement Consumer:
public void onMessage(List<ConsumerRecord<String, String>> records, Acknowledgment acknowledgment,
Consumer<?, ?> consumer) {
for (ConsumerRecord<String, String> record : records) {
System.out.println("Record : " + record.value());
// Process the message here..
listener.addOffset(record.topic(), record.partition(), record.offset());
}
acknowledgment.acknowledge();
}
Initialising container factory:
#Bean
public ConsumerFactory<String, String> consumerFactory() {
return new DefaultKafkaConsumerFactory<String, String>(consumerConfigs());
}
#Bean
public Map<String, Object> consumerConfigs() {
Map<String, Object> configs = new HashMap<String, Object>();
configs.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootStrapServer);
configs.put(ConsumerConfig.GROUP_ID_CONFIG, groupId);
configs.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, enablAutoCommit);
configs.put(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, maxPolInterval);
configs.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, autoOffsetReset);
configs.put(ConsumerConfig.CLIENT_ID_CONFIG, clientId);
configs.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
configs.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
return configs;
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<String, String>();
// Not sure about the impact of this property, so going with 1
factory.setConcurrency(2);
factory.setBatchListener(true);
factory.getContainerProperties().setAckMode(AckMode.MANUAL);
factory.getContainerProperties().setConsumerRebalanceListener(RebalanceListener.getInstance());
factory.setConsumerFactory(consumerFactory());
factory.getContainerProperties().setMessageListener(new BatchAckConsumer());
return factory;
}
#KafkaListener is a message-driven "POJO" it adds stuff like payload conversion, argument matching, etc. If you implement MessageListener you can only get the raw ConsumerRecord from Kafka. See #KafkaListener Annotation.
Yes, the concurrency represents the number of threads; each thread creates a Consumer; they run in parallel; in your example, each would get 2 partitions.
Also should we consider anything if we are consuming in parallel.
Your listener must be thread-safe (no shared state or any such state needs to be protected by locks.
It's not clear what you mean by "handle rebalance events". When a rebalance occurs, the framework will commit any pending offsets.
It doesn't make a difference; message listener Vs. batch listener is just a preference. Even with a message listener, with MANUAL ackmode, the offsets are committed when all the results from the poll have been processed. With MANUAL_IMMEDIATE mode, the offsets are committed one-by-one.
Q1:
From the documentation,
The #KafkaListener annotation is used to designate a bean method as a
listener for a listener container. The bean is wrapped in a
MessagingMessageListenerAdapter configured with various features, such
as converters to convert the data, if necessary, to match the method
parameters.
You can configure most attributes on the annotation with SpEL by using
"#{…​} or property placeholders (${…​}). See the Javadoc for more information."
This approach can be useful for simple POJO listeners and you do not need to implement any interfaces. You are also enabled to listen on any topics and partitions in a declarative way using the annotations. You can also potentially return the value you received whereas in case of MessageListener, you are bound by the signature of the interface.
Q2:
Ideally yes. If you have multiple topics to consume from, it gets more complicated though. Kafka by default uses RangeAssignor which has its own behaviour (you can change this -- see more details under).
Q3:
If your consumer dies, there will be rebalancing. If you acknowledge manually and your consumer dies before committing offsets, you do not need to do anything, Kafka handles that. But you could end up with some duplicate messages (at-least once)
Q4:
It depends what you mean by "performance". If you meant latency, then consuming each record as fast as possible will be the way to go. If you want to achieve high throughput, then batch consumption is more efficient.
I had written some samples using Spring kafka and various listeners - check out this repo

StreamListener Overwhelming TaskExecutor

Two questions:
I have an #StreamListener reading from a RabbitMQ channel. I have a pool of 500 ThreadTaskExecutor instances to process the messages as they are read.
The problem is that #StreamListener is reading messages even if the pool is completely utilized.
Caused by: org.springframework.core.task.TaskRejectedException:
Executor [java.util.concurrent.ThreadPoolExecutor#4c15ce96
[Running, pool size = 500, active threads = 500, queued tasks = 1500,
completed tasks = 1025020]] did not accept task:
org.springframework.cloud.sleuth.instrument.async.SpanContinuingTraceCallable#4dc03919
Is there a way to configure #StreamListener so that it only reads from the queue if it has capacity?
In addition, this error trickles up to an UndeclaredThrowableException. IO think its trying to throw the exception back to RabbitMQ so it reques the message. However the end is this:
[WARN] o.s.a.r.l.ConditionalRejectingErrorHandler
Execution of Rabbit message listener failed.
org.springframework.amqp.rabbit.listener.exception
.ListenerExecutionFailedException:
Retry Policy Exhausted
The final result is my message is lost.
Any suggestions for this second issue?
Did you try CallerRunsPolicy for your ThreadPoolTaskExecutor? This way the task won't finish with error and the thread from the SimpleMessageListenerContainer will be busy to do the latest task for just arrived message. As far as you don't use maxConcurrentConsumers option not new concurrent listeners will be raised and the current one (concurrentConsumers = 1 by default) will be busy and no new message is pulled from the Rabbit MQ.
See more info about listener container concurrency in the Docs. This way you may even reconsider your custom ThreadPoolTaskExecutor solution and will fully rely on the built-in mechanism in the Framework.
The maxConcurrency option is exposed for the RabbitMQ Binder Consumer as well.

Redelivery of JMS message in microserices

I want to know the redelivery of JMS in a microservices.
For example, if I have a microservices system. And I have 2 instances of User service. And have a listener on a destination in user service. It means I have 2 listeners. The listener is like this:
#JmsListener(destination = "order:new", containerFactory = "orderFactory")
#Transactional
public void create(OrderDTO orderDTO) {
Order order = new Order(orderDTO);
orderRepository.save(order);
jmsTemplate.convertAndSend("order:need_to_pay", order);
}
So my question is, how many times a message will be delivered. And if there is some error in this function, and the message will be re-delivered. But I have 2 instances of the service. And on which this message will be delivered?
It's not part of the spec; it depends on the broker configuration how many times it will be delivered; many brokers can be configured to send the message to a dead-letter queue after some number of attempts.
There is no guarantee the redelivery will go to the same instance.

Resources