Configuring a Dedicated Listener Container for each Queue using Spring AMQP Java Configuration - spring

I have listeners configured in XML like this
<rabbit:listener-container connection-factory="connectionFactory" concurrency="1" acknowledge="manual">
<rabbit:listener ref="messageListener" queue-names="${address.queue.s1}" exclusive="true"/>
<rabbit:listener ref="messageListener" queue-names="${address.queue.s2}" exclusive="true"/>
<rabbit:listener ref="messageListener" queue-names="${address.queue.s3}" exclusive="true"/>
<rabbit:listener ref="messageListener" queue-names="${address.queue.s4}" exclusive="true"/>
<rabbit:listener ref="messageListener" queue-names="${address.queue.s5}" exclusive="true"/>
<rabbit:listener ref="messageListener" queue-names="${address.queue.s6}" exclusive="true"/>
</rabbit:listener-container>
I am trying to move that to Java Configuration and I don't see a way to add more than one MessageListener to a ListenerContainer. Creating multiple ListenerContainer beans is not an option in my case because I would not know the number of queues to consume from until runtime. Queue names will come from a configuration file.
I did the following
#PostConstruct
public void init()
{
for (String queue : queues.split(","))
{
// The Consumers would not connect if I don't call the 'start()' method.
messageListenerContainer(queue).start();
}
}
#Bean
public SimpleMessageListenerContainer messageListenerContainer(String queue)
{
SimpleMessageListenerContainer container = new SimpleMessageListenerContainer(consumerConnectionFactory);
container.setQueueNames(queue);
container.setMessageListener(messageListener());
// Set Exclusive Consumer 'ON'
container.setExclusive(true);
// Should be restricted to '1' to maintain data consistency.
container.setConcurrentConsumers(1);
container.setAcknowledgeMode(AcknowledgeMode.MANUAL);
return container;
}
It "sort" of works BUT I see some weird behavior with lots of ghost channels getting opened which never used to happen with the XML configuration. So it makes me suspicious that I am doing something wrong. I would like to know the correct way of creating MessageListenerContainers in Java configuration? Simply put, "How does Spring convert 'rabbit:listener-container' with multiple 'rabbit:listener' to java objects properly?" Any help/insight into this would be greatly appreciated.
Business Case
We have a Publisher that publishes User Profile Updates. The publisher could dispatch multiple updates for the same use and we have to process them in the correct order to maintain data integrity in the data store.
Example : User : ABC, Publish -> {UsrA:Change1,...., UsrA:Change 2,....,UsrA:Change 3} -> Consumer HAS to process {UsrA:Change1,...., UsrA:Change 2,....,UsrA:Change 3} in that order.
In our previous setup, we had 1 Queue that got all the User Updates and we had a consumer app with concurrency = 5. There were multiple app servers running the consumer app. That resulted in *5 * 'Number of instances of the consumer app' channels/threads* that could process the incoming messages. The speed was GREAT! but we were having out of order processing quite often resulting in data corruption.
To maintain strict FIFO order and still process message parallelly as much as possible, we implemented queue Sharding. We have a "x-consistent-hash with a hash-header on employee-id. Our Publisher publishes messages to the hash exchange and we have multiple sharded queues bound to the hash exchange. The idea is, we will have all changes for a given user (User A for example) queued up in the same shard. We then have our consumers connect to the sharded queues in 'Exclusive' mode and 'ConcurrentConsumers = 1' and process the messages. That way we are sure to process messages in the correct order while still processing messages parallelly. We could make it more parallel by increasing the number of shards.
Now on to the consumer configuration
We have the consumer app deployed on multiple app servers.
Original Approach:
I simply added multiple 'rabbit:listener' to my 'rabbit:listener-container' in my consumer app as you can see above and it works great except for the server that starts first get an exclusive lock on all the sharded queues and the other servers are just sitting there doing no work.
New Approach:
We moved the sharded queue names to the application configuration file. Like so
Consumer Instance 1 : Properties
queues=user.queue.s1,user.queue.s2,user.queue.s3
Consumer Instance 2 : Properties
queues=user.queue.s4,user.queue.s5,user.queue.s6
Also worth noting, we could have Any number of Consumer instances and the shards could be distributed unevenly between instances depending on resource availability.
With the queue names moved to configuration file, the XML confiugration will no longer work because we cannot dynamically add 'rabbit:listener' to my 'rabbit:listener-container' like we did before.
Then we decided to switch over to the Java Configuration. That is where we are STUCK!.
We did this initially
#Bean
public SimpleMessageListenerContainer messageListenerContainer()
{
SimpleMessageListenerContainer container = new SimpleMessageListenerContainer(consumerConnectionFactory);
container.setQueueNames(queues.split(","));
container.setMessageListener(messageListener());
container.setMissingQueuesFatal(false);
// Set Exclusive Consumer 'ON'
container.setExclusive(true);
// Should be restricted to '1' to maintain data consistency.
container.setConcurrentConsumers(1);
container.setAcknowledgeMode(AcknowledgeMode.MANUAL);
container.start();
return container;
}
and it works BUT all our queues are on one connection sharing 1 channel. That is NOT good for speed. What we want is One connection and every queue gets its own channel.
Next Step
No success here YET!. The java configuration in my original question is where we are at now.
I am baffled why this is so HARD to do. Clearly the XML configuration does something that is NOT easly doable in Java confiugration (Or atleast it feel sthat way to me). I see this as a gap that needs to be filled unless I am compeltly missing something. Please correct me if I am wrong. This is a genuine business case NOT some ficticious edge case. Please feel free to comment if you think otherwise.

and it works BUT all our queues are on one connection sharing 1 channel. That is NOT good for speed. What we want is One connection and every queue gets its own channel.
If you switch to the DirectMessageListenerContainer, each queue in that configuration gets its own Channel.
See the documentation.
To answer your original question (pre-edit):
#Bean
public SimpleMessageListenerContainer messageListenerContainer1(#Value("${address.queue.s1}") String queue)
{
SimpleMessageListenerContainer container = new SimpleMessageListenerContainer(consumerConnectionFactory);
container.setQueueNames(queue);
container.setMessageListener(messageListener());
// Set Exclusive Consumer 'ON'
container.setExclusive(true);
// Should be restricted to '1' to maintain data consistency.
container.setConcurrentConsumers(1);
container.setAcknowledgeMode(AcknowledgeMode.MANUAL);
return container;
}
...
#Bean
public SimpleMessageListenerContainer messageListenerContainer6(#Value("${address.queue.s6}" ) String queue)
{
SimpleMessageListenerContainer container = new SimpleMessageListenerContainer(consumerConnectionFactory);
container.setQueueNames(queue);
container.setMessageListener(messageListener());
// Set Exclusive Consumer 'ON'
container.setExclusive(true);
// Should be restricted to '1' to maintain data consistency.
container.setConcurrentConsumers(1);
container.setAcknowledgeMode(AcknowledgeMode.MANUAL);
return container;
}

Here is the Java Configuration for creating SimpleMessageListenerContainer
#Value("#{'${queue.names}'.split(',')}")
private String[] queueNames;
#Bean
public SimpleMessageListenerContainer listenerContainer(final ConnectionFactory connectionFactory) {
final SimpleMessageListenerContainer container = new SimpleMessageListenerContainer();
container.setConnectionFactory(connectionFactory);
container.setQueueNames(queueNames);
container.setMessageListener(vehiclesReceiver());
setCommonQueueProperties(container);
return container;
}

Each <rabbit:listener > creates its own SimpleListenerContainer bean with the same ConnectionFactory. To do similar in Java config, you have to declare as much SimpleListenerContainer beans as you have queues: one for each of them.
You also may consider to use #RabbitListener approach instead: https://docs.spring.io/spring-amqp/docs/2.0.4.RELEASE/reference/html/_reference.html#async-annotation-driven

Related

Way to determine Kafka Topic for #KafkaListener on application startup?

We have 5 topics and we want to have a service that scales for example to 5 instances of the same app.
This would mean that i would want to dynamically (via for example Redis locking or similar mechanism) determine which instance should listen to what topic.
I know that we could have 1 topic that has 5 partitions - and each node in the same consumer group would pick up a partition. Also if we have a separately deployed service we can set the topic via properties.
The issue is that those two are not suitable for our situation and we want to see if it is possible to do that via what i explained above.
#PostConstruct
private void postConstruct() {
// Do logic via redis locking or something do determine topic
dynamicallyDeterminedVariable = // SOME LOGIC
}
#KafkaListener(topics = "{dynamicallyDeterminedVariable")
void listener(String data) {
LOG.info(data);
}
Yes, you can use SpEL for the topic name.
#{#someOtherBean.whichTopicToUse()}.

Spring-Kafka Concurrency Property

I am progressing on writing my first Kafka Consumer by using Spring-Kafka. Had a look at the different options provided by framework, and have few doubts on the same. Can someone please clarify below if you have already worked on it.
Question - 1 : As per Spring-Kafka documentation, there are 2 ways to implement Kafka-Consumer; "You can receive messages by configuring a MessageListenerContainer and providing a message listener or by using the #KafkaListener annotation". Can someone tell when should I choose one option over another ?
Question - 2 : I have chosen KafkaListener approach for writing my application. For this I need to initialize a container factory instance and inside container factory there is option to control concurrency. Just want to double check if my understanding about concurrency is correct or not.
Suppose, I have a topic name MyTopic which has 4 partitions in it. And to consume messages from MyTopic, I've started 2 instances of my application and these instances are started by setting concurrency as 2. So, Ideally as per kafka assignment strategy, 2 partitions should go to consumer1 and 2 other partitions should go to consumer2. Since the concurrency is set as 2, does each of the consumer will start 2 threads, and will consume data from the topics in parallel ? Also should we consider anything if we are consuming in parallel.
Question 3 - I have chosen manual ack mode, and not managing the offsets externally (not persisting it to any database/filesystem). So should I need to write custom code to handle rebalance, or framework will manage it automatically ? I think no as I am acknowledging only after processing all the records.
Question - 4 : Also, with Manual ACK mode, which Listener will give more performance? BATCH Message Listener or normal Message Listener. I guess if I use Normal Message listener, the offsets will be committed after processing each of the messages.
Pasted the code below for your reference.
Batch Acknowledgement Consumer:
public void onMessage(List<ConsumerRecord<String, String>> records, Acknowledgment acknowledgment,
Consumer<?, ?> consumer) {
for (ConsumerRecord<String, String> record : records) {
System.out.println("Record : " + record.value());
// Process the message here..
listener.addOffset(record.topic(), record.partition(), record.offset());
}
acknowledgment.acknowledge();
}
Initialising container factory:
#Bean
public ConsumerFactory<String, String> consumerFactory() {
return new DefaultKafkaConsumerFactory<String, String>(consumerConfigs());
}
#Bean
public Map<String, Object> consumerConfigs() {
Map<String, Object> configs = new HashMap<String, Object>();
configs.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootStrapServer);
configs.put(ConsumerConfig.GROUP_ID_CONFIG, groupId);
configs.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, enablAutoCommit);
configs.put(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, maxPolInterval);
configs.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, autoOffsetReset);
configs.put(ConsumerConfig.CLIENT_ID_CONFIG, clientId);
configs.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
configs.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
return configs;
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<String, String>();
// Not sure about the impact of this property, so going with 1
factory.setConcurrency(2);
factory.setBatchListener(true);
factory.getContainerProperties().setAckMode(AckMode.MANUAL);
factory.getContainerProperties().setConsumerRebalanceListener(RebalanceListener.getInstance());
factory.setConsumerFactory(consumerFactory());
factory.getContainerProperties().setMessageListener(new BatchAckConsumer());
return factory;
}
#KafkaListener is a message-driven "POJO" it adds stuff like payload conversion, argument matching, etc. If you implement MessageListener you can only get the raw ConsumerRecord from Kafka. See #KafkaListener Annotation.
Yes, the concurrency represents the number of threads; each thread creates a Consumer; they run in parallel; in your example, each would get 2 partitions.
Also should we consider anything if we are consuming in parallel.
Your listener must be thread-safe (no shared state or any such state needs to be protected by locks.
It's not clear what you mean by "handle rebalance events". When a rebalance occurs, the framework will commit any pending offsets.
It doesn't make a difference; message listener Vs. batch listener is just a preference. Even with a message listener, with MANUAL ackmode, the offsets are committed when all the results from the poll have been processed. With MANUAL_IMMEDIATE mode, the offsets are committed one-by-one.
Q1:
From the documentation,
The #KafkaListener annotation is used to designate a bean method as a
listener for a listener container. The bean is wrapped in a
MessagingMessageListenerAdapter configured with various features, such
as converters to convert the data, if necessary, to match the method
parameters.
You can configure most attributes on the annotation with SpEL by using
"#{…​} or property placeholders (${…​}). See the Javadoc for more information."
This approach can be useful for simple POJO listeners and you do not need to implement any interfaces. You are also enabled to listen on any topics and partitions in a declarative way using the annotations. You can also potentially return the value you received whereas in case of MessageListener, you are bound by the signature of the interface.
Q2:
Ideally yes. If you have multiple topics to consume from, it gets more complicated though. Kafka by default uses RangeAssignor which has its own behaviour (you can change this -- see more details under).
Q3:
If your consumer dies, there will be rebalancing. If you acknowledge manually and your consumer dies before committing offsets, you do not need to do anything, Kafka handles that. But you could end up with some duplicate messages (at-least once)
Q4:
It depends what you mean by "performance". If you meant latency, then consuming each record as fast as possible will be the way to go. If you want to achieve high throughput, then batch consumption is more efficient.
I had written some samples using Spring kafka and various listeners - check out this repo

Kafka Producer Thread, huge amound of threads even when no message is send

I profiled my kafka producer spring boot application and found many "kafka-producer-network-thread"s running (47 in total). Which would never stop running, even when no data is sending.
My application looks a bit like this:
var kafkaSender = KafkaSender(kafkaTemplate, applicationProperties)
kafkaSender.sendToKafka(json, rs.getString("KEY"))
with the KafkaSender:
#Service
class KafkaSender(val kafkaTemplate: KafkaTemplate<String, String>, val applicationProperties: ApplicationProperties) {
#Transactional(transactionManager = "kafkaTransactionManager")
fun sendToKafka(message: String, stringKey: String) {
kafkaTemplate.executeInTransaction { kt ->
kt.send(applicationProperties.kafka.topic, System.currentTimeMillis().mod(10).toInt(), System.currentTimeMillis().rem(10).toString(),
message)
}
}
companion object {
val log = LoggerFactory.getLogger(KafkaSender::class.java)!!
}
}
Since each time I want to send a message to Kafka I instantiate a new KafkaSender, I thought a new thread would be created which then sends the message to the kafka queue.
Currently it looks like a pool of producers is generated, but never cleaned up, even when none of them has anything to do.
Is this behaviour intended?
In my opinion the behaviour should be nearly the same as datasource pooling, keep the thread alive for some time, but when there is nothing to do, clear it up.
When using transactions, the producer cache grows on demand and is not reduced.
If you are producing messages on a listener container (consumer) thread; there is a producer for each topic/partition/consumer group. This is required to solve the zombie fencing problem, so that if a rebalance occurs and the partition moves to a different instance, the transaction id will remain the same so the broker can properly handle the situation.
If you don't care about the zombie fencing problem (and you can handle duplicate deliveries), set the producerPerConsumerPartition property to false on the DefaultKafkaProducerFactory and the number of producers will be much smaller.
EDIT
Starting with version 2.8 the default EOSMode is now V2 (aka BETA); which means it is no longer necessary to have a producer per topic/partition/group - as long as the broker version is 2.5 or later.

Spring AMPQ multiple consumers vs higher prefetch value

Even after reading plenty of SO questions (1,2) and articles, It is unclear on which is the better option to set for consumers. Multiple consumers or a higher prefetch value?
From what I understand, when it comes to SimpleRabbitListenerContainerFactory, as it was designed initially to have only one thread per connection it was designed to address a limitation that the amqp-client only had one thread per connection, does that mean that setting multiple consumers won't make much difference as there is only one thread that actually consumes from rabbit and than hands it off to the multiple consumers (threads)?
Or there are actually several consumers consuming at the same time?
So what is the best practice when it comes to spring implementation of rabbit concerning prefetch/consumers? When should one be used over the other? And should I switch to this new DirectRabbitListenerContainerFactory? Is it 'better' or just depends on the use case?
Some downsides I see when it comes to high prefetch is that maybe it can cause memory issues if an app consumes more messages that it can hold in the buffer? (haven't actually tested this yet, tbh)
And when it comes to multiple consumers, I see the downside of having more file descriptors opened on OS level and I saw this article about that each consumer actually pings rabbit for each ack and this making it slower.
FYI, if it is relevant, I usually have my config set up like this:
#Bean
public ConnectionFactory connectionFactory() {
final CachingConnectionFactory connectionFactory = new CachingConnectionFactory(server);
connectionFactory.setUsername(username);
connectionFactory.setPassword(password);
connectionFactory.setVirtualHost(virtualHost);
connectionFactory.setRequestedHeartBeat(requestedHeartBeat);
return connectionFactory;
}
#Bean
public AmqpAdmin amqpAdmin() {
AmqpAdmin admin = new RabbitAdmin(connectionFactory());
admin.declareQueue(getRabbitQueue());
return admin;
}
#Bean
public SimpleRabbitListenerContainerFactory rabbitListenerContainerFactory() {
final SimpleRabbitListenerContainerFactory factory = new SimpleRabbitListenerContainerFactory();
factory.setConnectionFactory(connectionFactory());
factory.setConcurrentConsumers(concurrency);
factory.setMaxConcurrentConsumers(maxConcurrency);
factory.setPrefetchCount(prefetch);
factory.setMissingQueuesFatal(false);
return factory;
}
#Bean
public Queue getRabbitQueue() {
final Map<String, Object> p = new HashMap<String, Object>();
p.put("x-max-priority", 10);
return new Queue(queueName, true, false, false, p);
}
No; the SMLC wasn't "designed for one thread per connection" it was designed to address a limitation that the amqp-client only had one thread per connection so that thread hands off to consumer threads via an in-memory queue; that is no longer the case. The client is multi-threaded and there is one dedicated thread per consumer.
Having multiple consumers (increasing the concurrency) is completely effective (and was, even with the older client).
Prefetch is really to reduce network chatter and improve overall throughput. Whether you need to increase concurrency really is orthogonal to prefetch. You would typically increase concurrency if (a) your listener is relatively slow to process each message and (b) strict message ordering is not important.
The DirectListenerContainer was introduced to provide a different threading model, where the listener is invoked directly on the amqp-client thread.
The reasons for choosing one container over the other is described in Choosing a Container.
The following features are available with the SMLC, but not the DMLC:
txSize - with the SMLC, you can set this to control how many messages are delivered in a transaction and/or to reduce the number of acks, but it may cause the number of duplicate deliveries to increase after a failure. (The DMLC does have mesagesPerAck which can be used to reduce the acks, the same as with txSize and the SMLC, but it can’t be used with transactions - each message is delivered and ack’d in a separate transaction).
maxConcurrentConsumers and consumer scaling intervals/triggers - there is no auto-scaling in the DMLC; it does, however, allow you to programmatically change the consumersPerQueue property and the consumers will be adjusted accordingly.
However, the DMLC has the following benefits over the SMLC:
Adding and removing queues at runtime is more efficient; with the SMLC, the entire consumer thread is restarted (all consumers canceled and re-created); with the DMLC, unaffected consumers are not canceled.
The context switch between the RabbitMQ Client thread and the consumer thread is avoided.
Threads are shared across consumers rather than having a dedicated thread for each consumer in the SMLC. However, see the IMPORTANT note about the connection factory configuration in the section called “Threading and Asynchronous Consumers”.

How to have dynamic selector with DefaultMessageListenerContainer (in spring-boot)?

I have a spring-boot application with ActiveMQ JMS. I have a queue in the application which will get messages with a string property say color. Value of color can be red, green or blue. Application has a Rest service where it will get list of color(s) { one or more } which should be used as a SELECTOR when listening for the messages on the queue. Over the lifetime of the application, this might change, so the value of SELECTOR can look like
"color='red'", "color='blue' OR color='red'" or "color='green'".
#Bean
MessageListenerAdapter adapter() {
return new MessageListenerAdapter(new Object() {
// message handler
});
}
#Bean
DefaultMessageListenerContainer container(ConnectionFactory cf) throws Exception {
DefaultMessageListenerContainer c = new DefaultMessageListenerContainer();
c.setMessageListener(adapter());
c.setConcurrency(this.concurrency);
c.setMessageSelector(this.selector);
c.setConnectionFactory(cf);
c.setDestinationName(this.q);
return c;
}
Was planning to use above code to achieve this; code works fine to start with initial selector, however when selector needs to change following code does not work.
c.stop();
// modify value of selector
c.setMessageSelector(this.selector);
c.start();
Looks like, I have a working solution. I put #Scope("prototype") on top of method container() and have a method which instantiates a new DefaultMessageListenerContainer whenever selector changes.
public void xx(String selector) {
this.selector = selector;
DefaultMessageListenerContainer c =
context.getBean("container", DefaultMessageListenerContainer.class);
c.start();
}
Is this the right way to go about this? Also, when selector changes and I instantiate a new DefaultMessageListenerContainer, what's the correct way to shutdown/stop existing DefaultMessageListenerContainer?
regards,
Yogi
The prototype thing looks a very bad idea to me.
MessageListenerContainer is meant to be a singleton that is responsible to handle listeners for a particular queue or topic configuration. The selector is a JMS spec so you basically need to reconfigure the listener container at runtime which will require you to fully stop the listener, change its configuration and restart it. I haven't seen your code but I don't see a reason why it wouldn't work.
Having said that, why are you using a selector for this? If the selector changes during the lifetime of your application, wouldn't it be better to actually perform that selection in your own logic? Having a selector at the JMS level is interesting if you have several message types on the same queue and you want different thread pools for them (i.e. you want 5 concurrent listeners for "red" and only 2 for green for instance). If you don't have that requirement having a generic route that filters the incoming message is probably a better idea.
If you do have that requirement, stopping the container, changing its config and restarting it should work. Unfortunately it doesn't so I've created SPR-14604 to track this issue.

Resources