kafka consumer max-poll-records: 1 - performance - spring

I have spring boot project with kafka consumer. I need to handle errors if some message arrives - stop the container. So I added those settings:
spring.kafka.consumer.max-poll-records: 1
Now I need to know what impact (big or not so much) it will have for performance with this setting and without (default 500). If I leave default, then kafkaListenerEndpointRegistry.getListenerContainer("myID").stop(); does not executes until kafka listener processes all the messages that are in a batch and this is no good for me for order.

You have to measure that. There is script kafka-verifiable-producer.sh which can help you produce big amount of messages. And on consumer side you can measure how much it takes to consume all messages with default value and how much with spring.kafka.consumer.max-poll-records: 1

Related

Missing Kafka events with 2 consumers on a Topic

Hi I have 2 consumers on a kafka topic. Both of them belong to different consumer groups but consuming from the same topic. I am using Go kafka library to consume the messages through both the consumers.
This problem only occurs when I see a number of events on this topic. for ex when 1000 events are seen on this topic for some reason one consumer is receiving around 600 and the other around 200. This number varies from time to time. In normal circumstances when there is 1 event, It is always consumed by both of them.
And I also noticed that the missing message on one consumer is seen on the other and vice versa. Though there might be messages missing on both of them as well. This eliminates the producer not producing the messages.
Is there anyway in which consumer groups get mixed up. I'm not sure what exactly is going on.
Does anyone know what's going on and how I can debug this further?
Thanks in advance.

How Kafka poll method works behind the scene in Spring Boot?

In Kafka for Spring, I see by default the max-poll-records value is 500.
So my question is suppose if 500 messages are not present in the Topic will the consumer will wait to get 500 records then the poll method will run and fetch the batch of records.
I am a bit confused here like what are all the checks before pulling the message from Topic.
Kafka operates with hybrid strategies of polling. Usually, it is a combination of the number of records (or bytes) and time interval.
All the properties can be overridden to fit your expectations for consumption.

Kafka consumption rate is low as compare to message publish on topic

Hi I am new to Spring Boot #kafkaListener. Service A publishes message on kafka topic continuously. My service consume the message from that topic. Partitions of topic in both service (Service A and my service) is same, but rate of consuming the message is low as compare to publishing the message. I can see consumer lag in kafka.
How can I fill that lag? Or how can I increase the rate of consuming the message?
Can I have separate thread for processing message. I can consume a message in Queue (acknowledge after adding into queue) and another thread will read from that queue to process that message.
Is there any settings or property provides by Spring to increase the rate of consumption?
Lag is something you want to reduce, not "fill".
Can you consume faster? Yes. For example, changing the consumer max.poll.records can be increased from the default of 500, per your I/O rates (do your own benchmarking) to fetch more data at once from Kafka. However, this will increase the surface area for consumer error handling.
You can also consume and immediately ack the offsets, then toss records into a queue for processing. There is possibility for skipping records in this case, though, as you move processing off the critical path for offset tracking.
Or you could only commit once per consumer poll loop, rather than ack every record, but this may result in duplicate record processing.
As mentioned before, adding partitions is the best way to scale consumption after distributing producer workload
You generally will need to increase the number of partitions (and concurrency in the listener container) if a single consumer thread can't keep up with the production rate.
If that doesn't help, you will need to profile your consumer app to see where the bottleneck is.

Circuit breaker for asynchronous microservices..?

There is a ActiveMQ queue (QueueA). A service (MyService) subscribes to messages, processes it and sends the message to another ActiveMQ queue (QueueB).
QueueA -> MyService -> QueueB
Cosider a scenario where thousands of messages are in QueueA. At the same time, QueueB is down. I want to stop processing if a certain number of messages (say 100) messages are consecutively failing while sending messages to QueueB. It should test for a rolling window in certain time period (say, 100 consecutive messages failed in 60 seconds) and stop consuming from QueueA. It should then test if the service is up after 15 minutes or so by sending one more message. If it still fails, again stop consuming from QueueA for another 15 minutes.
Right now, what is happening is that all the messages are erroring out and we have to reprocess every message again. There is a recovery mechanism, but the recovery mechanism is getting overloaded because of the limitations of the current archituecture.
Is there any pattern for this? Is it the same circuit breaker (I am aware of it in synchronous context). If so, not sure if there is a solution in Java / Spring Boot / Apache Camel. Yes, that is the technology stack we are currently on. Any guidelines for the pattern also will help even if you may not have the knowledge of this specific technology platform.
I have also read the following question in StackOverflow.
Is circuit breaker pattern applicable for asynchronous requests also?
Thanks and appreciate your time in helping me with this.
Have a look on the Camel RoutePolicy of type ThrottlingExceptionRoutePolicy which is based on the CircuitBreakerLoadBalancer.
Using this policy should allow you to stop consuming from the endpoint when the circuit is in the open state (to compare with the standard circuit behahiour : bypass the service call, and fallback to another response).
#Bean
public ThrottlingExceptionRoutePolicy myCustomPolicy() {
// Important: do not open circuit for this kind of exceptions
List<Class<?>> handledExceptions = Arrays.asList(MyException.class);
return new ThrottlingExceptionRoutePolicy(failureThreshold, failureWindow, halfOpenAfter, handledExceptions);
}
from("jms:queue:QueueA")
.routePolicy(myCustomPolicy)
.to("mock:MyService")

Concurrency value in JMS topic listener in Spring

I know Spring but I am a newbie in JMS and started reading the Spring JMS. From the Spring JMS doc Spring doc, I read the following
The number of concurrent sessions/consumers to start for each listener.
Can either be a simple number indicating the maximum number (e.g. "5")
or a range indicating the lower as well as the upper limit (e.g. "3-5").
Note that a specified minimum is just a hint and might be ignored at
runtime. Default is 1; keep concurrency limited to 1 in case of a topic
listener or if queue ordering is important; consider raising it for
general queues.
Just want to understand why should the concurrency limited to 1 in case of a topic listener? If I increase it, say to 10 instead of 1, what would happen?
Each subscriber receives a copy of each message published to a Topic. It makes no sense at all to put up multiple consumer, since all your application would do is to receive the same message 10 times, in different threads.
In the case of a Queue, messages on the queue would be distributed among 10 threads and hence, handled concurrently. That is indeed a very common scenario - load balancing.

Resources