Resume kafka stream when consumer is within a group - apache-kafka-streams

I have a circuit breaker in my Spring Cloud Stream application. It is pausing/resuming stream very well when circuit is changing state and when my stream consumer is anonymous (not in a group).
When the consumer is belong to a group, pausing the stream works well, but resuming is being "ignored", which eventually ending with timeout and leaving the group. Any explanation why this inconsistent behavior occurs?
Spring cloud stream version is 3.0.8.RELEASE.
This is my circuit breaker state transition handler:
#Component
public class CircuitBreakerKafkaStream {
private final Logger log = LoggerFactory.getLogger(CircuitBreakerKafkaStream.class);
private List<InputBindingLifecycle> inputBindingLifecycles;
public CircuitBreakerKafkaStream(List<InputBindingLifecycle> inputBindingLifecycles) {
this.inputBindingLifecycles = inputBindingLifecycles;
}
#Override
// Pause or resume all of input bindings by the state of circuit breaker.
public void transitionHandler(CircuitBreaker.State toState) {
log.info("Circuit breaker is transitioning to {} state", toState.toString());
if (toState == CircuitBreaker.State.OPEN) {
log.info("Pausing kafka binder...");
gatherInputBindings().stream().forEach(binding -> binding.pause());
} else {
log.info("Resuming kafka binder...");
gatherInputBindings().stream().forEach(binding -> binding.resume());
}
}
private List<Binding<?>> gatherInputBindings() {
List<Binding<?>> inputBindings = new ArrayList<>();
for (InputBindingLifecycle inputBindingLifecycle : this.inputBindingLifecycles) {
Collection<Binding<?>> lifecycleInputBindings =
(Collection<Binding<?>>)
new DirectFieldAccessor(inputBindingLifecycle).getPropertyValue("inputBindings");
inputBindings.addAll(lifecycleInputBindings);
}
return inputBindings;
}
}
Update: I think that the problem is more specific. The circuit breaker have open, half_open and close states. Once the circuit is closed and stream is resumed, it is consuming some amount of messages, and then for some reason it is stopping, until a poll timeout occurs.

You should increase max.poll.timeout.ms, by default it is set to 5 minutes. If stream does not send poll request within this limit, kafka broker considers that streams as dead.
We looked for a solution for delay processing in kafka streams, we wanted to delay message delivery in couple of minutes and increasing max.poll.timeout.ms avoided to kick off stream.

Related

AmqpResourceNotAvailableException: The channelMax limit is reached. Try later

I know that this error was mentioned in other posts as (https://github.com/spring-projects/spring-amqp/issues/999 and https://github.com/spring-projects/spring-amqp/issues/853 ) but I haven't found the solution for me.
My project has defined a set of microservices that publishing and consuminng messages by queue. when I
run my stress tests with 200 Transaction per seconds I get this error:
The channelMax limit is reached. Try later
This error is only shown in one microservice in the rest no.
I am using in my project:
spring-boot-starter-amqp.2.3.4.RELEASE
spring-rabbit:2.2.11
and my setting of rabbit is :
public ConnectionFactory publisherConnectionFactory() {
final CachingConnectionFactory connectionFactory = new
CachingConnectionFactory(rabbitMQConfigProperties.getHost(), rabbitMQConfigProperties.getPort());
connectionFactory.setUsername(rabbitMQConfigProperties.getUser());
connectionFactory.setPassword(rabbitMQConfigProperties.getPass());
connectionFactory.setPublisherReturns(true);
connectionFactory.setPublisherConfirms(true);
connectionFactory.setConnectionNameStrategy(connecFact -> rabbitMQConfigProperties.getNameStrategy());
connectionFactory.setRequestedHeartBeat(15);
return connectionFactory;
}
#Bean(name = "firstRabbitTemplate")
public RabbitTemplate firstRabbitTemplate(MessageDeliveryCallbackService messageDeliveryCallbackService) {
final RabbitTemplate template = new RabbitTemplate(publisherConnectionFactory());
template.setMandatory(true);
template.setMessageConverter(jsonMessageConverter());
template.setReturnCallback((msg, i, s, s1, s2) -> {
log.error("Publisher Unable to deliver the message {} , Queue {}: --------------", s1, s2);
messageDeliveryCallbackService.returnedMessage(msg, i, s, s1, s2);
});
template.setConfirmCallback((correlationData, ack, cause) -> {
if (!ack) {
log.error("Message unable to connect Exchange, Cause {}: ack{}--------------", cause,ack);
}
});
return template;
}
My questions are :
Should I set up the ChannelCacheSize and setChannelCheckoutTimeout?. I did a test increasing the channelCacheSize to 50 but the issue is still happenning. What would it be the best value for these parameters as per I mentioned it earlier?. I read about channelCheckoutTimeout should be higher than 0 but I don't know what value i must set up.
Right now i am processing around 200 Transaction per second but this number will be increased progressly
Thank you in advance.
channel_max is negotiated between the client and server and applies to connections. The default is 2047 so it looks like you broker has imposed a lower limit.
https://www.rabbitmq.com/channels.html#channel-max
When using publisher confirms, returning channels to the cache is delayed until the confirm is received; hence more channels are generally needed when the volume is high.
You can either reconfigure the broker to allow more channels, or change the CacheMode to CONNECTION instead of the default (CHANNEL).
https://docs.spring.io/spring-amqp/docs/current/reference/html/#cachingconnectionfactory

How to tell RSocket to read data stream by Java 8 Stream which backed by Blocking queue

I have the following scenario whereby my program is using blocking queue to process message asynchronously. There are multiple RSocket clients who wish to receive this message. My design is such a way that when a message arrives in the blocking queue, the stream that binds to the Flux will emit. I have tried to implement this requirement as below, but the client doesn't receive any response. However, I could see Stream supplier getting triggered correctly.
Can someone pls help.
#MessageMapping("addListenerHook")
public Flux<QueryResult> addListenerHook(String clientName){
System.out.println("Adding Listener:"+clientName);
BlockingQueue<QueryResult> listenerQ = new LinkedBlockingQueue<>();
Datalistener.register(clientName,listenerQ);
return Flux.fromStream(
()-> Stream.generate(()->streamValue(listenerQ))).map(q->{
System.out.println("I got an event : "+q.getResult());
return q;
});
}
private QueryResult streamValue(BlockingQueue<QueryResult> inStream){
try{
return inStream.take();
}catch(Exception e){
return null;
}
}
This is tough to solve simply and cleanly because of the blocking API. I think this is why there aren't simple bridge APIs here to help you implement this. You should come up with a clean solution to turn the BlockingQueue into a Flux first. Then the spring-boot part becomes a non-event.
This is why the correct solution is probably involving a custom BlockingQueue implementation like ObservableQueue in https://www.nurkiewicz.com/2015/07/consuming-javautilconcurrentblockingque.html
A alternative approach is in How can I create reactor Flux from a blocking queue?
If you need to retain the LinkedBlockingQueue, a starting solution might be something like the following.
val f = flux<String> {
val listenerQ = LinkedBlockingQueue<QueryResult>()
Datalistener.register(clientName,listenerQ);
while (true) {
send(bq.take())
}
}.subscribeOn(Schedulers.elastic())
With an API like flux you should definitely avoid any side effects before the subscribe, so don't register your listener until inside the body of the method. But you will need to improve this example to handle cancellation, or however you cancel the listener and interrupt the thread doing the take.

Restarting inifinite Flux on error with pubSubReactiveFactory

I'm developing an application which uses reactor libraries to connect with Google pubsub. So I have a Flux of messages. I want it to always consume from the queue, no matter what happens: this means handling all errors in order not to terminate the flux. I was thinking about the (very unlikely) event the connection to pubsub may be lost or whatever may cause the just created Flux to signal an error. I came up with this solution:
private final PubSubReactiveFactory pubSubReactiveFactory;
private final String requestSubscription;
private final Long requestPollTime;
private final Flux<AcknowledgeablePubsubMessage> requestFlux;
#Autowired
public FluxContainer(/* Field args...*/) {
// init stuff...
this.requestFlux = initRequestFlux();
}
private Flux<AcknowledgeablePubsubMessage> initRequestFlux() {
return pubSubReactiveFactory.poll(requestSubscription, requestPollTime);
.doOnError(e -> log.error("FATAL ERROR: could not retrieve message from queue. Resetting flux", e))
.onErrorResume(e -> initRequestFlux());
}
#EventListener(ApplicationReadyEvent.class)
public void configureFluxAndSubscribe() {
log.info("Setting up requestFlux...");
this.requestFlux
.doOnNext(AcknowledgeablePubsubMessage::ack)
// ...many more concatenated calls handling flux
}
Does it makes sense? I'm concerned about memory allocation (I'm relying on the gc to clean stuff). Any comment is welcome.
What I think you're looking for is basically a Flux that restarts itself when it is terminated for any situation except for the subscription being disposed. In my case I have a source that would generate infinite events from Docker daemon which can disconnect "successfully"
Let sourceFlux be the flux providing your data and would be something you'd want to restart on error or complete, but stop on subscription disposal.
create a recovery function
Function<Throwable, Publisher<Integer>> recoverFromThrow =
throwable -> sourceFlux
create a new flux that would recover from throw
var recoveringFromThrowFlux =
sourceFlux.onErrorResume(recoverFromThrow);
create a Flux generator that generates the flux that would recover from a throw. (Note the generic coercion is needed)
var foreverFlux =
Flux.<Flux<Integer>>generate((sink) -> sink.next(recoveringFromThrowFlux))
.flatMap(flux -> flux);
foreverFlux is the flux that does self recovery.

Reactor Flux conditional emit

Is it possible to allow emitting values from a Flux conditionally based on a global boolean variable?
I'm working with Flux delayUntil(...) but not able to fully grasp the functionality or my assumptions are wrong.
I have a global AtomicBoolean that represents the availability of a downstream connection and only want the upstream Flux to emit if the downstream is ready to process.
To represent the scenario, created a (not working) test sample
//Randomly generates a boolean value every 5 seconds
private Flux<Boolean> signalGenerator() {
return Flux.range(1, Integer.MAX_VALUE)
.delayElements(Duration.ofMillis(5000))
.map(integer -> new Random().nextBoolean());
}
and
Flux.range(1, Integer.MAX_VALUE)
.delayElements(Duration.ofMillis(1000))
.delayUntil(evt -> signalGenerator()) // ?? Only proceed when signalGenerator returns true
.subscribe(System.out::println);
I have another scenario where a downstream process can accept only x messages a second. In the current non-reactive implementation we have a Semaphore of x permits and the thread is blocked if no more permits are available, with Semaphore permits resetting every second.
In both scenarios I want upstream Flux to emit only when there is a demand from the downstream process, and I do not want to Buffer.
You might consider using Mono.fromRunnable() as an input to delayUntil() like below;
Helper class;
public class FluxCondition {
CountDownLatch latch = new CountDownLatch(10); // it depends, might be managed somehow
Runnable r = () -> { latch.await(); }
public void lock() { Mono.fromRunnable(r) };
public void release() { latch.countDown(); }
}
Usage;
FluxCondition delayCondition = new FluxCondition();
Flux.range(1, 10).delayUntil(o -> delayCondition.lock()).subscribe();
.....
delayCondition.release(); // shall call this for each element
I guess there might be a better solution by using sink.emitNext but this might also require a condition variable for controlling Flux flow.
According my understanding, in reactive programming, your data should be considered in every operator step. So it might be better for you to design your consumer as a reactive processor. In my case I had no chance and followed the way as I described above

Consuming from Camel queue every x minutes

Attempting to implement a way to time my consumer to receive messages from a queue every 30 minutes or so.
For context, I have 20 messages in my error queue until x minutes have passed, then my route consumes all messages on queue and proceeds to 'sleep' until another 30 minutes has passed.
Not sure the best way to implement this, I've tried spring #Scheduled, camel timer, etc and none of it is doing what I'm hoping for. I've been trying to get this to work with route policy but no dice in the correct functionality. It just seems to immediately consume from queue.
Is route policy the correct path or is there something else to use?
The route that reads from the queue will always read any message as quickly as it can.
One thing you could do is start / stop or suspend the route that consumes the messages, so have this sort of set up:
route 1: error_q_reader, which goes from('jms').
route 2: a timed route that fires every 20 mins
route 2 can use a control bus component to start the route.
from('timer?20mins') // or whatever the timer syntax is...
.to("controlbus:route?routeId=route1&action=start")
The tricky part here is knowing when to stop the route. Do you leave it run for 5 mins? Do you want to stop it once the messages are all consumed? There's probably a way to run another route that can check the queue depth (say every 1 min or so), and if it's 0 then shutdown route 1, you might get it to work, but I can assure you this will get messy as you try to deal with a number of async operations.
You could also try something more exotic, like a custom QueueBrowseStrategy which can fire an event to shutdown route 1 when there are no messages on the queue.
I built a customer bean to drain a queue and close, but it's not a very elegant solution, and I'd love to find a better one.
public class TriggeredPollingConsumer {
private ConsumerTemplate consumer;
private Endpoint consumerEndpoint;
private String endpointUri;
private ProducerTemplate producer;
private static final Logger logger = Logger.getLogger( TriggeredPollingConsumer.class );
public TriggeredPollingConsumer() {};
public TriggeredPollingConsumer( ConsumerTemplate consumer, String endpoint, ProducerTemplate producer ) {
this.consumer = consumer;
this.endpointUri = endpoint;
this.producer = producer;
}
public void setConsumer( ConsumerTemplate consumer) {
this.consumer = consumer;
}
public void setProducer( ProducerTemplate producer ) {
this.producer = producer;
}
public void setConsumerEndpoint( Endpoint endpoint ) {
consumerEndpoint = endpoint;
}
public void pollConsumer() throws Exception {
long count = 0;
try {
if ( consumerEndpoint == null ) consumerEndpoint = consumer.getCamelContext().getEndpoint( endpointUri );
logger.debug( "Consuming: " + consumerEndpoint.getEndpointUri() );
consumer.start();
producer.start();
while ( true ) {
logger.trace("Awaiting message: " + ++count );
Exchange exchange = consumer.receive( consumerEndpoint, 60000 );
if ( exchange == null ) break;
logger.trace("Processing message: " + count );
producer.send( exchange );
consumer.doneUoW( exchange );
logger.trace("Processed message: " + count );
}
producer.stop();
consumer.stop();
logger.debug( "Consumed " + (count - 1) + " message" + ( count == 2 ? "." : "s." ) );
} catch ( Throwable t ) {
logger.error("Something went wrong!", t );
throw t;
}
}
}
You configure the bean, and then call the bean method from your timer, and configure a direct route to process the entries from the queue.
from("timer:...")
.beanRef("consumerBean", "pollConsumer");
from("direct:myRoute")
.to(...);
It will then read everything in the queue, and stop as soon as no entries arrive within a minute. You might want to reduce the minute, but I found a second meant that if JMS as a bit slow, it would time out halfway through draining the queue.
I've also been looking at the sjms-batch component, and how it might be used with with a pollEnrich pattern, but so far I haven't been able to get that to work.
I have solved that by using my application as a CronJob in a MicroServices approach, and to give it the power of gracefully shutting itself down, we may set the property camel.springboot.duration-max-idle-seconds. Thus, your JMS consumer route keeps simple.
Another approach would be to declare a route to control the "lifecycle" (start, sleep and resume) of your JMS consumer route.
I would strongly suggest that you use the first approach.
If you use ActiveMQ you can leverage the Scheduler feature of it.
You can delay the delivery of a message on the broker by simply set the JMS property AMQ_SCHEDULED_DELAY to the number of milliseconds of the delay. Very easy in the Camel route
.setHeader("AMQ_SCHEDULED_DELAY", 60000)
It is not exactly what you look for because it does not drain a queue every 30 minutes, but instead delays every individual message for 30 minutes.
Notice that you have to enable the schedulerSupport in your broker configuration. Otherwise the delay properties are ignored.
<broker brokerName="localhost" dataDirectory="${activemq.data}" schedulerSupport="true">
...
</broker>
You should consider Aggregation EIP
from(URI_WAITING_QUEUE)
.aggregate(new GroupedExchangeAggregationStrategy())
.constant(true)
.completionInterval(TIMEOUT)
.to(URI_PROCESSING_BATCH_OF_EXCEPTIONS);
This example describes the following rules: all incoming in URI_WAITING_QUEUE objects will be grouped into List. constant(true) is a grouping condition (wihout any). And every TIMEOUT period (in millis) all grouped objects will be passed into URI_PROCESSING_BATCH_OF_EXCEPTIONS queue.
So the URI_PROCESSING_BATCH_OF_EXCEPTIONS queue will deal with List of objects to process. You can apply Split EIP to split them and to process one by one.

Resources