Spring cloud stream and backpressure using PubSubReactiveFactory - spring

I am trying to implement a flow control similar to https://cloud.google.com/pubsub/docs/pull#flow_control using spring cloud stream reactive client.
#Bean
ApplicationRunner reactiveSubscriber(PubSubReactiveFactory reactiveFactory, PubSubMessageConverter converter) {
return (args) -> {
reactiveFactory.poll("orders-subscription", 250L)
// Convert a JSON payload into an object
.map(msg -> converter.fromPubSubMessage(msg.getPubsubMessage(), Order.class))
.doOnNext(order -> proccessOrder(order))
// Mannually acknowledge the message
.doOnNext(AcknowledgeablePubsubMessage::ack);
.subscribe();
};
}
It seems that back-pressure is not possible to limit number of data flowing through the system. Since the processing can last for few second I am afraid to cause some memory problems with high number of incoming message (millions of incoming message per day).
What I would like to achieve is a constant processing of 100 messages in the flux. Anyone had success implement it ? Maybe reactive rabbitmq ?

resolved using spring amqp and prefetch count.

Related

Implementing DLQ in Kafka using Spring Cloud Stream with Batch mode enabled

I am trying to implement DLQ using spring cloud stream with Batch mode enabled
#Bean
public ListenerContainerCustomizer<AbstractMessageListenerContainer<?, ?>> customizer(BatchErrorHandler handler) {
return ((container, destinationName, group) -> {
if(dlqEnabledTopic.contains(destinationName))
container.setBatchErrorHandler(handler);});
}
#Bean
public BatchErrorHandler batchErrorHandler(KafkaOperations<String, byte[]> kafkaOperations) {
CustomDeadLetterPublishingRecoverer recoverer = new CustomDeadLetterPublishingRecoverer(kafkaOperations,
(cr, e) -> new TopicPartition(cr.topic()+"_dlq", cr.partition()));
return new RecoveringBatchErrorHandler(recoverer, new FixedBackOff(1000, 1));
}
but have a few queries:
how to configure key/value Serializer using properties - my message is String type but KafkaOperations is using ByteArraySerializer
In the batch multiple messages are there , but if first message failed it went to DLQ but don't see the processing of next message.
Requirement - at any index if batch fails, I need only that message to be sent to DLQ and rest of the message should be processed again.
Is DLQ now supported with batch mode now ? just like with record mode it can be enabled using properties
spring.kafka.producer.* properties - however, the DLT publishing should use the same serializers as the main stream app. ByteArraySerializer is generally correct.
The recovering batch error handler will perform seeks for the unprocessed records and they will be returned. Debug logging should help you figure out what's wrong. If you can't figure it out, provide an MCRE that exhibits the behavior you are seeing.
No; the binder does not support DLQ for batch mode; configuring the error handler is the correct approach.

Periodically polling consumer metrics with Reactor Kafka

We have a Spring Boot project using Reactor Kafka and a KafkaReceiver for consuming and we would like to collect and emit the underlying consumer metrics. It looks like we could leverage KafkaReceiver.doOnConsumer() with something like this:
receiver.doOnConsumer(Consumer::metrics)
.flatMapIterable(Map::entrySet)
.map(m -> Tuples.of(m.getKey(), m.getValue()))
If this is the best approach I'm not sure what the best way to do this periodically would be.
I notice there's also a version of the KafkaReceiver.create() factory method that takes a custom ConsumerFactory, maybe there's some way to use this to register the underlying Kafka consumer with Micrometer at creation time? I'm new to Spring Boot and relatively new to Kafka Reactor so I'm not totally sure.
Here's a snippet of my code so far for more context:
KafkaReceiver.create(receiverOptions(Collections.singleton(topic)).commitInterval(Duration.ZERO))
.receive()
.groupBy(m -> m.receiverOffset().topicPartition())
.flatMap(partitionFlux -> partitionFlux.publishOn(this.scheduler)
.map(r -> processEvent(partitionFlux.key(), r))
.concatMap(this::commit))
.doOnCancel(this::close)
.doOnError(error -> LOG.error("An error was encountered", error))
.blockLast();
If taking the doOnConsumer() approach makes sense we could possibly hook into doOnNext() but then we'd be collecting and emitting metrics for every event, which is too much, would be better if we could stagger and batch.
Any suggestions or tips appreciated, thanks.

Using onErrorResume to handle problematic payloads posted to Kafka using Reactor Kafka

I am using reactor kafka to send in kafka messages and receive and process them.
While receiving the kakfa payload, I do some deserialization, and if there is an exception, I want to just log that payload ( by saving to mongo ), and then continue receiving other payloads.
For this I am using the below approach -
#EventListener(ApplicationStartedEvent.class)
public void kafkaReceiving() {
for(Flux<ReceiverRecord<String, Object>> flux: kafkaService.getFluxReceives()) {
flux.delayUntil(//some function to do something)
.doOnNext(r -> r.receiverOffset().acknowledge())
.onErrorResume(this::handleException()) // here I'll just save to mongo
.subscribe();
}
}
private Publisher<? extends ReceiverRecord<String,Object>> handleException(object ex) {
// save to mongo
return Flux.empty();
}
Here I expect that whenever I encounter an exception while receiving a payload, the onErrorResume should catch it and log to mongo and then I should be good to continue receiving more messages when I send to the kafka queue. However, I see that after the exception, even though the onErrorResume method gets invoked, but I am not able to process anymore messages sent to Kakfa topic.
Anything I might be missing here?
If you need to handle the error gracefully, you can add onErrorResume inside delayUntil:
flux
.delayUntil(r -> {
return process(r)
.onErrorReturn(e -> saveToMongo(r));
});
.doOnNext(r -> r.receiverOffset().acknowledge())
.subscribe();
Reactive operators treat error as a terminal signal, and, if your inner logic (inside delayUntil) throws an error, delayUntil will terminate the sequence, and onErrorReturn after delayUntil will not make it continue processing the events from Kafka.
As mentioned by #bsideup too, I ultimately went ahead with not throwing exception from the deserializer, since the kafka is not able to commit offset for that record, and there is no clean way of ignoring that record and going ahead with further consumption of records as we dont have the offset information of the record( since it is malformed). So even if I try to ignore the record using reactive error operators, the poll fetches the same record, and the consumer is then kind of stuck

Spring Cloud Stream Listener not pausing / waiting for the messages in Integration Testing Code

I am having a Application which connects to RabbitMQ through Spring Cloud Stream, which works prefectly.
For Integration test cases i am trying to use the sample - https://github.com/piomin/sample-message-driven-microservices/blob/master/account-service/src/test/java/pl/piomin/services/account/OrderReceiverTest.java
However, in my case my application sends back 3 messages in some time Interval. So if i put the below Lines, it fetches the messages, but if the there is a delay in getting the messages.
int i = 1;
while (i > 0) {
Message<String> received = (Message<String>) collector.forChannel(channels.statusMessage()).poll();
if (received != null) {
LOGGER.info("Order response received: {}", received.getPayload());
}
}
So Instead of my custom polling, is there any way i can wait and Poll for my messages, and stop when i get those ?
I want to get the pick Messages based on the Response Routing Key to different Channels. Is it possible ?
--> Example: If the routingKey is "InProcess" , it should go to Inprocess Method.
1) Your question is not at all clear, expand on it and explain exactly what you mean.
2) Routing keys are used within Rabbit to route to different queues, they are not used within the framework to route to channels or methods.
You can, however, use a condition on the #StreamListener (match on the headers['amqp_receivedRoutingKey]`), but it's better to route messages to different queues instead.

Stream response from HTTP client with Spring/Project reactor

How to stream response from reactive HTTP client to the controller without having the whole response body in the application memory at any time?
Practically all examples of project reactor client return Mono<T>. As far as I understand reactive streams are about streaming, not loading it all and then sending the response.
Is it possible to return kind of Flux<Byte> to make it possible to transfer big files from some external service to the application client without a need of using a huge amount of RAM memory to store intermediate result?
It should be done naturally by simply returning a Flux<WHATEVER>, where each WHATEVER will be flushed on the network as soon as possible. In such a case, the response uses chunked HTTP encoding, and the bytes from each chunk are discarded once they've been flused to the network.
Another possibility is to upgrade the HTTP response to SSE (Server Sent Events), which can be achieved in WebFlux by setting the Controller method to something like #GetMapping(path = "/stream-flux", produces = MediaType.TEXT_EVENT_STREAM_VALUE) (the produces part is the important one).
I dont think that in your scenario you need to create an event stream because event stream is more used to emit event in real time i think you better do it like this.
#GetMapping(value = "bytes")
public Flux<Byte> getBytes(){
return byteService.getBytes();
}
and you can send it es a stream.
if you still want it as a stream
#GetMapping(value = "bytes",produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<List<Byte>> getBytes(){
return byteService.getBytes();
}

Resources