Spring Cloud Stream with Project Reactor Stability - spring

I want to use Spring Cloud Stream for consuming and processing Apache Kafka queues and writing them to MongoDB. I saw that there is an option of using the library so that functions will be Reactive, or Imperative. In most Spring projects the imperative way is the default, but as for my understanding, in spring cloud stream the reactive paradigm is the default.
I wonder what is considered the most “stable” api e.g. what is recommended to use for enterprise?

Reactive API is stable and yes we provide support for it. In other words you can write functions using reactive API (e.g., Function<Flux, Flux>).
However, i want to be very clear that support for API does not mean support for the full stack of reactive capabilities since those actually depend on source and targets which are not reactive.
That said, with Kafka you can rely on native reactive support provided by Kafka itself and Spring Cloud Stream using Kafka Streams binder - https://docs.spring.io/spring-cloud-stream-binder-kafka/docs/3.1.5/reference/html/spring-cloud-stream-binder-kafka.html#_kafka_streams_binder

Related

What is the difference between spring-kafka and Apache-Kafka-Streams-Binder regarding the interaction with Kafka Stream API?

My understanding was that spring-kafka was created to interact with Kafka Client APIs, and later on, spring-cloud-stream project was created for "building highly scalable event-driven microservices connected with shared messaging systems", and this project includes a couple of binders, one of them is a binder that allows the interaction with Kafka Stream API:
spring-cloud-stream-binder-kafka-streams
So it was clear to me that if I want to interact with Kafka Stream API, I will use the spring-cloud-stream approach with the appropriate binder.
But, I found out that you can interact with Kafka Stream API also with the spring-kafka approach.
Need the below two dependencies. One example is here.
'org.springframework.kafka:spring-kafka'
'org.apache.kafka:kafka-streams'
So my question is - if both the approaches allow interaction with Kafka Stream API, what are the differences between the approaches?
As Gary pointed out in the comments above, spring-kafka is the lower-level library that provides the building blocks for Spring Cloud Stream Kafka Streams binder (spring-cloud-stream-binder-kafka-streams). The binder provides a programming model to write your Kafka Streams processor as a java.util.function.Function or java.util.function.Consumer.
You can have multiple such functions and each of them will build its own Kafka Streams topologies. Behind the scenes, the binder uses Spring-Kafka to build the Kafka Streams StreamsBuilder object using the StreamsBuilderFactoryBean. Binder also allows you to compose various functions. The functional model comes largely from Spring Cloud Function, but it is adapted for the Kafka Streams in the binder implementation. The short answer is that both spring-Kafka and the Spring Cloud Stream Kafka Streams binder will work, but the binder gives a programming model and extra features consistent with Spring Cloud Stream whereas spring-kafka gives various low-level building blocks.

Spring Reactive Stack with Spring for Apache Kafka

In a few words:
I'm trying to decide between using the default Spring for Apache Kafka stack, KafkaTemplate or the pair, ReactiveKafkaProducerTemplate and ReactiveKafkaConsumerTemplate for my Reactor based application.
Some more context:
In the company I work we're developing a high-disponibility application aiming to publish a set of requests directly to a Kafka Broker. Since this is an API centric application expecting to receive a few millions of requests per week, we decided to go with a stack based on the Project Reactor with Spring WebFlux and Kotlin.
After doing some digging I've discovered that the Spring for Apache Kafka has a simple wrapper designed around the Reactor Kafka implementation, but this wrapper lacks a lot of the functionalities present in the default KafkaTemplate mentioned before, things like: A Metrics Binder out of the box (for prometheus integration), associated factories, extensive documentation, Auto configuration, etc.
I'm trying to understand what I'm really giving up when using the default implementation in favor of the Reactive one. Am I giving up back pressure functionality? Am I sacrificing the Reactive Stack present in my application? Will this be a toll in the future? Does anyone has some experience in working with a Reactive Stack alongside a non-reactive solution?
I have, also, a few concerns regarding the DLT flow facilitated in the default implementation, things like the SeekToCurrentErrorHandler strategy

Checkpointing with Spring AWS Integration

According to Spring release notes, spring-integration-aws.1.1.0.M1 does not include DynamoDB MetaDataStore implementation. There is still ConcurrentMetadataStore class which is a key-value based store and based on implementation I suppose it maps streams with latest sequence number read. But it does not use any data store as to retrieve checkpoints.
I am using spring integration for kinesis consuming and need to implement checkpointing. I am wondering if I need to do it manually by connecting to DynamoDB and always update checkpoints or there is another way of doing it using spring framework?
P.S: I can't use Spring Cloud KinesisBinderConfiguration as I dynamically consume events from a list of configurable streams.
Thank you
If you are not talking about Spring Cloud Stream and the AWS Kinesis Binder implementation, then I don't see any blockers for you to upgrade your solution to the Spring Integration AWS 2.0 and go ahead with already provided DynamoDbMetaDataStore.
Or if that is so hard for you to move to the Spring Integration 5.0, then you simply can consider to copy/paste an implementation to your own class and inject it into the KinesisMessageDrivenChannelAdapter: https://github.com/spring-projects/spring-integration-aws/blob/master/src/main/java/org/springframework/integration/aws/metadata/DynamoDbMetaDataStore.java
Although it is really available in the 1.1.0.RELEASE - I don't see reason for your to stick with the 1.1.0.M1: https://spring.io/blog/2017/11/27/spring-integration-for-aws-1-1-ga-available

Spring-Kafka vs. Spring-Cloud-Stream (Kafka)

Using Kafka as a messaging system in a microservice architecture what are the benefits of using spring-kafka vs. spring-cloud-stream + spring-cloud-starter-stream-kafka ?
The spring cloud stream framework supports more messaging systems and has therefore a more modular design. But what about the functionality ? Is there a gap between the functionality of spring-kafka and spring-cloud-stream + spring-cloud-starter-stream-kafka ?
Which API is better designed?
Looking forward to read about your opinions
Spring Cloud Stream with kafka binder rely on Spring-kafka. So the former has all functionalities supported by later, but the former will be more heavyweight. Below are some points help you make the choice:
If you might change kafka into another message middleware in the future, then Spring Cloud stream should be your choice since it hides implementation details of kafka.
If you want to integrate other message middle with kafka, then you should go for Spring Cloud stream, since its selling point is to make such integration easy.
If you want to enjoy the simplicity and not accept performance overhead, then choose spring-kafka
If you plan to migrate to public cloud service such as AWS Kensis, Azure EventHub, then use spring cloud stream which is part of spring cloud family.
Use Spring Cloud Stream when you are creating a system where one channel is used for input does some processing and sends it to one output channel. In other words it is more of an RPC system to replace say RESTful API calls.
If you plan to do an event sourcing system, use Spring-Kafka where you can publish and subscribe to the same stream. This is something that Spring Cloud Stream does not allow you do do easily as it disallows the following
public interface EventStream {
String STREAM = "event_stream";
#Output(EventStream.STREAM)
MessageChannel publisher();
#Input(EventStream.STREAM)
SubscribableChannel stream();
}
A few things that Spring Cloud Stream helps you avoid doing are:
setting up the serializers and deserializers

Reactive streams using Spring and Akka

At moment of writing this question, I am using Spring Framework 5.0.0.M5 and Akka 2.4.17. In project I am currently using actor messaging and streams from Akka. I see, that Spring Framework v5 has also streams.
According to http://projectreactor.io/
"As a Reactive Engine/SPI, both Reactor Core and IO modules expose reactive streams constructs for focused use cases, eventually combined with Spring, RxJava, Akka Streams, Ratpack... As a Reactive API, reactor framework modules will offer rich consumable features like composition and pub-sub eventing."
it looks like Spring uses Akka streams under the hood (plus other stuff).
Question is: what are possible advantages and disadvantages of switching from Akka streams to Spring streams?
EDIT: Here is much wider question (Reactov vs Akka in general): Akka or Reactor
Spring uses Reactor (http://projectreactor.io) under the hood, which implements the Reactive Streams specification. One main goal of that specification, which is also implemented by Akka Streams, is to allow interoperability of reactive libraries.
So it means that you can use an Akka Stream in a reactive Spring app, and Spring will connect it to its internal Reactor streams, or make conversions if you expose different reactive types eg. in your controllers.

Resources