Using Kafka as a messaging system in a microservice architecture what are the benefits of using spring-kafka vs. spring-cloud-stream + spring-cloud-starter-stream-kafka ?
The spring cloud stream framework supports more messaging systems and has therefore a more modular design. But what about the functionality ? Is there a gap between the functionality of spring-kafka and spring-cloud-stream + spring-cloud-starter-stream-kafka ?
Which API is better designed?
Looking forward to read about your opinions
Spring Cloud Stream with kafka binder rely on Spring-kafka. So the former has all functionalities supported by later, but the former will be more heavyweight. Below are some points help you make the choice:
If you might change kafka into another message middleware in the future, then Spring Cloud stream should be your choice since it hides implementation details of kafka.
If you want to integrate other message middle with kafka, then you should go for Spring Cloud stream, since its selling point is to make such integration easy.
If you want to enjoy the simplicity and not accept performance overhead, then choose spring-kafka
If you plan to migrate to public cloud service such as AWS Kensis, Azure EventHub, then use spring cloud stream which is part of spring cloud family.
Use Spring Cloud Stream when you are creating a system where one channel is used for input does some processing and sends it to one output channel. In other words it is more of an RPC system to replace say RESTful API calls.
If you plan to do an event sourcing system, use Spring-Kafka where you can publish and subscribe to the same stream. This is something that Spring Cloud Stream does not allow you do do easily as it disallows the following
public interface EventStream {
String STREAM = "event_stream";
#Output(EventStream.STREAM)
MessageChannel publisher();
#Input(EventStream.STREAM)
SubscribableChannel stream();
}
A few things that Spring Cloud Stream helps you avoid doing are:
setting up the serializers and deserializers
Related
My understanding was that spring-kafka was created to interact with Kafka Client APIs, and later on, spring-cloud-stream project was created for "building highly scalable event-driven microservices connected with shared messaging systems", and this project includes a couple of binders, one of them is a binder that allows the interaction with Kafka Stream API:
spring-cloud-stream-binder-kafka-streams
So it was clear to me that if I want to interact with Kafka Stream API, I will use the spring-cloud-stream approach with the appropriate binder.
But, I found out that you can interact with Kafka Stream API also with the spring-kafka approach.
Need the below two dependencies. One example is here.
'org.springframework.kafka:spring-kafka'
'org.apache.kafka:kafka-streams'
So my question is - if both the approaches allow interaction with Kafka Stream API, what are the differences between the approaches?
As Gary pointed out in the comments above, spring-kafka is the lower-level library that provides the building blocks for Spring Cloud Stream Kafka Streams binder (spring-cloud-stream-binder-kafka-streams). The binder provides a programming model to write your Kafka Streams processor as a java.util.function.Function or java.util.function.Consumer.
You can have multiple such functions and each of them will build its own Kafka Streams topologies. Behind the scenes, the binder uses Spring-Kafka to build the Kafka Streams StreamsBuilder object using the StreamsBuilderFactoryBean. Binder also allows you to compose various functions. The functional model comes largely from Spring Cloud Function, but it is adapted for the Kafka Streams in the binder implementation. The short answer is that both spring-Kafka and the Spring Cloud Stream Kafka Streams binder will work, but the binder gives a programming model and extra features consistent with Spring Cloud Stream whereas spring-kafka gives various low-level building blocks.
I want to use Spring Cloud Stream for consuming and processing Apache Kafka queues and writing them to MongoDB. I saw that there is an option of using the library so that functions will be Reactive, or Imperative. In most Spring projects the imperative way is the default, but as for my understanding, in spring cloud stream the reactive paradigm is the default.
I wonder what is considered the most “stable” api e.g. what is recommended to use for enterprise?
Reactive API is stable and yes we provide support for it. In other words you can write functions using reactive API (e.g., Function<Flux, Flux>).
However, i want to be very clear that support for API does not mean support for the full stack of reactive capabilities since those actually depend on source and targets which are not reactive.
That said, with Kafka you can rely on native reactive support provided by Kafka itself and Spring Cloud Stream using Kafka Streams binder - https://docs.spring.io/spring-cloud-stream-binder-kafka/docs/3.1.5/reference/html/spring-cloud-stream-binder-kafka.html#_kafka_streams_binder
I'm developing an application that consumes messages from an exchange and it can publish to one of the multiple exchanges based on the input message transformation result.
I am trying to decide whether to go with sprrimg amqp or spring cloud stream.
What would be apt for this scenario?
Spring Cloud Stream (its Rabbit Binder) is a higher-level abstraction on top of Spring AMQP.
It is more opinionated and performs some configuration automatically.
What benefits does spring Kafka template provide?
I have tried the existing Producer/Consumer API by Kafka. That is very simple to use, then why use Kafka template.
Kafka Template internally uses Kafka producer so you can directly use Kafka APIs. The benefit of using Kafka template is it provides different methods for sending message to Kafka topic, kind of added benefits you can see the API comparison between KafkaProducer and KafkaTemplate here:
https://kafka.apache.org/10/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html
https://docs.spring.io/spring-kafka/api/org/springframework/kafka/core/KafkaTemplate.html
You can see KafkaTemplate provide many additional ways of sending data to Kafka topics because of various send methods while some calls are the same as Kafka API and are simply forwarded from KafkaTemplate to KafkaProducer.
It's up to the developer what to use. If you feel like working with KafkaTemplate is easy as you don't have to create ProducerRecord a simple send method will do all the work for you.
At a high level, the benefit is that you can externalize your properties objects more easily and you can just focus on the record processing logic
Plus Spring is integrated with lots of other components.
Note: Other options still exist like Reactor Kafka, Alpakka, Apache Camel, Smallrye reactive messaging, Vert.x... But they all wrap the same Kafka API.
So, I'd say you're (marginally) trading efficiency for convinience
According to Spring release notes, spring-integration-aws.1.1.0.M1 does not include DynamoDB MetaDataStore implementation. There is still ConcurrentMetadataStore class which is a key-value based store and based on implementation I suppose it maps streams with latest sequence number read. But it does not use any data store as to retrieve checkpoints.
I am using spring integration for kinesis consuming and need to implement checkpointing. I am wondering if I need to do it manually by connecting to DynamoDB and always update checkpoints or there is another way of doing it using spring framework?
P.S: I can't use Spring Cloud KinesisBinderConfiguration as I dynamically consume events from a list of configurable streams.
Thank you
If you are not talking about Spring Cloud Stream and the AWS Kinesis Binder implementation, then I don't see any blockers for you to upgrade your solution to the Spring Integration AWS 2.0 and go ahead with already provided DynamoDbMetaDataStore.
Or if that is so hard for you to move to the Spring Integration 5.0, then you simply can consider to copy/paste an implementation to your own class and inject it into the KinesisMessageDrivenChannelAdapter: https://github.com/spring-projects/spring-integration-aws/blob/master/src/main/java/org/springframework/integration/aws/metadata/DynamoDbMetaDataStore.java
Although it is really available in the 1.1.0.RELEASE - I don't see reason for your to stick with the 1.1.0.M1: https://spring.io/blog/2017/11/27/spring-integration-for-aws-1-1-ga-available