Kafka Source in Spring Cloud Data Flow - spring-xd

I am migrating from Spring XD to Spring Cloud Data Flow. When I am looking for module list I realised that some of the sources are not listed in Spring Cloud Flow - One of them is KAFKA source.
My question is why KAFKA source is removed from standard sources list in spring cloud data flow ?

When I am looking for module list I realised that some of the sources are not listed in Spring Cloud Flow
Majority of the applications are ported over and the remaining are incrementally prioritized - you can keep track of the remaining subset in the backlog.
My question is why KAFKA source is removed from standard sources list in spring cloud data flow ?
Kafka is not removed and in fact, we are highly opinionated about Kafka in the context of streaming use-cases so much so that it is baked into the DSL directly. More details here.
For instance,
(i) if you've to consume from a Kafka topic (as a source), your stream definition would be:
stream create --definition ":someAwesomeTopic > log" --name subscribe_to_broker --deploy
(ii) if you've to write to a Kafka topic (as a sink), your stream definition would be:
stream create --definition "http --server.port=9001 > :someAwesomeTopic" --name publish_to_broker --deploy
(where *someAwesomeTopic* is the named destination, a topic name)

Related

kafka streams - can I use kafka streams processing in cases where the source is not a kafka topic?

I have an application (call it smscb-router) as shown in the diagram.
It reads data from a legacy system (sms).
Based on the content (callback type), I have to put into corresponding outgoing topic (such as billing-n-cdr, dr-cdr, ...)
I think streams API is better suited in this case, as it has the map functionality to do the content mapping check. What I am unsure is, can I read source data from a non-kafka-topic source.
All the examples that I see on the internet blogs, explain steaming apps with the context of reading from a source topic and put to other destination topics.
So, is this possible to read from a non-topic source, such as say a redis store, or a message queue such as RabbitMQ?
We had a recent implementation, where we had to poll an .xml file from a network attached drive and convert it into the KAFKA Events i.e. publishing each record into an output topic. In such, we wont even call it as something we have developed using a Streams API, but it is just a KAFKA Producer Component.
Java File Poller Module (Quartz time based) -> XML Schema Management -> KAFKA Producer Component -> Output Topic (KAFKA Broker).
And you will get all native features of KAKFA Producer API in terms of retries and you can use producer.send (Sync) or producer.send.get(Asyn) with call-back.
Hope this helps. Streams API is meant for big and something very complex that to be normalized through using Stateful operations.
Thanks,
Christopher
Kafka Streams is only about Topic to Topic Data Streaming
All external system should be integrated by another method :
Ideally Kafka Connect : for example with this one :
https://docs.confluent.io/kafka-connect-rabbitmq-source/current/overview.html
You may also use a manual consumer for the first step, but it always better to reuse all availability mecanism built in Kafka Connect. (No code, just some Json config).
In your schema i would recommend to add one topic and one producer or one connector in front of your Pink Component, then it can become a fully standard Kafka Streams microservice.

What is the difference between spring-kafka and Apache-Kafka-Streams-Binder regarding the interaction with Kafka Stream API?

My understanding was that spring-kafka was created to interact with Kafka Client APIs, and later on, spring-cloud-stream project was created for "building highly scalable event-driven microservices connected with shared messaging systems", and this project includes a couple of binders, one of them is a binder that allows the interaction with Kafka Stream API:
spring-cloud-stream-binder-kafka-streams
So it was clear to me that if I want to interact with Kafka Stream API, I will use the spring-cloud-stream approach with the appropriate binder.
But, I found out that you can interact with Kafka Stream API also with the spring-kafka approach.
Need the below two dependencies. One example is here.
'org.springframework.kafka:spring-kafka'
'org.apache.kafka:kafka-streams'
So my question is - if both the approaches allow interaction with Kafka Stream API, what are the differences between the approaches?
As Gary pointed out in the comments above, spring-kafka is the lower-level library that provides the building blocks for Spring Cloud Stream Kafka Streams binder (spring-cloud-stream-binder-kafka-streams). The binder provides a programming model to write your Kafka Streams processor as a java.util.function.Function or java.util.function.Consumer.
You can have multiple such functions and each of them will build its own Kafka Streams topologies. Behind the scenes, the binder uses Spring-Kafka to build the Kafka Streams StreamsBuilder object using the StreamsBuilderFactoryBean. Binder also allows you to compose various functions. The functional model comes largely from Spring Cloud Function, but it is adapted for the Kafka Streams in the binder implementation. The short answer is that both spring-Kafka and the Spring Cloud Stream Kafka Streams binder will work, but the binder gives a programming model and extra features consistent with Spring Cloud Stream whereas spring-kafka gives various low-level building blocks.

Spring Project Reactor for Kinesis

I have a micro service that consumes data from Kinesis stream , I want to convert it into reactive stream consumer since application is already written using spring-boot 2.x. I was going through support of Project Rector for Kinesis and found few example like aws sdk example.
I want to know if there is better way to do this. There are dedicated projects for streaming with Kafka and project reactor , is there any such thing for Kinesis ?

Spring Cloud Stream vs Spring AMQP

I'm developing an application that consumes messages from an exchange and it can publish to one of the multiple exchanges based on the input message transformation result.
I am trying to decide whether to go with sprrimg amqp or spring cloud stream.
What would be apt for this scenario?
Spring Cloud Stream (its Rabbit Binder) is a higher-level abstraction on top of Spring AMQP.
It is more opinionated and performs some configuration automatically.

Spring-Kafka vs. Spring-Cloud-Stream (Kafka)

Using Kafka as a messaging system in a microservice architecture what are the benefits of using spring-kafka vs. spring-cloud-stream + spring-cloud-starter-stream-kafka ?
The spring cloud stream framework supports more messaging systems and has therefore a more modular design. But what about the functionality ? Is there a gap between the functionality of spring-kafka and spring-cloud-stream + spring-cloud-starter-stream-kafka ?
Which API is better designed?
Looking forward to read about your opinions
Spring Cloud Stream with kafka binder rely on Spring-kafka. So the former has all functionalities supported by later, but the former will be more heavyweight. Below are some points help you make the choice:
If you might change kafka into another message middleware in the future, then Spring Cloud stream should be your choice since it hides implementation details of kafka.
If you want to integrate other message middle with kafka, then you should go for Spring Cloud stream, since its selling point is to make such integration easy.
If you want to enjoy the simplicity and not accept performance overhead, then choose spring-kafka
If you plan to migrate to public cloud service such as AWS Kensis, Azure EventHub, then use spring cloud stream which is part of spring cloud family.
Use Spring Cloud Stream when you are creating a system where one channel is used for input does some processing and sends it to one output channel. In other words it is more of an RPC system to replace say RESTful API calls.
If you plan to do an event sourcing system, use Spring-Kafka where you can publish and subscribe to the same stream. This is something that Spring Cloud Stream does not allow you do do easily as it disallows the following
public interface EventStream {
String STREAM = "event_stream";
#Output(EventStream.STREAM)
MessageChannel publisher();
#Input(EventStream.STREAM)
SubscribableChannel stream();
}
A few things that Spring Cloud Stream helps you avoid doing are:
setting up the serializers and deserializers

Resources