Kafka connect with EventStoreDB - apache-kafka-connect

I'm working on a small academic project - Event sourcing with EventStoreDB and Apache Kafka as a broker. The idea is that get events from EventStoreDB and push them to Kafka for further distribution. I saw Apache Kafka has connections to different DB systems but didn't find any connector with EvenStoreDB.
How can I create(code or use existing one) Kafka connector to EventStoreDB, so these two systems would be able to transfer events vise-versa, from Kafka to EventStoreDB and from EventStoreDB to Kafka?

There is no official Kafka Connect Connector between Kafka and EventStoreDB, and I haven't heard about any unofficial so far. Still, there is a tool called Replicator that enables replicating data from EventStoreDB to Kafka (https://replicator.eventstore.org/docs/features/sinks/kafka/). It's open-sourced, so you can either use it or check the implementation.
For the EventStoreDB to Kafka, I recommend using the subscriptions mechanism: catch-up if you need an ordering guarantee, persistent if ordering is not critical: https://developers.eventstore.com/clients/grpc/subscriptions.html. The crucial part here is to define how to map EventStoreDB streams to Kafka topics and partitions. Typically you'd expect to have at least an ordering guarantee on the stream level, so single stream events should land to the same partition.
For Kafka to EventStoreDB integration, you could either write your own pass-through service or try to use the HTTP sink connector (e.g. https://docs.confluent.io/kafka-connect-http/current/overview.html). EventStoreDB exposes HTTP API (https://developers.eventstore.com/clients/http-api/v5/introduction/). Sidenote, this API (Atom pub based) may be replaced with another HTTP API in the future, so the structure may change.

You can use Event Store Replicator, which has a Kafka sink.
Keep in mind that it doesn't do anything with regards to events schema, so things like Kafka Streams and KSQL might not work properly.
The sink was created solely for the purpose of pushing events to Kafka being used as a message broker.

Related

kafka streams - can I use kafka streams processing in cases where the source is not a kafka topic?

I have an application (call it smscb-router) as shown in the diagram.
It reads data from a legacy system (sms).
Based on the content (callback type), I have to put into corresponding outgoing topic (such as billing-n-cdr, dr-cdr, ...)
I think streams API is better suited in this case, as it has the map functionality to do the content mapping check. What I am unsure is, can I read source data from a non-kafka-topic source.
All the examples that I see on the internet blogs, explain steaming apps with the context of reading from a source topic and put to other destination topics.
So, is this possible to read from a non-topic source, such as say a redis store, or a message queue such as RabbitMQ?
We had a recent implementation, where we had to poll an .xml file from a network attached drive and convert it into the KAFKA Events i.e. publishing each record into an output topic. In such, we wont even call it as something we have developed using a Streams API, but it is just a KAFKA Producer Component.
Java File Poller Module (Quartz time based) -> XML Schema Management -> KAFKA Producer Component -> Output Topic (KAFKA Broker).
And you will get all native features of KAKFA Producer API in terms of retries and you can use producer.send (Sync) or producer.send.get(Asyn) with call-back.
Hope this helps. Streams API is meant for big and something very complex that to be normalized through using Stateful operations.
Thanks,
Christopher
Kafka Streams is only about Topic to Topic Data Streaming
All external system should be integrated by another method :
Ideally Kafka Connect : for example with this one :
https://docs.confluent.io/kafka-connect-rabbitmq-source/current/overview.html
You may also use a manual consumer for the first step, but it always better to reuse all availability mecanism built in Kafka Connect. (No code, just some Json config).
In your schema i would recommend to add one topic and one producer or one connector in front of your Pink Component, then it can become a fully standard Kafka Streams microservice.

Using Kafka without RabbitMQ

I am testing MassTransit with Kafka without RabbitMQ
I need a IBusControl, can i use bus.UsingInMemory()?
Is it safe to use for Kafka ITopicProducers?
Does features like scheduling/sagas will work with Kafka or with InMemory bus?
You can use an in-memory bus with Kafka, but realize that the in-memory bus is not durable. The only reason you would want to use it is because you want to consume/produce Kafka messages without using an accompanying broker.
Sagas work with Kafka just fine, since sagas are a type of consumer.
Scheduling is not supported with Kafka, since it isn't a broker. Using scheduling with in-memory is not recommended outside of unit tests.

Spring Kafka JDBC Connector compatibility

Is Kafka JDBC connect compatible with Spring-Kafka library?
I did follow https://www.confluent.io/blog/kafka-connect-deep-dive-jdbc-source-connector/ and still have some confusions.
Let's say you want to consume from a Kafka topic and write to a JDBC database. Some of your options are
Use plain Kafka consumer to consume from the topic and use Jdbc api to write the consumed record to database.
Use spring Kafka to consume from the Kafka Topic and spring jdbc template or spring data to write it to the database
Use Kafka connect with Jdbc connector as sink to read from topic and write to a table.
So as you can see
Kafka Jdbc connector is a specialised component that can only do one job.
Kafka Consumer is very genric component which can do lot of job and you will be writing lot of code. In facr, it will be the foundational API from which other frameworks build on and specialise.
Spring Kafka simplfies it and let you deal with kafka records as java objects but doesnt tell you how to write that object to your db.
So they are alternative solutions to fulfil the task. Having said that you may have a flow where different segments are controlled by different teams and for each segment, any of them can be used and Kafka topic will act as joining channel

Is it possible sending websocket messages to a kafka topic?

I am trying to find a way to consume messages that being sent by a websocket to a kafka topic (the messages are sent by the websocket to the address 'ws://address:port/topic_name' and I want to add all of those messages to a kafka topic).
I read about kafka connect and tried to find a way to do it with it but it doesnt seem to work...
thanks in advance :)
There is no Kafka Connector to a socket in Confluent Platform.
I work in a team that use Kafka in production and our source is a socket, so your options are to use platforms that support this socket->Kafka producing, or write one by yourself.
About possible platforms, I think most of them will be overkill though you can utilize them for this problem, some options are:
1. NiFi or MiniFi for smaller loads, use PublishKafka Processor
2. StreamSets with Kafka Producer Destination
3. Apache Flume- not very recommended, this project is stops to evolve.
If you wish to write your own producer, you basically have to create a listener on this port, and produce the incoming messages to Kafka; if this is a web socket, just get the payload of the requests and produce them to Kafka.
Example Kafka Producer Code can be copied from tutorialspoint simple producer example*
Here are some open-source projects examples:
1. https://github.com/DataReply/kafka-connect-socket-source
2. https://github.com/kafka-socket/miniature_engine
3. https://github.com/dhanuka84/kafka-connect-tcp
4. https://github.com/krux/tcp-stream-kafka-producer
The idea of Kafka connect is that you have some sort of external integration that serves as storage. This can be SAP, Salesforce, RDBMS, MQ or anything else that has state. You websocket endpoint does not have data, you can not poll it it is someone else that is invoking it and there fore the data is transfered. Now if you know who is actualy holding the data than you can potentialy build a conector using this guide. https://docs.confluent.io/current/connect/devguide.html
For your particular case, the best you can do is either to use Kafka Producer API https://docs.confluent.io/current/clients/producer.html
and from your websocket enpoint use this producer to post a message to the topic, or even better if you are using spring you can use a higher level abstraction, that will be KafkaTemplate https://docs.spring.io/spring-kafka/reference/html/#sending-messages.
Full disclosure: I work for MigratoryData.
You can check out MigratoryData's solution for Kafka. MigratoryData is a scalable WebSocket server. The MigratoryData Source/Sink Connector for Kafka makes use of Kafka Connect API and can be used to stream data in real-time from Kafka to WebSocket clients and vice versa. The main advantage of the solution is it extends Kafka messaging to WebSocket clients while preserving Kafka's key features like guaranteed delivery, message ordering, etc.

Connection between Apache Kafka and JMS

I was wondering could Apache Kafka communicate and send messages to JMS? Can I establish connection between them? For example, I'm using JMS in my system and it should send messages to the other system that uses Kafka
answering bit late, but if I understood correctly the requirement.
If the requirement is synchronous messaging from
client->JMS->Kafka --- > consumer
then following is not the solution, but if its ( and most likely) the async requirement like:
client->JMS | ----> Kafka ---> consumer
then, this would be related to KafkaConnect framework which is solving the problem of how to integrate different sources and sinks with Kafka.
http://docs.confluent.io/2.0.0/connect/
http://www.confluent.io/product/connectors
so what you need is a JMSSourceConnector.
Not directly. And the two are incomparable concepts. JMS is a vendor-neutral API specification of a messaging service.
While Kafka may be classified as a messaging service, it is not compatible with the JMS API, and to the best of my knowledge there is no trivial way of adapting JMS to fit Kafka's use cases without making significant compromises.
However, if your needs are simply to move messages between Kafka and a JMS-compliant broker, then this can easily be achieved by either writing a simple relay app that consumes from one and publishes onto another, or use something like Kafka Connect, which has pre-canned sinks for most data sources, including JMS brokers, databases, etc.
If the requirement is the reverse of the previous answer:
Kafka Producer -> Kafka Broker -> JMS Broker -> JMS Consumer
then you would need a KafkaConnect Sink like the following one from Data Mountaineer
http://docs.datamountaineer.com/en/latest/jms.html

Resources