I'm using Spring Cloud Stream library in a Java application. I want to use the Kafka Streams binder for a state store. The application will post messages to a topic, and I wish to use the Kafka Streams InteractiveQueryService to retrieve data from the same topic. Is it possible to perform such queries as-is, or do I need to first consume the topic as a KTable/KStream and materialize it before I can perform queries? I don't have any requirement to perform KTable/KStream processing on the topic, I just want to query the topic contents. I'm hoping there is some way to implicitly materialize it as a state store.
Interactive Queries is a feature that allows you to query client side state states. It's not a feature that allows you to query topics.
Hence, if you have data in a topic that you want to query it using "Interactive Queries", you need to load the data into a state store within Kafka Streams.
Related
I have a problem to add a custom metrics in Kafka Streams.
I made a Kafka Streams application with Spring Boot like this. (Kafka Streams with Spring boot. Baeldung)
and deployed several of this app on k8s.
I want to know about avg number of processd message per second of each app instance. and it exists in Kafka Streams built-in thread metrics(process-rate). (ref. Kafka Streams Metrics)
But, that metric use thread-id at tag key and so each app instance has different metric tag key.
I'd like to use that metric value as the same tag key in each app instance.
So, I came up with a solution. It's about using that built-in metric value to add a new custom metric.
But, There's no specific information about how I get built-in metric values in source code and add a custom metric..
In ref, there's a way to add a custom metrics but no specific information about how can I apply in source code.
Is there a way to solve this problem? Or is there any other way?
I have an application (call it smscb-router) as shown in the diagram.
It reads data from a legacy system (sms).
Based on the content (callback type), I have to put into corresponding outgoing topic (such as billing-n-cdr, dr-cdr, ...)
I think streams API is better suited in this case, as it has the map functionality to do the content mapping check. What I am unsure is, can I read source data from a non-kafka-topic source.
All the examples that I see on the internet blogs, explain steaming apps with the context of reading from a source topic and put to other destination topics.
So, is this possible to read from a non-topic source, such as say a redis store, or a message queue such as RabbitMQ?
We had a recent implementation, where we had to poll an .xml file from a network attached drive and convert it into the KAFKA Events i.e. publishing each record into an output topic. In such, we wont even call it as something we have developed using a Streams API, but it is just a KAFKA Producer Component.
Java File Poller Module (Quartz time based) -> XML Schema Management -> KAFKA Producer Component -> Output Topic (KAFKA Broker).
And you will get all native features of KAKFA Producer API in terms of retries and you can use producer.send (Sync) or producer.send.get(Asyn) with call-back.
Hope this helps. Streams API is meant for big and something very complex that to be normalized through using Stateful operations.
Thanks,
Christopher
Kafka Streams is only about Topic to Topic Data Streaming
All external system should be integrated by another method :
Ideally Kafka Connect : for example with this one :
https://docs.confluent.io/kafka-connect-rabbitmq-source/current/overview.html
You may also use a manual consumer for the first step, but it always better to reuse all availability mecanism built in Kafka Connect. (No code, just some Json config).
In your schema i would recommend to add one topic and one producer or one connector in front of your Pink Component, then it can become a fully standard Kafka Streams microservice.
I'm working on a small academic project - Event sourcing with EventStoreDB and Apache Kafka as a broker. The idea is that get events from EventStoreDB and push them to Kafka for further distribution. I saw Apache Kafka has connections to different DB systems but didn't find any connector with EvenStoreDB.
How can I create(code or use existing one) Kafka connector to EventStoreDB, so these two systems would be able to transfer events vise-versa, from Kafka to EventStoreDB and from EventStoreDB to Kafka?
There is no official Kafka Connect Connector between Kafka and EventStoreDB, and I haven't heard about any unofficial so far. Still, there is a tool called Replicator that enables replicating data from EventStoreDB to Kafka (https://replicator.eventstore.org/docs/features/sinks/kafka/). It's open-sourced, so you can either use it or check the implementation.
For the EventStoreDB to Kafka, I recommend using the subscriptions mechanism: catch-up if you need an ordering guarantee, persistent if ordering is not critical: https://developers.eventstore.com/clients/grpc/subscriptions.html. The crucial part here is to define how to map EventStoreDB streams to Kafka topics and partitions. Typically you'd expect to have at least an ordering guarantee on the stream level, so single stream events should land to the same partition.
For Kafka to EventStoreDB integration, you could either write your own pass-through service or try to use the HTTP sink connector (e.g. https://docs.confluent.io/kafka-connect-http/current/overview.html). EventStoreDB exposes HTTP API (https://developers.eventstore.com/clients/http-api/v5/introduction/). Sidenote, this API (Atom pub based) may be replaced with another HTTP API in the future, so the structure may change.
You can use Event Store Replicator, which has a Kafka sink.
Keep in mind that it doesn't do anything with regards to events schema, so things like Kafka Streams and KSQL might not work properly.
The sink was created solely for the purpose of pushing events to Kafka being used as a message broker.
I'm new to Spring cloud stream.
Say I Spring cloud stream app that listen to some topic from kafka using #StreamListener("input-channel").
I want to do some calculation and send the result to another topic but in the middle of the processing I also need to call Hibernate (via spring data jpa) to persist some data to my mySQL data base.
Is it valid to call Hibernate in the middle of stream processing? is there other pattern to do it?
Yes, it's a database call, so why not. People do it all the time.
Also, #StreamListener, has been deprecated for 3 years now, and is already removed from the new versions, so please transition to functional programming model
I'm very new to Akka Streaming and reactive streaming. I have a question: is it possible to have a rest API receiving a message dropping it on Kafka Bus, and the Kafka streaming consumer then aggregates the messages in a max. time window and retrun the answer back?
How to implement such a system? Or where to start?
Thanks
For a rest API you can consider the Kafka REST Proxy: https://github.com/confluentinc/kafka-rest
Or course you can instead build your own using akka-http and akka-stream-kafka.
As to windowing, I'm sure it can be done in akka streams but personally, I'd suggest using Kafka Streams as the first port of call:
http://docs.confluent.io/current/streams/developer-guide.html#windowing
I'm not sure what exactly you mean by returning the answer back, but if you follow the approach above, you can use use REST Proxy to consume the windowed-aggregated messages or you can build a REST service that queries the Kafka Streams state stores via the so-called "interactive queries". This post shows how to do it using javax.ws.rs: https://www.confluent.io/blog/unifying-stream-processing-and-interactive-queries-in-apache-kafka/ but for a reactive application you can do the same using akka-http instead (I'm implementing this exact thing on one of my projects).