Is there a way to get the client id in a processor? My thought is it maybe able to be built with info in the processor context?
For example, "my_app-e2e751f2-7c99-484d-9a5b-172de63bc6e1-StreamThread-1"
The reason for this I want to add new metrics to the existing location.
kafka.streams->my_app-e2e751f2-7c99-484d-9a5b-172de63bc6e1-StreamThread-1->*
Is there a way to get the client id in a processor?
You can access the application.id (which represents the Kafka consumer group ID used by your Kafka Streams application) as well as the stream task id via the ProcessorContext:
ProcessorContext#applicationId()
ProcessorContext#taskId()
See the Apache Kafka 2.1 docs for more information:
https://kafka.apache.org/21/javadoc/org/apache/kafka/streams/processor/ProcessorContext.html
https://kafka.apache.org/21/documentation/streams/developer-guide/config-streams.html#application-id
Is that what you need?
Related
I have a problem to add a custom metrics in Kafka Streams.
I made a Kafka Streams application with Spring Boot like this. (Kafka Streams with Spring boot. Baeldung)
and deployed several of this app on k8s.
I want to know about avg number of processd message per second of each app instance. and it exists in Kafka Streams built-in thread metrics(process-rate). (ref. Kafka Streams Metrics)
But, that metric use thread-id at tag key and so each app instance has different metric tag key.
I'd like to use that metric value as the same tag key in each app instance.
So, I came up with a solution. It's about using that built-in metric value to add a new custom metric.
But, There's no specific information about how I get built-in metric values in source code and add a custom metric..
In ref, there's a way to add a custom metrics but no specific information about how can I apply in source code.
Is there a way to solve this problem? Or is there any other way?
I have an application (call it smscb-router) as shown in the diagram.
It reads data from a legacy system (sms).
Based on the content (callback type), I have to put into corresponding outgoing topic (such as billing-n-cdr, dr-cdr, ...)
I think streams API is better suited in this case, as it has the map functionality to do the content mapping check. What I am unsure is, can I read source data from a non-kafka-topic source.
All the examples that I see on the internet blogs, explain steaming apps with the context of reading from a source topic and put to other destination topics.
So, is this possible to read from a non-topic source, such as say a redis store, or a message queue such as RabbitMQ?
We had a recent implementation, where we had to poll an .xml file from a network attached drive and convert it into the KAFKA Events i.e. publishing each record into an output topic. In such, we wont even call it as something we have developed using a Streams API, but it is just a KAFKA Producer Component.
Java File Poller Module (Quartz time based) -> XML Schema Management -> KAFKA Producer Component -> Output Topic (KAFKA Broker).
And you will get all native features of KAKFA Producer API in terms of retries and you can use producer.send (Sync) or producer.send.get(Asyn) with call-back.
Hope this helps. Streams API is meant for big and something very complex that to be normalized through using Stateful operations.
Thanks,
Christopher
Kafka Streams is only about Topic to Topic Data Streaming
All external system should be integrated by another method :
Ideally Kafka Connect : for example with this one :
https://docs.confluent.io/kafka-connect-rabbitmq-source/current/overview.html
You may also use a manual consumer for the first step, but it always better to reuse all availability mecanism built in Kafka Connect. (No code, just some Json config).
In your schema i would recommend to add one topic and one producer or one connector in front of your Pink Component, then it can become a fully standard Kafka Streams microservice.
Problem: I have a usecase where I want to get topic name, partition id etc.. per window of Kafka Stream...
What explored so far: I found that we CANNOT get partition id from Kafka DSL API... In order for me to get the partitionId using Kafka Windowing is to use Processor API... I looked at the document provided by Kafka about processor API here https://docs.confluent.io/current/streams/developer-guide/processor-api.html#accessing-processor-context
Question How do we integrate this Processor API with Kafka Windowing and other Kafka streaming features? Example above is simple wordcount... Can we have some detail example...
I'm using Spring Cloud Stream library in a Java application. I want to use the Kafka Streams binder for a state store. The application will post messages to a topic, and I wish to use the Kafka Streams InteractiveQueryService to retrieve data from the same topic. Is it possible to perform such queries as-is, or do I need to first consume the topic as a KTable/KStream and materialize it before I can perform queries? I don't have any requirement to perform KTable/KStream processing on the topic, I just want to query the topic contents. I'm hoping there is some way to implicitly materialize it as a state store.
Interactive Queries is a feature that allows you to query client side state states. It's not a feature that allows you to query topics.
Hence, if you have data in a topic that you want to query it using "Interactive Queries", you need to load the data into a state store within Kafka Streams.
I was trying to use exactly once capabilities of kafka using kafka streams library. I've only configured proessing.guarantee as exactly_once. Along with this, there is a need to have transaction state stored in a internal topic (__transaction_state).
My question is, how to customize the name of the topic? if kafka cluster is being shared by multiple consumers, does each customer need a different topic for transaction management?
Thanks
Murthy
You don't need to worry about the topic __transaction_state -- it's an internal topic that will be automatically created for you -- you don't need to create it manually and it will always have this name (it's not possible to customize the name). It will be used for all producers that use transactions.