We have a kafka cluster and have build the cluster with kafka from this maven repository https://mvnrepository.com/artifact/org.apache.kafka/kafka_2.10/0.10.0.0
We are using another kafka client library to send messages into this cluster. I would like to log the messages that are sent to this cluster and send them to AWS cloudwatch (my cluster in running on AWS EC2).
How can I achieve this?
Also, once a producer has sent the message, what is the entry point of code that will be invoked in this kafka library?
-Thanks!
Related
How can I connect my Springboot application to Kafka topic as soon as the application start,
so that when send method is invoked there is no need to fetch the metadata information?
Kafka clients are required to do an initial metadata fetch to determine the leader broker to actually send the data, but this shouldn't drastically change the startup time of any application and wouldn't prevent you from calling any Kafka producer actions
I have developed a streamsets pipeline which using KAFKA consumer as origin.My pipeline is working fine if Kafka consumer having message in it.But kafka consumer have 0 message in it the my pipeline went into loop and running contineously and didn't finish.
I need to finish my pipeline if kafka consumer having zero messages in his topic.
By default it is streaming application's nature to keep checking for messages. But a similar approach to what you want to achieve has been answered-here.
I have a Spring Boot (2.3.3) service using spring-kafka to currently access a dedicated Kafka/Zookeeper configuration. I have been using the application.properties setting spring.kafka.bootstrap-servers=localhost:9092 to access my dev/test Apache Kafka service.
However, in production, we have a Cluster of Kafka Brokers (on many servers) configured in Zookeeper, and I have been asked to modify my service to query Zookeeper to get the list of brokers and use that list instead of the bootstrap servers configuration. Reason, our DevOps folks have been known to reconfigure servers/nodes and Kafka brokers.
Basically, I have been asked to make my service agnostic to where the Apache Kafka brokers are running. All my service needs to know is how to get the list of brokers (bootstrap server info including host and port) from Zookeeper.
Is there a way in spring-boot and spring-kafka to retrieve from Zookeeper the broker list and use that broker (aka bootstrap server) list in my service?
Spring delegates to the kafka-clients for all connections; for a long time now, the kafka-clients no longer connect to Zookeeper, only to the brokers themselves.
There is no built-in support in Spring for querying the Zookeeper to determine the broker list.
Furthermore, in a future Kafka version, Zookeeper is going away altogether; see KIP-500.
I was reading below blog in Databricks
https://databricks.com/blog/2015/03/30/improvements-to-kafka-integration-of-spark-streaming.html
While explaining the process how spark kafka integration works using receiver with WAl , it says
1.The Kafka data is continuously received by Kafka Receivers running in the Spark workers/executors. This used the high-level consumer API
of Kafka.
2.The received data is stored in Spark’s worker/executor memory as well as to the WAL (replicated on HDFS). The Kafka Receiver updated
Kafka’s offsets to Zookeeper only after the data has been persisted to
the log.
Now my doubt is how a high level consumer can update offset in zookeeper , as high level consumer does not handle offset, it is handled by zookeeper. So once we read a message from kafka using zookeeper then zookeeper automatically update the offset.
So when a consumer retrieves data from a particular topic in kafka it is the responsibility of the consumer to update the offsets in zookeeper. So when you use a custom kafka consumer it has an inbuild kafka API( org.apache.kafka.clients.consumer.* does that )that will update the offsets once you receive the data from that particular topic.
In case of that receiver based approach in spark it uses Kafka's high level API to update the offsets in zookeeper.
I would like to have my Weblogic cluster listen on a distributed topic. Whenever a JMS message is sent on that topic, I would like for only one node in the cluster to handle this message. Is this possible?
I can't use a distributed queue because there are multiple listeners (other clusters) on the topic.
With WebLogic 10.3.4 this is possible with Partitioned TOpics. In order to enable this you should set the replication mode to 'Partitioned'. The default is 'Replicated' which delivers the message to every node in the cluster.