I have spring boot application, which is recives data over rest, based on some business logic I need to forwaard the data to two different kafka cluster which have their own kerberos keys menttioned jaas file.
I have written two different producer Instance with below properties in their different object instances.
#Service
public class EventProducer {
private Logger logger = LoggerFactory.getLogger(EventProducer.class);
Producer<String, String> kafkaProducer = null;
#Autowired
public Producer<String, String> createProducer() {
if (kafkaProducer == null) {
Properties props = getKafkaConfig();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "Cluste_1_hostaddress:9092");
props.put(ProducerConfig.CLIENT_ID_CONFIG,"usertest");
props.put(ProducerConfig.ACKS_CONFIG, "all");
props.put(ProducerConfig.RETRIES_CONFIG, "3");
props.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 33554432);
props.put(ProducerConfig.BATCH_SIZE_CONFIG, 1600);
System.setProperty("javax.security.auth.useSubjectCredsOnly", "true");
System.setProperty("java.security.auth.login.config", "/home/user/clusrter_1_jaas.conf);
props.put("security.protocol", "SASL_PLAINTEXT");
props.put("kafka.cluster.SecurityProtocol",PLAINTEXTSASL);
props.put("sasl.kerberos.service.name", "kafka");
props.put("sasl.kerberos", "sasl.kerberos.service.namekafka");
props.put("security.inter.broker.protocol", "SASL_PLAINTEXT");
props.put("sasl.mechanism.inter.broker.protocol", "PLAIN");
props.put("sasl.enabled.mechanisms", "PLAIN");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
kafkaProducer = new KafkaProducer<String, String>(props);
}
return kafkaProducer;
}
}
Second producer
#Service
public class MovementProducer {
private Logger logger = LoggerFactory.getLogger(MovementProducer.class);
Producer<String, String> kafkaProducer = null;
#Autowired
public Producer<String, String> createProducer() {
if (kafkaProducer == null) {
Properties props = getKafkaConfig();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "Cluste_2_hostaddress:9092");
props.put(ProducerConfig.CLIENT_ID_CONFIG,"usertest");
props.put(ProducerConfig.ACKS_CONFIG, "all");
props.put(ProducerConfig.RETRIES_CONFIG, "3");
props.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 33554432);
props.put(ProducerConfig.BATCH_SIZE_CONFIG, 1600);
System.setProperty("javax.security.auth.useSubjectCredsOnly", "true");
System.setProperty("java.security.auth.login.config", "/home/user/clusrter_2_jaas.conf);
props.put("security.protocol", "SASL_PLAINTEXT");
props.put("kafka.cluster.SecurityProtocol",PLAINTEXTSASL);
props.put("sasl.kerberos.service.name", "kafka");
props.put("sasl.kerberos", "sasl.kerberos.service.namekafka");
props.put("security.inter.broker.protocol", "SASL_PLAINTEXT");
props.put("sasl.mechanism.inter.broker.protocol", "PLAIN");
props.put("sasl.enabled.mechanisms", "PLAIN");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
kafkaProducer = new KafkaProducer<String, String>(props);
}
return kafkaProducer;
}
}
When I start this as two service with enabling only producer instance it works, but when I enable both instance in a single jar only one producer works and other gets authentication issues.
I feel this is due to System.setProperty("java.security.auth.login.config","") , since it is global system variable so it overrides when i use both in single process, so only one works.
So is there any way to solve this issue other than starting a two process. I have only one spring service and should be able to produce to both different kafka cluster ..
Kafka client's latest version has given options for multiple JAAS confs for different clients.
For example, if your instance wants to connect two clusters with different JAAS conf we can override on different producer and consumer levels. Just create 2 separate producer factories and set sasl.jaas.conig
clustera.java.security.auth.login.config=com.sun.security.auth.module.Krb5LoginModule required \
useKeyTab=true \
storeKey=true \
keyTab="test.keytab" \
principal="test#domain.com";
clusterb.java.security.auth.login.config=com.sun.security.auth.module.Krb5LoginModule required \
useKeyTab=true \
storeKey=true \
keyTab="testb.keytab" \
principal="testb#domain.com"
Related
Im using springboot v1.5 and spring kafka v1.1.6 to publish a message to a kafka broker.
when it publishes the message to the topic, the topic is created in the broker by default if not present.
I do not want it to create topics if not present. I tried to disable it by adding the property spring.kafka.topic.properties.auto.create=false but it does not work.
below is my bean configuration
#Value("${kpi.kafka.bootstrap-servers}")
private String bootstrapServer;
#Bean
public ProducerFactory<String, CmsMonitoringMetrics> producerFactoryJson() {
Map<String, Object> configProps = new HashMap<>();
configProps.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServer);
configProps.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
configProps.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, JsonSerializer.class);
configProps.put("allow.auto.create.topics", "false");
return new DefaultKafkaProducerFactory<>(configProps);
}
#Bean
public KafkaTemplate<String, CmsMonitoringMetrics> kafkaTemplateJson() {
return new KafkaTemplate<>(producerFactoryJson());
}
in producer method im using the below code to publish
Message<CmsMonitoringMetrics> message = MessageBuilder.withPayload(data)
.setHeader(KafkaHeaders.TOPIC, topicName)
.build();
SendResult<String, CmsMonitoringMetrics> result = kafkaTemplate.send(message).get();
it still creates the topic. please help me disable it.
As per the documentation, auto.create.topics.enable is a broker configuration. That means that you have to set this property on the server side of Kafka, not on producer/consumer clients.
I am investigating a bug report in our application (spring boot) regarding the kafka metric kafka.consumer.fetch.manager.records.consumed.total being missing.
The application has two kafka consumers, lets call them query-routing and query-tracking consumers, and they are configured via #KafkaListener annotation and each kafka consumer has it's own instance of ConcurrentKafkaListenerContainerFactory.
The query-router consumer is configured as
#Configuration
#EnableKafka
public class QueryRoutingConfiguration {
#Bean(name = "queryRoutingContainerFactory")
public ConcurrentKafkaListenerContainerFactory<String, RoutingInfo> kafkaListenerContainerFactory(MeterRegistry meterRegistry) {
Map<String, Object> consumerConfigs = new HashMap<>();
// For brevity I removed the configs as they are trivial configs like bootstrap servers and serializers
DefaultKafkaConsumerFactory<String, RoutingInfo> consumerFactory =
new DefaultKafkaConsumerFactory<>(consumerConfigs);
consumerFactory.addListener(new MicrometerConsumerListener<>(meterRegistry));
ConcurrentKafkaListenerContainerFactory<String, RoutingInfo> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory);
factory.getContainerProperties().setIdleEventInterval(5000L);
return factory;
}
}
And the query-tracking consumer is configured as:
#Configuration
#EnableKafka
public class QueryTrackingConfiguration {
private static final FixedBackOff NO_ATTEMPTS = new FixedBackOff(Duration.ofSeconds(0).toMillis(), 0L);
#Bean(name = "queryTrackingContainerFactory")
public ConcurrentKafkaListenerContainerFactory<String, QueryTrackingMessage> kafkaListenerContainerFactory(MeterRegistry meterRegistry) {
Map<String, Object> consumerConfigs = new HashMap<>();
// For brevity I removed the configs as they are trivial configs like bootstrap servers and serializers
DefaultKafkaConsumerFactory<String, QueryTrackingMessage> consumerFactory =
new DefaultKafkaConsumerFactory<>(consumerConfigs);
consumerFactory.addListener(new MicrometerConsumerListener<>(meterRegistry));
ConcurrentKafkaListenerContainerFactory<String, QueryTrackingMessage> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory);
factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.MANUAL);
factory.setBatchListener(true);
DefaultErrorHandler deusErrorHandler = new DefaultErrorHandler(NO_ATTEMPTS);
factory.setCommonErrorHandler(deusErrorHandler);
return factory;
}
}
The MeterRegistryConfigurator bean configuaration is set as:
#Configuration
public class MeterRegistryConfigurator {
private static final Logger LOG = LoggerFactory.getLogger(MeterRegistryConfigurator.class);
private static final String PREFIX = "dps";
#Bean
MeterRegistryCustomizer<MeterRegistry> meterRegistryCustomizer() {
return registry -> registry.config()
.onMeterAdded(meter -> LOG.info("onMeterAdded: {}", meter.getId().getName()))
.onMeterRemoved(meter -> LOG.info("onMeterRemoved: {}", meter.getId().getName()))
.onMeterRegistrationFailed(
(id, s) -> LOG.info("onMeterRegistrationFailed - id '{}' value '{}'", id.getName(), s))
.meterFilter(PrefixMetricFilter.withPrefix(PREFIX))
.meterFilter(
MeterFilter.deny(id ->
id.getName().startsWith(PREFIX + ".jvm")
|| id.getName().startsWith(PREFIX + ".system")
|| id.getName().startsWith(PREFIX + ".process")
|| id.getName().startsWith(PREFIX + ".logback")
|| id.getName().startsWith(PREFIX + ".tomcat"))
)
.meterFilter(MeterFilter.ignoreTags("host", "host.name"))
.namingConvention(NamingConvention.snakeCase);
}
}
The #KafkaListener for each consumer is set as
#KafkaListener(
id = "query-routing",
idIsGroup = true,
topics = "${query-routing.consumer.topic}",
groupId = "${query-routing.consumer.groupId}",
containerFactory = "queryRoutingContainerFactory")
public void listenForMessages(ConsumerRecord<String, RoutingInfo> record) {
// Handle each record ...
}
and
#KafkaListener(
id = "query-tracking",
idIsGroup = true,
topics = "${query-tracking.consumer.topic}",
groupId = "${query-tracking.consumer.groupId}",
containerFactory = "queryTrackingContainerFactory"
)
public void listenForMessages(List<ConsumerRecord<String, QueryTrackingMessage>> consumerRecords, Acknowledgment ack) {
// Handle each record ...
}
When the application starts up, going to the actuator/prometheus endpoing I can see the metric for both consumers:
# HELP dps_kafka_consumer_fetch_manager_records_consumed_total The total number of records consumed
# TYPE dps_kafka_consumer_fetch_manager_records_consumed_total counter
dps_kafka_consumer_fetch_manager_records_consumed_total{client_id="consumer-qf-query-tracking-consumer-1",kafka_version="3.1.2",spring_id="not.managed.by.Spring.consumer-qf-query-tracking-consumer-1",} 7.0
dps_kafka_consumer_fetch_manager_records_consumed_total{client_id="consumer-QF-Routing-f5d0d9f1-e261-407b-954d-5d217211dee0-2",kafka_version="3.1.2",spring_id="not.managed.by.Spring.consumer-QF-Routing-f5d0d9f1-e261-407b-954d-5d217211dee0-2",} 0.0
But a few seconds later there is a new call to io.micrometer.core.instrument.binder.kafka.KafkaMetrics#checkAndBindMetrics which will remove a set of metrics (including kafka.consumer.fetch.manager.records.consumed.total)
onMeterRegistrationFailed - dps.kafka.consumer.fetch.manager.records.consumed.total string Prometheus requires that all meters with the same name have the same set of tag keys. There is already an existing meter named 'dps.kafka.consumer.fetch.manager.records.consumed.total' containing tag keys [client_id, kafka_version, spring_id]. The meter you are attempting to register has keys [client_id, kafka_version, spring_id, topic].
Going again to actuator/prometheus will only show the metric for the query-routing consumer:
# HELP deus_dps_persistence_kafka_consumer_fetch_manager_records_consumed_total The total number of records consumed for a topic
# TYPE deus_dps_persistence_kafka_consumer_fetch_manager_records_consumed_total counter
deus_dps_persistence_kafka_consumer_fetch_manager_records_consumed_total{client_id="consumer-QF-Routing-0a739a21-4764-411a-9cc6-0e60293b40b4-2",kafka_version="3.1.2",spring_id="not.managed.by.Spring.consumer-QF-Routing-0a739a21-4764-411a-9cc6-0e60293b40b4-2",theKey="routing",topic="QF_query_routing_v1",} 0.0
As you can see above the metric for the query-tracking consumer is gone.
As the log says, The meter you are attempting to register has keys [client_id, kafka_version, spring_id, topic]. The issue is I cannot find where is this metric with a topic key being registered which will trigger io.micrometer.core.instrument.binder.kafka.KafkaMetrics#checkAndBindMetrics which will remove the metric for the query-tracking consumer.
I am using
micrometer-registry-prometheus version 1.9.5
spring boot version 2.7.5
spring kafka (org.springframework.kafka:spring-kafka)
My question is, why does the metric kafka.consumer.fetch.manager.records.consumed.total fails causing it to be removed for the query-tracking consumer and how can I fix it?
I believe this is internal in Micrometer KafkaMetrics.
Periodically, it checks for new metrics; presumably, the topic one shows up after the consumer subscribes to the topic.
#Override
public void bindTo(MeterRegistry registry) {
this.registry = registry;
commonTags = getCommonTags(registry);
prepareToBindMetrics(registry);
checkAndBindMetrics(registry);
VVVVVVVVVVVVVVVVVVVVVVVVVVVVVV
scheduler.scheduleAtFixedRate(() -> checkAndBindMetrics(registry), getRefreshIntervalInMillis(),
getRefreshIntervalInMillis(), TimeUnit.MILLISECONDS);
}
You should be able to write a filter to exclude the one with fewer tags.
I have two instances of kafka consumer, configured with the same consumer group and listening to partition 0 in the same topic. The problem is when I send a message to the topic. The message is consumed by both instances which supposed not to happen as they are in the same group.
I am using Spring Boot configuration class to configure them.
Here is the configuration:
#Bean
ConcurrentKafkaListenerContainerFactory<Integer, String> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<Integer, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.getContainerProperties().setAckMode(AbstractMessageListenerContainer.AckMode.MANUAL_IMMEDIATE);
return factory;
}
#Bean
public ConsumerFactory<Integer, String> consumerFactory() {
return new DefaultKafkaConsumerFactory<>(consumerConfigs());
}
#Bean
public Map<String, Object> consumerConfigs() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(ConsumerConfig.GROUP_ID_CONFIG, consumerGroupId);
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, "100");
props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, "15000");
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, keyDeserializer);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, valueDeserializer);
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
return props;
}
Here is the listener:
#KafkaListener(topicPartitions = {#TopicPartition(topic = "${kafka.topic.orders}", partitions = "0")})
public void consume(ConsumerRecord<String, String> record, Acknowledgment acknowledgment) {
log.info("message received at " + orderTopic + "at partition 0");
processRecord(record, acknowledgment);
}
Kafka doesn't work like that; when you manually assign partitions like that (#TopicPartition) you are explicitly telling Kafka you want to receive messages from that partition - the consumer assign() s the partitions to itself.
In other words, with manual assignment, you are taking responsibility for distributing the partitions.
You need use group management, and let Kafka assign the topics to the instances.
use topics = "..." and Kafka will do the assignment. If you don't have enough topics, instances will be idle. You need at least as many partitions as instances to have all instances participate.
I'm setting up a kafka listener in a spring boot application and I can't seem to get the listener running in a pool using an executor. Here's my kafka configuration:
#Bean
ThreadPoolTaskExecutor messageProcessorExecutor() {
logger.info("Creating a message processor pool with {} threads", numThreads);
ThreadPoolTaskExecutor exec = new ThreadPoolTaskExecutor();
exec.setCorePoolSize(200);
exec.setMaxPoolSize(200);
exec.setKeepAliveSeconds(30);
exec.setAllowCoreThreadTimeOut(true);
exec.setQueueCapacity(0); // Yields a SynchronousQueue
exec.setThreadFactory(ThreadFactoryFactory.defaultNamingFactory("kafka", "processor"));
return exec;
}
#Bean
public ConsumerFactory<String, PollerJob> consumerFactory() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(ConsumerConfig.GROUP_ID_CONFIG, consumerGroup);
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
DefaultKafkaConsumerFactory<String, PollerJob> factory = new DefaultKafkaConsumerFactory<>(props,
new StringDeserializer(),
new JsonDeserializer<>(PollerJob.class));
return factory;
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, PollerJob> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, PollerJob> factory
= new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setConcurrency(Integer.valueOf(kafkaThreads));
factory.getContainerProperties().setListenerTaskExecutor(messageProcessorExecutor());
factory.getContainerProperties().setAckMode(AbstractMessageListenerContainer.AckMode.MANUAL);
return factory;
}
The ThreadFactoryFactory used by the ThreadPoolTaskExecutor just makes sure the thread is named like 'kafka-1-processor-1'.
The ConsumerFactory has the ENABLE_AUTO_COMMIT_CONFIG flag set to false and I'm using manual mode for the acknowledgement which is required to use executors according to the documentation.
My listener looks like this:
#KafkaListener(topics = "my_topic",
group = "my_group",
containerFactory = "kafkaListenerContainerFactory")
public void listen(#Payload SomeJob job, Acknowledgment ack) {
ack.acknowledge();
logger.info("Running job {}", job.getId());
....
}
Using the Admin Server I can inspect all the threads and only one kafka-N-processor-N threads is being created but I expected to see up to 200. The jobs are all running one at a time on the that one thread and I can't figure out why.
How can I get this setup to run the listeners using my executor with as many threads as possible?
I'm using Spring Boot 1.5.4.RELEASE and kafka 0.11.0.0.
If your topic has only one partition, according the consumer group policy, only one consumer is able to poll that partition.
The ConcurrentMessageListenerContainer indeed creates as much target KafkaMessageListenerContainer instances as provided concurrency. And it does that only in case it doesn't know the number of partitions in the topic.
When the rebalance in consumer group happens only one consumer gets partition for consuming. All the work is really done there in a single thread:
private void startInvoker() {
ListenerConsumer.this.invoker = new ListenerInvoker();
ListenerConsumer.this.listenerInvokerFuture = this.containerProperties.getListenerTaskExecutor()
.submit(ListenerConsumer.this.invoker);
}
One partition - one thread for sequential records processing.
Spring boot new for me as non web project. please guide me how to code Spark streaming in spring boot, i have already work in java-spark project and want to convert in spring boot non web application. any help or suggestion please.
Here is my Spark config
#Bean
public SparkConf sparkConf() {
SparkConf sparkConf = new SparkConf();
sparkConf.set("spark.app.name", "SparkReceiver"); //The name of application. This will appear in the UI and in log data.
//conf.set("spark.ui.port", "7077"); //Port for application's dashboard, which shows memory and workload data.
sparkConf.set("dynamicAllocation.enabled","false"); //Which scales the number of executors registered with this application up and down based on the workload
//conf.set("spark.cassandra.connection.host", "localhost"); //Cassandra Host Adddress/IP
sparkConf.set("spark.serializer","org.apache.spark.serializer.KryoSerializer"); //For serializing objects that will be sent over the network or need to be cached in serialized form.
sparkConf.set("spark.driver.allowMultipleContexts", "true");
sparkConf.setMaster("local[4]");
return sparkConf;
}
#Bean
public JavaSparkContext javaSparkContext() {
return new JavaSparkContext(sparkConf());
}
#Bean
public SparkSession sparkSession() {
return SparkSession
.builder()
.sparkContext(javaSparkContext().sc())
.appName("Java Spark SQL basic example")
.getOrCreate();
}
#Bean
public JavaStreamingContext javaStreamingContext(){
return new JavaStreamingContext(sparkConf(), new Duration(2000));
}
Here is my testing class
#Autowired
private JavaSparkContext sc;
#Autowired
private SparkSession session;
public void testMessage() throws InterruptedException{
JavaStreamingContext jsc = new JavaStreamingContext(sc, new Duration(2000));
Map<String, String> kafkaParams = new HashMap<String, String>();
kafkaParams.put("zookeeper.connect", "localhost:2181"); //Make all kafka data for this cluster appear under a particular path.
kafkaParams.put("group.id", "testgroup"); //String that uniquely identifies the group of consumer processes to which this consumer belongs
kafkaParams.put("metadata.broker.list", "localhost:9092"); //Producer can find a one or more Brokers to determine the Leader for each topic.
kafkaParams.put("serializer.class", "kafka.serializer.StringEncoder"); //Serializer to use when preparing the message for transmission to the Broker.
kafkaParams.put("request.required.acks", "1"); //Producer to require an acknowledgement from the Broker that the message was received.
Set<String> topics = Collections.singleton("16jnfbtopic");
//Create an input DStream for Receiving data from socket
JavaPairInputDStream<String, String> directKafkaStream = KafkaUtils.createDirectStream(jsc,
String.class,
String.class,
StringDecoder.class,
StringDecoder.class,
kafkaParams, topics);
//Create JavaDStream<String>
JavaDStream<String> msgDataStream = directKafkaStream.map(new Function<Tuple2<String, String>, String>() {
#Override
public String call(Tuple2<String, String> tuple2) {
return tuple2._2();
}
});
//Create JavaRDD<Row>
msgDataStream.foreachRDD(new VoidFunction<JavaRDD<String>>() {
#Override
public void call(JavaRDD<String> rdd) {
JavaRDD<Row> rowRDD = rdd.map(new Function<String, Row>() {
#Override
public Row call(String msg) {
Row row = RowFactory.create(msg);
return row;
}
});
//Create Schema
StructType schema = DataTypes.createStructType(new StructField[] {DataTypes.createStructField("Message", DataTypes.StringType, true)});
Dataset<Row> msgDataFrame = session.createDataFrame(rowRDD, schema);
msgDataFrame.show();
}
});
jsc.start();
jsc.awaitTermination();
while run this app i am getting error please Guide me.
Here is my error log
Eclipse Error Log