Kafka Stream: org.apache.kafka.clients.consumer.ConsumerConfig.addDeserializerToConfig - spring-boot

I'm learning Kafka Streams and I'm getting an error, I have tried a few things but noting works
Input : value_1, value_2, value_3 ...............
public static void main(String[] args) throws InterruptedException {
String host = "127.0.0.1:9092";
String consumer_group = "firstGroup1";
String topic = "test1";
// create properties
Properties properties = new Properties();
properties.setProperty(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, host);
properties.setProperty(StreamsConfig.APPLICATION_ID_CONFIG, consumer_group);
properties.setProperty(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.StringSerde.class.getName());
properties.setProperty(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.StringSerde.class.getName());
// create a topology
StreamsBuilder builder = new StreamsBuilder();
// input topic
KStream<String, String> inputtopic = builder.stream(topic);
// filter the value
KStream<String, String> filtered_stream = inputtopic.filter((k, v) -> ((v.equalsIgnoreCase("value_5")) || (v.equalsIgnoreCase("value_7")) || (v.equalsIgnoreCase("value_9"))));
filtered_stream.foreach((k, v) -> System.out.println(v));
// output topic set
filtered_stream.to("prime_value");
// build a topology
KafkaStreams kafkaStreams = new KafkaStreams(builder.build(), properties);
// start our stream system
kafkaStreams.start();
}
Error message
1800 [main] INFO org.apache.kafka.streams.processor.internals.assignment.AssignorConfiguration - stream-thread [firstGroup1-d1244e8e-dbc1-4139-8876-ca75cb89c609-StreamThread-1-consumer] Cooperative rebalancing enabled now
1852 [main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka version: 3.0.0
1852 [main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka commitId: 8cb0a5e9d3441962
1852 [main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka startTimeMs: 1641673582569
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.kafka.clients.consumer.ConsumerConfig.addDeserializerToConfig(Ljava/util/Map;Lorg/apache/kafka/common/serialization/Deserializer;Lorg/apache/kafka/common/serialization/Deserializer;)Ljava/util/Map;
at org.apache.kafka.streams.processor.internals.StreamThread$InternalConsumerConfig.<init>(StreamThread.java:537)
at org.apache.kafka.streams.processor.internals.StreamThread$InternalConsumerConfig.<init>(StreamThread.java:535)
at org.apache.kafka.streams.processor.internals.StreamThread.<init>(StreamThread.java:527)
at org.apache.kafka.streams.processor.internals.StreamThread.create(StreamThread.java:406)
at org.apache.kafka.streams.KafkaStreams.createAndAddStreamThread(KafkaStreams.java:897)
at org.apache.kafka.streams.KafkaStreams.<init>(KafkaStreams.java:887)
at org.apache.kafka.streams.KafkaStreams.<init>(KafkaStreams.java:783)
at org.apache.kafka.streams.KafkaStreams.<init>(KafkaStreams.java:693)
at com.example.kafkastreams.main(kafkastreams.java:47)
Line no 47: { KafkaStreams kafkaStreams = new KafkaStreams(builder.build(), properties);}

It looks like you have mismatched versions of kafka-clients and kafka-streams - they must be the same version.
When using Spring Boot; you should not add versions for the kafka dependencies; Boot will bring in the correct versions of both libraries.

Related

Meter registration fails on Spring Boot Kafka consumer with Prometheus MeterRegistry

I am investigating a bug report in our application (spring boot) regarding the kafka metric kafka.consumer.fetch.manager.records.consumed.total being missing.
The application has two kafka consumers, lets call them query-routing and query-tracking consumers, and they are configured via #KafkaListener annotation and each kafka consumer has it's own instance of ConcurrentKafkaListenerContainerFactory.
The query-router consumer is configured as
#Configuration
#EnableKafka
public class QueryRoutingConfiguration {
#Bean(name = "queryRoutingContainerFactory")
public ConcurrentKafkaListenerContainerFactory<String, RoutingInfo> kafkaListenerContainerFactory(MeterRegistry meterRegistry) {
Map<String, Object> consumerConfigs = new HashMap<>();
// For brevity I removed the configs as they are trivial configs like bootstrap servers and serializers
DefaultKafkaConsumerFactory<String, RoutingInfo> consumerFactory =
new DefaultKafkaConsumerFactory<>(consumerConfigs);
consumerFactory.addListener(new MicrometerConsumerListener<>(meterRegistry));
ConcurrentKafkaListenerContainerFactory<String, RoutingInfo> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory);
factory.getContainerProperties().setIdleEventInterval(5000L);
return factory;
}
}
And the query-tracking consumer is configured as:
#Configuration
#EnableKafka
public class QueryTrackingConfiguration {
private static final FixedBackOff NO_ATTEMPTS = new FixedBackOff(Duration.ofSeconds(0).toMillis(), 0L);
#Bean(name = "queryTrackingContainerFactory")
public ConcurrentKafkaListenerContainerFactory<String, QueryTrackingMessage> kafkaListenerContainerFactory(MeterRegistry meterRegistry) {
Map<String, Object> consumerConfigs = new HashMap<>();
// For brevity I removed the configs as they are trivial configs like bootstrap servers and serializers
DefaultKafkaConsumerFactory<String, QueryTrackingMessage> consumerFactory =
new DefaultKafkaConsumerFactory<>(consumerConfigs);
consumerFactory.addListener(new MicrometerConsumerListener<>(meterRegistry));
ConcurrentKafkaListenerContainerFactory<String, QueryTrackingMessage> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory);
factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.MANUAL);
factory.setBatchListener(true);
DefaultErrorHandler deusErrorHandler = new DefaultErrorHandler(NO_ATTEMPTS);
factory.setCommonErrorHandler(deusErrorHandler);
return factory;
}
}
The MeterRegistryConfigurator bean configuaration is set as:
#Configuration
public class MeterRegistryConfigurator {
private static final Logger LOG = LoggerFactory.getLogger(MeterRegistryConfigurator.class);
private static final String PREFIX = "dps";
#Bean
MeterRegistryCustomizer<MeterRegistry> meterRegistryCustomizer() {
return registry -> registry.config()
.onMeterAdded(meter -> LOG.info("onMeterAdded: {}", meter.getId().getName()))
.onMeterRemoved(meter -> LOG.info("onMeterRemoved: {}", meter.getId().getName()))
.onMeterRegistrationFailed(
(id, s) -> LOG.info("onMeterRegistrationFailed - id '{}' value '{}'", id.getName(), s))
.meterFilter(PrefixMetricFilter.withPrefix(PREFIX))
.meterFilter(
MeterFilter.deny(id ->
id.getName().startsWith(PREFIX + ".jvm")
|| id.getName().startsWith(PREFIX + ".system")
|| id.getName().startsWith(PREFIX + ".process")
|| id.getName().startsWith(PREFIX + ".logback")
|| id.getName().startsWith(PREFIX + ".tomcat"))
)
.meterFilter(MeterFilter.ignoreTags("host", "host.name"))
.namingConvention(NamingConvention.snakeCase);
}
}
The #KafkaListener for each consumer is set as
#KafkaListener(
id = "query-routing",
idIsGroup = true,
topics = "${query-routing.consumer.topic}",
groupId = "${query-routing.consumer.groupId}",
containerFactory = "queryRoutingContainerFactory")
public void listenForMessages(ConsumerRecord<String, RoutingInfo> record) {
// Handle each record ...
}
and
#KafkaListener(
id = "query-tracking",
idIsGroup = true,
topics = "${query-tracking.consumer.topic}",
groupId = "${query-tracking.consumer.groupId}",
containerFactory = "queryTrackingContainerFactory"
)
public void listenForMessages(List<ConsumerRecord<String, QueryTrackingMessage>> consumerRecords, Acknowledgment ack) {
// Handle each record ...
}
When the application starts up, going to the actuator/prometheus endpoing I can see the metric for both consumers:
# HELP dps_kafka_consumer_fetch_manager_records_consumed_total The total number of records consumed
# TYPE dps_kafka_consumer_fetch_manager_records_consumed_total counter
dps_kafka_consumer_fetch_manager_records_consumed_total{client_id="consumer-qf-query-tracking-consumer-1",kafka_version="3.1.2",spring_id="not.managed.by.Spring.consumer-qf-query-tracking-consumer-1",} 7.0
dps_kafka_consumer_fetch_manager_records_consumed_total{client_id="consumer-QF-Routing-f5d0d9f1-e261-407b-954d-5d217211dee0-2",kafka_version="3.1.2",spring_id="not.managed.by.Spring.consumer-QF-Routing-f5d0d9f1-e261-407b-954d-5d217211dee0-2",} 0.0
But a few seconds later there is a new call to io.micrometer.core.instrument.binder.kafka.KafkaMetrics#checkAndBindMetrics which will remove a set of metrics (including kafka.consumer.fetch.manager.records.consumed.total)
onMeterRegistrationFailed - dps.kafka.consumer.fetch.manager.records.consumed.total string Prometheus requires that all meters with the same name have the same set of tag keys. There is already an existing meter named 'dps.kafka.consumer.fetch.manager.records.consumed.total' containing tag keys [client_id, kafka_version, spring_id]. The meter you are attempting to register has keys [client_id, kafka_version, spring_id, topic].
Going again to actuator/prometheus will only show the metric for the query-routing consumer:
# HELP deus_dps_persistence_kafka_consumer_fetch_manager_records_consumed_total The total number of records consumed for a topic
# TYPE deus_dps_persistence_kafka_consumer_fetch_manager_records_consumed_total counter
deus_dps_persistence_kafka_consumer_fetch_manager_records_consumed_total{client_id="consumer-QF-Routing-0a739a21-4764-411a-9cc6-0e60293b40b4-2",kafka_version="3.1.2",spring_id="not.managed.by.Spring.consumer-QF-Routing-0a739a21-4764-411a-9cc6-0e60293b40b4-2",theKey="routing",topic="QF_query_routing_v1",} 0.0
As you can see above the metric for the query-tracking consumer is gone.
As the log says, The meter you are attempting to register has keys [client_id, kafka_version, spring_id, topic]. The issue is I cannot find where is this metric with a topic key being registered which will trigger io.micrometer.core.instrument.binder.kafka.KafkaMetrics#checkAndBindMetrics which will remove the metric for the query-tracking consumer.
I am using
micrometer-registry-prometheus version 1.9.5
spring boot version 2.7.5
spring kafka (org.springframework.kafka:spring-kafka)
My question is, why does the metric kafka.consumer.fetch.manager.records.consumed.total fails causing it to be removed for the query-tracking consumer and how can I fix it?
I believe this is internal in Micrometer KafkaMetrics.
Periodically, it checks for new metrics; presumably, the topic one shows up after the consumer subscribes to the topic.
#Override
public void bindTo(MeterRegistry registry) {
this.registry = registry;
commonTags = getCommonTags(registry);
prepareToBindMetrics(registry);
checkAndBindMetrics(registry);
VVVVVVVVVVVVVVVVVVVVVVVVVVVVVV
scheduler.scheduleAtFixedRate(() -> checkAndBindMetrics(registry), getRefreshIntervalInMillis(),
getRefreshIntervalInMillis(), TimeUnit.MILLISECONDS);
}
You should be able to write a filter to exclude the one with fewer tags.

Spring kafka unit test listener not subscribing to topic

I have a sample project to explore spring with kafka ( find here ) . I have a listener subscribing to topic my-test-topic-upstream which will just lo the message and key and publish same to another topic my-test-topic-downstream. I tried this is local kafka ( docker-compose file is there ) and it works.
Now I'm tryignt to write a test for this using a embedded kafka server. Under test I have a embedded server starting up ( TestContext.java ) which should start before the test ( overridden junit beforeAll ).
private static EmbeddedKafkaBroker kafka() {
EmbeddedKafkaBroker kafkaEmbedded =
new EmbeddedKafkaBroker(
3,
false,
1,
"my-test-topic-upstream", "my-test-topic-downstream");
Map<String, String> brokerProperties = new HashMap<>();
brokerProperties.put("default.replication.factor", "1");
brokerProperties.put("offsets.topic.replication.factor", "1");
brokerProperties.put("group.initial.rebalance.delay.ms", "3000");
kafkaEmbedded.brokerProperties(brokerProperties);
try {
kafkaEmbedded.afterPropertiesSet();
} catch (Exception e) {
throw new RuntimeException(e);
}
return kafkaEmbedded;
}
Then I create a producer ( TickProducer ) and publish a message to the topic which I expect my listener will be able to consume.
public TickProducer(String brokers) {
Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, brokers);
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
producer = new KafkaProducer<>(props);
}
public RecordMetadata publishTick(String brand)
throws ExecutionException, InterruptedException {
return publish(TOPIC, brand, Instant.now().toString());
}
private RecordMetadata publish(String topic, String key, String value)
throws ExecutionException, InterruptedException {
final RecordMetadata recordMetadata;
recordMetadata = producer.send(new ProducerRecord<>(topic, key, value)).get();
producer.flush();
return recordMetadata;
}
I see following log message keep logging.
11:32:35.745 [main] WARN o.apache.kafka.clients.NetworkClient - [Consumer clientId=consumer-1, groupId=my-test-group] Connection to node -1 could not be established. Broker may not be available.
finally fails with
11:36:52.774 [main] ERROR o.s.boot.SpringApplication - Application run failed
org.springframework.context.ApplicationContextException: Failed to start bean 'org.springframework.kafka.config.internalKafkaListenerEndpointRegistry'; nested exception is org.apache.kafka.common.errors.TimeoutException: Timeout expired while fetching topic metadata
Any tips here?
Look at the INFO log ConsumerConfig to see where he is trying to connect (compare it to the ProducerConfig). I suspect you haven't updated the spring boot bootstrap-servers property to point to the embedded broker.
See
/**
* Set the system property with this name to the list of broker addresses.
* #param brokerListProperty the brokerListProperty to set
* #return this broker.
* #since 2.3
*/
public EmbeddedKafkaBroker brokerListProperty(String brokerListProperty) {
this.brokerListProperty = brokerListProperty;
return this;
}
Set it to spring.kafka.bootstrap-servers which will then be used instead of SPRING_EMBEDDED_KAFKA_BROKERS.
BTW, it's generally easier to use the #EmbeddedKafka annotation instead of instantiating the server yourself.

Spring Kafka Producer not sending to Kafka 1.0.0 (Magic v1 does not support record headers)

I am using this docker-compose setup for setting up Kafka locally: https://github.com/wurstmeister/kafka-docker/
docker-compose up works fine, creating topics via shell works fine.
Now I try to connect to Kafka via spring-kafka:2.1.0.RELEASE
When starting up the Spring application it prints the correct version of Kafka:
o.a.kafka.common.utils.AppInfoParser : Kafka version : 1.0.0
o.a.kafka.common.utils.AppInfoParser : Kafka commitId : aaa7af6d4a11b29d
I try to send a message like this
kafkaTemplate.send("test-topic", UUID.randomUUID().toString(), "test");
Sending on client side fails with
UnknownServerException: The server experienced an unexpected error when processing the request
In the server console I get the message Magic v1 does not support record headers
Error when handling request {replica_id=-1,max_wait_time=100,min_bytes=1,max_bytes=2147483647,topics=[{topic=test-topic,partitions=[{partition=0,fetch_offset=39,max_bytes=1048576}]}]} (kafka.server.KafkaApis)
java.lang.IllegalArgumentException: Magic v1 does not support record headers
Googling suggests a version conflict, but the version seem to fit (org.apache.kafka:kafka-clients:1.0.0 is in the classpath).
Any clues? Thanks!
Edit:
I narrowed down the source of the problem. Sending plain Strings works, but sending Json via JsonSerializer results in the given problem. Here is the content of my producer config:
#Value("\${kafka.bootstrap-servers}")
lateinit var bootstrapServers: String
#Bean
fun producerConfigs(): Map<String, Any> =
HashMap<String, Any>().apply {
// list of host:port pairs used for establishing the initial connections to the Kakfa cluster
put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers)
put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer::class.java)
put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, JsonSerializer::class.java)
}
#Bean
fun producerFactory(): ProducerFactory<String, MyClass> =
DefaultKafkaProducerFactory(producerConfigs())
#Bean
fun kafkaTemplate(): KafkaTemplate<String, MyClass> =
KafkaTemplate(producerFactory())
I had a similar issue. Kafka adds headers by default if we use JsonSerializer or JsonSerde for values.
In order to prevent this issue, we need to disable adding info headers.
if you are fine with default json serialization, then use the following (key point here is ADD_TYPE_INFO_HEADERS):
Map<String, Object> props = new HashMap<>(defaultSettings);
props.put(JsonSerializer.ADD_TYPE_INFO_HEADERS, false);
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, JsonSerializer.class);
ProducerFactory<String, Object> producerFactory = new DefaultKafkaProducerFactory<>(props);
but if you need a custom JsonSerializer with specific ObjectMapper (like with PropertyNamingStrategy.SNAKE_CASE), you should disable adding info headers explicitly on JsonSerializer, as spring kafka ignores DefaultKafkaProducerFactory's property ADD_TYPE_INFO_HEADERS (as for me it's a bad design of spring kafka)
JsonSerializer<Object> valueSerializer = new JsonSerializer<>(customObjectMapper);
valueSerializer.setAddTypeInfo(false);
ProducerFactory<String, Object> producerFactory = new DefaultKafkaProducerFactory<>(props, Serdes.String().serializer(), valueSerializer);
or if we use JsonSerde, then:
Map<String, Object> jsonSerdeProperties = new HashMap<>();
jsonSerdeProperties.put(JsonSerializer.ADD_TYPE_INFO_HEADERS, false);
JsonSerde<T> jsonSerde = new JsonSerde<>(serdeClass);
jsonSerde.configure(jsonSerdeProperties, false);
Solved. The problem is neither the broker, some docker cache nor the Spring app.
The problem was a console consumer which I used in parallel for debugging. This was an "old" consumer started with kafka-console-consumer.sh --topic=topic --zookeeper=...
It actually prints a warning when started: Using the ConsoleConsumer with old consumer is deprecated and will be removed in a future major release. Consider using the new consumer by passing [bootstrap-server] instead of [zookeeper].
A "new" consumer with --bootstrap-server option should be used (especially when using Kafka 1.0 with JsonSerializer).
Note: Using an old consumer here can indeed affect the producer.
I just ran a test against that docker image with no problems...
$docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f093b3f2475c kafkadocker_kafka "start-kafka.sh" 33 minutes ago Up 2 minutes 0.0.0.0:32768->9092/tcp kafkadocker_kafka_1
319365849e48 wurstmeister/zookeeper "/bin/sh -c '/usr/sb…" 33 minutes ago Up 2 minutes 22/tcp, 2888/tcp, 3888/tcp, 0.0.0.0:2181->2181/tcp kafkadocker_zookeeper_1
.
#SpringBootApplication
public class So47953901Application {
public static void main(String[] args) {
SpringApplication.run(So47953901Application.class, args);
}
#Bean
public ApplicationRunner runner(KafkaTemplate<Object, Object> template) {
return args -> template.send("foo", "bar", "baz");
}
#KafkaListener(id = "foo", topics = "foo")
public void listen(String in) {
System.out.println(in);
}
}
.
spring.kafka.bootstrap-servers=192.168.177.135:32768
spring.kafka.consumer.auto-offset-reset=earliest
spring.kafka.consumer.enable-auto-commit=false
.
2017-12-23 13:27:27.990 INFO 21305 --- [ foo-0-C-1] o.s.k.l.KafkaMessageListenerContainer : partitions assigned: [foo-0]
baz
EDIT
Still works for me...
spring.kafka.bootstrap-servers=192.168.177.135:32768
spring.kafka.consumer.auto-offset-reset=earliest
spring.kafka.consumer.enable-auto-commit=false
spring.kafka.consumer.value-deserializer=org.springframework.kafka.support.serializer.JsonDeserializer
spring.kafka.producer.value-serializer=org.springframework.kafka.support.serializer.JsonSerializer
.
2017-12-23 15:27:59.997 INFO 44079 --- [ main] o.a.k.clients.producer.ProducerConfig : ProducerConfig values:
acks = 1
...
value.serializer = class org.springframework.kafka.support.serializer.JsonSerializer
...
2017-12-23 15:28:00.071 INFO 44079 --- [ foo-0-C-1] o.s.k.l.KafkaMessageListenerContainer : partitions assigned: [foo-0]
baz
you are using kafka version <=0.10.x.x
once you using using this, you must set JsonSerializer.ADD_TYPE_INFO_HEADERS to false as below.
Map<String, Object> props = new HashMap<>(defaultSettings);
props.put(JsonSerializer.ADD_TYPE_INFO_HEADERS, false);
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, JsonSerializer.class);
ProducerFactory<String, Object> producerFactory = new DefaultKafkaProducerFactory<>(props);
for your producer factory properties.
In case you are using kafka version > 0.10.x.x, it should just work fine

Spring-kafka and kafka 0.10

I'm currently trying to use kafka and spring-kafka in order to consumer messages.
But I have trouble executing several consumers for the same topic and have several questions:
1 - My consumers tends to disconnect after some time and have trouble reconnecting
The following WARN is raised regularly on my consumers:
2017-09-06 15:32:35.054 INFO 5203 --- [nListener-0-C-1] f.b.poc.crawler.kafka.KafkaListener : Consuming {"some-stuff": "yes"} from topic [job15]
2017-09-06 15:32:35.054 INFO 5203 --- [nListener-0-C-1] f.b.p.c.w.services.impl.CrawlingService : Start of crawling
2017-09-06 15:32:35.054 INFO 5203 --- [nListener-0-C-1] f.b.p.c.w.services.impl.CrawlingService : Url has already been treated ==> skipping
2017-09-06 15:32:35.054 WARN 5203 --- [nListener-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator : Auto-commit of offsets {job15-3=OffsetAndMetadata{offset=11547, metadata=''}, job15-2=OffsetAndMetadata{offset=15550, metadata=''}} failed for group group-3: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records.
2017-09-06 15:32:35.054 INFO 5203 --- [nListener-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator : Revoking previously assigned partitions [job15-3, job15-2] for group group-3
2017-09-06 15:32:35.054 INFO 5203 --- [nListener-0-C-1] s.k.l.ConcurrentMessageListenerContainer : partitions revoked:[job15-3, job15-2]
2017-09-06 15:32:35.054 INFO 5203 --- [nListener-0-C-1] o.a.k.c.c.internals.AbstractCoordinator : (Re-)joining group group-3
This cause the consumer to stop and wait for several seconds.
As mentionned in the message, I increased the consumers session.timeout.ms to something like 30000. I still get the message.
As you can see in the provided logs the disconnection occurs right after a record has finished its process.
So ... a lot before 30s of innactivity.
2- Two consumers application receives the same message REALLY often
While looking at my consumers' logs I saw that they tend to treat the same message. I understood Kafka is at-least-once but I never thought I would encounter a lot of duplication.
Hopefully I use redis but I probably have missunderstood some tuning / properties I need to do.
THE CODE
Note: I'm using ConcurrentMessageListenerContainer with auto-commit=true but run with 1 Thread. I just start several instances of the same application because the consumer uses services that aren't thread-safe.
KafkaContext.java
#Slf4j
#Configuration
#EnableConfigurationProperties(value = KafkaConfig.class)
class KafkaContext {
#Bean(destroyMethod = "stop")
public ConcurrentMessageListenerContainer kafkaInListener(IKafkaListener listener, KafkaConfig config) {
final ContainerProperties containerProperties =
new ContainerProperties(config.getIn().getTopic());
containerProperties.setMessageListener(listener);
final DefaultKafkaConsumerFactory<Integer, String> defaultKafkaConsumerFactory =
new DefaultKafkaConsumerFactory<>(consumerConfigs(config));
final ConcurrentMessageListenerContainer messageListenerContainer =
new ConcurrentMessageListenerContainer<>(defaultKafkaConsumerFactory, containerProperties);
messageListenerContainer.setConcurrency(config.getConcurrency());
messageListenerContainer.setAutoStartup(false);
return messageListenerContainer;
}
private Map<String, Object> consumerConfigs(KafkaConfig config) {
final String kafkaHost = config.getHost() + ":" + config.getPort();
log.info("Crawler_Worker connecting to kafka at {} with consumerGroup {}", kafkaHost, config.getIn().getGroupId());
final Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, kafkaHost);
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, true);
props.put(ConsumerConfig.GROUP_ID_CONFIG, config.getIn().getGroupId());
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, JacksonNextSerializer.class);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, IntegerDeserializer.class);
props.put(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, 30000);
return props;
}
}
Listener
#Slf4j
#Component
class KafkaListener implements IKafkaListener {
private final ICrawlingService crawlingService;
#Autowired
public KafkaListener(ICrawlingService crawlingService) {
this.crawlingService = crawlingService;
}
#Override
public void onMessage(ConsumerRecord<Integer, Next> consumerRecord) {
log.info("Consuming {} from topic [{}]", JSONObject.wrap(consumerRecord.value()), consumerRecord.topic());
consumerService.apply(consumerRecord.value());
}
}
The main issue here is that your consumer group is continuously being rebalanced. You are right about increasing session.timeout.ms, but I don't see this config applied in your configuration. Try removing:
props.put(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, 30000);
and setting:
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, 10);
props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, 30000);
You can increase MAX_POLL_RECORDS_CONFIG to get a better performance on communication with brokers. But if you process messages in one thread only it is safer to keep this value low.

Storm Program not running

So I was trying to learn apache storm and was using the tutorialspoint guide as a reference point for working with my first storm program(https://www.tutorialspoint.com/apache_storm/apache_storm_quick_guide.htm)
I do not get the call log count output as expected. My zookeeper however shuts down
My topology is:
public class logAnalyserStorm {
public static void main(String[] args) throws InterruptedException{
Config config = new Config();
config.setDebug(true);
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("call-log-reader-spout", new FakeCallLogGeneratorSpout(),100);
builder.setBolt("call-log-creator-bolt", new callLogCreatorBolt()).shuffleGrouping("call-log-reader-spout");
builder.setBolt("call-log-counter-bolt", new callLogCounterBolt()).fieldsGrouping("call-log-creator-bolt", new Fields("call"));
LocalCluster cluster = new LocalCluster();
cluster.submitTopology("logAnalyserStorm", config, builder.createTopology());
Thread.sleep(10000);
cluster.killTopology("logAnalyserStorm");
cluster.shutdown();
}
}
The error is:
20680 [Thread-10] INFO o.a.s.event - Event manager interrupted
20683 [main] INFO o.a.s.testing - Shutting down in process zookeeper
20683 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.NIOServerCnxnFactory - NIOServerCnxn factory exited run method
I realized that my nimbus was not running. Ugh. Thank You
Change Thread.sleep(10000); to Thread.sleep(60000);

Resources