Spring Kafka Stream - No type information in headers and no default type provided when using BiFunction - spring-boot

I am trying to join 2 topics and produce output in to a 3rd topic using BiFunction. I am facing issue with resolving the type for incoming message. My left side message is getting deserialized successfully, but right side it throws "No type information in headers and no default type provided".
When I step through the code I could see it fails in the line org.springframework.kafka.support.serializer.JsonDeserializer
Assert.state(localReader != null, "No headers available and no default type provided");
The Messages are produced by spring boot with Kafka binder. And it has below properties.
spring.kafka.producer.key-serializer=org.apache.kafka.common.serialization.StringSerializer
spring.kafka.producer.value-serializer=org.springframework.kafka.support.serializer.JsonSerializer
##
spring.kafka.producer.properties.spring.json.type.mapping=type1:com.demo.domain.type2,type1:com.demo.domain.type2
spring.kafka.producer.properties.spring.json.trusted.packages=com.demo.domain
spring.kafka.producer.properties.spring.json.add.type.headers=true
And on the Kafka Stream binder consumer side
# kafka stream setting
spring.cloud.stream.bindings.joinProcess-in-0.destination=local-stream-process-type1
spring.cloud.stream.bindings.joinProcess-in-1.destination=local-stream-process-type2
spring.cloud.stream.bindings.joinProcess-out-0.destination=local-stream-process-type3
spring.cloud.stream.kafka.streams.binder.functions.joinProcess.applicationId=local-stream-process
spring.cloud.stream.kafka.streams.binder.configuration.default.key.serde=org.apache.kafka.common.serialization.Serdes$StringSerde
spring.cloud.stream.kafka.streams.binder.configuration.default.value.serde=org.springframework.kafka.support.serializer.JsonSerde
spring.kafka.streams.properties.spring.json.trusted.packages=*
spring.kafka.properties.spring.json.type.mapping=type1:com.demo.domain.type2,type1:com.demo.domain.type2
spring.kafka.streams.properties.spring.json.use.type.headers=true
And My Bifunction looks like
#Configuration
public class StreamsConfig {
#Bean
public RecordMessageConverter converter() {
return new StringJsonMessageConverter();
}
#Bean
public BiFunction<KStream<String, type1>, KStream<String, type2>, KStream<String, type3>> joinProcess() {
return (type1, type2) ->
type1.join(type2, joiner(),
JoinWindows.of(Duration.ofDays(1)));
}
private ValueJoiner<type1, type2, type3> joiner() {
return (type1, type2) -> { new type3("test");
};
}
}
I have pretty much went through all the previous questions and none of them were Bifunction. The one thing i havent tried is set VALUE_TYPE_METHOD.
###Update###
I resolved my issue with explicitly providing the serdes and disabling auto type conversion.
#Bean
public BiFunction<KStream<String, Type1>, KStream<String, Type2>, KStream<String, Type3>> joinStream() {
return (type1, type2) ->
type1.join(type2, myValueJoiner(),
JoinWindows.of(Duration.ofMinutes(1)), StreamJoined.with(Serdes.String(), new Type1Serde(), new Type2Serde()));
}
And I disabled the automatic Deserialization like below
spring.cloud.stream.bindings.joinStream-in-0.consumer.use-native-decoding=false
spring.cloud.stream.bindings.joinStream-in-1.consumer.use-native-decoding=false

Related

ReadOnlyKeyValueStore from a KTable using spring-kafka

I am migrating a Kafka Streams implementation which uses pure Kafka apis to use spring-kafka instead as it's incorporated in a spring-boot application.
Everything works fine the Stream, GlobalKTable, branching that I have all works perfectly fine but I am having a hard time incorporating a ReadOnlyKeyValueStore. Based on the spring-kafka documentation here: https://docs.spring.io/spring-kafka/docs/2.6.10/reference/html/#streams-spring
It says:
If you need to perform some KafkaStreams operations directly, you can
access that internal KafkaStreams instance by using
StreamsBuilderFactoryBean.getKafkaStreams(). You can autowire
StreamsBuilderFactoryBean bean by type, but you should be sure to use
the full type in the bean definition.
Based on that I tried to incorporate it to my example as in the following fragments below:
#Bean(name = KafkaStreamsDefaultConfiguration.DEFAULT_STREAMS_CONFIG_BEAN_NAME)
public KafkaStreamsConfiguration defaultKafkaStreamsConfig() {
Map<String, Object> props = defaultStreamsConfigs();
props.put(StreamsConfig.APPLICATION_ID_CONFIG, "quote-stream");
props.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, SpecificAvroSerde.class);
props.put(ConsumerConfig.GROUP_ID_CONFIG, "stock-quotes-stream-group");
return new KafkaStreamsConfiguration(props);
}
#Bean(name = KafkaStreamsDefaultConfiguration.DEFAULT_STREAMS_BUILDER_BEAN_NAME)
public StreamsBuilderFactoryBean defaultKafkaStreamsBuilder(KafkaStreamsConfiguration defaultKafkaStreamsConfig) {
return new StreamsBuilderFactoryBean(defaultKafkaStreamsConfig);
}
...
final GlobalKTable<String, LeveragePrice> leverageBySymbolGKTable = streamsBuilder
.globalTable(KafkaConfiguration.LEVERAGE_PRICE_TOPIC,
Materialized.<String, LeveragePrice, KeyValueStore<Bytes, byte[]>>as("leverage-by-symbol-table")
.withKeySerde(Serdes.String())
.withValueSerde(leveragePriceSerde));
leveragePriceView = myKStreamsBuilder.getKafkaStreams().store("leverage-by-symbol-table", QueryableStoreTypes.keyValueStore());
But adding the StreamsBuilderFactoryBean(which seems to be needed to get a reference to KafkaStreams) definition causes an error:
The bean 'defaultKafkaStreamsBuilder', defined in class path resource [com/resona/springkafkastream/repository/KafkaConfiguration.class], could not be registered. A bean with that name has already been defined in class path resource [org/springframework/kafka/annotation/KafkaStreamsDefaultConfiguration.class] and overriding is disabled.
The issue is I don't want to control the lifecycle of the stream that's what I get with the plain Kafka APIs so I would like to get a reference to the default managed one as I want spring to manage it but whenever I try to expose the bean it gives the error. Any ideas on what's the correct approach to that using spring-kafka?
P.S - I am not interested in solutions using spring-cloud-stream I am looking for implementations of spring-kafka.
You don't need to define any new beans; something like this should work...
spring.application.name=quote-stream
spring.kafka.streams.properties.default.key.serde=org.apache.kafka.common.serialization.Serdes$StringSerde
spring.kafka.streams.properties.default.value.serde=org.apache.kafka.common.serialization.Serdes$StringSerde
#SpringBootApplication
#EnableKafkaStreams
public class So69669791Application {
public static void main(String[] args) {
SpringApplication.run(So69669791Application.class, args);
}
#Bean
GlobalKTable<String, String> leverageBySymbolGKTable(StreamsBuilder sb) {
return sb.globalTable("gkTopic",
Materialized.<String, String, KeyValueStore<Bytes, byte[]>> as("leverage-by-symbol-table"));
}
private ReadOnlyKeyValueStore<String, String> leveragePriceView;
#Bean
StreamsBuilderFactoryBean.Listener afterStart(StreamsBuilderFactoryBean sbfb,
GlobalKTable<String, String> leverageBySymbolGKTable) {
StreamsBuilderFactoryBean.Listener listener = new StreamsBuilderFactoryBean.Listener() {
#Override
public void streamsAdded(String id, KafkaStreams streams) {
leveragePriceView = streams.store("leverage-by-symbol-table", QueryableStoreTypes.keyValueStore());
}
};
sbfb.addListener(listener);
return listener;
}
#Bean
KStream<String, String> stream(StreamsBuilder builder) {
KStream<String, String> stream = builder.stream("someTopic");
stream.to("otherTopic");
return stream;
}
}

Producer callback in Spring Cloud Stream with reactor core publisher

I have written a spring cloud stream application where producers are publishing messages to the designated kafka topics. My query is how can I add a producer callback to receive ack/confirmation that the message has been successfully published on the topic? Like how we do in spring kafka producer.send(record, new callback { ... }) (maintaining async producer). Below is my code:
private final Sinks.Many<Message<?>> responseProcessor = Sinks.many().multicast().onBackpressureBuffer();
#Bean
public Supplier<Flux<Message<?>>> event() {
return responseProcessor::asFlux;
}
public Message<?> publishEvent(String status) {
try {
String key = ...;
response = MessageBuilder.withPayload(payload)
.setHeader(KafkaHeaders.MESSAGE_KEY, key)
.build();
responseProcessor.tryEmitNext(response);
}
How can I make sure that tryEmitNext has successfully written to the topic?
Is implementing ProducerListener a solution and possible? Couldn't find a concrete solution/documentation in Spring Cloud Stream
UPDATE
I have implemented below now, seems to work as expected
#Component
public class MyProducerListener<K, V> implements ProducerListener<K, V> {
#Override
public void onSuccess(ProducerRecord<K, V> producerRecord, RecordMetadata recordMetadata) {
// Do nothing on onSuccess
}
#Override
public void onError(ProducerRecord<K, V> producerRecord, RecordMetadata recordMetadata, Exception exception) {
log.error("Producer exception occurred while publishing message : {}, exception : {}", producerRecord, exception);
}
}
#Bean
ProducerMessageHandlerCustomizer<KafkaProducerMessageHandler<?, ?>> customizer(MyProducerListener pl) {
return (handler, destinationName) -> handler.getKafkaTemplate().setProducerListener(pl);
}
See the Kafka Producer Properties.
recordMetadataChannel
The bean name of a MessageChannel to which successful send results should be sent; the bean must exist in the application context. The message sent to the channel is the sent message (after conversion, if any) with an additional header KafkaHeaders.RECORD_METADATA. The header contains a RecordMetadata object provided by the Kafka client; it includes the partition and offset where the record was written in the topic.
ResultMetadata meta = sendResultMsg.getHeaders().get(KafkaHeaders.RECORD_METADATA, RecordMetadata.class)
Failed sends go the producer error channel (if configured); see Error Channels. Default: null
You can add a #ServiceActivator to consume from this channel asynchronously.

How to create multi output stream from single input stream with Spring Cloud Kafka stream binder?

I am trying to create multi output streams(depend on different time window) from single input stream.
interface AnalyticsBinding {
String PAGE_VIEWS_IN = "pvin";
String PAGE_VIEWS _COUNTS_OUT_Last_5_Minutes = "pvcout_last_5_minutes";
String PAGE_VIEWS _COUNTS_OUT_Last_30_Minutes = "pvcout_last_30_minutes";
#Input(PAGE_VIEWS_IN)
KStream<String, PageViewEvent> pageViewsIn();
#Output(PAGE_VIEWS_COUNTS_OUT_Last_5_Minutes)
KStream<String,Long> pageViewsCountOutLast5Minutes();
#Output(PAGE_VIEWS_COUNTS_OUT_Last_30_Minutes)
KStream<String,Long> pageViewsCountOutLast30Minutes();
}
#StreamListener
#SendTo({ AnalyticsBinding.PAGE_VIEWS_COUNTS_OUT_Last_5_Minutes })
public KStream<String, Long> processPageViewEventForLast5Mintues(
#Input(AnalyticsBinding.PAGE_VIEWS_IN)KStream<String, PageViewEvent> stream) {
// aggregate by Duration.ofMinutes(5)
}
#StreamListener
#SendTo({ AnalyticsBinding.PAGE_VIEWS_COUNTS_OUT_Last_30_Minutes })
public KStream<String, Long> processPageViewEventForLast30Mintues(
#Input(AnalyticsBinding.PAGE_VIEWS_IN)KStream<String, PageViewEvent> stream) {
// aggregate by Duration.ofMinutes(30)
}
When I start the application just one stream task would work, Is there a way to get both processPageViewEventForLast5Mintues and processPageViewEventForLast30Mintues work simultaneously
You are using the same input binding in both processors and that's why you are seeing only one as working. Add another input binding in the binding interface and set it's destination to the same topic. Also, change one of the StreamListener methods to use this new binding name.
With that said, if you are using the latest versions of Spring Cloud Stream, you should consider migrating to a functional model. For e.g. the following should work.
#Bean
public Function<KStream<String, PageViewEvent>, KStream<String, Long>> processPageViewEventForLast5Mintues() {
...
}
and
#Bean
public Function<KStream<String, PageViewEvent>, KStream<String, Long>> processPageViewEventForLast30Mintues() {
...
}
The binder automatically creates two distinct input bindings in this case.
You can set destinations on those bindings.
spring.cloud.stream.bindings.processPageViewEventForLast5Mintues-in-0.destination=<your Kafka topic>
spring.cloud.stream.bindings.processPageViewEventForLast30Mintues-in-0.destination=<your Kafka topic>

Kafka stream does not retry on deserialisation error

Spring cloud Kafka stream does not retry upon deserialization error even after specific configuration. The expectation is, it should retry based on the configured retry policy and at the end push the failed message to DLQ.
Configuration as below.
spring.cloud.stream.bindings.input_topic.consumer.maxAttempts=7
spring.cloud.stream.bindings.input_topic.consumer.backOffInitialInterval=500
spring.cloud.stream.bindings.input_topic.consumer.backOffMultiplier=10.0
spring.cloud.stream.bindings.input_topic.consumer.backOffMaxInterval=100000
spring.cloud.stream.bindings.iinput_topic.consumer.defaultRetryable=true
public interface MyStreams {
String INPUT_TOPIC = "input_topic";
String INPUT_TOPIC2 = "input_topic2";
String ERROR = "apperror";
String OUTPUT = "output";
#Input(INPUT_TOPIC)
KStream<String, InObject> inboundTopic();
#Input(INPUT_TOPIC2)
KStream<Object, InObject> inboundTOPIC2();
#Output(OUTPUT)
KStream<Object, outObject> outbound();
#Output(ERROR)
MessageChannel outboundError();
}
#StreamListener(MyStreams.INPUT_TOPIC)
#SendTo(MyStreams.OUTPUT)
public KStream<Key, outObject> processSwft(KStream<Key, InObject> myStream) {
return myStream.mapValues(this::transform);
}
The metadataRetryOperations in KafkaTopicProvisioner.java is always null and hence it creates a new RetryTemplate in the afterPropertiesSet().
public KafkaTopicProvisioner(KafkaBinderConfigurationProperties kafkaBinderConfigurationProperties, KafkaProperties kafkaProperties) {
Assert.isTrue(kafkaProperties != null, "KafkaProperties cannot be null");
this.adminClientProperties = kafkaProperties.buildAdminProperties();
this.configurationProperties = kafkaBinderConfigurationProperties;
this.normalalizeBootPropsWithBinder(this.adminClientProperties, kafkaProperties, kafkaBinderConfigurationProperties);
}
public void setMetadataRetryOperations(RetryOperations metadataRetryOperations) {
this.metadataRetryOperations = metadataRetryOperations;
}
public void afterPropertiesSet() throws Exception {
if (this.metadataRetryOperations == null) {
RetryTemplate retryTemplate = new RetryTemplate();
SimpleRetryPolicy simpleRetryPolicy = new SimpleRetryPolicy();
simpleRetryPolicy.setMaxAttempts(10);
retryTemplate.setRetryPolicy(simpleRetryPolicy);
ExponentialBackOffPolicy backOffPolicy = new ExponentialBackOffPolicy();
backOffPolicy.setInitialInterval(100L);
backOffPolicy.setMultiplier(2.0D);
backOffPolicy.setMaxInterval(1000L);
retryTemplate.setBackOffPolicy(backOffPolicy);
this.metadataRetryOperations = retryTemplate;
}
}
The retry configuration only works with MessageChannel-based binders. With the KStream binder, Spring just helps with building the topology in a prescribed way, it's not involved with the message flow once the topology is built.
The next version of spring-kafka (used by the binder) has added the RecoveringDeserializationExceptionHandler (commit here); while it can't help with retry, it can be used with a DeadLetterPublishingRecoverer send the record to a dead-letter topic.
You can use a RetryTemplate within your processors/transformers to retry specific operations.
Spring cloud Kafka stream does not retry upon deserialization error even after specific configuration.
The behavior you are seeing matches the default settings of Kafka Streams when it encounters a deserialization error.
From https://docs.confluent.io/current/streams/faq.html#handling-corrupted-records-and-deserialization-errors-poison-pill-records:
LogAndFailExceptionHandler implements DeserializationExceptionHandler and is the default setting in Kafka Streams. It handles any encountered deserialization exceptions by logging the error and throwing a fatal error to stop your Streams application. If your application is configured to use LogAndFailExceptionHandler, then an instance of your application will fail-fast when it encounters a corrupted record by terminating itself.
I am not familiar with Spring's facade for Kafka Streams, but you probably need to configure the desired org.apache.kafka.streams.errors.DeserializationExceptionHandler, instead of configuring retries (they are meant for a different purpose). Or, you may want to implement your own, custom handler (see link above for more information), and then configure Spring/KStreams to use it.

KStream with Testbinder - Spring Cloud Stream Kafka

I recently started looking into Spring Cloud Stream for Kafka, and have struggled to make the TestBinder work with Kstreams. Is this a known limitation, or have I just overlooked something?
This works fine:
String processor:
#StreamListener(TopicBinding.INPUT)
#SendTo(TopicBinding.OUTPUT)
public String process(String message) {
return message + " world";
}
String test:
#Test
#SuppressWarnings("unchecked")
public void testString() {
Message<String> message = new GenericMessage<>("Hello");
topicBinding.input().send(message);
Message<String> received = (Message<String>) messageCollector.forChannel(topicBinding.output()).poll();
assertThat(received.getPayload(), equalTo("Hello world"));
}
But when I try to use KStream in my process, I can't get the TestBinder to be working.
Kstream processor:
#SendTo(TopicBinding.OUTPUT)
public KStream<String, String> process(
#Input(TopicBinding.INPUT) KStream<String, String> events) {
return events.mapValues((value) -> value + " world");
}
KStream test:
#Test
#SuppressWarnings("unchecked")
public void testKstream() {
Message<String> message = MessageBuilder
.withPayload("Hello")
.setHeader(KafkaHeaders.TOPIC, "event.sirism.dev".getBytes())
.setHeader(KafkaHeaders.MESSAGE_KEY, "Test".getBytes())
.build();
topicBinding.input().send(message);
Message<String> received = (Message<String>)
messageCollector.forChannel(topicBinding.output()).poll();
assertThat(received.getPayload(), equalTo("Hello world"));
}
As you might have noticed, I omitted the #StreamListener from the Kstream processor, but without it it doesn't seem like the testbinder can find the handler. (but with it, it doesn't work when starting up the application)
Is this a known bug / limitation, or am I just doing something stupid here?
The test binder is only for MessageChannel-based binders (subclasses of AbstractMessageChannelBinder). The KStreamBinder does not use MessageChannels.
You can testing using the real binder and an embedded kafka broker, provided by the spring-kafka-test module.
Also see this issue.

Resources