LongDeserializer exception while reading KTable using spring-kakfa-stream - apache-kafka-streams

I am trying to read KTable using spring-cloud-stream-binder-kafka-streams project. Can we read KTable using spring #StreamListener and all interfaces that spring-cloud-streams provide around messaging.
I am getting LongDeserializer exception while reading KTable.
I am using springCloudVersion = 'Finchley.RC1'
springBootVersion = '2.0.1.RELEASE'
The github link to the project is available at,
https://github.com/jaysara/KStreamAnalytics
Here is the stacktrace,
Exception in thread "panalytics-ac0fa75f-2ae4-4b26-9a04-1f80d1479112-StreamThread-2" org.apache.kafka.streams.errors.StreamsException: Deserialization exception handler is set to fail upon a deserialization error. If you would rather have the streaming pipeline continue after a deserialization error, please set the default.deserialization.exception.handler appropriately.
at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:74)
at org.apache.kafka.streams.processor.internals.RecordQueue.addRawRecords(RecordQueue.java:91)
at org.apache.kafka.streams.processor.internals.PartitionGroup.addRawRecords(PartitionGroup.java:117)
at org.apache.kafka.streams.processor.internals.StreamTask.addRecords(StreamTask.java:549)
at org.apache.kafka.streams.processor.internals.StreamThread.addRecordsToTasks(StreamThread.java:920)
at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:821)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:774)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:744)
Caused by: org.apache.kafka.common.errors.SerializationException: Size of data received by LongDeserializer is not 8

You need to uncomment this line: https://github.com/jaysara/KStreamAnalytics/blob/master/src/main/resources/application.properties#L19
spring.cloud.stream.bindings.policyPaidAnalytic.producer.useNativeEncoding=true
By default, the binder tries to serialize on the outbound and using application/json as the content type. So, in your case, it was going out as json (String) and thats why you were getting that Long serialization exception. By setting the above flag to true, you are asking the binder to stay back and let Kafka Streams serialize natively by using the LongSerde.
When you re-run, you might want to clear your topic policyAnalytic or use a new topic.
Hope that helps.

Related

Implementing DLQ in Kafka using Spring Cloud Stream with Batch mode enabled

I am trying to implement DLQ using spring cloud stream with Batch mode enabled
#Bean
public ListenerContainerCustomizer<AbstractMessageListenerContainer<?, ?>> customizer(BatchErrorHandler handler) {
return ((container, destinationName, group) -> {
if(dlqEnabledTopic.contains(destinationName))
container.setBatchErrorHandler(handler);});
}
#Bean
public BatchErrorHandler batchErrorHandler(KafkaOperations<String, byte[]> kafkaOperations) {
CustomDeadLetterPublishingRecoverer recoverer = new CustomDeadLetterPublishingRecoverer(kafkaOperations,
(cr, e) -> new TopicPartition(cr.topic()+"_dlq", cr.partition()));
return new RecoveringBatchErrorHandler(recoverer, new FixedBackOff(1000, 1));
}
but have a few queries:
how to configure key/value Serializer using properties - my message is String type but KafkaOperations is using ByteArraySerializer
In the batch multiple messages are there , but if first message failed it went to DLQ but don't see the processing of next message.
Requirement - at any index if batch fails, I need only that message to be sent to DLQ and rest of the message should be processed again.
Is DLQ now supported with batch mode now ? just like with record mode it can be enabled using properties
spring.kafka.producer.* properties - however, the DLT publishing should use the same serializers as the main stream app. ByteArraySerializer is generally correct.
The recovering batch error handler will perform seeks for the unprocessed records and they will be returned. Debug logging should help you figure out what's wrong. If you can't figure it out, provide an MCRE that exhibits the behavior you are seeing.
No; the binder does not support DLQ for batch mode; configuring the error handler is the correct approach.

How to set a seperate bootstrap server to a DLT of a binding

I need to send consumed invalid messages into a DLT but in a separate bootstrap server. I currently have this config:
spring.cloud.stream.binders.some-kafka-binder.type=kafka
spring.cloud.stream.binders.some-kafka-binder.environment.spring.cloud.stream.kafka.binder.brokers=localhost:29092
spring.cloud.stream.bindings.processor-in-0.binder=some-kafka-binder
spring.cloud.stream.bindings.processor-in-0.group=${spring.application.name}
spring.cloud.stream.bindings.processor-in-0.destination=outbound-topic
spring.cloud.stream.kafka.bindings.processor-in-0.consumer.configuration.value.deserializer=io.confluent.kafka.serializers.KafkaAvroDeserializer
spring.cloud.stream.kafka.bindings.processor-in-0.consumer.configuration.schema.registry.url=${schema.registry.url}
spring.cloud.stream.kafka.bindings.processor-in-0.consumer.configuration.specific.avro.reader=true
spring.cloud.stream.kafka.bindings.processor-in-0.consumer.enable-dlq=true
spring.cloud.stream.kafka.bindings.processor-in-0.consumer.dlq-name=outbound-topic.DLT
spring.cloud.stream.kafka.bindings.processor-in-0.consumer.dlq-producer-properties.configuration.value.serializer=io.confluent.kafka.serializers.KafkaAvroSerializer
spring.cloud.stream.kafka.bindings.processor-in-0.consumer.dlq-producer-properties.configuration.schema.registry.url=${schema.registry.url}
spring.cloud.stream.kafka.bindings.processor-in-0.consumer.dlq-producer-properties.configuration.bootstrap.servers=localhost:9092
...but I'm getting this error:
Caused by: java.lang.IllegalStateException: bootstrap.servers cannot be overridden at the binding level; use multiple binders instead
at org.springframework.util.Assert.state(Assert.java:76)
at org.springframework.cloud.stream.binder.kafka.KafkaMessageChannelBinder.getProducerFactory(KafkaMessageChannelBinder.java:560)
at org.springframework.cloud.stream.binder.kafka.KafkaMessageChannelBinder.getErrorMessageHandler(KafkaMessageChannelBinder.java:1148)
at org.springframework.cloud.stream.binder.kafka.KafkaMessageChannelBinder.getErrorMessageHandler(KafkaMessageChannelBinder.java:158)
at org.springframework.cloud.stream.binder.AbstractMessageChannelBinder.registerErrorInfrastructure(AbstractMessageChannelBinder.java:695)
at org.springframework.cloud.stream.binder.AbstractMessageChannelBinder.registerErrorInfrastructure(AbstractMessageChannelBinder.java:639)
at org.springframework.cloud.stream.binder.kafka.KafkaMessageChannelBinder.createConsumerEndpoint(KafkaMessageChannelBinder.java:734)
at org.springframework.cloud.stream.binder.kafka.KafkaMessageChannelBinder.createConsumerEndpoint(KafkaMessageChannelBinder.java:158)
at org.springframework.cloud.stream.binder.AbstractMessageChannelBinder.doBindConsumer(AbstractMessageChannelBinder.java:408)
... 27 common frames omitted
Is there a way in spring cloud I can set this up? I really wish that there's no need for me to do a custom DLT implementation.
If ever you're asking why there's a need for another bootstrap setup for the DLT: there are 2 separate AWS KMS accounts involve.
When you enable DLT, the destination needs to be on the same Kafka cluster as the input topic. At the moment, the binder does not provide a way to use a different cluster separately for DLT. I think you can do a workaround though if you are willing to create an extra topic. Let the DLT be on the first cluster so that when an error occurs it goes to the DLT on the same cluster from where the data came from. Then you can define another passthrough function for which the DLT is the input topic and the output topic is configured with the binder for the second cluster.
#Bean
public Consumer<IN> consumer() {
// records in error go to a DLT.
}
// input is the DLT topic from first cluster
// output is a topic on the second cluster.
#Bean
public Function<IN, OUT> function() {
return in -> out;
}
spring.cloud.stream.binders.some-kafka-binder-1.type=kafka
spring.cloud.stream.binders.some-kafka-binder-1.environment.spring.cloud.stream.kafka.binder.brokers=localhost:29092
spring.cloud.stream.binders.some-kafka-binder-2.type=kafka
spring.cloud.stream.binders.some-kafka-binder-2.environment.spring.cloud.stream.kafka.binder.brokers=localhost:9092
Also, remove the bootstrap config from the DLT configuration.
Feel free to add a new feature request at the GitHub repository for Kafka binder. We can consider using maybe the StreamBridge API to accommodate this use case.

Spring boot and CloudEvent AMQP binding

I'm trying to implement some CloudEvent demo.
I have a hew spring boot services with RabbitMQ as a message bus they all send messages to a queue and one listens to the queue messages.
I try to wrap my messages as CloudEvent to make them more standard.
I use the following code to wrap the message (data) as a CloudEvent.
try {
inputEvent = CloudEventBuilder.v1()
.withSource(new URI("app://" + messageData.getChangeRequestId().toString()))
.withDataContentType("application/json")
.withId(messageData.myId().toString())
.withType("com.data.BaseMessageData")
.withData(objMapper.writeValueAsBytes(eventData))
.build();
} catch (Exception e) {
throw new MyMessagingException("Failed to convert the message into json. (See inner exception for further details)", e);
}
The data is converted to bytes since the message CloudEventData is based on bytes.
Of course, that on my listener method I get exception since SimpleMessageConverter can't handle bytes array.
Now, I can try and implement some custom message handler or try to check out CloudEvent AMQP suggested binding solution but I'm not keen with the amount of code it involves and I don't want to involve more technologies if not absolutely necessary.
Should I go down this path and implement a custom message conveter?
Is there any other standard solution for standardizing services messaging over qeueus?
You will need a custom message converter; but see this blog post:
https://spring.io/blog/2020/12/10/cloud-events-and-spring-part-1

JmsListener called again and again when a error happen in the method

In a spring boot application, I have a class with jms listener.
public class PaymentNotification{
#JmsListener(destination="payment")
public void receive(String payload) throws Exception{
//mapstring conversion
....
paymentEvent = billingService.insert(paymentEvent); //transactional method
//call rest...
billingService.save(paymentEvent);
//send info to jms
}
}
I saw then when a error happen, data is inserted in the database, that ok, but it's like receive method is called again and again... but queue is empty when I check on the server.
If there is an error, I don't want method is called again, Is there something for that.
The JMS Message Headers might contain additional information to help with your processing. In particular JMSRedelivered could be of some value. The Oracle doc states that "If a client receives a message with the JMSRedelivered field set, it is likely, but not guaranteed, that this message was delivered earlier but that its receipt was not acknowledged at that time."
I ran the following code to explore what was available in my configuration (Spring Boot with IBM MQ).
#JmsListener(destination="DEV.QUEUE.1")
public void receive(Message message) throws Exception{
for (Enumeration<String> e = message.getPropertyNames(); e.hasMoreElements();)
System.out.println(e.nextElement());
}
From here I could find JMSXDeliveryCount is available in JMS 2.0. If that property is not available, then you may well find something similar for your own configuration.
One strategy would be to use JMSXDeliveryCount, a vendor specific property or maybe JMSRedelivered (if suitable for your needs) as a way to check before you process the message. Typically, the message would be sent to a specific blackout queue where the redelivery count exceeds a set threshold.
Depending on the messaging provider you are using it might also be possible to configure back out queue processing as properties of the queue.

Spring Integration Aggregation Error

I have an spring integration implemetation where I have two Client subscribers listening to the same JMS Topic. I am using JDBC message store( Different REGIONS) in both the implementation to save the incoming messages. While processing the data I get the exception:
org.springframework.dao.EmptyResultDataAccessException: Incorrect result size: expected 1, actual 0
I know this is Jira issue : https://jira.spring.io/browse/INT-2912
As now I cant upgrade the Spring version. I am unable to understand the workaround "The work-around is either to always use a different groupKey or to use separate tables for each Message Store. We will need to add a REGION column to the INT_GROUP_TO_MESSAGE as well."
How can i create a different groupKey?
My implementation is as below:
<bean
id="jdbcMessageStore"
class="org.springframework.integration.jdbc.JdbcMessageStore"
p:dataSource-ref="datasource"
p:region="REPORTS"/>
<si:aggregator
send-partial-result-on-expiry="false"
message-store="jdbcMessageStore"
discard-channel="discardedLogger"/>
The "groupKey" mentioned there is the correlation strategy result; by default it just uses the correlationId header.
You can use correlation-strategy-expression="'foo' + headers['correlationId']" and correlation-strategy-expression="'bar' + headers['correlationId']" to use a different group key for each app.

Resources