Spring Kafka + Json + Schema Registry Confluent - spring-boot

Confluent schema registry currently support json schema.
Does spring kafka provides support for json schema?
Avro with spring kafka works well using this config
spring:
kafka:
producer:
key-serializer: org.apache.kafka.common.serialization.StringSerializer
value-serializer: io.confluent.kafka.serializers.KafkaAvroSerializer
consumer:
group-id: template-cg
auto-offset-reset: earliest
key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
value-deserializer: io.confluent.kafka.serializers.KafkaAvroDeserializer
properties:
schema.registry.url: "http://localhost:8081"
But how to configure spring kafka to use json schema + confluent schema registry?

OK, as gary guides, this is not spring problem. Need to configure kafka like this.
spring:
kafka:
producer:
key-serializer: org.apache.kafka.common.serialization.StringSerializer
value-serializer: io.confluent.kafka.serializers.json.KafkaJsonSchemaSerializer
consumer:
group-id: template-cg
auto-offset-reset: earliest
key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
value-deserializer: io.confluent.kafka.serializers.json.KafkaJsonSchemaDeserializer
properties:
schema.registry.url: http://localhost:8081
Dependency (gradle)
repositories {
mavenCentral()
maven {
url "https://packages.confluent.io/maven/"
}
maven {
url "https://jitpack.io"
}
}
dependencies {
implementation 'org.springframework.boot:spring-boot-starter'
implementation 'org.springframework.kafka:spring-kafka'
implementation 'io.confluent:kafka-json-schema-serializer:6.1.0'
implementation 'com.github.everit-org.json-schema:org.everit.json.schema:1.12.2'
}

Related

Found duplicate key in application.yml

I have been working on a kafka stream project I am new to yaml files I have been using application.properties . Basically I want to use kafka properties as well as spring fox properties here is my
application.yml
spring:
cloud:
function:
definition: consumer;producer
stream:
kafka:
bindings:
producer-out-0:
producer:
configuration:
value.serializer: <packagename>
consumer-in-0:
consumer:
configuration:
value.deserializer: <packagename>
binder:
brokers: localhost:9092
bindings:
producer-out-0:
destination: <topic name>
producer:
useNativeEncoding: true # Enables using the custom serializer
consumer-in-0:
destination: <topic name>
consumer:
use-native-decoding: true # Enables using the custom deserializer
spring:
mvc:
pathmatch:
matching-strategy: ANT_PATH_MATCHER
It shows
in 'reader', line 1, column 1:
spring:
^
found duplicate key spring
in 'reader', line 31, column 1:
spring:
^
I tried add mvc: in between and other things nothing seem to work I dont know how to get spring key as parent to mvc: pathmatch: without ruining above configuration
spring is a duplicate! Your YAML should look like this:
spring:
cloud:
function:
definition: consumer;producer
stream:
kafka:
bindings:
producer-out-0:
producer:
configuration:
value.serializer: <packagename>
consumer-in-0:
consumer:
configuration:
value.deserializer: <packagename>
binder:
brokers: localhost:9092
bindings:
producer-out-0:
destination: <topic name>
producer:
useNativeEncoding: true # Enables using the custom serializer
consumer-in-0:
destination: <topic name>
consumer:
use-native-decoding: true # Enables using the custom deserializer
mvc:
pathmatch:
matching-strategy: ANT_PATH_MATCHER

Spring Cloud Kafka Stream: Publishing to DLQ is failing with Avro

I'm unable to publish to Dlq topic while using ErrorHandlingDeserializer for handling the errors with combination of Avro. Below is the error while publishing.
Topic TOPIC_DLT not present in metadata after 60000 ms.
ERROR KafkaConsumerDestination{consumerDestinationName='TOPIC', partitions=6, dlqName='TOPIC_DLT'}.container-0-C-1 o.s.i.h.LoggingHandler:250 - org.springframework.messaging.MessageHandlingException: error occurred in message handler [org.springframework.cloud.stream.function.FunctionConfiguration$FunctionToDestinationBinder$1#49abe531]; nested exception is java.lang.RuntimeException: failed, failedMessage=GenericMessage
And here is the application.yml
spring:
cloud:
stream:
bindings:
process-in-0:
destination: TOPIC
group: groupID
kafka:
binder:
brokers:
- xxx:9092
configuration:
security.protocol: SASL_SSL
sasl.mechanism: PLAIN
jaas:
loginModule: org.apache.kafka.common.security.plain.PlainLoginModule
options:
username: username
password: pwd
consumer-properties:
key.deserializer: org.apache.kafka.common.serialization.StringDeserializer
value.deserializer: org.springframework.kafka.support.serializer.ErrorHandlingDeserializer
spring.deserializer.value.delegate.class: io.confluent.kafka.serializers.KafkaAvroDeserializer
producer-properties:
key.serializer: org.apache.kafka.common.serialization.StringSerializer
value.serializer: io.confluent.kafka.serializers.KafkaAvroSerializer
bindings:
process-in-0:
consumer:
configuration:
basic.auth.credentials.source: USER_INFO
schema.registry.url: registryUrl
schema.registry.basic.auth.user.info: user:pwd
security.protocol: SASL_SSL
sasl.mechanism: PLAIN
max-attempts: 1
dlqProducerProperties:
configuration:
basic.auth.credentials.source: USER_INFO
schema.registry.url: registryUrl
schema.registry.basic.auth.user.info: user:pwd
key.serializer: org.apache.kafka.common.serialization.StringSerializer
value.serializer: io.confluent.kafka.serializers.KafkaAvroSerializer
deserializationExceptionHandler: sendToDlq
ackEachRecord: true
enableDlq: true
dlqName: TOPIC_DLT
autoCommitOnError: true
autoCommitOffset: true
I'm using the following dependencies:
spring-cloud-dependencies - 2021.0.1
spring-boot-starter-parent - 2.6.3
spring-cloud-stream-binder-kafka
kafka-schema-registry-client - 5.3.0
kafka-avro-serializer - 5.3.0
Im not sure what exactly im missing.
After going through a lot of documentation, I found out that for spring to do the job of posting DLQ, we need to have the same number of partitions for both Original topic and DLT Topic. And if it can't be done then we need to set dlqPartitions to 1 or manually provide the DlqPartitionFunction bean. By providing dlqPartitions: 1 all the messages will go to partition 0.

Spring-cloud kafka stream schema registry

I am trying to transform with functionnal programming (and spring cloud stream) an input AVRO message from an input topic, and publish a new message on an output topic.
Here is my transform function :
#Bean
public Function<KStream<String, Data>, KStream<String, Double>> evenNumberSquareProcessor() {
return kStream -> kStream.transform(() -> new CustomProcessor(STORE_NAME), STORE_NAME);
}
The CustomProcessor is a class that implements the "Transformer" interface.
I have tried the transformation with non AVRO input and it works fine.
My difficulties is how to declare the schema registry in the application.yaml file or in the the spring application.
I have tried a lot of different configurations (it seems difficult to find the right documentation) and each time the application don't find the settings for the schema.registry.url. I have the following error :
Error creating bean with name 'kafkaStreamsFunctionProcessorInvoker':
Invocation of init method failed; nested exception is
java.lang.IllegalStateException:
org.apache.kafka.common.config.ConfigException: Missing required
configuration "schema.registry.url" which has no default value.
Here is my application.yml file :
spring:
cloud:
stream:
function:
definition: evenNumberSquareProcessor
bindings:
evenNumberSquareProcessor-in-0:
destination: input
content-type: application/*+avro
group: group-1
evenNumberSquareProcessor-out-0:
destination: output
kafka:
binder:
brokers: my-cluster-kafka-bootstrap.kafka:9092
consumer-properties:
value.deserializer: io.confluent.kafka.serializers.KafkaAvroDeserializer
schema.registry.url: http://localhost:8081
I have tried this configuration too :
spring:
cloud:
stream:
kafka:
streams:
binder:
brokers: my-cluster-kafka-bootstrap.kafka:9092
configuration:
schema.registry.url: http://localhost:8081
default.value.serde: io.confluent.kafka.streams.serdes.avro.SpecificAvroSerde
bindings:
evenNumberSquareProcessor-in-0:
consumer:
destination: input
valueSerde: io.confluent.kafka.streams.serdes.avro.SpecificAvroSerde
evenNumberSquareProcessor-out-0:
destination: output
My spring boot application is declared in this way, with the activation of the schema registry client :
#EnableSchemaRegistryClient
#SpringBootApplication
public class TransformApplication {
public static void main(String[] args) {
SpringApplication.run(TransformApplication.class, args);
}
}
Thanks for any help you could bring to me.
Regards
CG
Configure the schema registry under the configuration then it will be available to all binders. By the way. The avro serializer is under the bindings and the specific channel. If you want use the default property default.value.serde:. Your Serde might be the wrong too.
spring:
cloud:
stream:
kafka:
streams:
binder:
brokers: localhost:9092
configuration:
schema.registry.url: http://localhost:8081
default.value.serde: io.confluent.kafka.streams.serdes.avro.SpecificAvroSerde
bindings:
process-in-0:
consumer:
valueSerde: io.confluent.kafka.streams.serdes.avro.SpecificAvroSerde
Don't use the #EnableSchemaRegistryClient. Enable the schema registry on the Avro Serde. In this example, I am using the bean Data of your definition. Try to follow this example here.
#Service
public class CustomSerdes extends Serdes {
private final static Map<String, String> serdeConfig = Stream.of(
new AbstractMap.SimpleEntry<>(SCHEMA_REGISTRY_URL_CONFIG, "http://localhost:8081"))
.collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
public static Serde<Data> DataAvro() {
final Serde<Data> dataAvroSerde = new SpecificAvroSerde<>();
dataAvroSerde.configure(serdeConfig, false);
return dataAvroSerde;
}
}

Spring Cloud Stream Kafka Multiple Binding

I am using Spring Cloud Stream Kafka binder to consume messages from Kafka. I am able to make my sample work with a single Kafka Binder as below
spring:
cloud:
stream:
kafka:
binder:
consumer-properties: {enable.auto.commit: true}
auto-create-topics: false
brokers: <broker url>
bindings:
consumer:
destination: some-topic
group: testconsumergroup
consumer:
concurrency: 1
valueSerde: JsonSerde
producer:
destination: some-other-topic
producer:
valueSerde: JsonSerde
Note that both the bindings are to the same Kafka Broker here. However, I have a situation where I need to publish to a topic in some Kafka Cluster and also consume from another topic in a different Kafka Cluster. How should I change my configuration to be able to bind to different Kafka Clusters?
I tried something like this
spring:
cloud:
stream:
binders:
defaultbinder:
type: kafka
environment:
spring.cloud.stream.kafka.streams.binder.brokers: <cluster1-brokers>
kafka1:
type: kafka
environment:
spring.cloud.stream.kafka.streams.binder.brokers: <cluster2-brokers>
bindings:
consumer:
binder: kafka1
destination: some-topic
group: testconsumergroup
consumer:
concurrency: 1
valueSerde: JsonSerde
producer:
binder: defaultbinder
destination: some-topic
producer:
valueSerde: JsonSerde
kafka:
binder:
consumer-properties: {enable.auto.commit: true}
auto-create-topics: false
brokers: <cluster1-brokers>
and
spring:
cloud:
stream:
binders:
defaultbinder:
type: kafka
environment:
spring.cloud.stream.kafka.streams.binder.brokers: <cluster1-brokers>
kafka1:
type: kafka
environment:
spring.cloud.stream.kafka.streams.binder.brokers: <cluster2-brokers>
kafka:
bindings:
consumer:
binder: kafka1
destination: some-topic
group: testconsumergroup
consumer:
concurrency: 1
valueSerde: JsonSerde
producer:
binder: defaultbinder
destination: some-topic
producer:
valueSerde: JsonSerde
kafka:
binder:
consumer-properties: {enable.auto.commit: true}
auto-create-topics: false
brokers: <cluster1-brokers>
But both of them dont seem to work.
The first configuration seems to be invalid.
For the second configuration I get the below error
Caused by: java.lang.IllegalStateException: A default binder has been requested, but there is more than one binder available for 'org.springframework.cloud.stream.messaging.DirectWithAttributesChannel' : kafka1,defaultbinder, and no default binder has been set.
I am using the dependency 'org.springframework.cloud:spring-cloud-starter-stream-kafka:3.0.1.RELEASE' and Spring Boot 2.2.6
Please let me know how to configure multiple bindings for Kafka using Spring Cloud Stream
Update
Tried this configuration below
spring:
cloud:
stream:
binders:
kafka2:
type: kafka
environment:
spring.cloud.stream.kafka.binder.brokers: <cluster2-brokers>
kafka1:
type: kafka
environment:
spring.cloud.stream.kafka.binder.brokers: <cluster1-brokers>
bindings:
consumer:
destination: <some-topic>
binder: kafka1
group: testconsumergroup
content-type: application/json
nativeEncoding: true
consumer:
concurrency: 1
valueSerde: JsonSerde
producer:
destination: some-topic
binder: kafka2
contentType: application/json
nativeEncoding: true
producer:
valueSerde: JsonSerde
The Message Streams and EventHubBinding is as follows
public interface MessageStreams {
String PRODUCER = "producer";
String CONSUMER = "consumer;
#Output(PRODUCER)
MessageChannel producerChannel();
#Input(CONSUMER)
SubscribableChannel consumerChannel()
}
#EnableBinding(MessageStreams.class)
public class EventHubStreamsConfiguration {
}
My Producer class looks like below
#Component
#Slf4j
public class EventPublisher {
private final MessageStreams messageStreams;
public EventPublisher(MessageStreams messageStreams) {
this.messageStreams = messageStreams;
}
public boolean publish(CustomMessage event) {
MessageChannel messageChannel = getChannel();
MessageBuilder messageBuilder = MessageBuilder.withPayload(event);
boolean messageSent = messageChannel.send(messageBuilder.build());
return messageSent;
}
protected MessageChannel getChannel() {
return messageStreams.producerChannel();
}
}
And Consumer class looks like below
#Component
#Slf4j
public class EventHandler {
private final MessageStreams messageStreams;
public EventHandler(MessageStreams messageStreams) {
this.messageStreams = messageStreams;
}
#StreamListener(MessageStreams.CONSUMER)
public void handleEvent(Message<CustomMessage> message) throws Exception
{
// process the event
}
#Override
#ServiceActivator(inputChannel = "some-topic.testconsumergroup.errors")
protected void handleError(ErrorMessage errorMessage) throws Exception {
// handle error;
}
}
I am getting the below error while trying to publish and consume the messages from my test.
Dispatcher has no subscribers for channel 'application.producer'.; nested exception is org.springframework.integration.MessageDispatchingException: Dispatcher has no subscribers, failedMessage=GenericMessage [payload=byte[104], headers={contentType=application/json, timestamp=1593517340422}]
Am i missing anything? For a single cluster, i am able to publish and consume messages. The issue is only happening with multiple cluster bindings

Kafka producer JSON serialization

I'm trying to use Spring Cloud Stream to integrate with Kafka. The message being written is a Java POJO and while it works as expected (the message is being written to the topic and I can read off with a consumer app), there are some unknown characters being added to the start of the message which are causing trouble when trying to integrate Kafka Connect to sink the messages from the topic.
With the default setup this is the message being pushed to Kafka:
 contentType "text/plain"originalContentType "application/json;charset=UTF-8"{"payload":{"username":"john"},"metadata":{"eventName":"Login","sessionId":"089acf50-00bd-47c9-8e49-dc800c1daf50","username":"john","hasSent":null,"createDate":1511186145471,"version":null}}
If I configure the Kafka producer within the Java app then the message is written to the topic without the leading characters / headers:
#Configuration
public class KafkaProducerConfig {
#Bean
public ProducerFactory<String, Object> producerFactory() {
Map<String, Object> configProps = new HashMap<>();
configProps.put(
ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,
"localhost:9092");
configProps.put(
ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
StringSerializer.class);
configProps.put(
ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
JsonSerializer.class);
return new DefaultKafkaProducerFactory<String, Object>(configProps);
}
#Bean
public KafkaTemplate<String, Object> kafkaTemplate() {
return new KafkaTemplate<>(producerFactory());
}
}
Message on Kafka:
{"payload":{"username":"john"},"metadata":{"eventName":"Login","sessionId":"089acf50-00bd-47c9-8e49-dc800c1daf50","username":"john","hasSent":null,"createDate":1511186145471}
Since I'm just setting the key/value serializers I would've expected to be able to do this within the application.yml properties file, rather than doing it through the code.
However, when the yml is updated to specify the serializers it's not working as I would expect i.e. it's not generating the same message as the producer configured in Java (above):
spring:
profiles: local
cloud:
stream:
bindings:
session:
destination: session
contentType: application/json
kafka:
binder:
brokers: localhost
zkNodes: localhost
defaultZkPort: 2181
defaultBrokerPort: 9092
bindings:
session:
producer:
configuration:
value:
serializer: org.springframework.kafka.support.serializer.JsonSerializer
key:
serializer: org.apache.kafka.common.serialization.StringSerializer
Message on Kafka:
"/wILY29udGVudFR5cGUAAAAMInRleHQvcGxhaW4iE29yaWdpbmFsQ29udGVudFR5cGUAAAAgImFwcGxpY2F0aW9uL2pzb247Y2hhcnNldD1VVEYtOCJ7InBheWxvYWQiOnsidXNlcm5hbWUiOiJqb2huIn0sIm1ldGFkYXRhIjp7ImV2ZW50TmFtZSI6IkxvZ2luIiwic2Vzc2lvbklkIjoiNGI3YTBiZGEtOWQwZS00Nzg5LTg3NTQtMTQyNDUwYjczMThlIiwidXNlcm5hbWUiOiJqb2huIiwiaGFzU2VudCI6bnVsbCwiY3JlYXRlRGF0ZSI6MTUxMTE4NjI2NDk4OSwidmVyc2lvbiI6bnVsbH19"
Should it be possible to configure this solely through the application yml? Are there additional settings that are missing?
Credit to #Gary for the answer above!
For completeness, the configuration which is now working for me is below.
spring:
profiles: local
cloud:
stream:
bindings:
session:
producer:
useNativeEncoding: true
destination: session
contentType: application/json
kafka:
binder:
brokers: localhost
zkNodes: localhost
defaultZkPort: 2181
defaultBrokerPort: 9092
bindings:
session:
producer:
configuration:
value:
serializer: org.springframework.kafka.support.serializer.JsonSerializer
key:
serializer: org.apache.kafka.common.serialization.StringSerializer
See headerMode and useNativeEncoding in the producer properties (....session.producer.useNativeEncoding).
headerMode
When set to raw, disables header embedding on output. Effective only for messaging middleware that does not support message headers natively and requires header embedding. Useful when producing data for non-Spring Cloud Stream applications.
Default: embeddedHeaders.
useNativeEncoding
When set to true, the outbound message is serialized directly by client library, which must be configured correspondingly (e.g. setting an appropriate Kafka producer value serializer). When this configuration is being used, the outbound message marshalling is not based on the contentType of the binding. When native encoding is used, it is the responsibility of the consumer to use appropriate decoder (ex: Kafka consumer value de-serializer) to deserialize the inbound message. Also, when native encoding/decoding is used the headerMode property is ignored and headers will not be embedded into the message.
Default: false.
Now, spring.kafka.producer.value-serializer property can be used
yml:
spring:
kafka:
producer:
value-serializer: org.springframework.kafka.support.serializer.JsonSerializer
properties:
spring.kafka.producer.value-serializer=org.springframework.kafka.support.serializer.JsonSerializer

Resources