Apache Avro generated POJO - spring-boot

I am using Apache avro with kafka to serialize/deserialize the messages,
I would like to know if there is a way to use avro without using the generated POJO (not very readable) ?
Thank you.

Sure, you're talking about Avro's GenericRecord:
final KafkaConsumer<String, GenericRecord> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Collections.singletonList(topic));
while (true) {
final ConsumerRecords<String, GenericRecord> records = consumer.poll(100);
for (final ConsumerRecord<String, GenericRecord> record : records) {
final GenericRecord value = record.value();
System.out.printf("the value of userId: %s%n", record.get("userId");
}
}
The down side of this is that you'll have to know and check for the existence of the field's key you're using GenericRecord#get() on.

Related

Is there a way to log all incoming kafka requests in spring?

I'm using simple kafka handler:
#KafkaListener(
topics = Topic.NAME,
clientIdPrefix = KafkaHandler.LISTENER_ID)
public class KafkaHandler {
public static final String LISTENER_ID = "kafka_listener";
#KafkaHandler(isDefault = true)
#Description(value = "Event received")
public void onEvent(#Payload Payload payload) {
...
}
However, my object (Payload in the example) is not mapped properly (some fields are null).
Is there a way to log all incoming kafka KV pairs somewhere in spring-kafka app?
You can process the entire Kafka record instead only the payload.
#KafkaListener(topics = "any-topic")
void listener(ConsumerRecord<String, String> record) {
log.info("{}",record.key());
log.info("{}",record.value());
log.info("{}",record.partition());
log.info("{}",record.topic());
log.info("{}",record.offset());
}
Replace the String for your desired key, value format, and define the deserializer class in your app properties.
spring.kafka.consumer.key-deserializer=YourKeyDeserializer.class
spring.kafka.consumer.value-deserializer=YourValueDeserializer.class

How to maintain case during serializing using object mapper?

I have a listener that listens to a queue. The message from the queue is a json text. I need to process them and then save in a mongodb database. I have used a DTO for incoming json. The problem is I can save the data as lower case only since I have used a DTO. But, the incoming data is upper case. How can I gracefully do this using jackson/spring?
I tried #JsonGetter and #JsonSetter in the DTO. But, that didn't work. It is still saving the data as lower case.
Mini version of my code:
DTO:
public String getMessage() {
return message;
}
#JsonSetter("MESSAGE")
public void setMessage(String message){
this.message = message;
}
Datasaver:
mongoOperations.save(DTO,collectionname);
Document in database:
_id: ObjectId("5da831183852090ddc7075fb")
message: "hi"
I want the data in mongodb as:
_id: ObjectId("5da831183852090ddc7075fb")
MESSAGE: "hi"
The incoming data has key as MESSAGE.So, I would like the same to store. I would not want the DTO fields names to be in uppercase.
As per #MichaelZiober on comment above, none of the annotations related to jackson helped my need. #Field annotation of spring worked.
Should work with #JsonProperty("MESSAGE")
If not (for some reason) - you could use custom serializer for this field
class CustomStringSerializer extends JsonSerializer<String> {
#Override
public void serialize(String value, JsonGenerator jgen, SerializerProvider provider) throws IOException {
jgen.writeStartObject();
jgen.writeObjectField("MESSAGE", value);
jgen.writeEndObject();
}
}
and init mapper in this way:
ObjectMapper objectMapper = new ObjectMapper();
SimpleModule mod = new SimpleModule("message");
mod.addSerializer(String.class, new CustomStringSerializer());
objectMapper.registerModule(mod);

Error handling - Consumer - apache kafka and spring

I am learning to use kafka, I have two services a producer and a consumer.
The producer produces messages that require processing (queries to services and database). These messages are received by the consumer, it is responsible for processing them and saves the result in a database
Producer
#Autowired
private KafkaTemplate<String, String> kafkaTemplate;
...
kafkaTemplate.send(topic, message);
Consumer
#KafkaListener(topics = "....")
public void listen(#Payload String message) {
....
}
I would like all messages to be processed correctly by the consumer.
I do not know how to handle errors on the consumer side in this context. For example, a database might be temporarily disabled and could not handle certain messages.
What to do in these cases?
I know that the responsibility belongs to the consumer.
I could do retries, but retry several times in a row if a database is down does not seem like a good idea. And if I continue to consume messages, the index advances and I lose the events that I could not process by mistake.
You have control over kafka consumer in form of committing the offset of records read. Kafka will continue to return the same records unless the offset is committed. You can set offset commit to manual and based on the success of your business logic decide whether to commit or not. See a sample below
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "test");
props.put("enable.auto.commit", "false");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Arrays.asList("foo", "bar"));
final int minBatchSize = 200;
List<ConsumerRecord<String, String>> buffer = new ArrayList<>();
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records) {
buffer.add(record);
}
if (buffer.size() >= minBatchSize) {
insertIntoDb(buffer);
consumer.commitSync();
buffer.clear();
}
}
Consumer.commitsync() commits the offset.
Also see the kakfa consumer documentation to understand the consumer offsets here .
This link was very helpful https://dzone.com/articles/spring-for-apache-kafka-deep-dive-part-1-error-han
Spring provides the DeadLetterPublishingRecoverer class that performs a correct handling of errors.

Kafka streams headers support

In our application the producer is sending different data types and the it may happen that a partition can have different datatype objects as we didn't want to partition based on datatype.
In kafka Streams I was trying to use headers.
Producer is adding header to the BytesObject and pushing the data to kafka.
Header are say, a particular dataType(customObject).
Now based of header I want to parse deserialize the BytesObject received in kafka streams but I am bounded by using processorInterface where I have to pass the actual deserializer
Is there any way I don't have to specify the deserialize beforehand then based on header in processorContext for a record received I can deserialize the Objects
public class StreamHeaderProcessor extends AbstractProcessor<String, Bytes>{
#Override
public void process(String key, Bytes value) {
Iterator<Header> it = context().headers().iterator();
while (it.hasNext()) {
Header head = it.next();
if (head.key().equals("dataType")) {
String headerValue = new String(head.value());
if (headerValue.equals("X")) {
} else if(headerValue.equals("Y")) {
}
}
}
}
}
If you don's set Serdes in StreamsConfig and don't set Serdes on builder.stream(..., Consumed.with(/*Serdes*/)) Kafka Streams will use ByteArraySerde by default and thus key and value is copied into byte[] arrays as data types. (Similar for using Processor API, and don't set a Serde on topology.addSource(...).)
Thus, you can apply a Processor or Transformer on the data stream, inspect the header and call the corresponding deserializer in your own code. You need to know all possible data type in advance.
public class MyProcessor implements Processor {
// add corresponding deserializers for all expected types (eg, String)
private StringDeserializer stringDeserializer = new StringDeserializer();
// other methods omitted
void process(byte[] key, byte[] value) {
// inspect header
if (header.equals("StringType") {
// get `context` via `init()` method
String stringValue = stringDeserializer.deserialize(context.topic(), value);
// similar for `key`
// apply processing logic for String type
}
}
}

Spring Cloud Stream Kafka Consumer Test

I am trying to setup test as suggested here at GitHub a link
Map<String, Object> senderProps = KafkaTestUtils.producerProps(embeddedKafka);
DefaultKafkaProducerFactory<Integer, String> pf = new DefaultKafkaProducerFactory<>(senderProps);
try {
KafkaTemplate<Integer, String> template = new KafkaTemplate<>(pf, true);
template.setDefaultTopic("words");
template.sendDefault("foobar");
--> ConsumerRecord<String, String> cr = KafkaTestUtils.getSingleRecord(consumer, "output");
log.debug(cr);
}
finally {
pf.destroy();
}
Where StreamProcessor is set to
#StreamListener
#SendTo("output")
public KStream<?, WordCount> process(#Input("input") KStream<Object, String> input) {
return input.map((key, value) -> new KeyValue<>(value, new WordCount(value, 10, new Date(), new Date())));
}
--> line never consumes messages which to my mind should be on topic "output" due to the fact that #Streamprocessor has #SendTo("output")
I want to be able to test stream processed messages.
You need to consume from the actual topic that your output is bound to.
Do you have a configuration for spring.cloud.stream.bindings.output.destination? That should be the value that you need to use. If you don't set that, the default will be the same as the binding - output in this case.

Resources