Kafka streams headers support - apache-kafka-streams

In our application the producer is sending different data types and the it may happen that a partition can have different datatype objects as we didn't want to partition based on datatype.
In kafka Streams I was trying to use headers.
Producer is adding header to the BytesObject and pushing the data to kafka.
Header are say, a particular dataType(customObject).
Now based of header I want to parse deserialize the BytesObject received in kafka streams but I am bounded by using processorInterface where I have to pass the actual deserializer
Is there any way I don't have to specify the deserialize beforehand then based on header in processorContext for a record received I can deserialize the Objects
public class StreamHeaderProcessor extends AbstractProcessor<String, Bytes>{
#Override
public void process(String key, Bytes value) {
Iterator<Header> it = context().headers().iterator();
while (it.hasNext()) {
Header head = it.next();
if (head.key().equals("dataType")) {
String headerValue = new String(head.value());
if (headerValue.equals("X")) {
} else if(headerValue.equals("Y")) {
}
}
}
}
}

If you don's set Serdes in StreamsConfig and don't set Serdes on builder.stream(..., Consumed.with(/*Serdes*/)) Kafka Streams will use ByteArraySerde by default and thus key and value is copied into byte[] arrays as data types. (Similar for using Processor API, and don't set a Serde on topology.addSource(...).)
Thus, you can apply a Processor or Transformer on the data stream, inspect the header and call the corresponding deserializer in your own code. You need to know all possible data type in advance.
public class MyProcessor implements Processor {
// add corresponding deserializers for all expected types (eg, String)
private StringDeserializer stringDeserializer = new StringDeserializer();
// other methods omitted
void process(byte[] key, byte[] value) {
// inspect header
if (header.equals("StringType") {
// get `context` via `init()` method
String stringValue = stringDeserializer.deserialize(context.topic(), value);
// similar for `key`
// apply processing logic for String type
}
}
}

Related

Is there a way to log all incoming kafka requests in spring?

I'm using simple kafka handler:
#KafkaListener(
topics = Topic.NAME,
clientIdPrefix = KafkaHandler.LISTENER_ID)
public class KafkaHandler {
public static final String LISTENER_ID = "kafka_listener";
#KafkaHandler(isDefault = true)
#Description(value = "Event received")
public void onEvent(#Payload Payload payload) {
...
}
However, my object (Payload in the example) is not mapped properly (some fields are null).
Is there a way to log all incoming kafka KV pairs somewhere in spring-kafka app?
You can process the entire Kafka record instead only the payload.
#KafkaListener(topics = "any-topic")
void listener(ConsumerRecord<String, String> record) {
log.info("{}",record.key());
log.info("{}",record.value());
log.info("{}",record.partition());
log.info("{}",record.topic());
log.info("{}",record.offset());
}
Replace the String for your desired key, value format, and define the deserializer class in your app properties.
spring.kafka.consumer.key-deserializer=YourKeyDeserializer.class
spring.kafka.consumer.value-deserializer=YourValueDeserializer.class

Apache Avro generated POJO

I am using Apache avro with kafka to serialize/deserialize the messages,
I would like to know if there is a way to use avro without using the generated POJO (not very readable) ?
Thank you.
Sure, you're talking about Avro's GenericRecord:
final KafkaConsumer<String, GenericRecord> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Collections.singletonList(topic));
while (true) {
final ConsumerRecords<String, GenericRecord> records = consumer.poll(100);
for (final ConsumerRecord<String, GenericRecord> record : records) {
final GenericRecord value = record.value();
System.out.printf("the value of userId: %s%n", record.get("userId");
}
}
The down side of this is that you'll have to know and check for the existence of the field's key you're using GenericRecord#get() on.

Spring Cloud Stream - #StremListener condition

According to documentation: https://cloud.spring.io/spring-cloud-static/spring-cloud-stream/3.0.3.RELEASE/reference/html/spring-cloud-stream.html#_using_streamlistener_for_content_based_routing
I can route the incoming message to a handler based on a condition like below:
#EnableBinding(MySink.class)
#EnableAutoConfiguration
public static class TestPojoWithAnnotatedArguments {
#StreamListener(target = MySink.INPUT, condition = "headers['type']=='bogey'")
public void receiveBogey(#Payload BogeyPojo bogeyPojo) {
// handle the message
}
#StreamListener(target = MySink.INPUT, condition = "headers['type']=='bacall'")
public void receiveBacall(#Payload BacallPojo bacallPojo) {
// handle the message
}
#StreamListener(target = MySink.ANOTHER_INPUT, condition = "headers['type']=='bacall'")
public void receiveBacall(#Payload BacallPojo bacallPojo) {
// handle the message
}
}
How do I provide a handler that's called when none of the conditions match?
If I have 2 handlers, first one with a condition and second one without any, both the handlers are called when the first one's condition matches. How do i avoid this?
We probably need to modify the section you're referring to as it is somewhat outdated.
Also, we can not (should not) do any kind of routing based on the payload type, since the data comes in from the wire in the serialised form such as byte[]. I discuss it in details in this old post.
But you can definitely use other parts of the incoming Message as routing condition. The recommended best practice is to rely on Message Headers.
So let's look at the sample:
#Bean
public Function<String, String> uppercase() {
return v -> v.toUpperCase();
}
#Bean
public Function<String, String> lowercase() {
return v -> v.toLowerCase();
}
#Bean
public Function<String, String> reverse() {
return v -> new StringBuilder(v).reverse().toString();
}
. . .and indeed a single routing-expression property. You only need one expression, since however complex or simple your condition is it can be encoded with standard Spring SpEL
--spring.cloud.function.routing-expression=headers['type'] == 'upper' ? 'uppercase' : (headers['type'] == 'lower' ? 'lowercase' : ''reverse)
What will happen is, the incoming Message's header with the name type will be evaluated. And if its value is 'upper' it will go to 'uppercase' function; if 'lower' to 'lowercase' function and default to 'reverse'.
Hope that helps.

How to maintain case during serializing using object mapper?

I have a listener that listens to a queue. The message from the queue is a json text. I need to process them and then save in a mongodb database. I have used a DTO for incoming json. The problem is I can save the data as lower case only since I have used a DTO. But, the incoming data is upper case. How can I gracefully do this using jackson/spring?
I tried #JsonGetter and #JsonSetter in the DTO. But, that didn't work. It is still saving the data as lower case.
Mini version of my code:
DTO:
public String getMessage() {
return message;
}
#JsonSetter("MESSAGE")
public void setMessage(String message){
this.message = message;
}
Datasaver:
mongoOperations.save(DTO,collectionname);
Document in database:
_id: ObjectId("5da831183852090ddc7075fb")
message: "hi"
I want the data in mongodb as:
_id: ObjectId("5da831183852090ddc7075fb")
MESSAGE: "hi"
The incoming data has key as MESSAGE.So, I would like the same to store. I would not want the DTO fields names to be in uppercase.
As per #MichaelZiober on comment above, none of the annotations related to jackson helped my need. #Field annotation of spring worked.
Should work with #JsonProperty("MESSAGE")
If not (for some reason) - you could use custom serializer for this field
class CustomStringSerializer extends JsonSerializer<String> {
#Override
public void serialize(String value, JsonGenerator jgen, SerializerProvider provider) throws IOException {
jgen.writeStartObject();
jgen.writeObjectField("MESSAGE", value);
jgen.writeEndObject();
}
}
and init mapper in this way:
ObjectMapper objectMapper = new ObjectMapper();
SimpleModule mod = new SimpleModule("message");
mod.addSerializer(String.class, new CustomStringSerializer());
objectMapper.registerModule(mod);

Gson: How do I deserialize an inner JSON object to a map if the property name is not fixed?

My client retrieves JSON content as below:
{
"table": "tablename",
"update": 1495104575669,
"rows": [
{"column5": 11, "column6": "yyy"},
{"column3": 22, "column4": "zzz"}
]
}
In rows array content, the key is not fixed. I want to retrieve the key and value and save into a Map using Gson 2.8.x.
How can I configure Gson to simply use to deserialize?
Here is my idea:
public class Dataset {
private String table;
private long update;
private List<Rows>> lists; <-- little confused here.
or private List<HashMap<String,Object> lists
Setter/Getter
}
public class Rows {
private HashMap<String, Object> map;
....
}
Dataset k = gson.fromJson(jsonStr, Dataset.class);
log.info(k.getRows().size()); <-- I got two null object
Thanks.
Gson does not support such a thing out of box. It would be nice, if you can make the property name fixed. If not, then you can have a few options that probably would help you.
Just rename the Dataset.lists field to Dataset.rows, if the property name is fixed, rows.
If the possible name set is known in advance, suggest Gson to pick alternative names using the #SerializedName.
If the possible name set is really unknown and may change in the future, you might want to try to make it fully dynamic using a custom TypeAdapter (streaming mode; requires less memory, but harder to use) or a custom JsonDeserializer (object mode; requires more memory to store intermediate tree views, but it's easy to use) registered with GsonBuilder.
For option #2, you can simply add the names of name alternatives:
#SerializedName(value = "lists", alternate = "rows")
final List<Map<String, Object>> lists;
For option #3, bind a downstream List<Map<String, Object>> type adapter trying to detect the name dynamically. Note that I omit the Rows class deserialization strategy for simplicity (and I believe you might want to remove the Rows class in favor of simple Map<String, Object> (another note: use Map, try not to specify collection implementations -- hash maps are unordered, but telling Gson you're going to deal with Map would let it to pick an ordered map like LinkedTreeMap (Gson internals) or LinkedHashMap that might be important for datasets)).
// Type tokens are immutable and can be declared constants
private static final TypeToken<String> stringTypeToken = new TypeToken<String>() {
};
private static final TypeToken<Long> longTypeToken = new TypeToken<Long>() {
};
private static final TypeToken<List<Map<String, Object>>> stringToObjectMapListTypeToken = new TypeToken<List<Map<String, Object>>>() {
};
private static final Gson gson = new GsonBuilder()
.registerTypeAdapterFactory(new TypeAdapterFactory() {
#Override
public <T> TypeAdapter<T> create(final Gson gson, final TypeToken<T> typeToken) {
if ( typeToken.getRawType() != Dataset.class ) {
return null;
}
// If the actual type token represents the Dataset class, then pick the bunch of downstream type adapters
final TypeAdapter<String> stringTypeAdapter = gson.getDelegateAdapter(this, stringTypeToken);
final TypeAdapter<Long> primitiveLongTypeAdapter = gson.getDelegateAdapter(this, longTypeToken);
final TypeAdapter<List<Map<String, Object>>> stringToObjectMapListTypeAdapter = stringToObjectMapListTypeToken);
// And compose the bunch into a single dataset type adapter
final TypeAdapter<Dataset> datasetTypeAdapter = new TypeAdapter<Dataset>() {
#Override
public void write(final JsonWriter out, final Dataset dataset) {
// Omitted for brevity
throw new UnsupportedOperationException();
}
#Override
public Dataset read(final JsonReader in)
throws IOException {
in.beginObject();
String table = null;
long update = 0;
List<Map<String, Object>> lists = null;
while ( in.hasNext() ) {
final String name = in.nextName();
switch ( name ) {
case "table":
table = stringTypeAdapter.read(in);
break;
case "update":
update = primitiveLongTypeAdapter.read(in);
break;
default:
lists = stringToObjectMapListTypeAdapter.read(in);
break;
}
}
in.endObject();
return new Dataset(table, update, lists);
}
}.nullSafe(); // Making the type adapter null-safe
#SuppressWarnings("unchecked")
final TypeAdapter<T> typeAdapter = (TypeAdapter<T>) datasetTypeAdapter;
return typeAdapter;
}
})
.create();
final Dataset dataset = gson.fromJson(jsonReader, Dataset.class);
System.out.println(dataset.lists);
The code above would print then:
[{column5=11.0, column6=yyy}, {column3=22.0, column4=zzz}]

Resources