Kafka streams: Using the DSL api, within a transform, how can I send two messages to different topics/separate DSL downstream processors - apache-kafka-streams

I'm using the DSL api and I have a use case where I need to check a condition and then if true, send an additional message to a separate topic from the happy path. My question is, how can I attach child processors to parents in the DSL api? Is it as simple as caching a stream variable and using it in two subsequent places, and naming those stream processors? Here's some brief code that explains what I'm trying to do. I am using the DSL api because I need the use of the foreignKeyJoin.
var myStream = stream.process(myProcessorSupplier); //3.3 returns a stream
stream.to("happyThingTopic"); Q: will the forward ever land here?
stream.map( myKvMapper, new Named("what-is-this")).to("myOtherTopic"); //will the forward land here?
public KeyValue<String, Object> process(Object key, Object value){
if (value.hasFlag){
processorContext.forward(key, new OtherThing(), "what-is-this?");
}
return new KeyValue(key, HappyThing(value));
}

Related

Multiple Send filters with KafkaFactoryConfigurator

is it possible to configure Kafka Rider to use more than one SendFilter? When I'm looking to KafkaFactoryConfigurator it can register only one delegate to configure SendFilter.
What I want to do. First, I communicate with SpringBoot application over Kafka topics, and I need to send same headers name for i.e. MessageId. And I want to also send some Business context information with additional headers. I don't want to mix these two concerns in one filter. But I don't know how to setup two filters for Kafka Rider.
I also tried to setup filters on InMemmoryBus but it looks that these filters are not used together with Rider Send.
Is there a way how to do it?
I'm using MassTransit v8.
Thank you.
EDIT:
My setup looks like this sample
builder.Services.AddMassTransit(x =>
{
x.UsingInMemory((context, config)=> {
//config.UseSendFilter(typeof(KBHeaderFilter<>), context);
});
x.AddRider(rider =>
{
rider.AddProducer<MessageV1>("to-poc-masstransit", (riderContext, producerConfig) =>
{
var schemaRegistryClient = riderContext.GetRequiredService<ISchemaRegistryClient>();
var serializerConfig = new AvroSerializerConfig{...};
producerConfig.SetValueSerializer(new AvroSerializer<MessageV1>(schemaRegistryClient, serializerConfig).AsSyncOverAsync());
});
rider.UsingKafka((context, k) =>
{
//k.UseSendFilter(typeof(TestScopedFilter<>), context);
k.SetHeadersSerializer(new TestSerializer());
k.Host("localhost:port");
});
});
});
I want to find a single place where I can override sent headers. But it looks like that appropriate place is the place where serialization/deserialization is made. This place is not the same for standard brokers and riders and I will have to implement my own header serializer for Kafka rider and another for Artemis broker.

Spring reactive: Chaining repository results

Repository repo
Repository otherRepo
foreach entity : repo.FindAll() {
entityFind = otherRepo.FindById(entity.Prop)
if (entityFind != null) {
return entityFind
}
}
How could I do this using the spring reactive?
I could use blockFirst() to search in otherRepo but it would break the reaction chain
I also have tried use a handle() to control the flow but I don't get to break the flow when I find an item
Any idea?
Thanks
If you have repos like this, for each record of repo1, if you need to find a record from repo2, you could probably join the tables using spring data JPQL & use your custom method instead as your current approach could have performance impact.
As you seem to be interested only in the first record, Just to give you an idea, We can achieve something like this.
return Flux.fromIterable(repo.findAll()) //assuming it returns a list
.map(entity -> otherRepo.findById(entity.property)) // for each entity we query the other repo
.filter(Objects::nonNull) // replace it with Optional::isPresent if it is optional
.next(); //converts the flux to mono with the first record
The answer from vins is assuming non-reactive repository, so here it is in a fully reactive style:
return repo.findAll() //assuming reactive repository, which returns Flux<Entity>
.flatMap(entity -> otherRepo.findById(entity.property)) //findById returns an empty Mono if id not found, which basically gets ignored by flatMap
.next(); //first record is turned into a Mono, and the Flux is cancelled
Note that as you've stated, this can lead to unnecessary requests being made to Cassandra (and then cancelled by the next()). This is due to flatMap allowing several concurrent requests (256 by default). You can either reduce the parallelism of flatMap (by providing a second parameter, an int) or use concatMap to perform findById queries serially.

Different serde for Kafka Streams KTable state store

As part of our application logic, we use Kafka Streams state store for range lookups, data is loaded from Kafka topic using builder.table() method.
The problem is that source topic's key is serialised as JSON and doesn't suite well to binary key comparisons used internally in RocksDB based state store.
We were hoping to use a separate serde for keys by passing it to Materialized.as(). However, it looks like that streams implementation resets whatever is passed to the original serdes used to load from the table topic.
This is what I can see in streams builder internals:
public synchronized <K, V> KTable<K, V> table(final String topic,
final Consumed<K, V> cons,
final Materialized<K, V, KeyValueStore<Bytes, byte[]>> materialized) {
Objects.requireNonNull(topic, "topic can't be null");
Objects.requireNonNull(consumed, "consumed can't be null");
Objects.requireNonNull(materialized, "materialized can't be null");
materialized.withKeySerde(consumed.keySerde).withValueSerde(consumed.valueSerde);
return internalStreamsBuilder.table(topic,
new ConsumedInternal<>(consumed),
new MaterializedInternal<>(materialized, internalStreamsBuilder, topic + "-"));
}
Anybody knows why it's done this way, and if it's possible to use a different serde for a DSL state store?
Please don't propose using Processor API, this route is well explored. I would like to avoid writing a processor and a custom state store every time when I need to massage data before saving it into a state store.
After some digging through streams sources, I found out that I can pass a custom Materialized.as to the filter with always true predicate. But it smells a bit hackerish.
This is my code, that unfortunately doesn't work as we hoped to, because of "serdes reset" described above.
Serde<Value> valueSerde = new JSONValueSerde()
KTable<Key, Value> table = builder.table(
tableTopic,
Consumed.with(new JSONKeySerde(), valueSerde)
Materialized.as(cacheStoreName)
.withKeySerde(new BinaryComparisonsCompatibleKeySerde())
.withValueSerde(valueSerde)
)
The code works by design. From a streams point of view, there is no reason to use a different Serde for the store are for reading the data from the topic, because it's know to be the same data. Thus, if one does not use the default Serdes from the StreamsConfig, it's sufficient to specify the Serde once (in Consumed) and it's not required to specify it in Materialized again.
For you special case, you could read the topic as a stream a do a "dummy aggregation" that just return the latest value per record (instead of computing an actual aggregate). This allows you to specify a different Serde for the result type.

How to get Flux<T> from Mono<K> in Spring Reactive API?

I have two independent collections in NoSQL document db Photo and Property where Photo has propertyId parameter meaning that I can find all photos that belong to a given property like a house. Normally without reactive I would simply do:
Property property = ....
List<Photo> = photoService.findByPropertyId(property.getId());
Just two lines. How to do above in Reactive Programming when I have
`Mono<Property> and I want to find Flux<Photo>
without using block()?` Assume aphotoService.findByPropertyId return List and in reactive case it returns Flux.
You should use flatMapMany, which triggers an async processing from the Mono's value which can emit multiple elements:
Flux<Photo> photoFlux = propertyMono
.flatMapMany(prop -> photoService.findByPropertyId(prop.getId()));

How to get ordering for different MessagePostProcessors in SimpleMessageListenerContainer

I have multiple MessagePostProcessors in SpringAMQP which i set them using SimpleMessageListenerContainer.setAfterReceivePostProcessors API , now my query is does these MessagePostProcessors are called in order I have mentioned.
Pseoudo code
SimpleMessageListenerContainer container = // api returing SimpleMessageListenerContainer object
container.setAfterReceivePostProcessors(new MessagePostProcessor[] {
messagePostProcessors1 , messagePostProcessors2});
So does Spring AMQP call messagePostProcessors1 followed messagePostProcessors2 in sequence or does it randomly selects the same ?
If it randomly selects is there any way that we can order the same i.e messagePostProcessors2 always gets called after messagePostProcessors1
Akshat , the order is based on the order that is set in the processor.Quoting the document here. When i look at the concrete implementation of the processors , i find there is a setOrder method (form interface ordered i think). May be setting that in your message post processor will do the trick.
public void setAfterReceivePostProcessors(MessagePostProcessor...
afterReceivePostProcessors)
Set a MessagePostProcessor that will be
invoked immediately after a Channel#basicGet() and before any message
conversion is performed. May be used for operations such as
decompression Processors are invoked in order, depending on
PriorityOrder, Order and finally unordered.

Resources