Consume Multiple Consumer with different Topics & Avro in Spring cloud stream framework - spring-boot

I am able to figure it out to consume 2 different Avros with different Topics. but i have 3 Topics with 3 different Avro's. how do we do in Spring Cloud kafka stream.
working code with 2 topics:
when I add consumerProcess-in-2, BiConsumer wont identify the 3 avro or 3 topic. which makes sense since this is BiConsumer. Any suggestion would be helpful.
Thank you in advance.
cloud:
stream:
function:
definition: consumerProcess
bindings:
consumerProcess-in-0:
content-type: application/*+avro
destination: Topic-1
consumerProcess-in-1:
content-type: application/*+avro
destination: Topic-2
public BiConsumer<KStream<String, Avro1>, KStream<String, Avro2>> consumerProcess() {
return (avro1, avro2) -> {
avro1.foreach(
(key, value) -> {
log.info("Avro1 ({}))",value);
});
avro2.foreach(
(key, value) -> {
log.info("avro2 {})", value);
});
};

Related

Spring Data Redis - watch and multi on same cluster node

My objective is to apply transaction logic (watch+multi) to a redis cluster. I can see here, here and on spring-data-redis repo that transactions on clusters are not supported on spring data redis. Nevertheless, considering that I need to do it on the same node, I have tried to do something like this:
val keySerialized = "myKey".toByteArray()
val valueSerialized = "myValue".toByteArray()
val node = redisTemplate.connectionFactory.clusterConnection.clusterGetNodeForKey(keySerialized)
val clusterExecutor = (redisTemplate.connectionFactory.clusterConnection as LettuceClusterConnection).clusterCommandExecutor
clusterExecutor.executeCommandOnSingleNode(
(ClusterCommandExecutor.ClusterCommandCallback { client: RedisCommands<ByteArray, ByteArray> -> client.watch(keySerialized) })
,node)
clusterExecutor.executeCommandOnSingleNode(
(ClusterCommandExecutor.ClusterCommandCallback { client: RedisCommands<ByteArray, ByteArray> -> client.multi() })
,node)
clusterExecutor.executeCommandOnSingleNode(
(ClusterCommandExecutor.ClusterCommandCallback { client: RedisCommands<ByteArray, ByteArray> -> client.get(keySerialized) })
,node)
clusterExecutor.executeCommandOnSingleNode(
(ClusterCommandExecutor.ClusterCommandCallback { client: RedisCommands<ByteArray, ByteArray> -> client.set(keySerialized, valueSerialized) })
,node)
clusterExecutor.executeCommandOnSingleNode(
(ClusterCommandExecutor.ClusterCommandCallback { client: RedisCommands<ByteArray, ByteArray> -> client.exec() })
,node)
This works fine when I'm running just one instance of the code. Example: If I do a SET in the redis console while debugging the code, the transaction will fail as expected.
However, If I am running this code in two threads, what happens is that both are using the same connection. When the second thread runs the MULTI command, the following error is raised
Caused by: io.lettuce.core.RedisCommandExecutionException: ERR MULTI calls can not be nested
I believe that by forcing the executor to use a new connection could be a solution, but I don't know how to do it. Any thoughs on this?

Multiple spring Kafka stream processors in one spring boot application

I have a requirement to write 2 Spring Kafka Stream processors in one spring-boot application. Both stream processors need to consume messages from a single topic and produce the output to another single topic.
The first processor does not process all messages coming to the input topic, it processes some selected messages (group by message key) and uses windowedBy option instead.
The second processor, however, needs to process all messages which do not fall into the first processor groupBy logic.
I started with the first processor which works fine. The problem came when I try to introduce the second processor.
KafkaProcessor.java:
#Bean
public Function<KStream<String, Input>, KStream<String, Output>> processorOne() {
final AtomicReference<KeyValue<String, Output>> result = new AtomicReference<>(null);
return kStream -> kStream
.filter((key, value) -> value != null)
.filter(this::isValidKey)
.peek((key, value) -> print(value))
.groupBy((key, value) -> key)
.windowedBy(TimeWindows.ofSizeWithNoGrace(Duration.ofMinutes(5)))
.count(Materialized.as("my-state-store"))
.filter((windowedKey, count) -> count > 0)
.toStream()
.filter((messageKey, messageValue) -> {
log.info("Key: {} count: {}", messageKey.key(), messageValue);
Optional<Output> outputResult = service.process(messageKey.key());
if (outputResult.isPresent()) {
Output output = outputResult.get();
print(output);
result.set(KeyValue.pair(messageKey.key(), output));
return true;
}
return false;
})
.map((messageKey, messageValue) -> result.get());
}
Then I added, a second processor to the same class.
#Bean
public Function<KStream<String, Input>, KStream<String, Output>> processorTwo() {
final AtomicReference<KeyValue<String, Output>> result = new AtomicReference<>(null);
return kStream -> kStream
.filter((key, value) -> value != null)
.filter((key, value) -> isValidEvent(value))
.peek((key, value) -> print(value))
.filter((messageKey, messageValue) -> {
Optional<Output> outputResult = service.process(messageValue);
if (outputResult.isPresent()) {
Output output = outputResult.get();
print(output);
result.set(KeyValue.pair(String.format("%s-%s", output.getNameOne(), output.getNametwo()), output));
return true;
}
return false;
})
.map((messageKey, messageValue) -> result.get());
}
application.yaml
spring:
application:
name: my-common-processor
cloud:
stream:
function:
definition: processorOne; processorTwo
bindings:
processorOne-in-0:
destination: input-topic
group: ${spring.application.name}-processorOne
processorOne-out-0:
destination: output-topic
group: ${spring.application.name}-processorOne
processorTwo-in-0:
destination: input-topic
group: ${spring.application.name}-processorTwo
processorTwo-out-0:
destination: output-topic
group: ${spring.application.name}-processorTwo
kafka:
binder:
brokers: 127.0.0.1:9092
auto-create-topics: false
auto-add-partitions: false
streams:
binder:
configuration:
spring.json.use.type.headers: false
spring.json.trusted.packages: '*'
max.poll.records: 10
max.block.ms: 5000
default.key.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
default.value.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
deserialization-exception-handler: sendtodlq
auto-create-topics: false
auto-add-partitions: false
functions:
processorOne:
application-id: ${spring.application.name}-processorOne
processorTwo:
application-id: ${spring.application.name}-processorTwo
bindings:
processorOne-in-0:
consumer:
dlqName: error-topic
processorTwo-in-0:
cosumer:
dlqName: error-topic
kafka:
bootstrap-servers: 127.0.0.1:9092 // for kafka admin
Questions:
Is this the correct way to do this?
With this setup, the second processor never processes any messages even the messages which should have been processed by it.
If the setup is correct, how do I add/configure a deserialization-exception-handler for each processor?
This is where low level processor API implementation comes into picture. Define your project in this way.
Write a filtering processor that extends an abstract processor to filter the event from input topic(Hope your event will have a field which describes EventType1 or EventType2). Use the context forwarder inside the override process method. I would call this as EventFilteringProcessor.
if (EventType1)
context.forward(key, value, To.child(EventType1));
if (EventType2)
context.forward(key, value, To.child(EventType2));
Write two separate processor instances i.e. two separate classes that would extends AbstractProcessor. I would call these classes as EventType1Processor and EventType2Processor.
Describe your stream processor topology in the following way, in the main Spring boot application that implements the ProcessController.
Topology topology = new Topology();
topology.addSource("Source", "YourInputTopic)
.addProcessor("EventFilterClass", () -> EventFilteringProcessor.class, "Source")
.addProcessor(EventType1, () -> EventType1Processor.class, "EventFilterClass")
.addProcessor(EventType2, () -> EventType2Processor.class, "EventFilterClass");
final KafkaStreams streams = new KafkaStreams(topology, streamConfig);
streams.start();
You can have your respective business logic in the respective override process methods of EventType1Processor and EventType2Processor.
Hope this helps.

Nestjs kafka implementation

I've read nestjs microservice and kafka documentation but I couldn't figure out some of it. I'll be so thankful if you can help me out.
So as the docs says I have to create a microService in main.ts file as follows:
const app = await NestFactory.createMicroservice<MicroserviceOptions>(AppModule, {
transport: Transport.KAFKA,
options: {
client: {
brokers: ['localhost:9092'],
}
}
});
await app.listen(() => console.log('app started'));
Then there is a kafkaModule file like this:
#Module({
imports: [
ClientsModule.register([
{
name: 'HERO_SERVICE',
transport: Transport.KAFKA,
options: {
client: {
clientId: 'hero',
brokers: ['localhost:9092'],
},
consumer: {
groupId: 'hero-consumer'
}
}
},
]),
]
})
export class KafkaModule implements OnModuleInit {
constructor(#Inject('HERO_SERVICE') private readonly clientService: KafkaClient)
async onModuleInit() {
await this.clientService.connect();
}
}
The first thing I can't figure is what is the use of the first parameter of createMicroservice ? (I passed AppModule and KafkaModule and both worked correctly. knowing that kafkaModule is imported at appModule)
The other thing is that from what I understood, the microservice part and the configuration in the main.ts file is used to subscribe on the topics that is used in MessagePattern or EventPattern decorators, and the kafkaClient described in the kafkaModule is used to send messages to different topics.
the problem here is if what I said earlier is true, then why clientModule uses a default groupId if not specified to work as consumer. strange thing is I couldn't find a solution to get any message from any topic using clientModule.
what I'm doing right now is to use different group ids in each file so they wont have any conflicts.
The first parameter of createMicroservice, it will help to guide how the consumer will connect to Kafka when you want to consume a message from a specific topic.
Example: we want to get message from topic: test01
How we declare?
import {Controller} from '#nestjs/common'
import {MessagePattern, Payload} from '#nestjs/microservices'
#Controller('sync')
export class SyncController {
#MessagePattern('test01')
handleTopicTest01(#Payload() message: Sync): any {
// Handle your message here
}
}
The second block is used as producer that is not consumer. When application want to send message to a specific topic, the clientModel will support this.
#Get()
sayHello() {
return this.clientModule.send('say.hello', 'hello world')
}

spring Functional binding names multiple inputs to one output in spring cloud

How to set multiple input channels to output to the same destanation
I have the following configuration:
spring:
cloud:
stream:
function:
definition: beer;scotch
bindings:
notification:
destination: labron
beer-in-0:
destination: wheat
scotch-in-0:
destination: wiskey
I want to create Function binding so that each input channel will output it's message to notification binding
So in the corresponding code:
#Service
class Notifications {
#Bean
fun beer(): Function<String, String> = Function {
// wanted oout channel
// beer -> notification
it.toUpperCase()
}
#Bean
fun scotch(): Function<String, String> = Function {
// wanted oout channel
// scotch -> notification
it.toUpperCase()
}
}
I want to use Spring Cloud Stream 3.0 functional binding names.
beer -> notification
scotch -> notification
What is the best way to active that ?
I'd suggest going with something like so (based on this):
#Bean
public Function<Tuple2<Flux<String>, Flux<String>>, Flux<String>> beerAndScotch() {
return tuple -> {
Flux<String> beerStream = tuple.getT1().map(item -> item.toUpperCase());
Flux<String> scotchStream = tuple.getT2().map(item -> item.toUpperCase());
return Flux.merge(beerStream, scotchStream);
};
}
and so your definition should look something like:
spring:
cloud:
stream:
function:
definition: beerAndScotch
bindings:
notification:
destination: labron
beerAndScotch-in-0:
// ...
beerAndScotch-in-1:
// ...
beerAndScotch-out-0:
destination: labron
This way, both inputs to beer and inputs to scotch get sent to labron

kafka sync producer takes longer time on the first request

I am using spring cloud stream Kafka sync producer in a spring boot micro service. every time we deploy the service the very first call to kafka takes more than 20 seconds to publish the message to Topic. but all the subsequent calls takes hardly 3 to 4 miliseconds. This issue also happens randomly and is intermittent but mostly happens when we restart the service.
we are using kafka version 0.9.0.1 and gradle dependencies as below
dependencies {
compile('org.springframework.cloud:spring-cloud-starter-stream-kafka')
}
dependencyManagement {
imports {
mavenBom "org.springframework.cloud:spring-cloud-dependencies:Camden.SR3"
}
}
here is the application. yml
spring:
cloud:
stream:
bindings:
output:
content-type: application/json
destination: SOPOrderReceiveTopic
kafka:
binder:
brokers: "localhost:9092,localhost:9093"
headers: eventType
requiredAcks: -1
zkNodes: "localhost:2181"
bindings:
output:
producer:
configuration:
max:
block:
ms: 20000
reconnect:
backoff:
ms: 5000
request:
timeout:
ms: 30000
retries: 3
retry:
backoff:
ms: 10000
timeout:
ms: 30000
sync: true
I am using org.springframework.cloud.stream.messaging.Source as output channel and this is the method used to publish message
public void publish(Message event) {
try {
boolean result = source.output().send(event, orderEventConfig.getTimeoutMs());
logger.log(LoggingEventType.INFORMATION, "MESSAGE SENT TO KAFKA : " + result);
} catch (Exception publishingExceptionMessage) {
logger.log(LoggingEventType.ERROR, "publish event to kafka failed!", publishingExceptionMessage);
throw new PublishEventException("publish event to kafka failed for eventPayload: " + event.getPayload(),
ThreadVariables.getTenantId());
}
}
I am aware that sync producer is slower is terms of performance as it guarantees the order and durability of message but why only the first request takes so long? is this issue a known issue ? is it fixed in the latest kafka version. can somebody suggest. thanks
It looks like there is an issue with the version for spring cloud stream downloaded using below dependency ,
imports {
mavenBom "org.springframework.cloud:spring-cloud-dependencies:Camden.SR3"
}
Try upgrading the spring cloud stream and check. It should fix the latency in the first publishing call on kafka server after springboot service startup.
dependencies {
compile('org.springframework.cloud:spring-cloud-stream-binder-kafka')
}
ext { springCloudVersion = 'Dalston.RELEASE' }
dependencyManagement {
imports {
mavenBom "org.springframework.cloud:spring-cloud-dependencies:${springCloudVersion}"
}
}

Resources