Multiple spring Kafka stream processors in one spring boot application

Multiple spring Kafka stream processors in one spring boot application - spring-boot

I have a requirement to write 2 Spring Kafka Stream processors in one spring-boot application. Both stream processors need to consume messages from a single topic and produce the output to another single topic.
The first processor does not process all messages coming to the input topic, it processes some selected messages (group by message key) and uses windowedBy option instead.
The second processor, however, needs to process all messages which do not fall into the first processor groupBy logic.
I started with the first processor which works fine. The problem came when I try to introduce the second processor.
KafkaProcessor.java:
#Bean
public Function<KStream<String, Input>, KStream<String, Output>> processorOne() {
final AtomicReference<KeyValue<String, Output>> result = new AtomicReference<>(null);
return kStream -> kStream
.filter((key, value) -> value != null)
.filter(this::isValidKey)
.peek((key, value) -> print(value))
.groupBy((key, value) -> key)
.windowedBy(TimeWindows.ofSizeWithNoGrace(Duration.ofMinutes(5)))
.count(Materialized.as("my-state-store"))
.filter((windowedKey, count) -> count > 0)
.toStream()
.filter((messageKey, messageValue) -> {
log.info("Key: {} count: {}", messageKey.key(), messageValue);
Optional<Output> outputResult = service.process(messageKey.key());
if (outputResult.isPresent()) {
Output output = outputResult.get();
print(output);
result.set(KeyValue.pair(messageKey.key(), output));
return true;
}
return false;
})
.map((messageKey, messageValue) -> result.get());
}
Then I added, a second processor to the same class.
#Bean
public Function<KStream<String, Input>, KStream<String, Output>> processorTwo() {
final AtomicReference<KeyValue<String, Output>> result = new AtomicReference<>(null);
return kStream -> kStream
.filter((key, value) -> value != null)
.filter((key, value) -> isValidEvent(value))
.peek((key, value) -> print(value))
.filter((messageKey, messageValue) -> {
Optional<Output> outputResult = service.process(messageValue);
if (outputResult.isPresent()) {
Output output = outputResult.get();
print(output);
result.set(KeyValue.pair(String.format("%s-%s", output.getNameOne(), output.getNametwo()), output));
return true;
}
return false;
})
.map((messageKey, messageValue) -> result.get());
}
application.yaml
spring:
application:
name: my-common-processor
cloud:
stream:
function:
definition: processorOne; processorTwo
bindings:
processorOne-in-0:
destination: input-topic
group: ${spring.application.name}-processorOne
processorOne-out-0:
destination: output-topic
group: ${spring.application.name}-processorOne
processorTwo-in-0:
destination: input-topic
group: ${spring.application.name}-processorTwo
processorTwo-out-0:
destination: output-topic
group: ${spring.application.name}-processorTwo
kafka:
binder:
brokers: 127.0.0.1:9092
auto-create-topics: false
auto-add-partitions: false
streams:
binder:
configuration:
spring.json.use.type.headers: false
spring.json.trusted.packages: '*'
max.poll.records: 10
max.block.ms: 5000
default.key.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
default.value.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
deserialization-exception-handler: sendtodlq
auto-create-topics: false
auto-add-partitions: false
functions:
processorOne:
application-id: ${spring.application.name}-processorOne
processorTwo:
application-id: ${spring.application.name}-processorTwo
bindings:
processorOne-in-0:
consumer:
dlqName: error-topic
processorTwo-in-0:
cosumer:
dlqName: error-topic
kafka:
bootstrap-servers: 127.0.0.1:9092 // for kafka admin
Questions:
Is this the correct way to do this?
With this setup, the second processor never processes any messages even the messages which should have been processed by it.
If the setup is correct, how do I add/configure a deserialization-exception-handler for each processor?

This is where low level processor API implementation comes into picture. Define your project in this way.
Write a filtering processor that extends an abstract processor to filter the event from input topic(Hope your event will have a field which describes EventType1 or EventType2). Use the context forwarder inside the override process method. I would call this as EventFilteringProcessor.
if (EventType1)
context.forward(key, value, To.child(EventType1));
if (EventType2)
context.forward(key, value, To.child(EventType2));
Write two separate processor instances i.e. two separate classes that would extends AbstractProcessor. I would call these classes as EventType1Processor and EventType2Processor.
Describe your stream processor topology in the following way, in the main Spring boot application that implements the ProcessController.
Topology topology = new Topology();
topology.addSource("Source", "YourInputTopic)
.addProcessor("EventFilterClass", () -> EventFilteringProcessor.class, "Source")
.addProcessor(EventType1, () -> EventType1Processor.class, "EventFilterClass")
.addProcessor(EventType2, () -> EventType2Processor.class, "EventFilterClass");
final KafkaStreams streams = new KafkaStreams(topology, streamConfig);
streams.start();
You can have your respective business logic in the respective override process methods of EventType1Processor and EventType2Processor.
Hope this helps.

Related

How to define a default filter for all routes but disable it for a specific route?

When using Spring Cloud Gateway (v3.1.3), how would one go about defining a default filter to perform retries for all routes, but then disable it for individual routes? I would like something as intuitive as this:
spring:
cloud:
gateway:
default-filters:
- name: Retry
args:
retries: 3
statuses: BAD_GATEWAY
methods: GET,POST,PUT,DELETE
backoff:
firstBackoff: 10ms
maxBackoff: 50ms
factor: 2
basedOnPreviousValue: false
routes:
- id: retry_disabled
uri: http://localhost:8080/retry_disabled
filters:
- name: Retry
args:
retries: 0
- id: retry_enabled
uri: http://localhost:8080/retry_enabled
I see in the RetryGatewayFilterFactory class that the RetryConfig.validate() method will fail when the number of retries is less than 1 or the other config options are not defined properly:
public void validate() {
Assert.isTrue(this.retries > 0, "retries must be greater than 0");
Assert.isTrue(!this.series.isEmpty() || !this.statuses.isEmpty() || !this.exceptions.isEmpty(),
"series, status and exceptions may not all be empty");
Assert.notEmpty(this.methods, "methods may not be empty");
if (this.backoff != null) {
this.backoff.validate();
}
}
Edit: I'm considering to implement it like this in code:
#Bean
public Function<GatewayFilterSpec, UriSpec> defaultRetryGatewayFilter() {
return gatewayFilterSpec -> gatewayFilterSpec
.retry(retryConfig -> {
RetryGatewayFilterFactory.BackoffConfig backoffConfig = new RetryGatewayFilterFactory.BackoffConfig();
backoffConfig.setFirstBackoff(Duration.ofMillis(10));
backoffConfig.setMaxBackoff(Duration.ofMillis(50));
backoffConfig.setFactor(2);
backoffConfig.setBasedOnPreviousValue(false);
retryConfig
.setRetries(3)
.allMethods()
.setSeries(HttpStatus.Series.SERVER_ERROR)
.setStatuses(HttpStatus.BAD_GATEWAY)
.setBackoff(backoffConfig);
});
}
#Bean
public RouteLocator routes(RouteLocatorBuilder builder, Function<GatewayFilterSpec, UriSpec> defaultRetryGatewayFilter) {
return builder.routes()
.route("retry_enabled", r -> r
.path("/retry_enabled")
.filters(defaultRetryGatewayFilter)
.uri("lb://foo"))
.route("retry_disabled", r -> r
.path("/retry_disabled")
// not retryable
.uri("lb://foo"))
.build();
}
Will the singleton defaultRetryGatewayFilter be thread safe?

Consume Multiple Consumer with different Topics & Avro in Spring cloud stream framework

I am able to figure it out to consume 2 different Avros with different Topics. but i have 3 Topics with 3 different Avro's. how do we do in Spring Cloud kafka stream.
working code with 2 topics:
when I add consumerProcess-in-2, BiConsumer wont identify the 3 avro or 3 topic. which makes sense since this is BiConsumer. Any suggestion would be helpful.
Thank you in advance.
cloud:
stream:
function:
definition: consumerProcess
bindings:
consumerProcess-in-0:
content-type: application/*+avro
destination: Topic-1
consumerProcess-in-1:
content-type: application/*+avro
destination: Topic-2
public BiConsumer<KStream<String, Avro1>, KStream<String, Avro2>> consumerProcess() {
return (avro1, avro2) -> {
avro1.foreach(
(key, value) -> {
log.info("Avro1 ({}))",value);
});
avro2.foreach(
(key, value) -> {
log.info("avro2 {})", value);
});
};

spring Functional binding names multiple inputs to one output in spring cloud

How to set multiple input channels to output to the same destanation
I have the following configuration:
spring:
cloud:
stream:
function:
definition: beer;scotch
bindings:
notification:
destination: labron
beer-in-0:
destination: wheat
scotch-in-0:
destination: wiskey
I want to create Function binding so that each input channel will output it's message to notification binding
So in the corresponding code:
#Service
class Notifications {
#Bean
fun beer(): Function<String, String> = Function {
// wanted oout channel
// beer -> notification
it.toUpperCase()
}
#Bean
fun scotch(): Function<String, String> = Function {
// wanted oout channel
// scotch -> notification
it.toUpperCase()
}
}
I want to use Spring Cloud Stream 3.0 functional binding names.
beer -> notification
scotch -> notification
What is the best way to active that ?

I'd suggest going with something like so (based on this):
#Bean
public Function<Tuple2<Flux<String>, Flux<String>>, Flux<String>> beerAndScotch() {
return tuple -> {
Flux<String> beerStream = tuple.getT1().map(item -> item.toUpperCase());
Flux<String> scotchStream = tuple.getT2().map(item -> item.toUpperCase());
return Flux.merge(beerStream, scotchStream);
};
}
and so your definition should look something like:
spring:
cloud:
stream:
function:
definition: beerAndScotch
bindings:
notification:
destination: labron
beerAndScotch-in-0:
// ...
beerAndScotch-in-1:
// ...
beerAndScotch-out-0:
destination: labron
This way, both inputs to beer and inputs to scotch get sent to labron

Log Spring webflux types - Mono and Flux

I am new to spring 5.
1) How I can log the method params which are Mono and flux type without blocking them?
2) How to map Models at API layer to Business object at service layer using Map-struct?
Edit 1:
I have this imperative code which I am trying to convert into a reactive code. It has compilation issue at the moment due to introduction of Mono in the argument.
public Mono<UserContactsBO> getUserContacts(Mono<LoginBO> loginBOMono)
{
LOGGER.info("Get contact info for login: {}, and client: {}", loginId, clientId);
if (StringUtils.isAllEmpty(loginId, clientId)) {
LOGGER.error(ErrorCodes.LOGIN_ID_CLIENT_ID_NULL.getDescription());
throw new ServiceValidationException(
ErrorCodes.LOGIN_ID_CLIENT_ID_NULL.getErrorCode(),
ErrorCodes.LOGIN_ID_CLIENT_ID_NULL.getDescription());
}
if (!loginId.equals(clientId)) {
if (authorizationFeignClient.validateManagerClientAccess(new LoginDTO(loginId, clientId))) {
loginId = clientId;
} else {
LOGGER.error(ErrorCodes.LOGIN_ID_VALIDATION_ERROR.getDescription());
throw new AuthorizationException(
ErrorCodes.LOGIN_ID_VALIDATION_ERROR.getErrorCode(),
ErrorCodes.LOGIN_ID_VALIDATION_ERROR.getDescription());
}
}
UserContactDetailEntity userContactDetail = userContactRepository.findByLoginId(loginId);
LOGGER.debug("contact info returned from DB{}", userContactDetail);
//mapstruct to map entity to BO
return contactMapper.userEntityToUserContactBo(userContactDetail);
}

You can try like this.
If you want to add logs you may use .map and add logs there. if filters are not passed it will return empty you can get it with swichifempty
loginBOMono.filter(loginBO -> !StringUtils.isAllEmpty(loginId, clientId))
.filter(loginBOMono1 -> loginBOMono.loginId.equals(clientId))
.filter(loginBOMono1 -> authorizationFeignClient.validateManagerClientAccess(new LoginDTO(loginId, clientId)))
.map(loginBOMono1 -> {
loginBOMono1.loginId = clientId;
return loginBOMono1;
})
.flatMap(o -> {
return userContactRepository.findByLoginId(o.loginId);
})

kafka sync producer takes longer time on the first request

I am using spring cloud stream Kafka sync producer in a spring boot micro service. every time we deploy the service the very first call to kafka takes more than 20 seconds to publish the message to Topic. but all the subsequent calls takes hardly 3 to 4 miliseconds. This issue also happens randomly and is intermittent but mostly happens when we restart the service.
we are using kafka version 0.9.0.1 and gradle dependencies as below
dependencies {
compile('org.springframework.cloud:spring-cloud-starter-stream-kafka')
}
dependencyManagement {
imports {
mavenBom "org.springframework.cloud:spring-cloud-dependencies:Camden.SR3"
}
}
here is the application. yml
spring:
cloud:
stream:
bindings:
output:
content-type: application/json
destination: SOPOrderReceiveTopic
kafka:
binder:
brokers: "localhost:9092,localhost:9093"
headers: eventType
requiredAcks: -1
zkNodes: "localhost:2181"
bindings:
output:
producer:
configuration:
max:
block:
ms: 20000
reconnect:
backoff:
ms: 5000
request:
timeout:
ms: 30000
retries: 3
retry:
backoff:
ms: 10000
timeout:
ms: 30000
sync: true
I am using org.springframework.cloud.stream.messaging.Source as output channel and this is the method used to publish message
public void publish(Message event) {
try {
boolean result = source.output().send(event, orderEventConfig.getTimeoutMs());
logger.log(LoggingEventType.INFORMATION, "MESSAGE SENT TO KAFKA : " + result);
} catch (Exception publishingExceptionMessage) {
logger.log(LoggingEventType.ERROR, "publish event to kafka failed!", publishingExceptionMessage);
throw new PublishEventException("publish event to kafka failed for eventPayload: " + event.getPayload(),
ThreadVariables.getTenantId());
}
}
I am aware that sync producer is slower is terms of performance as it guarantees the order and durability of message but why only the first request takes so long? is this issue a known issue ? is it fixed in the latest kafka version. can somebody suggest. thanks

It looks like there is an issue with the version for spring cloud stream downloaded using below dependency ,
imports {
mavenBom "org.springframework.cloud:spring-cloud-dependencies:Camden.SR3"
}
Try upgrading the spring cloud stream and check. It should fix the latency in the first publishing call on kafka server after springboot service startup.
dependencies {
compile('org.springframework.cloud:spring-cloud-stream-binder-kafka')
}
ext { springCloudVersion = 'Dalston.RELEASE' }
dependencyManagement {
imports {
mavenBom "org.springframework.cloud:spring-cloud-dependencies:${springCloudVersion}"
}
}

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Multiple spring Kafka stream processors in one spring boot application - spring-boot

Related

How to define a default filter for all routes but disable it for a specific route?

Consume Multiple Consumer with different Topics & Avro in Spring cloud stream framework

spring Functional binding names multiple inputs to one output in spring cloud

Log Spring webflux types - Mono and Flux

kafka sync producer takes longer time on the first request

Categories

Resources