How do I configure a consumer to check more than one schema when listening to multiple topics? - spring

I'm working on a project for a large company with millions of users. We are attempting to convert their REST based architecture to an event based architecture. The current architecture involves a service, we'll call it Service-A, that makes 7 REST calls when a user logs in.
Rather than calling out to the 7 services for that data when the user logs in we want to modify those 7 services to produce events when there are updates to the data. Then we will have Service-A listen to 7 different kafka topics and save those updates to the database.
It is a Java Spring Boot application. We are using AWS MSK to host our kafka cluster and we are using AWS Glue for the schema registry. I can configure my consumer in Service-A to listen to 7 topics but I don't know how to get Service-A to check 7 different schemas when consuming a message from one of those 7 topics.
So far, the only configuration I've found for the kafka consumer is one property that takes one schema name.
Here is my config yaml:
spring:
kafka:
listener:
ack-mode: manual_immediate
consumer:
enable-auto-commit: false
group-id: my-group
key-deserializer: org.springframework.kafka.support.serializer.ErrorHandlingDeserializer
value-deserializer: org.springframework.kafka.support.serializer.ErrorHandlingDeserializer
properties:
spring.json.trusted.packages: com.app.somepackage.domain
spring.deserializer.key.delegate.class: org.apache.kafka.common.serialization.StringDeserializer
spring.deserializer.value.delegate.class: com.amazonaws.services.schemaregistry.deserializers.avro.AWSKafkaAvroDeserializer
auto-offset-reset: earliest
bootstrap-servers: <my-msk-url>
properties:
region: us-west-2
schemaName: my-first-schema
registry.name: my-registry-name
avroRecordType: SPECIFIC_RECORD

Related

Spring Kafka consuming old messages which are already consumed by the consumer

i have a Spring boot application and using Spring Kafka. we have create a consumer which is consuming messages from 4 topics. these topics doesnt have any partition. the issue i am facing here a rendom behavior that out of three topics, in any one topic offset stop and my consumer keep on consuming same messages from that topic again and again until we need to manually move the offset to latest.below is the configuration YAML configuration i have :
spring:
kafka:
consumer:
bootstrap-servers: ${KAFKA_BOOTSTRAP_SERVERS}
group-id: group_id
key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
value-deserializer: org.apache.kafka.common.serialization.StringDeserializer
kafka:
consumer:
allTopicList: user.topic,student.topic,class.topic,teachers.topic**
as it is a Spring boot application, default offset is set to latest.
what i am doing wrong here, please help me to understand.
What version are you using?
You should set
...consumer:
enable-auto-commit: false
The listener container will more reliably commit the offsets.
You should also consider
ack-mode: RECORD
and the container will commit the offset for each successfully processed record (default is BATCH).

Spring Boot and Kafka: Broker disconnected

I have setup a Spring Boot application to receive Kafka messages from an existing and working Kafka producer. The setup is standard, and based on the following: https://www.codenotfound.com/spring-kafka-consumer-producer-example.html
Messages are not received, and the following is continually displayed in the console:
WARN org.apache.clients.NetworkClient :Bootstrap broker <hostname>:9092 disconnected
In addition, the following debug message is logged:
org.apache.common.errors.Timeout: Failed to update metadata after 60000 ms.
The console message is discussed in the following link:
https://community.hortonworks.com/content/supportkb/150148/errorwarn-bootstrap-broker-6668-disconnected-orgap.html
The logged message is discussed here:
https://community.cloudera.com/t5/Data-Ingestion-Integration/Error-when-sending-message-to-topic-in-Kafka/td-p/41440
Very likely, the timeout will not happen when the first issue is resolved.
The solution to the console message which is given is to explicitly pass --security-protocol SSL as an argument to the producer or consumer command.
Given that I am listening on an existing Kafka broker and topic, no settings can be changed there. Any changes must be on the Spring Boot side.
Is it possible to configure application.yml so that --security-protocol SSL is passed an an argument to the consumer? Also, has anyone experienced this before, and is there another way to resolve the issue using the configuration options available in Spring Boot and Spring Kafka?
Thanks
See the documentation.
Scroll down to Kafka. Arbitrary Kafka properties can be set using
spring:
kafka:
properties:
security.protocol: SSL
applies to consumer and producer (and admin in 2.0).
In the upcoming 2.0 release (currently RC1), there is also
spring:
kafka:
properties:
consumer:
some.property: foo
for properties that only apply to consumers (and similarly for producers and admins).

Spring Cloud Streaming - Separate Connection for Producer & Consumer

I have a Spring Cloud Streaming transformer application using RabbitMQ. It is reading from a Rabbit queue, doing some transformation, and writing to a Rabbit exchange. I have my application deployed to PCF and am binding to a Rabbit service.
This works fine, but now I am needing a separate connection for consuming and producing the message. (I want to read from the Rabbit queue using one connection, and write to a Rabbit exchange using a different connection). How would I configure this? Is it possible to bind my applications to 2 different Rabbit services using 1 as the producer and 1 as the consumer?
Well, starting with version 1.3 Rabbit Binder indeed creates a separate ConnectionFactory for producers: https://docs.spring.io/spring-cloud-stream/docs/Ditmars.RELEASE/reference/htmlsingle/#_rabbitmq_binder
Starting with version 1.3, the RabbitMessageChannelBinder creates an internal ConnectionFactory copy for the non-transactional producers to avoid dead locks on consumers when shared, cached connections are blocked because of Memory Alarm on Broker.
So, maybe that is just enough for you as is after upgrading to Spring Cloud Stream Ditmars.
UPDATE
How would I go about configuring this internal ConnectionFactory copy with different connection properties?
No, that's different story. What you need is called multi-binder support: https://docs.spring.io/spring-cloud-stream/docs/Ditmars.RELEASE/reference/htmlsingle/#multiple-binders
You should declare several blocks for different connection factories:
spring.cloud.stream.bindings.input.binder=rabbit1
spring.cloud.stream.bindings.output.binder=rabbit2
...
spring:
cloud:
stream:
bindings:
input:
destination: foo
binder: rabbit1
output:
destination: bar
binder: rabbit2
binders:
rabbit1:
type: rabbit
environment:
spring:
rabbitmq:
host: <host1>
rabbit2:
type: rabbit
environment:
spring:
rabbitmq:
host: <host2>

How to connect to Kafka Mesos Framework from an application using Spring Cloud Stream?

Having a Mesos-Marathon cluster in place and a Spring Boot application with Spring Cloud Stream that consumes a topic from Kafka, we now want to integrate Kafka with the Mesos cluster. For this we want to install Kafka Mesos Framework.
Right now we have the application.yml configuration like this:
---
spring:
profiles: local-docker
cloud:
stream:
kafka:
binder:
zk-nodes: 192.168.88.188
brokers: 192.168.88.188
....
Once we have installed Kafka Mesos Framework,
How can we connect to kafka from Spring Cloud Stream?
or more specifically
How will be the configuration?
The configuration properties look good. Do you have the host addresses correct.
For more info on the kafka binder config properties, you can refer here:
https://github.com/spring-cloud/spring-cloud-stream/blob/master/spring-cloud-stream-docs/src/main/asciidoc/spring-cloud-stream-overview.adoc#kafka-specific-settings

Spring Cloud Turbine - Unable to handle multiple clients?

I’m having a bit of trouble getting Turbine to work in Spring Cloud. In a nutshell, I can’t determine how to configure it to aggregate circuits from more than one application at a time.
I have 6 separate services, a eureka server, and a turbine server running in standalone mode. I can see from my Eureka server that all of the services are registered, including turbine. My turbine server is up and running, and I can see its /hystrix page without issue. But when I try to use it to examine turbine.stream, I only see the FIRST server that is listed in turbine.appConfig, the rest are ignored.
This is my Turbine server’s application.yml, or at least the relevant parts:
---
eureka:
client:
serviceUrl:
defaultZone: http://localhost:8010/eureka/
server:
port: 8030
info:
component: Turbine
turbine:
clusterNameExpression: new String(“default”)
appConfig: sentence,subject,verb,article,adjective,noun
management:
port: 8990
When I run this and access the hystrix dashboard on my turbine instance, asking for the turbine.stream, the ONLY circuit breakers listed in the output are for the first service listed in appConfig, the “sentence” service in this case. Curiously, if I re-arrange the order of these services and put another one first (like “noun”), I see only the circuits for THAT service. Only the first service in the list is displayed.
I’ll admit to being a little confused on some of the terminology, like streams, clusters, etc., so I could be missing some basic concept here, but my understanding is that Turbine could digest streams from more than one service and aggregate them in a single display. Suggestions would be appreciated.
I don't have enough reputation to comment, so I have to write this in an answer :)
I had the exactly same problem:
There are two services "test-service" and "other-service", each with it's own working hystrix-stream
and there is one Turbine-Application, which is configured like this:
turbine:
clusterNameExpression: new String("default")
appConfig: test-service,other-service
All of my services are running on my local machine.
Result is: My Hystrix-Dashboard just shows the metrics from "test-service".
Reason:
It seems to be, that a Turbine-Client which is configured the described way doesn't handle multiple services when they are running at the same host.
This is explained here:
https://github.com/Netflix/Hystrix/issues/117#issuecomment-14262713
Turbine maintains state of all these instances in order to maintain persistent connections to them and it does rely on the "hostname" and if the host name is the same then it won't instantiate a new connection to that same server (on a different port).
So the main point is, that all of your services must be registered with different hostnames. How you could do this on your local machine is described below.
UPDATE 2015-06-12/2016-01-23: Workaround for local testing
Change your hostfile:
# ...
127.0.0.1 localhost
127.0.0.1 localdomain1
127.0.0.1 localdomain2
# ...
127.0.0.1 localdomainx
And then set the hostname for your clients each to a different domain-entry like this:
application.yml:
eureka:
instance:
hostname: localdomainx

Resources