Spring cloud stream producer retries and error handling - spring-boot

I have set a spring cloud stream kafka producer and consumer and there are 3 kafka brokers running . I have set min.insync.replicas to 4 to see how producer error handling works . The messagechannel.send call returns immediately and the producer logs keep saying NOT_ENOUGH_REPLICAS which is fine and expected .
server.port: 9050
spring:
cloud:
stream:
bindings:
errorChannel:
destination: error-topic
output:
destination: stream-topic
group: top-group
producer:
errorChannelEnabled: true
kafka:
bindings:
output:
producer:
retries: 3
sync: false
binder:
autoCreateTopics: true
configuration:
value:
serializer: com.example.kafkapublisher.MySerializer
producer-properties:
acks: all
spring.cloud.stream.kafka.bindings.errorChannel.consumer.enableDlq: true
The above is my producer configuration . Although retries is set to 3 , the producer keeps on retrying a large number of times . Although sync is set to true , the send call comes out immediately . Although error channel and destination is defined , and errorChannelEnabled is set to true i do't see the failed message in the error topic my-error, neither is the error topic created . Request your help

Arbitrary Kafka producer properties go in the ...producer.configuration property.
https://docs.spring.io/spring-cloud-stream/docs/current/reference/html/spring-cloud-stream-binder-kafka.html#kafka-producer-properties
configuration
Map with a key/value pair containing generic Kafka producer properties. The bootstrap.servers property cannot be set here; use multi-binder support if you need to connect to multiple clusters.
Default: Empty map.

Related

Spring Cloud Streams kafka binder - topic serialization configuration

So I think I've run myself into confusion as I understand there are two different kafka binders for SpringCloudStreams:
Spring Cloud Streams Kafka Binder
Spring Cloud Streams Kafka Streams Binder
I'm looking for the correct YAML settings to define the serializer and deserializer in the normal kafka binder for spring cloud streams:
I can tweak the defaults using this logic:
spring:
main:
web-application-type: NONE
application:
name: tbfm-translator
kafka:
consumer:
group-id: ${consumer_id}
bootstrap-servers: ${kafka_servers}
cloud:
schemaRegistryClient:
endpoint: ${schema_registry}
stream:
# default:
# producer.useNativeEncoding: true
# consumer.useNativeEncoding: true
defaultBinder: kafka
kafka:
binder:
auto-add-partitions: true # I wonder if its cause this is set
auto-create-topics: true # Disabling this seem to override the server setings and will auto create
producer-properties:
# For additional properties you can check here:
# https://docs.confluent.io/current/installation/configuration/producer-configs.html
schema.registry.url: ${schema_registry}
# Disable for auto schema registration
auto.register.schemas: false
# Use only the latest schema version
use.latest.version: true
# This will use reflection to generate schemas from classes - used to validate current data set
# against the scheam registry for valid production
schema.reflection: true
# To use an avro key enable the following line
#key.serializer: io.confluent.kafka.serializers.KafkaAvroSerializer
#This will use a string based key - aka not in the registry - dont need a name strategy with string serializer
key.serializer: org.apache.kafka.common.serialization.StringSerializer
# This will control the Serializer Setup
value.subject.name.strategy: io.confluent.kafka.serializers.subject.RecordNameStrategy
value.serializer: io.confluent.kafka.serializers.KafkaAvroSerializer
which is:
spring.cloud.stream.kafka.binder.producer-properties.value.serializer
spring.cloud.stream.kafka.binder.producer-properties.key.serializer
I figure I should be able to do this on a per-topic basis:
spring:
cloud:
stream:
bindings:
my-topic:
destination: a-topic
xxxxxxxx??
I've come across setting:
producer:
use-native-encoding: false
keySerde: <CLASS>
But this doesn't seem to be working. Is there an easy property I can set to do this on the per-topic basis? I think the keySerde is for the Kafka-streams implementation not the normal kafka binder.
use-native-encoding must be true to use your own serializers.
spring.cloud.stream.kafka.bindings.my-topic.producer.configuration.value.serializer: ...
See the documentation for kafka-specific producer properties.
configuration
Map with a key/value pair containing generic Kafka producer properties.
Default: Empty map.
stream:
bindings: # Define output topics here and then again in the kafka.bindings section
test:
destination: multi-output
producer:
useNativeDecoding: true
kafka:
bindings:
test:
destination: multi-output
producer:
configuration:
value.serializer: org.apache.kafka.common.serialization.StringSerializer
This seems to work - but very annoying I have to duplicate the binding definition in two places
Makes we want to shy away from the YAML style definition

Spring Cloud Stream: Cannot connect to 2 rabbitmq clusters without allowOverride

I have a spring boot application (let's call it example-service) with the following configuration to connect to 2 different rabbitmq clusters.
spring:
cloud:
stream:
defaultBinder: rabbitA
binders:
rabbitA:
inheritEnvironment: false
defaultCandidate: false
type: rabbit
environment:
spring:
rabbitmq:
addresses: rabbitmq-a:5672
username: user-a
password: password-a
rabbitB:
inheritEnvironment: false
defaultCandidate: false
type: rabbit
environment:
spring:
rabbitmq:
addresses: rabbitmq-b:5672
username: user-b
password: password-b
bindings:
dataFromA:
destination: exchange-1
group: queue-1
binder: rabbitA
dataFromB:
destination: exchange-2
group: queue-2
binder: rabbitB
That itself works fine, it connects to both clusters. The problem is that this service is deployed in an environment where there is a spring config server with the following files:
application.yml
spring.rabbitmq:
addresses: rabbitmq-a:5672
username: user-a
password: password-a
Then that seems to override the configuration set for each binder, located under the "environment" property. So I needed to add this extra config.
example-service.yml
spring.cloud:
config:
overrideSystemProperties: false
allowOverride: true
overrideNone: false
Now the example-service connects to both rabbitmq clusters again. But I have observed certain side effects, mainly not being able to override other properties in the config server example-service.yml anymore, which is a real need for me. So I have discarded using allowOverride and its related properties.
The question is... is it possible to make it work without using allowOverride, while keeping the spring.rabbitmq.addresses/username/password in the remote config server application.yml?
Thank you very much in advance.
Kind regards.
Which version are you using? I just tested it with 3.0.6 and it works fine:
spring:
cloud:
stream:
binders:
rabbitA:
type: rabbit
inherit-environment: false
environment:
spring:
rabbitmq:
virtual-host: A
rabbitB:
type: rabbit
inherit-environment: false
environment:
spring:
rabbitmq:
virtual-host: B
bindings:
input1:
binder: rabbitA
destination: input1
group: foo
input2:
binder: rabbitB
destination: input2
group: bar
rabbitmq:
virtual-host: /
Probably not related, but your group indentation is wrong.

SCDF kubernetes custom source is writing data to "ouput" channel

I have custom source application which reads data from external kafka server and pass the information to next processor in the stream. In local everything works perfect. I have created docker image of the code and when i deploy stream in kubernetes env, i do see topic with the name stream.source-app got created but messages produced by source are actually going to "output" topic. I dont see this issue in local env.
application.yaml
spring:
cloud:
stream:
bindings:
Workitemconnector_subscribe:
destination: Workitemconnector
contentType: application/json
group: SCDFMessagingSourceTestTool1
consumer:
partitioned: true
concurrency: 1
headerMode: embeddedHeaders
output:
# destination: dataOut
binder: kafka2
binders:
kafka1:
type: kafka
environment:
spring:
cloud:
stream:
kafka:
binder:
brokers: xx.xxx.xx.xxx:9092
zkNodes: xx.xxx.xx.xxx:2181
kafka2:
type: kafka
environment:
spring:
cloud:
stream:
kafka:
binder:
brokers: server1:9092
zkNodes: server1:2181
spring.cloud.stream.defaultBinder: kafka1
In local without defining any parameters during stream deployment, i notice source is consuming message from xxxx server and producing data to server1 and to the topic name "stream.sourceapp" but in kubernetes env it is acting strange. It is always sending data to "output" topic even though "stream.sourceapp" topic exists

Loadbalancing fails when a server is down

I have written a simple set of micro-services with the following architecture:
For all, I have added spring-boot-starter-actuator in order to add /health endpoint.
In Zuul/Ribbon configuration I have added :
zuul:
ignoredServices: "*"
routes:
home-service:
path: /service/**
serviceId: home-service
retryable: true
home-service:
ribbon:
listOfServers: localhost:8080,localhost:8081
eureka.enabled: false
ServerListRefreshInterval: 1
So that, each time client will call GET http://localhost:7070/service/home, loadbalancer will choose one of two HomeService which runs on 8080 or 8081 port and will call its endpoint /home.
But, when one of HomeService is shutdown, the loadbalancer does not seem to be aware (in spite of ServerListRefreshInterval configuration) and will fail with error=500 if it tries to call the shutdown instance.
How could I fix it?
I have received and tested a solution from spring-cloud team.
Solution is here in github
To summarize:
I have added org.springframework.retry.spring-retry to my zuul classpath
I have added #EnableRetry to my zuul application
I have put the following properties in my zuul configuration
application.yml
server:
port: ${PORT:7070}
spring:
application:
name: gateway
endpoints:
health:
enabled: true
sensitive: true
restart:
enabled: true
shutdown:
enabled: true
zuul:
ignoredServices: "*"
routes:
home-service:
path: /service/**
serviceId: home-service
retryable: true
retryable: true
home-service:
ribbon:
listOfServers: localhost:8080,localhost:8081
eureka.enabled: false
ServerListRefreshInterval: 100
retryableStatusCodes: 500
MaxAutoRetries: 2
MaxAutoRetriesNextServer: 1
OkToRetryOnAllOperations: true
ReadTimeout: 10000
ConnectTimeout: 10000
EnablePrimeConnections: true
ribbon:
eureka:
enabled: false
hystrix:
command:
default:
execution:
isolation:
thread:
timeoutInMilliseconds: 30000
Debugging timeouts may be tricky, considering there are 3 levels of routing alone (Zuul→Hystrix→Ribbon), not including async execution layers and the retry engine. The following scheme is valid for Spring Cloud releases Camden.SR6 and newer (I've checked this on Dalston.SR1):
Zuul routes the request through RibbonRoutingFilter, which creates a Ribbon command with the request context. Ribbon command then creates a LoadBalancer command, which uses spring-retry for command execution, choosing retry policy for the RetryTemplate according to Zuul settings. #EnableRetry does nothing in this case, because this annotation enables wrapping methods with #Retryable annotation in retrying proxies.
This means, your command duration is limited to the lesser value of these two (see this post):
[HystrixTimeout], which is a timeout for invoked Hystrix command
[RibbonTimeout * MaxAutoRetries * MaxAutoRetriesNextServer] (retries kick in only if Zuul has them enabled in its configuration), where [RibbonTimeout = ConnectTimeout + ReadTimeout] on the http client.
For debugging, it's convenient to create a breakpoint in RetryableRibbonLoadBalancingHttpClient#executeWithRetry or RetryableRibbonLoadBalancingHttpClient#execute method. At this point, you have:
ContextAwareRequest instance (e.g. RibbonApacheHttpRequest or OkHttpRibbonRequest) with request context, which containes Zuul's retryable property;
LoadBalancedRetryPolicy intsance with load balancer context, which contains Ribbon's maxAutoRetries, maxAutoRetriesNextServer and okToRetryOnAllOperations properties;
RetryCallback instance with a requestConfig, which contains HttpClient's connectTimeout and socketTimeout properties;
RetryTemplate instance with chosen retry policy.
If the breakpoint is not hit, it means that org.springframework.cloud.netflix.ribbon.apache.RetryableRibbonLoadBalancingHttpClient bean was not instantiated. This happenes when the spring-retry library is not in the classpath.

Spring cloud netflix turbine.stream reports no data

Like a few others before me, I cannot get hystrix streams reported by my services to be aggregated by turbine (local, not amqp). I have read all the questions and answers here on SO, applied their advice and have got nowhere. Here's my setup.
Version: Brixton M4
Services running on localhost: eureka, zuul, myservice, myservice2, turbine
myservice and myservice2 are microservices that expose hystrix.stream. As with zuul, I can connect directly to these hystrix.stream endpoints and see data. turbine.stream is always empty.
turbine application.yml
The turbine part of this configuration is taken almost directly from the example in the spring cloud docs.
spring:
application:
name: turbine
server:
port: 8989
management:
port: 8990
turbine:
aggregator:
clusterNameExpression: metadata['cluster']
clusterConfig: LOCAL
appConfig: myservice,myservice2,zuul
InstanceMonitor:
eventStream:
skipLineLogic:
enabled: false
eureka:
instance:
leaseRenewalIntervalInSeconds: 10
client:
serviceUrl:
defaultZone: http://localhost:8761/eureka/
info:
component: Turbine!
Each of the 3 services includes:
eureka:
instance:
leaseRenewalIntervalInSeconds: 10
metadataMap:
cluster: LOCAL
client:
serviceUrl:
defaultZone: http://localhost:8761/eureka/
Verifying that the metadata map is correct in Eureka from http://localhost:8761/eureka/apps
<applications>
<versions__delta>1</versions__delta>
<apps__hashcode>UP_4_</apps__hashcode>
<application>
<name>MYSERVICE</name>
<instance>
<instanceId>192.168.43.128:myservice:2222</instanceId>
<hostName>192.168.43.128</hostName>
<app>MYSERVICE</app>
<ipAddr>192.168.43.128</ipAddr>
<status>UP</status>
<overriddenstatus>UNKNOWN</overriddenstatus>
<port enabled="true">2222</port>
<securePort enabled="false">443</securePort>
<countryId>1</countryId>
<dataCenterInfo class="com.netflix.appinfo.InstanceInfo$DefaultDataCenterInfo">
<name>MyOwn</name>
</dataCenterInfo>
<leaseInfo>
<renewalIntervalInSecs>10</renewalIntervalInSecs>
<durationInSecs>90</durationInSecs>
<registrationTimestamp>1453382031096</registrationTimestamp>
<lastRenewalTimestamp>1453382630966</lastRenewalTimestamp>
<evictionTimestamp>0</evictionTimestamp>
<serviceUpTimestamp>1453382031096</serviceUpTimestamp>
</leaseInfo>
<metadata>
<cluster>LOCAL</cluster>
</metadata>
When I run turbine it is finding the service instances in Eureka but is not choosing them to report data:
o.s.c.n.t.CommonsInstanceDiscovery : Fetching instance list for apps: [myservice, myservice2, zuul]
o.s.c.n.turbine.EurekaInstanceDiscovery : Fetching instances for app: myservice
o.s.c.n.turbine.EurekaInstanceDiscovery : Received instance list for app: myservice, size=1
o.s.c.n.turbine.EurekaInstanceDiscovery : Fetching instances for app: myservice2
o.s.c.n.turbine.EurekaInstanceDiscovery : Received instance list for app: myservice2, size=1
o.s.c.n.turbine.EurekaInstanceDiscovery : Fetching instances for app: zuul
o.s.c.n.turbine.EurekaInstanceDiscovery : Received instance list for app: zuul, size=1
c.n.t.discovery.InstanceObservable : Retrieved hosts from InstanceDiscovery: 3
c.n.t.discovery.InstanceObservable : Found hosts that have been previously terminated: 0
And another (potentially useful?) log line that appears when logging is raised to DEBUG:
c.n.t.discovery.InstanceObservable : Retrieved hosts from InstanceDiscovery: [StatsInstance [hostname=192.168.43.128, cluster: MYSERVICE, isUp: true, attrs={cluster=LOCAL, port=2222}], StatsInstance [hostname=192.168.43.128, cluster: MYSERVICE2, isUp: true, attrs={cluster=LOCAL, port=2223}], StatsInstance [hostname=192.168.43.128, cluster: ZUUL, isUp: true, attrs={cluster=LOCAL, port=8765}]]
OK never mind, it was a silly mistake on my part that I spotted while running turbine in the debugger. clusterNameExpression is a child of turbine and not aggregator.
With that error corrected I can see the first service in the comma separated list reporting data in the turbine stream but not the others. Is this expected? i.e. Is turbine designed for monitoring the streams from multiple microservices that make up the same logical application or is it purely for multiple instances of the same microservice?

Resources