I have a problem with a Spring cloud stream application that is using the KStream component. It is listening to one input and directing messages to one output after processing them.
It is expecting a JSON string to come in and tries to convert it to a Spring Tuple on arrival. Reverse of this happens when sending the message out.
Problem is that when a Sysadmin wants to test a topic with a kafka-console-producer.sh for instance... and prints a string
"lol"
in it, the whole Spring cloud stream application will die right there with the following exception:
java.lang.RuntimeException: com.fasterxml.jackson.core.JsonParseException: Unrecognized token 'lol': was expecting ('true', 'false' or 'null')
at [Source: lol; line: 1, column: 7]
at org.springframework.tuple.JsonStringToTupleConverter.convert(JsonStringToTupleConverter.java:71) ~[spring-tuple-1.0.0.RELEASE.jar:na]
at org.springframework.tuple.JsonStringToTupleConverter.convert(JsonStringToTupleConverter.java:31) ~[spring-tuple-1.0.0.RELEASE.jar:na]
at org.springframework.tuple.TupleBuilder.fromString(TupleBuilder.java:153) ~[spring-tuple-1.0.0.RELEASE.jar:na]
at org.springframework.cloud.stream.converter.TupleJsonMessageConverter.convertFromInternal(TupleJsonMessageConverter.java:90) ~[spring-cloud-stream-1.3.2.RELEASE.jar:1.3.2.RELEASE]
at org.springframework.messaging.converter.AbstractMessageConverter.fromMessage(AbstractMessageConverter.java:175) ~[spring-messaging-4.3.14.RELEASE.jar:4.3.14.RELEASE]
at org.springframework.messaging.converter.AbstractMessageConverter.fromMessage(AbstractMessageConverter.java:167) ~[spring-messaging-4.3.14.RELEASE.jar:4.3.14.RELEASE]
at org.springframework.messaging.converter.CompositeMessageConverter.fromMessage(CompositeMessageConverter.java:55) ~[spring-messaging-4.3.14.RELEASE.jar:4.3.14.RELEASE]
at org.springframework.cloud.stream.binder.kstream.KStreamListenerParameterAdapter$1.apply(KStreamListenerParameterAdapter.java:66) ~[spring-cloud-stream-binder-kstream-1.3.2.RELEASE.jar:1.3.2.RELEASE]
at org.apache.kafka.streams.kstream.internals.KStreamMap$KStreamMapProcessor.process(KStreamMap.java:42) ~[kafka-streams-0.10.1.1.jar:na]
at org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:82) ~[kafka-streams-0.10.1.1.jar:na]
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:202) ~[kafka-streams-0.10.1.1.jar:na]
at org.apache.kafka.streams.processor.internals.SourceNode.process(SourceNode.java:66) ~[kafka-streams-0.10.1.1.jar:na]
at org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:180) ~[kafka-streams-0.10.1.1.jar:na]
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:436) ~[kafka-streams-0.10.1.1.jar:na]
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:242) ~[kafka-streams-0.10.1.1.jar:na]
Caused by: com.fasterxml.jackson.core.JsonParseException: Unrecognized token 'lol': was expecting ('true', 'false' or 'null')
at [Source: lol; line: 1, column: 7]
at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1702) ~[jackson-core-2.8.10.jar:2.8.10]
at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:558) ~[jackson-core-2.8.10.jar:2.8.10]
at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._reportInvalidToken(ReaderBasedJsonParser.java:2839) ~[jackson-core-2.8.10.jar:2.8.10]
at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._handleOddValue(ReaderBasedJsonParser.java:1903) ~[jackson-core-2.8.10.jar:2.8.10]
at com.fasterxml.jackson.core.json.ReaderBasedJsonParser.nextToken(ReaderBasedJsonParser.java:749) ~[jackson-core-2.8.10.jar:2.8.10]
at com.fasterxml.jackson.databind.ObjectMapper._initForReading(ObjectMapper.java:3850) ~[jackson-databind-2.8.10.jar:2.8.10]
at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3799) ~[jackson-databind-2.8.10.jar:2.8.10]
at com.fasterxml.jackson.databind.ObjectMapper.readTree(ObjectMapper.java:2397) ~[jackson-databind-2.8.10.jar:2.8.10]
at org.springframework.tuple.JsonStringToTupleConverter.convert(JsonStringToTupleConverter.java:44) ~[spring-tuple-1.0.0.RELEASE.jar:na]
I would expect that the framework has at least some fault tolerance for such behaviour. You cannot expect input to be nice and pretty always.
So I looked into Spring documentation: https://docs.spring.io/spring-cloud-stream/docs/current/reference/htmlsingle/#_configuration_options
and there are some configuration options for what seems to be some hidden implementation of retry logic incase of failures. For instance the maxAttempts parameter. But this parameter already has the default value of 3 used and yet I don't see Spring cloud stream applications making any attempts to rescue from this error.
So I would like to know what is the recommended way of building some bad input tolerance for Spring cloud stream applications.
The configuration for the application looks like this:
spring:
cloud:
stream:
bindings:
input:
content-type: application/json
destination: inbound
group: fraud
consumer:
headerMode: raw
output:
content-type: application/x-spring-tuple
destination: outbound
producer:
headerMode: raw
useNativeEncoding: true
spring.cloud.stream.kstream.binder.configuration:
key.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
value.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
In Spring Cloud Stream 1.3.x (Ditmars), there is only very limited support for error handling for Kafka Streams. In fact, it is up to the application to handle any errors in 1.3 kafka streams library. However, in 2.0.0, we are adding support for KIP-161. https://cwiki.apache.org/confluence/display/KAFKA/KIP-161%3A+streams+deserialization+exception+handlers
Using this new feature in 2.0.0 version of the kafka streams binder, you can either logAndSkip the records or logAndFail the records on deserialization errors. In addition to these, the binder also provides a DLQ sending exception handler implementation. Docs are still getting updated on the 2.0 line for all these. I will update the docs links here once thats ready. But, here is the gist of it.
spring.cloud.stream.kafka.streams.binder.serdeError: sendToDlq (or logAndFail or logAndSkip)
spring.cloud.stream.kafka.stream.bindings.input.consumer.dlqName:[dlq name] - If this is not provided it will be error.[incoming-topic].[group-name].
Then you will see the records in error from deserialization in the DLQ topic.
Once again, these features are only available in the 2.0.0.BUILD-SNAPSHOT and will be part of the upcoming 2.0.0.RC1 release.
Related
With our spring boot app, we notice kafka consumer consuming message twice randomly once in a while only in prod env. We have 6 instances with 6 partitions deployed in PCF.We have caught messages with same offset and partition received twice in same topic which causes duplicates and is a business critical for us.
We haven't noticed this in non production env and it is hard to reproduce in non prod env. We have recently switched to Kafka and we are not able to find out the root issue.
We are using spring-cloud-stream/spring-cloud-stream-binder-kafka- 2.1.2
Here is the Config:
spring:
cloud:
stream:
default.consumer.concurrency: 1
default-binder: kafka
bindings:
channel:
destination: topic
content_type: application/json
autoCreateTopics: false
group: group
consumer:
maxAttempts: 1
kafka:
binder:
autoCreateTopics: false
autoAddPartitions: false
brokers: brokers list
bindings:
channel:
consumer:
autoCommitOnError: true
autoCommitOffset: true
configuration:
max.poll.interval.ms: 1000000
max.poll.records: 1
group.id: group
We use #Streamlisteners to consume the messages.
Here is the instance we received duplicate and the error message captured in server logs.
ERROR 46 --- [container-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator
: [Consumer clientId=consumer-3, groupId=group] Offset commit failed
on partition topic-0 at offset 1291358: The coordinator is not aware
of this member. ERROR 46 --- [container-0-C-1]
o.s.kafka.listener.LoggingErrorHandler : Error while processing:
null OUT org.apache.kafka.clients.consumer.CommitFailedException:
Commit cannot be completed since the group has already rebalanced and
assigned the partitions to another member. This means that the time
between subsequent calls to poll() was longer than the configured
max.poll.interval.ms, which typically implies that the poll loop is
spending too much time message processing. You can address this
either by increasing the session timeout or by reducing the maximum
size of batches returned in poll() with max.poll.records.
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:871)
~[kafka-clients-2.0.1.jar!/:na]
There is no crash and all the instances are healthy at the time of duplicate. Also there is confusion with error log - Error while processing: null since Message was successfully processed twice. And max.poll.interval.ms: 100000 which is about 16 minutes and it is supposed to be enough time to process any message for the system and session timeout and heartbit config is default. Duplicate is received within 2 seconds in most of the instances.
Any configs that we are missing ? Any suggestion/help is highly appreciated.
Commit cannot be completed since the group has already rebalanced
A rebalance occurred because your listener took too long; you should adjust max.poll.records and max.poll.interval.ms to make sure you can always handle the records received within the timelimit.
In any case, Kafka does not guarantee exactly once delivery, only at least once delivery. You need to add idempotency to your application and detect/ignore duplicates.
Also, keep in mind StreamListener and the annotation-based programming model has been deprecated for 3+ years and has been removed from the current main, which means the next release will not have it. Please migrate your solution to a functional based programming model
I'm using the spring-kafka dependency version 2.8.1 with the following config:
kafka:
bootstrap-servers: localhost:9092
producer:
acks: all
key-serializer: org.apache.kafka.common.serialization.StringSerializer
value-serializer: org.apache.kafka.common.serialization.StringSerializer
buffer-memory: 16384
I'm trying to send about 65.000 messages to a topic but my server crashes with the following exception:
org.springframework.kafka.KafkaException: Send failed; nested exception is org.apache.kafka.clients.producer.BufferExhaustedException: Failed to allocate memory within the configured max blocking time 60000 ms.
This is how I'm sending all of these messages to my topic:
public processMessages(List<Message> messages){
for (Message msg: messages) {
kafkaTemplate.send("prepared-messages", message.toJson());
}
}
I tried setting the batch size to 0 but that also didn't work.
Have you tried increasing buffer.memory?
https://kafka.apache.org/documentation/#producerconfigs_buffer.memory
The total bytes of memory the producer can use to buffer records waiting to be sent to the server. If records are sent faster than they can be delivered to the server the producer will block for max.block.ms after which it will throw an exception.
This setting should correspond roughly to the total memory the producer will use, but is not a hard bound since not all memory the producer uses is used for buffering. Some additional memory will be used for compression (if compression is enabled) as well as for maintaining in-flight requests.
I am using https://github.com/zendesk/racecar (which uses https://github.com/appsignal/rdkafka-ruby under the hood, a Ruby wrapper for https://github.com/edenhill/librdkafka/) to consume a Kafka stream and every now and then I receive this error Local: Unknown partition (unknown_partition)
"/app/vendor/bundle/ruby/2.6.0/gems/rdkafka-0.8.1/lib/rdkafka/consumer.rb:339:in `store_offset'",
"/app/vendor/bundle/ruby/2.6.0/gems/racecar-2.3.0/lib/racecar/consumer_set.rb:51:in `store_offset'"
This error seem to be raised when calling rd_kafka_offset_store, responsible to store offset of a message to be used in the next commit of this consumer.
Inspecting logs with config.log_level = "debug" looks like the error above always is preceded by:
rdkafka: [thrd:sasl_ssl://a1-abc-d1er1.eu-east-2.aws.confluent.cloud:9092/2]: sasl_ssl://a1-abc-d1er1.eu-east-2.aws.confluent.cloud:9092/2: Disconnected (after 3600185ms in state UP)
(try 1/10): Error for topic subscription #<struct Racecar::Consumer::Subscription topic="my.topic", start_from_beginning=true, max_bytes_per_partition=1048576, additional_config={}>: Local: Broker transport failure (transport)
First time using Kafka here and I don't have an idea what might be.
I am running 3 instances of a service that I wrote using:
Scala 2.11.12
kafkaStreams 1.1.0
kafkaStreamsScala 0.2.1 (by lightbend)
The service uses Kafka streams with the following topology (high level):
InputTopic
Parse to known Type
Clear messages that the parsing failed on
split every single message to 6 new messages
on each message run: map.groupByKey.reduce(with local store).toStream.to
Everything works as expected but i can't get rid of a WARN message that keeps showing:
15:46:00.065 [kafka-producer-network-thread | my_service_name-1ca232ff-5a9c-407c-a3a0-9f198c6d1fa4-StreamThread-1-0_0-producer] [WARN ] [o.a.k.c.p.i.Sender] - [Producer clientId=my_service_name-1ca232ff-5a9c-407c-a3a0-9f198c6d1fa4-StreamThread-1-0_0-producer, transactionalId=my_service_name-0_0] Got error produce response with correlation id 28 on topic-partition my_service_name-state_store_1-repartition-1, retrying (2 attempts left). Error: UNKNOWN_PRODUCER_ID
As you can see, I get those errors from the INTERNAL topics that Kafka stream manage. Seems like some kind of retention period on the producer metadata in the internal topics / some kind of a producer id reset.
Couldn't find anything regarding this issue, only a description of the error itself from here:
ERROR CODE RETRIABLE DESCRIPTION
UNKNOWN_PRODUCER_ID 59 False This exception is raised by the broker if it could not locate the producer metadata associated with the producerId in question. This could happen if, for instance, the producer's records were deleted because their retention time had elapsed. Once the last records of the producer id are removed, the producer's metadata is removed from the broker, and future appends by the producer will return this exception.
Hope you can help,
Thanks
Edit:
It seems that the WARN message does not pop up on version 1.0.1 of kafka streams.
Spring boot properties for kafka producer:
spring.kafka.bootstrap-servers=localhost:9092
spring.kafka.client-id=bam
#spring.kafka.producer.acks= # Number of acknowledgments the producer requires the leader to have received before considering a request complete.
spring.kafka.producer.batch-size=0
spring.kafka.producer.bootstrap-servers=localhost:9092
#spring.kafka.producer.buffer-memory= # Total bytes of memory the producer can use to buffer records waiting to be sent to the server.
spring.kafka.producer.client-id=bam-producer
spring.kafka.consumer.auto-offset-reset=earliest
#spring.kafka.producer.compression-type= # Compression type for all data generated by the producer.
spring.kafka.producer.key-serializer= org.apache.kafka.common.serialization.StringSerializer
#spring.kafka.producer.retries= # When greater than zero, enables retrying of failed sends.
spring.kafka.producer.value-serializer= org.apache.kafka.common.serialization.StringSerializer
#spring.kafka.properties.*= # Additional properties used to configure the client.
I am getting below exception when i am trying to send message to a kafka topic :
Caused by: org.springframework.kafka.core.KafkaProducerException: Failed to send; nested exception is org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for bam-0 due to 30004 ms has passed since last append
at org.springframework.kafka.core.KafkaTemplate$1.onCompletion(KafkaTemplate.java:255)
at org.apache.kafka.clients.producer.internals.RecordBatch.done(RecordBatch.java:109)
at org.apache.kafka.clients.producer.internals.RecordBatch.maybeExpire(RecordBatch.java:160)
at org.apache.kafka.clients.producer.internals.RecordAccumulator.abortExpiredBatches(RecordAccumulator.java:245)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:212)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:135)
... 1 more
Caused by: org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for bam-0 due to 30004 ms has passed since last append
I am not able to figure our why i am getting this exception. Can some one please help ?
The producer is timing out trying to send messages. I notice you are using localhost in your bootstrap. Make sure a broker is available locally and listening on port 9092.
Issue resolved by setting advertised.listeners in server.properties to PLAINTEXT://<ExternalIP>:9092.
Note: Kafka is deployed over aws.