Kafka Streaming - Caused by: org.rocksdb.RocksDBException - Too many open files - apache-kafka-streams

Using Kafka Streaming 2.4 and DSL API.
I am having stateful streaming processing which connects to user topic having 100 partitions. application also refers internal topics which has default partitions similler to user topic.
Observing Below Error and all task threads are getting shut down eventually.
Could you please put some pointer on getting formula to calculate required open file descriptors?
public class CustomRocksDBConfig implements RocksDBConfigSetter {
private org.rocksdb.Cache cache = new org.rocksdb.LRUCache(2 * 1024L * 1024L * 1024L);
#Override
public void setConfig(final String storeName, final Options options, final Map<String, Object> configs) {
BlockBasedTableConfig tableConfig = (BlockBasedTableConfig) options.tableFormatConfig();
tableConfig.setBlockCache(cache);
tableConfig.setBlockCacheSize(1024L * 1024L * 1024L);
tableConfig.setBlockSize( 4 * 1024L);
tableConfig.setCacheIndexAndFilterBlocks(true);
options.setTableFormatConfig(tableConfig);
options.setMaxWriteBufferNumber(7);
options.setMinWriteBufferNumberToMerge(4);
options.setWriteBufferSize(25 * 1024L * 1024L);}
Caused by: org.rocksdb.RocksDBException: While open a file for appending: /data/directory/generator.1583280000000/002360.sst: Too many open files
at org.rocksdb.RocksDB.flush(Native Method)
at org.rocksdb.RocksDB.flush(RocksDB.java:2394)
at org.apache.kafka.streams.state.internals.RocksDBStore$SingleColumnFamilyAccessor.flush(RocksDBStore.java:581)
at org.apache.kafka.streams.state.internals.RocksDBStore.flush(RocksDBStore.java:384)
... 17 more

Issue resolved after incrementing ulimit.

Related

Spring Apache Kafka onFailure Callback of KafkaTemplate not fired on connection error

I'm experimenting a lot with Apache Kafka in a Spring Boot App at the moment.
My current goal is to write a REST endpoint that takes in some message payload, which will use a KafkaTemplate to send the data to my local Kafka running on port 9092.
This is my producer config:
#Bean
public Map<String,Object> producerConfig() {
// config settings for creating producers
Map<String,Object> configProps = new HashMap<>();
configProps.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,this.bootstrapServers);
configProps.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
configProps.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,StringSerializer.class);
configProps.put(ProducerConfig.MAX_BLOCK_MS_CONFIG,5000);
configProps.put(ProducerConfig.REQUEST_TIMEOUT_MS_CONFIG,4000);
configProps.put(ProducerConfig.RETRIES_CONFIG,0);
return configProps;
}
#Bean
public ProducerFactory<String,String> producerFactory() {
// creates a kafka producer
return new DefaultKafkaProducerFactory<>(producerConfig());
}
#Bean("kafkaTemplate")
public KafkaTemplate<String,String> kafkaTemplate(){
// template which abstracts sending data to kafka
return new KafkaTemplate<>(producerFactory());
}
My rest endpoint forwards to a service, the service looks like this:
#Service
public class KafkaSenderService {
#Qualifier("kafkaTemplate")
private final KafkaTemplate<String,String> kafkaTemplate;
#Autowired
public KafkaSenderService(KafkaTemplate<String,String> kafkaTemplate) {
this.kafkaTemplate = kafkaTemplate;
}
public void sendMessageWithCallback(String message, String topicName) {
// possibility to add callbacks to define what shall happen in success/ error case
ListenableFuture<SendResult<String,String>> future = kafkaTemplate.send(topicName, message);
future.addCallback(new KafkaSendCallback<String, String>() {
#Override
public void onFailure(KafkaProducerException ex) {
logger.warn("Message could not be delivered. " + ex.getMessage());
}
#Override
public void onSuccess(SendResult<String, String> result) {
logger.info("Your message was delivered with following offset: " + result.getRecordMetadata().offset());
}
});
}
}
The thing now is: I'm expecting the "onFailure()" method to get called when the message could not be sent. But this seems not to work. When I change the bootstrapServers variable in the producer config to localhost:9091 (which is the wrong port, so there should be no connection possible), the producer tries to connect to the broker. It will do several connection attempts, and after 5 seconds, a TimeOutException will occur. But the "onFailure() method won't get called. Is there a way to achieve that the "onFailure()" method can get called event if the connection cannot be established?
And by the way, I set the retries count to zero, but the prodcuer still does a second connection attempt after the first one. This is the log output:
EDIT: it seems like the Kafke producer/ KafkaTemplate goes into an infinite loop when the broker is not available. Is that really the intended behaviour?
The KafkaTemplate does really nothing fancy about connection and publishing. Everything is delegated to the KafkaProducer. What you describe here would happen exactly even if you'd use just plain Kafka Client.
See KafkaProducer.send() JavaDocs:
* #throws TimeoutException If the record could not be appended to the send buffer due to memory unavailable
* or missing metadata within {#code max.block.ms}.
Which happens by the blocking logic in that producer:
/**
* Wait for cluster metadata including partitions for the given topic to be available.
* #param topic The topic we want metadata for
* #param partition A specific partition expected to exist in metadata, or null if there's no preference
* #param nowMs The current time in ms
* #param maxWaitMs The maximum time in ms for waiting on the metadata
* #return The cluster containing topic metadata and the amount of time we waited in ms
* #throws TimeoutException if metadata could not be refreshed within {#code max.block.ms}
* #throws KafkaException for all Kafka-related exceptions, including the case where this method is called after producer close
*/
private ClusterAndWaitTime waitOnMetadata(String topic, Integer partition, long nowMs, long maxWaitMs) throws InterruptedException {
Unfortunately this is not explained in the send() JavaDocs which claims to be fully asynchronous, but apparently it is not. At least in this metadata part which has to be available before we enqueue the record for publishing.
That's what we cannot control and it is not reflected on the returned Future:
try {
clusterAndWaitTime = waitOnMetadata(record.topic(), record.partition(), nowMs, maxBlockTimeMs);
} catch (KafkaException e) {
if (metadata.isClosed())
throw new KafkaException("Producer closed while send in progress", e);
throw e;
}
See more info in Apache Kafka docs how to adjust the KafkaProducer for this matter: https://kafka.apache.org/documentation/#theproducer
Question answered inside the discussion on https://github.com/spring-projects/spring-kafka/discussions/2250# for anyone else stumbling across this thread. In short, kafkaTemplate.getProducerFactory().reset();does the trick.

How to disable logging all messages in a Kafka batch in case of an exception?

When using #KafkaListener with batches, the error handler logs the content of the full batch (all messages) in case of an exception.
How can I make this less verbose? I'd like to avoid spamming the log files with all the messages and only see the actual exception.
Here is a minimal example of how my consumer currently looks like:
#Component
class TestConsumer {
#Bean
fun kafkaBatchListenerContainerFactory(kafkaProperties: KafkaProperties): ConcurrentKafkaListenerContainerFactory<String, String> {
val configs = kafkaProperties.buildConsumerProperties()
configs[ConsumerConfig.MAX_POLL_RECORDS_CONFIG] = 10000
val factory = ConcurrentKafkaListenerContainerFactory<String, String>()
factory.consumerFactory = DefaultKafkaConsumerFactory(configs)
factory.isBatchListener = true
return factory
}
#KafkaListener(
topics = ["myTopic"],
containerFactory = "kafkaBatchListenerContainerFactory"
)
fun batchListen(values: List<ConsumerRecord<String, String>>) {
// Something that might throw an exception in rare cases.
}
}
What version are you using?
This container property was added in 2.2.14.
/**
* Set to false to log {#code record.toString()} in log messages instead
* of {#code topic-partition#offset}.
* #param onlyLogRecordMetadata false to log the entire record.
* #since 2.2.14
*/
public void setOnlyLogRecordMetadata(boolean onlyLogRecordMetadata) {
this.onlyLogRecordMetadata = onlyLogRecordMetadata;
}
It has been true by default since version 2.7 (which is why the javadocs now read that way).
This was the previous javadoc:
/**
* Set to true to only log {#code topic-partition#offset} in log messages instead
* of {#code record.toString()}.
* #param onlyLogRecordMetadata true to only log the topic/parrtition/offset.
* #since 2.2.14
*/
Also, starting with version 2.5, you can set the log level on the error handler:
/**
* Set the level at which the exception thrown by this handler is logged.
* #param logLevel the level (default ERROR).
*/
public void setLogLevel(KafkaException.Level logLevel) {
Assert.notNull(logLevel, "'logLevel' cannot be null");
this.logLevel = logLevel;
}

Hazelcast ClassNotFound using Near Cache in Client

I try to use Hazelcast (3.9.2, 3.11 no difference) in the following way:
I got Hazelcast servers (members). I run them dedicated, not embedded.
I do not want to teach the Hazelcast members the classes I want to store within them. I used the bundled hazelcast.xml file and did the following addon (3.9.2)
<replicatedmap name="default">
<in-memory-format>BINARY</in-memory-format>
<statistics-enabled>true</statistics-enabled>
</replicatedmap>
I also activated TCP, not Multicast (true/false)
That is all changes I did. I started with one Member listening to 127.0.0.1:5701
Then I try to attach Hazelcast clients to the member for storing and retrieving Maps (Primarily ReplicatedMaps, but Maps also do not work in my scenario)
My Client Code looks like this (Cache is just a Serializable Class with no attributes):
public class Main {
public static final String HAZELCAST_INSTANCE_NAME = "HAZI";
public static final String REPLICATEDMAP_NAME = "REP_MAP";
public static final String MAP_NAME = "NORMAL_MAP";
public static void main(String[] args) {
init();
HazelcastInstance instance = HazelcastClient.getHazelcastClientByName(HAZELCAST_INSTANCE_NAME);
Map<String, Object> repMap = instance.getReplicatedMap(REPLICATEDMAP_NAME);
repMap.put("MyKey", new Cache());
System.err.println("Retrieve " + repMap.get("MyKey"));
Map<String, Object> normalMap = instance.getReplicatedMap(MAP_NAME);
normalMap.put("MyKey", new Cache());
System.err.println("Retrieve " + normalMap.get("MyKey"));
System.exit(1);
}
private static void init() {
ClientConfig cfg = new ClientConfig();
cfg.setInstanceName(HAZELCAST_INSTANCE_NAME);
cfg.addNearCacheConfig(defineNearCache(REPLICATEDMAP_NAME));
cfg.addNearCacheConfig(defineNearCache(MAP_NAME));
// for analysis in the hazelcast management console
cfg.getProperties().put("hazelcast.client.statistics.enabled", "true");
cfg.getProperties().put("hazelcast.client.statistics.period.seconds", "60");
cfg.getNetworkConfig().addAddress("127.0.0.1:5701");
if (HazelcastClient.newHazelcastClient(cfg) == null) {
System.err.println(" !!! ERROR in Cache Config !!!");
}
}
private static NearCacheConfig defineNearCache(String mapName) {
EvictionConfig evictionConfig = new EvictionConfig()
.setMaximumSizePolicy(EvictionConfig.MaxSizePolicy.ENTRY_COUNT)
.setSize(200);
return new NearCacheConfig()
.setName(mapName)
.setInMemoryFormat(InMemoryFormat.BINARY)
.setInvalidateOnChange(true)
.setEvictionConfig(evictionConfig);
}
}
My problem now is:
Using this code I get a ClassNotFoundError trying put put things to the replicated map or regular map, but in the dedicated Hazelcast server (member), not on the client side.
SCHWERWIEGEND: [127.0.0.1]:5701 [dev] [3.9.2] hz._hzInstance_1_dev.event-3 caught an exception while processing task:com.hazelcast.spi.impl.eventservice.impl.LocalEventDispatcher#eeed098
com.hazelcast.nio.serialization.HazelcastSerializationException: java.lang.ClassNotFoundException: de.empic.hazelwar.model.Cache
at com.hazelcast.internal.serialization.impl.JavaDefaultSerializers$JavaSerializer.read(JavaDefaultSerializers.java:224)
at com.hazelcast.internal.serialization.impl.StreamSerializerAdapter.read(StreamSerializerAdapter.java:48)
at com.hazelcast.internal.serialization.impl.AbstractSerializationService.toObject(AbstractSerializationService.java:185)
at com.hazelcast.map.impl.DataAwareEntryEvent.getValue(DataAwareEntryEvent.java:90)
at com.hazelcast.client.impl.protocol.task.replicatedmap.AbstractReplicatedMapAddEntryListenerMessageTask.handleEvent(AbstractReplicatedMapAddEntryListenerMessageTask.java:92)
at com.hazelcast.client.impl.protocol.task.replicatedmap.AbstractReplicatedMapAddEntryListenerMessageTask.entryAdded(AbstractReplicatedMapAddEntryListenerMessageTask.java:132)
at com.hazelcast.replicatedmap.impl.ReplicatedMapEventPublishingService.dispatchEvent(ReplicatedMapEventPublishingService.java:82)
at com.hazelcast.replicatedmap.impl.ReplicatedMapService.dispatchEvent(ReplicatedMapService.java:247)
at com.hazelcast.spi.impl.eventservice.impl.LocalEventDispatcher.run(LocalEventDispatcher.java:64)
at com.hazelcast.util.executor.StripedExecutor$Worker.process(StripedExecutor.java:225)
at com.hazelcast.util.executor.StripedExecutor$Worker.run(StripedExecutor.java:208)
Caused by: java.lang.ClassNotFoundException: de.empic.hazelwar.model.Cache
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at com.hazelcast.nio.ClassLoaderUtil.tryLoadClass(ClassLoaderUtil.java:173)
at com.hazelcast.nio.ClassLoaderUtil.loadClass(ClassLoaderUtil.java:147)
at com.hazelcast.nio.IOUtil$ClassLoaderAwareObjectInputStream.resolveClass(IOUtil.java:591)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1868)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1751)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2042)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:431)
at com.hazelcast.internal.serialization.impl.JavaDefaultSerializers$JavaSerializer.read(JavaDefaultSerializers.java:219)
... 10 more
Whenever I remove the near cache config from the client config, all works pretty perfect, except I do not have a near cache of course.
What do I miss here ?
#magicroomy, I run the same on both 3.9.2 & 3.11. I can confirm that:
If you change Replicated Map to Map, it works with or without Near Cache.
When using Replicated Map, if Near Cache defined, the exception thrown on the server side.
Without Near Cache, ReplicatedMap also works.
I created a github issue as well: https://github.com/hazelcast/hazelcast/issues/14210
My problem is solved using the 3.11.1 version of hazelcast.

Bind RabbitMQ consumer using Spring Cloud Stream to RabbitMQ producer

I have two microservices, one for collecting XML files from internal FTP server ,transforming it to DTO objects and then publishing them as bytes in RabbitMQ and the other for deserializing the incoming bytes from RabbitMQ to DTO objects, mapping them to JPA entities and persisiting them to database.
I'd like configure RabbitMQ broker between these two microservices like below:
1) for microservice that collect XML files, I edited in application.properties as below:
spring.cloud.stream.bindings.output.destination=TOPIC
spring.cloud.stream.bindings.output.group=proactive-policy
2) for microservice that persist incoming DTO onjects, I configured in application.properties as following:
spring.cloud.stream.bindings.input.destination=TOPIC
spring.cloud.stream.bindings.input.group=proactive-policy
For receiving incoming bytes from RabbitMQ I'm using second microservice as sink:
#EnableJpaAuditing
#EnableBinding(Sink.class)
#SpringBootApplication(scanBasePackages = { "org.proactive.policy.data.cache" })
#RefreshScope
public class ProactivePolicyDataCacheApplication {
private static Logger logger = LoggerFactory.getLogger(ProactivePolicyDataCacheApplication.class);
#Autowired
PolicyService policyService;
public static void main(String[] args) {
SpringApplication.run(ProactivePolicyDataCacheApplication.class, args);
}
#StreamListener(Sink.INPUT)
public void input(Message<byte[]> message) throws Exception {
if (Objects.isNull(message) || Objects.isNull(message.getPayload())) {
logger.error("the message is null ");
throw new IllegalArgumentException("`message` and `message.payload` cannot be null");
}
byte[] data = message.getPayload();
if (data.length == 0) {
logger.warn("Received empty message");
return;
}
logger.info("Got data from policy-collector = " + new String(data, "UTF-8"));
PolicyListDto policyListDto = (PolicyListDto) SerializationUtils.deserialize(data);
logger.info("Policies.xml from policy-collector = " + policyListDto.getPolicy().toString());
policyService.save(policyListDto);
}
}
But when I open RabbitMQ console for looking at exchanges I didn't receive any thing in Queue TOPIC.proactive-policy But the incoming messages are received in another Queue that I haven't configured it named FTPSTREAM.proactive-policy-collector
Is there any suggestion for resolving this issue
Couple of points:
1. There is no such thing as 'group' for the output binding. Consumer Group is a consumer property. Here is the fragment of the javadocs.
/**
* Unique name that the binding belongs to (applies to consumers only). Multiple
* consumers within the same group share the subscription. A null or empty String
* value indicates an anonymous group that is not shared.
* #see org.springframework.cloud.stream.binder.Binder#bindConsumer(java.lang.String,
* java.lang.String, java.lang.Object,
* org.springframework.cloud.stream.binder.ConsumerProperties)
*/
private String group;
2. The name 'FTPSTREAM.proactive-policy-collector' is definitely not something that is generated by the spring-cloud-stream, so consider looking into your configuration and see what have you missed.
It tells me that you have some consumer that has its 'destination' named FTPSTREAM and its 'group' proactive-policy-collector. It also tells me that your producer sends messages to the FTPSTREAM exchange.

Spring Integration TCP Performance Monitor

I am now coding based on https://github.com/spring-projects/spring-integration-samples/tree/master/basic/tcp-client-server
I want to know monitoring information for this good source example.
For example,
I want to know about
how many clients are connected?
how many threads are being used ?
Is there anybody who can replay about my question.
You can use this on the AbstractConnectionFactory:
/**
* Returns a list of (currently) open {#link TcpConnection} connection ids; allows,
* for example, broadcast operations to all open connections.
* #return the list of connection ids.
*/
public List<String> getOpenConnectionIds() {
return Collections.unmodifiableList(this.removeClosedConnectionsAndReturnOpenConnectionIds());
}
The threads stats you can get from the externally configured ThreadPoolTaskExecutor and injected to that AbstractConnectionFactory via:
/**
* #param taskExecutor the taskExecutor to set
*/
public void setTaskExecutor(Executor taskExecutor) {
this.taskExecutor = taskExecutor;
}

Resources