Getting the current number of messages in the ring buffer - spring

I am using Spring's Reactor pattern in my web application. Internally it uses LMAX's RingBuffer implementation as one of it's message queues. I was wondering if there's any way to find out the current RingBuffer occupancy dynamically. It would help me determine the number of producers and consumers needed (and also their relative rates), and whether the RingBuffer as a message queue is being used optimally.
I tried the getBacklog() of the reactor.event.dispatch.AbstractSingleThreadDispatcher class, but it seems to always give the same value: the size of the RingBuffer I used while instantiating the reactor.
Any light on the problem would be greatly appreciated.

Use com.lmax.disruptor.Sequencer.remainingCapacity()
To have access to the instance of Sequencer you have to create it explicitly as well as RingBuffer.
In my case initialization of outcoming Disruptor
Disruptor<MessageEvent> outcomingDisruptor =
new Disruptor<MessageEvent>(
MyEventFactory.getInstance(),
RING_BUFFER_OUT_SIZE,
MyExecutor.getInstance(),
ProducerType.SINGLE, new BlockingWaitStrategy());
transforms into
this.sequencer =
SingleProducerSequencer(RING_BUFFER_OUT_SIZE, new BlockingWaitStrategy());
RingBuffer ringBuffer =
new RingBuffer<MessageEvent>(MyEventFactory.getInstance(), sequencer);
Disruptor<MessageEvent> outcomingDisruptor =
new Disruptor<MessageEvent>(ringBuffer, MyExecutor.getInstance());
and then
this.getOutCapacity() {
return sequencer.remainingCapacity();
}
UPDATE
small bug :|
we need outMessagesCount instead of getOutCapacity.
public long outMessagesCount() {
return RING_BUFFER_OUT_SIZE - sequencer.remainingCapacity();
}

The latest version of reactor-core in mvnrepository (version 1.1.4 RELEASE) does not have a way to dynamically monitor the state of the message queue. However, after going through the reactor code on github, I found the TraceableDelegatingDispatcher, which allows tracing the message queue (if supported by the underlying dispatcher implementation) at runtime, via its remainingSlots() method. The easiest option was to compile the source and use it.

Related

Multithreaded Use of Spring Pulsar

I am working on a project to read from our existing ElasticSearch instance and produce messages in Pulsar. If I do this in a highly multithreaded way without any explicit synchronization, I get many occurances of the following log line:
Message with sequence id X might be a duplicate but cannot be determined at this time.
That is produced from this line of code in the Pulsar Java client:
https://github.com/apache/pulsar/blob/a4c3034f52f857ae0f4daf5d366ea9e578133bc2/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ProducerImpl.java#L653
When I add a synchronized block to my method, synchronizing on the pulsar template, the error disappears, but my publish rate drops substantially.
Here is the current working implementation of my method that sends Protobuf messages to Pulsar:
public <T extends GeneratedMessageV3> CompletableFuture<MessageId> persist(T o) {
var descriptor = o.getDescriptorForType();
PulsarPersistTopicSettings settings = pulsarPersistConfig.getSettings(descriptor);
MessageBuilder<T> messageBuilder = Optional.ofNullable(pulsarPersistConfig.getMessageBuilder(descriptor))
.orElse(DefaultMessageBuilder.DEFAULT_MESSAGE_BUILDER);
Optional<ProducerBuilderCustomizer<T>> producerBuilderCustomizerOpt =
Optional.ofNullable(pulsarPersistConfig.getProducerBuilder(descriptor));
PulsarOperations.SendMessageBuilder<T> sendMessageBuilder;
sendMessageBuilder = pulsarTemplate.newMessage(o)
.withSchema(Schema.PROTOBUF_NATIVE(o.getClass()))
.withTopic(settings.getTopic());
producerBuilderCustomizerOpt.ifPresent(sendMessageBuilder::withProducerCustomizer);
sendMessageBuilder.withMessageCustomizer(mb -> messageBuilder.applyMessageBuilderKeys(o, mb));
synchronized (pulsarTemplate) {
try {
return sendMessageBuilder.sendAsync();
} catch (PulsarClientException re) {
throw new PulsarPersistException(re);
}
}
}
The original version of the above method did not have the synchronized(pulsarTemplate) { ... } block. It performed faster, but generated a lot of logs about duplicate messages, which I knew to be incorrect. Adding the synchronized block got rid of the log messages, but slowed down publishing.
What are the best practices for multithreaded access to the PulsarTemplate? Is there a better way to achieve very high throughput message publishing?
Should I look at using the reactive client instead?
EDIT: I've updated the code block to show the minimum synchronization necessary to avoid the log lines, which is just synchronizing during the .sendAsync(...) call.
Your usage w/o the synchronized should work. I will look into that though to see if I see anything else going on. In the meantime, it would be great to give the Reactive client a try.
This issue was initially tracked here, and the final resolution was that it was an issue that has been resolved in Pulsar 2.11.
Please try updating the Pulsar 2.11.

Spring Integration Flow : Circuit breaker for each endpoints or at flow level

I have successfully implemented some spring Integration Flow.
I am looking to have a circuit breaker either the same one for each endpoints or either at the flow level.
I have already read this documentation https://docs.spring.io/spring-integration/reference/html/handler-advice.html, but I havent find my answer.
Should I use some AOP ?
Thanks
G.
I'm not sure what you have missed in the mentioned docs, but RequestHandlerCircuitBreakerAdvice is indeed over there: https://docs.spring.io/spring-integration/reference/html/handler-advice.html#circuit-breaker-advice
The advises like this should be applied in the Java DSL with this configuration option:
.transform(..., c -> c.advice(expressionAdvice()))
Pay attention to that advice(expressionAdvice()) call. The expressionAdvice() is a bean method. So, you can do something similar for the RequestHandlerCircuitBreakerAdvice and any your endpoints in the flow which need to be guarded by the circuit.
And yes, you can use only a single bean for the RequestHandlerCircuitBreakerAdvice. It does keep a state for any endpoint it is called against:
protected Object doInvoke(ExecutionCallback callback, Object target, Message<?> message) {
AdvisedMetadata metadata = this.metadataMap.get(target);
if (metadata == null) {
this.metadataMap.putIfAbsent(target, new AdvisedMetadata());
metadata = this.metadataMap.get(target);
}
Thanks for your answer #artem-bilan.
I really appreciate that a spring integration team member anwsered to this.
After more thoughts, I have reformulated my problem.
Given an IntegrationFlow, with a specific error channel, if there are more than a given amount of errors in given span time (more than 10 errors in 10s), I want to stop polling the input channel.
So I redirect all the errors for this flow to the specific flow error channel.
An error counter is incremented, and then if the threshold is reached in the given span time, I stop the poller.
I have a second flow that monitor "stopped" pollers, and it restart them after some time.
[UPDATE]
I do have use your recommendations.
Mainly because I the framework dont solve your problem, your probably wrong.
And I was wrong.
Thanks !

Kafka Streams: How to use persistentKeyValueStore to reload existing messages from disk?

My code is currently using an InMemoryKeyValueStore, which avoids any persistence to disk or to kafka.
I want to use rocksdb (Stores.persistentKeyValueStore) so that the app will reload state from disk. I'm trying to implement this, and I'm very new to Kafka and the streams API. Would appreciate help on how I might make changes, while I still try to understand stuff as I go.
I tried to create the state store here:
StoreBuilder<KeyValueStore<String, LinkedList<StoreItem>>> store =
Stores.<String, LinkedList<StoreItem>>keyValueStoreBuilder(Stores.persistentKeyValueStore(storeKey), Serdes.String(), valueSerde);
How do I register it with the streams builder?
Existing code which uses the inMemoryKeyValueStore:
static StoreBuilder<KeyValueStore<String, LinkedList<StoreItem>>> makeStoreBuilder(
final String storeKey,
final Serde<LinkedList<StoreItem>> valueSerde,
final boolean loggingDisabled) {
final StoreBuilder<KeyValueStore<String, LinkedList<StoreItem>>> storeBuilder =
Stores.keyValueStoreBuilder(Stores.inMemoryKeyValueStore(storeKey), Serdes.String(), valueSerde);
return storeBuilder;
}
I need to ensure that the streams app will not end up missing existing messages in the log topic each time it restarts.
How do I register it with the streams builder?
By calling StreamsBuilder#addStateStore().
https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/StreamsBuilder.html#addStateStore-org.apache.kafka.streams.state.StoreBuilder-
See StateStoresInTheDSLIntegrationTest at https://github.com/confluentinc/kafka-streams-examples fro an end-to-end demo application.
You use a persistent store the exact some way as an in-memory store. The store takes care of the rest and you don't need to worry about loading data etc. You just use it.

Workaround to fix StreamListener constant Channel Name

I am using cloud stream to consuming messages I am using something like
#StreamListener(target = "CONSTANT_CHANNEL_NAME")
public void readingData(String input){
System.out.println("consumed info is"+input);
}
But I want to keep channel name as per my environment and it should be picked from property file, while as per Spring channel name should be constant.
Is there any work around to fix this problem?
Edit:1
Let's see the actual situation
I am using multiple queues and dlq queues and it's binding is done with rabbit-mq
I want to change my channel name and queue name as per my environment
I want to do all on same AMQP host.
My Sink Code
public interfaceProcessorSink extends Sink {
#Input(CONSTANT_CHANNEL_NAME)
SubscribableChannel channel();
#Input(CONSTANT_CHANNEL_NAME_1)
SubscribableChannel channel2();
#Input(CONSTANT_CHANNEL_NAME_2)
SubscribableChannel channle2();
}
You can pick target value from property file as below:
#StreamListener(target = "${streamListener.target}")
public void readingData(String input){
System.out.println("consumed info is"+input);
}
application.yml
streamListener:
target: CONSTANT_CHANNEL_NAME
While there are many ways to do that I wonder why do you even care? In fact if anything you do want to make it constant so it is always the same, but thru configuration properties map it to different remote destinations (e.g., Kafka, Rabbit etc). For example spring.cloud.stream.bindings.input.destination=myKafkaTopic states that channel by the name input will be mapped to (bridged with) Kafka topic named myKafkaTopic'.
In fact, to further prove my point we completely abstracted away channels all together for users who use spring-cloud-function programming model, but that is a whole different discussion.
My point is that I believe you are actually creating a problem rather the solving it since with externalisation of the channel name you create probably that due to misconfiguration your actual bound channel and the channel you're mentioning in your properties are not going to be the same.

Project reactor processors v3.X

We are trying to migrate from 2.X to 3.X.
https://github.com/reactor/reactor-core/issues/375
We have used the EventBus as event manager in our application(Low latency FX system) and it works very well for us.
After the change we decided to take every module and create his own processor to handle event.
1. Does this use seems to be correct from your point of view? Because lack of document at the current stage and after reviewing everything we could we don't really know what to do here
2. We have tried to use Flux in order to perform action every X interval
For example: Market is arriving 1000 for 1 second but we want to process an update only 4 time in a second. After upgrading we are using:
Processor with buffer and sending to another method.
In this method we have Flux that get list and try to work in parallel in order to complete his task.
We had 2 major problems:
1. Sometimes we received Null event which we cannot find that our system is sending to i suppose maybe we are miss using the processor
//Definition of processor
ReplayProcessor<Event> classAEventProcessor = ReplayProcessor.create();
//Event handler subscribing
public void onMyEventX(Consumer<Event> consumer) {
Flux<Event> handler = classAEventProcessor .filter(event -> event.getType().equals(EVENT_X));
handler.subscribe(consumer);
}
in the example above the event in the handler sometimes get null.. Once he does the stream stop working until we are restating server(Because only on restart we are doing creating processor)
2.We have tried to us parallel but sometimes some of the message were disappeared so maybe we are misusing the framework
//On constructor
tickProcessor.buffer(1024, Duration.of(250, ChronoUnit.MILLIS)).subscribe(markets ->
handleMarkets(markets));
//Handler
Flux.fromIterable(getListToProcess())
.parallel()
.runOn(Schedulers.parallel())
.doOnNext(entryMap -> {
DoBlockingWork(entryMap);
})
.sequential()
.subscribe();
The intention of this is that the processor will wakeup every 250ms and invoke the handler. The handler will work work with Flux parallel in order to make better and faster processing.
*In case that DoBlockingWork takes more than 250ms i couldn't understand what will be the behavior
UPDATE:
The EventBus was wrapped by us and every event subscribed throw the wrapped event manager.
Now we have tried to create event processor for every module but it works very slow. We have used TopicProcessor with ThreadExecutor and still very slow.. EventBus did the same work in high speed
Anyone has any idea? BTW when i tried to use DirectProcessor it seems to work much better that the TopicProcessor
Reactor 3 is built around the concept that you should avoid blocking as much as you can, so in your second snippet DoBlockingWork doesn't look good.
How are the events generated? Do you maybe have an listener-based asynchronous API to get them? If so, you could try using Flux.create.
For your use case of "we have 1000 events in 1 second, but only want to process 4", I'd chain a sample operator. For instance, sample(Duration.ofMillis(250)) will divide each second into 4 windows, from which it will only emit the last element.
The reference guide is being written, as well as a page where you can find links to external articles and learning material.There's a preview of the WIP reference guide here and the learning resources page here.

Resources