Project reactor processors v3.X - spring

We are trying to migrate from 2.X to 3.X.
https://github.com/reactor/reactor-core/issues/375
We have used the EventBus as event manager in our application(Low latency FX system) and it works very well for us.
After the change we decided to take every module and create his own processor to handle event.
1. Does this use seems to be correct from your point of view? Because lack of document at the current stage and after reviewing everything we could we don't really know what to do here
2. We have tried to use Flux in order to perform action every X interval
For example: Market is arriving 1000 for 1 second but we want to process an update only 4 time in a second. After upgrading we are using:
Processor with buffer and sending to another method.
In this method we have Flux that get list and try to work in parallel in order to complete his task.
We had 2 major problems:
1. Sometimes we received Null event which we cannot find that our system is sending to i suppose maybe we are miss using the processor
//Definition of processor
ReplayProcessor<Event> classAEventProcessor = ReplayProcessor.create();
//Event handler subscribing
public void onMyEventX(Consumer<Event> consumer) {
Flux<Event> handler = classAEventProcessor .filter(event -> event.getType().equals(EVENT_X));
handler.subscribe(consumer);
}
in the example above the event in the handler sometimes get null.. Once he does the stream stop working until we are restating server(Because only on restart we are doing creating processor)
2.We have tried to us parallel but sometimes some of the message were disappeared so maybe we are misusing the framework
//On constructor
tickProcessor.buffer(1024, Duration.of(250, ChronoUnit.MILLIS)).subscribe(markets ->
handleMarkets(markets));
//Handler
Flux.fromIterable(getListToProcess())
.parallel()
.runOn(Schedulers.parallel())
.doOnNext(entryMap -> {
DoBlockingWork(entryMap);
})
.sequential()
.subscribe();
The intention of this is that the processor will wakeup every 250ms and invoke the handler. The handler will work work with Flux parallel in order to make better and faster processing.
*In case that DoBlockingWork takes more than 250ms i couldn't understand what will be the behavior
UPDATE:
The EventBus was wrapped by us and every event subscribed throw the wrapped event manager.
Now we have tried to create event processor for every module but it works very slow. We have used TopicProcessor with ThreadExecutor and still very slow.. EventBus did the same work in high speed
Anyone has any idea? BTW when i tried to use DirectProcessor it seems to work much better that the TopicProcessor

Reactor 3 is built around the concept that you should avoid blocking as much as you can, so in your second snippet DoBlockingWork doesn't look good.
How are the events generated? Do you maybe have an listener-based asynchronous API to get them? If so, you could try using Flux.create.
For your use case of "we have 1000 events in 1 second, but only want to process 4", I'd chain a sample operator. For instance, sample(Duration.ofMillis(250)) will divide each second into 4 windows, from which it will only emit the last element.
The reference guide is being written, as well as a page where you can find links to external articles and learning material.There's a preview of the WIP reference guide here and the learning resources page here.

Related

How do I debug a Mono that never completes

I have a Spring Boot application which contains a complex reactive flow (it involves MongoDB and RabbitMQ operations). Most of the time it works, but...
Some of the methods return a Mono<Void>. This is a typical pattern, in multiple layers:
fun workflowStep(things: List<Thing>): Mono<Void> =
Flux.fromIterable(things).flatMap { thing -> doSomethingTo(thing) }.collectList().then()
Let's say doSomethingTo() returns a Mono<Void> (it writes something to the database, sends a message etc). If I just replace it with Mono.empty() then everything works as expected, but otherwise it doesn't. More specifically the Mono never completes, it runs through all processing but misses the termination signal at the end. So the things are actually written in the database, messages are actually sent, etc.
To prove that the lack of termination is the problem, here is a hack that works:
val hackedDelayedMono = Mono.empty<Void>().delayElement(Duration.ofSeconds(1))
return Mono.first(
workflowStep(things),
hackedDelayedMono
)
The question is, what can I do with a Mono that never completes, to figure out what's going on? There is nowhere I could put a logging statement or a brakepoint, because:
there are no errors
there are no signals emitted
How could I check what the Mono is waiting for to be completed?
ps. I could not reproduce this behaviour outside the application, with simple Mono workflows.
You can trace and log events in your stream by using the log() operator in your reactive stream. This is useful for gaining a better understanding about what events are occurring within your app.
Flux.fromIterable(things)
.flatMap(thing -> doSomethingTo(thing))
.log()
.collectList()
.then()
Chained inside a sequence, it peeks at every event of the Flux or Mono
upstream of it (including onNext, onError, and onComplete as well as
subscriptions, cancellations, and requests).
Reactor Reference Documentation - Logging a Sequence
The Reactor reference documentation also contains other helpful advice for debugging a reactive stream and can be found here: Debugging Reactor
(We managed to fix the problem - it was not directly in the code I was working on, but for some reason my changes triggered it. I still don't understand the root cause, but higher up the chain we found a Mono.zip() zipping a Mono<Void>. Although this used to work before, it stopped working at some point. Why is a Mono<Void> even zippable, why don't we get a compiler error, and even worse, why does it work sometimes?)
To answer my own question here, the tool used for debugging was adding the following to all Monos in the chain, until it didn't produce any output:
mono.doOnEach { x ->
logger.info("signal: ${x}")
}
.then(Mono.defer {
logger.info("then()")
Mono.empty<Void>()
})
I also experimented with the .log() - also fine tool, but maybe too detailed, and it is not very easy to understand which Mono produces which log messages - as these are logged with the dynamic scope, not the lexical scope, which the above method gives you unambiguously.

Speeding up Message Hub Kafka Java Console sample

I've been working with the Message Hub sample code found at this link: https://github.com/ibm-messaging/message-hub-samples
In particular, I've been trying to increase the throughput of the producer with the Kafka Java console example. I noticed the documentation in this snippet of code:
// Synchronously wait for a response from Message Hub / Kafka on every message produced.
// For high throughput the future should be handled asynchronously.
RecordMetadata recordMetadata = future.get(5000, TimeUnit.MILLISECONDS);
producedMessages++;
I've already turned off the thread sleep found later in the code which also helped increase the throughput, but I was hoping I could get some help on implementing the future asynchronously in this block. Thanks in advance!
you have two basic options for handling the outcome of a produce request asynchronously
1) use the overloaded send with a completion callback argument, which will be invoked asynchronously:
public Future<RecordMetadata> send(ProducerRecord<K, V> record, Callback callback);
if using the callback you may ignore the future.
2) pass the Future to some other thread you have created, and have it inspect the future for completion, while leaving the thread that calls send free to carry on.

spring integration message released twice from aggregator

I have a spring integration flow that starts with a channel inboundadapter and picks up files and passes them through the system as messages.
After a few components, the messages are aggregated at an "Aggregator" from where they are released based on release strategies or by group timeout of 30 sec.
The downstream processing has another bunch of components till the final one.
The problem I am facing is this,
When I send 33 files which create 33 "groups/buckets" based on correlation IDs, aggregated at the "Aggregator", some of the files or messages seems to be "released" twice. The reason I conclude that is because I have a channel interceptor which shows a few messages passing through the "released" channel (appearing right after the aggregator) a second time, after completing the downstream processing successfully, the first time. Additionally, this behavior causes my application to not find a file and throw an exception which I see. This leads me to conclude that the message bucket/group/corrID is somehow being "Released" twice.
I have tried to debug this many ways , but essentially, I want to know how a corrID/bucket after being released and having successfully gone through all downstream components in a single thread, can be "released" again.
My question is, how can I debug this? I want to know what is making this message/bucket re-appear in the aggregator.
My aggregator is as follows,
<int:aggregator id="bufferedFiles" input-channel="inQueueForStage"
output-channel="released" expire-groups-upon-completion="true"
send-partial-result-on-expiry="true" release-strategy="releaseHandler"
release-strategy-method="canRelease"
group-timeout-expression="size() > 0 ? T(com.att.datalake.ifr.loader.utils.MessageUtils).getAggregatorTimeout(one, #sourceSnapshot) : -1">
<int:poller fixed-delay="${files.pickup.delay:3000}"
max-messages-per-poll="${num.files.pickup.per.poll:10}"
task-executor="executor" />
</int:aggregator>
Explanation of aggregator: The size()>0 applies to EACH correlation bucket. each of the 33 files I am sending will spawn/generate/create a new bucket because of the file name, so the aggregator will have 33 buckets/groups/corrIds, each bucket will contain only one file.
So the aggregator SPEL expression simply says that if there no release strategies, then release the bucket/group after 30 secs if the group indeed has at least some files.
My Channel inbound adapter is as follows:
<int-file:inbound-channel-adapter id="files"
channel="dispatchFiles" directory="${source.dir}" scanner="directoryScanner">
<int:poller fixed-delay="${files.pickup.delay:3000}"
max-messages-per-poll="${num.files.pickup.per.poll:10}" />
</int-file:inbound-channel-adapter>
Logs
here is the log of message completing the flow the first time. The completion time invoked suggests reaching the last component a "completionHandler" SA.
Explanation of Log: "cor" is the bucket/corrId that is being released twice. The reason I get the final exception is because during the first time, the file is removed from that original location and processed. So the second time around when this erroneous release happens, there is nothing to process there.
From the pictures it can be seen that the first batch/corrId/bucket is processed and finished around 11:09, and the second one is started around 11:10
an important point I noticed that this behavior only happens when I have a global channel interceptor in which I am doing somewhat long processing. When this interceptor is commented out, the errors go away.
Question:
is it possible for aggregator to double release a batch/corrId under any circumstance? How can I make aggregator emit any logs?
Thanks
Edit 10:15pm
My channel following the aggregator has an interceptor as follows,
public Message<?> preSend(Message<?> message, MessageChannel channel) {
LOGGER.info("******** Releasing from aggregator(interceptor) , corrID:{} at time:{} ********",MessageUtils.getCorrelationId(message), new Date() );
finalReporter.callback(channel.toString(), message);
return message;
}
From Aggregator down to final compeltionHandler SA, I have single threaded processing
Aggregator -> releasedChannel -> some SA1 -> some channel -> ..... -> completionChannel->completeSA
When I run for 33 partitions, let's follow corrId = "alh" The first time it is released, it looks like following,
What it shows is that thread-5 released it and it should process all the downstream components. But it leaves it mid-way and starts doing other things and is picked up again by a diffferent thread a little later as follows,
That seems/seemed to be the problem,
Solution Update:
I did following 3 things to sort of work around, at the moment,
for some reason, my interceptors were doing return super.preSend(message, channel) instead of simply return message. I changed it to latter
I had a global channel interceptors, I removed global and kept individual ones
If the channel interceptors had any issues before returning, would that cause a new release?
Although I still see the above scenario depicted in pictures, I am not getting double processing attempts and as such it avoids the errors. I am still trying to make sense out of this.
I understand it's too specific and difficult to explain; still thanks for the time and comments...
However, yes. I think #GaryRussell is right: since you use expire-groups-upon-completion="true" some partial groups may be released by group-timeout-expression and the new messages with the same correlationId will form a new group, which is released by the next group-timeout. Your size() > 0 isn't good too. It means that it is going to release partial group after that group-timeout. Maybe size() > 1? The group can't be size() == 0 though. Because it is created on the first message, so, if gruop exists, it contains at least one message. Yes, group can be empty, but in that case the aggregator should be marked with expire-groups-upon-completion="false". In that case it is marked as completed and doesn't allow new messages.
After struggling with debugging and various blind scenarios, I believe that at least I have a workaround and a possible root cause. I will try to outline all the things that I modified,
Root Cause:
My interceptors were calling a Common class with a common callback method. This method, based on the channel name from which the request was coming from, would decide the appropriate action to take. The actions were essentially collecting data, incrementing counters and persisting to database some information.
It seems that some of them were having errors and consequently, the thread was dying and message re-released. I am not entirely sure about it and please correct me if that's not the case.
But after I fixed those errors, the re-release issue seems to have subsided or vanished altogether.
The reason it was hard to diagnose was because I could not see those errors thrown during callback method invocations; may be I was catching them or may be they were lost.
I also found that the issue was only on any channel interceptors AFTER the aggregator. Interceptors before the aggregator did not present any issues; may be because they were simpler...
To debug,
I removed the interceptors and made the callback directly from various components (SAs), removed global interceptors and tried to add individual interceptors for specific channels.
Thanks for all the help.

Winjs Promise Async test

I m developping a Winjs/HTML windows Store application .
I have to do some tests every period of time so let's me explain my need.
when i navigate to my specific page , I have to test (without a specific time in advance=loop)
So when my condition is verified it Will render a Flyout(Popup) and then exit from the Promise. (Set time out need a specific time but i need to verify periodically )
I read the msdn but i can't fullfill this goal .
If someone has an idea how to do it , i will be thankful.
Every help will be appreciated.
setInterval can be used.
var timerId = setInternal(function ()
{
// do you work.
}, 2000); // timer event every 2s
// invoke this when timer needs to be stopped or you move out of the page; that is unload() method
clearInternal(timerId);
Instead of polling at specific intervals, you should check if you can't adapt your code to use events or databinding instead.
In WinJS you can use databinding to bind input values to a view model and then check in its setter functions if your condition has been fulfilled.
Generally speaking, setInterval et al should be avoided for anything that's not really time-related domain logic (clocks, countdowns, timeouts or such). Of course there are situations when there's no other way (like polling remote services), so this may not apply to your situation at hand.

What is considered overloading the main thread?

I am displaying information from a data model on a user interface. My current approach to doing so is by means of delegation as follows:
#protocol DataModelDelegate <NSObject>
- (void)updateUIFromDataModel;
#end
I am implementing the delegate method in my controller class as follows, using GCD to push the UI updating to the main thread:
- (void)updateUIFromDataModel {
dispatch_async(dispatch_get_main_queue(), ^{
// Code to update various UI controllers
// ...
// ...
});
}
What I am concerned about is that in some situations, this method can be called very frequently (~1000 times per second, each updating multiple UI objects), which to me feels very much like I am 'spamming' the main thread with commands.
Is this too much to be sending to the main thread? If so does anyone have any ideas on what would be the best way of approaching this?
I have looked into dispatch_apply, but that appears to be more useful when coalescing data, which is not what I am after - I really just want to skip updates if they are too frequent so only a sane amount of updates are sent to the main thread!
I was considering taking a different approach and implementing a timer instead to constantly poll the data, say every 10 ms, however since the data updating tends to be sporadic I feel that it would be wasteful to do so.
Combining both approaches, another option I have considered would be to wait for an update message and respond by setting the timer to poll the data at a set interval, and then disabling the timer if the data appears to have stopped changing. But would this be over-complicating the issue, and would the sane approach be to simply have a constant timer running?
edit: Added an answer below showing the adaptations using a dispatch source
One option is to use a Dispatch Source with type DISPATCH_SOURCE_TYPE_DATA_OR which lets you post events repeatedly and have libdispatch combine them together for you. When you have something to post, you use dispatch_source_merge_data to let it know there's something new to do. Multiple calls to dispatch_source_merge_data will be coalesced together if the target queue (in your case, the main queue) is busy.
I have been experimenting with dispatch sources and got it working as expected now - Here is how I have adapted my class implementation in case it is of use to anyone who comes across this question:
#implementation AppController {
#private
dispatch_source_t _gcdUpdateUI;
}
- (void)awakeFromNib {
// Added the following code to set up the dispatch source event handler:
_gcdUpdateUI = dispatch_source_create(DISPATCH_SOURCE_TYPE_DATA_ADD, 0, 0,
dispatch_get_main_queue());
dispatch_source_set_event_handler(_gcdUpdateUI, ^{
// For each UI element I want to update, pull data from model object:
// For testing purposes - print out a notification:
printf("Data Received. Messages Passed: %ld\n",
dispatch_source_get_data(_gcdUpdateUI));
});
dispatch_resume(_gcdUpdateUI);
}
And now in the delegate method I have removed the call to dispatch_async, and replaced it with the following:
- (void)updateUIFromDataModel {
dispatch_source_merge_data(_gcdUpdateUI, 1);
}
This is working absolutely fine for me. Now Even during the most intense data updating the UI stays perfectly responsive.
Although the printf() output was a very crude way of checking if the coalescing is working, a quick scrolling back up the console output showed me that the majority of the messages print outs had a value 1 (easily 98% of them), however there were the intermittent jumps to around 10-20, reaching a peak value of just over 100 coalesced messages around a time when the model was sending the most update messages.
Thanks again for the help!
If the app beach-balls under heavy load, then you've blocked the main thread for too long and you need to implement a coalescing strategy for UI updates. If the app remains responsive to clicks, and doesn't beach-ball, then you're fine.

Resources