Sequential execution of Reactive tasks in reactor Java - spring-boot

I'm working on converting a blocking sequential orchestration framework to reactive. Right now, these tasks are dynamic and are fed into the engine by a JSON input. The engine pulls classes and executes the run() method and saves the state with the responses from each task.
How do I achieve the same chaining in reactor? If this was a static DAG, I would have chained it with flatMap or then operators but since it is dynamic, How do I proceed with executing a reactive task and collecting the output from each task?
Examples:
Non reactive interface:
public interface OrchestrationTask {
OrchestrationContext run(IngestionContext ctx);
}
Core Engine
public Status executeDAG(String id) {
IngestionContext ctx = ContextBuilder.getCtx(id);
List<OrchestrationTask> tasks = app.getEligibleTasks(id);
for(OrchestrationTask task : tasks) {
// Eligible tasks are executed sequentially and results are collected.
OrchestrationContext stepContext = task.run(ctx);
if(!evaluateResult(stepContext)) break;
}
return Status.SUCCESS;
}
Following the above example, if I convert tasks to return Mono<?> then, how do I wait or chain other tasks to operate on the result on previous tasks?
Any help is appreciated. Thanks.
Update::
Reactive Task example.
public class SampleTask implements OrchestrationTask {
#Override
public Mono<OrchestrationContext> run(OrchestrationContext context) {
// Im simulating a delay here. treat this as a long running task (web call) But the next task needs the response from the below call.
return Mono.just(context).delayElements(Duration.ofSeconds(2));
}
So i will have a series of tasks that accomplish various things but the response from each task is dependent on the previous and is stored in the Orchestration Context. Anytime an error is occurred, the orchestration context flag will be set to false and the flux should stop.

Sure, we can:
Create the flux from the task list (if it's appropriate to generate the task list reactively then you can replace that arraylist with the flux directly, if not then keep as-is);
flatMap() each task to your task.run() method (which as per the question now returns a Mono;
Ensure we only consume elements while evaluateResult() is true;
...then finally just return the SUCCESS status as before.
So putting all that together, just replace your loop & return statement with:
Flux.fromIterable(tasks)
.flatMap(task -> task.run(ctx))
.takeWhile(stepContext -> evaluateResult(stepContext))
.then(Mono.just(Status.SUCCESS));
(Since we've made it reactive, your method will obviously need to return a Mono<Status> rather than just Status too.)
Update as per the comment - if you just want this to execute "one at a time" rather than with multiple concurrently, you can use concatMap() instead of flatMap().

Related

Kotlin coroutines running sequentially even with keyword async

Hi guys i'm trying to improve performance of some computation in my system. Basically I want to generate a series of actions based on some data. This doesn't scale well and I want to try doing this in parallel and getting a result after (a bit like how futures work)
I have an interface with a series of implementations that get a collection of actions. And want to call all these in parallel and await the results at the end.
The issue is that, when I view the logs its clearly doing this sequentially and waiting on each action getter before going to the next one. I thought the async would do this asynchronously, but its not.
The method the runBlocking is in, is within a spring transaction. Maybe that has something to do with it.
runBlocking {
val actions = actionsReportGetters.map { actionReportGetter ->
async {
getActions(actionReportGetter, abstractUser)
}
}.awaitAll().flatten()
allActions.addAll(actions)
}
private suspend fun getActions(actionReportGetter: ActionReportGetter, traderUser: TraderUser): List<Action> {
return actionReportGetter.getActions(traderUser)
}
interface ActionReportGetter {
fun getActions(traderUser: TraderUser): List<Action>
}
Looks like you are doing some blocking operation in ActionReportGetter.getActions in a single threaded environment (probably in the main thread).
For such IO operations you should launch your coroutines in Dispatchers.IO which provides a thread pool with multiple threads.
Update your code to this:
async(Dispatchers.IO) { // Switch to IO dispatcher
getActions(actionReportGetter, abstractUser
}
Also getActions need not be a suspending function here. You can remove the suspend modifier from it.

How to enforce only 1 subscriber per multiple instances of a same Single/Observable?

I have this sync modeled as a Single, and only 1 sync can be running at a time.
I'm trying to subscribe the "job" on a Schedulers.single() which mostly works, but inside the chain there are schedulers hops (to db writes scheduler), which unblocks the natural queue created by single()
Then I looked at flatMap(maxConcurrency=1) but this won't work, as that requires always the same instance. I.e. from what I understand, some sort of a Subject of sync requests, which however is uncomposable as my usecase mostly looks like this
fun someAction1AndSync(): Single<Unit> {
return someAction1()
.flatMap { sync() }
}
fun someAction2AndSync(): Single<Unit> {
return someAction2()
.flatMap { sync() }
}
...
as you can see, its separate sync Single instances :/
Also note someActionXAndSync should not emit until the sync is also done
Basically I'm looking for coroutines Semaphore
I can think of three ways:
use a single thread for whole sync operation (decoupling through queue)
use semaphore to protect sync method from entering multiple times (not recommended, because will block callee)
fast return, when sync is in progress (AtomicBoolean)
There might by other solutions, which I am not aware of.
fast return, when sync is in progress
Also note someActionXAndSync should not emit until the sync is also done
This solution will not queue up sync requests, but will fail fast. The callee must handle the error appropriately
SyncService
class SyncService {
val isSync: AtomicBoolean = AtomicBoolean(false)
fun sync(): Completable {
return if (isSync.compareAndSet(false, true)) {
Completable.fromCallable { "" }.doOnEvent { isSync.set(false) }
} else {
Completable.error(IllegalStateException("whatever"))
}
}
}
Handling
When sync process is already happening, you will receive an onError. This issue must be handled somehow, because the onError will be emitted to the subscriber. Either you are fine with it, or you could just ignore it with onErrorComplete
fun someAction1AndSync(): Completable {
return Single.just("")
.flatMapCompletable {
sync().onErrorComplete()
}
}
use a single thread for whole sync operation
You have to make sure, that the whole sync-process is processed in a single job. When the sync-process is composed of multiple reactive steps on other threads, it could happen, that another sync process is started, while one sync process is already in progress.
How?
You have to have a scheduler with one thread. Each sync invocation must be invoked from given scheduler. The sync operation must complete sync in one running job.
I would use this:
fun Observable<Unit>.forceOnlyOneSubscriber(): Observable<Unit> {
val subscriberCount = AtomicInteger(0)
return doOnSubscribe { subscriberCount.incrementAndGet() }
.doFinally { subscriberCount.decrementAndGet() }
.doOnSubscribe { check(subscriberCount.get() <= 1) }
}
You can always generify Unit using generics if you need.

Reactor Flux conditional emit

Is it possible to allow emitting values from a Flux conditionally based on a global boolean variable?
I'm working with Flux delayUntil(...) but not able to fully grasp the functionality or my assumptions are wrong.
I have a global AtomicBoolean that represents the availability of a downstream connection and only want the upstream Flux to emit if the downstream is ready to process.
To represent the scenario, created a (not working) test sample
//Randomly generates a boolean value every 5 seconds
private Flux<Boolean> signalGenerator() {
return Flux.range(1, Integer.MAX_VALUE)
.delayElements(Duration.ofMillis(5000))
.map(integer -> new Random().nextBoolean());
}
and
Flux.range(1, Integer.MAX_VALUE)
.delayElements(Duration.ofMillis(1000))
.delayUntil(evt -> signalGenerator()) // ?? Only proceed when signalGenerator returns true
.subscribe(System.out::println);
I have another scenario where a downstream process can accept only x messages a second. In the current non-reactive implementation we have a Semaphore of x permits and the thread is blocked if no more permits are available, with Semaphore permits resetting every second.
In both scenarios I want upstream Flux to emit only when there is a demand from the downstream process, and I do not want to Buffer.
You might consider using Mono.fromRunnable() as an input to delayUntil() like below;
Helper class;
public class FluxCondition {
CountDownLatch latch = new CountDownLatch(10); // it depends, might be managed somehow
Runnable r = () -> { latch.await(); }
public void lock() { Mono.fromRunnable(r) };
public void release() { latch.countDown(); }
}
Usage;
FluxCondition delayCondition = new FluxCondition();
Flux.range(1, 10).delayUntil(o -> delayCondition.lock()).subscribe();
.....
delayCondition.release(); // shall call this for each element
I guess there might be a better solution by using sink.emitNext but this might also require a condition variable for controlling Flux flow.
According my understanding, in reactive programming, your data should be considered in every operator step. So it might be better for you to design your consumer as a reactive processor. In my case I had no chance and followed the way as I described above

Mono returned by ServerRequest.bodyToMono() method not extracting the body if I return ServerResponse immediately

I am using web reactive in spring web flux. I have implemented a Handler function for POST request. I want the server to return immediately. So, I have implemeted the handler as below -:
public class Sample implements HandlerFunction<ServerResponse>{
public Mono<ServerResponse> handle(ServerRequest request) {
Mono bodyMono = request.bodyToMono(String.class);
bodyMono.map(str -> {
System.out.println("body got is " + str);
return str;
}).subscribe();
return ServerResponse.status(HttpStatus.CREATED).build();
}
}
But the print statement inside the map function is not getting called. It means the body is not getting extracted.
If I do not return the response immediately and use
return bodyMono.then(ServerResponse.status(HttpStatus.CREATED).build())
then the map function is getting called.
So, how can I do processing on my request body in the background?
Please help.
EDIT
I tried using flux.share() like below -:
Flux<String> bodyFlux = request.bodyToMono(String.class).flux().share();
Flux<String> processFlux = bodyFlux.map(str -> {
System.out.println("body got is");
try{
Thread.sleep(1000);
}catch (Exception ex){
}
return str;
});
processFlux.subscribeOn(Schedulers.elastic()).subscribe();
return bodyFlux.then(ServerResponse.status(HttpStatus.CREATED).build());
In the above code, sometimes the map function is getting called and sometimes not.
As you've found, you can't just arbitrarily subscribe() to the Mono returned by bodyToMono(), since in that case the body simply doesn't get passed into the Mono for processing. (You can verify this by putting a single() call in that Mono, it'll throw an exception since no element will be emitted.)
So, how can I do processing on my request body in the background?
If you really still want to just use reactor to do a long task in the background while returning immediately, you can do something like:
return request.bodyToMono(String.class).doOnNext(str -> {
Mono.just(str).publishOn(Schedulers.elastic()).subscribe(s -> {
System.out.println("proc start!");
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println("proc end!");
});
}).then(ServerResponse.status(HttpStatus.CREATED).build());
This approach immediately publishes the emitted element to a new Mono, set to publish on an elastic scheduler, that is then subscribed in the background. However, it's kind of ugly, and it's not really what reactor is designed to do. You may be misunderstanding the idea behind reactor / reactive programming here:
It's not written with the idea of "returning a quick result and then doing stuff in the background" - that's generally the purpose of a work queue, often implemented with something like RabbitMQ or Kafka. It's "raison d'ĂȘtre" is instead to be non-blocking, so a single thread is never idly blocked, waiting for something else to complete.
The map() method isn't designed for side effects, it's designed to transform each object into another. For side effects, you want doOnNext() instead;
Reactor uses a single thread by default, so your "additional processing" in your map() method would still block that thread.
If your application is for anything more than quick demo purposes, and/or you need to make heavy use of this pattern, then I'd seriously consider setting up a proper work queue instead.
This is not possible.
Web servers (including Reactor Netty, Tomcat, etc) clean up and recycle resources when request processing is done. This means that when your controller handler is done, the HTTP resources, the request itself, reusable buffers, etc are recycled or closed. At that point, you cannot read from the request body anymore.
In your case, you need to read and buffer the whole request body first, then return a response and kick off a task for processing that request in a separate execution.

How to implement kind of global try..finally in TPL?

I have async method that returns Task. From time to time my process is recycling/restarting. Work is interruping in the middle of the Task. Is there more or less general approach in TPL that I can at least log that Task was interruped?
I am hosting in ASP.NET, so I can use IRegisteredObject to cancel tasks with CancellationToken. I do not like this however. I need to pass CancellationToken in all methods and I have many of them.
try..finally in each method does not seem even to raise. ContinueWith also does not work
Any advice?
I have single place I start my async tasks, however each task can have any number of child tasks. To get an idea:
class CommandRunner
{
public Task Execute(object cmd, Func<object, Task> handler)
{
return handler(cmd).ContinueWith(t =>
{
if (t.State = == TaskStatus.Faulted)
{
// Handle faultes, log them
}
else if (x.Status == TaskStatus.RanToCompletion)
{
// Audit
}
})
}
}
Tasks don't just get "interrupted" somehow. They always get completed, faulted or cancelled. There is no global hook to find out about those completions. So the only option to do your logging is to either instrument the bodies of your tasks or hook up continuations for everything.

Resources