Why would one use runBlocking(IO) instead of just runBlocking in a Spring Boot app - spring

I have a Spring Boot app and when handling a given request I need to call upstream services in a parallel and wait for the result to complete before returning them in my own response.
In the existing code base, I noticed that in order to do so, the pattern is to use runBlocking(IO) { ... }
#Service
class MyUpstreamService {
fun getSomething() = 1
}
#RestController
class MyController(
val upstream: MyUpstreamService
) {
#GetMapping("/foo")
fun foo() =
runBlocking(Dispatchers.IO) {
val a = async { upstream.getSomething() }
val b = async { upstream.getSomething() }
a.await() + b.await()
}
}
This works as expected.
Now for some reasons I need to set the scope of MyUpstreamService to #RequestScope and if I do so, I get the following exception as soon as I access MyUpstreamService from within the runBlocking(IO) { ... } block:
Caused by: java.lang.IllegalStateException: No thread-bound request found: Are you referring to request attributes outside of an actual web request, or processing a request outside of the originally receiving thread? If you are actually operating within a web request and still receive this message, your code is probably running outside of DispatcherServlet: In this case, use RequestContextListener or RequestContextFilter to expose the current request.
at org.springframework.web.context.request.RequestContextHolder.currentRequestAttributes(RequestContextHolder.java:131) ~[spring-web-5.3.22.jar:5.3.22]
If I do not use the Dispatchers.IO context, then everything works fine.
So the question is why would one use runBlocking(Dispatchers.IO) { .. } instead of just runBlocking { .. } when waiting for several async calls to complete?
For completeness, here is the entire snippet demonstrating the question.
GET /bar works
GET /foo throws the exception
#RequestScope
#Service
class MyUpstreamService(
// val currentUser: CurrentUser
) {
fun getSomething() = 1
}
#RestController
class MyController(
val upstream: MyUpstreamService
) {
#GetMapping("/foo")
fun foo() =
runBlocking(Dispatchers.IO) {
val a = async { upstream.getSomething() }
val b = async { upstream.getSomething() }
a.await() + b.await()
}
#GetMapping("/bar")
fun bar() =
runBlocking {
val a = async { upstream.getSomething() }
val b = async { upstream.getSomething() }
a.await() + b.await()
}
}

runBlocking without particular dispatcher means that all coroutines inside are launched in a special single-threaded event loop dispatcher backed by the thread that you're blocking. This, in turn, means your coroutines would not run in parallel.
runBlocking(Dispatchers.IO) means the nested coroutines run on the IO dispatcher, which is backed by a pool of threads of dynamic size, and thus the coroutines are effectively run in parallel (within some limit). At the same time, it's still a runBlocking, which means the calling thread would still be blocked while waiting for the nested coroutines to complete, but it would not be used to do any work.
for some reasons I need to set the scope of MyUpstreamService to #RequestScope
When you do this, Spring creates one service instance by request - and this is done based on the request's thread (by using some ThreadLocal machinery I assume). As we have just seen, runBlocking without dispatcher actually uses the calling thread, so the request thread, and that is why this mechanism still works. If you use runBlocking(IO) and dispatch on other threads, you're breaking this Spring mechanism.
Now I haven't done Spring dev in a while, so I'm not 100% sure how to fix your problem. But I believe a good start would be to stop using the thread-per-request model if you're using coroutines, and thus switch to suspend functions in your controllers using Spring WebFlux. I think it will still not allow to use #RequestScope, though, because you would be giving up the "request thread" concept altogether. See https://github.com/spring-projects/spring-framework/issues/28235

Related

Spring Boot Webflux and suspend functions cannot be evaluated in Intellij Idea (this#HelloController is not captured)

In intellij idea evaluating (Cmd + F8 or Option + click) suspended functions while debugging works just fine.
The problem arises in combination with webflux and suspended functions.
Once suspended you cannot evaluate the captured variables any longer.
Error states: this#HelloController is not captured
One workaround is to create a coroutineScope around an endpoint.
Not sure why this is working at all and have not tested any performance impacts on this.
Searching for a proper way to be able to debug suspended endpoints in intellij with webflux and coroutines, without having to add this coroutineScope and an explanation why it's not working otherwise.
#RestController
class HelloController(
val helloService: HelloService,
) {
#GetMapping("hello")
suspend fun hello() =
helloService.hello() // Throws error when evaluated
#GetMapping("helloCoroutine")
suspend fun helloCoroutine() = coroutineScope {
helloService.hello() // Works just fine when evaluated
}
}
#Service
class HelloService {
suspend fun hello(): String {
delay(10)
return "hello there!"
}
}
Complete code can be found here: https://github.com/Ch4s3r/webflux_coroutines_test

Implementing smartLifeCycle with a reactor subscription

Below is code I have for a component that starts a Flux and subscribes to it, all within the constructor of the class. This particular flux comes from a mongoChangeStreams call. It does not terminate unless there is an error.
I want the subscription to stay alive constantly so I restart the subscription in the event in terminates due to an error.
It has occurred to me that calling subscribe within a constructor might be a bad idea. Also I should probably enable a way to shut down this app gracefully by calling cancel on the subscription during shutdown.
My guess is that I should be implementing SmartLifeCycle but I'm not sure how to do that. Is there a standard way of implementing SmartLifeCycle on a component backed by a Flux subscription?
#Component
class SubscriptionManager(
private val fooFluxProvider: FooFluxProvider, //calling foos() on this returns a Flux of foos
private val fooProcessor: FooProcessor
) {
private var subscription: BaseSubscriber<Foo> = subscribe() //called in constructor
private fun subscribe() = buildSubscriber().also {
fooFluxProvider.foos().subscribe(it)
}
private fun buildSubscriber(): BaseSubscriber<Foo> {
return object : BaseSubscriber<Foo>() {
override fun hookOnSubscribe(subscription: Subscription) {
subscription.request(1)
}
override fun hookOnNext(value: Foo) {
//process the foo
fooProcessor.process(value)//sync call
//ask for another foo
request(1)
}
override fun hookOnError(throwable: Throwable) {
logger.error("Something went wrong, restarting subscription", throwable)
//restart the subscription. We'll recover if we're lucky
subscription = subscribe()
}
}
}
}
Instead of creating a Subscriber subclass that resubscribes on exception, chain one of the retry* operators on the Flux before subscribing. The retry operators will resubscribe to the upstream Flux if it completes with an exception. For example, fooFluxProvider.foos().retry() will retry indefinitely. There are other variations of retry* for more advanced behavior, including an extremely customizable retryWhen that can be used with the reactor.retry.Retry class from reactor-extra.
Instead of passing a subscriber to subscribe(subscriber), call one of the subscribe methods that returns a Disposable. This gives you an object on which you can call dispose() later during shutdown to cancel the subscription.
To implement SmartLifecycle:
In the constructor (or in start()), create the Flux (but do not subscribe to it in the constructor)
In start(), call flux.subscribe() and save the returned Disposable to a member field. The start() method is much better suited for starting background jobs than a constructor. Consider also chaining .subscribeOn(Scheduler) before .subscribe() if you want this to run in the background (by default, the subscription occurs on the thread on which subscribe was called).
In stop(), call disposable.dispose()
Perhaps something like this:
class SubscriptionManager(
fooFluxProvider: FooFluxProvider, //calling foos() on this returns a Flux of foos
fooProcessor: FooProcessor
) : SmartLifecycle {
private val logger = LoggerFactory.getLogger(javaClass)
private val fooFlux = fooFluxProvider.foos()
// Subscribe on a parallel scheduler to run in the background
.subscribeOn(Schedulers.parallel())
// Publish on a boundedElastic scheduler if fooProcessor.process blocks
.publishOn(Schedulers.boundedElastic())
// Use .doOnNext to send the foo to your processor
// Alternatively use .flatMap/.concatMap/.flatMapSequential if the processor returns a Publisher
// Alternatively use .map if the processor transforms the foo, and you need to operate on the returned value
.doOnNext(fooProcessor::process)
// Log if an exception occurred
.doOnError{ e -> logger.error("Something went wrong, restarting subscription", e) }
// Resubscribe if an exception occurred
.retry()
// Repeat if you want to resubscribe if the upstream flux ever completes successfully
.repeat()
private var disposable: Disposable? = null
#Synchronized
override fun start() {
if (!isRunning) {
disposable = fooFlux.subscribe()
}
}
#Synchronized
override fun stop() {
disposable?.dispose()
disposable = null
}
#Synchronized
override fun isRunning(): Boolean {
return disposable != null
}
}

How to initialize/enable Bean after another process finishes?

The idea is that I would like to first let a #Scheduled method retrieve some data and only when that process has finished enable/initialize my #KafkaListener. Currently the Kafka listener starts up immediately without waiting for the scheduler to be done.
I've tried to use #Conditional with a custom Condition, but this only is executed on context creation (aka startup). Also #ConditionalOnBean didn't work because actually my Scheduler bean is already created before it finishes the process.
This is how my setup looks like.
Kafka Listener:
#Service
class KafkaMessageHandler(private val someRepository) {
#KafkaListener(topics = ["myTopic"])
fun listen(messages: List<ConsumerRecord<*, *>>) {
// filter messages based on data in someRepository
// Do fancy stuff
}
}
Scheduler:
#Component
class Scheduler(private val someRepository) {
#Scheduled(fixedDelayString = "\${schedule.delay}")
fun updateData() {
// Fetch data from API
// update someRepository with this data
}
}
Is there any nice Spring way of waiting for the scheduler to finish before initializing the KafkaMessageHandler?

Vert.x: how to process HttpRequest with a blocking operation

I've just started with Vert.x and would like to understand what is the right way of handling potentially long (blocking) operations as part of processing a REST HttpRequest. The application itself is a Spring app.
Here is a simplified REST service I have so far:
public class MainApp {
// instantiated by Spring
private AlertsRestService alertsRestService;
#PostConstruct
public void init() {
Vertx.vertx().deployVerticle(alertsRestService);
}
}
public class AlertsRestService extends AbstractVerticle {
// instantiated by Spring
private PostgresService pgService;
#Value("${rest.endpoint.port:8080}")
private int restEndpointPort;
#Override
public void start(Future<Void> futureStartResult) {
HttpServer server = vertx.createHttpServer();
Router router = Router.router(vertx);
//enable reading of the request body for all routes
router.route().handler(BodyHandler.create());
router.route(HttpMethod.GET, "/allDefinitions")
.handler(this::handleGetAllDefinitions);
server.requestHandler(router)
.listen(restEndpointPort,
result -> {
if (result.succeeded()) {
futureStartResult.complete();
} else {
futureStartResult.fail(result.cause());
}
}
);
}
private void handleGetAllDefinitions( RoutingContext routingContext) {
HttpServerResponse response = routingContext.response();
Collection<AlertDefinition> allDefinitions = null;
try {
allDefinitions = pgService.getAllDefinitions();
} catch (Exception e) {
response.setStatusCode(500).end(e.getMessage());
}
response.putHeader("content-type", "application/json")
.setStatusCode(200)
.end(Json.encodePrettily(allAlertDefinitions));
}
}
Spring config:
<bean id="alertsRestService" class="com.my.AlertsRestService"
p:pgService-ref="postgresService"
p:restEndpointPort="${rest.endpoint.port}"
/>
<bean id="mainApp" class="com.my.MainApp"
p:alertsRestService-ref="alertsRestService"
/>
Now the question is: how to properly handle the (blocking) call to my postgresService, which may take longer time if there are many items to get/return ?
After researching and looking at some examples, I see a few ways to do it, but I don't fully understand differences between them:
Option 1. convert my AlertsRestService into a Worker Verticle and use the worker thread pool:
public class MainApp {
private AlertsRestService alertsRestService;
#PostConstruct
public void init() {
DeploymentOptions options = new DeploymentOptions().setWorker(true);
Vertx.vertx().deployVerticle(alertsRestService, options);
}
}
What confuses me here is this statement from the Vert.x docs: "Worker verticle instances are never executed concurrently by Vert.x by more than one thread, but can [be] executed by different threads at different times"
Does it mean that all HTTP requests to my alertsRestService are going to be, effectively, throttled to be executed sequentially, by one thread at a time? That's not what I would like: this service is purely stateless and should be able to handle concurrent requests just fine ....
So, maybe I need to look at the next option:
Option 2. convert my service to be a multi-threaded Worker Verticle, by doing something similar to the example in the docs:
public class MainApp {
private AlertsRestService alertsRestService;
#PostConstruct
public void init() {
DeploymentOptions options = new DeploymentOptions()
.setWorker(true)
.setInstances(5) // matches the worker pool size below
.setWorkerPoolName("the-specific-pool")
.setWorkerPoolSize(5);
Vertx.vertx().deployVerticle(alertsRestService, options);
}
}
So, in this example - what exactly will be happening? As I understand, ".setInstances(5)" directive means that 5 instances of my 'alertsRestService' will be created. I configured this service as a Spring bean, with its dependencies wired in by the Spring framework. However, in this case, it seems to me the 5 instances are not going to be created by Spring, but rather by Vert.x - is that true? and how could I change that to use Spring instead?
Option 3. use the 'blockingHandler' for routing. The only change in the code would be in the AlertsRestService.start() method in how I define a handler for the router:
boolean ordered = false;
router.route(HttpMethod.GET, "/allDefinitions")
.blockingHandler(this::handleGetAllDefinitions, ordered);
As I understand, setting the 'ordered' parameter to TRUE means that the handler can be called concurrently. Does it mean this option is equivalent to the Option #2 with multi-threaded Worker Verticles?
What is the difference? that the async multi-threaded execution pertains to the one specific HTTP request only (the one for the /allDefinitions path) as opposed to the whole AlertsRestService Verticle?
Option 4. and the last option I found is to use the 'executeBlocking()' directive explicitly to run only the enclosed code in worker threads. I could not find many examples of how to do this with HTTP request handling, so below is my attempt - maybe incorrect. The difference here is only in the implementation of the handler method, handleGetAllAlertDefinitions() - but it is rather involved... :
private void handleGetAllAlertDefinitions(RoutingContext routingContext) {
vertx.executeBlocking(
fut -> { fut.complete( sendAsyncRequestToDB(routingContext)); },
false,
res -> { handleAsyncResponse(res, routingContext); }
);
}
public Collection<AlertDefinition> sendAsyncRequestToDB(RoutingContext routingContext) {
Collection<AlertDefinition> allAlertDefinitions = new LinkedList<>();
try {
alertDefinitionsDao.getAllAlertDefinitions();
} catch (Exception e) {
routingContext.response().setStatusCode(500)
.end(e.getMessage());
}
return allAlertDefinitions;
}
private void handleAsyncResponse(AsyncResult<Object> asyncResult, RoutingContext routingContext){
if(asyncResult.succeeded()){
try {
routingContext.response().putHeader("content-type", "application/json")
.setStatusCode(200)
.end(Json.encodePrettily(asyncResult.result()));
} catch(EncodeException e) {
routingContext.response().setStatusCode(500)
.end(e.getMessage());
}
} else {
routingContext.response().setStatusCode(500)
.end(asyncResult.cause());
}
}
How is this different form other options? And does Option 4 provide concurrent execution of the handler or single-threaded like in Option 1?
Finally, coming back to the original question: what is the most appropriate Option for handling longer-running operations when handling REST requests?
Sorry for such a long post.... :)
Thank you!
That's a big question, and I'm not sure I'll be able to address it fully. But let's try:
In Option #1 what it actually means is that you shouldn't use ThreadLocal in your worker verticles, if you use more than one worker of the same type. Using only one worker means that your requests will be serialised.
Option #2 is simply incorrect. You cannot use setInstances with instance of a class, only with it's name. You're correct, though, that if you choose to use name of the class, Vert.x will instantiate them.
Option #3 is less concurrent than using Workers, and shouldn't be used.
Option #4 executeBlocking is basically doing Option #3, and is also quite bad.

Run task in background using deferredResult in Spring without frozen browser as client

I have implemented a simple Rest service by which I'd like to test deferredResult from Spring. While am I getting texts in that order:
TEST
TEST 1
TEST AFTER DEFERRED RESULT
I am very interested why in a browser (client) I need to wait that 8 seconds. Isn't that deferedResult shouldn't be non-blocking and run a task in the background? If no, how to create a rest service which will be non-blocking and run tasks in the background without using Java 9 and reactive streams?
#RestController("/")
public class Controller {
#GetMapping
public DeferredResult<Person> test() {
System.out.println("TEST");
DeferredResult<Person> result = new DeferredResult<>();
CompletableFuture.supplyAsync(this::test1)
.whenCompleteAsync((res, throwable) -> {
System.out.println("TEST AFTER DEFERRED RESULT");
result.setResult(res);
});
System.out.println("TEST 1");
return result;
}
private Person test1() {
try {
Thread.sleep(8000);
} catch (InterruptedException e) {
e.printStackTrace();
}
return new Person("michal", 20);
}
}
class Person implements Serializable {
private String name;
private int age;
}
DeferredResult is a holder for a WebRequest to allow the serving thread to release and serve another incoming HTTP request instead of waiting for the current one's result. After setResult or setError methods will be invoked - Spring will release that stored WebRequest and your client will receive the response.
DeferredResult holder is a Spring Framework abstraction for Non-blocking IO threading.
Deferred result abstraction has nothing with background tasks. Calling it without threading abstractions will cause the expected same thread execution. Your test1 method is running in the background because of CompletableFuture.supplyAsync method invocation that gives the execution to common pool.
The result is returned in 8 seconds because the whenCompleteAsync passed callback will be called only after test1 method will return.
You cannot receive the result immediately when your "service call logic" takes 8 seconds despite you are performing it in the background. If you want to release the HTTP request - just return an available proper object (it could contain a UUID, for example, to fetch the created person later) or nothing from the controller method. You can try to GET your created user after N seconds. There are specific HTTP response codes (202 ACCEPTED), that means the serverside is processing the request. Finally just GET your created object.
The second approach (if you should notify your clientside - but I will not recommend you to do it if this is the only reason) - you can use WebSockets to notify the clientside and message with it.

Resources