Does Spring create new thread per request in rest controllers? - spring

I wanted to learn non blocking REST, but first I wrote blocking controller for comparison. To my surprise Spring doesn't block incoming requests.
Simple blocking service:
#Service
public class BlockingService {
public String blocking() {
try {
Thread.sleep(10000L);
} catch (InterruptedException ign) {}
return "Blocking response";
}
}
Simple REST controller:
#Slf4j
#RestController
public class BlockingRestController {
private final BlockingService blockingService;
#Autowired
public BlockingRestController(BlockingService blockingService) {
this.blockingService = blockingService;
}
#GetMapping("blocking")
public String blocking() {
log.info("Starting blocking request processing...");
return blockingService.blocking();
}
}
And I was thinking that when I send 4 requests using curl from 4 separated terminals I get:
1. Starting blocking request processing... (console where spring is running)
2. (4 terminals waiting)
3. "Blocking response" (in 1st terminal)
4. Starting blocking request processing... (console where spring is running)
5. (3 terminals waiting)
6. "Blocking response" (in 2nd terminal)
And so on...
But to my surprise I got:
1. Starting blocking request processing... (console where spring is running)
2. Starting blocking request processing... (console where spring is running)
3. Starting blocking request processing... (console where spring is running)
4. Starting blocking request processing... (console where spring is running)
5. "Blocking response" (in 1st terminal)
6. "Blocking response" (in 2nd terminal)
7. "Blocking response" (in 3rd terminal)
8. "Blocking response" (in 4th terminal)
Why first request doesn't block processing requests? Yet I don't create new threads and I don't process anything asynchronous?
Why do I ask about it? Because I want to learn how to use DeferredResult, but now I don't see a need.

It's blocking in the sense that it blocks one thread: the thread taken out of the pool of threads by the servlet container (Tomcat, Jetty, etc., not Spring) to handle your request. Fortunately, many threads are used concurrently to handle requests, otherwise the performance of any Java web application would be dramatic.
If you have, let's say, 500 concurrent requests all taking 1 minute to complete, and the pool of threads has 300 threads, then 200 requests will be queued, waiting for one of the threads to become available.

Absolutely NO since NIO!!!
Spring Web runs in a web container Tomcat or Netty, it is tomcat or Netty's work to create thread, not spring mvc or spring webflux.
If you use tomcat in a BIO model, it is definitely new thread per request.
Netty is of course NIO, tomcat supports NIO and APR, both are non-blocking,
Spring boot webmvc tomcat is default NIO, no need to worry about creating to many threads since nio.
Tomcat 8 NIO,how it works?

Related

Netty - EventLoop Queue Monitoring

I am using Netty server for a Spring boot application. Is there anyway to monitor the Netty server queue size so that we will come to know if the queue is full and server is not able to accept any new request? Also, Is there any logging by netty server if the queue is full or unable to accept a new request?
Netty does not have any logging for that purpose but I implemented a way to find pending tasks and put some logs according to your question. here is a sample log from my local
you can find all code here https://github.com/ozkanpakdil/spring-examples/tree/master/reactive-netty-check-connection-queue
About code which is very explanatory from itself but NettyConfigure is actually doing the netty configuration in spring boot env. at https://github.com/ozkanpakdil/spring-examples/blob/master/reactive-netty-check-connection-queue/src/main/java/com/mascix/reactivenettycheckconnectionqueue/NettyConfigure.java#L46 you can see "how many pending tasks" in the queue. DiscardServerHandler may help you how to discard if the limit is full. You can use jmeter for the test here is the jmeter file https://github.com/ozkanpakdil/spring-examples/blob/master/reactive-netty-check-connection-queue/PerformanceTestPlanMemoryThread.jmx
if you want to handle netty limit you can do it like the code below
#Override
public void channelActive(ChannelHandlerContext ctx) throws Exception {
totalConnectionCount.incrementAndGet();
if (ctx.channel().isWritable() == false) { // means we hit the max limit of netty
System.out.println("I suggest we should restart or put a new server to our pool :)");
}
super.channelActive(ctx);
}
You should check https://stackoverflow.com/a/49823055/175554 for handling the limits and here is another explanation about "isWritable" https://stackoverflow.com/a/44564482/175554
One more extra, I put actuators in the place http://localhost:8080/actuator/metrics/http.server.requests is nice to check too.

Readiness probe during Spring context startup

We are deploying our spring boot applications in OpenShift.
Currently we are trying to run a potentially long running task (database migration) before the webcontext is fully set up.
It is especially important that the app does not accept REST requests or process messages before the migration is fully run.
See the following minimal example:
// DemoApplication.java
#SpringBootApplication
public class DemoApplication {
public static void main(String[] args) {
SpringApplication.run(DemoApplication.class, args);
}
}
// MigrationConfig.java
#Configuration
#Slf4j
public class MigrationConfig {
#PostConstruct
public void run() throws InterruptedException {
log.info("Migration...");
// long running task
Thread.sleep(10000);
log.info("...Migration");
}
}
// Controller.java
#RestController
public class Controller {
#GetMapping("/test")
public String test() {
return "test";
}
}
// MessageHandler.java
#EnableBinding(Sink.class)
public class MessageHandler {
#StreamListener(Sink.INPUT)
public void handle(String message) {
System.out.println("Received: " + message);
}
}
This works fine so far: the auto configuration class is processed before the app responds to requests.
What we are worried about, however, is OpenShifts readiness probe: currently we use an actuator health endpoint to check if the application is up and running.
If the migration takes a long time, OpenShift might stop the container, potentially leaving us with inconsistent state in the database.
Does anybody have an idea how we could communicate that the application is starting, but prevent REST controller or message handlers from running?
Edit
There are multiple ways of blocking incoming REST requests, #martin-frey suggested a servletfilter.
The larger problem for us is stream listener. We use Spring Cloud Stream to listen to a RabbitMQ queue.
I added an exemplary handler in the example above.
Do you have any suggestions on how to "pause" that?
What about a servletfilter that knows about the state of the migration? That way you should be able to handle any inbound request and return a responsecode to your liking. Also there would be no need to prevent any requesthandlers until the system is fully up.
I think it can run your app pod without influence if you set up good enough initialDelaySeconds for initialization of your application.[0][1]
readinessProbe:
httpGet:
path: /_status/healthz
port: 8080
initialDelaySeconds: 10120
timeoutSeconds: 3
periodSeconds: 30
failureThreshold: 100
successThreshold: 1
Additionally, I recommend to set up the liveness probes with same condition (but more time than the readiness probes' value), then you can implement automated recovery of your pods if the application is failed until initialDelaySeconds.
[0] [ https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/#define-readiness-probes ]
[1] [ https://docs.openshift.com/container-platform/latest/dev_guide/application_health.html ]
How about adding an init container which only role is the db migration stuffs without the application.
Then another container to serve the application. But be careful when deploying the application with more than 1 replica. The replicas will also execute the initcontainer at the same time if you are using Deployment.
If you need multiple replicas, you might want to consider StatefulSets instead.
Such database migrations are best handled by switching to a Recreate deployment strategy and doing the migration as a mid lifecyle hook. At that point there are no instances of your application running so it can be safely done. If you can't have downtime, then you need to have the application be able to be switched to some offline or read/only mode against a copy of your database while doing the migration.
Don't keep context busy doing a long task in PostConstruct. Instead start migration as fully asynchronous task and allow Spring to build the rest of the context meanwhile. At the end of the task just set some shared Future with success or failure. Wrap controller in a proxy (can be facilitated with AOP, for example) where every method except the health check tries to get value from the same future within a timeout. If it succeeds, migration is done, all calls are available. If not, reject the call. Your proxy would serve as a gate allowing to use only part of API that is critical to be available while migration is going on. The rest of it may simply respond with 503 indicating the service is not ready yet. Potentially those 503 responses can also be improved by measuring and averaging the time migration typically takes and returning this value with RETRY-AFTER header.
And with the MessageHandler you can do essentially same thing. You wait for result of the future in the handle method (provided message handlers are allowed to hang indefinitely). Once the result is set, it will proceed with message handling from that moment on.

Can I use block() method of Flux returned from Spring5's WebClient?

I created Spring Boot 2.0 demo application which contains two applications that communicate using WebClient. And I'm suffering that they often stop communicating when I use block() method of Flux from the WebClient's response. I want to use List not Flux by some reasons.
The server side application is like this. It just returns Flux object.
#GetMapping
public Flux<Item> findAll() {
return Flux.fromIterable(items);
}
And the client side (or BFF side) application is like this. I get Flux from the server and convert it to List by calling block() method.
#GetMapping
public List<Item> findBlock() {
return webClient.get()
.retrieve()
.bodyToFlux(Item.class)
.collectList()
.block(Duration.ofSeconds(10L));
}
While it works well at first, findBlock() won't respond and timeouts after several times access. When I modify the findBlock() method to return Flux deleting collectList() and block(), it works well. Then I assume that block() method cause this problem.
And, when I modify the findAll() method to return List, nothing changes.
Source code of the entire example application is here.
https://github.com/cero-t/webclient-example
"resource" is the server application, and "front" is the client application. After running both application, when I access to localhost:8080 it works well and I can reload any times, but when I access to localhost:8080/block it seems to work well but after several reloads it won't respond.
By the way, when I add "spring-boot-starter-web" dependency to the "front" applications's (not resource application's) pom.xml, which means I use tomcat, this problem never happens. Is this problem due to Netty server?
Any guidance would be greatly appreciated.
First, let me point that using Flux.fromIterable(items) is advised only if items has been fetched from memory, no I/O involved. Otherwise chances are you'd be using a blocking API to get it - and this can break your reactive application. In this case, this is an in-memory list, so no problem. Note that you can also go Flux.just(item1, item2, item3).
Using the following is the most efficient:
#GetMapping("/")
public Flux<Item> findFlux() {
return webClient.get()
.retrieve()
.bodyToFlux(Item.class);
}
Item instances will be read/written, decoded/encoded on the fly in a very efficient way.
On the other hand, this is not the preferred way:
#GetMapping("/block")
public List<Item> findBlock() {
return webClient.get()
.retrieve()
.bodyToFlux(Item.class)
.collectList()
.block(Duration.ofSeconds(10L));
}
In this case, your front application is buffering in memory the whole items list with collectList but is also blocking one of the few server threads available. This might cause very poor performance because your server might be blocked waiting for that data and can't service other requests at the same time.
In this particular case it's worse, since the application totally breaks.
Looking at the console, we can see the following:
WARN 3075 --- [ctor-http-nio-7] io.netty.util.concurrent.DefaultPromise : An exception was thrown by reactor.ipc.netty.channel.PooledClientContextHandler$$Lambda$532/356589024.operationComplete()
reactor.core.Exceptions$BubblingException: java.lang.IllegalArgumentException: Channel [id: 0xab15f050, L:/127.0.0.1:59350 - R:localhost/127.0.0.1:8081] was not acquired from this ChannelPool
at reactor.core.Exceptions.bubble(Exceptions.java:154) ~[reactor-core-3.1.3.RELEASE.jar:3.1.3.RELEASE]
This is probably linked to a reactor-netty client connection pool issue that should be fixed in 0.7.4.RELEASE. I don't know the specifics of this, but I suspect the whole connection pool gets corrupted as HTTP responses aren't properly read from the client connections.
Adding spring-boot-starter-web does make your application use Tomcat, but it mainly turns your Spring WebFlux application into a Spring MVC application (which now supports some reactive return types, but has a different runtime model). If you wish to test your application with Tomcat, you can add spring-boot-starter-tomcat to your POM and this will use Tomcat with Spring WebFlux.

Spring boot embedded tomcat application session does not invalidate

Recently we ported our application from the web application running in tomcat to spring boot application with embedded tomcat.
After running the app for several days, memory and cpu usage have reached 100%.
In heap dump analysis it comes out that there was a bunch of http session objects which where not destroyed.
I can see in the debug that sessions created with configured timeout value, lets say, 5 minutes. But after this time the invalidation is not triggered. It invoked only if I do request again after the timeout period.
I have compared this behavior with app running in tomcat and I can see that session invalidation is triggered by ContainerBackgroungProcessor thread [StandardManager(ManagerBase).processExpires()]
I do not see this background thread in spring boot application.
What was done following some suggestions found:
session timeout set in application.properties:
server.session.timout=300
or in EmbeddedServletContainerCustomizer #Bean:
factory.setSessionTimout(5, TimeUnit.MINUTES)
Added HttpSessionEventPublisher and SessionRegistry beans
Nothing helps, sessions just not invalidated at the expiration time.
Some clue about this?
After some more debugging and documentation reading this is the reason and solution:
In tomcat, there is a thread spawned on behalf of the root container which scans periodically container and its child containers session pools and invalidates them. Each container/child container may be configured to have its own background processor to do the job or to rely on its host's background processor.
This controlled by context.backgroundProcessorDelay
Apache Tomcat 8 Configuration Reference
backgroundProcessorDelay -
This value represents the delay in seconds between the invocation of the backgroundProcess method on this engine and its child containers, including all hosts and contexts. Child containers will not be invoked if their delay value is not negative (which would mean they are using their own processing thread). Setting this to a positive value will cause a thread to be spawn. After waiting the specified amount of time, the thread will invoke the backgroundProcess method on this engine and all its child containers. If not specified, the default value for this attribute is 10, which represent a 10 seconds delay.
In spring boot application with embedded tomcat
there is TomcatEmbeddedServletContainerFactory.configureEngine() which sets this property -1 for the StandardEngine[Tomcat], which is the root container in tomcat hierarchy, as I understand.
All the child containers including web app also have this parameter set to -1.
And this means they all rely on someone else to do the job.
Spring do not do it, no-one do it.
The solution for me was to set this parameter for the app context:
#Bean
public EmbeddedServletContainerCustomizer servletContainerCustomizer() {
return new EmbeddedServletContainerCustomizer() {
#Override
public void customize(ConfigurableEmbeddedServletContainer container) {
if (container instanceof TomcatEmbeddedServletContainerFactory) {
TomcatEmbeddedServletContainerFactory factory = (TomcatEmbeddedServletContainerFactory) container;
TomcatContextCustomizer contextCustomizer = new TomcatContextCustomizer() {
#Override
public void customize(Context context) {
context.setBackgroundProcessorDelay(10);
}
};
List<TomcatContextCustomizer> contextCustomizers = new ArrayList<TomcatContextCustomizer>();
contextCustomizers.add(contextCustomizer);
factory.setTomcatContextCustomizers(contextCustomizers);
customizeTomcat(factory);
}
}

spring deferredresult

I am new to spring and want to implement long polling for a website to display admin message immediately when it becomes available to all clients,i searched google for hours and could only find out deferredresult(spring 3.2) can be used to implement it.my question is how i can achieve long polling with deferredresult, I would appreciate it if anyone could refer me to such a tutorial.
Another option is to use AsyncContext. This will keep the initial GET request "open" and enable you to send multiple messages as part of the response, unlike DeferredResult which allows to send only ONE response message. Here is a good-link that explains how !
Straight from the horses mouth.
You have two basic options: Option 1 is a Callable
, where the Callable returns the String view name (you may also be able to use #ResponseBody or some of the other normal Spring return types like ModelAndView, but I have never investigated that).
Option two is to return DeferredResult, which is like Callable. except you can pass that off to a separate thread and fill in the results there. Again, not sure if you can return a ModelAndView or use #ResponseBody to return XML/JSON, but I am sure you can.
Short background about DeferredResult:
Your controller is eventually a function executed by the servlet container (for that matter, let's assume that the server container is Tomcat) worker thread. Your service flow start with Tomcat and ends with Tomcat. Tomcat gets the request from the client, holds the connection, and eventually returns a response to the client. Your code (controller or servlet) is somewhere in the middle.
Consider this flow:
Tomcat get client request.
Tomcat executes your controller.
Release Tomcat thread but keep the client connection (don't return response) and run heavy processing on different thread.
When your heavy processing complete, update Tomcat with its response and return it to the client (by Tomcat).
Because the servlet (your code) and the servlet container (Tomcat) are different entities, then to allow this flow (releasing tomcat thread but keep the client connection) we need to have this support in their contract, the package javax.servlet, which introduced in Servlet 3.0 . Spring MVC use this new Servlet 3.0 capability when the return value of the controller is DeferredResult or Callable, although they are two different things. Callable is an interface that is part of java.util, and it is an improvement for the Runnable interface. DeferredResult is a class designed by Spring to allow more options (that I will describe) for asynchronous request processing in Spring MVC, and this class just holds the result (as implied by its name) while your Callable implementation holds the async code. So it means you can use both in your controller, run your async code with Callable and set the result in DeferredResult, which will be the controller return value. So what do you get by using DeferredResult as the return value instead of Callable? DeferredResult has built-in callbacks like onError, onTimeout, and onCompletion. It makes error handling very easy. In addition, as it is just the result container, you can choose any thread (or thread pool) to run on your async code. With Callable, you don't have this choice.
Here you can find a simple working examples I created with both options, Callable and DeferredResult.

Resources