Decrease initial lag in webclient call

Decrease initial lag in webclient call - spring-boot

I am currently using Webclient of spring-webflux package to make synchronous REST calls.
But the time taken to mae the first request is longer than the time taken by RestTemplate.
I have observed that the successive calls take much lesser time and more or less same to that of RestTemplate.
Is there a solution to decrease the initial lag for Webclient?

By default, the initialization of the HttpClient resources happens on demand. This means that the first request absorbs the extra time needed to initialize and load:
the event loop group
the host name resolver
the native transport libraries (when native transport is used)
the native libraries for the security (in case of OpenSsl)
You can preload these resources - check this documentation
Things that cannot be preloaded are:
host name resolution happens with the first request
in case a connection pool is used (the default) - with the first request, a connection to the URL is established, the subsequent requests to the same URL reuse the connections from the pool so they are faster.

Related

GRPC: Spring boot: How to warm up a grpc client. Especially to make it resolve domain names

The issue is: A grpc client is taking too much time on its first call. And after debugging, it was the domain name resolution that is taking that amount of time.
Is there any way to warm up the grpc clients just after the application starts?
Update: The request eventually succeeds, but the first call takes so much time, the consecutive calls are much faster.
And Im using this library https://github.com/yidongnan/grpc-spring-boot-starter

When created, the ManagedChannel has not performed any I/O. It will lazily initialize itself on the first RPC or if you call managedChannel.getState(true).

Throttle HTTP Request based on Available Memory

I have a REST API that is expected to receive a large payload as request body. The API calls a blocking method that takes 2 seconds to process each request and then returns 200 OK. I wish to introduce throttling based on available memory such that the API returns 429 Too Many Request when the available memory falls below a threshold.
When the threshold condition is met, I wish to reject subsequent requests right away, even before loading the large request payloads in my application memory. This will also give me some protection against denial of service attacks.
In a Java EE, Tomcat environment, if I use a Filter to check available memory, I understand the complete request is already loaded in memory. Is it then better to add the check in ServletRequestListener.requestInitialized method so that I can reject the request even before the app receives it?
P.S. I use the below formula to calculate available memory based on this SO post:
long presumableFreeMemory =
Runtime.getRuntime().maxMemory()
- Runtime.getRuntime().totalMemory()
+ Runtime.getRuntime().freeMemory();

Why is fasthttp faster than net/http?

A fasthttp based server is up to 10 times faster than net/http.
Which implementation details make fasthttp so much faster? Moreover, how does it manage incoming requests better than net/http?

The article "http implementation fasthttp in golang" from husobee mentions:
Well, this is a much better implementation for several reasons:
The worker pool model is a zero allocation model, as the workers are already initialized and are ready to serve, whereas in the stdlib implementation the go c.serve() has to allocate memory for the goroutine.
The worker pool model is easier to tune, as you can increase/decrease the buffer size of the number of work units you are able to accept, versus the fire and and forget model in the stdlib
The worker pool model allows for handlers to be more connected with the server through channel communications, if the server needs to shutdown for example, it would be able to more easily communicate with the workers than in the stdlib implementation
The handler function definition signature is better, as it takes in only a context which includes both the request and writer needed by the handler. this is HUGELY better than the standard library, as all you get from the stdlib is a request and response writer… The work in go1.7 to include context within the request is pretty much a hack to give people what they really want (context) without breaking anyone.
Overall it is just better to write a server with a worker pool model for serving requests, as opposed to just spawning a “thread” per request, with no way of throttling out of the box.

Batch HTTP Request Performance gain

I want to know the performance gain from doing a HTTP batch request. is it only reducing the number of round trips to one instead of n times where n is the number of HTTP requests? if it's like that I guess you can keep http connection opened and send your http messages through and once finish you can close it to get performance gain.

The performance gain of doing batch requests depends on what you are doing with them. However just as an agnostic approach here you go:
If you can manage a keep-alive connection, yes this means you don't have to do the initial handshake for the connection. That reduces some overhead and certainly saves time spent handling subsequent packets along this connection. Because of this you can "pipeline" requests and decrease overall load latency (all else not considered). However, requests in HTTP1.1 are still bound to be FIFO so you can have hangups. This is where batching is useful. Since even with a keep-alive connection you can have this hangup (HTTP/2 will allow asynchronous handling) you can still have some significant latency between requests.
This can be mitigated further by batching. If possible you lump all the data needed for subsequent requests into one and this way everything is processed together and sent back as one response. Sure it may take a bit longer to handle a single packet as opposed to the sequential method, but your throughput is increased per time because roundtrip latency for request->response is not multiplied. Thus you get an even better performance gain in terms of requests handling speeds.
Naturally this approach depends on what you're doing with the requests for it to be effective. Sometimes batching can put too much stress on a server if you have a lot of users doing this with a lot of data so to increase overall concurrent throughput across all users you sometimes need to take the technically slower sequential approach to balance things out. However, the best approach will be known by you upon some simple monitoring and analysis.
And as always, don't optimize prematurely :)

Consider this typical scenario: the client has the identifier of a resource which resides in a database behind an HTTP server, of which resource they want to get an object representation.
The general flow to execute that goes like this:
The client code constructs an HTTP client.
The client builds an URI and sets the proper HTTP request fields.
Client issues the HTTP request.
Client OS initiates a TCP connection, which the server accepts.
Client sends the request to the server.
Server OS or webserver parses the request.
Server middleware parses the request components into a request for the server application.
Server application gets initialized, the relevant module is loaded and passed the request components.
The module obtains an SQL connection.
Module builds an SQL query.
The SQL server finds the record and returns that to the module.
Module parses the SQL response into an object.
Module selects the proper serializer through content negotiation, JSON in this case.
The JSON serializer serializes the object into a JSON string.
The response containing the JSON string is returned by the module.
Middleware returns this response to the HTTP server.
Server sends the response to the client.
Client fires up their version of the JSON serializer.
Client deserializes the JSON into an object.
And there you have it, one object obtained from a webserver.
Now each of those steps along the way is heavily optimized, because a typical server and client execute them so many times. However, even if one of those steps only take a millisecond, when you for example have fifty resources to obtain, those milliseconds add up fast.
So yes, HTTP keep-alive cuts away the time the TCP connection takes to build up and warm up, but each and every other step will still have to be executed fifty times. Yes, there's SQL connection pooling, but every query to the database adds overhead.
So instead of going through this flow fifty separate times, if you have an endpoint that can accept fifty identifiers at once, for example through a comma-separated query string or even a POST with a body, and return their JSON representation at once, that will always be way faster than individual requests.

Stream writeheaders take too much time WCF

A similar question is asked at NewRelic stream & writeHeaders
I am profiling my WCF services on New Relic. There is a WCF service which calls another WCF service.
Now I suppose while calling the other WCF service, when it creates request, somewhere the internal process writes headers to request stream which is slow some times.
The traces I found in New Relic tells me that for a particular method of one of my WCF service which calls a method of my another WCF service, takes around 50-60 seconds, out of which 95-100 % of time is consumed by System.Net.ConnectStream.WriteHeaders.
Stream[url of WCF service/soap]: WriteHeaders -> 99.78 % time (approx 49 seconds).
I am not getting what it is and how to reduce this time ?
I have searched and I didn't found what ConnectStream actually do or some details about it, so that I can find any way to lessen the amount of time its taking.
Please, let me know your suggestions.

It sounds like you're streaming a large file up from a client, catching it in one WCF web service, then re-writing the data into a new HttpWebRequest, then sending it to another host. I think I'd be tempted to try buffering the data from the client to your web service rather than streaming.
I've spent the last year working on a project that sounds similar to wha you're doing. The difference betweens streaming and buffering is this:
Streaming reads (from source) and then writes (to target) the data in an interative process you don't have much control over. If the source file is large (like a gig or more), the WCF request/response will iterate a dozen or more times back and forth between the client and host before the request is complete.
Buffering, on the other hand, accummulates the entire content of the target file BEFORE filling the request and sending it to the host, thus speeding up the process. And since the performance penalty incurred by buffering (time required to accummulate the bytes in memory) is placed on the client, it's generally not a problem.
So when buffering data from the client, your host you'll receive one Http request with a complete byte array (let's say) that's ready to be repackaged into the request you're passing onto the second, target WCF host. At that point, again, you have the choice between buffering and streaming. On the host, were performance matters, streaming the request to the second host will improve your scalability but (again) potentially hurt your performance speed.
On the client side:
With binding
.TransferMode =TransferMode.Buffered 'instead of Transfermode.Streamed
.MessageEncoding = WSMessageEncoding.Text
.TextEncoding = System.Text.Encoding.UTF8
.MaxReceivedMessageSize = Integer.MaxValue
.ReaderQuotas.MaxArrayLength = Integer.MaxValue
.ReaderQuotas.MaxBytesPerRead = Integer.MaxValue
.ReaderQuotas.MaxDepth = Integer.MaxValue
.ReaderQuotas.MaxNameTableCharCount = Integer.MaxValue
.ReaderQuotas.MaxStringContentLength = Integer.MaxValue
.MaxBufferSize = Integer.MaxValue
.MaxBufferPoolSize = Integer.MaxValue
On the host side:
With binding
.TransferMode = TransferMode.Buffered
.MaxReceivedMessageSize = Integer.MaxValue

I've seen the same thing before when the service you're calling is stalling or is flooded with too many concurrent connections. If the issue is the former, profiling your WCF service may help identify the root cause -- maybe it's slow to respond due to database access or some other I/O bound process. If the issue is the later, it may be something that you can resolve by tuning the performance of the service (http://msdn.microsoft.com/en-us/library/ee377061(v=bts.10).aspx)
This can also manifest itself as "BeginRequest" for an ASP.NET application in New Relic. Rarely does BeginRequest or WriteHeaders mean the problem is really with sending the data itself, though it could be if you have large payloads, but in regular calls where the data transmitted is small, the problem with a slow time to connect or slow response will appear in these two areas.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio