Batch HTTP Request Performance gain - performance

I want to know the performance gain from doing a HTTP batch request. is it only reducing the number of round trips to one instead of n times where n is the number of HTTP requests? if it's like that I guess you can keep http connection opened and send your http messages through and once finish you can close it to get performance gain.

The performance gain of doing batch requests depends on what you are doing with them. However just as an agnostic approach here you go:
If you can manage a keep-alive connection, yes this means you don't have to do the initial handshake for the connection. That reduces some overhead and certainly saves time spent handling subsequent packets along this connection. Because of this you can "pipeline" requests and decrease overall load latency (all else not considered). However, requests in HTTP1.1 are still bound to be FIFO so you can have hangups. This is where batching is useful. Since even with a keep-alive connection you can have this hangup (HTTP/2 will allow asynchronous handling) you can still have some significant latency between requests.
This can be mitigated further by batching. If possible you lump all the data needed for subsequent requests into one and this way everything is processed together and sent back as one response. Sure it may take a bit longer to handle a single packet as opposed to the sequential method, but your throughput is increased per time because roundtrip latency for request->response is not multiplied. Thus you get an even better performance gain in terms of requests handling speeds.
Naturally this approach depends on what you're doing with the requests for it to be effective. Sometimes batching can put too much stress on a server if you have a lot of users doing this with a lot of data so to increase overall concurrent throughput across all users you sometimes need to take the technically slower sequential approach to balance things out. However, the best approach will be known by you upon some simple monitoring and analysis.
And as always, don't optimize prematurely :)

Consider this typical scenario: the client has the identifier of a resource which resides in a database behind an HTTP server, of which resource they want to get an object representation.
The general flow to execute that goes like this:
The client code constructs an HTTP client.
The client builds an URI and sets the proper HTTP request fields.
Client issues the HTTP request.
Client OS initiates a TCP connection, which the server accepts.
Client sends the request to the server.
Server OS or webserver parses the request.
Server middleware parses the request components into a request for the server application.
Server application gets initialized, the relevant module is loaded and passed the request components.
The module obtains an SQL connection.
Module builds an SQL query.
The SQL server finds the record and returns that to the module.
Module parses the SQL response into an object.
Module selects the proper serializer through content negotiation, JSON in this case.
The JSON serializer serializes the object into a JSON string.
The response containing the JSON string is returned by the module.
Middleware returns this response to the HTTP server.
Server sends the response to the client.
Client fires up their version of the JSON serializer.
Client deserializes the JSON into an object.
And there you have it, one object obtained from a webserver.
Now each of those steps along the way is heavily optimized, because a typical server and client execute them so many times. However, even if one of those steps only take a millisecond, when you for example have fifty resources to obtain, those milliseconds add up fast.
So yes, HTTP keep-alive cuts away the time the TCP connection takes to build up and warm up, but each and every other step will still have to be executed fifty times. Yes, there's SQL connection pooling, but every query to the database adds overhead.
So instead of going through this flow fifty separate times, if you have an endpoint that can accept fifty identifiers at once, for example through a comma-separated query string or even a POST with a body, and return their JSON representation at once, that will always be way faster than individual requests.

Related

Queue handling in HTTP server with heavy load

Consider a web server under very heavy load, resulting in important lag (up to ~30s response time). I've noticed that if I asynchronously request the same page multiple times (e.g. send multiple requests before the previous ones are answered), responses don't necessarily come back in the order I sent them.
Can anyone explain how the server chooses which requests to handle first? It seems there is no obvious queueing, so what makes a request get picked instead of another?

HTTP/2 : Wht multiple HTTP request are better? Or is the statement false?

I was reading through https://hackernoon.com/how-it-feels-to-learn-javascript-in-2016-d3a717dd577f
A line says
Yes, but because HTTP/2 is coming now multiple HTTP requests are actually better.
Embedded within all the sarcasm in that post, this statement is presented as to be true. So, I would like to know whether this statement is actually true? and is yes then how are multiple request better? From what I know from the computer networks class is that for each new linked resource, a bunch of messages or packets are exchanged between the end hosts i.e. eating the resources/time/space on all the routers/bridges on that path.
In Http/2 multiple requests mean a slightly different thing than Http/1.1. Http/2 tries to utilize a single connection request system where the connection is closed after the page has completed all tasks. This way you can have dynamic loading of smaller pieces of a library and share the overhead which would amount to a smaller download overall then one large js file which is efficient in Http/1.1.
Marc B had it right with the Groceries analogy in which Http/2 is one trip to the server which grabs multiple pieces and returns vs Http/1.1 is a series of trips to grab the same pieces.

Batching generation of http responses

I'm trying to find an architecture for the following scenario. I'm building a REST service that performs some computation that can be quickly batch computed. Let's say that computing 1 "item" takes 50ms, and computing 100 "items" takes 60ms.
However, the nature of the client is that only 1 item needs to be processed at a time. So if I have 100 simultaneous clients, and I write the typical request handler that sends one item and generates a response, I'll end up using 5000ms, but I know I could compute the same in 60ms.
I'm trying to find an architecture that works well in this scenario. I.e., I would like to have something that merges data from many independent requests, processes that batch, and generates the equivalent responses for each individual client.
If you're curious, the service in question is python+django+DRF based, but I'm curious about what kind of architectural solutions/patterns apply here and if anything solving this is already available.
At first you could think of a reverse proxy detecting all pattern-specific queries, collecting all theses queries and sending it to your application in an HTTP 1.1 pipeline (pipelining is a way to send a big number of queries one after another and receiving all HTTP responses in the same order at the end, without waiting for a response after each query).
But:
Pipelining is very hard to do well
you would have to code the reverse proxy as I do not know a way to do it
one slow response in the pipeline block all the other responses
you need an http server able to give several queries to your application language, something which never happens if the http server is not directly coded in your application, because usually http is made to work on only one query (like you never receive 2 queries in a PHP env, you receive the 1st one, send the response, and then receive the next one, even if the connection contain 2 queries).
So the good idea would be to do that on the application side. You could identify matching queries, and wait for a small amount of time (10ms?) to see if some other queries are also incoming. You will need a way to communicate between several parallel workers here (like you have 50 application workers and 10 of them have received queries that could be treated in the same batch). This way of communication could be a database (a very fast one) or some shared memory, depends on the technology used.
Then when too much time waiting has been spend (10ms?) or when a big amount of queries are received, one of the worker could collect all queries, run the batch, and tell every other workers that a result is there (here again you need a central point of communication, like LISTEN/NOTIFY in PostgreSQL, a shared memory thing, a message queue service, etc.).
Finally every worker is responsible for sending the right HTTP response.
The key here is having a system where the time you loose in trying to share requests treatment is less important than the time saved in batching several queries together, and in case of low traffic this time should stay reasonnable (as here you will always loose time waiting for nothing). And of course you are also adding some complexity on the system, harder to maintain, etc.

How to Reduce 'Waiting Time' and 'Receiving Time' on Page Load

I am using CloudFront and many time I see Wait Time and Receiving Time is too high.
According to Firebug document, Waiting time and Receiving time means:
Waiting - Waiting for a response from the server
Receiving - / (from cache) Time required to read the entire
response from the server (and/or time required to read from cache)
I do not understand why it takes so much time and what I can do to reduce the time?
There are multiple things you can do.
Set appropriate headers Expires, Cache-control, ETag etc.
Use gzipped versions of the assets
User Sprites where possible. Merge your CSS files into one, merge your JS files into one
Run your site through WebpageTest.org and go through all the recommendations.
Run your site through YSlow and go through all the recommendations
Waiting
This means that the browser is waiting for the server to process the request and return the response.
When that time is long, it normally means your server-side script takes long to process the request.
There are many reasons why a server-side script is slow, e.g. a long-running database query, processing of a huge file, deep recursions, etc.
To fix that, you need to optimize your script. Besides optimizing the code itself, a simple way is to reduce the execution time for subsequent requests is to implement some kind of server-side caching.
Receiving
This means the browser is receiving the response from the server.
When that time is long, it either means your network connection is slow or the received data is (too) big.
To reduce this time, you therefore need to improve the network connection and/or to reduce the size of the response.
Reducing the response size can be done by compressing the transferred data e.g. by enabling gzip and/or removing unnecessary characters like spaces from the output before outputting the data. You may also choose a different format for the returned data, where possible, e.g. use JSON instead of XML for data or directly returning HTML.
Generally
To generally reduce the waiting and receiving times you may implement some client-side caching, e.g. by setting appropriate HTTP headers like Expires, Cache-Control, etc. Then the browser will only make rather small requests to check whether there are new versions of the data to fetch.
You can also avoid the requests completely by saving the data on the client side (e.g. by putting it into the local or session storage) instead of fetching it from the server every time you need it.

Is the HEAD response faster than the GET?

I'm currently getting the info about the files with GET, will it be faster if I rewrite it using HEAD request? Cause I close the connection after the first response.
A HEAD response only includes the HTTP headers but no body - it is generally faster to just use a HEAD if you do not use any information in the body that would have normally transferred in a GET response - if there was no body to begin with it should not make a difference.
Also from here:
The HEAD method is identical to GET except that the server MUST NOT
return a message-body in the response. The metainformation contained
in the HTTP headers in response to a HEAD request SHOULD be identical
to the information sent in response to a GET request. This method can
be used for obtaining metainformation about the entity implied by the
request without transferring the entity-body itself. This method is
often used for testing hypertext links for validity, accessibility,
and recent modification.
Whether HEAD is faster than GET depends purely on the implementation of the server-side (it usually is due to less data transfer)... IF the information HEAD delivers is sufficient in your case I would go with HEAD and only fallback to GET where HEAD is not implemented properly and/or some obscure proxy is messing with it...
You haven't given any information about the type of server you're accessing or network you're accessing it over.
It is indeed plausible that a HEAD request would complete faster than GET, since it involves less data transfer. However, on a fast or high latency connection this almost always won't matter. As for the server side, it really depends heavily on what you're doing, but in most circumstances there would be no measurable difference if you timed it.
If you don't need the body of the response, why not use HEAD anyway? Regardless of whether you can measure any difference in response time or you can't, it is more bandwidth-efficient.
It's probably negligible. It really depends what the server is doing. Once it receives a request, you can't guarantee to expect a response from a HEAD request or a GET request any quicker than the other.
In theory, because the response to a HEAD request should be the same as that of a GET request, but without the response body, it should be quicker because its transfering less data. But there is no guaruntee that one connection which processes a HEAD request will be any quicker than another connection processing a GET request.
The important thing to note with your question, is that you are talking about 'GET requests and HEAD requests' - instead of 'GET responses, and HEAD responses'
Logically - the request for a HEAD and a GET both take the same amount of time to travel from your PC to the server destination. Whatever that server does with the HEAD/GET will be up to the server owner, so they could make a HEAD take longer if they coded it to do so.
If you really want to get into semantics, you could argue that a HEAD request is one extra character of data than a GET request, therefore, a HEAD request technically has to transmit 1 byte more of data in the request phase. In practice, this is going to be an non-measurable difference in request time.
If you were to start a timer from the moment both 'RESPONSES' left the server on their way back to the requester, then logically speaking, a GET response will take longer to travel across the network. Since it will usually consist of HEADERS and BODY - the BODY can be a huge amount of data.
A Head response will take less time to travel, because it is just HEADERS.
Using a really extreme example - if you send a GET request for a 4GB file, it will take minutes for that GET response to finish writing the data to your network stream.
A HEAD request for the same 4GB file will finish almost instantly, because it is only sending information that describes the 4GB file at a high level, without having to transmit its contents to the requester.
A GET response will encompass a HEAD + BODY.
A HEAD response will contain the HTTP Headers only.
I personally use HEAD requests in combination with a technology called IPFS - which is a type of distributed internet, where files and data can be stored on a P2P network. In order to keep files alive on the network, they need to be requested frequently. However, if you pull the file via a GET request, you end up using bandwidth, to download that 4GB file you stored weeks ago.
Performing a HEAD request however, in my case, keeps the file alive on the network, but does not request the 4GB of data to travel to me on the network.

Resources