I'm currently getting the info about the files with GET, will it be faster if I rewrite it using HEAD request? Cause I close the connection after the first response.
A HEAD response only includes the HTTP headers but no body - it is generally faster to just use a HEAD if you do not use any information in the body that would have normally transferred in a GET response - if there was no body to begin with it should not make a difference.
Also from here:
The HEAD method is identical to GET except that the server MUST NOT
return a message-body in the response. The metainformation contained
in the HTTP headers in response to a HEAD request SHOULD be identical
to the information sent in response to a GET request. This method can
be used for obtaining metainformation about the entity implied by the
request without transferring the entity-body itself. This method is
often used for testing hypertext links for validity, accessibility,
and recent modification.
Whether HEAD is faster than GET depends purely on the implementation of the server-side (it usually is due to less data transfer)... IF the information HEAD delivers is sufficient in your case I would go with HEAD and only fallback to GET where HEAD is not implemented properly and/or some obscure proxy is messing with it...
You haven't given any information about the type of server you're accessing or network you're accessing it over.
It is indeed plausible that a HEAD request would complete faster than GET, since it involves less data transfer. However, on a fast or high latency connection this almost always won't matter. As for the server side, it really depends heavily on what you're doing, but in most circumstances there would be no measurable difference if you timed it.
If you don't need the body of the response, why not use HEAD anyway? Regardless of whether you can measure any difference in response time or you can't, it is more bandwidth-efficient.
It's probably negligible. It really depends what the server is doing. Once it receives a request, you can't guarantee to expect a response from a HEAD request or a GET request any quicker than the other.
In theory, because the response to a HEAD request should be the same as that of a GET request, but without the response body, it should be quicker because its transfering less data. But there is no guaruntee that one connection which processes a HEAD request will be any quicker than another connection processing a GET request.
The important thing to note with your question, is that you are talking about 'GET requests and HEAD requests' - instead of 'GET responses, and HEAD responses'
Logically - the request for a HEAD and a GET both take the same amount of time to travel from your PC to the server destination. Whatever that server does with the HEAD/GET will be up to the server owner, so they could make a HEAD take longer if they coded it to do so.
If you really want to get into semantics, you could argue that a HEAD request is one extra character of data than a GET request, therefore, a HEAD request technically has to transmit 1 byte more of data in the request phase. In practice, this is going to be an non-measurable difference in request time.
If you were to start a timer from the moment both 'RESPONSES' left the server on their way back to the requester, then logically speaking, a GET response will take longer to travel across the network. Since it will usually consist of HEADERS and BODY - the BODY can be a huge amount of data.
A Head response will take less time to travel, because it is just HEADERS.
Using a really extreme example - if you send a GET request for a 4GB file, it will take minutes for that GET response to finish writing the data to your network stream.
A HEAD request for the same 4GB file will finish almost instantly, because it is only sending information that describes the 4GB file at a high level, without having to transmit its contents to the requester.
A GET response will encompass a HEAD + BODY.
A HEAD response will contain the HTTP Headers only.
I personally use HEAD requests in combination with a technology called IPFS - which is a type of distributed internet, where files and data can be stored on a P2P network. In order to keep files alive on the network, they need to be requested frequently. However, if you pull the file via a GET request, you end up using bandwidth, to download that 4GB file you stored weeks ago.
Performing a HEAD request however, in my case, keeps the file alive on the network, but does not request the 4GB of data to travel to me on the network.
Related
I'm coming to this from the InfoSec side, not the AppDev side, I just wanted to put that caveat in first. The issue is that my WAF is blocking certain images with the response, HTTP protocol compliance failed:Body in GET or HEAD requests. I need to justify keeping this rule active, so I'm asking, as a non-developer:
is this getting blocked because that's the rules of GET and HEAD requests, or can we allow Body in GET and HEAD requests, but it's really not a good idea ?
Why is it not a good idea? What are the potential problems that arise from allowing Body in a GET or HEAD request?
Thanks in advance for everyone's help.
GET and HEAD don't send request bodies, only POST and PUT do (see RFC 2616)
You are blocked for another reason, as I saw myself often, one of the request headers Content-Length or Transfer-Encoding which can be tolerated if really there's no request body at all (Content-Length: 0). As the WAF finds these headers, it considers the request as having a body, even of size 0.
If you loosen the policy, you will allow legitimate traffic but also open the door to abnormal traffic on GET/PUT. To circumvent this, you can add an iRule or LTM policy to remove the headers on GET/PUT, until F5 releases a better version of the software to not block the traffic when the body is of size 0.
The potential problem comes when a bugged Web server would buffer the data sent in a GET/HEAD body instead of returning a 400 error, and ignoring the data. This data could lead to memory consumption, or to injecting hacker's data into legitimate users requests with unknown results at this time. If you are confident in your Web server, you may loosen the WAF policy.
I want to know the performance gain from doing a HTTP batch request. is it only reducing the number of round trips to one instead of n times where n is the number of HTTP requests? if it's like that I guess you can keep http connection opened and send your http messages through and once finish you can close it to get performance gain.
The performance gain of doing batch requests depends on what you are doing with them. However just as an agnostic approach here you go:
If you can manage a keep-alive connection, yes this means you don't have to do the initial handshake for the connection. That reduces some overhead and certainly saves time spent handling subsequent packets along this connection. Because of this you can "pipeline" requests and decrease overall load latency (all else not considered). However, requests in HTTP1.1 are still bound to be FIFO so you can have hangups. This is where batching is useful. Since even with a keep-alive connection you can have this hangup (HTTP/2 will allow asynchronous handling) you can still have some significant latency between requests.
This can be mitigated further by batching. If possible you lump all the data needed for subsequent requests into one and this way everything is processed together and sent back as one response. Sure it may take a bit longer to handle a single packet as opposed to the sequential method, but your throughput is increased per time because roundtrip latency for request->response is not multiplied. Thus you get an even better performance gain in terms of requests handling speeds.
Naturally this approach depends on what you're doing with the requests for it to be effective. Sometimes batching can put too much stress on a server if you have a lot of users doing this with a lot of data so to increase overall concurrent throughput across all users you sometimes need to take the technically slower sequential approach to balance things out. However, the best approach will be known by you upon some simple monitoring and analysis.
And as always, don't optimize prematurely :)
Consider this typical scenario: the client has the identifier of a resource which resides in a database behind an HTTP server, of which resource they want to get an object representation.
The general flow to execute that goes like this:
The client code constructs an HTTP client.
The client builds an URI and sets the proper HTTP request fields.
Client issues the HTTP request.
Client OS initiates a TCP connection, which the server accepts.
Client sends the request to the server.
Server OS or webserver parses the request.
Server middleware parses the request components into a request for the server application.
Server application gets initialized, the relevant module is loaded and passed the request components.
The module obtains an SQL connection.
Module builds an SQL query.
The SQL server finds the record and returns that to the module.
Module parses the SQL response into an object.
Module selects the proper serializer through content negotiation, JSON in this case.
The JSON serializer serializes the object into a JSON string.
The response containing the JSON string is returned by the module.
Middleware returns this response to the HTTP server.
Server sends the response to the client.
Client fires up their version of the JSON serializer.
Client deserializes the JSON into an object.
And there you have it, one object obtained from a webserver.
Now each of those steps along the way is heavily optimized, because a typical server and client execute them so many times. However, even if one of those steps only take a millisecond, when you for example have fifty resources to obtain, those milliseconds add up fast.
So yes, HTTP keep-alive cuts away the time the TCP connection takes to build up and warm up, but each and every other step will still have to be executed fifty times. Yes, there's SQL connection pooling, but every query to the database adds overhead.
So instead of going through this flow fifty separate times, if you have an endpoint that can accept fifty identifiers at once, for example through a comma-separated query string or even a POST with a body, and return their JSON representation at once, that will always be way faster than individual requests.
I am using CloudFront and many time I see Wait Time and Receiving Time is too high.
According to Firebug document, Waiting time and Receiving time means:
Waiting - Waiting for a response from the server
Receiving - / (from cache) Time required to read the entire
response from the server (and/or time required to read from cache)
I do not understand why it takes so much time and what I can do to reduce the time?
There are multiple things you can do.
Set appropriate headers Expires, Cache-control, ETag etc.
Use gzipped versions of the assets
User Sprites where possible. Merge your CSS files into one, merge your JS files into one
Run your site through WebpageTest.org and go through all the recommendations.
Run your site through YSlow and go through all the recommendations
Waiting
This means that the browser is waiting for the server to process the request and return the response.
When that time is long, it normally means your server-side script takes long to process the request.
There are many reasons why a server-side script is slow, e.g. a long-running database query, processing of a huge file, deep recursions, etc.
To fix that, you need to optimize your script. Besides optimizing the code itself, a simple way is to reduce the execution time for subsequent requests is to implement some kind of server-side caching.
Receiving
This means the browser is receiving the response from the server.
When that time is long, it either means your network connection is slow or the received data is (too) big.
To reduce this time, you therefore need to improve the network connection and/or to reduce the size of the response.
Reducing the response size can be done by compressing the transferred data e.g. by enabling gzip and/or removing unnecessary characters like spaces from the output before outputting the data. You may also choose a different format for the returned data, where possible, e.g. use JSON instead of XML for data or directly returning HTML.
Generally
To generally reduce the waiting and receiving times you may implement some client-side caching, e.g. by setting appropriate HTTP headers like Expires, Cache-Control, etc. Then the browser will only make rather small requests to check whether there are new versions of the data to fetch.
You can also avoid the requests completely by saving the data on the client side (e.g. by putting it into the local or session storage) instead of fetching it from the server every time you need it.
I am setting-up a REST web service that just need to answer YES or NO, as fast as possible.
Designing a HEAD service seems the best way to do it but I would like to know if I will really gain some time versus doing a GET request.
I suppose I gain the body stream not to be open/closed on my server (about 1 millisecond?).
Since the amount of bytes to return is very low, do I gain any time in transport, in IP packet number?
Edit:
To explain further the context:
I have a set of REST services executing some processes, if they are in an active state.
I have another REST service indicating the state of all these first services.
Since that last service will be called very often by a very large set of clients (one call expected every 5ms), I was wondering if using a HEAD method can be a valuable optimization? About 250 chars are returned in the response body. HEAD method at least gain the transport of these 250 chars, but what is that impact?
I tried to benchmark the difference between the two methods (HEAD vs GET), running 1000 times the calls, but see no gain at all (< 1ms)...
A RESTful URI should represent a "resource" at the server. Resources are often stored as a record in a database or a file on the filesystem. Unless the resource is large or is slow to retrieve at the server, you might not see a measurable gain by using HEAD instead of GET. It could be that retrieving the meta data is not any faster than retrieving the entire resource.
You could implement both options and benchmark them to see which is faster, but rather than micro-optimize, I would focus on designing the ideal REST interface. A clean REST API is usually more valuable in the long run than a kludgey API that may or may not be faster. I'm not discouraging the use of HEAD, just suggesting that you only use it if it's the "right" design.
If the information you need really is meta data about a resource that can be represented nicely in the HTTP headers, or to check if the resource exists or not, HEAD might work nicely.
For example, suppose you want to check if resource 123 exists. A 200 means "yes" and a 404 means "no":
HEAD /resources/123 HTTP/1.1
[...]
HTTP/1.1 404 Not Found
[...]
However, if the "yes" or "no" you want from your REST service is a part of the resource itself, rather than meta data, you should use GET.
I found this reply when looking for the same question that requester asked. I also found this at http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html:
The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response. The metainformation contained in the HTTP headers in response to a HEAD request SHOULD be identical to the information sent in response to a GET request. This method can be used for obtaining metainformation about the entity implied by the request without transferring the entity-body itself. This method is often used for testing hypertext links for validity, accessibility, and recent modification.
It would seem to me that the correct answer to requester's question is that it depends on what is represented by the REST protocol. For example, in my particular case, my REST protocol is used to retrieve fairly large (as in more than 10K) images. If I have a large number of such resources being checked on a constant basis, and given that I make use of the request headers, then it would make sense to use HEAD request, per w3.org's recommendations.
GET fetches head + body, HEAD fetches head only. It should not be a matter of opinion which one is faster. I don't undestand the upvoted answers above. If you are looking for META information than go for HEAD, which is meant for this purpose.
I strongly discourage this kind of approach.
A RESTful service should respect the HTTP verbs semantics. The GET verb is meant to retrieve the content of the resource, while the HEAD verb will not return any content and may be used, for example, to see if a resource has changed, to know its size or its type, to check if it exists, and so on.
And remember : early optimization is the root of all evil.
HEAD requests are just like GET requests, except the body of the response is empty. This kind of request can be used when all you want is metadata about a file but don't need to transport all of the file's data.
Your performance will hardly change by using a HEAD request instead of a GET request.
Furthermore when you want it to be REST-ful and you want to GET data you should use a GET request instead of a HEAD request.
I don't understand your concern of the 'body stream being open/closed'. The response body will be over the same stream as the http response headers and will NOT be creating a second connection (which by the way is more in the range of 3-6ms).
This seems like a very pre-mature optimization attempt on something that just won't make a significant or even measurable difference. The real difference is the conformity with REST in general, which recommends using GET to get data..
My answer is NO, use GET if it makes sense, there's no performance gain using HEAD.
You could easily make a small test to measure the performance yourself. I think the performance difference would be negligable, because if you're only returning 'Y' or 'N' in the body, it's a single extra byte appended to an already open stream.
I'd also go with GET since it's more correct. You're not supposed to return content in HTTP headers, only metadata.
Several of my ajax applications in the past have used GET request but now I'm starting to use POST request instead. POST requests seem to be slightly more secure and definitely more url friendly/pretty. Thus, i'm wondering if there is any reason why I should use GET request at all.
I generally set up the question as thus: Does anything important change after the request? (Logging and the like notwithstanding). If it does, it should be a POST request, if it doesn't, it should be a GET request.
I'm glad that you call POST requests "slightly" more secure, because that's pretty much what they are; it's trivial to fake a POST request by a user to a page. Making it a POST request, however, prevents web accelerators or reloads from re-triggering the action accidentally.
As AJAX, there is one more consideration: if you are returning JSON with callback support, be very careful not to put any sensitive data that you don't want other websites to be able to see in there. Wikipedia had a vulnerability along these lines where the user anti-CSRF token was revealed via their JSON API.
All good points, however, in answer to the question, GET requests are more useful in certain scenarios over POST requests:
They can be bookmarked
They can be cached
They're faster
They have known consequences (assuming they don't change data), so visiting them multiple
times is not a problem.
For the sake of posterity, updating this comment with the blog notes re: point #3 here, all credit to Omar AL Zabir (the author of the referenced blog post):
"Atlas by default makes HTTP POST for all AJAX calls. Http POST is
more expensive than Http GET. It transmits more bytes over the wire,
thus taking precious network time and it also makes ASP.NET do extra
processing on the server end. So, you should use Http Get as much as
possible. However, Http Get does not allow you to pass objects as
parameters. You can pass numeric, string and date only. When you make
a Http Get call, Atlas builds an encoded url and makes a hit to that
url. So, you must not pass too much content which makes the url become
larger than 2048 chars. As far as I know, that’s what is the max
length of any url.
Another evil thing about http post is, it’s actually 2 calls. First
browser sends the http post headers and server replies with “HTTP 100
Continue”. When browser receives this, it sends the actual body."
You should use GET where you're doing a request which has no side effects, e.g. just fetching some info. This request can:
Be repeated without any problem - if the browser detects an error it can silently retry
Have its result cached by the browser
Be cached by a proxy
These things are all good. Anything which is only retrieving data (particularly public data) should really be a GET. The server should send sensible Last-Modified: and Expires: headers to allow caching if required.
There is one other difference not mentioned by anyone.
GET requests are passed in the URL string and are therefore subject to a length limit usually dependent on the browser. It seems that most are around 2000 chars.
POST requests can be much much larger - in fact not limited really. So if you're needing to request data from a web server and you're passing in lots of parameter information then a POST request might be the only option.
So, as mentioned before really a GET request is for requesting data (no side effects) while a POST request is generally used for transmitting data back to the server to be stored (with side effects). e.g. Use POST to upload a file. GET to retrieve a file.
There was a time when IE I believe had a very short GET URL string. Some applications like Lotus notes use large numbers of random characters to represent document id's. I had the displeasure of using another product that generated random strings so the page URL was unique each time. The random string was HUGE... and it didn't always work with IE6 from memory.
This might help you to decide where to use GET and where to use POST:
URIs, Addressability, and the use of HTTP GET and POST.
POST requests are just as insecure as GETs. The main difference is that POST is used to modify the state of the server application, while GET only requests data from it.
The difference matters when you use clean, "restful" URLs, where the URL itself specifies the resource, and the different methods trigger different actions on the server side.
Perhaps most importantly, GET is book-markable / viewable in url history, and searchable with Google.
POST is important where you don't want the event to be bookmarkable or able to be typed in as a URL - otherwise you (or Google crawling your URLS) could end up accidentally doing things like deleting users from your system, for example.
GET
POST
In GET method, values are visible in the URL
In POST method, values are not visible in the URL.
GET has a limitation on the length of the values, generally 255 characters.
POST has no limitation on the length of the values since they are submitted via the body of HTTP.
GET performs are better compared to POST because of the simple nature of appending the values in the URL.
It has lower performance as compared to GET method because of time spent in including POST values in the HTTP body
This method supports only string data types.
This method supports different data types, such as string, numeric, binary, etc.
GET results can be bookmarked.
POST results cannot be bookmarked.
GET request is often cacheable.
The POST request is hardly cacheable.
GET Parameters remain in web browser history.
Parameters are not saved in web browser history.
Source and more in depth analysis: https://www.guru99.com/difference-get-post-http.html