Varnish: How to send hit/miss stats to backend - caching

I hope you can help
I have a image server that generates images on the fly.
I'm using varnish to cache generated images.
I need to record how many requests (per image) varnish receives as well as if it was a hit or miss (pass gets marked as miss). Currently, I'm writing access logs with hit/miss to file, I then using crontab process this access-log file and write the data to my db...
What I would like to do instead is:
Have Varnish make a request to my backend notifying it of a cache hit (and if possible the response size (bytes)).
My backend could then save this data...
Is this at all possible and if so how?
In-case anybody is interested:
2 varnish instances each with 1 (java+tomcat) backend.
Service manipulates and generates each image specific to the requirements made in the request...
Below are per day:
Over 35 million page views where each page has at least 3 images in it.
Varnish gets around 3+ million requests for images (images are also cached by the browser).
Varnish has a 87% hit rate
Response times for a hit are a few micro seconds
Response times for a miss are 50ms to 1000ms depending on the size of the image (both source and output)

The best way of doing this is to have a helper process that tails varnishlog output and
does the HTTP calls when needed.
You can do this by logging the necessary data with std.log() in vcl_deliver, so the
helper process gets all the data it needs. Use obj.hits > 0 to check if this was a cache hit.
If you really really need to do it inline (and slowing down all your cache hits badly), you
can use libvmod-curl:
https://github.com/varnish/libvmod-curl

If you are going to send a request to a stats server from within your vcl I would try to incorporate some type of aggregate request, where you send it every 100 (or whatever) requests instead of every single incoming request.
Like the other answer, I would recommend using varnishncsa (or varnishlog) with a process that tails the log file. There could be some delay in that method but if that is acceptable then I would consider post processing the varnish log when logrotated runs. This way you have a full day's worth of data and you can churn through it, producing whatever report you need.

Related

Is there a way to keep ajax calls from firing off seemingly sequentially in web2py?

I'm developing an SPA and find myself needing to fire off several (5-10+) ajax calls when loading some sections. With web2py, it seems that many of them are waiting until others are done or near done to get any data returned.
Here's an example of some of Chrome's timeline output
Where green signifies time spent waiting, gray signifies time stalled, transparent signifies time queued, and blue signifies actually receiving the content.
These are all requests that go through web2py controllers, and most just do a simple operation (usually a database query). Anything that accesses a static resource seems to have no trouble being processed quickly.
For the record, I'm using sessions in cookies, since I did read about how file-based sessions force web2py into similar behavior. I'm also calling session.forget() at the top of any controller that doesn't modify the session.
I know that I can and I intend to optimize this by reducing the number of ajax calls, but I find this behavior strange and undesirable regardless. Is there anything else that can be done to improve the situation?
If you are using cookie based sessions, then requests are not serialized. However, note that browsers limit the number of concurrent connections to the same host. Looking at the timeline output, it does look like groups of requests are indeed made concurrently, but Chrome will not make all 21 requests concurrently.
If you can't reduce the number of requests but must make them all concurrently, you could look into domain sharding or configuring your web server to use HTTP/2.
As an aside, in web2py, if you are using file based sessions and want to unlock the session file within a given request in order to prevent serialization of requests, you must use session.forget(response) rather than just session.forget() (the latter prevents the session from being saved even if it has been changed, but it does not immediately unlock the file). In any case, there is no session file to unlock if you are using cookie based sessions.

How to Reduce 'Waiting Time' and 'Receiving Time' on Page Load

I am using CloudFront and many time I see Wait Time and Receiving Time is too high.
According to Firebug document, Waiting time and Receiving time means:
Waiting - Waiting for a response from the server
Receiving - / (from cache) Time required to read the entire
response from the server (and/or time required to read from cache)
I do not understand why it takes so much time and what I can do to reduce the time?
There are multiple things you can do.
Set appropriate headers Expires, Cache-control, ETag etc.
Use gzipped versions of the assets
User Sprites where possible. Merge your CSS files into one, merge your JS files into one
Run your site through WebpageTest.org and go through all the recommendations.
Run your site through YSlow and go through all the recommendations
Waiting
This means that the browser is waiting for the server to process the request and return the response.
When that time is long, it normally means your server-side script takes long to process the request.
There are many reasons why a server-side script is slow, e.g. a long-running database query, processing of a huge file, deep recursions, etc.
To fix that, you need to optimize your script. Besides optimizing the code itself, a simple way is to reduce the execution time for subsequent requests is to implement some kind of server-side caching.
Receiving
This means the browser is receiving the response from the server.
When that time is long, it either means your network connection is slow or the received data is (too) big.
To reduce this time, you therefore need to improve the network connection and/or to reduce the size of the response.
Reducing the response size can be done by compressing the transferred data e.g. by enabling gzip and/or removing unnecessary characters like spaces from the output before outputting the data. You may also choose a different format for the returned data, where possible, e.g. use JSON instead of XML for data or directly returning HTML.
Generally
To generally reduce the waiting and receiving times you may implement some client-side caching, e.g. by setting appropriate HTTP headers like Expires, Cache-Control, etc. Then the browser will only make rather small requests to check whether there are new versions of the data to fetch.
You can also avoid the requests completely by saving the data on the client side (e.g. by putting it into the local or session storage) instead of fetching it from the server every time you need it.

Is a CDN an appropriate solution to cache query results?

I'm working on a little search engine, where I'm trying to find out how to cache query results.
These results are simple JSON text, retrieved using an ajax request.
Storing results in memory is not an option, I can see two options remaining:
Use a nosql database to retrieve cached results.
Store results on a CDN and redirect the http request (307 - Temporary Redirect) in case the result was already cached.
However, I don't have much experience with CDN, and wonder if using it for a huge amount of temporary small text files is a good practice.
Is it a good practice to use redirection on an ajax request?
Is a CDN an appropriate solution to cache small text files?
Short answer: no.
Long: Usually, you use a CDN for large static files that you want the CDN to mirror all around the world so it's close to a user when she requests them. When you have data that changes a lot, it would always take a while to propagate the changes to all nodes of the CDN, in the meantime users get inconsistent results (this may or may not matter to you).
Also, to avoid higher latency I wouldn't use an HTTP redirect (where you tell the client to make a second request to somewhere else) but rather figure out whether to get the data from the cache or the engine on your end (e.g. using a caching proxy or a load balancer) and then serve it directly to the client.

Do browsers limit AJAX polling rate? What is the limit?

I just read that some browsers would prevent HTTP polling (I guess by limiting the rate of requests)...
From https://github.com/sstrigler/JSJaC:
Note: As security restrictions of most modern browsers prevent HTTP
Polling from being usable anymore this module is disabled by default
now. If you want to compile it in use 'make polling'.
This could explain some misbehavior of some of my JavaScripts (sometimes requests are just not sent or retried, even if they were actually successful). But I couldn't find further information on details..
Questions
if it's "max. number of requests n per x seconds", what are the usual/default settings for x and n?
Is there any way good resource for this?
Any way to detect if a request has been "delayed" or "rejected" because of a rate limit?
Thanks for your help...
Stefan
Yes, as far as I am aware there is a default pool limit of 10 and a default request timeout of 30 seconds per request, however the timeout and poll limits can be controlled and different browsers implement different limitations!
Check out this Google implementation.
and this is an awesome implementation of catching a timeout error!
You can find the Firefox specifics HERE!
Internet Explorer specifics are controlled from inside the Windows registry.
Also have a look at this question.
Basically, the way you control is not by changing the browser limitations, but by abiding them. So you apply a technique called throttle-ing.
Think of it as creating a FIFO/priority queue of functions. A queue struct that takes xhr requests as members and enforces delay between them is an Xhr Poll. For instance, I am using
Jsonp to get data from a node.js server located on another domain and I am polling of course due to browser limitations. Otherwise, I get zero response back from the server and that is only because of browser limitations.
I am actually doing a console log for every request that's supposed to be sent, but not all of them are being logged. So the browser limits them.
I'll be even more specific with helping you out. I have a page on my website which is supposed to render a view for tens or even hundreds of articles. You go through them using a cool horizontal slider.
The current value of the slider matches the currrent 'page'. Since I am only displaying 5 articles per page and I can't exactly load thousands of articles 'onload' without severe performance implications, I load the articles for the current page. I get them from a MongoDB by sending a cross-domain request to a Python script.
The script is supposed to return an array of five objects with all the details I need to build the DOM elements for a 'page'. However, there are a couple of issues.
First, the slider works extremely fast, as it's more or less a value change. Even if there is drag drop functionality, key down events etc, the actual change takes miliseconds. However, the code of the slider looks something like this:
goog.events.listen(slider, goog.events.EventType.CHANGE, function() {
myProject.Articles.page(slider.getValue());
}
The slider.getValue() method returns an int with the current page number, so basically I have to load from:
currentPage * articlesPerPage to (currentPage * articlesPerPage + 1) - 1
But in order to load, i do something like this:
I have a storage engine(think of it as an array):
I check if the content is not already there
If it is, there is no point to make another request, so go forward with getting the DOM elements from the array with the already created DOM elements in place.
If it isn't, then I need to get it so I need to send that request I was mentioning, which would look something like(without accounting for browser limitations):
JSONP.send({'action':'getMeSomeArticles','start':start,'length': itemsPerPage, function(callback){
// now I just parse the callback quickly to make sure it is consistent
// create DOM elements, and populate the client side storage
// and update the view for the user.
}}
The problem comes from the speed with which you can change that slider. Since every change supposedly triggers a request(same would happen for normal Xhr requests), then you are basically crossing the limitations of all browsers, so without throttle-ing, there would be no 'callback' for most of the requests. 'callback' is the JS code returned by the JSONP request(which is more of a remote script inclusion than anything else).
So what I do is push a request to a priority queue, not POLL, as now I don't need to send multiple simultaneous requests. If the queue is empty, the recently added member is executed and everyone is happy. If it's not, then all non-completed requests in progress are cancelled and only the last one is executed.
Now in my particular case, I do a binary search(0(log n)) to see if the storage engine doesn't have data for the previous requests yet, which tells me if the previous request has been completed or not. If it has, then it's removed from the queue and the current one is processed, otherwise the new one fires. So an and so forth.
Again, for speed consideration and shit browser wanna-bes such as Internet Explorer, I do the above described procedure about 3-4 steps ahead. So I pre-load 20 pages ahead till everything is the client side storage engine. This way, every limitation is successfully dealt with.
The cooldown time is covered by the minimum time it would take to slide through 20 pages and the throttle-ing makes sure there are no more than 1 active requests at any given time(with backwards compatibility going as far as Internet Explorer 5).
The reason why I wrote all this is to give you an example trying to say that you cannot always enforce delay directly from the FIFO structure, as your calls may need to turn into what a user sees, and you don't exactly want to make a user wait 10-15 seconds for a single page to render.
Also, always minimize the polling and the need to poll(simultaneously fired Ajax events, as not all browsers actually do good things with them). For instance, instead of doing something like sending one request to get content and sending another for that content to be tracked as viewed in your app metrics, do as many tasks at server level as you possibly can!
Of course, you probably want to track your errors properly, so your Xhr object from your library of choice implement error handling for ajax and because you are an awesome developer you want to make use of them.
so say you have a try - catch block in place
The scenario is this:
An Ajax call has finished and it's supposed to return a JSON, but the call somehow failed. However, you try to parse the JSON and do whatever you need to do with it.
so
function onAjaxSuccess (ajaxResponse) {
try {
var yourObj = JSON.parse(ajaxRespose);
} catch (err) {
// Now I've actually seen this on a number of occasions, to log that an error occur
// a lot of developers will attempt to send yet another ajax request to log the
// failure of the previous one.
// for these reasons, workers exist.
myProject.worker.message('preferrably a pre-determined error code should go here');
// Then only the worker should again throttle and poll the ajax requests that log the
//specific error.
};
};
While I have seen various implementations that try to fire as many Xhr requests at the same time as they possible can until they encounter browser limitations, then do quite a good job at stalling the ones that haven't fired in wait for the browser 'cooldown', what I can advise you is to think about the following:
How important is speed for your app?
Just how scalable and how intensive the I/O will be?
If the answer to the first one is 'very' and to the latter 'OMFG modern technology', then try to optimize your code and architecture as much as you can so that you never need to send 10 simultaneous Xhr requests. Also, for large scale apps, multi-thread your processes. The JavaScript way to accomplish that is by using workers. Or you could call the ECMA board, tell them to make this a default, and then post it here so that the rest of us JS devs can enjoy native multi-threading in JS:)(how dafuq did they not think about this?!?!)
Stefan, quick answers below:
-if it's "max. number of requests n per x seconds", what are the usual/default settings for x and n?
This sounds more like a server restriction. The browser ones usually sound like:
-"the maximum requests for the same hostname is x"
-"the maximum connections for ANY hostname is y"
-Is there any way good resource for this?
http://www.browserscope.org/?category=network (also hover over table headers to see what is measured)
http://www.stevesouders.com/blog/2008/03/20/roundup-on-parallel-connections
-Any way to detect if a request has been "delayed" or "rejected" because of a rate limit?
You could look at the http headers for "Connection: close" to detect server restrictions but I am not aware of being able in JavaScript to read settings from so many browsers in a consistent, browser-independent way. (For Firefox, you could read this http://support.mozilla.org/en-US/questions/746848)
Hope this quick answer helps?
No, browser does not in any way affect polling. I think what was meant on that page is the same origin policy - you can only access the same host and port as your original page.
Only known limitation to connections themselves is that you usually can only have from two to four simultaneous connections to the same host.
I've written some apps with long poll, some with C++ backend with my own webserver, and one with PHP backend with Apache2.
My long poll timeout is 4..10 s. When something occurs, or 4..10 s passes, my server returns an empty response. Then the client immediatelly starts another AJAX request. I found that some browsers hangs up when I start AJAX call from previous AJAX handler, so I am using setTimeout() with a small value to start the next AJAX request.
When something happens on the client side, which should be sent to server, I use another AJAX request for it, but it's a one-way thing: the server does not send any response, and the client does not process anything. The result of the operation (if any) will be received on the long poll. It requires max. 2 connection to the server, which all browsers supports.
Keep in mind, that if there's 500 client, it means 500 server-side webserver thread, which will move together, occurring load peaks, because when something happens, the server have to report it at the same time for each clients, the clients will process it near same time long, they will start the next long request in the same time, and from then, the timeout will expire also at the same time, and furthcoming ones too. You can trick with rnd timeout, say 4 rnd(0..4), but it's worthless, if anything happens, they will "sync" again, all the request have to be served at the same time, when something reportable happens.
I've tested it thru a router, and it works. I assume, routers respects 4..10 lag, it's around the speed of a slow webapge (far, far away), which no router think, that it should be canceled.
My PHP work is a collaborative spreadsheet, it looks amazing when you hit enter and the stuff is updating simultaneously in several browsers. Have fun!
No limit for no of ajax requests. However it will be on same host & port.
Server can limit no of request from a machine based on its setting.
For example. A server can set so that if there are more than few request from same machine within specified time it will reject request.
After small mistake in javascript code, neverending loop was made witch each step calling 2 ajax requests. In firebug i could see more and more requests until firefox started to slow down, dont response and finally crash.
So, yes, there is a "limit" ;)

What happens if I exceed my browser's ajax request limit?

Will it be smart and queue later requests for submission after earlier requests complete, or will it do something stupid like discard later requests which push it over its maximum number.
Is the answer the same across browsers, or does it vary?
Are you referring to the fact that browsers typically limit the number of simultaneous connections to a particular host (2 is recommended by the HTTP spec)? If so, then yes, all requests will be queued. It's really no different then loading a web page that has a lot of images in it -- the initial load will result in a bunch of new requests, but they may have to wait based on the connection limit. But all of your images do load.
I'm not aware of an ajax-specific request limit.

Resources