Drupal 6 caching and blocks - caching

I've read all over the place and I'm trying to figure out if I am understanding the way caching happens in Drupal 6. We have a site that has a real time stock ticker in it. We have Drupal caching enabled so the stock price ends up getting cached and frozen at a specific spot. I figured a way I could handle it would be to put the ticker in a block I make in a custom module and set BLOCK_NO_CACHE, but if I'm understanding this correctly, if you have site caching enabled, then the ENTIRE page gets cached including any and all blocks on it regardless of their individual cache settings. Is this correct? So am I unable to take advantage of site caching then if I have certain spots that should not cache? Does anyone know of another solution I might be able to use to have the best of both worlds? To be able to have site caching, but also have a real time stock ticker? By the way, the stock ticker is making a JSON request to the Yahoo finance API to get the quote.

You are correct, the directive BLOCK_NO_CACHE is only applicable to block level. However when page caching is enabled then Drupal will cache the entire page (which includes the block as well). But this is only applicable to anonymous users. Drupal's philosophy is that the content for anonymous users is always the same so they get served the cached page. But this is not applicable to authenticated users. Since different users might have different access to certain parts of the page (e.g. the links block will look different for an admin than a regular user).
You might want to have a look at this discussion: BLOCK_NO_CACHE not working for anonymous users
And there is a solution, which you'll stumble upon this discussion. It's this module: Ajax Blocks. Extract from the module description:
Permits to load some blocks by additional AJAX request after loading
the whole cached page when the page is viewed by anonymous user. It is
suitable for sites which are mostly static, and the page caching for
anonymous users is a great benefit, but there are some pieces of
information that have to be dynamic.

Related

Employing a CDN for a dynamic website

I have a website forum where users exchange photos and text with one another on the home page. The home page shows 20 latest objects - be they photos or text. The 21st object is pushed out out of view. A new photo is uploaded every 5 seconds. A new text string is posted every second. In around 20 seconds, a photo that appeared at the top has disappeared at the bottom.
My question is: would I get a performance improvement if I introduced a CDN in the mix?
Since the content is changing, it seems I shouldn't be doing it. However, when I think about it logically, it does seem I'll get a performance improvement from introducing a CDN for my photos. Here's how. Imagine a photo is posted, appearing on the page at t=1 and remaining there till t=20. The first person to access the page (closer to t=1) will enable to photo to be pulled to an edge server. Thereafter, anyone accessing the photo will be receiving it from the CDN; this will last till t=20, after which the photo disappears. This is a veritable performance boost.
Can anyone comment on what are the flaws in my reasoning, and/or what am I failing to consider? Also would be good to know what alternative performance optimizations I can make for a website like mine. Thanks in advance.
You've got it right. As long as someone accesses the photo within the 20 seconds that the image is within view it will be pulled to an edge server. Then upon subsequent requests, other visitors will receive a cached response from the nearest edge server.
As long as you're using the CDN for delivering just your static assets, there should be no issues with your setup.
Additionally, you may want to look into a CDN which supports HTTP/2. This will provide you with improved performance. Check out cdncomparison.com for a comparison between popular CDN providers.
You need to consider all requests hitting your server, which includes the primary dynamically generated HTML document, but also all static assets like CSS files, Javascript files and, yes, image files (both static and user uploaded content). An HTML document will reference several other assets, each of which needs to be downloaded separately and thus incurs a server hit. Assuming for the sake of argument that each visitor has an empty local cache, a single page load may incur, say, ~50 resource hits for your server.
Probably the only request which actually needs to be handled by your server is the dynamically generated HTML document, if it's specific to the user (because they're logged in). All other 49 resource requests are identical for all visitors and can easily be shunted off to a CDN. Those will just hit your server once [per region], and then be cached by the CDN and rarely bother your server again. You can even have the CDN cache public HTML documents, e.g. for non-logged in users, you can let the CDN cache HTML documents for ~5 seconds, depending on how up-to-date you want your site to appear; so the CDN can handle an entire browsing session without hitting your server at all.
If you have roughly one new upload per second, that means there is likely a magnitude more passive visitors per second. If you can let a CDN handle ~99% of requests, that's a dramatic reduction in actual hits to your server. If you are clever with what you cache and for how long and depending on your particular mix of anonymous and authenticated users, you can easily reduce server loads by a magnitude or two. On the other side, you're speeding up page load times accordingly for your visitors.
For every single HTML document and other asset, really think whether this can be cached and for how long:
For HTML documents, is the user logged in? If no, and there's no other specific cookie tracking or similar things going on, then the asset is static and public for all intents and purposes and can be cached. Decide on a maximum age for the document and let the CDN cache it. Even caching it for just a second makes a giant difference when you get 1000 hits per second.
If the user is logged in, set the cache pragma to private, but still let the visitor's browser cache it for a few seconds. These headers must be decided upon by your forum software while it's generating the document.
For all other assets which aren't access restricted: let the CDN cache it for a long time and you can practically forget about ever having to serve those particular files ever again. These headers can be statically configured for entire directories in the web server.

Use google hosted jQuery-ui or self host custom download of jQuery UI?

I'm working on a site where we are using the slide function from jquery-ui.
The Google-hosted minified version of jquery-ui weighs 63KB - this is for the whole library. The custom download of just the slide function weighs 14KB.
Obviously if a user has cached the Google hosted version its a no-brainer, but if they haven't it will take longer to load as I could just lump the custom jquery-ui slide function inside of my main.js file.
I guess it comes down to how many other sites using jquery-ui (if this was just for the normal jquery the above would be a no-brainer as loads of sites use jquery, but I'm a bit unsure as per the usage of jquery-ui)...
I can't work out what's the best thing to do in the above scenario?
I'd say if the custom selective build is that small, both absolutely and relatively, there's a good reasons to choose that path.
Loading a JavaScript resource has several implications, in the following order of events:
Loading: Request / response communication or, in case of a cache hit - fetching. Keep in mind that CDN or not, the communication only affects the first page. If your site is built in a traditional "full page request" style (as opposed to SPA's and the likes), this literally becomes a non-issue.
Parsing: The JS engine needs to parse the entire resource.
Executing: The JS engine executes the entire resource. That means that any initialization / loading code is executed, even if that's initialization for features that aren't used in the hosting page.
Memory usage: The memory usage depends on the entire resource. That includes static objects as well as function (which are also objects).
With that in mind, having a smaller resource is advantageous in ways beyond simple loading. More so, a request for such a small resource is negligible in terms of communication. You wouldn't even think twice about it had it been a mini version of the company logo somewhere on the bottom of the screen where nobody even notices.
As a side note and potential optimization, if your site serves any proprietary library, or a group of less common libraries, you can bundle all of these together, including the jQuery UI subset, and your users will only have a single request, again making this advantageous.
Go with the Google hosted version
It is likely that the user would have recently visited a website that loads jQuery-UI hosted on Google servers.
It will take load off from your server and make other elements load faster.
Browsers load a fixed number of resources from one domain. Loading the jQuery-UI from Google servers will make sure it is downloaded concurrently with other resource that reside on your servers.
The Yahoo developer network recommends using a CDN. Their full reasons are posted here.
https://developer.yahoo.com/performance/rules.html
This quote from their site really seals it in my mind.
"Deploying your content across multiple, geographically dispersed servers will make your pages load faster from the user's perspective."
I am not an expert but my two cents are these anyway. With a CDN you can be sure that there is reduced latency, plus as mentioned, user is most likely to have picked it up from some other website hosted by googleAlso the thing I always care about, save bandwidth.

WP Super Cache conflicts Contact Form 7

2 popular Wordpress plugins, Super Cache and Contact Form 7 - there are still some issues expected around _nonce AJAX calls.. How it is possible to make them work together smoothly? Definitely, there are some situations you want to keep ALL your posts (pages) super-cached, and not to be served dinamically without caching while preserving and providing contact form functionality.
When not checking the "Cache rebuild" in the settings, Super Cache will function:
standard pages will NOT enforce a new cache file to be generated (e.g. it REALLY serves an (if) existing supercache file whithout triggering a new cache file to be generated
pages with a wpcf7-form included WILL enforce/trigger the re-generation of supercache file while serving the initial on the request. I think that is the point when things go wrong.
Question: How to stop unwanted re-generation of pages with a wpcf7-form included? Based on wpcf7 plugin homepage Docs by the author, my expectation is that the wpcf7 form has evolved and been developing to meet this ajax-call requirements such _nonce calls. Any idea how to resolve this?
to be hard-coded (e.g. exclude _nonce call and/or URL parameter..) within wpcf7 form (this is against WP standards) or
settings in Super cache?
any other idea, solution or alternatives?
In this thread wpcf7 author says:
Contact Form 7 uses WP nonce as secret key. As nonce life is 24 hours by default, if cache clears less than 24 hours, the two plugins should work without problem, even if the page is cached.

Designing an application around HMVC and AJAX [Kohana 3.2]

I am currently designing an application that will have a few different pages, and each page will have components that update through AJAX. The layout is similar to the new Twitter design where 'Home', 'Discover', and 'Connect' are separate pages, but interacting within the page (such as clicking 'Followers' or 'Following') uses AJAX.
Since the design requires an initial page load with several components (in the context of Twitter: tweets, followers, following), each of which can be updated individually through AJAX, I thought it'd be best to have a default controller for serving pages, and other controllers with actions that, rather than serving full pages, strictly handle querying the database and returning JSON objects. This way, on initial page load several HMVC requests can be made to gather the data for each component, and AJAX calls can also be made to update each component individually.
My idea is to have a Controller_Default that handles serving pages. In the context of Twitter, Controller_Default would contain:
action_home()
action_connect()
action_discover()
I would then have other Controllers that don't deal with serving full pages, but rather components of pages. For instance, in the context of Twitter Controller_Tweet may have:
action_get()
which returns a JSON object containing tweets for a specific user. Action_home() could then make several HMVC requests to get the data for the several different components of the page (i.e. make requests to 'tweet/get', 'followers/get', 'following/get'). While on the page, however, AJAX calls could be made to the function specific controllers (i.e. 'tweet/get') to update the content.
My question: is this a good design? Does it make sense to have the pages served through a default controller, with page components served (in JSON format) through other function specific controllers?
If there is any confusion regarding the question please feel free to ask for clarification!
One of the strengths of the HMVC pattern is that employing this type of layered application doesn't lock you into a workflow that might be difficult to change later on.
From what you've indicated above, this would be perfectly acceptable as a way of serving content to a client; the default controller wraps sub-requests, which avoids multiple AJAX calls from the client to achieve the same goal.
Two suggestions I might make:
Ensure that your Twitter back-end requests are abstracted out and managed in a library to make the application DRY'er and easier to maintain.
Consider whether the default controller is making only the absolutely necessary calls on each request. Employ caching to avoid pulling infrequently changed data on every request (e.g., followers might only be updated every 30 seconds). This of course depends entirely on your application requirements, but if you get heavily loaded you could quickly find your Twitter API request limit being reached.
One final observation: if you do find the server is experiencing high load and Twitter API requests are taking a long time to return, consider provisioning another server and installing a copy of your application. You can then "point" sub-requests from the default gateway application to your second application server, which should help improve response times if the two servers are connected by a high-speed link.

Lazy HTTP caching

I have a website which is displayed to visitors via a kiosk. People can interact with it. However, since the website is not locally hosted, and uses an internet connection - the page loads are slow.
I would like to implement some kind of lazy caching mechanism such that as and when people browse the pages - the pages and the resources referenced by the pages get cached, so that subsequent loads of the same page are instant.
I considered using HTML5 offline caching - but it requires me to specify all the resources in the manifest file, and this is not feasible for me, as the website is pretty large.
Is there any other way to implement this? Perhaps using HTTP caching headers? I would also need some way to invalidate the cache at some point to "push" the new changes to the browser...
The usual approach to handling problems like this is with HTTP caching headers, combined with smart construction of URLs for resources referenced by your pages.
The general idea is this: every resource loaded by your page (images, scripts, CSS files, etc.) should have a unique, versioned URL. For example, instead of loading /images/button.png, you'd load /images/button_v123.png and when you change that file its URL changes to /images/button_v124.png. Typically this is handled by URL rewriting over static file URLs, so that, for example, the web server knows that /images/button_v124.png should really load the /images/button.png file from the web server's file system. Creating the version numbers can be done by appending a build number, using a CRC of file contents, or many other ways.
Then you need to make sure that, wherever URLs are constructed in the parent page, they refer to the versioned URL. This obviously requires dynamic code used to construct all URLs, which can be accomplished either by adjusting the code used to generate your pages or by server-wide plugins which affect all text/html requests.
Then, you then set the Expires header for all resource requests (images, scripts, CSS files, etc.) to a date far in the future (e.g. 10 years from now). This effectively caches them forever. This means that all requests loaded by each of your pages will be always be fetched from cache; cache invalidation never happens, which is OK because when the underlying resource changes, the parent page will use a new URL to find it.
Finally, you need to figure out how you want to cache your "parent" pages. How you do this is a judgement call. You can use ETag/If-None-Match HTTP headers to check for a new version of the page every time, which will very quickly load the page from cache if the server reports that it hasn't changed. Or you can use Expires (and/or Max-Age) to reload the parent page from cache for a given period of time before checking the server.
If you want to do something even more sophisticated, you can always put a custom proxy server on the kiosk-- in that case you'd have total, centralized control over how caching is done.

Resources