On a new project with lot of traffic, we are thinking on how to structure our Symfony2 app to take advantage of caches, and be ready to be more aggressive in the future. I'd love to know your opinion.
Let's say a user requests a page a list of places. This page has:
- list
- common data (title, author, description)
- user data (the user likes the list + other data)
- first 20 places
- common data (title, photo of each place)
- user data (the rates of the user for those places)
The HTML could be like:
<html>...
<body>
<header>
...
<!-- Embed the top user menu -->
<esi:include src="http://example.com/profile/menu" />
...
</header>
<content>
...
common data of the list
...
<!-- Embed the common data of the first 20 places, the same for everyone -->
<esi:include src="http://example.com/lists/17/places" />
...
<!-- Embed the user data of the list (used in JS) -->
<esi:include src="http://example.com/lists/17/user" />
...
<!-- Embed the user data of the list of places (used in JS) -->
<esi:include src="http://example.com/lists/17/places/user" />
...
</content>
</body>
</html>
The HTML will be cached on the gateway (Symfony or Varnish). The list of places will be cached most of the time on the gateway too. The user data requests will be the ones which are called and not be cached (not initially at least).
Questions:
How do you feel about this structure?
If the user is anonymous, can I avoid making the esi-includes for the user data? Also if I have a cookie for the anon user? How?
Does the esi-include for the user menu makes sense?
Or should we forget about ESI and go always through the controller (caching the rendered view of the common data for example)?
Should we move the 2 ESI-requests that ask for user data to be AJAX-calls, instead of waiting on the server?
Is this a good approach to scale if we need to do it fast? What would be best?
thanks a lot!
We have used Varnish on one site for whole-page caching and I've been using Symfony2 for few years, but keep in mind that I haven't used Varnish + Symfony2 + ESI on any production environment.
I think the basic idea is OK. If menu is the same in many pages and list of places also the same on many pages, you get common content cached by Varnish or Symfony reverse cache. As Varnish usually holds cache in memory, you get your content faster and don't have to call rendering and DB querying code at each request.
The hard part is making those ESI requests cached if the user is logged in. As I know, in default Varnish configuration, requests with Cookie in them are never cached. If you tend to pass cookies to ESI requests, those ESI responses will not be shared between users.
You can try making some rules from URL, but if you use default Symfony twig helpers, generated URLs are /_internal/..., so it might be hard to differ public and private ones.
Also you can configure to always ignore any cookies if Cache-Control: public is passed. This is done by default in Symfony:
if ($this->isPrivateRequest($request) && !$response->headers->hasCacheControlDirective('public')) {
$response->setPrivate(true);
}
As you see from the code, if you have public directive, response will never be private.
I haven't found how Varnish processes this directive - as I understand, it does not cache any requests that have cookie by default. So I think you have to tweak configuration to accomplish this.
If the main page is also to be cached, I don't see how you could skip the includes.
I assume JS is required for your registered users (not search bots), so I would suggest to use Javascript to differ the loading of user data.
Javascript code can look if the user has cookie session-id etc. and make request to get the data only in this case. It might also be a good idea to set some other cookie, like _loggedin to avoid Javascript code from getting the session id.
Not logged in users can also have some data in the cookies, like _likedPost:1,2,132. Javascript can get this cookie and make some HTML corrections without even making the additional request.
As we did with these cookies: we separated JS-only cookies from application cookies. We did this by some pattern, like _\w for JS cookies. Then we tweaked Varnish configuration to split Cookie header and remove these JS-only cookies. Then, if there is no other cookie left, the response is shared with everyone. Application (Symfony) does not get those cookies, as they are stripped.
I think it does if it is the same in every page.
I think ESI is good as Varnish can hold cache in memory. So it might be that it would not even make any queries to your hard disk for the content. As your controller cache might be also in-memory, I think Varnish would look for the cache more quicker than Symfony framework with all the routing, PHP code, services initialization etc.
It depends, but I think that it could be better approach. Keep in mind, that caches live different lives. For example, if your places list is cached for 2 hours, at the end of this time places can have changed - some new items are new on the list and some of them are missing. Your list to the user is still the old one (cached), but you provide user's data about the new list - some of the data is not needed, some of it is missing.
It might be better approach to get loaded places by javascript, for example searching for some HTML attribute like data-list-item-id and then make ajax request querying data about these items. In this case your user data will be synchronized with current cached list and you can make 1 ajax request for both lists instead of 2.
If cache invalidation (PURGE requests) are not used, all HTTP cache sheme is indeed good to scale. You can scale the application to several servers and configure Varnish to call them randomly, by some rule or just to use one of them as a failsafe. If the bandwidth is still too big, you can always modify the cache timeouts and other configuration.
Related
I just added some functionality to my site which, when a user hovers their mouse over a link (to a 3rd party page), a preview of the link is created from the meta tags on the target page and displayed. I'm worried about the implications of hot-linking in my current implementation.
I'm now thinking of implementing some kind of server-side caching such that the first request for the preview fetches the info and image from the target page, but each subsequent request (up to some age limit) is served from a cache on my host. I'm relatively confident that I could implement something of my own, but is there an off-the-shelf solution for something like this? I'm self-taught so I'm guessing that my DIY solution would be less than optimal. Thanks.
Edit I implemented a DIY solution (see below) but I'm still open to suggestions as to how this could be accomplished efficiently.
I couldn't find any off-the-shelf solutions so I wrote one in PHP.
It accepts a URL as a HTTP GET parameter and does some error checking. If error-checking passes, it opens a JSON-encoded database from disk and parses the data into an array of Record objects that contain the info that I want. The supplied URL is used as the array key. If the key exists in the array, the cached info is returned. Otherwise, the web page is fetched, meta tags parsed, image saved locally, and cached data returned. The cached info is then inserted into the database. After the cached info is returned to the requesting page, each record is examined for its expiration date and expired records are removed. Each request for a cached record extends its expiration date. Lastly, the database is JSON-encoded and written back to disk.
I need to implement like/dislike functionality (for anonymous users so there is no need to sign up). Problem is that content is served by Varnish and I need to display actual number of likes.
I'm wondering how it's done on website like stackoverflow. Assuming pages are cached in Varnish (for anonymous users only), so every time user votes on answer/question, page needs to be purged from cache. Am I right? Current number of votes needs to be visible for other users.
What is good approach in this situation? Should I send PURGE to Varnish every time user hits "like" button?
A common way of implementing this is to do the like button and display client side in Javascript instead. This avoids the issue slightly.
Assuming that pressing Like leads to a POST request hitting a single Varnish server, you can make the object be invalidated/replaced in different ways. Using purge and a VCL restart is most likely the better way to do this.
Of course there is a slight race here, where other clients will be served the old page while this is ongoing.
I am trying to use Varnish to cache a page that has some user specific text and links on it. The best way to cache such pages is via Edge Side Includes.
Context
My web application is RESTful and does not support sessions or even cookies for that matter. Every source URL is complete in a sense that it contains a user specific query parameter to be able to identify a unique user. The pages which see most visits in the web application are listing pages. I just need to show the user's email in the header and the links on the page must also carry the user specific query parameter ahead so as to simulate a logged in behavior. Page contents are supposed to be the same for each user except for the header and those internal links.
I tried to use <esi:include /> for such areas on the page but obviously, could not include the user specific parameter in the page source (else the first user specific hit would be cached with the first user's parameter and be served the same for every subsequent user). Further, I tried to strip user specific parameter in vcl_recv subroutine of Varnish and store it temporarily in a header such as req.http.X-User just before a lookup. Each source URL gets hashed with a req.url that doesn't contain any user specific parameters and hence, does not create duplicate cache objects for each unique user.
Question
I would like to read the user specific parameter from req.http.X-User and hash user specific ESI requests by adding this user specific value against each ESI URL as a query parameter. I do not see a way in which one could share query parameters between a source request and it's included ESI requests. Could someone help?
I have tried to depict my objective in the following diagram:
I guess your problem is that the ESI call itself is going to be cached. Including any query strings in the URL.
I cant remember the specifics, but I think you can get Varnish to pass cookies through the ESI requests, so you could store the value in a cookie (encrypted?) and then that can be read via whatever is handling the ESI call.
Or maybe you can get it to pass the HTTP headers through? In which case it can be read directly from the HTTP header parameter
I am using Varnish to enhance performance on my Magento store.
My problem is it that Varnish is caching the top links number of items in Cart.
I was thinking to use a Ajax call after page loading, not sure how to implement,
suggestions?
Thanks
If you want to implement this via ajax, here's one possible approach:
Backend work:
For each action which modifies the number of items in the cart, observe the event and fire a method that will update a cookie on the client with necessary data you need. You can do something simple and store a JSON structure: {"cartItem": 2, "isLoggedIn": false}. Some events to observe:
controller_action_postdispatch_checkout
controller_action_postdispatch_customer
checkout_onepage_controller_success_action
Create a controller/action that will return the exact same data structure (as well as set the cookie while its at it).
Frontend work:
On DOM ready your code should look for the cookie set in the backend. If it doesn't exist, make an ajax request to the controller to fetch it.
Once it has the necessary data, update the values in the DOM as necessary
You'll want to make sure you listen to the all the necessary events. Using the cookie will help speed things up on the client side and reduces the number of HTTP requests the browser needs to make.
To improve performances, I'd like to add a fairly long Cache-Control (up to 30 minutes) to each page since they do not change often. However, each page also displays the name of the user logged in (like this website).
The problem is when the user logs in or logs out: the user name must change. How can I change the user name after each login/logout action while keeping a long Cache-Control?
Here are the solutions I can think of:
Ajax request (not cached) to retrieve and display the user name. If I have 2 requests (/user?registered and /user?new), they could be cached as well. But I am afraid this extra request would nullify my caching performance-wise
Add a unique URL variable (?time=) to make the URL different, and cancel the cache. However, I would have to add this variable to all links on my webpage, not very convenient code-wise
This problems becomes greater if I actually have more content that is not the same for registered users and new users.
Cache-Control: private
Is usually enough in practice. It's what SO uses.
In theory, if you needed to allow for the case of variable logins from the same client you should probably set Vary on Cookie (assuming that's the mechanism you're using for login). However, this value of Vary (along with most others) messes up IE's caching completely so it's generally avoided. Also, it's often desirable to allow the user to step through the back/forward list including logged-in/out pages without having to re-fetch.
For situations where enforcing proper logged-in-ness for every page is critical (such as banking), an full Cache-Control: no-cache is typically used instead.