what is the best practice for coding user page? - performance

I just need a quick answer, I would like to know what is the best practice for coding an user page? Should i create 2 differents pages one for myself and one for other user's or only create one page, and modify the page programatically ( ex:if personnal page do this , else do that)?

A quick answer: from network/browser performance perspective it is better to create as consize pages as they could be. I.e. separate functionality (view-only/view-modify) should get a separate page since you would load different resources for each one and the functionality you don't need would affect the loading time (downloading, parsing, compiling, rendering).
Though it really depends on the environment of your website (network speed, user's device hardware) and size of your resources, since saving 5Kb between a view-only and a view-modify template would be an overkill.
So the best practice would be the following: understand your website requirements (page loading time, page navigation time), understand your website limitations (page size, network speed, user's device hardware), understand your time bounds (when your website need to go live) and basing on that choose your architecture.

Related

Magnolia "Page Aware" Caching

I am using Magnolia v5.7.1 and just configured the advanced cache module for site aware caching. Before that, the default behavior was to flush all caches in case of any (activation, import, edit) in a workspace. Using the advanced cache module, if any content on a specific site is changed, only the corresponding caches are flushed. So far, so good.
Now, let's say pages A and B are cached. If page A is changed, this will flush the cache for page A and B (as long as both pages are on the same site). I am wondering if there is a good reason that the default behavior isn't the following: If page A is changed, only the cache for page A gets flushed.
I know it's possible to implement my own FlushPolicy, however, this seems to be a difficult task and perhaps I miss a good reason why "page aware" caching can't be done.
The good reason to flush all is that change in one page might affect others. It is quite common to eg. generate menu from page strucuture, thus renaming one page would affect all pages that show menu. Or eg. have teaser component in a page that will take abstract text and image from the page it is teasing. And so on. In short, without calculating dependency graph there is no way for system to know which pages rendering might be affected by which other page changes. And in some cases it can be near impossible to know. Imagine event calendar page, with subpages for each event. And calendar being composed by query that searches for all events in current month. Since dynamic query is involved, calculating dependency graph just gets even more complicated.
That said it would still be possible to calculate dependencies and flush only affected pages, but in reality, for most cases, effort (cpu time) to calculate such graph is bigger than simply flushing all and re-rendering the pages as rendering is cheap (except special cases). On top of that it's also much faster to drop all items in the cache than retrieving them one by one and flushing only those that need to be flushed.
TLDR; For majority of the websites, it's not worth the effort to try to do very smart page cache management as costs outweigh the benefits w/o really affecting the performance.

"Time to Interact" metric in web performance measurements

Apparently "Time to Interact" is the new metric to use when measuring the perceived speed of a webpage. I'm interested in understanding a bit more about what this actually is.
The term was apparently coined by Radware, and is being pushed as the most meaningful performance measurement (compared to things such as Time to First/Last Byte, Time to Render etc.).
It is described as:
the point which a page displays its primary interactive (think
clickable) content, rather than full page load.
This seems pretty subjective to me; what is the "primary interactive content" of a webpage for example?
There have been reports citing results for the measurement, so some how this is being measured, and further, it must be automated as the result sets are pretty big (~500 sites were tested).
Other than the above quote, I cannot find any more information on how to measure this.
As Google are placing more emphasis on above the fold content (or visible content), I am wondering whether this metric is actually more like "Time to First Meaningful Render", i.e. it is contextual to the current page goal. So for example, on an eCommerce site's product page, this could be the main image, and an add to basket link.
I am keen to understand this metric, as to me it does seem like the most useful one. My question is therefore whether anyone is measuring this, and if so how are they doing so?
You kind of answered your own question, it is subjective, and contextual to you current project.
What if I'm testing a site with only HTML without any complex resources? There is no point measuring TTI there. On the other hand, let's see this demo site.
Bigger picture here.
Blue line is marking the "COMContentLoaded" event (main document is loaded and markup parsed), red line indicates the load event, where all page resources are loaded. The TTI line would go in-between the two lines, that is defined differently for each project, based on some essential to interact resources loaded event.
For example, let's say that the pictures on the demo site are not essential to the core features of the site. While the main site loaded in 0.8 seconds, the 3 big pictures took 36 extra seconds to load, so in this case using the overall response time as a KPI would yield ~36second response time, while if you define TTI excluding those big, non essential resources, you end up with < 1s response time.
I am keen to understand this metric, as to me it does seem like the most useful one.
Definitely useful, but as you said it in your question, it's specific to the project. You wouldn't measure TTI on a simple, relatively static web app, you would probably measure overall response time. I always define KPIs "tailored" for the current project, instead of trying to use common metrics, and "force them" on a project.
My question is therefore whether anyone is measuring this, and if so how are they doing so?
Definitely used it before, you should identify the essential resources for your site, and when the last of those resources are loaded, that is your TTI. This could be a javascript file, a css, etc...
Websites are getting more complex. Whereas they might not always contain more content they still have more resources to load as the user interaction/user experience is more complex from a technical point of view. Ajax helps us to load different parts separately. So rather than one page load we have the loading of several small things. And for each of these parts we can measure the loading performance. But there might some parts on the site that might be more important than others. The "primary interactive content" is that part of your view that enables the user to do what he intends to do, for example buy a train ticket. If some advertisement or a special animation on the left side of the screen hasn't loaded this does not prevent the user to buy start buying a ticket. But of course "primary interactive content" as a term is quite vague and you have to define it for your specific application. It is the point an average user can and will start to interact with the website while some parts are sill loading.
This is how I understand the concept and I see the difference to "Time to First Meaningful Render" here: you might have a basket rendered on your eCommerce page but the GUI is not yet responsive. So you see something meaningful but the interactivity is not yet there. Therefore TTI >= TtFMR.
Measuring TTI requires you to define what elements are required for interactivity which not only depends on what the site does but also HOW it does it. So it highly depends on your implementation/technology.

When is it better to generate a static page or dynamically generate?

The title pretty much sums up my question.
When is it more efficient to generate a static page, that a user can access, as apposed to using dynamically generated pages that query a database? As in what situations would one be better than the other.
To serve up a static page, your web server just needs to read the page off the disk and send it. Virtually no processing will be required. If the page is frequently accessed, it will probably be cached in memory, so even the disk access will not be needed.
Generating pages dynamically obviously has more overhead. There is a cost for every DB access you make, no matter how simple the query is. (On a project I worked on recently, I measured a minimum overhead of 0.7ms for each query, even for SELECT 1;) So if you can just generate a static page and save it to disk, page accesses will be faster. How much faster? It just depends on how much work is being done to generate the page dynamically. We don't know what you are doing, so we can't comment on that.
Now, if you generate a static page and save it to disk, that means you need to re-generate it every time the data which went into generating that page changes. If the data changes more often than the page is actually accessed, you could be doing more work rather than less! But in most cases, that's a very unlikely situation.
More likely, the biggest problem you will experience from generating static pages and saving them to disk is coding (and maintaining) the logic for re-generating the pages whenever necessary. You will need to keep track of exactly what data goes into each page, and in every place in the code where data can be changed, you will need to invoke re-generation of all the relevant pages. If you forget just one, then your users may be looking at stale data some of the time.
If you mix dynamic generation per-request and caching generated pages on disk, then your code will be harder to read and maintain, because of mixing the two styles.
And you can't really cache generated pages on disk in certain situations -- like responding to POST requests which come from a form submission. Or imagine that when your users invoke certain actions, you have to send a request to a 3rd party API, and the data which comes back from that API will be used in the page. What comes back from the API may be different each time, so in this case, you need to generate the page dynamically each time.
Static pages (or better resources) are filled with content, that does not change or at least not often, and does not allow further queries on it: About Page, Contact, ...
In this case it doesn't make any sense to query these pages. On the other side we have Data (e.g. in a Database) and want to query it/give the user the opportunity to query it. In this case you give the User a page with the possibility to specify the query and return a rendered page with the dynamically generated data.
In my opinion it depends on the result you want to present to the user. Either it is only an information or it is the possibility to query a Datasource. The first result is known before you do something, the second (query data) is known after you have the query parameters, which means you don't know the result beforehand (it could be empty or invalid).
It depends on your architecture, but when you consider that GET Requests should be idempotent it should be also easy to cache dynamic Pages with a Proxy, and invalidate the cache, when something new happens to the data which is displayed on the cached path. In this case one could save a lot of time, because the system behaves like the cached pages would be static, but instead coming from the filesystem, they come from your memory, which is really fast.
Cheers
Laidback

Programmatically maintaining sitemaps

Hoping someone can chime in on an ideal methodology.
I don't want to run my site through a crawler every month to add new pages to my sitemap, I'd like some robust systematic method to do so, because maintaining it by hand seems very prone to ahem human forgetfulness. Is there some sorta way to programmatically validate new controllers, controller methods, views, etc. to some special controller? What I'm picturing is some mechanism that enforces updating the sitemap whenever you create a new controller method or view. I work in LAMP stack if that's relevant. This guy here is doing it through the file system and that's not what I want for a public facing sitemap.
Perhaps there's another best practice for this type of maintenance other than the concept I'm proposing. Would love to hear how everyone else does this! :)
If your site is content based, best practise is reading database periodically and generating each contents link. With this method you can specify some subjects are more prior or vice versa in sitemap.
That method already mentioned before at topic that you linked.
Else, you can hold a visited pages list (static) at server-side. Or just log them. After recording your site traffic, without blocking the user experience, I mean asynchronously, check the sitemaps and add your page links there. You can specify priority with this method too, by visiting intensity of your pages and some statistical logic.

Strategies for Caching on the Web? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
What concerns, processes, and questions do you take into account when deciding when and how to cache. Is it always a no win situation?
This presupposes you are stuck with a code base that has been optimized.
I have been working with DotNetNuke most recently for web applications and there are a number of things that I consider each time I implement caching solutions.
Do all users need to see cached content?
How often does each bit of content change?
Can I cache the entire page?
Do I need a manual way to purge the cache?
Can I use a single cache mechanism for the entire site, or do I need multiple solutions?
What impacts occur if informaiton is somehow out of date?
I would look at each feature of your website/application a decided for each feature:
Should it be cached?
How long should it be cached for?
When should the cache be expunged?
I would personally go against caching whole pages in favour of caching sections of the website/application.
First off, if your code is optimized as you said, you will only see noticable performance benefits when the site is being hammered with a lot of requests.
However, It is faster to pull resources from RAM than from the disk, so your web server will be able to handle more requests if you have a caching strategy in place.
As for knowing when you're going to need caching, consider that even low end modern web servers can handle hundreds of requests per second, so unless you expect a decent amount of traffic, caching is probably something you can just skip.
Also, if you are pulling content from your database (for example, StackOverflow probably does this) caching can be very helpful because database operations are relatively expensive and can be a huge bottleneck in high-volume situations.
As for a scenario when it's not appropriate to cache or when caching becomes difficult... If you try to cache a dynamic page that, say, displays the current date and time, you will constantly see an old date/time unless you get a little more involved with your caching strategy. So that's something to think about.
What language are you using? With ASP you have some very easy caching with only adding some property tag over the method and the value is cached depending of the time.
If you want more control over the cache, you can use some popular system like MemCached and have a control with time or by event.
Yahoo for example "versions" their JavaScript, so your browser downloads code-1.2.3.js and when a new version appears they reference that version. By doing this they can make their Javascript code cacheable for a very-very long time.
As for the general answer I think it depends on your data, on how often does it change. For example, images don't change very often, but html pages do. The "About us" page doesn't change too often, but the news section does.
You can cache by time. This is useful for data that change fast. You can set time for 30 sec or 1 min. Of course, this require some traffic. More traffic you have, more you can play with the time because if you have 1 visit every hour, this visit will be populate the cache and not using it...
You can cache by event... if your data change, you update the cache... this is one very useful if the data need to be accurate for the user very fast.
You can cache static content that you know that won't change ofen. If you have a top 10 of the day that refresh every day, than you can stock all in the cache and update every day.
Where available, look out for whole object memory caching. In ASPNET, this is a built-in feature where you can just plant your business logic objects in the IIS Application and access them from there.
This means you can store everything you need to generate a page in memory (persisting writes to database) and generate a page without ANY database IO.
You still need to use the page-building logic to generate the page, but you save a lot of time in getting the data.
Other techniques involve localised output caching, where you capture the output before sending and save it to file. This is great for static sections (like navigation on certain pages, or text bodies) and include them out when they're requested. Most implementations purge cached objects like this when a write happens or after a certain period of time.
Then there's the least "accurate": whole page caching. It's the highest performer but it's pretty useless unless you have very simple pages.
What kind of caching? Server side caching? Client side caching?
Client side caching is a no-brainer with certain things, like Static HTML, SWFs and images. Figure out how often the assets are likely to change, and set up "Expires" headers as appropriate. (2 days? 2 weeks? 2 months?)
Dynamic pages, by definition, are a little harder to cache. There have been some explorations in caching of certain chunks using Javascript (and degrading to IFrames if JS is not available.) This however, might be a little more difficult to retrofit into an existing site.
DB and application level caching may, or may not work, depending on your situation. That really depends on where your bottlenecks are. Figuring out where your application spends the most time on page-rendering is probably priority 1, then you can start looking at where and how to cache.

Resources