How to do cache warmup in TYPO3 - caching

In another question, there is the recommendation to setup a cache_clearAtMidnight via TypoScript and do a subsequent cache warmup.
I would like to know how to do this cache warmup because I did not find a scheduler task to do it.
(Clearing the entire cache once a day seems excessive, but the cache warmup seems like a good idea to me in any case.)

As I don't know whether there is an internal mechanism in TYPO3 for cache warming, I built my own little cache warmer based around a simple PHP script (can actually be anything – Python, PHP, Bash,...). The script reads the sitemap.xml and requests each page via cURL.
I use a custom user agent to exclude these requests from statistics.
curl_setopt($ch, CURLOPT_USERAGENT, 'cache warming - TYPO3');

You can use this ext. Its simple wget wrapper but you can add it as Scheduler task.
https://github.com/visuellverstehen/t3fetch

There are extensions available to do cache warmup:
crawler
b13/warmup
See also this relatively new blog post (part 1) on caching by Benni Mack:
Caching in TYPO3 - Part 1
In general, there are a number of things to consider as well, e.g. changing cache duration, optimizing for pages to load faster without being cached etc.
Btw, cache_clearAtMidnight does not clear the cache at midnight, it sets the expire time to be at midnight. Once the cache has been expired, on next page hit, it will be regenerated. Has the same effect, but might be good to know.

Related

Jmeter- Throughput increases after ehcache clear

While executing Jmeter test on staging environment, we ran ehcache clear command, which removed all site cache. Since the ehcache got cleared, we were expecting that the performance and throughput would go down for some time. Instead, the number of transactions per second (throughput) increased drastically.
What can be the explanation for this?
It could be a bug/wrong implementation of ehcache, you can check a detailed how ehcache dissected:
...database connections were kept open. Which meant that the database
started to slow down. This meant that other activity started to take
longer as well...
and in summary:
for a non-distributed cache, it performs well enough as long as you
configure it okay.
Also check guidelines which will conclude in an interesting way:
we learned that we do not need a cache. In fact, in most cases where
people introduce a cache it is not really needed. ...
Our guidelines for using a cache are as follows:
You do not need a cache.
Really, you don’t.
If you still have a performance issue, can you
solve it at the source? What is slow? Why is it slow? Can you
architect it differently to not be slow? Can you prepare data to be
read-optimized?
If it's not due to a slow EhCache configuration, I suppose explanation could be that :
You don't have response assertion in your test plan so just base it on Response Code which might be 200 while response page was not the one you requested
The page served after the clear, are maybe lighter (error pages ? default pages) and do not require as
See:
http://www.ubik-ingenierie.com/blog/best-practice-using-jmeter-assertions/

Magento Admin suddenly slowed down

We have Magento EE 1.14. Admin was working fine till last two days its speed dropped dramatically. Frontend is not affected. Also no changes in code or server configuration. here is my attempt to fix the problem but nothing worked:
Log cleaning is properly configured
removed two unused extensions. but no improvement
tried to disable non-critical extensions to see if speed will improve but also not luck.
I can NOT use REDIS cache at this time. but configured new server which is using REDIS cache and move to it next month.
sometimes backend will gain speed for few minutes
I enabled profilers the source of the delay is mage ( screenshot attached ).
here are my question:
Is there anyway to know the exact reason for Mage delay ?
do I have other test i can use to identify the cause of delay ?
Thanks in advance,
It could be delay on external resources connection. Do you have new relic or similar software? Check there for slow connections. If You don't have NR, profile admin by blackfire.io. Magento profiler is really unhelpful :)
Follow below steps:
Delete unused extensions
It is best to remove unused extensions rather than just disabling them. If you disable an extension, it will still exist in the database. It would not only increase the size of your database (DB) but it also adds to the reading time for DB. So, keep your approach clear: If you don’t need it, DELETE it.
Keep your store clean by deleting unused and outdated products
One should keep in mind that a clean store is a fast store. We can operationalize the front-end faster by caching and displaying only a limited set of products even if we have more than 10,000 items in the back-end, but we cannot escape their wrath. If the number of products keeps on increasing at the backend, it may get slower, so it is best to remove unused products. Repeat this activity in every few months to keep the store fast.
Reindexing
One of the basic reasons why website administrators experience slow performance while saving a product is because of reindexing. Whenever you save a product, the Magento backend starts to reindex, and since you have a lot of products, it will take some time to complete. This causes unnecessary delays.
Clear the Cache
Cache is very important as far as any web application is concerned so that a web server does not have to process the same request again and again.

Shouldn't CloudFlare with "Cache Everything" cache everything?

I have a CloudFlare account and found out that if I use page rules, I could use a more agressive cache setting called "Cache Everything", when reading about this, I understood that it should basically cache everything. I tested it on a site that is completely static, and I set the expiration time to 1 day.
Now after a few days looking at how many requests have been served from the cache and not from the cache, there's no change, still about 25% of the requests have not been served from the cache.
The two rules I've added for Cache Everything are:
http://www.example.com/
and
http://www.example.com/*
Both with Cache Everything and 1 day expiration time.
So my questions are, have I misinterpreted the use of Cache Everything (I thought I only should get one request per page/file each day using this setting), or is something wrong with my rules? Or maybe do I need to wait a few days for the cache to kick in?
Thanks in advance
"Or maybe do I need to wait a few days for the cache to kick in?"
Our caching is really designed to function based on the number of requests for the resources (a minimum of three requests), and works basically off of the "hot files" on your site (frequently requested) and is also very much data center related. If we get a lot of requests in one data center, for example, then we would cache the resources.
Also keep in mind that our caching will not cache third-party resources that are on your site (calls to ad platforms, etc.).

Performance of memcache on a shared server

Lately I've been experimenting with increasing performance on my blog, and not just one-click fixes but also looking at code in addition to other things like CDN, cache, etc.
I talked to my host about installing memcache so I can enable it in W3 Total Cache and he seems to think it will actually hinder my site as it will instantaneously max out my RAM usage (which is 1GB).
Do you think he is accurate, and should I try it anyway? My blog and forum (MyBB) get a combined 200,000 pageviews a month.
In fact, having 200.000 pageviews a month, I would go a way from a 'shared' host, and buy a VPS or dedicated server or something, Memcache(d) is a good tool indeed, but there is lots of other way you can get better performance.
Memcached is good if you know how to use it correctly, (The w3 total cache memcached thing, doesn't do the job).
As a performance engineer, I think a lot about speed, but also about server load and stuff. Im working much with wordpress sites, and the way I increase the performance to the maximum on my servers, is to generate HTML pages of my wordpress sites, this will result in 0 or minimal access to the PHP handler itself, which increase performance a lot.
What you then again can do, is to add another caching proxy in front of the web server, etc Varnish, which caches results, which means you'll never touch the web-server either.
What it will do, is when the client request your page, it will serve the already processed page directly via the memory, which is pretty fast. You then have a TTL on your files, and can be as low as 50 seconds which is default. 50 seconds doesn't sounds a lot. But if you have 200k pageviews, that means you will have 4.5 pageviews each minute if you had same amount of pageviews each minute. So peak hours doesn't count.
When you do 1 page view, there will be a lot of processing going on:
Making the first request to the web-server, starting the php process, process data, grap stuff from the DB, process the data, process the PHP site, etc. If we can do this for a few requests it will speed up the performance.
Often you should be able to generate HTML files of your forum too, which then would be renewed each 1-2 minutes, if there is a request to the file. it will require 1 request being processed instead of 4-9 requests (if not more).
You can limit the amount of memory that memcached uses. If the memory is maxed out the oldest entries are pruned. In CentOS/Debian there is /etc/default/memcached and you can set the maximum memory with the -m flag.
In my experience 64MB or even 32MB of memcached memory are enough for Wordpress and make a huge difference. Be sure to not cache whole pages (that fills the cache pretty fast) instead use memcache for the Wordpress Object Cache.
For generall Performance: Make sure to have a recent PHP Version (5.3+) and have APC installed. For Database Queries I would skip W3TC and go directly for the MySQL Query Cache.

Does Google Analytics have performance overhead?

To what extent does Google Analytics impact performance?
I'm looking for the following:
Benchmarks (including response times/pageload times et al)
Links or results to similar benchmarks
One (possible) method of testing Google Analytics (GA) on your site:
Serve ga.js (the Google Analytics JavaScript file) from your own server.
Update from Google Daily (test 1) and Weekly (test 2).
I would be interested to see how this reduces the communication between the client webserver and the GA server.
Has anyone conducted any of these tests? If so, can you provide your results? If not, does anyone have a better method for testing the performance hit (or lack thereof) for using GA?
2018 update: Where and how you mount Analytics has changed over and over and over again. The current gtag.js code does a few things:
Load the gtag script but async (non-blocking). This means it doesn't slow your page down in any other way than bandwidth and processing.
Create an array on the page called window.datalayer
Define a little gtag() function that just pushes whatever you throw at it into that array.
Calls that with a pageload event.
Once the main gtag script loads, it syncs this array with Google and monitors it for changes. It's a good system and unlike the previous systems (eg stuffing code in just before </body>) it means you can call events before the DOM has rendered, and script order doesn't really matter, as long as you define gtag() first.
That's not to say there isn't a performance overhead here. We're still using bandwidth on loading up the script (it's cached locally for 15 minutes), and it's not a small pile of scripts that they throw at you, so there's some CPU time processing it.
But it's all negligible compared to (eg) modern frontend frameworks.
If you're going for the absolute, most cut-down website possible, avoid it completely. If you're trying to protect the privacy of your users, don't use any third party scripts... But if we're talking about an average modern website, there is much lower hanging fruit than gtag.js if you're hitting performance issues.
There are some great slides by Steve Souders (client-side performance expert) about:
Different techniques to load external JavaScript files in parallel
their effect on loading time and page rendering
what kind of "in progress" indicators the browser displays (e.g. 'loading' in the status bar, hourglass mouse cursor).
I haven't done any fancy automated testing or programmatic number crunching, but using good old Firefox with the Firebug plugin and a pair of JS variables to tell the time difference before and after all GA code is executed, here is what I found.
Two things are downloaded:
ga.js is the JavaScript file containing the code. This is 9kb, so the initial download is negligible and the filename isn't dynamic so it's cached after the first request.
a 35 byte gif file with a dynamic url (via query string args), so this is requested every time. 35 bytes is a negligible download as well (firebug says it took me 70ms to dl it).
As far as execution time, my first request with a clean browser cache was an average of about 330ms each time and subsequent requests were between 35 and 130 ms.
From my own experience it has adding Google-Analytics has not changed the load times.
According to FireBug it loads in less then a second (648MS avg), and according so some of my other test ~60% - 80% of that time was transferring the data from the server, which of course will vary from user to user.I don't preticularly think that caching the analytics code locally will change the load times much, for the above reasons.
I use Google-Analytics on more then 40 websites without it ever being the cause of any, even small, slowdown, the most amount of time is spent getting the images which, due to their typical sizes, is understandable.
You can host the ga.js on your servers with no problems whatsoever, but the idea is that your users will have the ga.js cached from some other site they may have visited. So downloading ga.js, because it's so popular, adds very little overhead in many cases (i.e., it's already been cached).
Plus, DNS lookups do not cost the same in different places due to network topology. Caching behavior would change depending on whether users use other sites that include ga.js or not.
Once the JavaScript has been loaded, the ga.js does communicate with Google servers, but that is an asynchronous process.
There's no/minimal site overhead on the server side.
The HTML for Google Analytics is three lines of javascript that you place at the bottom of your webpage. It's nothing really, and doesn't consume any more server resource than a copyright notice.
On the client side, the page can take a little bit (up to a couple of seconds) of time to finish displaying a page. However - In my experience, the only bit of the page not loaded is the Google stuff, so users can see your page perfectly fine. You just get the throbber at the top of the page throbbing for a little longer.
(Note: You need to place your google analytics code block at the bottom of any served pages for this to be the case. I don't know what happens if the code block is placed at the top of your HTML)
The traditional instructions from Google on how to include ga.js use document.write(). So, even if a browser would somehow asynchronously load external JavaScript libraries until some code is actually to be executed, the document.write() would still block the page loading. The later asynchronous instructions do not use document.write() directly, but maybe insertBefore also blocks page loading?
However, Google sets the cache's max-age to 86,400 seconds (being 1 day, and even set to be public, so also applicable to proxies). So, as many sites load the very same Google script, the JavaScript will often be fetched from the cache. Still, even when ga.js is cached, simply clicking the reload button will often make a browser ask Google about any changes. And then, just like when ga.js was not cached yet, the browser has to await the response before continuing:
GET /ga.js HTTP/1.1
Host: www.google-analytics.com
...
If-Modified-Since: Mon, 22 Jun 2009 20:00:33 GMT
Cache-Control: max-age=0
HTTP/1.x 304 Not Modified
Last-Modified: Mon, 22 Jun 2009 20:00:33 GMT
Date: Sun, 26 Jul 2009 12:08:27 GMT
Cache-Control: max-age=604800, public
Server: Golfe
Note that many users click reload for news sites, forums and blogs they already have open in a browser window, making many browsers block until a response from Google is received. How often do you reload the SO home page? When Google Analytics response is slow, then such users will notice right away. (There are many solutions published on the net to asynchronously load the ga.js script, especially useful for these kind of sites, but maybe no longer better than Google's updated instructions.)
Once the JavaScript has loaded and executed, the actual loading of the web bug (the tracking image) should be asynchronous. So, the loading of the tracking image should not block anything else, unless the page uses body.onload(). In this case, if the web bug fails to load promptly then clicking reload actually makes things worse because clicking reload will also make the browser request the script again, with the If-Modified-Since described above. Before the reload the browser was only awaiting the web bug, while after clicking reload it also needs the response for the ga.js script.
So, sites using Google Analytics should not use body.onload(). Instead, one should use something like jQuery's $(document).ready() or MooTools' domready event.
See also Google's Functional Overview, explaining How Does Google Analytics Collect Data?, including How the Tracking Code Works. (This also makes it official that Google collects the contents of first-party cookies. That is: the cookies from the site you're visiting.)
Update: in December 2009, Google has released an asynchronous version. The above should tell everyone to upgrade just to be sure, though upgrading does not solve everything.
It really depends on the day. I'm just adding this to a blog. I'm in california, very close to their main data centers, on a fast low latency business DSL, on a overclocked i5 with plenty of RAM running a recent linux kernel and stable firefox.
here's a sample page load:
google-analytics alone added 5 seconds just of network download time... to get 15Kb!
You can see blogger.com served 34Kb in 300 mili seconds. That's 32x faster!
Also, look how the Red Line (which represents the onLoad event, meaning, there's no more script executing on the page and the so the browser can finally stops the loading indicators/spinings/etc) ... look how far to the right it is. that's probably 3seconds of garbage javascript processing that happened there. It's very uncommon for that line to be very far away from the end of the resources download bars. I'm done debugging this and it's 1/3 analytics fault, 2/3 blogger fault. ...one would think google stuff was fast.
Edit:
Some more data. Here's a request with everything cached. the above one was first visit.
I've removed the googleplus crap from above for two reasons, I was trying to see if they were playing some part on the slow onLoad event (they aren't) and because It is mostly useless.
So, With this we can see that the network time is the least of your worries. Even on a fast computer with modern software, the toll google analytics + blogger take on processing time will still dump your page load past 7s. Without the blogger, just check this very site, i'm seeing 0.5s of delay after resources are loaded and the red line kicks in.
Loading any extra javascript to your page is going to increase the download time from the client's perspective. You can ameliorate this by loading it at the bottom of your page so that your page is rendered even if GA is not loaded. I would avoid caching because you would lose the advantage of the client cache for your page. If the client has it cached from some other page, your page's request will be filled from the client itself. If you change it to load from your site, it will require a download even if the client already has the code (which is likely). Adding a task to your software processes to avoid loading the file from Google seems unwarranted for what may be an unnecessary optimization. It would be hard to test this since it would always serve up faster locally, but what really matters is how fast it works for your customers. If you decide to evaluate keeping it locally, make sure you test it from your home internet connection --- not the machine sitting next to the server in your rack.
Use FireBug and YSlow to check for yourself. What you will discover however is that GA is about 9KB in size (which is actually quite substantially for what it does) and that it also sometimes does NOT load very fast (for what reasons I don't know, I think it might be the servers "choking" sometimes)
We removed it due to performance issues on our Ajax Samples, but then again for us being ultra fast and responsive was priority 1, 2 and 3
Nothing noticeable.
The call to Google (including DNS lookup, loading the Javascript if not already cached and the actual tracer calls themselves) should be done by the client's browser in a separate thread to actually loading your page. Certainly the DNS lookup will be done by the underlying system and will not, to my knowledge, count as a lookup within the browser (browsers have a limit on the number of request threads they will use per site).
Beyond that, the browser will load the Google script in parallel along with all other embedded resources, so you will potentially get an extremely slight increase in the time it takes to download everything, in the worst case (we're talking in the order of milliseconds, unnoticable. If the Google script is loaded last by the browser, or you don't have many external resources on your page, or if your page's external resources are cached by the browser, or if Google's script is cached by the browser (extremely likely) then you won't see any difference. It's just absolutely trivial overall, the same effect as sticking an extra tiny picture on your page, roughly speaking.
About the only time it might make a concrete difference is if you have some behaviour that fire on the onLoad event (which waits for external resources to load), and the Google servers are down/slow. The latter is unlikely to happen often, but if this were the case then the onLoad even won't fire until the script is downloaded. You can work around this anyway by using various "when DOM loaded" events, which are generally more responsive as you don't have to wait for your own scripts/images to load this way either.
If you're really that worried about the effects on page load time, then have a look a the "Net speed" section of Firebug, which will quantify this and draw you a pretty graph. I would encourage you to do this for yourself anyway as even if other people give you the figures and benchmarks you request, it will be completely different for your own site.
Well, I have have searched, researched and expored extensively on net. But I have not found any statistical data that claims either in favour or against of the premise.
However, this excerpt from http://www.ga-experts.com claims that its a Myth that GA slows down your website.
Err, well okay, maybe slightly, but
we’re talking about milliseconds. GA
works by page tagging, and any time
you add more content to a web page, it
will increase loading times. However
if you follow best practice (adding
the tag before the </body> tag) then
your page will load first. Also, bear
in mind that any page tag based web
analytics package (which is the
majority) will work the same way
From the answers above and all other sources, what I feel is that whatever slowdown it causes in not percieved by the user as the Script is included at the bottom of the page. But if we talk of complete page-loads we might say that it slows down the page-load time.
Please post in more info if you have and DATA if you have any.
I don't think this is what your looking for but what are you worried about performance for?
If its your server... then there's obviously no impact as it resides on Google servers.
If its your users that your worried about then there is no impact either. As long as you place it just above the body tag then your users will not receive anything slower than they would before... the script is loaded last and has no affect on the appearance to the user. So there essentially not waiting on anything and even continue to browse through the page without noticing that its still loading.
The question was will Google Analytics cause your site to slow down and the answer is yes. Right now at the time of writing this Google-Analytics.com is not working so sites that have that in their pages won't load the pages so yes, it can slow down and cause your site to not even load. It's uncommon for google-analytics.com to be down this long which right now has been over 10 minutes, but it just shows that it is possible.
There are two aspects to it.
Analytics script's' (and a gif) download
Downloaded scripts execution
Download time is almost always less than 100ms, which is acceptable.
Here comes the twist.
analytics.js execution 250ms
re-marketing (if enabled) 300ms
demographic (if enabled) 200ms
So analytics with re-marketing takes 750ms on average. I feel that this is a huge number when it comes to performance overhead.
I noticed frequent I/o and CPU overload in cPanel resulting with:
Site unreachable error
And that stopped after I disabled WP Analytics plugin. So I reckon it does have some impact.

Resources