Tweaking magento for performance - performance

i'm looking on performance (server load time) of magento site and i'm trying to tune search result pages. I realized that when I disabled all heavy things like top navigation, lev layered navigation and product listing and I cleared all cache then after this magento core does like 60 SQL queries agains a database. Does anyone have any procedure how to rid of them or how to reduce them to some acceptable amount?
Also can I somehow reduce a time spent during creating of blocks?
Thank you very much,
Jaro.

Magento is a extremely flexible ecommerce framework, but that flexibility comes with a price: performance. This answer is a collection of pointers and some details on caching (especially for blocks).
One thing to consider is the Magento environment, e.g. tuning the php, the web server (favor nginx over Apache), and MySQL. Also, set up a good caching backend for Magento. All these are covered e.g. in the Magento Performance Whitepaper that applies also to the CE.
After the environment is set up, the other side of things is the code.
Reducing the number of queries is possible for some pages by enabling the flat table catalog (System > Configuration > Catalog > Frontend), but you will always have a high number of queries.
You also can't really reduce the time spent creating the blocks except by tuning the environment (APC, memory, CPU). So as the other commenters said, your best choice is utilizing the caching functionality that Magento has built in.
Magento Block Caching
Because you specifically mentioned blocks in the question, I'll elaborate a bit more on block caching. Block caching is governed by three properties:
cache_lifetime
cache_key
cache_tags
All these properties can be set in the _construct() method of a block using setData() or magic setters, or by implementing the associated getter methods (getCacheLifetime(), getCacheKey(), getCacheTags()).
The cache_lifetime is specified in (integer) seconds. If it is set to false(boolean), the block will be cached for ever (no expiry). If it is set to nullthe block will not be cached (this is the default in Mage_Core_Block_Abstract).
The cache_key is the unique string that is used to identify the cache record in the cache pool. By default it is constructed from the array returned by the method getCacheKeyInfo().
// Mage_Core_Block_Abstract
public function getCacheKeyInfo()
{
return array(
$this->getNameInLayout()
);
}
public function getCacheKey()
{
if ($this->hasData('cache_key')) {
return $this->getData('cache_key');
}
/**
* don't prevent recalculation by saving generated cache key
* because of ability to render single block instance with different data
*/
$key = $this->getCacheKeyInfo();
//ksort($key); // ignore order
$key = array_values($key); // ignore array keys
$key = implode('|', $key);
$key = sha1($key);
return $key;
}
The best way to customize the cache key in custom blocks is to override the getCacheKeyInfo() method and add the data that you need to uniquely identify the cached block as needed.
For example, in order to cache a different version of a block depending on the customer group you could do:
public function getCacheKeyInfo()
{
$info = parent::getCacheKeyInfo();
$info[] = Mage::getSingleton('customer/session')->getCustomerGroupId()
return $info;
}
The cache_tags are an array that enable cache segmentation. You can delete sections of the cache matching one or more tags only.
In the admin interface under System > Cache Management you can see a couple of the default cache tags that are available (e.g. BLOCK_HTML, CONFIG, ...). You can use custom cache tags, too, simply by specifying them.
This is part of the Zend_Cache implementation, and needs to be customized far less frequently compared to the cache_lifetime and the cache_key.
Other Caching
Besides blocks Magento caches many other things (collection data, configuration, ...).
You can cache your own data using Mage::app()->saveCache(), Mage::app()->loadCache(), Mage::app()->cleanCache() and Mage::app()->removeCache(). Please look in Mage_Core_Model_App for details on these methods, they are rather straight forward.
You will also want to use a full page cache module. If you are using the Magento EE, you already have one. Otherwise search Magento Connect - there are many options (commercial).
Some of those modules also tune various parts of Magento for you beyond the full page caching aspect, e.g. Nitrogento (commercial).
Using a reverse proxy like Varnish is also very beneficial.
There are quite a number of blog posts on this subject. Here is one post by the publishers of the Nitrogento extension.
If you are running Magento on a more low-scale environment, check out my post on the optimization of the file cache backend on magebase.com.

I am adding additional comments for speed:
Instead of using Apache use nginx or litespeed.
Make sure flat catalog is used.
If possible use FPC.
compiler mode to be set on.
Merge css and js(Fooman Speedster ).
Use image sprites to reduce number of request.
Use query cache but avoid size greater then 64 MB.
Remove all modules not in use by removing there xml.Just disabling will not do.
Session to be on Ram.
Use of APC recommended.
Your cron should be run in offpeak hours.
Delete additional stores if not in use.
Delete cart rules if not in use.
optimize image for size.
Use Ajax where ever possible.
CMS blocks take more time then a magento block so unless you want a block to be modified do not use CMS blocks.
Do not use collection count use collection getSize to get what is the collection size.
Minimize number of searchable attributes as these result in columns in flat catalog table and will slow down your search.
Use of Solr search is recommended. It comes with EE version but it can be installed with CE as well.
Minimize customer group as suggested in comment.
Enable compression in .htaccess (mod_gzip for Apache 1.3, mod_deflate for Apache 2)
Remove staging stores if on EE.
Use Apache mod_expires and be sure to set how long files should be cached.In case you are on Apache server.
Use a Content Delivery Network (CDN).
Enable Apache KeepAlives.
Make your output W3C compliant
Use of getChildHtml('childName') is recommended as this will cache block against direct use of block code in .phtml file.
Make sure cron is run so as to clean logs stored in data base.
Number of days log should be minimized as per requirement.
Load cache on RAM if memory permits.
Reduce hard disc file reads and try reads from ram as this is faster.
Upgrade PHP version to above 5.3
If on EE make sure that most pages are delivered without application initialization.Even if one container needs application initialization its going to effect execution speed as apart form URL rewrites most of the other code will get executed.
Check XML for blocks placed in default handle and if those blocks not on specific page then move those XML values from default handle to specific handles.It has been observed that lots of blocks are executed that are not displayed.
If using FPC make sure your containers are cached and repeat request for container is delivered via cache.Improper placeholder definition results in container cache not being used but each time new container content getting generated.
Analyze page blocks and variables and if possible add those variables/blocks to cache.
Switch off Logs writing in Magento.
Remove Admin notification module.
Use of image sprites.
Use some web test tool to analyse number of requests and other html related parameters responsible for download time and act accordingly.
Remove attributes if not needed.With proper care we can even remove system attributes if not in use.
42: If on enterprise make sure partial indexing is effectively used.
Write your own solr search populate to bypass Magento search indexing.
Clean _cl tables or reduce _cl table rows.
I would add into list: try to avoid file cache if possible, replace it by apc / redis / memcache( As suggested by Jaro)
Remove system attributes not in use( Be careful,do a thorough check before removing).
There are some cron tab jobs that are not applicable to all stores so depending on your store features those can be removed.
Optimization by proper attribute management like setting required attribute to yes or is searchable or required in listing etc.
Some observers are not required for all stores so in case those observers are not applicable to a specific Magento site then they should be removed.
Make sure FPC is applicable to most of the site pages. Specially when you added some new controllers to delivering a page.
Magento has lots of features.For this it has many events and associated observers.There are few features that are not used by a store so any observer related to that feature should be removed.e.g : If you check enterprise version there is category permission concept which if not used, then its recommended that on save after events permission related observers to be removed.
If a specific attribute is to be save for a product then instead of call $product->save call a function that will save specific attribute.
In EE version that has partial indexing and triggers modify triggers to avoid multiple entries to_cl tables.
No.phtml files bypasses blocks and use modules or resources directly.As this will result in no caching which in-turn means more work for Magento.
Delivering images depending on device in use.
Some of the FPC recommended for community : Lesti( Free as on date ),Amasty( commercial),extender(commercial ) and Bolt(commercial).
Warming Cache.
Controlling bots by .htaccess during peak hrs.
Pre-populating values in a custom table for Layered Navigation via a custom script that executes daily by cron.
Making sure to avoid unwanted Keys to reduce cache size.
Using a higher PHP version 5.4+
Using a higher Mysql version( 5.5 +)
Reduce number of Dom elements.
Move all js out from html pages.
Remove commented html.
Modify triggers if on enterprise version(1.13 or higher) so as to reduct _cl table entries.As these entries results in cache flushing which in turn results in lower cache hit,hence more TTFB time.
Use Magmi to import products.

As Vinai said, Magento is all about extensibility and raw performance is secondary but remedied by things like indexing and caching. Significantly improving performance without caching is going to be very difficult. Short of full-page caching, enabling block caching is a good method of improving performance but proper cache invalidation is key. Many blocks are cacheable but not already configured to be cached by default so identify the slowest ones using profiling use Vinai's guide for enabling caching. Here are a few additional things to keep in mind with block caching:
Any block that lists product info should have the product's tag which is 'catalog_product_'.$productId. Similarly, use 'catalog_category_'.$categoryId for categories. This will ensure the cache is invalidated when the product or category is saved (edited in backend). Don't set these in the constructor, set them in an overridden getCacheTags() so that they are only collected when the block is saved and not when it is loaded from cache (since that would defeat the purpose of caching it).
If you use https and the block can appear on an https page and includes static resources, make sure the cache key includes Mage::app()->getRequest()->isSecure() or else you'll end up with http urls on https pages and vice versa.
Make sure your cache backend has plenty of capacity and avoid needless cache flushes.
Don't cache child blocks of a block that is itself cached unless the parent changes much more frequently than the child blocks or else you're just cluttering your cache backend.
If you do cache tagging properly you should be able to use a very long default cache lifetime safely. I believe setting "false" as the lifetime actually uses the default, not infinite. The default is 7200 seconds but can be configured in local.xml.
Using the redis backend in most cases will give you the best and most consistent performance. When using Redis you can monitor used memory size using this munin plugin.

Just to follow on from Mark... most of the tables in the Magento database are InnoDB. Whilst the query cache can be used in a few specific places, the following are more directly relevant...
innodb_buffer_pool_size
innodb_thread_concurrency
innodb_flush_method
innodb_flush_log_at_trx_commit
I also use
innodb_file_per_table
as this can be beneficial in reorganising specific tables.
If you give the database enough resource, (within reason) the amount of traffic really doesn't load the server up at all as the majority of queries are repeats anyway, and are delivered out of database cache.
In other words, you're probably worrying about nothing...

Make sure mysql query cache is turned on. And set these variables in mysql (maybe need tweaking depending on your setup).
query_cache_type=1
query_cache_size=64M

i found a very interesting blog post about Magento Performance optimization, there are many configuration settings for your server and your magento store, was very helpful for me.
http://www.mgt-commerce.com/blog/magento-on-steroids-best-practice-for-highest-performance/

First you need to audit and optimize time to first byte (TTFB).
Magento has profiler built-in that will help you identify unoptimized code blocks.
Examine your template files and make sure you DO NOT load product models inside a loop (common performance hog):
foreach($collection as $_product){
$_product = Mage::getModel('catalog/product')->load($_product->getId()
I see this code often in product/list.phtml
I wrote a step-by-step article on how to optimize TTFB
Disclaimer: the link points to my own website

Related

Conditional Incremental builds in Nextjs

Context
I am learning Nextjs which is a framework for developing react applications quickly by providing many functionalities out of the box such as Server Side Rendering, Fast Refresh and many others out of the box without any configuration. It also provides a functionality to optionally generate some web pages statically which are pre rendered at build time instead of rendering on demand. It achieves it by querying the data required for the page at build time. Nextjs also provides an optional argument expressed in seconds after which the data is re queried and the page re rendered. All of it happens on page level rather than rebuilding the entire website.
Problems
We cannot know in advance how frequently data would change, the data may change after 1 second or 10 minutes and it is impossible to know in advance and extremely hard to predict. However, it is most certainly not a constant number of seconds. With this approach, I might show outdated information due to higher time limit or I might end up querying the database unnecessarily even if data hasn't changed.
Suppose I have implemented some sort of pagination and I want to exploit the fact that most users would only visit first few pages before going to a different link. I could statically pre render first 1000 pages, so the most visited pages are served statically without going to the database whereas the rest are server side rendered. Now, if my data might change frequently, I would have to re render the first 1000 pages after regular intervals and each page would issue a separate query against the same database or external API which would cause too many round trips. I am not aware of the details of Nextjs but I suspect this would be true because Nextjs does not assume anything about the function which pulls the data and a generic implementation would necessitate it.
Attempted Solution
Both problems can be solved by client or server side rendering because the data would be fetched on demand but we lose the benefits of static generation specifically serving static assets compared to querying the database. I believe static generation would be useful if mutations to my data happen infrequently most of the time but we still want to show the updated information as fast as we can when it becomes available.
If I forget about Nextjs for a a while, both problems can be solved by spawning a new process which listens for mutations to the relevant data and only rebuilds those static assets which needs to be updated; kind of like React updates components but on server side. However Nextjs offers a lot of functionalities which would be difficult to replicate, so I cannot use this approach.
If I want to use Nextjs, problem (1) seems impossible to solve due to (perceived?) limitation of Nextjs which only offers one way to rebuild static pages, periodically re render them after a predetermined time. However, (2) can be solved by using some sort of in memory cache which pulls all the required data from the data store in one round trip and structures it up for every page. Then every page will pull data from this cache instead of the database. However, it looks like a hack to me.
Questions
Are there other ways to deal with the problem I might have have missed?
Is there a built-in way to deal with problem (1) and (2) in Nextjs?
Is my assessment of attempted solutions and their viability correct?

Magento - slow homepage

Have problem with loading homepage of my site. PHP execution time is about 14sec !!! Other pages like categories have 1,5sec to execution. Cache are enabled also I clean DB with no results.
The cache being enabled doesn't help if the blocks you have on your homepage aren't setup to be cached. If you have any custom blocks generating content on your homepage you should set the cache lifetime and override the getCacheKeyInfo. Especially if these blocks are doing large DB queries involving joins across many EAV tables. Obviously there's not much anyone can do with the limited information you've given, but start by looking at your homepage layout XML to see which blocks are being used, and investigate whether they are being cached.
Install FPC.
Pre-warm Cache.
Make sure if any listing then flat tables are used.
No block level code in .phtml
In short you need to know how to optimise magenta for performance
Solved problem with Full Page Cache Thanks!

What should be stored in cache for web app?

I realize that this might be a vague question the bequests a vague answer, but I'm in need of some real world examples, thoughts, &/or best practices for caching data for a web app. All of the examples I've read are more technical in nature (how to add or remove cache data from the respective cache store), but I've not been able to find a higher level strategy for caching.
For example, my web app has an inbox/mail feature for each user. What I've been doing to date is storing typical session data in the cache. In this example, when the user logs in I go to the database and retrieve the user's mail messages and store them in cache. I'm beginning to wonder if I should just maintain a copy of all users' messages in the cache, all the time, and just retrieve them from cache when needed, instead of loading from the database upon login. I have a bunch of other data that's loaded on login (product catalogs and related entities) and login is starting to slow down.
So I guess my question to the community, is what would you do/recommend as an approach in this scenario?
Thanks.
This might be better suited to https://softwareengineering.stackexchange.com/, but generally you want to cache:
Metadata/configuration data that does not change frequently. E.g. country/state lists, external resource addresses, logic/branching settings, product/price/tax definitions, etc.
Data that is costly to retrieve or generate and that does not need to frequently change. E.g. historical data sets for reports.
Data that is unique to the current user's session.
The last item above is where you need to be careful as you can drastically increase your app's memory usage, by adding a few megabytes to the data for every active session. It also implies different levels of caching -- application wide, user session, etc.
Generally you should NOT cache data that is under active change.
In larger systems you also need to think about where the cache(s) will sit. Is it possible to have one central cache server, or is it good enough for each server/process to handle its own caching?
Also: you should have some method to quickly reset/invalidate the cached data. For a smaller or less mission-critical app, this could be as simple as restarting the web server. For the large system that I work on, we use a 12 hour absolute expiration window for most cached data, but we have a way of forcing immediate expiration if we need it.
This is a really broad question, and the answer depends heavily on the specific application/system you are building. I don't know enough about your specific scenario to say if you should cache all the users' messages, but instinctively it seems like a bad idea since you would seem to be effectively caching your entire data set. This could lead to problems if new messages come in or get deleted. Would you then update them in the cache? Would that not simply duplicate the backing store?
Caching is only a performance optimization technique, and as with any optimization, measure first before making substantial changes, to avoid wasting time optimizing the wrong thing. Maybe you don't need much caching, and it would only complicate your app. Maybe the data you are thinking of caching can be retrieved in a faster way, or less of it can be retrieved at once.
Cache anything that causes duplicate database queries.
Client side file caching is important as well. Assuming files are marked with an id in your database, cache them on every network request to avoid many network requests for the same file. A resource to do this can be found here (https://developer.mozilla.org/en-US/docs/Web/API/IndexedDB_API). If you don't need to cache files, web storage, local storage and cookies are good for smaller pieces of data.
//if file is in cache
//refer to cache
//else
//make network request and push file to cache

When is it better to generate a static page or dynamically generate?

The title pretty much sums up my question.
When is it more efficient to generate a static page, that a user can access, as apposed to using dynamically generated pages that query a database? As in what situations would one be better than the other.
To serve up a static page, your web server just needs to read the page off the disk and send it. Virtually no processing will be required. If the page is frequently accessed, it will probably be cached in memory, so even the disk access will not be needed.
Generating pages dynamically obviously has more overhead. There is a cost for every DB access you make, no matter how simple the query is. (On a project I worked on recently, I measured a minimum overhead of 0.7ms for each query, even for SELECT 1;) So if you can just generate a static page and save it to disk, page accesses will be faster. How much faster? It just depends on how much work is being done to generate the page dynamically. We don't know what you are doing, so we can't comment on that.
Now, if you generate a static page and save it to disk, that means you need to re-generate it every time the data which went into generating that page changes. If the data changes more often than the page is actually accessed, you could be doing more work rather than less! But in most cases, that's a very unlikely situation.
More likely, the biggest problem you will experience from generating static pages and saving them to disk is coding (and maintaining) the logic for re-generating the pages whenever necessary. You will need to keep track of exactly what data goes into each page, and in every place in the code where data can be changed, you will need to invoke re-generation of all the relevant pages. If you forget just one, then your users may be looking at stale data some of the time.
If you mix dynamic generation per-request and caching generated pages on disk, then your code will be harder to read and maintain, because of mixing the two styles.
And you can't really cache generated pages on disk in certain situations -- like responding to POST requests which come from a form submission. Or imagine that when your users invoke certain actions, you have to send a request to a 3rd party API, and the data which comes back from that API will be used in the page. What comes back from the API may be different each time, so in this case, you need to generate the page dynamically each time.
Static pages (or better resources) are filled with content, that does not change or at least not often, and does not allow further queries on it: About Page, Contact, ...
In this case it doesn't make any sense to query these pages. On the other side we have Data (e.g. in a Database) and want to query it/give the user the opportunity to query it. In this case you give the User a page with the possibility to specify the query and return a rendered page with the dynamically generated data.
In my opinion it depends on the result you want to present to the user. Either it is only an information or it is the possibility to query a Datasource. The first result is known before you do something, the second (query data) is known after you have the query parameters, which means you don't know the result beforehand (it could be empty or invalid).
It depends on your architecture, but when you consider that GET Requests should be idempotent it should be also easy to cache dynamic Pages with a Proxy, and invalidate the cache, when something new happens to the data which is displayed on the cached path. In this case one could save a lot of time, because the system behaves like the cached pages would be static, but instead coming from the filesystem, they come from your memory, which is really fast.
Cheers
Laidback

Does Wordpress load-cache functions such as wp_get_nav_menu_object?

I've been asking myself this question for quite a while. Maybe someone has already done some digging (or is involved in WP) to know the answer.
I'm talking about storing objects from WP-functions in PHP variables, for the duration of a page load, e.g. to avoid having to query the database twice for the same result set.
I don't mean caching in the sense of pre-rendering dynamic pages and saving them in HTML format for faster retrieval.
Quite a few "template tags" (Wordpress functions) may be used multiple times in a theme during one page load. When a theme or plugin calls such a function, does WP run a database query every time to retrieve the necessary data, and does it parse this data every time to return the desired object?
Or, does the function store the its result in a PHP variable the first time it runs, and checks if it already exists before it queries the database or parses?
Examples include:
wp_get_nav_menu_object()
wp_get_nav_menu_items()
wp_list_categories()
wp_tag_cloud()
wp_list_authors()
...but also such important functions as bloginfo() or wp_nav_menu().
Of course, it wouldn't make much sense to cache any and all queries like post-related ones. But for the above examples (there are more), I believe it would.
So far, I've been caching these generic functions myself when a theme required the same function to be called more than once on a page, by writing my own functions or classes and caching in global or static variables. I don't see why I should add to the server load by running the exact same generic query more than a single time.
Does this sort of caching already exist in Wordpress?
Yes, for some queries and functions. See WP Object Cache. The relevant functions are wp_cache_get, wp_cache_set, wp_cache_add, and wp_cache_delete. You can find these functions being used in many places through the WordPress code to do exactly what you are describing.

Resources