Magento massive product import cache & performance issue - magento

I am importing about 10.000 products and updating their data with import custom script on regular basic. I use Magento object to save product data. The problem is that for each product save the process is slower. On 1000 products save it becomes really slow. When I clear cache, everything is ok again.
I have now couple of questions to understand the thing:
Does anybody have any idea why is that?
Should I disable "Collections Data" cache, or maybe any other type of cache as well?
Or is there any way to tell Magento not to cache collection data on product save?
If not, will disabling Collections Data cache slow the page a lot?
Thank you

The reason for the slowness is your indexes are getting larger. Unless specified Magento will reindex for each new product, you can speed this up during your imports by disabling it, however you will need to reindex at some point to be able to present the newly imported products to the frontend.
A solution to consider:
Magento API: Rebuild Indexes after adding new products

Related

Magento Reindexing Data - Risks

I have a Magento site in which the cross-selling products do not seem to be appearing.
After looking on Stack and Google it seems that 'reindexing the data' has solved this issue for a lot of individuals.
My question is, are there any risks associated with performing this task? Or is it a relatively straight forward procedure?
Indexing is a fundamental part of Magento and will not effect your site in a negative way.
Magento uses a complex EAV (entity-attribute-value) database structure that can sometimes require heavy database queries to retrieve simple results. Because of this, the Magento developers have implemeted Index tables that query all of this data, and store it into a single table structure. This allows Magento to quickly query the single Index table, rather than making complex joins across multiple tables.
With that being said, Reindexing does not alter your existing data. It simply queries your existing data and copies it to it's own tables.
To reindex your site, you can simple go to System > Index Management, check off all the indexes that you wish to reindex, then submit.
If you have a large set of products, I recommend reindexing your site from the shell command line.
Login to your site using an SSH program (such as Putty)
Once logged in, cd to your magento/shell/ (where magento is your Magento root directory)
Run the following command to reindex your site: php indexer.php reindexall
Wait for the index processes to complete.
Lastly, ensure that your Catalog is using the Flat index tables. To do this:
Go to System > Configuration > Catalog > Frontend (section)
Set Use Flat Catalog Category to Yes
Set Use Flat Catalog Product to Yes
Click Save Config
No, you're safe to reindex whenever you see the notice appear.
If you know you're going to make a lot of changes, you can wait until you're done, saving yourself some time but only running it once at the end.
The only exception where this is not safe is if you have tens of thousands of products and/or lots of store views. It may end up running for hours and hours, slowing down your site leading to an undesirable experience for the customer.
I have found on sites with a large number of products, running the price reindex can cause a database lock, which can cause certain actions to be unavailable and for orders to be duplicated during that time. It also can affect performance and eat resources. I recommend performing this late at night only if possible.

Could disabled products slow down the site?

I have more and more disabled products and I am wondering if they could slow down my site.
All operations will be much faster on a smaller database.
Each product has tens of records in the EAV tables. 10-20k inactive products will have a major impact on anything that is not cached. Reindexing will take longer.
You should take into consideration the following when you delete products:
Reorder will not work (as the same product ordered is expected to exist)
Check if you have some custom reports/extensions that full the data from the product tables
Orders/Invoices created already will display all the informations
if you know how to optimize the whole thing, and your site was working fine before, deactivating products is not much impact on the speed of the store as a whole.

Disabling store views in Magento for improving performance

I need to improve indexing times, specifically "Product Prices".
I would like to know if I need to actually delete a store view in order to improve indexing performance, or would it be enough to disable it. I'm talking about store views in different websites in a multi-site Magento installation.
How much does an extra store view affect performance with tens of thousands of products, each with different prices per store view (the other attributes are the same)?
Thank you.
Magento does not check if store view is disabled. If you will not index this data you must delete the store view, create your own indexer or rewrite magento's indexing behavior.

Magento flat catalog advice wanted

I'm having to convert an existing e-commerce site with 50k plus products to a magento site. Everywhere I look the advice is to use the flag catalog for this amount of products.
My question is, once enabled do new products have to be created using the old EAV tables or can I just import and update new products in the newly created flat catalog?
Thanks for any advice, I'm not looking forward to this transation at all, lol. ;)
Think of Flat Catalog as a cache of the EAV structure. It does not replace the EAV system it simply creates a "flattened" or simplified version of the data store in EAV tables.
The EAV system is the most flexible way to store data, allowing any number of user or system variables without changes to the table structure. The down side is this system requires multiple and or recursive queries, this is slow and memory intensive. This is where the flat catalog comes in... The following is still quite accurate (even though it was written when flat catalog was first introduced) and quite clear: http://www.magentocommerce.com/boards/viewthread/37247/#t122010
You will need to optimise memory usage within PHP and MySQL to enable rebuilding of flat catalog for a site with a large number of products.
I don't know what version of Magento your using, but until 1.4 you need to put it into the EAV structure if you have to manage your products using backoffice.
The creation of the flat_ tables are automatic, it's part of the indexing process (which can be very long for this amount of products)
edit: I don't know for version after 1.4

Magento index and cache. Do I need both?

I am developig an import module which update product data. To speed up the process, I put index to manual mode.
$processes = Mage::getSingleton('index/indexer')->getProcessesCollection();
$processes->walk('setMode', array(Mage_Index_Model_Process::MODE_MANUAL));
$processes->walk('save');
and after the import is finished, I reindex data and put index mode back to auto
$processes = Mage::getSingleton('index/indexer')->getProcessesCollection();
$processes->walk('reindexAll');
$processes->walk('setMode', array(Mage_Index_Model_Process::MODE_REAL_TIME));
$processes->walk('save');
But I am not sure if I also need to clear cache. So my question is how index and cache are related. For example if I clear cache, does it also reindex all data? And on the other site, if I reindex all data, does it clear cache? Or do I need to trigger everytime both processes if I have index mode set to manual? I am not quite sure about this, I hope anybody could confirm it for sure.
Thank you
Magentos System -> Cache Managment and System -> Index Managment are both stand-alone features. If you rebuild such index, no matter whether thru backend or directly using reindexAll(), Magento will not automatically refresh any cache and vice versa.
The answer to Do I need both? (caches and indexes) is: it depends.
If you're running Magento with caches COLLECTION_DATA and/or EAV enabled, you should refresh those caches after importing and reindexing product data.
The refreshing is necessary because your importer has updated/inserted product data which the caches are not aware of, but not, because you've reindexed.
If you're running Magento with all caches disabled, you don't need both. Technically, there is no need to refresh disabled cache. Magento would be slower of course, but still be fully functional.

Resources