Creating/Updating products programatically in Magento - performance

I've written a module to import products and I'm currently using Magento's product model to add/update products accordingly. However, this is proving to be very slow (possibly because I have the catalog index enabled?).
Even doing just
$product = Mage::getModel('catalog/product')->load($id);
$product->save();
Is incredibly slow - We're talking maybe 2 seconds per product (I'm doing 5 per http request, and using javascript to make several requests).
Each product needs to have some attributes updated, category ids changed, and the stock levels updated. At the moment, I'm looping through 20 products and it's taking near enough 60 seconds. In production, it will be looping through 200-300 products (although on a much more powerful server).
Is there a better/faster way of creating/updating the products? Obviously I could just use SQL but I don't fancy figuring out Magento's intense EAV database structure!
Sorry if this is a naff question, I'm not sure how best to word it!

Setting the indexer mode to manual whilst importing will give you at least some performance gains. You can obviously set this in the admin area, but you can also do it via your script:
//Set to manual mode
$processCollection = Mage::getSingleton('index/indexer')->getProcessesCollection();
foreach($processCollection as $process) {
$process
->setMode(Mage_Index_Model_Process::MODE_MANUAL)
->save();
}
//Set back to real time mode
$processCollection = Mage::getSingleton('index/indexer')->getProcessesCollection();
foreach($processCollection as $process) {
$process
->setMode(Mage_Index_Model_Process::MODE_REAL_TIME)
->save();
}
If you are looking for a way to reindex directly in your script after importing...
$processCollection = Mage::getSingleton('index/indexer')->getProcessesCollection();
foreach($processCollection as $process) {
$process
->reindexEverything();
}
But, Magmi - http://sourceforge.net/projects/magmi/files/magmi-0.7/ - is not only amazingly fast at importing products, but it also provides some really nice features.
I don' know of any other product import tool for Magento that is as fast (would be interested if anyone else knows of one that is as fast or faster?)

Related

Laravel 5: Heavy Select Query

I have about 25.000 rows in my DB table 'movies' (InnoDB, 17.5 mb)
And when I try to get them all to display in my admin panel, nothing happens. Just 5-8 seconds pending and white screen. No displayed errors, just nothing. (max execution time is 3600 seconds, because it's on my local machine). My simple as hell code:
public function index()
{
$data['movies'] = Movies::all();
dd('This var_dump & die never fires');
// return view('admin.movies', $data);
}
I just wonder why it not performs the query and just die without declaration of war.
I didn't found anything interesting in .ENV or config/database.php to explain what happens in such situations.
PS. yes, I can make serverside pagination and search, and take only 10-25 records from the DB, question is not about that.
Looks like you are running out of memory. Try quering half, of the results, or maybe just 100 to see if that at least fixes the white page, if so use chunk:
Movies::chunk(200, function($movies)
{
foreach($movies $movie)
{
var_dump($movie);
}
});
You should definitely look at your storage\logs directory to verify the error. It's quite possible that it takes to much memory getting 25k rows.
In fact as you mentioned in real life there is no need to get so many rows because unless you export them into CSV or XLS.

How many lines and documents should be there in the training data opennlp categorizer

I am following the documentation for Apache open-nlp. I was able to understand the sentence detection, tokenizer, name-finder. But I got stuck for Categorizer. The reason, I can not understand, how to create a model for Categorization.
I do understand that I need to create a file. The format is very clear, it needs to be a category space and a document in a single line. Save the file with .train extension.
So I created the following file:
Refund What is the refund status for my order #342 ?
NewOffers Are there any new offers for your products ?
I gave this command-
opennlp DoccatTrainer -model en-doccat.bin -lang en -data en-doccat.train -encoding UTF-8
It starts doing something and then returns with an error. These are the contents in the command prompt:
Indexing events using cutoff of 5
Computing event counts... done. 2 events
Indexing... Dropped event Refund:[bow=What, bow=is, bow=the, bow=refund, bow=status, bow=for, bow=my, bow=order, bow=#342, bow=?]
Dropped event NewOffers:[bow=Are, bow=there, bow=any, bow=new, bow=offers, bow=for, bow=your, bow=products, bow=?]
done.
Sorting and merging events... Done indexing.
Incorporating indexed data for training...
Exception in thread "main" java.lang.NullPointerException
at opennlp.maxent.GISTrainer.trainModel(GISTrainer.java:263)
at opennlp.maxent.GIS.trainModel(GIS.java:256)
at opennlp.model.TrainUtil.train(TrainUtil.java:184)
at opennlp.tools.doccat.DocumentCategorizerME.train(DocumentCategorizerME.java:162)
at opennlp.tools.cmdline.doccat.DoccatTrainerTool.run(DoccatTrainerTool.java:61)
at opennlp.tools.cmdline.CLI.main(CLI.java:222)
I am just not able to figure out why is this giving a null pointer exception here? I also tried to increase two more lines, but no result.
Refund What is the refund status for my order #342 ?
NewOffers Are there any new offers for your products ?
Refund Can I place a refund request for electronics ?
NewOffers Is there any new offer on buying worth 5000 ?
I found this blog, but here also pretty much the same thing is done. On trying his training file it works with a charm. What is wrong in my file? How do I resolve the error.
When I try opennlp DoccatTrainer it opens help for me, so path is not an issue. Any help is appreciated.
EDIT: I changed the file to
Refund What is the refund status for my order #342 ? Can I place a refund request for clothes ?
NewOffers Are there any new offers for your products ? what are the offers on new products or new offers on old products?
Refund Can I place a refund request for electronics ?
NewOffers Is there any new offer on buying worth 5000 ?
and it works, I thought it has got to do something with the document (apparently should be two sentences) and removed the last two lines.
to make it
Refund What is the refund status for my order #342 ? Can I place a refund request for clothes ?
NewOffers Are there any new offers for your products ? what are the offers on new products or new offers on old products?
But then again it fails, the question now summarizes to what kind of data/ format/document does it need?
Thanks
you have to add more than 5 samples from each category. because default cutoff mark size is 5,
Please refer this blog post
http://madhawagunasekara.blogspot.com/2014/11/nlp-categorizer.html
You can use the -cutoff flag in your DoccatTrainer command to change the default. In your case, you would add -cutoff 1 to set the minimum number of documents per category to 1.

Web Scraping using simplehtmldom on multiple sites

I am using simplehtmldom parser for my recent web scraping project and the project is actually building a price comparing website build with CodeIgniter. The website has to fetch product names, description and price from different shopping websites. Here is my code:
$this->dom->load_file('http://www.site1.com');
$price1 = $this->dom->find("span[itemprop=price]");
$this->dom->load_file('http://www.site2.com');
$price2 = $this->dom->find("div.price");
$this->dom->load_file('http://www.site3.com');
$price3 = $this->dom->find("div.priceBold");
$this->dom->load_file('http://www.site4.com');
$price4 = $this->dom->find("span.fntBlack");
$this->dom->load_file('http://www.site5.com');
$price5 = $this->dom->find("div.price");
The above code takes approximately 15-20 seconds to load the result into the screen. When I try with only one site, it just takes 2 seconds. This is how the simplehtmldom works with multiple domains? Or is there a way to optimize it?
PHP Simple HTML DOM Parser has some memory leak issue, so before trying to load a new page, clear the previous one using:
$this->dom->clear();
unset($this->dom);
If this doesn't change anything, then one of your websites is taking much time to respond... you'll have to check one by one to find the culprit xD

Extremely slow WordPress user import on XAMPP

I'm posting this question here because I'm not sure it's a WordPress issue.
I'm running XAMPP on my local system, with 512MB max headroom and a 2.5-hour php timeout. I'm importing about 11,000 records into the WordPress wp_user and wp_usermeta tables via a custom script. The only unknown quantity (performance-wise) on the WordPress end is the wp_insert_user and update_user_meta calls. Otherwise it's a straight CSV import.
The process to import 11,000 users and create 180,000 usermeta entries took over 2 hours to complete. It was importing about 120 records a minute. That seems awfully slow.
Are there known performance issues importing user data into WordPress? A quick Google search was unproductive (for me).
Are there settings I should be tweaking beyond the timeout in XAMPP? Is its mySQL implementation notoriously slow?
I've read something about virus software dramatically slowing down XAMPP. Is this a myth?
yes, there are few issues with local vs. hosted. One of the important things to remember is the max_execution time for php script. You may need to reset the timer once a while during the data upload.
I suppose you have some loop which takes the data row by row from CSV file for example and uses SQL query to insert it into WP database. I usually put this simple snippet into my loop so it will keep the PHP max_exec_time reset:
$counter = 1;
// some upload query
if (($handle = fopen("some-file.csv", "r")) !== FALSE) {
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
mysql_query..... blablabla....
// snippet
if($counter == '20') // this count 20 loops and resets the counter
{
set_time_limit(0);
$counter = 0;
}
$counter = $counter + 1;
} //end of the loop
.. also BTW 512MB room is not much if the database is big. Count how much resources is taking your OS and all running apps. I have ove 2Gb WO database and my MySql needs a lot of RAM to run fast. (depends on the query you are using as well)

Saving In Magento Taking A Very Very Long Time

In Magento I write a number of small command line scripts to do things like set a new attribute on a number of products. I am finding that the time it takes to update 900 products takes about 6 hours to complete.
The time it takes to load the individual products goes as fast as I would except, but the act of saving once I have made the change takes a very long time.
I am attaching how I am loading the products in case there is something I can do to better optimize the process. Any help here would be greatly appreciated.
$product = Mage::getModel('catalog/product')->load($magento_id);
$product->setMadeInUsa(1);
try {
$product->save();
} catch(Exception $e) {
echo "ERROR: " . $e->getMessage() . "\n";
}
The code runs without error, but it takes forever.
Mage::getSingleton('catalog/product_action')
->updateAttributes(array($product->getId()), $attributesData, $storeId);
This code only updates the attributes you want to change. The first paramater is an array of product IDs, the second is an array of attribute names and values, and then the third is the store ID you wish to update.
This is MUCH faster than saving the entire model.
Try first seting indexing to Manual and then reindex after update is done. This should improve the performance. However the ultimate solution, if you are going to do the import often, is to follow the code ideas you can find in update attributes mass action, which is optimized for saving many products at once.

Resources