Reduce inodes count on Magento website - magento

I am getting errors on my website and my website inodes count is overload. The hosting inodes limit is 200,000 but my website inodes count is 909,496 and I can't even open phpMyAdmin. The hosting support asked me to remove unused files. How can I decrease the inodes count and which files are unused in Magento based website?

Usually an indicator that you need a more capable hosting provider.
The major places that Magento creates files during operation are in the var/ folder and your product image cache.
If you've never checked before, the following areas can accumulate a phenomenal amount of detritus. Using an ftp client, check the following areas in your var/ folder:
Check that you don't have a bazillion sessions files in var/session, remove anything older than current date
Check that there aren't an excessive amount of files in var/report, you might want to find out why Magento is generating them and fix the issue. Delete them all.
Logging will generate over time several huge files in var/log, delete them and then look at the new ones to find out what errors are being generated.
Imports and other stuff can cause temporary files to accumulate in var/tmp, delete them. Also check in var/import for old imports that can be deleted
Stored database backups are kept in var/backup, using the admin backend System > Tools > Backups:
Download the latest database backups to a local workstation and delete all backups.
Magento uses a lot of caching to store information, the biggest will be the Image Cache if you have a large catalog, and it will contain cached images from the beginning of time, and lots of useless ones if you've deleted product over time. Using the Admin backend, go into System > Cache Management:
Clear the Magento Cache.
Flush Catalog Images Cache.
Magento does not delete product images when you delete product. In fact Magento would be a prime candidate for appearing on one of those Hoarder programs that were prevalent on TV there for a while.
After you get the site working, consider installing ImageClean.
Hopefully this will have reduced your inode count enough to carry out the following operations. Before proceeding, do a couple database backups and store off server!!!
Next step is to ask your hosting provider if they include your database in that inode table count. If they do, you are kind of stuck as Magento uses innodb and likely, they've cheaply not set up MySQL to use files-per-table so you can resize the innodb file size by optimizing each table. Ask them if they use files-per-table when they set up MySQL, if they don't know what it is, develop that sinking feeling in the pit of your stomach.
Some tables that get excessively huge, especially if you've haven't properly set set up the Magento master cron job trigger in your cPanel and checked to make sure log table cleaning is enabled in System > Configuration > Advanced > System > Log Cleaning. These tables are as follows:
'dataflow_batch_export',
'dataflow_batch_import',
'log_customer',
'log_quote',
'log_summary',
'log_summary_type',
'log_url',
'log_url_info',
'log_visitor',
'log_visitor_info',
'log_visitor_online',
'index_event',
'report_event',
'report_viewed_product_index',
'report_compared_product_index',
'catalog_compare_item',
'catalogindex_aggregation',
'catalogindex_aggregation_tag',
'catalogindex_aggregation_to_tag'
Magento has a built-in script to clean the logs. If running this crashes with a memory error because you've never set the cron job up and there's too much bloat to clean out, Crucial Web Host has a script that can be run to manually delete all log file contents. including the dataflow tables which won't be cleaned out by the Magento log cleaning process. If you use dataflow import/export a lot, Nexcess has a script that can check on the dataflow tables size and clear them as well.
After cleaning the database, you will need to use phpMyAdmin to optimize each table in your Magento database. If the hosting provider hasn't set up files-per-table in MySQL, it will do squat for reducing your inode count.
After all that, don't bother messing with deleting application files or anything else Magento uses. It doesn't really accumulate that much aside from the var/ folders and the Image cache and you likely will end up with a dead website.
At this point, you're at the mercy of a shared server hosting plan that has decided to be fair to everyone by limiting what can be done in each account and doesn't allow enough resources to run Magento. Start looking for a hosting provider that supports Magento, often they don't bother limiting your inode count (a cheap trick to allow too many people to share a hard drive) as they offer plenty of disk space for you to run your e-commerce website.

Related

DFS starts a new metadata refresh of the namespace every hour and never finishes - replication halted

My organisation uses DFS to replicate three servers - a hub at one site, a spoke at one site, and a spoke at a remote site. They contain a number of folders that are separate shares, all within the same replication group. So the namespace is CompFileShare, and it contains IT, Public, etc as shares that users can map based on permissions.
Our remote site's drive recently filled up, and since the drive it was hosted on was MBR and maxed out at 2 TB, we created a new 4 TB GPT drive. We then robocopied the MBR drive to the GPT drive using the recommended xcopy command flags, and let it finish. After it finished, we unshared the original IT folder, shared the new IT folder on the GPT drive, and changed the replication target from MBR to GPT and let it replicate.
For some reason, possibly unrelated, we are now seeing the remote site's fileshare server throwing Event ID 516 every hour:
DFSN service has started performing complete refresh of metadata for namespace TTFileShare. This task can take time if the namespace has large number of folders and may delay namespace administration operations.
Because it is running, all replication to the remote site is halted, even for shares still on the old MBR drive. It has been sitting like this for a day before I really got a chance to look at it. Any tips or places I should look to resolve this issue?
So...
Part of the migration reason was that our MBR drive was completely full. 100 Kbs left.
We deleted a folder (that was already robocopied to the new drive) from the full drive, (freed up ~20 MB) and restarted the server.
The server then replicated the everything, no problems. So lesson learned - don't let drives get so full DFS can't even manage their data.
EDIT: Also, the namespace refresh is flagged as warning when it should be flagged as informational. If you look under informational messages you will see the namespace refresh finished message shortly after the namespace refresh started message. This is the actual cause of the title problem - a microsoft bug mislabeling namespace refresh as a warning. Its just informational

PostgreSQL statistics issue - could not rename temporary statistics file

I am running PotgreSQL 9.4 on Windows, and constantly get the error,
2015-06-15 09:35:36 EDT LOG could not rename temporary statistics file "pg_stat_tmp/global.tmp" to "pg_stat_tmp/global.stat": Permission denied
I also see constant 200-800k writes to global.stat and global.tmp. I have seen other users with the same issue, but no solution.
It is a big database server, with 300g of data, and 6,000 databases.
I tried setting,
track_activities=off
In the config file, but it did not seem to have any affect.
Any help for the error, or reducing the write?
After my initial answer, I decided to research the operation of the stats collector and in particular what it is doing with the files in pg_stat_tmp. I've substantially re-written the answer as a result.
What are the global.stat / global.tmp files used for?
Postgresql contains functionality to collect statistics and status information about its operation. The function is described in Section 27.2 of the manual.
This information is collated by the stats collector process. It is made available to the other postgresql processes via the global.stat file. The first time you run a query that accesses this data within a transaction, the backend which you are connected to will read the global.stat file and cache the result, using it until the end of the transaction.
To keep this file up to date, the stats collector process periodically re-writes it with updated information. It typically does this several times a second. The process is as follows:
Create a new file global.tmp
Write data to this file
Rename global.tmp as global.stat, overwriting the previous global.stat
The global.tmp and global.stats files are written into the directory configured by the stats_temp_directory configuration parameter. Normally this is set to $PGDATA/pg_stat_tmp.
On shutdown, the stats file is written into the file $PGDATA/global/pgstat.stat, and the files in the tmp dir above are removed. This file is then read and removed when the database is started up again.
Why is the stats collector processor creating so much I/O load?
Normally, the amount of data written to the global.stats is relatively modest and writing it does not generate that much I/O traffic. However under some circumstances it does seem to get very bloated. When this happens the amount of load generated can start to get excessive as the entire file is rewritten more than once a second.
I have had one experience where it grew by a factor or 10 or more, compared to other similar servers. This machine did have an unusually large number of databases (for our application at least - 30-40 databases - but nothing like the 6000 you say you have). It is possible that having a large number of databases exacerbates this.
Some of the references below talk about a pattern of creating / dropping lots of tables causing bloat in these files, and that perhaps autovacuum is not running aggressively enough to remove the associated bloat. You may wish to consider your autovac settings.
Why do I get 'Permission Denied' errors on Windows?
After examining the postgresql source code I think there may be a race condition in accessing the global.stats file which could happen at any time, but is exacerbated by the size of the file.
The default mode of operation in Windows is that it is not possible to rename or remove a file while another process has it open. This is different to Linux (or Unix) where a file can be renamed or removed while other processes are accessing it.
In the sequence above you can see that if one of the backend processes is reading the file at the same time as the stats collector is rewriting it, then the backend process may still have the file open at the time the rename is attempted. That leads to the 'Permission Denied' error you are seeing.
Naturally when the file becomes very large, then the amount of time taken to read it becomes more significant, therefore the probability of the stats collector process attempting a rename while a backend still has it open increases.
However, since the file is frequently being rewritten, the impact of these errors is relatively mild. It just means that this particular update fails, leading the the backends getting slightly out of date statistics. The next update will probably succeed.
Note that Windows does offer a file opening mode which does allow files to be deleted or renamed while they are opened by another process, however as far as I could tell, this mode is not used by Postgresql. I could not find any bug report on this - seems like it should be reported.
In summary, these errors are a side effect of the main problem, which is the excessive size of the global.stat file.
I've turned track_activities off but the file is still being written - Why?
From what I can see, track_activites affects only one of the sets of information that the stats collector is collecting.
In addition, it looks as though the stats collector process is started regardless of these settings, and will continue to re-write the file. The settings appear to control only the collection of fresh data.
My conclusion is that once the file has become bloated, it will remain so and continue to be re-written, even once all of the stats collection options are turned off.
What can I do to avoid this problem?
Once the file has become bloated, it seems that the easiest way to get the database back into a good working state is to remove the file, using the following steps:
Stop the database
When the DB is stopped, the pg_stat_tmp directory is empty and a file $PGDATA/global/pgstat.stat is written. We renamed this file to pgstat.stat.old.
Start the database. It creates a fresh set of pgstat files. After confirming the server was operating correctly you can remove the old file you have renamed.
This is the process we used when one of our servers suffered from this problem.
Needless to say be very careful when manually manipulating any files under the Postgresql Data directory.
After this you may want to monitor the server to see if it the file becomes bloated again. If it does then here are some additional ideas to consider:
As mentioned above I have seen some references to this file becoming bloated if autovacuum is not running aggressively enough. You may wish to tune the autovacuum settings
Disabling any of the track_xxx options described in the Section 18.9.1 of the manual which are not required may help
It is possible to place the pg_stats_tmp directory in a tmpfs filesystem (or whatever equivalent RAM based filesystem is available in windows). Doing so should eliminate I/O as a concern for these files.
References:
Postgres stats collector showing high disk I/O
Too much I/O generated by postgres stats collector process
stats collector suddenly causing lots of IO
Here might be a solution for your problem. https://wiki.postgresql.org/wiki/May_2015_Fsync_Permissions_Bug
Another possibility could be antivirus settings. Try to turn it off temporarily.
It happened to me few days ago. I rebooted the machine, but the error did not disappeared.
Don't know why, but performing a vacuum analyze verbose did the trick, and the error has stoped to show up.

Golang file and folder replication / mirroring across multiple servers

Consider this scenario. In a load-balanced environment, I have 3 separate instances of a CMS running on 3 different physical servers. These 3 separate running instances of the application is sharing the same database.
On each server, the CMS has a /media folder where all media subfolders and files reside. My question is how I'd implement/code a file replication service/functionality in Golang, so when a subfolder or file is added/changed/deleted on one of the servers, it'll get copied/replicated/deleted on all other servers?
What packages would I need to look in to, or perhaps you have a small code snippet to help me get started? That would be awesome.
Edit:
This question has been marked as "duplicate", but it is not. It is however an alternative to setting up a shared network file system. I'm thinking that keeping a copy of the same file on all servers, synchronizing and keeping them updated might be better than sharing them.
You probably shouldn't do this. Use a distributed file system, object storage (ala S3 or GCS) or a syncing program like btsync or syncthing.
If you still want to do this yourself, it will be challenging. You are basically building a distributed database and they are difficult to get right.
At first blush you could checkout something like etcd or raft, but unfortunately etcd doesn't work well with large files.
You could, on upload, also copy the file to every other server using ssh. But then what happens when a server goes down? Or what happens when two people update the same file at the same time?
Maybe you could design it such that every file gets a unique id (perhaps based on the hash of its contents so you can safely dedupe) and those files can never be updated or deleted, only added. That would solve the simultaneous update problem, but you'd still have the downtime problem.
One approach would be for each server to maintain an append-only version log when a file is added:
VERSION | FILE HASH
1 | abcd123
2 | efgh456
3 | ijkl789
With that you can pull every file from a server and a single number would be sufficient to know when a file is added. (For example if you think Server A is on version 5, and you get informed it is now on version 7, you know you need to sync 2 files)
You could do this with a database table:
ID | LOCAL_SERVER_ID | REMOTE_SERVER_ID | VERSION | FILE HASH
Which you could periodically poll and do your syncing via ssh or http between machines. If a server was down you could just retry until it works.
Or if you didn't want to have a centralized database for this you could use a library like memberlist. The local meta data for each node could be its version.
Either way there will be some amount of delay between a file was uploaded to a single server, and when it's available on all of them. Handling that well is hard, which is why you probably shouldn't do this.

Magento reindex does not populate solr search

We have solr installed as a tomcat application. With the default installation of EE magento with sample data all runs perfect.
However, with a couple of our stores running the index does not touch the /solr/data folder. Normally when a reindex is called all files in /data are cleared and repopulated. However, on the live site this does not happen.
I have stipped back a lot of extesnions and run the index again.. this time the files are recreated but almost no data is captured... even less than the sample stores despite a much larger database.
Anyone any clues as to where we should be looking?

Magento - Magento Cache

I am using memcache.
I want to understand what is stored in Magento cache and how?
Do magento stores cache variable with website scope or store scope?
I have googled and greped the code but couldnt conclude anything,
Please if someone can direct me to correct links and path
Thanks & Regards,
Saurabh
If you go to the Cache Management section of the admin area you can see what it caches (configuration, layout configuration, block html output, translations, eav types, etc). I am no expert on Magento's caching mechanisms but here are a few random tidbits that might be helpful (maybe). (Also note that I am only familiar with Magento 1.3.x, not 1.4.x so things could have changed).
The caching is actually stored in the var/cache directory. There are a ton of directories in there (mage--0, mage--1, mage--2) and each directory has the cache files. Do a ls var/cache/mage*/* to see all the files.
Configuration - This source for the configuration is varied. Your app/etc/local.xml, and all of the config.xml files (that are in each module's etc dir) are combined together to make one big configuration object. Then Magento reads from the core_config_data table to update the configuration object. Then the configuration is written to a cache file so that next time a request is made it doesn't need to open a ton of config files and hit the database. Somehow this info gets stored in a bunch of files under var/cache. For some insight do a ls var/cache/mage*/*CONF*.
Layout - This is a lot like the configuration... there are a bunch of xml files in the app/design/frontendOrAdminhtml/yournamespace/layout/ directory and all these are merged into one layout configuration object, then cached in the cache directory.
Block HTML - The actual html generated by a block is cached. Each block is able to decide how long it is going to be cached.
Lastly, to (not really) answer your question about if the cache is per website or store, I can't really say since I haven't had the need to setup a multi-website/multi-store shop yet. It looks like there may be some store/website-specific files, but I can't see that they are really organized in a logical way. For example, in one of my instances I see a var/cache/mage--f/mage---LAYOUT_FRONTEND_STORE0_DEFAULT_BLANK_SEO file and a var/cache/mage--f/mage---LAYOUT_FRONTEND_STORE1_DEFAULT_BLANK_SEO... but then again, I only have one store configured and those two files have the same contents. Good luck with that!
You could also use some of the very great memcached analysis and reporting tools available
http://code.google.com/p/memcached/wiki/Tools
The best solution I have come up with is to use a two level cache.
Consult app/etc/local.xml.additional to see how to put memcached server nodes in there. Note that within the <servers> tag you will have to have tags like <server1> and <server2> encapsulating each memcached node's settings.
<cache>
<backend>memcached</backend>
<slow_backend>database</slow_backend>
</cache>
In this way all cache is shared.
To clear it the way I do it is to:
1. shut down apache
2. connect to mysql and connect to the magento db and run truncate core_cache; truncate core_cache_tag.
3. I then bounce the memcached nodes.
4. I restart apache but I keep it out of the load balancer until I have hit it at least once to generate the APC opcode cache. Otherwise the load can shot up through the roof.
This all seems extreme but I have found it works for me. Clearing cache using the backend is REALLY slow. I have around 100k entries in the core_cache table and close to 1 million entries in core_cache_tag. If I don't do it this way sometimes I get strange behavior.
Your Memcache configuration in ./app/etc/local/xml will dictate what Memcache is actually caching.
If you are only using a the single-level cache (without ), then Magento will store its cache (in its entirety) in Memcache.
HOWEVER without the slow_backend defined - it is caching content, without cache_tags - ie. without the ability to differentiate cache items
Eg. configuration, block, layouts, translations etc.
So, without the defined, you cannot refresh caches individually, in-fact, you'll almost always have to rely on "Flush Cache Storage" to actually see updates take effect.
We wrote a nice article here which covers your very issue - http://www.sonassi.com/knowledge-base/magento-knowledge-base/what-is-memcache-actually-caching-in-magento/
Memcached is a distributed memory caching system. It speeds up websites having large dynamic databases by storing database objects in Dynamic Memory to reduce the pressure on a server whenever an external data source requests a read. A Memcached layer reduces the number of times database requests are made.
The caching is actually stored in the var/cache directory. There are a ton of directories in there (mage--0, mage--1, mage--2) and each directory has the cache files. Do a ls var/cache/mage*/* to see all the files.
Configure Memcache Magento 2
Magento 2 also supports Memcached for caching objects but it isn’t enabled by default. You need to make simple changes to the $Magento2Root/app/etc/env.php file to enable it.
In env.php, you will see a large number of PHP arrays with different settings and configurations. Open the file in your favorite code editor and locate the following code:
array (
session' =>
'save' => 'files',
),
Modify this chunk as:
'session' =>
array (
'save' => 'memcached',
'save_path' => '<memcache ip or host>:<memcache port>'
),
Note that the default value for memcache ip is 127.0.0.1:11211. Similarly, the default value for memcache port is 11211.
For complete manual please look into it:
https://www.cloudways.com/blog/magento-2-memcached/
https://devdocs.magento.com/guides/v2.4/config-guide/memcache/memcache_magento.html

Resources