It's a frequently asked question, I know. But I tried every solution suggested (apc.stat=0, increasing shared memory, etc) without benefits.
Here's the screen with stats (you can see nginx and php5-fpm) and the parameters set in apc.ini:
APC is used for system and user-cache entries on multiple sites (8-9 WordPress sites and one with MediaWiki and SMF).
What would you suggest?
Each wordpress site is is going to cache a healthy amount in the user cache. I've looked at this myself in depth and have found the best 'guesstimate' is that if you use user cache in APC keep the fragmentation under 10%. This can sometimes mean you need to try and reserve upwards of 10x the amount of memory you intend to actually use for caching to avoid fragmentation. Start where you are and keep increasing memory allocated until fragmentation stays below 10% after running a while.
BTW: wordpress pages being cached are huge, so you'll probably need a lot of memory to avoid fragmentation.
Why 10% fragmentation? It's a bit of a black art, but I observed this is where performance start to noticeabily decline. I haven't found any good benchmarks (or run my own controlled tests) however.
This 10x amount to get this my seem insane, but root cause is APC has no way to defragment other than a restart (which completely dumps the cache). Having a slab of 1G of memory when you only plan to use 100-200m gives a lot of space to fill without having to look for 'holes' to put stuff. Think about bad old FAT16 or FAT32 disk performance with Windows 98--constant need to manually defrag as the disk gets over 50% full.
If you can't afford the extra memory to spare, you might want to look at either memcached or plain old file caching for your user cache.
Related
I've been reading a lot about operating systems and all lately. I understand how cache works and what is it used for.
However, when I asked a question to myself, I was unable to find an answer.
If a cache can be made as large as the device it is caching (for instance, a cache as large as a disk), why not do so and eliminate the device.
Assuming the cache is in-memory, it's not persistent, meaning you'll lose it once your machine is reboot. If nothing else, you'll need persistent storage (disk) to avoid losing data on restarts and power loses.
For a disk cache, there's no technical limit why a battery-backed RAM cache or flash storage usually used for HDD cache can't be combined together to provide terabytes of capacity. Your average user won't be using those RAM-based storage products because it's far more expensive than normal storage and they still consume too much power to leave unplugged for hours. We are leaving SSHD behind for full flash storage, but even they have RAM cache to reduce & consolidate IO calls.
For CPU cache, there's actually a practical limit to it, they're taking precious physical space that can be used for other unit, increasing distance (thus latency), and generate heat (which put a hard limit no matter how large is your cooling budget). That's why CPU cache is multilevel instead, with the small L1 tightly coupled to a core, the bigger L2 shared by multiple cores, and the slower but even bigger L3 with the slower but cheaper component. Even with an unlimited budget and an exotic process, a multilevel cache CPU will have better performance than a single-level cache CPU that's forced to put everything on (relatively) far silicon and suffering from the latency.
Just learned these 3 new techniques from https://unix.stackexchange.com/questions/87908/how-do-you-empty-the-buffers-and-cache-on-a-linux-system:
To free pagecache:
# echo 1 > /proc/sys/vm/drop_caches
To free dentries and inodes:
# echo 2 > /proc/sys/vm/drop_caches
To free pagecache, dentries and inodes:
# echo 3 > /proc/sys/vm/drop_caches
I am trying to understand what exactly are pagecache, dentries and inodes. What exactly are they?
Do freeing them up also remove the useful memcached and/or redis cache?
--
Why i am asking this question? My Amazon EC2 server RAM was getting filled up over the days - from 6% to up to 95% in a matter of 7 days. I am having to run a bi-weekly cronjob to remove these cache. Then memory usage drops to 6% again.
With some oversimplification, let me try to explain in what appears to be the context of your question because there are multiple answers.
It appears you are working with memory caching of directory structures. An inode in your context is a data structure that represents a file. A dentries is a data structure that represents a directory. These structures could be used to build a memory cache that represents the file structure on a disk. To get a directly listing, the OS could go to the dentries--if the directory is there--list its contents (a series of inodes). If not there, go to the disk and read it into memory so that it can be used again.
The page cache could contain any memory mappings to blocks on disk. That could conceivably be buffered I/O, memory mapped files, paged areas of executables--anything that the OS could hold in memory from a file.
Your commands flush these buffers.
I am trying to understand what exactly are pagecache, dentries and
inodes. What exactly are they?
user3344003 already gave an exact answer to that specific question, but it's still important to note those memory structures are dynamically allocated.
When there's no better use for "free memory", memory will be used for those caches, but automatically purged and freed when some other "more important" application wants to allocate memory.
No, those caches don't affect any caches maintained by any applications (including redis and memcached).
My Amazon EC2 server RAM was getting filled up over the days - from 6%
to up to 95% in a matter of 7 days. I am having to run a bi-weekly
cronjob to remove these cache. Then memory usage drops to 6% again.
Probably you're mis-interpreting the situation: your system may just be making efficient usage of its ressources.
To simplify things a little bit: "free" memory can also be seen as "unused", or even more dramatic - a waste of resources: you paid for it, but don't make use of it. That's a very un-economic situation, and the linux kernel tries to make some "more useful" use of your "free" memory.
Part of its strategy involves using it to save various kinds of disk I/O by using various dynamically sized memory caches. A quick access to cache memory saves "slow" disk access, so that's often a useful idea.
As soon as a "more important" process wants to allocate memory, the Linux kernel voluntarily frees those caches and makes the memory available to the requesting process. So there's usually no need to "manually free" those caches.
The Linux kernel may even decide to swap out memory of an otherwise idle process to disk (swap space), freeing RAM to be used for "more important" tasks, probably also including to be used as some cache.
So as long as your system is not actively swapping in/out, there's little reason to manually flush caches.
A common case to "manually flush" those caches is purely for benchmark comparison: your first benchmark run may run with "empty" caches and so give poor results, while a second run will show much "better" results (due to the pre-warmed caches). By flushing your caches before any benchmark run, you're removing the "warmed" caches and so your benchmark runs are more "fair" to be compared with each other.
Common misconception is that "Free Memory" is important.
Memory is meant to be used.
So let's clear that out :
There's used memory, which is where important data is stored, and if that reaches 100% you're dead
Then there's cache/buffer, which is used as long as there is space to do so. It's facultative memory to access disk files faster, mostly. If you run out of free memory, this will just free itself and let you access disk directly.
Clearing cached memory as you suggest is most of the case useless and means you're deactivating an optimization, therefore you'll get a slow down.
If you really run out of memory, that is if your "used memory" is high, and you begin to see swap usage, then you must do something.
HOWEVER : there's a known bug running on AWS instances, with dentry cache eating memory with no apparent reason. It's clearly described and solved in this blog.
My own experience with this bug is that "dentry" cache consumes both "used" and "cached" memory and does not seem to release it in time, eventually causing swap. The bug itself can consume resources anyway, so you need to look into it.
Hate to bring an old thread back from the dead, but I've been dealing with memory issues lately on my Linux Virtual Machines. Unfortunately, even with the virtualization of computing machines being great and the advancements of Linux memory and resource allocation being superb, conflicts occur when the hypervisor acts out what it calls "performance features".
VMWare will actively send RAM that hasn't been "written or modified" recently, to the disk. When your disk is on a SAN, that means reading from the RAM is now at 1Gbps to 10Gbps at best if you have a REALLY performant RAID and steady network access (ignoring the fact that now the RAM of say 100 VMs are all using the same SAN). DDR3 RAM operates at 25Gbps+ on modern systems, so I'll assume you can see the problem with systems running at 1/25th to less than 1/2 of the speed anticipated.
The caches on my linux systems are literally the same speed as disk I/O of the filesystem, meaning they do not help our performance and are actively sending the OS's RAM into Swap instead of clearing caches. This is a huge problem thanks to VMWare, not because of Linux, but be aware that cloud infrastructure often does stupid crap like this all the time unfortunately. You can read more here: https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/perf-vsphere-memory_management.pdf or if you use VMWare, surely you'll notice the "allocated memory" vs "active memory" and where your VMs will always display a different amount than VMWare because of this distinction and treatment of the memory.
Our application is memory intensive and deals with reading a large number of disk files. The total load can be more than 3 GB.
There is a custom memory manager that uses memory mapped files to achieve reading of such a huge data. The files are mapped into the process memory space only when needed and with this the process memory is well under control. But what is observed is, with memory mapping, the system cache keeps on increasing until it occupies the available physical memory. This leads to the slowing down of the entire system.
My question is how to prevent system cache from hogging the physical memory? I attempted to remove the file buffering (by using FILE_FLAG_NO_BUFFERING ), but with this, the read operations take considerable amount of time and slows down the application performance. How to achieve the scalability without sacrificing much on performance. What are the common techniques used in such cases?
I dont have a good understanding of the WinXP OS caching behavior. Any good links explaining the same would also be helpful.
I work on a file backup product, so we often run into similar scenario, where our own file access causes the cache manager to keep data around -- which can cause memory usage to spike.
By default the windows cache manager will try to be useful by reading ahead, and keeping file data around in case its needed again.
There are several registry keys that let you tweak the cache behavior, and some of our customers have had good results with this.
XP is unique in that it has some server capability, but by default optimized for desktop programs, not caching. You can enable System Cache Mode in XP, which causes more memory to be set aside for caching. This might improve performance, or you may be doing this already and its having a negative side effect! You can read about that here
I can't recommend a custom memory manager, but I do know that most heavy weight apps do there own caching (Exchange, SQL). You can observe that by running process monitor.
If you want to completely prevent cache manager from using memory to cache your files you must disable both read and write caching:
FILE_FLAG_NO_BUFFERING and
FILE_FLAG_WRITE_THROUGH
There are other hints you can give the CM, (random access, temporary file) read this document on Caching Behavior here
You can still get good read performance even if you disable caching, but your going to have to emulate the cache manager behavior by having your own background threads doing read aheads
Also, I would recommend upgrading to a server class OS, even windows 2003 is going to give you more cache manager tuning options. And of course if you can move to Windows 7 / Server 2008, you will get even more performance improvements, with the same physical resources, because of dynamic paged/non paged pool sizing, and working set improvements. There is a nice article on that here
Type this down in a notepad and save it as .vbs file. Run it whenever u realize the system RAM is too low. The system cache gets cleared and adds up to RAM. I found it else where on net and giving it here so that it might help you. Also, it is suggested to care that the first record should not ever exceed half your actual RAM. So if u have 1 gb ram, start with the following text in your vbs file.
FreeMem=Space(240000000) <This one is to clear 512 MB ram>
FreeMem=Space(120000000) <This one is to clear 256 MB ram>
FreeMem=Space(90000000) <This one is to clear 128 MB ram>
FreeMem=Space(48000000) <This one is to clear 64 MB ram>
FreeMem=Space(20000000) <This one is to clear 52 MB ram>
I have a 2GB RAM and running a memory intensive application and going to low available physical memory state and system is not responding to user actions, like opening any application or menu invocation etc.
How do I trigger or tell the system to swap the memory to pagefile and free physical memory?
I'm using Windows XP.
If I run the same application on 4GB RAM machine it is not the case, system response is good. After getting choked of available physical memory system automatically swaps to pagefile and free physical memory, not that bad as 2GB system.
To overcome this problem (on 2GB machine) attempted to use memory mapped files for large dataset which are allocated by application. In this case virtual memory of the application(process) is fine but system cache is high and same problem as above that physical memory is less.
Even though memory mapped file is not mapped to process virtual memory system cache is high. why???!!! :(
Any help is appreciated.
Thanks.
If your data access pattern for using the memory mapped file is sequential, you might get slightly better page recycling by specifying the FILE_FLAG_SEQUENTIAL_SCAN flag when opening the underlying file. If your data pattern accesses the mapped file in random order, this won't help.
You should consider decreasing the size of your map view. That's where all the memory is actually consumed and cached. Since it appears that you need to handle files that are larger than available contiguous free physical memory, you can probably do a better job of memory management than the virtual memory page swapper since you know more about how you're using the memory than the virtual memory manager does. If at all possible, try to adjust your design so that you can operate on portions of the large file using a smaller view.
Even if you can't get rid of the need for full random access across the entire range of the underlying file, it might still be beneficial to tear down and recreate the view as needed to move the view to the section of the file that the next operation needs to access. If your data access patterns tend to cluster around parts of the file before moving on, then you won't need to move the view as often. You'll take a hit to tear down and recreate the view object, but since tearing down the view also releases all the cached pages associated with the view, it seems likely you'd see a net gain in performance because the smaller view significantly reduces memory pressure and page swapping system wide. Try setting the size of the view based on a portion of the installed system RAM and move the view around as needed by your file processing. The larger the view, the less you'll need to move it around, but the more RAM it will consume potentially impacting system responsiveness.
As I think you are hinting in your post, the slow response time is probably at least partially due to delays in the system while the OS writes the contents of memory to the pagefile to make room for other processes in physical memory.
The obvious solution (and possibly not practical) is to use less memory in your application. I'll assume that is not an option or at least not a simple option. The alternative is to try to proactively flush data to disk to continually keep available physical memory for other applications to run. You can find the total memory on the machine with GlobalMemoryStatusEx. And GetProcessMemoryInfo will return current information about your own application's memory usage. Since you say you are using a memory mapped file, you may need to account for that in addition. For example, I believe the PageFileUsage information returned from that API will not include information about your own memory mapped file.
If your application is monitoring the usage, you may be able to use FlushViewOfFile to proactively force data to disk from memory. There is also an API (EmptyWorkingSet) that I think attempts to write as many dirty pages to disk as possible, but that seems like it would very likely hurt performance of your own application significantly. Although, it could be useful in a situation where you know your application is going into some kind of idle state.
And, finally, one other API that might be useful is SetProcessWorkingSetSizeEx. You might consider using this API to give a hint on an upper limit for your application's working set size. This might help preserve more memory for other applications.
Edit: This is another obvious statement, but I forgot to mention it earlier. It also may not be practical for you, but it sounds like one of the best things you might do considering that you are running into 32-bit limitations is to build your application as 64-bit and run it on a 64-bit OS (and throw a little bit more memory at the machine).
Well, it sounds like your program needs more than 2GB of working set.
Modern operating systems are designed to use most of the RAM for something at all times, only keeping a fairly small amount free so that it can be immediately handed out to processes that need more. The rest is used to hold memory pages and cached disk blocks that have been used recently; whatever hasn't been used recently is flushed back to disk to replenish the pool of free pages. In short, there isn't supposed to be much free physical memory.
The principle difference between using a normal memory allocation and memory mapped a files is where the data gets stored when it must be paged out of memory. It doesn't necessarily have any effect on when the memory will be paged out, and will have little effect on the time it takes to page it out.
The real problem you are seeing is probably not that you have too little free physical memory, but that the paging rate is too high.
My suggestion would be to attempt to reduce the amount of storage needed by your program, and see if you can increase the locality of reference to reduce the amount of paging needed.
Applications like Microsoft Outlook and the Eclipse IDE consume RAM, as much as 200MB. Is it OK for a modern application to consume that much memory, given that few years back we had only 256MB of RAM? Also, why this is happening? Are we taking the resources for granted?
Is it acceptable when most people have 1 or 2 gigabytes of RAM on their PCS?
Think of this - although your 200mb is small and nothing to worry about given a 2Gb limit, everyone else also has apps that take masses of RAM. Add them together and you find that the 2Gb I have very quickly gets all used up. End result - your app appears slow, resource hungry and takes a long time to startup.
I think people will start to rebel against resource-hungry applications unless they get 'value for ram'. you can see this starting to happen on servers, as virtualised systems gain popularity - people are complaining about resource requirements and corresponding server costs.
As a real-world example, I used to code with VC6 on my old 512Mb 1.7GHz machine, and things were fine - I could open 4 or 5 copies along with Outlook, Word and a web browser and my machine was responsive.
Today I have a dual-processor 2.8Ghz server box with 3Gb RAM, but I cannot realistically run more than 2 copies of Visual Studio 2008, they both take ages to start up (as all that RAM still has to be copied in and set up, along with all the other startup costs we now have), and even Word take ages to load a document.
So if you can reduce memory usage you should. Don't think that you can just use whatever bloated framework/library/practice you want with impunity.
http://en.wikipedia.org/wiki/Moore%27s_law
also:
http://en.wikipedia.org/wiki/Wirth%27s_law
There's a couple of things you need to think about.
1/ Do you have 256M now? I wouldn't think so - my smallest memory machine is 2G so a 200M application is not much of a problem.
2a/ That 200M you talk about might not be "real" memory. It may just be address space in which case it might not all be in physical memory at once. Some bits may only be pulled in to physical memory when you choose to do esoteric things.
2b/ It may also be shared between other processes (such as a DLL). This means it could be only held in physical memory as one copy but be present in the address space of many processes. That way, the usage is amortized over those many processes. Both 2a and 2b depend on where your figure of 200M actually came from (which I don't know and, running Linux, I'm unlikel to find out without you telling me :-).
3/ Even if it is physical memory, modern operating systems aren't like the old DOS or Windows 3.1 - they have virtual memory where bits of applications can be paged out (data) or thrown away completely (code, since it can always reload from the executable). Virtual memory gives you the ability to use far more memory than your actual physical memory.
Many modern apps will take advantage of the existance of more memory to cache more. Some like firefox and SQL server have explicit settings for how much memory they will use. In my opinion, it's foolish to not use available memory - what's the point of having 2GB of RAM if your apps all sit around at 10MB leaving 90% of your physical memory unused. Of course, if your app does use caching like this, it better be good at releasing that memory if page file thrashing starts, or allow the user to limit the cache size manually.
You can see the advantage of this by running a decent-sized query against SQL server. The first time you run the query, it may take 10 seconds. But when you run that exact query again, it takes less than a second - why? The query plan was only compiled the first time and cached for use later. The database pages that needed to be read were only loaded from disk the first time - the second time, they were still cached in RAM. If done right, the more memory you use for caching (until you run into paging) the faster you can re-access data. You'll see the same thing in large documents (e.g. in Word and Acrobat) - when you scroll to new areas of a document, things are slow, but once it's been rendered and cached, things speed up. If you don't have enough memory, that cache starts to get overwritten and going to the old parts of the document gets slow again.
If you can make good use of the RAM, it is your responsability to use it.
Yes, it is perfectly normal. Also something big was changed since 256MB were normal... and do not forget that before that 640Kb were supposed to be enough for everybody!
Now most software solutions are build with a garbage collector: C#, Java, Ruby, Python... everybody love them because certainly development can be faster, however there is one glitch.
The same program can be memory leak free with either manual or automatic memory deallocation. However in the second case it is likely for the memory consumption to grow. Why? In the first case memory is deallocated and kept clean immediately after something becomes useless (garbage). However it takes time and computing power to detect that automatically, hence most collectors (except for reference counting) wait for garbage to accumulate in order to make worth the cost of the exploration. The more you wait the more garbage you can sweep with the cost of one blow, but more memory is needed to accumulate that garbage. If you try to force the collector constantly, your program would spend more time exploring memory than working on your problems.
You can be completely sure than as long as programmers get more resources, they will sacrifice them using heavier tools in exchange for more freedom, abstraction and faster development.
A few years ago 256 MB was the norm for a PC, then Outlook consumed about 30 - 35 MB or so of memory, that's around 10% of the available memory, Now PC's have 2 GB or more as a norm, and outlook consumes 200 MB of memory, that's about 10% also.
The 1st conclusion: as more memory is available applications use more of it.
The 2nd conclusion: no matter what time frame you pick there are applications that are true memory hogs (like Outlook) and applications that are very efficient memory wise.
The 3rd conclusion: memory consumption of a app can't go down with time, else 640K would have been enough even today.
It completely depends on the application.