I am writing a server app which I want to efficiently use ALL available physical RAM of the machine when possible. The plan is that it will allocate physical pages using AWE until it detects that 99% of physical memory and stop when 1% is free, and any time physical memory drops below 1% free, it will free physical pages it doesn't need.
However when I put this plan into practice, Windows seems to think any time it has 99% of RAM in use it would be a good idea to free up more physical memory, and so it starts paging all sorts of stuff to disk, and my system crashes.
How can I tell Windows it is OK to have 99% of RAM in use and it doesn't need to try to page stuff back to disk until it reaches whatever its default perceived ideal level of usage is (I guess it will be something like 90%...)
Note: Raymond says 'Unless you are designing a system where you are the only program running on the computer, this is a bad idea.'
In this server scenario this is basically intended to be the only app running on the computer. But unfortunately there are some OS/background tasks that need to run...
But certainly I don't expect there is any other process on the computer indulging in this 'use all but 1% of RAM' behavior...?
Update: I've done more experimentation and started to wonder if I'm somewhat asking the wrong question. My assumption that windows is being overeager may be wrong. Perhaps the question should instead have been 'how can I determine how much physical RAM my process can safely use without compromising overall responsiveness on the machine'?
You can't. The Windows memory manager runs at a lower level than your program and knows nothing about your program (and even if it did, it has no reason to assume your program is the good citizen you claim it to be. What if your program crashes, or has an off-by-one error in a loop that mallocs? What about other programs that need memory while yours is running? What about the thousand other scenarios that the guys who wrote the Windows MM encountered when they were writing it?)
Don't try to be cleverer than Windows. A more productive use of your time would be to consider if your application really needs to allocate 99% physical memory up front.
Related
So, yesterday I opened task manager in Win 8 (64 bit) and noticed that Chrome (32-bit for some reason) didn't use the whole power my PC has got. So I was running an AI JavaScript program and I noticed that my CPU was running at 1% and Memory was only runnning 120 MB, and that forced me to think why would I wait 5 minutes for it to run instead of somehow boosting it to at least 60%. As far as I know Windows automatically distributes the hardware usage to programs so I'm asking what's the problem:
Is it because it's x32?
Is it because I should manually configure windows to give it more power?
Note: I did search Google, but all I got is that people actually complain about High CPU usage and I've got the opposite.
32 bit doesn't make a difference here. Javascript is inherently single-threaded, so by default (not counting web workers) it won't use more than a single core on your machine. It just cannot. Also memory usage doesn't necessarily tell you how hard a program is working. Some need lots of memory, others only little.
It's up to programs to use the resources of the machine most efficiently; if they don't, there is nothing you can do with Windows to make them run better or faster.
I am not an OS expert, and I am having trouble understanding my server's memory usage. I need your advices to understand the following:
My server has 8 GB RAM and operates as web server. PHP, mySQL and Apache processes consume the majority of the memory. When I issue the command "free" after the system is rebooted, I would normally see something along these lines:
total used free shared buffers cached
Mem: 8059080 2277924 5781156 0 948 310852
-/+ buffers/cache: 1966124 6092956
Swap: 4194296 0 4092668
Obviously, sooner or later the free memory would drop and the cached memory would increase and I assume there is nothing wrong with that since the OS decides to cache it.
What I don't understand is about 1-2 days later after the machine is rebooted, I would slightly see an increase in the used swap memory. Does not this mean that the server does not have free memory anymore and using IO instead? How can I understand which processes cause this?
I am asking this question to stackoverflow users because if I ask it to my hosting provider, I am sure they would ask more money to increase RAM.
Thanks.
This is perfectly normal. When the machine starts up, a large number of services also start up. As they run their startup code, read their configuration, and so on, they dirty some pages of memory. Many of these services will never run again. By writing this data to swap, the operating system accomplishes two things:
First, if it ever does encounter memory pressure, it can discard the pages without having to write them first, since it has already written them. Second, it can discard the pages to make more free memory to enlarge the cache.
The alternative is to keep information that hasn't been touched in days in physical memory. And that just doesn't make sense.
I've written a program that (among other things) downloads multiple large files from a server on the LAN, using TCP. This program runs fine under Linux, MacOS/X, and generally under Windows as well (it uses Qt for the GUI and straight sockets calls for networking), but on certain Windows machines the download appears to be too much for the machine to handle, and I'm wondering if anyone has any ideas as to why that is and what can be done about it.
When downloading files, my program spawns a separate I/O thread that basically just sits in a loop, downloading data over TCP and writing it to a file, writing 128KB per call to QFile:write(). Each file is typically several hundred megabytes long, and a typical download session writes out several dozen of these files. Note that the I/O thread runs independently of the GUI thread, so I wouldn't expect it to affect GUI's performance much if at all -- especially not when running on a multicore PC.
The PC in question is a Core-2Duo Quad Q6600 running at 2.40GHz, with 4GB of RAM. It's running Windows 7 Ultimate SP1, 32-bit. It is receiving data over a Gigabit Ethernet connection and writing it to files on the NTFS-formatted boot partition of the 232GB internal Hitachi ATA drive.
The symptom is that sometimes during a download (seemingly at random) the program's GUI will become non-responsive for 10 to 30 seconds at a time, and often the title bar of the window will have "(not responding)" appended to it. The symptom will then clear up again and the download will proceed normally again. Another symptom is that the desktop is extremely sluggish during the download... for example, if I click on the "Start" button, the Start menu will take ~30 seconds to populate, instead of being populated near-instantaneously as I would expect.
Note that Task Manager shows plenty of free memory, but it does show short spikes of CPU usage to 100% one one of the 4 cores, at the same time the problems are seen.
The data is arriving over Gigabit Ethernet, and if I have my program just receive the data and throw it away (without writing it to the hard drive), the machine can maintain a constant download rate of about 96MB/sec without breaking a sweat. If I write the received data to a file, however, the download rate decreases to about 37MB/sec, and the symptoms described above start to appear.
The interesting thing is that just for curiosity's sake I added this call to my I/O thread's entry function, just before the beginning of its event loop:
SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_BELOW_NORMAL);
When I did that, the "(not responding)" symptoms cleared, but then download speed was reduced to only ~25MB/sec.
So my questions are:
Does anyone know what might be causing the sporadic hangups of the GUI when the hard drive is under a heavy write-load?
Why does lowering the I/O thread's priority cause the download rate to drop so much, given that there are three idle cores on the machine? I would think that even a lower-priority thread would have plenty of CPU available in this situation.
Is there any way to get a maximum download rate without causing Windows' desktop responsiveness and/or my app's GUI responsiveness to suffer problems?
Without seeing any code is hard to answer but this seems to be something related to processors and the fact that your download thread is not leaving any space for other threads to performs other operations.
It seems it never waits and that the driver of the network card is not well written.
Are you sure your thread is entering in an idle state when there is no data incoming?
In OS with a single processor a for (;;) {} will consume 100% cpu and if it talks continuously with the kernel it may stops other processes or other threads for doing that, especially if there is a bug or a very bad behaviour in some network card driver in your case.
Probably putting the thread priority below normal you are asking the OS to use your thread less often, this gives by a magical combination of things that allow things to not hang too much.
Check the code, maybe you are forgetting something?
Check if adding a sleep(0) to force the OS to yield to another thread sometime will make things better, but this is a temporary fix, you should find why your thread is consuming 100% cpu, if it is.
We run a Fortran console program we have run for years. Recently we purchased identical new HP server class machines (4 processors, 8 gig ram, 4 hard drives) for everyone in the office. We configured them identically as nearly as we know. We can compile the Fortran program on one machine, pass the executable to the different machines, and on two machines it executes painfully slow, while on two others it has modest performance (but not as good as before we upgraded from XP machines).
It uses almost no console output (about 40 lines) but outputs about 15 megs of files.
We open task manager to see what's going on, and we see that on the slow machines it's loading ONE CPU to about 15%. On the fast machines it's loading ALL CPUs to about 40% (but one of them seems to load more than the others). As I recall, on XP it loaded the CPU to 99%, and ran much faster.
These machines are the employees' general purpose machines, and have lots of company bloatware on them. And there is the possibility they have slightly different directory structures. But what seems totally puzzling to me is why Vista is not giving them more CPU time. If the CPUs were loading up, I might blame the performance variation on different directory structures, but not loading up the CPUs just boggles my mind.
David
if there's a bottleneck in IO, the CPU wouldn't be loaded as much because it's mostly waiting for the IO to take place. One could even imagine this to cause the one CPU vs many CPUs problem if there's just no point in kicking in another CPU because there's plenty of time between while waiting. What if you take an external HD and try out if the differences also take place if you run the same program on that HD on different machines?
Please go into Windows task manager, Performance / - Select in [View] the option: [Kernel Times] and look what's displayed on the bars during program execution.
If its only 15% load on quad+hyperthreading box, that says basically, OpenMP, MPI (or whatever it uses) - isn't properly working - works on 1/8 => 15%. Can you run the MPI-test command for your specific system in order to check for errors in multiprocessing on each box? Therefore, the question would be - why does the multiprocessing environment not work?
Regards
rbo
SWAG, but have you checked your virus scanner configuration? If the scanner isn't set to ignore the type of file you're writing on the slow machine, then each write to those files might be getting intercepted and scanned before being written to the disk. This could lead to the process sitting in I/O wait and not getting scheduled as often.
Vista had a problem with some uncontrollably memory leaks, perhaps this is your error, some conflict in the "bloatware" is causing a memory leak and so your Fortran program is running so much slower?
I assume you have tested this with all programs ended. It seems unlikely that your console program is the issue. Sounds like there's definitely a memory conflict going on though.
We have some Win32 console applications running on Windows Server 2003 Service Pack 2 that regularly fail with this:
Error 1450 (ERROR_NO_SYSTEM_RESOURCES): "Insufficient system resources exist to complete the requested service."
All the documentation we've found suggests it is linked to the number of Free System Page Table Entries running out. We have 16GB RAM in these machines and use the /3GB Operating System switch to squeeze the Windows kernel into 1GB and allow our processes access to 3GB of address space. This drastically reduces the total number of Free System Page Table Entries, so combined with our heavy use of MapViewOfFile() it is perhaps not surprising that the kernel page table entries are running out.
However, when using Performance Monitor to view the Free System Page Table Entries counter, the value is around 36,000 on reboot and doesn't go down when our application starts. I find it hard to believe that our application, which opens many large memory-mapped files, doesn't have any effect on the kernel page table. If we can't believe the counter, it's much more difficult to test the effect of any system changes we make.
There is a promising Knowledge Base article, The Performance tool does not accurately show the available Free System Page Table entries in Windows Server 2003, but it says the problem has been fixed in Service Pack 1, and we are already on Service Pack 2.
Has anyone else struggled with or solved this issue?
Update: I have checked !sysptes in windbg (debugging the kernel) and the value matches the performance counter, around 36,000. I guess this is most likely to mean that there really are that many free page table entries and Windows is telling the truth. It does leave the question of why we're getting 1450 errors though, if the PTEs are not running out.
Further update: We never did get to the bottom of why the 1450 errors were occurring. However, instead we upgraded the OS on these servers to 64-bit Windows. This allows the existing 32-bit applications (without recompilation) to access a full 4GB of virtual address space, and lets the kernel memory area with those pesky Page Table Entries be as big as it likes too. I don't think we've had a 1450 error since.
Can you try the windbg command "!sysptes" to get System PTE Information? I'm not sure if you can do this with live kernel debug, you may have to get a memory dump.
I'm not sure why you assume that ERROR_NO_SYSTEM_RESOURCES is caused only by running out of free System Page Table Entries ? As far as I know, such generic error codes are used for more than one resource type. And in fact, the first Google hit suggests that running out of file cache memory may cause it too. (KB on an XP bug, which tripped this error mode).
In your case, I'd be checking the "Handle Count". Another possible problem is address space fragmentation. If you you want to create a 1GB file mapping view, you need 1GB of free address space, and it has to be contiguous. If you map a 1GB file, a 800 MB file, and a 1GB file, close the 800MB one and open a 900MB file, the 900MB file may not fit in the hole that's left.
MS has 2 ways to allow there 32 bit OS to "deal" with hardware that has 4 GB or more of RAM.
Option 1: is what you did with the /3GB Switch in the Boot.ini.
Option 1 Pros and Cons:
(CONS) This option sucks 1 GB from the normal 2 GB kernel area - hence making the OS struggle to meet the demands of both Paged Pool allocations and kernel stack allocations. So a person might think that using the /3GB Switch will help their, but really this option is screwing the 32 bit Window OS into a slow death.
(CONS) But, This gives my App 3GB.... WRONG (Hence this is a CON) The catch is that ONLY application that have been recompiled from the vendor to be "/3GB Switch aware" can really use the extra 1 GB. Hence the whole use of the /3GB Switch is a really BAD J.O.K.E on everyone.
Read this link for a much better write-up:
http://blogs.technet.com/askperf/archive/2007/03/23/memory-management-demystifying-3gb.aspx
Option 2: Use the /PAE switch in the Boot.ini.
Option 2 Pros and Cons:
(PROS) This really this only option if you have a more then 4GB of RAM. It tricks a application by placing the complete application memory footprint in RAM. Normally, only a application "Working Set" memory is in RAM and the remaining application memory requirements go into Windows Pagefile. What is a application total memory requirements?? - it called "Virtual Size".
In my world, I have a big fat Java based IBM Product that I deal with. The server that is running the "application" has 16 GB of RAM. I simply add the /PAE switch and watch (thanks to sysinternals Processes Explorer) application paging requests go from 200 KB per sec to up to 4MB per sec.
Question: "Why"?
Answer: The whole application is in RAM.
Question: "Does the application know that it is completely running in RAM?
Answer: No - It is running that same old way that it was always run, "THINKING" that it's has part of itself as the "Working Set" memory living in RAM and the remaining application memory requirements go into Windows Pagefile.
Yes, it is that flipping GOOD.
Please Note: Microsoft has done a poor job telling anyone about the great Windows OS option. Duh
Try it and report back to stackoverflow....