Unmanaged VC++ Application's memory consumption on windows server - windows

OK, so I have a very large multi-threaded unmanaged c++ application (server) that runs on a windows 2003 server. It hosts sessions for 20-50 concurrent users doing all sorts of business logic... At times it can be using a very large amount of memory due to things like object/session caching due to users having large numbers of windows open in the clients (each window has a separate server 'session'.
We routinely see consumption of more than 5-600 MB physical memory and 5-600 MB of virtual memory. Once it gets to this point we seem to start having 'out of memory' errors.
Now I know that Windows will start page-faulting when it feels it needs to free up physical memory, and also that win32 applications normally are only able to allocate up to a maximum of 4GB worth of memory, really only with 2GB of that available for actual use by the application for 'user-mode' address space, and even less of that after other libraries are loaded... I'm not sure if the 'user-mode' memory usage is what is reported on the Task Manager...
So anyway my real question is:
How can I find out how much user-mode memory my application has access to, and how much has been used at any given time (preferably from outside of the application, i.e. some windows management tool)?
[edit] After looking at the Process Explorer and the website, it looks like the value 'Virtual Size' is the value of how much memory the application has access to.

Sounds like a case for Process Explorer, a free utility from Microsoft SysInternals:
(source: microsoft.com)
Description:
Ever wondered which program has a
particular file or directory open? Now
you can find out. Process Explorer
shows you information about which
handles and DLLs processes have opened
or loaded.
The Process Explorer display consists
of two sub-windows. The top window
always shows a list of the currently
active processes, including the names
of their owning accounts, whereas the
information displayed in the bottom
window depends on the mode that
Process Explorer is in: if it is in
handle mode you'll see the handles
that the process selected in the top
window has opened; if Process Explorer
is in DLL mode you'll see the DLLs and
memory-mapped files that the process
has loaded. Process Explorer also has
a powerful search capability that will
quickly show you which processes have
particular handles opened or DLLs
loaded.
The unique capabilities of Process
Explorer make it useful for tracking
down DLL-version problems or handle
leaks, and provide insight into the
way Windows and applications work.
If you are looking for more info in terms of terminal-server specific info, I've been following the blog of a programmer that is releasing a beta of a tool that I believe will fit your needs perfectly. It is an improved TSAdmin. He calls it TSAdminEx.
See below for a screenshot, and click here to learn more about it and to get the beta. It's free software, BTW.

I know you asked for preferably from outside of the application, but I was googling for how to find such information from within my own program and stumbled upon your post. So, this is to benefit people who want this information from within their program.
unmanaged C++
#include <windows.h>
#include <stdio.h>
#include <psapi.h>
void PrintMemoryInfo( DWORD processID )
{
HANDLE hProcess;
PROCESS_MEMORY_COUNTERS pmc;
// Print the process identifier.
printf( "\nProcess ID: %u\n", processID );
// Print information about the memory usage of the process.
hProcess = OpenProcess( PROCESS_QUERY_INFORMATION |
PROCESS_VM_READ,
FALSE,
processID );
if (NULL == hProcess)
return;
if ( GetProcessMemoryInfo( hProcess, &pmc, sizeof(pmc)) )
{
printf( "\tPageFaultCount: 0x%08X\n", pmc.PageFaultCount );
printf( "\tYour app's PEAK MEMORY CONSUMPTION: 0x%08X\n",
pmc.PeakWorkingSetSize );
printf( "\tYour app's CURRENT MEMORY CONSUMPTION: 0x%08X\n", pmc.WorkingSetSize );
printf( "\tQuotaPeakPagedPoolUsage: 0x%08X\n",
pmc.QuotaPeakPagedPoolUsage );
printf( "\tQuotaPagedPoolUsage: 0x%08X\n",
pmc.QuotaPagedPoolUsage );
printf( "\tQuotaPeakNonPagedPoolUsage: 0x%08X\n",
pmc.QuotaPeakNonPagedPoolUsage );
printf( "\tQuotaNonPagedPoolUsage: 0x%08X\n",
pmc.QuotaNonPagedPoolUsage );
printf( "\tPagefileUsage: 0x%08X\n", pmc.PagefileUsage );
printf( "\tPeakPagefileUsage: 0x%08X\n",
pmc.PeakPagefileUsage );
}
CloseHandle( hProcess );
}
int main( )
{
PrintMemoryInfo( GetCurrentProcessId() );
return 0;
}

You wrote:
When your talking how much memory a
win32 app can access they specifically
call it 'user-mode' memory which I
don't see as an option or at least I
don't know what column it really is.
Have a look at this article (written by the creator of Process Explorer, Dr. Mark Russinovich).
To be able to manage your Windows systems effectively you need to understand how Windows manages physical resources, such as CPUs and memory, as well as logical resources, such as virtual memory, handles, and window manager objects. Knowing the limits of those resources and how to track their usage enables you to attribute resource usage to the applications that consume them, effectively size a system for a particular workload, and identify applications that leak resources.

Related

How does GetWindowText get the name of a window owned by another process without a syscall to read that process's memory?

I wanted to figure out what the syscalls behind GetWindowText are. I wrote a simple program to call GetWindowText with a handle to a window in a different process.
int CALLBACK WinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR lpCmdLine, int nCmdShow)
{
MessageBox(0,"Attach debugger and set bp","on GetWindowTextA",0);
HWND winmine = FindWindow(NULL,"Minesweeper");
if(winmine != NULL)
{
char buf[255] = "";
GetWindowTextA(winmine, buf, 254);
MessageBox(0,buf,"Found",0);
}
else
{
MessageBox(0,"?","Found nothing",0);
}
return 0;
}
I attached a debugger and stepped through the GetWindowTextA call, manually stepping through everything except these API calls (in order):
GetWindowThreadProcessId (in GetWindowLong)
InterlockedIncrement
WCSToMBEx (which is basically WideCharToMultiByte)
InterlockedDecrement
None of these API calls seem to be able to read a string in memory not owned by the calling process. I used a usermode debugger so I certainly didn't end up in kernelmode while stepping without realizing it. This means that GetWindowText got the window name without performing a context switch. Which seems to imply that the text for every window that exists is accessible without a context switch.. and that can't be right because there's no way Windows keeps a copy of the text for every single window/control on the system, on every single process.
I have read this article. It mentions that window names are stored in quote "a special place", but does not explain how this "special place" can be accessed from a different process without a syscall/context switching.
So I'm looking for any explanations as to how this is done. Any information you can provide is greatly appreciated.
GetWindowText got the window name without performing a context switch. Which seems to imply that the text for every window that exists is accessible without a context switch.
This info is stored in memory that is shared between all the processes that use user32.dll. You may try to search virtual space of your process for unicode names of other processes' windows.
It gets mapped into the process address space during user32.dll loading. There are some kernel structures/sections involved: win32k!gSharedInfo, win32k!ghSectionShared, win32k!gpsi and others (which I don't know of).
Actually, the lower 16 bits of HWND represents index into window info array with base address *(&user32!gSharedInfo + 1). The first field of this window info is the kernel address of another structure which contains all the shared window information. Subtracting the difference between kernel address of the section and its user-space mapping (which is stored in TEB!Win32ClientInfo) you can get relevant info.
user32!ValidateHwnd is the function that converts window handle into this address which can be used by inner user32 functions like user32!DefWindowProcWorker.
Pseudocode of GetWindowTextW looks like (excluding error-handling):
GetWindowTextW(HWND hwnd, wchar_t* buf, int size)
{
inner_hwnd = ValidateHwnd(hwnd);
if (TestWindowProcess(inner_hwnd))
SendMessageWorker(inner_hwnd, WM_GETTEXT, size, buf, FALSE);
else
DefWindowProcWorker(inner_hwnd, WM_GETTEXT, size, buf, FALSE);
}
DefWindowProcWorker which is called in your case with WM_GETTEXT will just parse the structure referenced by inner_hwnd and copy window's name into buf.
it seems that the text for EDITTEXT controls are not stored in this manner
I never knew all the info that was stored in there though it seems like a good choice to not pollute processes' virtual space with all kinds of user/gdi params. Besides, lower integrity processes should not be able to get higher integrity processes sensitive information.
because there's no way Windows keeps a copy of the text for every single window
The text most certainly exists, just not as a copy. The text for a window is stored in the virtual memory of the process that owns the window. Might be in RAM, not terribly likely if the process has been dormant for a while, definitely on disk in the paging file. Which doesn't stop GetWindowText() from making a copy. On-the-fly, when you call it.
GetWindowText() is limited, it is documented to only being capable of copying the caption text of a window, so it probably uses the desktop heap for the session to retrieve the text. Not otherwise a restriction to a winapi function like SendMessage(), you can use WM_GETTEXT to obtain a gigabyte from an Edit control. That certainly crosses the process boundary.
As an operating system function, SendMessage can of course break all the rules that apply to normal processes. The OS has no trouble addressing the VM of an arbitrary process. Rules that are routinely broken, your debugger does it as well. With functions that you can use to also break the rules, ReadProcessMemory() and WriteProcessMemory().

Resolving Error R6016 - Not enough space for thread data

My statically-linked Visual C++ 2012 program sporadically generates a CRTL error: "R6016 - Not enough space for thread data".
The minimal documentation from Microsoft says this error message is generated when a new thread is spawned, but not enough memory can be allocated for it.
However, my code only explicitly spawns a new thread in a couple of well-defined cases, neither of which are occurring here (although certainly the Microsoft libraries internally spawn threads at well). One user reported this problem when the program was just existing in the background.
Not sure if it's relevant, but I haven't overridden the default 1MB reserved stack size or heap size, and the total memory in use by my program is usually quite small (3MB-10MB on a system with 12GB actual RAM, over half of which is unallocated).
This happens very rarely (so I can't track it down), and it's been reported on more than one machine. I've only heard about this on Windows 8.1, but I wouldn't read too much into that.
Is there some compiler setting somewhere that might influence this error? Or programming mistake?
This turned out to be caused by calling CreateThread rather than _beginthread. Microsoft documentation in the Remarks section states that CreateThread causes conflicts when using the CRT library, and indeed, once we made the change, we never saw that error again.
You have to call TlsAlloc in DllMain if the Windows version is Vista or higher .
implicit TLS handling was rewritten in Windows Vista [...]
threadprivate and __declspec(thread) should function correctly in
run-time loaded DLLs since then.
BOOL APIENTRY DllMain(HINSTANCE hinstDll, DWORD fdwReason,
LPVOID lpvReserved)
{
static BOOL fFirstProcess = TRUE;
BOOL fWin32s = FALSE;
DWORD dwVersion = GetVersion();
static DWORD dwIndex;
if ( !(dwVersion & 0x80000000) && LOBYTE(LOWORD(dwVersion))<4 )
fWin32s = TRUE;
if (dwReason == DLL_PROCESS_ATTACH) {
if (fFirstProcess || !fWin32s) {
dwIndex = TlsAlloc();
}
fFirstProcess = FALSE;
}
}
kb 118816
When a program is started the size of the TLS is determined by taking
into account the TLS size required by the executable as well as the
TLS requirements of all other implicitly loaded DLLs. When you load
another DLL dynamically with LoadLibrary or unload it with
FreeLibrary, the system has to examine all running threads and to
enlarge or compact their TLS storage accordingly.
Your DLL code should be modified to use such TLS functions as TlsAlloc, and to allocate TLS if the DLL is loaded with LoadLibrary. Or, the DLL that is using __declspec(thread) should only be implicitly loaded into the application.
Bottom line: LoadLibrary ain't thread-safe.
I discovered that process is in 32 bit.
In this case I increase memory to process with the command
bcdedit /set increaseuserva 3072

Looking for an explanation of kernel driver I/O interface capability

I am looking at ways of interfacing to specific hardware I/O addresses from various Windows versions from 32-bit XP up 64-bit Win7 and beyond. There seem to be various solutions published with varying degrees of capability under different Windows versions and I am trying to understand the possibilities for creating my own kernel driver. The most basic kernal I/O R/W capability seems to be the direct I/O operations such as READ_PORT_UCHAR and WRITE_PORT_UCHAR (and their word and long derivatives). I have also seen the technique below which I dont understand, appearing to be some memory mapping capability of which I have no experience and can find little readable documentation. Could someone comment on the suitability / compatibility of READ_PORT_UCHAR / WRITE_PORT_UCHAR versus this mapping technique that I reproduce below please?
Thanks in advance.
case IOCTL_PHYMEM_MAP:
if (dwInBufLen==sizeof(PHYMEM_MEM) && dwOutBufLen==sizeof(PVOID))
{
PHYSICAL_ADDRESS phyAddr;
PVOID pvk, pvu;
phyAddr.QuadPart=(ULONGLONG)pMem->pvAddr;
//get mapped kernel address
pvk=MmMapIoSpace(phyAddr, pMem->dwSize, MmNonCached);
if (pvk)
{
//allocate mdl for the mapped kernel address
PMDL pMdl=IoAllocateMdl(pvk, pMem->dwSize, FALSE, FALSE, NULL);
if (pMdl)
{
PMAPINFO pMapInfo;
//build mdl and map to user space
MmBuildMdlForNonPagedPool(pMdl);
pvu=MmMapLockedPages(pMdl, UserMode);
//insert mapped infomation to list
pMapInfo=(PMAPINFO)ExAllocatePool(\
NonPagedPool, sizeof(MAPINFO));
pMapInfo->pMdl=pMdl;
pMapInfo->pvk=pvk;
pMapInfo->pvu=pvu;
pMapInfo->memSize=pMem->dwSize;
PushEntryList(&lstMapInfo, &pMapInfo->link);
DebugPrint("Map physical 0x%x to virtual 0x%x, size %u", \
pMem->pvAddr, pvu, pMem->dwSize);
RtlCopyMemory(pSysBuf, &pvu, sizeof(PVOID));
irp->IoStatus.Information=sizeof(PVOID);
}
else
{
//allocate mdl error, unmap the mapped physical memory
MmUnmapIoSpace(pvk, pMem->dwSize);
irp->IoStatus.Status=STATUS_INSUFFICIENT_RESOURCES;
}
}
else
irp->IoStatus.Status=STATUS_INSUFFICIENT_RESOURCES;
}
else
irp->IoStatus.Status=STATUS_INVALID_PARAMETER;
break;
What are these I/O ports that you're trying to access? It's generally a Really Bad Idea to go partying on ports that you don't own because you have no way of synchronizing access to those ports with the driver that owns them, the O/S, or the BIOS (it's possible to take an SMI and have the BIOS start talking to ports that it thinks it owns).
The code snippet provided is also a horribly bad idea and should be burned. Basically, all it's doing is mapping a kernel virtual address to a device register (MmMapIoSpace) and then doing the work to then map that device register into user mode (MmMapLockedPages). There are two obvious problems with it:
1) You don't know the caching attributes of the memory, so randomly specifying MmNonCached can hang the system
2) Same as with I/O ports, you can't just arbitrarily access a device's registers. You can't properly synchronize yourself with the driver that owns them, so you're doomed to eventually borking your system.
-scott

How to know what amount of memory I'm using in a process? win32 C++

I'm using
Win32 C++ in
CodeGear Builder 2009
Target is Windows XP Embedded.
I found the PROCESS_MEMORY_COUNTERS_EX struct
and I have created a siple function to return the
Memory consumption of my process
SIZE_T TForm1::ProcessPrivatBytes( DWORD processID )
{
SIZE_T lRetval = 0;
HANDLE hProcess;
PROCESS_MEMORY_COUNTERS_EX pmc;
hProcess = OpenProcess( PROCESS_QUERY_INFORMATION |
PROCESS_VM_READ,
FALSE, processID );
if (NULL == hProcess)
{
lRetval = 1;
}
else
{
if ( GetProcessMemoryInfo( hProcess, (PROCESS_MEMORY_COUNTERS*)&pmc, sizeof(pmc)) )
{
lRetval = pmc.WorkingSetSize;
lRetval = pmc.PrivateUsage;
}
CloseHandle( hProcess );
}
return lRetval;
}
//---------------------------------------------------------------------------
Do i have to use lRetval = pmc.WorkingSetSize; or lRetval = pmc.PrivateUsage;
the privateUsage are what I see in perfmon.
but what is that WorkingSetSize exactly.
I what to see every byte I allocate in the counter when I allocate it. Is this Posible?
regards
jvdn
This is a much tougher question than you probably realized. The reason is that Windows shares most executable code between processes (especially the ones that make up most of Windows itself) between processes. For example, there's normally ONE copy of kernel32.dll loaded into memory, but it'll normally be mapped into every process. Do you consider that part of the memory your process is "using" or not?
Private memory is what's unique to that particular process. This can be somewhat misleading too. Since the executable for your process could potentially be shared with another process (i.e. two instances of your program could be run), that's not counted as part of the private memory, even if (as is often the case) there's only one instance of it running.
The working set size is about 99.999% meaningless. What it returns is whatever has been set as the preferred working set size for the process. You can adjust that with SetProcessWorkingSetSize(). Windows has a working set trimmer that attempts to trim down working sets. If memory serves, it uses the working set size to guess at whether it's worth trying to trim the working set of this process -- i.e. if its current working set is larger than the working set size was set to, it tries to trim it down. Otherwise, it (mostly) leaves it alone.
Chances are that nothing you do will show you ever byte you allocate as you allocate it though. Calling Windows to allocate memory is fairly slow, so what's normally done is that the run-time library allocates a fairly big chunk of memory from Windows. When you allocate memory, the run-time library gives you a piece of that big chunk. Only when that chunk is gone does it go back to Windows and ask for more.

How can I find out how much of address space the application is consuming and report this to user?

I'm writing the memory manager for an application, as part of a team of twenty-odd coders. We're running out of memory quota and we need to be able to see what's going on, since we only appear to be using about 700Mb. I need to be able to report where it's all going - fragmentation etc. Any ideas?
You can use existing memory debugging tools for this, I found Memory Validator 1 quite useful, it is able to track both API level (heap, new...) and OS level (Virtual Memory) allocations and show virtual memory maps.
The other option which I also found very usefull is to be able to dump a map of the whole virtual space based on VirtualQuery function. My code for this looks like this:
void PrintVMMap()
{
size_t start = 0;
// TODO: make portable - not compatible with /3GB, 64b OS or 64b app
size_t end = 1U<<31; // map 32b user space only - kernel space not accessible
SYSTEM_INFO si;
GetSystemInfo(&si);
size_t pageSize = si.dwPageSize;
size_t longestFreeApp = 0;
int index=0;
for (size_t addr = start; addr<end; )
{
MEMORY_BASIC_INFORMATION buffer;
SIZE_T retSize = VirtualQuery((void *)addr,&buffer,sizeof(buffer));
if (retSize==sizeof(buffer) && buffer.RegionSize>0)
{
// dump information about this region
printf(.... some buffer information here ....);
// track longest feee region - usefull fragmentation indicator
if (buffer.State&MEM_FREE)
{
if (buffer.RegionSize>longestFreeApp) longestFreeApp = buffer.RegionSize;
}
addr += buffer.RegionSize;
index+= buffer.RegionSize/pageSize;
}
else
{
// always proceed
addr += pageSize;
index++;
}
}
printf("Longest free VM region: %d",longestFreeApp);
}
You can also find out information about the heaps in a process with Heap32ListFirst/Heap32ListNext, and about loaded modules with Module32First/Module32Next, from the Tool Help API.
'Tool Help' originated on Windows 9x. The original process information API on Windows NT was PSAPI, which offers functions which partially (but not completely) overlap with Tool Help.
Our (huge) application (a Win32 game) started throwing "Not enough quota" exceptions recently, and I was charged with finding out where all the memory was going. It is not a trivial job - this question and this one were my first attempts at finding out. Heap behaviour is unexpected, and accurately tracking how much quota you've used and how much is available has so far proved impossible. In fact, it's not particularly useful information anyway - "quota" and "somewhere to put things" are subtly and annoyingly different concepts. The accepted answer is as good as it gets, although enumerating heaps and modules is also handy. I used DebugDiag from MS to view the true horror of the situation, and understand how hard it is to actually thoroughly track everything.

Resources