Memory profiling tools and methods - windows-phone-7

I'm attempting to analyze memory usage of our Windows Phone 7 app. Querying the ApplicationPeakMemoryUsage property yields a value of ~90Mb following a soak test. System.GC.GetTotalMemory(true) returns ~11Mb at this time, so the balance must be unmanaged memory. The app does not explicitly allocate any unmanaged memory, so I assume the balance is GPU assets, audio and the app binary itself.
By surrounding calls to ContentManager.Load() and GPU resource allocations (new RenderTarget2D(), etc). with code similar to
System.GC.Collect();
unused = System.GC.GetTotalMemory(true);
GC.WaitForPendingFinalizers();
long mem = ((long)Microsoft.Phone.Info.DeviceExtendedProperties.GetValue("ApplicationCurrentMemoryUsage"));
.
. // perform loads/allocations
.
mem = ((long)Microsoft.Phone.Info.DeviceExtendedProperties.GetValue("ApplicationCurrentMemoryUsage")) - mem;
I am able to obtain approximate figures for memory used by render buffers, texture/audio resources etc. These total ~45-50Mb across my app. ApplicationCurrentMemoryUsage yields just under 10Mb immediately at the start of initialization. Subtracting the 11Mb managed heap as well (which is partly double-counting!), this leaves ~20Mb unaccounted for.
The Mango memory profiler tracks the totals but only breaks down allocations for the managed heap. What else might be using large quantities of unmanaged memory other than GPU resources, audio and the app binary itself? Are there any more sensible tools or methods for tracking memory than what I am doing?

Downloaded files (including images from the web) can use lots of memory. If you're using them be sure to free up the memory again properly (see http://blogs.msdn.com/b/swick/archive/2011/04/07/image-tips-for-windows-phone-7.aspx).

Are you using a WebBrowser control?
It has some flaws and can cause huge (and incremental) memory leaks in some scenarios, especially if the page contains many media or complicated scripts, or when its page is reloaded/changed with unlucky timing..

Related

How to see exactly how much memory each add-in is using?

Is there a way for me to see exactly how much memory each Outlook add-in is using? I have a few customers on 32-bit Office who are all having issues with screen flashing and crashing and I suspect that we as a company have deployed too many add-ins, and even with Large Address Awareness (LAA), they're running out of memory which is causing Outlook to freak out.
I didn't see a way to do this in Outlook so i created a .dmp file and I've opened it via windbg, but I'm new to this application and have no clue how to see specific memory usage by specific add-ins (the .dmp file is only of outlook.exe)
The following assumes plugins created in .NET.
The allocation of memory with a new statement goes to the .NET memory manager. In order to find out which plugin allocated the memory, that information would need to be stored in the .NET heap as well.
A UST (User Mode Stack Trace) database like available for the Windows Heap Manager is not available in .NET. Also, the .NET memory manager works directly above VirtualAlloc(), so it does not use the Windows Heap Manager. Basically, the reason is garbage collection.
Is there a way for me to see exactly how much memory each Outlook add-in is using?
No, since this information is not stored in crash dumps and there's no setting to enable it.
What you need is a memory profiler which is specific for .NET.
If you work with .NET and Visual Studio already, perhaps you're using JetBrains Resharper. The Ultimate Edition comes with a tool called dotMemory, so you might already have a license and you just need to install it via the control panel ("modify" Resharper installation).
It has (and other tools probably have as well) a feature to group memory allocations by assembly:
The screenshot shows memory allocated by an application called "MemoryPerformance". It retains 202 MB in objects, and those objects are mostly objects of the .NET framework (mscorlib).
The following assumes plugins created in C++ or other "native" languages, at least not .NET.
The allocation of memory with a new statement goes to HeapAlloc(). In order to find out who allocated the memory, that information would need to be stored in the heap as well.
However, you cannot provide that information in the new statement, and even if it were possible, you would need to rewrite all the new statements in your code.
Another way would be that HeapAlloc() has a look at the call stack at the time someone wants memory. In normal operation, that's too much cost (time-wise) and too much overhead (memory-wise). However, it is possible to enable the so called User Mode Stack Trace Database, sometimes abbreviated as UST database. You can do that with the tool GFlags, which ships with WinDbg.
The tool to capture memory snapshots is UMDH, also available with WinDbg. It will store the results as plain text files. It should be possible to extract statistical data from those USTs, however, I'm not aware of a tool that would do that, which means you would need to write one yourself.
The third approach is using the concept of "heap tagging". However, it's quite complex and also needs modifications in your code. I never implemented it, but you can look at the question How to benefit from Heap tagging by DLL?
Let's say the UST approch looks most feasible. How large should the UST database be?
Until now, 50 MB was sufficient for me to identify and fix memory leaks. However, for that use case it's not important to get information about all memory. It just needs enough samples to support a hypothesis. Those 50 MB are IMHO allocated in your application's memory, so it may affect the application.
The UST database only stores the addresses, not the call stack text. So in a 32 bit application, each frame on the call stack only needs 32 bit of storage.
In your case, 50 MB will not be sufficient. Considering an average depth of 10 frames and an average allocation size of 256 bytes (4 bytes for an int, but also larger things like strings), you get
4 GB / 256 bytes = 16M allocations
16M allocations * 10 frames * 4 byte/frame = 640 MB UST
If the given assumptions are realistic (I can't guarantee that), you would need a 640 MB UST database size. This will influence your application much, since it reduces the memory from 4 GB to 3.3 GB, thus the OOM comes earlier.
The UST information should also be available in the DMP file, if it was configured at the time the crash dump was created. Certainly not in your DMP file, otherwise you would have told us. However, it's not available in a way that's good for statistics. Using the UMDH text files IMHO is a better approach.
Is there a way for me to see exactly how much memory each Outlook add-in is using?
Not with the DMP file you have at the moment. It will still be hard with the tools available with WinDbg.
There are a few other options left:
Disable all plugins and measure memory of Outlook itself. Then, enable one plugin at a time and measure the memory with that plugin enables. Calculate the difference to find out what additional memory that plugin needs.
Does it crash immediately at startup? Or later, say after 10 minutes of usage? Could it be a memory leak? Identifying a memory leak could be easier: just enable one plugin at a time and monitor memory usage over time. Use a memory profiler, not WinDbg. It will be much easier to use and it can draw the appropriate graphics you need.
Note that you need to define a clear process to measure memory. Some memory will only be allocated when you do something specific ("lazy initialization"). Perhaps you want to measure that memory, too.

Xcode Memory Utilized

So in xcode, the Debug Navigator shows CPU Usage and Memory usage. When you click on Memory it says 'Memory Utilized'.
In my app I am using the latest Restkit (0.20.x) and every time I make a GET request using getObjectsAtPath (which doesn't even return a very large payload), the memory utilized increases about 2mb. So if I refresh my app 100 times, the Memory Utilized will have grown over 200mb.
However, when I run the Leaks tool, the Live Bytes remain fairly small and do not increase with each new request. Live bytes remains below 10mb the whole time.
So do I have a memory issue or not? Memory Utilized grows like crazy, but Live Bytes suggests everything is okay.
You can use Heapshot Analysis to evaluate the situation. If that shows no growth, then the memory consumption may be virtual memory which may (for example) reside in a cache/store which may support eviction and recreation -- so you should also identify growth in Virtual Memory regions.
If you keep making requests (e.g. try 200 refreshes), the memory will likely decrease at some point - or you will have memory warnings and ultimately allocation requests may fail. Determine how memory is reduced, if this is the case. Otherwise, you will need to determine where it is created and possibly referenced.
Also, test on a device in this case. The simulator is able to utilise much more memory than a device simply because it has more to work with. Memory constraints are not simulated.

How does the size of managed code affect memory footprint?

I have been tasked with reducing memory footprint of a Windows CE 5.0 application. I came across Rob Tiffany's highly cited article which recommends using managed DLL to keep the code out of the process's slot. But there is something I don't understand.
The article says that
The JIT compiler is running in your slot and it pulls in IL from the 1
GB space as needed to compile the current call stack.
This means that all the code in the managed DLL can potentially eventually end up in the process's slot. While this will help other processes by not loading the code in common area how does it help this process? FWIW the article does mention that
It also reduces the amount of memory that has to be allocated inside your
My only thought is that just as the code is pulled into the slot it is also pushed/swapped out. But that is just a wild guess and probably completely false.
CF assemblies aren't loaded into the process slot like native DLLs are. They're actually accessed as memory-mapped files. This means that the size of the DLL is effectively irrelevant.
The managed heap also lies in shared memory, not your process slot, so object allocations are far less likely to cause process slot fragmentation or OOM's.
The JITter also doesn't just JIT and hold forever. It compiles what is necessary, and during a GC may very well pitch compiled code that is not being used, or that hasn't been used in a while. You're never going to see an entire assembly JITTed and pulled into the process slow (well if it's a small assembly maybe, but it's certainly not typical).
Obviously some process slot memory has to be used to create some pointers, stack storage, etc etc, but by and large managed code has way less impact on the process slot limitations than native code. Of course you can still hit the limit with large stacks, P/Invokes, native allocations and the like.
In my experience, the area people get into trouble most often with CF apps an memory is with GDI objects and drawing. Bitmaps take up a lot of memory. Even though it's largely in shared memory, creating lots of them (along with brushes, pens, etc) and not caching and reusing is what most often give a large managed app memory footprint.
For a bit more detail this MSDN webcast on Compact Framework Memory Management, while old, is still very relevant.

What is private bytes, virtual bytes, working set?

I am trying to use the perfmon windows utility to debug memory leaks in a process.
This is how perfmon explains the terms:
Working Set is the current size, in bytes, of the Working Set of this process. The Working Set is the set of memory pages touched recently by the threads in the process. If free memory in the computer is above a threshold, pages are left in the Working Set of a process even if they are not in use. When free memory falls below a threshold, pages are trimmed from Working Sets. If they are needed they will then be soft-faulted back into the Working Set before leaving main memory.
Virtual Bytes is the current size, in bytes, of the virtual address space the process is using. Use of virtual address space does not necessarily imply corresponding use of either disk or main memory pages. Virtual space is finite, and the process can limit its ability to load libraries.
Private Bytes is the current size, in bytes, of memory that this process has allocated that cannot be shared with other processes.
These are the questions I have:
Is it the Private Bytes which I should measure to be sure if the process is having any leaks as it does not involve any shared libraries and any leaks, if happening, will come from the process itself?
What is the total memory consumed by the process? Is it the Virtual Bytes or is it the sum of Virtual Bytes and Working Set?
Is there any relation between Private Bytes, Working Set and Virtual Bytes?
Are there any other tools that give a better idea of the memory usage?
The short answer to this question is that none of these values are a reliable indicator of how much memory an executable is actually using, and none of them are really appropriate for debugging a memory leak.
Private Bytes refer to the amount of memory that the process executable has asked for - not necessarily the amount it is actually using. They are "private" because they (usually) exclude memory-mapped files (i.e. shared DLLs). But - here's the catch - they don't necessarily exclude memory allocated by those files. There is no way to tell whether a change in private bytes was due to the executable itself, or due to a linked library. Private bytes are also not exclusively physical memory; they can be paged to disk or in the standby page list (i.e. no longer in use, but not paged yet either).
Working Set refers to the total physical memory (RAM) used by the process. However, unlike private bytes, this also includes memory-mapped files and various other resources, so it's an even less accurate measurement than the private bytes. This is the same value that gets reported in Task Manager's "Mem Usage" and has been the source of endless amounts of confusion in recent years. Memory in the Working Set is "physical" in the sense that it can be addressed without a page fault; however, the standby page list is also still physically in memory but not reported in the Working Set, and this is why you might see the "Mem Usage" suddenly drop when you minimize an application.
Virtual Bytes are the total virtual address space occupied by the entire process. This is like the working set, in the sense that it includes memory-mapped files (shared DLLs), but it also includes data in the standby list and data that has already been paged out and is sitting in a pagefile on disk somewhere. The total virtual bytes used by every process on a system under heavy load will add up to significantly more memory than the machine actually has.
So the relationships are:
Private Bytes are what your app has actually allocated, but include pagefile usage;
Working Set is the non-paged Private Bytes plus memory-mapped files;
Virtual Bytes are the Working Set plus paged Private Bytes and standby list.
There's another problem here; just as shared libraries can allocate memory inside your application module, leading to potential false positives reported in your app's Private Bytes, your application may also end up allocating memory inside the shared modules, leading to false negatives. That means it's actually possible for your application to have a memory leak that never manifests itself in the Private Bytes at all. Unlikely, but possible.
Private Bytes are a reasonable approximation of the amount of memory your executable is using and can be used to help narrow down a list of potential candidates for a memory leak; if you see the number growing and growing constantly and endlessly, you would want to check that process for a leak. This cannot, however, prove that there is or is not a leak.
One of the most effective tools for detecting/correcting memory leaks in Windows is actually Visual Studio (link goes to page on using VS for memory leaks, not the product page). Rational Purify is another possibility. Microsoft also has a more general best practices document on this subject. There are more tools listed in this previous question.
I hope this clears a few things up! Tracking down memory leaks is one of the most difficult things to do in debugging. Good luck.
The definition of the perfmon counters has been broken since the beginning and for some reason appears to be too hard to correct.
A good overview of Windows memory management is available in the video "Mysteries of Memory Management Revealed" on MSDN: It covers more topics than needed to track memory leaks (eg working set management) but gives enough detail in the relevant topics.
To give you a hint of the problem with the perfmon counter descriptions, here is the inside story about private bytes from "Private Bytes Performance Counter -- Beware!" on MSDN:
Q: When is a Private Byte not a Private Byte?
A: When it isn't resident.
The Private Bytes counter reports the commit charge of the process. That is to say, the amount of space that has been allocated in the swap file to hold the contents of the private memory in the event that it is swapped out. Note: I'm avoiding the word "reserved" because of possible confusion with virtual memory in the reserved state which is not committed.
From "Performance Planning" on MSDN:
3.3 Private Bytes
3.3.1 Description
Private memory, is defined as memory allocated for a process which cannot be shared by other processes. This memory is more expensive than shared memory when multiple such processes execute on a machine. Private memory in (traditional) unmanaged dlls usually constitutes of C++ statics and is of the order of 5% of the total working set of the dll.
You should not try to use perfmon, task manager or any tool like that to determine memory leaks. They are good for identifying trends, but not much else. The numbers they report in absolute terms are too vague and aggregated to be useful for a specific task such as memory leak detection.
A previous reply to this question has given a great explanation of what the various types are.
You ask about a tool recommendation:
I recommend Memory Validator. Capable of monitoring applications that make billions of memory allocations.
http://www.softwareverify.com/cpp/memory/index.html
Disclaimer: I designed Memory Validator.
There is an interesting discussion here: http://social.msdn.microsoft.com/Forums/en-US/vcgeneral/thread/307d658a-f677-40f2-bdef-e6352b0bfe9e/
My understanding of this thread is that freeing small allocations are not reflected in Private Bytes or Working Set.
Long story short:
if I call
p=malloc(1000);
free(p);
then the Private Bytes reflect only the allocation, not the deallocation.
if I call
p=malloc(>512k);
free(p);
then the Private Bytes correctly reflect the allocation and the deallocation.

How much memory is my windows app really using?

I have a long-running memory hog of an experimental program, and I'd like to know it's actual memory footprint. The Task Manager says (in windows7-64) that the app is consuming 800 mb of memory, but the total amount of memory allocated, also according to the task manager, is 3.7gb. The sum of all the allocated memory does not equal 3.7gb. How can I determine, on the fly, how much memory my application is actually consuming.
Corollary: What memory is the task manager actually reporting? It doesn't seem to be all the memory that's allocated to the app itself.
As I understand it, Task manager shows the Working Set;
working set: The set of memory pages
recently touched by the threads of a
process. If free memory in the
computer is above a threshold, pages
are left in the working set of a
process even if they are not being
used. When free memory falls below a
threshold, pages are trimmed from the
working set.
via http://msdn.microsoft.com/en-us/library/cc432779(PROT.10).aspx
You can get Task Manager to show Virtual Memory as well.
I usually use perfmon (Start -> Run... -> perfmon) to track memory usage, using the Private Bytes counter. It reflects memory allocated by your normal allocators (new/HeapAlloc/malloc, etc).
Memory is a tricky thing to measure. An application might reserve lots of virtual memory but not actually use much of it. Some of the memory might be shared; that is, a shared DLL might be loaded in to the address space of several applications but it is only loaded in to physical memory once.
A good measure is the working set, which is the set of pages in its virtual address space that have been accessed recently. What the meaning of 'accessed recently' is depends on the operating system and its page replacement algorithm. In other words, it is the actual set of virtual pages that are mapped in to physical memory and are in use at the moment. This is what the task manager shows you.
The virtual memory usage is the amount of virtual pages that have been reserved (note that not all of these will have actually been committed, that is, had physical backing store allocated for it. You can add this to the display in task manager by clicking View -> Select Columns.
The most important thing though: If you want to actually measure how much memory your program is using to see if you need to optimize some of it for space or choose better data structures or persist some things to disk, using the task manager is the wrong approach. You should almost certainly be using a profiler.
That depends on what memory you are talking about. Unfortunately there are many different ways to measure memory. For instance ...
Physical Memory Allocated
Virtual Memory Allocated
Virtual Memory Reserved (but not committed)
Private Bytes
Shared Bytes
Which metric are you interested in?
I think most people tend to be interested in the "Virtual Memory Allocated" category.
The memory statistics displayed by task manager are not nearly all the statistics available, nor are particularly well presented. I would use the great free tool from Microsoft Sysinternals, VMMap, to analyse the memory used by the application further.
If it is a long running application, and the memory usage grows over time, it is going to be the heap that is growing. Parts of the heap may or may not be paged out to disk at any time, but you really need to optimize you heap usage. In this case you need to be profile your application. If it is a .Net application then I can recommend Redgate's ANTS profiler. It is very easy to use. If it's a native application, then the Intel vtune profiler is pretty powerful. You don't need the source code for the process you are profiling for either tool.
Both applications have a free trial. Good luck.
P.S. Sorry I didn't include more hyperlinks to the tools, but this is my first post, and stackoverflow limits first posts to one hyperlink :-(

Resources