windbg memory leak investigation - missing heap memory - windows

I am investigating a slow memory leak in a windows application using windbg
!heap -s gives the following output
Heap Flags Reserv Commit Virt Free List UCR Virt Lock Fast
(k) (k) (k) (k) length blocks cont. heap
-------------------------------------------------------------------------------------
00000023d62c0000 08000002 1182680 1169996 1181900 15759 2769 78 3 2b63 LFH
00000023d4830000 08008000 64 4 64 2 1 1 0 0
00000023d6290000 08001002 1860 404 1080 43 7 2 0 0 LFH
00000023d6dd0000 08001002 32828 32768 32828 32765 33 1 0 0
External fragmentation 99 % (33 free blocks)
00000023d8fb0000 08001000 16384 2420 16384 2412 5 5 0 3355
External fragmentation 99 % (5 free blocks)
00000023da780000 08001002 60 8 60 5 2 1 0 0
-------------------------------------------------------------------------------------
This shows that the heap with address 00000023d62c0000 has over a gigabyte of reserved memory.
Next I ran the command !heap -stat -h 00000023d62c0000
heap # 00000023d62c0000
group-by: TOTSIZE max-display: 20
size #blocks total ( %) (percent of total busy bytes)
30 19b1 - 4d130 (13.81)
20 1d72 - 3ae40 (10.55)
ccf 40 - 333c0 (9.18)
478 8c - 271a0 (7.01)
27158 1 - 27158 (7.00)
40 80f - 203c0 (5.78)
410 79 - 1eb90 (5.50)
68 43a - 1b790 (4.92)
16000 1 - 16000 (3.94)
50 39e - 12160 (3.24)
11000 1 - 11000 (3.05)
308 54 - fea0 (2.85)
60 28e - f540 (2.75)
8018 1 - 8018 (1.43)
80 f2 - 7900 (1.36)
1000 5 - 5000 (0.90)
70 ac - 4b40 (0.84)
4048 1 - 4048 (0.72)
100 3e - 3e00 (0.69)
48 c9 - 3888 (0.63)
If I add up the total size of the heap blocks from the above command (4d130 + 3ae40 + ...) I get a few megabytes of allocated memory.
Am I missing something here? How can I find which blocks are consuming the gigabyte of allocated heap memory?

I believe that the !heap –stat is broken for 64 bits dumps, at least big one. I have instead used debugdiag 1.2 for hunting memory leaks on 64 bits.

Related

Description ENGINE LOG in IBM ILOG CPLEX

I want to understand engine log of IBM ILOG CPLEX studios for a ILP model. I have checked there documentation also but could not able to get clear idea.
Example of Engine log :
Version identifier: 22.1.0.0 | 2022-03-09 | 1a383f8ce
Legacy callback pi
Tried aggregator 2 times.
MIP Presolve eliminated 139 rows and 37 columns.
MIP Presolve modified 156 coefficients.
Aggregator did 11 substitutions.
Reduced MIP has 286 rows, 533 columns, and 3479 nonzeros.
Reduced MIP has 403 binaries, 0 generals, 0 SOSs, and 129 indicators.
Presolve time = 0.05 sec. (6.16 ticks)
Found incumbent of value 233.000000 after 0.07 sec. (9.40 ticks)
Probing time = 0.00 sec. (1.47 ticks)
Tried aggregator 2 times.
Detecting symmetries...
Aggregator did 2 substitutions.
Reduced MIP has 284 rows, 531 columns, and 3473 nonzeros.
Reduced MIP has 402 binaries, 129 generals, 0 SOSs, and 129 indicators.
Presolve time = 0.01 sec. (2.87 ticks)
Probing time = 0.00 sec. (1.45 ticks)
Clique table members: 69.
MIP emphasis: balance optimality and feasibility.
MIP search method: dynamic search.
Parallel mode: deterministic, using up to 8 threads.
Root relaxation solution time = 0.00 sec. (0.50 ticks)
Nodes Cuts/
Node Left Objective IInf Best Integer Best Bound ItCnt Gap
* 0+ 0 233.0000 18.0000 92.27%
* 0+ 0 178.0000 18.0000 89.89%
* 0+ 0 39.0000 18.0000 53.85%
0 0 22.3333 117 39.0000 22.3333 4 42.74%
0 0 28.6956 222 39.0000 Cuts: 171 153 26.42%
0 0 31.1543 218 39.0000 Cuts: 123 251 20.12%
0 0 32.1544 226 39.0000 Cuts: 104 360 17.55%
0 0 32.6832 212 39.0000 Cuts: 102 456 16.20%
0 0 33.1524 190 39.0000 Cuts: 65 521 14.99%
Detecting symmetries...
0 0 33.3350 188 39.0000 Cuts: 66 566 14.53%
0 0 33.4914 200 39.0000 Cuts: 55 614 14.12%
0 0 33.6315 197 39.0000 Cuts: 47 673 13.77%
0 0 33.6500 207 39.0000 Cuts: 61 787 13.72%
0 0 33.7989 206 39.0000 Cuts: 91 882 13.34%
* 0+ 0 38.0000 33.7989 11.06%
0 0 33.9781 209 38.0000 Cuts: 74 989 10.58%
0 0 34.0074 209 38.0000 Cuts: 65 1043 10.51%
0 0 34.2041 220 38.0000 Cuts: 63 1124 9.99%
0 0 34.2594 211 38.0000 Cuts: 96 1210 9.84%
0 0 34.3032 216 38.0000 Cuts: 86 1274 9.73%
0 0 34.3411 211 38.0000 Cuts: 114 1353 9.63%
0 0 34.3420 220 38.0000 Cuts: 82 1402 9.63%
0 0 34.3709 218 38.0000 Cuts: 80 1462 9.55%
0 0 34.4494 228 38.0000 Cuts: 87 1530 9.34%
0 0 34.4882 229 38.0000 Cuts: 97 1616 9.24%
0 0 34.5173 217 38.0000 Cuts: 72 1663 9.16%
0 0 34.5545 194 38.0000 Cuts: 67 1731 9.07%
0 0 34.5918 194 38.0000 Cuts: 76 1786 8.97%
0 0 34.6094 199 38.0000 Cuts: 73 1840 8.92%
0 0 34.6226 206 38.0000 Cuts: 77 1883 8.89%
0 0 34.6421 206 38.0000 Cuts: 53 1928 8.84%
0 0 34.6427 213 38.0000 Cuts: 84 1982 8.83%
Detecting symmetries...
0 2 34.6427 213 38.0000 34.6478 1982 8.82%
Elapsed time = 0.44 sec. (235.86 ticks, tree = 0.02 MB, solutions = 4)
GUB cover cuts applied: 32
Cover cuts applied: 328
Implied bound cuts applied: 205
Flow cuts applied: 11
Mixed integer rounding cuts applied: 17
Zero-half cuts applied: 35
Gomory fractional cuts applied: 1
Root node processing (before b&c):
Real time = 0.43 sec. (235.61 ticks)
Parallel b&c, 8 threads:
Real time = 0.27 sec. (234.23 ticks)
Sync time (average) = 0.11 sec.
Wait time (average) = 0.00 sec.
------------
Total (root+branch&cut) = 0.71 sec. (469.84 ticks)
Mainly I want to understand what are nodes,left,gap,root node processing, parallel b&c.
I hope anyone of you will give a resource or explain it clearly so that it can be helpful when someone starts using IBM ILOG CPLEX studio in future
Thanks a lot in advance
I am expecting for someone to fill knowledge gaps regarding Engine log of IBMs ILOG CPLEX studio
I recommend
Progress reports: interpreting the node log
https://www.ibm.com/docs/en/icos/12.8.0.0?topic=mip-progress-reports-interpreting-node-log

trying to understand how checksum is calculated

I am looking at this page and I am not sure how the author is calculating the checksum. I would contact the author directly, but don't have his email address (its not listed in github).
This is a simple example of a packet with no variables. The author calculates the checksum to be 120 (I assume this is hex as all his other values are in hex). The sum of all the bytes is 0xBA hex or 186 base(10). His notes say "Checksum Low Bit, This bit is checksum of 1-5 bits (MOD 256, if necessary)" but I am not getting what he is saying and I can't figure out how to get to his answer.
Get Version / Return Name
Byte 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Request 16 2 80 20 2 120 16 3
Byte Sample hex Definition
hex (B10)
==== ==== ===== =============================
1 0x16 (22) Preamble 1
2 0x02 (2) Preamble 2
3 0x80 (128) Destination = Chlorinator
4 0x20 (32) Command = Get Name
5 0x02 (2) Not sure. Intellitouch uses 2. Aquarite uses 0. Any of them seem to work.
6 120 Checksum Low Bit, This bit is checksum of 1-5 bits (MOD 256, if necessary)
7 0x16 (22) Post-amble 1
8 0x3 (3) Post-amble 2
Any suggestions would be most appreciated!
Turns out that the commentors were 100% correct: the numbers were express in decimal, not hex as I assumed.

how the execution time drops sharply (more than expected) as the number of processors increase?

I am executing my programme 5000000 times in parallel using "Parallel.For" from F#.
Average execution time per task is given below.
Number of active cores : Execution Time (microseconds)
2 : 866
4 : 424
8 : 210
12 : 140
16 : 106
24 : 76
32 : 60
provided the fact,
by doubling number of cores, maximum speedup which we can get, should be less than than 2 (ideally it can be 2).
what can be the reason for this sharp speedup.
2 * 866 = 1732
4 * 424 = 1696
8 * 210 = 1680
12 * 140 = 1680
16 * 106 = 1696
24 * 76 = 1824
32 * 60 = 1920
So as you increase parallelism, relative performance improves and then begins to fall. The improvement is possibly due to amortization of overhead costs like JIT compilation or in the algorithm that manages the parallelization.
The degradation as the degree of parallelism increases is often due to some sort of resource contention, excessive context switching, or the like.

How to calculate classification error rate

Alright. Now this question is pretty hard. I am going to give you an example.
Now the left numbers are my algorithm classification and the right numbers are the original class numbers
177 86
177 86
177 86
177 86
177 86
177 86
177 86
177 86
177 86
177 89
177 89
177 89
177 89
177 89
177 89
177 89
So here my algorithm merged 2 different classes into 1. As you can see it merged class 86 and 89 into one class. So what would be the error at the above example ?
Or here another example
203 7
203 7
203 7
203 7
16 7
203 7
17 7
16 7
203 7
At the above example left numbers are my algorithm classification and the right numbers are original class ids. As can be seen above it miss classified 3 products (i am classifying same commercial products). So at this example what would be the error rate? How would you calculate.
This question is pretty hard and complex. We have finished the classification but we are not able to find correct algorithm for calculating success rate :D
Here's a longish example, a real confuson matrix with 10 input classes "0" - "9"
(handwritten digits),
and 10 output clusters labelled A - J.
Confusion matrix for 5620 optdigits:
True 0 - 9 down, clusters A - J across
-----------------------------------------------------
A B C D E F G H I J
-----------------------------------------------------
0: 2 4 1 546 1
1: 71 249 11 1 6 228 5
2: 13 5 64 1 13 1 460
3: 29 2 507 20 5 9
4: 33 483 4 38 5 3 2
5: 1 1 2 58 3 480 13
6: 2 1 2 294 1 1 257
7: 1 5 1 546 6 7
8: 415 15 2 5 3 12 13 87 2
9: 46 72 2 357 35 1 47 2
----------------------------------------------------
580 383 496 1002 307 670 549 557 810 266 estimates in each cluster
y class sizes: [554 571 557 572 568 558 558 566 554 562]
kmeans cluster sizes: [ 580 383 496 1002 307 670 549 557 810 266]
For example, cluster A has 580 data points, 415 of which are "8"s;
cluster B has 383 data points, 249 of which are "1"s; and so on.
The problem is that the output classes are scrambled, permuted;
they correspond in this order, with counts:
A B C D E F G H I J
8 1 4 3 6 7 0 5 2 6
415 249 483 507 294 546 546 480 460 257
One could say that the "success rate" is
75 % = (415 + 249 + 483 + 507 + 294 + 546 + 546 + 480 + 460 + 257) / 5620
but this throws away useful information —
here, that E and J both say "6", and no cluster says "9".
So, add up the biggest numbers in each column of the confusion matrix
and divide by the total.
But, how to count overlapping / missing clusters,
like the 2 "6"s, no "9"s here ?
I don't know of a commonly agreed-upon way
(doubt that the Hungarian algorithm
is used in practice).
Bottom line: don't throw away information; look at the whole confusion matrix.
NB such a "success rate" will be optimistic for new data !
It's customary to split the data into say 2/3 "training set" and 1/3 "test set",
train e.g. k-means on the 2/3 alone,
then measure confusion / success rate on the test set — generally worse than on the training set alone.
Much more can be said; see e.g.
Cross-validation.
You have to define the error criteria if you want to evaluate the performance of an algorithm, so I'm not sure exactly what you're asking. In some clustering and machine learning algorithms you define the error metric and it minimizes it.
Take a look at this
https://en.wikipedia.org/wiki/Confusion_matrix
to get some ideas
You have to define a error metric to measure yourself. In your case, a simple method should be to find the properties mapping of your product as
p = properties(id)
where id is the product id, and p is likely be a vector with each entry of different properties. Then you can define the error function e (or distance) between two products as
e = d(p1, p2)
Sure, each properties must be evaluated to a number in this function. Then this error function can be used in the classification algorithm and learning.
In your second example, it seems that you treat the pair (203 7) as successful classification, so I think you have already a metric yourself. You may be more specific to get better answer.
Classification Error Rate(CER) is 1 - Purity (http://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-clustering-1.html)
ClusterPurity <- function(clusters, classes) {
sum(apply(table(classes, clusters), 2, max)) / length(clusters)
}
Code of #john-colby
Or
CER <- function(clusters, classes) {
1- sum(apply(table(classes, clusters), 2, max)) / length(clusters)
}

How to analyze <unclassified> memory usage in windbg

This is a .NET v4 windows service application running on a x64 machine. At some point after days of running steadily the windows service memory consumption spikes up like crazy until it crashes. I was able to catch it at 1.2 GB and capture a memory dump. Here is what i get
If i run !address -summary in windbg on my dump file i get the follow result
!address -summary
--- Usage Summary ------ RgnCount ------- Total Size -------- %ofBusy %ofTotal
Free 821 7ff`7e834000 ( 7.998 Tb) 99.98%
<unclassified> 3696 0`6eece000 ( 1.733 Gb) 85.67% 0.02%
Image 1851 0`0ea6f000 ( 234.434 Mb) 11.32% 0.00%
Stack 1881 0`03968000 ( 57.406 Mb) 2.77% 0.00%
TEB 628 0`004e8000 ( 4.906 Mb) 0.24% 0.00%
NlsTables 1 0`00023000 ( 140.000 kb) 0.01% 0.00%
ActivationContextData 3 0`00006000 ( 24.000 kb) 0.00% 0.00%
CsrSharedMemory 1 0`00005000 ( 20.000 kb) 0.00% 0.00%
PEB 1 0`00001000 ( 4.000 kb) 0.00% 0.00%
-
-
-
--- Type Summary (for busy) -- RgnCount ----- Total Size ----- %ofBusy %ofTotal
MEM_PRIVATE 5837 0`7115a000 ( 1.767 Gb) 87.34% 0.02%
MEM_IMAGE 2185 0`0f131000 (241.191 Mb) 11.64% 0.00%
MEM_MAPPED 40 0`01531000 ( 21.191 Mb) 1.02% 0.00%
-
-
--- State Summary ------------ RgnCount ------ Total Size ---- %ofBusy %ofTotal
MEM_FREE 821 7ff`7e834000 ( 7.998 Tb) 99.98%
MEM_COMMIT 6127 0`4fd5e000 ( 1.247 Gb) 61.66% 0.02%
MEM_RESERVE 1935 0`31a5e000 (794.367 Mb) 38.34% 0.01%
-
-
--Protect Summary(for commit)- RgnCount ------ Total Size --- %ofBusy %ofTotal
PAGE_READWRITE 3412 0`3e862000 (1000.383 Mb) 48.29% 0.01%
PAGE_EXECUTE_READ 220 0`0b12f000 ( 177.184 Mb) 8.55% 0.00%
PAGE_READONLY 646 0`02fd0000 ( 47.813 Mb) 2.31% 0.00%
PAGE_WRITECOPY 410 0`01781000 ( 23.504 Mb) 1.13% 0.00%
PAGE_READWRITE|PAGE_GUARD 1224 0`012f2000 ( 18.945 Mb) 0.91% 0.00%
PAGE_EXECUTE_READWRITE 144 0`007b9000 ( 7.723 Mb) 0.37% 0.00%
PAGE_EXECUTE_WRITECOPY 70 0`001cd000 ( 1.801 Mb) 0.09% 0.00%
PAGE_EXECUTE 1 0`00004000 ( 16.000 kb) 0.00% 0.00%
-
-
--- Largest Region by Usage ----Base Address -------- Region Size ----------
Free 0`8fff0000 7fe`59050000 ( 7.994 Tb)
<unclassified> 0`80d92000 0`0f25e000 ( 242.367 Mb)
Image fe`f6255000 0`0125a000 ( 18.352 Mb)
Stack 0`014d0000 0`000fc000 (1008.000 kb)
TEB 0`7ffde000 0`00002000 ( 8.000 kb)
NlsTables 7ff`fffb0000 0`00023000 ( 140.000 kb)
ActivationContextData 0`00030000 0`00004000 ( 16.000 kb)
CsrSharedMemory 0`7efe0000 0`00005000 ( 20.000 kb)
PEB 7ff`fffdd000 0`00001000 ( 4.000 kb)
First, why would unclassified show up once as 1.73 GB and the other time as 242 MB. (This has been answered. Thank you)
Second, i understand that unclassified can mean managed code, however my heap size according to !eeheap is only 248 MB, which actually matches the 242 but not even close to the 1.73GB. The dump file size is 1.2 GB which is much higher than normal. Where do I go from here to find out what's using all the memory. Anything in the managed heap world is under 248 MB, but i'm using 1.2 GB.
Thanks
EDIT
If i do !heap -s i get the following
LFH Key : 0x000000171fab7f20
Termination on corruption : ENABLED
Heap Flags Reserv Commit Virt Free List UCR Virt Lock Fast
(k) (k) (k) (k) length blocks cont. heap
-------------------------------------------------------------------------------------
Virtual block: 00000000017e0000 - 00000000017e0000 (size 0000000000000000)
Virtual block: 0000000045bd0000 - 0000000045bd0000 (size 0000000000000000)
Virtual block: 000000006fff0000 - 000000006fff0000 (size 0000000000000000)
0000000000060000 00000002 113024 102028 113024 27343 1542 11 3 1c LFH
External fragmentation 26 % (1542 free blocks)
0000000000010000 00008000 64 4 64 1 1 1 0 0
0000000000480000 00001002 3136 1380 3136 20 8 3 0 0 LFH
0000000000640000 00041002 512 8 512 3 1 1 0 0
0000000000800000 00001002 3136 1412 3136 15 7 3 0 0 LFH
00000000009d0000 00001002 3136 1380 3136 19 7 3 0 0 LFH
00000000008a0000 00041002 512 16 512 3 1 1 0 0
0000000000630000 00001002 7232 3628 7232 18 53 4 0 0 LFH
0000000000da0000 00041002 1536 856 1536 1 1 2 0 0 LFH
0000000000ef0000 00041002 1536 944 1536 4 12 2 0 0 LFH
00000000034b0000 00001002 1536 1452 1536 6 17 2 0 0 LFH
00000000019c0000 00001002 3136 1396 3136 16 6 3 0 0 LFH
0000000003be0000 00001002 1536 1072 1536 5 7 2 0 3 LFH
0000000003dc0000 00011002 512 220 512 100 60 1 0 2
0000000002520000 00001002 512 8 512 3 2 1 0 0
0000000003b60000 00001002 339712 168996 339712 151494 976 116 0 18 LFH
External fragmentation 89 % (976 free blocks)
Virtual address fragmentation 50 % (116 uncommited ranges)
0000000003f20000 00001002 64 8 64 3 1 1 0 0
0000000003d90000 00001002 64 8 64 3 1 1 0 0
0000000003ee0000 00001002 64 16 64 11 1 1 0 0
-------------------------------------------------------------------------------------
I've recently had a very similar situation and found a couple techniques useful in the investigation. None is a silver bullet, but each sheds a little more light on the problem.
1) vmmap.exe from SysInternals (http://technet.microsoft.com/en-us/sysinternals/dd535533) does a good job of correlating information on native and managed memory and presenting it in a nice UI. The same information can be gathered using the techniques below, but this is way easier and a nice place to start. Sadly, it doesn't work on dump files, you need a live process.
2) The "!address -summary" output is a rollup of the more detailed "!address" output. I found it useful to drop the detailed output into Excel and run some pivots. Using this technique I discovered that a large number of bytes that were listed as "" were actually MEM_IMAGE pages, likely copies of data pages that were loaded when the DLLs were loaded but then copied when the data was changed. I could also filter to large regions and drill in on specific addresses. Poking around in the memory dump with a toothpick and lots of praying is painful, but can be revealing.
3) Finally, I did a poor man's version of the vmmap.exe technique above. I loaded up the dump file, opened a log, and ran !address, !eeheap, !heap, and !threads. I also targeted the thread environment blocks listed in ~*k with !teb. I closed the log file and loaded it up in my favorite editor. I could then find an unclassified block and search to see if it popped up in the output from one of the more detailed commands. You can pretty quickly correlate native and managed heaps to weed those out of your suspect unclassified regions.
These are all way too manual. I'd love to write a script that would take the output similar to what I generated in technique 3 above and output an mmp file suitable for viewing the vmmap.exe. Some day.
One last note: I did a correlation between vmmap.exe's output with the !address output and noted these types of regions that vmmap couple identify from various sources (similar to what !heap and !eeheap use) but that !address didn't know about. That is, these are things that vmmap.exe labeled but !address didn't:
.data
.pdata
.rdata
.text
64-bit thread stack
Domain 1
Domain 1 High Frequency Heap
Domain 1 JIT Code Heap
Domain 1 Low Frequency Heap
Domain 1 Virtual Call Stub
Domain 1 Virtual Call Stub Lookup Heap
Domain 1 Virtual Call Stub Resolve Heap
GC
Large Object Heap
Native heaps
Thread Environment Blocks
There were still a lot of "private" bytes unaccounted for, but again, I'm able to narrow the problem if I can weed these out.
Hope this gives you some ideas on how to investigate. I'm in the same boat so I'd appreciate what you find, too. Thanks!
“Usage summary” tells that you have 3696 regions of unclassified giving a total of 17.33 Gb
“Largest Region” tells that the largest of the unclassified regions is 242 Mb.
The rest of the unclassified (3695 regions) together makes the difference up to 17.33 Gb.
Try to do a !heap –s and sum up the Virt col to see the size of the native heaps, I think these also falls into the unmanaged bucket.
(NB earlier versions shows native heap explicit from !address -summary)
I keep a copy of Debugging Tools for Windows 6.11.1.404 which seems to be able to display something more meaningful for "unclassified"
With that version, I see a list of TEB addresses and then this:
0:000> !address -summary
--------- PEB fffde000 not found ----
TEB fffdd000 in range fffdb000 fffde000
TEB fffda000 in range fffd8000 fffdb000
...snip...
TEB fe01c000 in range fe01a000 fe01d000
ProcessParametrs 002c15e0 in range 002c0000 003c0000
Environment 002c0810 in range 002c0000 003c0000
-------------------- Usage SUMMARY --------------------------
TotSize ( KB) Pct(Tots) Pct(Busy) Usage
41f08000 ( 1080352) : 25.76% 34.88% : RegionUsageIsVAD
42ecf000 ( 1096508) : 26.14% 00.00% : RegionUsageFree
5c21000 ( 94340) : 02.25% 03.05% : RegionUsageImage
c900000 ( 205824) : 04.91% 06.64% : RegionUsageStack
0 ( 0) : 00.00% 00.00% : RegionUsageTeb
68cf8000 ( 1717216) : 40.94% 55.43% : RegionUsageHeap
0 ( 0) : 00.00% 00.00% : RegionUsagePageHeap
0 ( 0) : 00.00% 00.00% : RegionUsagePeb
0 ( 0) : 00.00% 00.00% : RegionUsageProcessParametrs
0 ( 0) : 00.00% 00.00% : RegionUsageEnvironmentBlock
Tot: ffff0000 (4194240 KB) Busy: bd121000 (3097732 KB)
-------------------- Type SUMMARY --------------------------
TotSize ( KB) Pct(Tots) Usage
42ecf000 ( 1096508) : 26.14% : <free>
5e6e000 ( 96696) : 02.31% : MEM_IMAGE
28ed000 ( 41908) : 01.00% : MEM_MAPPED
b49c6000 ( 2959128) : 70.55% : MEM_PRIVATE
-------------------- State SUMMARY --------------------------
TotSize ( KB) Pct(Tots) Usage
9b4d1000 ( 2544452) : 60.67% : MEM_COMMIT
42ecf000 ( 1096508) : 26.14% : MEM_FREE
21c50000 ( 553280) : 13.19% : MEM_RESERVE
Largest free region: Base bc480000 - Size 38e10000 (931904 KB)
With my "current" version (6.12.2.633) I get this from the same dump. Two things I note:
The data seems to be the sum of the HeapAlloc/RegionUsageHeap and VirtualAlloc/RegionUsageIsVAD).
The lovely EFAIL error which is no doubt in part responsible for the missing data!
I'm not sure how that'll help you with your managed code, but I think it actually answers the original question ;-)
0:000> !address -summary
Failed to map Heaps (error 80004005)
--- Usage Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
<unclassified> 7171 aab21000 ( 2.667 Gb) 90.28% 66.68%
Free 637 42ecf000 ( 1.046 Gb) 26.14%
Stack 603 c900000 ( 201.000 Mb) 6.64% 4.91%
Image 636 5c21000 ( 92.129 Mb) 3.05% 2.25%
TEB 201 c9000 ( 804.000 kb) 0.03% 0.02%
ActivationContextData 14 11000 ( 68.000 kb) 0.00% 0.00%
CsrSharedMemory 1 5000 ( 20.000 kb) 0.00% 0.00%
--- Type Summary (for busy) ------ RgnCount ----------- Total Size -------- %ofBusy %ofTotal
MEM_PRIVATE 7921 b49c6000 ( 2.822 Gb) 95.53% 70.55%
MEM_IMAGE 665 5e6e000 ( 94.430 Mb) 3.12% 2.31%
MEM_MAPPED 40 28ed000 ( 40.926 Mb) 1.35% 1.00%
--- State Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
MEM_COMMIT 5734 9b4d1000 ( 2.427 Gb) 82.14% 60.67%
MEM_FREE 637 42ecf000 ( 1.046 Gb) 26.14%
MEM_RESERVE 2892 21c50000 ( 540.313 Mb) 17.86% 13.19%
--- Protect Summary (for commit) - RgnCount ----------- Total Size -------- %ofBusy %ofTotal
PAGE_READWRITE 4805 942bd000 ( 2.315 Gb) 78.37% 57.88%
PAGE_READONLY 215 3cbb000 ( 60.730 Mb) 2.01% 1.48%
PAGE_EXECUTE_READ 78 2477000 ( 36.465 Mb) 1.21% 0.89%
PAGE_WRITECOPY 74 75b000 ( 7.355 Mb) 0.24% 0.18%
PAGE_READWRITE|PAGE_GUARD 402 3d6000 ( 3.836 Mb) 0.13% 0.09%
PAGE_EXECUTE_READWRITE 80 3b0000 ( 3.688 Mb) 0.12% 0.09%
PAGE_EXECUTE_WRITECOPY 80 201000 ( 2.004 Mb) 0.07% 0.05%
--- Largest Region by Usage ----------- Base Address -------- Region Size ----------
<unclassified> 786000 17d9000 ( 23.848 Mb)
Free bc480000 38e10000 ( 910.063 Mb)
Stack 6f90000 fd000 (1012.000 kb)
Image 3c3c000 ebe000 ( 14.742 Mb)
TEB fdf8f000 1000 ( 4.000 kb)
ActivationContextData 190000 4000 ( 16.000 kb)
CsrSharedMemory 7efe0000 5000 ( 20.000 kb)
You're best bet would be to use the EEHeap and GCHandles commands in windbg (http://msdn.microsoft.com/en-us/library/bb190764.aspx) and try to see if you can find what might be leaking/wrong that way.
Unfortunately you probably won't be able to get the exact help you're looking for due to the fact that diagnosing these types of issues is almost always very time intensive and outside of the simplest cases requires someone to do a full analysis on the dump. Basically it's unlikely that someone will be able to point you towards a direct answer on Stack overflow. Mostly people will be able to point you commands that might be helpful. You're going to have to do a lot of digging to find out more information on what is happening.
I recently spent some time diagnosing a customers issue where their app was using 70GB before terminating (likely due to hitting an IIS App Pool recycling limit, but still unconfirmed). They sent me a 35 GB memory dump. Based on my recent experience, here are some observations I can make about what you've provided:
In the !heap -s output, 284 MB of the 1.247 GB is shown in the Commit column. If you were to open this dump in DebugDiag it would tell you that heap 0x60000 has 1 GB committed memory. You'll add up the commit size of the 11 segments reported and find that they only add up to about 102 MB and not 1GB. So annoying.
The "missing" memory isn't missing. It's actually hinted at in the !heap -s output as "Virtual block:" lines. Unfortunately, !heap -s sucks and doesn't show the end address properly and therefore reports size as 0. Check the output of the following commands:
!address 17e0000
!address 45bd0000
!address 6fff0000
It will report the proper end address and therefore an accurate "Region Size". Even better, it gives a succinct version of the region size. If you add the size of those 3 regions to 102 MB, you should be pretty close to 1 GB.
So what's in them? Well, you can look using dq. By spelunking you might find a hint at why they were allocated. Perhaps your managed code calls some 3rd party code which has a native side.
You might be able to find references to your heap by using !heap 6fff0000 -x -v. If there are references you can see what memory regions they live in by using !address again. In my customer issue I found a reference that lived on a region with "Usage: Stack". A "More info: " hint referenced the stack's thread which happened to have some large basic_string append/copy calls at the top.

Resources