Information about perf block events - linux-kernel

I need information about following block(disk I/O) events available in perf. Can you tell me where will I get detail information about each event?
block:block_bio_backmerge
block:block_bio_bounce
block:block_bio_complete
block:block_bio_frontmerge
block:block_bio_queue
block:block_bio_remap
block:block_dirty_buffer
block:block_getrq
block:block_plug
block:block_rq_complete
block:block_rq_insert
block:block_rq_issue
block:block_rq_remap
block:block_rq_requeue
block:block_sleeprq
block:block_split
block:block_touch_buffer
block:block_unplug
Please help me with this.

As #osgx has already mentioned these are software tracepoint events, among the many pre-defined set of tracepoint events in the kernel sources which can be seen when you run -
sudo perf list | grep Tracepoint
The block based tracepoint events can give a fine detail of what the storage devices are doing when you run certain commands.
sudo perf record -e block:block_rq_complete -a sleep 10
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.412 MB perf.data (340 samples) ]
The block_rq_complete tracepoint traces I/O requests that have completed, either fully or partially.
sudo perf script
swapper 0 [006] 205791.409875: block:block_rq_complete: 8,16 RM () 390439208 + 8 [0]
swapper 0 [006] 205791.410439: block:block_rq_complete: 8,16 RM () 390439256 + 8 [0]
chrome 9526 [006] 205793.149462: block:block_rq_complete: 8,16 W () 424979920 + 8 [0]
chrome 9526 [006] 205793.149781: block:block_rq_complete: 8,16 W () 490387000 + 352 [0]
swapper 0 [006] 205794.547686: block:block_rq_complete: 8,16 WS () 432636024 + 1344 [0]
swapper 0 [006] 205794.558292: block:block_rq_complete: 8,16 WS () 432637368 + 1344 [0]
swapper 0 [006] 205794.566718: block:block_rq_complete: 8,16 WS () 432638712 + 544 [0]
swapper 0 [006] 205794.599791: block:block_rq_complete: 8,16 FF () 18446744073709551615 + 0 [0]
swapper 0 [006] 205794.599868: block:block_rq_complete: 8,16 WS () 432639256 + 8 [0]
swapper 0 [006] 205794.600792: block:block_rq_complete: 8,16 FF () 18446744073709551615 + 0 [0]
swapper 0 [006] 205794.600798: block:block_rq_complete: 8,16 WS () 432639256 + 0 [0]
swapper 0 [006] 205798.268989: block:block_rq_complete: 8,16 W () 462924840 + 8 [0]
swapper 0 [006] 205798.269079: block:block_rq_complete: 8,16 W () 462934720 + 8 [0]
swapper 0 [006] 205798.269118: block:block_rq_complete: 8,16 W () 462934752 + 8 [0]
swapper 0 [006] 205798.269158: block:block_rq_complete: 8,16 W () 462935416 + 8 [0]
swapper 0 [006] 205798.269195: block:block_rq_complete: 8,16 W () 462935592 + 8 [0]
swapper 0 [006] 205798.269241: block:block_rq_complete: 8,16 W () 476143872 + 8 [0]
swapper 0 [006] 205798.269265: block:block_rq_complete: 8,16 W () 476144624 + 8 [0]
swapper 0 [006] 205798.269283: block:block_rq_complete: 8,16 W () 476145360 + 8 [0]
The first 5 columns of the output are well understood - (process name/command, pid, CPU, timestamp, event name for which sampling was done), so we'll start with the 6th column onwards -
8,16 refers to the major and minor number of the device.
ls -l /dev/sdb
brw-rw---- 1 root disk 8, 16 Apr 8 07:52 /dev/sdb
Characters R,W,B,S,F describe the I/O operation being performed, where 'R' refers to Read, 'W' refers to Write, 'D' refers to discard block, 'M' refers to metadata, 'S' refers to synchronous and 'F' refers to flush.
The numbers following the empty brackets () refer to the offset from the start of the device where the I/O operation was done and the number of completed sectors of I/O.
[0] indicates the number of errors.
Some level of information about most of the other events can be obtained here -
block events summary
Note that, the APIs for these events keep changing and could be different for the linux kernel you are using. I have attached the summary for kernel version 5.6.

Related

RaspberryPi 3b+ with multiple can buses (MPC2515))

I'm trying to connect 6 mcp2515 over spi0. I have adapted an SPI overlay to add the neccesary chip select lines. My new SPI overlay looks like this:
{
compatible = "brcm,bcm2835", "brcm,bcm2836", "brcm,bcm2708", "brcm,bcm2709";
fragment#0 {
target = <&spi0>;
frag0: __overlay__ {
#address-cells = <1>;
#size-cells = <0>;
pinctrl-0 = <&spi0_pins &spi0_cs_pins>;
status = "okay";
cs-gpios = <&gpio 8 1>, <&gpio 7 1>, <&gpio 22 1>, <&gpio 23 1>, <&gpio 24 1>, <&gpio 25 1>;
spidev#0{
compatible = "spidev";
reg = <0>; /* CE0 */
#address-cells = <1>;
#size-cells = <0>;
spi-max-frequency = <500000>;
};
spidev#1{
compatible = "spidev";
reg = <1>; /* CE1 */
#address-cells = <1>;
#size-cells = <0>;
spi-max-frequency = <500000>;
};
spidev#2{
compatible = "spidev";
reg = <2>; /* CE2 */
#address-cells = <1>;
#size-cells = <0>;
spi-max-frequency = <500000>;
};
spidev#3{
compatible = "spidev";
reg = <3>; /* CE3 */
#address-cells = <1>;
#size-cells = <0>;
spi-max-frequency = <500000>;
};
spidev#4{
compatible = "spidev";
reg = <4>; /* CE4 */
#address-cells = <1>;
#size-cells = <0>;
spi-max-frequency = <500000>;
};
spidev#5{
compatible = "spidev";
reg = <5>; /* CE5 */
#address-cells = <1>;
#size-cells = <0>;
spi-max-frequency = <500000>;
};
};
};
fragment#1 {
target = <&gpio>;
__overlay__ {
spi0_cs_pins: spi0_cs_pins {
brcm,pins = <7 8 22 23 24 25>;
brcm,function = <1>; /* out */
};
};
};
With this SPI overlay i have the 6 spi's in /sys/bus/spi/devices/
spi0.0 spi0.1 spi0.2 spi0.3 spi0.4 spi0.5
I have also made new overlays for the mcp2515 (can0 to can5) in order to bind them with the new chip select lines of spi0.
My /boot/config.txt looks like this:
dtoverlay=spi-gpio-cs-new
dtoverlay=mcp2515-can0,oscillator=8000000,interrupt=5
dtoverlay=mcp2515-can4,oscillator=8000000,interrupt=26
dtoverlay=mcp2515-can5,oscillator=8000000,interrupt=27
dmesg | grep mcp
[ 7.870207] mcp251x spi0.5 can0: MCP2515 successfully initialized.
[ 7.892886] mcp251x spi0.4 can1: MCP2515 successfully initialized.
[ 7.908725] mcp251x spi0.0 can2: MCP2515 successfully initialized.
ifconfig
can0: flags=193<UP,RUNNING,NOARP> mtu 16
unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 txqueuelen 10 (UNSPEC)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
can1: flags=193<UP,RUNNING,NOARP> mtu 16
unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 txqueuelen 10 (UNSPEC)
RX packets 36 bytes 180 (180.0 B)
RX errors 0 dropped 36 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
can2: flags=193<UP,RUNNING,NOARP> mtu 16
unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 txqueuelen 10 (UNSPEC)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
I only have 3 mcp2515 boards at my disposal for the moment. I have modified them regarding voltage supply to the CAN transceiver(5V) and can controller(3V) in order no to damage the Raspberry Pi GPIO's, the boards have been individually tested and I was able to send/receive CAN frames with them. They are connected to the Raspberry like this
Out of these 3 interfaces only can1 (spi0.4) is working! Using candump I can see can frames on the network.
My question is why can0 and can2 are muted when trying to send or receive CAN messages (candump and cansend)?
Kernel interrupt table
CPU0 CPU1 CPU2 CPU3
17: 217 0 0 0 ARMCTRL-level 1 Edge 3f00b880.mailbox
18: 47 0 0 0 ARMCTRL-level 2 Edge VCHIQ doorbell
40: 0 0 0 0 ARMCTRL-level 48 Edge bcm2708_fb DMA
42: 352 0 0 0 ARMCTRL-level 50 Edge DMA IRQ
44: 3062 0 0 0 ARMCTRL-level 52 Edge DMA IRQ
45: 0 0 0 0 ARMCTRL-level 53 Edge DMA IRQ
48: 0 0 0 0 ARMCTRL-level 56 Edge DMA IRQ
56: 14104 0 0 0 ARMCTRL-level 64 Edge dwc_otg, dwc_otg_pcd, dwc_otg_hcd:usb1
78: 0 0 0 0 ARMCTRL-level 86 Edge 3f204000.spi
80: 158 0 0 0 ARMCTRL-level 88 Edge mmc0
81: 7450 0 0 0 ARMCTRL-level 89 Edge uart-pl011
86: 4207 0 0 0 ARMCTRL-level 94 Edge mmc1
161: 0 0 0 0 bcm2836-timer 0 Edge arch_timer
162: 1813 1900 2264 1528 bcm2836-timer 1 Edge arch_timer
165: 0 0 0 0 bcm2836-pmu 9 Edge arm-pmu
166: 0 0 0 0 lan78xx-irqs 17 Edge usb-001:004:01
167: 0 0 0 0 pinctrl-bcm2835 26 Edge spi0.5
168: 6 0 0 0 pinctrl-bcm2835 27 Edge spi0.4
169: 0 0 0 0 pinctrl-bcm2835 5 Level spi0.0
FIQ: usb_fiq
IPI0: 0 0 0 0 CPU wakeup interrupts
IPI1: 0 0 0 0 Timer broadcast interrupts
IPI2: 1469 2966 3711 4460 Rescheduling interrupts
IPI3: 203 798 542 445 Function call interrupts
IPI4: 0 0 0 0 CPU stop interrupts
IPI5: 55 85 41 24 IRQ work interrupts
IPI6: 0 0 0 0 completion interrupts
Err: 0
I can see from this table that the SPI's are assigned with interrupts but only spi0.4 is actually activated. How can activate the other 2 interrupts for spi0.0 and spi0.5?
It's working!!!
As i mentioned in my first post only one board was working (can1 spi0.4), after i rechecked the other two non working boards i discovered that one had a hardware damage causing the other board not to work as well. As a final conclusion my spi and mcp overlays are fully functional!
Regards
Antmar
[ 7.846788] mcp251x spi0.0 can0: MCP2515 successfully initialized.
[ 7.888039] mcp251x spi0.1 can1: MCP2515 successfully initialized.
[ 7.924747] mcp251x spi0.2 can2: MCP2515 successfully initialized.
[ 7.936608] mcp251x spi0.3 can3: MCP2515 successfully initialized.
can0 241 [5] 67 A4 31 F0 C7
can1 2A0 [2] 02 93
can2 241 [5] 67 A4 31 F0 CB
can3 240 [2] 02 6A

How to Print the Constraint Values or Results of a JuMP Model in Julia

I use Julia v1.4.1 and I have been trying to print/access the constraint values of my model below by following the instructions in the document at https://www.juliaopt.org/JuMP.jl/stable/solutions/#JuMP.value, but I keep getting errors. I shall very much appreciate your help with this task.
Thank you in advance.
using JuMP, Gurobi
## Define model Object & Parameters:----------#
m = Model(optimizer_with_attributes(Gurobi.Optimizer, "FeasibilityTol"=>1e-6, "MIPGap"=>3e-4, "IntFeasTol"=>1e-9, "TimeLimit"=>18000, "IterationLimit"=>500))
V = 5
dist =
[999 8 4 9 9
8 999 6 7 10
4 6 999 5 6
9 7 5 999 4
9 10 6 4 999]
cost =
[999 58 59 55 56
57 999 54 60 54
59 59 999 57 57
58 56 56 999 60
55 58 54 57 999]
death =
[9 1 1 1 1
1 9 1 1 1
1 1 9 1 1
1 1 1 9 1
1 1 1 1 9]
## define Variables:------------#
#variable(m, x[i=1:V,j=1:V], Bin) #decision binary variable
#variable(m, 0.0<=Q<=1.0) #mini_max variable
#3 Assign weights:________________________________#
w = Pair{Tuple{Int64,Int64},Float64}[]
for i=1:V, j=1:V
push!( w , (i,j) => i != j ? 0.3 : 0.7)
end
## define Objective function:---------------#
#objective(m, Min, Q) #variable for the min_max weighted percentage deviation from the target values for the goals.
### MOLP/MOMP/Goal/target:________________________________#
for (key, value) in w
#constraints(m, begin
(value*(sum(dist[i,j]*x[i,j] for i=1:V, j=1:V )-29)/29) <= Q
(value*(sum(cost[i,j]*x[i,j] for i=1:V, j=1:V )-277)/277) <= Q
(value*(sum(death[i,j]*x[i,j] for i=1:V, j=1:V )-5)/5) <= Q
end)
end
##printing model results;
print(m)
status = JuMP.optimize!(m)
println("Objective value: ------> ", JuMP.objective_value(m))
Below are the different ways I am trying to print out the values of the constraints and their associated error messages:
julia> JuMP.value(DIST)
ERROR: `JuMP.value` is not defined for collections of JuMP types. Use Julia's broadcast syntax instead: `JuMP.value.(x)`.
Stacktrace:
[1] error(::String) at .\error.jl:33
[2] value(::Array{VariableRef,1}) at C:\Users\Doe67\.julia\packages\JuMP\MnJQc\src\variables.jl:962
[3] top-level scope at REPL[37]:1
julia> JuMP.value.(DIST)
ERROR: OptimizeNotCalled()
Stacktrace:
[1] _moi_get_result(::MathOptInterface.Utilities.CachingOptimizer{MathOptInterface.AbstractOptimizer,MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.Model{Float64}}}, ::MathOptInterface.VariablePrimal, ::Vararg{Any,N} where N) at C:\Users\Doe67\.julia\packages\JuMP\MnJQc\src\JuMP.jl:811
[2] get(::Model, ::MathOptInterface.VariablePrimal, ::VariableRef) at C:\Users\Doe67\.julia\packages\JuMP\MnJQc\src\JuMP.jl:843
[3] value(::VariableRef; result::Int64) at C:\Users\Doe67\.julia\packages\JuMP\MnJQc\src\variables.jl:767
[4] value at C:\Users\Doe679\.julia\packages\JuMP\MnJQc\src\variables.jl:767 [inlined]
[5] _broadcast_getindex_evalf at .\broadcast.jl:631 [inlined]
[6] _broadcast_getindex at .\broadcast.jl:604 [inlined]
[7] getindex at .\broadcast.jl:564 [inlined]
[8] macro expansion at .\broadcast.jl:910 [inlined]
[9] macro expansion at .\simdloop.jl:77 [inlined]
[10] copyto! at .\broadcast.jl:909 [inlined]
[11] copyto! at .\broadcast.jl:864 [inlined]
[12] copy at .\broadcast.jl:840 [inlined]
[13] materialize(::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1},Nothing,typeof(value),Tuple{Array{VariableRef,1}}}) at .\broadcast.jl:820
[14] top-level scope at REPL[38]:1
julia> value(DIST)
ERROR: `JuMP.value` is not defined for collections of JuMP types. Use Julia's broadcast syntax instead: `JuMP.value.(x)`.
Stacktrace:
[1] error(::String) at .\error.jl:33
[2] value(::Array{VariableRef,1}) at C:\Users\Doe67\.julia\packages\JuMP\MnJQc\src\variables.jl:962
[3] top-level scope at REPL[39]:1
I wanted to share this information with anyone who might have a similar need and/or issue in the future. The directions provided by #blegat and #miles.lubin on -https://discourse.julialang.org/t/how-to-print-the-values-of-constraints/40040/11 - were very helpful in solving this problem. See the correction below. Thanks
### Constraints MOLP/MOMP/Goal/target:________________________________#
f(i, j) = i != j ? 0.3 : 0.7
DIST = #constraint(m, (sum(f(i,j)*dist[i,j]*x[i,j] for i=1:V, j=1:V) - 29)/29 <= Q)
COST = #constraint(m, (sum(f(i,j)*cost[i,j]*x[i,j] for i=1:V, j=1:V) - 277)/277 <= Q)
DEATH = #constraint(m, (sum(f(i,j)*death[i,j]*x[i,j] for i=1:V, j=1:V) - 2)/2 <= Q)
##printing model results:________________________________#
print(m)
status = JuMP.optimize!(m)
println("Objective value: ---> ", JuMP.objective_value(m))
println("Distance goal_target constraint: ---> ", JuMP.value(DIST))
println("Cost goal_target constraint: ---> ", JuMP.value(COST))
println("Expected Death goal_target constraint: ---> ", JuMP.value(DEATH))

EC2 high stolen time without load

I can see very high % of stolen time on a EC2 web server (t2.micro) without any load (one current user) with a high page load time. Is there a correlation between hight load time and hight stolen time? I have the same symptoms with another server from class t2.medium
Do you have an explanation?
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
2 0 0 79824 7428 479172 0 0 0 0 52 49 18 0 0 0 82
1 0 0 79792 7436 479172 0 0 0 6 54 49 18 0 0 0 82
1 0 0 79824 7444 479172 0 0 0 5 54 51 18 0 0 0 82

Fastest way to find the sign of different square

Given an image I and two matrices m_1 ;m_2 (same size with I). The function f is defined as:
Because my goal design wants to get the sign of f . Hence, the function f can rewritten as following:
I think that second formula is faster than first formula because: It
can ignore the square term
It can compute the sign directly, instead of two steps in first equation: compute the f and check sign.
Do you agree with me? Do you have another faster formula for f
I =[16 23 11 42 10
11 21 22 24 30
16 22 154 155 156
25 28 145 151 156
11 38 147 144 153];
m1 =[0 0 0 0 0
0 0 22 11 0
0 23 34 56 0
0 56 0 0 0
0 11 0 0 0];
m2 =[0 0 0 0 0
0 0 12 11 0
0 22 111 156 0
0 32 0 0 0
0 12 0 0 0];
The ouput f is
f =[1 1 1 1 1
1 1 -1 1 1
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1]
I implemented the first way, but I did not finish the second way by matlab. Could you check help me the second way and compare it
UPDATE: I would like to add code of chepyle and Divakar to make clearly question. Note that both of them give the same result as above f
function compare()
I =[16 23 11 42 10
11 21 22 24 30
16 22 154 155 156
25 28 145 151 156
11 38 147 144 153];
m1 =[0 0 0 0 0
0 0 22 11 0
0 23 34 56 0
0 56 0 0 0
0 11 0 0 0];
m2 =[0 0 0 0 0
0 0 12 11 0
0 22 111 156 0
0 32 0 0 0
0 12 0 0 0];
function f=first_way()
f=sign((I-m1).^2-(I-m2).^2);
f(f==0)=1;
end
function f= second_way()
f = double(abs(I-m1) >= abs(I-m2));
f(f==0) = -1;
end
function f= third_way()
v1=abs(I-m1);
v2=abs(I-m2);
f= int8(v1>v2) + -1*int8(v1<v2); % need to convert to int from logical
f(f==0) = 1;
end
disp(['First way : ' num2str(timeit(#first_way))])
disp(['Second way: ' num2str(timeit(#second_way))])
disp(['Third way : ' num2str(timeit(#third_way))])
end
First way : 1.2897e-05
Second way: 1.9381e-05
Third way : 2.0077e-05
This seems to be comparable and might be a wee bit faster at times than the original approach -
f = sign(abs(I-m1) - abs(I-m2)) + sign(abs(m1-m2)) + ...
sign(abs(2*I-m1-m2)) - 1 -sign(abs(2*I-m1-m2) + abs(m1-m2))
Benchmarking Code
%// Create random inputs
N = 5000;
I = randi(1000,N,N);
m1 = randi(1000,N,N);
m2 = randi(1000,N,N);
num_iter = 20; %// Number of iterations for all approaches
%// Warm up tic/toc.
for k = 1:100000
tic(); elapsed = toc();
end
disp('------------------------- With Original Approach')
tic
for iter = 1:num_iter
out1 = sign((I-m1).^2-(I-m2).^2);
out1(out1==0)=-1;
end
toc, clear out1
disp('------------------------- With Proposed Approach')
tic
for iter = 1:num_iter
out2 = sign(abs(I-m1) - abs(I-m2)) + sign(abs(m1-m2)) + ...
sign(abs(2*I-m1-m2)) - 1 -sign(abs(2*I-m1-m2) + abs(m1-m2));
end
toc
Results
------------------------- With Original Approach
Elapsed time is 1.751966 seconds.
------------------------- With Proposed Approach
Elapsed time is 1.681263 seconds.
There is a problem with the accuracy of second formula, but for the sake of comparison, here's how I would implement it in matlab, along with a third approach to avoid squaring and the sign() function, inline with your intent. Note that the matlab's matrix and sign functions are pretty well optimized, the second and third approaches are both slower.
function compare()
I =[16 23 11 42 10
11 21 22 24 30
16 22 154 155 156
25 28 145 151 156
11 38 147 144 153];
m1 =[0 0 0 0 0
0 0 22 11 0
0 23 34 56 0
0 56 0 0 0
0 11 0 0 0];
m2 =[0 0 0 0 0
0 0 12 11 0
0 22 111 156 0
0 32 0 0 0
0 12 0 0 0];
function f=first_way()
f=sign((I-m1).^2-(I-m2).^2);
end
function f= second_way()
v1=(I-m1);
v2=(I-m2);
f= int8(v1<=0 & v2>0) + -1* int8(v1>0 & v2<=0);
end
function f= third_way()
v1=abs(I-m1);
v2=abs(I-m2);
f= int8(v1>v2) + -1*int8(v1<v2); % need to convert to int from logical
end
disp(['First way : ' num2str(timeit(#first_way))])
disp(['Second way: ' num2str(timeit(#second_way))])
disp(['Third way : ' num2str(timeit(#third_way))])
end
The output:
First way : 9.4226e-06
Second way: 1.2247e-05
Third way : 1.1546e-05

native memory leak - how to find callstack of allocation source

Based on following output of !address -summary command, I think I have got a native memory leak. In order to deterine the callstack on where these allocations are happening, I am following article at http://www.codeproject.com/KB/cpp/MemoryLeak.aspx
0:000> !address -summary
TEB 7efdd000 in range 7efdb000 7efde000
TEB 7efda000 in range 7efd8000 7efdb000
TEB 7efd7000 in range 7efd5000 7efd8000
TEB 7efaf000 in range 7efad000 7efb0000
TEB 7efac000 in range 7efaa000 7efad000
ProcessParametrs 00441b78 in range 00440000 00540000
Environment 004407f0 in range 00440000 00540000
-------------------- Usage SUMMARY --------------------------
TotSize ( KB) Pct(Tots) Pct(Busy) Usage
551a000 ( 87144) : 04.16% 14.59% : RegionUsageIsVAD
5b8d3000 ( 1499980) : 71.53% 00.00% : RegionUsageFree
2cc3000 ( 45836) : 02.19% 07.68% : RegionUsageImage
4ff000 ( 5116) : 00.24% 00.86% : RegionUsageStack
0 ( 0) : 00.00% 00.00% : RegionUsageTeb
1c040000 ( 459008) : 21.89% 76.87% : RegionUsageHeap
0 ( 0) : 00.00% 00.00% : RegionUsagePageHeap
1000 ( 4) : 00.00% 00.00% : RegionUsagePeb
0 ( 0) : 00.00% 00.00% : RegionUsageProcessParametrs
0 ( 0) : 00.00% 00.00% : RegionUsageEnvironmentBlock
Tot: 7fff0000 (2097088 KB) Busy: 2471d000 (597108 KB)
0:000> !heap -s
LFH Key : 0x7fdcf95f
Termination on corruption : DISABLED
Heap Flags Reserv Commit Virt Free List UCR Virt Lock Fast
(k) (k) (k) (k) length blocks cont. heap
-----------------------------------------------------------------------------
00440000 00000002 453568 436656 453568 62 54 32 0 0 LFH
006b0000 00001002 64 16 64 4 2 1 0 0
002b0000 00041002 256 4 256 2 1 1 0 0
00620000 00001002 64 16 64 5 2 1 0 0
00250000 00001002 64 16 64 4 2 1 0 0
007d0000 00041002 256 4 256 0 1 1 0 0
005c0000 00001002 1088 388 1088 7 17 2 0 0 LFH
02070000 00041002 256 4 256 1 1 1 0 0
02270000 00041002 256 144 256 0 1 1 0 0 LFH
04e10000 00001002 3136 1764 3136 384 36 3 0 0 LFH
External fragmentation 21 % (36 free blocks)
-----------------------------------------------------------------------------
But when I run !heap -p –a command, I don’t get any callstack, just the following. Any ideas how to get callstack of allocations source?
0:000> !heap -p -a 0218e008
address 0218e008 found in
_HEAP # 4e10000
HEAP_ENTRY Size Prev Flags UserPtr UserSize - state
0218e000 001c 0000 [00] 0218e008 000d4 - (busy)
You should use deleaker. It's powerful tool for debuging.
use valgrind for linux and deleaker for windows.
If you don't get a call stack from !heap -p -a
The reason can be that you have not used gflags correctly
Remeber to use correct name including .exe
Try to start it inteactivly and go to the image tab, might be easier
Try with page heap, that also gives call stack
I know nothing about Windows, but at least on Unix systems a debugger (like gdb on Linux) is useful to understand callstacks.
And you could also circumvent some of your issues by using e.g. Boehm's conservative garbage collector. On many systems you can also hunt memory leaks with the help of valgrind

Resources