The question may be a bit awkward, but here's my detailed problem:
Currently I'm looking into setting up SysInternals' procdump.exe to monitor an application of ours that exhibits spurious disappearances -- that is, the user reports that the application is simply "gone" without any trace after a short visible hang of the application's window.
My first idea was to run procdump -e -x . MyApp.exe which would record a crash dump when the application encounters an unhandled exception, but then I saw that there is also a -t switch, that --
-t - Write a dump when the process terminates.
automatically generates a dump when the process terminates.
Now the problem
I have tested the -t switch with our app by inserting a ExitProcess or TerminateProcess call at a defined location where I can trigger it.
While the app behaves as expected, i.e. TerminateProcess immediately "kills" the running app and ExitProcess takes a while because global cleanup is run, the dump generated this way is useless in both cases.
The dumps I get for -t always contain only a sinlge thread (where the app was running over 20 thread at termination time) and the callstack isn't even at a useful location. (It just seems to be one random thread from the terminated app.)
Am I doing something wrong? Can I usefully use procdump -t to track down unexpected calls of process exit functions at all?
Can I usefully use procdump -t to track down unexpected calls of
process exit functions at all?
I think not and here's why:
test process calc.exe
CommandLine: "C:\Program Files\Sysinternals\procdump.exe" -t calc.exe
I try to carefully suggest that procdump is waiting on calc.exe process handle.
0:000> kb
ChildEBP RetAddr Args to Child
0017f2e0 77135e6c 75336872 00000002 0017f334 ntdll!KiFastSystemCallRet
0017f2e4 75336872 00000002 0017f334 00000001 ntdll!NtWaitForMultipleObjects+0xc
0017f380 76cbf14a 0017f334 0017f3a8 00000000 KERNELBASE!WaitForMultipleObjectsEx+0x100
0017f3c8 76cbf2c2 00000002 7ffdb000 00000000 kernel32!WaitForMultipleObjectsExImplementation+0xe0
0017f3e4 011c6135 00000002 0017f46c 00000000 kernel32!WaitForMultipleObjects+0x18
WARNING: Stack unwind information not available. Following frames may be wrong.
0017fc30 011c999e 00000003 013d1de0 013d1e78 procdump+0x6135
0017fc78 76cc1194 7ffdb000 0017fcc4 7714b495 procdump+0x999e
0017fc84 7714b495 7ffdb000 77ad79b5 00000000 kernel32!BaseThreadInitThunk+0xe
0017fcc4 7714b468 011c99f5 7ffdb000 00000000 ntdll!__RtlUserThreadStart+0x70
0017fcdc 00000000 011c99f5 7ffdb000 00000000 ntdll!_RtlUserThreadStart+0x1b
0:000> dd 17f46c
0017f46c 00000238 00000268
0:000> !handle 238 f
Handle 238
Type Process
Attributes 0
GrantedAccess 0x1fffff:
Delete,ReadControl,WriteDac,WriteOwner,Synch
Terminate,CreateThread,,VMOp,VMRead,VMWrite,DupHandle,CreateProcess,SetQuota,SetInfo,QueryInfo,SetPort
HandleCount 5
PointerCount 52
Name <none>
Object Specific Information
Process Id 1580
Parent Process 2476
Base Priority 8
In the crash dump file gets stack last complete process thread (TID 3136) just before the end of the process.
0:000> ~
. 0 Id: dc8.c40 Suspend: -1 Teb: 7ffdd000 Unfrozen
0:000> .formats c40
Evaluate expression:
Hex: 00000c40
Decimal: 3136
Crash dump file is created after the completion of the last thread, and before the end of the process.
Related
All!
I'm debugging one quite strange case of process hanging/running out of memory using standard Windows crash dump with WinDbg. Obviously, it runs out of address space because of too many threads being created (it is 32 bit process), and I'm trying to figure out what's wrong with threads initialization (see callstack #3 below), because besides threads with callstacks that are typical for this program, it has handful of threads with callstacks of 3 types like:
1)
00 02cefb08 77544413 02fc024c 00000000 02cefb8c ntdll_774f0000!NtWaitForAlertByThreadId+0xc
01 02cefb28 7754434d 00000000 00000000 ffffffff ntdll_774f0000!RtlpWaitOnAddressWithTimeout+0x33
02 02cefb6c 7754423f 00000004 00000000 00000000 ntdll_774f0000!RtlpWaitOnAddress+0xa5
03 02cefba8 7752a605 02fc0000 02fc0000 02fc04b0 ntdll_774f0000!RtlpWaitOnCriticalSection+0xaa
04 02cefbc8 7752a525 02fc0248 02cefc88 77533844 ntdll_774f0000!RtlpEnterCriticalSectionContended+0xd5
05 02cefbd4 77533844 02fc0248 62da3da7 02fc04b0 ntdll_774f0000!RtlEnterCriticalSection+0x45
06 02cefc88 77533688 02fc04b0 02fc04b8 00000007 ntdll_774f0000!RtlpFreeHeap+0x174
07 02cefcd8 110d27fc 02fc0000 00000000 02fc04b8 ntdll_774f0000!RtlFreeHeap+0x758
...
These threads are stuck behind critical section 02fc024c that is taken by non-longer existing thread, and it is quite hard to figure out, what happened to it.
There are some threads that try to end normally, but are stuck in the LdrpDrainWorkQueue:
2)
# ChildEBP RetAddr Args to Child
00 05e5fd54 77527631 00000064 00000000 00000000 ntdll_774f0000!NtWaitForSingleObject+0xc
01 05e5fd78 7752b105 65f13f5f 00404e7c 00000000 ntdll_774f0000!LdrpDrainWorkQueue+0xbd
02 05e5fe70 7755179c 00404e7c 00404e7c 1086eb50 ntdll_774f0000!LdrShutdownThread+0x85
03 05e5ff40 00404efe 00000000 0042cef4 0042cefc ntdll_774f0000!RtlExitUserThread+0x4c
04 05e5ff6c 00404ea6 05e5ffcc 004049b8 05e5ff80 abc!EndThread+0x6
05 05e5ff80 743962c4 1086eb50 743962a0 941a355e abc!ThreadWrapper+0x2a
06 05e5ff94 77550779 1086eb50 65f13ef3 00000000 kernel32!BaseThreadInitThunk+0x24
07 05e5ffdc 77550744 ffffffff 77573606 00000000 ntdll_774f0000!__RtlUserThreadStart+0x2f
08 05e5ffec 00000000 00404e7c 1086eb50 00000000 ntdll_774f0000!_RtlUserThreadStart+0x1b
Also, dump presents about 1400 threads on a very early stage of initialization, that were created during last 5 minutes of process life with a callstack like:
3)
# ChildEBP RetAddr Args to Child
00 0ed1fba4 77527631 00000064 00000000 00000000 ntdll_774f0000!NtWaitForSingleObject+0xc
01 0ed1fbcc 7752b586 6ec53d9b ffffffff 1ed9d000 ntdll_774f0000!LdrpDrainWorkQueue+0xbd
02 0ed1fcb4 77557d86 6ec53c27 00000000 00000000 ntdll_774f0000!LdrpInitializeThread+0x8d
03 0ed1fd08 77557ce0 00000000 00000000 0ed1fd24 ntdll_774f0000!_LdrpInitialize+0x6a
04 0ed1fd10 00000000 0ed1fd24 774f0000 00000000 ntdll_774f0000!LdrInitializeThunk+0x10
These threads are also waiting in the LdrpDrainWorkQueue for event LdrpLoadCompleteEvent to be signalled.
This event is related to parallel loader (for some reference, first answer fot his question from RbMm, somewhat similar yet different situation here) This event is created during process initialization and signalled after parallel DLL loading has finished, so all LdrpInitializeThread's could traverse DllMain's and signal THREAD_ATTACH. But I don't understand why it is in non-signalled state on a process that has been running for weeks? Does parallel loader work on LoadLibrary as well, so LdrpLoadCompleteEvent gets reset? Couldn't find it in disassembly.
In any case, I'm trying to understand why process has developed such strange callstacks before it was forcefully terminated. I could imagine, that some thread began loading DLL that caused LdrpLoadCompleteEvent to be reset, then some thread holding lock for the heap died in a bad manner, so dll loading couldn't have been completed, so LdrpLoadCompleteEvent was never signalled, hence no new threads could have been initialized. However, there's no any thread that is loading dll in the dump.
Any insight/hint regarding how such callstacks could have been developed, or what else I could do to squeeze more info from the dump, is welcome.
Thank you!
Your fundamental problem is architectural ... a legendary problem known as "thrashing."
Your system is probably designed with what I refer to as the "flaming-arrow approach." Whenever a new request comes in, "just light another flaming arrow and throw it into the air." Unfortunately you just can't do that.
The permanent solution to your problem unfortunately will never be solved by a debugger: it will require a redesign.
In windbg, I'm looking for a mechanism to take the output of a command (specifically, a command inside of a breakpoint) and have it appended to a file, and not written to a console.
Currently I setup the process with .logappend C:\path\to\log and then enable a few breakpoints with:
bp WIN32U!{function} ".echo '===WIN32K-START==='; k; .echo '===WIN32K-END==='; g"
This works great, except the volume of output written to the console causes serious performance issues. I'm hopeful there's a way to get the same output to my log file, without the overhead of writing to the windbg console.
You want the .outmask meta-command: https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/-outmask--control-output-mask-
.outmask allows you to control which message types are sent to the output window and log file. The /l switch can allow you to change just the types that reach the output window, without affecting which types will reach the log file.
For example, this command will turn off all output to the output window while still sending normal messages to the log file:
.outmask- /l 0xffff
Although probably .outmask- /l 1 is all you need, which turns off just normal message output, but errors and warnings will still show up in the output window. Use .outmask /d to reset output settings back to the default when you're done.
In combination with the ability of .printf to output different message types, you can make it so you still have some idea what's going on, as well. Turn off normal message output to the window with .outmask- /l 1. Now you can use .printf /oe "message" in a breakpoint command somewhere to write an error message, which will still be sent to the output window so you can tell what's happening at certain points in the process.
you can patch dbgeng!g_OutputControl global to disable writing to console and only write to log file
but I don't know if you will have a performance gain or not
looking for a txt file
C:\>dir /b *.txt
File Not Found
opening a debugging session
C:\>cdb calc
Microsoft (R) Windows Debugger Version 10.0.15063.400 X86
ntdll!LdrpDoDebuggerBreak+0x2c:
774005a6 cc int 3
in the opened debugging session
spawn a parent debugger to debug the windbg running your debuggee
0:000> .dbgdbg
Debugger spawned, connect with
"-remote npipe:icfenable,pipe=cdb_pipe,server=xxxx"
in the spawned parent patch the global and detach
ed dbgeng!g_OutputControl 0
.detach
q
open a logfile in your debugging session
0:000> .logappend c:\foo.txt
Opened log file 'c:\foo.txt'
set a conditional breakpoint and start the session
0:000> bp ntdll!RtlEnterCriticalSection "kb;gc"
0:000> bl
0 e 773a7790 0001 (0001) 0:**** ntdll!RtlEnterCriticalSection "kb;gc"
0:000> g
there is no console output here
doing a ctrl+c to stop session and quitting the session
eax=7ffde000 ebx=00000000 ecx=00000000 edx=773ff1d3 esi=00000000 edi=00000000
eip=77394108 esp=016ef8a8 ebp=016ef8d4 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
ntdll!DbgBreakPoint:
77394108 cc int 3
0:001> q
quit:
check for the log file and confirm if it has voluminous data written to it
RtlEnterCriticalSection Api is a very hot Api
C:\>dir /b *.txt
foo.txt
C:\>ls -l foo.txt
-rw-rw-rw- 1 0 **1754920** 2017-09-15 00:27 foo.txt
C:>head foo.txt
Opened log file 'c:\foo.txt'
0:000> bp ntdll!RtlEnterCriticalSection "kb;gc"
0:000> bl
0 e 773a7790 0001 (0001) 0:**** ntdll!RtlEnterCriticalSection "kb;gc"
0:000> g
ChildEBP RetAddr Args to Child
000cf114 77425f4b 000d0138 7724d80b 00000000 ntdll!RtlEnterCriticalSection
000cf158 773ea40a 000d0000 50180162 00000044 ntdll!RtlDebugAllocateHeap+0x9d
000cf23c 773b5ae0 00000044 00000000 00000000 ntdll!RtlpAllocateHeap+0xc4
000cf2c0 77384726 000d0000 40180060 00000044 ntdll!RtlAllocateHeap+0x23a
there are more tha 22k lines written to this file
C:\>wc -l foo.txt
22543 foo.txt
C:>tail foo.txt
000cf838 773c37be 00462d6c 7ffdb000 00000000 ntdll!__RtlUserThreadStart+0x70
000cf850 00000000 00462d6c 7ffdb000 00000000 ntdll!_RtlUserThreadStart+0x1b
(c80.8ec): Break instruction exception - code 80000003 (first chance)
eax=7ffde000 ebx=00000000 ecx=00000000 edx=773ff1d3 esi=00000000 edi=00000000
eip=77394108 esp=016ef8a8 ebp=016ef8d4 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
ntdll!DbgBreakPoint:
77394108 cc int 3
0:001> q
quit:
C:\>
One of the users of my command line application has reported what appears to be an infinite loop. They helpfully took a dump of the process (via Task Manager) while it was in this state and sent it to me.
I'm not sure how to get useful information out of this dump. My normal technique of windbg -z the-dump-file.dmp -y releases\v5.0.0 -i releases\v5.0.0 doesn't give me much information that I know how to interpret. Are there ghc-specific tools I can use instead?
Moving forward, are the build options I should add or other things I should do to my release process to make this kind of post-mortem debugging more fruitful?
Here's an example of the stacks that I'm seeing. Not much useful info, especially for someone used to debugging C/C++ code in WinDbg. :-)
0 Id: 112dc.cc18 Suspend: 1 Teb: 00000000`00341000 Unfrozen
*** ERROR: Module load completed but symbols could not be loaded for gbc.exe
# Child-SP RetAddr Call Site
00 00000000`01b7d8d0 00000000`01049f71 gbc+0xc5676e
01 00000000`01b7d930 00000000`0104b5b4 gbc+0xc49f71
02 00000000`01b7d9a0 00000000`0104c644 gbc+0xc4b5b4
03 00000000`01b7da60 00000000`0104c1fa gbc+0xc4c644
04 00000000`01b7dab0 00000000`0042545b gbc+0xc4c1fa
05 00000000`01b7db30 00000000`011c40a0 gbc+0x2545b
06 00000000`01b7db38 00000000`0535bee1 gbc+0xdc40a0
07 00000000`01b7db40 00000000`010ffd80 0x535bee1
08 00000000`01b7db48 00000000`0535bee1 gbc+0xcffd80
09 00000000`01b7db50 00007ffb`3581fb01 0x535bee1
0a 00000000`01b7db58 00007ffb`3581b850 imm32!?MSCTF_NULL_THUNK_DATA_DLB+0x2e9
0b 00000000`01b7db60 00000000`00000010 imm32!CtfImmGetCompatibleKeyboardLayout
0c 00000000`01b7db68 00000000`00000000 0x10
1 Id: 112dc.d324 Suspend: 1 Teb: 00000000`00349000 Unfrozen
# Child-SP RetAddr Call Site
00 00000000`05c2fc48 00007ffb`36441563 ntdll!ZwWaitForWorkViaWorkerFactory+0x14
01 00000000`05c2fc50 00007ffb`34172774 ntdll!TppWorkerThread+0x293
02 00000000`05c2ff60 00007ffb`36470d61 kernel32!BaseThreadInitThunk+0x14
03 00000000`05c2ff90 00000000`00000000 ntdll!RtlUserThreadStart+0x21
2 Id: 112dc.11b48 Suspend: 1 Teb: 00000000`0034b000 Unfrozen
# Child-SP RetAddr Call Site
00 00000000`0642dd38 00007ffb`32f2988f ntdll!ZwWaitForSingleObject+0x14
01 00000000`0642dd40 00000000`00ffca15 KERNELBASE!WaitForSingleObjectEx+0x9f
02 00000000`0642dde0 00000000`00000000 gbc+0xbfca15
Some resources that might be useful. (If there are more up-to-date ones, I would like to see them myself.)
https://ghc.haskell.org/trac/ghc/wiki/Debugging/CompiledCode
https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/debug-info.html
https://wiki.haskell.org/Debugging
A few important nuggets:
The runtime flag +RTS -? Will tell you what runtime flags add debugging information. These will start with +RTS -D. For example, +RTS -DS turns on a number of runtime assertions and sanity checks.
The strange names you see are encoded in something called Z-encoding. This is defined at https://ghc.haskell.org/trac/ghc/browser/ghc/compiler/cmm/CLabel.hs.
If you can recompile the code with debugging symbols on and threading off, and still reproduce the bug, you can set breakpoints (or hit control-C) inside the debugger and backtrace from there. You can examine memory with a command like print/a 0x006eb0c0 (although you seem to be using 64-bit pointers). You can see the assembly-language instruction that crashed with disassemble.
You need to use the -ddump-stg compile flag to see what the variable names mean, because that is the last phase of the transformation before the program is assembled, and the variable names you see in the debugger correspond to the ones here.
You can instrument the code with Debug.Trace.
I have the following scenario:
I am attached with the kernel debugger (Windbg) to a Hyper-V machine running Windows 10 64bit.
My process-to-be-debugged is a 32bit user-mode process which sometimes hangs the machine (it communicates with a minifilter), and therefore I cannot use user-mode debugger or remote debugger.
Now I have a symbols server, I know the process and the thread I want to investigate, how do I:
View the callstack for this thread only
Load symbols for my modules
Bonus question: For some reason, I have many instances of my program. Except the "active" one, the rest are not visible in Process Explorer, have no threads and 0 Handle count. What could cause this?
Things I tried:
!process ffffe08620a30800 7
(view all process threads)
...
THREAD **ffffe0862212f800** Cid 08a0.1cfc Teb: 0000000000d6e000 Win32Thread: 0000000000000000 RUNNING on processor 0
Not impersonating
DeviceMap ffffcb0a55817c30
Owning Process ffffe08620a30800 Image: avguard.exe
Attached Process N/A Image: N/A
Wait Start TickCount 27338460 Ticks: 1 (0:00:00:00.015)
Context Switch Count 2999214 IdealProcessor: 0
UserTime **00:17:51.125**
KernelTime 00:06:14.671
Win32 Start Address 0x00000000741bbfb4
Stack Init ffffb481db842dd0 Current ffffb481db842a10
Base ffffb481db843000 Limit ffffb481db83d000 Call 0
Priority 8 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
Child-SP RetAddr : Args to Child : Call Site
ffffb481`db842c40 00000000`77b1222c : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceExit+0x2f (TrapFrame # ffffb481`db842c40)
00000000`0333ed18 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : wow64cpu!CpupSyscallStub+0xc
...
This thread is the one I want to investigate. It has a high processor usage, as visible in bold: User Time: 17 mins. However the stack is not helpful.
Then I did:
.thread /p /r /w ffffe0862212f800
Implicit thread is now ffffe086`2212f800
Implicit process is now ffffe086`20a30800
.cache forcedecodeuser done
Loading User Symbols
.ModLoad: 00000000`009b0000 00000000`00a25000 C:\Program Files (x86)\my process.exe
.ModLoad: 00007ffb`67e20000 00007ffb`67ff1000 C:\WINDOWS\SYSTEM32\ntdll.dll
.ModLoad: 00000000`778d0000 00000000`77922000 C:\WINDOWS\System32\wow64.dll
.ModLoad: 00000000`77850000 00000000`778c7000 C:\WINDOWS\System32\wow64win.dll
.ModLoad: 00000000`77b10000 00000000`77b1a000 C:\WINDOWS\System32\wow64cpu.dll
The context is partially valid. Only x86 user-mode context is available.
x86 context set
1: kd:x86> kb
*** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr Args to Child
WARNING: Frame IP not in any known module. Following frames may be wrong.
db842cc0 00000000 00000000 00000000 00000000 **0x819613ca**
What is this address 0x819613ca?
How can I extract the module in which this belongs? Or extract a meaningfull callstack?
How can I proceed now further with my investigation?
I am debugging a managed application using Son of Strike (SOS) in Visual studio 2010. I want to run a raw memory dump from a specific location, but I get "End of expression expected" error. If I attach WinDbg, then I can run same 'dd' command. How can I fix this problem?
!clrstack -l
OS Thread Id: 0xd5c (3420)
Child SP IP Call Site
0050eeac 002700eb ConsoleApplication2.Program.Main(System.String[])
LOCALS:
0x0050eeb0 = 0x0240c178
0x0050eebc = 0x00000000
0050f0fc 6b4c21bb [GCFrame: 0050f0fc]
dd 0x0240c178
End of expression expected
dd 0x0050eeb0
End of expression expected
In the Immediate window you have to use >dd 0x001AF2E0 to make it work. You have to type the > before dd.
dd 0x001AF2E0
End of expression expected
>dd 0x001AF2E0
0x001AF2E0 6d7c4938 ffffffff 001af34c 00000001
0x001AF2F0 002dd780 00000000 002dd780 ffffffff
0x001AF300 00000001 77a220f9 00000000 00713000
0x001AF310 002711a8 00000001 00000000 00000000
In the Command window you can just type dd 0x001AF2E0.
Type .cordll and see if the sos dll is loaded.
eg:
0:000> .cordll
CLR DLL status: Loaded DLL C:\Windows\Microsoft.NET\Framework\v2.0.50727\mscordacwks.dll