Ignore INVALID HANDLE error in AppVerifier? - application-verifier

I have a VerifierDlls installed against an 3rd-party application. It kept getting crash due to invalid handle error:
APPLICATION_VERIFIER_HANDLES_INVALID_HANDLE (300)
Invalid handle exception for current stack trace.
This stop is generated if the function on the top of the stack passed an
invalid handle to system routines. Usually a simple kb command will reveal
what is the value of the handle passed (must be one of the parameters -
usually the first one). If the value is null then this is clearly wrong.
If the value looks ok you need to use !htrace debugger extension to get a
history of operations pertaining to this handle value. In most cases it
must be that the handle value is used after being closed.
Arguments:
Arg1: 00000000c0000008, Exception code.
Arg2: 0000008397afefd0, Exception record. Use .exr to display it.
Arg3: 0000008397afe9a0, Context record. Use .cxr to display it.
Arg4: 0000000000000000, Not used.
I'm wondering if there's a way to avoid it? I tried to hook CloseHandle and I don't know how to tell if the handle is invalid.
From procdump I can tell CloseHandle was the cause
00 00007ffd`cc963851 : 00000000`00000000 00000000`00000000 0000ab17`238a5e24 00000000`00000002 : ntdll!NtWaitForMultipleObjects+0x14
01 00007ffd`cc962ae5 : 00000000`000016d8 00000000`00000000 00000000`000016d8 00000000`00001000 : ntdll!WerpWaitForCrashReporting+0x6d
02 00007ffd`cc961b97 : 00000000`00000000 00000099`fecfd5c0 00000000`00000020 00007ffd`cc98d68a : ntdll!RtlReportExceptionHelper+0x269
03 00007ffd`ad70ecc1 : 00000099`fecfce60 00000000`00000300 00000223`d06f4280 00000000`00000000 : ntdll!RtlReportException+0x77
04 00007ffd`cc9b5eb0 : 00000000`00000000 00007ffd`cca9b5e0 00000223`d06f4280 00000223`d06f4280 : verifier!AVrfpVectoredExceptionHandler+0x2b1
05 00007ffd`cc98fa3b : 00000099`fecfdab0 00000099`fecfd5c0 00000000`00000000 00000000`deff7850 : ntdll!RtlpCallVectoredHandlers+0x104
06 00007ffd`cc9f960a : 00000000`00000000 00000000`00000000 00007ffd`ad735ef0 00000000`00000000 : ntdll!RtlDispatchException+0x6b
07 00007ffd`ad7067ea : 00007ffd`ad735ef0 00000000`00000000 00007ffd`ad728744 00007ffd`ad73dd40 : ntdll!KiUserExceptionDispatch+0x3a
08 00007ffd`ad70ec59 : 00000099`fecfe130 00000000`00000000 00000223`d06f4280 00000099`fecffd20 : verifier!VerifierStopMessageEx+0x6e2
09 00007ffd`cc9b5eb0 : 00000000`00000000 00007ffd`cca9b5e0 00000223`d06f4280 00000223`d06f4280 : verifier!AVrfpVectoredExceptionHandler+0x249
0a 00007ffd`cc98fa3b : 00000099`fecff090 00000099`fecfea60 00000223`d06d0000 00000000`deff7850 : ntdll!RtlpCallVectoredHandlers+0x104
0b 00007ffd`cc991a59 : 00000099`fecfe9c0 00000000`00000024 00000099`fecfe900 00007ffd`ccaa7870 : ntdll!RtlDispatchException+0x6b
0c 00007ffd`cc9f967a : 00000000`00000000 00000000`000007ac 00007ff6`8b94d3f3 00007ffd`cc9f5bd0 : ntdll!RtlRaiseException+0x2d9
0d 00007ffd`ad71e0e1 : 00000223`d43b1660 00007ffd`cb024700 00007ff6`8b94d3f3 00000000`000007ac : ntdll!KiRaiseUserExceptionDispatcher+0x3a
0e 00007ffd`c94d6d82 : 00000000`000007ac 00000223`d43b1660 00000223`d028e040 00000223`d028e040 : verifier!AVrfpNtClose+0x51
0f 00007ffd`ad7201ad : 00000000`000007ac 00000099`fecff310 00000223`d028e038 00000000`000007ac : KERNELBASE!CloseHandle+0x62
10 00007ffd`ad720218 : 00000000`00000000 00000223`d028e038 00000000`00000000 00000000`00000000 : verifier!AVrfpCloseHandleCommon+0xa1
11 00007ff6`8b94d3f3 : 00000223`d0742fb0 00000099`fecff310 00000223`d028e038 00000000`00000000 : verifier!AVrfpKernel32CloseHandle+0x28
Any ideas?

You can disable single application verifier checks. Run appverif (note there is a 64 and a 32 bit version) and locate the general type of error you have, like
Now comes the not very intuitive action: do a right click on that checkbox and choose "Verifier Stop Options"
You can then select stop option 300 (which is yours) and change the behavior. I don't know exactly what that does, since I never used it, but it sounds either "Ignore" or "Inactive" would be a good choice to get rid of them.
Don't forget to hit the "Save" button after closing the dialog.
The Settings will be stored in Registry somewhere below HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\ (64 bit)

Related

DPC_WATCHDOG_VIOLATION (133/1) Potentially related to NdisFIndicateReceiveNetBufferLists?

We have a NDIS LWF driver, and on a single machine we get a DPC_WATCHDOG_VIOLATION 133/1 bugcheck when they try to connect to their VPN to connect to the internet. This could be related to our NdisFIndicateReceiveNetBufferLists, as the IRQL is raised to DISPATCH before calling it (and obviously lowered to whatever it was afterward), and that does appear in the output of !dpcwatchdog shown below. This is done due to a workaround for another bug explained here:
IRQL_UNEXPECTED_VALUE BSOD after NdisFIndicateReceiveNetBufferLists?
Now this is the bugcheck:
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
DPC_WATCHDOG_VIOLATION (133)
The DPC watchdog detected a prolonged run time at an IRQL of DISPATCH_LEVEL
or above.
Arguments:
Arg1: 0000000000000001, The system cumulatively spent an extended period of time at
DISPATCH_LEVEL or above. The offending component can usually be
identified with a stack trace.
Arg2: 0000000000001e00, The watchdog period.
Arg3: fffff805422fb320, cast to nt!DPC_WATCHDOG_GLOBAL_TRIAGE_BLOCK, which contains
additional information regarding the cumulative timeout
Arg4: 0000000000000000
STACK_TEXT:
nt!KeBugCheckEx
nt!KeAccumulateTicks+0x1846b2
nt!KiUpdateRunTime+0x5d
nt!KiUpdateTime+0x4a1
nt!KeClockInterruptNotify+0x2e3
nt!HalpTimerClockInterrupt+0xe2
nt!KiCallInterruptServiceRoutine+0xa5
nt!KiInterruptSubDispatchNoLockNoEtw+0xfa
nt!KiInterruptDispatchNoLockNoEtw+0x37
nt!KxWaitForSpinLockAndAcquire+0x2c
nt!KeAcquireSpinLockAtDpcLevel+0x5c
wanarp!WanNdisReceivePackets+0x4bb
ndis!ndisMIndicateNetBufferListsToOpen+0x141
ndis!ndisMTopReceiveNetBufferLists+0x3f0e4
ndis!ndisCallReceiveHandler+0x61
ndis!ndisInvokeNextReceiveHandler+0x1df
ndis!NdisMIndicateReceiveNetBufferLists+0x104
ndiswan!IndicateRecvPacket+0x596
ndiswan!ApplyQoSAndIndicateRecvPacket+0x20b
ndiswan!ProcessPPPFrame+0x16f
ndiswan!ReceivePPP+0xb3
ndiswan!ProtoCoReceiveNetBufferListChain+0x442
ndis!ndisMCoIndicateReceiveNetBufferListsToNetBufferLists+0xf6
ndis!NdisMCoIndicateReceiveNetBufferLists+0x11
raspptp!CallIndicateReceived+0x210
raspptp!CallProcessRxNBLs+0x199
ndis!ndisDispatchIoWorkItem+0x12
nt!IopProcessWorkItem+0x135
nt!ExpWorkerThread+0x105
nt!PspSystemThreadStartup+0x55
nt!KiStartSystemThread+0x28
SYMBOL_NAME: wanarp!WanNdisReceivePackets+4bb
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: wanarp
IMAGE_NAME: wanarp.sys
And this following is the output of !dpcwatchdog, but I still can't find what is causing this bugcheck, and can't find which function is consuming too much time in DISPATCH level which is causing this bugcheck. Although I think this could be related to some spin locking done by wanarp? Could this be a bug with wanarp? Note that we don't use any spinlocking in our driver, and us raising the IRQL should not cause any issue as it is actually very common for indication in Ndis to be done at IRQL DISPATCH.
So How can I find the root cause of this bugcheck? There are no other third party LWF in the ndis stack.
3: kd> !dpcwatchdog
All durations are in seconds (1 System tick = 15.625000 milliseconds)
Circular Kernel Context Logger history: !logdump 0x2
DPC and ISR stats: !intstats /d
--------------------------------------------------
CPU#0
--------------------------------------------------
Current DPC: No Active DPC
Pending DPCs:
----------------------------------------
CPU Type KDPC Function
dpcs: no pending DPCs found
--------------------------------------------------
CPU#1
--------------------------------------------------
Current DPC: No Active DPC
Pending DPCs:
----------------------------------------
CPU Type KDPC Function
1: Normal : 0xfffff80542220e00 0xfffff805418dbf10 nt!PpmCheckPeriodicStart
1: Normal : 0xfffff80542231d40 0xfffff8054192c730 nt!KiBalanceSetManagerDeferredRoutine
1: Normal : 0xffffbd0146590868 0xfffff80541953200 nt!KiEntropyDpcRoutine
DPC Watchdog Captures Analysis for CPU #1.
DPC Watchdog capture size: 641 stacks.
Number of unique stacks: 1.
No common functions detected!
The captured stacks seem to indicate that only a single DPC or generic function is the culprit.
Try to analyse what other processors were doing at the time of the following reference capture:
CPU #1 DPC Watchdog Reference Stack (#0 of 641) - Time: 16 Min 17 Sec 984.38 mSec
# RetAddr Call Site
00 fffff805418d8991 nt!KiUpdateRunTime+0x5D
01 fffff805418d2803 nt!KiUpdateTime+0x4A1
02 fffff805418db1c2 nt!KeClockInterruptNotify+0x2E3
03 fffff80541808a45 nt!HalpTimerClockInterrupt+0xE2
04 fffff805419fab9a nt!KiCallInterruptServiceRoutine+0xA5
05 fffff805419fb107 nt!KiInterruptSubDispatchNoLockNoEtw+0xFA
06 fffff805418a9a9c nt!KiInterruptDispatchNoLockNoEtw+0x37
07 fffff805418da3cc nt!KxWaitForSpinLockAndAcquire+0x2C
08 fffff8054fa614cb nt!KeAcquireSpinLockAtDpcLevel+0x5C
09 fffff80546ba1eb1 wanarp!WanNdisReceivePackets+0x4BB
0a fffff80546be0b84 ndis!ndisMIndicateNetBufferListsToOpen+0x141
0b fffff80546ba7ef1 ndis!ndisMTopReceiveNetBufferLists+0x3F0E4
0c fffff80546bddfef ndis!ndisCallReceiveHandler+0x61
0d fffff80546ba4a94 ndis!ndisInvokeNextReceiveHandler+0x1DF
0e fffff8057c32d17e ndis!NdisMIndicateReceiveNetBufferLists+0x104
0f fffff8057c30d6c7 ndiswan!IndicateRecvPacket+0x596
10 fffff8057c32d56b ndiswan!ApplyQoSAndIndicateRecvPacket+0x20B
11 fffff8057c32d823 ndiswan!ProcessPPPFrame+0x16F
12 fffff8057c308e62 ndiswan!ReceivePPP+0xB3
13 fffff80546c5c006 ndiswan!ProtoCoReceiveNetBufferListChain+0x442
14 fffff80546c5c2d1 ndis!ndisMCoIndicateReceiveNetBufferListsToNetBufferLists+0xF6
15 fffff8057c2b0064 ndis!NdisMCoIndicateReceiveNetBufferLists+0x11
16 fffff8057c2b06a9 raspptp!CallIndicateReceived+0x210
17 fffff80546bd9dc2 raspptp!CallProcessRxNBLs+0x199
18 fffff80541899645 ndis!ndisDispatchIoWorkItem+0x12
19 fffff80541852b65 nt!IopProcessWorkItem+0x135
1a fffff80541871d25 nt!ExpWorkerThread+0x105
1b fffff80541a00778 nt!PspSystemThreadStartup+0x55
1c ---------------- nt!KiStartSystemThread+0x28
--------------------------------------------------
CPU#2
--------------------------------------------------
Current DPC: No Active DPC
Pending DPCs:
----------------------------------------
CPU Type KDPC Function
2: Normal : 0xffffbd01467f0868 0xfffff80541953200 nt!KiEntropyDpcRoutine
DPC Watchdog Captures Analysis for CPU #2.
DPC Watchdog capture size: 641 stacks.
Number of unique stacks: 1.
No common functions detected!
The captured stacks seem to indicate that only a single DPC or generic function is the culprit.
Try to analyse what other processors were doing at the time of the following reference capture:
CPU #2 DPC Watchdog Reference Stack (#0 of 641) - Time: 16 Min 17 Sec 984.38 mSec
# RetAddr Call Site
00 fffff805418d245a nt!KeClockInterruptNotify+0x453
01 fffff80541808a45 nt!HalpTimerClockIpiRoutine+0x1A
02 fffff805419fab9a nt!KiCallInterruptServiceRoutine+0xA5
03 fffff805419fb107 nt!KiInterruptSubDispatchNoLockNoEtw+0xFA
04 fffff805418a9a9c nt!KiInterruptDispatchNoLockNoEtw+0x37
05 fffff805418a9a68 nt!KxWaitForSpinLockAndAcquire+0x2C
06 fffff8054fa611cb nt!KeAcquireSpinLockRaiseToDpc+0x88
07 fffff80546ba1eb1 wanarp!WanNdisReceivePackets+0x1BB
08 fffff80546be0b84 ndis!ndisMIndicateNetBufferListsToOpen+0x141
09 fffff80546ba7ef1 ndis!ndisMTopReceiveNetBufferLists+0x3F0E4
0a fffff80546bddfef ndis!ndisCallReceiveHandler+0x61
0b fffff80546be3a81 ndis!ndisInvokeNextReceiveHandler+0x1DF
0c fffff80546ba804e ndis!ndisFilterIndicateReceiveNetBufferLists+0x3C611
0d fffff8054e384d77 ndis!NdisFIndicateReceiveNetBufferLists+0x6E
0e fffff8054e3811a9 ourdriver+0x4D70
0f fffff80546ba7d40 ourdriver+0x11A0
10 fffff8054182a6b5 ndis!ndisDummyIrpHandler+0x100
11 fffff80541c164c8 nt!IofCallDriver+0x55
12 fffff80541c162c7 nt!IopSynchronousServiceTail+0x1A8
13 fffff80541c15646 nt!IopXxxControlFile+0xC67
14 fffff80541a0aab5 nt!NtDeviceIoControlFile+0x56
15 ---------------- nt!KiSystemServiceCopyEnd+0x25
--------------------------------------------------
CPU#3
--------------------------------------------------
Current DPC: No Active DPC
Pending DPCs:
----------------------------------------
CPU Type KDPC Function
dpcs: no pending DPCs found
Target machine version: Windows 10 Kernel Version 19041 MP (4 procs)
Also note that we also pass the NDIS_RECEIVE_FLAGS_DISPATCH_LEVEL flag to the NdisFIndicateReceiveNetBufferLists, if the current IRQL is dispatch.
Edit1:
This is also the output of !locks and !qlocks and !ready, And the contention count on one of the resources is 49135, is this normal or too high? Could this be related to our issue? The threads that are waiting on it or own it are for normal processes such as chrome, csrss, etc.
3: kd> !kdexts.locks
**** DUMP OF ALL RESOURCE OBJECTS ****
KD: Scanning for held locks.
Resource # nt!ExpTimeRefreshLock (0xfffff80542219440) Exclusively owned
Contention Count = 17
Threads: ffffcf8ce9dee640-01<*>
KD: Scanning for held locks.....
Resource # 0xffffcf8cde7f59f8 Shared 1 owning threads
Contention Count = 62
Threads: ffffcf8ce84ec080-01<*>
KD: Scanning for held locks...............................................................................................
Resource # 0xffffcf8ce08d0890 Exclusively owned
Contention Count = 49135
NumberOfSharedWaiters = 1
NumberOfExclusiveWaiters = 6
Threads: ffffcf8cf18e3080-01<*> ffffcf8ce3faf080-01
Threads Waiting On Exclusive Access:
ffffcf8ceb6ce080 ffffcf8ce1d20080 ffffcf8ce77f1080 ffffcf8ce92f4080
ffffcf8ce1d1f0c0 ffffcf8ced7c6080
KD: Scanning for held locks.
Resource # 0xffffcf8ce08d0990 Shared 1 owning threads
Threads: ffffcf8cf18e3080-01<*>
KD: Scanning for held locks.........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Resource # 0xffffcf8ceff46350 Shared 1 owning threads
Threads: ffffcf8ce6de8080-01<*>
KD: Scanning for held locks......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Resource # 0xffffcf8cf0cade50 Exclusively owned
Contention Count = 3
Threads: ffffcf8ce84ec080-01<*>
KD: Scanning for held locks.........................
Resource # 0xffffcf8cf0f76180 Shared 1 owning threads
Threads: ffffcf8ce83dc080-02<*>
KD: Scanning for held locks.......................................................................................................................................................................................................................................................
Resource # 0xffffcf8cf1875cb0 Shared 1 owning threads
Contention Count = 3
Threads: ffffcf8ce89db040-02<*>
KD: Scanning for held locks.
Resource # 0xffffcf8cf18742d0 Shared 1 owning threads
Threads: ffffcf8cee5e1080-02<*>
KD: Scanning for held locks....................................................................................
Resource # 0xffffcf8cdceeece0 Shared 2 owning threads
Contention Count = 4
Threads: ffffcf8ce3a1c080-01<*> ffffcf8ce5625040-01<*>
Resource # 0xffffcf8cdceeed48 Shared 1 owning threads
Threads: ffffcf8ce5625043-02<*> *** Actual Thread ffffcf8ce5625040
KD: Scanning for held locks...
Resource # 0xffffcf8cf1d377d0 Exclusively owned
Threads: ffffcf8cf0ff3080-02<*>
KD: Scanning for held locks....
Resource # 0xffffcf8cf1807050 Exclusively owned
Threads: ffffcf8ce84ec080-01<*>
KD: Scanning for held locks......
245594 total locks, 13 locks currently held
3: kd> !qlocks
Key: O = Owner, 1-n = Wait order, blank = not owned/waiting, C = Corrupt
Processor Number
Lock Name 0 1 2 3
KE - Unused Spare
MM - Unused Spare
MM - Unused Spare
MM - Unused Spare
CC - Vacb
CC - Master
EX - NonPagedPool
IO - Cancel
CC - Unused Spare
IO - Vpb
IO - Database
IO - Completion
NTFS - Struct
AFD - WorkQueue
CC - Bcb
MM - NonPagedPool
3: kd> !ready
KSHARED_READY_QUEUE fffff8053f1ada00: (00) ****------------------------------------------------------------
SharedReadyQueue fffff8053f1ada00: No threads in READY state
Processor 0: No threads in READY state
Processor 1: Ready Threads at priority 15
THREAD ffffcf8ce9dee640 Cid 2054.2100 Teb: 000000fab7bca000 Win32Thread: 0000000000000000 READY on processor 1
Processor 2: No threads in READY state
Processor 3: No threads in READY state
3: kd> dt nt!_ERESOURCE 0xffffcf8ce08d0890
+0x000 SystemResourcesList : _LIST_ENTRY [ 0xffffcf8c`e08d0610 - 0xffffcf8c`e08cf710 ]
+0x010 OwnerTable : 0xffffcf8c`ee6e8210 _OWNER_ENTRY
+0x018 ActiveCount : 0n1
+0x01a Flag : 0xf86
+0x01a ReservedLowFlags : 0x86 ''
+0x01b WaiterPriority : 0xf ''
+0x020 SharedWaiters : 0xffffae09`adcae8e0 Void
+0x028 ExclusiveWaiters : 0xffffae09`a9aabea0 Void
+0x030 OwnerEntry : _OWNER_ENTRY
+0x040 ActiveEntries : 1
+0x044 ContentionCount : 0xbfef
+0x048 NumberOfSharedWaiters : 1
+0x04c NumberOfExclusiveWaiters : 6
+0x050 Reserved2 : (null)
+0x058 Address : (null)
+0x058 CreatorBackTraceIndex : 0
+0x060 SpinLock : 0
3: kd> dx -id 0,0,ffffcf8cdcc92040 -r1 (*((ntkrnlmp!_OWNER_ENTRY *)0xffffcf8ce08d08c0))
(*((ntkrnlmp!_OWNER_ENTRY *)0xffffcf8ce08d08c0)) [Type: _OWNER_ENTRY]
[+0x000] OwnerThread : 0xffffcf8cf18e3080 [Type: unsigned __int64]
[+0x008 ( 0: 0)] IoPriorityBoosted : 0x0 [Type: unsigned long]
[+0x008 ( 1: 1)] OwnerReferenced : 0x0 [Type: unsigned long]
[+0x008 ( 2: 2)] IoQoSPriorityBoosted : 0x1 [Type: unsigned long]
[+0x008 (31: 3)] OwnerCount : 0x1 [Type: unsigned long]
[+0x008] TableSize : 0xc [Type: unsigned long]
3: kd> dx -id 0,0,ffffcf8cdcc92040 -r1 ((ntkrnlmp!_OWNER_ENTRY *)0xffffcf8cee6e8210)
((ntkrnlmp!_OWNER_ENTRY *)0xffffcf8cee6e8210) : 0xffffcf8cee6e8210 [Type: _OWNER_ENTRY *]
[+0x000] OwnerThread : 0x0 [Type: unsigned __int64]
[+0x008 ( 0: 0)] IoPriorityBoosted : 0x1 [Type: unsigned long]
[+0x008 ( 1: 1)] OwnerReferenced : 0x1 [Type: unsigned long]
[+0x008 ( 2: 2)] IoQoSPriorityBoosted : 0x1 [Type: unsigned long]
[+0x008 (31: 3)] OwnerCount : 0x0 [Type: unsigned long]
[+0x008] TableSize : 0x7 [Type: unsigned long]
Thanks for reporting this. I've tracked this down to an OS bug: there's a deadlock in wanarp. This issue appears to affect every version of the OS going back to Windows Vista.
I've filed internal issue task.ms/42393356 to track this: if you have a Microsoft support contract, your rep can get you status updates on that issue.
Meanwhile, you can partially work around this issue by either:
Indicating 1 packet at a time (NumberOfNetBufferLists==1); or
Indicating on a single CPU at a time
The bug in wanarp is exposed when 2 or more CPUs collectively process 3 or more NBLs at the same time. So either workaround would avoid the trigger conditions.
Depending on how much bandwidth you're pushing through this network interface, those options could be rather bad for CPU/battery/throughput. So please try to avoid pessimizing batching unless it's really necessary. (For example, you could make this an option that's off-by-default, unless the customer specifically uses wanarp.)
Note that you cannot fully prevent the issue yourself. Other drivers in the stack, including NDIS itself, have the right to group packets together, which would have the side effect re-batching the packets that you carefully un-batched. However, I believe that you can make a statistically significant dent in the crashes if you just indicate 1 NBL at a time, or indicate multiple NBLs on 1 CPU at a time.
Sorry this is happening to you again! wanarp is... a very old codebase.

System becomes unresponsive due to kernel oops (IP: dev_queue_xmit+0x256/0x3f4)

Linux system is consistently getting unresponsive with below serial console output. The similar serial console output is observed every time the issue is occurred.
Steps to reproduce this issue are unknown as of now. But, this issue is not observed when all parameters related to acpi are disabled from BIOS.
I am newbie to debugging kernel oops. Please let me know what could be the problem and how can I resolve this issue. Any pointer or help will be very important.
Stack trace is as,
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<c06fbcdd>] dev_queue_xmit+0x256/0x3f4
*pdpt = 000000002ecb3001 *pde = 000000012974c067
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:1c.0/0000:01:00.0/net/eth0/broadcast
Modules linked in: tun nfnetlink_queue nfnetlink bluetooth rfkill ts_kmp xt_string 8021q garp nf_nat_pptp nf_conntrack_pptp nf_conntrack_proto_]
Pid: 14113, comm: snort Not tainted 2.6.33.3-85.fc13.i686.PAE #1 To be filled by O.E.M./To Be Filled By O.E.M.
EIP: 0060:[<c06fbcdd>] EFLAGS: 00210202 CPU: 1
EIP is at dev_queue_xmit+0x256/0x3f4
EAX: f6922000 EBX: f6bf5a80 ECX: ed524140 EDX: f6123380
ESI: f6248000 EDI: 00000000 EBP: eef7dbf0 ESP: eef7dbdc
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process snort (pid: 14113, ti=eef7c000 task=eef28cc0 task.ti=eef7c000)
Stack:
eef7dbec f6923300 f6bf5a80 ef48f360 ed524108 eef7dc14 c0722500 001aa8b4
<0> 00000000 ef48f350 ef48f300 00000028 f6bf5a80 001aa8b4 eef7dc20 c0722591
<0> f6bf5a80 eef7dc2c c0722819 f0fc4800 eef7dc34 c07216d5 eef7dc4c c0713b12
Call Trace:
[<c0722500>] ? ip_finish_output2+0x18e/0x1c6
[<c0722591>] ? ip_finish_output+0x59/0x5c
[<c0722819>] ? ip_output+0x74/0x79
[<c07216d5>] ? dst_output+0x9/0xb
[<c0713b12>] ? nf_reinject+0xa3/0xe6
[<f80ab427>] ? nfqnl_recv_verdict+0x1cf/0x1e0 [nfnetlink_queue]
[<f7e6b1ab>] ? nfnetlink_rcv_msg+0x118/0x149 [nfnetlink]
[<f7e6b0b9>] ? nfnetlink_rcv_msg+0x26/0x149 [nfnetlink]
[<c0711903>] ? netlink_sendmsg+0x72/0x221
[<f7e6b093>] ? nfnetlink_rcv_msg+0x0/0x149 [nfnetlink]
[<c0711130>] ? netlink_rcv_skb+0x30/0x76
[<f7e6b08c>] ? nfnetlink_rcv+0x1b/0x22 [nfnetlink]
[<c0710f6f>] ? netlink_unicast+0xbe/0x119
[<c0711aa5>] ? netlink_sendmsg+0x214/0x221
[<c06edfad>] ? __sock_sendmsg+0x45/0x4e
[<c06ee254>] ? sock_sendmsg+0x93/0xa7
[<c0442bfc>] ? irq_exit+0x39/0x5c
[<c0409c05>] ? do_IRQ+0x86/0x9a
[<c0408df0>] ? common_interrupt+0x30/0x38
[<c06f625f>] ? verify_iovec+0x57/0x6c
[<c06ee676>] ? sys_sendmsg+0x187/0x1eb
[<c06ee4c2>] ? sockfd_lookup_light+0x16/0x43
[<c06ee4aa>] ? fput_light+0xc/0xe
[<c06ef6d7>] ? sys_recvfrom+0x102/0x121
[<c06fbf04>] ? dev_kfree_skb_any+0x27/0x32
[<f88c3dfb>] ? e1000_put_txbuf+0x50/0x65 [e1000e]
[<f88c3ee8>] ? e1000_clean_tx_irq+0xa7/0x1dc [e1000e]
[<c05a6680>] ? might_fault+0x19/0x1b
[<c05a68eb>] ? copy_to_user+0x2f/0x108
[<c05a6680>] ? might_fault+0x19/0x1b
[<c06efe80>] ? sys_socketcall+0x15e/0x1a5
[<c040ff01>] ? syscall_trace_leave+0xa5/0xb8
[<c0782bdc>] ? syscall_call+0x7/0xb
[<c0780000>] ? acpi_processor_add+0x1f/0x74b
Code: 57 0c 66 89 83 80 00 00 00 8b 96 00 02 00 00 0f b7 c0 c1 e0 07 01 d0 89 45 f0 8b 78 04 66 8b 43 7e 80 e4 cf 80 cc 20 66 89 43 7e <83> 3f
EIP: [<c06fbcdd>] dev_queue_xmit+0x256/0x3f4 SS:ESP 0068:eef7dbdc
CR2: 0000000000000000
---[ end trace 5e9db4f99c9e9021 ]---
Kernel panic - not syncing: Fatal exception in interrupt
Message from syslogd#machine Pid: 14113, comm: snort Tainted: G D 2.6.33.3-85.fc13.i686.PAE #1
Call Trace:
[<c0780b4f>] ? printk+0xf/0x18
[<c0780a8d>] panic+0x39/0xec
[<c0783c90>] oops_end+0x92/0xa1
[<c04261c1>] no_context+0x13e/0x148
[<c04262b7>] __bad_area_nosemaphore+0xec/0xf4
[<c0784e87>] ? do_page_fault+0x0/0x2fa
[<c04262cc>] bad_area_nosemaphore+0xd/0x10
[<c078501b>] do_page_fault+0x194/0x2fa
[<c0784e87>] ? do_page_fault+0x0/0x2fa
[<c07832df>] error_code+0x73/0x78
[<c06fbcdd>] ? dev_queue_xmit+0x256/0x3f4
[<c0722500>] ip_finish_output2+0x18e/0x1c6
[<c0722591>] ip_finish_output+0x59/0x5c
[<c0722819>] ip_output+0x74/0x79
[<c07216d5>] dst_output+0x9/0xb
[<c0713b12>] nf_reinject+0xa3/0xe6
[<f80ab427>] nfqnl_recv_verdict+0x1cf/0x1e0 [nfnetlink_queue]
[<f7e6b1ab>] nfnetlink_rcv_msg+0x118/0x149 [nfnetlink]
[<f7e6b0b9>] ? nfnetlink_rcv_msg+0x26/0x149 [nfnetlink]
[<c0711903>] ? netlink_sendmsg+0x72/0x221
[<f7e6b093>] ? nfnetlink_rcv_msg+0x0/0x149 [nfnetlink]
[<c0711130>] netlink_rcv_skb+0x30/0x76
[<f7e6b08c>] nfnetlink_rcv+0x1b/0x22 [nfnetlink]
[<c0710f6f>] netlink_unicast+0xbe/0x119
[<c0711aa5>] netlink_sendmsg+0x214/0x221
[<c06edfad>] __sock_sendmsg+0x45/0x4e
[<c06ee254>] sock_sendmsg+0x93/0xa7
[<c0442bfc>] ? irq_exit+0x39/0x5c
[<c0409c05>] ? do_IRQ+0x86/0x9a
[<c0408df0>] ? common_interrupt+0x30/0x38
[<c06f625f>] ? verify_iovec+0x57/0x6c
[<c06ee676>] sys_sendmsg+0x187/0x1eb
[<c06ee4c2>] ? sockfd_lookup_light+0x16/0x43
[<c06ee4aa>] ? fput_light+0xc/0xe
[<c06ef6d7>] ? sys_recvfrom+0x102/0x121
[<c06fbf04>] ? dev_kfree_skb_any+0x27/0x32
[<f88c3dfb>] ? e1000_put_txbuf+0x50/0x65 [e1000e]
[<f88c3ee8>] ? e1000_clean_tx_irq+0xa7/0x1dc [e1000e]
[<c05a6680>] ? might_fault+0x19/0x1b
[<c05a68eb>] ? copy_to_user+0x2f/0x108
[<c05a6680>] ? might_fault+0x19/0x1b
[<c06efe80>] sys_socketcall+0x15e/0x1a5
[<c040ff01>] ? syscall_trace_leave+0xa5/0xb8
[<c0782bdc>] syscall_call+0x7/0xb
[<c0780000>] ? acpi_processor_add+0x1f/0x74b
I upgraded the kernel to 2.6.39 and e1000e driver on fedora13. This resolved this issue. Answering as it might help others.(even after long time. Sorry for that.)

How can I work out which process/thread owns the resource that my program is hanging on

I have a user mode process which is hanging when calling NtClose. That NtClose is hanging while trying to acquire a lock in the kernel. I believe it's the lock to the handle table. Here's the kernel part of the stack:
THREAD fffffa800bd4fb50 Cid 277c.21d8 Teb: 000007fffff80000 Win32Thread: 0000000000000000 WAIT: (WrResource) KernelMode Non-Alertable
fffffa80047bad20 SynchronizationEvent
IRP List:
fffffa80049f49c0: (0006,0430) Flags: 00000404 Mdl: 00000000
Not impersonating
DeviceMap fffff8a000008bc0
Owning Process fffffa800c195060 Image: My_Service.exe
Attached Process N/A Image: N/A
Wait Start TickCount 455527 Ticks: 223 (0:00:00:03.478)
Context Switch Count 1703
UserTime 00:00:00.015
KernelTime 00:00:00.109
Win32 Start Address 0x000000013f509190
Stack Init fffff8800c3e0fb0 Current fffff8800c3e0790
Base fffff8800c3e1000 Limit fffff8800c3db000 Call 0
Priority 10 BasePriority 8 UnusualBoost 2 ForegroundBoost 0 IoPriority 2 PagePriority 5
Child-SP RetAddr : Args to Child : Call Site
fffff880`0c3e07d0 fffff800`02ccc972 : fffffa80`0bd4fb50 fffffa80`0bd4fb50 fffff880`00000000 00000000`00000003 : nt!KiSwapContext+0x7a
fffff880`0c3e0910 fffff800`02cddd8f : 00000000`00000000 fffff880`0af2d400 fffff880`00000068 fffff880`0af2d408 : nt!KiCommitThreadWait+0x1d2
fffff880`0c3e09a0 fffff800`02cb7086 : 00000000`00000000 fffffa80`0000001b 00000000`00000000 fffff880`009eb100 : nt!KeWaitForSingleObject+0x19f
fffff880`0c3e0a40 fffff800`02cdc1ac : ffffffff`fd9da600 fffffa80`047bad20 fffffa80`03e1d238 00000000`00000200 : nt!ExpWaitForResource+0xae
fffff880`0c3e0ab0 fffff880`016e6f88 : 00000000`00000000 fffff8a0`0d555010 fffff880`0af2d840 fffff8a0`0a71e576 : nt!ExAcquireResourceExclusiveLite+0x14f
fffff880`0c3e0b20 fffff880`01652929 : fffffa80`06fc72c0 fffffa80`049f49c0 fffff880`0af2d550 fffffa80`0bd4fb50 : Ntfs!NtfsCommonCleanup+0x2705
fffff880`0c3e0f30 fffff800`02ccea37 : fffff880`0af2d550 00000000`00000000 00000000`00000000 00000000`00000000 : Ntfs!NtfsCommonCleanupCallout+0x19
fffff880`0c3e0f60 fffff800`02cce9f8 : 00000000`00000000 00000000`00000000 fffff880`0c3e1000 fffff800`02ce2e42 : nt!KySwitchKernelStackCallout+0x27 (TrapFrame # fffff880`0c3e0e20)
fffff880`0af2d420 fffff800`02ce2e42 : 00000000`0000277c 00000000`00000002 00000000`00000002 fffff880`042f8965 : nt!KiSwitchKernelStackContinue
fffff880`0af2d440 fffff880`016529a2 : fffff880`01652910 00000000`00000000 fffff880`0af2d800 00000000`00000000 : nt!KeExpandKernelStackAndCalloutEx+0x2a2
fffff880`0af2d520 fffff880`016f3894 : fffff880`0af2d5f0 fffff880`0af2d5f0 fffff880`0af2d5f0 fffff880`0af2d760 : Ntfs!NtfsCommonCleanupOnNewStack+0x42
fffff880`0af2d590 fffff880`01145bcf : fffff880`0af2d5f0 fffffa80`049f49c0 fffffa80`049f4da8 fffffa80`03ef5010 : Ntfs!NtfsFsdCleanup+0x144
fffff880`0af2d800 fffff880`011446df : fffffa80`04e239a0 00000000`00000000 fffffa80`048cb100 fffffa80`049f49c0 : fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x24f
fffff880`0af2d890 fffff800`02fe3fef : fffffa80`049f49c0 fffffa80`0c195060 00000000`00000000 fffffa80`04aa93d0 : fltmgr!FltpDispatch+0xcf
fffff880`0af2d8f0 fffff800`02fd1fe4 : 00000000`00000000 fffffa80`0c195060 fffff880`01165cb0 fffff800`02c64000 : nt!IopCloseFile+0x11f
fffff880`0af2d980 fffff800`02fd1da1 : fffffa80`0c195060 fffffa80`00000001 fffff8a0`18385220 00000000`00000000 : nt!ObpDecrementHandleCount+0xb4
fffff880`0af2da00 fffff800`02fd2364 : 00000000`0000cae8 fffffa80`0c195060 fffff8a0`18385220 00000000`0000cae8 : nt!ObpCloseHandleTableEntry+0xb1
fffff880`0af2da90 fffff800`02cd61d3 : fffffa80`0bd4fb50 fffff880`0af2db60 00000001`3f64afd8 00000000`00000000 : nt!ObpCloseHandle+0x94
My question is, how can I work out which other process/thread on the system has acquired this kernel resource using windbg? (By the way I'm looking at a full system dump from a customer, I don't have this reproduced in a debugger)
So the answer was to use kdext*.locks, this shows that the thread above was deadlocked with a System thread that belonged to one of Symantec's antivirus drivers.
The locks which were causing a problem here were kernel ERESOURCE locks. There's two versions of !locks I've discovered, one for user mode critical sections and the other for kernel mode locks

Windows singly linked list (_SINGLE_LIST_ENTRY)

I'm just doing some debugging on a Windows 7 crash dump, and I've come across a singly-linked list that I'm not able to fully understand.
Here's the output from WinDBG:
dt _GENERAL_LOOKASIDE_POOL fffff80002a14800 -b
....
0x000 SingleListHead: _SINGLE_LIST_ENTRY
+0x000 Next: 0x0000000000220001
....
From what I've been reading, it seems that each singly linked list begins with a list head, which contains a pointer to the first element in the list, or null if the list is empty.
Microsoft state: MSDN article
For a SINGLE_LIST_ENTRY that serves as a list entry, the Next member
points to the next entry in the list, or NULL if there is no next
entry in the list. For a SINGLE_LIST_ENTRY that serves as the list
header, the Next member points to the first entry in the list, or NULL
if the list is empty.
I'm 99% sure this list contains some entries, but I don't understand how the value of 0x0000000000220001 is supposed to be pointing to anything. This value certainly doesn't resolve to a valid page mapping, so I can only assume it's some kind of offset. However, I'm not sure.
If anyone could help shine some light on this, I'd appreciate it.
Thanks
UPDATE
I've just found a document (translated from Chinese) that seems to explain the structure a little more. If anyone could offer some input on it, I'd appreciate it.
Lookaside List article
What I'm actually looking at is a lookaside list that Windows should be using for the allocation of IRPs, here's the full output from WinDBG (values changed from original question):
lkd> !lookaside iopsmallirplookasidelist
Lookaside "" # fffff80002a14800 "Irps"
Type = 0000 NonPagedPool
Current Depth = 0 Max Depth = 4
Size = 280 Max Alloc = 1120
AllocateMisses = 127 FreeMisses = 26
TotalAllocates = 190 TotalFrees = 90
Hit Rate = 33% Hit Rate = 71%
lkd> dt _general_lookaside fffff80002a14800 -b
ntdll!_GENERAL_LOOKASIDE
+0x000 ListHead : _SLIST_HEADER
+0x000 Alignment : 0x400001
+0x008 Region : 0xfffffa80`01e83b11
+0x000 Header8 : <unnamed-tag>
+0x000 Depth : 0y0000000000000001 (0x1)
+0x000 Sequence : 0y001000000 (0x40)
+0x000 NextEntry : 0y000000000000000000000000000000000000000 (0)
+0x008 HeaderType : 0y1
+0x008 Init : 0y0
+0x008 Reserved : 0y11111111111111111101010000000000000011110100000111011000100 (0x7fffea0007a0ec4)
+0x008 Region : 0y111
+0x000 Header16 : <unnamed-tag>
+0x000 Depth : 0y0000000000000001 (0x1)
+0x000 Sequence : 0y000000000000000000000000000000000000000001000000 (0x40)
+0x008 HeaderType : 0y1
+0x008 Init : 0y0
+0x008 Reserved : 0y00
+0x008 NextEntry : 0y111111111111111111111010100000000000000111101000001110110001 (0xfffffa8001e83b1)
+0x000 HeaderX64 : <unnamed-tag>
+0x000 Depth : 0y0000000000000001 (0x1)
+0x000 Sequence : 0y000000000000000000000000000000000000000001000000 (0x40)
+0x008 HeaderType : 0y1
+0x008 Reserved : 0y000
+0x008 NextEntry : 0y111111111111111111111010100000000000000111101000001110110001 (0xfffffa8001e83b1)
+0x000 SingleListHead : _SINGLE_LIST_ENTRY
+0x000 Next : 0x00000000`00400001
+0x010 Depth : 4
+0x012 MaximumDepth : 0x20
+0x014 TotalAllocates : 0xbe
+0x018 AllocateMisses : 0x7f
+0x018 AllocateHits : 0x7f
+0x01c TotalFrees : 0x5a
+0x020 FreeMisses : 0x1a
+0x020 FreeHits : 0x1a
+0x024 Type : 0 ( NonPagedPool )
+0x028 Tag : 0x73707249
+0x02c Size : 0x118
+0x030 AllocateEx : 0xfffff800`029c30e0
+0x030 Allocate : 0xfffff800`029c30e0
+0x038 FreeEx : 0xfffff800`029c30d0
+0x038 Free : 0xfffff800`029c30d0
+0x040 ListEntry : _LIST_ENTRY [ 0xfffff800`02a147c0 - 0xfffff800`02a148c0 ]
+0x000 Flink : 0xfffff800`02a147c0
+0x008 Blink : 0xfffff800`02a148c0
+0x050 LastTotalAllocates : 0xbe
+0x054 LastAllocateMisses : 0x7f
+0x054 LastAllocateHits : 0x7f
+0x058 Future :
[00] 0
[01] 0
lkd> !slist fffff80002a14800
SLIST HEADER:
+0x000 Header16.Sequence : 40
+0x000 Header16.Depth : 1
SLIST CONTENTS:
fffffa8001e83b10 0000000000000000 0000000000000000
0000000000000404 0000000000000000
Sorry if some of the formatting is lost. Essentially, this should be a lookaside list that contains a list of chunks that are all of the same size 0x118 (sizeof(_IRP) + sizeof(_IO_STACK_LOCATION))
However I'm not entirely sure how the list is actually put together, I'm not sure if this should be a singly linked list of memory chunks, or if I'm reading all of it incorrectly.
In case of small irp list with win7x86rtm:
lkd> !lookaside iopsmallirplookasidelist
Lookaside "" # 82d5ffc0 "Irps"
....
lkd> dt _SINGLE_LIST_ENTRY 82d5ffc0
nt!_SINGLE_LIST_ENTRY
+0x000 Next : 0x86737e30 _SINGLE_LIST_ENTRY
....
lkd> !pool 0x86737e30
Pool page 86737e30 region is Nonpaged pool
*86737e28 size: a0 previous size: 48 (Allocated) *Irp
Pooltag Irp : Io, IRP packets
The size of memory chank is a0 bytes
lkd> ?? sizeof(_pool_header)+sizeof(_single_list_entry)+sizeof(_irp)+sizeof(_io_stack_location)
unsigned int 0xa0
which include pool header, pointer, irp, stack location
Minor update:
Author Tarjei Mandt aka #kernelpool
In _GENERAL_LOOKASIDE structure, SingleListHead.Next points to the first free pool chunk on the singly-linked lookaside list. The size of the lookaside list is limited by the value of Depth, periodically adjusted by the balance set manager according to the number of hits and misses on the lookaside list. Hence, a frequently used lookaside list will have a larger Depth value than an infrequently used list. The intial Depth is 4 nt!ExMinimumLookasideDepth, with maximum being MaximumDepth (256)...more
SINGLE_LIST_ENTRY implements intrusive linked-lists. Look for struct list_head which offers similar functionnality within the linux kernel.
As for the .Next member, it really is a pointer to a SINGLE_LIST_ENTRY that is most likely embedded inside another struct.

Windows PDB file contains multiple symbols for same address? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why two functions print the same address?
I am working with PDB symbol files for an application which processes them (via the DbgHelp API). I have come across a strange issue where a PDB file will contain multiple different public symbol entries for the same address!
For example, using the latest Microsoft PDB file for kernel32.dll (wow64) on Windows 7 (x64), we can dump the following information and see 31 different entries for the same address 0x10b1a6e:
C:\Program Files (x86)\Windows Kits\8.0\Debuggers\x86>dbh.exe c:\symbols\wkernel32.pdb\D08F1E131D1F4D97B4AB2F64E00CFC8B2\wkernel32.pdb m 10b1a6e
index address name
7a 10b1a6e : MFInitAttributesFromBlob
179 10b1a6e : MFCreateSourceReaderFromURL
2fc 10b1a6e : MFCreateASFMediaSinkActivate
5b6 10b1a6e : MFCreateWMVEncoderActivate
61d 10b1a6e : MFAddPeriodicCallback
64c 10b1a6e : MFPutWorkItem
825 10b1a6e : MFCreateAlignedMemoryBuffer
c12 10b1a6e : MFGetAttributesAsBlob
d26 10b1a6e : MFCreateMFVideoFormatFromMFMediaType
f1a 10b1a6e : MFFrameRateToAverageTimePerFrame
1129 10b1a6e : MFCreateProxyLocator
1277 10b1a6e : MFSerializeAttributesToStream
12b3 10b1a6e : MFEnumDeviceSources
146d 10b1a6e : MFCreateWMAEncoderActivate
164c 10b1a6e : MFBeginUnregisterWorkQueueWithMMCSS
1bfc 10b1a6e : MFCreateSourceReaderFromMediaSource
1d25 10b1a6e : MFInitMediaTypeFromWaveFormatEx
1d72 10b1a6e : MFGetStrideForBitmapInfoHeader
1efb 10b1a6e : CopyPropertyStore
1f8d 10b1a6e : MFDeserializePresentationDescriptor
1fb5 10b1a6e : MFCreateSampleGrabberSinkActivate
1fe4 10b1a6e : MFCreateASFStreamingMediaSinkActivate
23a3 10b1a6e : MFDeserializeAttributesFromStream
24c0 10b1a6e : MFConvertFromFP16Array
26f7 10b1a6e : MFSerializePresentationDescriptor
2877 10b1a6e : MFCreatePresentationDescriptor
2ab7 10b1a6e : MFCreateSourceReaderFromByteStream
2b4a 10b1a6e : MFGetWorkQueueMMCSSClass
2e08 10b1a6e : MFInitMediaTypeFromMFVideoFormat
2ef0 10b1a6e : MFCreateSinkWriterFromMediaSink
2eff 10b1a6e : MFConvertToFP16Array
The above example is one of many addresses containing duplicates. Normally there is one symbol entry at any address. It simply doesn't make sense to have multiple symbol entries for the same address AFAIK!!
Can anybody enlighten me as to:
Why this is happening?
Can these duplicate entries be resolved into there unique location?
Thanks.
There are multiple symbols for the same address because all the functions are the same. In your case, they are all functions that go
HRESULT MFBlahBlahBlah(...)
{
return E_NOTIMPL;
}

Resources