Analyzing a BSOD (WinDbg) beyond !analyze -v - windows

I'm trying to analyze a BSOD (0xFC,ATTEMPTED_EXECUTE_OF_NOEXECUTE_MEMORY) which occurred on a Windows 10 machine during a restart, but I'm stuck.
!analyze -v gave useful information like:
DEFAULT_BUCKET_ID: WIN8_DRIVER_FAULT
PROCESS_NAME: System
MODULE_NAME: mrxsmb
IMAGE_NAME: mrxsmb.sys
FAILURE_BUCKET_ID: 0xFC_mrxsmb!MRxSmbQueryLbfoTeamCapability
BUCKET_ID: 0xFC_mrxsmb!MRxSmbQueryLbfoTeamCapability
PRIMARY_PROBLEM_CLASS: 0xFC_mrxsmb!MRxSmbQueryLbfoTeamCapability
# Child-SP RetAddr Call Site
00 ffffd000`20505ba8 fffff802`a03e1a7c nt!KeBugCheckEx
01 ffffd000`20505bb0 fffff802`a0319229 nt! ?? ::FNODOBFM::`string'+0x1377c
02 ffffd000`20505bf0 fffff802`a031a661 nt!MiSystemFault+0x7e9
03 ffffd000`20505ce0 fffff802`a03c8bbc nt!MmAccessFault+0x5f1
04 ffffd000`20505e40 fffff800`1cf70030 nt!KiPageFault+0x13c
05 ffffd000`20505fd8 fffff802`a072fea2 ndis!_NULL_IMPORT_DESCRIPTOR+0x558
06 ffffd000`20505fe0 fffff802`a07300e6 nt!IopXxxControlFile+0x1162
07 ffffd000`205061b0 fffff802`a03ca1a3 nt!NtDeviceIoControlFile+0x56
08 ffffd000`20506220 fffff802`a03c2860 nt!KiSystemServiceCopyEnd+0x13
09 ffffd000`20506428 fffff800`1d7df2aa nt!KiServiceLinkage
0a ffffd000`20506430 fffff800`1d818b5b mrxsmb!MRxSmbQueryLbfoTeamCapability+0x16a
0b ffffd000`20506570 fffff800`1cbc1c7d mrxsmb!MRxSmbPnPBindingHandler+0x1ab
0c ffffd000`205066e0 fffff800`1cbc2194 TDI!TdiNotifyNewPnpClient+0x12d
0d ffffd000`20506710 fffff800`1cbc1db0 TDI!TdiExecuteRequest+0x2d4
0e ffffd000`20506760 fffff800`1cbc184c TDI!TdiHandleSerializedRequest+0x120
0f ffffd000`205067d0 fffff800`1d7e0b96 TDI!TdiRegisterPnPHandlers+0x12c
10 ffffd000`20506800 fffff800`1d81ab71 mrxsmb!MRxSmbStartTransport+0xfe
11 ffffd000`20506890 fffff800`1d228f2f mrxsmb!MRxSmbStart+0x51
12 ffffd000`205068e0 fffff800`1d816faf rdbss!RxStartMinirdr+0xef
13 ffffd000`20506930 fffff800`1d7d366e mrxsmb!MRxSmbDevFcbXXXControlFile+0x1eb
14 ffffd000`205069a0 fffff800`1d2567de mrxsmb!SmbShellDevFcbXXXControlFile+0xe
15 ffffd000`205069d0 fffff800`1d256665 rdbss!RxXXXControlFileCallthru+0xb6
16 ffffd000`20506a00 fffff800`1d2694d9 rdbss!RxCommonDevFCBFsCtl+0xa5
17 ffffd000`20506a50 fffff802`a0367c11 rdbss!RxFspDispatch+0xe9
18 ffffd000`20506b10 fffff802`a034f399 nt!IopProcessWorkItem+0x81
19 ffffd000`20506b80 fffff802`a02c3b35 nt!ExpWorkerThread+0xe9
1a ffffd000`20506c10 fffff802`a03c4b16 nt!PspSystemThreadStartup+0x41
1b ffffd000`20506c60 00000000`00000000 nt!KiStartSystemThread+0x16
These info gave me some idea what went wrong:
The source of the problem might be in the mrxsmb!MRxSmbQueryLbfoTeamCapability function according to WinDbg.
MRxSMB is the so called "SMB Mini-Redirector". A redirector is a component which is part of a remote file system driver. The rdbss stuff is also part of this redirector architecture which MRxSMB is built on.
The TDI stuff stands for "Transport Driver Interface" and is according to the Windows Internals book some sort of legacy "kernel-mode device driver that implements the kernel-mode portion of a networking API's implementation".
From that I assume Windows tries to access a remote filesystem via SMB. I think it actually tries to access a network drive which is not accessible.
But I'd like to understand why this causes a BSOD.
My questions are:
How does WinDbg know that the problem starts with mrxsmb!MRxSmbQueryLbfoTeamCapability+0x16a and doesn't occur later?
ndis!_NULL_IMPORT_DESCRIPTOR looks suspicious. According to google such a descriptor marks the end of a loaded library. But why does it then show up as function here?
If I take the return address of ndis!_NULL_IMPORT_DESCRIPTOR (fffff802`a072fea2) and go some bytes back in memory I get this:
1: kd> u fffff802`a072fe90 fffff802`a072fea2
nt!IopXxxControlFile+0x1150:
fffff802`a072fe90 eb13 jmp nt!IopXxxControlFile+0x1165 (fffff802`a072fea5)
fffff802`a072fe92 488bd7 mov rdx,rdi
fffff802`a072fe95 488b8c2488000000 mov rcx,qword ptr [rsp+88h]
fffff802`a072fe9d e8fe8dc1ff call nt!IofCallDriver (fffff802`a0348ca0)
fffff802`a072fea2 448be0 mov r12d,eax
Is my assumption correct, that the ndis!_NULL_IMPORT_DESCRIPTOR+0x558 thing happens actually in nt!IofCallDriver since that is the last instruction before the return address of ndis!_NULL_IMPORT_DESCRIPTOR+0x558? But why does this function then not show up in the stack trace?
How can I actually get function parameters? From my understanding I can't because the calling convention is that the first 4 parameters are passed in registers rather than the stack and I only have the register contents of the last operation?
How to proceed?

Related

Why is CreateFileA succeeding in Windows XP and failing in Windows 10?

I'm working through a reverse engineering exercise using olly.
00402D71 |. 6A 00 PUSH 0 ; /hTemplateFile = NULL
00402D73 |. 68 80000000 PUSH 80 ; |Attributes = NORMAL
00402D78 |. 6A 03 PUSH 3 ; |Mode = OPEN_EXISTING
00402D7A |. 6A 00 PUSH 0 ; |pSecurity = NULL
00402D7C |. 6A 00 PUSH 0 ; |ShareMode = 0
00402D7E |. 68 00000080 PUSH 80000000 ; |Access = GENERIC_READ
00402D83 |. FF75 F0 PUSH DWORD PTR SS:[EBP-10] ; |C:\Windows\system32\eLearnRE#5.dat
00402D86 |. E8 DF642300 CALL <JMP.&KERNEL32.CreateFileA> ; \CreateFileA
On Windows 10, I created the eLearnRE#5.dat file and verified that NTFS permissions make it readable. But even like that I get 0xFFFFFFFF with ERROR_FILE_NOT_FOUND (00000002) as a result of the CALL CreateFileA.
This is what I tried already (without success)
Copy and pasted the name from ollydbg to make sure I don't have any typos
Compared the two filenames letter by letter in notepad to check for typos
Patched the process memory to try reading an existing file in C:\temp\test.txt (just in case c:\windows\System32\eLearnRE#5.dat had some special restrictions because it's in a system folder)
Put some data into the file just in case it had trouble handling a blank file for some reason.
Ran olly as Administrator
This works perfectly fine in Windows XP, but not in Windows 10. Why?
The credit for the answer goes to Hans Passant for https://learn.microsoft.com/en-us/windows/win32/winprog64/file-system-redirector
On 64bit windows, 32bit applications are redirected to C:\windows\syswow64
As soon as I moved the file to C:\windows\syswos64, things started to work as expected

Debugging process deadlock (LdrpDrainWorkQueue/LdrpLoadCompleteEvent)

All!
I'm debugging one quite strange case of process hanging/running out of memory using standard Windows crash dump with WinDbg. Obviously, it runs out of address space because of too many threads being created (it is 32 bit process), and I'm trying to figure out what's wrong with threads initialization (see callstack #3 below), because besides threads with callstacks that are typical for this program, it has handful of threads with callstacks of 3 types like:
1)
00 02cefb08 77544413 02fc024c 00000000 02cefb8c ntdll_774f0000!NtWaitForAlertByThreadId+0xc
01 02cefb28 7754434d 00000000 00000000 ffffffff ntdll_774f0000!RtlpWaitOnAddressWithTimeout+0x33
02 02cefb6c 7754423f 00000004 00000000 00000000 ntdll_774f0000!RtlpWaitOnAddress+0xa5
03 02cefba8 7752a605 02fc0000 02fc0000 02fc04b0 ntdll_774f0000!RtlpWaitOnCriticalSection+0xaa
04 02cefbc8 7752a525 02fc0248 02cefc88 77533844 ntdll_774f0000!RtlpEnterCriticalSectionContended+0xd5
05 02cefbd4 77533844 02fc0248 62da3da7 02fc04b0 ntdll_774f0000!RtlEnterCriticalSection+0x45
06 02cefc88 77533688 02fc04b0 02fc04b8 00000007 ntdll_774f0000!RtlpFreeHeap+0x174
07 02cefcd8 110d27fc 02fc0000 00000000 02fc04b8 ntdll_774f0000!RtlFreeHeap+0x758
...
These threads are stuck behind critical section 02fc024c that is taken by non-longer existing thread, and it is quite hard to figure out, what happened to it.
There are some threads that try to end normally, but are stuck in the LdrpDrainWorkQueue:
2)
# ChildEBP RetAddr Args to Child
00 05e5fd54 77527631 00000064 00000000 00000000 ntdll_774f0000!NtWaitForSingleObject+0xc
01 05e5fd78 7752b105 65f13f5f 00404e7c 00000000 ntdll_774f0000!LdrpDrainWorkQueue+0xbd
02 05e5fe70 7755179c 00404e7c 00404e7c 1086eb50 ntdll_774f0000!LdrShutdownThread+0x85
03 05e5ff40 00404efe 00000000 0042cef4 0042cefc ntdll_774f0000!RtlExitUserThread+0x4c
04 05e5ff6c 00404ea6 05e5ffcc 004049b8 05e5ff80 abc!EndThread+0x6
05 05e5ff80 743962c4 1086eb50 743962a0 941a355e abc!ThreadWrapper+0x2a
06 05e5ff94 77550779 1086eb50 65f13ef3 00000000 kernel32!BaseThreadInitThunk+0x24
07 05e5ffdc 77550744 ffffffff 77573606 00000000 ntdll_774f0000!__RtlUserThreadStart+0x2f
08 05e5ffec 00000000 00404e7c 1086eb50 00000000 ntdll_774f0000!_RtlUserThreadStart+0x1b
Also, dump presents about 1400 threads on a very early stage of initialization, that were created during last 5 minutes of process life with a callstack like:
3)
# ChildEBP RetAddr Args to Child
00 0ed1fba4 77527631 00000064 00000000 00000000 ntdll_774f0000!NtWaitForSingleObject+0xc
01 0ed1fbcc 7752b586 6ec53d9b ffffffff 1ed9d000 ntdll_774f0000!LdrpDrainWorkQueue+0xbd
02 0ed1fcb4 77557d86 6ec53c27 00000000 00000000 ntdll_774f0000!LdrpInitializeThread+0x8d
03 0ed1fd08 77557ce0 00000000 00000000 0ed1fd24 ntdll_774f0000!_LdrpInitialize+0x6a
04 0ed1fd10 00000000 0ed1fd24 774f0000 00000000 ntdll_774f0000!LdrInitializeThunk+0x10
These threads are also waiting in the LdrpDrainWorkQueue for event LdrpLoadCompleteEvent to be signalled.
This event is related to parallel loader (for some reference, first answer fot his question from RbMm, somewhat similar yet different situation here) This event is created during process initialization and signalled after parallel DLL loading has finished, so all LdrpInitializeThread's could traverse DllMain's and signal THREAD_ATTACH. But I don't understand why it is in non-signalled state on a process that has been running for weeks? Does parallel loader work on LoadLibrary as well, so LdrpLoadCompleteEvent gets reset? Couldn't find it in disassembly.
In any case, I'm trying to understand why process has developed such strange callstacks before it was forcefully terminated. I could imagine, that some thread began loading DLL that caused LdrpLoadCompleteEvent to be reset, then some thread holding lock for the heap died in a bad manner, so dll loading couldn't have been completed, so LdrpLoadCompleteEvent was never signalled, hence no new threads could have been initialized. However, there's no any thread that is loading dll in the dump.
Any insight/hint regarding how such callstacks could have been developed, or what else I could do to squeeze more info from the dump, is welcome.
Thank you!
Your fundamental problem is architectural ... a legendary problem known as "thrashing."
Your system is probably designed with what I refer to as the "flaming-arrow approach." Whenever a new request comes in, "just light another flaming arrow and throw it into the air." Unfortunately you just can't do that.
The permanent solution to your problem unfortunately will never be solved by a debugger: it will require a redesign.

Replace bootloader on sama5d3 from the running linux system

I'd like to replace the first stage bootloader in the nand flash on a sama5d36 based system running 4.1.0-linux4sam_5.1 and buildroot-2016.02.
I can replace the kernel image with flashcp just fine, but when I try it with the bootloader, flashcp runs without errors, but the system doesn't boot afterwards, stays at the ROMBOOT prompt.
buildroot:~# flashcp -v at91bootstrap.bin /dev/mtd0
Erasing block: 1/1 (100%)
Writing kb: 14/14 (100%)
Verifying kb: 14/14 (100%)
buildroot:~# reboot
[...]
Sent SIGKILL to all processes
Requesting system reboot
�RomBOOTRestarting system
Then I can write the same bootloader image with sam-ba, and it will boot, so the image is good. How can it be flashed in Linux, without user intervention?
There should be a 208 byte header preceding the actual boot code at the beginning of the flash.
From the SAMA5D3 Datasheet (that I should have read before posting the question)
After Initialization and Reset command, the Boot Program reads the first page without an ECC check, to determine if the NAND parameter header is present. The header is made of 52 times the same 32-bit word (for redundancy reasons) which must contain NAND and
PMECC parameters used to correctly perform the read of the rest of the data in the NAND.
The header is of course there when I dump the contents of the boot sector
buildroot:~# hd < /dev/mtd0 | head -4
00000000 05 24 90 c0 05 24 90 c0 05 24 90 c0 05 24 90 c0 |.$...$...$...$..|
*
000000d0 0e 00 00 ea 05 00 00 ea 05 00 00 ea 05 00 00 ea |................|
000000e0 05 00 00 ea cc 3b 00 00 06 00 00 ea 06 00 00 ea |.....;..........|
the first four bytes are repeated over and over, and the ARM jump table begins at offset 0xD0 (=208=52 * 4)
sam-ba takes care of this header when it writes the boot sector, but the Linux mtd driver and flashcp treats it as ordinary data, so I should supply it.

What information can I glean from a dump of a ghc Haskell process on Windows?

One of the users of my command line application has reported what appears to be an infinite loop. They helpfully took a dump of the process (via Task Manager) while it was in this state and sent it to me.
I'm not sure how to get useful information out of this dump. My normal technique of windbg -z the-dump-file.dmp -y releases\v5.0.0 -i releases\v5.0.0 doesn't give me much information that I know how to interpret. Are there ghc-specific tools I can use instead?
Moving forward, are the build options I should add or other things I should do to my release process to make this kind of post-mortem debugging more fruitful?
Here's an example of the stacks that I'm seeing. Not much useful info, especially for someone used to debugging C/C++ code in WinDbg. :-)
0 Id: 112dc.cc18 Suspend: 1 Teb: 00000000`00341000 Unfrozen
*** ERROR: Module load completed but symbols could not be loaded for gbc.exe
# Child-SP RetAddr Call Site
00 00000000`01b7d8d0 00000000`01049f71 gbc+0xc5676e
01 00000000`01b7d930 00000000`0104b5b4 gbc+0xc49f71
02 00000000`01b7d9a0 00000000`0104c644 gbc+0xc4b5b4
03 00000000`01b7da60 00000000`0104c1fa gbc+0xc4c644
04 00000000`01b7dab0 00000000`0042545b gbc+0xc4c1fa
05 00000000`01b7db30 00000000`011c40a0 gbc+0x2545b
06 00000000`01b7db38 00000000`0535bee1 gbc+0xdc40a0
07 00000000`01b7db40 00000000`010ffd80 0x535bee1
08 00000000`01b7db48 00000000`0535bee1 gbc+0xcffd80
09 00000000`01b7db50 00007ffb`3581fb01 0x535bee1
0a 00000000`01b7db58 00007ffb`3581b850 imm32!?MSCTF_NULL_THUNK_DATA_DLB+0x2e9
0b 00000000`01b7db60 00000000`00000010 imm32!CtfImmGetCompatibleKeyboardLayout
0c 00000000`01b7db68 00000000`00000000 0x10
1 Id: 112dc.d324 Suspend: 1 Teb: 00000000`00349000 Unfrozen
# Child-SP RetAddr Call Site
00 00000000`05c2fc48 00007ffb`36441563 ntdll!ZwWaitForWorkViaWorkerFactory+0x14
01 00000000`05c2fc50 00007ffb`34172774 ntdll!TppWorkerThread+0x293
02 00000000`05c2ff60 00007ffb`36470d61 kernel32!BaseThreadInitThunk+0x14
03 00000000`05c2ff90 00000000`00000000 ntdll!RtlUserThreadStart+0x21
2 Id: 112dc.11b48 Suspend: 1 Teb: 00000000`0034b000 Unfrozen
# Child-SP RetAddr Call Site
00 00000000`0642dd38 00007ffb`32f2988f ntdll!ZwWaitForSingleObject+0x14
01 00000000`0642dd40 00000000`00ffca15 KERNELBASE!WaitForSingleObjectEx+0x9f
02 00000000`0642dde0 00000000`00000000 gbc+0xbfca15
Some resources that might be useful. (If there are more up-to-date ones, I would like to see them myself.)
https://ghc.haskell.org/trac/ghc/wiki/Debugging/CompiledCode
https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/debug-info.html
https://wiki.haskell.org/Debugging
A few important nuggets:
The runtime flag +RTS -? Will tell you what runtime flags add debugging information. These will start with +RTS -D. For example, +RTS -DS turns on a number of runtime assertions and sanity checks.
The strange names you see are encoded in something called Z-encoding. This is defined at https://ghc.haskell.org/trac/ghc/browser/ghc/compiler/cmm/CLabel.hs.
If you can recompile the code with debugging symbols on and threading off, and still reproduce the bug, you can set breakpoints (or hit control-C) inside the debugger and backtrace from there. You can examine memory with a command like print/a 0x006eb0c0 (although you seem to be using 64-bit pointers). You can see the assembly-language instruction that crashed with disassemble.
You need to use the -ddump-stg compile flag to see what the variable names mean, because that is the last phase of the transformation before the program is assembled, and the variable names you see in the debugger correspond to the ones here.
You can instrument the code with Debug.Trace.

How to automate task in OllyDBG ? Using Ollyscript or any other tool

How to automate task in OllyDBG? I want to create a script.
I bp on a certain address and go to its stack arguments. There are four stack arguments. I use the address of the 2nd argument and the size from the 4th argument and dump the HEX data into a log file.
For example :
I set a breakpoint at 512A12 and I see the following in stack:
00192003 005DB123 RETURN to program.005DB123 from program.00512A12
00192011 0018F058 - 1st argument
00192028 03F48D78 - 2nd argument
00192032 03F48D78 - 3rd argument
00192036 00000500 - 4th argument
So I dump the data from address starting at 00192028 03F48D78 - 2nd argument where 03F48D78 is the address to start the dump from.
And I use the size from the 4th argument 00192036 00000500 - 4th argument where 500 is the size of the data to be dumped.
The data is saved in a log file. It is a hex data from the dump section in OllyDBG.
Now I want to loop this function and automate it. How can this automation be done in Ollyscript or something else?
Afaik none of existing plugins offer windbg's .writemem functionality
uploaded below is an ollydbg plugin enhanced from anonymouse (openrce blog)
modified cmdline plugin with an added command .writemem
Download and put the dll in ollydbg 1.10 plugin path.
possible crash path fixed (FindModule -> Mod-name can be null if FindModule return null in such cases added "unknown_module" string as modulename for sprintf_s)
http://wikisend.com/download/750442/cmdline.dll
This plugin is an enhanced version of modified cmdline plugin for ollydbg 1.10 by anonymouse (openrce blog) and contains one extra command .writemem
enhancements being original source altered considerably to make it compile with visual studio 2010 express (old commands not tested)
one extra command .writemem added (similar to windbg's .writemem)
usage as follows alt+f1 or plugin -> cmdline plugin .
in the dialog box type
writemem [esp+0x4] dword [esp+0x10] c:\dumps
or may be
writemem 0x403085 0x45 f:\foo\blah
first command will dump 0xxxx bytes pointed by [esp+0x10] from address pointed by [esp+0x4] to the preexisting folder c:\dumps
second command will dump 0x45 bytes from 0x402085 to the specified folder
to automate use this command with conditional log breakpoint pass commands to plugin when paused functionality (shift+f4)
in the edit box enter
.writemem <address> <size> <folder path>
.run
when ever the breakpoint is hit the memory contents will be dumped
another snapshot to explain the words better below
This tutorial might help:
http://x9090.blogspot.com/2009/07/ollyscript-tutorial-unpack-upx.html
Also, you can read some scripts carefully to learn more about Olly scripting.
The actual function I need in the end to complete my task is as following.
Its an encryption / Decyption function
0x1 PUSH EBP - the encrypted data is loaded.
0x5 RETN - the data is decrypted
So the encrypted/decrypted data is loaded in the stack values as I earlier explained.
1 - When bp hits 0x1 PUSH EBP
2 - We goto stack values
3 - We follow the 2nd argument as start address for the chunk of encrypted data and the 4th argument for the size of the data.
4- We dump using wrtiemem from cmdline mod
Also we follow the same for decrypted data
1 - When bp hits 0x5 RETN
2 - We goto stack values
3 - We follow the 2nd argument as start address for the chunk of decrypted data and the 4th argument for the size of the data.
4- We dump using wrtiemem from cmdline mod.
I was wondering if the data could be saved in one text file called encdec.txt
Encrypted HEX Values
Decrypted HEX Values.
For exammple:
ENC - 88 F4 62 71 3D 25 CD 7C 72 76 8E 14 95 0B D1 8B
DEC - 3E 2E BA 24 FA 22 47 A0 00 0F A5 0E F7 B0 9C 32
If the above is done then I need to automate the HEX search and replace values from encdec.txt on the target encrypted file.
So the automation would check the line " ENC - 88 F4 62 71 3D 25 CD 7C 72 76 8E 14 95 0B D1 8B " and search for it on the targeted file and replace the values with hex values in "DEC - 3E 2E BA 24 FA 22 47 A0 00 0F A5 0E F7 B0 9C 32"

Resources