NASM Assembly Pe32 - What is the Optional Header Data Directory value - winapi

I'm trying to recode an existing EXE from scratch and having a problem figuring out what value the IMAGE_OPTIONAL_HEADER struct element "DataDirectory" has.
It's part of the Pe32 header.
I'm using NASM and the WIN32N.INC file.
I know that the IMAGE_OPTIONAL_HEADER struct element "DataDirectory" has the size DQ. Thats because the struct "DataDirectory" has the elements "VirtualAddress" and "isize" which are both DD.
STRUC IMAGE_DATA_DIRECTORY
.VirtualAddress RESD 1
.isize RESD 1
ENDSTRUC
STRUC IMAGE_OPTIONAL_HEADER
.Magic RESW 1
.MajorLinkerVersion RESB 1
.MinorLinkerVersion RESB 1
.SizeOfCode RESD 1
.SizeOfInitializedData RESD 1
.SizeOfUninitializedData RESD 1
.AddressOfEntryPoint RESD 1
.BaseOfCode RESD 1
.BaseOfData RESD 1
.ImageBase RESD 1
.SectionAlignment RESD 1
.FileAlignment RESD 1
.MajorOperatingSystemVersion RESW 1
.MinorOperatingSystemVersion RESW 1
.MajorImageVersion RESW 1
.MinorImageVersion RESW 1
.MajorSubsystemVersion RESW 1
.MinorSubsystemVersion RESW 1
.Reserved1 RESD 1
.SizeOfImage RESD 1
.SizeOfHeaders RESD 1
.CheckSum RESD 1
.Subsystem RESW 1
.DllCharacteristics RESW 1
.SizeOfStackReserve RESD 1
.SizeOfStackCommit RESD 1
.SizeOfHeapReserve RESD 1
.SizeOfHeapCommit RESD 1
.LoaderFlags RESD 1
.NumberOfRvaAndSizes RESD 1
.DataDirectory RESQ 1
ENDSTRUC
So what exact values does the DataDirectory elements have? There are way more Data Directory then just one. Like Export directory RVA + size, Import directory RVA + size etc.
Do I just put the Offset of the first virtual Address in "VirtualAddress" and its size in "isize"? That would be my guess but I'm not sure about it.

It is an array of IMAGE_DATA_DIRECTORY structs. MSDN tells you what the struct looks like:
typedef struct _IMAGE_DATA_DIRECTORY {
DWORD VirtualAddress;
DWORD Size;
} IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY;
The NumberOfRvaAndSizes field tells you how many there are. Usually 16 but there can be fewer.
Each directory tells you the offset and size of the thing they "point" to. The IMAGE_DIRECTORY_ENTRY_* defines tells you what they are. For example, IMAGE_DIRECTORY_ENTRY_DEBUG is 6 and tells you the location of IMAGE_DEBUG_DIRECTORY and the total size of it and it's data.
For more information, see the PE/COFF format documentation and the Matt Pietrek "An In-Depth Look into the Win32 Portable Executable File Format" and "Peering Inside the PE: A Tour of the Win32 Portable Executable File Format" MSDN/MSJ articles.

Each entry contains an RVA and size.
The most important ones are:
the one at index 0 [export directory],
the one at index 1 [import directory],
the one at index 5 [relocation table].
Now, depending on what you try to achieve, this table may be completely useless to you.
It is, in fact, a kind of "shortcut" for the loader, allowing it to quickly lookup particular portions of data without having to iterate all the section header table stuff before. Thus, it is really only usefull for execution-time. If you just want to inspect the PE-file without it beeing loaded into virtual memory, it will not provide any usefull information.
As Anders already told, there are usually 16 of them, although in my PE-file I'm currently researching I can find only 10 (as the field NumberOfRvaAndSizes tells me, essentially the last entry of the optionla header, you called it .DataDirectory and seem to have found it to be a QUADWORD, but fyi it should really be a DOUBLEWORD. At least if I interpret your RESQ entry correctly).
EDIT: Turned out that there are indeed 16 entries, since the value "10" is in hexadecimal form...

Related

Tool to "un-define" a symbol in a relocatable ELF symbol table

Is there any utility to patch arbitrary symbols in ELF symbol table so that defined symbol becomes undefined? For example here is readelf --syms for a file that I'm going to process
Symbol table '.symtab' contains 8 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
...
5: 0000000000000000 13 FUNC WEAK DEFAULT 3 my_message
6: 0000000000000000 19 FUNC GLOBAL DEFAULT 5 print_msg
7: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts
And here is expected output for the same binary where my_message has been un-defined:
Symbol table '.symtab' contains 8 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
...
5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND my_message
6: 0000000000000000 19 FUNC GLOBAL DEFAULT 5 print_msg
7: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND puts
An ELF file itself is relocatable. Modification should alter only symbol table. Actual section that contains original symbol definition should remain unchanged.
I've checked GNU Binutils and objcopy might be what I'm looking for but so far I haven't figured out any option (or combination) that would give me above described behavior.
In fact such tool should be straightforward enough to implement (even with no extra libraries like BFD), but I'm wondering if there is some existing thing that I might miss.
You may look at the 'anonymizer' example of ELFIO library. The example overrides a symbol's name. Overriding of symbol's type can be implemented similarly. But processing of '.symtab' section will be required.
Not exactly the tool, but, a library that permits to implement such tool.

Dumping W32pServiceTable

I want to see what function in win32k.sys driver handles specific syscall number.
I attach windbg to GUI process since win32k.sys is season space driver.
Then I shift first DWORD value right by 4 bits add base address of W32pServiceTable and use u command to show function in WinDbg but address isn't valid. I checked KiSystemCall64 and it seems to be doing the same thing.
!process 0 0 winlogon.exe
.process /p (PROCESS addr)
.reload
Answer: DWORD value from table is loaded with this instruction
movsxd r11,dword ptr [r10+rax*4]
W32pServiceTable DWORD values has bit at 31 position set to 1 so movsxd sets upper 32 bits of r11 register to 1 then adding r11 and table base address leads to correct function.
These values are negative so you need to preserve that when you shift off the bits. For example:
0: kd> dd win32k!W32pServiceTable L1
fffff88b`d1568000 ff8c8340
0: kd> u win32k!W32pServiceTable + ffffffff`fff8c834 L1
win32k!NtUserGetThreadState:
fffff88b`d14f4834 4883ec28 sub rsp,28h
Also, WinDbg is very picky/weird/broken/unpredictable when it comes to sign extension so you need to be careful about how you do this. For example, this doesn't work:
0: kd> u win32k!W32pServiceTable + fff8c834 L1
fffff88c`d14f4834 ?? ???
Due to WinDbg zero extending the value. But this does:
0: kd> u win32k!W32pServiceTable + (fff8c834) L1
win32k!NtUserGetThreadState:
fffff88b`d14f4834 4883ec28 sub rsp,28h
Because the () causes WinDbg to sign extend instead of zero extend.
Lastly, this happens even on the normal service table, it's not just a Win32k thing.

addiu instruction encoding (MIPS,GCC)

Here is addiu instruction opcode (16-bit instructions, GCC option -mmicromips):
full instruction: addiu sp,sp,-280
opcode, hexa: 4F75
opcode, binary: 1001(instruction) 11101(sp is $29) 110101
My purpose is to detect all instruction of this kind (addiu sp,sp,)
and then to decode the immediate, in the above case (-280) (to follow the sp).
What I don't understand is the encoding of (-280).
Linked to: How to get a call stack backtrace?(GCC,MIPS,no frame pointer)
microMips has a specialized ADDIUSP instruction which the assembler chose to use. The first 6 bits are the opcode 010011, the next 9 bits are the encoded immediate 110111010 = 0x1BA and the LSB is reserved at 1.
The encoding for the immediate uses scaling by 4 and sign extension. Given that 0x1BA = -70 (using 9 bits) the value is -70 * 4 = -280.

Reassigning non-absolute variables in OSX's assembler

The following assembler directives, when compiled with clang on OSX, produce an error:
.set link,0
test:
.int link
.set link,test
test2:
.int link
.set link,test2
The error:
$ clang test.s
test.s:7:13: error: invalid reassignment of non-absolute variable 'link'
.set link,test2
^
I want to use link in a macro as a variable that keeps track of the last defined word, to build a linked list (as in JONESFORTH).
As far as I know, you can't redefine normal symbols. The way I see it, you have two choices. Either you allocate a local label number to store your link address (as these can be redefined) or you use preprocessed assembly. For both cases, you probably want to use a macro to declare your nodes.
Example:
.macro declare_node list_id
.ifndef link_head_\list_id
link_head_\list_id : .int 0
.else
.int \list_id\()b-4
.endif
\list_id :
.endm
test:
declare_node 100
.int 42 # node data
test2:
declare_node 100
.int 314 # node data
test3:
declare_node 101
.int 173 # node data
test4:
declare_node 101
.int 141 # node data
Here, a numerical list id is used as the local label, so you can declare multiple lists.
I have the same problem (jonesforth). I have not found out why apples assembler doesn't allow to redefine symbols, but it is what it is.
I worked around this by manually passing the last defined word as an argument to the defword macro. It's ugly as hell, and error prone.
.macro defcode name, length, flags, name, link
.const_data
.balign 8
.globl name_\name
name_\name :
.quad \link // link
.byte \flags+\length // flags + length byte
.ascii \name // the name
.balign 8 // padding to next 8 byte boundary
.globl \name
\name :
.quad code_\name // codeword
.text
.balign 8
.globl code_\name
code_\name : // assembler code follows
.endmacro
Then call the macro like
defcode "BRANCH",6,0,BRANCH,name_TICK
...
NEXT
defcode "0BRANCH",7,0,ZBRANCH,name_BRANCH
...
NEXT
I'd be super excited to learn about better ways to handle it.

How to display managed objects with certain value in one of the fields in WinDbg using SOS (or SOSEX)?

My problem is this:
0:000> !DumpHeap -type Microsoft.Internal.ReadLock -stat
------------------------------
Heap 0
total 0 objects
------------------------------
Heap 1
total 0 objects
------------------------------
Heap 2
total 0 objects
------------------------------
Heap 3
total 0 objects
------------------------------
total 0 objects
Statistics:
MT Count TotalSize Class Name
000007fef3d14088 74247 2375904 Microsoft.Internal.ReadLock
Total 74247 objects
The way I read this output is that I have 74,247 Microsoft.Internal.ReadLock instances on my heap. However, some of them are probably pending collection.
I want to display only those which are not pending collection.
For example, 0000000080f88e90 is the address of one of these objects and it is garbage. I know it, because:
0:000> !mroot 0000000080f88e90
No root paths were found.
0:000> !refs 0000000080f88e90 -target
Objects referencing 0000000080f88e90 (Microsoft.Internal.ReadLock):
NONE
0:000> !do 0000000080f88e90
Name: Microsoft.Internal.ReadLock
MethodTable: 000007fef3d14088
EEClass: 000007fef3c63410
Size: 32(0x20) bytes
File: C:\Windows\Microsoft.Net\assembly\GAC_MSIL\System.ComponentModel.Composition\v4.0_4.0.0.0__b77a5c561934e089\System.ComponentModel.Composition.dll
Fields:
MT Field Offset Type VT Attr Value Name
000007fef3d13fb0 400001e 8 ...oft.Internal.Lock 0 instance 0000000080001010 _lock
000007fef0a8c7d8 400001f 10 System.Int32 1 instance 1 _isDisposed
As one can see, both sosex.mroot and sosex.refs indicate no one references it, plus dumping its fields reveals that it was disposed through IDisposable, so it makes sense that the object is garbage (I know that being disposed does not imply the object is garbage, but it is in this case).
Now I want to display all those instances which are not garbage. I guess I am to use the .foreach command. Something like this:
.foreach(entry {!dumpheap -type Microsoft.Internal.ReadLock -short}){.if (???) {.printf "%p\n", entry} }
My problem is that I have no idea what goes into the .if condition.
I am able to inspect the _isDisposed field like this:
0:000> dd 0000000080f88e90+10 L1
00000000`80f88ea0 00000001
But .if expects an expression and all I have is a command output. If I knew how to extract information from the command output and arrange it as an expression then I could use it as the .if condition and be good.
So, my question is this - is there a way to get the field value as an expression suitable for .if? Alternatively, is it possible to parse the command output in a way suitable for using the result as the .if condition?
I didn't have an example which uses ReadLock objects, but I tried with Strings and this is my result:
.foreach (entry {!dumpheap -short -type Microsoft.Internal.ReadLock})
{
.if (poi(${entry}+10) == 1)
{
.printf "%p\n", ${entry}
}
}
I'm using poi() to get pointer size data from the address. Also note I'm using ${entry} not entry in both, poi() and .printf. You might also like !do ${entry} inside the .if.
In one line for copy/paste:
.foreach (entry {!dumpheap -short -type Microsoft.Internal.ReadLock}) {.if (poi(${entry}+10) == 1) {.printf "%p\n", ${entry}}}

Resources