PE Format: Why the IAT can be empty, and a MS detours myth - portable-executable

To my knowledge the import address table (IAT) is a table of
import functions. But lately I found that in some executables the
IAT is empty: in IAT's directory, both VirtualAddress and Size are
zero. To my surprise, An executable without IAT could run.
Then I found some code in MS detours:
// If the file doesn't have an IAT_DIRECTORY, we create it...
if (inh.IAT_DIRECTORY.VirtualAddress == 0) {
inh.IAT_DIRECTORY.VirtualAddress = obBase;
inh.IAT_DIRECTORY.Size = cbNew;
}
There is an API called DetourCreateProcessWithDllExA in MS
detours, as its name said, it could launch an executable with
specified DLLs - it will create a process in suspended mode,
modify the import table (add DLLs), and resume the main thread to
run. The code above is a part of this procedure.
Depending on my test, if you comment the code above, process will
crash at very beginning. But even more amazing is that you could
modify the VirtualAddress and Size freely, for example:
// If the file doesn't have an IAT_DIRECTORY, we create it...
if (inh.IAT_DIRECTORY.VirtualAddress == 0) {
inh.IAT_DIRECTORY.VirtualAddress = 123;
inh.IAT_DIRECTORY.Size = 456;
}
And it works ! I don't know why. It seems that obBase and cbNew do
not make any sence too.
Q1: Why the IAT can be empty
Q2: Why MS detours must modify the IAT, what's going on
Edit:
An executable with empty IAT may be a packed executable. Although I still don't know the questions.

Q1:
IAT directory can be empty because the information it contains is useless for windows loader. All needed information is in Import Table. See IMAGE_IMPORT_DESCRIPTOR -> FirstThunk in WinNT.h

Related

Why PE need Original First Thunk(OFT)?

There is "First Thunk"(FT), which loader overwrites after execution with correct addresses.
But when PE uses OFT?
Does PE even need it?
The original first thunk is needed if the imports are bound but the imported .DLL does not match.
On a fresh unpatched version of Windows, all addresses of all functions in the base .DLLs (ntdll, kernel32, user32 etc) are known. Take shell32 for example, it links to kernel32!CreateProcess and the true address of CreateProcess can be stored directly in shell32. This is called import binding and lets the loader skip the step where it looks up all the addresses of the imported functions.
This does not work if the imported .DLL has not been loaded at its preferred address nor if the .DLL has changed (security update etc). If this happens then the loader has to look up the functions "the normal way" and the original first thunk array has to be used because that is the only place where the RVAs of the function names are stored.
If import binding is not used then the original first thunk array is optional and might not be present.
ASLR has probably made this optimization irrelevant.
Let me summarize a lot of things for you here. When you load a Library, for example, Milad.dll and then try to call a function from that like MPrint, dynamic loader of the windows operating system has to resolve the address of the MPrint function and then call it. How can OS resolve the address of that function?
Windows go through some really complicated stuff which I want to tell you those steps with a simple tongue. The dynamic loader of windows OS to resolve the address of the function in DLLs has to check Import Name Table (INT), Import Ordinal Table (IOT) and Import Address Table (IAT) table. These table pointed by AddressOfNames, AddressOfNamesOrdinal and AddressOfFunction member in Export directory a PE structure.
After OS load Milad.dll in address space of target process with help of LoadLibrary, it's going to fill INT, IOT and IAT table with their RVA in target address space of the process with GetProcAddress and doing some calculation.
There is an array of Import Directory in the process structure that has OriginalFirstThunk, TimeDateStamp, ForwarderChain, Name, FirstThunk which these members point to some important addresses.
Name in Import Directory (Image_Import_Data) pointed to the name of
the DLL which process tries to call, in this example this DLL is
Milad.dll.
OriginalFirstThunk pointed to Import Name Table which includes Names
of functions that exported by the Milad.Dll. Functions in this table
have a unique index which loader takes that index and go to the next
step and reference to Import Ordinal Table with that index and takes
the value which there is into that index of Import Ordinal Table
which It's another integer value.
FirstThunk is another important member which point to IAT. in the
previous step dynamic loader takes an integer value via IOT. this
value is an index number which dynamic loader refer to IAT with that value.
In this table, there is an address in index value which dynamic
loader gets from INT-IOT. After these steps when dynamic loader
finds out the correct address of the function, it puts that address
to Import Address Table for MPrint function. So the process can call
that function with its address.
This is a simple explanation for complicated stuff which loader does to resolve the address of the functions in DLLs via Name, OFT(INT) and FT(IAT) members in Image_Import_Data.
We need to know that when the PE file is loaded into memory, the PE loader will look at the IMAGE_THUNK_DATAs and IMAGE_IMPORT_BY_NAMEs and determine the addresses of the import functions. Then it replaces the IMAGE_THUNK_DATAs in the array pointed to by FirstThunk with the real addresses of the functions. Thus when the PE file is ready to run. The array of RVAs pointed to by OriginalFirstThunk remains unchanged so that if the need arises to find the names of import functions, the PE loader can still find them.

How do call instructions get generated for imported functions in a compiled module

I am not sure if I am phrasing the question correctly, but basically I want to know how the call instruction is generated when calling an imported function from another library.
For example
GetModuleFileName(...)
is compiled to
call 0x4D0000
where 0x4D0000 is the address of the imported function which is dynamic.
How does windows set those calls and would it be possible to circumvent it and set a custom address instead.
The address used in the call statement isn't dynamic. It's a relative address that's fixed at link time like a call to any other function. That's because the call is actually to a stub, and the stub performs an indirect jump to the real function. The indirect jump uses a memory operand that refers to location in the import table. When the executable (or DLL) is loaded by Windows it updates the import table with addresses of all the functions the executable or DLL uses in any DLLs it's linked to.
So if an executable a call instruction like this:
call _GetModuleFileNameA#12
Then somewhere else in the same executable is astub like this:
_GetModuleFileNameA#12:
jmp [__imp__GetModuleFileNameA#12]
And somewhere in the import table there is a definition like this:
__imp__GetModuleFileNameA#12:
DD ?
Windows sets the value of __imp_GetModuleFileName#12 in the import table when the executable (or DLL) is loaded. There's not much you can do change this, though it's not too hard to change the value after the executable (or DLL) has been loaded. Note that the import table might be located in a read-only section, meaning you may need to change the virtual memory protections in order to do this.

Adding an export to a DLL without recompiling it

I have a DLL that I want to use/debug. I don't have the source.
Looking at it with IDA, I found 3 things :
DllMain does nothing
The code I need is self contained in a function that only calls a few Windows API. It does not reference anything else in that DLL.
That self contained function is not exported
I could extract the assembly code and link it to a C program, but I wonder:
Is it possible to (and how should I) add an entry to the export table of an existing DLL without recompiling it?
Yes, you can do that but most tools don't support this. For example using CFF Explorer, it's easier to convert an existing export to what you want it to be. Just edit the function RVA and exported name. Since you only need the one thing, it shouldn't be a problem that you're removing some other export.
You could even do it with a hex editor since it doesn't involve moving anything rebuilding the header, it's just an in-place edit.

PE .idata section

According to the documentation I've read, the import directory for a Windows executable is typically placed in a section called .idata. (I know the names are effectively just comments, but 'typically... called' presumably means the Microsoft tool chain will use that name by default.)
When I compile and link a simple C test program with the Microsoft compiler and then dumpbin the result, there is no section called .idata. There is, however, in the optional header, a positive RVA and size of import directory, so the import table is there.
Is the import directory nowadays placed in a section with a different name, or am I missing something?
Indeed, in the executable I just built, there is no .idata section.
Using PE Explorer, we can see that the Import Table, and the IAT are stored as part of the .rdata section. (Note the "Pointing Directories" column):
On the Data Directories page, we see that the virtual address of the Import Table is 0x403354. This lands within the range of the .rdata section (0x403000 - 0x403C00).
Interestingly (and somewhat frustratingly), the PE loader for IDA synthetically "creates" an .idata section which doesn't actually exist in the file:

How can I make an import library in binary form that can be linked by VC/GCC?

Firstly sorry for my poor English.
I'm now writing a tool to build import library of a dll in windows, it will output the binary form of obj files. then I can put them together by linker.
Currently it can generate OBJs with import descriptor, import lookup table, import thunk.
I nearly make it work but get stuck with some problem.
I researched the import library (.lib) files generated by VC and GCC, deciding to imitate what GCC does.
I found that it contains IMAGE_IMPORT_DESCRIPTORs and IMAGE_THUNK_DATA32s just the same as what I saw in an EXE.
So I make my own one as it does, but the linker won't generate the EXE as I want.
I hope it links my lib in this order:
.idata$2 (IMAGE_IMPORT_DESCRIPTOR)
.idata$4 as FirstThunk
.idata$4 contains NullThunk
.idata$5 as OriginalFirstThunk
.idata$5 contains NullThunk
.idata$6 contains DLL's filename
I knew that the number after section name + $ can direct the linker to put the data in order, and that's NULL_IMPORT_DESCRIPTOR in .idata$3 does.
the problem is, the linker strips the section which is not referred even if another section in the same obj is referred. for example, I put NullThunk and DLL's name in a file called dllname, contains
section 1: .idata$4, 0x00000000
section 2: .idata$5, 0x00000000
section 3: .idata$6, a.dll\0
symbols: _DllName, external, sect3+0
the _DllName is referred by the import descriptor so it's linked, but .idata$4 and .idata$5 is stripped.
so the NullThunk is not linked, what I see in CFF Explorer is that the EXE which linker generated imports hundreds of symbols from my DLL.
The second question is, how can I direct the linker to make sure that NullThunk is put after IAT? because they are all in .idata$5.
And another problem, when I try to make my lib contains 2 imported functions, the linker select only one of them, throw the other away. In detail, I have 2 functions: int __stdcall add(int, int); and sub. In the linker's generated code, call to sub seems correct but call to add become "call [RVA:0] (ff 25 00400000)". What happened?
After a week's trial and error finally I gave up and turn to asking for help.
this is the file my tool generated(I packed them into a lib):
http://filebin.ca/19oJUzj8z1vN/add.lib
neither GCC nor VC generated correct EXE when link to this import library.
How can I solve these problems?
Regards,
LeiMing
I have solved this problem.
I just rename all members in generated archive (.lib file) to the same name,
and put the nullthunk object in the end of the archive (last member),
the 3 problems disappear.
I don't know the reason in details but it works.
Now my tool can generate import library successfully.
and other points that need to care about are the order of offsets, strings in first link member and second link member. Padding between archive members is also a point to care.

Resources