Are any sections at all not loaded by the PE loader? Or are every section specified in the section headers loaded? In ELF programs, it's section headers (Called program headers, or segments) that are supposed to be loaded are those that are flagged with PT_LOAD. Is there anything similar to that in PE programs?
PS. I found the flag IMAGE_SCN_MEM_DISCARDABLE. Are sections flagged with that not loaded?
When a relocation section is available, but the PE image does not need to be relocated, the loader does not load the relocation section. If a PE image has been digitally signed, it has a section that contains the certificate. This section is not loaded by the loader. Additionally, if a debug section is available, this is also not loaded by the loader.
Well, DOS Stub is not a Section!
As a general rule, some parts of the PE file can be read, but not mapped in memory (like relocations). And some parts are not mapped at all. Debugging information at the end of the file, is an instance of such a situation.
Usually the data placed at the end of the file -past any parts in the file (that are willing to be mapped)- are not mapped in memory.
Related
As the title says, how to tell if an exe will load a DLL statically or dynamically by looking at the PE file header?
In other words, how to tell if the DLL is part of the executable, or will be called by the loader?
Thanks
Let me first clarify some terminology to avoid confusion.
Anything executed within a DLL is by definition dynamic. But, a DLL may be statically bound or dynamically bound to an executable.
With static binding, the EXE links against a DLL's import library (actually a .LIB file that is built alongside the DLL). Corresponding DLL function prototypes in header files will usually be declared with __declspec(dllimport). This results in the EXE being filled with stubs for each DLL symbol that are filled in by the Windows loader at runtime. This is facilitated by the final EXE having an import section structure in its PE headers listing all the DLLs to be resolved by the Windows loader at runtime and their symbolic names (e.g. functions). Windows then does all the dirty work to find and load these DLLs and referenced symbolic addresses before the EXE starts execution of the primary thread at its entry point. If Windows fails to find any DLL(s) or referenced symbolic addresses, the EXE won't start.
With dynamic binding, the EXE explicitly invokes code to load DLL(s) and resolve symbolic addresses. This is done using the two KERNEL32 API functions: LoadLibrary() and GetProcAddress(). If an EXE does this, there will be no associated import section describing this DLL and its symbols, and the Windows loader will happily load the EXE knowing nothing said DLL(s). It is then application defined as to how to handle success or failure of LoadLibrary() and /or GetProcAddress().
It is worth noting at this point, that libraries like the C-Runtime may be provided in DLL form (dynamic library) or static form (static library). If you link to one of these libraries statically, there will be no DLL import section in the built EXE's PE header and no function stubs to resolve at runtime for that library. Instead of stubs, these symbols (functions and/or data variables) become part of the EXE. Static library functions and/or data are copied into the EXE and are assigned relative addresses explicitly by the linker; no different than if those symbols were implemented directly by the EXE. Additionally, there will be no LoadLibrary() or GetProcAddress() resolution either implicitly (by the Windows loader) or explicitly in code for these functions as they will be directly present and self-contained within the final EXE. As a side-note, debugging symbols may be used in this case to try and differentiate between EXE implemented functions and library implemented functions (should you care) but this is highly dependent on the settings used to build both the EXE and the static library.
With terminology cleared up, let me attempt to answer your question! :)
Let me also add I'm not going to go into the specifics of bound and unbound import symbols for a module's import section because this distinction has nothing to do with the original question and have more to do with speeding up the work done by the Windows loader. If you are interested in those details however, you can read up on Microsoft's PE COFF Specification.
To see if an EXE is statically bound to a DLL, you can either parse the PE headers yourself to locate the DLL imports section or use one of dozens of tools to do this for you, such as Dependency Walker. If you load your EXE in Dependency Walker for example, you will see a list of all statically bound DLLs in the top-left pane underneath the EXE itself. If any of these DLLs are not found at runtime, the program will fail to load. In the right pane, top table, you will see symbols (e.g. functions) that are referenced in the EXE for the selected DLL. All of these symbols must additionally be found for the EXE to load. The lower table simply shows all of the symbols exported by the DLL, referenced or not.
If the EXE uses dynamic binding (also called it manual binding) for a given DLL, there will be no import section for that DLL and thus you won't see it referenced in tools like Dependency Walker. BUT, you can click on KERNEL32.DLL in Dependency Walker (all EXEs will have this dependency, though there are exceptions to this rule I won't get in to here) and locate references to LoadLibrary() and GetProcAddress(). Unfortunately most EXEs reference these functions in boilerplate code such as the C-Runtime so this won't tell you too much.
If you want to dig deeper into trying to figure out which DLLs are manually loaded by an application, the first thing to try is to and locate that DLL name string by searching the EXE for the DLL name. Note that the DLL name string need not end in ".DLL" as LoadLibrary() automatically assumes this extension if not provided. The standard tool for searching for strings within a binary module is Sysinternals Strings. This works great for modules that make no attempt to hide what they are doing.
With that said, obfuscated code (found in unpackers, viruses and the like) may obfuscate or encrypt DLL names as well as the functions referenced. Code may even choose to resolve LoadLibrary() and GetProcAddress() at runtime to further hinder efforts to figure out what they are doing. Your best bet in these situations is to use a tool like Sysinternals Process Monitor or a debugger with verbose logging enabled to watch the DLLs being loaded as the program runs. You can also use a disassembler (such as IDA) to try and piece together what the code is doing. To find out what DLL symbols are being used, you might start the EXE in a debugger and wait for the initial break at the entry-point. Then add a breakpoint on the first instruction in KERNEL32.GetProcAddress. Run the program. When that breakpoint is hit, the stack arguments will contain the symbol trying to be resolved.
As you can see, if an application resolves DLL symbols manually (dynamic binding), the process of figuring out what DLLs are being referenced is not as straightforward.
What I know:
winnt.h contains the structure/definition of PE file and its components (Windows), and ELF32.h contains structure of ELF file and definition of each components(Linux).
What i think (understanding/observation):
I understand that the winnt.h not only contains the PE structure/definition but also contains various macros and types (for Windows NT), and it is a child header file of windows.h (so, based on my understanding, winnt.h has another important application to provide win API etc). However, based on my observation, elf32.h only contains definition/structure of ELF file (and nothing more than that.
My Question: what is the application/functionality of winnt.h when it comes to compiling/interpereting/executing a PE file?
I understand that winnt.h has another application (providing Win API/Macros/etc) and is a prefect guide to understand/dismantle a PE file, but how this file is specificly used by the OS?
Does the compiler use it to build the PE file from source?
Does the OS use it to interperet the PE file?
*And the same question for ELF32(or 64).h and ELF file.
Any answer is much appreciated.
My application is sometimes started from an network share and some customers reported an External exception C0000006 when running the application. According to my Google research this "may" be related to the image getting paged out and the failing to reload from the network. A workaround for this is telling Windows to load the complete image file into the swap and run it from there by setting the IMAGE_FILE_NET_RUN_FROM_SWAP flag
My application also depends on various .bpl and .dll libraries that are loaded at runtime. Only some of those can be changed by me, some are supplied by other vendors. What happens to this libraries if the exe has this flag set? Are the also loaded into the swap file or are they still paged out and reloaded when needed? Would I need to include this flag in the libraries too?
The flag applies only to the PE module which sets it. So, setting the flag in an EXE does not mean that modules loaded by that EXE are affected by the flag. Each module (DLL, package etc.) that is loaded by your EXE will be treated by the loader according to the PE options specified in that module.
So, you'll need to set the PE flag on each module that resides on a network share.
For what it's worth, I'd recommend adding IMAGE_FILE_REMOVABLE_RUN_FROM_SWAP as well.
I have to create a wrapper DLL that exports some symbols (functions). Within its resources it contains another encrypted DLL that actually does the job.
Upon the wrapper DLL initialization it decrypts the original one, saves it in a file, and loads into the address space by LoadLibrary. However I'd like to avoid saving this DLL in a file.
I know that this doesn't guarantee a bullet-proof protection, actually one may dump the process virtual memory and see it there. I also know that it's possible to create a file with FILE_FLAG_DELETE_ON_CLOSE attribute, which ensures this file will be deleted as soon as the process terminates. But still I'd like to know if there's an option to load the DLL "not from a file".
So far I thought about the following:
Allocate a virtual memory block with adequate protection (PAGE_EXECUTE_READ or PAGE_EXECUTE_READWRITE). Preferrably at the image preferred base address.
Extract/decrypt the DLL image there.
If the image base address isn't its preferred address - do the relocation "manually". I.e. - analyze the relocation table and patch the image in-place.
Handle the image imports. Load its dependency DLLs and fill symbol addresses.
Invoke its initialization function (DllMain).
That is, I can do the work of the loader. But unfortunately there are some areas where the DLL loaded by the above trick will behave differently, since it's not a properly-loaded DLL from the OS's perspective. This includes the following:
The DllMain requires the DLL "module handle", which is just its base address. It may use this handle in calls to various API functions, such as LoadResource. Those calls will probably fail.
There will be problems with exception handling. The OS won't see the DLL's SAFESEH section, hence its internal exception handling code won't be invoked (it's a 64-bit DLL, means SAFESEH is mandatory for exception handling).
Here's my question: Is there an API to properly load the DLL into the process address space without the need for it to be in a file? An alternative variant of LoadLibrary that works, say, on a file mapping instead of a file-system file?
Thanks in advance.
Yes, it is possible to load a DLL which is located in the resources of another image and execute it without needing a file! Take a look at this article, this is exactly what you want. It works, I tried it.
I have an old game executable with a large section of debug symbols, apparently in the Codeview format. How can I view the contents of this section in a human-readable format?
Current Windows compilers do not put the content of the debug symbols into the image file itself, they only put a reference to an external symbols file into the image. They put the debug data into a separate symbols file with the PDB (Program Data Base) extension. As you mentioned it, this format is also named CodeView. In your case, it looks like (since the debug section is large) you might be confronted with a really old symbols format.
this article explains the different symbols formats.
Okay, given that this is for 32-bit Windows, I believe the normal Windows symbol handler API should be able to read the data. From there, it's pretty much a matter of deciding what data you want, and how you want it formatted.