How does software like scylla find the start of a pe file - windows

I was dumping a pe file out of a process and was wondering how it had found the pe file in memory.
At first I thought that it was looking for the DOS string but the software states that it can find pe files which are not loaded according to the documentation so that is out of the question.

There are many ways to find loaded modules in memory if they are loaded in the normal way using the Windows OS Loader or LoadLibrary because the Process Environment Block or PEB contains a pointer to the PEB_LDR_DATA structure named 'Ldr' which contains a linked list of all loaded modules. This is the same list of loaded modules which the Windows OS uses when using the API ToolHelp32Snapshot.
If the the module is removed from this Ldr.InMemoryOrderList or perhaps loaded using a manual mapping routine, this won't be possible, in which case you could detect the module by scanning for the predictable PE header in memory.
If the PE Header is deleted and the module is not in the linked list, which is possible then this becomes more difficult. You would need to use some sort of heuristics to detect the somewhat predictable nature of a PE file such as a DLL.
For instance, you have the PE file for the process, so you know what imports & relocations are done You know what modules are loaded and where, so if you find memory pages outside of these locations which have page protections set to executable, then you can be pretty confident that these belong to hidden or at least unknown modules.
Here are 2 excellent repos which may shed some light on the topic Hollows Hunter & PE-Sieve

Related

actual machine code to execute what Win APIs do stays in OS kernel memory space or compiled together as part of the app?

If this question deals with too basic a matter, please forgive me.
As a somewhat-close-to-beginner-level programmer, I really wonder about this--whether the underlying code of every win API function is compiled altogether at the time of writing an app, or whether the machine code for executing win APIs stays in the memory as part of the OS since the pc is booted up, and only the app uses them?
All the APIs for an OS are used by many apps by means of function call. So I thought that rather than making every individual app include the API machine code on their own, apps just contain the header or signature to call the APIs and the API machine code addresses are mapped when launching the app.
I am sorry that I failed to make this question succinct due to my poor English. I really would like to get your insights. Thank you.
The implementation for (most) API calls is provided by the system by way of compiled modules (Portable Executable images). Application code only contains enough information so that the system can identify and load the required modules, and resolve the respective imports.
As an example consider the following code that shows a message box, waits for it to close, and then exits the program:
#include <Windows.h>
int main()
{
::MessageBoxW(nullptr, L"Foo", L"Bar", MB_OK);
}
Given the function signature (declared in WinUser.h, which gets pulled in from Windows.h) the compiler can almost generate a call instruction. It knows the number of arguments, their expected types, and the order and location the callee expects them in. What's missing is the actual target address inside user32.dll, that's only known after a process was fully initialized, and had the user32.dll module mapped into its address space.
Clearly, the compiler cannot postpone code generation until after load time. It needs to generate a call instruction now. Since we know that "all problems in computer science can be solved by another level of indirection" that's what the compiler does, too: Instead of emitting a direct call instruction it generates an indirect call. The difference is that, while a direct call immediately needs to provide the target address, an indirect call can specify the address at which the target address is stored.
In x86 assembly, instead of having to say
call _MessageBoxW#16 ; uh-oh, not yet known
the compiler can conveniently delegate the call to the Import Address Table (IAT):
call dword ptr [__imp__MessageBoxW#16]
Disaster averted, we've bought us just enough time to fix things up before the code actually executes.
Once a process object is created the system hands over control to its primary thread to finish initialization. Part of that initialization is loading dependencies (such as user32.dll here). Once that has completed, the system finally knows the load address (and ultimately the address of imported symbols, such as _MessageBoxW#16), and can overwrite the IAT entry at address __imp__MessageBoxW#16 with the imported function address.
And that is approximately how the system provides implementations for system services without requiring client applications to know where (physically) they will find them.
I'm saying "approximately" because things are somewhat more involved in reality. If that is something you'll want to learn about, I'll leave it up to Raymond Chen. He has published a series of blog entries covering this topic in far more detail:
How were DLL functions exported in 16-bit Windows?
How were DLL functions imported in 16-bit Windows?
How are DLL functions exported in 32-bit Windows?
Exported functions that are really forwarders
Rethinking the way DLL exports are resolved for 32-bit Windows
Calling an imported function, the naive way
How a less naive compiler calls an imported function
Issues related to forcing a stub to be created for an imported function
What happens when you get dllimport wrong?
Names in the import library are decorated for a reason
Why can't I GetProcAddress a function I dllexport'ed?

Implicit interprocess shared memory on Windows?

What I would like to do is mark a specific area of memory as being automatically shared between processes of the same image/binary, similar to __declspec(allocate)... and __pragma(section...).
I know that I can use names pipes or equivalent, but for this purpose I would like to avoid system calls or additional overhead. I'm just unsure if there is any way to inform the NT kernel to map a specific range of pages automatically for each process of an image. I haven't found anything on MSDN, though MSDN doesn't include undocumented functionally (by definition), which I am fine with using.
I also don't see any specific PE section names/flags which would indicate such, though it is possible that I am missing something.
Ed: I've noticed that there is actually a PE section flag IMAGE_SCN_MEM_SHARED, though I need to investigate how it works.
You can use #pragma comment(linker, "/SECTION:.shared,RWS") and #pragma data_seg(".shared") to declare things in a shared memory segment (only works in Visual Studio). See Sharing Variables Between Win32 Executables.
Otherwise, if that is not an option for you, the only other way to share memory between processes is to use a Memory Mapped File via CreateFileMapping() and MapViewOfFile/Ex(). See Creating Named Shared Memory.

{$IMAGEBASE $13140000} directive in a unit from an advanced hooking / injection library: explanation needed

What I have done so far:
I found it in the AfxCodeHook.pas unit by Aphex.
I have also skimmed a bunch of interesting sample codes using it:
Inject Library (how to inject a DLL into another process).
Inject Library Ex (how to inject a DLL into another process using the Ex method).
Create Process Ex (how to inject a DLL into a created process using the Ex method).
Inject Executable (InjectLibraryEx's true power: the ability to inject EXE files).
Simple Api Hooking (how to use afxcodehook to manipulate calls to windows apis).
I have also read:
Embarcadero's RAD Studio help entry on Image base address and
Base address page in Wikipedia.
Question:
I seek for an informed opinion and a simple explanation of the {$IMAGEBASE $13140000} directive in Layman's Terms from seasoned Delphi coders.
This specifies the preferred base address of the DLL. If the DLL can be loaded at this address, then the loader will do so. If it cannot, then it needs to be relocated and all the absolute jumps in the DLL need to be adjusted to the new address.
When the loader attempts to map a DLL into a process address space, it first reads the preferred base address. Then it works out the size of the DLL. Finally it checks to see if a contiguous block of memory stretching from the base address to the base address + size can be found. If so then the DLL is loaded at the preferred base address. If another DLL, or the exe resides at the preferred base address, then the DLL will need to be relocated. If the application has reserved heap memory that overlaps with the preferred DLL load address space, then the DLL will need to be relocated.
If a DLL needs to be relocated then its physical pages cannot be shared between processes. The Windows system DLLs have carefully chosen base addresses to ensure that there are no collisions and that they can be shared.
Nowadays, Address Space Layout Randomization (ASLR) complicates matters even further.
You can learn more from these articles:
Dr. Dobbs: Rebasing Win32 DLLs
Peering Inside the PE: A Tour of the Win32 Portable Executable File Format

Is it possible to replace Loader of an OS? Any way to obtain the control over Loader?

I was just wondering if it is possible to replace Loader (executable program loader not the boot loader) of an Operating System (Windows is my choice). Are there any third party loaders available that would patch the default one.
Is there any way through which I can obtain the control over the OS Loader? I mean, I want things it is doing to be visible to me(each and every step).
If you ask me why I want to do this, For learning purposes.
No, process creation and the user-mode loader in ntdll are tied together (PsCreateProcess will directly map in ntdll and jump to it so that it can finish resolving modules and setting up the process), you cannot replace it.
If you want to play with this sort of thing then Linux is the way to go.
The loader is part of the kernel, but as you have access to all the kernel source you can play with it to your hearts content.
Linux has pluggable executable file formats, so it is possible to add an extra program loader which will do its own custom stuff with executable files, rather than the standard ones (ELF, shell scripts, binfmt_misc).
The binfmt_misc module allows you to write custom loaders for executable programs entirely in userspace; this is commonly used to execute non-native binaries or interpreted binaries such as Java, CLR executables etc.
On the other hand if you wanted to replace the ELF loader with something else you can make a binfmt module directly in the kernel. Look at fs/binfmt_* for examples. The ELF loader itself is in there.
Since each of the answers & comments is giving useful information. I just compiled, all the answers & comments into a single post.
I was just wondering if it is possible
to replace Loader (executable program
loader not the boot loader) of an
Operating System (Windows is my
choice).
No, in windows process creation and the user-mode loader in ntdll are tied together (PsCreateProcess will directly map in ntdll and jump to it so that it can finish resolving modules and setting up the process), you cannot replace it.
but there are resources availbable describing the format and loading of processes.
Here is a quite old but still uptodate MSDN article regarding PE files ( exe + dll )
Part I. An In-Depth Look into the Win32 Portable Executable File
Format by Matt Pietrek (MSDN
Magazine, February 2002)
Part II. An In-Depth Look into the Win32 Portable Executable File
Format by Matt Pietrek (MSDN
Magazine, March 2002)
You can use this information to write an app that starts a given executable.
If you are more interested in linux and the elf format you will find all you need in google.
Is there any way through which I can
obtain the control over the OS Loader?
I mean, I want things it is doing to
be visible to me(each and every step).
On Windows, you can get some visibility into the loader at work by enabling Loader Snaps. You do this with gflags.exe (part of Debugging Tools for Windows). There's a nice gflags.exe reference http://www.osronline.com/DDKx/ddtools/gflags_4n77.htm . With Show Loader Snaps enabled, you can see loader trace messages by starting the application under a debugger (WinDBG).
If you want to play with this sort of thing then Linux is the best way to go.
The loader is part of the kernal -- but as you have access to all the kernal source you can play with it to your hearts content.
The loaders for various binary formats are in fs/binfmt_*.c in the Linux source (fs/binfmt_elf.c is the loader used for executables in ELF format - ie. the vast majority).
The dynamic loader /lib{,64}/ld-linux.so.2 is also used for dynamically linked binaries - it's an example of an "interpreter" as referenced by the code in binfmt_elf.c.
Linux has pluggable executable file formats, so it is possible to add an extra program loader which will do its own custom stuff with executable files, rather than the standard ones (ELF, shell scripts, binfmt_misc).
The binfmt_misc module allows you to write custom loaders for executable programs entirely in userspace; this is commonly used to execute non-native binaries or interpreted binaries such as Java, CLR executables etc.
On the other hand if you wanted to replace the ELF loader with something else you can make a binfmt module directly in the kernel. Look at fs/binfmt_* for examples. The ELF loader itself is in there.
No, you cannot replace the OS loader, but there are resources availbable describing the format and loading of processes.
Here is a quite old but still uptodate MSDN article regarding PE files ( exe + dll ) http://msdn.microsoft.com/en-us/magazine/cc301805.aspx
You can use this information to write an app that starts a given executable.
If you are more interested in linux and the elf format you will find all you need in google.
Is there any way through which I can obtain the control over the OS Loader? I mean, I want things it is doing to be visible to me(each and every step).
On Windows, you can get some visibility into the loader at work by enabling Loader Snaps. You do this with gflags.exe (part of Debugging Tools for Windows). There's a nice gflags.exe reference here. With Show Loader Snaps enabled, you can see loader trace messages by starting the application under a debugger (WinDBG).

Is there an ideal size for executable modules on Windows?

I've been taking note of the .exe file size of many applications.
I saw that Visual Studio 2005 has an .exe size of 453KB, and VS2008 of 1.04MB because they divide the application into many parts (.exe + many .dll files).
I saw also that MS Outlook has a very large .exe file (11.8MB) while MS Word is very small (398KB)!
After pondering the things that I had seen, I was left with these questions:
Is there an advantage to having a small .exe, even if the final size of the application (all DLLs loaded) is much larger?
And if so, at what size is it good to begin breaking up an application into separate modules?
A large executable probably has less less shared libraries (less re-usable code), meaning that the application probably is more memory intensive than an application using shared libraries. If you want a system with minimal memory footprint you want shared libraries.
A big exe is likely to be more self-contained, which might make it easier to keep a stable deployment.
As a general rule, big size exe may mean you collect all the dependencies in single file so no dependency problem like dll hell but every time you want to fix any thing in small module the user has to download this big exe!!!
On the other hand small exe it may mean you modularize your application to different modules so it is easy to maintain and upgrade these modules separately and user does not have to download the big chunk exe
These articles:
An In-Depth Look into the Win32 Portable Executable File Format
Part 1, Part 2
Seem to be a very nice summary of the file format the Windows executables (and DLLs) use.
As far as I have been able to see, the author nowhere mentions anything about whether a certain size for an executable is better than a smaller/bigger size.
So this could mean that the size of an executable doesn't really matter to Windows.
On the other hand, I think that when you open an executable in Windows, the file will be scanned by Virus Scanners (if any) prior to handing off the file to the OS for execution. So this could mean, that larger executable means longer startup time in the presence of a scanning application.

Resources