Can I manually insert ImageBase value of PE file? - portable-executable

Can I manually insert ImageBase value of PE file?
Basically..
ImageBase of DLL = 10000000
ImageBase of EXE = 00400000
If can, I want change ImageBase to random address.
I wonder How to do.

You can easily change the base address AND prevent Windows from relocating your executable module to a random base. I should stress that if you have access to the build environment, you should prefer specifying the base address and preventing the DYNAMICBASE flag from being placed in the module to begin with at build time, allowing the linker to make the proper optimizations. To do this with MSVC, you'd specify linker flags:
/BASE:400000
/DYNAMICBASE:NO
Altering the image base after the fact CAN be done and will work for simple modules, but in some instances could result in crashes depending on how the code was generated. Sometimes there is little choice when one does not have access to the original source code.
The code and data accesses may hardcode values based on the original ImageBase linked with. If you want to modify a module after it has been build, read on.
While Address Space Layout Randomization (ASLR) behavior was introduced in Windows Vista, the modifications suggested here WILL work on ANY version of Windows.
NOTE: The preceding statement assumes Microsoft, in the future, doesn't start randomizing image base addresses without regard to the relevant PE flags in the header or refuse to load these modules altogether. As of the present versions of Windows 10, Windows currently honors images that DO NOT contain IMAGE_DLLCHARACTERISTICS_DYNAMIC_BASE flag, preventing dynamic relocation.
Using a hex editor, a tool like MSVC's editbin, or even your own code, the following modifications should be made to the PE headers of the desired module to set a FIXED base-load address:
-set desired IMAGE_OPTIONAL_HEADER -> ImageBase (e.g. 0x400000)
i.e. editbin.exe /rebase:base=0x400000 <YOUR_MODULE>
-remove the the 0x0040 (DYNAMIC_BASE) bit from the IMAGE_OPTIONAL_HEADER -> DllCharacteristics flags or use editbin:
i.e.: editbin /dynamicbase:no <YOUR_MODULE>
-if not using editbin, you will need to recalculate the header checksum or just leave at zero for any non-driver or start-up Windows service; editbin updates the checksum automatically.
NOTES:
-manually changing the module's base address may require that you walk the .reloc section entries and perform manual fixups for your new base address either statically or at runtime (simulating what the Windows loader does); not doing so could result in crashes. To avoid this hassle, just remove the DYNAMIC_BASE flag and leave the base address the same as when the module was built. Then you still prevent ASLR, even if the original base address doesn't change.
-the editbin version must have come from MSVC 2005 SP1 (8.0.50727.161) to support the /dynamicbase argument; any free modern version of the MSVC C++ toolset's editbin will have this feature; my experience is that the /rebase option might report the cryptic "LNK1175: failed to rebase ; error 487" even for modules without a .reloc section - this ultimately forces you to use a PE editor to change ImgBase.
-The changes above may break embedded digital signature checks or anything that verifies the integrity of the original file since we've modified it.

As far as I remember, windows PE loader decides on base loading address(ImageBase in your question) and you cannot select it manually unless you write PE loader yourself.
Starting Windows Vista, windows uses address randomizer for selecting a random base loading address. So it is not like 0x10000000 or 0x00400000 anymore and it changes in every run unless the process is started in special situations like debug mode.

Related

How does Windows handle multiple DLLs loaded in memory without position independent code?

Linux and MacOS leverage the power of Position-Independent Code. There's no such thing on Windows and yet programs can link against shared DLLs normally. I can't seem to find good documentation on this topic, besides a couple of terse articles on the Microsoft website (here and here).
Does Windows just copy the DLL code in memory and adjust function addresses as needed? What if two programs link against the same library? Could the virtual memory mechanism be involved somehow?
Windows PE (.exe/.dll) files contain relocation data that allows the loader to adjust addresses as required if the code is loaded at an address other than the intended base address.
The relocation table is essentially just a list of offsets within the binary that need to be adjusted, such that e.g. if a .dll with a base address of 0x100000, is instead loaded at 0x300000, each of the addresses included in the relocation table will have (0x300000 - 0x100000) = 0x200000 added to them.
Further details on the format of the relocation data with the PE file, and the structure of such files generally can be found here: https://learn.microsoft.com/en-us/previous-versions/ms809762(v=msdn.10)#pe-file-base-relocations

MapViewOfFileEx(): Does porting an application to x64 require us to specify a different address (64-bit) for the argument lpBaseAddress?

I'm in the process of porting an existing win32 application to x64.
In one of the modules, I see a fixed based address passed to MapViewOfFileEx() as "lpBaseAddress" argument. The value passed is 0x20000000.
In one of the porting guidelines, I read that we should stay away from such "magic numbers" while porting to x64.
But, the code using the base address 0x20000000 is a legacy one and is called from lots of other modules for shared memory allocation. So, I'm hesitant to change the value of this address while porting to x64.
I'd like to know if the code ported to x64 will work well with the same base address?
As a side note, I also see the current (x86) code links, ie invokes the linker with /base option value of 0x1C000000, ie -base:0x1C000000.
Does this have any relation to the valid value of base address we can request from MapViewOfFileEx()?
Any insight/ideas will be greatly appreciated.
Edit:
To clarify, this question doesn't pertain to any addresses per se. What I want to know is whether a 32-bit constant address passed to MapViewOfFileEx() can be reused while porting to x64 platform. The reference to linker option "base" was to ask if the address specified as the base address while linking has any relation to the address lpBaseAddress we pass to MapViewOfFileEx().
This is a bit of a non-question. The real issue is why the file must be mapped at that address, and I'm having a tough time believing that changing the 'legacy' code to be more flexible is completely off the table.
Calling MapViewOfFileEx with a specific base address is really, really dangerous. There is never any guarantee that Windows will be able to honour that request, since, even if it's only one time in a hundred (which is the worst kind of bug, no?), that address will already be occupied. ASLR is a case in point, or Windows might have put the heap there, or whatever.
So, tl;dr: don't do that. Just don't. Find another way.

Is the same DLL guaranteed to be mapped to the same virtual address in every process using it?

I'm studying Windows system internals and the question is just a guess.
I learn that DLL is a form of shared libraries, so at least the code section of the same DLL is shared between processes using it. (By adding the same page entries into the page table of these processes) The code section usually has something like jump tables, which need to be relocated (i.e. write the run-time virtual address to fix the pointer) before it's ready to be executed.
Assume that the same DLL aa.dll is mapped in two different processes at different virtual addresses. (e.g. a.exe 0x00400000 b.exe 0x00410000) The same pointer (at .text+0x100) will be fixed into different addresses. (e.g. a.exe 0x00400100 b.exe 0x004100100). So we have to make a copy of the code section and change it to adapt one process. Then how can the code section be shared?
Am I right?
Answering my own question. The first time a DLL is loaded, Windows would try to load it at the Preferred address which would not require relocation (i.e. fixing addresses due to the fact that code segment is located at x). If it cannot be loaded at the preferred address, it would be allocated virtual pages at a free address backed up by the DLL file itself (not swap file) but marked as Copy-On-Write. Now Windows has to go and fix up the assembly code using the relocation table. Hopefully only a small percentage of code needs to be fixed up and each code segment that is changed would be copied on write and put into physical memory somewhere.
Each time a process cannot load a DLL at the preferred address, I believe this process would happen. This is why sometimes popular DLLs need to be rebased so that their preferred addresses don't conflict.

ASLR and Windows System DLLs for non-aware executables?

From a Microsoft article:
Address Space Layout Randomization (ASLR)
ASLR moves executable images into random locations when a system
boots, making it harder for exploit
code to operate predictably. For a
component to support ASLR, all
components that it loads must also
support ASLR. For example, if A.exe
consumes B.dll and C.dll, all three
must support ASLR. By default, Windows
Vista and later will randomize system
DLLs and EXEs, but DLLs and EXEs
created by ISVs must opt in to support
ASLR using the /DYNAMICBASE linker
option.
I don't quite get it. Take the base system DLLs loaded by every process on WIndows: NtDll.dll and kernel32.dll.
If a have a non-aware executable, will these system DLLs use ASLR? That is, will they load at a different base address after every system reboot on Win 7 for this executable or will they always load at the same base address after system reboot like they do on Win XP?
To make it more clear what I mean: My typical dummy program's startup stack will look like this:
write_cons.exe!wmain() Line 8 C++
write_cons.exe!__tmainCRTStartup() Line 583 + 0x19 bytes C
write_cons.exe!wmainCRTStartup() Line 403 C
> kernel32.dll!_BaseProcessStart#4() + 0x23 bytes
Looking at the asm of BaseProcessStart, I see on my XP box here:
_BaseProcessStart#4:
7C817054 push 0Ch
7C817056 push 7C817080h
7C81705B call __SEH_prolog (7C8024D6h)
7C817060 and dword ptr [ebp-4],0
...
Now what interests me is the following:
On Windows XP, the address will always be 0x7C817054, regardless of how many times I reboot this machine. If I were on Win7 with ASLR, will this address change between reboots if the executable that loads kernel32.dll is not enabled for ASLR?
(Note: For me, atm., there is only one minor use-case this address would be useful for: In Visual Studio, I can only set a "Data Breakpoint" for assembly level functions, that is a breakpoint # 0x7... - If I want to break in a specific ntdll.dll or kernel32.dll function, in Windows XP I do not have to adjust my breakpoints between reboots. With ASLR kicking in (the scope of this question) I would have to change the Data Breakpoints between reboots.)
Technically whether the system dlls get relocated or not, it shouldn't matter, as the linker will bind to symbols, not addresses. These symbols are resolved by the runtime loader into to addresses for the instanced system dlls, thus your binary should be none the wiser. From what i've seen however, windows 7 will reset the base randomization every reboot, including system dlls(note: this is from debuging WOW64 apps on widows server 2008 R2). You can also do a system wide disabling of ASLR via some registery edits, but thats not really relevant...
Update:
the section on ASLR in this article explains what gets relocated and when.
it doesn't mention if the base will reset every reboot, but for system dlls, its never going to be guaranteed to load at the same address twice, reboot or no reboot.
the important thing is according to article, everything needs to opt-in to ASLR for system dll's to be relocated.
Your program will resolve calls into system DLLs wherever they happen to be loaded. But, unless your executable is linked with /DYNAMICBASE, it will not be given a randomized base address. In other words, your exe will always load at the same base address.
If you want your exe to load at a randomized address, then you have to link it with /DYNAMICBASE, and every DLL that it references must also have been linked with /DYANMICBASE. The system DLLs (starting in Vista) are all linked with /DYNAMICBASE.

what's in a .exe file?

So a .exe file is a file that can be executed by windows, but what exactly does it contain? Assembly language that's processor specific? Or some sort of intermediate statement that's recognized by windows which turns it into assembly for a specific processor? What exactly does windows do with the file when it "executes" it?
MSDN has an article "An In-Depth Look into the Win32 Portable Executable File Format" that describes the structure of an executable file.
Basically, a .exe contains several blobs of data and instructions on how they should be loaded into memory. Some of these sections happen to contain machine code that can be executed (other sections contain program data, resources, relocation information, import information, etc.)
I suggest you get a copy of Windows Internals for a full description of what happens when you run an exe.
For a native executable, the machine code is platform specific. The .exe's header indicates what platform the .exe is for.
When running a native .exe the following happens (grossly simplified):
A process object is created.
The exe file is read into that process's memory. Different sections of the .exe (code, data, etc.) are mapped in separately and given different permissions (code is execute, data is read/write, constants are read-only).
Relocations occur in the .exe (addresses get patched if the .exe was not loaded at its preferred address.)
The import table is walked and dependent DLL's are loaded.
DLL's are mapped in a similar method to .exe's, with relocations occuring and their dependent DLL's being loaded. Imported functions from DLL's are resolved.
The process starts execution at an initial stub in NTDLL.
The initial loader stub runs the entry points for each DLL, and then jumps to the entry point of the .exe.
Managed executables contain MSIL (Microsoft Intermediate Language) and may be compiled so they can target any CPU that the CLR supports. I am not that familiar with the inner workings of the CLR loader (what native code initially runs to boot strap the CLR and start interpreting the MSIL) - perhaps someone else can elaborate on that.
I can tell you what the first two bytes in .exe files contain - 'MZ'. i mean the characters 'MZ'.
It actually represents: Mark Zbikowski. The guy who designed the exe file format.
http://en.wikipedia.org/wiki/Mark_Zbikowski
1's and 0's!
This wikipedia link will give you all the info you need on the Portable Executable format used for Windows applications.
An EXE file is really a type of file known as a Portable Executable. It contains binary data, which can be read by the processor and executed (essentially x86 instructions.) There's also a lot of header data and other miscellaneous content. The actual executable code is located in a section called .text, and is stored as machine instructions (processor specific). This code (as well as other parts of the .EXE) are put into memory, and the CPU is sent to it, where it starts executing. (Note that there's much more interfaces actually happening; this is a simplified explanation).

Resources