Why does editbin /swaprun:CD /swaprun:NET change two bytes? - windows

Calling editbin for a dll with the options /swaprun:CD and /swaprun:NET changes the PE header word of the dll, setting bits $0400 and $0800 (so actually it only changes the high byte).
That's what it is supposed to do.
But it also changes another byte (see hex comparison).
Can anybody explain to me what this byte means and why it is being changed?
edit: To clarify:
editbin with these options is supposed to set the
IMAGE_FILE_REMOVABLE_RUN_FROM_SWAP and IMAGE_FILE_NET_RUN_FROM_SWAP bits in the PE header's Characteristics field (which is a 16 bit word). This is the first byte I am talking about. None of these flags is stored in the second byte, so why does the tool change more than necessary and what does it mean?

IMAGE_FILE_HEADER.Characteristics |= IMAGE_FILE_REMOVABLE_RUN_FROM_SWAP|IMAGE_FILE_NET_RUN_FROM_SWAP;
So 0x2122 -> 0x2d22 (=2122|0x0c00)
And
IMAGE_OPTIONAL_HEADER.CheckSum is changed from 0x000a3c31 to 0x000a4831

Related

Finding and patching an instruction in a DLL

I have a (C++) program where, in one of its dll's, the following is done:
if (m_Map.GetMaxValue() >= MAX_CLASSES) {
I have two binaries of this program (compiled with various versions of Visual Studio), one where MAX_CLASSES was #define'd to 50, and one where it was 75. These binaries were made from different branches of the code and the other functionality is different as well. What I need is a version of the binary where the MAX_CLASSES was defined as 50, except with the higher limit i.e. 75.
So a sane person would change the constant in the source code of the branch I need, rebuild and go home. But, building this software is complex because it's old, the dependencies and tooling are old, etc.; plus I have issues with building the installers, and data and so on. So, I thought, how about I just patch this binary so that this one constant is changed directly in the DLL. I have vague recollections of doing similar things in the 1990's for, eh, probably 'educational' purposes.
But times have changed and I barely remember doing it, let alone how I did things back then. I opened the DLL (one where the limit is set to 75, this is the binary I have at hand - I will have to re-do this as soon as I have the actual binary with the 50 limit, so the following references 75 i.e. 0x4b for illustrating the principle) in Ghidra and after some poking around, I found the following:
18005160e 3c 4b CMP AL,0x4b
180051610 0f 82 19 JC LAB_18005172f
01 00 00
Which in the decompiler window I could link back to
if (bVar3 < 0x4b)
and some operations after that that I can map to the source code of the function I have.
Now my questions are:
how do I interpret the values above (the Ghidra output) wrt to the binary layout of the dll? When I hover over the first column value ('18005160e') in Ghidra, I get values for 'imagebase offset', 'memory block offset', 'function offset' and 'byte source offset'. Is this 'byte source offset' the physical address from the start of the dll where these instructions start? The actual value in this hover balloon is 50a0eh - is that Ghidra's notation for 0x50a0e ? I.e. does the trailing 'h' denote 'hex'?
I then tried to open the dll in a regular hex editor ('Hex Editor Neo' which I like to use to view/edit binary data files), and went to offset 0x50a0e, and looked for the values '3c 4b' around there which I didn't find. I searched for this byte sequence in the whole file, and found 7 occurrences, none of which are around 0x50a0e, leading me to think I'm misinterpreting Ghidra's 'byte source offset' here.
how do I make a 'patcher' for this? I would think what I need is a program that only does
FILE* fh = fopen('mydll.dll);
fseek(fh, 0x[magic constant]);
fwrite(fh, 0x4b);
fclose(fh);
where '0x[magic constant]' is hopefully just the value I got from Ghidra in 'byte source offset'? Or is there anything else I need to consider here? Is there any software tool that can generate a patcher program?
Thanks.
18005160e is a VA, a Virtual Address.
It is the sum of a Base Address (most likely 180000000) and an RVA, a Relative Virtual Address.
Find the Base Address of the DLL with any PE inspecting tool (e.g. CFF Explorer) or Ghidra itself.
Subtract the base address from 18005160e to the RVA. Let's say the result is 5160e.
Now you need to find which section this RVA lies in. Again use an PE inspecting tool to find the list of the sections and their RVA/Virtual start and RVA/Virtual size.
Say the RVA lies in the .text section with start at the RVA 1000.
Subtract this start RVA from the result above: 5160e - 1000 = 4160e.
This is the offset of the instruction in the .text section.
To find the offset in the file, just add the raw/offset start of the section (again you can find this with a PE inspecting tool).
Say the .text section starts at the offset 400, then 4160e + 400 = 41a0e is the offset corresponding to the VA 18005160e.
This is all PE 101.

DOS stub in a PE file [duplicate]

This question already has an answer here:
What's this extra bytes?
(1 answer)
Closed 2 years ago.
Lately, I analyzed some Windows executable files using a hex editor. The PE header starts at address 0x100, so there are 256 Bytes of data before the PE image actually starts. The first 256 Bytes:
I know the following about the file structure
0x00 - 0x3F: This is the MZ header (64 bytes long).
0x40 - 0x4D: These 14 bytes encode seven x86 (16 bit mode) instructions, which are used to print "This program cannot run in DOS mode.\r\r\n" to the screen, using a DOS system call (interrupt 0x21).
0x4E - 0x78: This is the string "This program cannot run in DOS mode.\r\r\n" with a dollar-sign at the end, which tells DOS that this is the end of the string.
0x79 - 0x7F: These are NULL bytes; I guess that they are inserted for alignment.
So I know what the first 128 bytes are for. My question is: What are the next 128 bytes (0x80 - 0xFF) used for? (The PE image starts after them at 0x100.)
It's the so-called undocumented "Rich header". It's a weakly encrypted block of data inserted by the Microsoft linker that indicates what Microsoft tools were used to make the executable. It includes version information from the object files linked, so includes information on what compilers, assemblers and other tools were used.
To decode the Rich header search for the Rich marker and then obtain the 32-bit encryption key that follows. Then working backwards from the Rich marker, XOR the key with the 32-bit values stored there until you find a decoded DanS marker. In between these two markers will be a list of pairs of 32-bit values. The first value of the pair identifies the Microsoft tool used, and the second value indicates how many linked object files were created using this tool. The upper 16-bit part of the tool id value indicates what kind of tool it was, and the lower 16-bit part identifies the build version of the tool.

How would one restore missing PE headers?

I have a binary file which once was a valid PE executable, but all the headers were erased (DOS-header, PE-header and sections table). I managed to guess that one section is .text since if converted to asm in IDA it shows some valid asm code. .rdata was easy to find as well since it contains some strings which correspond to program's logic. But no further progress. I guess I'm not the first one to stumble upon this problem and there are tools/methods to generate PE headers. Any suggestions?
I think you will have some problem that you couldn't fix
the entry point ( where the binary begin)
the relocation (but you can fix the base adress to skip it)
the base adress (but in general it is always the same just need to know if it x86 or x64)
the library used it and the extern functions
perhaps the resourse for instance py2exe create a resource for the python bytecode
and last things bu certainly some other if you have a tls fls in the binary

x86 segmentation, DOS, MZ file format, and disassembling

I'm disassembling "Test Drive III". It's a 1990 DOS game. The *.EXE has MZ format.
I've never dealt with segmentation or DOS, so I would be grateful if you answered some of my questions.
1) The game's system requirements mention 286 CPU, which has protected mode. As far as I know, DOS was 90% real mode software, yet some applications could enter protected mode. Can I be sure that the app uses the CPU in real mode only? IOW, is it guaranteed that the segment registers contain actual offset of the segment instead of an index to segment descriptor?
2) Said system requirements mention 1 MB of RAM. How is this amount of RAM even meant to be accessed if the uppermost 384 KB of the address space are reserved for stuff like MMIO and ROM? I've heard about UMBs (using holes in UMA to access RAM) and about HMA, but it still doesn't allow to access the whole 1 MB of physical RAM. So, was precious RAM just wasted because its physical address happened to be reserved for UMA? Or maybe the game uses some crutches like LIM EMS or XMS?
3) Is CS incremented automatically when the code crosses segment boundaries? Say, the IP reaches 0xFFFF, and what then? Does CS switch to the next segment before next instruction is executed? Same goes for SS. What happens when SP goes all the way down to 0x0000?
4) The MZ header of the executable looks like this:
signature 23117 "0x5a4d"
bytes_in_last_block 117
blocks_in_file 270
num_relocs 0
header_paragraphs 32
min_extra_paragraphs 3349
max_extra_paragraphs 65535
ss 11422
sp 128
checksum 0
ip 16
cs 8385
reloc_table_offset 30
overlay_number 0
Why does it have no relocation information? How is it even meant to run without address fixups? Or is it built as completely position-independent code consisting from program-counter-relative instructions? The game comes with a cheat utility which is also an MZ executable. Despite being much smaller (8448 bytes - so small that it fits into a single segment), it still has relocation information:
offset 1
segment 0
offset 222
segment 0
offset 272
segment 0
This allows IDA to properly disassemble the cheat's code. But the game EXE has nothing, even though it clearly has lots of far pointers.
5) Is there even such thing as 'sections' in DOS? I mean, data section, code (text) section etc? The MZ header points to the stack section, but it has no information about data section. Is data and code completely mixed in DOS programs?
6) Why even having a stack section in EXE file at all? It has nothing but zeroes. Why wasting disk space instead of just saying, "start stack from here"? Like it is done with BSS section?
7) MZ header contains information about initial values of SS and CS. What about DS? What's its initial value?
8) What does an MZ executable have after the exe data? The cheat utility has whole 3507 bytes in the end of the executable file which look like
__exitclean.__exit.__restorezero._abort.DGROUP#.__MMODEL._main._access.
_atexit._close._exit._fclose._fflush._flushall._fopen._freopen._fdopen
._fseek._ftell._printf.__fputc._fputc._fputchar.__FPUTN.__setupio._setvbuf
._tell.__MKNAME._tmpnam._write.__xfclose.__xfflush.___brk.___sbrk._brk._sbrk
.__chmod.__close._ioctl.__IOERROR._isatty._lseek.__LONGTOA._itoa._ultoa.
_ltoa._memcpy._open.__open._strcat._unlink.__VPRINTER.__write._free._malloc
._realloc.__REALCVT.DATASEG#.__Int0Vector.__Int4Vector.__Int5Vector.
__Int6Vector.__C0argc.__C0argv.__C0environ.__envLng.__envseg.__envSize
Is this some kind of debugging symbol information?
Thank you in advance for your help.
Re. 1. No, you can't be sure until you prove otherwise to yourself. One giveaway would be the presence of MOV CR0, ... in the code.
Re. 2. While marketing materials aren't to be confused with an engineering specification, there's a technical reason for this. A 286 CPU could address more than 1M of physical address space. The RAM was only "wasted" in real mode, and only if an EMM (or EMS) driver wasn't used. On 286 systems, the RAM past 640kb was usually "pushed up" to start at the 1088kb mark. The ISA and on-board peripherals' memory address space was mapped 1:1 into the 640-1024kb window. To use the RAM from the real mode needed an EMM or EMS driver. From protected mode, it was simply "there" as soon as you set up the segment descriptor correctly.
If the game actually needed the extra 384kb of RAM over the 640kb available in the real mode, it's a strong indication that it either switched to protected mode or required the services or an EMM or EMS driver.
Re. 3. I wish I remembered that. On reflection, I wish not :) Someone else please edit or answer separately. Hah, I did know it at some point in time :)
Re. 4. You say "[the code] has lots of instructions like call far ptr 18DCh:78Ch". This implies one of three things:
Protected mode is used and the segment part of the address is a selector into the segment descriptor table.
There is code there that relocates those instructions without DOS having to do it.
There is code there that forcibly relocates the game to a constant position in the address space. If the game doesn't use DOS to access on-disk files, it can remove DOS completely and take over, gaining lots of memory in the process. I don't recall whether you could exit from the game back to the command prompt. Some games where "play until you reboot".
Re. 5. The .EXE header does not "point" to any stack, there is no stack section you imply, the concept of sections doesn't exist as far as the .EXE file is concerned. The SS register value is obtained by adding the segment the executable was loaded at with the SS value from the header.
It's true that the linker can arrange sections contiguously in the .EXE file, but such sections' properties are not included in the .EXE header. They often can be reverse-engineered by inspecting the executable.
Re. 6. The SS and SP values in the .EXE header are not file pointers. The EXE file might have a part that maps to the stack, but that's entirely optional.
Re. 7. This is already asked and answered here.
Re. 8. This looks like a debug symbol list. The cheat utility was linked with the debugging information left in. You can have completely arbitrary data there - often it'd various resources (graphics, music, etc.).

Which of the MS-DOS header fields are mandatory/optional?

The above is the complete list of MS-DOS header fields, but I don't know which of them are mandatory and which are optional, does anyone know?
If you're trying to create PE Image, e_magic(Magic number) and elfanew(File address of new exe header) are the only mandatory fields that you have to fill in. elfanew should point to the PE IMAGE_NT_HEADER structure.
Well back in 2006 someone wanted to create the world most tiny PE. For this he wrote a small PE Fuzzer. With the smallest codebase posible.
return 42;
He managed to get the following sizes of PE's
you are too busy to read the entire page, here is a summary of the results:
Smallest possible PE file: 97 bytes
Smallest possible PE file on Windows 2000: 133 bytes
Smallest PE file that downloads a file over WebDAV and executes it: 133 bytes
You can check his work here:
http://www.phreedom.org/research/tinype/
He also states the required header values. These are:
e_magic
e_lfanew
Machine
NumberOfSections
SizeOfOptionalHeader
Characteristics
OptionalHeader:
Magic
AddressOfEntryPoint
ImageBase
SectionAlignment
FileAlignment
MajorSubsystemVersion
SizeOfImage
SizeOfHeaders
Subsystem
SizeOfStackCommit
SizeOfHeapReserve
For MS-DOS, all of the headers are mandatory.
For Win9x and above, e_lfanew must be the offset from the start of the image to the start of the IMAGE_NT_HEADERS, and e_magic must be IMAGE_DOS_SIGNATURE ('MZ').

Resources