PE file section RVA calculation

PE file section RVA calculation - windows

I'm currently looking through a PE file's section table, both from the raw data on the disk, and through a couple of PE analysers. I'm a little confused over how some addresses are being interpreted.
For example. From the raw PE image on disk, I see this:
.text virtualSize: 0x1A0F71 virtualAddress: 0x1000 rawSize: 0x1A1000
However, when using some PE analysers (LordPE, pedump.me), I see this:
.text virtualSize: 0x114d41 virtualAddress: 0x1000 rawSize: 0x114e00
I'm not sure how these values are being interpreted. It is something to do with alignment, and the image's base address?
Any input would be appreciated.
Thanks

Maybe this would help you to solve the problem:

This requires psychic debugging, the size of a section isn't affected by RVA. The crystal ball says that you are actually looking at two different files. And your PE dumper utilities are 32-bit programs that you run on a 64-bit operating system.
You have to understand the File System Redirector. A 32-bit process will be redirected from c:\windows\system32 to c:\windows\syswow64 and from c:\program files to c:\program files (x86). So your PE dumper utilities could well be opening the 32-bit version of an executable instead. And yes, the .text section will be substantially smaller.
Copy the file to a directory that's not affected by redirection, like your Documents folder.

Related

DOS inserting segment addresses at runtime

I noticed a potential bug in some code i'm writing.
I though that if I used mov ax, seg segment_name, the program might be non-portable and only work on one machine in a specific configuration since the load location can vary from machine to machine.
So I decided to disassemble a program containing just that one instruction on two different machines running DOS and I found that the problem was magically solved.
Output of debug on machine one: 0C7A:014C B8BB0C MOV AX,0CBB
Output of debug on machine two: 06CA:014C B80B07 MOV AX,070B
After hex dumping the program I found that the unaltered bytes are actually B84200.
Manually inserting those bytes back into the program results in mov ax, 0042
So does the PE format store references to those instructions and update them at runtime?

As Peter Cordes noted, MS-DOS doesn't use the PECOFF executable format that Windows uses. It has it's own "MZ" executable format, named after the first two bytes of the executable that identify as being in this format.
The MZ format supports the use of multiple segments through a relocation table containing relocations. These relocations are just simple segment:offset values that indicate the location of 16-bit segment values that need to be adjusted based on where the executable was loaded in memory. MS-DOS performs these adjustments by simply adding the actual load segment of the program to the value contained in the executable. This means that without relocations applied the executable would only work if loaded at segment 0, which happens to be impossible.
Note this isn't just necessary for a program to work on multiple machines, it's also necessary for the same program to work reliably on the same machine. The load address can change based on what various configuration details, was well as other programs and drivers that have already been loaded in memory, so the load address of an MS-DOS executable is essentially unpredictable.
Working backwards from your example, we can tell where your example program was loaded into memory on both machines. Since 0042h was relocated into 0CBBh on the first machine and into 070Bh on the second machine, we know MS-DOS loaded your program on the two machines at segments 0C79h and 06C9h respectively:
0CBB - 0042 = 0C79
070B - 0042 = 06C9
From that we can determine that your example executable has the entry 0001:014D, or equivalent segment:offset value, in it's relocation table:
0C7A:014D - 0C79:0000 = 0001:014D
06CA:014D - 06C9:0000 = 0001:014D
This entry indicates the unrelocated location of the 16-bit immediate operand of the mov ax, seg segname instruction that needs adjusting.

How can a windows executable be of only 128 bytes

Look into this post which describes a technique to put an executable code in the first 128 bytes of a DICOM file i.e. in the preamble section. This way the DICOM can be viewed as both a DICOM and an PE executable file.
This git repo demonstrates the same. However they don't show the code, instead only has the binaries.
Now my question. How can an executable be kept only in 128 bytes because I understand a minimal exe will take at least a few KBs from this, this and this SO posts?

From looking at image 1 it appears pretty simple, the valid DOS header is placed in the free area while the full PE image is embedded later in the file, the author put it between two legitimate DICOM meta entries for example. The DOS header is really short and has a field named e_lfanew which holds the file offset to IMAGE_NT_HEADERS. In other words you don't actually need 128 bytes for the full image, you can embed it anywhere in the file as long as it doesn't interfere with DICOM, all that's needed at the start is the dos header.

Before answering how to put an executable in 128 bytes, we need to understand a few things first.
A dicom file must have the characters DICM (File extension) on the bytes 121-124 (Prefix section) to be recognized as a dicom file
A windows executable file must have the DOS Header in the first 64 bytes of the file to be able to be executable as per the PE(Portable Executable) File Format.
Combining the above 2 points a new file format is created called PEDICOM which is both a dicom as well as an executable. The PEDICOM has the architecture as shown in the image above.
The PEDICOM contains both the header and content of the executable file in different sections because an entire executable can’t be fit inside 128 bytes.
Windows provides a list of structures and Win32 APIs to read/write these PE files section by section in winnt.h header.
Creating a PEDICOM file:
DOS header (IMAGE_DOS_HEADER) has 1 field named e_lfanew which contains the offset of the actual PE content. This allows to keep an entire executable code in at least 2 memory locations.
The PE Header (IMAGE_NT_HEADER) has the number of sections and the pointes to the sections (Code, Data, Stack etc.)
Now to answer the original question, an entire executable can't be kept in 128 bytes. However 128 bytes of data are sufficient to declare a file as executable i.e. the dos header and the dos stub can be kept in the 128 bytes while the rest of the executable can be kept somewhere else, in this case in a private dicom tag and a field in the header can point to this. Make the containing file a valid and legitimate executable.

How would one restore missing PE headers?

I have a binary file which once was a valid PE executable, but all the headers were erased (DOS-header, PE-header and sections table). I managed to guess that one section is .text since if converted to asm in IDA it shows some valid asm code. .rdata was easy to find as well since it contains some strings which correspond to program's logic. But no further progress. I guess I'm not the first one to stumble upon this problem and there are tools/methods to generate PE headers. Any suggestions?

I think you will have some problem that you couldn't fix
the entry point ( where the binary begin)
the relocation (but you can fix the base adress to skip it)
the base adress (but in general it is always the same just need to know if it x86 or x64)
the library used it and the extern functions
perhaps the resourse for instance py2exe create a resource for the python bytecode
and last things bu certainly some other if you have a tls fls in the binary

8051 with Keil-C filesize issue using a Megawin processor

I've written a program using Keil C for a MegaWin 8051 MPC82G516A. When I check the file size of the Intel generated hex file it has a size of 8kb (I see the code in the binary code window), but when I go to program the device using Megawin's tool it increases the code size to around 29kb!? Can anyone provide the reason for why it might be doing this?
Also, something else that is strange is the fact that it seems to be writing the code at the top of the processor memory and not at the start. There are like 4 bytes at the start of the code, but the complete rest of it is in the end of the memory.
Please help
Cameron.

you write that the file size of the intel hex file with your code is about 8k. Part of your program is written to the bottom of the address space.
Another part is written to the bottom of the address space.
The intel hex file not only contains the program code but also the address where that code should be written to.
You can check yourself if your file contains code for the bottom and the top of the address space.
Some information about intel hex format: http://www.keil.com/support/man/docs/oh166/oh166_ih_record.htm
If this is the case you can check the .m51 file which is generated by the linker during the build process.
This file contains information about the modules included in your programm and the addresses they are linked to.
Perhaps there is some linker setting in your project which tells the linker behave as you tell.

I was miss-reading the file size on the editor. Also, Keil's free version starts at position 2000Kb. It's part of it's limitations of the evaluation version.

Dumpbin file offsets for system binaries

So for system32 binaries, dumpbin will report:
...
6178 entry point (0000000140006178)
...
SECTION HEADER #1
.text name
63E6 virtual size
1000 virtual address (0000000140001000 to 00000001400073E5)
6400 size of raw data
400 file pointer to raw data (00000400 to 000067FF)
But the entry point at RVA 6178 maps to file offset 6178-1000+400 = 5578h (21,880), but the file opened in a hex editor only goes up to 4A00h (18,944).
Also, the file size is reported as 38,400 bytes by the shell.
So it seems that the .text section is encrypted or some other magic for system binaries. Anyone know what's going on?

As far as I can see, the Entry point is not pointing to the .text section. This is not unusual for encrypted binaries. Using CFF Explorer or other tools like PeStudio, PE Explorer, etc, you might take a look at the mapping of the Entry Point and the pointed Section.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio