"Relative Virtual Addresses", relative to what? - windows

I have just been reading about the offsets of instructions which they are in a file on the disk, the RVA and the VA once they are loaded into the memory. I also read that if a PE file were loaded into the memory exactly as it were in the disk, the RVA would be same as the file offsets(and that it would be very unusual for that to happen).
My doubt is - under normal circumstances, what are these RVA's relative to? The start of that particular PE data structure?
Edit: by PE data structure I mean - PE header, DOS header, DOS stub, PE file header, Image optional header, Section table and Data directories.

RVA is the address relative to the image base address, after having been loaded into memory.
The MS PE/COFF specification says:
Relative virtual address. In an image file, the address of an item after it is loaded into memory, with the base address of the image file subtracted from it. The RVA of an item almost always differs from its position within the file on disk (file pointer).

Related

How can i calculate the file offset of the memory virtual address of the export table?

so, i was trying to read a DLL file, everything was fine till i reach the Optional Header Data Directories, specifically its first member, the Export Table.
My problem is that i can't move the offset of my reader because the virtual address member is based on memory VA, and my reader is based on file offset. May a visual example helps:
As you can see, the loaded virtual address that this PE viewer reads at the Export Table Address from the Data Directory(Optional Header) is the value 0x00002630(lets refer to it as hex1 from now on).
However, when i click on the Export Table to see the actual content, the program does the conversion of this address from memory to file offset, redirecting me to this address as the result:
The address that it redirects me is the 0x00001a30(lets refer to it as hex2 from now on).
I did some tests on my own like dividing the hex1 per 8 because i thought it could be the transition from memory alignment which is 4096 and the file alignment which is 512 but it didn't gave me the same result as hex2. I also did some weird stuff to try to get that formula, but it gave me even more bizarre results.
So, my question would be, how could i get/calculate that file offset(hex2) if i only know the memory offset at the Data Directory(hex1)?
Assuming you are using MSVC C/C++, you first need to locate the array of IMAGE_SECTION_HEADER structures following the Optional Header. The SDK has a macro called IMAGE_FIRST_SECTION(pNtHeaders) in which you just pass the pointer of your PE header to make this process easier. It basically just skips past the optional header in memory which is where the section headers start. This macro will also work on either 32-bit or 64-bit Windows PE files.
Once you have the address of the IMAGE_SECTION_HEADER array, you loop through the structures up to FileHeader.NumberOfSections using pointer math. Each of the structures describe the relative starting of a memory address (VirtualAddress) for each of the named PE sections along with the file offset (PointerToRawData) to that section within the file you have loaded.
The size of the section WITHIN the file is SizeOfRawData. At this point, you now have everything you need to translate any given RVA to a file offset. First range check each IMAGE_SECTION_HEADER's VirtualAddress with the RVA you are looking up. I.e.:
if (uRva >= pSect->VirtualAddress && (uRva < (pSect->VirtualAddress + pSect->SizeOfRawData))
{
//found
}
If you find a matching section, you then subtract the VirtualAddress from your lookup RVA, then add the PointerToRawData offset:
uFileOffset = uRva - pSect->VirtualAddress + pSect->PointerToRawData
This results in an offset from the beginning of the file corresponding to that RVA. At this point you have translated the RVA to a file offset.
NOTE: Due to padding, incorrect PE files, etc., you may find not all RVAs will map to a location within the file at which point you might display an error message.

What parts of a PE file are mapped into memory by the MS loader?

What parts of a PE file are mapped into memory by the MS loader?
From the PE documentation, I can deduce the typical format of a PE executable (see below).
I know, by inspection, that all contents of the PE file, up to and including the section headers, gets mapped into memory exactly as stored on disk.
What happens next?
Is the remainder of the file also mapped (here I refer to the Image Pages part in the picture below), so that the whole file is in memory exactly like stored on disk, or is the loader more selective than that?
In the documentation, I've found the following snippet:
Another exception is that attribute certificate and debug information
must be placed at the very end of an image file, with the attribute
certificate table immediately preceding the debug section, because the
loader does not map these into memory. The rule about attribute
certificate and debug information does not apply to object files,
however.
This is really all I can find about loader behavior; it just says that these two parts must be placed last in the file, since they don't go into memory.
But, if the loader loads everything except these two parts, and I set the section RVA's suffiently high, then section data will actually be duplicated in memory (once in the mapped file and once for the position specified by the RVA)?
If possible, link to places where I can read further about loading specific to MS Windows.
Finding this information is like an egg hunt, because MS always insists on using its own terminology when the COFF description uses AT&T terms.
What parts of a PE file are mapped into memory by the MS loader?
It depends.
All sections covered by a section header are mapped into the run-time address space.
However sections that have an RVA of 0 are not mapped and thus never loaded.
Each debug directory entry identifies the location and size of a block of debug information. The RVA specified may be 0 if the debug information is not covered by a section header (i.e., it resides in the image file and is not mapped into the run-time address space). If it is mapped, the RVA is its address.
Memory contains an exact replica of the file on disk.
Note that executables and dll's are mapped into virtual memory, not physical!
As you access the executable parts of it are swapped into RAM as needed.
If a section is not accessed then it obviously does not get swapped into physical RAM, it is however still mapped into virtual memory.
You can read up on everything you might ever want to know about PE files (and more) on MSDN.
Your quote is lifted from the documentation of the COFF file format.
The critical part is:
The rule on attribute certificate and debug information does not apply to object files.
From: https://support.microsoft.com/en-us/kb/121460
Size: Size of the optional header, which is included for executable files but not object files. An object file should have a value of 0 here.
Ergo: executable files or not object files, they are image files.
as such the exception to the rule does not apply to them.

IDA Pro disassembly shows ? instead of hex or plain ascii in .data?

I am using IDA Pro to disassemble a Windows DLL file. At one point I have a line of code saying
mov esi, dword_xxxxxxxx
I need to know what the dword is, but double-clicking it brings me to the .data page and everything is in question marks.
How do I get the plain text that is supposed to be there?
If you see question marks in IDA, this means that there's no physical data at this location on the file (on your disk drive).
Sections in PE files have a physical size (given by the SizeOfRawData field of the section header). This physical size (on disk) might be different from the size of the section once it is mapped onto the process memory by the Windows' loader (this size is given by the VirtualSize field of the section header).
So, if the VirtualSize field is bigger than the SizeOfRawData field, a part of the section has no physical existence and it exists only in memory (once the file is mapped onto the process address space).
On most case, at program entry point, you can assume this memory is filled with 0 (but some parts of the memory might be written by the windows loader).
To get the locations where the data is being written, read or loaded you can use cross-references (xref). Here's an example :
Click on the name of the data from which you want the xref :
Then press 'x', you'll be shown all known (to ida) location where the data is used :
The second column indicates how the data is used:
r means it is read
w means it is written
o means it is loaded as a pointer

Calculating Memory offset of any Section of a Win PE File from Disk offset

Suppose I have a PE file(E.G. Notepad.exe). Suppose when the file is saved in hard-disk, the .text section of notepad.exe is at 0xabcdefgh offset.
So, how can I calculate/predict the offset of .text section when the same executable (notepad.exe) will be loaded into memory at the time of its execution, assuming ASLR is not enabled?
Thanks in Advance.
PE files are not position independent. Instead, they have a preferred load address, and if the OS is unable (because the address space is already used, or because ASLR is in effect) to load it in this address, it has to relocate it. See here:
http://en.wikipedia.org/wiki/Portable_Executable#Relocations
So, if ASLR (Address Space Layout Randomization) is not enabled, it should load at the offset specified by the preferred load address specified in the header. This may not be the case for DLLs, but for executables it should be.
You can get more info on the file format here:
http://www.wotsit.org/list.asp?fc=5

Dumpbin file offsets for system binaries

So for system32 binaries, dumpbin will report:
...
6178 entry point (0000000140006178)
...
SECTION HEADER #1
.text name
63E6 virtual size
1000 virtual address (0000000140001000 to 00000001400073E5)
6400 size of raw data
400 file pointer to raw data (00000400 to 000067FF)
But the entry point at RVA 6178 maps to file offset 6178-1000+400 = 5578h (21,880), but the file opened in a hex editor only goes up to 4A00h (18,944).
Also, the file size is reported as 38,400 bytes by the shell.
So it seems that the .text section is encrypted or some other magic for system binaries. Anyone know what's going on?
As far as I can see, the Entry point is not pointing to the .text section. This is not unusual for encrypted binaries. Using CFF Explorer or other tools like PeStudio, PE Explorer, etc, you might take a look at the mapping of the Entry Point and the pointed Section.

Resources