How is SizeOfImage in the PE optional header computed?
Trying to learn the PE format, I've come across the SizeOfImage field in the optional header.
To quote the documentation:
The size (in bytes) of the image, including all headers, as the image
is loaded in memory. It must be a multiple of SectionAlignment.
However, I've experienced that if I set this field wrongly, then the executable won't run and an error 193 (badly formatted excutable) is displayed:
How do I compute the SizeOfImage field, and why won't an executable run if its set wrong (e.g. the executable runs if it's set to 0x00003000 but not 0x00004000 or 0x00002000)?
The safest way that I know of is to loop through each section and find the section to be loaded last in memory (i.e. the highest address). You can almost always assume this is the last section and just skip directly to that section if you trust your PE file (such as if you are using a standard linker, etc). You begin with calculating the end-of-data pointer of that section as follows:
pEndOfLastSection = pLastSection->VirtualAddress + pLastSection->Misc.VirtualSize + pOptionalHeader->ImageBase
pEndOfLastSection now represents the end of the actual section's data (as it exists in the file, padded to file alignment, but not padded to memory alignment) and doesn't include any padding the loader must add to ensure the section fits exactly within the granularity of the memory section alignment.
Despite the other fields that might seem to store the end of the section rounded up to the next nearest "memory" alignment, you must you must perform this calculation yourself on the pEndOfLastSection pointer. I wrote the following function which so far has worked for my purposes:
//
// peRoundUpToAlignment() - rounds dwValue up to nearest dwAlign
//
DWORD peRoundUpToAlignment(DWORD dwAlign, DWORD dwVal)
{
if (dwAlign)
{
//do the rounding with bitwise operations...
//create bit mask of bits to keep
// e.g. if section alignment is 0x1000 1000000000000
// we want the following bitmask 11111111111111111111000000000000
DWORD dwMask = ~(dwAlign-1);
//round up by adding full alignment (dwAlign-1 since if already aligned we don't want anything to change),
// then mask off any lower bits
dwVal = (dwVal + dwAlign-1) & dwMask;
}
return(dwVal);
} //peRoundUpToAlignment()
Now take your pEndOfLastSecion and pass it to the rounding function as follows:
//NOTE: we are rounding to memory section alignment, not file
pEndOfLastSectionMem = peRoundUpToAlignment(pOptionalHeader->SectionAlignment,pEndOfLastSection)
Now you have a "simulated" pointer to the end of the PE file as it would be loaded in memory. NOTE: the end pointers calculated above actually point 1 byte past the last byte of the last section; this allows you to subtract them from their base to get the size. Once you have the end pointer of the last section as it would be loaded in memory, you can subtract this from the loader base and get the size of the PE file as it should be loaded in memory, and this size is obviously the same regardless of where the loader might relocate the PE file:
uCalcSizeOfFile = pEndOfLastSectionMem - pOptionalHeader->ImageBase
Unless the image has been tampered with, the calculation of uCalcSizeOfFile above should be equal to the pOptionalHeader->SizeOfImage field.
Related
I am trying to mimic the behavior of CString::LoadString(HINSTANCE hInst, DWORD id, WORD langID) without introducing a dependency on MFC into my app. So I walked through the source. The first thing it does is to immediately call AtlGetStringResourceImage(hInst, id, langID), and then this in turn contains the following line of code:
hResource = ::FindResourceExW(hInst, (LPWSTR)RT_STRING, MAKEINTRESOURCEW((id>>4)+1), langID);
(It's not verbatim like this, but I trimmed out some unimportant stuff).
What is the meaning of shifting the ID by 4 and adding 1? According to the documentation of FindResourceEx, you should pass in MAKEINTRESOURCE(id), and I can't find any example code that is manipulating the id before passing it to MAKEINTRESOURCE. At the same time, if I make my code call MAKEINTRESOURCE(id) then it doesn't work and FindResourceEx returns null, whereas if I use the above shift + add, then it does work.
Can anyone explain this?
From the STRINGTABLE resource documentation:
RC allocates 16 strings per section and uses the identifier value to determine which section is to contain the string. Strings whose identifiers differ only in the bottom 4 bits are placed in the same section.
The code you are curious about locates the section a given string identifier is stored in by ignoring the low 4 bits.
In
http://en.redinskala.com/finding-the-ep/
there is information about how to find the file offset of the entry point in a exe-file.
Here I can read that
EP (File) = AddressOfEntryPoint – BaseOfCode + .text[PointerToRawData]
+ FileAlignment
However, when I have been calculating this myself (I used a couple of different exe files) I have came to the conclusion that
Offset of entry point in EXE file = AddressOfEntryPoint + .text[PointerToRawData] -
.text[VirtualAddress]
Where AddressOfEntryPoint is fetched from IMAGE_OPTIONAL_HEADER and the other two values from the IMAGE_SECTION_HEADER.
Is the information on that web page false? Adding FileAlignment like they do just seems wrong, it does not make sense. Or does it? A file alignment suggests that I should use modulo or something to compute a value. If BaseOfCode and FileAlignment is the same value (mostly they are), it would not disturb adding them to the calculation, but how would it make sense?
Correct, you don't need to use the FileAlignment value at all.
The algorithm should be something like as follow (very similar to yours):
Get AddressOfEntryPoint from IMAGE_OPTIONAL_HEADER.AddressOfEntryPoint (this is a VA)
Search in which section header this VA resides (usually the 1st one, but you should really search in all section headers).
Once you have the right section header, get its VirtualAddress and PointerToRawData fields.
Subtract VirtualAddress from AddressOfEntryPoint: you now have a "delta"
As the exactly same delta applies to offsets, then: add "delta" to PointerToRawData.
You simply don't need FileAlignment because the section in which the entry point lies is already aligned on that value.
I'm using imagehlp.h to parse my binary and get LOADED_IMAGE.
-LoadedImage.FileHeader->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG].VirtualAddress
The above gets me the RVA for IMAGE_LOAD_CONFIG_DIRECTORY
As I unsderstand, RVA are offsets to my desired data structure.
So, (PIMAGE_LOAD_CONFIG_DIRECTORY)(RVA+LoadedImage.MappedAddress)
should return me my structure correctly (?).
Is this the right way to convert RVA to meaningful pointer.
I am not sure because the timestamp value in
PIMAGE_LOAD_CONFIG_DIRECTORY->TimeDateStamp does not show correctly.
On examination of the memory,
My LoadedImage.MappedAddress points me to "MZ" header. Which is the start of the binary file. Which means my baseaddress is correct. So I conclude that I'm not correctly using the RVA.
Anyone knows the correct way to use the RVA?
Before anyone points out, I've check that the value of virtualaddress for OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG].VirtualAddress` is not 0 before proceeding further.
My application has a number of modules that each require some variables to be stored in off-chip non-volatile memory. To make the reading and writing of these easier, I'm trying to collect them together into a contiguous region of RAM, to that the NVM driver can address a single block of memory when communicating with the NVM device.
To achieve this, I have created a custom linker script containing the following section definition.
.nvm_fram :
{
/* Include the "nvm_header" input section first. */
*(.nvm_header)
/* Include all other input sections prefixed with "nvm_" from all modules. */
*(.nvm_*)
/* Allocate a 16 bit variable at the end of the section to hold the CRC. */
. = ALIGN(2);
_gld_NvmFramCrc = .;
LONG(0);
} > data
_GLD_NVM_FRAM_SIZE = SIZEOF(.nvm_fram);
The data region is defined in the MEMORY section using the standard definition provided by Microchip for the target device.
data (a!xr) : ORIGIN = 0x1000, LENGTH = 0xD000
One example of a C source file which attempts to place its variables in this section is the NVM driver itself. The driver saves a short header structure at teh beginning of the NVM section so that it can verify the content of the NVM device before loading it into RAM. No linker error reported for this variable.
// Locate the NVM configuration in the non-volatile RAM section.
nvm_header_t _nvmHeader __attribute__((section(".nvm_header")));
Another module that has variables to store in the .nvm_fram section is the communications (CANopen) stack. This saves the Module ID and bitrate in NVM.
// Locate LSS Slave configuration in the non-volatile RAM section.
LSS_slave_config_t _slaveConfig __attribute__((section(".nvm_canopen"))) =
{ .BitRate = DEFAULT_BITRATE, .ModuleId = DEFAULT_MODULEID };
Everything compiles nicely, but when the linker runs, the following error stops the build.
elf-ld.exe: Link Error: attributes for input section '.nvm_canopen' conflict with
output section '.nvm_fram'
It's important that the variables can be initialised with values by the crt startup, as shown by the _slaveConfig declaration above, in case the NVM driver cannot load them from the NVM device (it's blank, or the software version has changed, etc.). Is this what's causing the attributes mismatch?
There are several questions here and on the Microchip forums, which relate to accessing symbols that are defined in the linker script from C. Most of these concern values in the program Flash memory and how to access them from C; I know how to do this. There is a similar question, but this doesn't appear to address the attributes issue, and is a little confusing due to being specific to a linker for a different target processor.
I've read the Microchip linker manual and various GCC linker documents online, but can't find the relevant sections because I don't really understand what the error means and how it relates to my code. What are the 'input and output section attributes', where are they specified in my code, and how do I get them to match eachother?
The problem is due to the _nvmHeader variable not having an initial value assigned to it in the C source, but the _slaveConfig variable does.
This results in the linker deducing that the .nvm_fram output section is uninitialised (nbss) from the .nvm_header input section attributes. So, when it enconters initialised data in the .nvm_canopen input section from the _slaveConfig variable, there is a mismatch in the input section attributes: .nvm_fram is for uninitialised data, but .nvm_canopen contains initialised data.
The solution is to ensure that all variables that are to be placed in the .nvm_fram output section are given initial values in the C source.
// Type used to hold metadata for the content of the NVM.
typedef struct
{
void* NvmBase; // The original RAM address.
uint16_t NvmSize; // The original NVM section size.
} nvm_header_t;
// The linker supplies the gld_NVM_FRAM_SIZE symbol as a 'number'.
// This is represented as the address of an array of unspecified
// length, so that it cannot be implicitly dereferenced, and cast
// to the correct type when used.
extern char GLD_NVM_FRAM_SIZE[];
// The following defines are used to convert linker symbols into values that
// can be used to initialise the _nvmHeader structure below.
#define NVM_FRAM_BASE ((void*)&_nvmHeader)
#define NVM_FRAM_SIZE ((uint16_t)GLD_NVM_FRAM_SIZE)
// Locate the NVM configuration in the non-volatile RAM section.
nvm_header_t _nvmHeader __attribute__((section(".nvm_header"))) =
{
.NvmBase = NVM_FRAM_BASE, .NvmSize = NVM_FRAM_SIZE
};
The answer is therefore that the output section attributes may be determined partly by the memory region in which the section is to be located and also by the attributes of the first input section assigned to it. Initialised and uninitialised C variables have different input section attributes, and therefore cannot be located within the same output section.
I was reading a tutorial on PE and it says
Go to the section table either by adding ImageBase to SizeOfHeaders
but SizeOfHeaders is
The size of all headers+section table
so if we add SizeOfHeaders to ImageBase won't we jump at the sections rather than the table?
SizeOfHeaders is not used to find out the position of the section table, even if they might match in some files (but I don't expect so).
Here's how it's done in the Windows headers (and thus the system loader):
#define IMAGE_FIRST_SECTION( ntheader ) ((PIMAGE_SECTION_HEADER) \
((ULONG_PTR)(ntheader) + \
FIELD_OFFSET( IMAGE_NT_HEADERS, OptionalHeader ) + \
((ntheader))->FileHeader.SizeOfOptionalHeader \
))
Note that the actual value of SizeOfOptionalHeader is not checked; it can be very big or even negative - some malware uses it trick to fool analyzing tools.
See here for more details and even nastier tricks.
SizeOfHeaders indeed is the size of the entire header, including the DOS stub.
To get the address of the section table, first get the address of the optional header, and add FileHeader.SizeOfOptionalHeader.