Memory mapping of binary to VAS - memory-management

When a new process is created the Address space is created using fork() i.e new page table entries are created for the new process which are exactly same as the parent process.
After fork() the exec() is called. What happens during the exec() system call?
I read in the book "Operating system concepts " that when a new program is executed, the process is given a new empty VAS. Does that mean that the page table entries created during fork() would get deleted/modifeid ? What is the meaning of empty VAS?
How does the memory mapping of binary to VAS is performed? How does the loader knows that what addresses of the VAS should be mapped to the corresponding binary file?
I am really confused here.

when you call exec the kernel will load the binary and set up a whole new set of page tables (replacing the old ones).
The loader gets the address to load the binary at from the binary itself (basically it does read() to get the headers and stuff that's not code, then mmap() to actually load the code/data stuff in the binary)
so it looks at the binary and figures out how it should be loaded, the does mmap(), passing in an address to do the map at for each part of the binary that needs to be in a different place (ie code and data sections are probably two different calls to mmap() also the .bss section would be mapped from /dev/zero)
Note that depending on the OS and the binary being loaded some of this stuff may be handled by the kernel directly or by a userspace loader (on UNIXish systems ld would be the loader, it handles shared object loading)

Related

Can someone explain the Windows ZwMapViewOfSection system call so that a noob (me) can understand?

I'm investigating a set of Windows API system calls made by a piece of malware running in a sandbox so that I can understand its malicious intent. Unfortunately, I'm struggling to understand the ZwMapViewOfSection function described in documentation: https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/content/wdm/nf-wdm-zwmapviewofsection
Now, I do understand that this function is related to the mapping of physical memory to virtual memory in a page table. Apart from that, I find the documentation arcane and not friendly to beginners. I am also confused why they are calling blocks of physical memory "sections" rather than "frames" (if that is what they are indeed referring to -- its not clear to me). Can anyone provide a more intuitive explanation about this system call and what it does in general? Is this a common system call for programs or is it limited to malware? Thank You.
It is extremely common for normal programs to make this call (not directly of course), every program is going to invoke it multiple times during initialization at the very least (ZwMapViewOfSection is used when performing the memory backing used to implement the executable sections of code itself). Not so common during normal program code but not uncommon either. Particularly common if the program performs dynamic DLL loads but legitimate programs can also do memory-mapped IO for their own reasons.
It operates on memory section objects (I've never really understood that name either) which are one part of the link between disc files and memory-mapped regions, the section is created via ZwCreateSection or opened with ZwOpenSection and then the other part comes into play with ZwMapViewOfSection.
What part of this, exactly, is confusing you? Knowing that would make it far easier to provide an informative response.
As far as I understand it, you have to open the file and acquire a file handle which you then map with CreateFileMapping, which will call NtCreateSection, which calls MmCreateSection. If the file is mapped for the first time a new segment object and control area are created first then depending on whether the section is created for a data, image or page-file backed MiCreateDataFileMap, MiCreateImageFileMap or MiCreatePagingFileMap is called.
MiCreateDataFileMap sets up the subsection object and section object. In the normal case only one subsection is created, but under some special conditions multiple subsections are used, e.g. if the file is very large. For data files, the subsection object field SubsectionBase is left blank. Instead the SegmentPteTemplate field of the segment object is setup properly which can be used to create the PPTEs when necessary. This defers the creation of PPTEs until a view is mapped for the first time which avoids wasting memory when very large data files are mapped. Note a PPTE is a PTE that is serving as a prototype PTE, but an _MMPTE_PROTOTYPE is a PTE that is pointing to a prototype.
MiCreateImageFileMap creates the section object and loads the PE header of the specified file and verifies it then one subsection is created for the PE header and one for each PE section. If a very small image file is mapped then only one subsection is created for the complete file. Besides the subsections also the related PPTEs for each of them are created and their page protection flags are set according to the protection settings of the related PE section. These PPTEs will be used as a template for building the real PTEs when a view is mapped and accessed.
After a section is created it can be mapped into the address space by creating a view from it. The flProtect passed to CreateFileMapping specifies the protection of the section object. All mapped views of the object must be compatible with this protection. You specify dwMaximumSizeLow and dwMaximumSizeHigh to be 0 in order for dwMaximumSizeHigh to be set to the length of the file automatically.
You then pass the returned section object handle to MapViewOfFile, which will calls NtMapViewOfSection on it, which calls MmMapViewOfSegment, which calls MmCreateMemoryArea, which is where the view is mapped into the VAD of the process with the protection dwDesiredAccess supplied to MapViewOfFile, which serves as the protection type for all PTEs that the VAD entry covers. dwNumberOfBytesToMap = 0 and dwFileOffsetLow = 0 in MapViewOfFile maps the whole file.
When a view is mapped, I believe that all of the PTEs are made to point to the prototype PTEs and are given the protection of the PPTE. For an image file, the PPTEs have already been initialised to subsection PTEs. For a data file, the PPTEs for the view need to be initialised to subsection PTEs. The VAD entry for the view is now created. The VAD entry protection isn't always reflective of the protection of the PTEs it covers, because it can cover multiple subsections and multiple blocks within those subsections.
The first time an address in the mapping is actually accessed, the subsection prototype PTE is filled in on demand with the allocated physical page filled with the I/O write for that range and the process PTE is filled in with that same address. For an image, the PPTE was already filled in when the subsections were created along with protection information derived from the section header characteristics in the image, and it just fills in the PTE with that address and the protection information in it.
When the PTE is trimmed from the process working set, the working set manager accesses the PFN to locate the PPTE address, decreases the share count, and it inserts the PPTE address into the PTE.
I'm not sure when a VAD PTE (which have a prototype bit and prototype address of 0xFFFFFFFF0000 and is not valid) occurs. I would have thought the PPTEs are always there at their virtual address and can be pointed to as soon as the VAD entry is created.

Loading data segment of already loaded shared library

For global offset table to work, GOT must be at a fixed location from text segment. Now assume that a program needs a shared library. Assume also that the shared library is already loaded by the OS for some other process. Now for our program, since text section of shared library is already loaded, it just needs to load data segment. The shared library text section is mapped back to the virtual address of our process. But what if there is already some data/text or whatever at the fixed offset from the virtual address of our shared library. How does the dynamic linker resolve that conflict? One approach would be to leave R_386_GOTPC in the text section till load time and let the dynamic linker change it the new offset. Is this how it is done in practice.
On GNU, even the same DSO is mapped at different addresses in different processes. No data at all is shared between them. This means that the GOT is just private data (like .data), and is initialized at load time with the proper addresses (either stubs or the proper function addresses with BIND_NOW).
(This assumes that prelink is not in use, which is somewhat broken anyway.)

How OS catches illegal memory references at paging scheme?

I am trying to understand how the OS catches all illegal memory access in a system which uses Paging. (32 bits, x86, Paging enabled).
To be more specific, let's suppose I have a tiny App which is just 1 Page in size. Considering that a MS OS take the upper half of the 'virtual memory address space' and that my tiny EXE occupies just 4k of lower half of VMAS, then:
1) How OS realizes that there is an 'illegal memory reference/access' going on when my code tries to write to a memory location outside from my own Exe's 4k? (Obviously, that pointer wasn't obtained from a 'malloc' or similar call).
2) How are Page Tables managed for that tiny Exe? Does OS have to define all 1 M Page Entries (-1 Page Entry) with a 'Non-Present' attribute set and 'System' owned? (When that 'process' is created).
Any advice or comment is wellcome.
EDIT:
Just to make things clear, the answer (compiled form all generous contributions) is:
In order to catch an illegal reference for unallocated memory, the VMAS for the App should be marked as User & Non-Present and the rest of the VMAS should be marked as Kernel & Non-Present.
(Of course, allocated memory is with User attribute. Take note that User & Non-Present is at 'process creation' before its first run!. After that it changes to User & Present).
That way the hardware monitor will catch any access outside of the App boundary!!!
And the Page Fault handler will assume an illegal access because no User code is allowed to access (read/write) a Kernel page.
[VMAS= Virtual Memory Address Space]
1) How OS realizes that there is an 'illegal memory reference/access' going on when my code tries to write to a memory location outside from my own Exe's 4k? (Obviously, that pointer wasn't obtained from a 'malloc' or similar call).
A sequence of events has to take place. The processor takes as inputs (a) the logical page being accessed; (b) the type of access; and (c) the processor mode to determine whether an access is valid.
Is there a page table entry for the page? If not => access violation
Is the page table entry marked valid?
The processing here is system specific, depending upon whether the page tables can distinguish between an invalid page table entry and an valid entry that is not mapped to a page frame. In the former case => access violation. In the latter case, it triggers a page fault and the OS has to determine whether to trigger an access violation or load the page.
Does the page table permit the type of access for the current processor mode? If not => access violation.
If the hardware triggers an access violation exception, it switches to kernel mode and invokes the OS's access violation handler.
2) How are Page Tables managed for that tiny Exe? Does OS have to define all 1 M Page Entries (-1 Page Entry) with a 'Non-Present' attribute set and 'System' owned? (When that 'process' is created).
Operating systems provide system services for mapping memory into the process address space. Generally, the program loader reads the instructions in the EXE file and calls page mapping system services to set up the initial state of the application.
When this occurs depends upon the operating system. In eunuchs-land, a process is a clone of its parent. The running of a program takes place in an exec___ system call. Some operating system have a background command processor that allows multiple applications to be run sequentially within a single process.
From there, it is up to the application to manage the pages mapped to its address space. That is done by calling system services. For example "malloc" calls will cause the application to use system services to map pages.
The initial state of the application is likely to have holes of invalid user addresses. In fact, the range of valid addresses is not likely to be contiguous within the logical address space.
Each page has, among others, the following attributes: Present and Read/Write.
Accessing a page that is not present, or writing a read-only page, generates a privileged event called a page fault. This event takes the form of the CPU executing a specific routine that the OS set up.
Hence the OS is informed of the event and the attempt that was made.
The structures used to implement paging are hierarchical: pages are grouped into directories, and directory into higher directories. There are usually four levels.
Like in a file system, only the directories needed to reach the specific page need to be created.
A definitive source of information is the Intel manuals, specifically the third volume.
This answer intentionally uses simplified words.
How OS realizes that there is an 'illegal memory reference/access' going on when my code tries to write to a memory location outside from my own Exe's 4k? (Obviously, that pointer wasn't obtained from a 'malloc' or similar call).
A page fault is raised and the page fault handler gets executed. In the case of an invalid memory access it terminates the program. In the case of an access of swapped memory, it restores the memory contents from the disk into the main memory again and lets the program continue.
How are Page Tables managed for that tiny Exe? Does OS have to define all 1 M Page Entries (-1 Page Entry) with a 'Non-Present' attribute set and 'System' owned? (When that 'process' is created).
On x86, there are two-level page structures: page directories and page tables. Assuming your program fits in a single page, the OS will initialise a page directory that contains only one valid entry pointing to a page table, and only one valid entry pointing to the page containing the needed memory.

How does a PE file get mapped into memory?

So I have been reasearching the PE format for the last couple days, and I still have a couple of questions
Does the data section get mapped into the process' memory, or does the program read it from the disk?
If it does get mapped into its memory, how can the process aqquire the offset of the section? ( And other sections )
Is there any way the get the entry point of a process that has already been mapped into the memory, without touching the file on disk?
Does the data section get mapped into the process' memory
Yes. That's unlikely to survive for very long, the program is apt to write to that section. Which triggers a copy-on-write page copy that gets the page backed by the paging file instead of the PE file.
how can the process aqquire the offset of the section?
The linker already calculated the offsets of variables in the section. It might be relocated, common for DLLs that have an awkward base address that's already in use when the DLL gets loaded. In which case the relocation table in the PE file is used by the loader to patch the addresses in the code. The pages that contain such patched code get the same treatment as the data section, they are no longer backed by the PE file and cannot be shared between processes.
Is there any way the get the entry point of a process
The entire PE file gets mapped to memory, including its headers. So you can certainly read IMAGE_OPTIONAL_HEADER.AddressOfEntryPoint from memory without reading the file. Do keep in mind that it is painful if you do this for another process since you don't have direct access to its virtual address space. You'd have to use ReadProcessMemory(), that's fairly little joy and unlikely to be faster than reading the file. The file is pretty likely to be present in the file system cache. The Address Space Layout Randomization feature is apt to give you a headache, designed to make it hard to do these kind of things.
Does the data section get mapped into the process' memory, or does the program read it from the disk?
It's mapped into process' memory.
If it does get mapped into its memory, how can the process aqquire the offset of the section? ( And other sections )
By means of a relocation table: every reference to a global object (data or function) from the executable code, that uses direct addressing, has an entry in this table so that the loader patches the code, fixing the original offset. Note that you can make a PE file without relocation section, in which case all data and code sections have a fixed offset, and the executable has a fixed entry point.
Is there any way the get the entry point of a process that has already been mapped into the memory, without touching the file on disk?
Not sure, but if by "not touching" you mean not even reading the file, then you may figure it out by walking up the stack.
Yes, all sections that are described in the PE header get mapped into memory. The IMAGE_SECTION_HEADER struct tells the loader how to map it (the section can for example be much bigger in memory than on disk).
I'm not quite sure if I understand what you are asking. Do you mean how does code from the code section know where to access data in the data section? If the module loads at the preferred load address then the addresses that are generated statically by the linker are correct, otherwise the loader fixes the addresses with relocation information.
Yes, the windows loader also loads the PE Header into memory at the base address of the module. There you can file all the info that was in the file PE header - also the Entry Point.
I can recommend this article for everything about the PE format, especially on relocations.
Does the data section get mapped into the process' memory, or does the
program read it from the disk?
Yes, everything before execution by the dynamic loader of operating systems either Windows or Linux must be mapped into memory.
If it does get mapped into its memory, how can the process acquire the
offset of the section? ( And other sections )
PE file has a well-defined structure which loader use that information and also parse that information to acquire the relative virtual address of sections around ImageBase. Also, if ASLR - Address randomization feature - was activated on the system, the loader has to use relocation information to resolve those offsets.
Is there any way the get the entry point of a process that has already
been mapped into the memory, without touching the file on disk?
NOPE, the loader of the operating system for calculation of OEP uses ImageBase + EntryPoint member values of the optional header structure and in some particular places when Address randomization is enabled, It uses relocation table to resolve all addresses. So we can't do anything without parsing of PE file on the disk.

How much of shared object is loaded to memory

If there is a shared object file say libComponent.so which is made up of two object files Component_1.o and Compononet_2.o.
And there is an application which links to libComponent.so but is only using Compononent_1.o functions.
Will the entire shared object i.e libComponent.so will be loaded into memory when application runs and uses shared object file or just the Component_1.o ?
Is there an option available in gcc compiler to toggle this behaviour of only loading the required symbols from a shared object ?
Well, it depends on what you mean by 'loaded'.
The dynamic linker will map all of the library into the process's virtual memory space and will fill in entries in the executable's import table for each library function used with the addresses of functions in the shared library. But filling in the import table doesn't actually load from those addresses, so they won't be loaded into physical memory.
From then on, the library code will be paged into physical memory on demand when the function is called, just like any other pageable memory in the process's virtual address space. If a function is never called (directly from the application or indirectly from another library function called by the application), it won't be paged in. (Well, paging occurs with page size granularity, so you might pull in a function the application doesn't call if it's next to a function it does call. Some compilers use profile-guided optimization to place functions commonly called together next to each other to minimize the number of pages used.)
(Aside: if your library wasn't compiled to use position-independent code and it's loaded at its non-default base address, the linker will need to fix up addresses in the code when it's loaded, which would cause the entire library to be paged in. This could be done lazily when each page is first loaded, though I'm not sure which linkers do this.)

Resources