Partial unmap of Win32 memory-mapped file

Partial unmap of Win32 memory-mapped file - winapi

I have some code (which I cannot change) that I need to get working in a native Win32 environment. This code calls mmap() and munmap(), so I have created those functions using CreateFileMapping(), MapViewOfFile(), etc., to accomplish the same thing. Initially this works fine, and the code is able to access files as expected. Unfortunately the code goes on to munmap() selected parts of the file that it no longer needs.
x = mmap(0, size, PROT_READ, MAP_SHARED, fd, 0);
...
munmap(x, hdr_size);
munmap(x + foo, bar);
...
Unfortunately, when you pass a pointer into the middle of the mapped range to UnmapViewOfFile() it destroys the entire mapping. Even worse, I can't see how I would be able to detect that this is a partial un-map request and just ignore it.
I have tried calling VirtualFree() on the range but, unsurprisingly, this produces ERROR_INVALID_PARAMETER.
I'm beginning to think that I will have to use static/global variables to track all the open memory mappings so that I can detect and ignore partial unmappings, but I hope you have a better idea...
edit:
Since I wasn't explicit enough above: the docs for UnMapViewOfFile do not accurately reflect the behavior of that function.
Un-mapping the whole view and remapping pieces is not a good solution because you can only suggest a base address for a new mapping, you can't really control it. The semantics of munmap() don't allow for a change to the base address of the still-mapped portion.
What I really need is a way to find the base address and size of a already-mapped memory area.
edit2: Now that I restate the problem that way, it looks like the VirtualQuery() function will suffice.

It is quite explicit in the MSDN Library docs for UnmapViewOfFile:
lpBaseAddress A pointer to the
base address of the mapped view of a
file that is to be unmapped. This
value must be identical to the value
returned by a previous call to the
MapViewOfFile or MapViewOfFileEx
function.
You changing the mapping by unmapping the old one and creating a new one. Unmapping bits and pieces isn't well supported, nor would it have any useful side-effects from a memory management point of view. You don't want to risk getting the address space fragmented.
You'll have to do this differently.

You could keep track each mapping and how many pages of it are still allocated by the client and only free the mapping when that counter reaches zero. The middle sections would still be mapped, but it wouldn't matter since the client wouldn't be accessing that memory anyway.
Create a global dictionary of memory mappings through this interface. When a mapping request comes through, record the address, size and number of pages that are in the range. When a unmap request is made, find out which mapping owns that address and decrease the page count by the number of pages that are being freed. When that count reaches zero, really unmap the view.

Related

Can someone explain the Windows ZwMapViewOfSection system call so that a noob (me) can understand?

I'm investigating a set of Windows API system calls made by a piece of malware running in a sandbox so that I can understand its malicious intent. Unfortunately, I'm struggling to understand the ZwMapViewOfSection function described in documentation: https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/content/wdm/nf-wdm-zwmapviewofsection
Now, I do understand that this function is related to the mapping of physical memory to virtual memory in a page table. Apart from that, I find the documentation arcane and not friendly to beginners. I am also confused why they are calling blocks of physical memory "sections" rather than "frames" (if that is what they are indeed referring to -- its not clear to me). Can anyone provide a more intuitive explanation about this system call and what it does in general? Is this a common system call for programs or is it limited to malware? Thank You.

It is extremely common for normal programs to make this call (not directly of course), every program is going to invoke it multiple times during initialization at the very least (ZwMapViewOfSection is used when performing the memory backing used to implement the executable sections of code itself). Not so common during normal program code but not uncommon either. Particularly common if the program performs dynamic DLL loads but legitimate programs can also do memory-mapped IO for their own reasons.
It operates on memory section objects (I've never really understood that name either) which are one part of the link between disc files and memory-mapped regions, the section is created via ZwCreateSection or opened with ZwOpenSection and then the other part comes into play with ZwMapViewOfSection.
What part of this, exactly, is confusing you? Knowing that would make it far easier to provide an informative response.

As far as I understand it, you have to open the file and acquire a file handle which you then map with CreateFileMapping, which will call NtCreateSection, which calls MmCreateSection. If the file is mapped for the first time a new segment object and control area are created first then depending on whether the section is created for a data, image or page-file backed MiCreateDataFileMap, MiCreateImageFileMap or MiCreatePagingFileMap is called.
MiCreateDataFileMap sets up the subsection object and section object. In the normal case only one subsection is created, but under some special conditions multiple subsections are used, e.g. if the file is very large. For data files, the subsection object field SubsectionBase is left blank. Instead the SegmentPteTemplate field of the segment object is setup properly which can be used to create the PPTEs when necessary. This defers the creation of PPTEs until a view is mapped for the first time which avoids wasting memory when very large data files are mapped. Note a PPTE is a PTE that is serving as a prototype PTE, but an _MMPTE_PROTOTYPE is a PTE that is pointing to a prototype.
MiCreateImageFileMap creates the section object and loads the PE header of the specified file and verifies it then one subsection is created for the PE header and one for each PE section. If a very small image file is mapped then only one subsection is created for the complete file. Besides the subsections also the related PPTEs for each of them are created and their page protection flags are set according to the protection settings of the related PE section. These PPTEs will be used as a template for building the real PTEs when a view is mapped and accessed.
After a section is created it can be mapped into the address space by creating a view from it. The flProtect passed to CreateFileMapping specifies the protection of the section object. All mapped views of the object must be compatible with this protection. You specify dwMaximumSizeLow and dwMaximumSizeHigh to be 0 in order for dwMaximumSizeHigh to be set to the length of the file automatically.
You then pass the returned section object handle to MapViewOfFile, which will calls NtMapViewOfSection on it, which calls MmMapViewOfSegment, which calls MmCreateMemoryArea, which is where the view is mapped into the VAD of the process with the protection dwDesiredAccess supplied to MapViewOfFile, which serves as the protection type for all PTEs that the VAD entry covers. dwNumberOfBytesToMap = 0 and dwFileOffsetLow = 0 in MapViewOfFile maps the whole file.
When a view is mapped, I believe that all of the PTEs are made to point to the prototype PTEs and are given the protection of the PPTE. For an image file, the PPTEs have already been initialised to subsection PTEs. For a data file, the PPTEs for the view need to be initialised to subsection PTEs. The VAD entry for the view is now created. The VAD entry protection isn't always reflective of the protection of the PTEs it covers, because it can cover multiple subsections and multiple blocks within those subsections.
The first time an address in the mapping is actually accessed, the subsection prototype PTE is filled in on demand with the allocated physical page filled with the I/O write for that range and the process PTE is filled in with that same address. For an image, the PPTE was already filled in when the subsections were created along with protection information derived from the section header characteristics in the image, and it just fills in the PTE with that address and the protection information in it.
When the PTE is trimmed from the process working set, the working set manager accesses the PFN to locate the PPTE address, decreases the share count, and it inserts the PPTE address into the PTE.
I'm not sure when a VAD PTE (which have a prototype bit and prototype address of 0xFFFFFFFF0000 and is not valid) occurs. I would have thought the PPTEs are always there at their virtual address and can be pointed to as soon as the VAD entry is created.

C++ function for "kernel memory check" before accessing

I am developing a simple device driver for study. With a lot of testing, I am creating so many errors which finally leads my computer to blue screen. I am sure that the reason for this is memory crash. So now I want to check if my code can access to Kernel memory before going further.
My question is what function can check whether it is accessible or not in kernel memory. For instance, there is a pointer structure with two fields and I want to access the first field which is also a pointer but do now know whether it really has an accessible value or just trash value.
In this given context, I need to check it out to make sure that I am not getting blue screen.
Thanks in advance!

this is impossible for kernel memory. you must know exactly are kernel address is valid and will be valid during access. if you get address from user mode - you can and must use ProbeForRead or ProbeForWrite. but this is only for user-mode buffer. for any kernel memory (even valid and resident) this function just raise exception. from invalid access to kernel memory address no any protection. try - except handler not help here - you just got PAGE_FAULT_IN_NONPAGED_AREA bug check
For instance, there is a pointer structure with two fields and I want
to access the first field which is also a pointer but do now know
whether it really has an accessible value or just trash value.
from where you got this pointer ? who fill this structure ? you must not check. you must know at begin that this pointer is valid and context of structure will be valid during time you use it. otherwise your code already wrong and buggy

Accessing user space data from linux kernel

This is an assignment problem which asks for partial implementation of process checkpointing:
The test program allocates an array, does a system call and passes the start and end address of array to the call. In the system call function I have to save the contents in the give range to a file.
From my understanding, I could simply use copy_from_usr function to save the contents from the give range. However since the assignment is based on topic "Process address space", I probably need to walk through page tables. Say I manage to get the struct pages that correspond to given range. How do I get the data corresponding to the pages?
Can I just use page_to_virt function and access data directly?
Since the array is contiguous in virtual space, I guess I will just need to translate the starting address to page and then back to virtual address and then just copy the range size of data to file. Is that right?

I think copy_from_user() is ok, nothing else needed. When executing the system call, although it trap to kernel space, the context is still the process context which doing the system call. The kernel still use the process's page table. So just to use copy_from_user(), and nothing else needed.
Okey, if you want to do this experiment, I think you can use the void __user *vaddr to traverse the mm->pgd(page table), using pgd_offset/pud_offset/pmd_offset/pte_offset to get the page physical address(page size alignment). Then in kernel space, using ioremap() to create a kernel space mapping, then using the kernel virtual address(page size) + offset(inside the page), you get the start virtual address of the array. Now in kernel, you can using the virtual address to access the array.

VirtualQueryEx how do i parse the results (how do i know if the page is readable or not by ReadProcessMemory?)

I am trying to learn how to access other process's memory. I'm using ReadProcessMemory but it fails, supposedly because i'm reading from an unaccessible memory. I think I should use VirtualQueryEx first to find out if the memory I'm trying to read is readable, but I don't know how to parse the results of VirtualQueryEx.
VirtualQueryEx returns a structure described here http://msdn.microsoft.com/en-us/library/windows/desktop/aa366775%28v=vs.85%29.aspx
Is there any good simple way to know whether the values in the MEMORY_BASIC_INFORMATION structure tell that the page is readable by ReadProcessMemory or not?
This structure has several parameters that, from what I understand, define whether or not memory is accessible:
DWORD AllocationProtect;
DWORD State;
DWORD Protect;
DWORD Type;
Its not even clear on what combination of parameters actually allows reading the memory (for example, should I read Protection flag from AllocationProtect or Protect, does State other than Commit mean that memory is unreadable, etc) and making a huge if (this is that and that is that or this is that....) doesn't seem like a clean solution.
There's got to be a better way, but I don't seem to be able to find it...

Actually the clue is in the documentation you linked:
AllocationProtect
The memory protection option when the region was initially allocated. This member can be one
of the memory protection constants or 0 if the caller does not have access.
VirtualProtect or VirtualProtectEx could have been used meanwhile to change the protection (per-page, while the initial protection was for the whole allocated range). So you'd have to check the Protect member of the structure against the Memory Protection Constants in the MSDN documentation that state that read is allowed. Checking for a single bit doesn't seem to be enough.

Can address space be recycled for multiple calls to MapViewOfFileEx without chance of failure?

Consider a complex, memory hungry, multi threaded application running within a 32bit address space on windows XP.
Certain operations require n large buffers of fixed size, where only one buffer needs to be accessed at a time.
The application uses a pattern where some address space the size of one buffer is reserved early and is used to contain the currently needed buffer.
This follows the sequence:
(initial run) VirtualAlloc -> VirtualFree -> MapViewOfFileEx
(buffer changes) UnMapViewOfFile -> MapViewOfFileEx
Here the pointer to the buffer location is provided by the call to VirtualAlloc and then that same location is used on each call to MapViewOfFileEx.
The problem is that windows does not (as far as I know) provide any handshake type operation for passing the memory space between the different users.
Therefore there is a small opportunity (at each -> in my above sequence) where the memory is not locked and another thread can jump in and perform an allocation within the buffer.
The next call to MapViewOfFileEx is broken and the system can no longer guarantee that there will be a big enough space in the address space for a buffer.
Obviously refactoring to use smaller buffers reduces the rate of failures to reallocate space.
Some use of HeapLock has had some success but this still has issues - something still manages to steal some memory from within the address space.
(We tried Calling GetProcessHeaps then using HeapLock to lock all of the heaps)
What I'd like to know is there anyway to lock a specific block of address space that is compatible with MapViewOfFileEx?
Edit: I should add that ultimately this code lives in a library that gets called by an application outside of my control

You could brute force it; suspend every thread in the process that isn't the one performing the mapping, Unmap/Remap, unsuspend the suspended threads. It ain't elegant, but it's the only way I can think of off-hand to provide the kind of mutual exclusion you need.

Have you looked at creating your own private heap via HeapCreate? You could set the heap to your desired buffer size. The only remaining problem is then how to get MapViewOfFileto use your private heap instead of the default heap.
I'd assume that MapViewOfFile internally calls GetProcessHeap to get the default heap and then it requests a contiguous block of memory. You can surround the call to MapViewOfFile with a detour, i.e., you rewire the GetProcessHeap call by overwriting the method in memory effectively inserting a jump to your own code which can return your private heap.
Microsoft has published the Detour Library that I'm not directly familiar with however. I know that detouring is surprisingly common. Security software, virus scanners etc all use such frameworks. It's not pretty, but may work:
HANDLE g_hndPrivateHeap;
HANDLE WINAPI GetProcessHeapImpl() {
return g_hndPrivateHeap;
}
struct SDetourGetProcessHeap { // object for exception safety
SDetourGetProcessHeap() {
// put detour in place
}
~SDetourGetProcessHeap() {
// remove detour again
}
};
void MapFile() {
g_hndPrivateHeap = HeapCreate( ... );
{
SDetourGetProcessHeap d;
MapViewOfFile(...);
}
}
These may also help:
How to replace WinAPI functions calls in the MS VC++ project with my own implementation (name and parameters set are the same)?
How can I hook Windows functions in C/C++?
http://research.microsoft.com/pubs/68568/huntusenixnt99.pdf

Imagine if I came to you with a piece of code like this:
void *foo;
foo = malloc(n);
if (foo)
free(foo);
foo = malloc(n);
Then I came to you and said, help! foo does not have the same address on the second allocation!
I'd be crazy, right?
It seems to me like you've already demonstrated clear knowledge of why this doesn't work. There's a reason that the documention for any API that takes an explicit address to map into lets you know that the address is just a suggestion, and it can't be guaranteed. This also goes for mmap() on POSIX.
I would suggest you write the program in such a way that a change in address doesn't matter. That is, don't store too many pointers to quantities inside the buffer, or if you do, patch them up after reallocation. Similar to the way you'd treat a buffer that you were going to pass into realloc().
Even the documentation for MapViewOfFileEx() explicitly suggests this:
While it is possible to specify an address that is safe now (not used by the operating system), there is no guarantee that the address will remain safe over time. Therefore, it is better to let the operating system choose the address. In this case, you would not store pointers in the memory mapped file, you would store offsets from the base of the file mapping so that the mapping can be used at any address.
Update from your comments
In that case, I suppose you could:
Not map into contiguous blocks. Perhaps you could map in chunks and write some intermediate function to decide which to read from/write to?
Try porting to 64 bit.

As the earlier post suggests, you can suspend every thread in the process while you change the memory mappings. You can use SuspendThread()/ResumeThread() for that. This has the disadvantage that your code has to know about all the other threads and hold thread handles for them.
An alternative is to use the Windows debug API to suspend all threads. If a process has a debugger attached, then every time the process faults, Windows will suspend all of the process's threads until the debugger handles the fault and resumes the process.
Also see this question which is very similar, but phrased differently:
Replacing memory mappings atomically on Windows

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio