Is there a way, in Windows, to check if a page in in memory or in disk(swap space)?
The reason I want know this is to avoid causing page fault if the page is in disk, by not accessing that page.
There is no documented way that I am aware of for accomplishing this in user mode.
That said, it is possible to determine this in kernel mode, but this would involve inspecting the Page Table Entries, which belong to the Memory Manager - not something that you really wouldn't want to do in any sort of production code.
What is the real problem you're trying to solve?
The whole point of Virtual Memory is to abstract this sort of thing away. If you are storing your own data and in user-land, put it in a data-structure that supports caching and don't think about pages.
If you are writing code in kernel-space, I know in linux you need to convert a memory address from a user-land to a kernal-space one, then there are API calls in the VMM to get at the page_table_entry, and subsequently the page struct from the address. Once that is done, you use logical operators to check for flags, one of which is "swapped". If you are trying to make something fast though, traversing and messing with memory at the page level might not be the most efficient (or safest) thing to do.
More information is needed in order to provide a more complete answer.
Related
I'm doing this as a personal project, I want to make a visualizer for this data. but the first step is getting the data.
My current plan is to
make my program debug the target process step through it
each step record the EIP from every thread's context within the target process
construct the memory address the instruction uses from the context and store it.
Is there an easier or built in way to do this?
Have a look at Intel PIN for dynamic binary instrumentation / running a hook for every load / store instruction. intel-pin
Instead of actually single-stepping in a debugger (extremely slow), it does binary-to-binary JIT to add calls to your hooks.
https://software.intel.com/sites/landingpage/pintool/docs/81205/Pin/html/index.html
Honestly the best way to do this is probably instrumentation like Peter suggested, depending on your goals. Have you ever ran a script that stepped through code in a debugger? Even automated it's incredibly slow. The only other alternative I see is page faults, which would also be incredibly slow but should still be faster than single step. Basically you make every page not in the currently executing section inaccessible. Any RW access outside of executing code will trigger an exception where you can log details and handle it. Of course this has a lot of flaws -- you can't detect RW in the current page, it's still going to be slow, it can get complicated such as handling page execution transfers, multiple threads, etc. The final possible solution I have would be to have a timer interrupt that checks RW access for each page. This would be incredibly fast and, although it would provide no specific addresses, it would give you an aggregate of pages written to and read from. I'm actually not entirely sure off the top of my head if Windows exposes that information already and I'm also not sure if there's a reliable way to guarantee your timers would get hit before the kernel clears those bits.
I'm trying to write a pseudo kernel driver (it uses CVE 2018-8120 to get kernel permission so it's technically not a driver) and I want to be as safe as possible when entering ring0. I'm writing a function to read and write MSR's from userland, and before the transition to ring0 I'm trying to guarantee that the void pointer given to my function can be written, I decided the ideal way to do this was to make it writable if it is not already.
The problem is that the only way I know how to do this is with VirtualProtect() and NtAllocateVirtualMemory, but VirtualProtect() sometimes fails and returns an error instead. I want to know precisely where these access permissions are stored (in ram? in some special CPU register?) how I can obtain their address and how can I modify them directly?
User-mode code should never try to muck around in kernel data structures, and any properly written kernel will prevent it anyway. The best way for user mode code to ensure that an address can be written is to write to it. If the page was not already writeable, the page fault will cause the kernel to make it so.
Nevertheless, the kernel code /cannot/ rely on the application having done so, for two reasons:
1) Even if the application does it properly, the page might be unmapped again before (or after) entering ring 0.
2) The kernel should /never/ rely on application code to do the right thing. It always has to protect itself.
The access permissions information and page data is stored in the page directory, page table, CR0 and CR3.
More information can be found here: https://wiki.osdev.org/Paging.
I'm trying to write a toy working set estimator, by keeping track of page faults over a period of time. Whenever a page is faulted in, I want to record that it was touched. The scheme breaks down when I try to keep track of accesses to already-present pages. If a page is read from or written to without triggering a fault, I have no way of tracking the access.
So then, I want to be able to cause a "lightweight" fault to occur on a page access. I've heard of some method at some point, but I didn't understand why it worked so it didn't stick in my mind. Dirty bit maybe?
You can use mprotect with PROT_NONE ("Page cannot be accessed"). Then any access to the given page will cause a fault.
The usual way to do this is to simply clear the "present" bit for the page, while leaving the page in memory and the necessary kernel data structures in place so that the kernel knows this.
However, depending on the architecture in question you may have better options - for example, on x86 there is an "Accessed" flag (bit 5 in the PTE) that is set whenever the PTE is used in a linear address translation. You can simply clear this bit whenever you like, and the hardware will set it to record that the page was touched.
Using either of these methods you will need to clear the cached translation for that page out of the TLB - on x86 you can use the INVLPG instruction.
I'd like to my Windows C++ program to be able to read the number of hard page faults it has caused. The program isn't running as administrator. Edited to add: To be clear, I'm not as interested in the aggregate page fault count of the whole system.
It looks like ETW might export counters for this, but I'm having a lot of difficulty figuring out the API, and it's not clear what's accessible by regular users as compared to administrators.
Does anyone have an example of this functionality lying around? Or is it simply not possible on Windows?
(OT, but isn't it sad how much easier this is on *nix? gerusage() and you're done.)
afai can tell the only way to do this would be to use ETW (Event Tracing for Windows) to monitor kernel Hard Page Faults. The event payload has a thread ID that you might be able to correlate with an existing process (this is going to be non-trivial btw) to produce a running per-process count. I don't see any way to get historical information per process.
I can guarantee you that this is A Hard Problem because Process Explorer supports only Page Faults (soft or hard) in its per-process display.
http://msdn.microsoft.com/en-us/magazine/ee412263.aspx
A page fault occurs when a sought-out
page table entry is invalid. If the
requested page needs to be brought in
from disk, it is called a hard page
fault (a very expensive operation),
and all other types are considered
soft page faults (a less expensive
operation). A Page Fault event payload
contains the virtual memory address
for which a page fault happened and
the instruction pointer that caused
it. A hard page fault requires disk
access to occur, which could be the
first access to contents in a file or
accesses to memory blocks that were
paged out. Enabling Page Fault events
causes a hard page fault to be logged
as a page fault with a type Hard Page
Fault. However, a hard fault typically
has a considerably larger impact on
performance, so a separate event is
available just for a hard fault that
can be enabled independently. A Hard
Fault event payload has more data,
such as file key, offset and thread
ID, compared with a Page Fault event.
I think you can use GetProcessMemoryInfo() - Please refer to http://msdn.microsoft.com/en-us/library/ms683219(v=vs.85).aspx for more information.
Yes, quite sad. Or you could just not assume Windows is so gimp that it doesn't even provide a page fault counter and look it up: Win32_PerfFormattedData_PerfOS_Memory.
There is a C/C++ sample on Microsoft's site that explain how to read performance counters: INFO: PDH Sample Code to Enumerate Performance Counters and Instances
You can copy/paste it and I think you're interested by the "Memory" / "Page Reads/sec" counters, as stated in this interesting article: The Basics of Page Faults
This is done with performance counters in windows. It's been a while since I've done anything with them. I don't recall whether or not you need to run as administrator to query them.
[Edit]
I don't have example code to provide but according to this page, you can get this information for a particular process:
Process : Page Faults/sec. This is an
indication of the number of page
faults that occurred due to requests
from this particular process.
Excessive page faults from a
particular process are an indication
usually of bad coding practices.
Either the functions and DLLs are not
organized correctly, or the data set
that the application is using is being
called in a less than efficient
manner.
I don't think you need administrative credential to enumerate the performance counters. A sample at codeproject Performance Counters Enumerator
I am trying to understand if we can add our page fault handlers / exception handlers in kernel / user mode and handle the fault we induced before giving the control back to the kernel.
The task here will be not modifying the existing kernel code (do_page_fault fn) but add a user defined handler which will be looked up when a page fault or and exception is triggered
One could find tools like "kprobe" which provide hooks at instruction, but looks like this will not serve my purpose.
Will be great if somebody can help me understand this or point to good references.
From user space, you can define a signal handler for SIGSEGV, so your own function will be invoked whenever an invalid memory access is made. When combined with mprotect(), this lets a program manage its own virtual memory, all from user-space.
However, I get the impression that you're looking for a way to intercept all page faults (major, minor, and invalid) and invoke an arbitrary kernel function in response. I don't know a clean way to do this. When I needed this functionality in my own research projects, I ended up adding code to do_page_fault(). It works fine for me, but it's a hack. I would be very interested if someone knew of a clean way to do this (i.e., that could be used by a module on a vanilla kernel).
If you don't won't to change the way kernel handles these fault and just add yours before, then kprobes will server your purpose. They are a little difficult to handle, because you get arguments of probed functions in structure containing registers and on stack and you have to know, where exactly did compiler put each of them. BUT, if you need it for specific functions (known during creation of probes), then you can use jprobes (here is a nice example on how to use both), which require functions for probing with exactly same arguments as probed one (so no mangling in registers/stack).
You can dynamically load a kernel module and install jprobes on chosen functions without having to modify your kernel.
You want can install a user-level pager with gnu libsegsev. I haven't used it, but it seems to be just what you are looking for.
I do not think it would be possible - first of all, the page fault handler is a complex function which need direct access to virtual memory subsystem structures.
Secondly, imagine it would not be an issue, yet in order to write a page fault handler in user space you should be able to capture a fault which is by default a force transfer to kernel space, so at least you should prevent this to happen.
To this end you would need a supervisor to keep track of all memory access, but you cannot guarantee that supervisor code was already mapped and present in memory.