PIN_CALLER_TRACKS_DIRTY_DATA in User Mode - windows

One possible solution to the problem of Why does WriteFile call ReadFile and how do I avoid it?. Is to write to file using CcPreparePinWrite and PIN_CALLER_TRACKS_DIRTY_DATA. Basically what this does is to make the cache manager map a file section into memory without having to read it from disk, since the entire section is assumed to be overwritten.
The PIN_CALLER_TRACKS_DIRTY_DATA flag is commonly used in cases where a file system is managing a log file that is written to but not read from. Because the existing file data will be overwritten and not read, the cache manager may return pages of zeros instead of faulting in the actual pages of file data from disk.
This is all great in theory. Though it seems quite complicated to achieve in practice. Especially since these are kernel-mode functions that cannot be called from a user-mode application.
Is there any way to achieve this behaviour using the regular WriteFile API? Or is there any good resource that further explain how to make use of the Cache Manager Routines?

Related

Where are page permissions stored on hardware and how can I alter them directly?

I'm trying to write a pseudo kernel driver (it uses CVE 2018-8120 to get kernel permission so it's technically not a driver) and I want to be as safe as possible when entering ring0. I'm writing a function to read and write MSR's from userland, and before the transition to ring0 I'm trying to guarantee that the void pointer given to my function can be written, I decided the ideal way to do this was to make it writable if it is not already.
The problem is that the only way I know how to do this is with VirtualProtect() and NtAllocateVirtualMemory, but VirtualProtect() sometimes fails and returns an error instead. I want to know precisely where these access permissions are stored (in ram? in some special CPU register?) how I can obtain their address and how can I modify them directly?
User-mode code should never try to muck around in kernel data structures, and any properly written kernel will prevent it anyway. The best way for user mode code to ensure that an address can be written is to write to it. If the page was not already writeable, the page fault will cause the kernel to make it so.
Nevertheless, the kernel code /cannot/ rely on the application having done so, for two reasons:
1) Even if the application does it properly, the page might be unmapped again before (or after) entering ring 0.
2) The kernel should /never/ rely on application code to do the right thing. It always has to protect itself.
The access permissions information and page data is stored in the page directory, page table, CR0 and CR3.
More information can be found here: https://wiki.osdev.org/Paging.

There is a comparison or performance table of the different uses of FlushFileBuffers and FILE_FLAG_NO_BUFFFERING?

I'm about to choose using FlushFileBuffers after each write in a file or FILE_FLAG_NO_BUFFFERING each time I need open the same file.
But I did not find any performance comparison table about the use of one or another option. Well, except this advice in MSDN:
If an application is performing multiple writes to disk and also needs
to ensure critical data is written to persistent media, the
application should use unbuffered I/O instead of frequently calling
FlushFileBuffers. To open a file for unbuffered I/O, call the
CreateFile function with the FILE_FLAG_NO_BUFFERING and
FILE_FLAG_WRITE_THROUGH flags. This prevents the file contents from
being cached and flushes the metadata to disk with each write.
So I'm assuming that: for an application that will write and read many times it is better not use FlushFileBuffers. But there is any comparison article, blog, forum post about the use of them? And if the file is being closed after the write? My google-fu did not get anything yet.

Read a File From Cache, But Without Polluting the Cache (in Windows)

Windows has a FILE_FLAG_NO_BUFFERING flag that allows you to specify whether or not you want your I/O to be cached by the file system.
That's fine, but what if I want to use the cache if possible, but avoid modifying it?
In other words, how do you tell Windows the following?
Read this file from the cache if it's already cached, but my data doesn't exhibit locality, so do not put it into the cache!
The SCSI standard defines a Disable Page Out bit that does precisely this, so I'm wondering how (if at all) it is possible to use that feature from Windows (with cooperation of the file system cache too, of course)?
Edit: TL;DR:
What's the equivalent of FILE_FLAG_WRITE_THROUGH for reads?
About the closest Windows provides to what you're asking is FILE_FLAG_WRITE_THROUGH.
I see two flags that look suspiciously like what you are asking for:
FILE_FLAG_RANDOM_ACCESS
FILE_FLAG_SEQUENTIAL_SCAN
The later's doc clearly suggests that it won't retain pages in cache, though it will probably read-ahead sequentially. The former's doc is completely opaque, but would seem to imply what you want. If the pattern is quite random, hanging onto pages for later reuse would be a waste of memory.
Keep in mind that, for files, the Windows kernel always will use some pages of 'cache' to hold the I/O. It has nowhere else to put it. So it's not meaningful to say 'don't cache it,' as opposed to 'evict the old pages of this file before evicting some other pages.'

Is I/O with section object(CreateFileMapping) faster than basic apis(Read/WriteFile)?

CreateFileMapping and MapViewOfFile, and then we do I/O with a function like memcpy.
Just use Read/WriteFile.
Is the first one faster than second?
I don't understand that.
Why is it faster?
If we use a section object, then we can get more cache benefits from VMM or Cache Manager?
File memory mapping is faster when page out occurs, as the file itself is used as paging storage.
If the memory in the memory mapped file is not changed there is no need to flush the the page to the paging file as the data is in the file already and windows can reread the page from disk. .EXE and .DLL files are loaded using this mechanism and thus are their own page storage.
If the memory in the memory mapped file is written, then page out is the same as if the paging file had been used. Faster possibly as the same place on disk is used (subject to NTFS optimisations).
The plain APIs consume page file backed memory to hold the contents of the file while in memory.
On slightly different perspective both APIs are optimised as memory mapped files are may actually be used the Read/Write File APIs under the hood so you get a micro-optimisation as you're using a lower abstraction
Both mechanisms will employ the VMM/Cache manager.
Use of ReadFile/WriteFile involves several extra memory block copying operations, so it will be slower, than use of MMFs. Another question is how much slower it will be - this is what you need to measure yourself.

Fastest way to pass a file's contents from Kernel to User mode?

I'll try to be brief, but fully descriptive:
This is Windows-specific. Using the Windows Driver Development Kit (DDK).
I am writing a Kernel Mode Driver (KMD) for the first time, having no prior experience in Kernel Mode. I am playing around currently with the "scanner" mini-filter sample which comes with the DDK, and expanding upon it for practice. The "scanner" mini-filter is a basic outline for a generic "anti-virus" type scanning driver which hooks file creates/closes and operates on the associated file to scan for a "bad word" before approving/denying the requested operation.
The end goal is to scan the file with the user-mode application when it is opened, deciding whether or not the mini-filter should allow the operation to complete, without noticeable slow-down to the process or user which is attempting to open the file. I will also want to scan the entire file again when a save is attempted to decide whether or not to allow the save to complete successfully or deny the save. The mini-filter sample lays out the groundwork for how to hook these calls, but is a bit weak in the actually "scanning" portion.
I am looking at expanding the sample to scan the entire file that has been opened, such as to generate a hash, rather than just the first 1k (the sample's limit). I have modified the sample to read the entirety of the file and send it using the same mechanisms within the original sample. This method uses FltReadFile to read the file within the KMD and FltSendMessage to send the buffer to the user-mode component. The user-mode application is using GetQueuedCompletionStatus to grab the notifications from the KMD and process the buffers.
However, I'm noticing that this process seems to be pretty slow compared to a normal open/read in C++ using the standard library (fstream). This method is taking between approximately 4-8 times longer than simplying opening and reading the file in a simple C++ user app. I have adjusted buffer sizes to see if it makes for a noticeable improvement, and while it can help slightly, the benefits have not appeared to be very significant.
Since I am looking to scan files in 'real-time', this rate of transfer is highly disappointing and prohibitive. Is there a faster way to transfer a file's contents from a Kernel-Mode Driver to a User-Mode Application?
I can suggest several solutions:
Use DeviceIoControl with METHOD_OUT_DIRECT transfer type to pass large amounts of data.
Create memory section and map it to your process (remember about limited address space on 32-bit platforms).
Pass file path to your application and open it there.

Resources