As we know, in Windows NT kernel, there are three ways to post a work item to execute in a system thread environment at PASSIVE_LEVEL.
i.e. ExQueueWorkItem, FltQueueGenericWorkItem, and FltQueueDeferredIoWorkItem.
However, I just wonder their differences and their respective application scenarios.
Any explanations?
From the documentation of each API:
ExQueueWorkItem : Can be used by drivers where there isn't any framework apis provided for such work. The documentation suggests to use IoQueueWorkItem instead.
FltQueueGenericWorkItem: For minifilter drivers, shall use this to do any non-IO related work. Like some periodic cleanup etc.
FltQueueDeferredIoWorkItem: For minifilter drivers for work related to IO operation. I.e. if you are filtering an IO, you can defer some work related to that IO using this function.
Related
The Windows Antimalware scan Interface (AMSI) contains abstractions which can be used to call the currently active virus scanner in Windows:
https://learn.microsoft.com/en-us/windows/desktop/amsi/antimalware-scan-interface-functions
There are 2 methods related to initialization:
AmsiInitialize
AmsiUninitialize
AmsiInitialize returns "A handle of type HAMSICONTEXT that must be passed to all subsequent calls to the AMSI API.".
After initialization is complete, I can use AmsiScanBuffer to scan a buffer for malware.
My question:
Can I use the same context concurrently from many threads in my application, or do I need to create one per thread from which I'm going to call the methods?
Reading the documentation, for AsmiUnitialize, it tells me that When the app is finished with the AMSI API it must call AmsiUninitialize.. This tells me that the context can be used for many calls, but it doesn't tell me anything about thread safety or concurrency.
Generally, API calls that are not specifically marked as thread-safe are not (this is usually true for any library). The easiest solution is to open an AMSI handle per thread.
(P.S. This only works with Windows Defender so far as I 've tested).
I need to use workqueue-like feature on Mac OSX (kernel mode driver) and am looking for a way to add work into a queue to be processed by a kernel thread later. Conceptually this is the same thing as workqueue feature available in Linux kernel. Is there something similar on XNU kernel as well?
I don't think there's a direct equivalent as such, although I admit I'm not intimately familiar with the Linux side, so I'll avoid comparing and just tell you about what's available on macOS/xnu.
I/O Kit IOWorkLoops
If you're building an I/O Kit driver, and especially if you're writing a secondary interrupt handler, you'll be using IOWorkLoops. Interrupts are abstracted by IOEventSource objects, which schedule secondary interrupt handlers to run on the driver's IOWorkLoop.
Each IOWorkLoop wraps one kernel thread and also provides a serialisation/locking mechanism for resources shared with that thread. All jobs submitted to a workloop either explicitly through an IOCommandGate or the workloop object directly, or as a result of an IOEventSource event will be serialised. Note that IOCommandGate jobs will run synchronously on the calling thread, not the workloop thread.
As always with macOS/OSX internals, you will want to look at the header file comments and possibly the implementation in the xnu source for details. I personally find IOWorkLoops a bit clumsy for some tasks, but if you're dealing with PCI devices, etc. you don't really have a choice.
thread_call
A more lightweight background work mechanism is the thread_call API. It's defined in <kern/thread_call.h> and supports running functions on an OS-managed background thread, optionally after a delay or with a specific priority. This is probably closer to what you know from Linux, has a fairly straightforward API, but is not suitable for secondary interrupt handlers.
I need a two-way communication between a kernel-mode WFP driver and a user-mode application. The driver initiates the communication by passing a URL to the application which then does a categorization of that URL (Entertainment, News, Adult, etc.) and passes that category back to the driver. The driver needs to know the category in the filter function because it may block certain web pages based on that information. I had a thread in the application that was making an I/O request that the driver would complete with the URL and a GUID, and then the application would write the category into the registry under that GUID where the driver would pick it up. Unfortunately, as the driver verifier pointed out, this is unstable because the Zw registry functions have to run at PASSIVE_LEVEL. I was thinking about trying the same thing with mapped memory buffers, but I’m not sure what the interrupt requirements are for that. Also, I thought about lowering the interrupt level before the registry function calls, but I don't know what the side effects of that are.
You just need to have two different kinds of I/O request.
If you're using DeviceIoControl to retrieve the URLs (I think this would be the most suitable method) this is as simple as adding a second I/O control code.
If you're using ReadFile or equivalent, things would normally get a bit messier, but as it happens in this specific case you only have two kinds of operations, one of which is a read (driver->application) and the other of which is a write (application->driver). So you could just use WriteFile to send the reply, including of course the GUID so that the driver can match up your reply to the right query.
Another approach (more similar to your original one) would be to use a shared memory buffer. See this answer for more details. The problem with that idea is that you would either need to use a spinlock (at the cost of system performance and power consumption, and of course not being able to work on a single-core system) or to poll (which is both inefficient and not really suitable for time-sensitive operations).
There is nothing unstable about PASSIVE_LEVEL. Access to registry must be at PASSIVE_LEVEL so it's not possible directly if driver is running at higher IRQL. You can do it by offloading to work item, though. Lowering the IRQL is usually not recommended as it contradicts the OS intentions.
Your protocol indeed sounds somewhat cumbersome and doing a direct app-driver communication is probably preferable. You can find useful information about this here: http://msdn.microsoft.com/en-us/library/windows/hardware/ff554436(v=vs.85).aspx
Since the callouts are at DISPATCH, your processing has to be done either in a worker thread or a DPC, which will allow you to use ZwXXX. You should into inverted callbacks for communication purposes, there's a good document on OSR.
I've just started poking around WFP but it looks like even in the samples that they provide, Microsoft reinject the packets. I haven't looked into it that closely but it seems that they drop the packet and re-inject whenever processed. That would be enough for your use mode engine to make the decision. You should also limit the packet capture to a specific port (80 in your case) so that you don't do extra processing that you don't need.
Is there any way to hook all disk writes going thru the system, and receive the file names of whatever's being modified, using the Win32 API? Or is this something that would require writing a driver?
You can't do this in user mode, it needs to be kernel mode and so that means a driver. You need a File System Filter Driver.
If you don't care about intercepting the actual data, and only want to know which files are being modified/created/deleted then you can use the ReadDirectoryChangesW API to get that info from userland. Note however that it's one of the hardest functions to use effectively and efficiently, and you should be familiar with IOCP to use it correctly.
In Native API Microsoft exports two versions of each api call, one prefixed with Zw and one with Nt, for eg. ZwCreateThread and NtCreateThread.
My question is what is the difference between those two versions of the calls and when and why one should use Zw or Nt exclusively?
To my understanding Zw version ensures that the caller resides in kernel mode, whereas Nt doesn't.
I am also wondering about the specific meaning for Zw and Nt prefixes/abbreviations?
One can guess Nt probably refers to NT(New Technology) Windows family or Native(probably not)?
As for Zw, does it stand for something?
Update:
Aside from Larry Osterman's answer (which you should definitely read), there's another thing I should mention:
Since the NtXxx variants perform checks as though the call is coming from user-mode, this means that any buffers passed to the NtXxs function must reside in user-mode address spaces, not kernel-mode. So if you call a function like NtCreateFile in your driver and pass it pointers to kernel-mode buffers, you will get back a STATUS_ACCESS_VIOLATION because of this.
See Using Nt and Zw Versions of the Native System Services Routines.
A kernel-mode driver calls the Zw version of a native system services routine to inform the routine that the parameters come from a trusted, kernel-mode source. In this case, the routine assumes that it can safely use the parameters without first validating them. However, if the parameters might be from either a user-mode source or a kernel-mode source, the driver instead calls the Nt version of the routine, which determines, based on the history of the calling thread, whether the parameters originated in user mode or kernel mode.
Native system services routines make additional assumptions about the parameters that they receive. If a routine receives a pointer to a buffer that was allocated by a kernel-mode driver, the routine assumes that the buffer was allocated in system memory, not in user-mode memory. If the routine receives a handle that was opened by a user-mode application, the routine looks for the handle in the user-mode handle table, not in the kernel-mode handle table.
Also, Zw doesn't stand for anything. See What Does the Zw Prefix Mean?:
The Windows native system services routines have names that begin with the prefixes Nt and Zw. The Nt prefix is an abbreviation of Windows NT, but the Zw prefix has no meaning. Zw was selected partly to avoid potential naming conflicts with other APIs, and partly to avoid using any potentially useful two-letter prefixes that might be needed in the future.
I was going to leave this as a comment on Merhdad's answer but it got too long...
Mehrdad's answer is 100% accurate. It is also slightly misleading. The "PreviousMode" article linked to by the "Using Nt and Zw..." article Mehrdad goes into it in more detail. Paraphrasing:
The primary difference between the Nt and Zw API calls is that the Zw calls go through the system call dispatcher, but for drivers, Nt calls are direct calls to the API.
When a driver calls a Zw API, the only real effect of running through the system call dispatcher is that it sets KeGetPreviousMode() to KernelMode instead of UserMode (obviously to user mode code, the Zw and Nt forms are identical). When the various system calls see that ExGetPreviousMode is KernelMode, they bypass access checking (since drivers can do anything).
If a driver calls the NT form of the APIs, it's possible that it will fail because of the access checks.
A concrete example: if a driver calls NtCreateFile, the NtCreateFile will call SeAccessCheck() to see if the application which called into the driver has permissions to create the file. If that same driver called ZwCreateFile, the NtCreateFile API call won't call SeAccessCheck because ExGetPreviousMode returned KernelMode and thus the driver is assumed to have access to the file.
It's important for driver authors to understand the difference between the two since it can have profound implications for security...
Zw stands for Zero Wait (no time waste for parameters validation).