Best way to communicate from KEXT to Daemon and block until result is returned from Daemon - macos

In KEXT, I am listening for file close via vnode or file scope listener. For certain (very few) files, I need to send file path to my system daemon which does some processing (this has to happen in daemon) and returns the result back to KEXT. The file close call needs to be blocked until I get response from daemon. Based on result I need to some operation in close call and return close call successfully. There is lot of discussion on KEXT communication related topic on the forum. But they are not conclusive and appears be very old (year 2002 around). This requirement can be handled by FtlSendMessage(...) Win32 API. I am looking for equivalent of that on Mac
Here is what I have looked at and want to summarize my understanding:
Mach message: Provides very good way of bidirectional communication using sender and reply ports with queueing mechansim. However, the mach message APIs (e.g. mach_msg, mach_port_allocate, bootstrap_look_up) don't appear to be KPIs. The mach API mach_msg_send_from_kernel can be used, but that alone will not help in bidirectional communication. Is my understanding right?
IOUserClient: This appears be more to do with communicating from User space to KEXT and then having some callbacks from KEXT. I did not find a way to initiate communication from KEXT to daemon and then wait for result from daemon. Am I missing something?
Sockets: This could be last option since I would have to implement entire bidirectional communication channel from KEXT to Daemon.
ioctl/sysctl: I don't know much about them. From what I have read, its not recommended option especially for bidirectional communication
RPC-Mig: Again I don't know much about them. Looks complicated from what I have seen. Not sure if this is recommended way.
KUNCUserNotification: This appears to be just providing notification to the user from KEXT. It does not meet my requirement.
Supported platform is (10.5 onwards). So looking at the requirement, can someone suggest and provide some pointers on this topic?
Thanks in advance.

The pattern I've used for that process is to have the user-space process initiate a socket connection to the KEXT; the KEXT creates a new thread to handle messages over that socket and sleeps the thread. When the KEXT detects an event it needs to respond to, it wakes the messaging thread and uses the existing socket to send data to the daemon. On receiving a response, control is passed back to the requesting thread to decide whether to veto the operation.
I don't know of any single resource that will describe that whole pattern completely, but the relevant KPIs are discussed in Mac OS X Internals (which seems old, but the KPIs haven't changed much since it was written) and OS X and iOS Kernel Programming (which I was a tech reviewer on).

For what it's worth, autofs uses what I assume you mean by "RPC-Mig", so it's not too complicated (MIG is used to describe the RPC calls, and the stub code it generates handles calling the appropriate Mach-message sending and receiving code; there are special options to generate kernel-mode stubs).
However, it doesn't need to do any lookups, as automountd (the user-mode daemon to which the autofs kext sends messages) has a "host special port" assigned to it. Doing the lookups to find an arbitrary service would be harder.

If you want to use the socket established with ctl_register() on the KExt side, then beware: The communication from kext to user space (via ctl_enqueuedata()) works OK. However opposite direction is buggy on 10.5.x and 10.6.x.
After about 70.000 or 80.000 send() calls with SOCK_DGRAM in the PF_SYSTEM domain complete net stack breaks with disastrous consequences for complete system (hard turning off is the only way out). This has been fixed in 10.7.0. I workaround by using setsockopt() in our project for the direction from user space to kext as we only send very small data (just to allow/disallow some operation).

Related

Resolve Windows socket error WSAENOBUFS (10055)

Our application has a feature to actively connect to the customers' internal factory network and send a message when inspection events occur. The customer enters the IP address and port number of their machine and application into our software.
I'm using a TClientSocket in blocking mode and have provided callback functions for the OnConnect and OnError events. Assuming the abovementioned feature has been activated, when the application starts I call the following code in a separate thread:
// Attempt active connection
try
m_socketClient.Active := True;
except
end;
// Later...
// If `OnConnect` and socket is connected...send some data!
// If `OnError`...call `m_socketClient.Active := True;` again
When IP + port are valid, the feature works well. But if not, after several thousand errors (and many hours or even days) eventually Windows socket error 10055 (WSAENOBUFS) occurs and the application crashes.
Various articles such as this one from ServerFramework and this one from Microsoft talk about exhausting the Windows non-paged pool and mention (1) actively managing the number outstanding asynchronous send operations and (2) releasing the data buffers that were used for the I/O operations.
My question is how to achieve this and is three-fold:
A) Am I doing something wrong that memory is being leaked? For example, is there some missing cleanup code in the OnError handler?
B) How do you monitor I/O buffers to see if they are being exhausted? I've used Process Explorer to confirm my application is the cause of the leak, but ideally I'd need some programmatic way of measuring this.
C) Apart from restarting the application, is there a way to ask Windows to clear out or release I/O operation data buffers?
Code samples in Delphi, C/C++, C# fine.
A) The cause of the resource leak was a programming error. When the OnError event occurs, Socket.Close() should be called to release low-level resources associated with the socket.
B) The memory leak does not show up in the standard Working Set memory use of the process. Open handles belonging to your process need to be monitored which is possible with GetProcessHandleCount. See this answer in Delphi which was tested and works well. This answer in C++ was not tested but the answer is accepted so should work. Of course, you should be able to use GetProcessHandleCount directly in C++.
C) After much research, I must conclude that just like a normal memory leak, you cannot just ask Windows to "clean up" after you! The handle resource has been leaked by your application and you must find and fix the cause (see A and B above).

Sharing Mach ports with child processes

I am doing a comparison of different IPC mechanisms available on Mac OS X (pipes, sockets, System V IPC, etc.), and I would like to see how Mach ports compare to the higher-level alternatives. However, I've run into a very basic issue: getting send rights to ports across processes (specifically, across a parent process and a child process).
Unlike file descriptors, ports are generally not carried over to forked processes. This means that some other way to transfer them must be established. Just about the only relevant page I could find about this was this one, and they state in an update that their method no longer works and never was guaranteed to, even though that method was suggested by an Apple engineer in 2009. (It implied replacing the bootstrap port, and now doing that breaks XPC.) The replacement they suggest uses deprecated functions, so that's not a very appealing solution.
Besides, one thing I liked about the old solution is that ports remained pretty much private between the processes that used it. There was no need to broadcast the existence of the port, just like pipes (from the pipe call) work once forked. (I'll probably live with it if there's another solution, but it's a little annoying.)
So, how do you pass a send right to a Mach port from a parent process to a child process?
bootstrap_register is deprecated but bootstrap_check_in isn't, and can be used to register your port which can later be retrieved by the child process by using bootstrap_look_up. (This still doesn't provide the privacy you're looking for, unfortunately).
The recommended solution is to not use Mach IPC directly at all but implementing your child process as an XPC service, in which case you can use the XPC API that will use Mach IPC behind the scene, yet you don't have to deal with any details. You have an easy API to send XPC messages in the parent and an easy API to receive XPC messages in the client, that can also pass back replies easily. The system will handle all the hard parts for you.
https://developer.apple.com/library/mac/documentation/MacOSX/Conceptual/BPSystemStartup/Chapters/CreatingXPCServices.html
If you cannot use the XPC API, keep in mind that when you register your service with bootstrap_check_in() (which is not deprecated), it won't be private, but if you do so in a user space process, it will be private to your login session: root processes won't see it, processes of other users neither. If you do that in a root process, it will be visible to all sessions, though.
Also note however, that you can control who may send you IPC messages and who not. You can request a mach_msg_audit_trailer_t when receiving a mach message. That way you get access to the audit_token_t of the sender. And using audit_token_to_pid() you can get the pid_t of the sender. As you know the PID of your child, you can simply ignore all messages (passing it to mach_msg_destroy() to avoid leaking resources), unless the message was sent by your child process. So you cannot avoid your port to be discover-able, but you can avoid that any process other than your child process may use this port.
And last not but not least, you can just give your port a random name, after all only your child process needs to know it, so you can dynamicall generate a name in the parent process and the pass it along to your child process, that way your port can be seen if software scans for ports but most software just uses hardcoded names anyway.
One thing you might try (although it's a gross hack) is hijacking the exception ports as an inheritance mechanism. Set a custom port as an exception port in the parent, fork the child, have the child get the custom port from its exception port, send its task port to the parent, the parent resets its exception port, resets the child's exception port, and then the two proceed from there with a communication channel. See task_set_exception_ports().

Multi-client RPC routing daemon

I've spent a few weeks looking for a daemon that is capable of the following:
Handling arbitrary numbers of clients from other network hosts.
Allowing clients to register services under an identity.
Allowing other clients to make RPC calls to these registered services, with reliable error handling (if client registered for that service detaches, the daemon should report an error to the caller, not leave it hanging).
Allowing clients to register to receive events, which other clients can broadcast.
The simpler, the better.
The closest thing I've found in the F/LOSS world is D-Bus, which meets all of these criteria except the first: it cannot reliably work with remote connections. I have spent time looking into other options (ESB and MQ daemons -- spending the most time with 0MQ and RabbitMQ) and they all fall short in one way or another: ESBs tend to be extremely complicated with a high learning curve, and MQ daemons tend to provide half of the solution (routing) while leaving the other half (error recovery) extremely complicated, if not impossible.
I'm not looking for any fantastic routing capabilities beyond one client saying "I provide service 'foo'" and another client saying "I want to invoke method 'bar' on service 'foo'."
It seems like something like this would exist already, and I hesitate to roll my own.
The use case is that I will have many processes across many hosts. Each host will have a governor process/service that will provide control of the lifetimes of these processes from a control panel. The processes themselves will also be services, allowing for direct inquiry about status as well as reconfiguration requests from this panel. The trick is that processes will come and go, and so static configuration of endpoints isn't something I really want to bother with; I would rather have each process be responsible for telling a daemon "I'm service so-and-so" and then let the daemon do the inter-client routing.
The only piece I'm really missing to develop this system is something that can route RPC requests between all of these processes. Is there any such daemon out there, or is there some other model that would better fit my needs?

Two-way communication between kernel-mode driver and user-mode application?

I need a two-way communication between a kernel-mode WFP driver and a user-mode application. The driver initiates the communication by passing a URL to the application which then does a categorization of that URL (Entertainment, News, Adult, etc.) and passes that category back to the driver. The driver needs to know the category in the filter function because it may block certain web pages based on that information. I had a thread in the application that was making an I/O request that the driver would complete with the URL and a GUID, and then the application would write the category into the registry under that GUID where the driver would pick it up. Unfortunately, as the driver verifier pointed out, this is unstable because the Zw registry functions have to run at PASSIVE_LEVEL. I was thinking about trying the same thing with mapped memory buffers, but I’m not sure what the interrupt requirements are for that. Also, I thought about lowering the interrupt level before the registry function calls, but I don't know what the side effects of that are.
You just need to have two different kinds of I/O request.
If you're using DeviceIoControl to retrieve the URLs (I think this would be the most suitable method) this is as simple as adding a second I/O control code.
If you're using ReadFile or equivalent, things would normally get a bit messier, but as it happens in this specific case you only have two kinds of operations, one of which is a read (driver->application) and the other of which is a write (application->driver). So you could just use WriteFile to send the reply, including of course the GUID so that the driver can match up your reply to the right query.
Another approach (more similar to your original one) would be to use a shared memory buffer. See this answer for more details. The problem with that idea is that you would either need to use a spinlock (at the cost of system performance and power consumption, and of course not being able to work on a single-core system) or to poll (which is both inefficient and not really suitable for time-sensitive operations).
There is nothing unstable about PASSIVE_LEVEL. Access to registry must be at PASSIVE_LEVEL so it's not possible directly if driver is running at higher IRQL. You can do it by offloading to work item, though. Lowering the IRQL is usually not recommended as it contradicts the OS intentions.
Your protocol indeed sounds somewhat cumbersome and doing a direct app-driver communication is probably preferable. You can find useful information about this here: http://msdn.microsoft.com/en-us/library/windows/hardware/ff554436(v=vs.85).aspx
Since the callouts are at DISPATCH, your processing has to be done either in a worker thread or a DPC, which will allow you to use ZwXXX. You should into inverted callbacks for communication purposes, there's a good document on OSR.
I've just started poking around WFP but it looks like even in the samples that they provide, Microsoft reinject the packets. I haven't looked into it that closely but it seems that they drop the packet and re-inject whenever processed. That would be enough for your use mode engine to make the decision. You should also limit the packet capture to a specific port (80 in your case) so that you don't do extra processing that you don't need.

Many-to-many messaging on local machine without broker

I'm looking for a mechanism to use to create a simple many-to-many messaging system to allow Windows applications to communicate on a single machine but across sessions and desktops.
I have the following hard requirements:
Must work across all Windows sessions on a single machine.
Must work on Windows XP and later.
No global configuration required.
No central coordinator/broker/server.
Must not require elevated privileges from the applications.
I do not require guaranteed delivery of messages.
I have looked at many, many options. This is my last-ditch request for ideas.
The following have been rejected for violating one or more of the above requirements:
ZeroMQ: In order to do many-to-many messaging a central broker is required.
Named pipes: Requires a central server to receive messages and forward them on.
Multicast sockets: Requires a properly configured network card with a valid IP address, i.e. a global configuration.
Shared Memory Queue: To create shared memory in the global namespace requires elevated privileges.
Multicast sockets so nearly works. What else can anyone suggest? I'd consider anything from pre-packaged libraries to bare-metal Windows API functionality.
(Edit 27 September) A bit more context:
By 'central coordinator/broker/server', I mean a separate process that must be running at the time that an application tries to send a message. The problem I see with this is that it is impossible to guarantee that this process really will be running when it is needed. Typically a Windows service would be used, but there is no way to guarantee that a particular service will always be started before any user has logged in, or to guarantee that it has not been stopped for some reason. Run on demand introduces a delay when the first message is sent while the service starts, and raises issues with privileges.
Multicast sockets nearly worked because it manages to avoid completely the need for a central coordinator process and does not require elevated privileges from the applications sending or receiving multicast packets. But you have to have a configured IP address - you can't do multicast on the loopback interface (even though multicast with TTL=0 on a configured NIC behaves as one would expect of loopback multicast) - and that is the deal-breaker.
Maybe I am completely misunderstanding the problem, especially the "no central broker", but have you considered something based on tuple spaces?
--
After the comments exchange, please consider the following as my "definitive" answer, then:
Use a file-based solution, and host the directory tree on a Ramdisk to insure good performance.
I'd also suggest to have a look at the following StackOverflow discussion (even if it's Java based) for possible pointers to how to manage locking and transactions on the filesystem.
This one (.NET based) may be of help, too.
How about UDP broadcasting?
Couldn't you use a localhost socket ?
/Tony
In the end I decided that one of the hard requirements had to go, as the problem could not be solved in any reasonable way as originally stated.
My final solution is a Windows service running a named pipe server. Any application or service can connect to an instance of the pipe and send messages. Any message received by the server is echoed to all pipe instances.
I really liked p.marino's answer, but in the end it looked like a lot of complexity for what is really a very basic piece of functionality.
The other possibility that appealed to me, though again it fell on the complexity hurdle, was to write a kernel driver to manage the multicasting. There would have been several mechanisms possible in this case, but the overhead of writing a bug-free kernel driver was just too high.

Resources