What I'm ultimately trying to do is write a daemon-type process that monitors the start/death of another process I'm interested in watching.
I know GCD has the proc dispatch source type, but you need a PID for that, and I wouldn't know that info from the monitoring process.
So are there any OSX APIs that let you know what processes have been started / stopped? How do programs like Activity Monitor, or 'top' do it?
It seems pretty clear, from their behavior, that Activity Monitor and top are polling, and not responding to asynchronous events. For instance, you can easily contrive a situation in which a short-lived process never appears in top or Activity Monitor. It stands to reason that if there was an event-driven mechanism, that the system tools would use it (at least Activity Monitor, which is OSX-specific would; top might be too BSD-general, but regardless.)
dtrace has hooks that are triggered for exec events, and fs_usage also has the capability to log exec/spawn events, but both of these require root privileges, and therefore likely have access to facilities that top and Activity Monitor don't.
If running in user space is a requirement, you might consider a user space app that communicates with a kext or something like that.
Related
I'm doing some research about the way launchd load it's services from plist files under /Library/LaunchDaemons/ or via the command launchctl load
So far I've managed to gather some various sources and compose the following vague picture as I understand it:
Upon Service loading (launchctl load) The process launchctl send the launchd an appropriate XPC message, and then the launchd is forked into new process with the context of xpcproxy.
This generic process, is waiting for another XPC call from the launchd to run it's real process context according to the launchDaemon plst.
Is this explanation sounds right ? perhaps anybody can help me make it more accurate ?
thanks
This is actually a bit more complicated. The kernel is composed of two parts, BSD and the mach kernel; the latter being responsible for management of memory and process scheduling.
Each mach process has one or more mach tasks (really task port rights!). When an application is first launched, it has just one right, the bootstrap port, allowing communication with launchd. Note that a task port right is uni-directional, so a launching process that has the right to communicate with launchd must give a right for launchd to communicate back to it.
When an XPC message is received by launchd, it depends upon the Launch Daemon as to what action it takes. It's possible that the message is for a service that runs with a network port that may or may not be running. If running, it forwards any arguments from the calling process to the running service. If not running, it can provide the service on demand by launching the process first.
More specifically you asked about launchctl load. Since the source code for launchd is no longer open source, the next best resource is the reverse engineering work by Jonathan Levin; Author of Mac OS X and iOS Internals and more recently, his newer self-published books on *OS Internals.
You'll find his slides about launchd here, but probably more useful to you is his version of launchctl, jlaunchctl which is open source.
Finally, if you want to view content of XPC messages between processes, disable SIP and use Jonathan's invaluable XPoCe tool.
I am in the process of designing a crash handler solution for one of our applications that creates a crash dump file using the MiniDumpWriteDump() function. While reading up on the topic I have seen the recommendations to invoke MiniDumpWriteDump() from an external process to maximize the chance that the dump file contains the correct information. The common solution seems to be to run a watchdog process in parallel to the application process. When the application crashes it somehow contacts the watchdog process, providing it with the information that is required to create the crash dump. Then the application goes to sleep until it is terminated by the watchdog process.
I can imagine such a watchdog process being run continually as a background service. This has many implications, starting with "who creates the service?", but also "which user does the service run as?", and "how does the application contact the service?" etc. It seems a pretty heavy-weight solution which I don't feel is appropriate for the scope of my task.
A simpler approach is suggested by this SO answer: Launch a guard process on application startup that is tightly coupled to the application process. This is pretty good, but it still leaves me with the tasks of 1) keeping the information somewhere in the application how I can contact the guard process in case of a crash; and 2) making sure to terminate the guard process if the application process shuts down normally.
The simplest solution of all would be to launch the crash dump handler process at the time the crash occurs, passing all the information that is required to create the crash dump as arguments to the process. This information consists of
The process ID of the application process that crashed
The thread ID of the thread that crashed
The adress of the EXCEPTION_POINTERS structure that describes the exception that caused the crash
This "fire and forget" approach is compelling because it does not require any state retention, nor any complicated over-time process management. In fact, the approach seems so overwhelmingly simple that I cannot help but feel that I am overlooking something.
What are the arguments against such an approach?
The main argument against the "fire and forget" approach, as I called it, is that it is not safe to launch a new process at a time when the application is already in a state where it is about to crash.
Because of that I went for the "guard process" approach. It brings a number of challenges with it, for which Hans Passant has outlined a solution.
I also added a bit of code in this answer that should help with deep-copying the all-important EXCEPTION_POINTERS data structure.
Using WER, as proposed in the comments, also looks like a good alternative to writing your own guard process. I must admit I have not investigated this any further, though.
When an application process launches an XPC helper process, it doesn't actually do the fork()/exec() itself in the classic UNIX style. Instead, it sends a message to launchd, which does the dirty work for it. Thus, if you query the parent process on the XPC process, it comes back as the launchd process.
However, if you open Activity Monitor in the hierarchical process view, the XPC helper processes are all shown below the original application that requested them, for example:
In the software I'm working on, knowing this relationship between processes would be extremely useful. So far we've been using the regular BSD parent process information, but as everything moves towards XPC, this isn't much use anymore.
So:
Where is the "original" parent process information stored for XPC processes?
How does Activity Monitor access it?
There is a kext involved, so I'd be happy to pull this information straight out in the kernel instead of userspace, but I can't seem to even figure out where it's stored.
Update: Discussion on Apple's darwin-kernel mailing list: http://lists.apple.com/archives/darwin-kernel/2015/Mar/msg00001.html
I imagine that launchd knows what you are looking for.
The Service Management framework has a method that might give you what you are looking for easily.
CFDictionaryRef SMJobCopyDictionary(CFStringRef domain, CFStringRef jobLabel); function.
We have a system where there are typically two processes running on the same system. One process handles the GUI and the other runs like a service (although for historical reasons, it's not a service, just an exe with no visible window).
The two processes undertake IPC mainly via registered messages asynchronously - i.e. we use RegisterWindowMessage() in both processes to define a large'ish set of messages that effectively form the API to the server process.
I have written a "hands-free" monitoring application that uses SetWindowsHookEx() to monitor and display the message queues of both processes and provide some level of decoding of the way the API is being utilised and how notifications are being propagated to the GUI process (each individual window can subscribe to notifications from the server directly).
So, there are a large number of messages in both directions so I have filtering and summary counts etc. so I can focus on particular activity. All this can be done without affecting the live code, which is good.
This all works well, but it now would be very useful to be able to "tag" a message originating in the GUI so I can trace the same message when it's processed by the server. This would be enormously useful for debugging and diagnosing system issues, but I can't find a clean way (actually I can't find any way!) of doing this without adding such support to our registered message API, which would be a lot of work and involves more risk than I'm comfortable with at the moment. It gets further complicated by the fact that the server pre-processes some messages and then does a PostMessage() back to itself to perform the action, so the originating message can get "lost".
Has anyone here tackled this type of problem? If so, can you give me some pointers? If not, then are there any documented or undocumented ways of adding a small block of data to a Windows message and retrieving it later? I've looked at SetMessageExtraInfo() but that seems to be per-queue rather than per-message.
FindWindow or FindWindowEx will give you the details of the GUI Window. Compare the details with message intercepted
I would like to create an application that will monitor apps running on my machine, and respond to situations where an application has beachballed. Is it possible (using any of the various OSX programming tools -- I'll teach myself Objective-C for this) to detect whether this has happened? If so, can someone give me a short code sample that does so?
I'm afraid I don't the actual classes or functions involved, but I can give you an outline of the process.
First, understand that every Application (perhaps every Window) has an event queue backing it. Each is serviced by a thread that just pops an event* off the queue, does some processing, and then returns to waiting for the next event. A "beachball" comes up (when forced by the system) when the event queue isn't getting serviced quickly enough. A "frozen" event queue implies that an application locked up when responding to some event in the past.
Now - outside of debugging contexts - you shouldn't be able to reach into another application and fiddle with a thread's event queue to see if its getting serviced. But what you could do instead is periodically post an event that would illicit a response, and if ever that response doesn't come you know the application is "locked up".
This constitutes polling, so be wary of the performance implications.
*Events are things like key down, key up, mouse moved, repaint, and so on.
Besides the WindowServer itself, the other system components that I know of that can detect unresponsiveness are the force quit dialog, spindump (which collects sampling profiles of applications while they are unresponsive), and Activity Monitor (presumably via its pmTool privileged subprocess). Perhaps running strings on pmTool might provide hints about what system calls to use?
Note that none of these evidently does its job by polling, because no application is ever detected as unresponsive until it fails to respond to an event — if an application hangs/does a lot of computation without checking its event queue, but it receives no events during that time, then it is not reported as unresponsive.