FSEvents treatment of disk changed by other OS - macos

I'm seeing strange behavior on FSEvents where I mount my drive in recovery mode and on reboot get zero fsevents in my stream.
I do the following:
Boot regularly
Record current event with FSEventsGetCurrentEventId()
Boot in recovery mode and modify a file in the watched path
Restart the system
When this happens, I get no events at all when I use the fsevents API. The only flag it sends in the kFSEventStreamEventFlagHistoryDone sentinel, even if I had made other changes on the regular OS.
This ars technica review seems to imply that when you mount on some other device you should get the kFSEventStreamEventFlagMustScanSubDirs flag, but I'm not seeing that behavior. Has anybody encountered this before? Is there a better way of detecting and dealing with the case that the drive has been mounted somewhere else while the OS was off?
Update: I tried the same thing booting from linux and modifying the file system. I did not get the same strange behavior of 0 events no matter what, but I also didn't get an event from the directory I changed or a MustScanSubdirs flag.
Update 2: In this thread, the accepted response says that when this happens, time machine detects that the logs are out of date in the above situations. Anybody know how to detect if the logs are out of date? This date could be used instead of a flag.

I think you need to also store the UUID of the FSEvents database in step #2, and check for it in step #4.
This behavior is vaguely mentioned in Apple's documentation (emphasis added):
Note: Because disks can be modified by computers running earlier versions of OS X (or potentially other operating systems), you should treat the events list as advisory rather than a definitive list of all changes to the volume. If a disk is modified by a computer running a previous version of OS X, the historical log is discarded.
For example, backup software should still periodically perform a full sweep of any volume to ensure that no changes fall through the cracks.
Note the bit about the historical log being discarded, then look at the reference (emphasis added):
FSEventStreamGetLatestEventId() -> Initially, this returns the
sinceWhen value supplied when the stream was created; thereafter, it
is updated with the highest-numbered event ID mentioned in the current
batch of events just before invoking the client's callback. Clients
can store this value persistently as long as they also store the UUID
for the device (obtained via FSEventsCopyUUIDForDevice()). Clients can
then later supply this event ID as the sinceWhen parameter to
FSEventStreamCreateRelativeToDevice(), as long as its UUID matches
what you stored. This works because the FSEvents service stores events
in a persistent, per-volume database. In this regard,the stream of
event IDs acts like a global, system-wide clock, but bears no relation
to any particular timebase.
FSEventsCopyUUIDForDevice() -> Gets a UUID that uniquely identifies
the FSEvents database for that volume. If the database gets discarded
then its replacement will have a different UUID so that clients will
be able to detect this situation and avoid trying to use event IDs
that they stored as the sinceWhen parameter to the
FSEventStreamCreate...() functions.
Note that the UUID is per-device, so if you have any filesystems mounted inside your directory tree, you'll probably need to get the UUID of each of them.
Good luck!

Related

How to pass user setting to Driver Extension (MacOS)?

I am writing a driverkit extension whose goal is to block some categories of USB devices, such as flash drives. The driver should block (match to) any device of the relevant device classes, except those, which are whitelisted (based on their vendor and product ID). The whitelist can be set dynamically by user of the application.
The question is, how to pass these data to the driver as reading from a file or something like Windows registry is not available in the DriverKit. The tricky part is that the driver requires the whitelist data before the device is matched.
From what I understood, rejection of device is possible by returning an error from Start() method and returning from it prematurely. I got an idea to send the data while the driver is running this function, however this is not possible as the communication via IOUserClass is not available until the Start method returns.
Is this somehow doable?
As far as I'm aware, communicating with user space apps from the initial Start() method is not possible from DriverKit extensions. As you say, IOUserClients are the mechanism to use for user space communication, and those aren't available until the service is started and registered. You can have your driver match IOResources/IOUserResources so it is always loaded, but each matched service starts up an independed process of your dext, and I'm not aware of a way to directly communicate between these instances.
If I understand you correctly, you're trying to block other drivers from acquiring the device. I don't think the solution you have in mind will help you with this. If you return success from Start(), your dext will drive the device. If you return failure, no driver is loaded for the device, because matching has already concluded. So other drivers would never get a chance anyway, regardless of whether the device is on your allow-list or deny-list.
It's new in DriverKit 21 (i.e. macOS Monterey), and I've not had a chance to try it yet, but there is an API for reading files, OSMappedFile. I would imagine that the DriverKit sandbox will have something to say about which files a dext can open, but this seems like an avenue worth exploring whether you can open configuration files this way.
Note that none of this will help you during early boot, as your dext will never be considered for matching at that time. And you may not be able to get required entitlements from Apple to build a dext which matches USB device classes rather than specific product/vendor ID patterns. (Apologies for repeating myself, but other users may come across this answer and not be aware of this issue.)

Why does didUpdateLocations: only have one location after locationManagerDidResumeLocationUpdates:

I have location services working in iOS8.
It is set for kCLLocationAccuracyBest using startMonitoringSignificantLocationChanges to restart when in the background and startUpdatingLocation for accuracy.
When I set pausesLocationUpdatesAutomatically = YES, the location services get paused and resumed as expected. However, the following call to didUpdateLocations: only has one location in it.
I was expecting to receive a bunch of locations that were received by the OS while the delivery was paused. Am I missing something here? Does it have anything to do with deferredLocationUpdatesAvailable?
This answer talks about a post on Apple Dev Forum, but I get nothing when searching for pausesLocationUpdatesAutomatically.
Please note: this issue has nothing to do with calling requestAlwaysAuthorization or setting prompts in info.plist.
Further to the answer by Quentin Hayot...
The documentation for pausesLocationUpdatesAutomatically states:
Allowing the location manager to pause updates can improve battery life on the target device without sacrificing location data.
This is highly misleading.
What it really means is when location manager pauses it will sacrifice location data: the app does not get and never will get location updates until the location manager resumes.
It should explain that paused location updates are completely different from deferred location updates.
To improve battery life, the app should call allowDeferredLocationUpdatesUntilTraveled:timeout: which does deliver all location updates gathered while deferring to locationManager:didUpdateLocations:.
The documentation for locationManager:didUpdateLocations: states:
If updates were deferred or if multiple locations arrived before they could be delivered, the array may contain additional entries.
which is reasonably clear, but it could state that it has nothing to do with, and should not be confused with, pausesLocationUpdatesAutomatically.
When you pause the location updates, the system considers that you don't need the location for now. Resuming it will only give you the current location.
This is a normal behavior.

DirectInput, multiple joysticks, multiple users

My situation is that I have multiple joystick input devices, from the same manufacturer. The idea is that you set them up once, and then any user could log in and play without having to set them up again. The problem is how to uniquely identify each stick so at runtime you associate each stick to the correct saved configuration. I'm using DirectInput since these are joysticks, not gamepads.
The problem is that despite what is claimed here as well as other MSDN pages, the InstanceGUID is only unique per user, not per machine! At least this is what I am seeing, which seems to be corroberated by one other poster here. Ideally I want a new user to be able to log in and just run with the existing setup, but I don't see any way to associate the correct joystick to the correct saved button mapping without a reliable GUID that doesn't change (and that isn't the same for both sticks, like the product GUID).
All these sticks are USB joysticks if that helps.
Maybe there's a way to reliably enumerate the sticks? But without understanding the underlying architecture of how the USB ports are scanned for Joysticks, I'm not sure if I can always assume the "first" stick will actually be the first for every user, or if I unplug a stick and plug it back in, even to the same port.
UPDATE: This article contains the claim that InstanceId is per computer, not per user.

How to guarantee file integrity without mandatory file lock on OS X?

AFAIK, OS X is a BSD derivation, which doesn't have actual mandatory file locking. If so, it seems that I have no way to prevent writing access from other programs even while I am writing a file.
How to guarantee file integrity in such environment? I don't care integrity after my program exited, because that's now user's responsibility. But at least, I think I need some kind of guarantee while my program is running.
How do other programs guarantee file content integrity without mandatory locking? Especially database programs. If there's common technique or recommended practice, please let me know.
Update
I am looking for this for data layer of GUI application for non-engineer users. And currently, my program have this situations.
Data is too big that it cannot be fit to RAM. And even hard to be temporarily copied. So it cannot be read/written atomically, and should be used from disk directly while program is running.
A long running professional GUI content editor application used by humans who are non-engineers. Though users are not engineers, but they still can access the file simultaneously with Finder or another programs. So users can delete or write on currently using file accidentally. Problem is users don't understand what is actually happening, and expect program handles file integrity at least program is running.
I think the only way to guarantee file's integrity in current situation is,
Open file with system-wide exclusive mandatory lock. Now the file is program's responsibility.
Check for integrity.
Use the file as like external memory while program is running.
Write all the modifications.
Unlock. Now the file is user's responsibility.
Because OS X lacks system-wide mandatory lock, so now I don't know what to do for this. But still I believe there's a way to archive this kind of file integrity, which just I don't know. And I want to know how everybody else handles this.
This question is not about my programming error. That's another problem. Current problem is protecting data from another programs which doesn't respect advisory file lockings. And also, users are usually root and the program is running with same user, so trivial Unix file privilege is not useful.
You have to look at the problem that you are trying to actually solve with mandatory locking.
File content integrity is not guaranteed by mandatory locking; unless you keep your file locked 24/7; file integrity will still depend on all processes observing file format/access conventions (and can still fail due to hard drive errors etc.).
What mandatory locking protects you against is programming errors that (by accident, not out of malice) fail to respect the proper locking protocols. At the same time, that protection is only partial, since failure to acquire a lock (mandatory or not) can still lead to file corruption. Mandatory locking can also reduce possible concurrency more than needed. In short, mandatory locking provides more protection than advisory locking against software defects, but the protection is not complete.
One solution to the problem of accidental corruption is to use a library that is aggressively tested for preserving data integrity. One such library (there are others) is SQlite (see also here and here for more information). On OS X, Core Data provides an abstraction layer over SQLite as a data storage. Obviously, such an approach should be complemented by replication/backup so that you have protection against other causes for data corruption where the storage layer cannot help you (media failure, accidental deletion).
Additional protection can be gained by restricting file access to a database and allowing access only through a gateway (such as a socket or messaging library). Then you will just have a single process running that merely acquires a lock (and never releases it). This setup is fairly easy to test; the lock is merely to prevent having more than one instance of the gateway process running.
One simple solution would be to simply hide the file from the user until your program is done using it.
There are various ways to hide files. It depends on whether you're modifying an existing file that was previously visible to the user or creating a new file. Even if modifying an existing file, it might be best to create a hidden working copy and then atomically exchange its contents with the file that's visible to the user.
One approach to hiding a file is to create it in a location which is not normally visible to users. (That is, it's not necessary that the file be totally impossible for the user to reach, just out of the way so that they won't stumble on it.) You can obtain such a location using -[NSFileManager URLForDirectory:inDomain:appropriateForURL:create:error:] and passing NSItemReplacementDirectory and NSUserDomainMask for the first two parameters. See -replaceItemAtURL:withItemAtURL:backupItemName:options:resultingItemURL:error: method for how to atomically move the file into its file place.
You can set a file to be hidden using various APIs. You can use -[NSURL setResourceValue:forKey:error:] with the key NSURLIsHiddenKey. You can use the chflags() system call to set UF_HIDDEN. The old Unix standby is to use a filename starting with a period ('.').
Here's some details about this topic:
https://developer.apple.com/library/ios/documentation/FileManagement/Conceptual/FileSystemProgrammingGuide/FileCoordinators/FileCoordinators.html
Now I think the basic policy on OSX is something like this.
Always allow access by any process.
Always be prepared for shared data file mutation.
Be notified when other processes mutates the file content, and provide proper response on them. For example you can display an error to end users if other process is trying to access the file. And then users will learn that's bad, and will not do it again.

How to emulate shm_open on Windows?

My service needs to store a few bits of information (at minimum, at least 20 bits or so, but I can easily make use of more) such that
it persists across service restarts, even if the service crashed or was otherwise terminated abnormally
it does not persist across a reboot
can be read and updated with very little overhead
If I store this information in the registry or in a file, it will not get automatically emptied when the system reboots.
Now, if I were on a modern POSIX system, I would use shm_open, which would create a shared memory segment which persists across process restarts but not system reboots, and I could use shm_unlink to clean it up if the persistent data somehow got corrupted.
I found MSDN : Creating Named Shared Memory and started reimplementing pieces of it within my service; this basically uses CreateFileMapping(INVALID_HANDLE_NAME, ..., PAGE_READWRITE, ..., "Global\\my_service") instead of shm_open("/my_service", O_RDWR, O_CREAT).
However, I have a few concerns, especially centered around the lifetime of this pagefile-backed mapping. I haven't found answers to these questions in the MSDN documentation:
Does the mapping persist across reboots?
If not, does the mapping disappear when all open handles to it are closed?
If not, is there a way to remove or clear the mapping? Doesn't need to be while it's in use.
If it does persist across reboots, or does disappear when unreferenced, or is not able to be reset manually, this method is useless to me.
Can you verify or find faults in these points, and/or recommend a different approach?
If there were a directory that were guaranteed to be cleaned out upon reboot, I could save data in a temporary file there, but it still wouldn't be ideal: under certain system loads, we are encountering file open/write failures (rare, under 0.01% of the time, but still happening), and this functionality is to be used in the logging path. I would like not to introduce any more file operations here.
The shared memory mapping would not persist across reboots and it will disappear when all of its handles are closed. A memory mapping object is a kernel object - they always get deleted when the last reference to them goes away, either explicitly via a CloseHandle or when the process containing the reference exits.
Try creating a registry key with RegCreateKeyEx with REG_OPTION_VOLATILE - the data will not preserved when the corresponding hive is unloaded. This will be at system shutdown for HKLM or user logoff for HKCU.
sounds like maybe you want serialization instead of shared memory? If that is indeed appropriate for your application, the way you serialize will depend on your language. If you're using c++, check out boost::serialize. C# undoubtedly has lots of serializations options (like java), if that's what you're using.

Resources