Where do UTI come from? - macos

I've read a lot of blog posts and SO-questions about Uniform Type Identifiers and how OS X handles file types. However, there are still some things I just don't get:
How are UTIs created by the system for each file? As a developer I passively declare a UTI for my file type but the system is responsible to assign the UTI for each matching file.
My current impression is that UTIs are created on-the-fly by the Finder according to the file extension.
Where are UTIs stored on the file system level? I've learned that the UTI can be displayed with the mdls command. Does that imply that the UTI is stored along the Spotlight meta data? What if Spotlight is turned off?
Is it correct that there is no API to manually add or change a UTI for a specific file?

There's actually not that much magic to it. You've asked several different questions, so I'll try to give you each of the answers:
How are UTIs created by the system for each file?
Launch Services maintains a database of all applications (and certain other types of bundles) on your Mac and relevant information declared in their Info.plist files. It updates this information automatically—I think it has a daemon monitor the file system to watch for changes to applications, but I don't know the details. What I do know is that you can ask a tool called lsregister to dump the entire database for you. In Terminal on Mountain Lion:
$ /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/LaunchServices.framework/Versions/A/Support/lsregister -dump
The various UTType functions also access this Launch Services database (although I'm not sure if they do it directly or if they communicate with some kind of daemon that does it for them).
Where are UTIs stored on the file system level?
Well, the actual Launch Services database seems to be located somewhere different on each Mac. On mine, it appears to be at /private/var/folders/mf/1xd7vlw90dj5p4z1r5f800bc000101/C/com.apple.LaunchServices-0371025.csstore. (At least, lsregister holds this file open while it's working; I'm not actually sure what's in it, but I assume it's the database.)
This is just a list of the declared UTIs, though. There is no UTI field attached to a given file. When you ask Cocoa for a file's UTI—through, say, -[NSWorkspace typeOfFile:error:] or -[NSURL getResourceValue:forKey:error:]—it actually extracts the path extension from the file name and then calls UTTypeCreatePreferredIdentifierForTag() to fetch the relevant UTI. (It's a little more complicated than that, because it's also looking at things like whether the path leads to a directory or device file or something, but that's the basic idea.)
Does that imply that the UTI is stored along the Spotlight meta data? What if Spotlight is turned off?
Spotlight does keep UTIs of files in its database, but that's just so it can quickly search and filter by type. Like everything else in the Spotlight index, this information is not canonical; it's just used to quickly search data that's actually stored elsewhere. If you turn off Spotlight, that's fine—nothing else depends on it.
Is it correct that there is no API to manually add or change a UTI for a specific file?
Yes, because that UTI is calculated at runtime from other information about the file. Changing a file's UTI makes about as much sense as changing the length of its name—you can't do it without changing the name itself.

Related

What is a "context" used for in regards to a Windows NT MiniFilter Driver?

I built a very simple minifilter driver as part of a lesson on minifilters. I've also read the minifilter documentation that Microsoft provides which is in the form of a PDF doc, as well as this reference. These guides explain how to set up a context and an instance. However, they do not explain why one would use a context and/or instance and what they are for. My very small filter driver used NULL for both context and instance and still operates, so I am wondering the use-case for these constructs.
There are many reasons why you would want to use contexts for files, volumes etc.. Certainly filters and even file-systems could operate without them, but the performance would be really bad.
Imagine this scenario: you are an AV (AntiVirus) and want to scan some files to check if they contain malicious code or not.
You register your minifilter and callbacks and now you are being called and you need to make a decision on a file as it is opened.
There are a few steps involved:
You query the file name and security context
You read the file contents
Alternatively hash the file with a SHA256 to see if it matches in your AV database for example
You check if the file is digitally signed, also part of your check
You parse the file's PE header if it has one to see what kind of file or executable it is to help you in your decision
You apply your policy on the file based on all the information above
Now let's assume the file is clean and goes away. If you cannot hold on to the information that you just learnt about the file, the next time the file is opened you will have to re-do it all over again. Your performance will suck, and your OS will crash and burn slowly to the ground.
This is where contexts come in handy.
Now that you have all this information about the file, you store all of it in your context that is then associated with this file. Next time you see the file you simply query its context and have all the information you need.
Of course some things will need to be updated, for example if you notice the file has been changed then you mark it as dirty and update as needed on the next Create or Cleanup callback.
Alternatively you could use a cache, where after the file is closed for good and the minifilter wants to free the context you have associated with the file you can save it yourself.
Now, the next time the file is opened you look for the context of the file ( NTFS support unique file ids for files ) and just associated it with your file and know immediately everything you need to know about that file
This is only one usage, but now you can think for yourself of many more scenarios where they are useful.

Can a file have localization on MacOS from a Xcode project

What I mean is this: I am developing a course. I have 10 files that I want to provide to the user as free. Files that the user may use to follow the course. The problem is that the course is multi-language, so, suppose the files are musics.
So, when the user clicks a button, all these musics are copied from the bundle to a directory on the user's computer. That folder will open automatically and the user will see the files there but they will have the name I have assigned originally to them in one language. I want a file called "music1.mp3", for example, be seen by portuguese-speakers like "música1.mp3", for example.
OK, I know. I can copy the files to the directory them rename them on-the-fly to the correspondent language, getting their names from Localizable.strings but I wonder if there is a way to add localization internally to the files, so if the user drags them between computers with different languages they will show with a different name, according to the computer's default language, provided that I have localized for that language. My idea is, always appear in english, unless I have localized for that specific language.
I know apple does that for specific directories of MacOS, like Pictures, Movies, etc., that appear with another names if I change MacOS language.
Is there a way to do that? or will I have to use of renaming the whole thing after copying?
No, I don't think this is possible.
The localization of directory names like Library happens at the system level -- those directories are flagged as having localized names (by putting a hidden file named .localized in them!), and the system looks up the appropriate name in a system strings file.
As far as I'm aware, there is no supported way to add extra directory names to this localization table, nor is there any way to localize the name of a plain file.

File type misery - Cocoa

So we recently shipped a document based application with an unfortunate oversight: the UTI for our main document type was left blank. We had a name for it, but the identifier was straight up empty.
Everything still worked great, but then we went to add another file type to the mix. The new file type is simply xml (conforms to public.xml). We set that up and dropped it into the document. This is when we caught our oversight on the first document type's UTI.
Now, if we so much as touch this document type, BOOM. The application can't read any files it has created of that type. We really want to clean this up, so what's the best way to do so?
My question is essentially:
How do you migrate your main document type in a document based application?
First, it's very difficult to debug this type of problem on the machine you're using to cut builds. The dynamic UTI system gets confused as to which app owns which files. To solve this issue, there is a command you can run in terminal to clear out the file associations on your system.
Next, we tackled the actual document types of our application. Ultimately, we want to support just two document types, our custom type and the xml type. However, we had to keep that empty, dynamically generated UTI that was shipped. In "Document types", we have three: the two we actually want to support and the legacy one we no longer want. For the first two, our application is an "Editor". For the legacy one, we changed it to "Reader".
Another thing that really helped our system out is using exported an imported UTIs. We told the system our application imports the XML type, and exports the two others.
We've done some pretty significant testing, including deployment, and this configuration works like a charm.

How do I open .bin/.idx database files?

I'm attempting to open some database files used by a legacy application that I know almost nothing about. The databases appear to be in file pairs of a bin and idx, for example: Cust.bin and Cust.idx.
I have never seen this type of database before and wasn't able to find anything useful through Google. I also don't know what language or tool the developer used for this app, but it seems that he used the default generic icon for his published executable. This is it:
Can anyone tell me anything about this application, what type of database it uses and how I might open the database myself?
The program that was using this database was a custom written application by a former consultant.
I never did figure out what type of database he was using, or how to open it properly. But I did manage to extract all the data out of it. I opened the file up in EditPad and found that all records had fixed-length fields. With this knowledge I was able to easily write a small application to parse all the binary data and export everything to .csv
So I was ultimately able to get the data. Woot!

Where Firefox extensions store data?

I want to write plugin for GNOME Do, that will work with Firefox extensions data (for example, with URL Alias patterns). I have looked through files in my profile folder (~/.mozilla/firefox/.default/), but haven't found anything related.
Can anybody help me?
I unzipped the XPI, and it looks like the key data is stored in the preference system under the urlalias branch. This is serialized to disk in the prefs.js. file Each line is a single preference, so it should be pretty simple to parse (you could conceivably use Firefox's XPCOM interface, but that's probably unnecessary).

Resources