I'm writing an app for macOS with the primary goal of managing arbitrary user files in a certain manner. 'Management' includes arbitrarily reading/writing/updating these files. Management is not internally a discrete event, and may consist of several idle-periods. However, it must appear so to the user.
Note: The term 'user' includes any and all user-activity (i.e. via Finder) or user-initiated processes (i.e. other apps opened by the user; though not running as root, similar to the privileges of my own application).
My app does not store these files in an owned container (e.g. sandboxed app container), but rather runs continuously in the background keeping track of these files, monitoring for changes and managing them as necessary.
The duration of this 'management' may vary from a few milliseconds to a few hours.
I'm trying to write a construct (i.e. class / struct) to encapsulate references to these 'hot' files (i.e. files under management). During management, the user must not be capable of reading/writing-to/deleting these files, unless the app is explicitly quit (through normal quit / forced quit, regardless).
Is there any way I can "lock" a file, as to prevent user reading/writing/updating and/or even modification of permissions?
Here are two possible solutions:
Copy the file to an undisclosed location, manage it, and overwrite the old file. This is undesirable for multiple reasons: copying is expensive and impractical for large files, user is not explicitly aware of management, does nothing to prevent other processes from seeing the file as "free".
Modify file permissions. I'm not sure if this is even possible (please let me know in detail if it is!), but if my process could modify file permissions as to prevent user-access, it would solve the essence my problem. However, if anything were to prevent my app from 'unlocking' these files (be it through a crash/force-quit etc.), it would leave the files inaccessible to the user.
A third, though not really a solution, would be to simply not attempt to 'lock' any of these files. I could just monitor the files continuously, and alert the user of any failure. I really don't want to do this, hence the question.
The second solution seems quite promising. I can't, however, find any high-level APIs that let me interface with the file ACLs (access-control-lists). I'm not even sure whether I'm correct in my understanding of how it would work, so feel free to build upon that thought and turn it into a concrete answer.
I'm also curious as to how Finder seems to know whether files are being used by other processes. Again, I think I know but I'm not entirely sure, so better ask it here with the main question.
Related
My application currently writes files to its installation directory which means Program Files isn't a valid option (I know this isn't ideal). But I would also prefer my installer (Inno) not to require admin rights i.e. no UAC; I think Google Chrome does this.
Which common location would make sense to default to with both these restrictions in mind?
If you really want to make a per-user installer that does not require admin permissions, the correct settings to use are:
[Setup]
PrivilegesRequired=lowest
DefaultDirName={userpf}\YourAppName
Note that (addressing Glytzhkof's concerns) this is a local folder, not a roaming folder. If you want settings to roam then you will still need your application to keep them in (your language's equivalent of) the {userappdata}\YourAppName folder. Regardless, the user will have to separately install the software on each machine that they intend to use it on (but this is typically the best option anyway).
Some of the downsides of making a per-user application are:
You cannot use admin permissions when installing. In particular this means that you cannot install many other components (runtimes, libraries, etc) that you might have wanted to use in your application. You also can't use features like regserver and restartreplace. (This doesn't necessarily mean that you cannot still use these components, just that it's a larger hassle if the user does not already have them installed.)
If a single machine has multiple users (common for families and in some workplaces) then they will have duplicate independent copies of your application, which have to be individually upgraded by each user. This particularly annoys IT departments as they prefer doing central upgrades, and if your app is large it may waste disk space.
If the reason you don't want to make a normal {pf} based application is simply that you want to be lazy and store settings files in the program's folder, then it's probably better overall to rethink this decision. It's not hard to do it "right".
There are basically 3 types of files: 1: user data, 2: application settings, and 3: binaries. Plus a few exceptions. I assume Harry is suggesting to write to a user's application data folder with configuration and settings files, and not the whole application. Don't ever put binaries or data here, but do save settings files here.
The whole concept of "roaming files" is a bad idea in my opinion. It clogs your userprofile and increases logon time on each computer and causes all kinds of synchronization issues when people leave multiple machines logged on simultaneously for weeks at a time. The whole roaming concept works only in theory in my opinion - but it depends on user discipline and application quality in its data management. Rarely edited and mergeable files can work, if the application is good. I have seen it work well for spell checker custom dictionaries and similar. The real solution is client/server applications with back-end databases for the purpose of persisting settings. Everything else will eventually fail - if it's a light weight app that might not matter.
User data should be saved to the "My Documents" location only. and only a few configuration settings should roam. If a network is set up to allow "My Documents" to roam, the system administrator should be shot immediately :-). It must be a server share accessible regardless of the computer the user is logged on to.
I've flown off the handle here and answered too many questions you didn't ask. Just hate seeing people head for problems they might not know about. If you have a super small application that is basically "portable" as we call it in deployment. That means an application where you can run off a single folder on a USB stick, then save everything in User data, and keep the application small and lightweight with a single settings file and a binary. No UAC or admin rights should be needed.
I'm contemplating implementing launchd to watch a file structure on my computer. Using watchpaths to tell whether or not one of these directories changes, though I will need to create a new property list file to account for each directory. My point of contention is as to how scalable this is; will I notice a drop in my computer's performance if im watching 10, 100, 1000 or more directories, or are the resources consumed by watching this many paths focused in memory as opposed to processing?
These jobs are going to be used to handle when a file is deleted or renamed and will update a manifest in the root of this file structure, so my application will know what files are where without walking the tree; I'm trying to make the application a little more responsive and aware. Should these jobs be daemons or agents? I've presumed agents because I don't see how this structure could be modified without a user being logged in, although these jobs would have no need to create a gui.
Will launchd be scalable enough to handle an arbitrarily sized file structure in this manner?
Are there other options? (Portability would be nice.)
Handling whether or not files have been renamed created or otherwise modified is better accomplished using FSEvents or kqueues.
Ok i wrote and application that use Adobe ActiveX control for displaying PDF files.
Adobe ActiveX control load files only from file system. So i nead to feed a file path to this control.
Problem is that i don't want to store PDF files on file system. Event temporary! I wan't to store my PDF files only in memory, and i want to use Adobe ActiveX control.
So i nead:
1) A way to fake file on a file system. So this control would "think" that there is a file, but would load it from memory
2) A way to create file on file system that would be "visible" to only one application, so my PDF control could load it, and other users won't even see it..
3) Something else
PS: I'm not asking to "finish my home work", i'm just asking - is there a way to do this?
You can almost do it (means: no you can't, but you can do something that comes close).
Creating a file with FILE_ATTRIBUTE_TEMPORARY does in principle create a file, temporarily. However, as long as there is sufficient buffer cache (which is normally always the case unless your file is tens to hundreds of megabytes), the system will not write to disk. This is not just something that happens accidentially, but the actual specified behaviour of this flag.
Further, specifying 0 as share mode and FILE_FLAG_DELETE_ON_CLOSE will prevent any other process from opening your file for as long as you keep it open, even if someone knows it's there, and the file will "disappear" when you close it. Even if your application crashes, the OS will clean up behind you (if DRM is the reason). If you're in super paranoia mode and worried about the system bluescreening while your file exists, you can additionally schedule a pending move too. This will, in case of a system crash, remove the file during boot.
Lastly, given NTFS, you can create an alternate stream with a random, preferrably unique name (e.g. SHA1 of the document or a UUID) on any file or even directory. Alternate streams on directories are ... a kind of nasty hack, but entirely legal and they work just fine, and don't appear in Explorer. This will not really make your file invisible, but nearly so (in almost every practical aspect, anyway). If you're a good citizen, you will want to use the system temp folder for such a thing, not the program folder or some other place that you shouldn't write to.
Creating an alternate stream is dead easy too, just use the normal file or directory name and append a colon (:) and the name of the stream you want. No extra API needed.
Other than that, it gets kind of hard. You can of course always create something like a ramdisk (would be tough to hide it, though), or try to use one of the stream-from-memory functions to fool an application into reading from a memory buffer on the allegation of a file... but that's not trivial stuff.
If something needs to be on a file system to pass to another application, you can not hide it/limit it to certain processes. Anything your app can see, anything else at the same privilige level can also see/access. You may be able to lock it but how depends on why you want to protect against.
Remember that the user's PC is theirs, not yours so they have full access to everything on it.
You can create a virtual disk and limit access to it to only specific application. Do to this you would have to write a file system driver or a filesystem filter driver. Both work in kernel mode and are tricky to write and maintain. Our company offers components that let you avoid writing drivers yourself and write business logic in user-mode (we provide drivers in those products).
Your most obvious option is to get rid of Adobe Reader control and use some third-party component that displays PDFs and can load them from memory.
But in general a smart hacker would be able to capture your data unless you have (a) non-standard data format, and/or (b) stream the data from the server dynamically, not keeping the complete data on the computer. These are not bulletproof solutions either, but they make hacker's work much harder.
Whiteboard Overview
The images below are 1000 x 750 px, ~130 kB JPEGs hosted on ImageShack.
Internal
Global
Additional Information
I should mention that each user (of the client boxes) will be working straight off the /Foo share. Due to the nature of the business, users will never need to see or work on each other's documents concurrently, so conflicts of this nature will never be a problem. Access needs to be as simple as possible for them, which probably means mapping a drive to their respective /Foo/username sub-directory.
Additionally, no one but my applications (in-house and the ones on the server) will be using the FTP directory directly.
Possible Implementations
Unfortunately, it doesn't look like I can use off the shelf tools such as WinSCP because some other logic needs to be intimately tied into the process.
I figure there are two simple ways for me to accomplishing the above on the in-house side.
Method one (slow):
Walk the /Foo directory tree every N minutes.
Diff with previous tree using a combination of timestamps (can be faked by file copying tools, but not relevant in this case) and check-summation.
Merge changes with off-site FTP server.
Method two:
Register for directory change notifications (e.g., using ReadDirectoryChangesW from the WinAPI, or FileSystemWatcher if using .NET).
Log changes.
Merge changes with off-site FTP server every N minutes.
I'll probably end up using something like the second method due to performance considerations.
Problem
Since this synchronization must take place during business hours, the first problem that arises is during the off-site upload stage.
While I'm transferring a file off-site, I effectively need to prevent the users from writing to the file (e.g., use CreateFile with FILE_SHARE_READ or something) while I'm reading from it. The internet upstream speeds at their office are nowhere near symmetrical to the file sizes they'll be working with, so it's quite possible that they'll come back to the file and attempt to modify it while I'm still reading from it.
Possible Solution
The easiest solution to the above problem would be to create a copy of the file(s) in question elsewhere on the file-system and transfer those "snapshots" without disturbance.
The files (some will be binary) that these guys will be working with are relatively small, probably ≤20 MB, so copying (and therefore temporarily locking) them will be almost instant. The chances of them attempting to write to the file in the same instant that I'm copying it should be close to nil.
This solution seems kind of ugly, though, and I'm pretty sure there's a better way to handle this type of problem.
One thing that comes to mind is something like a file system filter that takes care of the replication and synchronization at the IRP level, kind of like what some A/Vs do. This is overkill for my project, however.
Questions
This is the first time that I've had to deal with this type of problem, so perhaps I'm thinking too much into it.
I'm interested in clean solutions that don't require going overboard with the complexity of their implementations. Perhaps I've missed something in the WinAPI that handles this problem gracefully?
I haven't decided what I'll be writing this in, but I'm comfortable with: C, C++, C#, D, and Perl.
After the discussions in the comments my proposal would be like so:
Create a partition on your data server, about 5GB for safety.
Create a Windows Service Project in C# that would monitor your data driver / location.
When a file has been modified then create a local copy of the file, containing the same directory structure and place on the new partition.
Create another service that would do the following:
Monitor Bandwidth Usages
Monitor file creations on the temporary partition.
Transfer several files at a time (Use Threading) to your FTP Server, abiding by the bandwidth usages at the current time, decreasing / increasing the worker threads depending on network traffic.
Remove the files from the partition that have successfully transferred.
So basically you have your drives:
C: Windows Installation
D: Share Storage
X: Temporary Partition
Then you would have following services:
LocalMirrorService - Watches D: and copies to X: with the dir structure
TransferClientService - Moves files from X: to ftp server, removes from X:
Also use multi threads to move multiples and monitors bandwidth.
I would bet that this is the idea that you had in mind but this seems like a reasonable approach as long as your really good with your application development and your able create a solid system that would handle most issues.
When a user edits a document in Microsoft Word for instance, the file will change on the share and it may be copied to X: even though the user is still working on it, within windows there would be an API see if the file handle is still opened by the user, if this is the case then you can just create a hook to watch when the user actually closes the document so that all there edits are complete, then you can migrate to drive X:.
this being said that if the user is working on the document and there PC crashes for some reason, the document / files handle may not get released until the document is opened at a later date, thus causing issues.
For anyone in a similar situation (I'm assuming the person who asked the question implemented a solution long ago), I would suggest an implementation of rsync.
rsync.net's Windows Backup Agent does what is described in method 1, and can be run as a service as well (see "Advanced Usage"). Though I'm not entirely sure if it has built-in bandwidth limiting...
Another (probably better) solution that does have bandwidth limiting is Duplicati. It also properly backs up currently-open or locked files. Uses SharpRSync, a managed rsync implementation, for its backend. Open source too, which is always a plus!
I have an application that has to support modifying some registry data depending on the kind of 'installation' that is desired. At present, I have no problems hard-coding to either get elevation and do the changes to the entire local machine, but it is far from nice as ideally, I would also like to support per-user installations. I could hardcode that, but then I lose the local-machine stuff. To be precise, the changes in question involve file association changes, COM stuff etc.
How can I properly support both usage scenarios? Currently I use a set of ON/OFF checkboxes for the variety of associations.
Should I change this meaning on, for example, a MachineInstall file existing in my apps directory, and if not assume User install?
Is it an expected/valid/whatever usecase to say that someone might want to do some things for the entire machine, and some things only for the user? (E.g. mixing of the two.)
Or should I change the entire UI, move away from checkboxes and move to some sort of combobox going 'None/User/Local'? Then again, I think this might have some sort of breakage once you involve multiple users and combinations.
To give an indication, I personally expect the application in question to have its uses for everyone on a computer and as such lean towards the Local-Machine as a 'default', if that makes any sort of difference.
I am likely overthinking the matters quite a bit, so any and all input is very much appreciated. :)
P.S.
Now, someone is probably going to say 'do not do all that stuff from your app, do it from the installer instead'. And they probably have a point, but the point is to allow easy changing of these settings from within the application. To top it off, I am not using .MSI install packages because they make working with 32/64-bit specific executables a disaster requiring merge modules, spawning other MSI's depending on the situation, and so forth (I forgot the details last time I dug into it and forgot about the matter). I don't have that knowledge, nor the time to learn all the intricacies of MSI installations, so it is out for as far I am concerned. To boot, my application is perfectly capable of functioning without any of those registry entries being present, and that is by design. In a way, one might compare it to be like Process Explorer from Sysinternals, which does not require an installer, but can be unzipped and take over the task manager etc without a problem if a user wants, or simply run stand-alone.