According to the MSDN documentation, transactional NTFS doesn't seem to allow one to block on opening a file for write - instead the open operation fails with ERROR_SHARING_VIOLATION. I'd like to block on writes instead - how can I do this?
Ideally I'd like the following properties for the solution:
Works over a network share (so no local named mutex handles)
Auto-releases if the owning process dies
Doesn't require a separate file (named streams are OK)
Allows the locking wait to have a timeout (or be cancellable from another thread or APC)
Does anyone have some experience with a locking method that works with transactional NTFS with these properties?
I'm not sure I understand exactly what you're asking. TXF doesn't work across SMB shares.
My knee-jerk suggestion would be that if you are using files for this before using TXF, you could continue to use a file for this in non-transacted mode...
FYI, the reason TXF fails these transactional lock conflicts is to help applications avoid deadlocks.
Related
We have a serie of applications running on windows that uses file locking for protecting concurrent access to their data (shared files on network drive).
Sometimes, it seems one of these process fails to release one of these locks and everything freezes until the process is killed. Finding out who holds that lock is not always trivial (needs an admin to go on the file server and check network open files, go one workstation, find process and kill it).
We have a message queue system between the applications that is serviced by a background thread so, in theory, it would be possible to send out a message to every process asking them if they hold a lock to a specific file and if they do, mybe take an action like kill the process if the lock is held longer than a few seconds)
So, the question is: is there a way for a thread to know if a different thread of the same process holds a lock (LockFile) against a given file?
I'm not sure if there is a API to query this but a process can query itself with the LockFileEx function:
A shared lock can overlap an exclusive lock if both locks were created using the same file handle. When a shared lock overlaps an exclusive lock, the only possible access is a read by the owner of the locks.
The other thread could query and see if it can get shared access.
If you control the file format you could write the computer name and process id to the start of the file every time you take the lock. File memory mappings can view the file contents even while it is locked.
While writing a driver, I came across a issue mentioned below.
Given a multithreaded application accessing the same device file through same FD. Consider that between the calls to OPEN and RELEASE, there are some resources (say mutex) held mutually by the thread-group. These resources are used during the READ/WRITE calls, and then eventually given up or destroyed during RELEASE.
If there is one thread accessing the resource during READ/WRITE and another thread simultaneously invokes the RELEASE by calling close, how is it assured by the VFS that the RELEASE is not called until there is at least one thread in the READ, WRITE, or like. What mechanism is handling this protection?
The kernel layer above the device drivers keeps track of how many references to an open file exist and does not call the release function until all of those references have been closed. This is somewhat documented in LDD3: http://tjworld.net/books/ldd3/#TheReleaseMethod
AFAIK, OS X is a BSD derivation, which doesn't have actual mandatory file locking. If so, it seems that I have no way to prevent writing access from other programs even while I am writing a file.
How to guarantee file integrity in such environment? I don't care integrity after my program exited, because that's now user's responsibility. But at least, I think I need some kind of guarantee while my program is running.
How do other programs guarantee file content integrity without mandatory locking? Especially database programs. If there's common technique or recommended practice, please let me know.
Update
I am looking for this for data layer of GUI application for non-engineer users. And currently, my program have this situations.
Data is too big that it cannot be fit to RAM. And even hard to be temporarily copied. So it cannot be read/written atomically, and should be used from disk directly while program is running.
A long running professional GUI content editor application used by humans who are non-engineers. Though users are not engineers, but they still can access the file simultaneously with Finder or another programs. So users can delete or write on currently using file accidentally. Problem is users don't understand what is actually happening, and expect program handles file integrity at least program is running.
I think the only way to guarantee file's integrity in current situation is,
Open file with system-wide exclusive mandatory lock. Now the file is program's responsibility.
Check for integrity.
Use the file as like external memory while program is running.
Write all the modifications.
Unlock. Now the file is user's responsibility.
Because OS X lacks system-wide mandatory lock, so now I don't know what to do for this. But still I believe there's a way to archive this kind of file integrity, which just I don't know. And I want to know how everybody else handles this.
This question is not about my programming error. That's another problem. Current problem is protecting data from another programs which doesn't respect advisory file lockings. And also, users are usually root and the program is running with same user, so trivial Unix file privilege is not useful.
You have to look at the problem that you are trying to actually solve with mandatory locking.
File content integrity is not guaranteed by mandatory locking; unless you keep your file locked 24/7; file integrity will still depend on all processes observing file format/access conventions (and can still fail due to hard drive errors etc.).
What mandatory locking protects you against is programming errors that (by accident, not out of malice) fail to respect the proper locking protocols. At the same time, that protection is only partial, since failure to acquire a lock (mandatory or not) can still lead to file corruption. Mandatory locking can also reduce possible concurrency more than needed. In short, mandatory locking provides more protection than advisory locking against software defects, but the protection is not complete.
One solution to the problem of accidental corruption is to use a library that is aggressively tested for preserving data integrity. One such library (there are others) is SQlite (see also here and here for more information). On OS X, Core Data provides an abstraction layer over SQLite as a data storage. Obviously, such an approach should be complemented by replication/backup so that you have protection against other causes for data corruption where the storage layer cannot help you (media failure, accidental deletion).
Additional protection can be gained by restricting file access to a database and allowing access only through a gateway (such as a socket or messaging library). Then you will just have a single process running that merely acquires a lock (and never releases it). This setup is fairly easy to test; the lock is merely to prevent having more than one instance of the gateway process running.
One simple solution would be to simply hide the file from the user until your program is done using it.
There are various ways to hide files. It depends on whether you're modifying an existing file that was previously visible to the user or creating a new file. Even if modifying an existing file, it might be best to create a hidden working copy and then atomically exchange its contents with the file that's visible to the user.
One approach to hiding a file is to create it in a location which is not normally visible to users. (That is, it's not necessary that the file be totally impossible for the user to reach, just out of the way so that they won't stumble on it.) You can obtain such a location using -[NSFileManager URLForDirectory:inDomain:appropriateForURL:create:error:] and passing NSItemReplacementDirectory and NSUserDomainMask for the first two parameters. See -replaceItemAtURL:withItemAtURL:backupItemName:options:resultingItemURL:error: method for how to atomically move the file into its file place.
You can set a file to be hidden using various APIs. You can use -[NSURL setResourceValue:forKey:error:] with the key NSURLIsHiddenKey. You can use the chflags() system call to set UF_HIDDEN. The old Unix standby is to use a filename starting with a period ('.').
Here's some details about this topic:
https://developer.apple.com/library/ios/documentation/FileManagement/Conceptual/FileSystemProgrammingGuide/FileCoordinators/FileCoordinators.html
Now I think the basic policy on OSX is something like this.
Always allow access by any process.
Always be prepared for shared data file mutation.
Be notified when other processes mutates the file content, and provide proper response on them. For example you can display an error to end users if other process is trying to access the file. And then users will learn that's bad, and will not do it again.
Usually, when an application writes to one of it's files on disk, the file modified timestamp changes.
Sometimes, and in my case it is an application written in ProvideX (a Business Basic derivative i believe) doing the writing, the modified timestamp does not change after a write. A program like MyTrigger will not pick up on the write operation either, but Sysinternals ProcessMonitor does log the disk activity.
It seems obvious that there are different ways to ask windows to perform write operations, and the request could then be hooked or logged in various different ways as well.
I need to be able to hook the write operations coming from the ProvideX application. Any pointers on the different ways windows writes to disk, and the type of hooks available for them would be greatly appreciated.
Thanks
User-mode process can write to the file either using WriteFile API function or using MMF, memory-mapped file API (CreateFileMapping/MapViewOfFile/Write to memory block). Maybe your application goes MMF way. MMF writes to files very differently from WriteFile API, but they both lead to the same end point - IRP sent to file system driver. File system filter driver (such as the one used by Sysinternals stuff) can track write requests on that IRP level. It is technically possible to distinguish between write operations initiated by MMF and WriteFile as different IRPs are sent (cached and non-cached writing is involved). It seems that directory change monitoring function in windows tracks only one IRP type, and this causes MyTrigger to miss the change.
This is intended to be a lightweight generic solution, although the problem is currently with a IIS CGI application that needs to log the timeline of events (second resolution) for troubleshooting a situation where a later request ends up in the MySQL database BEFORE the earlier request!
So it boils down to a logging debug statements in a single text file.
I could write a service that manages a queue as suggested in this thread:
Issue writing to single file in Web service in .NET
but deploying the service on each machine is a pain
or I could use a global mutex, but this would require each instance to open and close the file for each write
or I could use a database which would handle this for me, but it doesnt make sense to use a database like MySQL to try to trouble shoot a timeline issue with itself. SQLite is another possability, but this thread
http://www.perlmonks.org/?node_id=672403
Suggests that it is not a good choice either.
I am really looking for a simple approach, something as blunt as writing to individual files for each process and consolidating them accasionally with a scheduled app. I do not want to over engineer this, nor spend a week implementing it. It is only needed occassionally.
Suggestions?
Try the simplest solution first - each write to the log opens and closes the file. If you experience problems with this, which you probably won't , look for another solution.
You can use file locking. Lock the file for writing, write the message, unlock.
My suggestion is to preserve performance then think in asynchronous logging. Why not send your data log info using UDP to service listening port and he write to log file.
I would also suggest some kind of a central logger that can be called by each process in an asynchronous way. If the communication is UDP or RPC or whatever would be an implementation detail.
Even thought it's an old post, has anyone got an idea why not using the following concept:
Creating/opening a file with share mode of FILE_SHARE_WRITE.
Having a named global mutex, and opening it.
Whenever a file write is desired, lock the mutex first, then write to the file.
Any input?