My program creates a file, writes to it, closes it, renames it to something else. For one customer, the rename often fails with a sharing violation and I have been unable to recreate this issue.
The program is asynchronous and multithreaded, where the create and write are guaranteed to have been completed at the time of close and rename, but the close and rename may happen in any order due to being in different threads.
The customer ensures me that there are no av or backup programs installed and we have tried with Windows Search disabled.
When the close happens before or after the rename everything works (the file is opened with shared read+write+delete flags). However, when they are happening very close in time it sometimes fails. When running with ProcessMonitor, the error does not occur.
I know that the rename is made up out of several file operations (open, set information, close at least), so I assume that it is possible for the file close to be interleaved with the file rename, which seems be at the heart of the problem.
I am going to be able to work around the issue by ensuring that the file is closed after the rename. But I don't understand exactly what causes the sharing violation and I would like to know more why this is an issue. Can anyone give me more information on what happens?
Related
I'm using Transactional NTFS to atomize multiple writes to several files.
The problem is that after commit, I may not be able to reopen a file,
perhaps because of a racing condition.
The sequence of events is :
NTFS transaction is created with CreateTransaction
Files are opened with CreateFileTransacted
Writes are done to the files
Files are closed with CloseHandle
Transaction is committed with CommitTransaction
Files are reopened with CreateFile for read/write
The last step sometimes fails with error code 3 :
ERROR_PATH_NOT_FOUND - The system cannot find the path specified.
When re-executing the program, the file is then found.
This happens rarely, but in a completely random manner, meaning not always
when reopening the same file.
My theory is that if terminating the transaction by Windows takes a long
time, the files are not available for opening in read/write mode until
the transaction terminates. My program then fails when trying to open
my own files in non-transaction mode.
I think that to avoid this problem, I need to wait for the transaction
to complete before reopening the files.
However, I have not found any documented method for doing that.
No clever answers, so I had to implement my own dummy one :
If an I/O error happens on opening a file that was just closed,
the solution was to loop on opening several times while in-between calling
Sleep() to release the CPU, before deciding that a catastrophic error had occurred.
Dummy solution, but it solved the problem.
During a file is load-ed/require-d, is the file locked from writing?
If not, how can I exclusively lock the file from writing during load/require?
Probably, File#flock should be used if so, but I don't know the answer to the first question, and also how to combine it with load/require.
When a file is opened the only protection you have is if the file is deleted, technically unlinked from the filesystem and orphaned, you can still read the contents. Closing the file forfeits any access to it from that point on. That's how it works on POSIX type systems in any case, Windows may be different.
There's nothing to prevent another process from over-writing part of the file or truncating it while your process is trying to do its thing.
Remember File#flock is simply a polite way of requesting a lock and unless the other process that's about to manipulate the file is polite and checks you have no guarantees about the state of your file. Processes are free to ignore that and mangle your file without warning.
The only way to be sure nobody touches your file is to copy it to a private /tmp directory, test that the thing copied correctly, and read it in from there. That's an extremely paranoid thing to do so I'd hope you have a compelling reason before going down that road.
If you can control all the processes that access your file and make them well-behaved citizens and use a consistent locking mechanism for the file you'll probably be fine. If that's not the case you may want to have a master process that grants access to the files on an exclusive basis using some kind of IPC signalling.
In a couple of scripts that I use I have problem that is intermittent.
Sometimes the script fails when trying to delete a file. According to the error log due to the file being accessed by an other process. I'm guessing that windows not had time to release the file after the previous operation performed on the file ended.
What amount of time would be a good guesstimate after which windows should have had time to release the file again?
If the Windows app is done working with the file it should be closed instantly, because presumably they closed their file handles. There is no delay in time to unlock a file after a file close operation.
If a program forgets to close their file handles though, but ends, Windows will free it for them (just not instantly). Usually it's not long but it can be any amount of time, I haven't seen it take longer than a couple seconds. But proper cleanup should be done to avoid it being locked.
It's also worth mentioning that not all programs open files in a locked way. They can open file specifying what type of access they'd like to give other processes, and they can also lock portions of the file. They may open the file with full read/write permissions to other processes.
If you have no control over the process that is not closing its file handles, but you need to execute it, you could write some kind of loop to keep trying the file for a few seconds.
As another user has posted, it should be done instantly if the file has been closed correctly - with an indetminate delay until the OS sorts it out otherwise...
Always, always dispose of resources correctly.
I have a VB6 program running on Windows 7. It is copying a large number of files and sometimes FileCopy fails with an access violation (between every 60 and 500 files).
I cannot reproduce it using a single file, only during such mass-copying operations this problem happens.
It makes no difference, if source/target are on hard disks, network shares or CD-ROMs.
What could trigger this problem?
EDIT: My question might be a little bit convoluted, so here's some more data:
Run 1:
Start copying 5.000 files
Access violation on file #983
Access violation on file #1437
Access violation on file #1499
Access violation on file #2132
Access violation on file #3456
Access violation on file #4320
Done
Run 2:
Start copying 5.000 files
Access violation on file #60
Access violation on file #3745
Done
Observations
The affected files are always different
The number of affected files tends to decrease if the same file batch is copied multiple times in succession.
Running as Administrator makes no difference
The application has read/write access to all necessary file system objects
This problem happens on Windows 7 workstations only!
Best guess: Is it possible that another user/application is using the specified file at the time the process is running? (anti-virus scanner, Win7 search indexing tool, windows defender, etc) You might try booting the machine in safe-mood to eliminate any of the background services/apps and try running the process to see.
Is there any consistency in the file types or size of the files causing the issue?
Is the machine low on resources? RAM/Disk Space
You said it occurs on Win7 – is it multiple Win7 machines or just one. (help to rule out system resources vs. software/OS)
Any hints from the event viewer (control panel > admin tools) – doubtful
Does the process take a long time to complete? If you can take the performance hit you might look at destroying and recreating the FSO object after every copy or every X files to make sure there isn’t some odd memory leak issue with Win7/VB6.
Not necessarily a recommended solution but if all else fails you could handle that error and save the files that trigger it in a dictionary/collection and reloop through the process with any those files when done. No guarantee it wouldn’t happen again.
Not enough information (as you probably know). Do you log the activity? If not, it's a good place to start. Knowing whether certain files are the problem, and if the issue is repeatable, can help narrow it down.
In your case I would also trap (and log) all errors and retry N times after waiting N seconds. You could be trying to copy in-use files locked by another process, and a retry may allow time for that lock to go away.
Really, more data is the key, and logging is the way to get it.
Is there any chance your antivirus program or some indexer is getting in the way?
Try creating a procmon trace while reproducing the error and see what is actually failing. With the trace you can see if there is another program causing the issue or if your app is trying to write somewhere it should't (incorrect permissions) or can't (a temp/scratch directory without enough space).
Check out the presentations linked to on the procmon page or Mark Russinovich's blog for some cool examples of using this tool to solve various Windows/application mysteries.
Is there a a hidden/system file in the directory that is potentially blocking it?
Does running the VB6 App with right-click "Run As Administrator" make a difference?
Is the point where it dies at the max # of files in the directory? e.g. Are you sure the upper limit on whatever loop structure you are using in VB6 is correct (Count vs count -1)?
UNIX file-locking is dead-easy: The operating system assumes that you know what you are doing and lets you do what you want:
For example, if you try to delete a file which another process has opened the operating system will usually let you do it. The original process still keeps it's file-handles until it terminates - at which point the the file-system will quietly re-cycle the disk-resources. No fuss, that's the way I like it.
How different things are on Windows: If I try to delete a file which another process is using I get an Operating-System error. The file is untouchable until the original process releases it's lock on the file. That was great back in the single-user days of MS-DOS when any locking process was likely to be on the same computer that contained the files, however on a network it's a nightmare:
Consider what happens when a process hangs while writing to a shared file on a Windows file-server. Before the file can be deleted we have to locate the computer and ID the process on that computer which originally opened the file. Only then can we kill the process and delete our unwanted file.
What a nuisance!
Is there a way to make this better? What I want is for file-locking on Windows to behave a like file-locking in UNIX. I want the operating system to just let me do what I want because I'm in charge and I know what I'm doing...
...so can it be done?
No. Windows is designed for the "average user", that is people who don't understand anything about a computer. Therefore, the OS tries to be smart to avoid PEBKACs. To quote Bill Gates: "There are no issues with Windows that any number of people want to be fixed." Of course, he knows that 99.9999% of all Windows users can't tell whether the program just did something odd because of them or the guy who wrote it.
Unix was designed when the world was more simple and anyone close enough to a computer to touch it, probably knew how to assemble it from dirty sand. Therefore, the OS usually lets you do what you want because it assumes that you know better (and if you didn't, you will next time).
Technical answer: Unix allocates an "i-nodes" if you create a file. I-nodes can be shared between processes. If two processes create the same file (that is, two processes call create() with the same path), then you end up with two i-nodes. This is by design. It allows for a fancy security feature: You can create files which no one can open but yourself:
Open a file
Delete it (but keep the file handle)
Use the file any way you like
Close the file
After step #2, the only process in the universe who can access the file is the one who created it (unless you want to read the hard disk block by block). The OS will keep the data alive until you either close the file or your process dies (at which time Unix will clean up after you).
This design is the foundation of all Unix filesystems. The Windows file system NTFS works much the same way but the high level API is different. Many applications open files in exclusive mode (which prevents anyone, even backup programs) to read the file. This is even true for applications which just display information like PDF viewers.
That means you'll have to fix all the Windows applications to achieve the desired effect. If you have access to the source, you can create a file in a shared mode. That would allow other processes to access it at the same time but then, you will have to check before every read/write if the file still exists, whether someone has made changes, etc.
According to MSDN you can specify to CreateFile() 3rd parameter (dwSharedMode) shared mode flag FILE_SHARE_DELETE which:
Enables subsequent open operations on a file or device to request delete access.
Otherwise, other processes cannot open the file or device if they request delete access.
If this flag is not specified, but the file or device has been opened for delete access, the function fails.
Note Delete access allows both delete and rename operations.
http://msdn.microsoft.com/en-us/library/aa363858(VS.85).aspx
So if you're can control your applications you can use this flag.
Note that Process Explorer allow for force closing of file handles (for processes local to the box on which you are running it) via Handle -> Close Handle.
Unlocker purports to do a lot more, and provides a helpful list of other tools.
Also deleting on reboot is an option (though this sounds like not what you want)
That doesn't really help if the hung process still has the handle open. It won't release the resources until that hung process releases the handle. But anyway, in Windows it is possible to force close a file out from under a process that's using it. Process Explorer from sysinternals.com will let you look at and close handles that a process has open.