Using ReadDirectoryChangesW to read changes to the folder itself (WINDOWS) - windows

From the doc (ReadDirectoryChangesW):
"Retrieves information that describes the changes within the specified directory. The function does not report changes to the specified directory itself."
My question is: What do I use to report changes to the specified directory itself?
I want to be able to capture changes not only to things and sub-things in the folder itself but also to detect for example, when the folder itself has been deleted.
One strategy would be to actually monitor for changes on the parent of the folder I'm really interested in and then use that to generate an event when the folder I'm interested in is deleted. This works but has the potential to generate thousands of 'uninteresting' events.
A second strategy is to have a recursive monitor for stuff under the folder I'm actually interested in and then a non-recursive monitor on it'a parent. The non-recursive monitor would then be able to tell me when the real folder of interest is deleted.
The latter, second strategy, generates fewer events and is the strategy I would like to use. BUT: It doesn't work 'in process'. That is, if I start to monitor the folder of interest recursively (HANDLE A), and it's parent non-recursively (HANDLE B) and then in the same process, I try and delete the folder of interest, no removal event is generated for it (even though I verify from a console that the thing no longer exists). My suspicion is that this is due to HANDLE A on the folder still being open, and even though I have included the "FILE_SHARE_DELETE" flag in the call to CreateFileW that gave me HANDLE A, it simply can't work.
Note that 'Out of process', i.e. when I delete the folder from within a completely separate process, the above strategy does work.
So, what are my options?
Many thanks,
Ben.

Related

Find next file (but not FindNextFile)

If a user opens a file in a program (for example using GetOpenFileNameW, DragQueryFileW, command line argument, or whatever else to get the path, and a subsequent CreateFileW call), is there a way to find the next file in the parent directory of the opened file?
The obvious solution is to cycle through the results from FindNextFileW or NtQueryDirectoryFileEx until the opened file is encountered, and just open the next file.
However, this seems undesireable.
First, because these functions use paths (instead of for example a handle), the original file is decoupled from the search algorithm, so the original file might not even get encountered in that search. This is not much of an issue (as failing in this case is the expected outcome), and it probably could be resolved with (temporarly) changing the sharing mode, using LockFile or similar (though I would like to avoid that).
Second, this cycling search would have to be done every time, because the contents of the directory might have changed (retaining hFindFile does not work, because only FindFirstFileW calls NtQueryDirectoryFileEx and enumerates the contents of the directory). Which seems like unnecessary work and might even affect performance (for example if the directory contains a lot of files).
In theory any file system has some way of enumerating the files in a directory. Meaning there is some ordered data structure of the files' metadata. And getting the next file should only involve going back from the existing file handle to that file's entry, and then getting the next entry from that data structure. So there does not seem to be a fundamental reason why this cannot be done more sanely.
I thought maybe there exist a better way to do this somewhere in WinAPI...
Same question for finding the previous file.

How to remove directory in Windows synchronous

RemoveDirectory() is documented as only marking a directory for deletion. I have an application where I have to be sure that the directory is actually deleted (because I create a new one with the same name, or delete directories recursively).
First idea I had was to use GetFileAttributes() to test if the directory still exists, or to use SHFileOperation() for deletion. But when running long test, at some point both solutions fail - CreateDirectory() fails.
Is there a solution for this?
This video by Douglas Niall at the 2015 CppCon covers the solution in detail, starting at about 7:30.
The idea is to first rename (move) the file or directory to another place (on the same volume), which happens synchronously, and then delete it, which happens asynchronously.
Consider this tree:
C:\Users\me\
foo\
bar\
obsolete.txt
If you try to remove bar after deleting obsolete.txt, it may fail because there can be a delay before obsolete.txt is really deleted.
Instead suppose you first move obsolete.txt to C:\Users\me, and give it a temporary name to ensure it doesn't collide with another obsolete.txt in the directory. Maybe you prefix it with a GUID, like 2DCD7863-456C-4B6C-AD84-C4F5E8009D81_obsolete.txt. Now you can delete the file using that temporary name, and, even if there's a delay before it's really deleted, you know bar is truly empty. You can now delete bar or create a new obsolete.txt in bar without worries of a conflict.
To remove bar (a directory) on the way to deleting foo (the root of the tree you're trying to delete), you play the same game. Move it to the parent of the root, call RemoveDirectory, and then proceed along your merry way knowing that it will eventually be deleted.
Possible options:
Delete Directory and check for its existence afterwards
if no handle was open, it is deleted. if a handle is still open there is another problem. Optionally you can wait a few ms after each existence check until it disappears.
Delete all files inside the directory
you mentioned you want to recreate it, so just delete its content. Doing this allows you to see which files/folders are still open inside the directory.

How should I mark a folder as processed in a script?

A script shall process files in a folder on a Windows machine and mark it as done once it is finished in order to not pick it up in the next round of processing.
My tendency is to let the script rename the folder to a different name, like adding "_done".
But on Windows, renaming a folder is not possible if some process has the folder or a file within it open. In this setup, there is a minor chance that some user may have the folder open.
Alternatively I could just write a stamp-file into that folder.
Are there better alternatives?
Is there a way to force the renaming anyway, in particular when it is on a shared drive or some NAS drive?
You have several options:
Put a token file of some sort in each processed folder and skip the folders that contain said file
Keep track of the last folder processed and only process ones newer (Either by time stamp or (since they're numbered sequentially), by sequence number)
Rename the folder
Since you've already stated that other users may already have the folder/files open, we can rule out #3.
In this situation, I'm in favor of option #1 even though you'll end up with extra files, if someone needs to try and figure out which folders have already been processed, they have a quick, easy method of discerning that with the naked eye, rather than trying to find a counter somewhere in a different file. It's also a bit less code to write, so less pieces to break.
Option #2 is good in this situation as well (I've used both depending on the circumstances), but I tend to favor it for things that a human wouldn't really need to care about or need to look for very often.

Windows remembering lower case filename, how to force it to forget?

Here's my problem:
I've got source files I'm publishing (.dita files, publishing using Oxygen) and I need to change capitalization on a lot of them, along with folders and subfolders that they're in. Everything is in source control using SVN.
When I change only an initial cap, say, and leave everything about the filename the same otherwise, Windows "remembers" the lower case name, and that's what gets published, even though the source name is now upper case.
I can even search for the filename, for example Foobar.dita, and the search results will show me "foobar.dita". When I go to that location directly in the file explorer, the file is named Foobar.dita. It's not a duplicate, it's the same file.
What I understand from reading up on this is that Windows isn't case-sensitive, but it "remembers" the filename as one case or the other. So my question is, if I can't force Windows to be case-sensitive, can I somehow force Windows to forget the filename? I've tried deleting it from both Windows and SVN, and recreating it, but it still gets read as lower case when it's initial cap.
If I rename the file, even slightly, it solves the problem, but many of the filenames are just what they need to be, and it's a lot more work to rename them (to think of another good filename) than just to change to initial cap.
UPDATE:
Here's where I read about about the "remembering" idea, in response two, the one with 7 recommendations.
To be explicit: I'm not updating from SVN and thus turning it back to lower case, it's upper case in SVN. It appears upper case in the Windows folder.
UPDATE II: This seems to be what I'm up against:
http://support.microsoft.com/kb/100625
In NTFS, you can create unique file names, stored in the same directory, that differ only in case. For example, the following filenames can coexist in one directory on an NTFS volume:
CASE.TXT
case.txt
case.TXT
However, if you attempt to open one of these files in a Win32 application, such as Notepad, you would only have access to one of the files, regardless of the case of the filename you type in the Open File dialog box.
So it sounds like the only answer is rename the files, not just change case.

NTFS Journal USN_REASON_HARD_LINK_CHANGE event

I've written a program that reads the NTFS index and journal similar to what is described here:
http://ejrh.wordpress.com/2012/07/06/using-the-ntfs-journal-for-backups/
And It works fairly well.
In addition to the normal journal events USN_REASON_CLOSE, USN_REASON_FILE_CREATE, USN_REASON_FILE_DELETE etc' I'm receiving an event with reason USN_REASON_HARD_LINK_CHANGE. I'd like to be able to update the directory index according to this event but I can't find any information about it. The only documentation is:
An NTFS file system hard link is added to or removed from the file or
directory. An NTFS file system hard link, similar to a POSIX hard
link, is one of several directory entries that see the same file or
directory.
What does this mean? where was the hard-link created? or was it removed? how do I get more information about what happened?
I know this is ancient, but I stumbled upon this while researching a related problem. Here's what I found: The hard-links are a complicating factor when reading the USN. You can get journal entries describing change to a single file reference number by way of changes made through any hard-link that's been created. Generally, and to the original question, hard-links are alternative directory entries through which a single file might be accessed. Thus, all the file's characteristics are shared for each link (except for the names and parent file reference numbers). Technically, you can't tell which entry is the original and which is a link.
A subtle difference does exist, and it manifests if you query the master file table (using DeviceIOControl and Fsctl_Enum_Usn_Data). The query will return only a single representative file regardless of how many links exist. You can query for the links using NtQueryInformationFile, querying for FILE_HARD_LINK_INFORMATION. I think of the entry returned by the MFT query as the main entry and the NtQueryInformationFile-returned items as links...however, the main entry can get deleted and one of the links will get promoted...so it's only a housekeeping thought and little else.
Note that a problem arises where one of the hard-links is moved or renamed. In this case, the journal entries for the rename or move reflect the filename and parent file reference number of the affected link. The problem arises if you ask for only the summary "on-close" records. In such a case, you won't ever see the USN_REASON_RENAME_OLD_NAME record...because that USN entry never gets an associated REASON_CLOSE associated with it. Without this tidbit, you won't be able to easily determine which link's name or location was changed. You have to read the usn with ReadOnlyOnClose set to 0 in the Read_Usn_Journal_Data_V0. This is a far chattier query, but without it, you can't accurately associate the change with one link or the other.
As always with the USN, I expect you'll need to go through a bit of trial and error to get it to work right. These observations/guesses may, I hope, be helpful:
When the last hard link to a file is deleted, the file is deleted; so if the last hard link has been removed you should see USN_REASON_FILE_DELETE instead of USN_REASON_HARD_LINK_CHANGE. I believe that each reference number refers to a file (or directory, but NTFS doesn't support multiple hard links to directories AFAIK) rather than to a hard link. So immediately after the event is recorded, at least, the file reference number should still be valid, and point to another name for the file.
If the file still exists, you can look it up by reference number and use FindFirstFileNameW and friends to find the current links. Comparing this to the event record in question plus any relevant later events should give you enough information, although if multiple hard links for the same file are deleted and/or created you might not be able to reconstruct the order in which this happened, and if you don't have enough information about the prior state of the file system you might not be able to identify the deleted hard links. I don't know whether that would matter to you or not.
If the file no longer exists, you should still be able to identify it by the USN record in which it was deleted. Again, taking all relevant events into consideration, and with enough information about the prior state, you should be able to reconstruct most of what happened, if not the order.
There is some hope that we can do better than this: the file name and/or ParentFileReference number in the event record might refer to the hard link that was created or deleted, rather than to an arbitrary link to the file. In this case you'll have all the relevant information about the sequence of events except for whether any particular event was a create or a delete, which you should be able to work out by looking at the current state of the file and working backwards through the records.
I assume you've already looked for nearby change records that might contain additional information? There isn't, for example, a USN_REASON_RENAME_NEW_NAME record generated when a hard link is created or a USN_REASON_RENAME_OLD_NAME when a hard link is removed? Or paired USN_REASON_HARD_LINK_CHANGE records, one for the file, one for the directory containing the affected hard link to the file? (Wishful thinking, I expect, but it wouldn't hurt to look!)
For testing purposes, you can create hard links with the mklink command.

Resources