Find next file (but not FindNextFile) - windows

If a user opens a file in a program (for example using GetOpenFileNameW, DragQueryFileW, command line argument, or whatever else to get the path, and a subsequent CreateFileW call), is there a way to find the next file in the parent directory of the opened file?
The obvious solution is to cycle through the results from FindNextFileW or NtQueryDirectoryFileEx until the opened file is encountered, and just open the next file.
However, this seems undesireable.
First, because these functions use paths (instead of for example a handle), the original file is decoupled from the search algorithm, so the original file might not even get encountered in that search. This is not much of an issue (as failing in this case is the expected outcome), and it probably could be resolved with (temporarly) changing the sharing mode, using LockFile or similar (though I would like to avoid that).
Second, this cycling search would have to be done every time, because the contents of the directory might have changed (retaining hFindFile does not work, because only FindFirstFileW calls NtQueryDirectoryFileEx and enumerates the contents of the directory). Which seems like unnecessary work and might even affect performance (for example if the directory contains a lot of files).
In theory any file system has some way of enumerating the files in a directory. Meaning there is some ordered data structure of the files' metadata. And getting the next file should only involve going back from the existing file handle to that file's entry, and then getting the next entry from that data structure. So there does not seem to be a fundamental reason why this cannot be done more sanely.
I thought maybe there exist a better way to do this somewhere in WinAPI...
Same question for finding the previous file.

Related

Is there a way to limit my executable's ability to delete to only files it has created?

I'm on Windows writing a C++ executable that deletes and replaces some files in a directory it creates during an earlier run session. Maybe I'm a little panicky, but since my directory and file arguments for the deletions are generated by parsing an input file's path, I worry about the parse throwing out a much higher or different directory due to an oversight and systematically deleting unrelated files unintentionally.
Is there a way to limit my executable's reign to only include write/delete access to files it has created during earlier run sessions, while retaining read access to everything else? Or at least provide a little extra peace of mind that, even if I really mis-speak my strings to DeleteFileA() and RemoveDirectoryA() I'll avoid causing catastrophic damage?
It doesn't need to be a restriction to the entire executable, it's good enough if it limits the function calls to delete and remove in some way.

How should I mark a folder as processed in a script?

A script shall process files in a folder on a Windows machine and mark it as done once it is finished in order to not pick it up in the next round of processing.
My tendency is to let the script rename the folder to a different name, like adding "_done".
But on Windows, renaming a folder is not possible if some process has the folder or a file within it open. In this setup, there is a minor chance that some user may have the folder open.
Alternatively I could just write a stamp-file into that folder.
Are there better alternatives?
Is there a way to force the renaming anyway, in particular when it is on a shared drive or some NAS drive?
You have several options:
Put a token file of some sort in each processed folder and skip the folders that contain said file
Keep track of the last folder processed and only process ones newer (Either by time stamp or (since they're numbered sequentially), by sequence number)
Rename the folder
Since you've already stated that other users may already have the folder/files open, we can rule out #3.
In this situation, I'm in favor of option #1 even though you'll end up with extra files, if someone needs to try and figure out which folders have already been processed, they have a quick, easy method of discerning that with the naked eye, rather than trying to find a counter somewhere in a different file. It's also a bit less code to write, so less pieces to break.
Option #2 is good in this situation as well (I've used both depending on the circumstances), but I tend to favor it for things that a human wouldn't really need to care about or need to look for very often.

Windows remembering lower case filename, how to force it to forget?

Here's my problem:
I've got source files I'm publishing (.dita files, publishing using Oxygen) and I need to change capitalization on a lot of them, along with folders and subfolders that they're in. Everything is in source control using SVN.
When I change only an initial cap, say, and leave everything about the filename the same otherwise, Windows "remembers" the lower case name, and that's what gets published, even though the source name is now upper case.
I can even search for the filename, for example Foobar.dita, and the search results will show me "foobar.dita". When I go to that location directly in the file explorer, the file is named Foobar.dita. It's not a duplicate, it's the same file.
What I understand from reading up on this is that Windows isn't case-sensitive, but it "remembers" the filename as one case or the other. So my question is, if I can't force Windows to be case-sensitive, can I somehow force Windows to forget the filename? I've tried deleting it from both Windows and SVN, and recreating it, but it still gets read as lower case when it's initial cap.
If I rename the file, even slightly, it solves the problem, but many of the filenames are just what they need to be, and it's a lot more work to rename them (to think of another good filename) than just to change to initial cap.
UPDATE:
Here's where I read about about the "remembering" idea, in response two, the one with 7 recommendations.
To be explicit: I'm not updating from SVN and thus turning it back to lower case, it's upper case in SVN. It appears upper case in the Windows folder.
UPDATE II: This seems to be what I'm up against:
http://support.microsoft.com/kb/100625
In NTFS, you can create unique file names, stored in the same directory, that differ only in case. For example, the following filenames can coexist in one directory on an NTFS volume:
CASE.TXT
case.txt
case.TXT
However, if you attempt to open one of these files in a Win32 application, such as Notepad, you would only have access to one of the files, regardless of the case of the filename you type in the Open File dialog box.
So it sounds like the only answer is rename the files, not just change case.

NTFS Journal USN_REASON_HARD_LINK_CHANGE event

I've written a program that reads the NTFS index and journal similar to what is described here:
http://ejrh.wordpress.com/2012/07/06/using-the-ntfs-journal-for-backups/
And It works fairly well.
In addition to the normal journal events USN_REASON_CLOSE, USN_REASON_FILE_CREATE, USN_REASON_FILE_DELETE etc' I'm receiving an event with reason USN_REASON_HARD_LINK_CHANGE. I'd like to be able to update the directory index according to this event but I can't find any information about it. The only documentation is:
An NTFS file system hard link is added to or removed from the file or
directory. An NTFS file system hard link, similar to a POSIX hard
link, is one of several directory entries that see the same file or
directory.
What does this mean? where was the hard-link created? or was it removed? how do I get more information about what happened?
I know this is ancient, but I stumbled upon this while researching a related problem. Here's what I found: The hard-links are a complicating factor when reading the USN. You can get journal entries describing change to a single file reference number by way of changes made through any hard-link that's been created. Generally, and to the original question, hard-links are alternative directory entries through which a single file might be accessed. Thus, all the file's characteristics are shared for each link (except for the names and parent file reference numbers). Technically, you can't tell which entry is the original and which is a link.
A subtle difference does exist, and it manifests if you query the master file table (using DeviceIOControl and Fsctl_Enum_Usn_Data). The query will return only a single representative file regardless of how many links exist. You can query for the links using NtQueryInformationFile, querying for FILE_HARD_LINK_INFORMATION. I think of the entry returned by the MFT query as the main entry and the NtQueryInformationFile-returned items as links...however, the main entry can get deleted and one of the links will get promoted...so it's only a housekeeping thought and little else.
Note that a problem arises where one of the hard-links is moved or renamed. In this case, the journal entries for the rename or move reflect the filename and parent file reference number of the affected link. The problem arises if you ask for only the summary "on-close" records. In such a case, you won't ever see the USN_REASON_RENAME_OLD_NAME record...because that USN entry never gets an associated REASON_CLOSE associated with it. Without this tidbit, you won't be able to easily determine which link's name or location was changed. You have to read the usn with ReadOnlyOnClose set to 0 in the Read_Usn_Journal_Data_V0. This is a far chattier query, but without it, you can't accurately associate the change with one link or the other.
As always with the USN, I expect you'll need to go through a bit of trial and error to get it to work right. These observations/guesses may, I hope, be helpful:
When the last hard link to a file is deleted, the file is deleted; so if the last hard link has been removed you should see USN_REASON_FILE_DELETE instead of USN_REASON_HARD_LINK_CHANGE. I believe that each reference number refers to a file (or directory, but NTFS doesn't support multiple hard links to directories AFAIK) rather than to a hard link. So immediately after the event is recorded, at least, the file reference number should still be valid, and point to another name for the file.
If the file still exists, you can look it up by reference number and use FindFirstFileNameW and friends to find the current links. Comparing this to the event record in question plus any relevant later events should give you enough information, although if multiple hard links for the same file are deleted and/or created you might not be able to reconstruct the order in which this happened, and if you don't have enough information about the prior state of the file system you might not be able to identify the deleted hard links. I don't know whether that would matter to you or not.
If the file no longer exists, you should still be able to identify it by the USN record in which it was deleted. Again, taking all relevant events into consideration, and with enough information about the prior state, you should be able to reconstruct most of what happened, if not the order.
There is some hope that we can do better than this: the file name and/or ParentFileReference number in the event record might refer to the hard link that was created or deleted, rather than to an arbitrary link to the file. In this case you'll have all the relevant information about the sequence of events except for whether any particular event was a create or a delete, which you should be able to work out by looking at the current state of the file and working backwards through the records.
I assume you've already looked for nearby change records that might contain additional information? There isn't, for example, a USN_REASON_RENAME_NEW_NAME record generated when a hard link is created or a USN_REASON_RENAME_OLD_NAME when a hard link is removed? Or paired USN_REASON_HARD_LINK_CHANGE records, one for the file, one for the directory containing the affected hard link to the file? (Wishful thinking, I expect, but it wouldn't hurt to look!)
For testing purposes, you can create hard links with the mklink command.

How to get a file's arrival time to a directory using Perl?

Assume a file is copied or moved to a directory by some other program. I want to get the time that this file was copied/moved to this folder. That is, I want the time that the file first appears in this directory.
Note that this file might exist before it was moved/copied or it might not.
This is not any of the time information that can be obtained by File::stat. Thanks.
You may find File::ChangeNotify helpful which tracks file and directory changes. I would suggest looking at incron, which can track various events and changes of files in filesystems.
My guess is you want the time the file was closed after being first written. This may or may not be available, and will be OS-specific. Most OSes track file creation, last modification, and last read (or some subset of those). If none of those work for you you're out of luck unless you control the creation and writing of the file in your application code, in which case you can use whatever you like.
While it may not be the best way to do it,
but for the copying case, if you make a file handle $fh,
You can keep checking for file existence using -e $fh
As soon as you find that file exists, record that moments time.
You may find more interesting -X $fileHandle stuff here.
If nothing else has happened in that directory, this will be the modification time of the directory.

Resources