I am about to design a win32 process that periodically polls the contents of a specific folder (let's call it "incoming") for the arrival of folders.
These new folders will be created occasionally by another process that copies a directory/folder tree from somewhere else into the the "incoming" folder. It may even be created by a user drag-dropping a folder in Windows/File Explorer.
When my process polls, it will notice the presence of a new folder and will delve several levels down in the folder tree and read the contents of some of files it finds there.
It is important that my process waits until the whole folder tree has finished copying before it attempts to read it, otherwise it may miss some files that have not been copied yet.
So is there a way for my process to be sure that the entire folder tree has been fully copied? For example, is there a lock or property that gets set on the top level folder until all lower level folders and files have finished copying?
Related
Here's the scenario: We have a computer running Windows 10 which has a directory that's backed up nightly. The backups are done with a batch file utilizing Robocopy and scheduled via Windows. The parameters are as such that the backup will always add any new files or existing file edits into the destination, but it will never delete files from the destination that have been deleted in the source. It essentially archives all files which are in the source directory at the end of each day.
Here's the tricky part. The source directory is very large, and occasionally someone finds a duplicate file (or several duplicates of a file) in it. When that happens, we need to delete all but one copy of the file, and then we need to access the backup directory manually, locate the file there, and do the same. This is tedious and time-consuming as it's not rare for someone to notice an entire subdirectory full of files that exist 5+ times each.
What we're looking for is a way to scan the source directory and all subdirectories inside for duplicate files and remove all but one copy of them, and then a way to reflect that into the destination. I've assumed that we will not be able to use Robocopy to reflect the changes in the destination due to the nature of the backup script it's running, but we do have the ability to run any third-party software on the destination directory as well, essentially running an action in both directories to clean each of them of duplicate files.
On that note, I'm not against using third-party tools to make this cleaner or more efficient, I'm just not aware of any.
There is one way to solve this problem I was also suffering from this problem. but I found that how to use "BATCH" file
There are mainly 2 command
X_COPY
ROBO_COPY
According to your need here, (1)x_copy will be helpfull
xcopywill backup your specific file or folder even if you changed some megabytes data, it will copy the new data and will not be replaced on previous data it will make new copy.
HOW TO DO
Open NotePad and type
xcopy "source file" "destination" /y/e/d/c/f/h/i/z/j
And then save your notepad as ".bat" file
for more requirement use below url
https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/xcopy
Usually I do it when I am about to publish an application with a lot of code changes applied to. The problem is that I do it to feel good not because I am certain that the process really needs it.
I am not asking about cleaning the folders but deleting them. What valid reasons are there to do so?
I do not see a big difference between deleting bin folders and total clearing them. In any way, the entire folder content is removed and new content is copied. ASP.NET Server is automatically restarted (if you didn't shut down it first) when the bin folder content is changed.
I personally usually delete the folder content but do not delete the folder itself. But as I said, there is no big difference at all.
I am looking for a bit of advice on how Windows file system differentiates between files that are copied(copy and pasted from another location) and files that are created (a new file created in a a folder).
A bit of background to this so it makes more sense: I have an application that is used to move files. The application will monitor a directory and when a file is placed in the directory it will move it elsewhere. However, I am having issues where the application will not pick up a file that is created within the monitored directory but will pick up files that have been created else where and are copied into the monitored directory.
Any advice on how Windows differentiates, or if it does at all, would be greatly appreciated.
This is running on Microsoft Windows Server 2008 R2 Standard. I can't dig into the code and see what is going on under the hood unfortunately, so need to get an idea of the difference if any there would be.
The filesystems don't know the operation of "copying" the file. Any copying is a sequence of file open/read/write/close operations. The same applies to moving to the different filesystem. Moving within the same filesystem, though, is an operation native to the filesystems and it can be done with one command to the filesystem.
Now about your problem. Most likely you catch the creation of the file (before the data is written), and when your application reacts, the file is still opened for writing. So you need to wait until the file is closed.
Depending on how you do monitoring, such waiting is done in different ways. In filesystem filters you wait for file close operation. With .NET FileSystemWatcher there's no way to track file close operation, but I saw a couple of tricks here on StackOverflow (don't have a link though, sorry).
A file existing in D: drive, from creation
The same file which was copied to E: drive
As you can see, the file which was copied to E: drive, has a creation time as the latest, when it was copied to and the modification time as the last modification time for that file in previous location.
So I guess this illustrates, how windows differentiates between copied files and created files.
I have a small application that displays the contents of a log file, somewhat transmogrified for readability. As the log file gets rewritten occasionally and Windows file system semantics prohibit deletion of open files, I create a hardlink to the file.
Obviously, this needs to happen on the same file system as the original file -- at present, I create the harddisk in the same directory, which I believe can be reasonably assumed to fulfill this requirement; the result is that a temporary file shows up in the directory listing where the user just clicked to open the file, which is ugly.
Is there a way to create a hardlink so that it does not show up (the customer where the program is used has several junctions in their directory tree, so it cannot be assumed that a specific directory is on the same filesystem), or is there a better method to read a file that another process may want to delete and rewrite (e.g. by catching their access and closing the file before letting the other process's access go through), so the program can be used on archived (readonly) log files without modification?
No
It won't help if you could. Sharing spans links.
Use the solution posed by Hans Passant as a comment.
Is it possible to lock a directory in Windows so as to ensure that no other process is reading or modifying files inside the directory for the duration of the lock, while at the same time allowing the process with the lock to modify and move files and directory itself freely?
This is not a real answer, but as a workaround:
Move the directory to a subdirectory specific to your application, which is on the same volume.
Advantages:
Prevents users and other programs from modifying the file at the old location, as the files will no longer be there
Importantly, will fail if a process already has a file open within that directory, thus ensuring that the "acquired" lock is indeed "exclusive"
Disadvantages:
It's a hack
The software will need to be adapted to work with the directory at a different path than where it was initially
Users and programs attempting to access the files will encounter unusual behavior or errors ("Path not found" instead of "Access denied")
Does not protect against programs that may poke into your application-specific subdirectory
Will leave the directory "locked" (moved to a location the user probably can't find) if your program crashes while the "lock" is "held"