How Windows differentiates between Copied files and Created files

How Windows differentiates between Copied files and Created files - windows

I am looking for a bit of advice on how Windows file system differentiates between files that are copied(copy and pasted from another location) and files that are created (a new file created in a a folder).
A bit of background to this so it makes more sense: I have an application that is used to move files. The application will monitor a directory and when a file is placed in the directory it will move it elsewhere. However, I am having issues where the application will not pick up a file that is created within the monitored directory but will pick up files that have been created else where and are copied into the monitored directory.
Any advice on how Windows differentiates, or if it does at all, would be greatly appreciated.
This is running on Microsoft Windows Server 2008 R2 Standard. I can't dig into the code and see what is going on under the hood unfortunately, so need to get an idea of the difference if any there would be.

The filesystems don't know the operation of "copying" the file. Any copying is a sequence of file open/read/write/close operations. The same applies to moving to the different filesystem. Moving within the same filesystem, though, is an operation native to the filesystems and it can be done with one command to the filesystem.
Now about your problem. Most likely you catch the creation of the file (before the data is written), and when your application reacts, the file is still opened for writing. So you need to wait until the file is closed.
Depending on how you do monitoring, such waiting is done in different ways. In filesystem filters you wait for file close operation. With .NET FileSystemWatcher there's no way to track file close operation, but I saw a couple of tricks here on StackOverflow (don't have a link though, sorry).

A file existing in D: drive, from creation
The same file which was copied to E: drive
As you can see, the file which was copied to E: drive, has a creation time as the latest, when it was copied to and the modification time as the last modification time for that file in previous location.
So I guess this illustrates, how windows differentiates between copied files and created files.

Related

How can I mirror deleted duplicates from a source into a destination?

Here's the scenario: We have a computer running Windows 10 which has a directory that's backed up nightly. The backups are done with a batch file utilizing Robocopy and scheduled via Windows. The parameters are as such that the backup will always add any new files or existing file edits into the destination, but it will never delete files from the destination that have been deleted in the source. It essentially archives all files which are in the source directory at the end of each day.
Here's the tricky part. The source directory is very large, and occasionally someone finds a duplicate file (or several duplicates of a file) in it. When that happens, we need to delete all but one copy of the file, and then we need to access the backup directory manually, locate the file there, and do the same. This is tedious and time-consuming as it's not rare for someone to notice an entire subdirectory full of files that exist 5+ times each.
What we're looking for is a way to scan the source directory and all subdirectories inside for duplicate files and remove all but one copy of them, and then a way to reflect that into the destination. I've assumed that we will not be able to use Robocopy to reflect the changes in the destination due to the nature of the backup script it's running, but we do have the ability to run any third-party software on the destination directory as well, essentially running an action in both directories to clean each of them of duplicate files.
On that note, I'm not against using third-party tools to make this cleaner or more efficient, I'm just not aware of any.

There is one way to solve this problem I was also suffering from this problem. but I found that how to use "BATCH" file
There are mainly 2 command
X_COPY
ROBO_COPY
According to your need here, (1)x_copy will be helpfull
xcopywill backup your specific file or folder even if you changed some megabytes data, it will copy the new data and will not be replaced on previous data it will make new copy.
HOW TO DO
Open NotePad and type
xcopy "source file" "destination" /y/e/d/c/f/h/i/z/j
And then save your notepad as ".bat" file
for more requirement use below url
https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/xcopy

Visual Studio - File changes not saved to original disk location?

Ok, complete Visual Studio & Windows development noob here - there's gotta be an easy answer to this.
I've just started working on porting a Linux C++ library to Windows. Existing source tree is on the Linux file system, VS is running in a Windows 7 VM, which has the Linux file system mounted.
I added the source tree to a new project - I was initially doing the edits on the linux side, but now I've done a few from the VS IDE. But those edits aren't showing up on the disk?? I've done the typical save: ctrl-s, done the "save all": ctrl-shift-s, saved from the menu, etc. If I look at the file on the disk on the linux side, the changes aren't there.
I've shut down & restarted VS, and it still sees the changes on restart. How do I get the changes back on the actual disk so I can commit to subversion, etc.?
I've confirmed that the files & file system are read/writable from the Windows VM.
I'm sure this made sense to somebody, but I'll be damned if I get it.
Visual Studio Professional 2013 on Windows 7

You shouldn't be reading/writing to the same directory under both environments, imo. Not the least reasons of which is that *nix & windows have different ideas of line endings.
It would be much better to keep a git repository on your host OS (or on a server like github) and pull/push to that repo from your windows VM. Git is smart enough to handle all the line endings, symbolic links, permissions, etc. automagically.

I have seen similar behavior using BC++ IDE.
In my case I was trying to edit files that were hard links to files in a second directory (on the same NTFS file system).
The IDE is using some mechanism to reposition the file to be edited into the _history backup directory.
I.e. the editor unlinks the original file in the original directory and relinks it in the _history subdirectory and creates a completely new directory entry for the edited file.
The hard linked file I created in the second directory is thus then linked to the backup file in the _history directory so when I edit the file in the second directory with notepad, the modifications appear in the _history backup file (or vica versa) but not the file in the originial location.
Its not like a simple text editor (notepad) where the edited file is opened-read-closed and when saved, reopen-written-closed using the same directory entry.
I presume that the IDE is using a low level Windows file system function to rename/link the original file into the _history
directory and that this mechanism does not support/recognize NTFS hard links. I suspect in your case that VS may use a similar relinking mechanism (specific to NTFS) that similarly would not work with the files in the mounted Linux file system.
That VS may be storing edits in a temporary file (may be hidden or in some other temporary directory) so the original file is not lost if the IDE crashes. When the file save is committed it attempts to link the original file to backup and then attempts to relink the temporary edit file into the original directory entry location, but because the NTFS file system linking mechanism is not compatible with the Linux file system, nothing happens.
[I do observe temporary files appear like this when editing MicroSoft Office documents. notepad++ also does this, so I suspect VS is doing the same thing.]

Fastest way to move files within remote computer from Cocoa application?

I have files stored in a shared directory on one computer and a Cocoa Application running on another computer on the same LAN.
I want the application to move files within the shared directory.
I’m using -NSFileManager copyItemAtPath: toPath: error:. But sometimes it seems extremely slow, regardless of file size. Why would that operation be much longer than doing it directly on the shared directory’s computer?

I'd guess, I don't know for sure, that NSFileManager first downloads the file to copy and then reuploads the downloaded file under a different name. The last thing it does is removing the original file. Of course the downloading and uploading take some time.
The reason for this procedure is that most protocols don't have a 'copy' command. So the client will have to do all the work itself with the explained procedure.

How can I atomically replace a file on a webserver so it's latest version is continually available?

I'm working on a project that generates Google Earth KML files and saves the file to a web-accessible directory. It's running on Windows with ActivePerl. (not my preferred platform but it's what I must work with.)
The method I'm using for this is: write to temp.kml, use File::Copy to copy temp.kml to real.kml. This occurs once a second.
Google Earth grabs this real.kml via an apache2 webserver. The problem is, errors get thrown when Google Earth grabs the real.kml at the same time as temp.kml is being copied to real.kml.
I understand that there's a good chance this is unavoidable, but is there any way that I can minimize the frequency of errors thrown?

Instead of copying the file, why not just move it from your temp directory to the web directory once your processing has finished? If your temp directory is on the same filesystem as the web directory, this should result in only the name of the file changing, while the contents remain unchanged. There should be a smaller chance of a race condition.
Use file::Copy to move file

Renaming A Running Process' File Image On Windows

I have a Windows service application on Vista SP1 and I've found that users are renaming its executable file (while it's running) and then rebooting, thus causing it to fail to start on next bootup because the service manager can no longer find the exe file since it's been renamed.
I seem to recall that with older versions of Windows you couldn't do this because the OS placed a lock on the file. Even with Vista SP1 I still cannot copy over the existing file when it's running - Windows reports that the file is in use - makes sense. So why should I be allowed to rename it? What happens if Windows needs to page in a new code page from the exe but the file has been renamed since it was started? I ran Process Monitor while renaming the exe file, etc, but Process Mon didn't report anything strange and just logged changing the filename like any other file.
Does anyone know what's going on here behind the scenes? It's seem counter intuitive that Windows would allow a running process' filename (or its dependent DLLs) to be changed. What am I missing here?

your concept is wrong ... the filename is not the center of the file-io universe ... the handle to the open file is. the file is not moved to a different section of disk when you rename it, it's still in the same place and the part of the disk the internal data structure for the open file is still pointing to the same place. bottom line is that your observations are correct. you can rename a running program without causing problems. you can create a new file with the same name as the running program once you've renamed it. this is actually useful behavior if you want to update software while the software is running.

As long as the file is still there, Windows can still read from it - it's the underlying file that matters, not its name.
I can happily rename running executables on my XP machine.

The OS keeps an open handle to the .exe file,. Renaming the file simply changes some filesystem metadata about the file, without invalidating open handles. So when the OS goes to page in more code, it just uses the file handle it already has open.
Replacing the file (writing over its contents) is another matter entirely, and I'm guessing the OS opens with the FILE_SHARE_WRITE flag unset, so no other processes can write to the .exe file.

Might be a stupid question but, why do users have access to rename the file if they are not suppose to rename the file? But yeah, it's allowed because, as the good answers point out, the open handle to the file isn't lost until the application exits. And there are some uses for it as well, even though I'm not convinced updating an application by renaming its file is a good practice.

You might consider having your service listen to changes to the directory that your service is installed in. If it detects a rename, then it could rename itself back to what it's supposed to be.

There are two aspects to the notion of file here:
The data on the disk - that's the actual file.
The file-name (could be several or none) which you can give that data - called directory entries.
What you are renaming is the directory entry, which still references the same data. Windows doesn't care about your doing so, as it still can access the data when it needs to. The running process is mapped to the data, not the name.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio