Renaming A Running Process' File Image On Windows - windows

I have a Windows service application on Vista SP1 and I've found that users are renaming its executable file (while it's running) and then rebooting, thus causing it to fail to start on next bootup because the service manager can no longer find the exe file since it's been renamed.
I seem to recall that with older versions of Windows you couldn't do this because the OS placed a lock on the file. Even with Vista SP1 I still cannot copy over the existing file when it's running - Windows reports that the file is in use - makes sense. So why should I be allowed to rename it? What happens if Windows needs to page in a new code page from the exe but the file has been renamed since it was started? I ran Process Monitor while renaming the exe file, etc, but Process Mon didn't report anything strange and just logged changing the filename like any other file.
Does anyone know what's going on here behind the scenes? It's seem counter intuitive that Windows would allow a running process' filename (or its dependent DLLs) to be changed. What am I missing here?

your concept is wrong ... the filename is not the center of the file-io universe ... the handle to the open file is. the file is not moved to a different section of disk when you rename it, it's still in the same place and the part of the disk the internal data structure for the open file is still pointing to the same place. bottom line is that your observations are correct. you can rename a running program without causing problems. you can create a new file with the same name as the running program once you've renamed it. this is actually useful behavior if you want to update software while the software is running.

As long as the file is still there, Windows can still read from it - it's the underlying file that matters, not its name.
I can happily rename running executables on my XP machine.

The OS keeps an open handle to the .exe file,. Renaming the file simply changes some filesystem metadata about the file, without invalidating open handles. So when the OS goes to page in more code, it just uses the file handle it already has open.
Replacing the file (writing over its contents) is another matter entirely, and I'm guessing the OS opens with the FILE_SHARE_WRITE flag unset, so no other processes can write to the .exe file.

Might be a stupid question but, why do users have access to rename the file if they are not suppose to rename the file? But yeah, it's allowed because, as the good answers point out, the open handle to the file isn't lost until the application exits. And there are some uses for it as well, even though I'm not convinced updating an application by renaming its file is a good practice.

You might consider having your service listen to changes to the directory that your service is installed in. If it detects a rename, then it could rename itself back to what it's supposed to be.

There are two aspects to the notion of file here:
The data on the disk - that's the actual file.
The file-name (could be several or none) which you can give that data - called directory entries.
What you are renaming is the directory entry, which still references the same data. Windows doesn't care about your doing so, as it still can access the data when it needs to. The running process is mapped to the data, not the name.

Related

How Windows differentiates between Copied files and Created files

I am looking for a bit of advice on how Windows file system differentiates between files that are copied(copy and pasted from another location) and files that are created (a new file created in a a folder).
A bit of background to this so it makes more sense: I have an application that is used to move files. The application will monitor a directory and when a file is placed in the directory it will move it elsewhere. However, I am having issues where the application will not pick up a file that is created within the monitored directory but will pick up files that have been created else where and are copied into the monitored directory.
Any advice on how Windows differentiates, or if it does at all, would be greatly appreciated.
This is running on Microsoft Windows Server 2008 R2 Standard. I can't dig into the code and see what is going on under the hood unfortunately, so need to get an idea of the difference if any there would be.
The filesystems don't know the operation of "copying" the file. Any copying is a sequence of file open/read/write/close operations. The same applies to moving to the different filesystem. Moving within the same filesystem, though, is an operation native to the filesystems and it can be done with one command to the filesystem.
Now about your problem. Most likely you catch the creation of the file (before the data is written), and when your application reacts, the file is still opened for writing. So you need to wait until the file is closed.
Depending on how you do monitoring, such waiting is done in different ways. In filesystem filters you wait for file close operation. With .NET FileSystemWatcher there's no way to track file close operation, but I saw a couple of tricks here on StackOverflow (don't have a link though, sorry).
A file existing in D: drive, from creation
The same file which was copied to E: drive
As you can see, the file which was copied to E: drive, has a creation time as the latest, when it was copied to and the modification time as the last modification time for that file in previous location.
So I guess this illustrates, how windows differentiates between copied files and created files.

Rename a file that multiple processes are trying to use

I have 2 applications running in parallel, both doing the following:
check for file not containing "processed"
process the file and then rename it to filename+processed
for every file, only one application shall use it (on a first come first served basis)
I get the files and I also lock them so the other application cannot process it. But when it comes to renaming the file I get a problem. To rename the file, wanted to use the File.renameTo function. However, for that to work, I have to release the lock on the file. But when I release the lock another process may try to use the file. Exactly that should not happen.
Is there any way to prevent the application B from using the file between application A releasing the lock and finishing renaming the file?
EDIT
Some more information:
File creation if the file doesn't exist has to be prevented.
The file will be processed RandomAccessFile (with read and write permission; this creates a new file if it doesn't exist).
Note: On linux, one can rename a file that is locked, so this problem doesn't occur there. However, on Windows a locked file cannot be renamed; I have to release the lock, then rename it. But the time, during which the lock is released creates enables other applications to see that the file is available and then they will try to use it.
Windows applications can do this using the SetFileInformationByHandle function, which allows you to rename the file using the handle you already have open. You probably can't do this natively from Java.
However, a more straightforward solution would be to rename the file (to filename+processing, for example) before you start processing it. Whichever process successfully renames the file in this way is the one responsible for processing it and eventually renaming it to filename+processed.

How to let Windows know that a file is "being used" by my application?

I'm making a simple VB.net application, which basically asks the user for multiple files and later it will need to access the selected files and modify them.
Right now, I'm saving the full paths of the selected files, and in the future, the application will iterate through each path, open the file from such path, and modify it.
The problem with that is that the user could select a file (so the full path is saved) and then they delete or move the file before my application modifies it.
Normally, I'd throw an error saying "File not found", but I'm under the impression that Windows had a feature that would disallow you from deleting/moving/renaming a file because "a program was using it" - which is a feature that would fit way better for my application.
I'm not very advanced with VB.NET, but I suppose that if I "open" a file using my application (with some IO thing), the feature I mentioned earlier would indeed trigger and the user would be unable to modify the file because it is "opened" by my application.
However, since my only desire is to "reserve" files, it seems to be quite wasteful to actually open them when I don't really need to (yet). Is there a way to tell Windows I need a certain file to be intact?
Opening files (with specifying desired sharing mode) is the way to do that.
I don't believe there is anything really wrong with opening multiple files (also you still will not be able to do anything for cases like removing of removable drive). In old times there were restrictions on number of opened files per process, but I it no longer practical limitation - Pushing the Limits of Windows: Handles
There is an easy solution: open each file in exclusive mode.
It should look like this:
Sub test()
Dim FS = System.IO.File.Open("path", IO.FileMode.Open, IO.FileAccess.ReadWrite, IO.FileShare.None)
End Sub
But beware: You have opened a file handle and if you code responsible for closing files fails without terminating the application files will still be locked for very long (till app shuts down).
You can use a using clause or a try/catch/finally clause - I don't know enough about your program to recommend anyone.

Files under Program Files have a split personality

I have a Ruby application I'm installing (along with a packaged ruby interpreter) under Program Files on Windows 7 with an NSIS-built installer. In order to debug it, I edited one of the files to add some debugging statements. After that, I uninstalled the package and ran a new version of the installer which includes a new copy of the edited file, without debugging statements.
Now, I can't get the new copy to load into ruby. If I run type <filename> in cmd.exe, or open the file in Notepad.exe or Firefox, I see the new version. If I run ruby -e "puts File.read('<filename>')", or open the file in emacs, I see the old version.
If, in Windows Explorer, I copy the file to a new filename, everything can see the new contents at that filename. If I delete the original file and rename the copy to replace the original, the split personality returns.
This situation survives a reboot, so it's not a simple matter of a file being accidentally held open.
What on earth is going on here? Is there some aspect of the install process that might be checkpointing the file in a way I can revert, or at least switch off while I'm debugging the installer?
update
If I run ruby -e "puts File.read('<filename>')" in a console that is run as administrator, I see the correct, new contents. How should I be managing this file?
I think it has to do with UAC file system virtualization. Check whether your file exists in C:\Users\<username>\AppData\Local\VirtualStore. If it does, delete it from the VirtualStore.
The fact you see the correct file when running Administrator console proves that it's because of virtualization: UAC virtualization is turned off for elevated processes.
In general, do not put files you plan to change a lot in Program Files. From Vista onward, there is an interesting way things work to "allow" you to write to a protected file, but it really gets stored in your app data directories, not actually in Program Files. So, utilities that go through the Windows API find the "new" version of the file correctly, but utilities that are more low-level (ruby.exe) only find the existing version. If you navigate to that folder, do you see a "Compatibility Files" button right above the contents? Press that and you'll see your updated version.
Scott Hanselman wrote a good article about this when it was introduced in Vista.
You can only write to the real file when you're logged on as Administrator.

Is it safe to recompile an executable while it's running?

What happens if I recompile an executable while it's running? Does the operating system read all of the executable's contents into memory when it starts running it, so it will never read the new executable file? Or will it read sections of the new executable file thinking it hasn't changed, leading to possibly undefined behaviour?
What if I have a script running which repeatedly invokes an executable in a loop, and I recompile the executable while the script is running. Is it guaranteed that future iterations of the loop will invoke the new executable, and only the result of the invocation that was in progress when the switch was made might be corrupted?
My OS is Linux, but I'm also curious about what happens on Windows.
Since this is a conventional compiler, that writes out an executable file, let's follow it in Linux.
The first thing to know is that a Linux filename doesn't directly refer to the file, but rather to a directory entry, which is independent of the filename. A file doesn't actually need to have a filename, but if it doesn't it will be difficult to refer to it.
If a process is using a file, and you replace or delete it, the process will continue using that file through its directory entry. Any new process using the file, or looking it up, will get the new version (if you replaced it) or fail to find it (if you deleted it). Once all the processes are through with the old file, it will be deleted from the file system.
Therefore, if you recompile and create a new executable of the same name, you won't affect the running process. It will continue to use the old executable. Any new process that tries to open the file will get the new one. If you've got system("foo"); in a loop, each time it executes it it will see what the filename foo means right then.
Windows handles files differently. In general, if there's a process using a file, the file is locked and may not be deleted or replaced.
It depends.
If the OS read the whole of the executable into memory and doesn't refer back to the disk image then yes you can recompile it while it was "in use".
In practice this doesn't always happen. If the OS keeps a file handle open (like Windows does) on the executable this will prevent the file being deleted and/or overwritten.
With Linux/Unix it is possible to overwrite a file that's "in use". See David Thornley's answer for a detailed explanation.
In Windows you can't delete a locked file but what most people don't know is that you can move or rename a running exe.
So you could
move the old exe to a temp directory on the same drive
schedule it for deletion on the next reboot: MoveFileEx(name, NULL, MOVEFILE_DELAY_UNTIL_REBOOT);
move a new exe in its place.
The old program will keep running but new processes will use the new file.
Under Linux, executables are demand paged into memory as needed. The executable on disk becomes the backing store for the application. This means you cannot modify the executable on disk or you will affect a running application. If you try to open(2) an in-use executable for writing, you will get an ETXTBSY (Text file busy) error (check the man page for open(2)).
As many others have said, you can remove the file from the filesystem (unlink(2)) and the kernel will maintain a reference to it and not delete it from disk until there are no more references (when the process exits, it will release its reference to the file). This means you can effectively "overwrite" an in-use executable by first removing it and then creating a new file with the same name as the old file.
So, it comes down to how the compiler creates the executable when "overwriting" an existing file. If it just opens the file for writing and truncates it (O_WRONLY|O_CREAT|O_TRUNC), the it will fail with an ETXTBSY error. If it first removes the existing output file and creates a new one, it will work without error.
I would imagine it wouldn't let you replace the file, since windows locked it down while it was in use.
It depends. From what I've experienced, on Linux you can still be running a program if you delete it (and it's not too large). But I don't think that's defined behavior.
As far as the loop goes, depending on how you're invoking the executable, you will likely end out crashing your script when it goes to execute a program that's only halfway been written.
In Windows you can't if the executable is still running, the file will be locked. If the exe isn't actually running, the new runs should pick up the new one, depending on how your script is coded among other things.
I don't know about Linux.
The Executable might be loaded completely into the memory on startup, however if it's large enough, and running long enough, the OS might decide to swap out some unused parts of it.
Since the OS assumes that the program's file is still there, there is no reason to actually write these memory blocks into the swap file. So they are simply invalidated and reused. If the program needs these pages again, the OS loads them from the executable file.
In Windows this actually happens automagically, since a loaded module is a memory mapped file. That also means that the file is locked during it's execution, and you will not be able to overwrite it easily.
Not sure about Linux, but IIRC it does the swapping the same way.

Resources