What happens if I recompile an executable while it's running? Does the operating system read all of the executable's contents into memory when it starts running it, so it will never read the new executable file? Or will it read sections of the new executable file thinking it hasn't changed, leading to possibly undefined behaviour?
What if I have a script running which repeatedly invokes an executable in a loop, and I recompile the executable while the script is running. Is it guaranteed that future iterations of the loop will invoke the new executable, and only the result of the invocation that was in progress when the switch was made might be corrupted?
My OS is Linux, but I'm also curious about what happens on Windows.
Since this is a conventional compiler, that writes out an executable file, let's follow it in Linux.
The first thing to know is that a Linux filename doesn't directly refer to the file, but rather to a directory entry, which is independent of the filename. A file doesn't actually need to have a filename, but if it doesn't it will be difficult to refer to it.
If a process is using a file, and you replace or delete it, the process will continue using that file through its directory entry. Any new process using the file, or looking it up, will get the new version (if you replaced it) or fail to find it (if you deleted it). Once all the processes are through with the old file, it will be deleted from the file system.
Therefore, if you recompile and create a new executable of the same name, you won't affect the running process. It will continue to use the old executable. Any new process that tries to open the file will get the new one. If you've got system("foo"); in a loop, each time it executes it it will see what the filename foo means right then.
Windows handles files differently. In general, if there's a process using a file, the file is locked and may not be deleted or replaced.
It depends.
If the OS read the whole of the executable into memory and doesn't refer back to the disk image then yes you can recompile it while it was "in use".
In practice this doesn't always happen. If the OS keeps a file handle open (like Windows does) on the executable this will prevent the file being deleted and/or overwritten.
With Linux/Unix it is possible to overwrite a file that's "in use". See David Thornley's answer for a detailed explanation.
In Windows you can't delete a locked file but what most people don't know is that you can move or rename a running exe.
So you could
move the old exe to a temp directory on the same drive
schedule it for deletion on the next reboot: MoveFileEx(name, NULL, MOVEFILE_DELAY_UNTIL_REBOOT);
move a new exe in its place.
The old program will keep running but new processes will use the new file.
Under Linux, executables are demand paged into memory as needed. The executable on disk becomes the backing store for the application. This means you cannot modify the executable on disk or you will affect a running application. If you try to open(2) an in-use executable for writing, you will get an ETXTBSY (Text file busy) error (check the man page for open(2)).
As many others have said, you can remove the file from the filesystem (unlink(2)) and the kernel will maintain a reference to it and not delete it from disk until there are no more references (when the process exits, it will release its reference to the file). This means you can effectively "overwrite" an in-use executable by first removing it and then creating a new file with the same name as the old file.
So, it comes down to how the compiler creates the executable when "overwriting" an existing file. If it just opens the file for writing and truncates it (O_WRONLY|O_CREAT|O_TRUNC), the it will fail with an ETXTBSY error. If it first removes the existing output file and creates a new one, it will work without error.
I would imagine it wouldn't let you replace the file, since windows locked it down while it was in use.
It depends. From what I've experienced, on Linux you can still be running a program if you delete it (and it's not too large). But I don't think that's defined behavior.
As far as the loop goes, depending on how you're invoking the executable, you will likely end out crashing your script when it goes to execute a program that's only halfway been written.
In Windows you can't if the executable is still running, the file will be locked. If the exe isn't actually running, the new runs should pick up the new one, depending on how your script is coded among other things.
I don't know about Linux.
The Executable might be loaded completely into the memory on startup, however if it's large enough, and running long enough, the OS might decide to swap out some unused parts of it.
Since the OS assumes that the program's file is still there, there is no reason to actually write these memory blocks into the swap file. So they are simply invalidated and reused. If the program needs these pages again, the OS loads them from the executable file.
In Windows this actually happens automagically, since a loaded module is a memory mapped file. That also means that the file is locked during it's execution, and you will not be able to overwrite it easily.
Not sure about Linux, but IIRC it does the swapping the same way.
Related
I'm creating a program (doesn't really matter the objective but it happens to be purely to mess around and learn more about windows) which reads and writes to a file which is in use by another program (for example notepad or word).
Obviously I'm having trouble deleting it as I'm getting an access denied error because the file is in use.
My first idea was I should use CloseHandle (kernel32.dll) to close the handle to that file, but I have no clue how to find that handle in the first place.
Any ideas? I'm doing this in Rust, so if there are any language-specific suggestions that would be best but if not, that's more than fine too.
On another note, what would happen to the program after the handle has been closed? Would word or notepad still be able to edit it or would a subsequent save delete the changes made by my program or perhaps it wouldn't even save?
This behaviour you observe is not related to Rust or to any other programming language, since this is system-specific.
The CreateFileA() win32 call offers, thanks to its third parameter (dwShareMode), a means to explicitly specify how sharing could happen with the open file.
Unfortunately (for you) this call is performed beforehand by the other program you try to hijack, not yours; your program cannot do anything, it's too late once the file is open.
Not that on UNIX the situation is different because the path to the file in the file-system is just a reference to the content of this file, as an open() operation is.
Thus, if you remove (rm) the file indicated by this path, you just remove the reference (unlink()) but not its actual content if it is still referenced by an open file descriptor.
The actual deletion of the file content only happens when no reference to it exists anymore.
I have a GUI (lxn/walk) app patcher that downloads a file via ftp, streams it to a temporary file and extracts the contents to update the local files. The remove file command is deferred.
This works unless the user exits the program while the file is downloading, then the file isn’t deleted.
I tried to fix this by doing a graceful exit by catching the signal and removing the file there. But unfortunately it throws an error that the file can’t be deleted because it is being used by another program. Which makes sense because the another program is actually itself still writing to the temporary file.
Now I’m stuck and don’t know what to do to make sure that the temporary file is automatically gone once the patcher is not running. How do I do that correctly?
The file could also be created as a normal file, not just a temp file. I would just like to ask too, where in windows is best to write a temporary file?
Now I’m stuck and don’t know what to do to make sure that the temporary file is automatically gone once the patcher is not running. How do I do that correctly?
There are no guaranteed ways to accomplish this as many things beyond the control of the application can cause it to exit. A power failure or kernel panic due to some hardware issue can crash the machine or force it to be restarted.
A strategy that is in common use is to implement a check on program startup for the status of the previous run. Some applications create a lock file at start and remove it on graceful exit. If this lock file exists when the program is restarted, this means the previous run did not result in a clean exit, and the application can take any corrective action. The exact action to be taken depends on the nature of the application, some refuse to start, others give warnings to users.
I would just like to ask too, where in windows is best to write a temporary file?
Each OS has its own location for temporary files. If you eliminate the dir argument to TempFile, it will create it in the appropriate location, as mentioned in the documentation:
TempFile creates a new temporary file in the directory dir, opens the
file for reading and writing, and returns the resulting *os.File. The
filename is generated by taking pattern and adding a random string to
the end. If pattern includes a "*", the random string replaces the
last "*". If dir is the empty string, TempFile uses the default
directory for temporary files (see os.TempDir). Multiple programs
calling TempFile simultaneously will not choose the same file. The
caller can use f.Name() to find the pathname of the file. It is the
caller's responsibility to remove the file when no longer needed.
From os.TempDir we see the following:
On Unix systems, it returns $TMPDIR if non-empty, else /tmp. On
Windows, it uses GetTempPath, returning the first non-empty value
from %TMP%, %TEMP%, %USERPROFILE%, or the Windows directory. On
Plan 9, it returns /tmp.
The directory is neither guaranteed to exist nor have accessible
permissions.
I have 2 applications running in parallel, both doing the following:
check for file not containing "processed"
process the file and then rename it to filename+processed
for every file, only one application shall use it (on a first come first served basis)
I get the files and I also lock them so the other application cannot process it. But when it comes to renaming the file I get a problem. To rename the file, wanted to use the File.renameTo function. However, for that to work, I have to release the lock on the file. But when I release the lock another process may try to use the file. Exactly that should not happen.
Is there any way to prevent the application B from using the file between application A releasing the lock and finishing renaming the file?
EDIT
Some more information:
File creation if the file doesn't exist has to be prevented.
The file will be processed RandomAccessFile (with read and write permission; this creates a new file if it doesn't exist).
Note: On linux, one can rename a file that is locked, so this problem doesn't occur there. However, on Windows a locked file cannot be renamed; I have to release the lock, then rename it. But the time, during which the lock is released creates enables other applications to see that the file is available and then they will try to use it.
Windows applications can do this using the SetFileInformationByHandle function, which allows you to rename the file using the handle you already have open. You probably can't do this natively from Java.
However, a more straightforward solution would be to rename the file (to filename+processing, for example) before you start processing it. Whichever process successfully renames the file in this way is the one responsible for processing it and eventually renaming it to filename+processed.
I'm trying to reverse-engineer a program that does some basic parsing: text in, text out. I've got an executable "reference implementation" and the source code to what must be a different version, since the compiled source output != executable output.
The process creates and deletes temporary files very quickly in a multi-step parsing process. If I could take a look at the individual temporary files, I could get some great diagnostic data to narrow down where my source differs from the binary.
Is there any way to do any of the following?
Freeze a directory so that file creation will work but file deletion will fail silently?
Run a program in "slow motion" so that I can look at the files that it creates?
Log everything that a program does, including any data written out to files?
Running a tool like NTFS Undelete should give you the chance to recover the temporary files it's creating then deleting. Combine this with ProcMon from Sysinternals to get the right filenames.
You didn't mention what OS you're doing this on, but assuming you're using Windows...
You might be able to make use of SysInternals tools like Process Explorer and Process Monitor to get a better idea of the files being accessed. As far as I know, there's no "write-only" option on folders. For "slowing down" the files, you'd just need to use a slower computer. For logging, the SysInternals tools will help out quite a bit. Once you have a file name(s) that are being created, you could try preventing their deletion by opening the files in a stream from another process. That would prevent the system from being able to delete them.
There are two ways to attack this:
Run various small test cases through both systems and notice the differences. Since the test cases are small, you should be able to figure out why your code works differently than the executable.
Disassemble the executable and remove all the "delete temp file" instructions. Depending on how this works, this could be a very complex task (say when there is no central place where it happens).
I have a Windows service application on Vista SP1 and I've found that users are renaming its executable file (while it's running) and then rebooting, thus causing it to fail to start on next bootup because the service manager can no longer find the exe file since it's been renamed.
I seem to recall that with older versions of Windows you couldn't do this because the OS placed a lock on the file. Even with Vista SP1 I still cannot copy over the existing file when it's running - Windows reports that the file is in use - makes sense. So why should I be allowed to rename it? What happens if Windows needs to page in a new code page from the exe but the file has been renamed since it was started? I ran Process Monitor while renaming the exe file, etc, but Process Mon didn't report anything strange and just logged changing the filename like any other file.
Does anyone know what's going on here behind the scenes? It's seem counter intuitive that Windows would allow a running process' filename (or its dependent DLLs) to be changed. What am I missing here?
your concept is wrong ... the filename is not the center of the file-io universe ... the handle to the open file is. the file is not moved to a different section of disk when you rename it, it's still in the same place and the part of the disk the internal data structure for the open file is still pointing to the same place. bottom line is that your observations are correct. you can rename a running program without causing problems. you can create a new file with the same name as the running program once you've renamed it. this is actually useful behavior if you want to update software while the software is running.
As long as the file is still there, Windows can still read from it - it's the underlying file that matters, not its name.
I can happily rename running executables on my XP machine.
The OS keeps an open handle to the .exe file,. Renaming the file simply changes some filesystem metadata about the file, without invalidating open handles. So when the OS goes to page in more code, it just uses the file handle it already has open.
Replacing the file (writing over its contents) is another matter entirely, and I'm guessing the OS opens with the FILE_SHARE_WRITE flag unset, so no other processes can write to the .exe file.
Might be a stupid question but, why do users have access to rename the file if they are not suppose to rename the file? But yeah, it's allowed because, as the good answers point out, the open handle to the file isn't lost until the application exits. And there are some uses for it as well, even though I'm not convinced updating an application by renaming its file is a good practice.
You might consider having your service listen to changes to the directory that your service is installed in. If it detects a rename, then it could rename itself back to what it's supposed to be.
There are two aspects to the notion of file here:
The data on the disk - that's the actual file.
The file-name (could be several or none) which you can give that data - called directory entries.
What you are renaming is the directory entry, which still references the same data. Windows doesn't care about your doing so, as it still can access the data when it needs to. The running process is mapped to the data, not the name.