How to remove a file and avoid introducing a race condition? - c++11

I try to remove a file and recreate it if it is corrupted. I used remove() and unlink(), but both might lead to race conditions. (I found it while I run static analysis from Parasoft.)
Does anybody know an alternative?

Related

Is there a way to limit my executable's ability to delete to only files it has created?

I'm on Windows writing a C++ executable that deletes and replaces some files in a directory it creates during an earlier run session. Maybe I'm a little panicky, but since my directory and file arguments for the deletions are generated by parsing an input file's path, I worry about the parse throwing out a much higher or different directory due to an oversight and systematically deleting unrelated files unintentionally.
Is there a way to limit my executable's reign to only include write/delete access to files it has created during earlier run sessions, while retaining read access to everything else? Or at least provide a little extra peace of mind that, even if I really mis-speak my strings to DeleteFileA() and RemoveDirectoryA() I'll avoid causing catastrophic damage?
It doesn't need to be a restriction to the entire executable, it's good enough if it limits the function calls to delete and remove in some way.

Changing directories in go routines

I am trying to change directories in a go routine to directory x. I now want to use a different go routine that changes the directory to directory y. Will the execution of my first go routine be affected by this change to the current working directory in the second go routine? The purpose of wanting to do this is to introduce parallelism while doing similar tasks. If it does end up changing the CWD, what should be an alternative approach (forking...)?
Like mentioned in the comments, keeping track of the current working directory in each goroutine will cause problems.
Try using filepath.Abs to capture the absolute directory and store that instead. Then each goroutine can operate on it's own directory without worrying about it being "switched" under the hood. Just be sure you're not modifying the same file accidentally by multiple goroutines.
Edit: Removing a chunk of text per #Evan's comment. Use absolute paths :p
#Evan has identified a fundamental flaw in attempting to use the 'change working directory' (CWD) system call.
I believe that #Evan is correct, and that the CWD is a thread property on some OS's.
As #Evan pointed out, a goroutine could be resheduled (for example at a function call, channel access, or system call) onto a different thread.
The implications are, it may be impossible to change the CWD (if Chdir() could change the threads CWD) because Go's runtime chooses to reschedule the goroutine on a different thread; its CWD could change invisibly and unpredictably.
Edit: I would not expect Chdir() to do anything other than change the CWD for the process. However, the documentation for the package has no mention of 'process'.
Worse, the runtime may change how things work with releases.
Even worse, it would be very hard to debug. It may be a 'Heisenberg problem', where any attempt to debug it (for example by calling a function, which the runtime may use as a reschedule point) may actually changes the behaviour in an unpredictable way.
Keep track of absolute path names. This is explicit, clear, and would even work across goroutines without any need for synchronisation. Hence it is simpler, and easier to test and debug.

The bizarre case of the file that both is and isn’t there

In .Net 3.5, I have the following code.
If File.Exists(sFilePath & IndexFileName & ".NX") Then
Kill(sFilePath & IndexFileName & ".NX")
End If
At runtime, on one client's machine, I get the following exception, over and over, when this code executes
Source: Microsoft.VisualBasic
TargetSite: Microsoft.VisualBasic.FileSystem.Kill
Message: No files found matching 'I:\RPG\HGIAPVXD.NX'.
StackTrace:
at Microsoft.VisualBasic.FileSystem.Kill(String PathName)
(More trace that identifies the exact line of code.)
There are two people on different machines running this code, but only one of them is getting the exception. The exception does not happen every time, but it is happening regularly. (Multiple times every hour.) The code is not in a loop, nor does it run continuously, more like once every couple of minutes or so.
On the surface, this looks like a race condition, but given how infrequently this code is run and how often the error is happening I think there must be something else going on.
I would appreciate any suggestions on how I can track down what is really going on here. A solution to keep the error from happening would be even better.
I guess the first question to ask is "IS the file really there or not?" and if so, does it have any specical attributes (Is it Read-only or Hidden, or System --- or a Directory)?
Note the Microsoft.VisualBasic.FileSystem.Kill specifically looks for, and silently skips, any file marked "System" or "Hidden". For pretty much any other problem you would have gotten a different exception.
as James pointed out the Kill functions checks if the file in case is a system or hidden, you better use System.IO.File.Delete() instead
Try
System.IO.File.Delete(sFilePath & IndexFileName & ".NX")
Catch ex As System.Exception
...
End Try
using File.Exits is not neccasary because File.Delete() checks this by itself.
Is there any chance that the I: drive is a network drive? it could be some network issue... or then maybe a race condition

Is it possible to create a file that cannot be copied?

To restrict the scope, let assume we are in Windows world only.
Also assume we don't want to play with permission policy.
Is it possible for us to create a file that cannot be copied?
Thank you in advance.
"Trying to make digital files uncopyable is like trying to make water not wet." ~ Bruce Schneier
No. You can't create a file that a SYSADMIN can't copy. You could encrypt it, though.
Well, how about creating a file that uses up more than 50% of the total space on that machine and that is not compressible?
For instance, let us assume that you want to save a boolean (true or false) in such a fashion.
Depending on its value, you could then write a bit stream of ones or zeroes and encrypt said stream using some kind of encryption algorith, such as AES in CBC mode. This gives you the added advantage of error correction. Even in case of massive data corruption, you should be able to recover your boolean by checking whether ones or zeroes are prevalent in the decrypted stream.
In that case you cannot copy it around (completely) on the machine...
Of course, any type of external memory that can be added to the system would pose a problem in this scenario. But the file would be already encrypted, so don't worry about it too much...
Any file that can be read can have its contents written to another location (such as another file, i.e. copied).
The only thing you can do is limit who/what can read the file.
What is the motivation behind? If it is a read-only file, you can have it as embedded resources within your assembly.
Nice try, RIAA.
But seriously, no you can not. It is always possible to copy, you can just make it it more difficult for people to make sense of the file or try to hide it using like encryption. Spotify does it.
If you really try hard thou, you cold make a root-kit for windows and use it to prevent windows from even knowing about the file and also prevent copies. The file will still be there and copy-able by other tools, or Linux accessing the ntfs.
If in a running process you open a file and hold an exclusive lock, then other processes cannot read the file until you close the handle or your process terminates. However, as admin you could forcibly remove the lock handle.
Short answer: No.
You can, of course, use security settings to limit who can read the file. But if someone can read it, then they can copy it. Even if you found some operating system trick to disable "ordinary" copying, if someone can read the file, they can extract the contents, store it in memory, and then write it somewhere else.
You can encrypt the contents so it's only useful to your own program, that knows how to decrypt it.
That's about it.
When using Windows 7 to copy some files from a hard drive, certain files popped up a message saying they could not be copied in their entirety; certain data would be omitted from the copy. I suspect that had something to do with slack space at the end of the files, though I thought the message was curious. I would have expected the copy operation to just ignore the slack space.
If you are running old (OLD) versions of windows, there are certain characters you can put in the filename that make it invalid, not listed in folders, etc. They were used a lot in the old pub ftp days of filesharing ;)
In the old DOS days, you used to be able to flag disk sectors as bad and still read from them. This meant the OS ignored the sector in question but your application would know where to look and be able to get the data. Not sure this would work these days.
Another old MS-DOS trick was to put a space character in the middle of the filename (yes, spaces were valid characters for filenames). Since there was no method on the command line to escape a space, the file couldn't be copied using the DOS commands.
This answer is outside Windows so yeah
Dont know if its already been said but what about a file that is an inseperable part of the firmware so that it is always on AND running, perhaps it has firmware that generates a sequence that is required for the other . AN incedental effect of its running is to prevent any 80% or more of its code from being replicated. Lets say its on an entirely different board, protected by surge protectors, heavy em proof shielding and anything else required to make it completely unerasable.
If its possible to make a program that is ALWAYS on and running as long as the copying software is running then yes.
I have another way and this IS with windows. I will come to your house and give you a disk, i will then proceed to destroy every single computer you put the disk into. This doesnt work on XP
Well technically you could create and write to a write-only network share.

File Unlocking and Deleting as single operation

Please note this is not duplicate of File r/w locking and unlink. (The difference - platform. Operations of files like locking and deletion have totally different semantics, thus the sultion would be different).
I have following problem. I want to create a file system based session storage where each session data is stored in simple file named with session ids.
I want following API: write(sid,data,timeout), read(sid,data,timeout), remove(sid)
where sid==file name, Also I want to have some kind of GC that may remove all timed-out sessions.
Quite simple task if you work with single process but absolutly not trivial when working with multiple processes or even over shared folders.
The simplest solution I thought about was:
write/read:
hanlde=CreateFile
LockFile(handle)
read/write data
UnlockFile(handle)
CloseHanlde(handle)
GC (for each file in directory)
hanlde=CreateFile
LockFile(handle)
check if timeout occured
DeleteFile
UnlockFile(handle)
CloseHanlde(handle)
But AFIAK I can't call DeleteFile on opended locked file (unlike in Unix where file locking is
not mandatory and unlink is allowed for opened files.
But if I put DeleteFile outside of Locking loop bad scenario may happen
GC - CreateFile/LockFile/Unlock/CloseHandle,
write - oCreateFile/LockFile/WriteUpdatedData/Unlock/CloseHandle
GC - DeleteFile
Does anybody have an idea how such issue may be solved? Are there any tricks that allow
combine file locking and file removal or make operation on file atomic (Win32)?
Notes:
I don't want to use Database,
I look for a solution for Win32 API for NT 5.01 and above
Thanks.
I don't really understand how this is supposed to work. However, deleting a file that's opened by another process is possible. The process that creates the file has to use the FILE_SHARE_DELETE flag for the dwShareMode argument of CreateFile(). A subsequent DeleteFile() call will succeed. The file doesn't actually get removed from the file system until the last handle on it is closed.
You currently have data in the record that allows the GC to determine if the record is timed out. How about extending that housekeeping info with a "TooLateWeAlreadyTimedItOut" flag.
GC sets TooLateWeAlreadyTimedItOut = true
Release lock
<== writer comes in here, sees the "TooLate" flag and so does not write
GC deletes
In other words we're using a kind of optimistic locking approach. This does require some additional complexity in the Writer, but now you're not dependent upon any OS-specifc wrinkles.
I'm not clear what happens in the case:
GC checks timeout
GC deletes
Writer attempts write, and finds no file ...
Whatever you have planned for this case can also be used in the "TooLate" case
Edited to add:
You have said that it's valid for this sequence to occur:
GC Deletes
(Very slightly later) Writer attempts a write, sees no file, creates a new one
The writer can treat "tooLate" flag as a identical to this case. It just creates a new file, with a different name, use a version number as a trailing part of it's name. Opening a session file the first time requires a directory search, but then you can stash the latest name in the session.
This does assume that there can only be one Writer thread for a given session, or that we can mediate between two Writer threads creating the file, but that must be true for your simple GC/Writer case to work.
For Windows, you can use the FILE_FLAG_DELETE_ON_CLOSE option to CreateFile - that will cause the file to be deleted when you close the handle. But I'm not sure that this satisfies your semantics (because I don't believe you can clear the delete-on-close attribute.
Here's another thought. What about renaming the file before you delete it? You simply can't close the window where the write comes in after you decided to delete the file but what if you rename the file before deleting it? Then when the write comes in it'll see that the session file doesn't exist and recreate it.
The key thing to keep in mind is that you simply can't close the window in question. IMHO there are two solutions:
Adding a flag like djna mentioned or
Require that a per-session named mutex be acquired which has the unfortunate side effect of serializing writes on the session.
What is the downside of having a TooLate flag? In other words, what goes wrong if you delete the file prematurely? After all your system has to deal with the file not being present...

Resources