When is it necessary to flush a file?
I never do it because I call File.Close and I think that it is flushed automatically, isn't it?
You'll notice that an os.File doesn't have a .Flush() because it doesn't need one because it isn't buffered. Writes to it are direct syscalls to write to the file.
When your program exits(even if it crashes) all files it has open will be closed automatically by the operating system and the file system will write your changes to disk when it gets around to it (sometimes up to few minutes after your program exits).
Calling os.File.Sync() will call the fsync() syscall which will force the file system to flush it's buffers to disk. This will guarantee that your data is on disk and persistent even if the system is powered down or the operating system crashes.
You don't need to call .Sync()
See here. File.Sync() is syscall to fsync. You should be able to find more under that name.
Keep in mind that fsync is not the same as fflush and is not executed before close.
You generally don't need to call it. The file will be written to disk anyway, after some time and if there is no power failure during this period.
Looks the most recommendations here are to not to call fsync(), but in general it mainly depends on your application requirement. If you are working on critical file read/write, its always recommended to call fsync().
http://www.microhowto.info/howto/atomically_rewrite_the_content_of_a_file.html#idp31936
link, has more details on when file.Sync() will help.
When you want to ensure data integrity as much as possible. For example, what happens if your program crashes before it comes to closing the file?
Related
In my program I call ReadDirectoryChangesW to listen for file events in a given directory. The problem is some events (e.g. FILE_ACTION_ADDED) are signaled when the file is opened, not closed. This means the file will be locked by other process for some unspecified amount of time and CreateFileW will be returning an error.
The question is: how do I open the file when the other process is done with it? I can tolerate race conditions (e.g. some other process manages to delete the file after it's closed, but before I open it), but I'd like to avoid busy waiting.
Options I see/considered so far:
Asynchronous CreateFileW. That would be an ideal solution, but it's not possible - all user-space APIs for opening a file are synchronous by design (see a great explanation).
Listening for FILE_NOTIFY_CHANGE_LAST_ACCESS. This almost works - notification on close is sent only when the other process wrote some bytes to the file.
I also found some resources on kernel filter drivers, which can detect file close event. Probably works, but seems a bit too complex.
Busy loop continuously calling CreateFileW until it succeeds. Overutilizes the CPU, but is the only thing that actually works. I'm worried I'm stuck with this approach.
I'd like to load a file such that the contents are either in memory or on disk but not both. By doing a simple open, read, and delete, this should accomplish the task. But it seems that it's up to the OS to decide when to flush the delete command to hardware. In the case of linux, a call to sync() should accomplish this. In the world of Windows, the closest you can get is FlushFileBuffers(), which takes a handle. Whend deleting a file, you don't use handles, just paths. Is there a way to force Windows to flush a delete request to disk rather than queuing or caching it?
During a file is load-ed/require-d, is the file locked from writing?
If not, how can I exclusively lock the file from writing during load/require?
Probably, File#flock should be used if so, but I don't know the answer to the first question, and also how to combine it with load/require.
When a file is opened the only protection you have is if the file is deleted, technically unlinked from the filesystem and orphaned, you can still read the contents. Closing the file forfeits any access to it from that point on. That's how it works on POSIX type systems in any case, Windows may be different.
There's nothing to prevent another process from over-writing part of the file or truncating it while your process is trying to do its thing.
Remember File#flock is simply a polite way of requesting a lock and unless the other process that's about to manipulate the file is polite and checks you have no guarantees about the state of your file. Processes are free to ignore that and mangle your file without warning.
The only way to be sure nobody touches your file is to copy it to a private /tmp directory, test that the thing copied correctly, and read it in from there. That's an extremely paranoid thing to do so I'd hope you have a compelling reason before going down that road.
If you can control all the processes that access your file and make them well-behaved citizens and use a consistent locking mechanism for the file you'll probably be fine. If that's not the case you may want to have a master process that grants access to the files on an exclusive basis using some kind of IPC signalling.
I've been going through the WinAPI documentation for a while, but I don't seem to be able to find an answer. What I'm trying to achieve is to give a program a file name that it can open and work with it like that would be a normal file on the disk. But I want this object to be in the memory.
I tried using named pipes and they work in some of the situations, but not always. I create a named pipe and pass it to the child process as a regular file. When process exists I collect the data from the pipe.
program.exe \\.\pipe\input_pipe
Faced some limitations though. One of them is that they are not seekable. The second limitation is that they should be opened with exactly the right permissions. And the third one I found is that you cannot pre-put any data into a duplex pipe before it's been open on the other end. Is there any way to overcome those limitations of the named pipes?
Or maybe there is some other kind of object that could be opened with CreateFile and then accessed with ReadFile and WriteFile. So far the only solution I see is to create a file system driver and implement all the functionality myself.
Just to make it clear I wanted to point out that I cannot change the child program I'm running. The main idea is to give that program something that it would think is a normal file.
UPDATE: I'm not looking for a solution that involves installation of any external software.
Memory-mapped files would allow you to do what you want.
EDIT:
On rereading the question - since the receiving program already uses CreateFile/ReadFile/WriteFile and cannot be modified, this will not work. I cannot think of a way to do what OP wants outside of third-party or self-written RAMDisk solution.
The simplest solution might be, as you seem to suggest, using a Ramdisk to make a virtual drive mapped to memory. Then obviously, any files you write to or read from that virtual drive will be completely contained in RAM (assuming it doesn't get paged to disk).
I've done that a few times myself to speed up a process that was entirely disk-bound.
Call CreateFile but with FILE_ATTRIBUTE_TEMPORARY and probably FILE_FLAG_DELETE_ON_CLOSE as well.
The file will then never hit the disk unless the system is low on physical memory.
UNIX file-locking is dead-easy: The operating system assumes that you know what you are doing and lets you do what you want:
For example, if you try to delete a file which another process has opened the operating system will usually let you do it. The original process still keeps it's file-handles until it terminates - at which point the the file-system will quietly re-cycle the disk-resources. No fuss, that's the way I like it.
How different things are on Windows: If I try to delete a file which another process is using I get an Operating-System error. The file is untouchable until the original process releases it's lock on the file. That was great back in the single-user days of MS-DOS when any locking process was likely to be on the same computer that contained the files, however on a network it's a nightmare:
Consider what happens when a process hangs while writing to a shared file on a Windows file-server. Before the file can be deleted we have to locate the computer and ID the process on that computer which originally opened the file. Only then can we kill the process and delete our unwanted file.
What a nuisance!
Is there a way to make this better? What I want is for file-locking on Windows to behave a like file-locking in UNIX. I want the operating system to just let me do what I want because I'm in charge and I know what I'm doing...
...so can it be done?
No. Windows is designed for the "average user", that is people who don't understand anything about a computer. Therefore, the OS tries to be smart to avoid PEBKACs. To quote Bill Gates: "There are no issues with Windows that any number of people want to be fixed." Of course, he knows that 99.9999% of all Windows users can't tell whether the program just did something odd because of them or the guy who wrote it.
Unix was designed when the world was more simple and anyone close enough to a computer to touch it, probably knew how to assemble it from dirty sand. Therefore, the OS usually lets you do what you want because it assumes that you know better (and if you didn't, you will next time).
Technical answer: Unix allocates an "i-nodes" if you create a file. I-nodes can be shared between processes. If two processes create the same file (that is, two processes call create() with the same path), then you end up with two i-nodes. This is by design. It allows for a fancy security feature: You can create files which no one can open but yourself:
Open a file
Delete it (but keep the file handle)
Use the file any way you like
Close the file
After step #2, the only process in the universe who can access the file is the one who created it (unless you want to read the hard disk block by block). The OS will keep the data alive until you either close the file or your process dies (at which time Unix will clean up after you).
This design is the foundation of all Unix filesystems. The Windows file system NTFS works much the same way but the high level API is different. Many applications open files in exclusive mode (which prevents anyone, even backup programs) to read the file. This is even true for applications which just display information like PDF viewers.
That means you'll have to fix all the Windows applications to achieve the desired effect. If you have access to the source, you can create a file in a shared mode. That would allow other processes to access it at the same time but then, you will have to check before every read/write if the file still exists, whether someone has made changes, etc.
According to MSDN you can specify to CreateFile() 3rd parameter (dwSharedMode) shared mode flag FILE_SHARE_DELETE which:
Enables subsequent open operations on a file or device to request delete access.
Otherwise, other processes cannot open the file or device if they request delete access.
If this flag is not specified, but the file or device has been opened for delete access, the function fails.
Note Delete access allows both delete and rename operations.
http://msdn.microsoft.com/en-us/library/aa363858(VS.85).aspx
So if you're can control your applications you can use this flag.
Note that Process Explorer allow for force closing of file handles (for processes local to the box on which you are running it) via Handle -> Close Handle.
Unlocker purports to do a lot more, and provides a helpful list of other tools.
Also deleting on reboot is an option (though this sounds like not what you want)
That doesn't really help if the hung process still has the handle open. It won't release the resources until that hung process releases the handle. But anyway, in Windows it is possible to force close a file out from under a process that's using it. Process Explorer from sysinternals.com will let you look at and close handles that a process has open.