When the policy for a disk in Windows XP and Vista is set to enable write caching on the hard disk, is there a way to flush a file that has just been written, and ensure that it has been committed to disk?
I want to do this programmatically in C++.
Closing the file does perform a flush at the application level, but not at the operating system level. If the power is removed from the PC after closing the file, but before the operating system has flushed the disk write cache, the file is lost, even though it was closed.
.NET FileStream.Flush() will NOT flush the Windows cache for that file content; Flush() only flushes the .NET internal file buffer. In .NET 4.0, Microsoft fixed the problem by adding an optional parameter to Flush() which if set true causes FlushFileSystemBuffers to be called. In .NET 3.5 and below your only choice is to call FlushFileBuffers via pinvoke. See MSDN'sFileStream.Flush community comment for how to do this.
You should not fix this at the time you close the file. Windows will cache, unless you open the file passing FILE_FLAG_WRITE_THROUGH to CreateFile().
You may also want to pass FILE_FLAG_NO_BUFFERING; this tells Windows not to keep a copy of the bytes in cache.
This is more efficient than FlushFileBuffers(), according to the CreateFile documentation on MSDN.
See also file buffering and file caching on MSDN.
You haven't specified the development environment, so:
.Net
IO streams have a .Flush method that does what you want.
Win32 API
There is the FlushFileBuffers call, which takes a file handle as argument.
EDIT (based on a comment from the OA): FlushFileBuffers does not need administrative privileges; it does only if the handle passed to it is the handle for a volume, not for a single file.
You should also note, that your data might not get flushed to the actual disk, even when invoking a flush method of your frameworks API.
Calling the flush method will only tell the kernel to flush its pages to disk. However, if you have the disk write-cache turned on, it is allowed to delay the actual writing process indefinitely.
In order to ensure that your data gets written to the physical layer you have to turn of the write cache in your operating system. This most often comes with a performance penalty up to one or two orders of magnitude when dealing with a lot of small io-operations.
Battery based support (UPS) or disks that accept commands to flush the disk write-cache are another option to deal with this problem.
From the microsoft documents you would use _flushall and link in COMMODE.OBJ to ensure that all buffers were committed to disk.
See here: https://jeffpar.github.io/kbarchive/kb/066/Q66052/
When you initially open your file using fopen, include the "c" mode option as the LAST OPTION:
fopen( path, "wc") // w - write mode, c - allow immediate commit to disk
Then when you want to force a flush to disk, call
_flushall()
We made this call before calling
fclose()
We experienced the exact issue you described and this approach fixed it.
Note that this approach does NOT required Administrative rights, which FlushFileBuffers does require, as others have mentioned.
From that above site:
"Microsoft C/C++ version 7.0 introduces the "c" mode option for the fopen()
function. When an application opens a file and specifies the "c" mode, the
run-time library writes the contents of the file buffer to disk when the
application calls the fflush() or _flushall() function. "
You can open/create the file with the FileOptions.WriteThrough flag, which will cause the file to write directly into the disk, bypassing any caches.
E.g.
var file = File.Open(
"1.txt",
new FileStreamOptions
{
Options = FileOptions.WriteThrough
});
// - OR -
var file = new FileStream(
"1.txt",
FileMode.Create,
FileAccess.Write,
FileShare.None,
4096,
FileOptions.WriteThrough)
Related
Apparently the Windows file cache flushes data to disk asynchronously, even when using the synchronous WriteFile() API. Quoting "File Caching" on MSDN:
By default, [...] write operations write file data to the system
file cache rather than to the disk, and this type of cache is
referred to as a write-back cache.
Assuming that write-through and no-buffering flags are not used, what happens if the actual write to disk fails? Can clients be notified of such failures? What is the expected client error handling model for such failures? "Fire and forget" and "Write and pray" come to mind but maybe there is something else.
Secondary question: are there certain classes of errors that are guaranteed to be detected early? E.g. will WriteFile() always return an error if the disk is full? -- even though the actual write to disk would be deferred?
I would like to know how to write reliable file i/o that responds to these kinds of errors without disabling the Windows File Cache.
Bonus points: is this handled differently on other operating systems? Can you recommend a good resource on the topic?
In Windows 7, the user is notified via a pop-up dialog from the notification area.
Normal errors (such as the disk being full, lack of permissions, etc.) are reported back to the application immediately, these do not cause late failures.
Late failures can only happen in a handful of situations, such as a hardware failure or operating system crash. They can also happen when writing to a network share if the connection drops unexpectedly for any reason.
In most cases, it doesn't make sense for an application to worry about this. Data loss is to be expected under these circumstances; let the user deal with it.
If the data you are writing is unusually important, then you may need to worry, in which case you will have to use the write-through and/or no-buffering flags.
There is no third option.
I guess NTFS (file system of Windows) has some cache. Suppose I have a file, which is frequently accessed (read-only). How can I check if this file is in the file system cache ? Can I increase the file system cache size ?
Check
http://blogs.technet.com/b/askperf/archive/2010/08/13/introduction-to-the-new-sysinternals-tool-rammap.aspx
You can use RamMap which will give you a dedicated view of how current system is caching files.
Also to mention, cache isn't based on file, more by block/page.
There is no direct way from user space to detect if a file has been cached (partially or completely). In a multithreaded/multiprocessing environment, once you have received this information, it is instantly out of date.
There is no "limit" to caching in Windows that can be adjusted (although my data is Windows 7 and prior versions). The cache manager simply uses the memory manager to place data into memory and get callbacks when physical memory needs to be reclaimed (say, by an application's demands). The memory manager trades off file cache against memory demands of processes.
The FileShare enumeration offers various flags such as Read, Write, Delete, ... . Normally I'd think that sharing a file for deletion only allows deletion but nothing else (like reading).
However, I remotely recall that Windows only differentiates between read-only and full access to files, so actually sharing for deletion allows writing to the file as well. Sadly this is from many years back and I neither found the original source nor any related info. Is there a reliable spec on the actual behavior? Is it dependent on the OS or the FS?
Take a look at the documentation for the CreateFile Function.
FILE_SHARE_DELETE:
Enables subsequent open operations on a file or
device to request delete access. Otherwise, other processes cannot
open the file or device if they request delete access. If this flag is
not specified, but the file or device has been opened for delete
access, the function fails. Note Delete access allows both delete and
rename operations.
The documentation doesn't mention that read access is allowed, too.
Usually, when an application writes to one of it's files on disk, the file modified timestamp changes.
Sometimes, and in my case it is an application written in ProvideX (a Business Basic derivative i believe) doing the writing, the modified timestamp does not change after a write. A program like MyTrigger will not pick up on the write operation either, but Sysinternals ProcessMonitor does log the disk activity.
It seems obvious that there are different ways to ask windows to perform write operations, and the request could then be hooked or logged in various different ways as well.
I need to be able to hook the write operations coming from the ProvideX application. Any pointers on the different ways windows writes to disk, and the type of hooks available for them would be greatly appreciated.
Thanks
User-mode process can write to the file either using WriteFile API function or using MMF, memory-mapped file API (CreateFileMapping/MapViewOfFile/Write to memory block). Maybe your application goes MMF way. MMF writes to files very differently from WriteFile API, but they both lead to the same end point - IRP sent to file system driver. File system filter driver (such as the one used by Sysinternals stuff) can track write requests on that IRP level. It is technically possible to distinguish between write operations initiated by MMF and WriteFile as different IRPs are sent (cached and non-cached writing is involved). It seems that directory change monitoring function in windows tracks only one IRP type, and this causes MyTrigger to miss the change.
My service needs to store a few bits of information (at minimum, at least 20 bits or so, but I can easily make use of more) such that
it persists across service restarts, even if the service crashed or was otherwise terminated abnormally
it does not persist across a reboot
can be read and updated with very little overhead
If I store this information in the registry or in a file, it will not get automatically emptied when the system reboots.
Now, if I were on a modern POSIX system, I would use shm_open, which would create a shared memory segment which persists across process restarts but not system reboots, and I could use shm_unlink to clean it up if the persistent data somehow got corrupted.
I found MSDN : Creating Named Shared Memory and started reimplementing pieces of it within my service; this basically uses CreateFileMapping(INVALID_HANDLE_NAME, ..., PAGE_READWRITE, ..., "Global\\my_service") instead of shm_open("/my_service", O_RDWR, O_CREAT).
However, I have a few concerns, especially centered around the lifetime of this pagefile-backed mapping. I haven't found answers to these questions in the MSDN documentation:
Does the mapping persist across reboots?
If not, does the mapping disappear when all open handles to it are closed?
If not, is there a way to remove or clear the mapping? Doesn't need to be while it's in use.
If it does persist across reboots, or does disappear when unreferenced, or is not able to be reset manually, this method is useless to me.
Can you verify or find faults in these points, and/or recommend a different approach?
If there were a directory that were guaranteed to be cleaned out upon reboot, I could save data in a temporary file there, but it still wouldn't be ideal: under certain system loads, we are encountering file open/write failures (rare, under 0.01% of the time, but still happening), and this functionality is to be used in the logging path. I would like not to introduce any more file operations here.
The shared memory mapping would not persist across reboots and it will disappear when all of its handles are closed. A memory mapping object is a kernel object - they always get deleted when the last reference to them goes away, either explicitly via a CloseHandle or when the process containing the reference exits.
Try creating a registry key with RegCreateKeyEx with REG_OPTION_VOLATILE - the data will not preserved when the corresponding hive is unloaded. This will be at system shutdown for HKLM or user logoff for HKCU.
sounds like maybe you want serialization instead of shared memory? If that is indeed appropriate for your application, the way you serialize will depend on your language. If you're using c++, check out boost::serialize. C# undoubtedly has lots of serializations options (like java), if that's what you're using.