How do I create a memory-mapped file without a backing file on OSX? - macos

I want to use a library that uses file descriptors as the basic means to access its data. For performance reasons, I don't want to have to commit files to the disk each before I use this library's functions.
I want to create (large) data blobs on the fly, and call into the library to send them to a server. As it stands, I have to write the file to disk, open it, pass the FD to the library, wait for it to finish, then delete the file on disk. Since I can re-create the blobs on demand (and they're not so large that they cause excessive virtual memory paging), saving them to disk buys me nothing, and incurs a large performance penalty.
Is it possible to assign a FD to a block of data that resides only as a memory-mapped entity?

You could mount a memory-backed filesystem: http://lists.apple.com/archives/darwin-kernel/2004/Sep/msg00004.html
Using this mechanism will increase memory pressure on the system, and will probably be paged out if memory pressure is great enough. It might be worthwhile to make it a configuration option, in case the user would rather some other application have first-choice of the memory.
Another option is to use POSIX shared memory segments: http://opengroup.org/onlinepubs/007908799/xsh/shm_open.html (I haven't used POSIX shared memory segments myself; if I understand them correctly, they were designed to solve exactly this problem.)
The shm_open() function creates a memory object and returns a file descriptor. You could then mmap(2) that file descriptor, do your work, and pass the file descriptor to the library.
Don't forget to shm_unlink the object when you're done; POSIX shared memory segments, message queues, and semaphore arrays don't automatically go away when the last process exits.

Related

read/write to a disk without a file system

I would like to know if anybody has any experience writing data directly to disk without a file system - in a similar way that data would be written to a magnetic tape. In particular I would like to know if/how data is written in blocks, and whether a certain blocksize needs to be specified (like it does when writing to tape), and if there is a disk equivalent of a tape file mark, which separates the archives written to a tape.
We are creating a digital archive for over 1 PB of data, and we want redundancy built in to the system in as many levels as possible (by storing multiple copies using different storage media, and storage formats). Our current system works with tapes, and we have a mechanism for storing the block offset of each archive on each tape so we can restore it.
We'd like to extend the system to work with disk volumes without having to change much of the logic. Another advantage of not having a file system is that the solution would be portable across Operating Systems.
Note that the ability to browse the files on disk is not important in this application, since we are considering this for an archival copy of data which is not accessed independently. Also note that we would already have an index of the files stored in the application database, which we also write to the end of the tape/disk when it is almost full.
EDIT 27/05/2020: It seems that accessing the disk device as a raw/character device is what I'm looking for.

Can I create a file in Windows that only exists in memory - and if so, how?

This question is not a duplicate of any of these existing questions:
How can I store an object File that only exists in memory as a file inside of my storage system? - This question is not about Java's File API.
Temp file that exists only in RAM? - This is close to what I'm asking, except the OP isn't asking how to create files from memory for the purposes of sharing passing them to child-processes
I'm not asking about Win32's Memory-mapped File either - as they're essentially the opposite of what I'm after: a memory-mapped file is a file-on disk that's mapped to a process' virtual memory space - whereas what I want is a file that exists in the OS' filesystem (but not the disk's physical filesystem) like a mount-point and that file's data is mapped to an existing buffer in memory.
I.e., with Memory Mapped Files, writing/writing to a byte at a particular buffer address and offset in memory will cause the byte at the same offset from the start of the file to be modified - but the file physically exists on-disk, which isn't what I want.
To elaborate and to provide context:
I have an ASP.NET Core server-side application that receives request streams sized between 1 and 10MB on a regular basis. This program will run only on Windows / Windows Server, so using Windows-specific functionality is fine.
75% of the time my application just reads through these streams by itself and that's it.
But a minority of the time it needs to have a separate applications read the data which it starts using Process.Start and passing the file-name as a command-line argument.
It passes the data to these separate applications by saving the stream to a temporary file on-disk and passing the filename of that stream.
Unfortunately it can't write the content to the child-process's stdin because some of the those programs expect a file on-disk rather than reading from stdin.
Additionally, while the machine it's running on has lots of RAM (so keeping the streams buffered in-memory is fine) it has slow spinning-rust HDDs, which is further reason to avoid temporary files on-disk.
I'd like to avoid unnecessary buffering and copies - ideally I'd like to stream the entire 1-10MB request into a single in-memory buffer, and then expose that same buffer to other processes and use that same buffer as the backing for a temporary file.
If I were on Linux, I could use tmpfs - it isn't perfect:
To my knowledge, an existing process can't instruct the OS to take an existing region of its virtual-memory and map a file in tmpfs to that memory region, instead tmpfs still requires that the file be populated by writing (i.e. copying) all of the data to its file-descriptor - which is counter to the aim of having a zero-copy system.
Windows' built-in RAM-disk functionality is limited to providing the basis for a RAM-disk implementation via a third-party device-driver - I'm surprised that Microsoft never shipped Windows with a built-in RAM-disk GUI or API, especially given their relative simplicity.
The ImDisk program is an implementation of a RAM-disk using Microsoft's RAM-disk driver platform, but as far as I can tell while it's more like tmpfs in that it can create a file that exists only in-memory, it doesn't allow the file's data to be backed by a buffer directly accessible to a running process (or a shared-memory buffer).
CreateFileMapping with hFile = INVALID_HANDLE_VALUE "creates a file mapping object of a specified size that is backed by the system paging file instead of by a file in the file system".
From Raymond Chen's The source of much confusion: “backed by the system paging file”:
In other words, “backed by the system paging file” just means “handled like regular virtual memory.”
If the memory is freed before it ever gets paged out, then it will never get written to the system paging file.

Why are read operations much faster after writing files using OS and disk buffers?

I am writing about a hundred files with a size of 50MB each sequentially to a directory on my disk using CreateFile() and WriteFile(). In a second steps, the contents of those files are read using CreateFile() and ReadFile().
I noticed some partially weird things:
If I pass FILE_FLAG_NO_BUFFERING | FILE_FLAG_WRITE_THROUGH when writing the files, reading takes a noticably long time (usually hundreds of milliseconds). However, when I do not pass those flags (but use FlushFileBuffers() instead), writing appears to happen at roughly the same speed but reading those files after writing them is blazingly fast (less than 20 milliseconds per file!).
How is this possible? How do the flags passed when writing 5000MB of data affect reading later? Does the disk cache the whole 5GB in its cache?
When you pass FILE_FLAG_NO_BUFFERING then you are telling the system not to put the data in its disk cache. Then when you read the data, the system has to get the data from the disk.
When you omit FILE_FLAG_NO_BUFFERING, the system can put the data in its disk cache. And so when you read the data subsequently, it can be read directly from memory, which is faster than disk.
From https://support.microsoft.com/en-us/kb/99794:
The FILE_FLAG_WRITE_THROUGH flag for CreateFile() causes any writes made to that handle to be written directly to the file without being buffered. The data is cached (stored in the disk cache); however, it is still written directly to the file. This method allows a read operation on that data to satisfy the read request from cached data (if it's still there), rather than having to do a file read to get the data. The write call doesn't return until the data is written to the file. This applies to remote writes as well--the network redirector passes the FILE_FLAG_WRITE_THROUGH flag to the server so that the server knows not to satisfy the write request until the data is written to the file.
The FILE_FLAG_NO_BUFFERING takes this concept one step further and eliminates all read-ahead file buffering and disk caching as well, so that all reads are guaranteed to come from the file and not from any system buffer or disk cache.
You might find this article from Raymond Chen of interest: We’re currently using FILE_FLAG_NO_BUFFERING and FILE_FLAG_WRITE_THROUGH, but we would like our WriteFile to go even faster. An excerpt:
A customer said that their program’s I/O pattern is to open a file and
then every so often write about 100KB of data into the file. They are
currently using the FILE_FLAG_NO_BUFFERING and FILE_FLAG_WRITE_THROUGH
flags to open a file, and they wanted to know what else they could do
to make their writes go even faster.
Um, for one thing, you stop passing those two flags!
Those two flags in combination basically mean “Give me the slowest
possible I/O performance!” because they force all I/O to go through to
the physical media right away.

Fastest way to send large blobs of data from one program to another in Windows?

I need to send large blobs of data (~10MB) from one program to another in Windows 7. I would like a method that allows for at least a gigabyte per second total throughput with very low system load. To simplify this, all blobs may be the same size, and one program may be a child process of the other.
Method 1: Memory map the same file in both programs: CreateFileMapping() / MapViewOfFile()
In this case, the memory mapped file(s) presumably contains room for several blobs in a ring buffer. There would need to be some external mechanism to synchronize access to the ring buffer.
Method 2: Create named data sections
Method 3: WriteProcessMemory (suggested by Hristo Iliev below, thanks!)
Method 4: Read/write files on a RAM disk.
Method 5: Read/write to an anonymous pipe.
Method ?: Anything else? Perhaps write over TCP, use MPI, ...
I know that memory-mapped files (method 1) are considered the standard solution to this problem :)
How fast are memory-mapped files? (rough order of magnitude)
Is there an even faster method?
How much worse is the performance of the other methods? Which ones of them can hit GB/sec throughput?
If using memory mapped files, what is the best way for the programs to synchronize access to the data being passed? (ie: how would the producer indicate to the consumer that a new blob is available, and how would the consumer indicate it is done with a particular blob?)
If using memory mapped files, is it better to have one file for all blobs together (ring buffer in a file), or one file for each blob (ring buffer of files)?
You could also use WriteProcessMemory and have the first process to directly post the data into the address space of the second process. You'd need to develop a protocol of some kind. For example, the second process could send the virtual address of its receive buffer to the first process via a named pipe or a shared memory block, then the first process copies the data using WriteProcessMemory and when it is finished, signals the second one via a semaphore or something. This ought to be the fastest way to send data between two processes as it involves a single copy operation. The first process would need to obtain the proper rights on the second one and that should not be a problem as long as both processes belong to the same user.

Memory mapped files optional write possible?

When using memory-mapped files it seems it is either read-only, or write-only. By this I mean you can't:
have one open for writing, and later decide not to save it
have open open for reading, and later decide to save it
Our application uses a writeable memory-mapped file to save data files, but since the user might want to exit without saving changes, we have to use a temporary file which the user actually edits. When the user opts to save the changes, the original file is overwritten with the temporary file so it has the latest changes. This is cumbersome because the files can be very large (>1GB) and it takes a long time to copy them.
I've tried many combinations of the flags used to create the file mapping but none seem to allow the flexibility of saving on demand. Can anyone confirm this is the case? Our application is written in Delphi, but it uses the standard Windows API to create the mapping, in our case
FMapHandle := CreateFileMapping(FFileHandle, nil, PAGE_READWRITE, 0, 2 * 65536, nil);
FBasePointer := MapViewOfFile(FileMapHandle, FILE_MAP_WRITE, FileOffsetHigh,
FileOffsetLow, NumBytes);
I don't think you can. By that I mean you may be able to, but it doesn't make any sense to me :-)
The whole point of a memory-mapped file is that it's a window onto the actual file. If you don't wany changes reflected in the file, you'll probably have to do something like batch up the changes in a data structure (e.g., an array of base address, size and data) and apply them when saving.
In which case, you wouldn't actually need the memory mapped file, just read in and maintain the chunks you want to change (lock the file first if there's a chance of multi-user access).
Update:
Have you thought of the possibility of, when doing a save, deleting the original file and just renaming the temporary file to the original file name? That's likely to be much faster than copying 1G of data from temporary to original. That way, if you don't want it saved, just delete the temporary file and keep the original.
You'll still have to copy the original data to the temporary file when loading but you won't have to copy the temporary data back (whether you save it or not) - that would halve the time taken.
Possible, but non-trivial.
You have to understand memory mapped basics, and the difference between the three modes of memory-mapped files. Both set aside a part of your virtual address space and create a mapping entry in an internal table. No physical RAM is initially allocated. Hence, when you DO try to access the memory, the CPU faults and the OS has to fix up. It does so by copying the file contents to RAM and mapping the RAM to your process, at the faulting address.
Now, the difference between the three modes is how the descriptors are set on the mapped pages. In all cases you get read access on the pages. (The first mode). However, if you ask for write access and subsequently write to it, on your first write the page is marked as writeable and dirty. It can then be written back to the original file, at the discretion of the OS (Second mode). Finally, it's possible to get copy-on-write semantics. You still start out with only read access to the page in memory. When you write to it, the CPU still faults and the OS needs to fix it up. With copy-on-write, that fixup is done by setting the backing store of the changed page to the page file, instead of the original mapped file.
So, in your case you want to use copy-on-write mode. If the user decides to discard the modifications, no problem. You simply discard the memory mapping. All pages that were modified in memory, and were backed by the page file are also discarded.
If the user does decide to save, you've got a slightly harder task. You now need to figure out which parts of the file have changed. Those changes are in memory, and you need to reapply those to the source file. You can do this with Page Guards. So, when the user decides to save, copy all modified pages to a separate memory block, remap the (unchanged) file for write, and apply the changes.

Resources