I was wondering how can I make the IO faster by writing and reading temporary files to main memory. I've used to write to /dev/shm in Linux.
But now I'm in Windows 7. Anyone knows the answer ?
Thanks
If I understood it correctly (based on this post) what you are looking for is Memory Mapped Files.
You can use CreateFile() with FILE_ATTRIBUTE_TEMPORARY, and Windows should try to keep it in cache as much as possible.
Related
Is there a tmpfs kind of solution where I can write into Ruby's memory which only persists until that ruby instance is complete.
File.write('/ruby_tmpfs/path/to/file', 'Some glorious content')
It get consumed in same script like this:
read_file_function_i_cannot_change_which_expects_file_path('/ruby_tmpfs/path/to/file')
Is there a tmpfs kind of solution where I can write into Ruby's memory which only persists until that ruby instance is complete.
tmpfs is a temporary file storage paradigm implemented in many Unix-like operating systems. It is intended to appear as a mounted file system, but data is stored in volatile memory instead of a persistent storage device.
https://en.wikipedia.org/wiki/Tmpfs
I've never heard of such a feature in Ruby or its stdlib.
Searching for "ruby in-memory file" revealed memfs, which I have never heard of before today, but sounds relevant.
MemFs is an in-memory filesystem .. intended for tests but you can use it for any other scenario needing in memory file system.
Using only the stdlib, mktmpdir is probably the best alternative. It will use non-volatile storage, but the OS will eventually delete it.
From Mojave, open("/dev/rdisk0", O_RDONLY) will fail, becasue the System Integrity Protection.
Now I am trying to build a kernel extension for accessing rdisk0.
How to implement the equivalent user space C API(open/read)?
Or is there other way can access raw bytes of /dev/rdisk0?
There is no way to do the raw access to disk since Mojave (or maybe even from High Sierra, don't remember) in usermode.
You will need to utilize the vnode_open and vn_rdwr functions from vnode.h
There are some comments there, they should help you with placing correct parameters.
I've been going through the WinAPI documentation for a while, but I don't seem to be able to find an answer. What I'm trying to achieve is to give a program a file name that it can open and work with it like that would be a normal file on the disk. But I want this object to be in the memory.
I tried using named pipes and they work in some of the situations, but not always. I create a named pipe and pass it to the child process as a regular file. When process exists I collect the data from the pipe.
program.exe \\.\pipe\input_pipe
Faced some limitations though. One of them is that they are not seekable. The second limitation is that they should be opened with exactly the right permissions. And the third one I found is that you cannot pre-put any data into a duplex pipe before it's been open on the other end. Is there any way to overcome those limitations of the named pipes?
Or maybe there is some other kind of object that could be opened with CreateFile and then accessed with ReadFile and WriteFile. So far the only solution I see is to create a file system driver and implement all the functionality myself.
Just to make it clear I wanted to point out that I cannot change the child program I'm running. The main idea is to give that program something that it would think is a normal file.
UPDATE: I'm not looking for a solution that involves installation of any external software.
Memory-mapped files would allow you to do what you want.
EDIT:
On rereading the question - since the receiving program already uses CreateFile/ReadFile/WriteFile and cannot be modified, this will not work. I cannot think of a way to do what OP wants outside of third-party or self-written RAMDisk solution.
The simplest solution might be, as you seem to suggest, using a Ramdisk to make a virtual drive mapped to memory. Then obviously, any files you write to or read from that virtual drive will be completely contained in RAM (assuming it doesn't get paged to disk).
I've done that a few times myself to speed up a process that was entirely disk-bound.
Call CreateFile but with FILE_ATTRIBUTE_TEMPORARY and probably FILE_FLAG_DELETE_ON_CLOSE as well.
The file will then never hit the disk unless the system is low on physical memory.
I'm currently using FindFirstFile, FindNextFile API to recursively iterate through directories for searching files based on a given criteria. I noticed that "dir /s" command gives better performance than my program. I'm tried checking out the events in process monitor and it looks like cmd.exe/dir command is directly querying the disk device driver. Is there any way I can achieve some thing similar with DeviceIOControl() ?. I'm very new to device drivers though not new to programming. Attaching procmon output for reference:
Regards,
Use FindFirstFile and FindNextFile. That's the API, using DeviceIOControl directly is either a mess or not possible (don't know exactly).
Have you tried FindFirstFileEx and it's FIND_FIRST_EX_LARGE_FETCH flag and FindExInfoBasic info level?
You can call ZwQueryDirectoryFile directly. Going further down to the driver level would require sending a bunch of IRPs and would probably be an overkill.
"dir /s" is using FindFirst/Next. It doesn't do any special magic to enumerate the files.
QueryDirectory appears to be how Procmon exposes what FindFirst/Next does to get its data from the file system.
http://ntfs-search.sourceforge.net/
It works well. And faster.
It opens a volume, and parses directly.
But it only works on NTFS.
Profile your app, your bottleneck is likely to be elswhere. Some of these options are like taking out a shotgun to shoot a fly...
-scott
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to invalidate the file system cache?
I'm writing a disk intensive win32 program. The first time it runs, it runs a lot slower while it scans the user's folders using FindFirstFile()/FindNextFile().
How can I repeat this first time performance without rebooting? Is there any way to force the system to discard everything in its disk cache?
I know that if I were reading a single file, I can disable caching by passing the FILE_FLAG_NO_BUFFERING flag to a call to CreateFile(). But it doesn't seem possible to do this when searching for files.
Have you thought about doing it on a different volume, and dismounting / remounting the volume? That will cause the vast majority of everything to be re-read from disk (though the cache down there won't care).
You need to create enough memory pressure to cause the memory manager and cache manager to discard the previously caches results. For the cache manager, you could try to open a large (I.e. Bigger than physical ram) file with caching enabled and then read it backwards (to avoid any sequential I/o optimizations). The interactions between vm and cache manager are a little more complex and much more dependent on os version.
There are also caches on the controller (possibly, but unlikely) and on the disk drive itself (likely). There are specific IoCtls to flush this cache, but in my experience, disk firmware is untested in this arena.
Check out the Clear function of CacheSet by SysInternals.
You could avoid a physical reboot by using a virtual machine.
I tried all the methods in the answers, including CacheSet, but they would not work for FindFirstFile/FindNextfile(). Here is what worked:
Scanning files over the network. When scanning a shared drive, it seems that windows does not cache the folders, so it is slow every time.
The simplest way to make any algorithm slower is to insert calls to Sleep(). This can reveal lots of problems in multi-threaded code, and that is what I was really trying to do.