Improving filesystem access on a remote fileserver - windows

I have a large file server machine which contains several terabytes of image data that I generally access in chunks. I'm wondering if there is anything special that I can do to hint to the OS that a specific set of documents should be preloaded into memory to improve the access time for that subset of files when they are loaded over a file share.
I can supply a parent directory that contains all of the files that comprise a given chunk before I start to access them.
The first thing that comes to mind is to simply write a service that will iterate through the files in the specified path, load them into process memory and then free the memory in hopes that the OS filesystem cache holds on to them, but I was wondering if there is a more explicit way to do this.
It would save a lot of work if I could re-use the existing file share access paradigm rather than requiring the access to these files to go through a memory caching layer.
The files in question will almost always be accessed in a readonly manner.
I'm working on Windows Server 2003/2008

Two approaches come to mind:
1) Set the server to be optimized for file serving. This used to be in the properties for file & printer sharing, but seems to have gone away in Windows 2008. This is set via the registry in:
HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Memory
Management\LargeSystemCache=1
HKLM\SYSTEM\CurrentControlSet\Services\LanmanServer\Parameters\Size=3
http://technet.microsoft.com/en-us/library/cc784562.aspx as ref.
2) Ensure that both endpoints are either windows 2008/windows 2008, or windows 2008/Vista. There are significant performance improvements in SMB 2.0 as well as the IP stack which improve performance greatly. This may not be an option due to cost, organizational constraints, or procurement lead time, but I thought I'd mention it.
http://technet.microsoft.com/en-us/library/bb726965.aspx as ref.

Related

Load dependent file transfer

I'm attempting to simultaneously copy and write large amounts of data to on a server system using Windows Server 2012.
The writing happens from multiple applications with high data-rates, approaching approximately 30% of the I/O limits of the local system.
The transfer happens between the local storage and a network storage with a transfer rate of up to several gbit.
Because the applications writing the data must not suffer from low memory during their operation, the transfer must be aware of the load on the source system and possibly restrict itself during the transfer process.
Attempts have been made to apply basic Windows-based tools (x/copy, robocopy). x/copy is unsuitable because it does not supply throttling capabilites. Robocopy has proven either cause memory problems by it's throttling method or when not throttling by exceeding the available network/memory limits.
Now here's the fun part: Using the standard windows explorer copy GUI does not exhibit any of these problems. In fact, it copies blazingly fast while showing no noticeable impact at all on the system performance.
The next step would involve creating a custom copy process using basic win api to mimic the desirable explorer copy process behavior.
Q: What basic api copy command is the explorer copy using ... copyFile2? ... a completely unavailable internal copy command? Should I consider other options?
The IFileOperation interface lets you access the copy function Explorer uses.
The documentation for IFileOperation::CopyItem has a full example on how to use it.

Creating Virtual Folders and hooking them into the file system

I have a big collection of folders for projects I'm working on. I've been trying to find a better way to sort them all for a long time and I want to write an app that creates groups based on whatever criteria I say, such as "folders from 2011" or "folders containing a x type of file" etc.
This is fairly straightforward, and wouldn't present much of a problem to code using its own UI in winForms or WPF or something. But I think it would be far better if I could make these folders appear to be part of the filesystem, so other apps (like existing file explorers) can see them.
Is this possible? Would it cause problems I haven't considered? How do I go about doing it if it is possible?
One way I thought of doing it would be to have the app monitor the filesystem and create folder shortcuts every time there's a change, but I'm curious about whether its possible to actually present a fake filesystem to explorer through a 'gateway' folder
EDIT: Ok it's obviously possible since http://www.virtualfolder.net/ can do it, and now that I think of it so can TrueCrypt, although it would be nice if it didn't have to appear as a separate drive. So the question becomes, how do I implement it?
You can create a Shell Namespace Extension that gathers the file information you want and displays it within Windows Explorer any way you wish. You can choose where your extension is located, whether as its own top-level node, a child of another system virtual folder/extension, or as a child of a file system folder.
Writing a SNE is not trivial, but it is a lot easier then writing a lower-level file system driver, and it does not require special driver-oriented compilers. Any compiler that supports developing COM objects will work.
This is accomplished using filesystem drivers or filesystem filter drivers. First let you create a virtual filesystem and mount it to a drive letter and also to a folder on NTFS drive (folder must exist but its contents are "replaced" with a virtual filesystem directory tree). Filesystem filter drivers let you introduce virtual files and folders in existing folders without replacing them.
VirtualFolder uses filesystem driver as it creates a drive letter.
Both types of drivers are written in C and work in kernel-mode. Writing them requires deep knowledge of Windows internals and experience with driver development (since filesystem drivers are one of the most complicated driver types).
We offer several products related to virtual storage. One of them, Callback File System, is a filesystem driver. It calls your user-mode code to perform actual filesystem functions. Another product, CallbackFilter, is an FS filter driver (and it also calls your user-mode code). However, current version of CallbackFilter doesn't let you introduce virtual files and folders (this would be implemented in the next release).
There's also Pismo File Mount product available, they use filter driver techniques. You can check with them if what you need can be accomplished.
From what I gather you are looking for a way to present the results of predefined file queries to appear as though they are located at a specific location in the file system. If that is correct you may want to look into Hard Links and Junctions. There are limits on what you can do with these file system services. However it is really straight forward to implement.

Windows - download a file on-demand, when FileNotFound in file system?

I want to put some sort of "hook" into windows (only has to work on Windows Server 2008 R2 and above) which when I ask for a file on disk and it's not there it then requests it from a web server and caches it locally.
The files are immutable and have unique file names.
The application which is trying to open these files is written in C and just opens a file using the operating system in the normal way. Say it calls OpenFile asking for c:\scripts\1234.12.script, and that is there then it will just open it normally. If then it asks for c:\scripts\1234.13.script and it isn't then my hook in the operating system will then go and ask my web service for the file, download it and then return that file as it it were there all the time.
I'd prefer to write this as a usermode process (I've never written a windows driver), it should only fire when files are not found in a specific folder, and I'd prefer if possible to write it in a managed language (C# would be perfect). The files are small (< 50kB) and the web service is fast and the internet connection blinding so I'm not expecting it to take more than a second to download the file.
My question is - where do I start looking for information about this kind of thing? And if anyone has done anything similar - do you know what options I have (eg can it be done in C#?)?
You would need to create a kernel-mode filesystem filter driver which would intercept requests for opening such files and would "fake" those files. I should say that this is a very complicated task even for driver development. Our CallbackFilter product would be able to solve your problem however mechanism for "faking" files is not yet ready (we plan this feature for CallbackFilter 3). Until then I don't know any user-mode solutions (frankly speaking, no kernel-mode solutions as well) that would solve your problem.
If you can change the folder the application is accessing, then you can create a virtual file system and map it to the drive letter or a folder on NTFS drive. From the virtual file system you can direct most requests to/from real disk and if the file doesn't exist, you can download the file and cache it. Our other product, Callback File System, lets you do what I described in user-mode. If you have a one-time task you need to accomplish, and don't have a budget for it, please contact us anyway and maybe we can find some solution. There also exists an open-source solution with similar (but not so comprehensive) functionality named Dokan, yet I will refrain from commenting on its quality.
You can also try Dokan , it open source and you can check its discussion group for question and guides.

Cheat exclusive access locked files in Windows (7)

I am currently on a mission loading files into pagecache, and I want to load locked files, too. The goal is nothing more than pro-actively keeping a dataset in RAM, reducing loading times within third party applications.
Shadow copies were my first thought on this, but unfortunately seem to have separated pagecaches.
So is there any way cheating around the exclusive lock mechanism? Like fetching file fragment location on disk, accessing whole disk and reading directly (which I fear is another separated pagecache, anyways)?
Or is there a very different approach to directing the pagecache, e.g. some Windows API that can be told to load a specific file into pagecache?
You can access locked files in Windows from kernel-mode driver, or using our RawDisk product. But for your task (speed up DB file access) this won't work right as Windows' filesystem cache size is limited (it won't accommodate GBs of data).
In general, if I were to develop a large software project (for small application the amount of work needed is just enormous) I'd do the following: create a virtual drive backed by in-memory storage, present the DB file to the application via that virtual disk and flush drive contents to the disk on change asynchronously. All of this should be done in kernel mode (this is where development time grows to 12-15 man-months of work).
In theory the same can be done using one of our Virtual Storage products, but going back into user mode for callback handling would eliminate all that you gain from moving the data into RAM.

Rewrite Registry File in Windows

I have been trying to find a way to "defragment" the registry on my Windows machine. Firstly, does this make sense? Any benefits in doing this? (Not much love on superuser.com) Secondly, I am looking for a way to rewrite the registry using C/C++ with Windows API. Is there a way to read the registry and write it to a new file getting rid of unused bytes along the way? (I might have to write the new file and then boot into another OS/disk before I can overwrite the original... but I am willing to take that risk.)
Microsoft's PageDefrag does exactly this, as it states on its page "PageDefrag uses the standard file defragmentation APIs to defragment the files."
(A copy of the linked article is here because in typical MSDN style their link is dead.)
http://www.larshederer.homepage.t-online.de/erunt/ - NTREGOPT NT Registry Optimizer
Similar to Windows 9x/Me, the registry files in an NT-based system
can become fragmented over time, occupying more space on your hard
disk than necessary and decreasing overall performance. You should
use the NTREGOPT utility regularly, but especially after installing
or uninstalling a program, to minimize the size of the registry files
and optimize registry access.
The program works by recreating each registry hive "from scratch",
thus removing any slack space that may be left from previously
modified or deleted keys.
http://reboot.pro/index.php?showtopic=11212 - Offreg.dll MS WDK Offline Registry Library
The offline registry library (Offreg.dll) is used to modify a registry hive outside the active system registry. This library is intended for registry update scenarios such as servicing an operating system image. The library supports registry hive formats starting with Windows XP.
Developer Audience
http://reboot.pro/topic/11312-offline-registry/ - Offline Registry MS WDK Command-Line Tool
A command line tool that will allow one to read and write to an offline registry hive.
Reading the values should be possible.
But I've never seen any spec for how the registry files are written to disk, and unless you could find one you'd have to reverse engineer those files in your OS (might be differences between XP and 7 etc). Then you have to remember that the registry isn't just one file, it's multiple files and some of them belongs to certain users and I think they use SIDs rather than user names so even if you move them to a new computer, you have to be sure it's the same OS version with the same users with the same SIDs set up on it.
All this for little or no gain so I'd agree with the superuser users that it wouldn't make sense.

Resources