Load dependent file transfer - winapi

I'm attempting to simultaneously copy and write large amounts of data to on a server system using Windows Server 2012.
The writing happens from multiple applications with high data-rates, approaching approximately 30% of the I/O limits of the local system.
The transfer happens between the local storage and a network storage with a transfer rate of up to several gbit.
Because the applications writing the data must not suffer from low memory during their operation, the transfer must be aware of the load on the source system and possibly restrict itself during the transfer process.
Attempts have been made to apply basic Windows-based tools (x/copy, robocopy). x/copy is unsuitable because it does not supply throttling capabilites. Robocopy has proven either cause memory problems by it's throttling method or when not throttling by exceeding the available network/memory limits.
Now here's the fun part: Using the standard windows explorer copy GUI does not exhibit any of these problems. In fact, it copies blazingly fast while showing no noticeable impact at all on the system performance.
The next step would involve creating a custom copy process using basic win api to mimic the desirable explorer copy process behavior.
Q: What basic api copy command is the explorer copy using ... copyFile2? ... a completely unavailable internal copy command? Should I consider other options?

The IFileOperation interface lets you access the copy function Explorer uses.
The documentation for IFileOperation::CopyItem has a full example on how to use it.

Related

Will Disk File Have Better Performance When Opened Exclusively?

I notice that in many disk storage systems, such as SQLite, IStream(Created on File). When they are opened exclusively, they will get better performance.
For SQLite, it is at "PRAGMA LOCKING_MODE" section in https://blog.devart.com/increasing-sqlite-performance.html
For IStream, based on document for SHCreateStreamOnFileEx at https://learn.microsoft.com/zh-cn/windows/win32/stg/stgm-constants, it said "In transacted mode, sharing of STGM_SHARE_DENY_WRITE or STGM_SHARE_EXCLUSIVE can significantly improve performance because they do not require snapshots."
Therefore, I just wonder in Windows, whether the genereal disk file will get better performance if I open it as read mode, together with share exclusively mode? In the past, when opening a file for read purpose, I only set it share mode to deny write instead of share exclusively, though there are no other processes or threads that will try to read the file at the same time.

Cheat exclusive access locked files in Windows (7)

I am currently on a mission loading files into pagecache, and I want to load locked files, too. The goal is nothing more than pro-actively keeping a dataset in RAM, reducing loading times within third party applications.
Shadow copies were my first thought on this, but unfortunately seem to have separated pagecaches.
So is there any way cheating around the exclusive lock mechanism? Like fetching file fragment location on disk, accessing whole disk and reading directly (which I fear is another separated pagecache, anyways)?
Or is there a very different approach to directing the pagecache, e.g. some Windows API that can be told to load a specific file into pagecache?
You can access locked files in Windows from kernel-mode driver, or using our RawDisk product. But for your task (speed up DB file access) this won't work right as Windows' filesystem cache size is limited (it won't accommodate GBs of data).
In general, if I were to develop a large software project (for small application the amount of work needed is just enormous) I'd do the following: create a virtual drive backed by in-memory storage, present the DB file to the application via that virtual disk and flush drive contents to the disk on change asynchronously. All of this should be done in kernel mode (this is where development time grows to 12-15 man-months of work).
In theory the same can be done using one of our Virtual Storage products, but going back into user mode for callback handling would eliminate all that you gain from moving the data into RAM.

How does srv.sys decide on raw vs core mode depending on the underlying filesystem?

We are developing a file system for Windows using IFS Kit.
We started to investigate a performance problem which caused our file system I/O to be much slower when shared over the network. After looking at it with FileMon and TCPView from Sysinternals we found out that if a NTFS/FAT was shared, the SMB client and server were transferring I/O in 60K blocks, while while sharing our file system it used 4K blocks.
These two block sizes correspond to the SMB "core" and "raw" modes - this is explained here by Microsoft.
The problem is that we cannot figure out what in our file system causes the windows share server (srv.sys) to choose core mode (4K) for our file system and raw mode (60K) for NTFS and FAT.
Even hints at what to check are welcome.
KIV
The issue was resolved by setting the FO_CACHE_SUPPORTED flag in the file system object.
From the support page:
When you use Windows NT Explorer to
copy a file from the client to a
remote computer, data is typically
transfered in Core mode in 4 KB
blocks.
Have you tried this from a command line?

Improving filesystem access on a remote fileserver

I have a large file server machine which contains several terabytes of image data that I generally access in chunks. I'm wondering if there is anything special that I can do to hint to the OS that a specific set of documents should be preloaded into memory to improve the access time for that subset of files when they are loaded over a file share.
I can supply a parent directory that contains all of the files that comprise a given chunk before I start to access them.
The first thing that comes to mind is to simply write a service that will iterate through the files in the specified path, load them into process memory and then free the memory in hopes that the OS filesystem cache holds on to them, but I was wondering if there is a more explicit way to do this.
It would save a lot of work if I could re-use the existing file share access paradigm rather than requiring the access to these files to go through a memory caching layer.
The files in question will almost always be accessed in a readonly manner.
I'm working on Windows Server 2003/2008
Two approaches come to mind:
1) Set the server to be optimized for file serving. This used to be in the properties for file & printer sharing, but seems to have gone away in Windows 2008. This is set via the registry in:
HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Memory
Management\LargeSystemCache=1
HKLM\SYSTEM\CurrentControlSet\Services\LanmanServer\Parameters\Size=3
http://technet.microsoft.com/en-us/library/cc784562.aspx as ref.
2) Ensure that both endpoints are either windows 2008/windows 2008, or windows 2008/Vista. There are significant performance improvements in SMB 2.0 as well as the IP stack which improve performance greatly. This may not be an option due to cost, organizational constraints, or procurement lead time, but I thought I'd mention it.
http://technet.microsoft.com/en-us/library/bb726965.aspx as ref.

Is there any way of throttling CPU/Memory of a process?

Problem: I have a developers machine (read: fast, lots of memory), but the user has a users machine (read: slow, not very much memory).
I can simulate a slow network using Fiddler (http://www.fiddler2.com/fiddler2/)
I can look at how CPU is used over time for a process using Process Explorer (http://technet.microsoft.com/en-us/sysinternals/bb896653.aspx).
Is there any way I can restrict the amount of CPU a process can have, or the amount of memory a process can have in order to simulate a users machine more effectively? (In order to isolate performance problems for instance)
I suppose I could use a VM, but I'm looking for something a bit lighter.
I'm using Windows XP, but a solution for any Windows machine would be welcome. Thanks.
The platform SDK used to come with stress tools for doing just this back in the good old days (STRESS.EXE, CPUSTRESS.EXE in the SDK), but they might still be there (check your platform SDK and/or Visual Studio installation for these two files -- unfortunately I have niether the PSDK nor VS installed on the machine I'm typing from.)
Other tools:
memory: performance & reliability (e.g. handling failed memory allocation): can use EatMem
CPU: performance & reliability (e.g. race conditions): can use CPU Burn, Prime95, etc
handles (GDI, User): reliability (e.g. handling failed GDI resource allocation): ??? may have to write your own, but running out of GDI handles (buggy GTK apps would usually eat them all away until all other apps on the system would start falling dead like flies) is a real test for any Windows app
disk: performance & reliability (e.g. handling disk full): DiskFiller, etc.
AppVerifier has a low-resource simulation feature.
You could also try setting the priority of your process to be very low.
You can run MemAlloc to chew up RAM, possibly a few copies at once.
I found a related question:
Set Windows process (or user) memory limit
The accepted answer for the question has a link to the Windows API's SetProcessWorkingSetSize, so it's not exactly a tool that can limit the amount of memory that a process can use.
In terms of changing the amount of CPU resources a process can use, if you don't mind the granularity of per-core limiting of resources, Task Manager can change the processor affinity of a process.
In Task Manager, right-click a process and select "Set Affinity...", then select the processor cores that the process can be assigned to.
If the development machine has many cores but the user machine only has one, then, rather than allowing the process to run on all the available cores, set the process' processor affinity to only one core.
It has nothing to do with SetProcessWorkingSetSize
Just use internal Win32 kernel apis to restrict CPU Usage

Resources