I need to read the contents of several thousands of small files at startup. On linux, just using fopen and reading is very fast. On Windows, this happens very slowly.
I have switched to using Overlapped I/O (Asynchronous I/O) using ReadFileEx, where Windows does a callback when data is ready to read.
However, the actual thousands of calls to CreateFile itself are still a bottleneck. Note that I supply my own buffers, turn on the NO_BUFFERING flag, give the SERIAL hint, etc. However, the calls to CreateFile take several 10s of seconds, whereas on linux everything is done much faster.
Is there anything that can be done to get these files ready for reading more quickly?
The call to CreateFile is:
hFile = CreateFile(szFullFileName,
GENERIC_READ,
FILE_SHARE_READ | FILE_SHARE_WRITE,
NULL,
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL | FILE_FLAG_OVERLAPPED | FILE_FLAG_NO_BUFFERING | FILE_FLAG_SEQUENTIAL_SCAN,
NULL);
CreateFile in kernel32.dll has some extra overhead compared to the kernel syscall NtCreateFile in ntdll.dll. This is the real function that CreateFile calls to ask the kernel to open the file. If you need to open a large number of files, NtOpenFile will be more efficient by avoiding the special cases and path translation that Win32 has-- things that wouldn't apply to a bunch of files in a directory anyway.
NTSYSAPI NTSTATUS NTAPI NtOpenFile(OUT HANDLE *FileHandle, IN ACCESS_MASK DesiredAccess, IN OBJECT_ATTRIBUTES *ObjectAttributes, OUT IO_STATUS_BLOCK *IoStatusBlock, IN ULONG ShareAccess, IN ULONG OpenOptions);
HANDLE Handle;
OBJECT_ATTRIBUTES Oa = {0};
UNICODE_STRING Name_U;
IO_STATUS_BLOCK IoSb;
RtlInitUnicodeString(&Name_U, Name);
Oa.Length = sizeof Oa;
Oa.ObjectName = &Name_U;
Oa.Attributes = CaseInsensitive ? OBJ_CASE_INSENSITIVE : 0;
Oa.RootDirectory = ParentDirectoryHandle;
Status = NtOpenFile(&Handle, FILE_READ_DATA, &Oa, &IoSb, FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, FILE_SEQUENTIAL_ONLY);
Main downside: this API is not supported by Microsoft for use in user mode. That said, the equivalent function is documented for kernel mode use and hasn't changed since the first release of Windows NT in 1993.
NtOpenFile also allows you to open a file relative to an existing directory handle (ParentDirectoryHandle in the example) which should cut down on some of the filesystem overhead in locating the directory.
In the end, NTFS may just be too slow in handling directories with large numbers of files as Carey Gregory said.
Try paging in the MFT efficiently before issuing the Create file. This can be done by issuing FSCTL_ENUM_USN_DATA.
Related
When working with filesystem files on Windows, and specifically with the CreateFile API:
With regard to access sharing, that is having multiple, independent, CreateFile calls to open the same file, possibly with different flags and sharing modes, does it matter in any way whether file access is performed from within the same process or from a different process?
That is, once someone has opened a file with CreateFile(..., FILE_SHARE_READ, ...), noone should be able to open the same file with GENERIC_WRITE access. Does it matter whether different calls originate from within the same process, or from different processes?
My impression so far is that process boundaries do not matter for independent CreateFile calls to the same file. (They do matter for handle inheritance, etc.)
But that docs contain such gems as:
To enable a process to share a file or device while another process
has the file or device open, use a compatible combination of one or
more of the following values. For more information about valid
combinations of this parameter with the dwDesiredAccess parameter, see
Creating and Opening Files.
which doesn't exactly inspire confidence.
No. In general such access restrictions to files don't care if they are done from the same or different process.
The file sharing flags work always the same way.
As #xMRi answered, it does not matter if the Valid second call to CreateFile is in the same process or not.
Creating and Opening Files
illustrates the valid combinations of two calls to CreateFile using
various access modes and sharing modes (dwDesiredAccess, dwShareMode
respectively). It does not matter in which order the CreateFile calls
are made.
An example to illustrate.
#include <Windows.h>
#include <Tchar.h>
int main()
{
//...
HANDLE hFile1 = CreateFile(TEXT("tempfile"),
GENERIC_READ | GENERIC_WRITE,
FILE_SHARE_READ,
NULL,
CREATE_ALWAYS,
0,
NULL);
HANDLE hFile2 = CreateFile(TEXT("tempfile"),
GENERIC_READ,
FILE_SHARE_READ| FILE_SHARE_WRITE,
NULL,
OPEN_ALWAYS,
0,
NULL);
if (hFile1 != INVALID_HANDLE_VALUE && hFile2 != INVALID_HANDLE_VALUE)
{
HANDLE hFile = hFile1;
//HANDLE hFile = hFile2;
FILE_BASIC_INFO fdi{};
fdi.FileAttributes = FILE_ATTRIBUTE_TEMPORARY | FILE_ATTRIBUTE_NORMAL;
BOOL fResult = SetFileInformationByHandle(hFile,
FileBasicInfo,
&fdi,
sizeof(FILE_BASIC_INFO));
if (fResult)
{
// File will be deleted upon CloseHandle.
_tprintf(TEXT("SetFileInformationByHandle marked tempfile for deletion\n"));
// ...
// Now use the file for whatever temp data storage you need,
// it will automatically be deleted upon CloseHandle or
// application termination.
// ...
}
else
{
_tprintf(TEXT("error %lu: SetFileInformationByHandle could not mark tempfile for deletion\n"),
GetLastError());
}
CloseHandle(hFile);
// At this point, the file is closed and deleted by the system.
}
else
{
_tprintf(TEXT("error %lu: could not create tempfile\n"),
GetLastError());
}
//...
}
How do i write to a physical drive in Windows 7?
I am trying to write to a physical disk (e.g. \\.\PhysicalDrive0) in Windows 7.
This question has been asked to death, but has never been answered. It is something that used to work in Windows XP, but Microsoft intentionally broke in Windows Vista. Microsoft provides hints about how to do it, but nobody has even been able to figure it out.
It used to work
In the olden days, writing to a physical disk was allowed (as long as you were an administrator). The method to do it was even documented in a Knowledge Base article:
INFO: Direct Drive Access Under Win32
To open a physical hard drive for direct disk access (raw I/O) in a Win32-based application, use a device name of the form
\\.\PhysicalDriveN
where N is 0, 1, 2, and so forth, representing each of the physical drives in the system.
You can open a physical or logical drive using the CreateFile() application programming interface (API) with these device names provided that you have the appropriate access rights to the drive (that is, you must be an administrator). You must use both the CreateFile() FILE_SHARE_READ and FILE_SHARE_WRITE flags to gain access to the drive.
All that changed in Windows Vista, when addition security restrictions were put in place.
How do you write to a physical disk?
Many people, and many answers, on many stackoverflow questions confuse:
writing to a physical disk (e.g. \\.\PhysicalDrive0), and
writing to a logical volume (e.g. \\.\C$)
Microsoft notes the restrictions placed on both kinds of operations:
Blocking Direct Write Operations to Volumes and Disks
Write operations on a DASD (Direct access storage device) volume handle will succeed if:
the file system is not mounted, or if
The sectors being written to are the boot sectors.
The sectors being written to reside outside file system space.
The file system has been locked implicitly by requesting exclusive write access.
The file system has been locked explicitly by sending down a lock/dismount request.
The write request has been flagged by a kernel-mode driver that indicates that this check should be bypassed. The flag is called SL_FORCE_DIRECT_WRITE and it is in the IrpSp->flags field. This flag is checked by both the file system and storage drivers.
In my case i am asking about writing to a Physical, not a Logical one. Microsoft notes the new set of restrictions on writing to a physical disk handle:
Write operations on a disk handle will succeed if:
The sectors being written to do not fall within a file system.
The sectors being written to fall within a mounted file system that is locked explicitly.
The sectors being written to fall within a file system that is not mounted or the volume has no file system.
My sectors being written do fall within a file system --> fail
My sectors being written do fall within mounted, unlocked, file system --> fail
My sectors being written do fall within a file system that is mounted, and in inside a logical volume that has a file system.
The hints on how to make it work revolve around:
unmounting a file system
locking a file system
But the question is how do you unmount a file system? How do you lock a file system?
What are you doing now?
I am able to read all physical sectors of a disk; that is no problem. The problem is when i want to write to a physical sector of the disk.
The current code i have is, in pseudo-code:
void ZeroSector(Int64 PhysicalSectorNumber)
{
String diskName := '\\.\PhysicalDrive0';
DWORD desiredAccess := GENERIC_READ or GENERIC_WRITE;
//INFO: Direct Drive Access Under Win32
//https://support.microsoft.com/en-us/kb/100027
//says you nedd both
DWORD shareMode := FILE_SHARE_READ or FILE_SHARE_WRITE;
//Open the physical disk
hDisk := CreateFile(diskName, desiredAccess, shareMode,
nil, OPEN_EXISTING, 0, 0);
if hDisk = INVALID_HANDLE_VALUE
RaiseLastWin32Error();
try
{
Int32 bytesPerPhysicalSector := 4096; //Determined elsewhere using IOCTL_STORAGE_QUERY_PROPERTY+STORAGE_ACCESS_ALIGNMENT_DESCRIPTOR
//Setup buffer for what we will be writing
Byte[] buffer = new Byte[bytesPerPhysicalSector];
//Calculate the byte offset of where the sector is
Int64 byteOffset = PhysicalSectorNumber * bytesPerPhysicalSector;
//Seek to that byte offset
SetFilePointer(hDisk, byteOffset.Lo, byteOffset.Hi, FILE_BEGIN);
//Write the buffer
DWORD numberOfBytesWritten;
if (!WriteFile(hDisk, buffer, bytesPerPhysicalSector, out numberOfBytesWritten, nil))
RaiseLastWin32Error();
}
finally
{
CloseHandle(hDisk);
}
}
Surprisingly:
i can open the physical disk for GENERIC_READ + GENERIC_WRITE access
it doesn't fail until the actual WriteFile, which fails with:
ERROR_ACCESS_DENIED
How to do what Microsoft says
Microsoft said that my write would fail, and they were right. They said that i need to explicitly lock the file system:
Write operations on a disk handle will succeed if:
The sectors being written to fall within a mounted file system that is locked explicitly.
Except i don't know how to do that.
I know i probably have to use DeviceIoControl and one of the IOCTLS to "lock" a volume. But that presents three challenges:
figuring out which volume(s) are on the physical disk selected
figuring out which IOCTL to use
figuring out how to unlock the locked volumes
Ignoring those problems, i blindly tried the LockFile API. Just before calling WriteFile:
//Try to lock the physical sector we will be writing
if (!LockFile(DiskHandle, byteOffset.Lo, byteOffset.Hi, bytesPerPhysicalSector, 0)
RaiseLastWin32Error();
That fails with:
ERROR_INVALID_FUNCTION (1)
Check out FSCTL_LOCK_VOLUME, FSCTL_DISMOUNT_VOLUME control codes. I believe you would have to enum all volumes you have on your disk, and then dismount-and-lock them. After lock succeeded, the disk is all yours.
You probably won't be able to do this on a system drive though. I'd also guess that there will be caveats with volumes that contain page files.
after I have done this, I can write to the corresponding \\.\PhysicalDrive3. I could not before:
HANDLE hVol = CreateFileA(
"\\\\.\\X:",
FILE_READ_DATA | FILE_WRITE_DATA,
FILE_SHARE_READ | FILE_SHARE_WRITE,
NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
DWORD unused;
BOOL b = DeviceIoControl(hVol, FSCTL_DISMOUNT_VOLUME, NULL, 0, NULL, 0, &unused, NULL);
if (!b) {
printf("%u", GetLastError());
abort();
}
...
HANDLE h = CreateFileA(
argv[1], // that's my \\physicaldrive3
FILE_READ_DATA | FILE_WRITE_DATA,
FILE_SHARE_READ | FILE_SHARE_WRITE,
NULL,
OPEN_EXISTING,
0,
NULL);
...
bResult = WriteFile(h, buf, cb, &dwWritten, &foo);
if (!bResult) {
// used to fail without messing with vol handle
printf("Failed writing data. Error = %d.\n", GetLastError());
return 0;
}
Supposedly it is possible to actually open and read directories on NTFS volumes. However, my code to try this wasn't working, so I tried google, which found me this.
The key observation there seems to be that you must use FILE_FLAG_BACKUP_SEMANTICS. So, trimming that down, I basically get:
HANDLE hFile = CreateFile(L"C:\\temp", GENERIC_READ, FILE_SHARE_READ,
0, OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, 0);
DWORD dwFileSize = GetFileSize(hFile, 0);
char* buf = new char[dwFileSize];
DWORD dwBytesRead = 0;
BOOL b = ReadFile(hFile, buf, dwFileSize, &dwBytesRead, 0);
Seems pretty straight-forward. Unfortunately, it doesn't work.
The CreateFile and GetFileSize both work (handle is not INVALID_HANDLE_VALUE, non-zero and plausible file size), but the ReadFile returns FALSE, dwBytesRead is zero, and GetLastError returns 1 ("Incorrect function"). Huh.
While I was typing this question, the 'Similar Questions' prompt showed me this. That business about using AdjustTokenPrivileges made a lot of sense. However, it didn't help. Adding ReadFile (and using c:\temp) to that example gives the same behavior. A closer reading of the CreateFile docs shows that even without the SE_BACKUP_NAME privilege, I should be able to open the file due to admin privileges.
I've tried a number of permutations:
Different ways of specifying the directory name (c:\temp, c:\temp\, \\.\c:\temp, \\?\c:\temp\, etc).
Different directories
Different drives
Different share options (0, FILE_SHARE_READ, FILE_SHARE_READ | FILE_SHARE_WRITE)
Different access permissions (GENERIC_READ, FILE_LIST_DIRECTORY, FILE_LIST_DIRECTORY + FILE_READ_EA + FILE_READ_ATTRIBUTES, FILE_LIST_DIRECTORY + FILE_READ_EA + FILE_READ_ATTRIBUTES + FILE_TRAVERSE)
I can't see any flags that might apply other than FILE_FLAG_BACKUP_SEMANTICS (which I assume is required), but I tried FILE_FLAG_NO_BUFFERING and a 4096 byte aligned buffer. Nope.
I'm (currently) trying 152 permutations, and none of the ReadFiles are working. What am I missing?
Is my original assumption here incorrect? Is it not really possible to 'read' from a directory? Or is there just some trick I'm still missing?
What else should I mention?
I'm running as an admin, and can do a CreateFile on the volume.
My program is 64bit, built for unicode.
Windows 7 x64
NTFS 3.1 volume
It's cloudy outside (Hey, you never know what might matter...)
If you want to open a stream then you need to include the stream name and/or type as part of the path:
c:\foo:bar A.K.A. c:\foo:bar:$DATA
c:\foo::$INDEX_ALLOCATION
The default $DATA stream is used if you don't specify a stream. $DATA stores a files "normal data".
If you want the list of files in the directory then you can use GetFileInformationByHandleEx(FileIdBothDirectoryInfo) (and NtQueryDirectoryFile on older systems).
It looks like Jonathan Potter has given the correct answer. Despite prompting, he has elected not to post his comments as an answer. So I'm going to create one based on his responses in order to close the question.
In short: "You can open a handle to a directory to do certain things, but calling ReadFile on it isn't one of them."
What things? These things. This list includes:
BackupRead
BackupSeek
BackupWrite
GetFileInformationByHandle
GetFileSize
GetFileTime
GetFileType
ReadDirectoryChangesW
SetFileTime
In summary: While you can "open" directories and "read" certain information about them, you can't actually use ReadFile. If you want to read the DirName::$INDEX_ALLOCATION information, you'll have to use a different approach.
Starting with the ndisprot sample from Microsoft I try to write a NDIS protocol driver. From User space I try to read and write to the device simultaneous (out of two threads). Since I don't receive any packets, the ReadFile system call blocks. I'm not able to complete a WriteFile system call in this state.
CHAR NdisProtDevice[] = "\\\\.\\\\NDISprot";
CHAR * pNdisProtDevice = &NdisProtDevice[0];
this.iHandle = CreateFile(pNdisProtDevice,
GENERIC_WRITE | GENERIC_READ, 0, 0, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, 0);
// Blocks, because no frames arrive
bSuccess = (BOOLEAN)ReadFile(Handle,
(LPVOID)pReadBuf,
PacketLength,
&BytesRead,
NULL);
...
// Called some seconds later from another thread, while ReadFile still blocking...
bSuccess = (BOOLEAN)WriteFile(Handle,
pWriteBuf,
PacketLength,
&BytesWritten,
NULL);
I added some debug messages and discovered that the driver function associated with IRP_MJ_WRITE (NdisprotWrite) gets not even called! Something between the user space application and the driver blocks concurrent access to the device \Device\NDISprot.
How can I concurrent Read and Write to the file?
By default, you can only have one outstanding I/O request per usermode handle. Either open multiple handles, or open your one handle with FILE_FLAG_OVERLAPPED. (Once you use FILE_FLAG_OVERLAPPED, you also generally need to use OVERLAPPED structures - make sure you've got the gist of it by skimming this and this.)
I'm trying to call IOCTL_BTH_GET_LOCAL_INFO using DeviceIoControl, which I believe it can be done (accordingly to Bluetooth Profile Driver IOCTLs).
I'm on a Windows 7 x64 using Visual Studio 2012 (probably with default configuration).
The handle have a valid value (I removed the validation code) but DeviceIoControl always returns ERROR_INVALID_USER_BUFFER (error 1784).
Here's the code:
int main() {
BTH_LOCAL_RADIO_INFO buffer;
BOOL fStatus;
HANDLE h;
DWORD returned = 0;
h = CreateFile(
TEXT("\\\\.\\BthPan"),
GENERIC_READ | GENERIC_WRITE ,
FILE_SHARE_READ | FILE_SHARE_WRITE,
NULL,
OPEN_EXISTING,
0,
NULL);
fStatus = DeviceIoControl(
h,
IOCTL_BTH_GET_LOCAL_INFO,
NULL, 0,
(LPVOID)&buffer, sizeof(BTH_LOCAL_RADIO_INFO),
&returned,
(LPOVERLAPPED) NULL
);
(...)
After some research I tried the following solutions:
Changing the structure pack alignment to 1/4/8 byte (with VS options);
Using values which are 8-byte aligned (later I've found out that
this was already happening, even with data types smaller than 8 bytes). After a while I've read somewhere that DeviceIoControl deals with misaligment for you, so probably no need to worry about that.
All of the solutions above have failed. What do you think it is? VS have a bunch of configurations for Win32, but that never gave me a problem before (first time with IOCTL though).
I've seen some of that code on 32feet.NET, so probably it's just an error of mine (I can't see any difference).
You're sending IOCTL_BTH_GET_LOCAL_INFO to the wrong device (Bluetooth Personal Area Network instead of Bluetooth Radio).
So I suggest you to use BluetoothFindFirstRadio, BluetoothFindNextRadio and BluetoothFindRadioClose to simply iterate through local Bluetooth radios, rather than to guess the correct DOS Device Names for them.