In Windows, should I use CreateFile or fopen, portability aside? - windows

What are the differences, and in what cases one or the other would prove superior in some way?

First of all the function fopen can be used only for simple portable operations with files.
CreateFile on the other side can be used not only for operations with files, but also with directories (with use of corresponding options), pipes and various Windows devices.
CreateFile has a lot of additional useful switches, like FILE_FLAG_NO_BUFFERING, FILE_ATTRIBUTE_TEMPORARY and FILE_FLAG_SEQUENTIAL_SCAN, which can be very useful in different scenarios.
You can use CreateFile with a filename longer that MAX_PATH characters. It can be important for some server applications or ones which must be able to open any file (a virus scanner or a backup application for example). This is enabled by using namespace semantics, though this mode has its own concerns, like ability to actually create a file named ".." or L"\xfeff\x20\xd9ab" (good luck trying to delete them later).
You can use CreateFile in different security scenarios. I mean not only usage of security attributes. If current process has SE_BACKUP_NAME or SE_RESTORE_NAME privilege (like Administrators typically have) and enable this privilege, one can use CreateFile to open any file also a file to which you have no access through security descriptor.
If you only want to read the content of a file, you can use CreateFile, CreateFileMapping and MapViewOfFile to create file mapping. Then you can work with a file as with a block of memory, which can possibly increase your application's speed.
There are also other uses of the function, which are described in detail in the corresponding MSDN article.
So I can summarize: only if you have a hard portability requirements or if you need to pass a FILE* to some external library, then you have to use fopen. In all other cases I would recommend you to use CreateFile.
For best results, I would also advise to learn Windows API specifically, as there are many features that you can find a good use for.
UPDATED: Not directly related to your question, but I also recommend you to take a glance at transactional I/O functions which are supported starting with Windows Vista. Using this feature, you can commit a bunch of operation with files, directories or registry as one transaction that cannot be interrupted. It is a very powerful and interesting tool. If you are not ready now to use the transactional I/O functions, you can start with CreateFile and port your application to transactional I/O later.

That really depends on what type of program you are writing. If it is supposed to be portable, fopen will make your life easier. fopen will call CreateFile "behind the scenes".
Some more advanced options (cache control, file access control, etc) are only available if you are using the Win32 API (they depend on the Win32 file handle, as opposed to the FILE pointer in stdio), so if you are writing a pure Win32 application, you may want to use CreateFile.

CreateFile lets you
Open file for asynchronous I/O
Pass optimization hints like FILE_FLAG_SEQUENTIAL_SCAN
Set security and inherit settings without threading issues
They don't return the same handle type, with fopen/FILE object you can call other runtime functions such as fputs (as well as converting it to a "native" file handle)

Whenever possible, prefer object oriented wrappers that support RAII, like fstream or boost file IO objects.
You should, of course, care about the share mode, so fopen() and STL are insufficient.

Related

Is there a way to read a text file in 32 bits assembly in windows?

How do I write a code which copies a text file using assembly and in windows?
My compiler is masm.
Not really sure whether you want to read from the file to memory and do something with that, or simply create a copy. In the first case, use CreateFile, otherwise go with CopyFile. You'll need to link with kernel32.dll to be able to use these functions.
In Windows, interacting with the OS involves calling API functions rather than making interrupt calls as in Linux.
If you just want to copy the file, call CopyFile. If you want to read the file, do some processing, and then write, you'll need CreateFile, ReadFile, and WriteFile. (You can find documentation for those functions from the CopyFile link above.)
I don't have a link to a good tutorial on calling Windows API functions from assembly language. Searching reveals some information, but nothing that I'd call a good tutorial. You'll have to look for examples and try things.

Implementing a custom file namespace scheme in Windows?

Is it possible to create a new, arbitrary, file namespace scheme in Windows?
As best I understand, Windows currently understands two or three file system or file-system-like namespace schemes:
The namespace scheme we all know and love, eg, C:\path\to\file.
UNC paths, eg, \\server\path\to\file
One, perhaps uncommon scheme - the Windows NT Object Manager, eg, \\.\Device\COM1 - see WinObj on SysInternals, usually accessed by programs by calling CreateFile, though this is not really a file system.
Is it possible to implement a custom namespace scheme that would be universally, automatically used by the rest of the operating system? Perhaps a filter driver or some other specialized kernel-mode driver? I'm out of my league here, but I'm genuinely curious.
I don't have anything concrete, but lets say I wanted to implement a kernel driver that, not only understands how to read and write OpenVMS file systems, but also implements some sort of filter driver so that userland programs could use standard File-11 syntax to access such a filesystem.
For example, an existing program calls OpenFile("[DIR1.DIR2.DIR3]FILE.EXT;10"); and somehow a custom handler deals with it transparently, and lo, notepad can read and write VMS files. More importantly, perhaps, some ported program that expects OpenVMS File-11 path strings just works. Simply mapping the OpenVMS file system into the regular windows file system as D:\dir1\dir2\file.ext would be insufficient.
I should clarify that my OpenVMS reference is just an example; I'd be looking for a more generic solution. This could be for OpenVMS File-11, MVS, standard unix syntax ala /path/to/thing, or something I just cooked up myself.
I'm aware of shell-based namespace extensions, and compatibility layers like cygwin, but that's not what I'm looking for.
So SO, what do you think? Is this possible? Where do you start?

Is it possible to write a libPOSIX for Windows (Win32) without requiring a background service or DLL that's always loaded?

I know about Cygwin, and I know of its shortcomings. I also know about the slowness of fork, but not why on Earth it's not possible to work around that. I also know Cygwin requires a DLL. I also understand POSIX defines a whole environment (shell, etc...), that's not really what I care about here.
My question is asking if there is another way to tackle the problem. I see more and more of POSIX functionality being implemented by the MinGW projects, but there's no complete solution providing a full-blown (comparable to Linux/Mac/BSD implementation status) POSIX functionality.
The question really boils down to:
Can the Win32 API (as of MSVC20??) be efficiently used to provide a complete POSIX layer over the Windows API?
Perhaps this will turn out to be a full libc that only taps into the OS library for low-level things like filesystem access, threads, and process control. But I don't know exactly what else POSIX consists of. I doubt a library can turn Win32 into a POSIX compliant entiity.
POSIX <> Win32.
If you're trying to write apps that target POSIX, why are you not using some variant of *N*X? If you prefer to run Windows, you can run Linux/BSD/whatever inside Hyper-V/VMWare/Parallels/VirtualBox on your PC/laptop/etc.
Windows used to have a POSIX compliant environment that ran alongside the Win32 subsystem, but was discontinued after NT4 due to lack of demand. Microsoft bought Interix and released Services For Unix (SFU). While it's still available for download, SFU 3.5 is now deprecated and no longer developed or supported.
As to why fork is so slow, you need to understand that fork isn't just "Create a new process", it's "create a new process (itself an expensive operation) which is a duplicate of the calling process along with all memory".
In *N*X, the forked process is mapped to the same memory pages as the parent (i.e. is pretty quick) and is only given new pages as and when the forked process tried to modify any shared pages. This is known as copy on write. This is largely achievable because in UNIX, there is no hard barrier between the parent and forked processes.
In NT, on the other hand, all processes are separated by a barrier enforced by CPU hardware. In NT, the easiest way to spawn a parallel activity which has access to your process' memory and resources, is to create a thread. Threads run within the memory space of the creating process and have access to all of the process' memory and resources.
You can also share data between processes via various forms of IPC, RPC, Named Pipes, mailslots, memory-mapped files but each technique has its own complexities, performance characteristics, etc. Read this for more details.
Because it tries to mimic UNIX, CygWin's 'fork' operation creates a new child process (in its own isolated memory space) and has to duplicate every page of memory in the parent process within the newly forked child. This can be a very costly operation.
Again, if you want to write POSIX code, do so in *N*X, not NT.
How about this
Most of the Unix API is implemented by the POSIX.DLL dynamically loaded (shared) library. Programs linked with POSIX.DLL run under the Win32 subsystem instead of the POSIX subsystem, so programs can freely intermix Unix and Win32 library calls.
From http://en.wikipedia.org/wiki/UWIN
The UWIN environment may be what you're looking for, but note that it is hosted at research.att.com, while UWIN is distributed under a liberal license it is not the GNU license. Also, as it is research for att, and only 2ndarily something that they are distributing for use, there are a lot of issues with documentation.
See more info see my write-up as the last answer for Regarding 'for' loop in KornShell
Hmm main UWIN link is bad link in that post, try
http://www2.research.att.com/sw/download/
Also, You can look at
https://mailman.research.att.com/pipermail/uwin-users/
OR
https://mailman.research.att.com/pipermail/uwin-developers/
To get a sense of the features vs issues.
I hope this helps.
The question really boils down to: Can the Win32 API (as of MSVC20??)
be efficiently used to provide a complete POSIX layer over the Windows
API?
Short answer: No.
"Complete POSIX" means fork(), mmap(), signal() and such, and these are [almost] impossible to implement on NT.
To drive the point home: GNU Hurd has problems with fork() as well, because Hurd kernel is not POSIX.
NT is not POSIX too.
Another difference is persisence:
In POSIX-compliant systems it is possible to create system objects and leave them there. Examples of such objects are named pipes and shared memory objects (shms). You can create a named pipe or a shm, and leave it in the filesystem (or in a special filesystem-like place) where other processes will be able to access it. The downside is that a process might die and fail to clean up after itself, leaving unused objects behind (you know about zombie processes? same thing).
In NT every object is reference-counted, and is destroyed as soon as its last handle is closed. Files are among the few objects that persist.
Symlinks are a filesystem feature, and don't exactly depend on NT kernel, but current implementation (in Vista and later) is incapable of creating object-type-agnostic symlinks. That is, a symlink is either a file or a directory, and must link to either a file or a directory. If the target has wrong type, the symlink won't work. You can give it the right type if the target exists when you create the symlink, but POSIX requires that symlinks may be created without their target existing. I can't imagine a use-case for a symlink that points first to a file, then to a directory, but POSIX says that this should work, and if it doesn't, you're not completely POSIX-compliant. Or if your symlinking API/utility can be given an option that specifies the right type, when target doesn't exist, that also breaks POSIX compatibility.
It is possible to replicate some POSIX features to some degree (such as "integer descriptors from in a single namespace, referencing any I/O object, and being select()able" without sacrificing [much] performance, but that is still a major undertaking, and POSIX interface is really restrictive (that is, if you could just add one more argument to that function, it would have been possible to Do The Right Thing...but you couldn't, unless you want to throw POSIX compliance away).
Your best bet is to not to rely on POSIX features that are difficult to port to non-POSIX systems, or abstract in such a way that lower levels may have separate implementations for different OSes, and upper levels do not care about the details.

Using Named Pipes as Files

Simple question here (though perhaps not such a simple answer):
Is it possible to specify a path for an (existing) named pipe that can be used by programs as if they were opening on a normal file?
According to this MSDN page, name pipes on the local computer can be referrenced using the following path syntax: \\.\pipe\PipeName, yet I'm having no luck using this from standard Windows programs.
As a side point, if anyone has any suggestions for interfacing with programs that are only capable of using the file-system in a more efficient manner than physical I/O (e.g. named pipes), I would be glad to take them.
It would only work if the programs are using the Win32 API CreateFile() function to open the files.

Win32 equivalent of opendir

Would anyone know what the Win32 equivalent of opendir is (or if it even exists) ?
Obviously I could use FindFirstFile(Ex) with FindNextFile, but appending * to the path seems like such a hackish way to do it.
FindFirstFile and FindNextFile are the appropriate Win32 APIs. Assuming you're writing C++ code, as a portable alternative you could consider using directory_iterator from the Boost Filesystem library (which is implemented on Windows using FindFirstFile and FindNextFile).
I believe you can use CreateFile with FILE_FLAG_BACKUP_SEMANTICS and then BackupRead to read directory data, but I'm not sure what format the data is actually in. Also, you would need to be running as a user with the SE_BACKUP_NAME privilege enabled, so this is not really suitable in a general purpose application.

Resources