Is there a Windows equivalent of `openat`? - winapi

The title pretty much says it, on Windows, can I somehow simulate multiple working directories in a multithreaded application by using something similar to openat?
See also this question.

Yes there is, and it's called NtCreateFile() (https://msdn.microsoft.com/en-us/library/bb432380(v=vs.85).aspx) :)
openat() takes an open fd to a base directory from which path operations start. Similarly, you can supply a HANDLE to the ObjectAttributes.RootDirectory of NtCreateFile() to have whatever that directory's path is to be used as a base for relative path operations.
If going direct to the NT kernel API is too much for you and you want to remain within Win32, you can retrieve the current path of the base directory HANDLE easily enough (see https://msdn.microsoft.com/en-us/library/windows/desktop/aa366789(v=vs.85).aspx). If you open the directory without FILE_SHARE_DELETE permissions, nobody will be able to rename it and therefore the path retrieved will never move so long as you keep the HANDLE open.
You then stitch together the retrieved path with the relative path using normal string concatenation.
The NT kernel API approach is the only solution on Windows which allows the base directory HANDLE to be arbitrarily renamed by third party processes and that not cause races in your code. i.e. it's the only equivalent to POSIX openat(). I agree it's very unfortunate that Win32 doesn't make available this facility - in fact, Win32 doesn't make available atomic renames which is a very useful POSIX facility indeed to let you avoid lock files and is something the NT kernel API provides too.

Related

Does Windows have an equivalent to Linux/FreeBSD's `RESOLVE_BENEATH` and `O_RESOLVE_BENEATH`?

I want to open a file, while ensuring that no component of the path goes outside a certain directory.
On Linux, I can accomplish this using openat2 and RESOLVE_BENEATH, and on FreeBSD there is the O_RESOLVE_BENEATH flag with openat.
Is there any way to achieve the same effect on Windows? Note that this needs to be on a per-call basis — globally restricting my process to a given directory is not useful.

Create file or registry key without calling NTDLL.DLL

I know that ntdll is always present in the running process but is there a way (not necessarily supported/stable/guaranteed to work) to create a file/key without ever invoking ntdll functions?
NTDLL is at the bottom of the user-mode hierarchy, some of its functions switch to kernel mode to perform their tasks. If you want to duplicate its code then I suppose there is nothing stopping you from decompiling NtCreateFile to figure out how it works. Keep in mind that on 32-bit Windows there are 3 different instructions used to enter kernel mode (depending on the CPU type), the exact way and where the transition code lives changes between versions and the system call ids change between versions (and even service packs). You can find a list of system call ids here.
I assume you are doing this to avoid people hooking your calls? Detecting your calls? Either way, I can't recommend that you try to do this. Having to test on a huge set of different Windows versions is unmanageable and your software might break on a simple Windows update at any point.
You could create a custom kernel driver that does the work for you but then you are on the hook for getting all the security correct. At least you would have documented functions to call in the kernel.
Technically, registry is stored in %WINDIR%\System32\config / %WINDIR%\SysWOW64\config, excepted your own user's registry which is stored in your own profile, in %USERPROFILE%\NTUSER.DAT.
And now, the problems...
You don't normally have even a read access to this folder, and this is true even from an elevated process. You'll need to change (and mess up a lot...) the permissions to simply read it.
Even for your own registry, you can't open the binary file - "Sharing violation"... So, for system/local machine registries... You can't in fact open ANY registry file for the current machine/session. You would need to shut down your Windows and mount its system drive in another machine/OS to be able to open - and maybe edit - registry files.
Real registry isn't a simple file like the .reg files. It's a database (you can look here for some elements on its structure). Even when having a full access to the binary files, it won't be fun to add something inside "from scratch", without any sotware support.
So, it's technically possible - after all, Windows does it, right? But I doubt that it can be done in a reasonable amount of time, and I simply can't see any benefit from doing that since, as you said, ntdll is ALWAYS present, loaded and available to be used.
If the purpose is to hack the current machine and/or bypass some lack of privileges, it's a hopeless approach, since you'll need even more privileges to do it - like being able to open your case and extract the system drive or being able to boot on another operating system on the same machine... If it's possible, then there is already tools to access the offline Windows, found on a well-known "Boot CD", so still no need to write in registry without any Windows support.

Creating Virtual Folders and hooking them into the file system

I have a big collection of folders for projects I'm working on. I've been trying to find a better way to sort them all for a long time and I want to write an app that creates groups based on whatever criteria I say, such as "folders from 2011" or "folders containing a x type of file" etc.
This is fairly straightforward, and wouldn't present much of a problem to code using its own UI in winForms or WPF or something. But I think it would be far better if I could make these folders appear to be part of the filesystem, so other apps (like existing file explorers) can see them.
Is this possible? Would it cause problems I haven't considered? How do I go about doing it if it is possible?
One way I thought of doing it would be to have the app monitor the filesystem and create folder shortcuts every time there's a change, but I'm curious about whether its possible to actually present a fake filesystem to explorer through a 'gateway' folder
EDIT: Ok it's obviously possible since http://www.virtualfolder.net/ can do it, and now that I think of it so can TrueCrypt, although it would be nice if it didn't have to appear as a separate drive. So the question becomes, how do I implement it?
You can create a Shell Namespace Extension that gathers the file information you want and displays it within Windows Explorer any way you wish. You can choose where your extension is located, whether as its own top-level node, a child of another system virtual folder/extension, or as a child of a file system folder.
Writing a SNE is not trivial, but it is a lot easier then writing a lower-level file system driver, and it does not require special driver-oriented compilers. Any compiler that supports developing COM objects will work.
This is accomplished using filesystem drivers or filesystem filter drivers. First let you create a virtual filesystem and mount it to a drive letter and also to a folder on NTFS drive (folder must exist but its contents are "replaced" with a virtual filesystem directory tree). Filesystem filter drivers let you introduce virtual files and folders in existing folders without replacing them.
VirtualFolder uses filesystem driver as it creates a drive letter.
Both types of drivers are written in C and work in kernel-mode. Writing them requires deep knowledge of Windows internals and experience with driver development (since filesystem drivers are one of the most complicated driver types).
We offer several products related to virtual storage. One of them, Callback File System, is a filesystem driver. It calls your user-mode code to perform actual filesystem functions. Another product, CallbackFilter, is an FS filter driver (and it also calls your user-mode code). However, current version of CallbackFilter doesn't let you introduce virtual files and folders (this would be implemented in the next release).
There's also Pismo File Mount product available, they use filter driver techniques. You can check with them if what you need can be accomplished.
From what I gather you are looking for a way to present the results of predefined file queries to appear as though they are located at a specific location in the file system. If that is correct you may want to look into Hard Links and Junctions. There are limits on what you can do with these file system services. However it is really straight forward to implement.

Windows equivalent to Linux namespaces (per-process filesystem mounts)?

Linux has a feature called namespaces, which let you give a different "view" of the filesystem to different processes. In Windows terms, this would be useful for example if you had a legacy program "floyd" that always loaded its configuration from C:\floyd\floyd.ini. If Windows had namespaces, you could write a wrapper script which would create a namespace in which to run floyd, making it so when Alice ran the script, floyd would start up in an environment where C:\floyd existed but actually pointed to C:\Users\Alice\Floyd.
Now you may be thinking, "OK, just use soft or hard links and make C:\floyd an alias for C:\Users\Alice." But with namespaces, Bob can also run the startup script, but his instance of floyd (on the same computer, running at the same time) will see C:\floyd with the contents of, say, C:\Users\Bob\Program Settings\Floyd Config (or any other path we like).
You can do this on Linux with namespaces. Is there something similar or analogous on Windows? It's fine if it requires writing a C program, and it's OK if it only works on recent versions of Windows.
NTFS hard links are really a simple case of reparse points. Reparse points are typed, and can include more advanced behavior. For instance, they're also used for "offline storage" (transparent migration of files to and from secondary storage). You can therefore also use reparse points to implement per-user symbolic links, by creating a new reparse type.
The reparse point type even has an explicit "Name surrogate" bit, which (if set) indicates that reparse points of those types are some kind of symbolic link.
You can even have multiple reparse points in a path. Therefore files inside your symbolic namepace can still be migrated to secondary storage - you'd just have two reparse points in the path.
I think Virtual Store does this automatically for legacy programs that try to write to nonstandard directories. So the legacy program writes to a user- and program-specific directory instead to C:\floyd.
This sounds like Windows Vista's file system virtualization. For example, it can silently redirect c:\Program Files\Floyd to c:\Users\<username>\AppData\Local\VirtualStore\Program Files\Floyd. However, file system virtualization isn't nearly as configurable as Linux namespaces. From what I can tell from reading, file system virtualization should apply any time a 32-bit interactive process opens for writing a file, folder, or registry key that's only writable by administrators. (So you typically end up with some read-only files under c:\Program Files and some per-user writable files under c:\Users\<username>\AppData\Local\VirtalStore.)
An application virtualization product can probably also do this, although those are often more complicated and more expensive.
You can use hardlinks for that, but only with NTFS. http://en.wikipedia.org/wiki/Hard_link
I think windows doesn't have virtual FS view per process.
The most relevant thing is probably special environment folders, such as %temp%, %appdata%, %localappdata%. Not that they're equivalent, but they fulfil the same purpose.
You can define your own environment variables then use '%myspecialplace%\myfile.txt' to access them.
As a horrid kludge (and here I book my passage to programming hell) could you use a NamedPipe for C:\Floyd that mapped the IO operations onto a file specific to the current process user?
I know it's not pretty and I don't know enough about NamedPipes (FIFOs in other dialects) on Windows to know how feasible this is.
Dan
There are several things that come to my mind.
First of all you can create a file system filter driver (or use a ready driver, such as our CallbackFilter product) that would redirect all file system calls, coming from the application, to other location. This is close to virtualization that you mentioned but this won't change the list of drive letters though. Such approach is both powerful and non-trivial, so see the other option first of all.
And the other option is:
there exist several products (Thinstall, Molebox if memory serves) that "wrap" the application redirecting it's file I/O to some other location. There was also some SDK to do the same, but I don't remember it's name at all.
e.g. http://www.msigeek.com/4819/file-re-direction-using-correctfilepaths-shim-to-fix-broken-applications
But I think it's not configurable per user, although the target can vary per-user based on environment variable substitution.
Most programs, though, store configuration in the registry, in which case RegOverridePredefKey would do the trick.
There's a shortage of good solutions for this. For simplicity, I can't improve on using NTFS soft links (junctions) for this - as you correctly point out, this creates issues if you want per-user configuration. As MSalters correctly says, all NTFS soft and hard links are just special cases of reparse points, so you could do something more general by impplementing a new reparse type, if you don't mind some work digging into NTFS..
(Junction is a pretty useful tool when experimenting with NTFS soft links: http://technet.microsoft.com/en-us/sysinternals/bb896768.aspx )
You could just take a direct approach - give each user (or your program initialization if you only care about one piece of software) a logon script that sets up the appropriate junction into their user directory (and make sure you clean it up afterwards). But it's clumsy.
In general the right Windows approach is to put things into the folders pointed at by the %localappdata% (from Vista on) and, more generally, %userprofile% system variables.
Win file system virtualization is intended to ensure this in the cases where it applies.

Locking Executing Files: Windows does, Linux doesn't. Why?

I noticed when a file is executed on Windows (.exe or .dll), it is locked and cannot be deleted, moved or modified.
Linux, on the other hand, does not lock executing files and you can delete, move, or modify them.
Why does Windows lock when Linux does not? Is there an advantage to locking?
Linux has a reference-count mechanism, so you can delete the file while it is executing, and it will continue to exist as long as some process (Which previously opened it) has an open handle for it. The directory entry for the file is removed when you delete it, so it cannot be opened any more, but processes already using this file can still use it. Once all processes using this file terminate, the file is deleted automatically.
Windows does not have this capability, so it is forced to lock the file until all processes executing from it have finished.
I believe that the Linux behavior is preferable. There are probably some deep architectural reasons, but the prime (and simple) reason I find most compelling is that in Windows, you sometimes cannot delete a file, you have no idea why, and all you know is that some process is keeping it in use. In Linux it never happens.
As far as I know, linux does lock executables when they're running -- however, it locks the inode. This means that you can delete the "file" but the inode is still on the filesystem, untouched and all you really deleted is a link.
Unix programs use this way of thinking about the filesystem all the time, create a temporary file, open it, delete the name. Your file still exists but the name is freed up for others to use and no one else can see it.
Linux does lock the files. If you try to overwrite a file that's executing you will get "ETXTBUSY" (Text file busy). You can however remove the file, and the kernel will delete the file when the last reference to it is removed. (If the machine wasn't cleanly shutdown, these files are the cause of the "Deleted inode had zero d-time" messages when the filesystem is checked, they weren't fully deleted, because a running process had a reference to them, and now they are.)
This has some major advantages, you can upgrade a process thats running, by deleting the executable, replacing it, then restarting the process. Even init can be upgraded like this, replace the executable, and send it a signal, and it'll re-exec() itself, without requiring a reboot. (THis is normally done automatically by your package management system as part of it's upgrade)
Under windows, replacing a file that's in use appears to be a major hassle, generally requiring a reboot to make sure no processes are running.
There can be some problems, such as if you have an extremely large logfile, and you remove it, but forget to tell the process that was logging to that file to reopen the file, it'll hold the reference, and you'll wonder why your disk didn't suddenly get a lot more free space.
You can also use this trick under linux for temporary files. open the file, delete it, then continue to use the file. When your process exits (for no matter what reason -- even power failure), the file will be deleted.
Programs like lsof and fuser (or just poking around in /proc//fd) can show you what processes have files open that no longer have a name.
I think linux / unix doesn't use the same locking mechanics because they are built from the ground up as a multi-user system - which would expect the possibility of multiple users using the same file, maybe even for different purposes.
Is there an advantage to locking? Well, it could possibly reduce the amount of pointers that the OS would have to manage, but now a days the amount of savings is pretty negligible. The biggest advantage I can think of to locking is this: you save some user-viewable ambiguity. If user a is running a binary file, and user b deletes it, then the actual file has to stick around until user A's process completes. Yet, if User B or any other users look on the file system for it, they won't be able to find it - but it will continue to take up space. Not really a huge concern to me.
I think largely it's more of a question on backwards compatibility with window's file systems.
I think you're too absolute about Windows. Normally, it doesn't allocate swap space for the code part of an executable. Instead, it keeps a lock on the excutable & DLLs. If discarded code pages are needed again, they're simply reloaded. But with /SWAPRUN, these pages are kept in swap. This is used for executables on CD or network drives. Hence, windows doesn't need to lock these files.
For .NET, look at Shadow Copy.
If executed code in a file should be locked or not is a design decision and MS simply decided to lock, because it has clear advantages in practice: That way you don't need to know which code in which version is used by which application. This is a major problem with Linux default behaviour, which is simply ignored by most people. If system wide libs are replaced, you can't easily know which apps use code of such libs, most of the times the best you can get is that the package manager knows some users of those libs and restarts them. But that only works for general and well know things like maybe Postgres and its libs or such. The more interesting scenarios are if you develop your own application against some 3rd party libs and those get replaced, because most of the times the package manager simply doesn't know your app. And that's not only a problem of native C code or such, it can happen with almost everything: Just use httpd with mod_perl and some Perl libs installed using a package manager and let the package manager update those Perl libs because of any reason. It won't restart your httpd, simply because it doesn't know the dependencies. There are plenty of examples like this one, simply because any file can potentially contain code in use in memory by any runtime, think of Java, Python and all such things.
So there's a good reason to have the opinion that locking files by default may be a good choice. You don't need to agree with that reasons, though.
So what did MS do? They simply created an API which gives the calling application the chance to decide if files should be locked or not, but they decided that the default value of this API is to provide an exclusive lock to the first calling application. Have a look at the API around CreateFile and its dwShareMode argument. That is the reason why you might not be able to delete files in use by some application, it simply doesn't care about your use case, used the default values and therefore got an exclusive lock by Windows for a file.
Please don't believe in people telling you something about Windows doesn't use ref counting on HANDLEs or doesn't support Hardlinks or such, that is completely wrong. Almost every API using HANDLEs documents its behaviour regarding ref counting and you can easily read in almost any article about NTFS that it in deed does support Hardlinks and always did. Since Windows Vista it has support for Symlinks as well and the Support for Hardlinks has been improved by providing APIs to read all of those for a given file etc.
Additionally, you may simply want to have a look at the structures used to describe a file in e.g. Ext4 compared to those of NTFS, which have a lot in common. Both work with the concept of extents, which separates data from attributes like file name, and inodes are pretty much just another name for an older, but similar concept of that. Even Wikipedia lists both file systems in its article.
There's really a lot of FUD around file locking in Windows compared to other OSs on the net, just like about defragmentation. Some of this FUD can be ruled out by simply reading a bit on the Wikipedia.
NT variants have the
openfiles
command, which will show which processes have handles on which files. It does, however, require enabling the system global flag 'maintain objects list'
openfiles /local /?
tells you how to do this, and also that a performance penalty is incurred by doing so.
Executables are progressively mapped to memory when run. What that means is that portions of the executable are loaded as needed. If the file is swapped out prior to all sections being mapped, it could cause major instability.

Resources