Mounting a stream encoder as a drive in windows? - windows

For a variety of reasons, revolving around cost of copy and the travails of the Windows filesystem, I need to mount a stream encoder as a drive, so that incoming data can simply be blindly directed at this "drive", aggregated, and encoded, without the source program being any the wiser. This would be basically trivial on Linux, but seems to be an uphill struggle on windows.
Specifically, I'd like to be able to "mount" a tar builder, which I know sounds odd, but there's a compelling reason for doing this. Is there a utility or library that deals with this? Perhaps an obscure part of the Windows API?
This looks promising... but it seems intended to mount folders or similar, rather than "devices." I do have control of where the data is written, so I can specify an arbitrary path.

Having experience with virtual drives (see our Virtual Storage product line) I can say that your task needs some redefining. As said in comments, drives (or, better say, filesystems) in Windows are expected to be namely filesystems (unlike Unix world), and as such they must support certain reading and enumeration operations, which is not something you'd expect.
Probably the closest you can do is a virtual drive in memory whose contents are then passed to your application in some way. The user will drag the data to your drive and on unmounting (or on other command) drive contents are passed to other program.
Several of our products can be used for your task (see CallbackDisk, Callback File System and SolFS OS Edition on Virtual Storage page), yet they are all commercial products. If you have a one-time or short-term task, you can build something for your use with a trial key.
There exist free approaches to your task, namely Pismo File Mount and Dokan, but I don't know how well they fit.

Related

Creating Virtual Folders and hooking them into the file system

I have a big collection of folders for projects I'm working on. I've been trying to find a better way to sort them all for a long time and I want to write an app that creates groups based on whatever criteria I say, such as "folders from 2011" or "folders containing a x type of file" etc.
This is fairly straightforward, and wouldn't present much of a problem to code using its own UI in winForms or WPF or something. But I think it would be far better if I could make these folders appear to be part of the filesystem, so other apps (like existing file explorers) can see them.
Is this possible? Would it cause problems I haven't considered? How do I go about doing it if it is possible?
One way I thought of doing it would be to have the app monitor the filesystem and create folder shortcuts every time there's a change, but I'm curious about whether its possible to actually present a fake filesystem to explorer through a 'gateway' folder
EDIT: Ok it's obviously possible since http://www.virtualfolder.net/ can do it, and now that I think of it so can TrueCrypt, although it would be nice if it didn't have to appear as a separate drive. So the question becomes, how do I implement it?
You can create a Shell Namespace Extension that gathers the file information you want and displays it within Windows Explorer any way you wish. You can choose where your extension is located, whether as its own top-level node, a child of another system virtual folder/extension, or as a child of a file system folder.
Writing a SNE is not trivial, but it is a lot easier then writing a lower-level file system driver, and it does not require special driver-oriented compilers. Any compiler that supports developing COM objects will work.
This is accomplished using filesystem drivers or filesystem filter drivers. First let you create a virtual filesystem and mount it to a drive letter and also to a folder on NTFS drive (folder must exist but its contents are "replaced" with a virtual filesystem directory tree). Filesystem filter drivers let you introduce virtual files and folders in existing folders without replacing them.
VirtualFolder uses filesystem driver as it creates a drive letter.
Both types of drivers are written in C and work in kernel-mode. Writing them requires deep knowledge of Windows internals and experience with driver development (since filesystem drivers are one of the most complicated driver types).
We offer several products related to virtual storage. One of them, Callback File System, is a filesystem driver. It calls your user-mode code to perform actual filesystem functions. Another product, CallbackFilter, is an FS filter driver (and it also calls your user-mode code). However, current version of CallbackFilter doesn't let you introduce virtual files and folders (this would be implemented in the next release).
There's also Pismo File Mount product available, they use filter driver techniques. You can check with them if what you need can be accomplished.
From what I gather you are looking for a way to present the results of predefined file queries to appear as though they are located at a specific location in the file system. If that is correct you may want to look into Hard Links and Junctions. There are limits on what you can do with these file system services. However it is really straight forward to implement.

How can I know whether a particular file on a Windows machine supports Alternate Data Streams?

Using the raw Windows programming API from C/C++ and a file handle or a path to a file, folder, link, etc; how can I programmatically decide whether the file (etc) supports ADS (Alternate Data Streams)?
I assume one thing I have to know is whether the file is on an NTFS partition, but then again for all I know it might be possible to mount some kind of Mac or *nix filesystems which support data forks or alternate data streams of some kind, and all such cases might be covered by a single API call or data structure.
Secondly I'm not sure whether every kind of object that can exist on an NTFS partition can have ADSs - such as folders, symlinks, hardlinks, anything else?
What API etc can handle all cases to tell me whether a given file etc has the ability to have ADSs?
(For this question I'm not looking for whether files have ADSs, just whether its possible for files to have them. It could include a file I've just created for instance.)
ADS is a feature of NTFS. You can use GetVolumeInformation() to detect if a given path is on an NTFS file system, and even if that volume supports ADS at all. AFAIK, only a real file can have an ADS attached to it. You can use GetFileAttributes() to detect if a path is a file, directory, symbolic link, etc.
Like any other file, Directories can also host other ADS! Any file object on NTFS can store more than one DATA Stream. The 'visible' one is named, any additional data stream is 'invisible' as far as Explorer is concerned. Actually, at the prompt now one can display ADS using the /R switch when invoking dir.

How to read disk file entries faster than FindFile API? [duplicate]

I am in the middle of writing a tool that finds lost files of an iTunes library, for both Mac and Windows. On the Mac, I can quickly find files by naming using the wonderful "CatalogSearch" function.
On Windows, however, there seems to be no OS API for searching by file name (or is there?).
After some googling, I learned that there are tools (like TFind, Everything) that read the NTFS directory directly and scan it to find files by name.
I would like to do the same, but without having to start from scratch (although I've written quite a few disk tools in the past, I've never had the energy to dig into NTFS).
I wonder if there are ready-made libs around, possibly as a .dll, that would give me this search feature: Pass in a file name, get back its path.
Alternatively, what about the Windows indexing service? At least when I tried this on a recently installed XP Home system, the Search operation under the Start menu would actually scan all directories, which suggests that it has no complete database. As I'm not a Windows user at all, I wonder why this isn't working.
In the end, the complete solution I need is: I have a list of file names to find, and I need code that searches the entire disk (or uses a DB for it) to get me all results in one go. E.g, the search should not start a new full scan for every file I'm looking up. That's why I think the MFT way would be optimal, as it could quickly iterate over all names, comparing each to my list.
The best way to solve your problem seems to be by using the Windows Change Journal.
Problem: If it is not enabled for a volume or the volume is a non-NTFS you need a fallback (or enable the Change Journal if it is NTFS). You need administrator rights as well to access the Change Journal.
You get the files by using the FSCTL_ENUM_USN_DATA and DeviceIOControll with LowUsn=0. This directly accesses the MFT and writes all filenames into the supplied buffer. Because it sequentially acesses the MFT it is faster than the FindFirstFile API.

Finding a set of file names quickly on NTFS volumes, ideally via its MFT

I am in the middle of writing a tool that finds lost files of an iTunes library, for both Mac and Windows. On the Mac, I can quickly find files by naming using the wonderful "CatalogSearch" function.
On Windows, however, there seems to be no OS API for searching by file name (or is there?).
After some googling, I learned that there are tools (like TFind, Everything) that read the NTFS directory directly and scan it to find files by name.
I would like to do the same, but without having to start from scratch (although I've written quite a few disk tools in the past, I've never had the energy to dig into NTFS).
I wonder if there are ready-made libs around, possibly as a .dll, that would give me this search feature: Pass in a file name, get back its path.
Alternatively, what about the Windows indexing service? At least when I tried this on a recently installed XP Home system, the Search operation under the Start menu would actually scan all directories, which suggests that it has no complete database. As I'm not a Windows user at all, I wonder why this isn't working.
In the end, the complete solution I need is: I have a list of file names to find, and I need code that searches the entire disk (or uses a DB for it) to get me all results in one go. E.g, the search should not start a new full scan for every file I'm looking up. That's why I think the MFT way would be optimal, as it could quickly iterate over all names, comparing each to my list.
The best way to solve your problem seems to be by using the Windows Change Journal.
Problem: If it is not enabled for a volume or the volume is a non-NTFS you need a fallback (or enable the Change Journal if it is NTFS). You need administrator rights as well to access the Change Journal.
You get the files by using the FSCTL_ENUM_USN_DATA and DeviceIOControll with LowUsn=0. This directly accesses the MFT and writes all filenames into the supplied buffer. Because it sequentially acesses the MFT it is faster than the FindFirstFile API.

Windows equivalent to Linux namespaces (per-process filesystem mounts)?

Linux has a feature called namespaces, which let you give a different "view" of the filesystem to different processes. In Windows terms, this would be useful for example if you had a legacy program "floyd" that always loaded its configuration from C:\floyd\floyd.ini. If Windows had namespaces, you could write a wrapper script which would create a namespace in which to run floyd, making it so when Alice ran the script, floyd would start up in an environment where C:\floyd existed but actually pointed to C:\Users\Alice\Floyd.
Now you may be thinking, "OK, just use soft or hard links and make C:\floyd an alias for C:\Users\Alice." But with namespaces, Bob can also run the startup script, but his instance of floyd (on the same computer, running at the same time) will see C:\floyd with the contents of, say, C:\Users\Bob\Program Settings\Floyd Config (or any other path we like).
You can do this on Linux with namespaces. Is there something similar or analogous on Windows? It's fine if it requires writing a C program, and it's OK if it only works on recent versions of Windows.
NTFS hard links are really a simple case of reparse points. Reparse points are typed, and can include more advanced behavior. For instance, they're also used for "offline storage" (transparent migration of files to and from secondary storage). You can therefore also use reparse points to implement per-user symbolic links, by creating a new reparse type.
The reparse point type even has an explicit "Name surrogate" bit, which (if set) indicates that reparse points of those types are some kind of symbolic link.
You can even have multiple reparse points in a path. Therefore files inside your symbolic namepace can still be migrated to secondary storage - you'd just have two reparse points in the path.
I think Virtual Store does this automatically for legacy programs that try to write to nonstandard directories. So the legacy program writes to a user- and program-specific directory instead to C:\floyd.
This sounds like Windows Vista's file system virtualization. For example, it can silently redirect c:\Program Files\Floyd to c:\Users\<username>\AppData\Local\VirtualStore\Program Files\Floyd. However, file system virtualization isn't nearly as configurable as Linux namespaces. From what I can tell from reading, file system virtualization should apply any time a 32-bit interactive process opens for writing a file, folder, or registry key that's only writable by administrators. (So you typically end up with some read-only files under c:\Program Files and some per-user writable files under c:\Users\<username>\AppData\Local\VirtalStore.)
An application virtualization product can probably also do this, although those are often more complicated and more expensive.
You can use hardlinks for that, but only with NTFS. http://en.wikipedia.org/wiki/Hard_link
I think windows doesn't have virtual FS view per process.
The most relevant thing is probably special environment folders, such as %temp%, %appdata%, %localappdata%. Not that they're equivalent, but they fulfil the same purpose.
You can define your own environment variables then use '%myspecialplace%\myfile.txt' to access them.
As a horrid kludge (and here I book my passage to programming hell) could you use a NamedPipe for C:\Floyd that mapped the IO operations onto a file specific to the current process user?
I know it's not pretty and I don't know enough about NamedPipes (FIFOs in other dialects) on Windows to know how feasible this is.
Dan
There are several things that come to my mind.
First of all you can create a file system filter driver (or use a ready driver, such as our CallbackFilter product) that would redirect all file system calls, coming from the application, to other location. This is close to virtualization that you mentioned but this won't change the list of drive letters though. Such approach is both powerful and non-trivial, so see the other option first of all.
And the other option is:
there exist several products (Thinstall, Molebox if memory serves) that "wrap" the application redirecting it's file I/O to some other location. There was also some SDK to do the same, but I don't remember it's name at all.
e.g. http://www.msigeek.com/4819/file-re-direction-using-correctfilepaths-shim-to-fix-broken-applications
But I think it's not configurable per user, although the target can vary per-user based on environment variable substitution.
Most programs, though, store configuration in the registry, in which case RegOverridePredefKey would do the trick.
There's a shortage of good solutions for this. For simplicity, I can't improve on using NTFS soft links (junctions) for this - as you correctly point out, this creates issues if you want per-user configuration. As MSalters correctly says, all NTFS soft and hard links are just special cases of reparse points, so you could do something more general by impplementing a new reparse type, if you don't mind some work digging into NTFS..
(Junction is a pretty useful tool when experimenting with NTFS soft links: http://technet.microsoft.com/en-us/sysinternals/bb896768.aspx )
You could just take a direct approach - give each user (or your program initialization if you only care about one piece of software) a logon script that sets up the appropriate junction into their user directory (and make sure you clean it up afterwards). But it's clumsy.
In general the right Windows approach is to put things into the folders pointed at by the %localappdata% (from Vista on) and, more generally, %userprofile% system variables.
Win file system virtualization is intended to ensure this in the cases where it applies.

Resources