How does Windows link specific files to specific thumbnails in the thumbnail cache? - windows

According to Wikipedia:
Thumbnail caching was introduced in Windows 2000; wherein the thumbnails were stored in the image file's alternate data stream if the operating system was installed on a drive with the NTFS file system.
Here, it's clear that Windows used to associate thumbnail information directly with a file using its alternate data stream (on an NTFS file system). However, since Windows Vista:
...thumbnail previews are stored in a centralized location on the system... The cache is stored at %userprofile%\AppData\Local\Microsoft\Windows\Explorer as a number of files with the label thumbcache_xxx.db (numbered by size)
However, I have yet to find anywhere that explains how Windows associates individual thumbnails in the cache with specific files on the file system.
Does Windows associate the thumbnail image with the checksum of the file? It seems unlikely/nonoptimal because Windows would have to compute the checksum of every item in a folder (when accessed) if it wanted to properly display the correct thumbnails.
Does Windows use something lower level like the NTFS file ID of the file? But then how would it work on other file systems like FAT which don't assign fixed file IDs?
I have yet to find any good answers so I would really appreciate any help I could get!

Related

How can I know whether a particular file on a Windows machine supports Alternate Data Streams?

Using the raw Windows programming API from C/C++ and a file handle or a path to a file, folder, link, etc; how can I programmatically decide whether the file (etc) supports ADS (Alternate Data Streams)?
I assume one thing I have to know is whether the file is on an NTFS partition, but then again for all I know it might be possible to mount some kind of Mac or *nix filesystems which support data forks or alternate data streams of some kind, and all such cases might be covered by a single API call or data structure.
Secondly I'm not sure whether every kind of object that can exist on an NTFS partition can have ADSs - such as folders, symlinks, hardlinks, anything else?
What API etc can handle all cases to tell me whether a given file etc has the ability to have ADSs?
(For this question I'm not looking for whether files have ADSs, just whether its possible for files to have them. It could include a file I've just created for instance.)
ADS is a feature of NTFS. You can use GetVolumeInformation() to detect if a given path is on an NTFS file system, and even if that volume supports ADS at all. AFAIK, only a real file can have an ADS attached to it. You can use GetFileAttributes() to detect if a path is a file, directory, symbolic link, etc.
Like any other file, Directories can also host other ADS! Any file object on NTFS can store more than one DATA Stream. The 'visible' one is named, any additional data stream is 'invisible' as far as Explorer is concerned. Actually, at the prompt now one can display ADS using the /R switch when invoking dir.

How to read disk file entries faster than FindFile API? [duplicate]

I am in the middle of writing a tool that finds lost files of an iTunes library, for both Mac and Windows. On the Mac, I can quickly find files by naming using the wonderful "CatalogSearch" function.
On Windows, however, there seems to be no OS API for searching by file name (or is there?).
After some googling, I learned that there are tools (like TFind, Everything) that read the NTFS directory directly and scan it to find files by name.
I would like to do the same, but without having to start from scratch (although I've written quite a few disk tools in the past, I've never had the energy to dig into NTFS).
I wonder if there are ready-made libs around, possibly as a .dll, that would give me this search feature: Pass in a file name, get back its path.
Alternatively, what about the Windows indexing service? At least when I tried this on a recently installed XP Home system, the Search operation under the Start menu would actually scan all directories, which suggests that it has no complete database. As I'm not a Windows user at all, I wonder why this isn't working.
In the end, the complete solution I need is: I have a list of file names to find, and I need code that searches the entire disk (or uses a DB for it) to get me all results in one go. E.g, the search should not start a new full scan for every file I'm looking up. That's why I think the MFT way would be optimal, as it could quickly iterate over all names, comparing each to my list.
The best way to solve your problem seems to be by using the Windows Change Journal.
Problem: If it is not enabled for a volume or the volume is a non-NTFS you need a fallback (or enable the Change Journal if it is NTFS). You need administrator rights as well to access the Change Journal.
You get the files by using the FSCTL_ENUM_USN_DATA and DeviceIOControll with LowUsn=0. This directly accesses the MFT and writes all filenames into the supplied buffer. Because it sequentially acesses the MFT it is faster than the FindFirstFile API.

Finding a set of file names quickly on NTFS volumes, ideally via its MFT

I am in the middle of writing a tool that finds lost files of an iTunes library, for both Mac and Windows. On the Mac, I can quickly find files by naming using the wonderful "CatalogSearch" function.
On Windows, however, there seems to be no OS API for searching by file name (or is there?).
After some googling, I learned that there are tools (like TFind, Everything) that read the NTFS directory directly and scan it to find files by name.
I would like to do the same, but without having to start from scratch (although I've written quite a few disk tools in the past, I've never had the energy to dig into NTFS).
I wonder if there are ready-made libs around, possibly as a .dll, that would give me this search feature: Pass in a file name, get back its path.
Alternatively, what about the Windows indexing service? At least when I tried this on a recently installed XP Home system, the Search operation under the Start menu would actually scan all directories, which suggests that it has no complete database. As I'm not a Windows user at all, I wonder why this isn't working.
In the end, the complete solution I need is: I have a list of file names to find, and I need code that searches the entire disk (or uses a DB for it) to get me all results in one go. E.g, the search should not start a new full scan for every file I'm looking up. That's why I think the MFT way would be optimal, as it could quickly iterate over all names, comparing each to my list.
The best way to solve your problem seems to be by using the Windows Change Journal.
Problem: If it is not enabled for a volume or the volume is a non-NTFS you need a fallback (or enable the Change Journal if it is NTFS). You need administrator rights as well to access the Change Journal.
You get the files by using the FSCTL_ENUM_USN_DATA and DeviceIOControll with LowUsn=0. This directly accesses the MFT and writes all filenames into the supplied buffer. Because it sequentially acesses the MFT it is faster than the FindFirstFile API.

Change Journal for Blocks in Windows(NTFS)

I have written a backup tool that is able to backup files and images of volumes for Windows. To detect which files have changed I use the Windows Change Journal. I already use the shadow copy functionality to do a consistent copy of both the files and the volume images.
To detect which blocks have changed I use hashes at the moment. This means the whole volume has to be read once (because to see which block has changed hashes of all blocks have to be calculated).
The backup integrated into Windows 7 is able to create incremental volume images without checking all blocks. I wasn't able to find an API for a kind of block level change journal.
Does anybody know how to access this information?
(I'm willing to dive deep into NTFS internals - even reading and parsing special files)
I don't think block level change info is available anywhere. Most probably what the Windows 7 integrated backup does is it installs a File System Filter Driver like some backup products does and anti-virus software. A filter driver can intercept all file system calls and in this way know which blocks changed. If you do this you can basically build your own change journal that works block level but only for the files that you are interested in.
I would really like to know a better answer myself here.
When you say Windows Change Journal I take it you are referring to the NTFS USN? It looks very much like the Windows 7 backup uses a combination of VSC and NTFS USN to detect changes and create incremental images much like you are already doing.

Folder with Extension

I'm looking to have windows recognize that certain folders are associated to my application - maybe by naming the folder 'folder.myExt'.
Can this be done via the registry?
A bit more info
- This is for a x-platform app ( that's why I suggested the folder with an extension - mac can handle that )
- The RAD I'm using doesn't read write binary data efficiently enough as the size of this 'folder' will be upwards of 2000 files and 500Mb
Folders in Windows aren't subject to the name.extension rules at all, there's only 1 entry in the registry's file type handling for "folder" types. (If you try to change it you're going to have very, very rough times ahead)
The only simple way to get the effect you're after would be to do what OpenOffice, MS Office 2007, and large video games have been doing for some time, use a ZIP file for a container. (It doesn't have to be a "ZIP" exactly, but some type of readily available container file type is better than writing your own) Like OO.org and Office 2K7 you can just use a custom extension and designate your app as the handler. This will also work on Macs, so it can be cross-platform. It may not be fast however. Using low or no compression may help with that.
You can have an "extension" on your folder, but as far as I know, windows just treats it all as the folder name and opens the folder like normal when you click on it.
The few times I messed with opening a .app on my windows system, it acted like it was a normal folder.

Resources