SHParseDisplayName fails with ERROR_FILE_NOT_FOUND when passed existing directory whose name ends with a space character - windows

I'm developing a file manager application, and noticed that some functions don't work with an existing folder that ends with a space symbol. E. g. "E:\1 \". This isn't specific to this particular folder, but indeed to any one with a space as the last character of the folder's name. For such folders, SHParseDisplayName returns ERROR_FILE_NOT_FOUND.
I'm calling SHParseDisplayName like so from C++:
ITEMIDLIST* idPtr = nullptr;
const auto result = SHParseDisplayName(L"E:\\1 \\", nullptr, &idPtr, 0, nullptr);
The documentation doesn't specify any edge cases, nor any ways in which the input path should be pre-processed. Regardless, I tried decorating it with quotes:
SHParseDisplayName(L"\"E:\\1 \\\"", nullptr, &idPtr, 0, nullptr);
And supplying a UNC path:
SHParseDisplayName(L"\\\\?\\E:\\1 \\", nullptr, &idPtr, 0, nullptr);
Both of which results in E_INVALIDARG.
Of note: SHParseDisplayName does work properly for items nested inside such a folder, e. g. L"E:\\1 \\some_internal_folder\\", just not the folder whose name ends with a space itself.
Is there any workaround? Windows Explorer seems to work just fine with such folders (as one would expect).
Also, SHParseDisplayName isn't the only Windows API function that fails for such folders. Another example of the same behavior is ILCreateFromPathW.

File and Folder names that begin or end with the ASCII Space (0x20)
will be saved without these characters. File and Folder names that end
with the ASCII Period (0x2E) character will also be saved without this
character. All other trailing or leading whitespace characters are
retained.
The Win32 API (CreateFile, FindFirstFile, etc.) uses a direct method
to enumerate the files and folders on a local or remote file system.
All files and folders are discoverable regardless of the inclusion or
location of whitespace characters.
Refer to "Support for Whitespace characters in File and Folder names"
And blog "MS-DOS also allowed spaces in file names, although vanishingly few programs knew how to access them.
So for existing files/folders with space at the end of names, either use Win32 API (CreateFile, FindFirstFile, etc.) or replace them with a new name without trailing or leading whitespace characters.

Related

QT: Escape slash / in saving location on a mac

I have HTML input I am supposed to extract 2 Strings of, build a document title string of type <string 1> / <string 2, create a PDF from the source on the users mac desktop and name it as described.
I do know that a slash in a document name is not such a brilliant idea but this is what I am asked to do.
Problem is: the forward slash is interpreted as a folder on the mac and not as part of the documents name which means QPainter fails to print to PDF because it interpretes string1 / being a folder that doesn't exist.
BTW when omitting the / my code is working fine.
How am I supposed to escape the /?
Here's the string building logic:
QString docTitle;
docTitle.append(string1);
docTitle.append(" / ");
docTitle.append(string2);
On OS X, the name of a file at the level of the APIs is different from the display name that is shown to the user in the Finder, open and save panels, etc.
At the level of the APIs, file names simply can't contain slashes. They are reserved for separating names within a path. There's no form of escaping or quoting to allow it.
However, you can create a file whose name will be displayed with a slash in the UI.
Basically, the slash (/) and colon (:) characters swap roles. The display names of files can't include a colon, because it's reserved. (This is a holdover of the old HFS file system used in Classic Mac OS.) So, one aspect of the conversion from names-in-the-APIs to display names is to convert from colons to slashes. Thus, if you want a file whose display name has a slash, you actually use a colon.
A file whose name as per the APIs is "Important legal document 06:13:2015.pdf" will be displayed in the UI as "Important legal document 06/13/2015.pdf". Likewise, if a user names a file in a save dialog or in the Finder as "Important legal document 06/13/2015.pdf", it will end up with a name which, when observed via the APIs, will be "Important legal document 06:13:2015.pdf".

Win32 API: Reading very long "Win32 file namespace" names

Sorry I haven't tested this myself, but MSDN says we can make a very long (more than MAX_PATH ie. 260 chars) file name by specifying "Win32 file namespace":
http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247.aspx#win32_file_namespaces Naming Files, Paths, and Namespaces > Win32 File Namespaces
That's easy with the CreateFile API because its signature accepts LPCTSTR lpFileName which incurs no restriction about the input length:
http://msdn.microsoft.com/en-us/library/windows/desktop/aa363858.aspx CreateFile function (Windows)
But how can we read such a long file name? W32_FIND_DATA returned by FindFirstFile contains only TCHAR cFileName[MAX_PATH].
http://msdn.microsoft.com/en-us/library/windows/desktop/aa365740.aspx WIN32_FIND_DATA structure (Windows)
http://msdn.microsoft.com/en-us/library/windows/desktop/aa364418.aspx FindFirstFile function (Windows)
Will we perhaps get 8.3 name instead when the actual file name doesn't fit into cFileName[MAX_PATH]?
For CreateFile, you can escape the MAX_PATH limit by using the Unicode version of the API, and the special L"\\?\" prefix.
For W32_FIND_DATA things are a little different. That record contains the file names as inline character arrays. With fixed length. However, these filenames in the record only contain the name of the object relative to its container. By that I mean these filenames are relative to the directory in which they live. And so the limitation that they can be no more than MAX_PATH in length is in fact not a limitation, because each component in a path is itself limited in length, typically to be no more than 255 characters.
The limitation of path components to 255 characters in length is discussed in the very MSDN article to which you linked: Naming Files, Paths, and Namespaces.
The Windows API has many functions that also have Unicode versions to permit an extended-length path for a maximum total path length of 32,767 characters.
This type of path is composed of components separated by backslashes, each up to the value returned in the lpMaximumComponentLength parameter of the GetVolumeInformation function (this value is commonly 255 characters).
To specify an extended-length path, use the "\\?\" prefix. For example, "\\?\D:\very long path".

Does the "SubstituteName" string in the PathBuffer of a REPARSE_DATA_BUFFER structure always start with the prefix "\??\", and if so, why?

I am trying to use Windows API functions compatible with Windows XP and up to find the target of a junction or symbolic link. I am using CreateFile to get a handle to the reparse point, then DeviceIoControl with the FSCTL_GET_REPARSE_POINT flag to read the reparse data into a REPARSE_DATA_BUFFER. Then, I use the offsets and lengths in the buffer to extract the SubstituteName and PrintName strings.
In Windows 8, extracting the PrintName works perfectly, giving me a normal path (ie c:\filename.ext), but in XP the PrintName section of the REPARSE_DATA_BUFFER seems to always have a length of 0, leaving me with an empty string.
Using the SubsituteName seems to work in both, but I always end up with a prefix of \??\ on the beginning of the file path (ie \??\c:\filename.ext). (as a side note, fsutil reparsepoint query shows the \??\ prefix as well).
I've read through much of the documentation on MSDN, but I can't find any explanation of this prefix. If the prefix is guaranteed to begin every SubstituteName, then I can just exclude the first four characters when I copy the file path from the buffer, but I'm not sure that this is the case. I would love to know if the "\??\" prefix appears in the SubstituteName for all Microsoft reparse points and why.
The Windows kernel has a "DOS Devices namespace" \DosDevices\ which is basically where anything you can open with CreateFile resides. (QueryDosDevice is a function which gives you all the members of that namespace.)
Because it's such a commonly used path, \??\ also redirects to that namespace. So, to the kernel, the path C:\Windows is invalid -- it should really be written as something like \??\C:\Windows. That's where this notation comes from.
The \??\ prefix means the path is not parsed. It is not guaranteed on every name, so you will have to look for the prefix on a per-name basis and skip it if present.
Update: I could not find any definitive documentation explaining exactly that \??\ actually represents, but here are some links that mention the \??\ prefix in action:
http://www.flexhex.com/docs/articles/hard-links.phtml
Note that szTarget string must contain the path prefixed with the "non-parsed" prefix "\??\", and terminated with the backslash character, for example "\??\C:\Some Dir\".
http://social.msdn.microsoft.com/Forums/en-US/vbgeneral/thread/908b3927-1ee9-4e03-9922-b4fd49fc51a6
http://mjunction.googlecode.com/svn-history/r5/trunk/MJunction/MJunction/JunctionPoint.cs
This prefix indicates to NTFS that the path is to be treated as a non-interpreted path in the virtual file system.
Private Const NonInterpretedPathPrefix As String = "\??\"

SetCurrentDirectory to a path with dot at the end

Win32 SetCurrentDirectory() function failing to change current directory to a path with dot at the end, resulting GetLastError 2 (The system cannot find the file specified.).
What's wrong?
File names are not allowed to end in dots, and the behaviour is not guaranteed if they do.
http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx#naming_conventions
Do not end a file or directory name with a space or a period. Although
the underlying file system may support such names, the Windows shell
and user interface does not.

CGI DLL (built in Delphi) physical path

I deployed an CGI DLL built with Delphi 2007 on the Windows 2008 server. Internally I need to use the current DLL path.
Normally I can use GetModuleFileName or GetModuleName, but on the server they both return:
\\?\c:\my\correct\path
Why the first 4 characters? It looks like a network path? Is there any way to exclude those first 4 characters?
The pertinent documentation is this:
Maximum Path Length Limitation
In the Windows API (with some exceptions discussed in the following
paragraphs), the maximum length for a path is MAX_PATH, which is
defined as 260 characters. A local path is structured in the following
order: drive letter, colon, backslash, name components separated by
backslashes, and a terminating null character. For example, the
maximum path on drive D is "D:\some 256-character path string"
where "" represents the invisible terminating null character for
the current system codepage. (The characters < > are used here for
visual clarity and cannot be part of a valid path string.)
Note File I/O functions in the Windows API convert "/" to "\" as part
of converting the name to an NT-style name, except when using the
"\\?\" prefix as detailed in the following sections.
The Windows API has many functions that also have Unicode versions to
permit an extended-length path for a maximum total path length of
32,767 characters. This type of path is composed of components
separated by backslashes, each up to the value returned in the
lpMaximumComponentLength parameter of the GetVolumeInformation
function (this value is commonly 255 characters). To specify an
extended-length path, use the "\\?\" prefix. For example, "\\?\D:\very
long path".
Note The maximum path of 32,767 characters is approximate, because
the "\\?\" prefix may be expanded to a longer string by the system at
run time, and this expansion applies to the total length.
The "\\?\" prefix can also be used with paths constructed according to
the universal naming convention (UNC). To specify such a path using
UNC, use the "\\?\UNC\" prefix. For example, "\\?\UNC\server\share",
where "server" is the name of the computer and "share" is the name of
the shared folder. These prefixes are not used as part of the path
itself. They indicate that the path should be passed to the system
with minimal modification, which means that you cannot use forward
slashes to represent path separators, or a period to represent the
current directory, or double dots to represent the parent directory.
Because you cannot use the "\\?\" prefix with a relative path,
relative paths are always limited to a total of MAX_PATH characters.
As long as you are calling Unicode versions of Windows API functions, then there's no need to strip the "\\?\" prefix. Because the path that you have been handed is a valid path.
As we discovered in the comments, you were calling an ANSI version of an API function. And when you do that, the "\\?\" prefix is not valid. So, stick to Unicode API functions and it's all good!

Resources