Haskell Directory creates invalid symlink on Windows - windows

This year System.Directory was updated to include createFileLink and createDirectoryLink actions, and for me on Windows 10 both work fine for relative paths.
When I use either on an absolute path (of about 50 character length, so I suppose in unicode it exceeds 260) it prepends \\?\ (i.e. "\\\\?\\") to the paths, which can be seen from DIR as follows
<SYMLINKD> source [\\?\T:\Code\hLink\binaries\dest]
<SYMLINK> source.txt [\\?\T:\Code\hLink\binaries\dest\source.txt]
The directory link works fine, but the file link doesn't do anything, it doesn't even say that the target file is missing.
When I create a file link using MKLINK without \\?\ in the absolute path it works fine as well, and when I create either link using MKLINKwith \\?\ it has the same result.
Is this a Windows problem? Can I make Haskell use short path format instead? (Using Win10 so apparently I can enable long paths via registry)
Should the Windows api be passing the \\?\ header to symlinks at all?
References:
MaxPath and the meaning of \\?\, plus disabling path limitations on Win10
https://msdn.microsoft.com/en-us/library/aa365247(VS.85).aspx#maxpath
Changelog reporting the addition of \\?\ to win32 calls https://hackage.haskell.org/package/directory-1.3.1.1/changelog

Related

Where is the use of "\\?\" defined?

This command is to delete all files and sub-folders in a folder
rd /s "\\?\D:\TestFolder
This command snippet got from a youtube video right here
Could someone explain what this, \\?\, does?
It's the prefix to bypass Windows path normalization. With it you'll be able to access paths that are not valid in Win32 namespace like names ending with . or spaces: D:\TestFolder\folder ending with space \file name ending with dot., or files with path longer than MAX_PATH (260 characters in older Windows)
For file I/O, the "\\?\" prefix to a path string tells the Windows APIs to disable all string parsing and to send the string that follows it straight to the file system. For example, if the file system supports large paths and file names, you can exceed the MAX_PATH limits that are otherwise enforced by the Windows APIs. For more information about the normal maximum path limitation, see the previous section Maximum Path Length Limitation.
Naming Files, Paths, and Namespaces - Win32 File Namespaces
See
Dots at the end of file name?
How to copy files that have too long of a filepath in Windows?

Windows directory that will never contain non-ASCII characters for temp file?

Using MinGW 7.3.0 on Windows, Hunspell can't load the dictionary files from locations that have non-ASCII characters because of Windows limitations. I've tried everything[1] and I'm now resorting to copying the file to a path without ASCII characters before giving it to Hunspell. What is a good location to copy it to?
[1]
Windows requires wchar_t support for std::iostream.open() to work right, which MinGW does not implement
std::filesystem can solve this, but only available in GCC 8
Hunspell insists on loading files on its own, it is not possible to pass the read files as strings to it
The "natural" fit would be the use the user's choosen temporary directory (or subdirectory thereof) (see %temp% or GetTempPath()). However, that defaults to something that contains the user name (which can contain "non-ASCII" characters; e.g. c:\users\Ø¥Ć¼\AppData\LocalLow\Temp) or something arbitrary (regarding character set) all together.
So you're most likely best off to choose some directory that
a) does not contain off-limits characters from the get go. For example, a directory underneath C:\ProgramData that you choose yourself (e.g. the application name) that you know does not contain non-ASCII characters.
b) let the user decide where to put these files and make sure it is not permissible to enter a path that contains only allowed characters.
c) Pass the "short path name" to Hunspell, which should not contain non-ASCII characters for compatibility with FAT file system traits. For example, the short path name for c:\temp\Ø¥Ć¼ is c:\temp\571D~1.
You can see the short names for directories using cmd.exe /c dir /x:
C:\temp>dir /x
...
19.07.2019 15:30 <DIR> .
19.07.2019 15:30 <DIR> ..
19.07.2019 15:30 <DIR> 571D~1 Ø¥Ć¼
How you can invoke the GetShortPathName Win32 API from MinGW I don't know, but I would assume that it is possible.
Also make sure to review the MSDN page for the above function for traitoffs, e.g. short names are not supported everywhere (e.g. SMB + see comments below).
From this bug tracker:
In WIN32 environment, use UTF-8 encoded paths started with the long
path prefix \\?\ to handle system-independent character encoding
and very long path names (without the long path prefix Hunspell will
use fopen() with system-dependent character encoding instead of
_wfopen()).
So the actual solution seems to be:
Call GetFullPathNameW to normalize the path. Required because paths with long path prefix \\?\ are passed to the NT API unchanged.
Prepend L"\\\\?\\" to the normalized path (backslashes doubled because of C string literal requirements).
For a UNC path, you have to use the "UNC" device directly (i. e. L"\\\\server\\share" → L"\\\\?\\UNC\\server\\share" (thanks eryksun)
Encode the path in UTF-8, e. g. using WideCharToMultiByte() with CP_UTF8.
Pass the final UTF-8 encoded path to Hunspell.
It looks like C:\Windows\Temp is still a valid path you can write to yourself.

Is clearcase path is a standard one -- '##' and string after that

is the clearcase path syntax is universal one.
my_source.c##\main\10.1_bugfix\another_branch\0
is this path is standard one?
I invoked the following in emacs and it is working. how does emacs diff understand this path. somehow clearcase informs operating system how to interpret that path. or emacs-diff know of this path syntax?
$ diff my_source.c##\main\10.1_bugfix\another_branch\0 my_source.c
This is an extended path.
It is working with dynamic views, which give access to the branches and versions of an element.
See "Base ClearCase path meaning".
In your case, you access the version 0 of the branch main\10.1_bugfix\another_branch.
See also the IBM technote "About the version-extended path" for an example, and pathnames_ccase for the doc:
You can add characters to the end of a relative or full path name, turning it into a VOB-extended path name.
VOB-extended path names that specify versions of elements are the most commonly used; they are called version-extended path names.

Windows program files path names?

Maybe this can be a silly question but I don't figure out how to search in google why in some code I read, it is used to write this way: \\progra~1
What does ~ and 1 mean?
I tried executing in Windows Run the same path but changing numbers and these are the results:
C:\progra~1 -> Opens Program Files
C:\progra~2 -> Opens Program Files(x86)
C:\progra~3 -> Opens ProgramData
C:\progra~4 -> Opens ProgramDevices, a folder I created in C:\
Why? Is this like a Match or something in the Folder names list?
For example a regex like "progra" and then to show the ~1 (First) match in some X order or ~2 (Second) ... etc?
It's a compatability mode with the old (really old) windows 8.3 naming convention. The ~n represents the instance of the name that has the same root characters.
In your example:
Program Files and Program Files(x86) have the same root characters Progra.
Hence one gets progra~1, the next progra~2 etc.
8.3 compatability can be turned off for a disk partition.
Exactly, it's a pattern counter.
Check out also this answer: What does %~d0 mean in a Windows batch file?
You can find more examples of different variables with modifiers here:
https://technet.microsoft.com/en-us/library/bb490909.aspx
(ctrl-f for "Variable substitution")

How to get the parent folder of a Windows user's profile path using C++

I am trying get the parent folder of a Windows user's profile path. But I couldn't find any "parameter" to get this using SHGetSpecialFolderPath, so far I am using CSIDL_PROFILE.
Expected Path:
Win7 - "C:\Users"
Windows XP - "C:\Documents and Settings"
For most purposes other than displaying the path to a user, it should work to append "\\.." (or "..\\" if it ends with a backslash) to the path in question.
With the shell libary version 6.0 you have the CSIDL_PROFILES (not to be confused with CSIDL_PROFILE) which gives you what you want. This value was removed (see here), you have to use your own workaround.
On any prior version you'll have to implement your own workaround, such as looking for the possible path separator(s), i.e. \ and / on Windows, and terminate the string at the last one. A simple version of this could use strrchr (or wcsrchr) to locate the backslash and then, assuming the string is writable, terminate the string at that location.
Example:
char* path;
// Retrieve the path at this point, e.g. "C:\\Users\\username"
char* lastSlash = strrchr(path, '\\');
if(!lastSlash)
lastSlash = strrchr(path, '/');
if(lastSlash)
*lastSlash = 0;
Or of course GetProfilesDirectory (that eluded me) which you pointed out in a comment to this answer.

Resources