in the doc, they have mentioned that nNumberOfLinks is "The number of links to this file. For the FAT file system, this member is always 1. For the NTFS file system, it can be more than 1."
The number of links to the file means what ? if that file used as the destination of 3 symlinks then nNumberOfLinks is 3 . or it has some other meaning.
Looking into the implementation of GetFileInformationByHandle in the ReactOS source code,
https://doxygen.reactos.org/da/d02/dll_2win32_2kernel32_2client_2file_2fileinfo_8c_source.html We can see the field nNumberOfLinks gets populated as follows (error checking removed).
errCode = NtQueryInformationFile(hFile,
&IoStatusBlock,
&FileStandard,
sizeof(FILE_STANDARD_INFORMATION),
FileStandardInformation);
lpFileInformation->nNumberOfLinks = FileStandard.NumberOfLinks;
As per the documentation of FILE_STANDARD_INFORMATION https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/ns-wdm-_file_standard_information.
NumberOfLinks
The number of hard links to the file.
So nNumberOfLinks will be the number of hard links, as mentioned by dxiv in the comments.
Related
FindNextFile WinApi function is used to list content of directories. Microsoft is stating in documentation, that order is file system dependent. However NTFS should be in alphabetical order most of the time.
The order in which this function returns the file names is dependent on the file system type. With the NTFS file system and CDFS file systems, the names are usually returned in alphabetical order. With FAT file systems, the names are usually returned in the order the files were written to the disk, which may or may not be in alphabetical order. However, as stated previously, these behaviors are not guaranteed.
My application needs some ordering of object in directories. Because majority of Windows users use NTFS, I would like to optimize my application for that case. Therefore I use function _wcsicmp for name compare. Most of the time it is correct and results from FindNextFile are sorted according to _wcsicmp. However sometime result are not sorted. I thought, that it is natural, because FindFirstFile doesn't guaranteed the order and I must sort it anyway (in case of another file system). Then I noticed strange pattern. It looks like character '_' is returned after letters. Folder with content (a.txt, b.txt, _.txt) is returned in order a, b, _. Function _wcsicmp will sort that as _, a, b. Tested on Windows 8.1. I ran some test and this behavior is consistent.
Can someone explain me what is the comparison criteria used by NTFS? Or why is FindNextFile returning names out of alphabetical order?
Because NTFS sort rules are not so simple as just to sort in alphabetical order. Here is an msdn blog article to shed some light on the problem:
Why do NTFS and Explorer disagree on filename sorting?
One reason to this can be that NTFS captures the case mapping table at the time the drive is formatted and continues to use that table, even if the OS's case mapping tables change subsequently.
You can use CompareStringEx and set the flag SORT_DIGITSASNUMBERS
Minimum system requirement for this function is Windows Vista
LINK
int CompareStringEx(0,0x00000008/*SORT_DIGITSASNUMBERS*/,
lpString1, cchCount1, lpString2, cchCount2, NULL, NULL, 0);
Comparison result for this function is weird, it returns 1, 2, or 3:
#define CSTR_LESS_THAN 1 // string 1 less than string 2
#define CSTR_EQUAL 2 // string 1 equal to string 2
#define CSTR_GREATER_THAN 3 // string 1 greater than string 2
You can also try _wcsicoll for older systems. If I recall correctly _wcsicoll works better but not the same as Windows's sort.
I've been writing a program in R that outputs randomization schemes for a research project I'm working on with a few other people this summer, and I'm done with the majority of it, except for one feature. Part of what I've been doing is making it really user friendly, so that the program will prompt the user for certain pieces of information, and therefore know what needs to be randomized. I have it set up to check every piece of user input to make sure it's a valid input, and give an error message/prompt the user again if it's not. The only thing I can't quite figure out is how to get it to check whether or not the file name for the .csv output is valid. Does anyone know if there is a way to get R to check if a string makes a valid windows file name? Thanks!
These characters aren't allowed: /\:*?"<>|. So warn the user if it contains any of those.
Some other names are also disallowed: COM, AUX, NUL, COM1 to COM9, LPT1 to LPT9.
You probably want to check that the filename is valid using a regular expression. See this other answer for a Java example that should take minimal tweaking to work in R.
https://stackoverflow.com/a/6804755/134830
You may also want to check the filename length (260 characters for maximum portability, though longer names are allowed on some systems).
Finally, in R, if you try to create a file in a directory that doesn't exist, it will still fail, so you need to split the name up into the filename and directory name (using basename and dirname) and try to create the directory first, if necessary.
That said, David Heffernan gives good advice in his comment to let Windows do the wok in deciding whether or not it can create the file: you don't want to erroneously tell the user that a filename is invalid.
You want something a little like this:
nice_file_create <- function(filename)
{
directory_name <- dirname(filename)
if(!file.exists(directory_name))
{
ok <- dir.create(directory_name)
if(!ok)
{
warning("The directory of that path could not be created.")
return(invisible())
}
}
tryCatch(
file.create(filename),
error = function(e)
{
warning("The file could not be created.")
}
)
}
But test it thoroughly first! There are all sorts of edge cases where things can fall over: try UNC network path names, "~", and paths with "." and ".." in them.
I'd suggest that the easiest way to make sure a filename is valid is to use fs::path_sanitize().
It removes control characters, reserved characters, and Windows-reserved filenames, truncating the string at 255 bytes in length.
I'm using the following function to delete a file to the recycle bin: (C++, MFC, Unicode)
bool DeleteFileToPaperbasket (CString filename)
{
TCHAR Buffer[2048+4];
_tcsncpy_s (Buffer, 2048+4, filename, 2048);
Buffer[_tcslen(Buffer)+1]=0; //Double-Null-Termination
SHFILEOPSTRUCT s;
s.hwnd = NULL;
s.wFunc = FO_DELETE;
s.pFrom = Buffer;
s.pTo = NULL;
s.fFlags = FOF_ALLOWUNDO | FOF_SILENT | FOF_NOERRORUI;
s.fAnyOperationsAborted = false;
s.hNameMappings = NULL;
s.lpszProgressTitle = NULL;
int rc = SHFileOperation(&s);
return (rc==0);
}
This works nicely for most files. But if path+filename exceed 255 characters (and still much shorter that 2048 characters), SHFileOperation returns 124. Which is DE_INVALIDFILES.
But what's wrong? I checked everything a million times. The path is double-null terminated, I'm not using \\?\ and it works for short filenames.
I'm totally out of ideas...
I think backwards comparability is biting you in the --- in several ways, and I'd need to actually see the paths your using and implement some error checking code to help. But here are some hints.
You would not get a DE_INVALIDFILES 0x7C "The path in the source or destination or both was invalid." for a max path violation, you'd get a DE_PATHTOODEEP 0x79 "The source or destination path exceeded or would exceed MAX_PATH."
These error codes(return value) do, can, and have changed over time, to be sure what your specific error code means, you need to check it with GetLastError function(msdn)
Also, taken from the SHFileOperation function documentation: "If you do not check fAnyOperationsAborted as well as the return value, you cannot know that the function accomplished the full task you asked of it and you might proceed under incorrect assumptions."
You should not be using this API for extremely long path names, it has been replaced in vista+ by IFileOperation interface
The explanation for why it may work in explorer and not thru this LEGACY api is - Taken from the msdn page on Naming Files, Paths, and Namespaces
The shell and the file system have different requirements. It is
possible to create a path with the Windows API that the shell user
interface is not able to interpret properly.
Hope this was helpful
The recycle bin doesn't support files whose paths exceed MAX_PATH in length. You can verify this for yourself by trying to recycle such a file in Explorer - you will get an error message about the path being too long.
I created a GUI and used uiimport to import a dataset into matlab workspace, I would like to pass this imported data to another function in matlab...How do I pass this imported dataset into another function....I tried doing diz...but it couldnt pick diz....it doesnt pick the data on the matlab workspace....any ideas??
[file_input, pathname] = uigetfile( ...
{'*.txt', 'Text (*.txt)'; ...
'*.xls', 'Excel (*.xls)'; ...
'*.*', 'All Files (*.*)'}, ...
'Select files');
uiimport(file_input);
M = dlmread(file_input);
X = freed(M);
I think that you need to assign the result of this statement:
uiimport(file_input);
to a variable, like this
dataset = uiimport(file_input);
and then pass that to your next function:
M = dlmread(dataset);
This is a very basic feature of Matlab, which suggests to me that you would find it valuable to read some of the on-line help and some of the documentation for Matlab. When you've done that you'll probably find neater and quicker ways of doing this.
EDIT: Well, #Tim, if all else fails RTFM. So I did, and my previous answer is incorrect. What you need to pass to dlmread is the name of the file to read. So, you either use uiimport or dlmread to read the file, but not both. Which one you use depends on what you are trying to do and on the format of the input file. So, go RTFM and I'll do the same. If you are still having trouble, update your question and provide details of the contents of the file.
In your script you have three ways to read the file. Choose one on them depending on your file format. But first I would combine file name with the path:
file_input = fullfile(pathname,file_input);
I wouldn't use UIIMPORT in a script, since user can change way to read the data, and variable name depends on file name and user.
With DLMREAD you can only read numerical data from the file. You can also skip some number of rows or columns with
M = dlmread(file_input,'\t',1,1);
skipping the first row and one column on the left.
Or you can define a range in kind of Excel style. See the DLMREAD documentation for more details.
The filename you pass to DLMREAD must be a string. Don't pass a file handle or any data. You will get "Filename must be a string", if it's not a string. Easy.
FREAD reads data from a binary file. See the documentation if you really have to do it.
There are many other functions to read the data from file. If you still have problems, show us an example of your file format, so we can suggest the best way to read it.
I have a folder with these files:
alongfilename1.txt <--- created first
alongfilename3.txt <--- created second
When I run DIR /x in command prompt, I see these short names assigned:
ALONGF~1.TXT alongfilename1.txt
ALONGF~2.TXT alongfilename3.txt
Now, if I add another file:
alongfilename1.txt
alongfilename2.txt <--- created third
alongfilename3.txt
I see this:
ALONGF~1.TXT alongfilename1.txt
ALONGF~3.TXT alongfilename2.txt
ALONGF~2.TXT alongfilename3.txt
Fine. It seems to be assigning the "~#" according to the date/time that I created the file. Is this correct?
Now, if I delete "alongfilename1.txt", the other two files keep their short names.
ALONGF~3.TXT alongfilename2.txt
ALONGF~2.TXT alongfilename3.txt
When will that ID (in this case, ~1) be released for use in another shortname. Will it ever?
Also, is it possible that a file on my machine has a short name of X, whereas the same file has a short name of Y on another machine? I'm particularly concerned for installations whose custom actions utilize DOS short names.
Thanks, guys.
If I were you, I would never rely on any version of any file system driver (be it Microsoft's, be it another OS's) to be consistent about the algorithm it uses to generate short file names. The exact behavior of the Microsoft Fastfat and NTFS drivers is not "officially" documented (except as somewhat high level overviews) thus are not part of the API contract. What works today might not work tomorrow if you update the driver.
In addition, there is absolutely no requirement that short names contain tilde characters - see for example this post by Raymond Chen.
There's a treasure trove of info to be found about this topic in the MSDN blogs - for example:
Registry key to force Windows to use short filenames
NTFS curiosities (Part I): Short file names
Also, do not rely on the sole presence of alphanumerical characters. Look at the Linux VFAT driver which says, for example, that any combination of uppercase letters, digits, and the following characters is valid: $ % ' ` - # { } ~ ! # ( ) & _ ^. NTFS will operate in compatibility mode with that...
The short filename is created with the file. The algorithm works like this (usually, but see moocha's reply):
counter = 1
stripped_filename = strip_dots(strip_non_ascii_characters(filename))
shortfn = first_6_characters(stripped_filename)
while (file_exists(shortfn + "~" + counter + "." + extension)) {
increment counter by 1
if more digits are added to counter, shorten shortfn by 1
/* e.g. if counter comes to 9 and shortf~9.txt is taken. try short~10.txt next */
}
This means that once the file is created, it will keep its short name until it's deleted.
As soon as the file is deleted, the short name may be used again.
If you move the file somewhere else, it may get a new short name (e.g. you're moving c:\somefilewithlongname.txt ("c:\somefi~1.txt") to d:\stuff\somefilewithlongname.txt, if there's d:\stuff\somefileelse.txt ("d:\stuff\somefi~1.txt"), the short name of the moved file will be somefi~2.txt). It seems that the short name is only persistent within a given directory on a given machine.
So: the short filenames will be generated by the filesystem, usually by the method outlined above. It is better to assume that short filenames are not persistent, as c:\longfi~1.txt on one machine might be "c:\longfilename.txt", whereas on another it might be "c:\longfish_story.txt"; also, when a file is deleted, the short name is immediately available again.
I believe MSDOS stores the association between the long and the short name in a per directory file.
It does not depends on the date/time.
If you move your files in a new directory... this will reset the algo mentionned by Piskvor applies itself again
In the new directory (after a move), you will get:
ALONGF~1.TXT alongfilename1.txt
ALONGF~2.TXT alongfilename2.txt
ALONGF~3.TXT alongfilename3.txt
even though alongfilename2.txt has initially been created third.
This link says how NTFS does it. I would guess it's still the same idea on more recent version.
In Windows 2000, both FAT and NTFS use
the Unicode character set for their
names, which contain several forbidden
characters that MS-DOS cannot read. To
generate a short MS-DOS-readable file
name, Windows 2000 deletes all of
these characters from the LFN and
removes any spaces. Because an
MS-DOS-readable file name can have
only one period, Windows 2000 also
removes all extra periods from the
file name. Next, Windows 2000
truncates the file name, if necessary,
to six characters and appends a tilde
( ~ ) and a number. For example, each
non-duplicate file name is appended
with ~1 . Duplicate file names end
with ~2 , then ~3, and so on. After
the file names are truncated, the file
name extensions are truncated to three
or fewer characters. Finally, when
displaying file names at the command
line, Windows 2000 translates all
characters in the file name and
extension to uppercase.
When the files are provided by a network server which is running Samba, then the short names are generated by the server, and they do not follow a predictable pattern.
So it is not safe to assume that you can predict the form of the short name.
G:\>dir /x *.txt
Directory of G:\
08/25/2009 12:34 PM 1,848 S2XYYV~1.TXT strace_output.txt
03/01/2010 05:32 PM 325,428 TEY7IH~O.TXT tomcat-dump-march-1.txt
03/11/2010 12:01 AM 5,811 DI356A~S.TXT ddmget-output.txt
01/23/2009 01:03 PM 313,880 DLA94Q~K.TXT ddm-log-fn.txt
04/20/2010 07:42 PM 7,491 A50QZP~A.TXT april-20-2010.txt