Linux operating system identifies files by looking at its magic number at the starting of the header. How does windows do it ? Does it also have some kind of magic number mechanism or does it only rely on the file extension ?
It relies only on the extension, as provided by the filesystem; the contents of the file are not examined. See e.g. this article - it talks about Windows XP, but AFAIK the general behavior is shared by all released versions of Windows so far: http://support.microsoft.com/kb/307859
Related
I have a bunch of files in a Git repo which works okay on macOS and Linux, but will fail on Windows because the file name contains ?
I was wondering how do I configure Git only on my local Windows machine so that I can check it out and have it auto convert the file to something Windows will allow and push it back preserving the file name?
The best way to do this is to use the Windows Subsystem for Linux. It has a fully POSIX-compatible file system and is fully capable of storing arbitrary byte values (with the normal exceptions).
It is almost certainly not going to be possible to use Git for Windows on a native Windows file system, although if you happened to format an external hard disk as UDF (which, I believe, requires the full disk, not just a partition, to be formatted that way on Windows), then you could probably check it out there. UDF is at least capable of handling these characters on Unix.
I thought extended file attributes existed in NTFS which Windows supports. I cannot find a cmd for accessing/updating attributes.
Is there a flavor of Windows (and its file system) that supports this?
I tried getfattr, setfattr, and a number of other commands. attrib is not it either.
If extended attributes are to remain portable across filesystems (even virtual ones implemented in FUSE) then all target platforms need to present an api in userspace (a cmd or set of cmds).
The closest thing to UNIX attribs are EAs: NTFS stores partition metadata called Extended Attributes (EA), which allow data to be stored as an attribute of a file or folder.
EAs, for instance, are used by IE to identify a file as having been "downloaded from the web".
From Wikipedia:
On Windows NT, limited-length extended attributes are supported by
FAT, HPFS, and NTFS. This was implemented as part of the OS/2
subsystem. They are notably used by the NFS server of the Interix
POSIX subsystem in order to implement Unix-like permissions. The
Windows Subsystem for Linux added in the Windows 10 Anniversary Update
uses them for similar purposes, storing the Linux file mode, owner,
device ID (if applicable), and file times in the extended
attributes. Additionally, NTFS can store infinite-length extended
attributes in the form of alternate data streams (ADS), a type of
resource fork. Plugins for the file manager Total Commander, like NTFS
Descriptions and QuickSearch eXtended support filtering the
file list by or searching for metadata contained in ADS Streams. Ref.
If you want to do something security related you want to take a look at the Discretionary Access Control List (DACL) functionality; http://www.windowsecurity.com/articles/Understanding-Windows-NTFS-Permissions.html
Powershell can help setting the mode and extended file and folder attributes - but this does, unfortunately, only apply to regular attributes (not EAs).
I found something related to NTFS attribs in the 3G-Fuse source that might be helpful. However, I doubt that's truly portable.
Here's a GitHub repo containing a tool that allows to manipulate EAs: https://github.com/jschicht/EaTools - this apparently uses NtCreateFile and NtSetEaFile to set and modify them.
(I also asked specifically about .NET APIs to maybe implement a cross-platform tool: .NET: How to set "extended file attributes" in a cross-platform way? - currently there is no such API.)
Is it safe to assume that Windows local and network file paths are NOT case sensitive?
Yes. Windows (local) file systems, including NTFS, as well as FAT and variants, are case insensitive (normally). The underlying implementation of a network file system may be case sensitive, however, most software that allows Windows to access it (such as SMB) will automatically make case sensitive file systems appear as case insensitive to Windows.
For details, I'd read the section in the Wikipedia article on filenames.
Case sensitivity on Windows is actually implemented in how the application opens the files. NTFS can be a case-sensitive file system and can happily store files, with identical names differing only by case in the same directory.
On Windows all files are ultimately opened via the CreateFile API - If the FILE_FLAG_POSIX_SEMANTICS flag is passed to the call (and the file system being accessed is natively case-sensitive) then the file will be opened based on an exact name match. If FILE_FLAG_POSIX_SEMANTICS is not passed then the filesystem does a case-insensitive file open and will open one of the files with a matching name. If there is more than one it's undefined as to which one is actually opened.
Most C and C++ runtime implementations on Windows do not provide any access to this mechanism and never use this flag so the only way to get access to case-sensitive behaviors is to use the Windows API directly.
tl;dr - Your language runtime probably exposes your filesystem as case insensitive or case preserving. You can, if you use the windows API directly, access supported filesystems fully case senstive.
NO. It is not a safe assumption.
The other answers are informative but regardless of what they say it is not a safe assumption and continues to become more unsafe as time goes on.
NFST - Can be case sensitive.
I use it on a per-directory basis but you can also do it with entire NTFS drives.
https://devblogs.microsoft.com/commandline/per-directory-case-sensitivity-and-wsl/
WSL - Is case sensitive.
Linux GUI apps, and Android apps coming to Windows. They will all be running on a case sensitive file system by default, locally.
I just stumbled on this article on MSDN that says a path can be 259 characters + NUL termination, but if you prefix it with "\\?\" WinAPI allows you to use the
maximum total path length of 32,767 characters.
Eager to see it working I tried using that prefix from Explorer (On XP SP3) but it doesn't work at all (on any path). If you put \\?\C:\Path\to\an\existing.file on explorer's bar, it will give the "file not found" error.
So I'm confused. Can I code something for (non-ancient) Windows that makes full use of the mentioned path size on NTFS? Why Explorer doesn't use it?
There is a SET of API calls that are work with extened-paths and some that do not. The MSDN usually mentions this.
Not that if you just type that path into windows explorerunder xp this does not work, because the extened path syntax is just an escape sequence for the WIn32 API and not for windows explorer. Now, In Win7 this does work because many people expected this to work.
Also for long paths, it does help if you change the working directory or open up explorer with a sub-directory as a root.
Before someone tells me to RTFM...
Note that these examples are intended for use with the Windows API functions and do not all necessarily work with Windows shell applications such as Windows Explorer.
[...]
For file I/O, the "\\?\" prefix to a path string tells the Windows APIs to disable all string parsing and to send the string that follows it straight to the file system. For example, if the file system supports large paths and file names, you can exceed the MAX_PATH limits that are otherwise enforced by the Windows APIs.
On a secondary note, this makes me wonder about the possibilities of hiding files (or finding such files) from explorer by using illegal file names.
Are you asking why all components in Windows do not support it, or are you asking whether it's legal to use these long paths?
You can definitely safely use them, but you may irritate someone who wants to use tools like Explorer to browse them. We see paths like this all the time in the wild. Sometimes people are pretty surprised when they can't use MY_FAVORITE_TOOL to delete it...
I have been trying to find a way to "defragment" the registry on my Windows machine. Firstly, does this make sense? Any benefits in doing this? (Not much love on superuser.com) Secondly, I am looking for a way to rewrite the registry using C/C++ with Windows API. Is there a way to read the registry and write it to a new file getting rid of unused bytes along the way? (I might have to write the new file and then boot into another OS/disk before I can overwrite the original... but I am willing to take that risk.)
Microsoft's PageDefrag does exactly this, as it states on its page "PageDefrag uses the standard file defragmentation APIs to defragment the files."
(A copy of the linked article is here because in typical MSDN style their link is dead.)
http://www.larshederer.homepage.t-online.de/erunt/ - NTREGOPT NT Registry Optimizer
Similar to Windows 9x/Me, the registry files in an NT-based system
can become fragmented over time, occupying more space on your hard
disk than necessary and decreasing overall performance. You should
use the NTREGOPT utility regularly, but especially after installing
or uninstalling a program, to minimize the size of the registry files
and optimize registry access.
The program works by recreating each registry hive "from scratch",
thus removing any slack space that may be left from previously
modified or deleted keys.
http://reboot.pro/index.php?showtopic=11212 - Offreg.dll MS WDK Offline Registry Library
The offline registry library (Offreg.dll) is used to modify a registry hive outside the active system registry. This library is intended for registry update scenarios such as servicing an operating system image. The library supports registry hive formats starting with Windows XP.
Developer Audience
http://reboot.pro/topic/11312-offline-registry/ - Offline Registry MS WDK Command-Line Tool
A command line tool that will allow one to read and write to an offline registry hive.
Reading the values should be possible.
But I've never seen any spec for how the registry files are written to disk, and unless you could find one you'd have to reverse engineer those files in your OS (might be differences between XP and 7 etc). Then you have to remember that the registry isn't just one file, it's multiple files and some of them belongs to certain users and I think they use SIDs rather than user names so even if you move them to a new computer, you have to be sure it's the same OS version with the same users with the same SIDs set up on it.
All this for little or no gain so I'd agree with the superuser users that it wouldn't make sense.