I'm trying to find out total number of currently open file descriptors (or any file related objects) on current OS.
My current betting is sysctl(3), and I think kern.num_files does the job. But I'm not actually sure what it means, and I can't find any manual page entry or standard spec for kern.num_files. This makes me nervous.
kern.num_files is listed on man 3 sysctl, but only names are listed and says nothing about what it actually means.
Command line sysctl -a(2) lists and report some value for kern.num_files.
sysctl.h does not define a name looks like kern.num_files
although it even contains names like KERN_FILE that are considers private/deprecated.
Is this actually the count of system-wide open FD? Where can I find any spec for this?
If kern.num_files is not the number, what is recommended way to get the total open FD count?
Related
If a user opens a file in a program (for example using GetOpenFileNameW, DragQueryFileW, command line argument, or whatever else to get the path, and a subsequent CreateFileW call), is there a way to find the next file in the parent directory of the opened file?
The obvious solution is to cycle through the results from FindNextFileW or NtQueryDirectoryFileEx until the opened file is encountered, and just open the next file.
However, this seems undesireable.
First, because these functions use paths (instead of for example a handle), the original file is decoupled from the search algorithm, so the original file might not even get encountered in that search. This is not much of an issue (as failing in this case is the expected outcome), and it probably could be resolved with (temporarly) changing the sharing mode, using LockFile or similar (though I would like to avoid that).
Second, this cycling search would have to be done every time, because the contents of the directory might have changed (retaining hFindFile does not work, because only FindFirstFileW calls NtQueryDirectoryFileEx and enumerates the contents of the directory). Which seems like unnecessary work and might even affect performance (for example if the directory contains a lot of files).
In theory any file system has some way of enumerating the files in a directory. Meaning there is some ordered data structure of the files' metadata. And getting the next file should only involve going back from the existing file handle to that file's entry, and then getting the next entry from that data structure. So there does not seem to be a fundamental reason why this cannot be done more sanely.
I thought maybe there exist a better way to do this somewhere in WinAPI...
Same question for finding the previous file.
I'm having trouble monitoring a file for changes. I need to be able to know when a file changes, and when it does, I need the new line that was added. I intend to parse each line and find ones that match certain criteria, and act on information in those lines. I know the expected number of matching lines ahead of time, but I do not know how many lines in total will be added to the file, or where the matching lines will be.
I've tried 2 packages so far, with no avail.
fsnotify/fsnotify
As fas as I can tell, fsnotify can only tell me when a file is modified, not what the details of the modification was. Since I need to know what exactly was added to the file, this is no good for me.
(As a side-question, can this be run in a loop? The example that I tried exited after just one modification. I need to monitor for multiple modifications.)
hpcloud/tail
This package tries to mimic the Unix tail command, but it seems to have its own issues. The output that I get includes timestamps and other data - I just want the added line, nothing else. Also, it seems to think a file has been modified multiple times, even when it's just one edit. Further, the deal breaker here is that it does not output the last line if the line was not followed by a newline character.
Delegating to tail
I came across this answer, which suggests to delegate this work to the tail command itself, but I need this to work cross-platform (specifically, macOS, Linux and Windows). I don't believe that an equivalent command exists on Windows.
How do I go about tackling this?
#user2515526,
Usually changed diff is out of scope of file watchers' functionality, because, you know, you could change an image, and a watcher would need to keep a track several Mb of a diff in memory, and what if we have thousands of files?
However, as bad as it sounds, this may be exactly the way you want to implement this (sure, depends on your app, etc. - could be fine for text files), i.e. - keeping a map of diffs (1 diff per file) since last modification. Cannot say I like it, but sounds like fsnotify has no support for changes/diffs that you need.
Also, regarding your question about running in a loop, maybe you can get some hints here: https://github.com/kataras/iris/blob/8370d76910cdd8de043753ed81ae080eae8dc798/utils/file.go
Its a framework that allows to build a server that watches for TypeScript file changes. So sounds similar to your case/question.
Cheers,
-D
I've written a program that reads the NTFS index and journal similar to what is described here:
http://ejrh.wordpress.com/2012/07/06/using-the-ntfs-journal-for-backups/
And It works fairly well.
In addition to the normal journal events USN_REASON_CLOSE, USN_REASON_FILE_CREATE, USN_REASON_FILE_DELETE etc' I'm receiving an event with reason USN_REASON_HARD_LINK_CHANGE. I'd like to be able to update the directory index according to this event but I can't find any information about it. The only documentation is:
An NTFS file system hard link is added to or removed from the file or
directory. An NTFS file system hard link, similar to a POSIX hard
link, is one of several directory entries that see the same file or
directory.
What does this mean? where was the hard-link created? or was it removed? how do I get more information about what happened?
I know this is ancient, but I stumbled upon this while researching a related problem. Here's what I found: The hard-links are a complicating factor when reading the USN. You can get journal entries describing change to a single file reference number by way of changes made through any hard-link that's been created. Generally, and to the original question, hard-links are alternative directory entries through which a single file might be accessed. Thus, all the file's characteristics are shared for each link (except for the names and parent file reference numbers). Technically, you can't tell which entry is the original and which is a link.
A subtle difference does exist, and it manifests if you query the master file table (using DeviceIOControl and Fsctl_Enum_Usn_Data). The query will return only a single representative file regardless of how many links exist. You can query for the links using NtQueryInformationFile, querying for FILE_HARD_LINK_INFORMATION. I think of the entry returned by the MFT query as the main entry and the NtQueryInformationFile-returned items as links...however, the main entry can get deleted and one of the links will get promoted...so it's only a housekeeping thought and little else.
Note that a problem arises where one of the hard-links is moved or renamed. In this case, the journal entries for the rename or move reflect the filename and parent file reference number of the affected link. The problem arises if you ask for only the summary "on-close" records. In such a case, you won't ever see the USN_REASON_RENAME_OLD_NAME record...because that USN entry never gets an associated REASON_CLOSE associated with it. Without this tidbit, you won't be able to easily determine which link's name or location was changed. You have to read the usn with ReadOnlyOnClose set to 0 in the Read_Usn_Journal_Data_V0. This is a far chattier query, but without it, you can't accurately associate the change with one link or the other.
As always with the USN, I expect you'll need to go through a bit of trial and error to get it to work right. These observations/guesses may, I hope, be helpful:
When the last hard link to a file is deleted, the file is deleted; so if the last hard link has been removed you should see USN_REASON_FILE_DELETE instead of USN_REASON_HARD_LINK_CHANGE. I believe that each reference number refers to a file (or directory, but NTFS doesn't support multiple hard links to directories AFAIK) rather than to a hard link. So immediately after the event is recorded, at least, the file reference number should still be valid, and point to another name for the file.
If the file still exists, you can look it up by reference number and use FindFirstFileNameW and friends to find the current links. Comparing this to the event record in question plus any relevant later events should give you enough information, although if multiple hard links for the same file are deleted and/or created you might not be able to reconstruct the order in which this happened, and if you don't have enough information about the prior state of the file system you might not be able to identify the deleted hard links. I don't know whether that would matter to you or not.
If the file no longer exists, you should still be able to identify it by the USN record in which it was deleted. Again, taking all relevant events into consideration, and with enough information about the prior state, you should be able to reconstruct most of what happened, if not the order.
There is some hope that we can do better than this: the file name and/or ParentFileReference number in the event record might refer to the hard link that was created or deleted, rather than to an arbitrary link to the file. In this case you'll have all the relevant information about the sequence of events except for whether any particular event was a create or a delete, which you should be able to work out by looking at the current state of the file and working backwards through the records.
I assume you've already looked for nearby change records that might contain additional information? There isn't, for example, a USN_REASON_RENAME_NEW_NAME record generated when a hard link is created or a USN_REASON_RENAME_OLD_NAME when a hard link is removed? Or paired USN_REASON_HARD_LINK_CHANGE records, one for the file, one for the directory containing the affected hard link to the file? (Wishful thinking, I expect, but it wouldn't hurt to look!)
For testing purposes, you can create hard links with the mklink command.
I want to write a program that allows or blocks processes while openning a file depending on a policy.
I could make a control by checking the name of the program. However, it would not be enough because user can change the name to pass the policy. i.e. let's say that policy doesn't allow a.exe to access txt files whereas b.exe is allowed. If user change a.exe with b.exe, i cannot block it.
On the other hand, verifying portable executable signature is not enough for me, because i don't care whether the executable signed or not. I just want to identify the executable that is wanted to execute even its name is changed.
For this type of case, what would you propose? Any solutions are welcome.
Thanks in advance
There are many ways to identify an executable file. Here is a simple list:
Name:
The most simple and straightforward approach is to identify a file by its name. But it is one of the easiest things to change, and you already ruled that out.
Date:
Files have an access, creation, and modification date, and they are managed by the operating system. They are not foolproof, or maybe not even accurate.
Also, they are very simple to change.
Version Information:
Since we are talking about executable files, then most executable files have version information attached to it. For example, original file name, file version, product version, company, description, etc. You can check these fields if you are sure the user cannot modify them by editing your executable. It doesn't require you to keep a database of allowed files. However, it does require you to have something to compare to, like company name, or a product name. Also, If someone made an executable with the same value
they can run instead of the allowed one and bypass your protection.
Location:
If the file is located in a specific place and is protected by file system access rights, and it cannot be changed, then you can use that. You can, for example, put the allowed files in a folder where the user (without admin rights) can only read/execute them, but not rename/move. Then identify the file with its location. If it is run from this location, then allow it, else block it. It is good as it doesn't need a database of
allowed/blocked files, it just compares the location, it it is a valid one, then allow, and you can keep adding and removing files to allowed locations
without affecting your program.
Size:
If the file has a specific file size, you can quickly check its size and compare it. But it is unreliable as files can be changed/patched and without any change in size. You can avoid that by also applying a CRC check to detect if the content of the file changed.
But, both size and CRC can be changed. Also, this requires you to have a list of file names and their sizes/CRC, and keeping it up to date.
Signature:
Deanna mentioned that you can self-sign your executable files. Then check if the signature matches yours and allow/deny based on that. This seems to be a good way if
it is okay for you to sign all the executable files you want to allow. It doesn't require you to keep an updated list of allowed files.
Hash:
arx also pointed out, that you can hash the files. It is one of the slowest methods, as it requires the file to be hashed every time it is executed, then compared to a list of files. But it very reliable as it can uniquely identify each file and hard to break. But, you will need to keep an up to date database of every file hash.
Finally, and depend on your needs and options, you can mix two or more ways together to get the results you want. Like, checking file name + location, etc.
I hope I covered most of things, but I'm sure there are more ways. Anyone can freely edit my post to include anything that I have missed.
I would recommend using the signature, if it has one, or the hash otherwise. Apps such as Office that update frequently are more likely to be signed, whereas smaller apps downloaded off the Internet are unlikely to ever be updated and so should have a consistent hash.
Let's take for example notepad. How can I in my application be 100% sure whether notepad is running or not?
By 100% I mean, if there is another process whose name is "notepad.exe" which in fact is not a real notepad but for example an imitation, I don't want to detect it. Only real notepads.
I've already thought about reading the process memory but it's more difficult than it appears to be, because of memory displacements etc.
The standard way is by name, right? But for me it is really important, that it is not any other program since I want to interact with it what would critical fail if I found a wrong process.
Does anyone know a good way of doing this?
PS: There is no specific programming language to do it in. If possible I would prefer an indipendent solution. But if required, I specifically use .Net/C#.
The only way to be 99.9%1 sure you're looking at the right executable is to validate the file's digital signature. For example, you'd ensure that the notepad.exe in question was signed by "Microsoft Corporation".
I'd do something like this:
Get the list of running processes.
Filter down to name of interest (notepad.exe)
Get each process' image [executable] path.
Validate that the Authenticode signature is valid and trusted.
Compare the name of the signer to the expected value.
Success! You can be very certain this is the correct file.
This method avoids issues like having to know ahead of time where the file should be located (which is nearly impossible – Notepad is installed in two locations), what its hash value should be (obviously bound to change), or strange user behavior (replacing Notepad with some other text editor).
1 - of course, it's impossible to be 100% sure. Someone really determined could self-sign an executable with the expected signer name and add the certificate to their machine's root store, causing the signature to appear valid.
Well, I haven't been confronted to that kind of problem, but you can first check if the process is running by searching by name (in your case, that would be notepad.exe), parse the Process.GetProcesses() list for that, then get
Process.StartInfo.FileName
and see if this is the path to the Notepad executable, that would do the deal, right ?
What exactly do you know of the executable we want to be running? If you knew the filesize that could work as a "hack". Use #josh3736 's method, but replace point 4 and 5 by comparing the filesize with the one you know [different versions will have different sizes, but if there are not too many you can hardcode them]. Calculation a Md5-Hashtag would look more professional, but would do basicly the same thing.
**
If your process has a GUI: you could use EnumWindows for the children to get Edit-Boxes etc. Find something destinctive for your "notepad.exe" and check if it's there.