Linux page-cache list - linux-kernel

I want to get a list of all PFNs which belong to the pagecache. One way is to go over each open file/inode and get the address_space pages.
Is there a simpler way? Cannot seem to find a big list of cache-pages .
Is there any such list/API i can use ?

yes, something like what u said - the address_space pointer is called i_mapping for a inode.
So for example, inside fs/drop_cache.c is a function that enumerate all the pagecache for a superblock:
static void drop_pagecache_sb(struct super_block *sb, void *unused)
{
struct inode *inode, *toput_inode = NULL;
spin_lock(&inode_sb_list_lock);
list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
spin_lock(&inode->i_lock);
if ((inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) ||
(inode->i_mapping->nrpages == 0)) {
spin_unlock(&inode->i_lock);
continue;
}
__iget(inode);
spin_unlock(&inode->i_lock);
spin_unlock(&inode_sb_list_lock);
invalidate_mapping_pages(inode->i_mapping, 0, -1);
iput(toput_inode);
toput_inode = inode;
spin_lock(&inode_sb_list_lock);
}
spin_unlock(&inode_sb_list_lock);
iput(toput_inode);
}
So instead of calling "invalidate_mapping_pages()" will use the i_mapping pointer to enumerate all the pagecache component.
As for enumerate the blocks, and thus identifying the page's PFN, u can follow this here:
http://www.makelinux.net/books/ulk3/understandlk-CHP-15-SECT-2#understandlk-CHP-15-SECT-2.6
(15.2.6. Searching Blocks in the Page Cache).

Related

c++ access PEB_LDR_DATA struct member by offset

I am new to c++ and I am trying to access the InLoadOrderModuleList member in PEB_LDR_DATA structure.
I tried this:
// the ldrData data type is PPEB_LDR_DATA
PLIST_ENTRY firstitem_InMemoryOrderModuleList = ((PLIST_ENTRY)(pebLdrData + 0x0010)-> Flink);
without success. How should I accessing it?
LIST_ENTRY is how Windows does linked lists internally. There is plenty of information about them online if you need more details, but there are two things you need to know here:
is that the next/back pointers don't point to the head of the object (which is common in most implementations); so in order to get to the head of the object you have to do a fixup on the pointer based on the offset of the LIST_ENTRY member. This is where the CONTAINING_RECORD macro comes into use.
is that you don't want to this fixup on the first LIST_ENTRY in the PEB_LDR_DATA object, think of those as the "head" pointer, and you need to move through the Flink before you get to the data that you care about.
Sample code:
LIST_ENTRY *current_record = NULL;
LIST_ENTRY *start = &(pebLdrData->InLoadOrderModuleList);
// move off the initial list entry to the first actual object
current_record = start->Flink;
while (true)
{
// find the head of the object
LDR_DATA_TABLE_ENTRY *module_entry = (LDR_DATA_TABLE_ENTRY*)
CONTAINING_RECORD(current_record, LDR_DATA_TABLE_ENTRY, InLoadOrderLinks);
printf("%wZ\n", &module_entry->BaseDllName);
// advance to the next object
current_record = current_record->Flink;
if (current_record == start)
{
break;
}
}
The solution is to declare typedef structures of LDR_DATA_TABLE_ENTRY and PEB_LDR_DATA with its full structure.

How to get the timestamp of when a disk is made offline from diskmgmt or other ways in windows?

I want to know the time when a disk is made offline by user. Is there a way to know this through WMI classes or other ways?
If you cannot find a way to do it through the Win32 API/WMI or other, I do know of an alternate way which you could look into as a last-resort.
What about using NtQueryVolumeInformationFile with the FileFsVolumeInformation class? You can do this to retrieve the data about the volume and then access the data through the FILE_FS_VOLUME_INFORMATION structure. This includes the creation time.
At the end of the post, I've left some resource links for you to read more on understanding this so you can finish it off the way you'd like to implement it; I do need to quickly address something important though, which is that the documentation will lead you to
an enum definition for the _FSINFOCLASS, but just by copy-pasting it from MSDN, it probably won't work. You need to set the first entry of the enum definition to 1 manually, otherwise it will mess up and NtQueryVolumeInformationFile will return an error status of STATUS_INVALID_INFO_CLASS (because the first entry will be identified as 0 and not 1 and then all the entries following it will be -1 to what they should be unless you manually set the = 1).
Here is the edited version which should work.
typedef enum _FSINFOCLASS {
FileFsVolumeInformation = 1,
FileFsLabelInformation,
FileFsSizeInformation,
FileFsDeviceInformation,
FileFsAttributeInformation,
FileFsControlInformation,
FileFsFullSizeInformation,
FileFsObjectIdInformation,
FileFsDriverPathInformation,
FileFsVolumeFlagsInformation,
FileFsSectorSizeInformation,
FileFsDataCopyInformation,
FileFsMetadataSizeInformation,
FileFsMaximumInformation
} FS_INFORMATION_CLASS, *PFS_INFORMATION_CLASS;
Once you've opened a handle to the disk, you can call NtQueryVolumeInformationFile like this:
NTSTATUS NtStatus = 0;
HANDLE FileHandle = NULL;
IO_STATUS_BLOCK IoStatusBlock = { 0 };
FILE_FS_VOLUME_INFORMATION FsVolumeInformation = { 0 };
...
Open the handle to the disk here, and then check that you have a valid handle.
...
NtStatus = NtQueryVolumeInformationFile(FileHandle,
&IoStatusBlock,
&FsVolumeInformation,
sizeof(FILE_FS_VOLUME_INFORMATION),
FileFsVolumeInformation);
...
If NtStatus represents an NTSTATUS error code for success (e.g. STATUS_SUCCESS) then you can access the VolumeCreationTime (LARGE_INTEGER) field of the FILE_FS_VOLUME_INFORMATION structure with the FsVolumeInformation variable.
Your final task at this point will be using the LARGE_INTEGER field named VolumeCreationTime to gather proper time/date information. There are two links included at the end of the post which are focused on that topic, they should help you sort it out.
See the following for more information.
https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/content/ntifs/nf-ntifs-ntqueryvolumeinformationfile
https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/content/wdm/ne-wdm-_fsinfoclass
https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/content/ntddk/ns-ntddk-_file_fs_volume_information
https://msdn.microsoft.com/en-us/library/windows/desktop/ms724280.aspx
https://blogs.msdn.microsoft.com/joshpoley/2007/12/19/datetime-formats-and-conversions/

memory leak delete linked list node

Newbie question. Suppose I have a C++11 linked list implementation with
template <typename X> struct Node {
X value;
Node* next;
Node(X x) {
this->value = x;
this->next = nullptr;
}
};
and later in the code I create a pointer variable
X x = something;
Node<X>* node = new Node(x);
and still later I do
delete node;
Is the x stored within node destructed when this statement is executed?
You may tell me I should use std::list instead of writing my own, but right
now I'm just trying to educate myself on pointers.
Since you did not provide a custom desctructor the compiler will generate the default one for you, which (by default) call destructors on its elements.
Now, the answer to your question really depends on what your x is :) If it is an object that has a destructor (like std::string) - it will be properly destroyed. But if it is a "naked pointer" (like int *) - it will not get destroyed and will cause a memory leak.
N.B. You create your x on a stack so I really-really-really hope that X provides proper copy semantics, otherwise you may end up with an invalid object stored in your node!

Print a simply linked list backwards with no recursion, in two passes at most, using constant extra memory, leaving it intact

You must print a simply linked list backwards:
Without recursion
With constant extra memory
In linear time
Leaving the list intact
Added Later Two passes at most
Invert the list, print it forwards, invert again. Each step can be done without violating restrictions except the last one.
EDIT: As cube notes in the comments the second and the third stages can be combined into one pass. This gives two passes – first reverse, then print while reversing again.
Building on sharptooth's reply, you can combine the printing and second inversion in the same pass.
Edit: The "list is left intact" from a single-threaded view because the post-condition equals the pre-condition.
Edit 2: Not sure how I got the answer, but I'll take it since I've hit the rep cap for the day. I gave sharptooth a +1 too.
Here's a C# implementation that holds for all the current rules. It mutates the list during the execution, but the list is restored before returning.
using System;
using System.Diagnostics;
namespace SO1135917.Classes
{
public class ReverseListPrinter
{
public static void Execute(Node firstNode, Action<Node> action)
{
Reverse(Reverse(firstNode, null), action);
}
private static Node Reverse(Node firstNode, Action<Node> action)
{
Node node = firstNode;
Debug.Assert(node != null);
Node nextNode = node.Next;
node.Next = null;
while (node != null)
{
if (action != null)
action(node);
if (nextNode == null)
break;
Node nextNode2 = nextNode.Next;
nextNode.Next = node;
node = nextNode;
nextNode = nextNode2;
}
return node;
}
}
}
There is one problem, however, and that is that the state of the list is undefined if an exception should occur in the above methods. Probably not impossible to handle though.
A subversion repository of the above code, with unit tests, for Visual Studio 2008 is available here, username and password is both 'guest' without the quotes.
You can first check the length of the list. Then create a print-buffer, which you fill in backwards as you traverse the list once again for the information.
Or
You can create another linked list where you add all the printing data in the front when you traverse the first list, and then print the second list from front to back.
Either way makes only two passes at most. The first idea could be done in one pass if you have a header struct that keeps track of the amount of elements in the list.
Edit: I just realised that these ideas does not use constant memory.
The only way to do this sensibly seems to be Sharptooths reply, but that requires three passes.
a function like the following might solver your issue:
void invert_print(PtNo l){
PtNo ptaux = l;
PtNo last;
PtNo before;
while(ptaux != NULL){
last = ptaux;
ptaux = ptaux->next;
}
while(ptaux != last){
printf("%s\n", last->info.title);
ptaux = l;
before = last;
while(ptaux != before){
last = ptaux;
ptaux = ptaux->next;
}
}
}
you will need a structure like the following:
typedef struct InfoNo{
char title20];
}InfoNo;
typedef struct aPtNo{
struct InfoNo info;
struct aPtNo* nextx;
}*PtNo;
Objective-C Link class with reverse method:
Link.h
#import <Foundation/Foundation.h>
#interface Link : NSObject
#property(nonatomic) int value;
#property(nonatomic) Link *next;
- (Link*)reversedList;
#end
Link.m
#import "Link.h"
#implementation Link
- (Link*)reversedList {
Link* head;
Link *link = self;
while (link) {
// save reference to next link
Link *next = link.next;
// "insert" link at the head of the list
link.next = head;
head = link;
// continue processing the rest of the list
link = next;
}
return head;
}
#end

Tree traversal algorithm for directory structures with a lot of files

When recursively traversing through a directory structure, what is the most efficient algorithm to use if you have more files than directories? I notice that when using depth-first traversal, it seems to take longer when there are a lot of files in a given directory. Does breadth-first traversal work more efficiently in this case? I have no way to profile the two algorithms at the moment so your insights are very much welcome.
EDIT: In response to alphazero's comment, I'm using PHP on a Linux machine.
Since you have more files than directories, it does not appear as if you are dealing with very deeply nested directories that would make DFS to take more memory (and hence somewhat more time) than BFS. Essentially, BFS and DFS both do the same thing (i.e. visit every node of the graph), and so in general their speeds should not differ by any significant amount.
It is difficult to say why exactly your DFS is slower without actually seeing your implementation. Are you sure you are not visiting the same nodes more than once due to links/shortcuts in your filesystem? You will also probably get a significant speedup if you use an explicit stack based DFS rather than recursion.
You probably only want to scan the contents in a directory once per directory, so processing order - whether you process a directory's contents before or after visiting other directories probably matters more than whether or not you're doing a depth-first or breadth-first search. Depending on your file system, it may also be more efficient to process file nodes sooner rather than later than stating them to see if they are files or directories. So I'd suggest an pre-order depth-first search as starting point, as easiest to implement and most likely to have good cache/seek performance.
In summary - pre-order depth-first - On entering a directory, list its contents, process any files in that directory, and save a list of child directory names. Then enter each child directory in turn. Just use the program's call stack as stack, unless you know you have vastly deep directory structures.
It makes sense that breadth-first would work better. When you enter your root folder, you create a list of items you need to deal with. Some of those items are files and some are directories.
If you use breadth-first, you would deal with the files in the directory and forget about them before moving on to one of the child directories.
If you use depth-first, you need to keep growing a list of files to deal with later as you drill deeper down. This would use more memory to maintain your list of files to deal with, possibly causing more page faults, etc...
Plus, you'd need to go through the list of new items anyway to figure out which ones are directories that you can drill into. You would need to go through that same list (minus the directories) again when you've gotten to the point of dealing with the files.
Travse directory structure using BFS (as Igor mentioned).
When you reach a directory start a thread to list all the files in the directory.
And kill the thread once it finishes listing/travseing files.
So,there will be separate thread for each directory to list files.
EXAMPLE:
root
- d1
- d1.1
- d1.2
- f1.1 ... f1.100
- d2
- d2.1
- d2.2
- d2.3
- f2.1 ... f2.200
- d3
....
OUTPUT might look like this ->
got d1
started thread to get files of d1
got d2
started thread to get files of d1
done with files in d1
got d3
started thread to get files of d1
got d1.1
started thread to get files of d1.1
got d1.2
started thread to get files of d1.2
So by the time you come back to travse the depths of a directory
the thread to get files would have finished(almost) its job.
Hope this is helpful.
This would be the most effective in Windows (class DirectoryTreeReader), it uses breath first and stores every directory.
static const uint64 DIRECTORY_INDICATOR = -1;//std::numeric_limits <uint64>::max();
class DirectoryContent {
public:
DirectoryContent(const CString& path)
: mIndex(-1)
{
CFileFind finder;
finder.FindFile(path + L"\\*.*");
BOOL keepGoing = FALSE;
do {
keepGoing = finder.FindNextFileW();
if (finder.IsDots()) {
// Do nothing...
} else if (finder.IsDirectory()) {
mPaths.push_back(finder.GetFilePath());
mSizes.push_back(DIRECTORY_INDICATOR);
} else {
mPaths.push_back(finder.GetFilePath());
mSizes.push_back(finder.GetLength());
}
} while(keepGoing);
}
bool OutOfRange() const {
return mIndex >= mPaths.size();
}
void Advance() {
++mIndex;
}
bool IsDirectory() const {
return mSizes[mIndex] == DIRECTORY_INDICATOR;
}
const CString& GetPath() const {
return mPaths[mIndex];
}
uint64 GetSize() const {
return mSizes[mIndex];
}
private:
CStrings mPaths;
std::vector <uint64> mSizes;
size_t mIndex;
};
class DirectoryTreeReader{
DirectoryTreeReader& operator=(const DirectoryTreeReaderRealtime& other) {};
DirectoryTreeReader(const DirectoryTreeReaderRealtime& other) {};
public:
DirectoryTreeReader(const CString& startPath)
: mStartPath(startPath){
Reset();
}
void Reset() {
// Argh!, no clear() in std::stack
while(!mDirectoryContents.empty()) {
mDirectoryContents.pop();
}
mDirectoryContents.push( DirectoryContent(mStartPath) );
Advance();
}
void Advance() {
bool keepGoing = true;
while(keepGoing) {
if (mDirectoryContents.empty()){
return;
}
mDirectoryContents.top().Advance();
if (mDirectoryContents.top().OutOfRange()){
mDirectoryContents.pop();
} else if ( mDirectoryContents.top().IsDirectory() ){
mDirectoryContents.push( DirectoryContent(mDirectoryContents.top().GetPath()) );
} else {
keepGoing = false;
}
}
}
bool OutOfRange() const {
return mDirectoryContents.empty();
}
const CString& GetPath() const {
return mDirectoryContents.top().GetPath();
}
uint64 GetSize() const {
return mDirectoryContents.top().GetSize();
}
private:
const CString mStartPath;
std::stack <DirectoryContent> mDirectoryContents;
};

Resources