ext4 pointers from in-memory inode - linux-kernel

I'm trying to retrieve in a kernel module the direct/indirect etc addresses in an ext4 file system inode. I understand that I need to look into ext_inode_info struct (I do this via container_of using the relevant vfs_inode).
But to which field am I supposed to look into?
Where can I find for example the first direct pointer? I thought it was stored in i_data array (it is in ext3_inode_info).
But for an ext4 inode when I examine the first entry in i_data, I get a sector address that is not remotely similar to the real sector holding the address of the first data block.
Any help will be appreciated.
==EDIT==
ok, so I seemed to have understood the basic problem. I have an extent-based ext4 file system. Wasn't aware of this change, and that this is enabled by default. So is there a simple way to extract the physical addresses of blocks by offset? I'm trying again as verification to look at the first physical block (logical 0), by looking at the first extent, but I get some gibberish numbers (though consistent and unique for every inode/file, so some progress was made).

Related

Working of mmap()

I am trying to get an idea on how does memory mapping take place using the system call mmap.
So far I know mmap takes arguments from the user and returns a logical address of where the file is stored. When the user tries to access it takes this address to the map table converts it to a a physical address and carries the operation as requested.
However I found articles as code example and Theoretical explanation
What it mentions is the memory mapping is carried out as:
A. Using system call mmap ()
B. file operations using (struct file *filp, struct vm_area_struct *vma)
What I am trying to figure out is:
How the arguments passed in the mmap system call are used in the struct vm_area_struct *vma) More generally how are these 2 related.
for instance: the struct vm_area_struct has arguments such as starting address, ending address permissions,etc. How are the values sent by the user used to fill values of these variables.
I am trying to write a driver so, Does the kernal fill the values for variables in the structure for us and I simply use it to call and pass values to remap_pfn_range
And a more fundamental question, why is a different file systems operation needed. The fact that mmap returns the virtual address means that it has already achieved a mapping doesnt it ?
Finally I am not that clear about how the entire process would work in user as well as kernal space. Any documentation explaining the process in details would be helpful.

Read a actual physical block

I have LBN->PBN map.
LBN - Logical Block Number.
PBN - Physical Block Number.
I can get each (LBN, PBN) entry from the above map.
Is there any API that I can use to read the data from the actual block device using this PBN. I am currently working in the linux kernel code /drivers/md.
Try to look inside. I also grep thru sources and found a lot of accessor functions relative to fs.
The Linux Kernel API

How device name is copied in ip_rt_ioctl in fib_frontend.c

I have one doubt in ip_rt_ioctl function
In case of route addition, first a copy_from_user is made for the structure struct rtentry and then the copied data from is subsequently used in rtentry_to_fib_config function, including the rtentry.rt_dev field which usually is the device name.
My understanding is copy_from_user does a shallow copy. So since the rtentry.rt_dev field is again a character pointer. So likely the contents of the pointer will not get copied.
Hence even after copy the device name will be pointer to the user space address.
So is it right to access the user space address from kernel space ?
It's OK to refer to user-space address from kernel-space while kernel is bound to that process' context (this is true for syscall handlers). In that case, proper page table is set and it's safe to refer to user process' memory.
However, you should always check validity of address or use copy_from_user() that does that.

Partial unmap of Win32 memory-mapped file

I have some code (which I cannot change) that I need to get working in a native Win32 environment. This code calls mmap() and munmap(), so I have created those functions using CreateFileMapping(), MapViewOfFile(), etc., to accomplish the same thing. Initially this works fine, and the code is able to access files as expected. Unfortunately the code goes on to munmap() selected parts of the file that it no longer needs.
x = mmap(0, size, PROT_READ, MAP_SHARED, fd, 0);
...
munmap(x, hdr_size);
munmap(x + foo, bar);
...
Unfortunately, when you pass a pointer into the middle of the mapped range to UnmapViewOfFile() it destroys the entire mapping. Even worse, I can't see how I would be able to detect that this is a partial un-map request and just ignore it.
I have tried calling VirtualFree() on the range but, unsurprisingly, this produces ERROR_INVALID_PARAMETER.
I'm beginning to think that I will have to use static/global variables to track all the open memory mappings so that I can detect and ignore partial unmappings, but I hope you have a better idea...
edit:
Since I wasn't explicit enough above: the docs for UnMapViewOfFile do not accurately reflect the behavior of that function.
Un-mapping the whole view and remapping pieces is not a good solution because you can only suggest a base address for a new mapping, you can't really control it. The semantics of munmap() don't allow for a change to the base address of the still-mapped portion.
What I really need is a way to find the base address and size of a already-mapped memory area.
edit2: Now that I restate the problem that way, it looks like the VirtualQuery() function will suffice.
It is quite explicit in the MSDN Library docs for UnmapViewOfFile:
lpBaseAddress A pointer to the
base address of the mapped view of a
file that is to be unmapped. This
value must be identical to the value
returned by a previous call to the
MapViewOfFile or MapViewOfFileEx
function.
You changing the mapping by unmapping the old one and creating a new one. Unmapping bits and pieces isn't well supported, nor would it have any useful side-effects from a memory management point of view. You don't want to risk getting the address space fragmented.
You'll have to do this differently.
You could keep track each mapping and how many pages of it are still allocated by the client and only free the mapping when that counter reaches zero. The middle sections would still be mapped, but it wouldn't matter since the client wouldn't be accessing that memory anyway.
Create a global dictionary of memory mappings through this interface. When a mapping request comes through, record the address, size and number of pages that are in the range. When a unmap request is made, find out which mapping owns that address and decrease the page count by the number of pages that are being freed. When that count reaches zero, really unmap the view.

How do they read clusters/cylinders/sectors from the disk?

I needed to recover the partition table I deleted accidentally. I used an application named TestDisk. Its simply mind blowing. I reads each cylinder from the disk. I've seen similar such applications which work with MBR & partitioning.
I'm curious.
How do they read
clusters/cylinders/sectors from the
disk? Is there some kind of API for this?
Is it again OS dependent? If so whats the way to for Linux & for windows?
EDIT:
Well, I'm not just curious I want a hands on experience. I want to write a simple application which displays each LBA.
Cylinders and sectors (wiki explanation) are largely obsoleted by the newer LBA (logical block addressing) scheme for addressing drives.
If you're curious about the history, use the Wikipedia article as a starting point. If you're just wondering how it works now, code is expected to simply use the LBA address (which works largely the same way as a file does - a linear array of bytes arranged in blocks)
It's easy due to the magic of *nix special device files. You can open and read /dev/sda the same way you'd read any other file.
Just use open, lseek, read, write (or pread, pwrite). If you want to make sure you're physically fetching data from a drive and not from kernel buffers you can open with the flag O_DIRECT (though you must perform aligned reads/writes of 512 byte chunks for this to work).
For *nix, there have been already answers (/dev directory); for Windows, there are the special objects \\.\PhisicalDriveX, with X as the number of the drive, which can be opened using the normal CreateFile API. To actually perform reads or writes you have then to use the DeviceIoControl function.
More info can be found in "Physical Disks and Volumes" section of the CreateFile API documentation.
I'm the OP. I'm combining Eric Seppanen's & Matteo Italia's answers to make it complete.
*NIX Platforms:
It's easy due to the magic of *nix special device files. You can open and read /dev/sda the same way you'd read any other file.
Just use open, lseek, read, write (or pread, pwrite). If you want to make sure you're physically fetching data from a drive and not from kernel buffers you can open with the flag O_DIRECT (though you must perform aligned reads/writes of 512 byte chunks for this to work).
Windows Platform
For Windows, there are the special objects \\.\PhisicalDriveX, with X as the number of the drive, which can be opened using the normal CreateFile API. To perform reads or writes simply call ReadFile and WriteFile (buffer must be aligned on sector size).
More info can be found in "Physical Disks and Volumes" section of the CreateFile API documentation.
Alternatively you can also you DeviceIoControl function which sends a control code directly to a specified device driver, causing the corresponding device to perform the corresponding operation.
On linux, as root, you can save your MBR like this (Assuming you drive is /dev/sda):
dd if=/dev/sda of=mbr bs=512 count=1
If you wanted to read 1Mb from you drive, starting at the 10th MB:
dd if=/dev/sda of=1Mb bs=1Mb count=1 skip=10

Resources