Read a actual physical block - linux-kernel

I have LBN->PBN map.
LBN - Logical Block Number.
PBN - Physical Block Number.
I can get each (LBN, PBN) entry from the above map.
Is there any API that I can use to read the data from the actual block device using this PBN. I am currently working in the linux kernel code /drivers/md.

Try to look inside. I also grep thru sources and found a lot of accessor functions relative to fs.
The Linux Kernel API

Related

Equivalent to ZwQueryVirtualMemory that works on system memory?

ZwQueryVirtualMemory reports on virtual memory in the address space of a process. I would like to do the same thing, but for paged memory in system space. Is there an equivalent function that deals with system space instead of process space?
You could use ZwQuerySystemInformation with SystemModuleInformation to get all running drivers. Then you can find the entry you want and get the base and size of the driver. You could if you want to do it properly only get the base of a driver with using the same method of above or using the PsLoadedModuleList to get the base of the targeted driver and then just walk the sections manually with the headers.
Also a tip, if you are going to copy it to dump it, use MmCopyMemory.

Intel Pin Tool: Get instruction from address

I'm using Intel's Pin Tool to do some binary instrumentation, and was wondering if there an API to get the instruction byte code at a given address.
Something like:
instruction = getInstructionatAddr(addr);
where addr is the desired address.
I know the function Instruction (used in many of the simple/manual examples) given by Pin gets the instruction, but I need to know the instructions at other addresses. I perused the web with no avail. Any help would be appreciated!
CHEERS
wondering if there an API to get the instruction byte code at a given
address
Yes, it's possible but in a somewhat contrived way: with PIN you are usually interested in what is executed (or manipulated through the executed instructions), so everything outside the code / data flow is not of any interest for PIN.
PIN is using (and thus ships with) Intel XED which is an instruction encoder / decoder.
In your PIN installation you should have and \extra folder with two sub-directories: xed-ia32 and xed-intel64 (choose the one that suits your architecture). The main include file for XED is xed-interface.h located in the \include folder of the aforementioned directories.
In your Pintool, given any address in the virtual space of your pintooled program, use the PIN_SafeCopy function to read the program memory (and thus bytes at the given address). The advantage of PIN_SafeCopy is that it fails graciously even if it can't read the memory, and can read "shadowed" parts of the memory.
Use XED to decode the instruction bytes for you.
For an example of how to decode an instruction with XED, see the first example program.
As the small example uses an hardcoded buffer (namely itext in the example program), replace this hardcoded buffer with the destination buffer you used in PIN_SafeCopy.
Obviously, you should make sure that the memory you are reading really contains code.
AFAIK, it is not possible to get an INS type (the usual type describing an instruction in PIN) from an arbitrary address as only addresses in the code flow will "generate" an INS type.
As a side note:
I know the function Instruction (used in many of the simple/manual
examples) given by Pin gets the instruction
The Instruction routine used in many PIN example is called an "Instrumentation routine": its name is not relevant in itself.
Pin_SafeCopy may help you. This API could copy memory content from the address space of target process to one specified buffer.

Accessing user space data from linux kernel

This is an assignment problem which asks for partial implementation of process checkpointing:
The test program allocates an array, does a system call and passes the start and end address of array to the call. In the system call function I have to save the contents in the give range to a file.
From my understanding, I could simply use copy_from_usr function to save the contents from the give range. However since the assignment is based on topic "Process address space", I probably need to walk through page tables. Say I manage to get the struct pages that correspond to given range. How do I get the data corresponding to the pages?
Can I just use page_to_virt function and access data directly?
Since the array is contiguous in virtual space, I guess I will just need to translate the starting address to page and then back to virtual address and then just copy the range size of data to file. Is that right?
I think copy_from_user() is ok, nothing else needed. When executing the system call, although it trap to kernel space, the context is still the process context which doing the system call. The kernel still use the process's page table. So just to use copy_from_user(), and nothing else needed.
Okey, if you want to do this experiment, I think you can use the void __user *vaddr to traverse the mm->pgd(page table), using pgd_offset/pud_offset/pmd_offset/pte_offset to get the page physical address(page size alignment). Then in kernel space, using ioremap() to create a kernel space mapping, then using the kernel virtual address(page size) + offset(inside the page), you get the start virtual address of the array. Now in kernel, you can using the virtual address to access the array.

ext4 pointers from in-memory inode

I'm trying to retrieve in a kernel module the direct/indirect etc addresses in an ext4 file system inode. I understand that I need to look into ext_inode_info struct (I do this via container_of using the relevant vfs_inode).
But to which field am I supposed to look into?
Where can I find for example the first direct pointer? I thought it was stored in i_data array (it is in ext3_inode_info).
But for an ext4 inode when I examine the first entry in i_data, I get a sector address that is not remotely similar to the real sector holding the address of the first data block.
Any help will be appreciated.
==EDIT==
ok, so I seemed to have understood the basic problem. I have an extent-based ext4 file system. Wasn't aware of this change, and that this is enabled by default. So is there a simple way to extract the physical addresses of blocks by offset? I'm trying again as verification to look at the first physical block (logical 0), by looking at the first extent, but I get some gibberish numbers (though consistent and unique for every inode/file, so some progress was made).

Accesing block device from kernel module

I am interested in developing kernel module that binds two block devices into a new block device in such manner that first block device contains data at mount time, and the other is considered empty. Every write is being made to second partition, so on next mount the base filesystem remains unchanged. I know of solutions like UnionFS, but those are filesystem-based, while i want to develop it a layer lower, block-based.
Can anyone tell me how could i open ad read/write block device from kernel module? Possibly without using userspace program for reading/writing merged block devices. I found similar topic here, but the answer was rather unsatysfying because filp_* functions are rather for reading small config files, not for (large) block device I/O.
Since interface for creating block devices is standarized i was thinking of direct (or almost direct) acces to functions implementing source devices, as i will be requested to export similar functions anyway. If i could do that i would simply create some proxy-functions calling appropriate functions on source devices. Can i somehow obtain pointer to a gendisk structure that belongs to different driver?
This serves only my own purposes (satisfying quriosity being main of them) so i am not worried about messing my kernel up seriously.
Or does somebody know if module like that already exists?
The source code in the device mapper driver will suit your needs. Look at the code in the Linux source in Linux/drivers/md/dm-*.
You don't need to access the other device's gendisk structure, but rather its request queue. You can prepare I/O requests and push it down the other device's queue, and it will do the rest itself.
I have implemented a simple block device that opens another block device. Take a look in my post describing it:
stackbd: Stacking a block device over another block device
Here are some examples of functions that you need for accessing another device's gendisk.
The way to open another block device using its path ("/dev/"):
struct block_device *bdev_raw = lookup_bdev(dev_path);
printk("Opened %s\n", dev_path);
if (IS_ERR(bdev_raw))
{
printk("stackbd: error opening raw device <%lu>\n", PTR_ERR(bdev_raw));
return NULL;
}
if (!bdget(bdev_raw->bd_dev))
{
printk("stackbd: error bdget()\n");
return NULL;
}
if (blkdev_get(bdev_raw, STACKBD_BDEV_MODE, &stackbd))
{
printk("stackbd: error blkdev_get()\n");
bdput(bdev_raw);
return NULL;
}
The simplest example of passing an I/O request from one device to another is by remapping it without modifying it. Notice in the following code that the bi_bdev entry is modified with a different device. One can also modify the block address (*bi_sector) and the data itself.
static void stackbd_io_fn(struct bio *bio)
{
bio->bi_bdev = stackbd.bdev_raw;
trace_block_bio_remap(bdev_get_queue(stackbd.bdev_raw), bio,
bio->bi_bdev->bd_dev, bio->bi_sector);
/* No need to call bio_endio() */
generic_make_request(bio);
}
Consider examining the code for the dm / md block devices in drivers/md - these existing drivers create a block device that stores data on other block devices.
In fact, you could probably implement your idea as another "RAID personality" in md, and thereby make use of the existing userspace tools for setting up the devices.
You know, if you're a GPL'd kernel module you can just call open(), read(), write(), etc. from kernel mode right?
Of course this way has certain caveats including requiring forking from kernel mode to create a space for your handle to live.

Resources