User space access from Kernel space - get_user_pages - linux-kernel

I'd like pass a pointer from a user space memory into a function in my kernel module. I don't want to use copy_from_user. I've read that I should use get_user_pages function.
For example one page.
struct page **pages;
pages = kmalloc(1 * sizeof(*pages), GFP_KERNEL);
down_read(&current->mm->mmap_sem);
get_user_pages(current,current->mm,uaddr, 1, 1, 0,pages,NULL);
up_read(&current->mm->mmap_sem);
uaddr is an address in User Space.
After doing this, am I allowed to cast and pass uaddr into my kernel module function? Or maybe I have to use these struct pages in some way?
Why do I have to use down/up read?
After everything do I have to use SetPageDirty() and page_cache_release() functions ?

No, you cannot directly access the userspace pages via uaddr. The struct pages are filled out to allow the kernel to access the physical pages that correspond to the userspace pages. Note also that they are most unlikely to be contiguous, so one must be careful to use the correct page index into the array from the start of the uaddr.
You are changing the page mapping structures for this process, so need to protect them while the page mapping in the kernel is set up.
When you are done with the mappings that were setup by get_user_pages(), you must 'release' them by the referenced functions.

This is not what get_user_pages is for (and no - you can't then just cast and pass uaddr into your kernel module function).
If you don't want to call copy_from_user in the calling function, then just pass a void __user * to your module function and have it do the copy_from_user.

You can only use the user pages for page type activity, for example setting up Scatter/Gather DMA into userspace memory. You cannot use it to directly access user space from kernel mode code. Hence the copy_to/from functions that are there for that reason. Unless your moving large amounts of data why not use these functions?

Once you get the valid user space address, use get_user_pages to get the struct page pointer. once struct page pointer received , to access it in the kernel mode you have to map it to kernel virtual address by using kmap. hope that helps

Related

Working of mmap()

I am trying to get an idea on how does memory mapping take place using the system call mmap.
So far I know mmap takes arguments from the user and returns a logical address of where the file is stored. When the user tries to access it takes this address to the map table converts it to a a physical address and carries the operation as requested.
However I found articles as code example and Theoretical explanation
What it mentions is the memory mapping is carried out as:
A. Using system call mmap ()
B. file operations using (struct file *filp, struct vm_area_struct *vma)
What I am trying to figure out is:
How the arguments passed in the mmap system call are used in the struct vm_area_struct *vma) More generally how are these 2 related.
for instance: the struct vm_area_struct has arguments such as starting address, ending address permissions,etc. How are the values sent by the user used to fill values of these variables.
I am trying to write a driver so, Does the kernal fill the values for variables in the structure for us and I simply use it to call and pass values to remap_pfn_range
And a more fundamental question, why is a different file systems operation needed. The fact that mmap returns the virtual address means that it has already achieved a mapping doesnt it ?
Finally I am not that clear about how the entire process would work in user as well as kernal space. Any documentation explaining the process in details would be helpful.

Can we access memory through a struct page structure

Can we access memory through a struct page structure?
Note: The page belongs to high memory and has not been mapped to kernel logical address space.
Yes we can access the page belonging to highmem through struct page's virtual field. But in your case you can't access as you mentioned that highmem page is not mapped into kernel virtual memory.
To access it you need to create mapping either permanent or temporary mappping.
To create permanent mapping map page through kmap.
void *kmap(struct page *page)
This function works on either high or low memory. If the page structure belongs to a page in low memory, the page’s virtual address is simply returned. If the page resides in high memory, a permanent mapping is created and the address is returned.The function may sleep, so kmap() works only in process context. Because the number of permanent mappings are limited (if not, we would not be in this mess and could just permanently map all memory), high memory should be unmapped when no longer needed.This is done via the following function, which unmaps the given page:
void kunmap(struct page *page)
The temporary mapping can be created via:
void *kmap_atomic(struct page *page, enum km_type type)
This is an atomic function so you can't sleep and can be called in interrupt context. It is called temporary because next call to kmap_atomic will overwrite the previous mapping.
in case there is no value for virtual field then you can not access that specific physical frame. the simple reason is struct page denotes the mappings between physical and virtual addresses so a system with large memory can not map all memory in kernel space. so high memory is mapped dynamically. but to access that memory it should be mapped i.e. void *virtual should not be NULL.

Accessing user space data from linux kernel

This is an assignment problem which asks for partial implementation of process checkpointing:
The test program allocates an array, does a system call and passes the start and end address of array to the call. In the system call function I have to save the contents in the give range to a file.
From my understanding, I could simply use copy_from_usr function to save the contents from the give range. However since the assignment is based on topic "Process address space", I probably need to walk through page tables. Say I manage to get the struct pages that correspond to given range. How do I get the data corresponding to the pages?
Can I just use page_to_virt function and access data directly?
Since the array is contiguous in virtual space, I guess I will just need to translate the starting address to page and then back to virtual address and then just copy the range size of data to file. Is that right?
I think copy_from_user() is ok, nothing else needed. When executing the system call, although it trap to kernel space, the context is still the process context which doing the system call. The kernel still use the process's page table. So just to use copy_from_user(), and nothing else needed.
Okey, if you want to do this experiment, I think you can use the void __user *vaddr to traverse the mm->pgd(page table), using pgd_offset/pud_offset/pmd_offset/pte_offset to get the page physical address(page size alignment). Then in kernel space, using ioremap() to create a kernel space mapping, then using the kernel virtual address(page size) + offset(inside the page), you get the start virtual address of the array. Now in kernel, you can using the virtual address to access the array.

What happens when I printk a char * that was initialized in userspace?

I implemented a new system call as an intro exercise. All it does is take in a buffer and printk that buffer. I later learned that the correct practice would be to use copy_from_user.
Is this just a precautionary measure to validate the address, or is my system call causing some error (page fault?) that I cannot see?
If it is just a precautionary measure, what is it protecting against?
Thanks!
There are several reasons.
Some architectures employ segmented memory, where there is a separate segment for the user memory. In that case, copy_from_user is essential to actually get the right memory address.
The kernel has access to everything, including (almost by definition) a lot of privileged information. Not using copy_from_user could allow information disclosure if a user passes in a kernel address. Worse, if you are writing to a user-supplied buffer without copy_to_user, the user could overwrite kernel memory.
You'd like to prevent the user from crashing the kernel module just by passing in a bad pointer; using copy_from_user protects against faults so e.g. a system call handler can return EFAULT in response to a bad user pointer.

Change user space memory protection flags from kernel module

I am writing a kernel module that has access to a particular process's memory. I have done an anonymous mapping on some of the user space memory with do_mmap():
#define MAP_FLAGS (MAP_PRIVATE | MAP_FIXED | MAP_ANONYMOUS)
prot = PROT_WRITE;
retval = do_mmap(NULL, vaddr, vsize, prot, MAP_FLAGS, 0);
vaddr and vsize are set earlier, and the call succeeds. After I write to that memory block from the kernel module (via copy_to_user), I want to remove the PROT_WRITE permission on it (like I would with mprotect in normal user space). I can't seem to find a function that will allow this.
I attempted unmapping the region and remapping it with the correct protections, but that zeroes out the memory block, erasing all the data I just wrote; setting MAP_UNINITIALIZED might fix that, but, from the man pages:
MAP_UNINITIALIZED (since Linux 2.6.33)
Don't clear anonymous pages. This flag is intended to improve performance on embedded
devices. This flag is only honored if the kernel was configured with the
CONFIG_MMAP_ALLOW_UNINITIALIZED option. Because of the security implications, that option
is normally enabled only on embedded devices (i.e., devices where one has complete
control of the contents of user memory).
so, while that might do what I want, it wouldn't be very portable. Is there a standard way to accomplish what I've suggested?
After some more research, I found a function called get_user_pages() (best documentation I've found is here) that returns a list of pages from userspace at a given address that can be mapped to kernel space with kmap() and written to that way (in my case, using kernel_read()). This can be used as a replacement for copy_to_user() because it allows forcing write permissions on the pages retrieved. The only drawback is that you have to write page by page, instead of all in one go, but it does solve the problem I described in my question.
In userspace there is a system call mprotect that can modify the protection flags on existing mapping. You probably need to follow from the implementation of that system call, or maybe simply call it directly from your code. See mm/protect.c.

Resources