How to distinguish between CPU and GPU virtual address within Linux kernel module? - linux-kernel

Within a user space program, I can easily tell whether a pointer ptr targets CPU memory (allocated with regular memory allocations) or GPU memory (allocated with cudaMalloc) using the following code.
CUDA_POINTER_ATTRIBUTE_P2P_TOKENS tokens;
CUresult status = cuPointerGetAttribute( &tokens, CU_POINTER_ATTRIBUTE_P2P_TOKENS, (intptr_t)ptr );
if ( CUDA_SUCCESS == status )
// This is GPU memory
How can I do this from within a Linux kernel module driver?
How can the Linux kernel module driver perform the same test given a user core virtual address?
Note that my kernel module uses the nvidia driver for RDMA. It calls the function nvidia_p2p_get_pages for example. As this function takes a lot of time, I'd like to check, before calling it, whether the virtual address ptr actually points to GPU memory.

Related

Accessing a page from device memory in userspace using linux

I have a device memory mapped to kernel virtual address via ioremap. Userspace needs to access a page at offset x from this device memory.
The way i can achieve it rightnow is via using mmap in userspace and writing a small memory mapping at driver side.
Is there any way to use offset ( lets assume kernel passes the offset to userspae )and achieve samething without making any mapping at driver side.
Can ioremapped kernel virtual addresses be used here ?

How to mmap a file in linux kernel space?

I try to mmap a file in a linux kernel module. I have tried to use the function do_mmap_pgoff. But the address returned is memory virtual address in current process' user space, i.e., below the kernel boundary. Instead, I want to map the file in the kernel space and get the kernel virtual address of the mapped region. Is there any kernel API in Linux support this operation? Thanks

Shared Memory between User Space and Kernel Threads

I am developing a kernel application which involves kthreads. I create an array of structure and allocate memory using malloc in user-space. Then I call a system call (which I implemented) and pass the address of array to kernel-space. In the handler of system-call I create I create 2 kthreads which will monitor the array. kthread can change some value and user-space threads can also change some values. The idea is to use the array as a shared memory. But some when I access the memory in kernel space (using copy_from_user) the data are somehow changed. I can verify that the address are same when it was assigned and in kernel. But when using copy_from_user it is giving various values like garbage values.
Also is the following statement ok?
int kthread_run_function(void* data){
struct entry tmp;
copy_from_user(&tmp, data, sizeof(struct entry));
}
This is not OK because copy_from_user() copies from the current user process (which should be obvious, since there's no way to tell it which user process to copy from).
In a syscall invoked by your userspace process this is OK, because the current process is your userspace process. However, within the kernel thread the current process could be any other process on the system - so you're copying from a random process's memory, which is why you get garbage.
If you want to share memory between the kernel and a userspace process, the right way to do this is to have the kernel allocate it, then allow the userspace process to map it into its address space with mmap(). The kernel thread and the userspace process will use different pointers to refer to the memory region - the kernel thread will use a pointer to the memory allocated within the kernel address space, and the userspace process will use a pointer to the memory region returned by mmap().
No, generally it's not OK since data is kernel virtual address, not a user virtual address.
However, IFF you called kthread_create with the data argument equal to an __user pointer, this should be ok.

Somewhat newb question about assy and the heap

Ultimately I am just trying to figure out how to dynamically allocate heap memory from within assembly.
If I call Linux sbrk() from assembly code, can I use the address returned as I would use an address of a statically (ie in the .data section of my program listing) declared chunk of memory?
I know Linux uses the hardware MMU if present, so I am not sure if what sbrk returns is a 'raw' pointer to real RAM, or is it a cooked pointer to RAM that may be modified by Linux's VM system?
I read this: How are sbrk/brk implemented in Linux?. I suspect I can not use the return value from sbrk() without worry: the MMU fault on access-non-allocated-address must cause the VM to alter the real location in RAM being addressed. Thus assy, not linked against libc or what-have-you, would not know the address has changed.
Does this make sense, or am I out to lunch?
Unix user processes live in virtual memory, no matter if written in assembler of Fortran, and should not care about physical addresses. That's kernel's business - kernel sets up and manages the MMU. You don't have to worry about it. Page faults are handled automatically and transparently.
sbrk(2) returns a virtual address specific to the process, if that's what you were asking.

User to kernel mode big picture?

I've to implement a char device, a LKM.
I know some basics about OS, but I feel I don't have the big picture.
In a C programm, when I call a syscall what I think it happens is that the CPU is changed to ring0, then goes to the syscall vector and jumps to a kernel memmory space function that handle it. (I think that it does int 0x80 and in eax is the offset of the syscall vector, not sure).
Then, I'm in the syscall itself, but I guess that for the kernel is the same process that was before, only that it is in kernel mode, I mean the current PCB is the process that called the syscall.
So far... so good?, correct me if something is wrong.
Others questions... how can I write/read in process memory?.
If in the syscall handler I refer to address, say, 0xbfffffff. What it means that address? physical one? Some virtual kernel one?
To read/write memory from the kernel, you need to use function calls such as get_user or __copy_to_user.
See the User Space Memory Access API of the Linux Kernel.
You can never get to ring0 from a regular process.
You'll have to write a kernel module to get to ring0.
And you never have to deal with any physical addresses, 0xbfffffff represents an address in a virtual address space of your process.
Big picture:
Everything happens in assembly. So in Intel assembly, there is a set of privilege instruction which can only be executed in Ring0 mode (http://en.wikipedia.org/wiki/Privilege_level). To make the transition into Ring0 mode, you can use the "Int" or "Sysenter" instruction:
what all happens in sysenter instruction is used in linux?
And then inside the Ring0 mode (which is your kernel mode), accessing the memory will require the privilege level to be matched via DPL/CPL/RPL attributes bits tagged in the segment register:
http://duartes.org/gustavo/blog/post/cpu-rings-privilege-and-protection/
You may asked, how the CPU initialize the memory and register in the first place: it is because when bootup, x86 CPU is running in realmode, unprotected (no Ring concept), and so everything is possible and lots of setup work is done.
As for virtual vs non-virtual memory address (or physical address): just remember that anything in the register used for memory addressing, is always via virtual address (if the MMU is setup, protected mode enabled). Look at the picture here (noticed that anything from the CPU is virtual address, only the memory bus will see physical address):
http://en.wikipedia.org/wiki/Memory_management_unit
As for memory separation between userspace and kernel, you can read here:
http://www.inf.fu-berlin.de/lehre/SS01/OS/Lectures/Lecture14.pdf

Resources