Linux User Process Context to Access User virtual memory - memory-management

Say I have the user context data stored in a kernel memory pointer. Say I also have a pointer to user-space char *. Then I create a kernel thread and kernel thread can have these two pointers. From the thread can I access the user space data using the pointer? I can access them in the system call but the question is can I access them from kernel thread? What about accessing them from Workqueue?
Say my userprocess calls a system call
//User Application
char* abc = "This is data.";
syscall(340, p);
in syscall handler
void sys_340(void* p) {
th = kthread_run("kth", kt_func, p);
//might also store process context as I am in system call!! How?
}
void kt_func(void *p) {
while(1){ printk("Line: %s\n",p); sleep(1000); }
}
I want kt_func to print "This is data" in every 1 seceond.

Kernel threads can access any part of user space memory(given that they have the proper pointer to it). As you code suggests that as a part of system call you want to start a new kernel thread and let it print something every 1 second. I am assuming that after creating the kernel thread, you would return from the system call. The problem here is this: Once you have returned from the system call, the user process can also access the memory pointed by p and the kernel thread can also access it. How would you assure synchronization access of the pointer p ? ( Perhaps through another system call).
Although, I cannot see any use case of what you are doing ?

In your syscall handler, you could do something like
struct mm_struct *mm = get_task_mm(current);
to stash away the memory mapping of the process making the system call. Then later in your kernel thread you can do something like
access_remote_vm(mm, p, my_kernel_buf, length, 0);
to do the equivalent of copy_from_user() on the original task's memory.

Related

CUDA dynamic parallelism: Access child kernel results in global memory

I am currently trying my first dynamic parallelism code in CUDA. It is pretty simple. In the parent kernel I am doing something like this:
int aPayloads[32];
// Compute aPayloads start values here
int* aGlobalPayloads = nullptr;
cudaMalloc(&aGlobalPayloads, (sizeof(int) *32));
cudaMemcpyAsync(aGlobalPayloads, aPayloads, (sizeof(int)*32), cudaMemcpyDeviceToDevice));
mykernel<<<1, 1>>>(aGlobalPayloads); // Modifies data in aGlobalPayloads
cudaDeviceSynchronize();
// Access results in payload array here
Assuming that I do things right so far, what is the fastest way to access the results in aGlobalPayloads after kernel execution? (I tried cudaMemcpy() to copy aGlobalPayloads back to aPayloads but cudaMemcpy() is not allowed in device code).
You can directly access the data in aGlobalPayloads from your parent kernel code, without any copying:
mykernel<<<1, 1>>>(aGlobalPayloads); // Modifies data in aGlobalPayloads
cudaDeviceSynchronize();
int myval = aGlobalPayloads[0];
I'd encourage careful error checking (Read the whole accepted answer here). You do it in device code the same way as in host code. The programming guide states: "May not pass in local or shared memory pointers". Your usage of aPayloads is a local memory pointer.
If for some reason you want that data to be explicitly put back in your local array, you can use in-kernel memcpy for that:
memcpy(aPayloads, aGlobalPayloads, sizeof(int)*32);
int myval = aPayloads[0]; // retrieves the same value
(that is also how I would fix the issue I mention in item 2 - use in-kernel memcpy)

KSPIN_LOCK blocks when acquiring from Driver's main thread

I have a KSPIN_LOCK which is shared among a Windows driver's main thread and some threads I created with PsCreateSystemThread. The problem is that the main thread blocks if I try to acquire the spinlock and doesn't unblock. I'm very confused as to why this happens.. it's probably somehow connected to the fact that the main thread runs at driver IRQL, while the other threads run at PASSIVE_LEVEL as far as I know.
NOTE: If I only run the main thread, acquiring/releasing the lock works just fine.
NOTE: I'm using the functions KeAcquireSpinLock and KeReleaseSpinLock to acquire/release the lock.
Here's my checklist for a "stuck" spinlock:
Make sure the spinlock was initialized with KeInitializeSpinLock. If the KSPIN_LOCK holds uninitialized garbage, then the first attempt to acquire it will likely spin forever.
Check that you're not acquiring it recursively/nested. KSPIN_LOCK does not support recursion, and if you try it, it will spin forever.
Normal spinlocks must be acquired at IRQL <= DISPATCH_LEVEL. If you need something that works at DIRQL, check out [1] and [2].
Check for leaks. If one processor acquires the spinlock, but forgets to release it, then the next processor will spin forever when trying to acquire the lock.
Ensure there's no memory-safety issues. If code randomly writes a non-zero value on top of the spinlock, that'll cause it to appear to be acquired, and the next acquisition will spin forever.
Some of these issues can be caught easily and automatically with Driver Verifier; use it if you're not using it already. Other issues can be caught if you encapsulate the spinlock in a little helper that adds your own asserts. For example:
typedef struct _MY_LOCK {
KSPIN_LOCK Lock;
ULONG OwningProcessor;
KIRQL OldIrql;
} MY_LOCK;
void MyInitialize(MY_LOCK *lock) {
KeInitializeSpinLock(&lock->Lock);
lock->OwningProcessor = (ULONG)-1;
}
void MyAcquire(MY_LOCK *lock) {
ULONG current = KeGetCurrentProcessorIndex();
NT_ASSERT(KeGetCurrentIrql() <= DISPATCH_LEVEL);
NT_ASSERT(current != lock->OwningProcessor); // check for recursion
KeAcquireSpinLock(&lock->Lock, &lock->OldIrql);
NT_ASSERT(lock->OwningProcessor == (ULONG)-1); // check lock was inited
lock->OwningProcessor = current;
}
void MyRelease(MY_LOCK *lock) {
NT_ASSERT(KeGetCurrentProcessorIndex() == lock->OwningProcessor);
lock->OwningProcessor = (ULONG)-1;
KeReleaseSpinLock(&lock->Lock, lock->OldIrql);
}
Wrappers around KSPIN_LOCK are common. The KSPIN_LOCK is like a race car that has all the optional features stripped off to maximize raw speed. If you aren't counting microseconds, you might reasonably decide to add back the heated seats and FM radio by wrapping the low-level KSPIN_LOCK in something like the above. (And with the magic of #ifdefs, you can always take the airbags out of your retail builds, if you need to.)

how to print debug from both user-space and kernel-space

I am learning embedded system
I need to print debug info on the console from both user-space daemon and kernel-space , I used printf for userspace and printk(KERN_CRIT) for kernel-space.
However, the output is mixed into a mess and out of order. I guess KERN_CRIT is very fast, Is there any clean way to do the job??
Thanks so much
ftrace can resolve your problem.
In linux kernel, you can use "trace_printk" instead of "printk" to log the information, and at the same time in user space you can write the log to the file "trace_marker".
For kernel space:
#include/linux/kernel.h
...
trace_printk("Hello, kernel trace printk !\n");
...
For user space
...
trace_fd = open("trace_marker", WR_ONLY);
void trace_write(const char *fmt, ...)
{
va_list ap;
char buf[256];
int n;
if (trace_fd < 0)
return;
va_start(ap, fmt);
n = vsnprintf(buf, 256, fmt, ap);
va_end(ap);
write(trace_fd, buf, n);
}
...
trace_write("Hello, trace in user space \n");
...
You can find detail information about ftrace in the linux kernel souce code, the path is Documentation/trace/ftrace.txt.
And there are some introduce about ftraces, please focus on trace_printk and trace marker.
Debugging the kernel using Ftrace - part 1
Debugging the kernel using Ftrace - part 2
This seems like a problem of synchronising between user and kernel space. Two solutions come to mind.
First, create a debugfs or sysfs interface which holds just one value representing a binary semaphore. Before printing, user program and kernel each will first "down" the value in debugfs or sysfs file. After printing it will "up" it. This can be achieved via wrapper function or macro.
Second, create a debugfs interface. Kernel will always send its logs to that interface rather than printk them. A user space daemon can constantly check that debugfs file. The user program wanting to print will also send its logs to the user space daemon. The daemon can use appropriate synchronisation mechanism like mutex, to ensure that logs never overlap.

How to send Notification from Kernel to user space application using SYSFS

I'm working in an USB ACM driver, "where i need to send notification from kernel space to user space application for invoking a call back function". I'm not much aware of using kernel to user interfaces in code. how well can sysfs help for this scenario. Please send some sample code to use sysfs so that I'll get an idea to implement in my code. I could not find it anywhere. Also pls tell anyother easy way to achieve kernel to user space notification. Thanks in advance.
My suggestion would be to create a sysfs interface to your kernel driver that userspace can access. Each sysfs attribute if created with the correct properties can be read like a file from userspace. You can then use the poll function from userspace to poll for an action on that file. To trigger this action from Kernel space, you can use the sysfs_notify function on your attribute and it will cause your userspace code to wake up. Here is how I would do it
Kernel
1. Create Kobject or attach an attribute to a previous kobject
2. When you want to signal userspace call sysfs_notify on the kobject and attribute
Userspace
Create a new thread that will block while waiting for the sysfs_notify
Open the sysfs attribute from this thread
poll the attribute, once sysfs_notify from the kernel is called it will unblock your poll
call your event handling function
Another alternative would be to use eventfd. You create eventfd, pass the integer file descriptor to kernel space (e.g. through sysfs, or ioctl), then convert the file descriptor you got from user space to eventfd_ctx in kernel space. This is it - you have your notification channel.
User space
#include <sys/eventfd.h>
int efd = eventfd(0, 0);
// pass 'efd' to kernel space
Kernel space
#include <linux/eventfd.h>
// Set up
int efd = /* get from user space - sysfs/ioctl/write/... */;
struct eventfd_ctx* efd_ctx = eventfd_ctx_fdget(efd);
// Signal user space
eventfd_signal(efd_ctx, 1);
// Tear down
eventfd_ctx_put(efd_ctx);

Shared Memory between User Space and Kernel Threads

I am developing a kernel application which involves kthreads. I create an array of structure and allocate memory using malloc in user-space. Then I call a system call (which I implemented) and pass the address of array to kernel-space. In the handler of system-call I create I create 2 kthreads which will monitor the array. kthread can change some value and user-space threads can also change some values. The idea is to use the array as a shared memory. But some when I access the memory in kernel space (using copy_from_user) the data are somehow changed. I can verify that the address are same when it was assigned and in kernel. But when using copy_from_user it is giving various values like garbage values.
Also is the following statement ok?
int kthread_run_function(void* data){
struct entry tmp;
copy_from_user(&tmp, data, sizeof(struct entry));
}
This is not OK because copy_from_user() copies from the current user process (which should be obvious, since there's no way to tell it which user process to copy from).
In a syscall invoked by your userspace process this is OK, because the current process is your userspace process. However, within the kernel thread the current process could be any other process on the system - so you're copying from a random process's memory, which is why you get garbage.
If you want to share memory between the kernel and a userspace process, the right way to do this is to have the kernel allocate it, then allow the userspace process to map it into its address space with mmap(). The kernel thread and the userspace process will use different pointers to refer to the memory region - the kernel thread will use a pointer to the memory allocated within the kernel address space, and the userspace process will use a pointer to the memory region returned by mmap().
No, generally it's not OK since data is kernel virtual address, not a user virtual address.
However, IFF you called kthread_create with the data argument equal to an __user pointer, this should be ok.

Resources