When we reach the callback of mmap of a struct file_operations in a Linux kernel module, can we assume that the vma->vm_mm->mm_sem is already held before the callback is invoked?
Or do we have to explicitly call down_write(&vma->vm_mm->mmap_sem) before doing remap_pfn_range?
The mmap file operation handler should assume the mmap lock is already write-locked when it is called. The mmap file handler is called via call_mmap() via mmap_region() via do_mmap(), and the following comment appears before the do_mmap() function in "mm/mmap.c":
/*
* The caller must write-lock current->mm->mmap_lock.
*/
N.B. The mmap lock was renamed from mmap_sem to mmap_lock in Linux kernel 5.8. The corresponding comment in the 5.7 kernel is:
/*
* The caller must hold down_write(¤t->mm->mmap_sem).
*/
do_mmap() is called via do_mmap_pgoff() (in "include/linux/mm.h") via vm_mmap_pgoff() (in "mm/util.c") via ksys_mmap_pgoff() (in "mm/mmap.c") via the mmap_pgoff() syscall handler (in "mm/mmap.c"). (N.B. From kernel version 5.9 onwards, do_mmap_pgoff() is eliminated and do_mmap() is called directly from vm_mmap_pgoff().) The mmap lock is write-locked in vm_mmap_pgoff().
We can use mutex lock in POSIX API as follows:
/* acquire the mutex lock */
pthread_mutex_lock(&mutex);
/* critical section */
/* release the mutex lock */
pthread_mutex_unlock(&mutex);
Does POSIX API put waiting thread to sleep? Where is the waiting queue? Is the waiting queue not visible to user?
Does POSIX API put waiting thread to sleep?
The POSIX API is just an API, it can be implemented in different ways.
In Linux, the POSIX Threads library uses futexes to implement mutexes. When a mutex is contended, the pthread implementation will use the futex(2) syscall to request intervention from the kernel, which puts to sleep or wakes up threads as needed. So yes, threads can definitely be put to sleep when calling pthread_mutex_lock().
One thing to note, as the Wikipedia article suggests, is that: a properly programmed futex-based lock does not use system calls except when the lock is contended. And this is exactly the case for the POSIX Thread library, so you may have perfectly functioning and syncronized programs using threads that never issue futex(2) syscalls.
Where is the waiting queue? Is the waiting queue not visible to user?
Since mutexes are bases on futexes, and futex contention is ultimately handled by the kernel, the waiting queue resides in kernel space, and is not visible from user space. You can see the implementation right in the file kernel/futex.c from the Linux kernel source code.
I have registered a threaded interrupt handler as below:
ret = devm_request_threaded_irq(dev,
data->irq,
&abc_handle_irq,
&abc_thread_irq,
IRQF_SHARED,
DEVICE_NAME,
(void *)pdev);
abc_thread_irq() is the thread function which acts as a bottom half in this mechanism. This thread shares a resource with process context hence needs a lock. Now question 1. can this lock be a mutex_lock() (struct mutex lock) or it must use spin_lock_bh(). I am looking into a code which uses mutex_lock which to my mind is not ok as it can put the BH into sleep and if the BH thread is running in an atomic context then that in turn cause kernel panic. The threaded interrupt mechanism is new to me, so need help.
Thanks.
I've started to learn Linux driver programs, but I'm finding it a little difficult.
I've been studying the i2c driver, and I got quite confused regarding the entry-point of the driver program. Does the driver program start at the MOUDULE_INIT() macro?
And I'd also like to know how I can know the process of how the driver program runs. I got the book, Linux Device Driver, but I'm still quite confused. Could you help me? Thanks a lot.
I'll take the i2c driver as an example. There are just so many functions in it, I just wanna know how I can get coordinating relation of the functions in the i2c drivers?
A device driver is not a "program" that has a main {} with a start point and exit point. It's more like an API or a library or a collection of routines. In this case, it's a set of entry points declared by MODULE_INIT(), MODULE_EXIT(), perhaps EXPORT_SYMBOL() and structures that list entry points for operations.
For block devices, the driver is expected to provide the list of operations it can perform by declaring its functions for those operations in (from include/linux/blkdev.h):
struct block_device_operations {
int (*open) ();
int (*release) ();
int (*ioctl) ();
int (*compat_ioctl) ();
int (*direct_access) ();
unsigned int (*check_events) ();
/* ->media_changed() is DEPRECATED, use ->check_events() instead */
int (*media_changed) ();
void (*unlock_native_capacity) ();
int (*revalidate_disk) ();
int (*getgeo)();
/* this callback is with swap_lock and sometimes page table lock held */
void (*swap_slot_free_notify) ();
struct module *owner;
};
For char devices, the driver is expected to provide the list of operations it can perform by declaring its functions for those operations in (from include/linux/fs.h):
struct file_operations {
struct module *owner;
loff_t (*llseek) ();
ssize_t (*read) ();
ssize_t (*write) ();
ssize_t (*aio_read) ();
ssize_t (*aio_write) ();
int (*readdir) ();
unsigned int (*poll) ();
long (*unlocked_ioctl) ();
long (*compat_ioctl) ();
int (*mmap) ();
int (*open) ();
int (*flush) ();
int (*release) ();
int (*fsync) ();
int (*aio_fsync) ();
int (*fasync) ();
int (*lock) ();
ssize_t (*sendpage) ();
unsigned long (*get_unmapped_area)();
int (*check_flags)();
int (*flock) ();
ssize_t (*splice_write)();
ssize_t (*splice_read)();
int (*setlease)();
long (*fallocate)();
};
For platform devices, the driver is expected to provide the list of operations it can perform by declaring its functions for those operations in (from include/linux/platform_device.h):
struct platform_driver {
int (*probe)();
int (*remove)();
void (*shutdown)();
int (*suspend)();
int (*resume)();
struct device_driver driver;
const struct platform_device_id *id_table;
};
The driver, especially char drivers, does not have to support every operation listed. Note that there are macros to facilitate the coding of these structures by naming the structure entries.
Does the driver program starts at the MOUDLUE_INIT() macro?
The driver's init() routine specified in MODULE_INIT() will be called during boot (when statically linked in) or when the module is dynamically loaded. The driver passes its structure of operations to the device's subsystem when it registers itself during its init().
These device driver entry points, e.g. open() or read(), are typically executed when the user app invokes a C library call (in user space) and after a switch to kernel space. Note that the i2c driver you're looking at is a platform driver for a bus that is used by leaf devices, and its functions exposed by EXPORT_SYMBOL() would be called by other drivers.
Only the driver's init() routine specified in MODULE_INIT() is guaranteed to be called. The driver's exit() routine specified in MODULE_EXIT() would only be executed if/when the module is dynamically unloaded. The driver's op routines will be called asynchronously (just like its interrupt service routine) in unknown order. Hopefully user programs will invoke an open() before issuing a read() or an ioctl() operation, and invoke other operations in a sensible fashion. A well-written and robust driver should accommodate any order or sequence of operations, and produce sane results to ensure system integrity.
It would probably help to stop thinking of a device driver as a program. They're completely different. A program has a specific starting point, does some stuff, and has one or more fairly well defined (well, they should, anyway) exit point. Drivers have some stuff to do when the first get loaded (e.g. MODULE_INIT() and other stuff), and may or may not ever do anything ever again (you can forcibly load a driver for hardware your system doesn't actually have), and may have some stuff that needs to be done if the driver is ever unloaded. Aside from that, a driver generally provides some specific entry points (system calls, ioctls, etc.) that user-land applications can access to request the driver to do something.
Horrible analogy, but think of a program kind of like a car - you get in, start it up, drive somewhere, and get out. A driver is more like a vending machine - you plug it in and make sure it's stocked, but then people just come along occasionaly and push buttons to make it do something.
Actually you are taking about (I2C) platform (Native)driver first you need to understand how MOUDULE_INIT() of platform driver got called versus other loadable modules.
/*
* module_init() - driver initialization entry point
* #x: function to be run at kernel boot time or module insertion
* module_init() will either be called during do_initcalls() (if
* builtin) or at module insertion time (if a module). There can only
* be one per module.*/
and for i2c driver you can refer this link http://www.linuxjournal.com/article/7136 and
http://www.embedded-bits.co.uk/2009/i2c-in-the-2632-linux-kernel/
Begin of a kernel module is starting from initialization function, which mainly addressed with macro __init just infront of the function name.
The __init macro indicate to linux kernel that the following function is an initialization function and the resource that will use for this initialization function will be free once the code of initialization function is executed.
There are other marcos, used for detect initialization and release function, named module_init() and module_exit() [as described above].
These two macro are used, if the device driver is targeted to operate as loadable and removeable kernel module at run time [i.e. using insmod or rmmod command]
IN short and crisp way : It starts from .probe and go all the way to init as soon you do insmod .This also registers the driver with the driver subsystem and also initiates the init.
Everytime the driver functionalities are called from the user application , functions are invoked using the call back.
"Linux Device Driver" is a good book but it's old!
Basic example:
#include <linux/module.h>
#include <linux/version.h>
#include <linux/kernel.h>
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Name and e-mail");
MODULE_DESCRIPTION("my_first_driver");
static int __init insert_mod(void)
{
printk(KERN_INFO "Module constructor");
return 0;
}
static void __exit remove_mod(void)
{
printk(KERN_INFO "Module destructor");
}
module_init(insert_mod);
module_exit(remove_mod);
An up-to-date tutorial, really well written, is "Linux Device Drivers Series"
I come across this term every now and then.
And now I really need a clear explanation as I wish to use some MPI routines that
are said not to be interrupt-safe.
I believe it's another wording for reentrant. If a function is reentrant it can be interrupted in the middle and called again.
For example:
void function()
{
lock(mtx);
/* code ... */
unlock(mtx);
}
This function can clearly be called by different threads (the mutex will protect the code inside). But if a signal arrives after lock(mtx) and the function is called again it will deadlock. So it's not interrupt-safe.
Code that is safe from concurrent access from an interrupt is said to be interrupt-safe.
Consider a situation that your process is in critical section and an asynchronous event comes and interrupts your process to access the same shared resource that process was accessing before preemption.
It is a major bug if an interrupt occurs in the middle of code that is manipulating a resource and the interrupt handler can access the same resource. Locking can save you!