Correct lock to use in linux character driver - linux-kernel

I am writing a simple character device driver. (kernel 2.6.26)
Multiple concurrent reader & writers are expected.
I am not sure what type of lock is best used to synchronize a short access to internal structures.
Any advice will be most appreciated

Compare with http://www.kernel.org/pub/linux/kernel/people/rusty/kernel-locking/c214.html . An old document from before when mutexes existed, but given mutexes are a sleeping lock, they count towards user context.
spinlock — spinlock_bh — mutex — semaphore
If your data structures are only ever accessed by functions whose execution is triggered by userspace, all lock primitives are available to you. It depends on gut feeling of how short a "short access" is.
And then there is RCU as a fifth way of doing things, though it is somewhat not a locking primitive in its own right. (It is used together with one of the lock primitives.)

Start with a mutex. Once you've got it working you can think about reworking the locking.

Related

What is the use-case for TryEnterCriticalSection?

I've been using Windows CRITICAL_SECTION since the 1990s and I've been aware of the TryEnterCriticalSection function since it first appeared. I understand that it's supposed to help me avoid a context switch and all that.
But it just occurred to me that I have never used it. Not once.
Nor have I ever felt I needed to use it. In fact, I can't think of a situation in which I would.
Generally when I need to get an exclusive lock on something, I need that lock and I need it now. I can't put it off until later. I certainly can't just say, "oh well, I won't update that data after all". So I need EnterCriticalSection, not TryEnterCriticalSection
So what exactly is the use case for TryEnterCriticalSection?
I've Googled this, of course. I've found plenty of quick descriptions on how to use it but almost no real-world examples of why. I did find this example from Intel that, frankly doesn't help much:
CRITICAL_SECTION cs;
void threadfoo()
{
while(TryEnterCriticalSection(&cs) == FALSE)
{
// some useful work
}
// Critical Section of Code
LeaveCriticalSection (&cs);
}
// other work
}
What exactly is a scenario in which I can do "some useful work" while I'm waiting for my lock? I'd love to avoid thread-contention but in my code, by the time I need the critical section, I've already been forced to do all that "useful work" in order to get the values that I'm updating in shared data (for which I need the critical section in the first place).
Does anyone have a real-world example?
As an example you might have multiple threads that each produce a high volume of messages (events of some sort) that all need to go on a shared queue.
Since there's going to be frequent contention on the lock on the shared queue, each thread can have a local queue and then, whenever the TryEnterCriticalSection call succeeds for the current thread, it copies everything it has in its local queue to the shared one and releases the CS again.
In C++11 therestd::lock which employs deadlock-avoidance algorithm.
In C++17 this has been elaborated to std::scoped_lock class.
This algorithm tries to lock on mutexes in one order, and then in another, until succeeds. It takes try_lock to implement this approach.
Having try_lock method in C++ is called Lockable named requirement, whereas mutexes with only lock and unlock are BasicLockable.
So if you build C++ mutex on top of CTRITICAL_SECTION, and you want to implement Lockable, or you'll want to implement lock avoidance directly on CRITICAL_SECTION, you'll need TryEnterCriticalSection
Additionally you can implement timed mutex on TryEnterCriticalSection. You can do few iterations of TryEnterCriticalSection, then call Sleep with increasing delay time, until TryEnterCriticalSection succeeds or deadline has expired. It is not a very good idea though. Really timed mutexes based on user-space WIndows synchronization objects are implemented on SleepConditionVariableSRW, SleepConditionVariableCS or WaitOnAddress.
Because windows CS are recursive TryEnterCriticalSection allows a thread to check whether it already owns a CS without risk of stalling.
Another case would be if you have a thread that occasionally needs to perform some locked work but usually does something else, you could use TryEnterCriticalSection and only perform the locked work if you actually got the lock.

Is there a version/equivalent of SleepConditionVariableCS() which uses a non-recursive mutex (like) object?

I am writing the Windows port for a Linux application and I am trying to find a suitable replacement for pthread_cond_wait(). The closest alternative seems to be SleepConditionVariableCS(). However I am unwilling to use this function because it uses CriticalSections which are basically lightweight recursive mutexes. I would prefer a non-recursive lock object alternative - is there one?
P.S. -
In the application in place of pthread mutexes I am using Semaphores with maximum count 1.
If recursive mutexes are as problematic as stated by David Butenhof then why does Windows provide only recursive Mutexes (or CriticalSection) as an option? Is this a massive #Fail on part of Windows or is David Butenhof outdated/wrong?
Windows Vista and later provide Slim Reader/Writer (SRW) Locks as a non-recursive lock object alternative 1.
As the documentation states:
An SRW lock is the size of a pointer. The advantage is that it is fast to update the lock state. The disadvantage is that very little state information can be stored, so SRW locks cannot be acquired recursively. In addition, a thread that owns an SRW lock in shared mode cannot upgrade its ownership of the lock to exclusive mode.
A Windows Condition Variable can use a SRW lock instead of a CriticalSection lock. See SleepConditionVariableSRW().
1: PS. Here's another view on the good vs. bad of recursive locks .

Does Go support volatile / non-volatile variables?

I'm new to the language so bear with me.
I am curious how GO handles data storage available to threads, in the sense that non-local variables can also be non-volatile, like in Java for instance.
GO has the concept of channel, which, by it's nature -- inter thread communication, means it bypasses processor cache, and reads/writes to heap directly.
Also, have not found any reference to volatile in the go lang documentation.
TL;DR: Go does not have a keyword to make a variable safe for multiple goroutines to write/read it. Use the sync/atomic package for that. Or better yet Do not communicate by sharing memory; instead, share memory by communicating.
Two answers for the two meanings of volatile
.NET/Java concurrency
Some excerpts from the Go Memory Model.
If the effects of a goroutine must be observed by another goroutine,
use a synchronization mechanism such as a lock or channel
communication to establish a relative ordering.
One of the examples from the Incorrect Synchronization section is an example of busy waiting on value.
Worse, there is no guarantee that the write to done will ever be
observed by main, since there are no synchronization events between
the two threads. The loop in main is not guaranteed to finish.
Indeed, this code(play.golang.org/p/K8ndH7DUzq) never exits.
C/C++ non-standard memory
Go's memory model does not provide a way to address non-standard memory. If you have raw access to a device's I/O bus you'll need to use assembly or C to safely write values to the memory locations. I have only ever needed to do this in a device driver which generally precludes use of Go.
The simple answer is that volatile is not supported by the current Go specification, period.
If you do have one of the use cases where volatile is necessary, such as low-level atomic memory access that is unsupported by existing packages in the standard library, or unbuffered access to hardware mapped memory, you'll need to link in a C or assembly file.
Note that if you do use C or assembly as understood by the GC compiler suite, you don't even need cgo for that, since the [568]c C/asm compilers are also able to handle it.
You can find examples of that in Go's source code. For example:
http://golang.org/src/pkg/runtime/sema.goc
http://golang.org/src/pkg/runtime/atomic_arm.c
Grep for many other instances.
For how memory access in Go does work, check out The Go Memory Model.
No, go does not support the volatile or register statement.
See this post for more information.
This is also noted in the Go for C++ Programmers guide.
The Go Memory Model documentation explains why the concept of 'volatile' has no application in Go.
Loosely: Among other things, goroutines are free to keep goroutine-local changes cached in registers so those changes are not observable by other goroutines. To "flush" those changes to memory, a synchronization must be performed. Either by using locks or by communicating (channel send or receive).

Which mutex lock variant should I use in Linux kernel developing?

AFAIK, the mutex API was introduced to the kernel after LDD3 (Linux device drivers 3rd edition) was written so it's not described in the book.
The book describes how to use the kernel's semaphore API for mutex functionality.
It suggest to use down_interruptable() instead of down():
You do not, as a general rule,
want to use noninterruptible operations unless there truly is no alternative. Non-interruptible operations are a good way to create unkillable processes (the dreaded
“D state” seen in ps), and annoy your users [Linux Device Drivers 3rd ed]
Now. here's my question:
The mutex API has two "similar" functions:
mutex_lock_killable() an mutex_lock_interruptable(). Which one should I choose?
Use mutex_lock_interruptible() function to allow your driver to be interrupted by any signal.
This implies that your system call should be written so that it can be restarted.
(Also see ERESTARTSYS.)
Use mutex_lock_killable() to allow your driver to be interrupted only by signals that actually kill the process, i.e., when the process has no opportunity to look at the results of your system call, or even to try it again.
Use mutex_lock() when you can guarantee that the mutex will not be held for a long time.

Cocoa Lock that does not use cpu power

I need a lock in cocoa that does not use one cpu when I try to lock it and it is locked somewhere else. Something that is implemented in the kernel scheduler.
It sounds like you're trying to find a lock that's not a spin lock. EVERY lock must use some CPU, or else it couldn't function. :-)
NSLock is the most obvious in Cocoa. It has a simple -lock, -unlock interface and uses pthread mutexes in its implementation. There are a number of more sophisticated locks in Cocoa for more specific needs: NSRecursiveLock, NSCondition, NSDistributedLock, etc.
There is also the #synchronized directive which is even simpler to use but has some additional overhead to it.
GCD also has a counted semaphore object if you're looking for something like that.
My recommendation is that, instead of locks, you look at using NSOperations and an NSOperationQueue where you -setMaxConcurrentOperationCount: to 1 to access the shared resource. By using a single-wide operation queue, you can guarantee that only one thing at a time will make use of a resource, while still allowing for multiple threads to do so.
This avoids the need for locks, and since everything is done in user space, can provide much better performance. I've replaced almost all of my locking around shared resources with this technique, and have been very pleased with the results.
Do you mean "lock" as in a mutex between threads, or a mutex between processes, or a mutex between disparate resources on a network, or...?
If it's between threads, you use NSLock. If it's between processes, then you can use POSIX named semaphores.
If you really want kernel locks and know what you are doing, you can use
<libkern/OSAtomic.h>
Be sure to always use the "barrier" variants. These are faster and much more dangerous than posix locks. If you can target 10.6 with new code, then GCD is a great way to go. There is a great podcast on using the kernel synchronization primitives at: http://www.mac-developer-network.com/shows/podcasts/lnc/lnc032/

Resources