Does the kernel data structure klist provide threadsafe access? - linux-kernel

Does the linked list wrapper described in klist.h provide threadsafe access to its nodes for both reading and writing?

I assume that by "reading and writing" you actually mean "interating and adding/removing" (we're talking about lists, right?).
In this sense they are thread-safe: you do not have to perform manual locking on them because the functions defined in lib/klist.c use the internal spinlock of the klist structure.
Do not use these functions if interrupt or bottom half context, because the locking done inside them is not spin_lock_irqsave() or spin_lock_bh().

Related

Implementation of MTA COM server

I can't find any source code on the prerequisites of an MTA compliant COM. I tried changing the ThreadingModel registry key of my object from Apartment to Both, and it results in a crash when a secondary thread calls the method before any data is accessed.
If STA COMs require a message pump, what kind of plumbing code do MTA COM objects require?
I do not think that there is anything special about MTA, except that you need to use synchronization primitives like mutexes to synchronize access to your internal structures. Does the "Multithreaded Apartments" not give you all that you need?
Quoting from the documentation, emphasis is mine:
Because calls to objects are not serialized in any way, multithreaded object concurrency offers the highest performance and takes the best advantage of multiprocessor hardware for cross-thread, cross-process, and cross-machine calling. This means, however, that the code for objects must provide synchronization in their interface implementations, typically through the use of synchronization primitives such as event objects, critical sections, mutexes, or semaphores, which are described later in this section. In addition, because the object doesn't control the lifetime of the threads that are accessing it, no thread-specific state may be stored in the object (in thread local storage).

Does Go support volatile / non-volatile variables?

I'm new to the language so bear with me.
I am curious how GO handles data storage available to threads, in the sense that non-local variables can also be non-volatile, like in Java for instance.
GO has the concept of channel, which, by it's nature -- inter thread communication, means it bypasses processor cache, and reads/writes to heap directly.
Also, have not found any reference to volatile in the go lang documentation.
TL;DR: Go does not have a keyword to make a variable safe for multiple goroutines to write/read it. Use the sync/atomic package for that. Or better yet Do not communicate by sharing memory; instead, share memory by communicating.
Two answers for the two meanings of volatile
.NET/Java concurrency
Some excerpts from the Go Memory Model.
If the effects of a goroutine must be observed by another goroutine,
use a synchronization mechanism such as a lock or channel
communication to establish a relative ordering.
One of the examples from the Incorrect Synchronization section is an example of busy waiting on value.
Worse, there is no guarantee that the write to done will ever be
observed by main, since there are no synchronization events between
the two threads. The loop in main is not guaranteed to finish.
Indeed, this code(play.golang.org/p/K8ndH7DUzq) never exits.
C/C++ non-standard memory
Go's memory model does not provide a way to address non-standard memory. If you have raw access to a device's I/O bus you'll need to use assembly or C to safely write values to the memory locations. I have only ever needed to do this in a device driver which generally precludes use of Go.
The simple answer is that volatile is not supported by the current Go specification, period.
If you do have one of the use cases where volatile is necessary, such as low-level atomic memory access that is unsupported by existing packages in the standard library, or unbuffered access to hardware mapped memory, you'll need to link in a C or assembly file.
Note that if you do use C or assembly as understood by the GC compiler suite, you don't even need cgo for that, since the [568]c C/asm compilers are also able to handle it.
You can find examples of that in Go's source code. For example:
http://golang.org/src/pkg/runtime/sema.goc
http://golang.org/src/pkg/runtime/atomic_arm.c
Grep for many other instances.
For how memory access in Go does work, check out The Go Memory Model.
No, go does not support the volatile or register statement.
See this post for more information.
This is also noted in the Go for C++ Programmers guide.
The Go Memory Model documentation explains why the concept of 'volatile' has no application in Go.
Loosely: Among other things, goroutines are free to keep goroutine-local changes cached in registers so those changes are not observable by other goroutines. To "flush" those changes to memory, a synchronization must be performed. Either by using locks or by communicating (channel send or receive).

what does __rcu stands for in linux?

I am new to linux kernel. My question is about the task_struct.
I know that Each task_struct has a reference to its parent process via a pointer to the task_struct of the parent.
After looking at the sched.h in the task_struct definition I noticed the following :
struct task_struct __rcu *real_parent; /* real parent process */
I found that it is referenced to compiler.h. I guess that the "__rcu" stands for "read copy update"
Can someone clarify the syntax ?
Read-copy-update is an algorithm that enables concurrent access to readers of a data structure without having to lock the structure. It can be read about here.
If the kernel is built with the CONFIG_SPARSE_RCU_POINTER config option, __rcu is defined in include/linux/compiler.h as
# define __rcu __attribute__((noderef, address_space(4)))
This is an annotation for a the Sparse code analysis tool that can warn about certain things the programmer may have overlooked. How this is relevant to RCU is explained in Documentation/RCU/checklist.txt:
__rcu sparse checks: tag the pointer to the RCU-protected data
structure with __rcu, and sparse will warn you if you
access that pointer without the services of one of the
variants of rcu_dereference().
rcu_dereference() returns a pointer that can be safely dereferenced by the code and documents the programmer's intention to protect the pointer with the RCU mechanism, enabling tools like Sparse to check for programming errors and omissions.
RCU stands for "read, copy, update". It is an algorithm that allows multiple readers to access data which can be updated or even deleted at the same time by writers.
Under RCU, writers still have to ensure mutual exclusion with regard to one another, but readers do not acquire a lock. Care has to be taken that the shared data structure is updated in ways that do not violate read integrity. If something has to be removed or deleted, the unlinking of that item from the data structure can be done in parallel with the readers but the actual deletion of the memory has to wait until the last reader has finished.
Rather than making the readers acquire a lock, the whereabouts of the readers are inferred in other ways. Threads can announce their intent to browse the data structure by joining a "read side critical section" which is not really a lock but a kind of global phase.
For instance, suppose that some threads entered the RCU read side critical section in phase 0. An updater has performed a deletion and want to free a piece of memory. It has to simply wait for all threads in the system to vacate phase 0. In the meanwhile, other readers are looking at the data structure already, but when they declare their intent to RCU, they do so by entering the RCU read-side critical section under phase 1. Only the phase 0 threads can possibly still have a pointer to the object that was deleted, and so when the last thread leaves phase 0, the object can safely be deleted. Newly arriving threads in phase 1 do not see the object, because the object has been removed from the data structure, so they have no way to find it.
RCU takes advantage of the idea that we do not need lock objects that are "owned" in order to know information like "no thread can be accessing this object any more".

Correct lock to use in linux character driver

I am writing a simple character device driver. (kernel 2.6.26)
Multiple concurrent reader & writers are expected.
I am not sure what type of lock is best used to synchronize a short access to internal structures.
Any advice will be most appreciated
Compare with http://www.kernel.org/pub/linux/kernel/people/rusty/kernel-locking/c214.html . An old document from before when mutexes existed, but given mutexes are a sleeping lock, they count towards user context.
spinlock — spinlock_bh — mutex — semaphore
If your data structures are only ever accessed by functions whose execution is triggered by userspace, all lock primitives are available to you. It depends on gut feeling of how short a "short access" is.
And then there is RCU as a fifth way of doing things, though it is somewhat not a locking primitive in its own right. (It is used together with one of the lock primitives.)
Start with a mutex. Once you've got it working you can think about reworking the locking.

Cocoa Lock that does not use cpu power

I need a lock in cocoa that does not use one cpu when I try to lock it and it is locked somewhere else. Something that is implemented in the kernel scheduler.
It sounds like you're trying to find a lock that's not a spin lock. EVERY lock must use some CPU, or else it couldn't function. :-)
NSLock is the most obvious in Cocoa. It has a simple -lock, -unlock interface and uses pthread mutexes in its implementation. There are a number of more sophisticated locks in Cocoa for more specific needs: NSRecursiveLock, NSCondition, NSDistributedLock, etc.
There is also the #synchronized directive which is even simpler to use but has some additional overhead to it.
GCD also has a counted semaphore object if you're looking for something like that.
My recommendation is that, instead of locks, you look at using NSOperations and an NSOperationQueue where you -setMaxConcurrentOperationCount: to 1 to access the shared resource. By using a single-wide operation queue, you can guarantee that only one thing at a time will make use of a resource, while still allowing for multiple threads to do so.
This avoids the need for locks, and since everything is done in user space, can provide much better performance. I've replaced almost all of my locking around shared resources with this technique, and have been very pleased with the results.
Do you mean "lock" as in a mutex between threads, or a mutex between processes, or a mutex between disparate resources on a network, or...?
If it's between threads, you use NSLock. If it's between processes, then you can use POSIX named semaphores.
If you really want kernel locks and know what you are doing, you can use
<libkern/OSAtomic.h>
Be sure to always use the "barrier" variants. These are faster and much more dangerous than posix locks. If you can target 10.6 with new code, then GCD is a great way to go. There is a great podcast on using the kernel synchronization primitives at: http://www.mac-developer-network.com/shows/podcasts/lnc/lnc032/

Resources