Atomicity guaranteed on "lock" methods in Linux? - linux-kernel

We know that the following kernel methods in Linux allow us to apply various locking mechanisms on shared data. But, does Linux guarantee atomicity on the methods themselves? With the exception to methods related to normal and reader-writer spin locks, which cannot sleep, wouldn't it be catastrophic if a thread of execution is preempted while it has partially executed any of the other methods mentioned below?
Spin Lock Methods
spin_lock();
spin_lock_irq();
spin_lock_irqsave();
spin_unlock();
spin_unlock_irq();
spin_unlock_irqrestore();
spin_lock_init();
spin_trylock();
spin_is_locked();
Reader-Writer Spin Lock Methods
read_lock();
read_lock_irq();
read_lock_irqsave();
read_unlock();
read_unlock_irq();
read_unlock_ irqrestore();
write_lock();
write_lock_irq();
write_lock_irqsave();
write_unlock();
write_unlock_irq();
write_unlock_irqrestore();
write_trylock();
rwlock_init();
Semaphore Methods
sema_init();
init_MUTEX();
init_MUTEX_LOCKED();
down_interruptible();
down();
down_trylock();
up();
Reader-Writer Semaphore Methods
init_rwsem();
down_read();
up_read();
down_write();
up_write();
down_read_trylock();
down_write_trylock();
downgrade_write();
Mutex Methods
mutex_lock();
mutex_unlock();
mutex_trylock();
mutex_is_locked();
Completion Variable Methods
init_completion();
wait_for_completion();
complete();

If these functions were not atomic with respect to the lock itself, they would not work at all. And last time I looked, my Linux did work.
Most of these functions indeed disable preemption while doing their stuff.

Semaphore, Reader-writer semeaphores would automatically disable kernel preemption in SMP systems.

Related

What happens when a task is executing critical section but it needs to be scheduled out on a uniprocessor system with preemption disabled?

Here is a scenario. Let’s say that a kernel task is running on a uniprocessor system with preemption disabled. The task acquires a spin lock. Now it is executing it’s critical section. At this time, what if the time slice available for this task expires and it has to schedule out?
Does the spin_lock have a mechanism to prevent this?
Can it be scheduled out? If yes, then what happens to the critical section?
Can it be interrupted by an IRQ? (Assuming that preemption is disabled)
Is this scenario feasible? In other words, could this scenario happen?
From the kernel code, I understand that the spin_lock is basically a nop on a uniprocessor with preemption disabled. To be accurate, all it does is barrier()
I understand why it is a nop (as it is a uniprocessor and no other task could be manipulating the data at that instant) but I still don’t understand how it could be uninterrupted(due to IRQs or scheduling).
What am I missing here? Pointers to the Linux kernel code which indicates about this could be really helpful.
My basic assumptions:
32 bit Linux kernel
Actually spin_lock() disables preemption by calling preempt_disable() before it tries to acquire the lock, so scenario #1, #2, #3 could never happen.
From recent source code, spin_lock() eventually calls __raw_spin_lock(), which calls preempt_disable() before calling spin_acquire() to acquire the lock. spin_lock_irqsave() which is commonly used in interrupt context has similar context.
Regarding #3, if the variable is shared between process/interrupt context, you should always use spin_lock_irq()/spin_lock_irqsave() instead of spin_lock() to avoid deadlock scenario.
The mechanism that handles time slices expiring is a timer interrupt. The interrupt will set the TIF_NEEDS_RESCHED flag for the process. When returning from the timer's interrupt context back to your critical section, a check will be made whether or not to preempt the process due to the TIF_NEEDS_RESCHED flag. Since preemption is disabled, nothing will happen and it will return to your critical section.
When your critical section is over, the release of the lock will call preempt_enable() to reenable preemption. At that moment another check is done as to whether or not to preempt. Since the TIF_NEEDS_RESCHED flag is set and preemption is now enabled, the process will be preempted.
Spin locks disable preemption.
No, because preemption is disabled.
Yes. There are spin lock versions that disable IRQs to prevent this.
No because spin locks disable preemption.
Spinlocks don't exist on unitprocessor systems anyway because they don't make sense. If a a thread that doesn't own the lock attempts to acquire it, that means that the thread that does own it is currently asleep (only one cpu). So there's no reason to spin wait for something that's asleep. For this reason spinlocks are optimized away in these cases to just a preemption disable so that no other thread can touch the critical section.

Disadvantage of using mutex in interrupt context

What is Disadvantage of using mutex in interrupt context.Why spin lock is preferred here.
A Mutex will force the function to sleep if it's contended and sleeping is illegal when preemption is disabled or in interrupt context.
Many functions in the kernel sleep (ie. call schedule()) directly or
indirectly: you can never call them while holding a spinlock, or with
preemption disabled. This also means you need to be in user context:
calling them from an interrupt is illegal.
The following is worth reading...
https://www.kernel.org/pub/linux/kernel/people/rusty/kernel-locking/c557.html
Theres a ton of information in that doc.
When a thread tries to acquire a mutex and if it does not succeed, either due to another thread having already acquired it or due to a context switch, the thread goes to sleep until been woken-up which is critical if used in an ISR.
Whereas when a thread fails to acquire a spin_lock, it continuously tries to acquire it, until it finally succeeds, thus avoiding sleeping in an ISR. Using spin_lock in the "top-half" is a common practice followed in writing Linux Device Driver Interrupt handlers.
Hence you should use a spin-lock instead!

Synchronization level (executive or kernel) used by monitors, mutex, and semaphore

I understand that the kernel can synchronize processes via the spinlock method. However, when it comes down to one processor how does it do so? How does it use a synchronization object to ensure mutual exclusion?
Is a semaphore at the level of the executive? How does the kernel come into play here?
Are mutexes only implemented at the level of the kernel? They do not give off a signal or message when the resource is free.
You've got several questions here:
I understand that the kernel can synchronize processes via the
spinlock method. However, when it comes down to one processor how does
it do so? How does it use a synchronization object to ensure mutual
exclusion?
On uni-processor machines, acquiring a spinlock simply raises the IRQL to >DISPATCH_LEVEL - a thread at such elevated IRQL cannot be pre-empted, so synchronization is guaranteed.
Is a semaphore at the level of the executive? How does the kernel come
into play here?
Semaphores, mutexes, (and most waitable objects, for that matter) are Kernel Dispatch Objects. Such objects are implemented by the kernel, and are made available to user mode applications via various functions exported by KERNEL32.DLL (CreateEvent/Mutex/Semaphore, et.al.). In addition, the "kernel comes into play" by scheduling thread waits, and awakening threads that are waiting on synchronization objects.
Are mutexes only implemented at the level of the kernel?
Mutex objects are indeed kernel dispatch objects (KMUTEX). A mutex object is signalled when it is un-owned. When a thread acquires a mutex, it's state goes to non-signalled, which means that any other thread that attempts to acquire it will be put into a wait state until either the mutex is acquired, or the wait times out.
For more detailed explanations on kernel dispatcher objects, as well as Windows synchronization in general, have a peek at the latest version of "Windows Internals" - every Windows developer should have a copy of this on their desk, IMHO.
'They do not give off a signal or message when the resource is free' - sure they do - they are an inter-thread signaling mechanism! A thread waiting on the mutex is signaled and made ready when the protected resource is released, so acquiring the mutex.
Spinlocks are generally not used on single-core processors - there is no point. TBH, spinlocks need great care on multi-core and clustered systems too if their use is not to be counter-productive.

How best to synchronize memory access shared between kernel and user space, in Windows

I can't find any function to acquire spinlock in Win32 Apis.
Is there a reason?
When I need to use spinlock, what do I do?
I know there is an CriticalSectionAndSpinCount function.
But that's not what I want.
Edit:
I want to synchronize a memory which will be shared between kernel space and user space. -The memory will be mapped.
I should lock it when I access the data structure and the locking time will be very short.
The data structure(suppose it is a queue) manages event handles to interaction each other.
What synchronization mechanism should I use?
A spinlock is clearly not appropriate for user-level synchronization. From http://www.microsoft.com/whdc/driver/kernel/locks.mspx:
All types of spin locks raise the IRQL
to DISPATCH_LEVEL or higher. Spin
locks are the only synchronization
mechanism that can be used at IRQL >=
DISPATCH_LEVEL. Code that holds a spin
lock runs at IRQL >= DISPATCH_LEVEL,
which means that the system’s thread
switching code (the dispatcher) cannot
run and, therefore, the current thread
cannot be pre-empted.
Imagine if it were possible to take a spin lock in user mode: Suddenly the thread would not be able to be pre-empted. So on a single-cpu machine, this is now an exclusive and real-time thread. The user-mode code would now be responsible for handling interrupts and other kernel-level tasks. The code could no longer access any paged memory, which means that the user-mode code would need to know what memory is currently paged and act accordingly. Cats and dogs living together, mass hysteria!
Perhaps a better question would be to tell us what you are trying to accomplish, and ask what synchronization method would be most appropriate.
There is a managed user-mode SpinLock as described here. Handle with care, as advised in the docs - it's easy to go badly wrong with these locks.
The only way to access this in native code is via the Win32 API you named already - CriticalSectionAndSpinCount and its siblings.

Avoiding sleep while holding a spinlock

I've recently read section 5.5.2 (Spinlocks and Atomic Context) of LDDv3 book:
Avoiding sleep while holding a lock can be more difficult; many kernel functions can sleep, and this behavior is not always well documented. Copying data to or from user space is an obvious example: the required user-space page may need to be swapped in from the disk before the copy can proceed, and that operation clearly requires a sleep. Just about any operation that must allocate memory can sleep; kmalloc can decide to give up the processor, and wait for more memory to become available unless it is explicitly told not to. Sleeps can happen in surprising places; writing code that will execute under a spinlock requires paying attention to every function that you call.
It's clear to me that spinlocks must always be held for the minimum time possible and I think that it's relatively easy to write correct spinlock-using code from scratch.
Suppose, however, that we have a big project where spinlocks are widely used.
How can we make sure that functions called from critical sections protected by spinlocks will never sleep?
Thanks in advance!
What about enabling "Sleep-inside-spinlock checking" for your kernel ? It is usually found under Kernel Debugging when you run make config. You might also try to duplicate its behavior in your code.
One thing I noticed on a lot of projects is people seem to misuse spinlocks, they get used instead of the other locking primitives that should have be used.
A linux spinlock only exists in multiprocessor builds (in single process builds the spinlock preprocessor defines are empty) spinlocks are for short duration locks on a multi processor platform.
If code fails to aquire a spinlock it just spins the processor until the lock is free. So either another process running on a different processor must free the lock or possibly it could be freed by an interrupt handler but the wait event mechanism is much better way of waiting on an interrupt.
The irqsave spinlock primitive is a tidy way of disabling/ enabling interrupts so a driver can lock out an interrupt handler but this should only be held for long enough for the process to update some variables shared with an interrupt handler, if you disable interupts you are not going to be scheduled.
If you need to lock out an interrupt handler use a spinlock with irqsave.
For general kernel locking you should be using mutex/semaphore api which will sleep on the lock if they need to.
To lock against code running in other processes use muxtex/semaphore
To lock against code running in an interrupt context use irq save/restore or spinlock_irq save/restore
To lock against code running on other processors then use spinlocks and avoid holding the lock for long.
I hope this helps

Resources