use of spin variants in network processing - linux-kernel

I have written a Kernel module that is interacting with net-filter hooks.
The net-filter hooks operate in Softirq context.
I am accessing a global data structure
"Hash Table" from the softirq context as well as from Process context. The process context access is due to a sysctl file being used to modify the contents of the Hash-table.
I am using spinlock_irq_save.
Is this choice of spin_lock api correct ?? In terms of performance and locking standards.
what would happen if an interrupt is scheduled on another processor? while on the current processor lock is already hold by a process context code?

Firstly:
So, with all the above details I concluded that my softirqs can run concurrently on both cores.
Yes, this is correct. Your softirq handler may be executed "simultaneously on more than one CPU".
Your conclusion to use spinlocks sounds correct to me. However, this assumes that the critical section (ie., that which is executed with the spinlock held) has the following properties:
It must not sleep (for example, acquire a blocking mutex)
It should be as short as possible
Generally, if you're just updating your hash table, you should be fine here.
If an IRQ handler tries to acquire a spinlock that is held by a process context, that's fine. As long as your process context does not sleep with that lock held, the lock should be released within a short amount of time, allowing the IRQ handler to make forward progress.

I think the solution is appropriate . Softirqs anyways runs with preemption disabled . To share a data with a process, the process must also disable both preemption and interrupts. In case of timer, which only reduces the time stamp of an entry can do it atomically i.e. the time stamp variable must be atomic. If in another core softirqs run and wants to acquire the spinlock, when it is already held in the other core,it must wait.

Related

What happens when a task is executing critical section but it needs to be scheduled out on a uniprocessor system with preemption disabled?

Here is a scenario. Let’s say that a kernel task is running on a uniprocessor system with preemption disabled. The task acquires a spin lock. Now it is executing it’s critical section. At this time, what if the time slice available for this task expires and it has to schedule out?
Does the spin_lock have a mechanism to prevent this?
Can it be scheduled out? If yes, then what happens to the critical section?
Can it be interrupted by an IRQ? (Assuming that preemption is disabled)
Is this scenario feasible? In other words, could this scenario happen?
From the kernel code, I understand that the spin_lock is basically a nop on a uniprocessor with preemption disabled. To be accurate, all it does is barrier()
I understand why it is a nop (as it is a uniprocessor and no other task could be manipulating the data at that instant) but I still don’t understand how it could be uninterrupted(due to IRQs or scheduling).
What am I missing here? Pointers to the Linux kernel code which indicates about this could be really helpful.
My basic assumptions:
32 bit Linux kernel
Actually spin_lock() disables preemption by calling preempt_disable() before it tries to acquire the lock, so scenario #1, #2, #3 could never happen.
From recent source code, spin_lock() eventually calls __raw_spin_lock(), which calls preempt_disable() before calling spin_acquire() to acquire the lock. spin_lock_irqsave() which is commonly used in interrupt context has similar context.
Regarding #3, if the variable is shared between process/interrupt context, you should always use spin_lock_irq()/spin_lock_irqsave() instead of spin_lock() to avoid deadlock scenario.
The mechanism that handles time slices expiring is a timer interrupt. The interrupt will set the TIF_NEEDS_RESCHED flag for the process. When returning from the timer's interrupt context back to your critical section, a check will be made whether or not to preempt the process due to the TIF_NEEDS_RESCHED flag. Since preemption is disabled, nothing will happen and it will return to your critical section.
When your critical section is over, the release of the lock will call preempt_enable() to reenable preemption. At that moment another check is done as to whether or not to preempt. Since the TIF_NEEDS_RESCHED flag is set and preemption is now enabled, the process will be preempted.
Spin locks disable preemption.
No, because preemption is disabled.
Yes. There are spin lock versions that disable IRQs to prevent this.
No because spin locks disable preemption.
Spinlocks don't exist on unitprocessor systems anyway because they don't make sense. If a a thread that doesn't own the lock attempts to acquire it, that means that the thread that does own it is currently asleep (only one cpu). So there's no reason to spin wait for something that's asleep. For this reason spinlocks are optimized away in these cases to just a preemption disable so that no other thread can touch the critical section.

Usage of spinlocks and semaphore in linux in process and interupt context

What would happen if I use semaphore and mutex locks in interrupt context?
Normally semaphore is used in synchronization mechanism. What would happen if I use this one in an interrupt context?
I am working on a project on gpio pins and when interrupt happens, I have to send one signal in ISR. I am using spinlocks.
What would happen if I use semaphore and mutext in ISR?
Waiting in mutexes and semaphores are implemented using switching current task state to TASK_INTERRUPTIBLE/TASK_UNINTERRUPTIBLE and similar with futher call to schedule().
Calling schedule() with current task state differed from TASK_RUNNING leads to switching to another process. And if current refers to interrupt context, you will never return back to it, because scheduling can switch only to the process.
So, when you lock contended(that is, currently locked) semaphore/mutex in interrupt context, you just lost current execution "thread".
If you lock semaphore/mutex which is uncontended(currently not locked), execution will be correct except warning in system log about improper semaphore/mutex usage.

How does Kernel handle the lock in process context when an interrupt comes?

First of all sorry for a little bit ambiguity in Question... What I want to understand is the below scenario
Suppose porcess is running, it holds one lock, Now after acquiring the lock HW interrupt is generated, So How kernel will handle this situation, will it wait for lock ? if yes, what if the interrupt handler need to access that lock or the shared data protected by that lock in process ?
The Linux kernel has a few functions for acquiring spinlocks, to deal with issues like the one you're raising here. In particular, there is spin_lock_irq(), which disables interrupts (on the CPU the process is running on) and acquires the spinlock. This can be used when the code knows interrupts are enabled before the spinlock is acquired; in case the function might be called in different contexts, there is also spin_lock_irqsave(), which stashes away the current state of interrupts before disabling them, so that they can be reenabled by spin_unlock_irqrestore().
In any case, if a lock is used in both process and interrupt context (which is a good and very common design if there is data that needs to be shared between the contexts), then process context must disable interrupts (locally on the CPU it's running on) when acquiring the spinlock to avoid deadlocks. In fact, lockdep ("CONFIG_PROVE_LOCKING") will verify this and warn if a spinlock is used in a way that is susceptible to the "interrupt while process context holds a lock" deadlock.
Let me explain some basic properties of interrupt handler or bottom half.
A handler can’t transfer data to or from user space, because it doesn’t execute in the context of a process.
Handlers also cannot do anything that would sleep, such as calling wait_event, allocating memory with anything other than GFP_ATOMIC, or locking a semaphore
handlers cannot call schedule.
What i am trying to say is that Interrupt handler runs in atomic context. They can not sleep as they cannot be rescheduled. interrupts do not have a backing process context
The above is by design. You can do whatever you want in code, just be prepared for the consequences
Let us assume that you acquire a lock in interrupt handler(bad design).
When an interrupt occur the process saves its register on stack and start ISR. now after acquiring a lock you would be in a deadlock as their is no way ISR know what the process was doing.
The process will not be able to resume execution until it is done it with ISR
In a preemptive kernel the ISR and the process can be preempt but for a non-preemptive kernel you are dead.

Avoid spinlock deadlock

Imagine that a device function holds a spinlock to control access to the device. While the lock is held, the device issues an interrupt, which causes an interrupt handler to run. The interrupt handler, before accessing the device, must also obtain the lock.
Suppose that the interrupt handler executes in the same processor as the code that took out the lock originally.
Knowing that to hold spinlock disables preemption on the relevant processor, is it possible that the code that holds the spinlock be executed on another processor (because of preemption on this processor)? (We suppose that this is a SMP machine)
Is it possible that the code that holds the spinlock be executed on another processor (because of preemption on this processor)?
No, the code just keeps waiting for the interrupt handler to return.
Just use spin_lock_irq*(), or spin_lock_bh() if you also want to protect against softirqs/tasklets.

Avoiding sleep while holding a spinlock

I've recently read section 5.5.2 (Spinlocks and Atomic Context) of LDDv3 book:
Avoiding sleep while holding a lock can be more difficult; many kernel functions can sleep, and this behavior is not always well documented. Copying data to or from user space is an obvious example: the required user-space page may need to be swapped in from the disk before the copy can proceed, and that operation clearly requires a sleep. Just about any operation that must allocate memory can sleep; kmalloc can decide to give up the processor, and wait for more memory to become available unless it is explicitly told not to. Sleeps can happen in surprising places; writing code that will execute under a spinlock requires paying attention to every function that you call.
It's clear to me that spinlocks must always be held for the minimum time possible and I think that it's relatively easy to write correct spinlock-using code from scratch.
Suppose, however, that we have a big project where spinlocks are widely used.
How can we make sure that functions called from critical sections protected by spinlocks will never sleep?
Thanks in advance!
What about enabling "Sleep-inside-spinlock checking" for your kernel ? It is usually found under Kernel Debugging when you run make config. You might also try to duplicate its behavior in your code.
One thing I noticed on a lot of projects is people seem to misuse spinlocks, they get used instead of the other locking primitives that should have be used.
A linux spinlock only exists in multiprocessor builds (in single process builds the spinlock preprocessor defines are empty) spinlocks are for short duration locks on a multi processor platform.
If code fails to aquire a spinlock it just spins the processor until the lock is free. So either another process running on a different processor must free the lock or possibly it could be freed by an interrupt handler but the wait event mechanism is much better way of waiting on an interrupt.
The irqsave spinlock primitive is a tidy way of disabling/ enabling interrupts so a driver can lock out an interrupt handler but this should only be held for long enough for the process to update some variables shared with an interrupt handler, if you disable interupts you are not going to be scheduled.
If you need to lock out an interrupt handler use a spinlock with irqsave.
For general kernel locking you should be using mutex/semaphore api which will sleep on the lock if they need to.
To lock against code running in other processes use muxtex/semaphore
To lock against code running in an interrupt context use irq save/restore or spinlock_irq save/restore
To lock against code running on other processors then use spinlocks and avoid holding the lock for long.
I hope this helps

Resources