7e7eb5ae4e
Most Linux-kernel uses of locking are straightforward, but there are corner-case uses that rely on less well-known aspects of the lock and unlock primitives. This commit therefore adds a locking.txt and litmus tests in Documentation/litmus-tests/locking to explain these corner-case uses. [ paulmck: Apply Andrea Parri feedback for klitmus7. ] [ paulmck: Apply Akira Yokosawa example-consistency feedback. ] Reviewed-by: Akira Yokosawa <akiyks@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
299 lines
8.5 KiB
Plaintext
299 lines
8.5 KiB
Plaintext
Locking
|
|
=======
|
|
|
|
Locking is well-known and the common use cases are straightforward: Any
|
|
CPU holding a given lock sees any changes previously seen or made by any
|
|
CPU before it previously released that same lock. This last sentence
|
|
is the only part of this document that most developers will need to read.
|
|
|
|
However, developers who would like to also access lock-protected shared
|
|
variables outside of their corresponding locks should continue reading.
|
|
|
|
|
|
Locking and Prior Accesses
|
|
--------------------------
|
|
|
|
The basic rule of locking is worth repeating:
|
|
|
|
Any CPU holding a given lock sees any changes previously seen
|
|
or made by any CPU before it previously released that same lock.
|
|
|
|
Note that this statement is a bit stronger than "Any CPU holding a
|
|
given lock sees all changes made by any CPU during the time that CPU was
|
|
previously holding this same lock". For example, consider the following
|
|
pair of code fragments:
|
|
|
|
/* See MP+polocks.litmus. */
|
|
void CPU0(void)
|
|
{
|
|
WRITE_ONCE(x, 1);
|
|
spin_lock(&mylock);
|
|
WRITE_ONCE(y, 1);
|
|
spin_unlock(&mylock);
|
|
}
|
|
|
|
void CPU1(void)
|
|
{
|
|
spin_lock(&mylock);
|
|
r0 = READ_ONCE(y);
|
|
spin_unlock(&mylock);
|
|
r1 = READ_ONCE(x);
|
|
}
|
|
|
|
The basic rule guarantees that if CPU0() acquires mylock before CPU1(),
|
|
then both r0 and r1 must be set to the value 1. This also has the
|
|
consequence that if the final value of r0 is equal to 1, then the final
|
|
value of r1 must also be equal to 1. In contrast, the weaker rule would
|
|
say nothing about the final value of r1.
|
|
|
|
|
|
Locking and Subsequent Accesses
|
|
-------------------------------
|
|
|
|
The converse to the basic rule also holds: Any CPU holding a given
|
|
lock will not see any changes that will be made by any CPU after it
|
|
subsequently acquires this same lock. This converse statement is
|
|
illustrated by the following litmus test:
|
|
|
|
/* See MP+porevlocks.litmus. */
|
|
void CPU0(void)
|
|
{
|
|
r0 = READ_ONCE(y);
|
|
spin_lock(&mylock);
|
|
r1 = READ_ONCE(x);
|
|
spin_unlock(&mylock);
|
|
}
|
|
|
|
void CPU1(void)
|
|
{
|
|
spin_lock(&mylock);
|
|
WRITE_ONCE(x, 1);
|
|
spin_unlock(&mylock);
|
|
WRITE_ONCE(y, 1);
|
|
}
|
|
|
|
This converse to the basic rule guarantees that if CPU0() acquires
|
|
mylock before CPU1(), then both r0 and r1 must be set to the value 0.
|
|
This also has the consequence that if the final value of r1 is equal
|
|
to 0, then the final value of r0 must also be equal to 0. In contrast,
|
|
the weaker rule would say nothing about the final value of r0.
|
|
|
|
These examples show only a single pair of CPUs, but the effects of the
|
|
locking basic rule extend across multiple acquisitions of a given lock
|
|
across multiple CPUs.
|
|
|
|
|
|
Double-Checked Locking
|
|
----------------------
|
|
|
|
It is well known that more than just a lock is required to make
|
|
double-checked locking work correctly, This litmus test illustrates
|
|
one incorrect approach:
|
|
|
|
/* See Documentation/litmus-tests/locking/DCL-broken.litmus. */
|
|
void CPU0(void)
|
|
{
|
|
r0 = READ_ONCE(flag);
|
|
if (r0 == 0) {
|
|
spin_lock(&lck);
|
|
r1 = READ_ONCE(flag);
|
|
if (r1 == 0) {
|
|
WRITE_ONCE(data, 1);
|
|
WRITE_ONCE(flag, 1);
|
|
}
|
|
spin_unlock(&lck);
|
|
}
|
|
r2 = READ_ONCE(data);
|
|
}
|
|
/* CPU1() is the exactly the same as CPU0(). */
|
|
|
|
There are two problems. First, there is no ordering between the first
|
|
READ_ONCE() of "flag" and the READ_ONCE() of "data". Second, there is
|
|
no ordering between the two WRITE_ONCE() calls. It should therefore be
|
|
no surprise that "r2" can be zero, and a quick herd7 run confirms this.
|
|
|
|
One way to fix this is to use smp_load_acquire() and smp_store_release()
|
|
as shown in this corrected version:
|
|
|
|
/* See Documentation/litmus-tests/locking/DCL-fixed.litmus. */
|
|
void CPU0(void)
|
|
{
|
|
r0 = smp_load_acquire(&flag);
|
|
if (r0 == 0) {
|
|
spin_lock(&lck);
|
|
r1 = READ_ONCE(flag);
|
|
if (r1 == 0) {
|
|
WRITE_ONCE(data, 1);
|
|
smp_store_release(&flag, 1);
|
|
}
|
|
spin_unlock(&lck);
|
|
}
|
|
r2 = READ_ONCE(data);
|
|
}
|
|
/* CPU1() is the exactly the same as CPU0(). */
|
|
|
|
The smp_load_acquire() guarantees that its load from "flags" will
|
|
be ordered before the READ_ONCE() from data, thus solving the first
|
|
problem. The smp_store_release() guarantees that its store will be
|
|
ordered after the WRITE_ONCE() to "data", solving the second problem.
|
|
The smp_store_release() pairs with the smp_load_acquire(), thus ensuring
|
|
that the ordering provided by each actually takes effect. Again, a
|
|
quick herd7 run confirms this.
|
|
|
|
In short, if you access a lock-protected variable without holding the
|
|
corresponding lock, you will need to provide additional ordering, in
|
|
this case, via the smp_load_acquire() and the smp_store_release().
|
|
|
|
|
|
Ordering Provided by a Lock to CPUs Not Holding That Lock
|
|
---------------------------------------------------------
|
|
|
|
It is not necessarily the case that accesses ordered by locking will be
|
|
seen as ordered by CPUs not holding that lock. Consider this example:
|
|
|
|
/* See Z6.0+pooncelock+pooncelock+pombonce.litmus. */
|
|
void CPU0(void)
|
|
{
|
|
spin_lock(&mylock);
|
|
WRITE_ONCE(x, 1);
|
|
WRITE_ONCE(y, 1);
|
|
spin_unlock(&mylock);
|
|
}
|
|
|
|
void CPU1(void)
|
|
{
|
|
spin_lock(&mylock);
|
|
r0 = READ_ONCE(y);
|
|
WRITE_ONCE(z, 1);
|
|
spin_unlock(&mylock);
|
|
}
|
|
|
|
void CPU2(void)
|
|
{
|
|
WRITE_ONCE(z, 2);
|
|
smp_mb();
|
|
r1 = READ_ONCE(x);
|
|
}
|
|
|
|
Counter-intuitive though it might be, it is quite possible to have
|
|
the final value of r0 be 1, the final value of z be 2, and the final
|
|
value of r1 be 0. The reason for this surprising outcome is that CPU2()
|
|
never acquired the lock, and thus did not fully benefit from the lock's
|
|
ordering properties.
|
|
|
|
Ordering can be extended to CPUs not holding the lock by careful use
|
|
of smp_mb__after_spinlock():
|
|
|
|
/* See Z6.0+pooncelock+poonceLock+pombonce.litmus. */
|
|
void CPU0(void)
|
|
{
|
|
spin_lock(&mylock);
|
|
WRITE_ONCE(x, 1);
|
|
WRITE_ONCE(y, 1);
|
|
spin_unlock(&mylock);
|
|
}
|
|
|
|
void CPU1(void)
|
|
{
|
|
spin_lock(&mylock);
|
|
smp_mb__after_spinlock();
|
|
r0 = READ_ONCE(y);
|
|
WRITE_ONCE(z, 1);
|
|
spin_unlock(&mylock);
|
|
}
|
|
|
|
void CPU2(void)
|
|
{
|
|
WRITE_ONCE(z, 2);
|
|
smp_mb();
|
|
r1 = READ_ONCE(x);
|
|
}
|
|
|
|
This addition of smp_mb__after_spinlock() strengthens the lock
|
|
acquisition sufficiently to rule out the counter-intuitive outcome.
|
|
In other words, the addition of the smp_mb__after_spinlock() prohibits
|
|
the counter-intuitive result where the final value of r0 is 1, the final
|
|
value of z is 2, and the final value of r1 is 0.
|
|
|
|
|
|
No Roach-Motel Locking!
|
|
-----------------------
|
|
|
|
This example requires familiarity with the herd7 "filter" clause, so
|
|
please read up on that topic in litmus-tests.txt.
|
|
|
|
It is tempting to allow memory-reference instructions to be pulled
|
|
into a critical section, but this cannot be allowed in the general case.
|
|
For example, consider a spin loop preceding a lock-based critical section.
|
|
Now, herd7 does not model spin loops, but we can emulate one with two
|
|
loads, with a "filter" clause to constrain the first to return the
|
|
initial value and the second to return the updated value, as shown below:
|
|
|
|
/* See Documentation/litmus-tests/locking/RM-fixed.litmus. */
|
|
void CPU0(void)
|
|
{
|
|
spin_lock(&lck);
|
|
r2 = atomic_inc_return(&y);
|
|
WRITE_ONCE(x, 1);
|
|
spin_unlock(&lck);
|
|
}
|
|
|
|
void CPU1(void)
|
|
{
|
|
r0 = READ_ONCE(x);
|
|
r1 = READ_ONCE(x);
|
|
spin_lock(&lck);
|
|
r2 = atomic_inc_return(&y);
|
|
spin_unlock(&lck);
|
|
}
|
|
|
|
filter (1:r0=0 /\ 1:r1=1)
|
|
exists (1:r2=1)
|
|
|
|
The variable "x" is the control variable for the emulated spin loop.
|
|
CPU0() sets it to "1" while holding the lock, and CPU1() emulates the
|
|
spin loop by reading it twice, first into "1:r0" (which should get the
|
|
initial value "0") and then into "1:r1" (which should get the updated
|
|
value "1").
|
|
|
|
The "filter" clause takes this into account, constraining "1:r0" to
|
|
equal "0" and "1:r1" to equal 1.
|
|
|
|
Then the "exists" clause checks to see if CPU1() acquired its lock first,
|
|
which should not happen given the filter clause because CPU0() updates
|
|
"x" while holding the lock. And herd7 confirms this.
|
|
|
|
But suppose that the compiler was permitted to reorder the spin loop
|
|
into CPU1()'s critical section, like this:
|
|
|
|
/* See Documentation/litmus-tests/locking/RM-broken.litmus. */
|
|
void CPU0(void)
|
|
{
|
|
int r2;
|
|
|
|
spin_lock(&lck);
|
|
r2 = atomic_inc_return(&y);
|
|
WRITE_ONCE(x, 1);
|
|
spin_unlock(&lck);
|
|
}
|
|
|
|
void CPU1(void)
|
|
{
|
|
spin_lock(&lck);
|
|
r0 = READ_ONCE(x);
|
|
r1 = READ_ONCE(x);
|
|
r2 = atomic_inc_return(&y);
|
|
spin_unlock(&lck);
|
|
}
|
|
|
|
filter (1:r0=0 /\ 1:r1=1)
|
|
exists (1:r2=1)
|
|
|
|
If "1:r0" is equal to "0", "1:r1" can never equal "1" because CPU0()
|
|
cannot update "x" while CPU1() holds the lock. And herd7 confirms this,
|
|
showing zero executions matching the "filter" criteria.
|
|
|
|
And this is why Linux-kernel lock and unlock primitives must prevent
|
|
code from entering critical sections. It is not sufficient to only
|
|
prevent code from leaving them.
|