NeilBrown 6d183de407 md/raid5: fix newly-broken locking in get_active_stripe.
commit 566c09c53455d7c4f1 raid5: relieve lock contention in get_active_stripe()

modified the locking in get_active_stripe() reducing the range
protected by the (highly contended) device_lock.
Unfortunately it reduced the range too much opening up some races.

One race can occur if get_priority_stripe runs between the
test on sh->count and device_lock being taken.
This will mean that sh->lru is not empty while get_active_stripe
thinks ->count is zero resulting in a 'BUG' firing.

Another race happens if __release_stripe is called immediately
after sh->count is tested and found to be non-zero.  If STRIPE_HANDLE
is not set, get_active_stripe should increment ->active_stripes
when it increments ->count from 0, but as it didn't think it was 0,
it doesn't.

Extending device_lock to cover the test on sh->count close these
races.

While we are here, fix the two BUG tests:
 -If count is zero, then lru really must not be empty, or we've
  lock the stripe_head somehow - no other tests are relevant.
 -STRIPE_ON_RELEASE_LIST is completely independent of ->lru so
  testing it is pointless.

Reported-and-tested-by: Brassow Jonathan <jbrassow@redhat.com>
Reviewed-by: Shaohua Li <shli@kernel.org>
Fixes: 566c09c53455d7c4f1
Signed-off-by: NeilBrown <neilb@suse.de>
2013-11-28 11:00:15 +11:00
..
2013-09-26 15:33:18 -07:00
2013-03-01 22:45:51 +00:00
2013-03-01 22:45:51 +00:00
2012-03-28 18:41:29 +01:00
2013-03-01 22:45:51 +00:00
2013-11-09 18:20:22 -05:00
2013-08-23 09:02:13 -04:00
2012-07-30 17:25:16 -07:00
2013-08-23 09:02:13 -04:00
2013-09-05 20:46:06 -04:00
2013-07-10 23:41:19 +01:00
2007-10-20 02:01:26 +01:00
2013-07-10 23:41:17 +01:00
2013-03-01 22:45:47 +00:00
2013-11-09 18:20:22 -05:00
2013-11-09 18:20:22 -05:00
2013-03-23 14:15:29 -07:00
2013-11-09 18:20:22 -05:00
2013-03-23 14:15:29 -07:00
2013-09-05 20:46:06 -04:00
2013-11-20 13:05:25 -08:00
2013-11-20 13:05:25 -08:00
2013-11-20 13:05:25 -08:00