linux/drivers/md
NeilBrown f94c0b6658 md/raid5: fix interaction of 'replace' and 'recovery'.
If a device in a RAID4/5/6 is being replaced while another is being
recovered, then the writes to the replacement device currently don't
happen, resulting in corruption when the replacement completes and the
new drive takes over.

This is because the replacement writes are only triggered when
's.replacing' is set and not when the similar 's.sync' is set (which
is the case during resync and recovery - it means all devices need to
be read).

So schedule those writes when s.replacing is set as well.

In this case we cannot use "STRIPE_INSYNC" to record that the
replacement has happened as that is needed for recording that any
parity calculation is complete.  So introduce STRIPE_REPLACED to
record if the replacement has happened.

For safety we should also check that STRIPE_COMPUTE_RUN is not set.
This has a similar effect to the "s.locked == 0" test.  The latter
ensure that now IO has been flagged but not started.  The former
checks if any parity calculation has been flagged by not started.
We must wait for both of these to complete before triggering the
'replace'.

Add a similar test to the subsequent check for "are we finished yet".
This possibly isn't needed (is subsumed in the STRIPE_INSYNC test),
but it makes it more obvious that the REPLACE will happen before we
think we are finished.

Finally if a NeedReplace device is not UPTODATE then that is an
error.  We really must trigger a warning.

This bug was introduced in commit 9a3e1101b8
(md/raid5:  detect and handle replacements during recovery.)
which introduced replacement for raid5.
That was in 3.3-rc3, so any stable kernel since then would benefit
from this fix.

Cc: stable@vger.kernel.org (3.3+)
Reported-by: qindehua <13691222965@163.com>
Tested-by: qindehua <qindehua@163.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-25 16:46:57 +10:00
..
bcache Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2013-07-04 11:40:58 -07:00
persistent-data dm persistent metadata: add space map threshold callback 2013-05-10 14:37:20 +01:00
bitmap.c md: replace strict_strto*() with kstrto*() 2013-06-14 08:10:26 +10:00
bitmap.h md/bitmap: record the space available for the bitmap in the superblock. 2012-05-22 13:55:34 +10:00
dm-bio-prison.c dm: add cache target 2013-03-01 22:45:51 +00:00
dm-bio-prison.h dm: add cache target 2013-03-01 22:45:51 +00:00
dm-bio-record.h
dm-bufio.c dm bufio: submit writes outside lock 2013-07-10 23:41:18 +01:00
dm-bufio.h dm bufio: prefetch 2012-03-28 18:41:29 +01:00
dm-cache-block-types.h dm: add cache target 2013-03-01 22:45:51 +00:00
dm-cache-metadata.c dm cache: replace memcpy with struct assignment 2013-05-10 14:37:18 +01:00
dm-cache-metadata.h dm cache: policy ignore hints if generated by different version 2013-03-20 17:21:28 +00:00
dm-cache-policy-cleaner.c dm cache: policy change version from string to integer set 2013-03-20 17:21:27 +00:00
dm-cache-policy-internal.h dm cache: policy change version from string to integer set 2013-03-20 17:21:27 +00:00
dm-cache-policy-mq.c dm cache: policy change version from string to integer set 2013-03-20 17:21:27 +00:00
dm-cache-policy.c dm cache: policy change version from string to integer set 2013-03-20 17:21:27 +00:00
dm-cache-policy.h dm cache policy: fix description of lookup fn 2013-05-10 14:37:17 +01:00
dm-cache-target.c dm cache: fix arm link errors with inline 2013-07-10 23:41:17 +01:00
dm-crypt.c block: Convert some code to bio_for_each_segment_all() 2013-03-23 14:26:30 -07:00
dm-delay.c dm: rename request variables to bios 2013-03-01 22:45:47 +00:00
dm-exception-store.c dm: replace simple_strtoul 2012-07-27 15:07:59 +01:00
dm-exception-store.h
dm-flakey.c dm flakey: correct ctr alloc failure mesg 2013-07-10 23:41:17 +01:00
dm-io.c dm kcopyd: add WRITE SAME support to dm_kcopyd_zero 2012-12-21 20:23:37 +00:00
dm-ioctl.c dm: optimize use SRCU and RCU 2013-07-10 23:41:18 +01:00
dm-kcopyd.c dm kcopyd: introduce configurable throttling 2013-03-01 22:45:49 +00:00
dm-linear.c dm: rename request variables to bios 2013-03-01 22:45:47 +00:00
dm-log-userspace-base.c Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
dm-log-userspace-transfer.c connector/userns: replace netlink uses of cap_raised() with capable() 2012-05-10 23:21:39 -04:00
dm-log-userspace-transfer.h
dm-log.c dm: use memweight() 2012-07-30 17:25:16 -07:00
dm-mpath.c dm mpath: fix ioctl deadlock when no paths 2013-07-10 23:41:15 +01:00
dm-mpath.h
dm-path-selector.c md: Add module.h to all files using it implicitly 2011-10-31 19:31:18 -04:00
dm-path-selector.h
dm-queue-length.c dm: reject trailing characters in sccanf input 2012-03-28 18:41:26 +01:00
dm-raid1.c block: Use bio_sectors() more consistently 2013-03-23 14:15:30 -07:00
dm-raid.c MD: Remember the last sync operation that was performed 2013-06-26 12:38:24 +10:00
dm-region-hash.c dm raid1: fix crash with mirror recovery and discard 2012-07-20 14:25:03 +01:00
dm-round-robin.c dm: reject trailing characters in sccanf input 2012-03-28 18:41:26 +01:00
dm-service-time.c dm: reject trailing characters in sccanf input 2012-03-28 18:41:26 +01:00
dm-snap-persistent.c md: Add in export.h for files using EXPORT_SYMBOL 2011-10-31 19:31:19 -04:00
dm-snap-transient.c md: Add in export.h for files using EXPORT_SYMBOL 2011-10-31 19:31:19 -04:00
dm-snap.c dm snapshot: fix error return code in snapshot_ctr 2013-05-10 14:37:15 +01:00
dm-stripe.c dm stripe: fix regression in stripe_width calculation 2013-05-10 14:37:14 +01:00
dm-switch.c dm: add switch target 2013-07-10 23:41:19 +01:00
dm-sysfs.c
dm-table.c dm: optimize use SRCU and RCU 2013-07-10 23:41:18 +01:00
dm-target.c dm: rename request variables to bios 2013-03-01 22:45:47 +00:00
dm-thin-metadata.c dm thin: generate event when metadata threshold passed 2013-05-10 14:37:21 +01:00
dm-thin-metadata.h dm thin: generate event when metadata threshold passed 2013-05-10 14:37:21 +01:00
dm-thin.c dm thin: fix metadata dev resize detection 2013-05-19 18:57:50 +01:00
dm-uevent.c md: Add in export.h for files using EXPORT_SYMBOL 2011-10-31 19:31:19 -04:00
dm-uevent.h
dm-verity.c dm verity: use __ffs and __fls 2013-07-10 23:41:17 +01:00
dm-zero.c dm: rename request variables to bios 2013-03-01 22:45:47 +00:00
dm.c dm: optimize reorder structure 2013-07-10 23:41:18 +01:00
dm.h dm: introduce per_bio_data 2012-12-21 20:23:38 +00:00
faulty.c block: Add bio_end_sector() 2013-03-23 14:15:29 -07:00
Kconfig dm: add switch target 2013-07-10 23:41:19 +01:00
linear.c block: Add bio_end_sector() 2013-03-23 14:15:29 -07:00
linear.h md/linear: typedef removal: linear_conf_t -> struct linear_conf 2011-10-11 16:48:54 +11:00
Makefile dm: add switch target 2013-07-10 23:41:19 +01:00
md.c md: Remove recent change which allows devices to skip recovery. 2013-07-18 14:18:03 +10:00
md.h MD: Remember the last sync operation that was performed 2013-06-26 12:38:24 +10:00
multipath.c MD: change the parameter of md thread 2012-10-11 13:34:00 +11:00
multipath.h md/multipath: typedef removal: multipath_conf_t -> struct mpconf 2011-10-11 16:48:57 +11:00
raid0.c md: fix buglet in RAID5 -> RAID0 conversion. 2013-06-26 12:38:19 +10:00
raid0.h md: add proper merge_bvec handling to RAID0 and Linear. 2012-03-19 12:46:39 +11:00
raid1.c md/raid1: fix bio handling problems in process_checks() 2013-07-18 14:18:04 +10:00
raid1.h md/raid1: prevent merging too large request 2012-07-31 10:03:53 +10:00
raid5.c md/raid5: fix interaction of 'replace' and 'recovery'. 2013-07-25 16:46:57 +10:00
raid5.h md/raid5: fix interaction of 'replace' and 'recovery'. 2013-07-25 16:46:57 +10:00
raid10.c md/raid10: remove use-after-free bug. 2013-07-25 16:46:53 +10:00
raid10.h MD RAID10: Improve redundancy for 'far' and 'offset' algorithms (part 1) 2013-02-26 11:55:30 +11:00