md/raid5: deadlock between retry_aligned_read with barrier io
A chunk aligned read increases counter active_aligned_reads and decreases it after sub-device handle it successfully. But when a read error occurs, the read redispatched by raid5d, and the active_aligned_reads will not be decreased until we can grab a stripe head in retry_aligned_read. Now suppose, a barrier io comes, set conf->quiesce to 2, and wait until both active_stripes and active_aligned_reads are zero. The retried chunk aligned read gets stuck at get_active_stripe waiting until conf->quiesce becomes 0. Retry_aligned_read and barrier io are waiting each other now. One possible solution is that we ignore conf->quiesce, let the retried aligned read finish. I reproduced this deadlock and test this patch on centos6.0 Signed-off-by: NeilBrown <neilb@suse.de>
This commit is contained in:
parent
d592a99691
commit
2844dc32ea
@ -5115,7 +5115,7 @@ static int retry_aligned_read(struct r5conf *conf, struct bio *raid_bio)
|
|||||||
/* already done this stripe */
|
/* already done this stripe */
|
||||||
continue;
|
continue;
|
||||||
|
|
||||||
sh = get_active_stripe(conf, sector, 0, 1, 0);
|
sh = get_active_stripe(conf, sector, 0, 1, 1);
|
||||||
|
|
||||||
if (!sh) {
|
if (!sh) {
|
||||||
/* failed to get a stripe - must wait */
|
/* failed to get a stripe - must wait */
|
||||||
|
Loading…
Reference in New Issue
Block a user