1
0
mirror of git://sourceware.org/git/lvm2.git synced 2024-10-28 03:27:58 +03:00
lvm2/lib
Jonathan Brassow d5896f0afd Mirror: Fix hangs and lock-ups caused by attempting label reads of mirrors
There is a problem with the way mirrors have been designed to handle
failures that is resulting in stuck LVM processes and hung I/O.  When
mirrors encounter a write failure, they block I/O and notify userspace
to reconfigure the mirror to remove failed devices.  This process is
open to a couple races:
1) Any LVM process other than the one that is meant to deal with the
mirror failure can attempt to read the mirror, fail, and block other
LVM commands (including the repair command) from proceeding due to
holding a lock on the volume group.
2) If there are multiple mirrors that suffer a failure in the same
volume group, a repair can block while attempting to read the LVM
label from one mirror while trying to repair the other.

Mitigation of these races has been attempted by disallowing label reading
of mirrors that are either suspended or are indicated as blocking by
the kernel.  While this has closed the window of opportunity for hitting
the above problems considerably, it hasn't closed it completely.  This is
because it is still possible to start an LVM command, read the status of
the mirror as healthy, and then perform the read for the label at the
moment after a the failure is discovered by the kernel.

I can see two solutions to this problem:
1) Allow users to configure whether mirrors can be candidates for LVM
labels (i.e. whether PVs can be created on mirror LVs).  If the user
chooses to allow label scanning of mirror LVs, it will be at the expense
of a possible hang in I/O or LVM processes.
2) Instrument a way to allow asynchronous label reading - allowing
blocked label reads to be ignored while continuing to process the LVM
command.  This would action would allow LVM commands to continue even
though they would have otherwise blocked trying to read a mirror.  They
can then release their lock and allow a repair command to commence.  In
the event of #2 above, the repair command already in progress can continue
and repair the failed mirror.

This patch brings solution #1.  If solution #2 is developed later on, the
configuration option created in #1 can be negated - allowing mirrors to
be scanned for labels by default once again.
2013-10-22 19:14:33 -05:00
..
activate Mirror: Fix hangs and lock-ups caused by attempting label reads of mirrors 2013-10-22 19:14:33 -05:00
cache metadata: Fix metadata repair paths when lvmetad is used. 2013-10-09 14:44:01 +02:00
commands Mirror: Fix hangs and lock-ups caused by attempting label reads of mirrors 2013-10-22 19:14:33 -05:00
config Mirror: Fix hangs and lock-ups caused by attempting label reads of mirrors 2013-10-22 19:14:33 -05:00
datastruct Switch to return void 2012-02-08 12:52:58 +00:00
device filters: Add NVM Express (nvme). 2013-10-09 20:08:07 +01:00
display display: fix units for sizes <1k 2013-07-18 17:55:58 +01:00
error cleanup: drop unneeded included header files 2012-08-23 14:37:20 +02:00
filters filter-mpath: remove superfluous error message about mpath major not equal to dm major 2013-08-21 14:07:01 +02:00
format1 cleanup: drop unused headers 2013-06-16 00:07:32 +02:00
format_pool cleanup: drop unused headers 2013-06-16 00:07:32 +02:00
format_text activation: flag temporary LVs internally 2013-10-23 14:09:37 +02:00
freeseg cleanup: drop unneeded included header files 2012-08-23 14:37:20 +02:00
label logging: classify log_debug messages 2013-01-07 22:30:29 +00:00
locking activation: flag temporary LVs internally 2013-10-23 14:09:37 +02:00
log logging: tidy log_sys_error when string empty 2013-08-12 18:40:41 +01:00
metadata activation: flag temporary LVs internally 2013-10-23 14:09:37 +02:00
mirror config: add profile arg to find_config_tree_str 2013-07-02 15:19:09 +02:00
misc Mirror: Fix hangs and lock-ups caused by attempting label reads of mirrors 2013-10-22 19:14:33 -05:00
mm config: add profile arg to find_config_tree_bool 2013-07-02 15:19:09 +02:00
properties lvm2app: Add thin and thin pool lv creation 2013-07-12 16:52:16 -05:00
raid fix: also make commit b4637 work without dmeventd 2013-09-30 08:17:56 +02:00
replicator cleanup: drop unneeded included header files 2012-08-23 14:37:20 +02:00
report lvs: Add seg_size_pe field. 2013-09-23 21:50:14 +01:00
snapshot snapshot: rework parsing of snapshot metadata 2013-10-14 00:26:58 +02:00
striped Add activation/use_linear_target enabled by default. (prajnoha) 2011-11-28 20:37:51 +00:00
thin fix: also make commit b4637 work without dmeventd 2013-09-30 08:17:56 +02:00
unknown cleanup: drop unneeded included header files 2012-08-23 14:37:20 +02:00
uuid Revert the #include changes. Need to fix this at the #include site for now, and 2011-07-18 14:34:33 +00:00
zero cleanup: drop unneeded included header files 2012-08-23 14:37:20 +02:00
Makefile.in filters: check for mpath before opening devs 2013-08-13 23:26:58 +01:00