1
0
mirror of git://sourceware.org/git/lvm2.git synced 2024-12-22 17:35:59 +03:00
Commit Graph

119 Commits

Author SHA1 Message Date
Jonathan Brassow
d5896f0afd Mirror: Fix hangs and lock-ups caused by attempting label reads of mirrors
There is a problem with the way mirrors have been designed to handle
failures that is resulting in stuck LVM processes and hung I/O.  When
mirrors encounter a write failure, they block I/O and notify userspace
to reconfigure the mirror to remove failed devices.  This process is
open to a couple races:
1) Any LVM process other than the one that is meant to deal with the
mirror failure can attempt to read the mirror, fail, and block other
LVM commands (including the repair command) from proceeding due to
holding a lock on the volume group.
2) If there are multiple mirrors that suffer a failure in the same
volume group, a repair can block while attempting to read the LVM
label from one mirror while trying to repair the other.

Mitigation of these races has been attempted by disallowing label reading
of mirrors that are either suspended or are indicated as blocking by
the kernel.  While this has closed the window of opportunity for hitting
the above problems considerably, it hasn't closed it completely.  This is
because it is still possible to start an LVM command, read the status of
the mirror as healthy, and then perform the read for the label at the
moment after a the failure is discovered by the kernel.

I can see two solutions to this problem:
1) Allow users to configure whether mirrors can be candidates for LVM
labels (i.e. whether PVs can be created on mirror LVs).  If the user
chooses to allow label scanning of mirror LVs, it will be at the expense
of a possible hang in I/O or LVM processes.
2) Instrument a way to allow asynchronous label reading - allowing
blocked label reads to be ignored while continuing to process the LVM
command.  This would action would allow LVM commands to continue even
though they would have otherwise blocked trying to read a mirror.  They
can then release their lock and allow a repair command to commence.  In
the event of #2 above, the repair command already in progress can continue
and repair the failed mirror.

This patch brings solution #1.  If solution #2 is developed later on, the
configuration option created in #1 can be negated - allowing mirrors to
be scanned for labels by default once again.
2013-10-22 19:14:33 -05:00
Alasdair G Kergon
04d9a52684 release 2.02.103
52 files changed, 598 insertions(+), 264 deletions(-)
2013-10-04 14:32:23 +01:00
Peter Rajnoha
8cf0810d57 thin: rename thin_pool_chunk_size_calculation -> ..size_policy and rename "default" policy to "generic"
Just to be consistent with existing naming we use.
2013-10-04 12:30:33 +02:00
Peter Rajnoha
b4637bd298 fix: make it possible to compile with --disable-devmapper again
Some code has been added recently which makes it impossible to compile
when "configure --disable-devmapper" is used. This patch just shuffles
the code around so it's under proper #ifdef DEVMAPPER_SUPPORT.
2013-09-27 13:58:55 +02:00
Peter Rajnoha
cc9e65c391 thin: use appropriate default value based on allocation/thin_pool_chunk_size_calculation setting
If thin_pool_chunk_size_calculation is set to "default", use 64KiB,
otheriwse 512KiB for "performance".
2013-09-25 16:06:38 +02:00
Peter Rajnoha
8bf425005c conf: add allocation/thin_pool_chunk_size_calculation
Add allocation/thin_pool_chunk_size_calculation lvm.conf
option to select a method for calculating thin pool chunk
sizes and define two possible values - "default" and "performance".
2013-09-25 16:06:38 +02:00
Alasdair G Kergon
a3a5f58c21 reporting: Add devtypes command.
Add internal devtypes reporting command to display built-in recognised
block device types.  (The output does not include any additional
types added by a configuration file.)

> lvm devtypes -o help
  Device Types Fields
  -------------------
    devtype_all            - All fields in this section.
    devtype_name           - Name of Device Type exactly as it appears in /proc/devices.
    devtype_max_partitions - Maximum number of partitions. (How many device minor numbers get reserved for each device.)
    devtype_description    - Description of Device Type.

> lvm devtypes
  DevType       MaxParts Description
  aoe                 16 ATA over Ethernet
  ataraid             16 ATA Raid
  bcache               1 bcache block device cache
  blkext               1 Extended device partitions
...
2013-09-18 01:09:15 +01:00
Jonathan Brassow
c13d1b11b2 RAID: Make "raid10" the default striped + mirror segment type
When both the '-i' and '-m' arguments are specified on the command
line, use the "raid10" segment type.  This way, the native RAID10
personality is used through dm-raid rather than layering a mirror
on striped LVs.  If the old behavior is desired, the '--type'
argument to use would be "mirror" rather than "raid10".
2013-08-06 14:15:08 -05:00
Jonathan Brassow
7e1083c985 RAID: Make "raid1" the default mirror segment type 2013-08-06 14:13:55 -05:00
Zdenek Kabelac
22fc80982a thin: add thin_repair and thin_dump options
Add new configure lvm.conf options for binaries thin_repair
and thin_dump.

Those are part of device-mapper-persistent-data package
and will be used for recovery of thin_pool.
2013-07-31 15:30:47 +02:00
Zdenek Kabelac
3075784955 thin: add spare lvcreate support
Add --poolmetadataspare option and creates and handles
pool metadata spare lv when thin pool is created.
With default setting 'y' it tries to ensure, spare has
at least the size of created LV.
2013-07-18 18:22:44 +02:00
Peter Rajnoha
8d1e511363 conf: add activation/auto_set_activation_skip
The activation/auto_set_activation_skip enables/disables automatic
adding of the ACTIVATION_SKIP LV flag. By default thin snapshots
are flagged to be skipped during activation.

And by default, the auto_set_activation_skip is enabled.
2013-07-12 20:54:17 +02:00
Peter Rajnoha
9f6cfc9de4 report: add vg_profile and lv_profile to report the profile attached to VG/LV
vgs -o vg_profile ...
lvs -o lv_profile ...
2013-07-02 15:22:12 +02:00
Peter Rajnoha
f88690221b config: make DEFAULT_MAX_HISTORY unconditional 2013-03-06 12:47:23 +01:00
Peter Rajnoha
386886f71c config: refer to config nodes using assigned IDs
For example, the old call and reference:

  find_config_tree_str(cmd, "devices/dir", DEFAULT_DEV_DIR)

...now becomes:

  find_config_tree_str(cmd, devices_dir_CFG)

So we're referring to the named configuration ID instead
of passing the configuration path and the default value
is taken from central config definition in config_settings.h
automatically.
2013-03-06 10:14:33 +01:00
Jonathan Brassow
70f57996b3 RAID: Add new 'raid10_segtype_default' setting in lvm.conf
If '--mirrors/-m' and '--stripes/-i' are used together when creating
a logical volume, mirrors-over-stripes is currently chosen.  The user
can override this by using the '--type raid10' option on creation.
However, we want a place where we can set the default behavior to
'raid10' explicitly - similar to the "mirror" and "raid1" tunable,
mirror_segtype_default.

A follow-on patch should use this new setting to change the default
from "mirror" to "raid10", as this is the preferred segment type.
2013-02-20 15:10:04 -06:00
Jonathan Brassow
0e4ffd9d3b clean-up: Rename lvm.conf setting 'mirror_region_size' to 'raid_region_size'
We have been using 'mirror_region_size' in lvm.conf as the default region
size for RAID logical volumes as well as mirror logical volumes.  Since,
"raid" is more inclusive and representative than "mirror", I have changed
the name of this setting.  We must still check for the old setting and warn
the user if we are overriding it with the new setting if both happen to be
present.
2013-02-20 14:40:17 -06:00
Alasdair G Kergon
7f747a0d73 logging: add debug classes
Add log/debug_classes to lvm.conf to allow debug messages to be
classified and filtered at runtime.

The dm_errno field is only used by log_error(), so I've redefined it
for log_debug() messages to hold the message class.

By default, all existing messages appear, but we can add categories that
generate high volumes of data, such as logging all traffic to/from
lvmetad.
2013-01-07 22:25:19 +00:00
Zdenek Kabelac
1ef9831018 thin: support configurable thin pool defaults
Configurable settings for thin pool create
if they are not specified on command line.

New supported lvm.conf options are:
  allocation/thin_pool_chunk_size
  allocation/thin_pool_discards
  allocation/thin_pool_zero
2012-11-26 12:16:47 +01:00
Alasdair G Kergon
438e0050df config: add silent mode
Accept -q as the short form of --quiet.
Suppress non-essential standard output if -q is given twice.
Treat log/silent in lvm.conf as equivalent to -qq.
Review all log_print messages and change some to
log_print_unless_silent.

When silent, the following commands still produce output:
dumpconfig, lvdisplay, lvmdiskscan, lvs, pvck, pvdisplay,
pvs, version, vgcfgrestore -l, vgdisplay, vgs.
[Needs checking.]

Non-essential messages are shifted from log level 4 to log level 5
for syslog and lvm2_log_fn purposes.
2012-08-25 20:35:48 +01:00
Zdenek Kabelac
e866931169 Improve thin_check option passing
Update a way we handle option passing - so we now support path and options
with space inside.
Fix dm name usage for thin pools with '-' in name.
Use new lvm.conf option thin_check_options to pass in options as string array.
2012-03-14 17:12:05 +00:00
Zdenek Kabelac
90423c1200 Fit thin pool metadata into 128MB
If the lvcreate may decide some automagical values for a user,
try to keep the pool metadata size into 128MB range for optimal
perfomance (as suggested by Joe).

So if the pool metadata size and chunk_size were not specified,
try to select such values they would fit into 128MB size.
2012-03-05 14:19:13 +00:00
Zdenek Kabelac
6c7a6c07ee Add support for thin check
Use libdm callback to execute thin_check before activation
thin pool and after deactivation as well.

Supporting thin_check_executable which may pass in extra options for
the tool.
2012-03-02 21:49:43 +00:00
Alasdair Kergon
a1991f101d pre-release 2012-01-26 14:02:42 +00:00
Zdenek Kabelac
10e80a212f Update verbose lvs to print metadata_percent info
Update lvs  -o fields in WHATS_NEW.
2012-01-25 11:32:41 +00:00
Zdenek Kabelac
c54998209d Update lvdisplay to show more info about thin LVs
Reformat name and path how the LV is represented with lvm1 compatible option,
to switch to the old way - which had number of  problem - i.e. many links
do not exist - since for private devices we are not creating them.
Add more info about thin pools and volumes.
2012-01-20 16:59:58 +00:00
Zdenek Kabelac
e58b5dd8e8 Thin add new display field for lvs
New field Data% is able to display info about
thin_pool, thin, snapshot and has generic meaning here.

Simple Time/Host field are here to display host and time creation.
2012-01-19 15:34:32 +00:00
Zdenek Kabelac
0e0f706f2e Thin automatic policy based extension 2011-12-21 13:10:52 +00:00
Zdenek Kabelac
2bc1d7598e Thin add dmeventd support
This is basic version with still few unresolved issue mainly in case,
when the pool resize is failing.
2011-12-21 13:08:11 +00:00
Jonathan Earl Brassow
d098140177 Add policy based automated repair of RAID logical volumes
The RAID plug-in for dmeventd now calls 'lvconvert --repair' to address failures
of devices in a RAID logical volume.  The action taken can be either to "warn"
or "allocate" a new device from any spares that may be available in the
volume group.  The action is designated by setting 'raid_fault_policy' in
lvm.conf - the default being "warn".
2011-12-06 19:30:15 +00:00
Alasdair Kergon
8dd6036da4 Add activation/use_linear_target enabled by default. (prajnoha)
LVM metadata knows only of striped segments - not linear ones.
The activation code detects segments with a single stripe and switches
them to use the linear target.

If the new lvm.conf setting is set to 0 (e.g. in a test script), this
'optimisation' is turned off.
2011-11-28 20:37:51 +00:00
Milan Broz
07113beea3 Do not scan device if it is part of active multipath.
Add filter which tries to check if scanned device is part
of active multipath.

Firstly, only SCSI major number devices are handled in filter.

Then it checks if device has exactly one holder (in sysfs) and
if it is device-mapper device and DM-UUID is prefixed by "MPATH-".

If so, this device is filtered out.

The whole filter can be switched off by setting
mpath_component_detection in lvm.conf.

https://bugzilla.redhat.com/show_bug.cgi?id=597010

Signed-off-by: Milan Broz <mbroz@redhat.com>
2011-11-11 15:11:08 +00:00
Zdenek Kabelac
bd15208cd7 Thin add thin_pool_metadata_require_separate_pvs
Allow to set different policy for pool from mirrors.
2011-11-04 22:44:21 +00:00
Zdenek Kabelac
b8cac455bd Thin supports poolmetadatasize setting
Add option to set pool metadatasize.
For passing size parameter reuse region_size.
2011-11-04 22:43:10 +00:00
Zdenek Kabelac
57f4dfc653 Reduce preallocated stack size
Go with just 64KiB for stack.

Closer inspection should be made, whether we actually need to play with
settings at all.

Since default stack size is 8MB and gets mapped via page locking thus,
it seems there is no big help with preallocation of stack to some value.
2011-10-11 09:13:39 +00:00
Peter Rajnoha
9fa1d30a1c Add activation/retry_deactivation to lvm.conf to retry deactivation of an LV. 2011-09-22 17:39:56 +00:00
Alasdair Kergon
dbb48de507 Add a new 'thin_pool' output field to 'lvs.
A gentle reminder that anyone relying on the output of reporting commands
like lvs in scripts must use -o to guarantee they get the fields they expect.

The default sequence of fields can change from release to release.
Equally, the 'attr' fields can have new values introduced and/or characters
appended to them.
2011-09-09 00:54:49 +00:00
Zdenek Kabelac
cf98c05082 Add detect_internal_vg_cache_corruption to lvm.conf
Add config option to enable crc checking of VG structures.
Currently it's disabled by default.

For the internal test-suite this check it is enabled.

Note: In the case the internal error is detected, debug build with
compile option DEBUG_ENFORCE_POOL_LOCKING helps to catch the source
of the problem.
2011-08-11 17:46:13 +00:00
Jonathan Earl Brassow
3041b72f06 Add dmeventd monitoring for RAID devices. 2011-08-11 05:00:20 +00:00
Jonathan Earl Brassow
cac52ca4ce Add basic RAID segment type(s) support.
Implementation described in doc/lvm2-raid.txt.

Basic support includes:
- ability to create RAID 1/4/5/6 arrays
- ability to delete RAID arrays
- ability to display RAID arrays
Notable missing features (not included in this patch):
- ability to clean-up/repair failures
- ability to convert RAID segment types
- ability to monitor RAID segment types
2011-08-02 22:07:20 +00:00
Peter Rajnoha
c212019f3a Change DEFAULT_UDEV_SYNC to 1 so udev_sync is used even without any config.
This should be set by default! Normally we have "activation/udev_sync = 1"
in lvm.conf (example.conf.in). But if we use lvm2 without any config file
(or without a definition within '--config' option) the DEFAULT_UDEV_SYNC
is used instead. Together with verify_udev_operations=0 (when we rely on
udev fully), this can cause races as the node could be missing when needed.

(See also https://bugzilla.redhat.com/show_bug.cgi?id=723144)
2011-08-02 10:49:57 +00:00
Alasdair Kergon
2243718fae Add framework for validation of ioctls. Doesn't do any checks yet.
dmsetup --checks
libdevmapper: dm_task_enable_checks()
lvm.conf: activation/checks=1
2011-07-01 14:09:19 +00:00
Alasdair Kergon
0437bccc3c Move udev_only logic inside stacked node op code.
(We still need to treat add+readhead+del as a no-op.)
Rename udev_fallback to verify_udev_operations.
Rename --udevfallback to --verifyudev
2011-06-27 21:43:58 +00:00
Peter Rajnoha
418663b61c Disable udev fallback by default and add activation/udev_fallback to lvm.conf.
We've used udev fallback code till now to check whether udev
created/removed the entries in /dev correctly and if not,
a repair was done (giving a warning messagea about that).

This patch adds a possibility to enable this additional check
and subsequent fallback only when required (debugging purposes
mostly) and trust udev completely.

So let's disable the fallback code by default and add a new
configuration option "activation/udev_fallback".

(The original code for creating the nodes will still be used
in case the device directory that is set in lvm.conf differs
from the one that udev uses and also when activation/udev_rules
is set to 0 - otherwise we would end up with no nodes/symlinks
at all)
2011-06-17 14:50:53 +00:00
Alasdair Kergon
2c56f60db4 Set pv_min_size to 2048KB to exclude floppy drives.
Previously was 512.
2011-04-28 17:33:34 +00:00
Peter Rajnoha
edcda01a1e Obtain device list from udev by default if LVM2 is compiled with udev support.
Also, add a new 'obtain_device_list_from_udev' setting to lvm.conf with which
we can turn this feature on or off if needed.

If set, the cache of block device nodes with all associated symlinks
will be constructed out of the existing udev database content.
This avoids using and opening any inapplicable non-block devices or
subdirectories found in the device directory. This setting is applied
to udev-managed device directory only, other directories will be scanned
fully. LVM2 needs to be compiled with udev support for this setting to
take effect. N.B. Any device node or symlink not managed by udev in
udev directory will be ignored with this setting on.
2011-04-22 12:05:32 +00:00
Mike Snitzer
fdc8670327 Add "devices/issue_discards" to lvm.conf.
Issue discards on lvremove if enabled and both storage and kernel have support.
2011-04-12 21:59:01 +00:00
Alasdair Kergon
92ffcda183 Various changes to the allocation algorithms: Expect some fallout.
There is a lot to test.

Two new config settings added that are intended to make the code behave
closely to the way it did before - worth a try if you find problems.
2011-02-27 00:38:31 +00:00
Alasdair Kergon
b83af51668 Add global/metadata_read_only to use unrepaired metadata in read-only cmds. 2010-10-25 11:20:54 +00:00
Petr Rockai
a341cab721 Implement automatic snapshot extension with dmeventd, and add two new options
to lvm.conf in the activation section: 'snapshot_autoextend_threshold' and
'snapshot_autoextend_percent', that define how to handle automatic snapshot
extension. The former defines when the snapshot should be extended: when its
space usage exceeds this many percent. The latter defines how much extra space
should be allocated for the snapshot, in percent of its current size.
2010-10-15 16:28:14 +00:00