1
0
mirror of git://sourceware.org/git/lvm2.git synced 2024-12-31 21:18:26 +03:00
Commit Graph

3223 Commits

Author SHA1 Message Date
Alasdair G Kergon
cbfb5a98b5 filters: power2 devs get precedence if PVIDs match
Give precedence to EMC "power2" devices with duplicate PVIDs like
we already do with "emcpower" devices.
2013-03-11 20:10:49 +00:00
Jonathan Brassow
31c24dd9f2 RAID: Code changes missing from previous commit (bbc6378)
Previous commit included changes to WHATSNEW, but the code changes
were missing.  Here is the description from the previous commit:
commit bbc6378b73
Author: Jonathan Brassow <jbrassow@redhat.com>
Date:   Thu Feb 21 11:31:36 2013 -0600

    RAID:  Make 'lvchange --refresh' restore transiently failed RAID PVs

    A new function (dm_tree_node_force_identical_table_reload) was added to
    avoid the suppression of identical table reloads.  This allows RAID LVs
    to reload the on-disk superblock information that contains which devices
    have failed and the bitmaps.  If the failed device has returned, this has
    the effect of restoring the device and initiating recovery.  Without this
    patch, the user had to completely deactivate their RAID LV and re-activate
    it in order to restore the failed device.  Now they simply need to
    suspend and resume (which is done by 'lvchange --refresh').

    The identical table suppression is only avoided if the LV is not PARTAIL
    (i.e. all of it's devices can be seen and read by LVM) and the kernel
    status of the array contains failed devices.  In other words, the function
    will only be called in the case where we may have success in restoring
    a failed device in the array.
2013-03-06 10:17:11 -06:00
Jonathan Brassow
ed6f3945fd clean-up: Typo 's/should had/should have/' 2013-03-06 08:42:03 -06:00
Peter Rajnoha
f88690221b config: make DEFAULT_MAX_HISTORY unconditional 2013-03-06 12:47:23 +01:00
Peter Rajnoha
7d6991e900 dumpconfig: add --ignoreadvanced and --ignoreunsupported switch
lvm dumpconfig [--ignoreadvanced] [--ignoreunsupported]

--ignoreadvanced causes the advanced configuration options to be left
out on dumpconfig output

--ignoreunsupported causes the options that are not officially supported
to be lef out on dumpconfig output
2013-03-06 10:46:36 +01:00
Peter Rajnoha
7fd04bd93a config: add comment note about advanced and unsupported config nodes
This shows up in the output as a short commentary:

  $ lvm dumpconfig --type default --withcomments metadata/disk_areas
  # Configuration option metadata/disk_areas.
  # This configuration option is advanced.
  # This configuration option is not officially supported.
  disk_areas=""
2013-03-06 10:46:36 +01:00
Peter Rajnoha
088d88cfe2 dumpconfig: add --withcomments and --withversions switch
lvm dumpconfig [--withcomments] [--withversions]

The --withcomments causes the comments to appear on output before each
config node (if they were defined in config_settings.h).

The --withversions causes a one line extra comment to appear on output
before each config node with the version information in which the
configuration setting first appeared.
2013-03-06 10:46:36 +01:00
Peter Rajnoha
e29cd366a2 config: add support for enhanced config node output
There's a possibility to interconnect the dm_config_node with an
ID, which in our case is used to reference the configuration
definition ID from config_settings.h. So simply interconnecting
struct dm_config_node with struct cfg_def_item.

This patch also adds support for enhanced config node output besides
existing "output line by line". This patch adds a possibility to
register a callback that gets called *before* the config node is
processed line by line (for example to include any headers on output)
and *after* the config node is processed line by line (to include any
footers on output). Also, it adds the config node reference itself
as the callback arg in addition to have a possibility to extract more
information from the config node itself if needed when processing the
output callback (e.g. the key name, the id, or whether this is a
section or a value etc...).

If the config node from lvm.conf/--config tree is recognized and valid,
it's always coupled with the config node definition ID from
config_settings.h:

 struct dm_config_node {
   int id;
   const char *key;
   struct dm_config_node *parent, *sib, *child;
   struct dm_config_value *v;
 }

For example if the dm_config_node *cn holds "devices/dev" configuration,
then the cn->id holds "devices_dev_CFG" ID from config_settings.h, -1 if
not found in config_settings.h and 0 if matching has not yet been done.

To support the enhanced config node output, a new structure has been
defined in libdevmapper to register it:

  struct dm_config_node_out_spec {
    dm_config_node_out_fn prefix_fn; /* called before processing config node lines */
    dm_config_node_out_fn line_fn; /* called for each config node line */
    dm_config_node_out_fn suffix_fn; /* called after processing config node lines */
  };

Where dm_config_node_out_fn is:

  typedef int (*dm_config_node_out_fn)(const struct dm_config_node *cn, const char *line, void *baton);

(so in comparison to existing callbacks for config node output, it has
an extra dm_config_node *cn arg in addition)

This patch also adds these functions to libdevmapper:
  - dm_config_write_node_out
  - dm_config_write_one_node_out

...which have exactly the same functionality as their counterparts
without the "out" suffix. The "*_out" functions adds the extra hooks
for enhanced config output (prefix_fn and suffix_fn mentioned above).

One can still use the old interface for config node output, this is
just an enhancement for those who'd like to modify the output more
extensively.
2013-03-06 10:46:36 +01:00
Peter Rajnoha
34350963d1 dumpconfig: add --type, --atversion and --validate arg
lvm dumpconfig [--type {current|default|missing|new}] [--atversion] [--validate]

This patch adds above-mentioned args to lvm dumpconfig and it maps them
to creation and writing out a configuration tree of a specific type
(see also previous commit):

  - current maps to CFG_TYPE_CURRENT
  - default maps to CFG_TYPE_DEFAULT
  - missing maps to CFG_TYPE_MISSING
  - new maps to CFG_TYPE_NEW

If --type is not defined, dumpconfig defaults to "--type current"
which is the original behaviour of dumpconfig before all these changes.

The --validate option just validates current configuration tree
(lvm.conf/--config) and it writes a simple status message:

  "LVM configuration valid" or "LVM configuration invalid"
2013-03-06 10:46:36 +01:00
Peter Rajnoha
245b85692e config: use config checks and add support for creating trees from config definition (config_def_create_tree fn)
Configuration checking is initiated during config load/processing
(_process_config fn) which is part of the command context
creation/refresh.

This patch also defines 5 types of trees that could be created from
the configuration definition (config_settings.h), the cfg_def_tree_t:

  - CFG_DEF_TREE_CURRENT that denotes a tree of all the configuration
    nodes that are explicitly defined in lvm.conf/--config

  - CFG_DEF_TREE_MISSING that denotes a tree of all missing
    configuration nodes for which default valus are used since they're
    not explicitly used in lvm.conf/--config

  - CFG_DEF_TREE_DEFAULT that denotes a tree of all possible
    configuration nodes with default values assigned, no matter what
    the actual lvm.conf/--config is

  - CFG_DEF_TREE_NEW that denotes a tree of all new configuration nodes
    that appeared in given version

  - CFG_DEF_TREE_COMPLETE that denotes a tree of the whole configuration
    tree that is used in LVM2 (a combination of CFG_DEF_TREE_CURRENT +
    CFG_DEF_TREE_MISSING). This is not implemented yet, it will be added
    later...

The function that creates the definition tree of given type:

  struct dm_config_tree *config_def_create_tree(struct config_def_tree_spec *spec);

Where the "spec" specifies the tree type to be created:

  struct config_def_tree_spec {
    cfg_def_tree_t type;	/* tree type */
    uint16_t version;		/* tree at this LVM2 version */
    int ignoreadvanced;		/* do not include advanced configs */
    int ignoreunsupported;	/* do not include unsupported configs */
  };

This tree can be passed to already existing functions that write
the tree on output (like we already do with cmd->cft).

There is a new lvm.conf section called "config" with two new options:

  - config/checks which enables/disables checking (enabled by default)

  - config/abort_on_errors which enables/disables aborts on any type of
    mismatch found in the config (disabled by default)
2013-03-06 10:46:35 +01:00
Peter Rajnoha
e38aaddb5e config: add support for configuration check (config_def_check fn)
Add support for configuration checking - type checking and recognition
of registered configuration settings that LVM2 understands and also
check the structure of the configuration. Log error on any mismatch
found.

A hash over all allowed configuration paths is created which helps
with matching the exact configuration (lvm.conf/--config tree) with
the configuration item definition from config_settings.h in an
efficient and one-step way.

Two more helper flags are introduced for each configuration definition
item:

  - CFG_USED which marks the item as being used (lvm.conf/--config)
    This helps with identifying missing configuration options
    (and for which defaults were used) when traversing the tree later.

  - CFG_VALID which denotes that the item has already been checked and
    it was found valid. This improves performance, so if the check
    is called once again on the same tree which was not reloaded, we
    can just return the state from previous check (with a possibility
    to force the check if needed).

The new function that config.h exports and which is going to be used
to perform the configuration checking is:

  int config_def_check(struct cmd_context *cmd, int force, int skip, int suppress_messages)

...which is exported internally via config.h.
2013-03-06 10:17:18 +01:00
Peter Rajnoha
386886f71c config: refer to config nodes using assigned IDs
For example, the old call and reference:

  find_config_tree_str(cmd, "devices/dir", DEFAULT_DEV_DIR)

...now becomes:

  find_config_tree_str(cmd, devices_dir_CFG)

So we're referring to the named configuration ID instead
of passing the configuration path and the default value
is taken from central config definition in config_settings.h
automatically.
2013-03-06 10:14:33 +01:00
Peter Rajnoha
a3d891a290 config: add structs to represent config definition and register config_settings.h content
This patch adds basic structures that encapsulate the config_settings.h
content - it takes each item and puts it in structures:

  - cfg_def_type_t to define config item type

  - cfg_def_value_t to define config item (default) value

  - flags used to define the nature and use of the config item:
      - CFG_NAME_VARIABLE for items with variable names (e.g. tags)
      - CFG_ALLOW_EMPTY for items where empty value is allowed
      - CFG_ADVANCED for items which are considered as "advanced settings"
      - CFG_UNSUPPORTED for items which are not officially supported
        (config options mostly for internal use and testing/debugging)

  - cfg_def_item_t to encapsulate the whole definition of the config
    definition itself

Each config item is referenced by named ID, e.g. "devices_dir_CFG"
instead of directly typing the path "devices/dir" as it was before.

This patch also adds cfg_def_get_path helper function to get the
config setting path up to the root for given config ID
(it returns the path in form of "abc/def/.../xyz" where the "abc"
is the topmost element).
2013-03-06 10:14:33 +01:00
Peter Rajnoha
e947c362dd config: add config_settings.h
This file centrally defines all recognized LVM2 configuration
sections and settings. Each item here has its parent, set of
allowed types, default value, brief comment, version the setting
first appeared in and flags that further define the nature of
the configuration setting and its use.
2013-03-06 10:14:32 +01:00
Peter Rajnoha
6ea68f233c config: add vsn macro
The 'vsn' macro encodes the LVM2 version major, minor
and patchlevel number in a packed form using 16 bits.
2013-03-06 08:52:55 +01:00
Peter Rajnoha
a9d0e25627 cleanup: remove struct pv_header_extension reference from struct pv_header
Just to prevent accidental and improper use when reading the layout
from disk because of the already existing disk_areas_xl[0] lists
that are variable in size. We can read pv_header_extension only
after we know exactly where the lists end...
2013-02-27 10:47:24 +01:00
Peter Rajnoha
9d5a3c16dd lvmetad: fix to properly process embedding area 2013-02-27 10:36:49 +01:00
Peter Rajnoha
ea69cda4b0 report: add reporting fields for Embedding Area start and size
There are new reporting fields for Embedding Area: ea_start and ea_size.

An example of 1m Embedding Area and relevant reporting fields:
raw/~ # pvs -o pv_name,pe_start,ea_start,ea_size
  PV         1st PE  EA start EA size
  /dev/sda     2.00m    1.00m   1.00m
2013-02-26 14:46:42 +01:00
Peter Rajnoha
b778653f03 pv_header_extension: add support for writing PV header extension (flags & Embedding Area)
The PV header extension information (PV header extension version, flags
and list of Embedding Area locations) is stored just beyond the PV header base.

When calculating the Embedding Area start value (ea_start), the same logic is
used as when calculating the pe_start value for Data Area - the value must
follow exactly the same alignment restrictions for its start value
(the alignment detected automatically or provided via command line using
the --dataalignment and --dataalignmentoffset arguments).

The Embedding Area is placed at the very start of the PV, starting at
ea_start. The Data Area starting at pe_start is placed next. The pe_start is
still properly aligned. Due to the pe_start alignment, it's possible that the
resulting Embedding Area size (ea_size) ends up bigger in size than requested
(but never less than requested).
2013-02-26 11:28:00 +01:00
Peter Rajnoha
9dbe25709e pv_header_extension: add support for reading PV header extension (flags & Embedding Area)
New tools with PV header extension support will read the extension
if it exists and it's not an error if it does not exist (so old PVs
will still work seamlessly with new tools).

Old tools without PV header extension support will just ignore any
extension.

As for the Embedding Area location information (its start and size),
there are actually two places where this is stored:
  - PV header extension
  - VG metadata

The VG metadata contains a copy of what's written in the PV header
extension about the Embedding Area location (NULL value is not copied):

    physical_volumes {
        pv0 {
          id = "AkSSRf-difg-fCCZ-NjAN-qP49-1zzg-S0Fd4T"
          device = "/dev/sda"     # Hint only

          status = ["ALLOCATABLE"]
          flags = []
          dev_size = 262144       # 128 Megabytes
          pe_start = 67584
          pe_count = 23   # 92 Megabytes
          ea_start = 2048
          ea_size = 65536 # 32 Megabytes
        }
    }

The new metadata fields are "ea_start" and "ea_size".
This is mostly useful when restoring the PV by using existing
metadata backups (e.g. pvcreate --restorefile ...).

New tools does not require these two fields to exist in VG metadata,
they're not compulsory. Therefore, reading old VG metadata which doesn't
contain any Embedding Area information will not end up with any kind
of error but only a debug message that the ea_start and ea_size values
were not found.

Old tools just ignore these extra fields in VG metadata.
2013-02-26 11:27:23 +01:00
Peter Rajnoha
60c5d4c42f pv_header_extension: add supporting infrastructure for PV header extension (flags & Embedding Area)
PV header extension comes just beyond the existing PV header base:

PV header base (existing):
 - uuid
 - device size
 - null-terminated list of Data Areas
 - null-terminater list of MetaData Areas

PV header extension:
 - extension version
 - flags
 - null-terminated list of Embedding Areas

This patch also adds "eas" (Embedding Areas) list to lvmcache (lvmcache_info)
and it also adds support for common operations on the list (just like for
already existing "das" - Data Areas list):
 - lvmcache_add_ea
 - lvmcache_update_eas
 - lvmcache_foreach_ea
 - lvmcache_del_eas

Also, add ea_start and ea_size to struct physical_volume for processing
PV Embedding Area location throughout the code (currently only one
Embedding Area is supported, though the definition on disk allows for
more if needed in the future...).

Also, define FMT_EAS format flag to mark that the format actually
supports Embedding Areas (currently format-text only).
2013-02-26 11:25:16 +01:00
Peter Rajnoha
6d8de3638c cleanup: use struct pvcreate_restorable_params throughout 2013-02-26 11:25:11 +01:00
Peter Rajnoha
6692b17777 cleanup: add struct pvcreate_restorable_params and move relevant items from pvcreate_params
Extract restorable PV creation parameters from struct pvcreate_params into
a separate struct pvcreate_restorable_params for clarity and also for better
maintainability when adding any new items later.
2013-02-26 11:24:38 +01:00
Zdenek Kabelac
71f4934500 activation: fix pvmove partial tree creation
Do not try to add LV again into the partial tree, if it's been
already added. Otherwise we may end in endless loop.
2013-02-23 12:09:12 +01:00
Zdenek Kabelac
b73de73151 thin: lvconvert support for external origin
Add basic support for converting LV into an external origin volume.

Syntax:

lvconvert --thinpool vg/pool  --originname renamed_origin -T origin

It will convert volume  'origin' into a thin volume, which will
use 'renamed_origin' as an external read-only origin.
All read/write into origin will go via 'pool'.

renamed_origin volume is read-only volume, that could be activated
only in read-only mode, and cannot be modified.
2013-02-23 10:38:20 +01:00
Zdenek Kabelac
2cba0ea9f9 thin: removal of external_origin 2013-02-23 10:37:01 +01:00
Zdenek Kabelac
30c13eff37 thin: report external origin
Use the field 'origin' for reporting external origin lv name.

For thin volumes with external origin, report the size of
external origin size via:

  lvs -o+origin_size
2013-02-23 10:37:01 +01:00
Zdenek Kabelac
87331dc419 thin: add support for external origin
Add internal support for thin volume's external origin.
2013-02-23 10:36:58 +01:00
Zdenek Kabelac
d023b2d12f lvremove: easier removal of dependent lvs
Add function to remove lvs which are depending on removed lv
prior the lv is removed.

User is asked for confirmation.
2013-02-23 10:31:05 +01:00
Zdenek Kabelac
3679bb1cd9 activation: simplify activation code
Reorder activation code to look similar for preload tree and
activation tree.

Its also give much better suppport for device stacking,
since now we also support activation of snapshot which might
be then used for other devices.
2013-02-23 10:30:03 +01:00
Zdenek Kabelac
0631d233d8 activation: add _add_layer_target_to_dtree
Add function for creation of simple linear mapping over layer device.
2013-02-23 10:29:08 +01:00
Zdenek Kabelac
520cc9a7f8 thin: replace _thin_layer with lv_layer()
Use consitently lv_layer function internally for thin pool layer name.
2013-02-23 10:28:04 +01:00
Zdenek Kabelac
78b23f3595 activation: extend _cached_info
Add layer string to support check of layered devices.
2013-02-23 10:28:01 +01:00
Peter Rajnoha
303e86adc8 pvcreate: fix alignment to incorporate alignment offset if PV has 0 MDAs
If zero metadata copies are used, there's no further recalculation of
PV alignment that happens when adding metadata areas to the PV and
which actually calculates the alignment correctly as a matter of fact.
So fix this for "PV without MDA" case as well.

Before this patch:
[1] raw/~ # pvcreate --dataalignment 8m --dataalignmentoffset 4m
--metadatacopies 1 /dev/sda
  Physical volume "/dev/sda" successfully created
[1] raw/~ # pvs -o pv_name,pe_start
  PV         1st PE
  /dev/sda    12.00m
[1] raw/~ # pvcreate --dataalignment 8m --dataalignmentoffset 4m
--metadatacopies 0 /dev/sda
  Physical volume "/dev/sda" successfully created
[1] raw/~ # pvs -o pv_name,pe_start
  PV         1st PE
  /dev/sda     8.00m

After this patch:
[1] raw/~ # pvcreate --dataalignment 8m --dataalignmentoffset 4m
--metadatacopies 1 /dev/sda
  Physical volume "/dev/sda" successfully created
[1] raw/~ # pvs -o pv_name,pe_start
  PV         1st PE
  /dev/sda    12.00m
[1] raw/~ # pvcreate --dataalignment 8m --dataalignmentoffset 4m
--metadatacopies 0 /dev/sda
  Physical volume "/dev/sda" successfully created
[1] raw/~ # pvs -o pv_name,pe_start
  PV         1st PE
  /dev/sda    12.00m

Also, remove a superfluous condition "pv->pe_start < pv->pe_align" in:
  if (pe_start == PV_PE_START_CALC && pv->pe_start < pv->pe_align)
    pv->pe_start = pv->pe_align ...
This part of the condition is not reachable as with the PV_PE_START_CALC,
we always have pv->pe_start set to 0 from the PV struct initialisation
(...the pv->pe_start value is just being calculated).
2013-02-21 14:51:19 +01:00
Jonathan Brassow
70f57996b3 RAID: Add new 'raid10_segtype_default' setting in lvm.conf
If '--mirrors/-m' and '--stripes/-i' are used together when creating
a logical volume, mirrors-over-stripes is currently chosen.  The user
can override this by using the '--type raid10' option on creation.
However, we want a place where we can set the default behavior to
'raid10' explicitly - similar to the "mirror" and "raid1" tunable,
mirror_segtype_default.

A follow-on patch should use this new setting to change the default
from "mirror" to "raid10", as this is the preferred segment type.
2013-02-20 15:10:04 -06:00
Jonathan Brassow
dc2ce71313 clean-up: Remove a FIXME question that has been settled
It is ok for us to use the shorthand 'lv_is_virtual' to detect error
targets in a RAID LV when searching for candidates for device replacement.
2013-02-20 15:03:58 -06:00
Jonathan Brassow
bd0ee420b5 RAID: Allow remove/replace of sub-LVs composed of error segments.
When a device fails, we may wish to replace those segments with an
error segment.  (Like when a 'vgreduce --removemissing' removes a
failed device that happens to be a RAID image/meta.)  We are then left
with images that we will eventually want to remove or replace.

This patch allows us to pull out these virtual "error" sub-LVs.  This
allows a user to 'lvconvert -m -1 vg/lv' to extract the bad sub-LVs.
Sub-LVs with error segments are considered for extraction before other
possible devices so that good devices are not accidentally removed.

This patch also adds the ability to replace RAID images that contain error
segments.  The user will still be unable to run 'lvconvert --replace'
because there is no way to address the 'error' segment (i.e. no PV
that it is associated with).  However, 'lvconvert --repair' can be
used to replace the image's error segment with a new PV.  This is also
the most appropriate way to do it, since the LV will continue to be
reported as 'partial'.
2013-02-20 14:58:56 -06:00
Jonathan Brassow
845852d6b4 RAID: Make 'vgreduce --removemissing' work with RAID LVs
Currently it is impossible to remove a failed PV which has a RAID LV
on it.  This patch fixes the issue by replacing the failed PV with an
'error' segment within the affected sub-LVs.  Once there is no longer
a RAID LV using the PV, it can be removed.

Most often, it is better to replace a failed RAID device with a spare.
(You can use 'lvconvert --repair <vg>/<LV>' to accomplish that.)
However, if there are no spares in the volume group and none will be
added, it is useful to be able to removed the failed device.

Following patches address the ability to perform 'lvconvert' operations
on RAID LVs that contain sub-LVs composed of 'error' segments.
2013-02-20 14:52:46 -06:00
Jonathan Brassow
0e4ffd9d3b clean-up: Rename lvm.conf setting 'mirror_region_size' to 'raid_region_size'
We have been using 'mirror_region_size' in lvm.conf as the default region
size for RAID logical volumes as well as mirror logical volumes.  Since,
"raid" is more inclusive and representative than "mirror", I have changed
the name of this setting.  We must still check for the old setting and warn
the user if we are overriding it with the new setting if both happen to be
present.
2013-02-20 14:40:17 -06:00
Peter Rajnoha
a7d6a612b8 fix: 'Couldn't read extent size' --> '... extent start' 2013-02-21 13:33:27 +01:00
Peter Rajnoha
722ca363f0 report: fix pvs -o pv_free reporting for PVs with 0 PEs
[0] raw/~ # lsblk -o NAME,SIZE /dev/sda
NAME  SIZE
sda   128M

[0] raw/~ # pvcreate --dataalignment 128m /dev/sda
  Physical volume "/dev/sda" successfully created

[0] raw/~ # vgcreate vg /dev/sda
  Volume group "vg" successfully created

[0] raw/~ # lvcreate -l1 vg
  Volume group "vg" has insufficient free space (0 extents): 1 required.

Before this patch:
[0] raw/~ # pvs -o pv_name,pv_free
  PV         PFree
  /dev/sda   128.00m

After this patch:
[0] raw/~ # pvs -o pv_name,pv_free
  PV         PFree
  /dev/sda      0
2013-02-21 13:28:07 +01:00
Zdenek Kabelac
e566faaae6 cleanup: old style gcc 2013-02-05 16:54:12 +01:00
Zdenek Kabelac
d97605beaf cleanup: preserve signesss and type size on return values 2013-02-05 16:54:11 +01:00
Zdenek Kabelac
7910b6c0ba thin: update pool_is_active
Change it to take LV and move it to exported header - seems
to be a better fit for usability from tools/ directory.
2013-02-05 16:54:11 +01:00
Zdenek Kabelac
c984d8fbab thin: properly unmark volume after detach
When the volume is detached form thin pool,
unmask THIN_VOLUME flag and reset related pointers.
2013-02-05 14:40:37 +01:00
Zdenek Kabelac
11eaf1c98c thin: add function pool_is_active
This internal function check for active pool device.
For cluster it checks every thin volume,
On the non-clustered VG we need to check just
for presence of -tpool device.
2013-02-05 14:35:44 +01:00
Zdenek Kabelac
9d445f371c report: leave empty report field for 0
Since we do not support LVs with 0 size, use this value
as 'error' value for devices without origin, and leave this
field blank as in other cases.
2013-02-05 14:32:37 +01:00
Zdenek Kabelac
ddeb37f282 cleanup: add internal error check
Check if 'is_removable' is defined and report internal error,
if it's missing.
2013-02-05 14:27:24 +01:00
Jonathan Brassow
f5cd9c3563 clean-up: Another functiont that can use 'lv_layer'
lib/activate/dev_manager.c:dev_manager_raid_status() can also use
the new 'lv_layer' function.
2013-02-04 17:10:16 -06:00
Zdenek Kabelac
a4870c79ca thin: use noflush for obtaining transaction_id
Do not flush thin pool data, when reading transation_id status.
2013-02-04 19:05:56 +01:00
Zdenek Kabelac
153ce89af3 cleanup: comment update
Just update code comment and use single line if().
2013-02-04 19:05:43 +01:00
Zdenek Kabelac
b37a0a39e3 cleanup: indent line 2013-02-04 19:01:11 +01:00
Zdenek Kabelac
8ed0b6f312 thin: replace is_active with send_messages
Since is_active is only used for thinp
replace struct member with more meaningful
send_messages flag
2013-02-04 19:01:10 +01:00
Zdenek Kabelac
4af4241ba4 use lv_layer 2013-02-04 19:01:10 +01:00
Zdenek Kabelac
ca7abbce8a activate: add lv_layer function
Add function to return layer name for LV.
2013-02-04 19:01:10 +01:00
Zdenek Kabelac
9f433e6ee3 cleanup: postpone lv_is_thin_volume check
Code move to make it easier to follow and
call _add_dev_to_dtree() in the separate if() branch
for thin volumes.
2013-02-04 19:00:19 +01:00
Jonathan Brassow
801d4f96a8 RAID: Improve 'lvs' attribute reporting of RAID LVs and sub-LVs
There are currently a few issues with the reporting done on RAID LVs and
sub-LVs.  The most concerning is that 'lvs' does not always report the
correct failure status of individual RAID sub-LVs (devices).  This can
occur when a device fails and is restored after the failure has been
detected by the kernel.  In this case, 'lvs' would report all devices are
fine because it can read the labels on each device just fine.
Example:
[root@bp-01 lvm2]# lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   rwi-a-r--   100.00 lv_rimage_0(0),lv_rimage_1(0)
  [lv_rimage_0] vg   iwi-aor--          /dev/sda1(1)
  [lv_rimage_1] vg   iwi-aor--          /dev/sdb1(1)
  [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)

However, 'dmsetup status' on the device tells us a different story:
  [root@bp-01 lvm2]# dmsetup status vg-lv
  0 1024000 raid raid1 2 DA 1024000/1024000

In this case, we must also be sure to check the RAID LVs kernel status
in order to get the proper information.  Here is an example of the correct
output that is displayed after this patch is applied:
[root@bp-01 lvm2]# lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   rwi-a-r-p   100.00 lv_rimage_0(0),lv_rimage_1(0)
  [lv_rimage_0] vg   iwi-aor-p          /dev/sda1(1)
  [lv_rimage_1] vg   iwi-aor--          /dev/sdb1(1)
  [lv_rmeta_0]  vg   ewi-aor-p          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)

The other case where 'lvs' gives incomplete or improper output is when a
device is replaced or added to a RAID LV.  It should display that the RAID
LV is in the process of sync'ing and that the new device is the only one
that is not-in-sync - as indicated by a leading 'I' in the Attr column.
(Remember that 'i' indicates an (i)mage that is in-sync and 'I' indicates
an (I)mage that is not in sync.)  Here's an example of the old incorrect
behaviour:
[root@bp-01 lvm2]# lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   rwi-a-r--   100.00 lv_rimage_0(0),lv_rimage_1(0)
  [lv_rimage_0] vg   iwi-aor--          /dev/sda1(1)
  [lv_rimage_1] vg   iwi-aor--          /dev/sdb1(1)
  [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)
[root@bp-01 lvm2]# lvconvert -m +1 vg/lv; lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   rwi-a-r--     0.00 lv_rimage_0(0),lv_rimage_1(0),lv_rimage_2(0)
  [lv_rimage_0] vg   Iwi-aor--          /dev/sda1(1)
  [lv_rimage_1] vg   Iwi-aor--          /dev/sdb1(1)
  [lv_rimage_2] vg   Iwi-aor--          /dev/sdc1(1)
  [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)
  [lv_rmeta_2]  vg   ewi-aor--          /dev/sdc1(0)                            ** Note that all the images currently are marked as 'I' even though it is
   only the last device that has been added that should be marked.

Here is an example of the correct output after this patch is applied:
[root@bp-01 lvm2]# lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   rwi-a-r--   100.00 lv_rimage_0(0),lv_rimage_1(0)
  [lv_rimage_0] vg   iwi-aor--          /dev/sda1(1)
  [lv_rimage_1] vg   iwi-aor--          /dev/sdb1(1)
  [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)
[root@bp-01 lvm2]# lvconvert -m +1 vg/lv; lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   rwi-a-r--     0.00 lv_rimage_0(0),lv_rimage_1(0),lv_rimage_2(0)
  [lv_rimage_0] vg   iwi-aor--          /dev/sda1(1)
  [lv_rimage_1] vg   iwi-aor--          /dev/sdb1(1)
  [lv_rimage_2] vg   Iwi-aor--          /dev/sdc1(1)
  [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)
  [lv_rmeta_2]  vg   ewi-aor--          /dev/sdc1(0)
** Note only the last image is marked with an 'I'.  This is correct and we can
   tell that it isn't the whole array that is sync'ing, but just the new
   device.

It also works under snapshots...
[root@bp-01 lvm2]# lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   owi-a-r-p    33.47 lv_rimage_0(0),lv_rimage_1(0),lv_rimage_2(0)
  [lv_rimage_0] vg   iwi-aor--          /dev/sda1(1)
  [lv_rimage_1] vg   Iwi-aor-p          /dev/sdb1(1)
  [lv_rimage_2] vg   Iwi-aor--          /dev/sdc1(1)
  [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor-p          /dev/sdb1(0)
  [lv_rmeta_2]  vg   ewi-aor--          /dev/sdc1(0)
  snap          vg   swi-a-s--          /dev/sda1(51201)
2013-02-01 11:33:54 -06:00
Jonathan Brassow
37ffe6a13a RAID: Cache previous results of lv_raid_dev_health for future use
We can avoid many dev_manager (ioctl) calls by caching the results of
previous calls to lv_raid_dev_health.  Just considering the case where
'lvs -a' is called to get the attributes of a RAID LV and its sub-lvs,
this function would be called many times.  (It would be called at least
7 times for a 3-way RAID1 - once for the health of each sub-LV and once
for the health of the top-level LV.)  This is a good idea because the
sub-LVs are processed in groups along with their parent RAID LV and in
each case, it is the parent LV whose status will be queried.  Therefore,
there only needs to be one trip through dev_manager for each time the
group is processed.
2013-02-01 11:32:18 -06:00
Jonathan Brassow
c8242e5cf4 RAID: Add RAID status accessibility functions
Similar to the way thin* accesses its kernel status, we add a method
for RAID to grab the various values in its status output without the
higher levels (LVM) having to understand how to parse the output.
Added functions include:
        - lib/activate/dev_manager.c:dev_manager_raid_status()
          Pulls the status line from the kernel

        - libdm/libdm-deptree.c:dm_get_status_raid()
          Parses status line and puts components into dm_status_raid struct

        - lib/activate/activate.c:lv_raid_dev_health()
          Accesses dm_status_raid to deliver raid dev_health string

The new structure and functions can provide a more unified way to access
status information.  ('lv_raid_percent' could switch to using these
functions, for example.)
2013-02-01 11:31:47 -06:00
Petr Rockai
1e4a9534f4 lvmetad: Call _lvmetad_handle_reply in lvmetad_vg_lookup. 2013-01-16 11:19:33 +01:00
Sebastian Ott
9602e68577 filters: add scm devices
Fix this:
pvcreate /dev/scma
  Device /dev/scma not found (or ignored by filtering).

Reported-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
2013-01-11 09:24:07 +01:00
Alasdair G Kergon
06abb2dd4c logging: classify log_debug messages
Place most log_debug() messages into a class.
2013-01-07 22:30:29 +00:00
Alasdair G Kergon
7f747a0d73 logging: add debug classes
Add log/debug_classes to lvm.conf to allow debug messages to be
classified and filtered at runtime.

The dm_errno field is only used by log_error(), so I've redefined it
for log_debug() messages to hold the message class.

By default, all existing messages appear, but we can add categories that
generate high volumes of data, such as logging all traffic to/from
lvmetad.
2013-01-07 22:25:19 +00:00
Alasdair G Kergon
b617109fff lvmetad: fix format1 updates
fmt1 doesn't have a separate commit function: updates take effect
immediately vg_write is called, so we must update lvmetad at this
point if we're going to go on and ask lvmetad for the VG metadata
again before calling the commit function (though that's probably an
unsupported and pointless thing to do anyway as the client must
already have that data and it cannot have changed because it's locked
and with devs suspended we shouldn't be communicating with lvmetad;
so when that's fixed properly, this fix here can be reverted).

This problem showed up as an internal error when lvremoving an LVM1
snapshot.

> Internal error: LV snap1 (00000000000000000000000000000001) missing from preload metadata

https://bugzilla.redhat.com/891855
2013-01-05 03:17:35 +00:00
Alasdair G Kergon
48e1ae7f6a lvmetad: add basic client-side debug logging
First attempt at showing precisely what use any command is making of
lvmetad in the -vvvv trace information.
2013-01-05 00:35:50 +00:00
Alasdair G Kergon
41e7f45258 lvmetad: rename device vars and move _token_update
Move _token_update() to avoid the need for _lvmetad_send prototype.

Use 'dev' consistently for a struct device * variable.
Use 'devno' for a dev_t.
2013-01-04 23:45:22 +00:00
Alasdair G Kergon
6d760b2c63 lvmetad: improve client logging when connecting
Rename lvmetad_warning() to lvmetad_connect_or_warn().

Log all connection attempts on the client side, whether successful or not.

Reduce some nesting and remove a redundant assertion.
2013-01-04 23:22:30 +00:00
Jonathan Brassow
970dfbcd69 RAID: Limit replacement of devices when array is not in-sync.
If a RAID array is not in-sync, replacing devices should not be allowed
as a general rule.  This is because the contents used to populate the
incoming device may be undefined because the devices being read where
not in-sync.  The kernel enforces this rule unless overridden by not
allowing the creation of an array that is not in-sync and includes a
devices that needs to be rebuilt.

Since we cannot know the sync state of an LV if it is inactive, we must
also enforce the rule that an array must be active to replace devices.

That leaves us with the following conditions:
1) never allow replacement or repair of devices if the LV is in-active
2) never allow replacement if the LV is not in-sync
3) allow repair if the LV is not in-sync, but warn that contents may
   not be recoverable.

In the case where a user is performing the repair on the command line via
'lvconvert --repair', the warning is printed before the user is prompted
if they would like to replace the device(s).  If the repair is automated
(i.e. via dmeventd and policy is "allocate"), then the device is replaced
if possible and the warning is printed.
2012-12-18 14:40:42 -06:00
Zdenek Kabelac
401c9aba4a pv_read: add missing check for valid info
If the lvmcache_info_from_pvid() fails to find valid
info, invoke the lookup by dev, and only in this case
call lvmcache_info_from_pvid() again.

Also check for the result of info and return
error directly, so the NULL is not passed
to lvmcache_get_label().
2012-12-15 17:23:27 +01:00
Zdenek Kabelac
e012d0635d lvmetad: check id_read_format error status
Detect error from id_read_format() function.
2012-12-15 17:23:27 +01:00
Zdenek Kabelac
ff5612c0c3 format-text: check for _text_create_text_instance
Test if 'fid' creation failed and report stack trace,
break the loop and do not pass NULL fid further.
2012-12-15 17:23:23 +01:00
Zdenek Kabelac
740ab81d03 log: move abort past syslog
When the abort_on_internal_errors is enabled, we aborted prior
the syslog logging output.

Since such fatal error gets level _LOG_FATAL it should
not be blocked by debug_level() check so lets move it further,
to get abort error logged also via syslog.
2012-12-15 17:22:48 +01:00
Zdenek Kabelac
575c4ed964 cleanup: use proper const in apply_lvname_restrictions
Better constness used for reserved prefixes and strings.
Also simplify a bit validate_name and use direct char
checks isntead of 2 strcmp() calls.
2012-12-15 14:57:40 +01:00
Zdenek Kabelac
21f6511bc2 cleanup: reorder code
Swap if() test condition and check for failure
and use traditional 'stack' trace.
2012-12-15 14:57:40 +01:00
Zdenek Kabelac
8ab4334505 cleanup: ignore return values
These dm_snprintfs should not fail, since enough space is reserved.
So return intentionaly ignored.
2012-12-15 14:57:40 +01:00
Petr Rockai
f14f2d4378 lvmetad: Fix autoactivation for MDA-less PVs.
Calling pvscan --cache with -aay on a PV without an MDA would spuriously fail
with an internal error, because of an incorrect assumption that a parsed VG
structure was always available. This is not true and the autoactivation handler
needs to call vg_read to obtain metadata in cases where the PV had no MDAs to
parse. Therefore, we pass vgid into the handler instead of the (possibly NULL)
VG coming from the PV's MDA.
2012-12-12 13:19:04 +01:00
Marian Csontos
ff5c1c576c lvmetad: use dm_config_destroy to free pvmeta
Release pvmeta handler with proper dm_config_destroy() function.
TODO: Fix primary fault for this internal error.

Signed-off-by: mcsontos@redhat.com
2012-12-11 11:55:12 +01:00
Zdenek Kabelac
17be6d5210 thin: fix test for dicards ignore settings
Arghh, this was bad last-minute shortening of if() expression
in the commit 1ef9831018.

dm_tree_node_set_thin_pool_discard() must not run in the same
expression as check for non-power-2 discard, otherwise
there are 2 calls for dm_tree_node_set_thin_pool_discard
and whole setting of discards is missinterpretted.

In-relase fix it by using proper parentheses {}.
2012-12-11 11:26:19 +01:00
Zdenek Kabelac
ec49f07b0d mirrors: fix leak in device_is_usable mirror check
Function _ignore_blocked_mirror_devices was not release
allocated strings images_health and log_health.

In error paths it was also not releasing dm_task structure.

Swaped return code of _ignore_blocked_mirror_devices and
use 1 as success.

In _parse_mirror_status use log_error if memory allocation
fails and few more errors so they are no going unnoticed
as debug messages.

On error path always clear return values and free strings.

For dev_create_file  use cache mem pool to avoid memleak.
2012-12-11 11:15:22 +01:00
Peter Rajnoha
35a4d70aad activation: don't miss the log on empty {auto_activation|read_only|}_volume_list
Addendum to previous commit...
2012-12-04 14:12:36 +01:00
Peter Rajnoha
e2be2652ad Allow empty activation/{auto_activation|read_only|}_volume_list config option.
In case we don't want to activate, autoactivate or have the
VG/LV read-only. Primarily targeted for the auto_activation_volume_list,
but it makes no harm for other settings (the part of the code
that reads these three settings is shared, but there's no
reason to separate it only for this change).
2012-12-04 10:33:54 +01:00
Zdenek Kabelac
a530c70b21 thin: update thin feature detection
Safe 1 static var and keep whole detection within one function.
2012-12-03 13:03:41 +01:00
Zdenek Kabelac
5ec20e267f thin: reworked thin feature detection
Rework thin feature detection to support runtime
section to allow to disable them selectively.

New lvm.conf option is born: global/thin_disabled_features
2012-12-03 11:57:40 +01:00
Zdenek Kabelac
6987a353de thin: add detach_pool_metadata_lv
Add internal function detach_pool_metadata_lv().
2012-12-02 17:56:29 +01:00
Zdenek Kabelac
9ec474f38a lvm2api: fix size reporting
API is reporting all sizes as 64bit integers in bytes.
Fix at those places, where sectors were returned
to remain consistent.
2012-12-02 17:55:08 +01:00
Peter Rajnoha
ed9751d9fa udev: add a warning message if DM_DISABLE_UDEV set and udev running
$ export DM_DISABLE_UDEV=1

$ dmsetup create test --table "0 1 zero"
Udev is running and DM_DISABLE_UDEV environment variable is set. Bypassing udev, device-mapper library will manage device nodes in device directory.

$ lvchange -ay vg/lvol0
  Udev is running and DM_DISABLE_UDEV environment variable is set. Bypassing udev, LVM will manage logical volume symlinks in device directory.
  Udev is running and DM_DISABLE_UDEV environment variable is set. Bypassing udev, LVM will obtain device list by scanning device directory.
  Udev is running and DM_DISABLE_UDEV environment variable is set. Bypassing udev, device-mapper library will manage device nodes in device directory.
2012-11-29 15:57:43 +01:00
Peter Rajnoha
4891a735d3 udev: recognize DM_DISABLE_UDEV environment variable
Setting this environment variable will cause a full fallback
to old direct node and symlink management in libdevmapper and lvm2.

It means:

 - disabling udev synchronization
   (--noudevsync in dmsetup and --noudevsync + activation/udev_sync=0
    lvm2 config)
 - disabling dm and any subsystem related udev rules
   (--noudevrules in dmsetup and activation/udev_rules=0 lvm2 config)
 - management of nodes/symlinks under /dev directly by libdevmapper/lvm2
   (--verifyudev in dmsetup and activation/verify_udev_operations=1
    lvm2 config)
 - not obtaining any device list from udev database
   (devices/obtain_device_list_from_udev=0 lvm2 config)

Note: we could set all of these before - there's no functional change!
However the DM_DISABLE_UDEV environment variable is a nice shortcut
to make it easier for libdevmapper users so that one can switch off all
of the udev management off at one go directly on the command line,
without a need to modify any source or add any extra switches.
2012-11-29 14:03:48 +01:00
Zdenek Kabelac
0387e70d76 thin: fix property discard for lvm2api
Discards property is string and may have these values:
  ignore, nopassdown, passdown
2012-11-27 14:09:49 +01:00
Zdenek Kabelac
09b7ceea95 thin: allow restore with --force
Allow restoring metadata with thin pool volumes.
No validation is done for this case within vgcfgrestore tool -
thus incorrect metadata may lead to destruction of pool content.
2012-11-27 14:08:24 +01:00
Alasdair G Kergon
8c49aa79e7 filters: Add STEC skd and Violin vtms devices 2012-11-26 14:55:17 +00:00
Zdenek Kabelac
1ef9831018 thin: support configurable thin pool defaults
Configurable settings for thin pool create
if they are not specified on command line.

New supported lvm.conf options are:
  allocation/thin_pool_chunk_size
  allocation/thin_pool_discards
  allocation/thin_pool_zero
2012-11-26 12:16:47 +01:00
Zdenek Kabelac
683b1f0625 thin: detect discards for non-power-2
Check if target supports discards for chunk sizes,
that are not power of 2 (just multiple of 64K),
and enable it in case it's supported by thin kernel target.
2012-11-26 12:14:47 +01:00
Petr Rockai
60668f823e Automatically restore MISSING PVs with no MDAs. 2012-11-25 20:41:56 +01:00
Jonathan Brassow
fb0cee9a66 RAID: Do not allow --splitmirrors on RAID10 logical volumes.
RAID10 does not have the ability to split off images for independent
use.  So, 'lvconvert --splitmirrors' will not work and must be
disallowed.
2012-11-21 18:39:26 -06:00
Zdenek Kabelac
400f644286 lv_manip: fix regresion from bf2741376d
Commit bf2741376d started to use
lv_is_active() instead of call for lv_info & info.exists so
we cover also cluster activated devices.
For snapshost the conversion was not correct and introduced
regression by blocking creation of snapshot of inactive LV.

Fix it by assigning lv_is_active() directly.
Note: we still have minor issue to fix - to make
lv_is_???? function able to return error states since
lv_info() may fail.
2012-11-21 12:15:09 +01:00
Zdenek Kabelac
d5697b29ee mm: skip mlocking [vectors]
Somehow forgotten:
https://www.redhat.com/archives/linux-lvm/2012-June/msg00019.html
Need for arm architecture support.
2012-11-20 10:02:51 +01:00
Zdenek Kabelac
2e96ea4a89 liblvm: internal API change
Return LV/NULL instead of 1/0 which saves lookup for created LV.
2012-11-19 14:37:30 +01:00
Zdenek Kabelac
cf5242a670 lvconvert: store target attributes
Target tells us its version, and we may allow different set of options
to be supported with different version of driver.

Idea is to provide individual feature flags and later be
able to query for them.
2012-11-19 14:17:10 +01:00
Petr Rockai
983f0b46f2 lvmetad: Init lazily, to avoid socket access on config overrides. 2012-10-30 09:15:47 +01:00
Peter Rajnoha
7c59199d49 lvmetad: warn only if use_lvmetad=1 and locking_type=3 2012-10-29 16:20:35 +01:00
Peter Rajnoha
10492b238d lvmetad: whats_new + more explanation for previous commit 2012-10-25 14:47:45 +02:00
Petr Rockai
2fdd0840d5 lvmetad: Disable and warn when locking_type is 3. 2012-10-25 14:31:08 +02:00
Jonathan Brassow
b248ba0a39 mirror: Avoid reading mirrors with failed devices in mirrored log
Commit 9fd7ac7d03 did not handle mirrors
that contained mirrored logs.  This is because the status line of the
mirror does not give an indication of the health of the mirrored log,
as you can see here:
        [root@bp-01 lvm2]# dmsetup status vg-lv vg-lv_mlog
        vg-lv: 0 409600 mirror 2 253:6 253:7 400/400 1 AA 3 disk 253:5 A
        vg-lv_mlog: 0 8192 mirror 2 253:3 253:4 7/8 1 AD 1 core
Thus, the possibility for LVM commands to hang still persists when mirror
have mirrored logs.  I discovered this while performing some testing that
does polling with 'pvs' while doing I/O and killing devices.  The 'pvs'
managed to get between the mirrored log device failure and the attempt
by dmeventd to repair it.  The result was a very nasty block in LVM
commands that is very difficult to remove - even for someone who knows
what is going on.  Thus, it is absolutely essential that the log of a
mirror be recursively checked for mirror devices which may be failed
as well.

Despite what the code comment says in the aforementioned commit...
+ * _mirrored_transient_status().  FIXME: It is unable to handle mirrors
+ * with mirrored logs because it does not have a way to get the status of
+ * the mirror that forms the log, which could be blocked.
... it is possible to get the status of the log because the log device
major/minor is given to us by the status output of the top-level mirror.
We can use that to query the log device for any DM status and see if it
is a mirror that needs to be bypassed.  This patch does just that and is
now able to avoid reading from mirrors that have failed devices in a
mirrored log.
2012-10-25 00:42:45 -05:00
Jonathan Brassow
9fd7ac7d03 mirror: Avoid reading from mirrors that have failed devices
Addresses: rhbz855398 (Allow VGs to be built on cluster mirrors),
           and other issues.

The LVM code attempts to avoid reading labels from devices that are
suspended to try to avoid situations that may cause the commands to
block indefinitely.  When scanning devices, 'ignore_suspended_devices'
can be set so the code (lib/activate/dev_manager.c:device_is_usable())
checks any DM devices it finds and avoids them if they are suspended.

The mirror target has an additional mechanism that can cause I/O to
be blocked.  If a device in a mirror fails, all I/O will be blocked
by the kernel until a new table (a linear target or a mirror with
replacement devices) is loaded.  The mirror indicates that this condition
has happened by marking a 'D' for the faulty device in its status
output.  This condition must also be checked by 'device_is_usable()' to
avoid the possibility of blocking LVM commands indefinitely due to an
attempt to read the blocked mirror for labels.

Until now, mirrors were avoided if the 'ignore_suspended_devices'
condition was set.  This check seemed to suggest, "if we are concerned
about suspended devices, then let's ignore mirrors altogether just
in case".  This is insufficient and doesn't solve any problems.  All
devices that are suspended are already avoided if
'ignore_suspended_devices' is set; and if a mirror is blocking because
of an error condition, it will block the LVM command regardless of the
setting of that variable.

Rather than avoiding mirrors whenever 'ignore_suspended_devices' is
set, this patch causes mirrors to be avoided whenever they are blocking
due to an error.  (As mentioned above, the case where a DM device is
suspended is already covered.)  This solves a number of issues that weren't
handled before.  For example, pvcreate (or any command that does a
pv_read or vg_read, which eventually call device_is_usable()) will be
protected from blocked mirrors regardless of how
'ignore_suspended_devices' is set.  Additionally, a mirror that is
neither suspended nor blocking is /allowed/ to be read regardless
of how 'ignore_suspended_devices' is set.  (The latter point being the
source of the fix for rhbz855398.)
2012-10-23 23:10:33 -05:00
Jonathan Brassow
e191780947 RAID: Make RAID 4/5/6 display sync status under heading s/Copy%/Cpy%Sync
The heading 'Copy%' is specific to PVMOVE volumes, but can be generalized
to apply to LVM mirrors also.  It is a bit awkward to use 'Copy%' for
RAID 4/5/6, however - 'Sync%' would be more appropriate.  This is why
RAID 4/5/6 have not displayed their sync status by any means available to
'lvs' yet.

Example (old):
[root@hayes-02 lvm2]# lvs vg
  LV      VG   Attr      LSize  Pool Origin Data%  Move Log Cpy%Sy Convert
  lv    vg   -wi-a----  1.00g
  raid1 vg   rwi-a-r--  1.00g                             100.00
  raid4 vg   rwi-a-r--  1.01g
  raid5 vg   rwi-a-r--  1.01g
  raid6 vg   rwi-a-r--  1.01g

This patch changes the heading to 'Cpy%Sync' and allows RAID 4/5/6 to print
their sync percent in this field.

Example (new):
[root@hayes-02 lvm2]# lvs vg
  LV    VG   Attr      LSize Pool Origin Data%  Move Log Cpy%Sync Convert
  lv    vg   -wi-a---- 1.00g
  raid1 vg   rwi-a-r-- 1.00g                               100.00
  raid4 vg   rwi-a-r-- 1.01g                               100.00
  raid5 vg   rwi-a-r-- 1.01g                               100.00
  raid6 vg   rwi-a-r-- 1.01g                               100.00
2012-10-23 21:19:27 -05:00
Jonathan Brassow
6db461e3b0 mirror/raid: Move 'copy_percent' to common code (mirror.c -> lv_manip.c)
The 'copy_percent' function takes the 'extents_copied' field from each
segment in an LV to create the numerator for the ratio that is to
become the copy_percent.  (Otherwise known as the 'sync' percent for
non-pvmove uses, like mirror LVs and RAID LVs.)  This function safely
works on RAID - not just mirrors - so it is better to have it in
lv_manip.c rather than mirror.c.

There's a lot of different functions that do a lot of different things
in lv_manip.c, so I placed the function near a function in lv_manip.c
that it was close to in metadata-exported.h.  Different placement in the
file or a different name for the function may be useful.
2012-10-23 20:33:54 -05:00
Zdenek Kabelac
bf2741376d Use lv_is_active instead of lv_info()
Usage of lv_is_active makes it more obvious what is being checked.
2012-10-17 15:42:31 +02:00
Zdenek Kabelac
e431b19bac cleanup: move log_error upward in code stack
Report log_error earlier.
2012-10-17 15:41:44 +02:00
Zdenek Kabelac
f260f99d57 cleanup: switch log_error to log_warn
Use log_warn to print non-fatal warning messages.

Use of log_error would confuse checker for testing
whether proper error has been reported for some real error.
2012-10-17 15:41:35 +02:00
Zdenek Kabelac
b89963a7c3 cleanup: swap return values
Use lvm standard return code for success/fail  1/0.
2012-10-17 15:37:26 +02:00
Jonathan Brassow
7519d881ef Clean-up: Adjust message to be clearer on action taken and why
A message is printed when the region_size of a RAID LV is adjusted
to allow for large (> ~1TB) LVs.  The message wasn't very clear.
Hopefully, this is better.
2012-10-15 15:09:05 -05:00
Petr Rockai
08ba1b4472 lvmetad: Only print scanning messages when scanning 1 device. 2012-10-15 12:45:50 +02:00
Zdenek Kabelac
2393b468a4 lvmetad: fix previous commit
Ooops patch conversion for gcc cleanup missed this line.
2012-10-15 00:44:31 +02:00
Zdenek Kabelac
6595cae6e9 cleanup: resolve dereferencing type-punned pointer
fix gcc warning:
dereferencing type-punned pointer will break strict-aliasing rules
Replace call by value and pass just const pointer to pvid.
2012-10-14 23:14:00 +02:00
Zdenek Kabelac
4379365cae lvmetad: fix memory leaks in error paths
Destroy interator in error path.
Releasy any possible allocated buffer from buffer_append_f
and buffer_append_vf  in error path.
2012-10-13 19:19:50 +02:00
Zdenek Kabelac
5df6ec24bf cleanup: used old C standard 2012-10-13 19:18:33 +02:00
Zdenek Kabelac
feea5003cc cleanup: remove unneeded headers
Header do not provide any needed symbols.
2012-10-13 19:13:25 +02:00
Zdenek Kabelac
0f155e699c lvmetad: release_token without lvmetad
Missing wrapper when not building with lvmetad support
2012-10-12 17:16:28 +02:00
Zdenek Kabelac
31d8c3ee85 debug: do not play with fds with valgring
When valgrind usage is desired by user (--enable-valgrind-pool)
skip playing/closing/reopenning with descriptors - it makes
valgridng useless.

Make sleep delay for clvmd start longer.
2012-10-12 17:02:30 +02:00
Zdenek Kabelac
ee7143cd02 lvmetead: release token
Release allocated memory when destroing toolcontext
2012-10-12 17:01:22 +02:00
Zdenek Kabelac
be291e1064 thin: lvm2api return origin property for thin LV 2012-10-12 12:20:55 +02:00
Petr Rockai
d6d207006a lvmetad: Fix the fix for 813766 (lvmetad connection warning). 2012-10-12 11:22:47 +02:00
Petr Rockai
28776b9526 lvmetad: Make --sysinit suppress connection failure warnings. 2012-10-12 10:58:04 +02:00
Zdenek Kabelac
9ee071705b cleanup: fix compiler warnings
remove unused vars
move var declarations into the front of functions.
fix some sign warnings
2012-10-12 10:25:07 +02:00
Alasdair G Kergon
ee3cfa4184 python: Add bindings for liblvm2app.
Use configure --enable-python_bindings to generate them.

Note that the Makefiles do not yet control the owner or permissions of
the two new files on installation.
2012-10-12 02:08:47 +01:00
Zdenek Kabelac
316ce655a3 thin: raise required version to 1.4
Stay safe and require 1.4 (kernel 3.6) for non-power-of-2
support for thin pool chunk_size.
2012-10-11 14:09:07 +02:00
Petr Rockai
deea86c7f4 pvscan --cache: Also read metadata from LVM1 PVs (BZ 863401). 2012-10-10 21:55:24 +02:00
Zdenek Kabelac
b6512b10ae cleanup: fix typos 2012-10-10 21:22:11 +02:00
Zdenek Kabelac
ca09c9ab4c thin: support non power of 2 chunk size
Support thin chunk size with multiple of 64KiB if user has
thin-pool target version at least 1.2.
2012-10-10 21:21:00 +02:00
Petr Rockai
71d718a4a4 lvmetad: Warn if lvmetad is running but disabled. 2012-10-10 13:54:29 +02:00
Petr Rockai
ee4c75c8b7 dev-cache: Make dev_iter_create work with a NULL filter. 2012-10-08 16:18:10 +02:00
Petr Rockai
582a344cd6 lvmetad: In pvscan --cache, update the token directly. 2012-10-08 14:38:22 +02:00
Zdenek Kabelac
ff13206c7e report: call snapshot percent with cow only
Ensure lv_snapshot_percent is used only with snapshot LVs.
2012-10-08 12:16:53 +02:00
Zdenek Kabelac
1da6c1495a lvm2api: fix data percent reporting for thin, snap
Use same logic for lvm2api as we use lvs reporting.
data_percent is meant to be superset for snap_percent.
2012-10-05 10:37:09 +02:00
Jonathan Brassow
9efd3fb604 RAID: Do not allow RAID LVs in a cluster volume group.
It would be possible to activate a RAID LV exclusively in a cluster
volume group, but for now we do not allow RAID LVs to exist in a
clustered volume group at all.  This has two components:
	1) Do not allow RAID LVs to be created in a clustered VG
	2) Do not allow changing a VG from single-machine to clustered
	   if there are RAID LVs present.
2012-10-03 15:52:54 -05:00
Zdenek Kabelac
d442c3ef0c liblvm: insert layer with subvolume renames
Rename also subvolumes if we are inserting _tdata layer.
(Currently it breaks mirrors if it would be generic, needs fixing).
2012-10-03 15:13:32 +02:00
Zdenek Kabelac
cf8e1a0093 thin: origin only suspend
Skip tree creating when used with origin_only flag.
2012-10-03 15:05:55 +02:00
Zdenek Kabelac
21c401006c liblvm: add lv_rename_update
Support lv_rename without directly updating metatata.
It can save some metadata commits in some cases,
i.e. when LVs are offline.
2012-10-03 15:03:49 +02:00
Jonathan Brassow
886656e4ac RAID: Fix problems with creating, extending and converting large RAID LVs
MD's bitmaps can handle 2^21 regions at most.  The RAID code has always
used a region_size of 1024 sectors.  That means the size of a RAID LV was
limited to 1TiB.  (The user can adjust the region_size when creating a
RAID LV, which can affect the maximum size.)  Thus, creating, extending or
converting to a RAID LV greater than 1TiB would result in a failure to
load the new device-mapper table.

Again, the size of the RAID LV is not limited by how much space is allocated
for the metadata area, but by the limitations of the MD bitmap.  Therefore,
we must adjust the 'region_size' to ensure that the number of regions does
not exceed the limit.  I've added code to do this when extending a RAID LV
(which covers 'create' and 'extend' operations) and when up-converting -
specifically from linear to RAID1.
2012-09-27 16:51:22 -05:00
Petr Rockai
662a2122f6 libdaemon: Split daemon-shared.[hc] into daemon-io.[hc] and config-util.[hc]. 2012-09-26 17:26:23 +02:00
Petr Rockai
5f5832e318 lvremove: Ask before discarding data areas. 2012-09-26 17:26:23 +02:00
Petr Rockai
1ff2245c23 lvmetad: Give inconsistent metadata warnings in pvscan --cache. 2012-09-26 17:26:23 +02:00
Petr Rockai
c731bb1ee1 lvmetad: Fix #845269: SEGV on corrupt lvmetad response. 2012-09-26 17:26:23 +02:00
Petr Rockai
d2d6663428 lvmetad: Clear metadata/PV cache before a token-triggered rescan. 2012-09-26 17:26:23 +02:00
Petr Rockai
ca0c8673b2 lib/cache/lvmetad: s/pvscan_lvmetad/lvmetad_pvscan/ in the API 2012-09-26 17:26:23 +02:00
Petr Rockai
c9f56d639b lvmetad: Use "%" PRId64 in place of "%d" for extra clarity. 2012-09-26 17:26:16 +02:00
Petr Rockai
c7b17836ea Implement devices/global_filter.
The global filter is applied first, and is also applied in pvscan --cache (which
is called from udev rules to keep lvmetad updated). Cf. example.conf.
2012-09-26 14:49:15 +02:00
Petr Rockai
2276379a71 lib/cache/lvmetad: Refactor to use dm_config_tree in requests.
We were using daemon_send_simple until now, but it is no longer adequate, since
we need to manipulate requests in a generic way (adding a validity token to each
request), and the tree-based request interface is much more suitable for this.
2012-09-26 14:49:15 +02:00
Petr Rockai
ea14d5159c libdaemon: Extend and refactor APIs.
- move common dm_config_tree manipulation functions from lvmetad-core to
  daemon-shared
- add config-tree-based request manipulation APIs to daemon-client
- factor out _v (va_list) variants of most variadic functions in libdaemon
2012-09-26 14:49:09 +02:00
Petr Rockai
72d82e21d4 dev-cache: Make it possible to pass in a NULL filter. 2012-09-26 12:23:34 +02:00