1
0
mirror of git://sourceware.org/git/lvm2.git synced 2025-01-05 13:18:20 +03:00
Commit Graph

3181 Commits

Author SHA1 Message Date
Peter Rajnoha
ea36d0501e cleanup: remove unused 'pv_by_path' fn
The pv_by_path might be also dangerous to use as it does not
count with any other metadata areas but the ones found on the PV
itself. If metadata was not found on the PV referenced by the path,
it returned no PV though it might have been referenced by metadata
elsewhere (on other PVs...).
2013-03-19 14:57:36 +01:00
Peter Rajnoha
7e5e2dd4ee vgextend: do not allow PV with 0 MDAs to be added while already in a VG
If extending a VG and including a PV with 0 MDAs that was already
a part of a VG, the vgextend allowed that PV to be added and we
ended up *with one PV in two VGs*!

The vgextend code used the 'pv_by_path' fn that returned a PV for
a given path. However, when the PV did not have any metadata areas,
the fn just returned a PV without any reference to existing VG.
Consequently, any checks for the existing VG failed.

[0] raw/~ # pvcreate --metadatacopies 0 /dev/sda
  Physical volume "/dev/sda" successfully created

[0] raw/~ # pvcreate --metadatacopies 1 /dev/sdb
  Physical volume "/dev/sdb" successfully created

[0] raw/~ # vgcreate vg1 /dev/sda /dev/sdb
  Volume group "vg1" successfully created

[0] raw/~ # pvcreate --metadatacopies 1 /dev/sdc
  Physical volume "/dev/sdc" successfully created

[0] raw/~ # vgcreate vg2 /dev/sdc
  Volume group "vg2" successfully created

Before this patch (incorrect):
[0] raw/~ # vgextend vg2 /dev/sda
  Volume group "vg2" successfully extended

With this patch (correct):
[0] raw/~ # vgextend vg2 /dev/sda
  Physical volume '/dev/sda' is already in volume group 'vg1'
  Unable to add physical volume '/dev/sda' to volume group 'vg2'.
2013-03-19 14:57:36 +01:00
Peter Rajnoha
59878d0129 metadata: add 'allow_orphan' arg to find_pv_by_name fn
Before, the find_pv_by_name call always failed if the PV found was orphan.
However, we might use this function even for a PV that is not part of any VG.
This patch adds 'allow_orphan' arg to find_pv_by_name fn that allows that.
2013-03-19 14:57:31 +01:00
Peter Rajnoha
5b6bab2e30 cleanup: remove superfluous wrappers
_find_pv_by_name -> find_pv_by_name
_find_pv_in_vg -> find_pv_in_vg
_find_pv_in_vg_by_uuid -> find_pv_in_vg_by_uuid

The only callers of the underscored variants were their wrappers
without the underscore. No other part of the code referenced the
underscored variants.
2013-03-19 13:58:02 +01:00
Zdenek Kabelac
b36a776a7f thin: move update_pool_params
Now we may recongnize preset arguments, move
the code for updating thin pool related values
into /lib portion of the code.
2013-03-13 15:13:54 +01:00
Zdenek Kabelac
f06dd8725a thin: mark passed args
Keep the flag whether given thin pool argument has been given on command
line or it's been 'estimated'

Call of update_pool_params() must not change cmdline given args and
needs to know this info.

Since there is a need to move this update function into /lib, we cannot
use arg_count().

FIXME: we need some generic mechanism here.
2013-03-13 15:13:54 +01:00
Zdenek Kabelac
b9fe52e811 cleanup: move comment 2013-03-13 15:13:50 +01:00
Zdenek Kabelac
293a06c39a cleanup: indent 2013-03-13 15:13:42 +01:00
Alasdair G Kergon
cbfb5a98b5 filters: power2 devs get precedence if PVIDs match
Give precedence to EMC "power2" devices with duplicate PVIDs like
we already do with "emcpower" devices.
2013-03-11 20:10:49 +00:00
Jonathan Brassow
31c24dd9f2 RAID: Code changes missing from previous commit (bbc6378)
Previous commit included changes to WHATSNEW, but the code changes
were missing.  Here is the description from the previous commit:
commit bbc6378b73
Author: Jonathan Brassow <jbrassow@redhat.com>
Date:   Thu Feb 21 11:31:36 2013 -0600

    RAID:  Make 'lvchange --refresh' restore transiently failed RAID PVs

    A new function (dm_tree_node_force_identical_table_reload) was added to
    avoid the suppression of identical table reloads.  This allows RAID LVs
    to reload the on-disk superblock information that contains which devices
    have failed and the bitmaps.  If the failed device has returned, this has
    the effect of restoring the device and initiating recovery.  Without this
    patch, the user had to completely deactivate their RAID LV and re-activate
    it in order to restore the failed device.  Now they simply need to
    suspend and resume (which is done by 'lvchange --refresh').

    The identical table suppression is only avoided if the LV is not PARTAIL
    (i.e. all of it's devices can be seen and read by LVM) and the kernel
    status of the array contains failed devices.  In other words, the function
    will only be called in the case where we may have success in restoring
    a failed device in the array.
2013-03-06 10:17:11 -06:00
Jonathan Brassow
ed6f3945fd clean-up: Typo 's/should had/should have/' 2013-03-06 08:42:03 -06:00
Peter Rajnoha
f88690221b config: make DEFAULT_MAX_HISTORY unconditional 2013-03-06 12:47:23 +01:00
Peter Rajnoha
7d6991e900 dumpconfig: add --ignoreadvanced and --ignoreunsupported switch
lvm dumpconfig [--ignoreadvanced] [--ignoreunsupported]

--ignoreadvanced causes the advanced configuration options to be left
out on dumpconfig output

--ignoreunsupported causes the options that are not officially supported
to be lef out on dumpconfig output
2013-03-06 10:46:36 +01:00
Peter Rajnoha
7fd04bd93a config: add comment note about advanced and unsupported config nodes
This shows up in the output as a short commentary:

  $ lvm dumpconfig --type default --withcomments metadata/disk_areas
  # Configuration option metadata/disk_areas.
  # This configuration option is advanced.
  # This configuration option is not officially supported.
  disk_areas=""
2013-03-06 10:46:36 +01:00
Peter Rajnoha
088d88cfe2 dumpconfig: add --withcomments and --withversions switch
lvm dumpconfig [--withcomments] [--withversions]

The --withcomments causes the comments to appear on output before each
config node (if they were defined in config_settings.h).

The --withversions causes a one line extra comment to appear on output
before each config node with the version information in which the
configuration setting first appeared.
2013-03-06 10:46:36 +01:00
Peter Rajnoha
e29cd366a2 config: add support for enhanced config node output
There's a possibility to interconnect the dm_config_node with an
ID, which in our case is used to reference the configuration
definition ID from config_settings.h. So simply interconnecting
struct dm_config_node with struct cfg_def_item.

This patch also adds support for enhanced config node output besides
existing "output line by line". This patch adds a possibility to
register a callback that gets called *before* the config node is
processed line by line (for example to include any headers on output)
and *after* the config node is processed line by line (to include any
footers on output). Also, it adds the config node reference itself
as the callback arg in addition to have a possibility to extract more
information from the config node itself if needed when processing the
output callback (e.g. the key name, the id, or whether this is a
section or a value etc...).

If the config node from lvm.conf/--config tree is recognized and valid,
it's always coupled with the config node definition ID from
config_settings.h:

 struct dm_config_node {
   int id;
   const char *key;
   struct dm_config_node *parent, *sib, *child;
   struct dm_config_value *v;
 }

For example if the dm_config_node *cn holds "devices/dev" configuration,
then the cn->id holds "devices_dev_CFG" ID from config_settings.h, -1 if
not found in config_settings.h and 0 if matching has not yet been done.

To support the enhanced config node output, a new structure has been
defined in libdevmapper to register it:

  struct dm_config_node_out_spec {
    dm_config_node_out_fn prefix_fn; /* called before processing config node lines */
    dm_config_node_out_fn line_fn; /* called for each config node line */
    dm_config_node_out_fn suffix_fn; /* called after processing config node lines */
  };

Where dm_config_node_out_fn is:

  typedef int (*dm_config_node_out_fn)(const struct dm_config_node *cn, const char *line, void *baton);

(so in comparison to existing callbacks for config node output, it has
an extra dm_config_node *cn arg in addition)

This patch also adds these functions to libdevmapper:
  - dm_config_write_node_out
  - dm_config_write_one_node_out

...which have exactly the same functionality as their counterparts
without the "out" suffix. The "*_out" functions adds the extra hooks
for enhanced config output (prefix_fn and suffix_fn mentioned above).

One can still use the old interface for config node output, this is
just an enhancement for those who'd like to modify the output more
extensively.
2013-03-06 10:46:36 +01:00
Peter Rajnoha
34350963d1 dumpconfig: add --type, --atversion and --validate arg
lvm dumpconfig [--type {current|default|missing|new}] [--atversion] [--validate]

This patch adds above-mentioned args to lvm dumpconfig and it maps them
to creation and writing out a configuration tree of a specific type
(see also previous commit):

  - current maps to CFG_TYPE_CURRENT
  - default maps to CFG_TYPE_DEFAULT
  - missing maps to CFG_TYPE_MISSING
  - new maps to CFG_TYPE_NEW

If --type is not defined, dumpconfig defaults to "--type current"
which is the original behaviour of dumpconfig before all these changes.

The --validate option just validates current configuration tree
(lvm.conf/--config) and it writes a simple status message:

  "LVM configuration valid" or "LVM configuration invalid"
2013-03-06 10:46:36 +01:00
Peter Rajnoha
245b85692e config: use config checks and add support for creating trees from config definition (config_def_create_tree fn)
Configuration checking is initiated during config load/processing
(_process_config fn) which is part of the command context
creation/refresh.

This patch also defines 5 types of trees that could be created from
the configuration definition (config_settings.h), the cfg_def_tree_t:

  - CFG_DEF_TREE_CURRENT that denotes a tree of all the configuration
    nodes that are explicitly defined in lvm.conf/--config

  - CFG_DEF_TREE_MISSING that denotes a tree of all missing
    configuration nodes for which default valus are used since they're
    not explicitly used in lvm.conf/--config

  - CFG_DEF_TREE_DEFAULT that denotes a tree of all possible
    configuration nodes with default values assigned, no matter what
    the actual lvm.conf/--config is

  - CFG_DEF_TREE_NEW that denotes a tree of all new configuration nodes
    that appeared in given version

  - CFG_DEF_TREE_COMPLETE that denotes a tree of the whole configuration
    tree that is used in LVM2 (a combination of CFG_DEF_TREE_CURRENT +
    CFG_DEF_TREE_MISSING). This is not implemented yet, it will be added
    later...

The function that creates the definition tree of given type:

  struct dm_config_tree *config_def_create_tree(struct config_def_tree_spec *spec);

Where the "spec" specifies the tree type to be created:

  struct config_def_tree_spec {
    cfg_def_tree_t type;	/* tree type */
    uint16_t version;		/* tree at this LVM2 version */
    int ignoreadvanced;		/* do not include advanced configs */
    int ignoreunsupported;	/* do not include unsupported configs */
  };

This tree can be passed to already existing functions that write
the tree on output (like we already do with cmd->cft).

There is a new lvm.conf section called "config" with two new options:

  - config/checks which enables/disables checking (enabled by default)

  - config/abort_on_errors which enables/disables aborts on any type of
    mismatch found in the config (disabled by default)
2013-03-06 10:46:35 +01:00
Peter Rajnoha
e38aaddb5e config: add support for configuration check (config_def_check fn)
Add support for configuration checking - type checking and recognition
of registered configuration settings that LVM2 understands and also
check the structure of the configuration. Log error on any mismatch
found.

A hash over all allowed configuration paths is created which helps
with matching the exact configuration (lvm.conf/--config tree) with
the configuration item definition from config_settings.h in an
efficient and one-step way.

Two more helper flags are introduced for each configuration definition
item:

  - CFG_USED which marks the item as being used (lvm.conf/--config)
    This helps with identifying missing configuration options
    (and for which defaults were used) when traversing the tree later.

  - CFG_VALID which denotes that the item has already been checked and
    it was found valid. This improves performance, so if the check
    is called once again on the same tree which was not reloaded, we
    can just return the state from previous check (with a possibility
    to force the check if needed).

The new function that config.h exports and which is going to be used
to perform the configuration checking is:

  int config_def_check(struct cmd_context *cmd, int force, int skip, int suppress_messages)

...which is exported internally via config.h.
2013-03-06 10:17:18 +01:00
Peter Rajnoha
386886f71c config: refer to config nodes using assigned IDs
For example, the old call and reference:

  find_config_tree_str(cmd, "devices/dir", DEFAULT_DEV_DIR)

...now becomes:

  find_config_tree_str(cmd, devices_dir_CFG)

So we're referring to the named configuration ID instead
of passing the configuration path and the default value
is taken from central config definition in config_settings.h
automatically.
2013-03-06 10:14:33 +01:00
Peter Rajnoha
a3d891a290 config: add structs to represent config definition and register config_settings.h content
This patch adds basic structures that encapsulate the config_settings.h
content - it takes each item and puts it in structures:

  - cfg_def_type_t to define config item type

  - cfg_def_value_t to define config item (default) value

  - flags used to define the nature and use of the config item:
      - CFG_NAME_VARIABLE for items with variable names (e.g. tags)
      - CFG_ALLOW_EMPTY for items where empty value is allowed
      - CFG_ADVANCED for items which are considered as "advanced settings"
      - CFG_UNSUPPORTED for items which are not officially supported
        (config options mostly for internal use and testing/debugging)

  - cfg_def_item_t to encapsulate the whole definition of the config
    definition itself

Each config item is referenced by named ID, e.g. "devices_dir_CFG"
instead of directly typing the path "devices/dir" as it was before.

This patch also adds cfg_def_get_path helper function to get the
config setting path up to the root for given config ID
(it returns the path in form of "abc/def/.../xyz" where the "abc"
is the topmost element).
2013-03-06 10:14:33 +01:00
Peter Rajnoha
e947c362dd config: add config_settings.h
This file centrally defines all recognized LVM2 configuration
sections and settings. Each item here has its parent, set of
allowed types, default value, brief comment, version the setting
first appeared in and flags that further define the nature of
the configuration setting and its use.
2013-03-06 10:14:32 +01:00
Peter Rajnoha
6ea68f233c config: add vsn macro
The 'vsn' macro encodes the LVM2 version major, minor
and patchlevel number in a packed form using 16 bits.
2013-03-06 08:52:55 +01:00
Peter Rajnoha
a9d0e25627 cleanup: remove struct pv_header_extension reference from struct pv_header
Just to prevent accidental and improper use when reading the layout
from disk because of the already existing disk_areas_xl[0] lists
that are variable in size. We can read pv_header_extension only
after we know exactly where the lists end...
2013-02-27 10:47:24 +01:00
Peter Rajnoha
9d5a3c16dd lvmetad: fix to properly process embedding area 2013-02-27 10:36:49 +01:00
Peter Rajnoha
ea69cda4b0 report: add reporting fields for Embedding Area start and size
There are new reporting fields for Embedding Area: ea_start and ea_size.

An example of 1m Embedding Area and relevant reporting fields:
raw/~ # pvs -o pv_name,pe_start,ea_start,ea_size
  PV         1st PE  EA start EA size
  /dev/sda     2.00m    1.00m   1.00m
2013-02-26 14:46:42 +01:00
Peter Rajnoha
b778653f03 pv_header_extension: add support for writing PV header extension (flags & Embedding Area)
The PV header extension information (PV header extension version, flags
and list of Embedding Area locations) is stored just beyond the PV header base.

When calculating the Embedding Area start value (ea_start), the same logic is
used as when calculating the pe_start value for Data Area - the value must
follow exactly the same alignment restrictions for its start value
(the alignment detected automatically or provided via command line using
the --dataalignment and --dataalignmentoffset arguments).

The Embedding Area is placed at the very start of the PV, starting at
ea_start. The Data Area starting at pe_start is placed next. The pe_start is
still properly aligned. Due to the pe_start alignment, it's possible that the
resulting Embedding Area size (ea_size) ends up bigger in size than requested
(but never less than requested).
2013-02-26 11:28:00 +01:00
Peter Rajnoha
9dbe25709e pv_header_extension: add support for reading PV header extension (flags & Embedding Area)
New tools with PV header extension support will read the extension
if it exists and it's not an error if it does not exist (so old PVs
will still work seamlessly with new tools).

Old tools without PV header extension support will just ignore any
extension.

As for the Embedding Area location information (its start and size),
there are actually two places where this is stored:
  - PV header extension
  - VG metadata

The VG metadata contains a copy of what's written in the PV header
extension about the Embedding Area location (NULL value is not copied):

    physical_volumes {
        pv0 {
          id = "AkSSRf-difg-fCCZ-NjAN-qP49-1zzg-S0Fd4T"
          device = "/dev/sda"     # Hint only

          status = ["ALLOCATABLE"]
          flags = []
          dev_size = 262144       # 128 Megabytes
          pe_start = 67584
          pe_count = 23   # 92 Megabytes
          ea_start = 2048
          ea_size = 65536 # 32 Megabytes
        }
    }

The new metadata fields are "ea_start" and "ea_size".
This is mostly useful when restoring the PV by using existing
metadata backups (e.g. pvcreate --restorefile ...).

New tools does not require these two fields to exist in VG metadata,
they're not compulsory. Therefore, reading old VG metadata which doesn't
contain any Embedding Area information will not end up with any kind
of error but only a debug message that the ea_start and ea_size values
were not found.

Old tools just ignore these extra fields in VG metadata.
2013-02-26 11:27:23 +01:00
Peter Rajnoha
60c5d4c42f pv_header_extension: add supporting infrastructure for PV header extension (flags & Embedding Area)
PV header extension comes just beyond the existing PV header base:

PV header base (existing):
 - uuid
 - device size
 - null-terminated list of Data Areas
 - null-terminater list of MetaData Areas

PV header extension:
 - extension version
 - flags
 - null-terminated list of Embedding Areas

This patch also adds "eas" (Embedding Areas) list to lvmcache (lvmcache_info)
and it also adds support for common operations on the list (just like for
already existing "das" - Data Areas list):
 - lvmcache_add_ea
 - lvmcache_update_eas
 - lvmcache_foreach_ea
 - lvmcache_del_eas

Also, add ea_start and ea_size to struct physical_volume for processing
PV Embedding Area location throughout the code (currently only one
Embedding Area is supported, though the definition on disk allows for
more if needed in the future...).

Also, define FMT_EAS format flag to mark that the format actually
supports Embedding Areas (currently format-text only).
2013-02-26 11:25:16 +01:00
Peter Rajnoha
6d8de3638c cleanup: use struct pvcreate_restorable_params throughout 2013-02-26 11:25:11 +01:00
Peter Rajnoha
6692b17777 cleanup: add struct pvcreate_restorable_params and move relevant items from pvcreate_params
Extract restorable PV creation parameters from struct pvcreate_params into
a separate struct pvcreate_restorable_params for clarity and also for better
maintainability when adding any new items later.
2013-02-26 11:24:38 +01:00
Zdenek Kabelac
71f4934500 activation: fix pvmove partial tree creation
Do not try to add LV again into the partial tree, if it's been
already added. Otherwise we may end in endless loop.
2013-02-23 12:09:12 +01:00
Zdenek Kabelac
b73de73151 thin: lvconvert support for external origin
Add basic support for converting LV into an external origin volume.

Syntax:

lvconvert --thinpool vg/pool  --originname renamed_origin -T origin

It will convert volume  'origin' into a thin volume, which will
use 'renamed_origin' as an external read-only origin.
All read/write into origin will go via 'pool'.

renamed_origin volume is read-only volume, that could be activated
only in read-only mode, and cannot be modified.
2013-02-23 10:38:20 +01:00
Zdenek Kabelac
2cba0ea9f9 thin: removal of external_origin 2013-02-23 10:37:01 +01:00
Zdenek Kabelac
30c13eff37 thin: report external origin
Use the field 'origin' for reporting external origin lv name.

For thin volumes with external origin, report the size of
external origin size via:

  lvs -o+origin_size
2013-02-23 10:37:01 +01:00
Zdenek Kabelac
87331dc419 thin: add support for external origin
Add internal support for thin volume's external origin.
2013-02-23 10:36:58 +01:00
Zdenek Kabelac
d023b2d12f lvremove: easier removal of dependent lvs
Add function to remove lvs which are depending on removed lv
prior the lv is removed.

User is asked for confirmation.
2013-02-23 10:31:05 +01:00
Zdenek Kabelac
3679bb1cd9 activation: simplify activation code
Reorder activation code to look similar for preload tree and
activation tree.

Its also give much better suppport for device stacking,
since now we also support activation of snapshot which might
be then used for other devices.
2013-02-23 10:30:03 +01:00
Zdenek Kabelac
0631d233d8 activation: add _add_layer_target_to_dtree
Add function for creation of simple linear mapping over layer device.
2013-02-23 10:29:08 +01:00
Zdenek Kabelac
520cc9a7f8 thin: replace _thin_layer with lv_layer()
Use consitently lv_layer function internally for thin pool layer name.
2013-02-23 10:28:04 +01:00
Zdenek Kabelac
78b23f3595 activation: extend _cached_info
Add layer string to support check of layered devices.
2013-02-23 10:28:01 +01:00
Peter Rajnoha
303e86adc8 pvcreate: fix alignment to incorporate alignment offset if PV has 0 MDAs
If zero metadata copies are used, there's no further recalculation of
PV alignment that happens when adding metadata areas to the PV and
which actually calculates the alignment correctly as a matter of fact.
So fix this for "PV without MDA" case as well.

Before this patch:
[1] raw/~ # pvcreate --dataalignment 8m --dataalignmentoffset 4m
--metadatacopies 1 /dev/sda
  Physical volume "/dev/sda" successfully created
[1] raw/~ # pvs -o pv_name,pe_start
  PV         1st PE
  /dev/sda    12.00m
[1] raw/~ # pvcreate --dataalignment 8m --dataalignmentoffset 4m
--metadatacopies 0 /dev/sda
  Physical volume "/dev/sda" successfully created
[1] raw/~ # pvs -o pv_name,pe_start
  PV         1st PE
  /dev/sda     8.00m

After this patch:
[1] raw/~ # pvcreate --dataalignment 8m --dataalignmentoffset 4m
--metadatacopies 1 /dev/sda
  Physical volume "/dev/sda" successfully created
[1] raw/~ # pvs -o pv_name,pe_start
  PV         1st PE
  /dev/sda    12.00m
[1] raw/~ # pvcreate --dataalignment 8m --dataalignmentoffset 4m
--metadatacopies 0 /dev/sda
  Physical volume "/dev/sda" successfully created
[1] raw/~ # pvs -o pv_name,pe_start
  PV         1st PE
  /dev/sda    12.00m

Also, remove a superfluous condition "pv->pe_start < pv->pe_align" in:
  if (pe_start == PV_PE_START_CALC && pv->pe_start < pv->pe_align)
    pv->pe_start = pv->pe_align ...
This part of the condition is not reachable as with the PV_PE_START_CALC,
we always have pv->pe_start set to 0 from the PV struct initialisation
(...the pv->pe_start value is just being calculated).
2013-02-21 14:51:19 +01:00
Jonathan Brassow
70f57996b3 RAID: Add new 'raid10_segtype_default' setting in lvm.conf
If '--mirrors/-m' and '--stripes/-i' are used together when creating
a logical volume, mirrors-over-stripes is currently chosen.  The user
can override this by using the '--type raid10' option on creation.
However, we want a place where we can set the default behavior to
'raid10' explicitly - similar to the "mirror" and "raid1" tunable,
mirror_segtype_default.

A follow-on patch should use this new setting to change the default
from "mirror" to "raid10", as this is the preferred segment type.
2013-02-20 15:10:04 -06:00
Jonathan Brassow
dc2ce71313 clean-up: Remove a FIXME question that has been settled
It is ok for us to use the shorthand 'lv_is_virtual' to detect error
targets in a RAID LV when searching for candidates for device replacement.
2013-02-20 15:03:58 -06:00
Jonathan Brassow
bd0ee420b5 RAID: Allow remove/replace of sub-LVs composed of error segments.
When a device fails, we may wish to replace those segments with an
error segment.  (Like when a 'vgreduce --removemissing' removes a
failed device that happens to be a RAID image/meta.)  We are then left
with images that we will eventually want to remove or replace.

This patch allows us to pull out these virtual "error" sub-LVs.  This
allows a user to 'lvconvert -m -1 vg/lv' to extract the bad sub-LVs.
Sub-LVs with error segments are considered for extraction before other
possible devices so that good devices are not accidentally removed.

This patch also adds the ability to replace RAID images that contain error
segments.  The user will still be unable to run 'lvconvert --replace'
because there is no way to address the 'error' segment (i.e. no PV
that it is associated with).  However, 'lvconvert --repair' can be
used to replace the image's error segment with a new PV.  This is also
the most appropriate way to do it, since the LV will continue to be
reported as 'partial'.
2013-02-20 14:58:56 -06:00
Jonathan Brassow
845852d6b4 RAID: Make 'vgreduce --removemissing' work with RAID LVs
Currently it is impossible to remove a failed PV which has a RAID LV
on it.  This patch fixes the issue by replacing the failed PV with an
'error' segment within the affected sub-LVs.  Once there is no longer
a RAID LV using the PV, it can be removed.

Most often, it is better to replace a failed RAID device with a spare.
(You can use 'lvconvert --repair <vg>/<LV>' to accomplish that.)
However, if there are no spares in the volume group and none will be
added, it is useful to be able to removed the failed device.

Following patches address the ability to perform 'lvconvert' operations
on RAID LVs that contain sub-LVs composed of 'error' segments.
2013-02-20 14:52:46 -06:00
Jonathan Brassow
0e4ffd9d3b clean-up: Rename lvm.conf setting 'mirror_region_size' to 'raid_region_size'
We have been using 'mirror_region_size' in lvm.conf as the default region
size for RAID logical volumes as well as mirror logical volumes.  Since,
"raid" is more inclusive and representative than "mirror", I have changed
the name of this setting.  We must still check for the old setting and warn
the user if we are overriding it with the new setting if both happen to be
present.
2013-02-20 14:40:17 -06:00
Peter Rajnoha
a7d6a612b8 fix: 'Couldn't read extent size' --> '... extent start' 2013-02-21 13:33:27 +01:00
Peter Rajnoha
722ca363f0 report: fix pvs -o pv_free reporting for PVs with 0 PEs
[0] raw/~ # lsblk -o NAME,SIZE /dev/sda
NAME  SIZE
sda   128M

[0] raw/~ # pvcreate --dataalignment 128m /dev/sda
  Physical volume "/dev/sda" successfully created

[0] raw/~ # vgcreate vg /dev/sda
  Volume group "vg" successfully created

[0] raw/~ # lvcreate -l1 vg
  Volume group "vg" has insufficient free space (0 extents): 1 required.

Before this patch:
[0] raw/~ # pvs -o pv_name,pv_free
  PV         PFree
  /dev/sda   128.00m

After this patch:
[0] raw/~ # pvs -o pv_name,pv_free
  PV         PFree
  /dev/sda      0
2013-02-21 13:28:07 +01:00
Zdenek Kabelac
e566faaae6 cleanup: old style gcc 2013-02-05 16:54:12 +01:00
Zdenek Kabelac
d97605beaf cleanup: preserve signesss and type size on return values 2013-02-05 16:54:11 +01:00
Zdenek Kabelac
7910b6c0ba thin: update pool_is_active
Change it to take LV and move it to exported header - seems
to be a better fit for usability from tools/ directory.
2013-02-05 16:54:11 +01:00
Zdenek Kabelac
c984d8fbab thin: properly unmark volume after detach
When the volume is detached form thin pool,
unmask THIN_VOLUME flag and reset related pointers.
2013-02-05 14:40:37 +01:00
Zdenek Kabelac
11eaf1c98c thin: add function pool_is_active
This internal function check for active pool device.
For cluster it checks every thin volume,
On the non-clustered VG we need to check just
for presence of -tpool device.
2013-02-05 14:35:44 +01:00
Zdenek Kabelac
9d445f371c report: leave empty report field for 0
Since we do not support LVs with 0 size, use this value
as 'error' value for devices without origin, and leave this
field blank as in other cases.
2013-02-05 14:32:37 +01:00
Zdenek Kabelac
ddeb37f282 cleanup: add internal error check
Check if 'is_removable' is defined and report internal error,
if it's missing.
2013-02-05 14:27:24 +01:00
Jonathan Brassow
f5cd9c3563 clean-up: Another functiont that can use 'lv_layer'
lib/activate/dev_manager.c:dev_manager_raid_status() can also use
the new 'lv_layer' function.
2013-02-04 17:10:16 -06:00
Zdenek Kabelac
a4870c79ca thin: use noflush for obtaining transaction_id
Do not flush thin pool data, when reading transation_id status.
2013-02-04 19:05:56 +01:00
Zdenek Kabelac
153ce89af3 cleanup: comment update
Just update code comment and use single line if().
2013-02-04 19:05:43 +01:00
Zdenek Kabelac
b37a0a39e3 cleanup: indent line 2013-02-04 19:01:11 +01:00
Zdenek Kabelac
8ed0b6f312 thin: replace is_active with send_messages
Since is_active is only used for thinp
replace struct member with more meaningful
send_messages flag
2013-02-04 19:01:10 +01:00
Zdenek Kabelac
4af4241ba4 use lv_layer 2013-02-04 19:01:10 +01:00
Zdenek Kabelac
ca7abbce8a activate: add lv_layer function
Add function to return layer name for LV.
2013-02-04 19:01:10 +01:00
Zdenek Kabelac
9f433e6ee3 cleanup: postpone lv_is_thin_volume check
Code move to make it easier to follow and
call _add_dev_to_dtree() in the separate if() branch
for thin volumes.
2013-02-04 19:00:19 +01:00
Jonathan Brassow
801d4f96a8 RAID: Improve 'lvs' attribute reporting of RAID LVs and sub-LVs
There are currently a few issues with the reporting done on RAID LVs and
sub-LVs.  The most concerning is that 'lvs' does not always report the
correct failure status of individual RAID sub-LVs (devices).  This can
occur when a device fails and is restored after the failure has been
detected by the kernel.  In this case, 'lvs' would report all devices are
fine because it can read the labels on each device just fine.
Example:
[root@bp-01 lvm2]# lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   rwi-a-r--   100.00 lv_rimage_0(0),lv_rimage_1(0)
  [lv_rimage_0] vg   iwi-aor--          /dev/sda1(1)
  [lv_rimage_1] vg   iwi-aor--          /dev/sdb1(1)
  [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)

However, 'dmsetup status' on the device tells us a different story:
  [root@bp-01 lvm2]# dmsetup status vg-lv
  0 1024000 raid raid1 2 DA 1024000/1024000

In this case, we must also be sure to check the RAID LVs kernel status
in order to get the proper information.  Here is an example of the correct
output that is displayed after this patch is applied:
[root@bp-01 lvm2]# lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   rwi-a-r-p   100.00 lv_rimage_0(0),lv_rimage_1(0)
  [lv_rimage_0] vg   iwi-aor-p          /dev/sda1(1)
  [lv_rimage_1] vg   iwi-aor--          /dev/sdb1(1)
  [lv_rmeta_0]  vg   ewi-aor-p          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)

The other case where 'lvs' gives incomplete or improper output is when a
device is replaced or added to a RAID LV.  It should display that the RAID
LV is in the process of sync'ing and that the new device is the only one
that is not-in-sync - as indicated by a leading 'I' in the Attr column.
(Remember that 'i' indicates an (i)mage that is in-sync and 'I' indicates
an (I)mage that is not in sync.)  Here's an example of the old incorrect
behaviour:
[root@bp-01 lvm2]# lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   rwi-a-r--   100.00 lv_rimage_0(0),lv_rimage_1(0)
  [lv_rimage_0] vg   iwi-aor--          /dev/sda1(1)
  [lv_rimage_1] vg   iwi-aor--          /dev/sdb1(1)
  [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)
[root@bp-01 lvm2]# lvconvert -m +1 vg/lv; lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   rwi-a-r--     0.00 lv_rimage_0(0),lv_rimage_1(0),lv_rimage_2(0)
  [lv_rimage_0] vg   Iwi-aor--          /dev/sda1(1)
  [lv_rimage_1] vg   Iwi-aor--          /dev/sdb1(1)
  [lv_rimage_2] vg   Iwi-aor--          /dev/sdc1(1)
  [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)
  [lv_rmeta_2]  vg   ewi-aor--          /dev/sdc1(0)                            ** Note that all the images currently are marked as 'I' even though it is
   only the last device that has been added that should be marked.

Here is an example of the correct output after this patch is applied:
[root@bp-01 lvm2]# lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   rwi-a-r--   100.00 lv_rimage_0(0),lv_rimage_1(0)
  [lv_rimage_0] vg   iwi-aor--          /dev/sda1(1)
  [lv_rimage_1] vg   iwi-aor--          /dev/sdb1(1)
  [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)
[root@bp-01 lvm2]# lvconvert -m +1 vg/lv; lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   rwi-a-r--     0.00 lv_rimage_0(0),lv_rimage_1(0),lv_rimage_2(0)
  [lv_rimage_0] vg   iwi-aor--          /dev/sda1(1)
  [lv_rimage_1] vg   iwi-aor--          /dev/sdb1(1)
  [lv_rimage_2] vg   Iwi-aor--          /dev/sdc1(1)
  [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)
  [lv_rmeta_2]  vg   ewi-aor--          /dev/sdc1(0)
** Note only the last image is marked with an 'I'.  This is correct and we can
   tell that it isn't the whole array that is sync'ing, but just the new
   device.

It also works under snapshots...
[root@bp-01 lvm2]# lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   owi-a-r-p    33.47 lv_rimage_0(0),lv_rimage_1(0),lv_rimage_2(0)
  [lv_rimage_0] vg   iwi-aor--          /dev/sda1(1)
  [lv_rimage_1] vg   Iwi-aor-p          /dev/sdb1(1)
  [lv_rimage_2] vg   Iwi-aor--          /dev/sdc1(1)
  [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor-p          /dev/sdb1(0)
  [lv_rmeta_2]  vg   ewi-aor--          /dev/sdc1(0)
  snap          vg   swi-a-s--          /dev/sda1(51201)
2013-02-01 11:33:54 -06:00
Jonathan Brassow
37ffe6a13a RAID: Cache previous results of lv_raid_dev_health for future use
We can avoid many dev_manager (ioctl) calls by caching the results of
previous calls to lv_raid_dev_health.  Just considering the case where
'lvs -a' is called to get the attributes of a RAID LV and its sub-lvs,
this function would be called many times.  (It would be called at least
7 times for a 3-way RAID1 - once for the health of each sub-LV and once
for the health of the top-level LV.)  This is a good idea because the
sub-LVs are processed in groups along with their parent RAID LV and in
each case, it is the parent LV whose status will be queried.  Therefore,
there only needs to be one trip through dev_manager for each time the
group is processed.
2013-02-01 11:32:18 -06:00
Jonathan Brassow
c8242e5cf4 RAID: Add RAID status accessibility functions
Similar to the way thin* accesses its kernel status, we add a method
for RAID to grab the various values in its status output without the
higher levels (LVM) having to understand how to parse the output.
Added functions include:
        - lib/activate/dev_manager.c:dev_manager_raid_status()
          Pulls the status line from the kernel

        - libdm/libdm-deptree.c:dm_get_status_raid()
          Parses status line and puts components into dm_status_raid struct

        - lib/activate/activate.c:lv_raid_dev_health()
          Accesses dm_status_raid to deliver raid dev_health string

The new structure and functions can provide a more unified way to access
status information.  ('lv_raid_percent' could switch to using these
functions, for example.)
2013-02-01 11:31:47 -06:00
Petr Rockai
1e4a9534f4 lvmetad: Call _lvmetad_handle_reply in lvmetad_vg_lookup. 2013-01-16 11:19:33 +01:00
Sebastian Ott
9602e68577 filters: add scm devices
Fix this:
pvcreate /dev/scma
  Device /dev/scma not found (or ignored by filtering).

Reported-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
2013-01-11 09:24:07 +01:00
Alasdair G Kergon
06abb2dd4c logging: classify log_debug messages
Place most log_debug() messages into a class.
2013-01-07 22:30:29 +00:00
Alasdair G Kergon
7f747a0d73 logging: add debug classes
Add log/debug_classes to lvm.conf to allow debug messages to be
classified and filtered at runtime.

The dm_errno field is only used by log_error(), so I've redefined it
for log_debug() messages to hold the message class.

By default, all existing messages appear, but we can add categories that
generate high volumes of data, such as logging all traffic to/from
lvmetad.
2013-01-07 22:25:19 +00:00
Alasdair G Kergon
b617109fff lvmetad: fix format1 updates
fmt1 doesn't have a separate commit function: updates take effect
immediately vg_write is called, so we must update lvmetad at this
point if we're going to go on and ask lvmetad for the VG metadata
again before calling the commit function (though that's probably an
unsupported and pointless thing to do anyway as the client must
already have that data and it cannot have changed because it's locked
and with devs suspended we shouldn't be communicating with lvmetad;
so when that's fixed properly, this fix here can be reverted).

This problem showed up as an internal error when lvremoving an LVM1
snapshot.

> Internal error: LV snap1 (00000000000000000000000000000001) missing from preload metadata

https://bugzilla.redhat.com/891855
2013-01-05 03:17:35 +00:00
Alasdair G Kergon
48e1ae7f6a lvmetad: add basic client-side debug logging
First attempt at showing precisely what use any command is making of
lvmetad in the -vvvv trace information.
2013-01-05 00:35:50 +00:00
Alasdair G Kergon
41e7f45258 lvmetad: rename device vars and move _token_update
Move _token_update() to avoid the need for _lvmetad_send prototype.

Use 'dev' consistently for a struct device * variable.
Use 'devno' for a dev_t.
2013-01-04 23:45:22 +00:00
Alasdair G Kergon
6d760b2c63 lvmetad: improve client logging when connecting
Rename lvmetad_warning() to lvmetad_connect_or_warn().

Log all connection attempts on the client side, whether successful or not.

Reduce some nesting and remove a redundant assertion.
2013-01-04 23:22:30 +00:00
Jonathan Brassow
970dfbcd69 RAID: Limit replacement of devices when array is not in-sync.
If a RAID array is not in-sync, replacing devices should not be allowed
as a general rule.  This is because the contents used to populate the
incoming device may be undefined because the devices being read where
not in-sync.  The kernel enforces this rule unless overridden by not
allowing the creation of an array that is not in-sync and includes a
devices that needs to be rebuilt.

Since we cannot know the sync state of an LV if it is inactive, we must
also enforce the rule that an array must be active to replace devices.

That leaves us with the following conditions:
1) never allow replacement or repair of devices if the LV is in-active
2) never allow replacement if the LV is not in-sync
3) allow repair if the LV is not in-sync, but warn that contents may
   not be recoverable.

In the case where a user is performing the repair on the command line via
'lvconvert --repair', the warning is printed before the user is prompted
if they would like to replace the device(s).  If the repair is automated
(i.e. via dmeventd and policy is "allocate"), then the device is replaced
if possible and the warning is printed.
2012-12-18 14:40:42 -06:00
Zdenek Kabelac
401c9aba4a pv_read: add missing check for valid info
If the lvmcache_info_from_pvid() fails to find valid
info, invoke the lookup by dev, and only in this case
call lvmcache_info_from_pvid() again.

Also check for the result of info and return
error directly, so the NULL is not passed
to lvmcache_get_label().
2012-12-15 17:23:27 +01:00
Zdenek Kabelac
e012d0635d lvmetad: check id_read_format error status
Detect error from id_read_format() function.
2012-12-15 17:23:27 +01:00
Zdenek Kabelac
ff5612c0c3 format-text: check for _text_create_text_instance
Test if 'fid' creation failed and report stack trace,
break the loop and do not pass NULL fid further.
2012-12-15 17:23:23 +01:00
Zdenek Kabelac
740ab81d03 log: move abort past syslog
When the abort_on_internal_errors is enabled, we aborted prior
the syslog logging output.

Since such fatal error gets level _LOG_FATAL it should
not be blocked by debug_level() check so lets move it further,
to get abort error logged also via syslog.
2012-12-15 17:22:48 +01:00
Zdenek Kabelac
575c4ed964 cleanup: use proper const in apply_lvname_restrictions
Better constness used for reserved prefixes and strings.
Also simplify a bit validate_name and use direct char
checks isntead of 2 strcmp() calls.
2012-12-15 14:57:40 +01:00
Zdenek Kabelac
21f6511bc2 cleanup: reorder code
Swap if() test condition and check for failure
and use traditional 'stack' trace.
2012-12-15 14:57:40 +01:00
Zdenek Kabelac
8ab4334505 cleanup: ignore return values
These dm_snprintfs should not fail, since enough space is reserved.
So return intentionaly ignored.
2012-12-15 14:57:40 +01:00
Petr Rockai
f14f2d4378 lvmetad: Fix autoactivation for MDA-less PVs.
Calling pvscan --cache with -aay on a PV without an MDA would spuriously fail
with an internal error, because of an incorrect assumption that a parsed VG
structure was always available. This is not true and the autoactivation handler
needs to call vg_read to obtain metadata in cases where the PV had no MDAs to
parse. Therefore, we pass vgid into the handler instead of the (possibly NULL)
VG coming from the PV's MDA.
2012-12-12 13:19:04 +01:00
Marian Csontos
ff5c1c576c lvmetad: use dm_config_destroy to free pvmeta
Release pvmeta handler with proper dm_config_destroy() function.
TODO: Fix primary fault for this internal error.

Signed-off-by: mcsontos@redhat.com
2012-12-11 11:55:12 +01:00
Zdenek Kabelac
17be6d5210 thin: fix test for dicards ignore settings
Arghh, this was bad last-minute shortening of if() expression
in the commit 1ef9831018.

dm_tree_node_set_thin_pool_discard() must not run in the same
expression as check for non-power-2 discard, otherwise
there are 2 calls for dm_tree_node_set_thin_pool_discard
and whole setting of discards is missinterpretted.

In-relase fix it by using proper parentheses {}.
2012-12-11 11:26:19 +01:00
Zdenek Kabelac
ec49f07b0d mirrors: fix leak in device_is_usable mirror check
Function _ignore_blocked_mirror_devices was not release
allocated strings images_health and log_health.

In error paths it was also not releasing dm_task structure.

Swaped return code of _ignore_blocked_mirror_devices and
use 1 as success.

In _parse_mirror_status use log_error if memory allocation
fails and few more errors so they are no going unnoticed
as debug messages.

On error path always clear return values and free strings.

For dev_create_file  use cache mem pool to avoid memleak.
2012-12-11 11:15:22 +01:00
Peter Rajnoha
35a4d70aad activation: don't miss the log on empty {auto_activation|read_only|}_volume_list
Addendum to previous commit...
2012-12-04 14:12:36 +01:00
Peter Rajnoha
e2be2652ad Allow empty activation/{auto_activation|read_only|}_volume_list config option.
In case we don't want to activate, autoactivate or have the
VG/LV read-only. Primarily targeted for the auto_activation_volume_list,
but it makes no harm for other settings (the part of the code
that reads these three settings is shared, but there's no
reason to separate it only for this change).
2012-12-04 10:33:54 +01:00
Zdenek Kabelac
a530c70b21 thin: update thin feature detection
Safe 1 static var and keep whole detection within one function.
2012-12-03 13:03:41 +01:00
Zdenek Kabelac
5ec20e267f thin: reworked thin feature detection
Rework thin feature detection to support runtime
section to allow to disable them selectively.

New lvm.conf option is born: global/thin_disabled_features
2012-12-03 11:57:40 +01:00
Zdenek Kabelac
6987a353de thin: add detach_pool_metadata_lv
Add internal function detach_pool_metadata_lv().
2012-12-02 17:56:29 +01:00
Zdenek Kabelac
9ec474f38a lvm2api: fix size reporting
API is reporting all sizes as 64bit integers in bytes.
Fix at those places, where sectors were returned
to remain consistent.
2012-12-02 17:55:08 +01:00
Peter Rajnoha
ed9751d9fa udev: add a warning message if DM_DISABLE_UDEV set and udev running
$ export DM_DISABLE_UDEV=1

$ dmsetup create test --table "0 1 zero"
Udev is running and DM_DISABLE_UDEV environment variable is set. Bypassing udev, device-mapper library will manage device nodes in device directory.

$ lvchange -ay vg/lvol0
  Udev is running and DM_DISABLE_UDEV environment variable is set. Bypassing udev, LVM will manage logical volume symlinks in device directory.
  Udev is running and DM_DISABLE_UDEV environment variable is set. Bypassing udev, LVM will obtain device list by scanning device directory.
  Udev is running and DM_DISABLE_UDEV environment variable is set. Bypassing udev, device-mapper library will manage device nodes in device directory.
2012-11-29 15:57:43 +01:00
Peter Rajnoha
4891a735d3 udev: recognize DM_DISABLE_UDEV environment variable
Setting this environment variable will cause a full fallback
to old direct node and symlink management in libdevmapper and lvm2.

It means:

 - disabling udev synchronization
   (--noudevsync in dmsetup and --noudevsync + activation/udev_sync=0
    lvm2 config)
 - disabling dm and any subsystem related udev rules
   (--noudevrules in dmsetup and activation/udev_rules=0 lvm2 config)
 - management of nodes/symlinks under /dev directly by libdevmapper/lvm2
   (--verifyudev in dmsetup and activation/verify_udev_operations=1
    lvm2 config)
 - not obtaining any device list from udev database
   (devices/obtain_device_list_from_udev=0 lvm2 config)

Note: we could set all of these before - there's no functional change!
However the DM_DISABLE_UDEV environment variable is a nice shortcut
to make it easier for libdevmapper users so that one can switch off all
of the udev management off at one go directly on the command line,
without a need to modify any source or add any extra switches.
2012-11-29 14:03:48 +01:00
Zdenek Kabelac
0387e70d76 thin: fix property discard for lvm2api
Discards property is string and may have these values:
  ignore, nopassdown, passdown
2012-11-27 14:09:49 +01:00
Zdenek Kabelac
09b7ceea95 thin: allow restore with --force
Allow restoring metadata with thin pool volumes.
No validation is done for this case within vgcfgrestore tool -
thus incorrect metadata may lead to destruction of pool content.
2012-11-27 14:08:24 +01:00
Alasdair G Kergon
8c49aa79e7 filters: Add STEC skd and Violin vtms devices 2012-11-26 14:55:17 +00:00
Zdenek Kabelac
1ef9831018 thin: support configurable thin pool defaults
Configurable settings for thin pool create
if they are not specified on command line.

New supported lvm.conf options are:
  allocation/thin_pool_chunk_size
  allocation/thin_pool_discards
  allocation/thin_pool_zero
2012-11-26 12:16:47 +01:00
Zdenek Kabelac
683b1f0625 thin: detect discards for non-power-2
Check if target supports discards for chunk sizes,
that are not power of 2 (just multiple of 64K),
and enable it in case it's supported by thin kernel target.
2012-11-26 12:14:47 +01:00