1
0
mirror of git://sourceware.org/git/lvm2.git synced 2024-12-22 17:35:59 +03:00
Commit Graph

1488 Commits

Author SHA1 Message Date
Zdenek Kabelac
0670bfeb59 thin: validation catch multiseg thin pool/volumes
Multisegment thin pools and volumes are not supported.
Catch such error code path early.
2013-09-07 03:32:07 +02:00
Zdenek Kabelac
4c001a7854 thin: fix resize of stacked thin pool volume
When the pool is created from non-linear target the more complex rules
have to be used and stacking needs to properly decode args for _tdata
LV. Also proper allocation policies are being used according to those
set in lvm2 metadata for data and metadata LVs.

Also properly check for active pool and extra code to active it
temporarily.

With this fix it's now possible to use:

lvcreate -L20 -m2 -n pool vg  --alloc anywhere
lvcreate -L10 -m2 -n poolm vg --alloc anywhere
lvconvert --thinpool vg/pool --poolmetadata vg/poolm

lvresize -L+10 vg/pool
2013-09-07 03:24:48 +02:00
Jonathan Brassow
72d6bdd6b9 misc: make lv_is_on_pv use for_each_sub_lv to walk LV tree
Make lv_is_on_pv use for_each_sub_lv to walk the LV tree.  This
reduces code duplication.
2013-08-23 11:03:28 -05:00
Jonathan Brassow
e5c0213168 Thin: Make 'lv_is_on_pv(s)' work with thin types
The pool metadata LV must be accounted for when determining what PVs
are in a thin-pool.  The pool LV must also be accounted for when
checking thin volumes.

This is a prerequisite for pvmove working with thin types.
2013-08-23 08:49:16 -05:00
Jonathan Brassow
f1e3640df3 Misc: Make get_pv_list_for_lv() available to more than just RAID
The function 'get_pv_list_for_lv' will assemble all the PVs that are
used by the specified LV.  It uses 'for_each_sub_lv' to traverse all
of the sub-lvs which may compose it.
2013-08-23 08:40:13 -05:00
Peter Rajnoha
f74e8fe044 thin: fix commit e195b5227e
Check chunk_size range unconditionally.
2013-08-06 16:28:12 +02:00
Peter Rajnoha
e195b5227e thin: apply VG profile if creating a new thin pool
When creating a new thin pool and there's no profile requested
via "lvcreate --profile ...", inherit any VG profile if it's attached.

Currently this applies to these settings:
  allocation/thin_pool_chunk_size
  allocation/thin_pool_discards
  allocation/thin_pool_zero
2013-08-06 11:42:40 +02:00
Zdenek Kabelac
7b58f10442 thin: move setting of THIN_POOL
Set flag when attaching data LV which make segment THIN_POOL.
2013-07-31 15:27:38 +02:00
Alasdair G Kergon
b6bfddcd0a alloc: fix lvextend when stripe number varies
The PREFERRED allocation mechanism requires the number of areas in the
previous LV segment to match the number in the new segment being
allocated.  If they do not match, the code may crash.
  E.g. https://bugzilla.redhat.com/989347

Introduce A_AREA_COUNT_MATCHES and when not set avoid referring
to the previous segment with the contiguous and cling policies.
2013-07-29 19:35:45 +01:00
Jonathan Brassow
f5a205668b Revert a previous change
commit d00d45a8b6 introduced changes
that are causing cluster mirror tests to fail.  Ultimately, I think
the change was right, but a proper clean-up will have to wait.
The portion of the commit we are reverting correlates to the
following commit comment:
    2) lib/metadata/mirror.c:_delete_lv() - should have been calling
       _activate_lv_like_model() with 'mirror_lv'.  This is because
       'mirror_lv' is the LV that the overall operation is being
       performed on.  We need to use this LV as the basis for
       determining whether to activate locally, or across the
       cluster, etc.
It appears that when legs or logs are removed from a mirror, they
are being activated before they are deleted in order to make them
top-level LVs that can be acted upon.  When doing this, it appears
they are not activated based on the characteristics of the mirror
from which they came.  IOW, if the mirror was exclusively active,
the sub-LVs are activated globally.  This is a no-no.  This then
made it impossible to activate_lv_like_model if the model was
"mirror_lv" instead of "lv" in _delete_lv().  Thus, at some point
this change should probably be put back and those location where
the sub-LVs are being improperly activated "shared" instead of
EX should be corrected.
2013-07-24 14:18:07 -05:00
Zdenek Kabelac
5597dc3652 thin: not zeroing for non-zeroed thin pool snaps
Do not zero initial 4KB of thin snapshot volume for thin pool with
disabled zeroing.
2013-07-24 01:15:31 +02:00
Jonathan Brassow
d00d45a8b6 Clean-up: Addressing a few FIXME's
Three fixme's addressed in this commit:
1) lib/metadata/lv_manip.c:_calc_area_multiple() - this could be
   safely changed to a comment explaining that currently because
   RAID10 can only have a 2-way mirror, we don't need to know the
   number of stripes.  However, we will need to know that in the
   future if RAID10 is to support more than 2-way mirroring.

2) lib/metadata/mirror.c:_delete_lv() - should have been calling
   _activate_lv_like_model() with 'mirror_lv'.  This is because
   'mirror_lv' is the LV that the overall operation is being
   performed on.  We need to use this LV as the basis for
   determining whether to activate locally, or across the
   cluster, etc.

3) tools/lvcreate.c:_lvcreate_params() - Minor clean-up.  If
   '-m 0' is given, treat it as though the mirroring argument
   was not given (i.e. as though the requested segment type
   was 'stripe' and not mirror).
2013-07-23 14:46:22 -05:00
Zdenek Kabelac
373f95a921 snapshot: update merging fix
Activation is needed only for clustered VG.
For non-clustered VG skip activation, since deactivate_lv()
is called without problems (no testing for lock presence).

(updates f6ded62291)
2013-07-23 15:15:04 +02:00
Zdenek Kabelac
6311be29e4 thin: use 64bit arithmetic for checking meta size
Avoid overflow since extents are just 32bit values.

(in release fix 87aca628)
2013-07-23 14:58:07 +02:00
Alasdair G Kergon
84801d7c34 thin: rename extend_pool to create_pool 2013-07-23 13:33:14 +01:00
Zdenek Kabelac
f6ded62291 snapshot: fix merging
When the merging of snapshot is finished, we need to clean dm table
intries for snapshot and -cow device. So for merging snapshot
we have to activate_lv plain 'cow' LV and let the table
resolver to its work - shortly deactivation_lv() request
will follow - in cluster this needs LV lock to be held by clvmd.

Also update a test - add small wait - if lvremove is not 'fast enough'
and merging process has not been stopped and $lv1 removed in background.
Ortherwise the following lvcreate occasionally finds name $lv1 still in use.

(in release fix)
2013-07-22 16:26:00 +02:00
Zdenek Kabelac
ea68f08501 cleanup: remove unused headers 2013-07-22 12:41:21 +02:00
Petr Rockai
6d2604f026 metadata: Fix tracking of read_status flags in _vg_make_handle. 2013-07-22 12:04:47 +02:00
Petr Rockai
3ed7f78ff4 metadata: Do not ignore errors in _vg_update_vg_ondisk. 2013-07-22 12:00:48 +02:00
Petr Rockai
f897fcbd95 metadata: Do not try to maintain an ondisk copy of orphan VGs. 2013-07-22 11:51:35 +02:00
Zdenek Kabelac
3075784955 thin: add spare lvcreate support
Add --poolmetadataspare option and creates and handles
pool metadata spare lv when thin pool is created.
With default setting 'y' it tries to ensure, spare has
at least the size of created LV.
2013-07-18 18:22:44 +02:00
Zdenek Kabelac
a916bf7eeb thin: removal of spare disables recovery
Warn user when removing spare LV.
Remove spare automatically, when last pool from VG is removed.
2013-07-18 18:22:44 +02:00
Zdenek Kabelac
915cc5a2fa thin: report 'e' volume type pool metadata spare
Reuse m'e'tadata volume type for spar'e' volume as well.
Essentially they are related and there is no big reason
to introduce new flag.
2013-07-18 18:22:44 +02:00
Zdenek Kabelac
460d0254eb thin: add pool metadata spare lv support
Add support for pool's metadata spare volume.
2013-07-18 18:22:43 +02:00
Zdenek Kabelac
08df7ba844 thin: improve pool creation activation order
Pool creation involves clearing of metadata device
which triggers udev watch rule we cannot udev synchronize with
in current code.

This metadata devices needs to be activated localy,
so in cluster mode deactivation and reactivation
is always needed.

However for non-clustered mode we may reload table
via suspend/resume path which avoids collision with
udev watch rule which was occasionaly triggering
retry deactivation loop.

Code has been also split into 2 separate code paths
for thin pools and thin volumes which improved readability
of the code as well.

Deactivation has been moved out of extend_pool() and
decision is now in _lv_create_an_lv() which knows
the change mode.
2013-07-18 18:22:43 +02:00
Zdenek Kabelac
d72f34af41 thin: fix error paths for metadata creation
Since we vg_write&commit metadata LV inside  lv_extend() call,
proper restore is needed in case something fails.

So add bad: section which deactivates activated LV
and removes it from VG.

Also check early for metadata LV name lengh fail.
2013-07-18 18:22:43 +02:00
Zdenek Kabelac
7afa9cebcb thin: fix error path in creation path
Remove some calls to revert_new_lv when no LV has been created/commited so far.
When the pool update failed - then only revert is needed.
2013-07-18 18:22:43 +02:00
Zdenek Kabelac
1d3f7953bd lv_manip: move some validation code before archiving
Make as much test we can, before actualy modifying metadata.
Avoids also unnecessary archiving.
2013-07-18 18:22:43 +02:00
Zdenek Kabelac
7b4b97b731 snapshot: local activation for clear COW device
To clear snapshot cow device in cluster enforce local
activation here.
2013-07-18 18:22:43 +02:00
Zdenek Kabelac
b5dfe4bec2 metadata: add is_change_activating
Add simple inline function to check, whether the change is activating.
(better then macro since we get type checking).
2013-07-18 18:22:42 +02:00
Zdenek Kabelac
e8fc77bd6d cleanup: move detection of change mode before commit point
Following patch will need to know change state before commit point.
Also the test mode should properly report all ongoing operation.
2013-07-18 18:22:42 +02:00
Zdenek Kabelac
20187fc190 cleanup: use dm_list_empty
Check for empty list directly.
2013-07-18 18:22:42 +02:00
Zdenek Kabelac
8bd67f9e5a cleanup: exit earlier in lv_rename_update
List doesn't need to be created when the metadata are not updated.
2013-07-18 18:16:29 +02:00
Peter Rajnoha
73e7f6c45f profile: move profile assignment to common lv_create_empty fn 2013-07-17 11:31:54 +02:00
Zdenek Kabelac
925701d9f3 thin: write back when command successfully finished
Remove backup() call from update_pool_lv() as it's been there
duplicated and preperly order backup() call after lvresize,
so there is just one such call.
2013-07-15 15:48:32 +02:00
Zdenek Kabelac
42881c8877 thin: send messages to active pool
If the thin pool is known to be active, messages can be passed
to the pool even when the created thin volume is not going to be
activated.

So we do not need to stack large list of message and validate
and catch creation errors earlier in this case.

Replace the test for valid activation combination with simpler list of
deactivation combinations.
2013-07-15 15:47:25 +02:00
Zdenek Kabelac
9ba7783350 cleanup: update comments
Add and indent.
2013-07-15 15:40:46 +02:00
Peter Rajnoha
8d1e511363 conf: add activation/auto_set_activation_skip
The activation/auto_set_activation_skip enables/disables automatic
adding of the ACTIVATION_SKIP LV flag. By default thin snapshots
are flagged to be skipped during activation.

And by default, the auto_set_activation_skip is enabled.
2013-07-12 20:54:17 +02:00
Peter Rajnoha
283c93a858 lvs: add 's(k)ip activatoin' lv_attr field
A new 10-th bit in lvs lv_attr field to indicate whether an LV
has the ACTIVATION_SKIP flag attached (stored in metadata).
2013-07-12 20:54:17 +02:00
Peter Rajnoha
7dc8c84b18 activation: add support for skipping activation of selected LVs
Also add -k/--setactivationskip y/n and -K/--ignoreactivationskip
options to lvcreate.

The --setactivationskip y sets the flag in metadata for an LV to
skip the LV during activation. Also, the newly created LV is not
activated.

Thin snapsots have this flag set automatically if not specified
directly by the --setactivationskip y/n option.

The --ignoreactivationskip overrides the activation skip flag set
in metadata for an LV (just for the run of the command - the flag
is not changed in metadata!)

A few examples for the lvcreate with the new options:

  (non-thin snap LV => skip flag not set in MDA + LV activated)
  raw/~ $ lvcreate -l1 vg
    Logical volume "lvol0" created
  raw/~ $ lvs -o lv_name,attr vg/lvol0
    LV    Attr
    lvol0 -wi-a----

  (non-thin snap LV + -ky => skip flag set in MDA + LV not activated)
  raw/~ $ lvcreate -l1 -ky vg
    Logical volume "lvol1" created
  raw/~ $ lvs -o lv_name,attr vg/lvol1
    LV    Attr
    lvol1 -wi------

  (non-thin snap LV + -ky + -K => skip flag set in MDA + LV activated)
  raw/~ $ lvcreate -l1 -ky -K vg
    Logical volume "lvol2" created
  raw/~ $ lvs -o lv_name,attr vg/lvol2
    LV    Attr
    lvol2 -wi-a----

  (thin snap LV => skip flag set in MDA (default behaviour) + LV not activated)
  raw/~ $ lvcreate -L100M -T vg/pool -V 1T -n thin_lv
    Logical volume "thin_lv" created
  raw/~ $ lvcreate -s vg/thin_lv -n thin_snap
    Logical volume "thin_snap" created
  raw/~ $ lvs -o name,attr vg
    LV        Attr
    pool      twi-a-tz-
    thin_lv   Vwi-a-tz-
    thin_snap Vwi---tz-

  (thin snap LV + -K => skip flag set in MDA (default behaviour) + LV activated)
  raw/~ $ lvcreate -s vg/thin_lv -n thin_snap -K
    Logical volume "thin_snap" created
  raw/~ $ lvs -o name,attr vg/thin_lv
    LV      Attr
    thin_lv Vwi-a-tz-

  (thins snap LV + -kn => no skip flag in MDA (default behaviour overridden) + LV activated)
  [0] raw/~ # lvcreate -s vg/thin_lv -n thin_snap -kn
    Logical volume "thin_snap" created
  [0] raw/~ # lvs -o name,attr vg/thin_snap
    LV        Attr
    thin_snap Vwi-a-tz-
2013-07-12 20:39:07 +02:00
Alasdair G Kergon
aaa9a68b2f metadata: cleanup comments and formatting 2013-07-09 12:34:48 +01:00
Alasdair G Kergon
f56a1819e9 tools: remove metadata-exported.h
metadata-exported.h is included by tools.h
2013-07-09 03:07:55 +01:00
Alasdair G Kergon
9d5bdc91ca tools: remove metadata.h 2013-07-09 02:51:24 +01:00
Alasdair G Kergon
8adddbf101 pvcreate: remove metadata.h header
Files in tools/ should only use metadata-exported.h not metadata.h.
Rename pvcreate_locked to pvcreate_single.
2013-07-09 02:37:56 +01:00
Alasdair G Kergon
7c6526aae2 lvresize: separate validation from action
Start separating the validation from the action in the basic lvresize
code moved to the library.
Remove incorrect use of command line error codes from lvresize library
functions.  Move errors.h to tools directory to reinforce this,
exporting public versions of the error codes in lvm2cmd.h for dmeventd
plugins to use.
2013-07-06 03:28:21 +01:00
Zdenek Kabelac
a64239f225 cleanup: use plain unsigned types 2013-07-05 17:20:57 +02:00
Zdenek Kabelac
7319bc0420 cleanup: move declaration to front 2013-07-05 17:20:57 +02:00
Zdenek Kabelac
f88f5a1ca3 thin: move alloc_pool_metadata
Move function from /tool to /lib to thin_manip.c
Since lvm2api will need to move many things into /lib anyway.
2013-07-04 13:33:41 +02:00
Zdenek Kabelac
6f335ffa35 sigint: improve logic on for sigint reaction
Fix and improve handling on sigint.

Always check for signal presence *before* calling of command,
so it will not call the command when break was hit.

If the command has been finished succesfully there is
no problem to mark the command ok and not report interrupt at all.

Fix cuple related stack; reports and assignments.
2013-07-03 14:46:42 +02:00
Mike Snitzer
f9e0adcce5 snapshot: Rename snapshot segment returning methods from find_*_cow to find_*_snapshot
find_cow -> find_snapshot, find_merging_cow -> find_merging_snapshot.
Will thin snapshot code to reuse these methods without confusion.
2013-07-02 16:26:03 -04:00
Tony Asleson
50db109e20 liblvm: Moved additional pv resize code
The pv resize code required that a lvm_vg_write be done
to commit the change.  When the method to add the ability
to list all PVs, including ones that are not assocated with
a VG we had no way for the user to make the change persistent.
Thus additional resize code was move and now liblvm calls into
a resize function that does indeed write the changes out, thus
not requiring the user to explicitly write out he changes.

Signed-off-by: Tony Asleson <tasleson@redhat.com>
2013-07-02 14:24:34 -05:00
Tony Asleson
6d6ccded35 lib2app: Added PV create. V2
V2: Correct call to lock_vol

Signed-off-by: Tony Asleson <tasleson@redhat.com>
2013-07-02 14:24:34 -05:00
Tony Asleson
8ddb1b4abf _get_pvs: Remove unused variable
Signed-off-by: Tony Asleson <tasleson@redhat.com>
2013-07-02 14:24:34 -05:00
Tony Asleson
e33ac7b1ed lvm2app: Implement lvm_pv_remove V2
Code move and changes to support calling code from
command line and from library interface.

V2 Change lock_vol call

Signed-off-by: Tony Asleson <tasleson@redhat.com>
2013-07-02 14:24:34 -05:00
Tony Asleson
ef3ab801e8 lvm2app: Add function to retrieve list of PVs V3
As locks are held, you need to call the included function
to release the memory and locks when done transversing the
list of physical volumes.

V2: Rebase fix
V3: Prevent VGs from getting cached and then write protected.

Signed-off-by: Tony Asleson <tasleson@redhat.com>
2013-07-02 14:24:33 -05:00
Tony Asleson
49d7596581 lvm2app: Implement lv resize (v3)
Simplified version of lv resize.

v3: Rebase changes to make work.  Needed to set sizeargs = 1
to indicate to resize that we are asking for a size based
resize.

Signed-off-by: Tony Asleson <tasleson@redhat.com>
2013-07-02 14:24:33 -05:00
Tony Asleson
c43ce46ba7 lvm2app: Move core lv re-size code (v6)
Moved to allow use from command line and for library use.

v3,v4,v5,v6: rebase changes

Signed-off-by: Tony Asleson <tasleson@redhat.com>
2013-07-02 14:24:33 -05:00
Peter Rajnoha
9f6cfc9de4 report: add vg_profile and lv_profile to report the profile attached to VG/LV
vgs -o vg_profile ...
lvs -o lv_profile ...
2013-07-02 15:22:12 +02:00
Peter Rajnoha
24a84549a8 thin: make selected thinp settings profilable
These settins are customizable by profiles:

	allocation/thin_pool_zero
	allocation/thin_pool_discards
	allocation/thin_pool_chunk_size
	activation/thin_pool_autoextend_threshold
	activation/thin_pool_autoextend_percent
2013-07-02 15:22:11 +02:00
Peter Rajnoha
6f0427cc56 lv: add lv_config_profile fn
Returns LV's profile if it exists, VG's profile otherwise
(LV inherits VG's profile).
2013-07-02 15:22:11 +02:00
Peter Rajnoha
e21e38cf74 metadata: add support for storing profile name in metadata (during vgcreate/lvcreate)
If "vgcreate/lvcreate --profile <profile_name>" is used, the profile
name is automatically stored in metadata for making it possible to
load it automatically next time the VG/LV is used.
2013-07-02 15:19:09 +02:00
Peter Rajnoha
d6a91da4be config: add profile arg to find_config_tree_bool 2013-07-02 15:19:09 +02:00
Peter Rajnoha
50bf2c0db1 config: add profile arg to find_config_tree_int 2013-07-02 15:19:09 +02:00
Peter Rajnoha
eeb7b0f7fa config: add profile arg to find_config_tree_node 2013-07-02 15:19:09 +02:00
Peter Rajnoha
c5e6bc393e metadata: read VG/LV profile name from metadata if it exists and load it
This is per VG/LV profile loading on demand. The profile itself is saved
in struct volume_group/logical_volume as "profile" field so we can
reference it whenever needed.
2013-07-02 15:19:09 +02:00
Zdenek Kabelac
e30028004b archiver: do not archive vg more then once
Do not keep multiple archives for the executed command.
Reuse the ALLOCATABLE_PV from pv status for
ARCHIVED_VG vg status. Mark VG with the bit with the
first archivation.
2013-07-01 23:09:26 +02:00
Zdenek Kabelac
afea2bf598 cleanup: move bit flags in order
Preseve the sequence of bits.
2013-07-01 23:06:41 +02:00
Zdenek Kabelac
da9263905c cleanup: use "" around type in error message
A bit more cleaner error message for wrong discards paramater.
2013-06-25 13:47:39 +02:00
Zdenek Kabelac
2a059f2358 cleanup: use log_print instead of log_error
Since reduce the message has informational character and doesn't lead
to exit of the command - reduce the log level to info print as we
use for other similar types.

Reindent next print message.
2013-06-25 13:47:39 +02:00
Zdenek Kabelac
f990b7298d cleanup: return lv_is_ as 1 or 0
Do not return 64bit values - return just plain int 0 or 1
2013-06-18 22:13:42 +02:00
Zdenek Kabelac
d4308a558d snapshot: fix max size limit check for COW device
Use proper max size as a multiple of extent size.
And use 64bit arithmentic for validation of minsize.
(in release fix).
2013-06-17 09:37:50 +02:00
Zdenek Kabelac
2f334b16d2 cleanup: use struct assign
Simplier code with struct assign.
Drop unneeded zeroing of zallocated memory.
2013-06-17 09:37:06 +02:00
Zdenek Kabelac
2636cae139 clean: remove unneeded assign
Since init_unknown_segtype returns zalloced memory,
NULL assign is not needed.
2013-06-17 09:34:56 +02:00
Zdenek Kabelac
5d73c0c674 cleanup: access pool segs with check
Add some minor checks after first_seg(lv).
Use direct check for thin pool segment.
Drop some unneeded {}
2013-06-16 00:07:33 +02:00
Zdenek Kabelac
8fb5f63637 mirror: add missing error message
When a user has not proceeded with conversion,
print the error message why the command has failed.
2013-06-16 00:07:32 +02:00
Alasdair G Kergon
c2dc21d89f text: miscellaneous comments & message tweaks 2013-06-15 01:28:54 +01:00
Peter Rajnoha
c6f48b7c1a refactor: make device type recognition code common for general use
Changes:

- move device type registration out of "type filter" (filter.c)
to a separate and new dev-type.[ch] for common use throughout the code

- the structure for keeping the major numbers detected for available
device types and available partitioning available is stored in
"dev_types" structure now

- move common partitioning detection code to dev-type.[ch] as well
together with other device-related functions bound to dev_types
(see dev-type.h for the interface)

The dev-type interface contains all common functions used to detect
subsystems/device types, signature/superblock recognition code,
type-specific device properties and other common device properties
(bound to dev_types), including partitioning support.

- add dev_types instance to cmd context as cmd->dev_types for common use

- use cmd->dev_types throughout as a central point for providing
information about device types
2013-06-12 12:08:56 +02:00
Peter Rajnoha
657abb08e0 cleanup: use libdm's dm_sysfs_dir() for sysfs directory throughout
And remove superfluous cmd->sysfs_dir and
set_sysfs_dir_path/sysfs_dir_path fn from lvm-globals.[ch].
2013-06-12 11:44:58 +02:00
Zdenek Kabelac
72c3ae253e thin: add helper functions
Add find_pool_lv() and pool_can_resize_metadata().
2013-06-11 14:03:30 +02:00
Zdenek Kabelac
01ef97fcbb thin: report 'e' metadata type with higher priority
Giving volume type information about being 'metadata' type of volume
has higher priority then i.e.  'mirror' or 'thin' flag - for those
type we have 'target attr' (7th. field).
2013-06-11 14:03:08 +02:00
Zdenek Kabelac
c0290489c3 thin: report o as volume type for external origin
Reuse 'o' attr for lvs report also for external origin.
2013-06-11 14:02:41 +02:00
Zdenek Kabelac
7151ede767 thin: report t for thin pool and volume
Do not mark internal device _tdata and _tmeta as having target type 't'.
They have the target type on their own (i.e. mirror, raid).
2013-06-11 13:58:16 +02:00
Zdenek Kabelac
1dcba13dfc cleanup: remove {} 2013-06-11 13:55:26 +02:00
Petr Rockai
f5a3bef276 format1: Fix snapshot reload in lv_remove.
The special suspend/resume code in lv_remove for LVM1 snapshots was interpsersed
with a vg_commit call. However, while with LVM1 metadata, vg_commit is
technically a no-op, the activation code relied on the ondisk and incore
metadata being the same, since on LVM1, a "commit" happens in vg_write
already. Since the "ondisk" metadata was previously not available with format1
(and incore was silently used instead, via lvmcache), the problem was masked.
2013-06-10 21:01:57 +02:00
Petr Rockai
2cce2f67ab metadata: Fix a pool CRC failure due to "late" ondisk copy creation. 2013-06-10 17:26:38 +02:00
Petr Rockai
f65dd341a5 locking: Make it possible to pass down an LV to activation code.
Previously, we have relied on UUIDs alone, and on lvmcache to make getting a
"new copy" of VG metadata fast. If the code which triggers the activation has
the correct VG metadata at hand (the version which is currently on disk), it can
now hand it to the activation code directly.
2013-06-10 17:26:38 +02:00
Petr Rockai
5d5f2306bd Add and track an "ondisk" pointer to struct volume_group.
This allows us to get the current on-disk version of the metadata whenever we
have the current in-flight version, without a recourse to scanning or lvmcache.
2013-06-10 17:26:29 +02:00
Petr Rockai
c1e851e208 Move export_vg_to_config_tree alongside export_vg_to_buffer. 2013-06-10 15:55:55 +02:00
Zdenek Kabelac
31f3274ed8 mirror: implement check for remotely active LV
If the mirror is active exclusively and locally, then we may proceed.
2013-05-31 21:42:31 +02:00
Jonathan Brassow
562c678ee2 DM RAID: Add ability to throttle sync operations for RAID LVs.
This patch adds the ability to set the minimum and maximum I/O rate for
sync operations in RAID LVs.  The options are available for 'lvcreate' and
'lvchange' and are as follows:
  --minrecoveryrate <Rate> [bBsSkKmMgG]
  --maxrecoveryrate <Rate> [bBsSkKmMgG]
The rate is specified in size/sec/device.  If a suffix is not given,
kiB/sec/device is assumed.  Setting the rate to 0 removes the preference.
2013-05-31 11:25:52 -05:00
Zdenek Kabelac
eb7e206a73 snapshot: add cow_max_extents
Add more precise calculation of the maximum usable snapshot size.
Using only percentage fails for small size of snapshot and extents.
2013-05-30 17:30:15 +02:00
Zdenek Kabelac
59962d8d3e snapshot: require 3 chunks for creation
There is no point in creation of 2chunks snapshot,
since the snapshot is invalidated immeditelly with the first write
as there is no free chunk for COW blocks
(2 chunks are used by the snap header and the 1st. metadata chunk).

Enhance error message about the lowest usable size.
2013-05-30 17:28:03 +02:00
Zdenek Kabelac
2f1a571c97 fid: fix reset of PV fid
Avoid hitting memory corruption (double free) in code path,
where PV FID has been already destroyed and the released pointer
was left in PV structure and could have been tried to be released
from there 2nd. time with final context destruction.
2013-05-30 16:52:39 +02:00
Peter Rajnoha
732859d21f refactor: rename embedding area -> bootloader area 2013-05-28 12:37:22 +02:00
Zdenek Kabelac
9966842810 snapshot: skip monitor for large cows
If snapshot cow device is already big enough to
cover whole origin, do not monitor it.
2013-05-27 10:35:43 +02:00
Zdenek Kabelac
77952151af snapshot: add lv_is_cow_covering_origin
Add function to check is size of cow is already big enough
to cover whole origin.
2013-05-27 10:34:53 +02:00
Zdenek Kabelac
3ba3bc0d66 cleanup: drop backtrace
After log_error/log_warn there is no point to show <backtrace>
in debug log trace from the next code line.
2013-05-27 10:28:32 +02:00
Zdenek Kabelac
8cbacd2474 lv_manip: use lv_is_active
Updated reverted commit.
The usage of lv_is_active() is needed here, so the
(!lv_is_active_exclusive_locally) gives the correct
report.
2013-05-20 16:47:33 +02:00
Jonathan Brassow
06ac797f42 Clean-up: Replace 'lv_is_active' with more correct/specific variants
There are places where 'lv_is_active' was being used where it was
more correct to use 'lv_is_active_locally'.  For example, when checking
for the existance of a kernel instance before asking for its status.
Most of the time these would work correctly.  (RAID is only allowed on
non-clustered VGs at the moment, which means that 'lv_is_active' and
'lv_is_active_locally' would give the same result.)  However, it is
more correct to use the proper variant and it helps with future
scenarios where targets might be allowed exclusively (or clustered) in
a cluster VG.
2013-05-16 10:36:56 -05:00
Peter Rajnoha
4777eb6872 lvconvert: check for snapshot-merge support before merge init 2013-05-16 08:21:57 +02:00
Alasdair G Kergon
f12d88f840 activation: fix lv_is_active regressions
Try to fix commit bf2741376d.

lv_is_active is not the same as lv_info(cmd, org, 0, &info, 0, 0).

Introduce and use lv_is_active_locally.
2013-05-15 02:13:31 +01:00
Mike Snitzer
8ad7865b42 Fix alignment of PV data area if detected alignment less than 1 MB
This fixes a long standing regression since LVM2 2.02.74 (commit 4efb1d9c,
"Update heuristic used for default and detected data alignment.")

The default PE alignment could be used (via MAX()) even if it was
determined that the device's MD stripe width, or minimal_io_size or
optimal_io_size were not factors of the default PE alignment (either 64K
or the newer default of 1MB, etc).  This bug would manifest if the
default PE alignment was larger than the overriding hint that the
device provided (e.g. default of 1MB vs optimal_io_size of 768K).

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2013-05-13 15:56:47 -04:00
Zdenek Kabelac
3d0cb0611e lv: fix typedef
Since older gcc is not accepting duplication of same typedef,
stay with predeclared enum type.
2013-05-03 16:02:43 +02:00
Zdenek Kabelac
2d3700ba42 report: improve reporting of active state
For reporting stacked or joined devices properly in cluster,
we need to report their activation state according the lock,
which activated this device tree.

This is getting a bit complex - current code tries simple approach -

For snapshot - return status for origin.
For thin pool - return status of the first known active thin volume.
For the rest of them - try to use dependency list of LVs and skip
known execptions.  This should be able to recursively deduce top level
device for given LV.

(in release fix)
2013-05-03 15:43:52 +02:00
Zdenek Kabelac
d2d71330c3 lv: add lv_active_change
Make a separate /lib function for the change of activation state
of the LV.

(in release update)
2013-05-03 15:43:19 +02:00
Zdenek Kabelac
8d004b5127 report: show active state of LV
For non clustered VG - show  "active"/""

For clustered VG its more complex:

"local exclusive"
"remote exclusive"
"locally"
"remotely"
2013-04-25 17:33:24 +02:00
Zdenek Kabelac
8b18ab76d2 report: show dmeventd monitoring status
Add new lvs segment field 'Monitor' showing 3 states:

"monitored" - LV is monitored by dmeventd.

"not monitored" - LV is currently not being monitored by dmeventd

"" (empty) - LV does not support monitoring, or dmeventd support
             is not compiled in.
2013-04-25 17:33:24 +02:00
Zdenek Kabelac
f84f12a6a3 snapshot: rework cluster creation and removal
Support for exclusive activation of snapshots revealed some problems.

When snapshot is created, COW LV is activated first (for clearing) and
then it's transformed into snapshot's COW LV, but it has left the lock
for such LV active in cluster and this lock could not have been removed
from dlm, unless snapshot has been removed within same dlm session.

If the user tried to remove snapshot after rebooting node, the lock was
missing, and COW LV could not have been detached.

Patch modifes the approach in this way:

Always deactivate COW LV for clustered vg  after clearing (so it's
activated again via imlicit snapshot activation rule when snapshot is activated).

When snapshot is removed, activate COW LV as independend LV, so the lock
will exist for such LV, but only when the snapshot is active.

Also add test case for testing snapshot removal after cluster reboot.
2013-04-25 17:33:24 +02:00
Zdenek Kabelac
dd4fdce16c cleanup: drop unused assignment
Assigned values are unused.
2013-04-21 23:14:04 +02:00
Zdenek Kabelac
5e7eae59da lv_manip: check remove_seg_from_segs_using_this_lv()
Add missing check for result of remove_seg_from_segs_using_this_lv().
Failure is reported as internal error.
2013-04-21 23:10:43 +02:00
Zdenek Kabelac
24f8daa13d raid: test for target_pvs
If target_pvs is NULL do not call lv_is_on_pvs()
2013-04-21 23:07:00 +02:00
Zdenek Kabelac
4ca6a4105d thin: lvcreate better support for AAY
Test rather for changes which are deactivating.
2013-04-21 23:06:23 +02:00
Jonathan Brassow
2e0740f7ef RAID: Add writemostly/writebehind support for RAID1
'lvchange' is used to alter a RAID 1 logical volume's write-mostly and
write-behind characteristics.  The '--writemostly' parameter takes a
PV as an argument with an optional trailing character to specify whether
to set ('y'), unset ('n'), or toggle ('t') the value.  If no trailing
character is given, it will set the flag.
Synopsis:
        lvchange [--writemostly <PV>:{t|y|n}] [--writebehind <count>] vg/lv
Example:
        lvchange --writemostly /dev/sdb1:y --writebehind 512 vg/raid1_lv

The last character in the 'lv_attr' field is used to show whether a device
has the WriteMostly flag set.  It is signified with a 'w'.  If the device
has failed, the 'p'artial flag has priority.

Example ("nosync" raid1 with mismatch_cnt and writemostly):
[~]# lvs -a --segment vg
  LV                VG   Attr      #Str Type   SSize
  raid1             vg   Rwi---r-m    2 raid1  500.00m
  [raid1_rimage_0]  vg   Iwi---r--    1 linear 500.00m
  [raid1_rimage_1]  vg   Iwi---r-w    1 linear 500.00m
  [raid1_rmeta_0]   vg   ewi---r--    1 linear   4.00m
  [raid1_rmeta_1]   vg   ewi---r--    1 linear   4.00m

Example (raid1 with mismatch_cnt, writemostly - but failed drive):
[~]# lvs -a --segment vg
  LV                VG   Attr      #Str Type   SSize
  raid1             vg   rwi---r-p    2 raid1  500.00m
  [raid1_rimage_0]  vg   Iwi---r--    1 linear 500.00m
  [raid1_rimage_1]  vg   Iwi---r-p    1 linear 500.00m
  [raid1_rmeta_0]   vg   ewi---r--    1 linear   4.00m
  [raid1_rmeta_1]   vg   ewi---r-p    1 linear   4.00m

A new reportable field has been added for writebehind as well.  If
write-behind has not been set or the LV is not RAID1, the field will
be blank.
Example (writebehind is set):
[~]# lvs -a -o name,attr,writebehind vg
  LV            Attr      WBehind
  lv            rwi-a-r--     512
  [lv_rimage_0] iwi-aor-w
  [lv_rimage_1] iwi-aor--
  [lv_rmeta_0]  ewi-aor--
  [lv_rmeta_1]  ewi-aor--

Example (writebehind is not set):
[~]# lvs -a -o name,attr,writebehind vg
  LV            Attr      WBehind
  lv            rwi-a-r--
  [lv_rimage_0] iwi-aor-w
  [lv_rimage_1] iwi-aor--
  [lv_rmeta_0]  ewi-aor--
  [lv_rmeta_1]  ewi-aor--
2013-04-15 13:59:46 -05:00
Zdenek Kabelac
2e39392daf cleanup: remove unused lvl_idx 2013-04-12 11:26:31 +02:00
Jonathan Brassow
ff64e3500f RAID: Add scrubbing support for RAID LVs
New options to 'lvchange' allow users to scrub their RAID LVs.
Synopsis:
	lvchange --syncaction {check|repair} vg/raid_lv

RAID scrubbing is the process of reading all the data and parity blocks in
an array and checking to see whether they are coherent.  'lvchange' can
now initaite the two scrubbing operations: "check" and "repair".  "check"
will go over the array and recored the number of discrepancies but not
repair them.  "repair" will correct the discrepancies as it finds them.

'lvchange --syncaction repair vg/raid_lv' is not to be confused with
'lvconvert --repair vg/raid_lv'.  The former initiates a background
synchronization operation on the array, while the latter is designed to
repair/replace failed devices in a mirror or RAID logical volume.

Additional reporting has been added for 'lvs' to support the new
operations.  Two new printable fields (which are not printed by
default) have been added: "syncaction" and "mismatches".  These
can be accessed using the '-o' option to 'lvs', like:
	lvs -o +syncaction,mismatches vg/lv
"syncaction" will print the current synchronization operation that the
RAID volume is performing.  It can be one of the following:
        - idle:   All sync operations complete (doing nothing)
        - resync: Initializing an array or recovering after a machine failure
        - recover: Replacing a device in the array
        - check: Looking for array inconsistencies
        - repair: Looking for and repairing inconsistencies
The "mismatches" field with print the number of descrepancies found during
a check or repair operation.

The 'Cpy%Sync' field already available to 'lvs' will print the progress
of any of the above syncactions, including check and repair.

Finally, the lv_attr field has changed to accomadate the scrubbing operations
as well.  The role of the 'p'artial character in the lv_attr report field
as expanded.  "Partial" is really an indicator for the health of a
logical volume and it makes sense to extend this include other health
indicators as well, specifically:
        'm'ismatches:  Indicates that there are discrepancies in a RAID
                       LV.  This character is shown after a scrubbing
                       operation has detected that portions of the RAID
                       are not coherent.
        'r'efresh   :  Indicates that a device in a RAID array has suffered
                       a failure and the kernel regards it as failed -
                       even though LVM can read the device label and
                       considers the device to be ok.  The LV should be
                       'r'efreshed to notify the kernel that the device is
                       now available, or the device should be 'r'eplaced
                       if it is suspected of failing.
2013-04-11 15:33:59 -05:00
Petr Rockai
382fc878d7 lvmetad: Check for reappeared PVs. 2013-04-03 12:48:28 +02:00
Zdenek Kabelac
d24c01a414 thin: lvcreate external origin snapshot support 2013-04-02 15:17:31 +02:00
Zdenek Kabelac
435e0bb608 cleanup: indent line 2013-04-02 15:17:05 +02:00
Peter Rajnoha
5c93f3997b metadata: use PV's internal UNLABELLED_PV flag more consistently
Set when new PV created, cleared on PV write.
2013-03-25 16:21:59 +01:00
Peter Rajnoha
ea36d0501e cleanup: remove unused 'pv_by_path' fn
The pv_by_path might be also dangerous to use as it does not
count with any other metadata areas but the ones found on the PV
itself. If metadata was not found on the PV referenced by the path,
it returned no PV though it might have been referenced by metadata
elsewhere (on other PVs...).
2013-03-19 14:57:36 +01:00
Peter Rajnoha
7e5e2dd4ee vgextend: do not allow PV with 0 MDAs to be added while already in a VG
If extending a VG and including a PV with 0 MDAs that was already
a part of a VG, the vgextend allowed that PV to be added and we
ended up *with one PV in two VGs*!

The vgextend code used the 'pv_by_path' fn that returned a PV for
a given path. However, when the PV did not have any metadata areas,
the fn just returned a PV without any reference to existing VG.
Consequently, any checks for the existing VG failed.

[0] raw/~ # pvcreate --metadatacopies 0 /dev/sda
  Physical volume "/dev/sda" successfully created

[0] raw/~ # pvcreate --metadatacopies 1 /dev/sdb
  Physical volume "/dev/sdb" successfully created

[0] raw/~ # vgcreate vg1 /dev/sda /dev/sdb
  Volume group "vg1" successfully created

[0] raw/~ # pvcreate --metadatacopies 1 /dev/sdc
  Physical volume "/dev/sdc" successfully created

[0] raw/~ # vgcreate vg2 /dev/sdc
  Volume group "vg2" successfully created

Before this patch (incorrect):
[0] raw/~ # vgextend vg2 /dev/sda
  Volume group "vg2" successfully extended

With this patch (correct):
[0] raw/~ # vgextend vg2 /dev/sda
  Physical volume '/dev/sda' is already in volume group 'vg1'
  Unable to add physical volume '/dev/sda' to volume group 'vg2'.
2013-03-19 14:57:36 +01:00
Peter Rajnoha
59878d0129 metadata: add 'allow_orphan' arg to find_pv_by_name fn
Before, the find_pv_by_name call always failed if the PV found was orphan.
However, we might use this function even for a PV that is not part of any VG.
This patch adds 'allow_orphan' arg to find_pv_by_name fn that allows that.
2013-03-19 14:57:31 +01:00
Peter Rajnoha
5b6bab2e30 cleanup: remove superfluous wrappers
_find_pv_by_name -> find_pv_by_name
_find_pv_in_vg -> find_pv_in_vg
_find_pv_in_vg_by_uuid -> find_pv_in_vg_by_uuid

The only callers of the underscored variants were their wrappers
without the underscore. No other part of the code referenced the
underscored variants.
2013-03-19 13:58:02 +01:00
Zdenek Kabelac
b36a776a7f thin: move update_pool_params
Now we may recongnize preset arguments, move
the code for updating thin pool related values
into /lib portion of the code.
2013-03-13 15:13:54 +01:00
Zdenek Kabelac
f06dd8725a thin: mark passed args
Keep the flag whether given thin pool argument has been given on command
line or it's been 'estimated'

Call of update_pool_params() must not change cmdline given args and
needs to know this info.

Since there is a need to move this update function into /lib, we cannot
use arg_count().

FIXME: we need some generic mechanism here.
2013-03-13 15:13:54 +01:00
Peter Rajnoha
386886f71c config: refer to config nodes using assigned IDs
For example, the old call and reference:

  find_config_tree_str(cmd, "devices/dir", DEFAULT_DEV_DIR)

...now becomes:

  find_config_tree_str(cmd, devices_dir_CFG)

So we're referring to the named configuration ID instead
of passing the configuration path and the default value
is taken from central config definition in config_settings.h
automatically.
2013-03-06 10:14:33 +01:00
Peter Rajnoha
b778653f03 pv_header_extension: add support for writing PV header extension (flags & Embedding Area)
The PV header extension information (PV header extension version, flags
and list of Embedding Area locations) is stored just beyond the PV header base.

When calculating the Embedding Area start value (ea_start), the same logic is
used as when calculating the pe_start value for Data Area - the value must
follow exactly the same alignment restrictions for its start value
(the alignment detected automatically or provided via command line using
the --dataalignment and --dataalignmentoffset arguments).

The Embedding Area is placed at the very start of the PV, starting at
ea_start. The Data Area starting at pe_start is placed next. The pe_start is
still properly aligned. Due to the pe_start alignment, it's possible that the
resulting Embedding Area size (ea_size) ends up bigger in size than requested
(but never less than requested).
2013-02-26 11:28:00 +01:00
Peter Rajnoha
9dbe25709e pv_header_extension: add support for reading PV header extension (flags & Embedding Area)
New tools with PV header extension support will read the extension
if it exists and it's not an error if it does not exist (so old PVs
will still work seamlessly with new tools).

Old tools without PV header extension support will just ignore any
extension.

As for the Embedding Area location information (its start and size),
there are actually two places where this is stored:
  - PV header extension
  - VG metadata

The VG metadata contains a copy of what's written in the PV header
extension about the Embedding Area location (NULL value is not copied):

    physical_volumes {
        pv0 {
          id = "AkSSRf-difg-fCCZ-NjAN-qP49-1zzg-S0Fd4T"
          device = "/dev/sda"     # Hint only

          status = ["ALLOCATABLE"]
          flags = []
          dev_size = 262144       # 128 Megabytes
          pe_start = 67584
          pe_count = 23   # 92 Megabytes
          ea_start = 2048
          ea_size = 65536 # 32 Megabytes
        }
    }

The new metadata fields are "ea_start" and "ea_size".
This is mostly useful when restoring the PV by using existing
metadata backups (e.g. pvcreate --restorefile ...).

New tools does not require these two fields to exist in VG metadata,
they're not compulsory. Therefore, reading old VG metadata which doesn't
contain any Embedding Area information will not end up with any kind
of error but only a debug message that the ea_start and ea_size values
were not found.

Old tools just ignore these extra fields in VG metadata.
2013-02-26 11:27:23 +01:00
Peter Rajnoha
60c5d4c42f pv_header_extension: add supporting infrastructure for PV header extension (flags & Embedding Area)
PV header extension comes just beyond the existing PV header base:

PV header base (existing):
 - uuid
 - device size
 - null-terminated list of Data Areas
 - null-terminater list of MetaData Areas

PV header extension:
 - extension version
 - flags
 - null-terminated list of Embedding Areas

This patch also adds "eas" (Embedding Areas) list to lvmcache (lvmcache_info)
and it also adds support for common operations on the list (just like for
already existing "das" - Data Areas list):
 - lvmcache_add_ea
 - lvmcache_update_eas
 - lvmcache_foreach_ea
 - lvmcache_del_eas

Also, add ea_start and ea_size to struct physical_volume for processing
PV Embedding Area location throughout the code (currently only one
Embedding Area is supported, though the definition on disk allows for
more if needed in the future...).

Also, define FMT_EAS format flag to mark that the format actually
supports Embedding Areas (currently format-text only).
2013-02-26 11:25:16 +01:00
Peter Rajnoha
6d8de3638c cleanup: use struct pvcreate_restorable_params throughout 2013-02-26 11:25:11 +01:00
Peter Rajnoha
6692b17777 cleanup: add struct pvcreate_restorable_params and move relevant items from pvcreate_params
Extract restorable PV creation parameters from struct pvcreate_params into
a separate struct pvcreate_restorable_params for clarity and also for better
maintainability when adding any new items later.
2013-02-26 11:24:38 +01:00
Zdenek Kabelac
2cba0ea9f9 thin: removal of external_origin 2013-02-23 10:37:01 +01:00
Zdenek Kabelac
30c13eff37 thin: report external origin
Use the field 'origin' for reporting external origin lv name.

For thin volumes with external origin, report the size of
external origin size via:

  lvs -o+origin_size
2013-02-23 10:37:01 +01:00
Zdenek Kabelac
87331dc419 thin: add support for external origin
Add internal support for thin volume's external origin.
2013-02-23 10:36:58 +01:00
Zdenek Kabelac
d023b2d12f lvremove: easier removal of dependent lvs
Add function to remove lvs which are depending on removed lv
prior the lv is removed.

User is asked for confirmation.
2013-02-23 10:31:05 +01:00
Peter Rajnoha
303e86adc8 pvcreate: fix alignment to incorporate alignment offset if PV has 0 MDAs
If zero metadata copies are used, there's no further recalculation of
PV alignment that happens when adding metadata areas to the PV and
which actually calculates the alignment correctly as a matter of fact.
So fix this for "PV without MDA" case as well.

Before this patch:
[1] raw/~ # pvcreate --dataalignment 8m --dataalignmentoffset 4m
--metadatacopies 1 /dev/sda
  Physical volume "/dev/sda" successfully created
[1] raw/~ # pvs -o pv_name,pe_start
  PV         1st PE
  /dev/sda    12.00m
[1] raw/~ # pvcreate --dataalignment 8m --dataalignmentoffset 4m
--metadatacopies 0 /dev/sda
  Physical volume "/dev/sda" successfully created
[1] raw/~ # pvs -o pv_name,pe_start
  PV         1st PE
  /dev/sda     8.00m

After this patch:
[1] raw/~ # pvcreate --dataalignment 8m --dataalignmentoffset 4m
--metadatacopies 1 /dev/sda
  Physical volume "/dev/sda" successfully created
[1] raw/~ # pvs -o pv_name,pe_start
  PV         1st PE
  /dev/sda    12.00m
[1] raw/~ # pvcreate --dataalignment 8m --dataalignmentoffset 4m
--metadatacopies 0 /dev/sda
  Physical volume "/dev/sda" successfully created
[1] raw/~ # pvs -o pv_name,pe_start
  PV         1st PE
  /dev/sda    12.00m

Also, remove a superfluous condition "pv->pe_start < pv->pe_align" in:
  if (pe_start == PV_PE_START_CALC && pv->pe_start < pv->pe_align)
    pv->pe_start = pv->pe_align ...
This part of the condition is not reachable as with the PV_PE_START_CALC,
we always have pv->pe_start set to 0 from the PV struct initialisation
(...the pv->pe_start value is just being calculated).
2013-02-21 14:51:19 +01:00
Jonathan Brassow
dc2ce71313 clean-up: Remove a FIXME question that has been settled
It is ok for us to use the shorthand 'lv_is_virtual' to detect error
targets in a RAID LV when searching for candidates for device replacement.
2013-02-20 15:03:58 -06:00
Jonathan Brassow
bd0ee420b5 RAID: Allow remove/replace of sub-LVs composed of error segments.
When a device fails, we may wish to replace those segments with an
error segment.  (Like when a 'vgreduce --removemissing' removes a
failed device that happens to be a RAID image/meta.)  We are then left
with images that we will eventually want to remove or replace.

This patch allows us to pull out these virtual "error" sub-LVs.  This
allows a user to 'lvconvert -m -1 vg/lv' to extract the bad sub-LVs.
Sub-LVs with error segments are considered for extraction before other
possible devices so that good devices are not accidentally removed.

This patch also adds the ability to replace RAID images that contain error
segments.  The user will still be unable to run 'lvconvert --replace'
because there is no way to address the 'error' segment (i.e. no PV
that it is associated with).  However, 'lvconvert --repair' can be
used to replace the image's error segment with a new PV.  This is also
the most appropriate way to do it, since the LV will continue to be
reported as 'partial'.
2013-02-20 14:58:56 -06:00
Jonathan Brassow
845852d6b4 RAID: Make 'vgreduce --removemissing' work with RAID LVs
Currently it is impossible to remove a failed PV which has a RAID LV
on it.  This patch fixes the issue by replacing the failed PV with an
'error' segment within the affected sub-LVs.  Once there is no longer
a RAID LV using the PV, it can be removed.

Most often, it is better to replace a failed RAID device with a spare.
(You can use 'lvconvert --repair <vg>/<LV>' to accomplish that.)
However, if there are no spares in the volume group and none will be
added, it is useful to be able to removed the failed device.

Following patches address the ability to perform 'lvconvert' operations
on RAID LVs that contain sub-LVs composed of 'error' segments.
2013-02-20 14:52:46 -06:00
Jonathan Brassow
0e4ffd9d3b clean-up: Rename lvm.conf setting 'mirror_region_size' to 'raid_region_size'
We have been using 'mirror_region_size' in lvm.conf as the default region
size for RAID logical volumes as well as mirror logical volumes.  Since,
"raid" is more inclusive and representative than "mirror", I have changed
the name of this setting.  We must still check for the old setting and warn
the user if we are overriding it with the new setting if both happen to be
present.
2013-02-20 14:40:17 -06:00
Peter Rajnoha
722ca363f0 report: fix pvs -o pv_free reporting for PVs with 0 PEs
[0] raw/~ # lsblk -o NAME,SIZE /dev/sda
NAME  SIZE
sda   128M

[0] raw/~ # pvcreate --dataalignment 128m /dev/sda
  Physical volume "/dev/sda" successfully created

[0] raw/~ # vgcreate vg /dev/sda
  Volume group "vg" successfully created

[0] raw/~ # lvcreate -l1 vg
  Volume group "vg" has insufficient free space (0 extents): 1 required.

Before this patch:
[0] raw/~ # pvs -o pv_name,pv_free
  PV         PFree
  /dev/sda   128.00m

After this patch:
[0] raw/~ # pvs -o pv_name,pv_free
  PV         PFree
  /dev/sda      0
2013-02-21 13:28:07 +01:00
Zdenek Kabelac
7910b6c0ba thin: update pool_is_active
Change it to take LV and move it to exported header - seems
to be a better fit for usability from tools/ directory.
2013-02-05 16:54:11 +01:00
Zdenek Kabelac
c984d8fbab thin: properly unmark volume after detach
When the volume is detached form thin pool,
unmask THIN_VOLUME flag and reset related pointers.
2013-02-05 14:40:37 +01:00
Zdenek Kabelac
11eaf1c98c thin: add function pool_is_active
This internal function check for active pool device.
For cluster it checks every thin volume,
On the non-clustered VG we need to check just
for presence of -tpool device.
2013-02-05 14:35:44 +01:00
Zdenek Kabelac
ddeb37f282 cleanup: add internal error check
Check if 'is_removable' is defined and report internal error,
if it's missing.
2013-02-05 14:27:24 +01:00
Zdenek Kabelac
153ce89af3 cleanup: comment update
Just update code comment and use single line if().
2013-02-04 19:05:43 +01:00
Zdenek Kabelac
ca7abbce8a activate: add lv_layer function
Add function to return layer name for LV.
2013-02-04 19:01:10 +01:00
Jonathan Brassow
801d4f96a8 RAID: Improve 'lvs' attribute reporting of RAID LVs and sub-LVs
There are currently a few issues with the reporting done on RAID LVs and
sub-LVs.  The most concerning is that 'lvs' does not always report the
correct failure status of individual RAID sub-LVs (devices).  This can
occur when a device fails and is restored after the failure has been
detected by the kernel.  In this case, 'lvs' would report all devices are
fine because it can read the labels on each device just fine.
Example:
[root@bp-01 lvm2]# lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   rwi-a-r--   100.00 lv_rimage_0(0),lv_rimage_1(0)
  [lv_rimage_0] vg   iwi-aor--          /dev/sda1(1)
  [lv_rimage_1] vg   iwi-aor--          /dev/sdb1(1)
  [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)

However, 'dmsetup status' on the device tells us a different story:
  [root@bp-01 lvm2]# dmsetup status vg-lv
  0 1024000 raid raid1 2 DA 1024000/1024000

In this case, we must also be sure to check the RAID LVs kernel status
in order to get the proper information.  Here is an example of the correct
output that is displayed after this patch is applied:
[root@bp-01 lvm2]# lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   rwi-a-r-p   100.00 lv_rimage_0(0),lv_rimage_1(0)
  [lv_rimage_0] vg   iwi-aor-p          /dev/sda1(1)
  [lv_rimage_1] vg   iwi-aor--          /dev/sdb1(1)
  [lv_rmeta_0]  vg   ewi-aor-p          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)

The other case where 'lvs' gives incomplete or improper output is when a
device is replaced or added to a RAID LV.  It should display that the RAID
LV is in the process of sync'ing and that the new device is the only one
that is not-in-sync - as indicated by a leading 'I' in the Attr column.
(Remember that 'i' indicates an (i)mage that is in-sync and 'I' indicates
an (I)mage that is not in sync.)  Here's an example of the old incorrect
behaviour:
[root@bp-01 lvm2]# lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   rwi-a-r--   100.00 lv_rimage_0(0),lv_rimage_1(0)
  [lv_rimage_0] vg   iwi-aor--          /dev/sda1(1)
  [lv_rimage_1] vg   iwi-aor--          /dev/sdb1(1)
  [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)
[root@bp-01 lvm2]# lvconvert -m +1 vg/lv; lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   rwi-a-r--     0.00 lv_rimage_0(0),lv_rimage_1(0),lv_rimage_2(0)
  [lv_rimage_0] vg   Iwi-aor--          /dev/sda1(1)
  [lv_rimage_1] vg   Iwi-aor--          /dev/sdb1(1)
  [lv_rimage_2] vg   Iwi-aor--          /dev/sdc1(1)
  [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)
  [lv_rmeta_2]  vg   ewi-aor--          /dev/sdc1(0)                            ** Note that all the images currently are marked as 'I' even though it is
   only the last device that has been added that should be marked.

Here is an example of the correct output after this patch is applied:
[root@bp-01 lvm2]# lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   rwi-a-r--   100.00 lv_rimage_0(0),lv_rimage_1(0)
  [lv_rimage_0] vg   iwi-aor--          /dev/sda1(1)
  [lv_rimage_1] vg   iwi-aor--          /dev/sdb1(1)
  [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)
[root@bp-01 lvm2]# lvconvert -m +1 vg/lv; lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   rwi-a-r--     0.00 lv_rimage_0(0),lv_rimage_1(0),lv_rimage_2(0)
  [lv_rimage_0] vg   iwi-aor--          /dev/sda1(1)
  [lv_rimage_1] vg   iwi-aor--          /dev/sdb1(1)
  [lv_rimage_2] vg   Iwi-aor--          /dev/sdc1(1)
  [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)
  [lv_rmeta_2]  vg   ewi-aor--          /dev/sdc1(0)
** Note only the last image is marked with an 'I'.  This is correct and we can
   tell that it isn't the whole array that is sync'ing, but just the new
   device.

It also works under snapshots...
[root@bp-01 lvm2]# lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   owi-a-r-p    33.47 lv_rimage_0(0),lv_rimage_1(0),lv_rimage_2(0)
  [lv_rimage_0] vg   iwi-aor--          /dev/sda1(1)
  [lv_rimage_1] vg   Iwi-aor-p          /dev/sdb1(1)
  [lv_rimage_2] vg   Iwi-aor--          /dev/sdc1(1)
  [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor-p          /dev/sdb1(0)
  [lv_rmeta_2]  vg   ewi-aor--          /dev/sdc1(0)
  snap          vg   swi-a-s--          /dev/sda1(51201)
2013-02-01 11:33:54 -06:00
Alasdair G Kergon
06abb2dd4c logging: classify log_debug messages
Place most log_debug() messages into a class.
2013-01-07 22:30:29 +00:00
Alasdair G Kergon
b617109fff lvmetad: fix format1 updates
fmt1 doesn't have a separate commit function: updates take effect
immediately vg_write is called, so we must update lvmetad at this
point if we're going to go on and ask lvmetad for the VG metadata
again before calling the commit function (though that's probably an
unsupported and pointless thing to do anyway as the client must
already have that data and it cannot have changed because it's locked
and with devs suspended we shouldn't be communicating with lvmetad;
so when that's fixed properly, this fix here can be reverted).

This problem showed up as an internal error when lvremoving an LVM1
snapshot.

> Internal error: LV snap1 (00000000000000000000000000000001) missing from preload metadata

https://bugzilla.redhat.com/891855
2013-01-05 03:17:35 +00:00