1
0
mirror of git://sourceware.org/git/lvm2.git synced 2025-01-06 17:18:29 +03:00
Commit Graph

1868 Commits

Author SHA1 Message Date
Petr Rockai
bead8ef5f0 metadata: Nuke the exported "pv_read" function. 2013-11-17 21:41:27 +01:00
Petr Rockai
ba6d6f0028 metadata: Fix handling of orphan PV linking & re-linking. 2013-11-17 21:41:27 +01:00
Petr Rockai
651d5093ed metadata: Avoid pv_read in find_pv_by_name. 2013-11-17 21:41:27 +01:00
Petr Rockai
2f5c12e3a8 pvremove: Avoid using pv_read in favour of scanning. 2013-11-17 21:41:27 +01:00
Petr Rockai
603b45e0ed pvresize: Do not use pv_read (get the PV from orphan VG). 2013-11-17 21:41:27 +01:00
Petr Rockai
e1a63905d1 metadata: Do not try to check vg_name of a NULL PV. 2013-11-17 21:41:26 +01:00
Zdenek Kabelac
89a8c7bea6 cleanup: share the nonremoval message
Use common goto label for not remove log error.
2013-11-15 12:38:37 +01:00
Zdenek Kabelac
9f6209b878 activation: improve activation
This patch fixes mostly cluster behavior but also updates
non-cluster reaction where calls like   'lvchange -aln'
lead to incorrect errors for some segment types.

Fix the implicit activation rules where some segment types could
be activated only in exclusive mode in cluster.
lvm2 command was not preserver 'local' property and incorrectly
converted local activations in to plain exclusive, so the local
activation could have activate volumes exclusively, but remotely.
2013-11-01 13:03:50 +01:00
Zdenek Kabelac
c3e674ad30 activation: _lv_activate is ok when filtered.
If the volume_list filters out volume from activation,
it is still success result for this function.
Change the error message back to verbose level.

Detect if the volume is active localy before zeroing,
so we report error a bit later for cases, where volume
could not be activated because it doesn't pass through volume
list  (but user still could create volume when he disables
zeroing)
2013-11-01 13:02:36 +01:00
Peter Rajnoha
039bdad732 activation: flag temporary LVs internally
Add LV_TEMPORARY flag for LVs with limited existence during command
execution. Such LVs are temporary in way that they need to be activated,
some action done and then removed immediately. Such LVs are just like
any normal LV - the only difference is that they are removed during
LVM command execution. This is also the case for LVs representing
future pool metadata spare LVs which we need to initialize by using
the usual LV before they are declared as pool metadata spare.

We can optimize some other parts like udev to do a better job if
it knows that the LV is temporary and any processing on it is just
useless.

This flag is orthogonal to LV_NOSCAN flag introduced recently
as LV_NOSCAN flag is primarily used to mark an LV for the scanning
to be avoided before the zeroing of the device happens. The LV_TEMPORARY
flag makes a difference between a full-fledged LV visible in the system
and the LV just used as a temporary overlay for some action that needs to
be done on underlying PVs.

For example: lvcreate --thinpool POOL --zero n -L 1G vg

- first, the usual LV is created to do a clean up for pool metadata
  spare. The LV is activated, zeroed, deactivated.

- between "activated" and "zeroed" stage, the LV_NOSCAN flag is used
  to avoid any scanning in udev

- betwen "zeroed" and "deactivated" stage, we need to avoid the WATCH
  udev rule, but since the LV is just a usual LV, we can't make a
  difference. The LV_TEMPORARY internal LV flag helps here. If we
  create the LV with this flag, the DM_UDEV_DISABLE_DISK_RULES
  and DM_UDEV_DISABLE_OTHER_RULES flag are set (just like as it is
  with "invisible" and non-top-level LVs) - udev is directed to
  skip WATCH rule use.

- if the LV_TEMPORARY flag was not used, there would normally be
  a WATCH event generated once the LV is closed after "zeroed"
  stage. This will make problems with immediated deactivation that
  follows.
2013-10-23 14:09:37 +02:00
Peter Rajnoha
ee878bc52c coverity: assigned variable not used and reassigned later 2013-10-16 15:06:43 +02:00
Jonathan Brassow
f58b26b633 RAID: Report RAID images split with tracking as out-of-sync ("I").
Split image should have an out-of-sync attr ('I') - always.  Even if
the RAID LV has not been written to since the LV was split off, it is
still not part of the group that makes up the RAID and is therefore
"out-of-sync".
2013-10-14 10:48:44 -05:00
Zdenek Kabelac
1146691afc snapshot: deactivate virtual snapshot first
Since the virtual snapshot has no reason to stay alive once we
detach related snapshot - deactivate whole thing in front of
snapshot removal - otherwice the code would get tricky for
support in cluster.

The correct full solution would require to have transactions
for libdm operations.

Also enable to the check for snapshot being opened prior
the origin deactivation, otherwise we could easily end
with the origin being deactivate, but snapshot still kept
active, desynchronizing locking state in cluster.
2013-10-14 00:25:15 +02:00
Zdenek Kabelac
81504ba70c snapshot: move virtsnap code from tool to lib
Move code for removal dependency from tool's remove.c
into lib's manipulation code.

Same code then works with lvm2app.
2013-10-12 00:14:52 +02:00
Petr Rockai
0decd7553a metadata: Fix metadata repair paths when lvmetad is used. 2013-10-09 14:44:01 +02:00
Peter Rajnoha
ce7489ed22 activation: add support for flagging an LV to skip udev scanning during activation
A common scenario is during new LV creation when we need to wipe the
newly created LV and avoid any udev scanning before this stage otherwise
it could cause the device (the LV) to be claimed by some other subsystem
for which there were stale metadata within LV data.

This patch adds possibility to mark the LV we're just about to wipe with
a flag that gets passed to udev via DM_COOKIE as a subsystem specific
flag - DM_SUBSYSTEM_UDEV_FLAG0 (in this case the subsystem is "LVM")
so LVM udev rules will take care of handling that.
2013-10-08 13:43:14 +02:00
Alasdair G Kergon
04d9a52684 release 2.02.103
52 files changed, 598 insertions(+), 264 deletions(-)
2013-10-04 14:32:23 +01:00
Peter Rajnoha
8cf0810d57 thin: rename thin_pool_chunk_size_calculation -> ..size_policy and rename "default" policy to "generic"
Just to be consistent with existing naming we use.
2013-10-04 12:30:33 +02:00
Alasdair G Kergon
baf95bbff7 cmdline: Add --ignoreskippedcluster.
Accept --ignoreskippedcluster with pvs, vgs, lvs, pvdisplay, vgdisplay,
lvdisplay, vgchange and lvchange to avoid the 'Skipping clustered
VG' errors when requesting information about a clustered VG
without using clustered locking and still exit with success.

The messages can still be seen with -v.
2013-10-01 21:20:10 +01:00
Peter Rajnoha
24ffd5244f thin: better dbg msgs and avoid uninit. value on chunk size recalc 2013-09-30 08:58:57 +02:00
Jonathan Brassow
acdc731e83 RAID: Fix _sufficient_pes_free calculation for RAID
lib/metadata/lv_manip.c:_sufficient_pes_free() was calculating the
required space for RAID allocations incorrectly due to double
accounting.  This resulted in failure to allocate when available
space was tight.

When RAID data and metadata areas are allocated together, the total
amount is stored in ah->new_extents and ah->alloc_and_split_meta is
set.  '_sufficient_pes_free' was adding the necessary metadata extents
to ah->new_extents without ever checking ah->alloc_and_split_meta.
This often led to double accounting of the metadata extents.  This
patch checks 'ah->alloc_and_split_meta' to perform proper calculations
for RAID.

This error is only present in the function that checks for the needed
space, not in the functions that do the actual allocation.
2013-09-26 11:30:07 -05:00
Peter Rajnoha
78cba8eb3f thin: calculate thin pool chunk size based on device IO hints
If "default" thin pool chunk size calculation method is selected,
use minimum_io_size, otherwise optimal_io_size for "performance"
device hint exposed in sysfs. If there appear to be PVs with
different hints presented, use their least common multiple.

If the hint is less than the default value defined for the
calculation method, use the default value instead.
2013-09-25 16:06:38 +02:00
Peter Rajnoha
cc9e65c391 thin: use appropriate default value based on allocation/thin_pool_chunk_size_calculation setting
If thin_pool_chunk_size_calculation is set to "default", use 64KiB,
otheriwse 512KiB for "performance".
2013-09-25 16:06:38 +02:00
Jonathan Brassow
c37c59e155 Test/clean-up: Indent clean-up and additional RAID resize test
Better indenting and a test for bug 1005434 (parity RAID should
extend in a contiguous fashion).
2013-09-24 21:32:53 -05:00
Jonathan Brassow
5ded7314ae RAID: Fix broken allocation policies for parity RAID types
A previous commit (b6bfddcd0a) which
was designed to prevent segfaults during lvextend when trying to
extend striped logical volumes forgot to include calculations for
RAID4/5/6 parity devices.  This was causing the 'contiguous' and
'cling_by_tags' allocation policies to fail for RAID 4/5/6.

The solution is to remember that while we can compare
	ah->area_count == prev_lvseg->area_count
for non-RAID, we should compare
	(ah->area_count + ah->parity_count) == prev_lvseg->area_count
for a general solution.
2013-09-24 21:32:10 -05:00
Zdenek Kabelac
b29adbbc4d raid: add lv_is_raid()
More readable then status bit flag masking...
2013-09-23 11:35:15 +02:00
Zdenek Kabelac
30432bd604 cleanup: skip call of detect...
SInce we know the pool was locked and we want to reloc pool again,
just use '1' directly.
2013-09-23 11:35:15 +02:00
Zdenek Kabelac
b8ea27ac97 cleanup: hide gcc warning
Older gcc is giving misleading warning:

metadata/lv_manip.c:4018: warning: ‘seg’ may be used uninitialized in
this function

But warning free compilation is better.
2013-09-11 23:40:45 +02:00
Jonathan Brassow
2691f1d764 RAID: Make RAID single-machine-exclusive capable in a cluster
Creation, deletion, [de]activation, repair, conversion, scrubbing
and changing operations are all now available for RAID LVs in a
cluster - provided that they are activated exclusively.

The code has been changed to ensure that no LV or sub-LV activation
is attempted cluster-wide.  This includes the often overlooked
operations of activating metadata areas for the brief time it takes
to clear them.  Additionally, some 'resume_lv' operations were
replaced with 'activate_lv_excl_local' when sub-LVs were promoted
to top-level LVs for removal, clearing or extraction.  This was
necessary because it forces the appropriate renaming actions the
occur via resume in the single-machine case, but won't happen in
a cluster due to the necessity of acquiring a lock first.

The *raid* tests have been updated to allow testing in a cluster.
For the most part, this meant creating devices with '-aey' if they
were to be converted to RAID.  (RAID requires the converting LV to
be EX because it is a condition of activation for the RAID LV in
a cluster.)
2013-09-10 16:33:22 -05:00
Jonathan Brassow
ca51435153 Misc/RAID: Enable resume_lv to handle some renaming conflicts.
When images and their associated metadata are removed from a RAID1 LV,
the remaining sub-LVs are "shifted" down to fill the gaps.  For
example, if there is a 3-way mirror:
	[0][1][2]
and we remove device#0, the devices will be shifted down
	[1][2]
and renamed.
	[0][1]

This can create a problem for resume_lv (specifically,
dm_tree_activate_children) during the renaming process though.  This
is because it will attempt to rename the higher indexed sub-LVs first
and find that it cannot because there are currently other sub-LVs with
that name.  The solution is to check for a conflicting name before
attempting to rename.  If a conflict is found and that conflicting
sub-LV is also in the process of renaming, we can defer the current
rename until the conflicting sub-LV has renamed and cleared the
conflict.

Now that resume_lv can handle these types of rename conflicts, we can
remove the workaround in RAID that was attempting to resume a RAID1
LV from the bottom-up in order to force a proper rename in assending
order before attempting a resume on the top-level LV.  This "hack"
only worked for single machine use-cases of LVM.  Clearing this up
paves the way for exclusive activation of RAID LVs in a cluster.
2013-09-09 15:07:28 -05:00
Zdenek Kabelac
0670bfeb59 thin: validation catch multiseg thin pool/volumes
Multisegment thin pools and volumes are not supported.
Catch such error code path early.
2013-09-07 03:32:07 +02:00
Zdenek Kabelac
4c001a7854 thin: fix resize of stacked thin pool volume
When the pool is created from non-linear target the more complex rules
have to be used and stacking needs to properly decode args for _tdata
LV. Also proper allocation policies are being used according to those
set in lvm2 metadata for data and metadata LVs.

Also properly check for active pool and extra code to active it
temporarily.

With this fix it's now possible to use:

lvcreate -L20 -m2 -n pool vg  --alloc anywhere
lvcreate -L10 -m2 -n poolm vg --alloc anywhere
lvconvert --thinpool vg/pool --poolmetadata vg/poolm

lvresize -L+10 vg/pool
2013-09-07 03:24:48 +02:00
Jonathan Brassow
72d6bdd6b9 misc: make lv_is_on_pv use for_each_sub_lv to walk LV tree
Make lv_is_on_pv use for_each_sub_lv to walk the LV tree.  This
reduces code duplication.
2013-08-23 11:03:28 -05:00
Jonathan Brassow
e5c0213168 Thin: Make 'lv_is_on_pv(s)' work with thin types
The pool metadata LV must be accounted for when determining what PVs
are in a thin-pool.  The pool LV must also be accounted for when
checking thin volumes.

This is a prerequisite for pvmove working with thin types.
2013-08-23 08:49:16 -05:00
Jonathan Brassow
f1e3640df3 Misc: Make get_pv_list_for_lv() available to more than just RAID
The function 'get_pv_list_for_lv' will assemble all the PVs that are
used by the specified LV.  It uses 'for_each_sub_lv' to traverse all
of the sub-lvs which may compose it.
2013-08-23 08:40:13 -05:00
Peter Rajnoha
f74e8fe044 thin: fix commit e195b5227e
Check chunk_size range unconditionally.
2013-08-06 16:28:12 +02:00
Peter Rajnoha
e195b5227e thin: apply VG profile if creating a new thin pool
When creating a new thin pool and there's no profile requested
via "lvcreate --profile ...", inherit any VG profile if it's attached.

Currently this applies to these settings:
  allocation/thin_pool_chunk_size
  allocation/thin_pool_discards
  allocation/thin_pool_zero
2013-08-06 11:42:40 +02:00
Zdenek Kabelac
7b58f10442 thin: move setting of THIN_POOL
Set flag when attaching data LV which make segment THIN_POOL.
2013-07-31 15:27:38 +02:00
Alasdair G Kergon
b6bfddcd0a alloc: fix lvextend when stripe number varies
The PREFERRED allocation mechanism requires the number of areas in the
previous LV segment to match the number in the new segment being
allocated.  If they do not match, the code may crash.
  E.g. https://bugzilla.redhat.com/989347

Introduce A_AREA_COUNT_MATCHES and when not set avoid referring
to the previous segment with the contiguous and cling policies.
2013-07-29 19:35:45 +01:00
Jonathan Brassow
f5a205668b Revert a previous change
commit d00d45a8b6 introduced changes
that are causing cluster mirror tests to fail.  Ultimately, I think
the change was right, but a proper clean-up will have to wait.
The portion of the commit we are reverting correlates to the
following commit comment:
    2) lib/metadata/mirror.c:_delete_lv() - should have been calling
       _activate_lv_like_model() with 'mirror_lv'.  This is because
       'mirror_lv' is the LV that the overall operation is being
       performed on.  We need to use this LV as the basis for
       determining whether to activate locally, or across the
       cluster, etc.
It appears that when legs or logs are removed from a mirror, they
are being activated before they are deleted in order to make them
top-level LVs that can be acted upon.  When doing this, it appears
they are not activated based on the characteristics of the mirror
from which they came.  IOW, if the mirror was exclusively active,
the sub-LVs are activated globally.  This is a no-no.  This then
made it impossible to activate_lv_like_model if the model was
"mirror_lv" instead of "lv" in _delete_lv().  Thus, at some point
this change should probably be put back and those location where
the sub-LVs are being improperly activated "shared" instead of
EX should be corrected.
2013-07-24 14:18:07 -05:00
Zdenek Kabelac
5597dc3652 thin: not zeroing for non-zeroed thin pool snaps
Do not zero initial 4KB of thin snapshot volume for thin pool with
disabled zeroing.
2013-07-24 01:15:31 +02:00
Jonathan Brassow
d00d45a8b6 Clean-up: Addressing a few FIXME's
Three fixme's addressed in this commit:
1) lib/metadata/lv_manip.c:_calc_area_multiple() - this could be
   safely changed to a comment explaining that currently because
   RAID10 can only have a 2-way mirror, we don't need to know the
   number of stripes.  However, we will need to know that in the
   future if RAID10 is to support more than 2-way mirroring.

2) lib/metadata/mirror.c:_delete_lv() - should have been calling
   _activate_lv_like_model() with 'mirror_lv'.  This is because
   'mirror_lv' is the LV that the overall operation is being
   performed on.  We need to use this LV as the basis for
   determining whether to activate locally, or across the
   cluster, etc.

3) tools/lvcreate.c:_lvcreate_params() - Minor clean-up.  If
   '-m 0' is given, treat it as though the mirroring argument
   was not given (i.e. as though the requested segment type
   was 'stripe' and not mirror).
2013-07-23 14:46:22 -05:00
Zdenek Kabelac
373f95a921 snapshot: update merging fix
Activation is needed only for clustered VG.
For non-clustered VG skip activation, since deactivate_lv()
is called without problems (no testing for lock presence).

(updates f6ded62291)
2013-07-23 15:15:04 +02:00
Zdenek Kabelac
6311be29e4 thin: use 64bit arithmetic for checking meta size
Avoid overflow since extents are just 32bit values.

(in release fix 87aca628)
2013-07-23 14:58:07 +02:00
Alasdair G Kergon
84801d7c34 thin: rename extend_pool to create_pool 2013-07-23 13:33:14 +01:00
Zdenek Kabelac
f6ded62291 snapshot: fix merging
When the merging of snapshot is finished, we need to clean dm table
intries for snapshot and -cow device. So for merging snapshot
we have to activate_lv plain 'cow' LV and let the table
resolver to its work - shortly deactivation_lv() request
will follow - in cluster this needs LV lock to be held by clvmd.

Also update a test - add small wait - if lvremove is not 'fast enough'
and merging process has not been stopped and $lv1 removed in background.
Ortherwise the following lvcreate occasionally finds name $lv1 still in use.

(in release fix)
2013-07-22 16:26:00 +02:00
Zdenek Kabelac
ea68f08501 cleanup: remove unused headers 2013-07-22 12:41:21 +02:00
Petr Rockai
6d2604f026 metadata: Fix tracking of read_status flags in _vg_make_handle. 2013-07-22 12:04:47 +02:00
Petr Rockai
3ed7f78ff4 metadata: Do not ignore errors in _vg_update_vg_ondisk. 2013-07-22 12:00:48 +02:00
Petr Rockai
f897fcbd95 metadata: Do not try to maintain an ondisk copy of orphan VGs. 2013-07-22 11:51:35 +02:00
Zdenek Kabelac
3075784955 thin: add spare lvcreate support
Add --poolmetadataspare option and creates and handles
pool metadata spare lv when thin pool is created.
With default setting 'y' it tries to ensure, spare has
at least the size of created LV.
2013-07-18 18:22:44 +02:00
Zdenek Kabelac
a916bf7eeb thin: removal of spare disables recovery
Warn user when removing spare LV.
Remove spare automatically, when last pool from VG is removed.
2013-07-18 18:22:44 +02:00
Zdenek Kabelac
915cc5a2fa thin: report 'e' volume type pool metadata spare
Reuse m'e'tadata volume type for spar'e' volume as well.
Essentially they are related and there is no big reason
to introduce new flag.
2013-07-18 18:22:44 +02:00
Zdenek Kabelac
460d0254eb thin: add pool metadata spare lv support
Add support for pool's metadata spare volume.
2013-07-18 18:22:43 +02:00
Zdenek Kabelac
08df7ba844 thin: improve pool creation activation order
Pool creation involves clearing of metadata device
which triggers udev watch rule we cannot udev synchronize with
in current code.

This metadata devices needs to be activated localy,
so in cluster mode deactivation and reactivation
is always needed.

However for non-clustered mode we may reload table
via suspend/resume path which avoids collision with
udev watch rule which was occasionaly triggering
retry deactivation loop.

Code has been also split into 2 separate code paths
for thin pools and thin volumes which improved readability
of the code as well.

Deactivation has been moved out of extend_pool() and
decision is now in _lv_create_an_lv() which knows
the change mode.
2013-07-18 18:22:43 +02:00
Zdenek Kabelac
d72f34af41 thin: fix error paths for metadata creation
Since we vg_write&commit metadata LV inside  lv_extend() call,
proper restore is needed in case something fails.

So add bad: section which deactivates activated LV
and removes it from VG.

Also check early for metadata LV name lengh fail.
2013-07-18 18:22:43 +02:00
Zdenek Kabelac
7afa9cebcb thin: fix error path in creation path
Remove some calls to revert_new_lv when no LV has been created/commited so far.
When the pool update failed - then only revert is needed.
2013-07-18 18:22:43 +02:00
Zdenek Kabelac
1d3f7953bd lv_manip: move some validation code before archiving
Make as much test we can, before actualy modifying metadata.
Avoids also unnecessary archiving.
2013-07-18 18:22:43 +02:00
Zdenek Kabelac
7b4b97b731 snapshot: local activation for clear COW device
To clear snapshot cow device in cluster enforce local
activation here.
2013-07-18 18:22:43 +02:00
Zdenek Kabelac
b5dfe4bec2 metadata: add is_change_activating
Add simple inline function to check, whether the change is activating.
(better then macro since we get type checking).
2013-07-18 18:22:42 +02:00
Zdenek Kabelac
e8fc77bd6d cleanup: move detection of change mode before commit point
Following patch will need to know change state before commit point.
Also the test mode should properly report all ongoing operation.
2013-07-18 18:22:42 +02:00
Zdenek Kabelac
20187fc190 cleanup: use dm_list_empty
Check for empty list directly.
2013-07-18 18:22:42 +02:00
Zdenek Kabelac
8bd67f9e5a cleanup: exit earlier in lv_rename_update
List doesn't need to be created when the metadata are not updated.
2013-07-18 18:16:29 +02:00
Peter Rajnoha
73e7f6c45f profile: move profile assignment to common lv_create_empty fn 2013-07-17 11:31:54 +02:00
Zdenek Kabelac
925701d9f3 thin: write back when command successfully finished
Remove backup() call from update_pool_lv() as it's been there
duplicated and preperly order backup() call after lvresize,
so there is just one such call.
2013-07-15 15:48:32 +02:00
Zdenek Kabelac
42881c8877 thin: send messages to active pool
If the thin pool is known to be active, messages can be passed
to the pool even when the created thin volume is not going to be
activated.

So we do not need to stack large list of message and validate
and catch creation errors earlier in this case.

Replace the test for valid activation combination with simpler list of
deactivation combinations.
2013-07-15 15:47:25 +02:00
Zdenek Kabelac
9ba7783350 cleanup: update comments
Add and indent.
2013-07-15 15:40:46 +02:00
Peter Rajnoha
8d1e511363 conf: add activation/auto_set_activation_skip
The activation/auto_set_activation_skip enables/disables automatic
adding of the ACTIVATION_SKIP LV flag. By default thin snapshots
are flagged to be skipped during activation.

And by default, the auto_set_activation_skip is enabled.
2013-07-12 20:54:17 +02:00
Peter Rajnoha
283c93a858 lvs: add 's(k)ip activatoin' lv_attr field
A new 10-th bit in lvs lv_attr field to indicate whether an LV
has the ACTIVATION_SKIP flag attached (stored in metadata).
2013-07-12 20:54:17 +02:00
Peter Rajnoha
7dc8c84b18 activation: add support for skipping activation of selected LVs
Also add -k/--setactivationskip y/n and -K/--ignoreactivationskip
options to lvcreate.

The --setactivationskip y sets the flag in metadata for an LV to
skip the LV during activation. Also, the newly created LV is not
activated.

Thin snapsots have this flag set automatically if not specified
directly by the --setactivationskip y/n option.

The --ignoreactivationskip overrides the activation skip flag set
in metadata for an LV (just for the run of the command - the flag
is not changed in metadata!)

A few examples for the lvcreate with the new options:

  (non-thin snap LV => skip flag not set in MDA + LV activated)
  raw/~ $ lvcreate -l1 vg
    Logical volume "lvol0" created
  raw/~ $ lvs -o lv_name,attr vg/lvol0
    LV    Attr
    lvol0 -wi-a----

  (non-thin snap LV + -ky => skip flag set in MDA + LV not activated)
  raw/~ $ lvcreate -l1 -ky vg
    Logical volume "lvol1" created
  raw/~ $ lvs -o lv_name,attr vg/lvol1
    LV    Attr
    lvol1 -wi------

  (non-thin snap LV + -ky + -K => skip flag set in MDA + LV activated)
  raw/~ $ lvcreate -l1 -ky -K vg
    Logical volume "lvol2" created
  raw/~ $ lvs -o lv_name,attr vg/lvol2
    LV    Attr
    lvol2 -wi-a----

  (thin snap LV => skip flag set in MDA (default behaviour) + LV not activated)
  raw/~ $ lvcreate -L100M -T vg/pool -V 1T -n thin_lv
    Logical volume "thin_lv" created
  raw/~ $ lvcreate -s vg/thin_lv -n thin_snap
    Logical volume "thin_snap" created
  raw/~ $ lvs -o name,attr vg
    LV        Attr
    pool      twi-a-tz-
    thin_lv   Vwi-a-tz-
    thin_snap Vwi---tz-

  (thin snap LV + -K => skip flag set in MDA (default behaviour) + LV activated)
  raw/~ $ lvcreate -s vg/thin_lv -n thin_snap -K
    Logical volume "thin_snap" created
  raw/~ $ lvs -o name,attr vg/thin_lv
    LV      Attr
    thin_lv Vwi-a-tz-

  (thins snap LV + -kn => no skip flag in MDA (default behaviour overridden) + LV activated)
  [0] raw/~ # lvcreate -s vg/thin_lv -n thin_snap -kn
    Logical volume "thin_snap" created
  [0] raw/~ # lvs -o name,attr vg/thin_snap
    LV        Attr
    thin_snap Vwi-a-tz-
2013-07-12 20:39:07 +02:00
Alasdair G Kergon
aaa9a68b2f metadata: cleanup comments and formatting 2013-07-09 12:34:48 +01:00
Alasdair G Kergon
f56a1819e9 tools: remove metadata-exported.h
metadata-exported.h is included by tools.h
2013-07-09 03:07:55 +01:00
Alasdair G Kergon
9d5bdc91ca tools: remove metadata.h 2013-07-09 02:51:24 +01:00
Alasdair G Kergon
8adddbf101 pvcreate: remove metadata.h header
Files in tools/ should only use metadata-exported.h not metadata.h.
Rename pvcreate_locked to pvcreate_single.
2013-07-09 02:37:56 +01:00
Alasdair G Kergon
7c6526aae2 lvresize: separate validation from action
Start separating the validation from the action in the basic lvresize
code moved to the library.
Remove incorrect use of command line error codes from lvresize library
functions.  Move errors.h to tools directory to reinforce this,
exporting public versions of the error codes in lvm2cmd.h for dmeventd
plugins to use.
2013-07-06 03:28:21 +01:00
Zdenek Kabelac
a64239f225 cleanup: use plain unsigned types 2013-07-05 17:20:57 +02:00
Zdenek Kabelac
7319bc0420 cleanup: move declaration to front 2013-07-05 17:20:57 +02:00
Zdenek Kabelac
f88f5a1ca3 thin: move alloc_pool_metadata
Move function from /tool to /lib to thin_manip.c
Since lvm2api will need to move many things into /lib anyway.
2013-07-04 13:33:41 +02:00
Zdenek Kabelac
6f335ffa35 sigint: improve logic on for sigint reaction
Fix and improve handling on sigint.

Always check for signal presence *before* calling of command,
so it will not call the command when break was hit.

If the command has been finished succesfully there is
no problem to mark the command ok and not report interrupt at all.

Fix cuple related stack; reports and assignments.
2013-07-03 14:46:42 +02:00
Mike Snitzer
f9e0adcce5 snapshot: Rename snapshot segment returning methods from find_*_cow to find_*_snapshot
find_cow -> find_snapshot, find_merging_cow -> find_merging_snapshot.
Will thin snapshot code to reuse these methods without confusion.
2013-07-02 16:26:03 -04:00
Tony Asleson
50db109e20 liblvm: Moved additional pv resize code
The pv resize code required that a lvm_vg_write be done
to commit the change.  When the method to add the ability
to list all PVs, including ones that are not assocated with
a VG we had no way for the user to make the change persistent.
Thus additional resize code was move and now liblvm calls into
a resize function that does indeed write the changes out, thus
not requiring the user to explicitly write out he changes.

Signed-off-by: Tony Asleson <tasleson@redhat.com>
2013-07-02 14:24:34 -05:00
Tony Asleson
6d6ccded35 lib2app: Added PV create. V2
V2: Correct call to lock_vol

Signed-off-by: Tony Asleson <tasleson@redhat.com>
2013-07-02 14:24:34 -05:00
Tony Asleson
8ddb1b4abf _get_pvs: Remove unused variable
Signed-off-by: Tony Asleson <tasleson@redhat.com>
2013-07-02 14:24:34 -05:00
Tony Asleson
e33ac7b1ed lvm2app: Implement lvm_pv_remove V2
Code move and changes to support calling code from
command line and from library interface.

V2 Change lock_vol call

Signed-off-by: Tony Asleson <tasleson@redhat.com>
2013-07-02 14:24:34 -05:00
Tony Asleson
ef3ab801e8 lvm2app: Add function to retrieve list of PVs V3
As locks are held, you need to call the included function
to release the memory and locks when done transversing the
list of physical volumes.

V2: Rebase fix
V3: Prevent VGs from getting cached and then write protected.

Signed-off-by: Tony Asleson <tasleson@redhat.com>
2013-07-02 14:24:33 -05:00
Tony Asleson
49d7596581 lvm2app: Implement lv resize (v3)
Simplified version of lv resize.

v3: Rebase changes to make work.  Needed to set sizeargs = 1
to indicate to resize that we are asking for a size based
resize.

Signed-off-by: Tony Asleson <tasleson@redhat.com>
2013-07-02 14:24:33 -05:00
Tony Asleson
c43ce46ba7 lvm2app: Move core lv re-size code (v6)
Moved to allow use from command line and for library use.

v3,v4,v5,v6: rebase changes

Signed-off-by: Tony Asleson <tasleson@redhat.com>
2013-07-02 14:24:33 -05:00
Peter Rajnoha
9f6cfc9de4 report: add vg_profile and lv_profile to report the profile attached to VG/LV
vgs -o vg_profile ...
lvs -o lv_profile ...
2013-07-02 15:22:12 +02:00
Peter Rajnoha
24a84549a8 thin: make selected thinp settings profilable
These settins are customizable by profiles:

	allocation/thin_pool_zero
	allocation/thin_pool_discards
	allocation/thin_pool_chunk_size
	activation/thin_pool_autoextend_threshold
	activation/thin_pool_autoextend_percent
2013-07-02 15:22:11 +02:00
Peter Rajnoha
6f0427cc56 lv: add lv_config_profile fn
Returns LV's profile if it exists, VG's profile otherwise
(LV inherits VG's profile).
2013-07-02 15:22:11 +02:00
Peter Rajnoha
e21e38cf74 metadata: add support for storing profile name in metadata (during vgcreate/lvcreate)
If "vgcreate/lvcreate --profile <profile_name>" is used, the profile
name is automatically stored in metadata for making it possible to
load it automatically next time the VG/LV is used.
2013-07-02 15:19:09 +02:00
Peter Rajnoha
d6a91da4be config: add profile arg to find_config_tree_bool 2013-07-02 15:19:09 +02:00
Peter Rajnoha
50bf2c0db1 config: add profile arg to find_config_tree_int 2013-07-02 15:19:09 +02:00
Peter Rajnoha
eeb7b0f7fa config: add profile arg to find_config_tree_node 2013-07-02 15:19:09 +02:00
Peter Rajnoha
c5e6bc393e metadata: read VG/LV profile name from metadata if it exists and load it
This is per VG/LV profile loading on demand. The profile itself is saved
in struct volume_group/logical_volume as "profile" field so we can
reference it whenever needed.
2013-07-02 15:19:09 +02:00
Zdenek Kabelac
e30028004b archiver: do not archive vg more then once
Do not keep multiple archives for the executed command.
Reuse the ALLOCATABLE_PV from pv status for
ARCHIVED_VG vg status. Mark VG with the bit with the
first archivation.
2013-07-01 23:09:26 +02:00
Zdenek Kabelac
afea2bf598 cleanup: move bit flags in order
Preseve the sequence of bits.
2013-07-01 23:06:41 +02:00
Zdenek Kabelac
da9263905c cleanup: use "" around type in error message
A bit more cleaner error message for wrong discards paramater.
2013-06-25 13:47:39 +02:00
Zdenek Kabelac
2a059f2358 cleanup: use log_print instead of log_error
Since reduce the message has informational character and doesn't lead
to exit of the command - reduce the log level to info print as we
use for other similar types.

Reindent next print message.
2013-06-25 13:47:39 +02:00
Zdenek Kabelac
f990b7298d cleanup: return lv_is_ as 1 or 0
Do not return 64bit values - return just plain int 0 or 1
2013-06-18 22:13:42 +02:00
Zdenek Kabelac
d4308a558d snapshot: fix max size limit check for COW device
Use proper max size as a multiple of extent size.
And use 64bit arithmentic for validation of minsize.
(in release fix).
2013-06-17 09:37:50 +02:00
Zdenek Kabelac
2f334b16d2 cleanup: use struct assign
Simplier code with struct assign.
Drop unneeded zeroing of zallocated memory.
2013-06-17 09:37:06 +02:00
Zdenek Kabelac
2636cae139 clean: remove unneeded assign
Since init_unknown_segtype returns zalloced memory,
NULL assign is not needed.
2013-06-17 09:34:56 +02:00
Zdenek Kabelac
5d73c0c674 cleanup: access pool segs with check
Add some minor checks after first_seg(lv).
Use direct check for thin pool segment.
Drop some unneeded {}
2013-06-16 00:07:33 +02:00
Zdenek Kabelac
8fb5f63637 mirror: add missing error message
When a user has not proceeded with conversion,
print the error message why the command has failed.
2013-06-16 00:07:32 +02:00
Alasdair G Kergon
c2dc21d89f text: miscellaneous comments & message tweaks 2013-06-15 01:28:54 +01:00
Peter Rajnoha
c6f48b7c1a refactor: make device type recognition code common for general use
Changes:

- move device type registration out of "type filter" (filter.c)
to a separate and new dev-type.[ch] for common use throughout the code

- the structure for keeping the major numbers detected for available
device types and available partitioning available is stored in
"dev_types" structure now

- move common partitioning detection code to dev-type.[ch] as well
together with other device-related functions bound to dev_types
(see dev-type.h for the interface)

The dev-type interface contains all common functions used to detect
subsystems/device types, signature/superblock recognition code,
type-specific device properties and other common device properties
(bound to dev_types), including partitioning support.

- add dev_types instance to cmd context as cmd->dev_types for common use

- use cmd->dev_types throughout as a central point for providing
information about device types
2013-06-12 12:08:56 +02:00
Peter Rajnoha
657abb08e0 cleanup: use libdm's dm_sysfs_dir() for sysfs directory throughout
And remove superfluous cmd->sysfs_dir and
set_sysfs_dir_path/sysfs_dir_path fn from lvm-globals.[ch].
2013-06-12 11:44:58 +02:00
Zdenek Kabelac
72c3ae253e thin: add helper functions
Add find_pool_lv() and pool_can_resize_metadata().
2013-06-11 14:03:30 +02:00
Zdenek Kabelac
01ef97fcbb thin: report 'e' metadata type with higher priority
Giving volume type information about being 'metadata' type of volume
has higher priority then i.e.  'mirror' or 'thin' flag - for those
type we have 'target attr' (7th. field).
2013-06-11 14:03:08 +02:00
Zdenek Kabelac
c0290489c3 thin: report o as volume type for external origin
Reuse 'o' attr for lvs report also for external origin.
2013-06-11 14:02:41 +02:00
Zdenek Kabelac
7151ede767 thin: report t for thin pool and volume
Do not mark internal device _tdata and _tmeta as having target type 't'.
They have the target type on their own (i.e. mirror, raid).
2013-06-11 13:58:16 +02:00
Zdenek Kabelac
1dcba13dfc cleanup: remove {} 2013-06-11 13:55:26 +02:00
Petr Rockai
f5a3bef276 format1: Fix snapshot reload in lv_remove.
The special suspend/resume code in lv_remove for LVM1 snapshots was interpsersed
with a vg_commit call. However, while with LVM1 metadata, vg_commit is
technically a no-op, the activation code relied on the ondisk and incore
metadata being the same, since on LVM1, a "commit" happens in vg_write
already. Since the "ondisk" metadata was previously not available with format1
(and incore was silently used instead, via lvmcache), the problem was masked.
2013-06-10 21:01:57 +02:00
Petr Rockai
2cce2f67ab metadata: Fix a pool CRC failure due to "late" ondisk copy creation. 2013-06-10 17:26:38 +02:00
Petr Rockai
f65dd341a5 locking: Make it possible to pass down an LV to activation code.
Previously, we have relied on UUIDs alone, and on lvmcache to make getting a
"new copy" of VG metadata fast. If the code which triggers the activation has
the correct VG metadata at hand (the version which is currently on disk), it can
now hand it to the activation code directly.
2013-06-10 17:26:38 +02:00
Petr Rockai
5d5f2306bd Add and track an "ondisk" pointer to struct volume_group.
This allows us to get the current on-disk version of the metadata whenever we
have the current in-flight version, without a recourse to scanning or lvmcache.
2013-06-10 17:26:29 +02:00
Petr Rockai
c1e851e208 Move export_vg_to_config_tree alongside export_vg_to_buffer. 2013-06-10 15:55:55 +02:00
Zdenek Kabelac
31f3274ed8 mirror: implement check for remotely active LV
If the mirror is active exclusively and locally, then we may proceed.
2013-05-31 21:42:31 +02:00
Jonathan Brassow
562c678ee2 DM RAID: Add ability to throttle sync operations for RAID LVs.
This patch adds the ability to set the minimum and maximum I/O rate for
sync operations in RAID LVs.  The options are available for 'lvcreate' and
'lvchange' and are as follows:
  --minrecoveryrate <Rate> [bBsSkKmMgG]
  --maxrecoveryrate <Rate> [bBsSkKmMgG]
The rate is specified in size/sec/device.  If a suffix is not given,
kiB/sec/device is assumed.  Setting the rate to 0 removes the preference.
2013-05-31 11:25:52 -05:00
Zdenek Kabelac
eb7e206a73 snapshot: add cow_max_extents
Add more precise calculation of the maximum usable snapshot size.
Using only percentage fails for small size of snapshot and extents.
2013-05-30 17:30:15 +02:00
Zdenek Kabelac
59962d8d3e snapshot: require 3 chunks for creation
There is no point in creation of 2chunks snapshot,
since the snapshot is invalidated immeditelly with the first write
as there is no free chunk for COW blocks
(2 chunks are used by the snap header and the 1st. metadata chunk).

Enhance error message about the lowest usable size.
2013-05-30 17:28:03 +02:00
Zdenek Kabelac
2f1a571c97 fid: fix reset of PV fid
Avoid hitting memory corruption (double free) in code path,
where PV FID has been already destroyed and the released pointer
was left in PV structure and could have been tried to be released
from there 2nd. time with final context destruction.
2013-05-30 16:52:39 +02:00
Peter Rajnoha
732859d21f refactor: rename embedding area -> bootloader area 2013-05-28 12:37:22 +02:00
Zdenek Kabelac
9966842810 snapshot: skip monitor for large cows
If snapshot cow device is already big enough to
cover whole origin, do not monitor it.
2013-05-27 10:35:43 +02:00
Zdenek Kabelac
77952151af snapshot: add lv_is_cow_covering_origin
Add function to check is size of cow is already big enough
to cover whole origin.
2013-05-27 10:34:53 +02:00
Zdenek Kabelac
3ba3bc0d66 cleanup: drop backtrace
After log_error/log_warn there is no point to show <backtrace>
in debug log trace from the next code line.
2013-05-27 10:28:32 +02:00
Zdenek Kabelac
8cbacd2474 lv_manip: use lv_is_active
Updated reverted commit.
The usage of lv_is_active() is needed here, so the
(!lv_is_active_exclusive_locally) gives the correct
report.
2013-05-20 16:47:33 +02:00
Jonathan Brassow
06ac797f42 Clean-up: Replace 'lv_is_active' with more correct/specific variants
There are places where 'lv_is_active' was being used where it was
more correct to use 'lv_is_active_locally'.  For example, when checking
for the existance of a kernel instance before asking for its status.
Most of the time these would work correctly.  (RAID is only allowed on
non-clustered VGs at the moment, which means that 'lv_is_active' and
'lv_is_active_locally' would give the same result.)  However, it is
more correct to use the proper variant and it helps with future
scenarios where targets might be allowed exclusively (or clustered) in
a cluster VG.
2013-05-16 10:36:56 -05:00
Peter Rajnoha
4777eb6872 lvconvert: check for snapshot-merge support before merge init 2013-05-16 08:21:57 +02:00
Alasdair G Kergon
f12d88f840 activation: fix lv_is_active regressions
Try to fix commit bf2741376d.

lv_is_active is not the same as lv_info(cmd, org, 0, &info, 0, 0).

Introduce and use lv_is_active_locally.
2013-05-15 02:13:31 +01:00
Mike Snitzer
8ad7865b42 Fix alignment of PV data area if detected alignment less than 1 MB
This fixes a long standing regression since LVM2 2.02.74 (commit 4efb1d9c,
"Update heuristic used for default and detected data alignment.")

The default PE alignment could be used (via MAX()) even if it was
determined that the device's MD stripe width, or minimal_io_size or
optimal_io_size were not factors of the default PE alignment (either 64K
or the newer default of 1MB, etc).  This bug would manifest if the
default PE alignment was larger than the overriding hint that the
device provided (e.g. default of 1MB vs optimal_io_size of 768K).

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2013-05-13 15:56:47 -04:00
Zdenek Kabelac
3d0cb0611e lv: fix typedef
Since older gcc is not accepting duplication of same typedef,
stay with predeclared enum type.
2013-05-03 16:02:43 +02:00
Zdenek Kabelac
2d3700ba42 report: improve reporting of active state
For reporting stacked or joined devices properly in cluster,
we need to report their activation state according the lock,
which activated this device tree.

This is getting a bit complex - current code tries simple approach -

For snapshot - return status for origin.
For thin pool - return status of the first known active thin volume.
For the rest of them - try to use dependency list of LVs and skip
known execptions.  This should be able to recursively deduce top level
device for given LV.

(in release fix)
2013-05-03 15:43:52 +02:00
Zdenek Kabelac
d2d71330c3 lv: add lv_active_change
Make a separate /lib function for the change of activation state
of the LV.

(in release update)
2013-05-03 15:43:19 +02:00
Zdenek Kabelac
8d004b5127 report: show active state of LV
For non clustered VG - show  "active"/""

For clustered VG its more complex:

"local exclusive"
"remote exclusive"
"locally"
"remotely"
2013-04-25 17:33:24 +02:00
Zdenek Kabelac
8b18ab76d2 report: show dmeventd monitoring status
Add new lvs segment field 'Monitor' showing 3 states:

"monitored" - LV is monitored by dmeventd.

"not monitored" - LV is currently not being monitored by dmeventd

"" (empty) - LV does not support monitoring, or dmeventd support
             is not compiled in.
2013-04-25 17:33:24 +02:00
Zdenek Kabelac
f84f12a6a3 snapshot: rework cluster creation and removal
Support for exclusive activation of snapshots revealed some problems.

When snapshot is created, COW LV is activated first (for clearing) and
then it's transformed into snapshot's COW LV, but it has left the lock
for such LV active in cluster and this lock could not have been removed
from dlm, unless snapshot has been removed within same dlm session.

If the user tried to remove snapshot after rebooting node, the lock was
missing, and COW LV could not have been detached.

Patch modifes the approach in this way:

Always deactivate COW LV for clustered vg  after clearing (so it's
activated again via imlicit snapshot activation rule when snapshot is activated).

When snapshot is removed, activate COW LV as independend LV, so the lock
will exist for such LV, but only when the snapshot is active.

Also add test case for testing snapshot removal after cluster reboot.
2013-04-25 17:33:24 +02:00
Zdenek Kabelac
dd4fdce16c cleanup: drop unused assignment
Assigned values are unused.
2013-04-21 23:14:04 +02:00
Zdenek Kabelac
5e7eae59da lv_manip: check remove_seg_from_segs_using_this_lv()
Add missing check for result of remove_seg_from_segs_using_this_lv().
Failure is reported as internal error.
2013-04-21 23:10:43 +02:00
Zdenek Kabelac
24f8daa13d raid: test for target_pvs
If target_pvs is NULL do not call lv_is_on_pvs()
2013-04-21 23:07:00 +02:00
Zdenek Kabelac
4ca6a4105d thin: lvcreate better support for AAY
Test rather for changes which are deactivating.
2013-04-21 23:06:23 +02:00
Jonathan Brassow
2e0740f7ef RAID: Add writemostly/writebehind support for RAID1
'lvchange' is used to alter a RAID 1 logical volume's write-mostly and
write-behind characteristics.  The '--writemostly' parameter takes a
PV as an argument with an optional trailing character to specify whether
to set ('y'), unset ('n'), or toggle ('t') the value.  If no trailing
character is given, it will set the flag.
Synopsis:
        lvchange [--writemostly <PV>:{t|y|n}] [--writebehind <count>] vg/lv
Example:
        lvchange --writemostly /dev/sdb1:y --writebehind 512 vg/raid1_lv

The last character in the 'lv_attr' field is used to show whether a device
has the WriteMostly flag set.  It is signified with a 'w'.  If the device
has failed, the 'p'artial flag has priority.

Example ("nosync" raid1 with mismatch_cnt and writemostly):
[~]# lvs -a --segment vg
  LV                VG   Attr      #Str Type   SSize
  raid1             vg   Rwi---r-m    2 raid1  500.00m
  [raid1_rimage_0]  vg   Iwi---r--    1 linear 500.00m
  [raid1_rimage_1]  vg   Iwi---r-w    1 linear 500.00m
  [raid1_rmeta_0]   vg   ewi---r--    1 linear   4.00m
  [raid1_rmeta_1]   vg   ewi---r--    1 linear   4.00m

Example (raid1 with mismatch_cnt, writemostly - but failed drive):
[~]# lvs -a --segment vg
  LV                VG   Attr      #Str Type   SSize
  raid1             vg   rwi---r-p    2 raid1  500.00m
  [raid1_rimage_0]  vg   Iwi---r--    1 linear 500.00m
  [raid1_rimage_1]  vg   Iwi---r-p    1 linear 500.00m
  [raid1_rmeta_0]   vg   ewi---r--    1 linear   4.00m
  [raid1_rmeta_1]   vg   ewi---r-p    1 linear   4.00m

A new reportable field has been added for writebehind as well.  If
write-behind has not been set or the LV is not RAID1, the field will
be blank.
Example (writebehind is set):
[~]# lvs -a -o name,attr,writebehind vg
  LV            Attr      WBehind
  lv            rwi-a-r--     512
  [lv_rimage_0] iwi-aor-w
  [lv_rimage_1] iwi-aor--
  [lv_rmeta_0]  ewi-aor--
  [lv_rmeta_1]  ewi-aor--

Example (writebehind is not set):
[~]# lvs -a -o name,attr,writebehind vg
  LV            Attr      WBehind
  lv            rwi-a-r--
  [lv_rimage_0] iwi-aor-w
  [lv_rimage_1] iwi-aor--
  [lv_rmeta_0]  ewi-aor--
  [lv_rmeta_1]  ewi-aor--
2013-04-15 13:59:46 -05:00
Zdenek Kabelac
2e39392daf cleanup: remove unused lvl_idx 2013-04-12 11:26:31 +02:00
Jonathan Brassow
ff64e3500f RAID: Add scrubbing support for RAID LVs
New options to 'lvchange' allow users to scrub their RAID LVs.
Synopsis:
	lvchange --syncaction {check|repair} vg/raid_lv

RAID scrubbing is the process of reading all the data and parity blocks in
an array and checking to see whether they are coherent.  'lvchange' can
now initaite the two scrubbing operations: "check" and "repair".  "check"
will go over the array and recored the number of discrepancies but not
repair them.  "repair" will correct the discrepancies as it finds them.

'lvchange --syncaction repair vg/raid_lv' is not to be confused with
'lvconvert --repair vg/raid_lv'.  The former initiates a background
synchronization operation on the array, while the latter is designed to
repair/replace failed devices in a mirror or RAID logical volume.

Additional reporting has been added for 'lvs' to support the new
operations.  Two new printable fields (which are not printed by
default) have been added: "syncaction" and "mismatches".  These
can be accessed using the '-o' option to 'lvs', like:
	lvs -o +syncaction,mismatches vg/lv
"syncaction" will print the current synchronization operation that the
RAID volume is performing.  It can be one of the following:
        - idle:   All sync operations complete (doing nothing)
        - resync: Initializing an array or recovering after a machine failure
        - recover: Replacing a device in the array
        - check: Looking for array inconsistencies
        - repair: Looking for and repairing inconsistencies
The "mismatches" field with print the number of descrepancies found during
a check or repair operation.

The 'Cpy%Sync' field already available to 'lvs' will print the progress
of any of the above syncactions, including check and repair.

Finally, the lv_attr field has changed to accomadate the scrubbing operations
as well.  The role of the 'p'artial character in the lv_attr report field
as expanded.  "Partial" is really an indicator for the health of a
logical volume and it makes sense to extend this include other health
indicators as well, specifically:
        'm'ismatches:  Indicates that there are discrepancies in a RAID
                       LV.  This character is shown after a scrubbing
                       operation has detected that portions of the RAID
                       are not coherent.
        'r'efresh   :  Indicates that a device in a RAID array has suffered
                       a failure and the kernel regards it as failed -
                       even though LVM can read the device label and
                       considers the device to be ok.  The LV should be
                       'r'efreshed to notify the kernel that the device is
                       now available, or the device should be 'r'eplaced
                       if it is suspected of failing.
2013-04-11 15:33:59 -05:00
Petr Rockai
382fc878d7 lvmetad: Check for reappeared PVs. 2013-04-03 12:48:28 +02:00
Zdenek Kabelac
d24c01a414 thin: lvcreate external origin snapshot support 2013-04-02 15:17:31 +02:00
Zdenek Kabelac
435e0bb608 cleanup: indent line 2013-04-02 15:17:05 +02:00
Peter Rajnoha
5c93f3997b metadata: use PV's internal UNLABELLED_PV flag more consistently
Set when new PV created, cleared on PV write.
2013-03-25 16:21:59 +01:00
Peter Rajnoha
ea36d0501e cleanup: remove unused 'pv_by_path' fn
The pv_by_path might be also dangerous to use as it does not
count with any other metadata areas but the ones found on the PV
itself. If metadata was not found on the PV referenced by the path,
it returned no PV though it might have been referenced by metadata
elsewhere (on other PVs...).
2013-03-19 14:57:36 +01:00
Peter Rajnoha
7e5e2dd4ee vgextend: do not allow PV with 0 MDAs to be added while already in a VG
If extending a VG and including a PV with 0 MDAs that was already
a part of a VG, the vgextend allowed that PV to be added and we
ended up *with one PV in two VGs*!

The vgextend code used the 'pv_by_path' fn that returned a PV for
a given path. However, when the PV did not have any metadata areas,
the fn just returned a PV without any reference to existing VG.
Consequently, any checks for the existing VG failed.

[0] raw/~ # pvcreate --metadatacopies 0 /dev/sda
  Physical volume "/dev/sda" successfully created

[0] raw/~ # pvcreate --metadatacopies 1 /dev/sdb
  Physical volume "/dev/sdb" successfully created

[0] raw/~ # vgcreate vg1 /dev/sda /dev/sdb
  Volume group "vg1" successfully created

[0] raw/~ # pvcreate --metadatacopies 1 /dev/sdc
  Physical volume "/dev/sdc" successfully created

[0] raw/~ # vgcreate vg2 /dev/sdc
  Volume group "vg2" successfully created

Before this patch (incorrect):
[0] raw/~ # vgextend vg2 /dev/sda
  Volume group "vg2" successfully extended

With this patch (correct):
[0] raw/~ # vgextend vg2 /dev/sda
  Physical volume '/dev/sda' is already in volume group 'vg1'
  Unable to add physical volume '/dev/sda' to volume group 'vg2'.
2013-03-19 14:57:36 +01:00
Peter Rajnoha
59878d0129 metadata: add 'allow_orphan' arg to find_pv_by_name fn
Before, the find_pv_by_name call always failed if the PV found was orphan.
However, we might use this function even for a PV that is not part of any VG.
This patch adds 'allow_orphan' arg to find_pv_by_name fn that allows that.
2013-03-19 14:57:31 +01:00
Peter Rajnoha
5b6bab2e30 cleanup: remove superfluous wrappers
_find_pv_by_name -> find_pv_by_name
_find_pv_in_vg -> find_pv_in_vg
_find_pv_in_vg_by_uuid -> find_pv_in_vg_by_uuid

The only callers of the underscored variants were their wrappers
without the underscore. No other part of the code referenced the
underscored variants.
2013-03-19 13:58:02 +01:00
Zdenek Kabelac
b36a776a7f thin: move update_pool_params
Now we may recongnize preset arguments, move
the code for updating thin pool related values
into /lib portion of the code.
2013-03-13 15:13:54 +01:00
Zdenek Kabelac
f06dd8725a thin: mark passed args
Keep the flag whether given thin pool argument has been given on command
line or it's been 'estimated'

Call of update_pool_params() must not change cmdline given args and
needs to know this info.

Since there is a need to move this update function into /lib, we cannot
use arg_count().

FIXME: we need some generic mechanism here.
2013-03-13 15:13:54 +01:00
Peter Rajnoha
386886f71c config: refer to config nodes using assigned IDs
For example, the old call and reference:

  find_config_tree_str(cmd, "devices/dir", DEFAULT_DEV_DIR)

...now becomes:

  find_config_tree_str(cmd, devices_dir_CFG)

So we're referring to the named configuration ID instead
of passing the configuration path and the default value
is taken from central config definition in config_settings.h
automatically.
2013-03-06 10:14:33 +01:00
Peter Rajnoha
b778653f03 pv_header_extension: add support for writing PV header extension (flags & Embedding Area)
The PV header extension information (PV header extension version, flags
and list of Embedding Area locations) is stored just beyond the PV header base.

When calculating the Embedding Area start value (ea_start), the same logic is
used as when calculating the pe_start value for Data Area - the value must
follow exactly the same alignment restrictions for its start value
(the alignment detected automatically or provided via command line using
the --dataalignment and --dataalignmentoffset arguments).

The Embedding Area is placed at the very start of the PV, starting at
ea_start. The Data Area starting at pe_start is placed next. The pe_start is
still properly aligned. Due to the pe_start alignment, it's possible that the
resulting Embedding Area size (ea_size) ends up bigger in size than requested
(but never less than requested).
2013-02-26 11:28:00 +01:00
Peter Rajnoha
9dbe25709e pv_header_extension: add support for reading PV header extension (flags & Embedding Area)
New tools with PV header extension support will read the extension
if it exists and it's not an error if it does not exist (so old PVs
will still work seamlessly with new tools).

Old tools without PV header extension support will just ignore any
extension.

As for the Embedding Area location information (its start and size),
there are actually two places where this is stored:
  - PV header extension
  - VG metadata

The VG metadata contains a copy of what's written in the PV header
extension about the Embedding Area location (NULL value is not copied):

    physical_volumes {
        pv0 {
          id = "AkSSRf-difg-fCCZ-NjAN-qP49-1zzg-S0Fd4T"
          device = "/dev/sda"     # Hint only

          status = ["ALLOCATABLE"]
          flags = []
          dev_size = 262144       # 128 Megabytes
          pe_start = 67584
          pe_count = 23   # 92 Megabytes
          ea_start = 2048
          ea_size = 65536 # 32 Megabytes
        }
    }

The new metadata fields are "ea_start" and "ea_size".
This is mostly useful when restoring the PV by using existing
metadata backups (e.g. pvcreate --restorefile ...).

New tools does not require these two fields to exist in VG metadata,
they're not compulsory. Therefore, reading old VG metadata which doesn't
contain any Embedding Area information will not end up with any kind
of error but only a debug message that the ea_start and ea_size values
were not found.

Old tools just ignore these extra fields in VG metadata.
2013-02-26 11:27:23 +01:00
Peter Rajnoha
60c5d4c42f pv_header_extension: add supporting infrastructure for PV header extension (flags & Embedding Area)
PV header extension comes just beyond the existing PV header base:

PV header base (existing):
 - uuid
 - device size
 - null-terminated list of Data Areas
 - null-terminater list of MetaData Areas

PV header extension:
 - extension version
 - flags
 - null-terminated list of Embedding Areas

This patch also adds "eas" (Embedding Areas) list to lvmcache (lvmcache_info)
and it also adds support for common operations on the list (just like for
already existing "das" - Data Areas list):
 - lvmcache_add_ea
 - lvmcache_update_eas
 - lvmcache_foreach_ea
 - lvmcache_del_eas

Also, add ea_start and ea_size to struct physical_volume for processing
PV Embedding Area location throughout the code (currently only one
Embedding Area is supported, though the definition on disk allows for
more if needed in the future...).

Also, define FMT_EAS format flag to mark that the format actually
supports Embedding Areas (currently format-text only).
2013-02-26 11:25:16 +01:00
Peter Rajnoha
6d8de3638c cleanup: use struct pvcreate_restorable_params throughout 2013-02-26 11:25:11 +01:00
Peter Rajnoha
6692b17777 cleanup: add struct pvcreate_restorable_params and move relevant items from pvcreate_params
Extract restorable PV creation parameters from struct pvcreate_params into
a separate struct pvcreate_restorable_params for clarity and also for better
maintainability when adding any new items later.
2013-02-26 11:24:38 +01:00
Zdenek Kabelac
2cba0ea9f9 thin: removal of external_origin 2013-02-23 10:37:01 +01:00
Zdenek Kabelac
30c13eff37 thin: report external origin
Use the field 'origin' for reporting external origin lv name.

For thin volumes with external origin, report the size of
external origin size via:

  lvs -o+origin_size
2013-02-23 10:37:01 +01:00
Zdenek Kabelac
87331dc419 thin: add support for external origin
Add internal support for thin volume's external origin.
2013-02-23 10:36:58 +01:00
Zdenek Kabelac
d023b2d12f lvremove: easier removal of dependent lvs
Add function to remove lvs which are depending on removed lv
prior the lv is removed.

User is asked for confirmation.
2013-02-23 10:31:05 +01:00
Peter Rajnoha
303e86adc8 pvcreate: fix alignment to incorporate alignment offset if PV has 0 MDAs
If zero metadata copies are used, there's no further recalculation of
PV alignment that happens when adding metadata areas to the PV and
which actually calculates the alignment correctly as a matter of fact.
So fix this for "PV without MDA" case as well.

Before this patch:
[1] raw/~ # pvcreate --dataalignment 8m --dataalignmentoffset 4m
--metadatacopies 1 /dev/sda
  Physical volume "/dev/sda" successfully created
[1] raw/~ # pvs -o pv_name,pe_start
  PV         1st PE
  /dev/sda    12.00m
[1] raw/~ # pvcreate --dataalignment 8m --dataalignmentoffset 4m
--metadatacopies 0 /dev/sda
  Physical volume "/dev/sda" successfully created
[1] raw/~ # pvs -o pv_name,pe_start
  PV         1st PE
  /dev/sda     8.00m

After this patch:
[1] raw/~ # pvcreate --dataalignment 8m --dataalignmentoffset 4m
--metadatacopies 1 /dev/sda
  Physical volume "/dev/sda" successfully created
[1] raw/~ # pvs -o pv_name,pe_start
  PV         1st PE
  /dev/sda    12.00m
[1] raw/~ # pvcreate --dataalignment 8m --dataalignmentoffset 4m
--metadatacopies 0 /dev/sda
  Physical volume "/dev/sda" successfully created
[1] raw/~ # pvs -o pv_name,pe_start
  PV         1st PE
  /dev/sda    12.00m

Also, remove a superfluous condition "pv->pe_start < pv->pe_align" in:
  if (pe_start == PV_PE_START_CALC && pv->pe_start < pv->pe_align)
    pv->pe_start = pv->pe_align ...
This part of the condition is not reachable as with the PV_PE_START_CALC,
we always have pv->pe_start set to 0 from the PV struct initialisation
(...the pv->pe_start value is just being calculated).
2013-02-21 14:51:19 +01:00
Jonathan Brassow
dc2ce71313 clean-up: Remove a FIXME question that has been settled
It is ok for us to use the shorthand 'lv_is_virtual' to detect error
targets in a RAID LV when searching for candidates for device replacement.
2013-02-20 15:03:58 -06:00
Jonathan Brassow
bd0ee420b5 RAID: Allow remove/replace of sub-LVs composed of error segments.
When a device fails, we may wish to replace those segments with an
error segment.  (Like when a 'vgreduce --removemissing' removes a
failed device that happens to be a RAID image/meta.)  We are then left
with images that we will eventually want to remove or replace.

This patch allows us to pull out these virtual "error" sub-LVs.  This
allows a user to 'lvconvert -m -1 vg/lv' to extract the bad sub-LVs.
Sub-LVs with error segments are considered for extraction before other
possible devices so that good devices are not accidentally removed.

This patch also adds the ability to replace RAID images that contain error
segments.  The user will still be unable to run 'lvconvert --replace'
because there is no way to address the 'error' segment (i.e. no PV
that it is associated with).  However, 'lvconvert --repair' can be
used to replace the image's error segment with a new PV.  This is also
the most appropriate way to do it, since the LV will continue to be
reported as 'partial'.
2013-02-20 14:58:56 -06:00
Jonathan Brassow
845852d6b4 RAID: Make 'vgreduce --removemissing' work with RAID LVs
Currently it is impossible to remove a failed PV which has a RAID LV
on it.  This patch fixes the issue by replacing the failed PV with an
'error' segment within the affected sub-LVs.  Once there is no longer
a RAID LV using the PV, it can be removed.

Most often, it is better to replace a failed RAID device with a spare.
(You can use 'lvconvert --repair <vg>/<LV>' to accomplish that.)
However, if there are no spares in the volume group and none will be
added, it is useful to be able to removed the failed device.

Following patches address the ability to perform 'lvconvert' operations
on RAID LVs that contain sub-LVs composed of 'error' segments.
2013-02-20 14:52:46 -06:00
Jonathan Brassow
0e4ffd9d3b clean-up: Rename lvm.conf setting 'mirror_region_size' to 'raid_region_size'
We have been using 'mirror_region_size' in lvm.conf as the default region
size for RAID logical volumes as well as mirror logical volumes.  Since,
"raid" is more inclusive and representative than "mirror", I have changed
the name of this setting.  We must still check for the old setting and warn
the user if we are overriding it with the new setting if both happen to be
present.
2013-02-20 14:40:17 -06:00
Peter Rajnoha
722ca363f0 report: fix pvs -o pv_free reporting for PVs with 0 PEs
[0] raw/~ # lsblk -o NAME,SIZE /dev/sda
NAME  SIZE
sda   128M

[0] raw/~ # pvcreate --dataalignment 128m /dev/sda
  Physical volume "/dev/sda" successfully created

[0] raw/~ # vgcreate vg /dev/sda
  Volume group "vg" successfully created

[0] raw/~ # lvcreate -l1 vg
  Volume group "vg" has insufficient free space (0 extents): 1 required.

Before this patch:
[0] raw/~ # pvs -o pv_name,pv_free
  PV         PFree
  /dev/sda   128.00m

After this patch:
[0] raw/~ # pvs -o pv_name,pv_free
  PV         PFree
  /dev/sda      0
2013-02-21 13:28:07 +01:00
Zdenek Kabelac
7910b6c0ba thin: update pool_is_active
Change it to take LV and move it to exported header - seems
to be a better fit for usability from tools/ directory.
2013-02-05 16:54:11 +01:00
Zdenek Kabelac
c984d8fbab thin: properly unmark volume after detach
When the volume is detached form thin pool,
unmask THIN_VOLUME flag and reset related pointers.
2013-02-05 14:40:37 +01:00
Zdenek Kabelac
11eaf1c98c thin: add function pool_is_active
This internal function check for active pool device.
For cluster it checks every thin volume,
On the non-clustered VG we need to check just
for presence of -tpool device.
2013-02-05 14:35:44 +01:00
Zdenek Kabelac
ddeb37f282 cleanup: add internal error check
Check if 'is_removable' is defined and report internal error,
if it's missing.
2013-02-05 14:27:24 +01:00
Zdenek Kabelac
153ce89af3 cleanup: comment update
Just update code comment and use single line if().
2013-02-04 19:05:43 +01:00
Zdenek Kabelac
ca7abbce8a activate: add lv_layer function
Add function to return layer name for LV.
2013-02-04 19:01:10 +01:00
Jonathan Brassow
801d4f96a8 RAID: Improve 'lvs' attribute reporting of RAID LVs and sub-LVs
There are currently a few issues with the reporting done on RAID LVs and
sub-LVs.  The most concerning is that 'lvs' does not always report the
correct failure status of individual RAID sub-LVs (devices).  This can
occur when a device fails and is restored after the failure has been
detected by the kernel.  In this case, 'lvs' would report all devices are
fine because it can read the labels on each device just fine.
Example:
[root@bp-01 lvm2]# lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   rwi-a-r--   100.00 lv_rimage_0(0),lv_rimage_1(0)
  [lv_rimage_0] vg   iwi-aor--          /dev/sda1(1)
  [lv_rimage_1] vg   iwi-aor--          /dev/sdb1(1)
  [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)

However, 'dmsetup status' on the device tells us a different story:
  [root@bp-01 lvm2]# dmsetup status vg-lv
  0 1024000 raid raid1 2 DA 1024000/1024000

In this case, we must also be sure to check the RAID LVs kernel status
in order to get the proper information.  Here is an example of the correct
output that is displayed after this patch is applied:
[root@bp-01 lvm2]# lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   rwi-a-r-p   100.00 lv_rimage_0(0),lv_rimage_1(0)
  [lv_rimage_0] vg   iwi-aor-p          /dev/sda1(1)
  [lv_rimage_1] vg   iwi-aor--          /dev/sdb1(1)
  [lv_rmeta_0]  vg   ewi-aor-p          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)

The other case where 'lvs' gives incomplete or improper output is when a
device is replaced or added to a RAID LV.  It should display that the RAID
LV is in the process of sync'ing and that the new device is the only one
that is not-in-sync - as indicated by a leading 'I' in the Attr column.
(Remember that 'i' indicates an (i)mage that is in-sync and 'I' indicates
an (I)mage that is not in sync.)  Here's an example of the old incorrect
behaviour:
[root@bp-01 lvm2]# lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   rwi-a-r--   100.00 lv_rimage_0(0),lv_rimage_1(0)
  [lv_rimage_0] vg   iwi-aor--          /dev/sda1(1)
  [lv_rimage_1] vg   iwi-aor--          /dev/sdb1(1)
  [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)
[root@bp-01 lvm2]# lvconvert -m +1 vg/lv; lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   rwi-a-r--     0.00 lv_rimage_0(0),lv_rimage_1(0),lv_rimage_2(0)
  [lv_rimage_0] vg   Iwi-aor--          /dev/sda1(1)
  [lv_rimage_1] vg   Iwi-aor--          /dev/sdb1(1)
  [lv_rimage_2] vg   Iwi-aor--          /dev/sdc1(1)
  [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)
  [lv_rmeta_2]  vg   ewi-aor--          /dev/sdc1(0)                            ** Note that all the images currently are marked as 'I' even though it is
   only the last device that has been added that should be marked.

Here is an example of the correct output after this patch is applied:
[root@bp-01 lvm2]# lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   rwi-a-r--   100.00 lv_rimage_0(0),lv_rimage_1(0)
  [lv_rimage_0] vg   iwi-aor--          /dev/sda1(1)
  [lv_rimage_1] vg   iwi-aor--          /dev/sdb1(1)
  [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)
[root@bp-01 lvm2]# lvconvert -m +1 vg/lv; lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   rwi-a-r--     0.00 lv_rimage_0(0),lv_rimage_1(0),lv_rimage_2(0)
  [lv_rimage_0] vg   iwi-aor--          /dev/sda1(1)
  [lv_rimage_1] vg   iwi-aor--          /dev/sdb1(1)
  [lv_rimage_2] vg   Iwi-aor--          /dev/sdc1(1)
  [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)
  [lv_rmeta_2]  vg   ewi-aor--          /dev/sdc1(0)
** Note only the last image is marked with an 'I'.  This is correct and we can
   tell that it isn't the whole array that is sync'ing, but just the new
   device.

It also works under snapshots...
[root@bp-01 lvm2]# lvs -a -o name,vg_name,attr,copy_percent,devices vg
  LV            VG   Attr      Cpy%Sync Devices
  lv            vg   owi-a-r-p    33.47 lv_rimage_0(0),lv_rimage_1(0),lv_rimage_2(0)
  [lv_rimage_0] vg   iwi-aor--          /dev/sda1(1)
  [lv_rimage_1] vg   Iwi-aor-p          /dev/sdb1(1)
  [lv_rimage_2] vg   Iwi-aor--          /dev/sdc1(1)
  [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
  [lv_rmeta_1]  vg   ewi-aor-p          /dev/sdb1(0)
  [lv_rmeta_2]  vg   ewi-aor--          /dev/sdc1(0)
  snap          vg   swi-a-s--          /dev/sda1(51201)
2013-02-01 11:33:54 -06:00
Alasdair G Kergon
06abb2dd4c logging: classify log_debug messages
Place most log_debug() messages into a class.
2013-01-07 22:30:29 +00:00
Alasdair G Kergon
b617109fff lvmetad: fix format1 updates
fmt1 doesn't have a separate commit function: updates take effect
immediately vg_write is called, so we must update lvmetad at this
point if we're going to go on and ask lvmetad for the VG metadata
again before calling the commit function (though that's probably an
unsupported and pointless thing to do anyway as the client must
already have that data and it cannot have changed because it's locked
and with devs suspended we shouldn't be communicating with lvmetad;
so when that's fixed properly, this fix here can be reverted).

This problem showed up as an internal error when lvremoving an LVM1
snapshot.

> Internal error: LV snap1 (00000000000000000000000000000001) missing from preload metadata

https://bugzilla.redhat.com/891855
2013-01-05 03:17:35 +00:00
Jonathan Brassow
970dfbcd69 RAID: Limit replacement of devices when array is not in-sync.
If a RAID array is not in-sync, replacing devices should not be allowed
as a general rule.  This is because the contents used to populate the
incoming device may be undefined because the devices being read where
not in-sync.  The kernel enforces this rule unless overridden by not
allowing the creation of an array that is not in-sync and includes a
devices that needs to be rebuilt.

Since we cannot know the sync state of an LV if it is inactive, we must
also enforce the rule that an array must be active to replace devices.

That leaves us with the following conditions:
1) never allow replacement or repair of devices if the LV is in-active
2) never allow replacement if the LV is not in-sync
3) allow repair if the LV is not in-sync, but warn that contents may
   not be recoverable.

In the case where a user is performing the repair on the command line via
'lvconvert --repair', the warning is printed before the user is prompted
if they would like to replace the device(s).  If the repair is automated
(i.e. via dmeventd and policy is "allocate"), then the device is replaced
if possible and the warning is printed.
2012-12-18 14:40:42 -06:00
Zdenek Kabelac
401c9aba4a pv_read: add missing check for valid info
If the lvmcache_info_from_pvid() fails to find valid
info, invoke the lookup by dev, and only in this case
call lvmcache_info_from_pvid() again.

Also check for the result of info and return
error directly, so the NULL is not passed
to lvmcache_get_label().
2012-12-15 17:23:27 +01:00
Zdenek Kabelac
6987a353de thin: add detach_pool_metadata_lv
Add internal function detach_pool_metadata_lv().
2012-12-02 17:56:29 +01:00
Zdenek Kabelac
0387e70d76 thin: fix property discard for lvm2api
Discards property is string and may have these values:
  ignore, nopassdown, passdown
2012-11-27 14:09:49 +01:00
Jonathan Brassow
fb0cee9a66 RAID: Do not allow --splitmirrors on RAID10 logical volumes.
RAID10 does not have the ability to split off images for independent
use.  So, 'lvconvert --splitmirrors' will not work and must be
disallowed.
2012-11-21 18:39:26 -06:00
Zdenek Kabelac
400f644286 lv_manip: fix regresion from bf2741376d
Commit bf2741376d started to use
lv_is_active() instead of call for lv_info & info.exists so
we cover also cluster activated devices.
For snapshost the conversion was not correct and introduced
regression by blocking creation of snapshot of inactive LV.

Fix it by assigning lv_is_active() directly.
Note: we still have minor issue to fix - to make
lv_is_???? function able to return error states since
lv_info() may fail.
2012-11-21 12:15:09 +01:00
Zdenek Kabelac
2e96ea4a89 liblvm: internal API change
Return LV/NULL instead of 1/0 which saves lookup for created LV.
2012-11-19 14:37:30 +01:00
Zdenek Kabelac
cf5242a670 lvconvert: store target attributes
Target tells us its version, and we may allow different set of options
to be supported with different version of driver.

Idea is to provide individual feature flags and later be
able to query for them.
2012-11-19 14:17:10 +01:00
Jonathan Brassow
6db461e3b0 mirror/raid: Move 'copy_percent' to common code (mirror.c -> lv_manip.c)
The 'copy_percent' function takes the 'extents_copied' field from each
segment in an LV to create the numerator for the ratio that is to
become the copy_percent.  (Otherwise known as the 'sync' percent for
non-pvmove uses, like mirror LVs and RAID LVs.)  This function safely
works on RAID - not just mirrors - so it is better to have it in
lv_manip.c rather than mirror.c.

There's a lot of different functions that do a lot of different things
in lv_manip.c, so I placed the function near a function in lv_manip.c
that it was close to in metadata-exported.h.  Different placement in the
file or a different name for the function may be useful.
2012-10-23 20:33:54 -05:00
Zdenek Kabelac
bf2741376d Use lv_is_active instead of lv_info()
Usage of lv_is_active makes it more obvious what is being checked.
2012-10-17 15:42:31 +02:00
Zdenek Kabelac
e431b19bac cleanup: move log_error upward in code stack
Report log_error earlier.
2012-10-17 15:41:44 +02:00
Zdenek Kabelac
f260f99d57 cleanup: switch log_error to log_warn
Use log_warn to print non-fatal warning messages.

Use of log_error would confuse checker for testing
whether proper error has been reported for some real error.
2012-10-17 15:41:35 +02:00
Zdenek Kabelac
b89963a7c3 cleanup: swap return values
Use lvm standard return code for success/fail  1/0.
2012-10-17 15:37:26 +02:00
Jonathan Brassow
7519d881ef Clean-up: Adjust message to be clearer on action taken and why
A message is printed when the region_size of a RAID LV is adjusted
to allow for large (> ~1TB) LVs.  The message wasn't very clear.
Hopefully, this is better.
2012-10-15 15:09:05 -05:00
Zdenek Kabelac
6595cae6e9 cleanup: resolve dereferencing type-punned pointer
fix gcc warning:
dereferencing type-punned pointer will break strict-aliasing rules
Replace call by value and pass just const pointer to pvid.
2012-10-14 23:14:00 +02:00
Zdenek Kabelac
be291e1064 thin: lvm2api return origin property for thin LV 2012-10-12 12:20:55 +02:00
Zdenek Kabelac
9ee071705b cleanup: fix compiler warnings
remove unused vars
move var declarations into the front of functions.
fix some sign warnings
2012-10-12 10:25:07 +02:00
Zdenek Kabelac
b6512b10ae cleanup: fix typos 2012-10-10 21:22:11 +02:00
Jonathan Brassow
9efd3fb604 RAID: Do not allow RAID LVs in a cluster volume group.
It would be possible to activate a RAID LV exclusively in a cluster
volume group, but for now we do not allow RAID LVs to exist in a
clustered volume group at all.  This has two components:
	1) Do not allow RAID LVs to be created in a clustered VG
	2) Do not allow changing a VG from single-machine to clustered
	   if there are RAID LVs present.
2012-10-03 15:52:54 -05:00
Zdenek Kabelac
d442c3ef0c liblvm: insert layer with subvolume renames
Rename also subvolumes if we are inserting _tdata layer.
(Currently it breaks mirrors if it would be generic, needs fixing).
2012-10-03 15:13:32 +02:00
Zdenek Kabelac
21c401006c liblvm: add lv_rename_update
Support lv_rename without directly updating metatata.
It can save some metadata commits in some cases,
i.e. when LVs are offline.
2012-10-03 15:03:49 +02:00
Jonathan Brassow
886656e4ac RAID: Fix problems with creating, extending and converting large RAID LVs
MD's bitmaps can handle 2^21 regions at most.  The RAID code has always
used a region_size of 1024 sectors.  That means the size of a RAID LV was
limited to 1TiB.  (The user can adjust the region_size when creating a
RAID LV, which can affect the maximum size.)  Thus, creating, extending or
converting to a RAID LV greater than 1TiB would result in a failure to
load the new device-mapper table.

Again, the size of the RAID LV is not limited by how much space is allocated
for the metadata area, but by the limitations of the MD bitmap.  Therefore,
we must adjust the 'region_size' to ensure that the number of regions does
not exceed the limit.  I've added code to do this when extending a RAID LV
(which covers 'create' and 'extend' operations) and when up-converting -
specifically from linear to RAID1.
2012-09-27 16:51:22 -05:00
Petr Rockai
5f5832e318 lvremove: Ask before discarding data areas. 2012-09-26 17:26:23 +02:00
Petr Rockai
2276379a71 lib/cache/lvmetad: Refactor to use dm_config_tree in requests.
We were using daemon_send_simple until now, but it is no longer adequate, since
we need to manipulate requests in a generic way (adding a validity token to each
request), and the tree-based request interface is much more suitable for this.
2012-09-26 14:49:15 +02:00
Alasdair G Kergon
290ae4791e lvs: add partial attribute 2012-09-19 12:49:40 +01:00
Alasdair G Kergon
b737ff01e4 discards: skip when removing LVs on missing PVs
Don't try to issue discards to a missing PV to avoid segfault.
Prevent lvremove from removing LVs that have any part missing.

https://bugzilla.redhat.com/857554
2012-09-19 12:48:56 +01:00
Jonathan Brassow
2a6712ddef RAID1: Clear the LV_NOTSYNCED flag when a RAID1 LV is converted to linear
Failing to clear the LV_NOTSYNCED flag when converting a RAID1 LV to
linear can result in the flag being present after an upconvert - even
if the sync is performed when upconverting.
2012-09-14 16:26:53 -05:00
Jonathan Brassow
116bcb3ea4 RAID1: Like mirrors, do not allow adding images to LV created w/ --nosync
Mirrors do not allow upconverting if the LV has been created with --nosync.
We will enforce the same rule for RAID1.  It isn't hugely critical, since
the portions that have been written will be copied over to the new device
identically from either of the existing images.  However, the unwritten
sections may be different, causing the added image to be a hybrid of the
existing images.

Also, we are disallowing the addition of new images to a RAID1 LV that has
not completed the initial sync.  This may be different from mirroring, but
that is due to the fact that the 'mirror' segment type "stacks" when adding
a new image and RAID1 does not.  RAID1 will rebuild a newly added image
"inline" from the existant images, so they should be in-sync.
2012-09-14 16:12:52 -05:00
Jonathan Brassow
cdb0339319 RAID: Disallow addition of RAID images while array is not in-sync
We cannot add images to a RAID array while it is not in-sync.  The
kernel will simply reject the table, saying:
	'rebuild' specified while array is not in-sync
Now we check to ensure the LV is in-sync before attempting image
additions.
2012-09-10 17:15:20 -05:00
Jonathan Brassow
b49b98d50c RAID: '--test' should not cause a valid create command to fail
It is necessary when creating a RAID LV to clear the new metadata areas.
Failure to do so could result in a prepopulated bitmap that would cause
the new array to skip syncing portions of the array.  It is a requirement
that the metadata LVs be activated and cleared in the process of creating.
However in test mode, this requirement should be lifted - no new LVs should
be created or written to.
2012-09-05 14:32:06 -05:00
Jonathan Brassow
c3eb3a7687 cleanup: Use segtype->ops->name() instead of segtype->name where applicable
When printing a message for the user and the lv_segment pointer is available,
use segtype->ops->name() instead of segtype->name.  This gives a better
user-readable name for the segment.  This is especially true for the
'striped' segment type, which prints "linear" if there is an area_count of
one.
2012-09-05 11:35:54 -05:00
Alasdair G Kergon
3acc85caa8 buffering: use unbuffered silent mode for liblvm
Disable private buffering when using liblvm.
When private stdin/stdout buffering is not used always use silent
mode.
2012-08-26 00:15:45 +01:00
Alasdair G Kergon
438e0050df config: add silent mode
Accept -q as the short form of --quiet.
Suppress non-essential standard output if -q is given twice.
Treat log/silent in lvm.conf as equivalent to -qq.
Review all log_print messages and change some to
log_print_unless_silent.

When silent, the following commands still produce output:
dumpconfig, lvdisplay, lvmdiskscan, lvs, pvck, pvdisplay,
pvs, version, vgcfgrestore -l, vgdisplay, vgs.
[Needs checking.]

Non-essential messages are shifted from log level 4 to log level 5
for syslog and lvm2_log_fn purposes.
2012-08-25 20:35:48 +01:00
Jonathan Brassow
4047e4dfb1 RAID: Add support for RAID10
This patch adds support for RAID10.  It is not the default at this
stage.  The user needs to specify '--type raid10' if they would like
RAID10 instead of stacked mirror over stripe.
2012-08-24 15:34:19 -05:00
Zdenek Kabelac
7b300a803c cleanup: add some missing stack backtraces 2012-08-23 14:38:48 +02:00
Zdenek Kabelac
fd417db274 check: add internal errors for unexpected paths
Adding couple INTERNAL_ERROR reports for unwanted parameters:

Ensure the 'top' metadata node cannot be NULL for lvmetad.

Make obvious vginfo2 cannot be NULL.

Report internal error if handler and vg is undefined.

Check for handle in poll_vg().

Ensure seg is not NULL in dev_manager_transient().

Report missing read_ahead for _lv_read_ahead_single().

Check for report handler in dm_report_object().

Check missing VG in _vgreduce_single().
2012-08-23 14:37:52 +02:00
Zdenek Kabelac
195fe03075 cleanup: use proper activation_change_t 2012-08-23 14:37:38 +02:00
Zdenek Kabelac
bd67a3151a cleanup: uint64_t casts 2012-08-23 14:37:21 +02:00
Zdenek Kabelac
286cd2006b cleanup: drop unneeded included header files
This headers were not resolving anything used for compiled .c files.
Remove unused util.c file.
2012-08-23 14:37:20 +02:00
Peter Rajnoha
00877fe47b mirror: reconfigure_mirror_images not used 2012-08-15 10:44:19 +02:00
Zdenek Kabelac
54c24193f5 thin: lvcreate --discards 2012-08-09 16:25:52 +02:00
Alasdair G Kergon
701b4a8363 thin: use discards as plural rather than singular
Global change from --discard to --discards, as that feels more natural.
2012-08-07 21:24:41 +01:00
Alasdair G Kergon
7b5ea9a5a8 thin: tighten discard string conversions
Respond with "unknown" rather than a NULL pointer if there's an
internal error and the discard value is invalid.

Don't accept 'no_passdown' or 'no-passdown' variants in the LVM
metadata: this is written by the program so should only ever contain
"nopassdown" and should be validated strictly against that.
2012-08-07 18:37:35 +01:00
Alasdair G Kergon
adfa778a58 thin: order discard enum alphabetically 2012-08-07 18:36:40 +01:00
Alasdair G Kergon
4dbf872a9f reports: invalid snaps do not capitalise lv_attr
No longer capitalise first LV attribute char for invalid snapshots.
This state is available from the 5th char now (I or S).
2012-07-27 20:19:28 +01:00
Jonathan Brassow
5555d2a000 RAID: Fix segfault when attempting to replace RAID 4/5/6 device
Commit 8767435ef8 allowed RAID 4/5/6
LV to be extended properly, but introduced a regression in device
replacement - a critical component of fault tolerance.

When only 1 or 2 drives are being replaced, the 'area_count' needed
can be equal to the parity_count.  The 'area_multiple' for RAID 4/5/6
was computed as 'area_count - parity_devs', which could result in
'area_multiple' being 0.  This would ultimately lead to a division by
zero error.  Therefore, in calc_area_multiple, it is important to take
into account the number of areas that are being requested - just as
we already do in _alloc_init.
2012-07-24 19:02:06 -05:00
Zdenek Kabelac
ebbf7d8e68 thin: add discard support for thin pool
Add arg support for discard.
Add discard ignore, nopassdown, passdown (=default) support.
Flags could be set per pool.

lvcreate [--discard {ignore|no_passdown|passdown}]  vg/thinlv
2012-07-18 14:36:57 +02:00
Jonathan Brassow
8767435ef8 RAID: Fix extending size of RAID 4/5/6 logical volumes.
Reducing a RAID 4/5/6 LV or extending it with a different number of
stripes is still not implemented.  This patch covers the "simple" case
where the LV is extended with the same number of stripes as the orginal.
2012-06-26 09:44:54 -05:00
Alasdair G Kergon
2cec4b4a77 alloc: fix raid --alloc anywhere double allocs
If _alloc_parallel_area for raid devices chooses an area already used
up, it doesn't notice that it has no space left in it and leaves
later code trying to place a zero-length area into the LV.

https://bugzilla.redhat.com/832596
2012-06-28 23:26:42 +01:00
Peter Rajnoha
a2f4ccd839 lvcreate: add --activate ay (autoactivate)
One can use "lvcreate --aay" to have the newly created volume
activated or not activated based on the activation/auto_activation_volume_list
this way.

Note: -Z/--zero is not compatible with -aay, zeroing is not used in this case!
When using lvcreate -aay, a default warning message is also issued that zeroing
is not done.
2012-06-28 09:44:07 -04:00
Peter Rajnoha
95ced7a7be activate: add autoactivation hooks
Define an 'activation_handler' that gets called automatically on
PV appearance/disappearance while processing the lvmetad_pv_found
and lvmetad_pv_gone functions that are supposed to update the
lvmetad state based on PV availability state. For now, the actual
support is for PV appearance only, leaving room for PV disappearance
support as well (which is a more complex problem to solve as this
needs to count with possible device stack).

Add a new activation change mode - CHANGE_AAY exposed as
'--activate ay/-aay' argument ('activate automatically').

Factor out the vgchange activation functionality for use in other
tools (like pvscan...).
2012-06-28 09:42:47 -04:00
Peter Rajnoha
2729720fd3 args: add --activate synonym for --available arg
We're refererring to 'activation' all over the code and we're talking
about 'LVs being activated' all the time so let's use 'activation/activate'
everywhere for clarity and consistency (still providing the old
'available' keyword as a synonym for backward compatibility with
existing environments).
2012-06-28 09:42:44 -04:00
Alasdair G Kergon
07a25c249b discards: don't discard reconfigured extents
Update release_lv_segment_area not to discard any PV extents,
as it also gets used when moving extents between LVs.
Instead, call a new function release_and_discard_lv_segment_area() in
the two places where data should be discarded - lv_reduce() and
remove_mirrors_from_segments().
2012-06-27 22:12:01 +01:00
Alasdair G Kergon
e59f6981e6 discards: split discard from release_pv_segment
Separate discard_pv_segment out of release_pv_segment
2012-06-27 22:11:54 +01:00
Alasdair G Kergon
a5ddb347e5 allocation: allow release_lv_segment_area to fail
Allow release_lv_segment_area to fail as functions it calls can fail.
2012-06-27 22:11:49 +01:00
Zdenek Kabelac
6f3cd63551 cleanup: replace memset with struct initilization
Simplifies the code, properly detects too long socket paths,
drops unused parameter.
2012-06-22 13:23:03 +02:00
Alasdair Kergon
e0ed1b458d Warn of deadlock risk when using snapshots of mirror segment type. 2012-05-14 16:18:57 +00:00
Alasdair Kergon
8b59522d67 Fix cling policy not to behave like normal policy if no previous LV seg.
Fix alloc cling to cling to PVs already found with contiguous policy.
2012-05-11 22:53:13 +00:00
Alasdair Kergon
8a689fd04d Fix allocation policy loop so it doesn't continue beyond cling using later
policies it shouldn't be using when --alloc cling is specified but no tags
are defined.
2012-05-11 22:19:12 +00:00
Alasdair Kergon
01cfbe14f1 Append _TO_LVSEG to names of internal A_CONTIGUOUS and A_CLING flags.
Remove some unnecesary prev_lvseg checks.
2012-05-11 18:59:01 +00:00
Alasdair Kergon
51514ae62f Always include debug mesg when cling to allocated is set. 2012-05-11 15:32:19 +00:00
Alasdair Kergon
086829459b Refactor _has_matching_pv_tag to provide a fn that takes PV structs. 2012-05-11 15:26:30 +00:00
Peter Rajnoha
81c215de54 More comments on metadata area types. 2012-05-10 11:03:07 +00:00
Peter Rajnoha
8c3e4b43f1 Comment on auxiliary metadata areas. 2012-05-10 10:37:49 +00:00
Zdenek Kabelac
98f2e3d974 Fix regression in for_each_sub_lv
pool_lv is not a sub lv in terms for this function.
It has caused problem with renaming thin_volume, where it has tried to
rename pool LV as well.
2012-05-09 12:12:21 +00:00
Jonathan Earl Brassow
eb2d70293d Fix up-convert when mirror activation is controled by volume_list and tags.
When mirrors are up-converted, a transient mirror layer is put in so that
only the new devices are sync'ed.  That transient layer must carry the tags
of the original mirror LV, otherwise it will fail to activate when activation
is regulated by lvm.conf:activation/volume_list.  The conversion would then
fail.

The fix is to do exactly the same thing that is being done for linear ->
mirror converting (lib/metadata/mirror.c:_init_mirror_log()).  We copy the
tags temporarily for the new LV and remove them after the activation.
2012-05-05 02:08:46 +00:00
Jonathan Earl Brassow
1e4e9548b1 Disallow snapshots of mirror segment types.
Snapshots of RAID logical volumes are allowed (including "raid1").  However,
snapshots of "mirror" logical volumes has been disallowed due to unsolvable
issues inherent to the design.  The fact that mirroring (dm-raid1.c) must
stop all I/O as the result of a failure and wait for userspace intervention
can lead to a circular dependency if userspace is simultaneously waiting for
snapshots (on mirrors) to make an I/O update before proceeding.

Various snapshot on mirror tests have been removed as a result.
2012-05-01 19:21:24 +00:00
Jonathan Earl Brassow
ac6e1e3e8d Disallow changing cluster attribute of VG while RAID LVs are active.
Mirror and snapshot LVs are already checked for when switching the cluster
attribute of a VG.  This patch adds RAID.
2012-04-25 13:38:41 +00:00
Jonathan Earl Brassow
dfd024d3a8 Allow a subset of failed devices to be replaced in RAID LVs.
If two devices in an array failed, it was previously impossible to replace
just one of them.  This patch allows for the replacement of some, but perhaps
not all, failed devices.
2012-04-24 20:05:31 +00:00
Jonathan Earl Brassow
a7feae8a6e Fix code that performs RAID device replacement while under snapshot.
The code should have been calling [suspend|resume]_lv_origin() rather than
[suspend|resume]_lv.

This addresses bug 807069.
2012-04-12 03:16:37 +00:00
Jonathan Earl Brassow
187486c7bb Fix inability to split RAID1 image while specifying a particular PV.
The logic for resuming the original and newly split LVs was not properly
done to handle situations where anything but the last device in the array
was split.  It did not take into account the possible name collisions that
might occur when the original LV undergoes the shifting and renaming of its
sub-LVs.
2012-04-11 14:20:19 +00:00
Jonathan Earl Brassow
c0b5886f18 RAID LVs could not handle a down-convert if a device other than the last one
in the array was specified for removal.  This change addresses that (bz806111).
2012-04-11 01:23:29 +00:00
Zdenek Kabelac
8a81716325 Minor fixes
Just small updates and remove <backtrace> after log_error.
2012-03-28 11:11:25 +00:00
Milan Broz
7076d1439b Fix pvmove if LV is activated exclusively but cmirror is not running.
In this case we should allow to use local mirror, check for cmirror
should apply only for lvconvert/lvcreate.

Introduced in 2.02.86 by removing !(lv->status & ACTIVATE_EXCL).

(Partially workaround, it is minimalistic patch for now.)
2012-03-23 16:28:40 +00:00
Zdenek Kabelac
2caa558e7c Update and fix monitoring of thin pool devices
Code adds better support for monitoring of thin pool devices.
update_pool_lv uses DMEVENTD_MONITOR_IGNORE to not manipulate with monitoring.
vgchange & lvchange are checking real thin pool device for existance
as we are using   _tpool  real device and visible LV pool device might not
be even active (_tpool is activated implicitely for any thin volume).
monitor_dev_for_events is another _lv_postorder like code it might be worth
to think about reusing it here - for now update the code to properly
monitory thin volume deps.
For unmonitoring add extra code to check the usage of thin pool - in case it's in use
unmonitoring of thin volume is skipped.
2012-03-23 09:58:04 +00:00
Jonathan Earl Brassow
dc7b1640ed Fix name conflicts that prevent down-converting RAID1 when specifying a device
When down-converting a RAID1 device, it is the last device that is extracted
and removed when the user does not specify a particular device.  However,
when a device is specified (and it is not the last), the device is removed and
the remaining sub-LVs are "shifted down" to fill the hole.  This cause problems
when resuming the LV because if the shifted devices were resumed (and thus
renamed) before the sub-LV being extracted, there would be a name conflict.
The solution is to resume the extracted sub-LVs first so that they can be
properly renamed preventing a possible conflict.

This addresses bug 801967.
2012-03-15 20:00:54 +00:00
Zdenek Kabelac
0f2bbb8763 Removing call of release_vg(NULL)
Since we are in error path were vg must be always NULL,
skip call of release_vg() like we do in other places.
2012-03-12 14:43:12 +00:00
Zdenek Kabelac
e19271418c Using %u modifier to print unsigned values. 2012-03-12 14:41:42 +00:00
Zdenek Kabelac
aa9ebf4494 Switch to normal log_verbose message
Here it's not an error case - so do not push this message to stderr.
2012-03-12 14:18:28 +00:00
Zdenek Kabelac
aeaec150c0 Some more missing supposedly 64bit operations.
Avoid use 32bit math for extent_size.
2012-03-05 15:05:24 +00:00
Zdenek Kabelac
20c40a0807 Use 64bit math
Prevent 32bit overflow and resulting weird error reports when working
with TB sizes..
2012-03-05 14:12:57 +00:00
Zdenek Kabelac
01c62d8f9d Just make error message more clear
Make more obvious, the origin LV for snapshot must be active.
2012-03-04 15:57:27 +00:00
Alasdair Kergon
9c159ea320 Pass struct device around internally rather than dev_t.
Add 3rd daemon return state "unknown" for lookups that are carried out
successfully but don't find the item requested.
Avoid issuing error messages when it's expected that a device that's
being looked up in lvmetad might not be there.
2012-03-02 20:46:36 +00:00
Alasdair Kergon
d742cdf327 Change pvscan --lvmetad to pvscan --cache. 2012-03-02 18:09:46 +00:00
Zdenek Kabelac
6a5706a3a5 Remove support for TRIM message
It's been unsupporte for now - and it's not going to be
implemented for thin pool kernel driver - so dropping
appearence of TRIM from libdm and lvm.
2012-03-02 13:26:08 +00:00
Zdenek Kabelac
52f76a7682 Test alloc fail 2012-03-01 21:49:32 +00:00
Zdenek Kabelac
f874903a35 Add stack trace for lv_reduce
Use common code call with stack trace.

TODO: maybe the release_lv_segment_area()
should be actually able to return error code to upper level.
2012-03-01 10:09:36 +00:00
Zdenek Kabelac
2455ce226d Fix leak of FID structure 2012-03-01 09:46:38 +00:00
Zdenek Kabelac
24ab6328f7 Wipe initial 4KiB for non-zeroed thin volumes
If the thin pool has disabled zeroing  (created with -Zn), we at least
clear initial 4KiB of such thin volume (provisions 1st block).

If lvcreate is executed with '-an' command will abort (same way like we for
normal LV - however for normal LV option -Zn may skip clearing completely,
for thin volumes this option is not supported (applies only for pools).
2012-02-29 22:08:57 +00:00
Alasdair Kergon
5b613cff97 Pass 'single_device' parameter down to suppress 'Can't find uuid' messages
when reading VG text metadate and called from pvscan --lvmetad.

(Longer-term, that check needs moving outside of that code.)
2012-02-29 02:35:35 +00:00
Petr Rockai
996fe0a836 Fix a whitespace bug in the last checkin. 2012-02-29 00:19:14 +00:00
Petr Rockai
059e87ce7b Attempt a fix for lvm shell accumulating copies of orphan PVs with each "pvs"
invocation.
2012-02-29 00:18:27 +00:00
Zdenek Kabelac
0650d875e8 Test dm_hash_insert() failures mem failures 2012-02-28 11:12:58 +00:00
Zdenek Kabelac
58afa3e7fb Check error from _lv_each_dependency
_lv_mark_if_partial_collect cannot fail, however it's good to keep
checking here as we do in all other cases.
2012-02-28 11:10:45 +00:00
Zdenek Kabelac
a46cc72fd2 Add some stack traces for dev_close error paths 2012-02-28 10:11:35 +00:00
Zdenek Kabelac
0475b86268 Explicitely check list size of segments
instead of checking for NULL from last_seg and first_seg.
2012-02-28 10:08:20 +00:00
Zdenek Kabelac
b9141fcefa Add stack traces for lock_vol failures
Adding at least stack traces with some FIXMEs for cases,
where we might want to do something cleaver - maybe fail command
or give user hints something is not going well ?

For remote_backup is stack probably 'good' enough for now.
2012-02-27 11:35:59 +00:00
Zdenek Kabelac
9737943c4c Fix missing break
Bug introduced with addition of internal error default case.
Seem like this code is not used.
TODO: add coverage test.
2012-02-27 11:13:48 +00:00
Zdenek Kabelac
efe228a42e Just reindent with tabs 2012-02-27 09:51:31 +00:00
Zdenek Kabelac
2960604178 Add explicit cast for time() ret value
To keep all numbers with same sign
2012-02-23 22:31:23 +00:00
Zdenek Kabelac
219e040062 Drop backtrace after log_error
Just a minor change to not give backtrace when log_error has been just
reported.
2012-02-23 22:24:47 +00:00
Jonathan Earl Brassow
870762d8e3 Require number of stripes to be greater than parity devices in higher RAID.
Also, add some comments to code that I recently added that may be unclear
otherwise.
2012-02-23 17:36:35 +00:00
Petr Rockai
dae0822698 The lvmetad client-side integration. Only active when use_lvmetad = 1 is set in
lvm.conf *and* lvmetad is running.
2012-02-23 13:11:07 +00:00
Jonathan Earl Brassow
9bdfb30720 Fix allocation code to allow replacement of single RAID 4/5/6 device.
The code fail to account for the case where we just need a single device
in a RAID 4/5/6 array.  There is no good way to tell the allocation functions
that we don't need parity devices when we are allocating just a single device.
So, I've used a bit of a hack.  If we are allocating an area_count that is <=
the parity count, then we can assume we are simply allocating a replacement
device (i.e. no need to include parity devices in the calculations).  This
should make sense in most cases.  If we need to allocate replacement devices
due to failure (or moving), we will never allocate more than the parity count;
or we would cause the array to become unusable.  If we are creating a new device,
we should always create more stripes than parity devices.
2012-02-23 03:57:23 +00:00
Alasdair Kergon
d860272b00 Check all tags and LV names are in a valid form in vg_validate. 2012-02-23 00:11:01 +00:00
Jonathan Earl Brassow
0e92b70f71 *** empty log message *** 2012-02-22 17:14:38 +00:00
Zdenek Kabelac
d81498a824 Initialize dmeventd monitoring for every command
Read lvm.conf setting for monitoring for each command. So we should not
activate monitoring if the default compilation is set to monitor during
lvconvert commnads.

Patch also removes check for  clustered VG and allows to disable monitoring
for clustered VG with the assumption, the problem with monitoring and dmeventd
flag passing for INGNORE is already fixed.
2012-02-15 15:18:43 +00:00
Zdenek Kabelac
73e62cdc11 Add internal error for unsupported code paths
Patch mainly helps static analyzers to better work with code paths
lvm code should never trigger.
2012-02-13 11:25:56 +00:00
Zdenek Kabelac
cbe6bcd593 Add check for rimage name allocation failure 2012-02-13 11:10:37 +00:00
Zdenek Kabelac
52f2f3eae4 Add free_orphan_vg
Move commod code to destroy orphan VG into free_orphan_vg() function.
Use orphan vgmem for creation of PV lists.
Remove some free_pv_fid() calls (FIXME: check all of them)
FIXME: Check whether we could merge release_vg back again for all VGs.
2012-02-13 11:03:59 +00:00
Zdenek Kabelac
65079de265 If the same fid is already same avoid ref_counting 2012-02-13 11:01:34 +00:00
Zdenek Kabelac
960ee343f3 Add missing test for failure of lvmcache_foreach_pv 2012-02-13 10:58:20 +00:00
Zdenek Kabelac
bbf98c19a8 Log error reporting for failing _alloc_pv
Drop unneeded zeroing of zalloced memory region.
2012-02-13 10:51:52 +00:00
Alasdair Kergon
b719e3d323 FMT_INSTANCE_VG is redundant now 2012-02-12 23:01:19 +00:00
Alasdair Kergon
ba14fff2af FMT_INSTANCE_PV is no longer used 2012-02-12 22:37:24 +00:00
Alasdair Kergon
f3f6c17250 use stack consistently if 0 is considered an error 2012-02-12 21:42:43 +00:00
Alasdair Kergon
2a434c479d missing error mesg 2012-02-12 21:37:03 +00:00
Alasdair Kergon
10670c641b remove unused bits after fid changes 2012-02-12 20:19:39 +00:00
Petr Rockai
6e41729eb8 Keep a global (per-format) orphan_vg and keep any and all orphan PVs linked to
it. Avoids the need for FMT_INSTANCE_PV and enables further simplifications. No
functional change, internal refactor only.
2012-02-10 02:53:03 +00:00
Petr Rockai
8e5f7cf3dc Move lvmcache data structures behind an API (making the structures private to
lvmcache.c). No functional change.
2012-02-10 01:28:27 +00:00
Peter Rajnoha
5fa417a9c0 Stop processing lvextend if trying to extend a mirror that is being recovered.
Missing correct return value in lv_extend fn.
2012-02-09 15:13:42 +00:00
Zdenek Kabelac
a7e2da0585 Thin add pool_below_threshold
Test both data and metadata percent usage.
2012-02-08 13:05:38 +00:00
Zdenek Kabelac
94f88a4f14 Fix test for lv_snapshot_percent
Do not check for PERCENT_MERGE_FAILED if the lv_snapshot_percent() failed.
(test for snap_percent would be testing uninitialized value).
2012-02-08 13:02:07 +00:00
Zdenek Kabelac
462835faa0 Switch to return void
List delete cannot fail, so there is no reason to test for error.
2012-02-08 12:52:58 +00:00
Zdenek Kabelac
d75c5f06f0 Replace snprintf with dm_snprintf
snprintf testing for negative is replaced with dm_snprintf where this
test really works.
Add missing test for result of dm_snprintf().
2012-02-08 11:40:02 +00:00
Alasdair Kergon
b167ca28b0 Adjust comments 2012-02-01 15:05:53 +00:00
Zdenek Kabelac
42b5c54092 Add synchornization point in mirror log init.
Put extra sync point when mirror log is deactivated and before
it's activated for the second time.
2012-02-01 13:50:36 +00:00
Alasdair Kergon
1368dc905a lost line 2012-02-01 02:11:43 +00:00
Alasdair Kergon
72abf1d880 Track unreserved space for all alloc policies and then permit NORMAL to place
log and data on same single PV.
2012-02-01 02:10:45 +00:00
Zdenek Kabelac
7012268499 Thin for_each_sub_lv
Adapt to scan thin dependency LVs
2012-01-26 21:39:32 +00:00
Zdenek Kabelac
42beb34826 Set missing header define 2012-01-25 22:37:48 +00:00
Zdenek Kabelac
9b1fe5a062 Thin clear stacked message for thin pool
Before removing thin pool LV always make sure, stacked message
for previous run are cleared - but allow to remove any
device that should have been created
(i.e. creation of snapshot failed - so the message for snapshot creation
may be replaced with delete message within unfinished transaction).

Also commit messages after lv remove - so free space is released in pool.
2012-01-25 11:27:42 +00:00
Zdenek Kabelac
89764fd494 Thin skip activation when there are no thin message
If the list with thin messages is empty, do not touch thin pool device.
2012-01-25 09:17:15 +00:00
Zdenek Kabelac
c771a70882 Thin correct activation order
When the message is passed only in resume path the order needs
to be corrected.
2012-01-25 09:15:44 +00:00
Zdenek Kabelac
3dadb176ce Thin use suspend/resume_lv_origin
Use origin_only support for thin volume when thin snapshot is created.
2012-01-25 09:14:25 +00:00
Zdenek Kabelac
2258242f6c Thin use origin_only for thin pools as well
Extend the usage of origin_only flag to allow resume of thin pool LV
(when it's active) to pass only the messages.

origin_only flag will skip detection of already resumed tree for thin_pool,
so we do not need to suspend the tree and we just send messages.
2012-01-25 09:13:10 +00:00
Zdenek Kabelac
1dede50c85 Thin check for lv_thin_pool_percent error status
Check has been missing.
2012-01-25 09:02:35 +00:00
Zdenek Kabelac
0926438aad Thin prevent removal of its data and metadata LVs
LVs cannot be removed while there are linked to thin pool.
(Gives better error message, than validation).
2012-01-25 08:57:25 +00:00
Zdenek Kabelac
d55aa53816 Thin fix transaction_id incrementation and code refactoring
Add pool_has_message and use it in attach_pool_message.
Also update header to make more obvious which segment type is
expected as parameter.
Rename  'read_only' to  'no_update' (no auto update transaction_id)
to better fit how it's used.
Fix problem when there was only one stacked message replaced with delete
message that caused unwanted transaction_id increase.
2012-01-25 08:55:19 +00:00
Zdenek Kabelac
c217690f4c Thin dependency scan support
Go through pool_lv and metadata_lv LVs when doing recursive scan.
2012-01-25 08:50:10 +00:00
Alasdair Kergon
3f61871f38 Caller is still entitled to reference an LV that's unlinked, so don't
tamper with struct contents.
2012-01-24 14:53:59 +00:00
Jonathan Earl Brassow
6cf3274732 Use suspend|resume_origin_only when up-converting RAID LVs, as mirrors do.
Failure to do so results in "Performing unsafe table load while X device(s) are
known to be suspended" errors.  While fixing the problem in this way works and
is consistent with the way the mirror segment type does it, it would be nice
to find a solution that uses the generic suspend/resume calls.

Also included in this check-in are additions to the test suite that perform
conversions on RAID LVs under a snapshot.  These tests are disabled for the
time being due to a kernel bug that is yet to be tracked down.
2012-01-24 14:33:38 +00:00
Milan Broz
095d95d0a8 Properly show LV removal message.
(Fix regression in commit 6e181ba96d)
2012-01-24 14:15:52 +00:00
Alasdair Kergon
46c67b5279 Use chunk_size consistently for thin_pool within LVM. 2012-01-24 00:55:03 +00:00
Alasdair Kergon
5c9eae9647 Reorder fns in libdm-deptree.
Tweak dm_config interface and remove FIXMEs.
2012-01-23 17:46:31 +00:00
Mike Snitzer
fc0f2d5031 Prompt if request is made to remove a snapshot whose "Merge failed". 2012-01-20 22:04:16 +00:00
Mike Snitzer
27e21a4adc Allow removal of an invalid snapshot that was to be merged on next activation.
Don't allow a user to merge an invalid snapshot.
2012-01-20 22:03:48 +00:00
Mike Snitzer
d658922f36 Use m and M lv_attr to indicate that a snapshot merge failed in lvs.
snapshot (m)erge failed, suspended snapshot (M)erge failed
2012-01-20 22:03:03 +00:00
Zdenek Kabelac
6515946e4d Thin cleanup
Reorder condition so the code is better readable (and shorter).
2012-01-20 10:56:30 +00:00
Zdenek Kabelac
f881095a69 Drop hack in segtype reporting
Since striped name function knows when to report 'linear' instead of
'stripe' type name - drop it from this place.

This fixes problem when reporting segtype e.g. for thin-pool which
is also using area_count=1 to store thin data device reference.

It also returns properly strduped memory instead of badly casted const char*.
2012-01-20 10:55:28 +00:00
Zdenek Kabelac
f82bddb76c Thin disable snapshot creation when pool is over the threshold.
Since snapshot needs to suspend origin - it might lead to pool userspace
deadlock (as the pool will wait for new space in case it would be overfilled,
but dmeventd would not be able to resize it, as the lvcreate operation would
have kept the VG lock.)
To minimize the risk of such scenario - we prevent to create new snapshot
in case we are over the threshold - but beware, there is still small timewindow,
so keep threshold at some reasonable level!
2012-01-19 15:39:41 +00:00
Zdenek Kabelac
e58b5dd8e8 Thin add new display field for lvs
New field Data% is able to display info about
thin_pool, thin, snapshot and has generic meaning here.

Simple Time/Host field are here to display host and time creation.
2012-01-19 15:34:32 +00:00
Zdenek Kabelac
53d7985fa1 Add support to keep info about creation time and host for each LV
Basic support to keep info when the LV was created.
Host and time is stored into LV mda section.

FIXME: Current version doesn't support configurable string via lvm.conf
and used fixed version strftime "%Y-%m-%d %T %z".
2012-01-19 15:31:45 +00:00
Zdenek Kabelac
d8106dfee2 Thin rename seg var pool_metadata_lv to metadata_lv
Better fits the code.
2012-01-19 15:23:50 +00:00
Alasdair Kergon
8f95d94b4f Show read-only activation in display tools. 2012-01-12 16:58:43 +00:00
Zdenek Kabelac
f582793f1b Thin rename internal thin pool segment
Use matching name as kernel target - useful when function like
_percent is using this for validation.
2011-12-21 12:54:19 +00:00
Alasdair Kergon
66e5b7f53c Reinstate support for format1 snapshots, but issue deprecated warning.
I anticipate removing support for snapshots with lvm1-formatted metadata in a
future release.
2011-12-20 00:02:18 +00:00
Alasdair Kergon
289ed221d0 update FIXMEs 2011-12-10 00:47:23 +00:00
Jonathan Earl Brassow
9711057499 Don't allow two images to be split and tracked from a RAID LV at one time
Also, don't allow a splitmirror operation on a RAID LV that is already tracking
a split, unless the operation is to stop the tracking and complete the split.
Example:
~> lvconvert --splitmirrors 1 --trackchanges vg/lv /dev/sdc1
# Now tracking changes - image can be merged back or split-off for good
~> lvconvert --splitmirrors 1 -n new_name vg/lv /dev/sdc1
# ^ Completes split ^

If a split is performed on a RAID that is tracking an already split image and
PVs are provided, we must ensure that
 1) the already split LV is represented in the PVs
 2) we are careful to split only the tracked image
2011-12-01 00:21:04 +00:00
Jonathan Earl Brassow
a927e401f1 Do not allow users to change the name of RAID sub-LVs or the name of the
RAID LV if it is tracking changes for a split image.
2011-12-01 00:09:34 +00:00
Jonathan Earl Brassow
2ba1e8fccc The LV_REBUILD flag is not internal - bad comments in metadata-exported.h updated 2011-11-30 02:20:13 +00:00
Jonathan Earl Brassow
0c506d9a40 Support the ability to replace specific devices in a RAID array.
RAID is not like traditional LVM mirroring.  LVM mirroring required failed
devices to be removed or the logical volume would simply hang.  RAID arrays can
keep on running with failed devices.  In fact, for RAID types other than RAID1,
removing a device would mean substituting an error target or converting to a
lower level RAID (e.g. RAID6 -> RAID5, or RAID4/5 to RAID0).  Therefore, rather
than removing a failed device unconditionally and potentially allocating a
replacement, RAID allows the user to "replace" a device with a new one.  This
approach is a 1-step solution vs the current 2-step solution.

example> lvconvert --replace <dev_to_remove> vg/lv [possible_replacement_PVs]

'--replace' can be specified more than once.

example> lvconvert --replace /dev/sdb1 --replace /dev/sdc1 vg/lv
2011-11-30 02:02:10 +00:00
Zdenek Kabelac
900f5f8187 Replace dynamic buffer allocations for PATH_MAX
Use static buffer instead of stack allocated buffer.
This reduces stack size usage of lvm tool and the
change is very simple.

Since the whole library is not thread safe - it should not
add any new problems - and if there will be some conversion
it's easy to convert this to use some preallocated buffer.
2011-11-18 19:31:09 +00:00
Zdenek Kabelac
8deeeb07ea Unlock memory for vg_write
For write we do not need to hold memory locked.
This relaxes many conditions and avoid problems when allocating
a lot of memory for writting metadata buffers.
(In case of huge MDA size this would lead to mismatch between
locked and unlocked memory region size).

Add also internal check we are not writing in critical section.
2011-11-18 19:28:00 +00:00
Zdenek Kabelac
37f274ced9 Query before removing inactive snapshots
Removal of an inactive origin removes also all related snapshots.

When we now support 'old' external snapshots with thin volumes,
removal of pool will not only drop all thin volumes, but as
a consequence also all snapshots - which might be seen a bit
unexpected for the user - so add a query to confirm such action.

lvremove -f will skip the prompt.
2011-11-18 19:25:20 +00:00
Zdenek Kabelac
91e4512619 Adjusted mirror region size only for mirrors and raids
Update region_size only for mirror and raid targets.
This fixes warning messages when vg is using small
extent size like 1KiB and no mirror/raid is created,
but the user still got the message:

$> vgcreate -s 1K vg  <pvs>
$> lvcreate -L10K vg
Using reduced mirror region size of 4 sectors
2011-11-15 17:32:12 +00:00
Zdenek Kabelac
5f129d15b1 Thin update prompt message
Enhance message with info about how many thin volumes are going to
be removed with thin pool removal.
2011-11-15 17:29:52 +00:00
Zdenek Kabelac
8542953f74 Reorder AND test condition
Take the easiest condition for checking first since they must
apply all together, check local conditions first before doing
more expensive tests.
2011-11-15 17:27:41 +00:00
Zdenek Kabelac
25de8ca372 Thin supports only thin volumes as snapshot origins
It's currently of the scope to properly solve the snapshoting
of internal thin devs so prevent non-toplevel snapshots here.
2011-11-15 17:23:51 +00:00
Zdenek Kabelac
dd0c58c69b Add missing stack reporting
also remove unneeded {}
2011-11-12 22:53:23 +00:00
Zdenek Kabelac
3af072cc63 Thin use items iterator and stack reporting 2011-11-12 22:52:18 +00:00
Zdenek Kabelac
651ef6be82 Missing stack printing 2011-11-12 22:51:20 +00:00
Zdenek Kabelac
6e89eb9a52 Small comment and indent updates 2011-11-10 12:43:05 +00:00
Zdenek Kabelac
f201498f99 Thin test min thin_pool size for at least 1 chunk 2011-11-10 12:42:36 +00:00
Zdenek Kabelac
39fc633957 Thin align volume size on chunk boundary size
If the extent_size is smaller then the chunk_size we may try
to find better aligment (wasting less space).

i.e. using  4KB extent_size and  64KB chunk size will
lead to creation of 64KB aligned thin volume.
2011-11-10 12:42:15 +00:00
Zdenek Kabelac
74e53e8bc0 Thin disable pool create without activation 2011-11-10 12:39:01 +00:00
Alasdair Kergon
3da4ed712e Must not override alloc policy specified by user. 2011-11-07 13:54:54 +00:00
Zdenek Kabelac
65e88e6b3c Thin add error message for double delete
Add few more internal error messages.
2011-11-07 11:04:45 +00:00
Zdenek Kabelac
97d7e5aedb Thin supports snapshots
Full support for thin snapshots.
Create and remove is supported.

TODO: lvconvert support is not yes available.
2011-11-07 11:03:47 +00:00
Zdenek Kabelac
11721819a7 Thin reindent code
Drop indention level
Add extra internal error.
2011-11-07 10:59:07 +00:00
Zdenek Kabelac
87371d48cc Thin revert code for exclusive pool activation
There are no limits on thin-pool activation now.
Revert code that is no longer needed.
2011-11-07 10:58:13 +00:00
Zdenek Kabelac
4079a8f298 Avoid lvextend to overflow
Add extra check to extent_count overflow.
Use internal define MAX_EXTENT_COUNT instead UINT32_MAX.
2011-11-04 22:49:53 +00:00
Zdenek Kabelac
83baa0b778 Thin pool allocation simplified
Support allocation of metadata from the same PV, if the VG
is build only from one PV.

As thinp is not mirror - we do not require 2 PVs
for basic thin usage as user is losing only perfomance.
2011-11-04 22:45:52 +00:00
Zdenek Kabelac
bd15208cd7 Thin add thin_pool_metadata_require_separate_pvs
Allow to set different policy for pool from mirrors.
2011-11-04 22:44:21 +00:00
Zdenek Kabelac
b8cac455bd Thin supports poolmetadatasize setting
Add option to set pool metadatasize.
For passing size parameter reuse region_size.
2011-11-04 22:43:10 +00:00
Alasdair Kergon
13dc67cda7 Add missing lvrename mirrored log recursion in for_each_sub_lv. 2011-11-04 01:31:23 +00:00
Zdenek Kabelac
1cae10a36c Thin keep pool device in the same state
Leave the optimalisation to be done differently and preserve
availability state of the pool device.
2011-11-03 15:58:20 +00:00
Zdenek Kabelac
9aa24bd034 Thin no device is created - so nothing to revert here 2011-11-03 15:46:51 +00:00
Zdenek Kabelac
466a8ebf9d Thin removing unused detach_pool_messages 2011-11-03 14:57:04 +00:00
Zdenek Kabelac
92384bfd0b Thin using update_pool_lv
Replace detach_pool_messages with update_pool_lv.
Move creation code from to 'if' condition into 1.
Ensure creation has finished all previous message operations.
2011-11-03 14:56:20 +00:00
Zdenek Kabelac
73b7bf961b Thin genering update_pool_lv function
Function to trigger pool message passing via resume,
or resize of the pool itself independently on other thins.
2011-11-03 14:53:58 +00:00
Zdenek Kabelac
dc964ab0d3 Thin uses _tdata instead of _tpool for data LV
Switch to different suffix and keep -tpool reserved for overlay device name.
2011-11-03 14:38:36 +00:00
Zdenek Kabelac
1f5c98270d Thin code cleanup
Use iterate_items for list processing.
2011-11-03 14:36:40 +00:00
Zdenek Kabelac
25de9addb6 Thin fix compile warns
Test for dm_snprintf < 0.
Add header for moved backup.
2011-10-30 22:52:08 +00:00
Zdenek Kabelac
7654abc26f Thin creation without activation
All thins are created with the next activation and VG is updated
without messages. Only some basic commands works.
(i.e. lvcreate -an  -V10 -T mvg/pool)
There can be some combination to confuse this system.

This functionality for snapshots is going to be interesting.
2011-10-30 22:07:38 +00:00
Zdenek Kabelac
f0df05e1dd Cleanup unsuccessfully created thin LV
If something fails during creation of thin LV remove such LV
and deactivate in case it's been already tried to activate
(i.e. thin kernel driver fails for some reason.)
2011-10-30 22:02:18 +00:00
Zdenek Kabelac
96279ac1c0 Make detach_pool_message visible for tools
Move there also vg_write and vg_commit.
2011-10-30 22:01:39 +00:00
Zdenek Kabelac
f8d46bd256 Thin cleanups
Fix/cleanup several error messages.
Remove test for seg_is_thin which could never be true there.
Replace (1<<24) with predefined constant.
2011-10-30 22:00:57 +00:00
Zdenek Kabelac
0968dfcd03 Thin support for stripe
Support stripe options to create thin data pool LV.

TODO: combine chunk size and stripe size.
2011-10-28 20:32:54 +00:00
Zdenek Kabelac
daa10ad0fd Thin pool resize support for data LV
Support for extension of pool data LV.

TODO: figure out thin volume for suspend/resume in cluster.
2011-10-28 20:31:01 +00:00
Zdenek Kabelac
e5b12b305f Thin support for lvrename
Rename pool's metadata lv _tmeta together with pool and _tdata.
2011-10-28 20:29:32 +00:00
Zdenek Kabelac
a1d5aaf725 Thin pool activation change
To ensure we properly handle LV cluster locking - explicitely do
not allow to change the availability of the thin pool that is in use
for some thin LV.

As soon as the thin volume is created the only way to activate pool
is via implicit dependency.

Ignore thinpool open count for lv/vgchange operations.
2011-10-28 20:28:00 +00:00
Zdenek Kabelac
2b71bcd0cb Improve lv_extend stack reporting
and some code cleanup with setting return value.
2011-10-28 20:23:24 +00:00
Zdenek Kabelac
c590a9cdbc Thin error messages clenaup and some indent 2011-10-28 20:19:26 +00:00
Zdenek Kabelac
dd3bb2bac3 Remove thin code from mirror/raid lv_extend 2011-10-28 20:18:32 +00:00
Zdenek Kabelac
2fa836e843 Extend virtual segment instead of adding new one
Before adding a new virtual segment to LV, check first whether
the last segment isn't already of the same type. In this case
extend last segment instead of creating the new one.

Thin volumes should have always only 1 virtual segment, but it
helps also to virtual snapshot or error segtype..
2011-10-28 20:17:55 +00:00
Zdenek Kabelac
bd4b840879 Add last_seg
Implement a function to return the last segment in a LV.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2011-10-28 20:12:54 +00:00
Jonathan Earl Brassow
682309e0b8 Disallow 'mirrored' log for cluster mirrors.
Git commit ID 0864378250 was meant to disallow
'mirrored' logs for cluster mirrors.  However, when add_mirror_log is used
to create the log (as is now the case when using 'lvcreate' or converting only
the log) the check is bypassed.

This patch adds the check to add_mirror_log.
2011-10-25 13:17:04 +00:00
Zdenek Kabelac
eafbdf3029 Don't print char type[8] as a plain string
pvck prints 'extra' character from the label since there is no '\0'
after the struct label entry and just uint64_t follows directly.
So avoid it by limiting 8 chars to be printed.

https://www.redhat.com/archives/lvm-devel/2011-January/msg00109.html

Signed-off-by: Paul Bolle <pebolle tiscali nl>
2011-10-24 10:24:39 +00:00
Zdenek Kabelac
72ff89d279 Always use vg memory pool for allocated lv segment
Remove mem pool parameter from alloc_lv_segment()
Since we should always allocate LV segment from the vg mempool.
2011-10-23 16:02:01 +00:00
Zdenek Kabelac
aef13649ea Remove old thin code from _lv_insert_empty_sublvs
Since thin is not able to use _lv_insert_empty_sublvs,
remove its appearence from this function.

Start to use extend_pool() function for desired functionality
and modify lv_extend() for this.
2011-10-22 16:48:59 +00:00
Zdenek Kabelac
dc225f58a9 Remove extra empty check
dm_list_splice handles empty list itself, no need to duplicate code.
2011-10-22 16:46:34 +00:00
Zdenek Kabelac
f4c77bd0e3 Recoded way to insert thin pool into vg
Code in _lv_insert_empty_sublvs  was not able to provide proper
initialization order for thin pool LV.

New function extend_pool() first adds metadata segment to pool LV which
is still visible. Such LV is activate and cleared.

Then new meta LV is created and metadata segments are moved there.
Now the preallocated pool data segment is attached to the pool LV
and layer _tpool is created. Finaly segment is marked as thin_pool.
2011-10-22 16:44:23 +00:00
Zdenek Kabelac
06b8248d63 Make move_lv_segment non-static
This function could be useful for other _manip source files.

Use dm_list manipulation function for provided functionality,
which make the code more readable and avoid touching list
internal details here.
2011-10-22 16:42:10 +00:00
Zdenek Kabelac
f0c9160df4 Store transaction_id with created thin lv
So we know the creation history and this should be useful with vgcfgrestore.
2011-10-21 11:38:35 +00:00
Zdenek Kabelac
4d925f5785 Remove double-hack for setting metadata size
Drop the second lv_extend and set 128MB directly in the first hack place.
2011-10-21 09:55:50 +00:00
Zdenek Kabelac
3bc417488d Thin pool now support chunk size as well
Use chunksize option to specify data_block_size for thin pool target.
Drop low_water_mark to zero.
2011-10-21 09:55:07 +00:00
Zdenek Kabelac
22f40c4efe Ensure right activation order
Couple FIXMEs put into the code for parts of the code which may be
improved later, since we might be able to add 'lazy' device creation later.
For now require exclusive activation.
2011-10-20 10:35:14 +00:00
Zdenek Kabelac
3f53c059e9 Add _BLOCK_ to define
Use DM_THIN_MIN_DATA_BLOCK_SIZE and
DM_THIN_MAX_DATA_BLOCK_SIZE to make it more obvious, for which
this define is useful in thin API.
2011-10-20 10:28:41 +00:00
Zdenek Kabelac
759b9592ba Update error message
Drop INTERNAL_ERROR from public API functions.
Improve some messages.
2011-10-19 16:42:14 +00:00
Zdenek Kabelac
8de912b677 Simple validation of messages in mda
Check we do not combine multiple messages for same LV target
and switch to use  'delete_id' to make it clear for what this device_id
is being used.
2011-10-19 16:39:09 +00:00
Zdenek Kabelac
3dcce042f6 Drop messages referencing deleted LV
lvremove may remove problematic LV for thin target.
2011-10-19 16:37:30 +00:00
Zdenek Kabelac
97d0f72c92 Just indent changes
Some tabs & spaces.
2011-10-19 16:36:39 +00:00
Zdenek Kabelac
b04e977851 Remove test for thin_pool
Since both functions are called during mda read - we don't have full LV info
at this moment.
2011-10-19 16:32:34 +00:00
Zdenek Kabelac
a25434a3a3 Message support for thin provisiong
lvm part of messaging.

Each message is now stored it's own thin pool section:

message1 {
	create = lv
}

Messages are queued to thin pool dm target when this target
is going to be resumed or used through some dependency.

Currently  'delete' message are purely queued and processed
with next thin pool resume operation (i.e. create_thin).

WARNING - thin provisioning support is developmental code.
2011-10-17 14:17:09 +00:00
Jonathan Earl Brassow
a551de6152 Use a more correct macro for 'seg_is_linear'
It is better to check 'seg->area_count == 1' than '!seg->stripe_size'.
2011-10-14 14:21:32 +00:00
Zdenek Kabelac
d4f134b8f6 Check for refresh_filter failure
Properly detect if the filters were refreshed properly.

(May needs few more fixes ??)

Filter refresh may fail because it may be out of free file descriptors
when clvmd gets overloaded.
2011-10-11 09:09:00 +00:00
Jonathan Earl Brassow
f60175c308 Add the ability to convert LVs of "mirror" segtype to "raid1" segtype.
Example:
~> lvconvert --type raid1 vg/mirror_lv

Steps to convert "mirror" to "raid1"
1) Allocate a RAID metadata LV for each mirror image from the same PVs
   on which they are located.
2) Clear the metadata LVs.  This involves writing LVM metadata, so we don't
   change any aspects of the mirror LV before this so that the user can easily
   remove LVs from the failed convert attempt while retaining the original
   mirror.
3) Remove the mirror log, if it exists.
4) Add metadata LVs to mirror LV
5) Rename mirror sub-lvs (s/mimage/rimage/)
6) Change flags and segtype from mirror to raid1
2011-10-07 14:56:01 +00:00
Jonathan Earl Brassow
d3582e0252 Add the ability to convert linear LVs to RAID1
Example:
~> lvconvert --type raid1 -m 1 vg/lv

The following steps are performed to convert linear to RAID1:
1) Allocate a metadata device from the same PV as the linear device
   to provide the metadata/data LV pair required for all RAID components.
2) Allocate the required number of metadata/data LV pairs for the
   remaining additional images.
3) Clear the metadata LVs.  This performs a LVM metadata update.
4) Create the top-level RAID LV and add the component devices.

We want to make any failure easy to unwind.  This is why we don't create the
top-level LV and add the components until the last step.  Should anything
happen before that, the user could simply remove the unnecessary images.  Also,
we want to ensure that the metadata LVs are cleared before forming the array to
prevent stale information from polluting the new array.

A new macro 'seg_is_linear' was added to allow us to distinguish linear LVs
from striped LVs.
2011-10-07 14:52:26 +00:00
Jonathan Earl Brassow
a80192b6a7 Allow 'nosync' extension of mirrors.
This patch allows a mirror to be extended without an initial resync of the
extended portion.  It compliments the existing '--nosync' option to lvcreate.
This action can be done implicitly if the mirror was created with the '--nosync'
option, or explicitly if the '--nosync' option is used when extending the device.

Here are the operational criteria:
1) A mirror created with '--nosync' should extend with 'nosync' implicitly
[EXAMPLE]# lvs vg; lvextend -L +5G vg/lv ; lvs vg
  LV   VG   Attr     LSize Pool Origin Snap%  Move Log     Copy%  Convert
  lv   vg   Mwi-a-m- 5.00g                         lv_mlog 100.00
  Extending 2 mirror images.
  Extending logical volume lv to 10.00 GiB
  Logical volume lv successfully resized
  LV   VG   Attr     LSize  Pool Origin Snap%  Move Log     Copy%  Convert
  lv   vg   Mwi-a-m- 10.00g                         lv_mlog 100.00

2) The 'M' attribute ('M' signifies a mirror created with '--nosync', while 'm'
signifies a mirror created w/o '--nosync') must be preserved when extending a
mirror created with '--nosync'.  See #1 for example of 'M' attribute.

3) A mirror created without '--nosync' should extend with 'nosync' only when
'--nosync' is explicitly used when extending.
[EXAMPLE]# lvs vg; lvextend -L +5G vg/lv; lvs vg
  LV   VG   Attr     LSize  Pool Origin Snap%  Move Log     Copy%  Convert
  lv   vg   mwi-a-m- 20.00m                         lv_mlog 100.00
  Extending 2 mirror images.
  Extending logical volume lv to 5.02 GiB
  Logical volume lv successfully resized
  LV   VG   Attr     LSize Pool Origin Snap%  Move Log     Copy%  Convert
  lv   vg   mwi-a-m- 5.02g                         lv_mlog   0.39
vs.
[EXAMPLE]# lvs vg; lvextend -L +5G vg/lv --nosync; lvs vg
  LV   VG   Attr     LSize  Pool Origin Snap%  Move Log     Copy%  Convert
  lv   vg   mwi-a-m- 20.00m                         lv_mlog 100.00
  Extending 2 mirror images.
  Extending logical volume lv to 5.02 GiB
  Logical volume lv successfully resized
  LV   VG   Attr     LSize Pool Origin Snap%  Move Log     Copy%  Convert
  lv   vg   Mwi-a-m- 5.02g                         lv_mlog 100.00

4) The 'm' attribute must change to 'M' when extending a mirror created without
'--nosync' is extended with the '--nosync' option.  (See #3 examples above.)

5) An inactive mirror's sync percent cannot be determined definitively, so it
must not be allowed to skip resync.  Instead, the extend should ask the user if
they want to extend while performing a resync.
[EXAMPLE]# lvchange -an vg/lv
[EXAMPLE]# lvextend -L +5G vg/lv
  Extending 2 mirror images.
  Extending logical volume lv to 10.00 GiB
  vg/lv is not active.  Unable to get sync percent.
Do full resync of extended portion of vg/lv?  [y/n]: y
  Logical volume lv successfully resized

6) A mirror that is performing recovery (as opposed to an initial sync) - like
after a failure - is not allowed to extend with either an implicit or
explicit nosync option.  [You can simulate this with a 'corelog' mirror because
when it is reactivated, it must be recovered every time.]
[EXAMPLE]# lvcreate -m1 -L 5G -n lv vg --nosync --corelog
  WARNING: New mirror won't be synchronised. Don't read what you didn't write!
  Logical volume "lv" created
[EXAMPLE]# lvs vg
  LV   VG   Attr     LSize Pool Origin Snap%  Move Log Copy%  Convert
  lv   vg   Mwi-a-m- 5.00g                             100.00
[EXAMPLE]# lvchange -an vg/lv; lvchange -ay vg/lv; lvs vg
  LV   VG   Attr     LSize Pool Origin Snap%  Move Log Copy%  Convert
  lv   vg   Mwi-a-m- 5.00g                               0.08
[EXAMPLE]# lvextend -L +5G vg/lv
  Extending 2 mirror images.
  Extending logical volume lv to 10.00 GiB
  vg/lv cannot be extended while it is recovering.

7) If 'no' is selected in #5 or if the condition in #6 is hit, it should not
result in the mirror being resized or the 'm/M' attribute being changed.


NOTE:  A mirror created with '--nosync' behaves differently than one created
without it when performing an extension.  The former cannot be extended when
the mirror is recovering (unless in-active), while the latter can.  This is
a reasonable thing to do since recovery of a mirror doesn't take long (at
least in the case of an on-disk log) and it would cause far more time in
degraded mode if the extension w/o '--nosync' was allowed.  It might be
reasonable to add the ability to force the operation in the future.  This
should /not/ force a nosync extension, but rather force a sync'ed extension.
IOW, the user would be saying, "Yes, yes... I know recovery won't take long
and that I'll be adding significantly to the time spent in degraded mode, but
I need the extra space right now!".
2011-10-06 15:32:26 +00:00
Jonathan Earl Brassow
b19f01212e Fix splitmirror in cluster having different DM/LVM views of storage.
This patch also does some clean-up of the splitmirrors code.

I've attempted to clean-up the splitmirrors code to make it easier to
understand with fewer operations.  I've tried to reduce the number of
metadata operations without compromising the intermediate stages which
are necessary for easy clean-up in the even of failure.

These changes now correctly handle cluster situations - including exclusive
cluster mirrors.  Whereas before, a splitmirror operation would result in
remote nodes having LVM commands report the newly split LV with a proper
name while DM commands would report the old (pre-split) names of the device.
IOW, there was a kernel/userspace mismatch.
2011-10-06 14:55:39 +00:00
Jonathan Earl Brassow
6c0b0e5d9a Revert initial solution to bug 733114 - I/O error message during splitmirror
The original commit comments can be located via this git commit ID:
	7d8e615c0b

There were three possible solutions to the original problem proposed in the
initial check-in.  The one chosen was as follows:
    2) Do like _remove_mirror_images does and suspend the original, then suspend
    the sub-lv (the error target), then resume the sub-lv, and finally resume the
    original LV.  This seems like extra pointless operations to me, but it doesn't
    produce the error message (although, I'm not sure why) and it allows us to
    leave the visible flag in place.
Turns out, the cluster also views the extra suspend/resume operations as
pointless too and ignores them.  So, this solution doesn't work in a cluster.
Further, I've noticed that in addition to the remote cluster nodes still getting
I/O errors from scanning the error target, they also have a different LVM and
DM views of the same LV.  IOW, while the LVM level (gotten from the LVM metadata)
sees the correct name for the newly split LV, device-mapper still maintains the
old names.

Because the original fix failed to completely fix the problem (or work-around it)
and because a better solution must be found to address the additional cluster
issue of device renaming, I am reverting the above mentioned commit.
2011-10-06 14:49:16 +00:00
Zdenek Kabelac
565a4bfc49 Move defines to header
Make limits for thin data_block_size and device_id part of public API.

FIXME: read them possible from some kernel header file in the future ?
But we may need to support different values for different versions ?
2011-10-06 11:05:56 +00:00
Zdenek Kabelac
01ef6510b0 Missed rename pool->thin_pool
Fix compilation
2011-10-03 19:10:52 +00:00
Zdenek Kabelac
04a4715cb8 Add code to activate thin target
Code to zero pool metadata lv when pool is created.
Add code to create thin target via message sending.

(Revert is missing)
2011-10-03 18:43:39 +00:00
Zdenek Kabelac
d35a117e4b Add simple function for lookup of some free device_id
Initial simple implementation for finding some free device_id.
2011-10-03 18:39:17 +00:00
Zdenek Kabelac
38796c3d47 Fix bad error message for thinp validation 2011-09-29 09:03:36 +00:00
Zdenek Kabelac
aebf2d5cdc Add experimental code for activation of thinp targets
No dm messages yes - just a base functionality in the steps of other targets.
For now usable only for debugging and tracing.
2011-09-29 08:56:38 +00:00
Alasdair Kergon
1c26860d82 Abort if _finish_pvmove suspend_lvs fails instead of cleaning up incompletely.
Change suspend_lvs to call vg_revert internally.
Change vg_revert to void and remove superfluous calls after failed vg_commit.
2011-09-27 17:09:42 +00:00
Jonathan Earl Brassow
efa3621a59 Add 'Volume Type' lv_attr characters for RAID and RAID_IMAGE.
RAID_META is already handled.
2011-09-23 15:17:54 +00:00
Peter Rajnoha
125712bea0 Replace open_count check with holders/mounted_fs check on lvremove path.
Before, we used to display "Can't remove open logical volume" which was
generic. There 3 possibilities of how a device could be opened:
  - used by another device
  - having a filesystem on that device which is mounted
  - opened directly by an application

With the help of sysfs info, we can distinguish the first two situations.
The third one will be subject to "remove retry" logic - if it's opened
quickly (e.g. a parallel scan from within a udev rule run), this will
finish quickly and we can remove it once it has finished. If it's a
legitimate application that keeps the device opened, we'll do our best
to remove the device, but we will fail finally after a few retries.
2011-09-22 17:33:50 +00:00
Jonathan Earl Brassow
40c85cf1d7 When up-converting a RAID1 array, we need to allocate new larger arrays for
seg->areas and seg->meta_areas.  We also need to copy the memory from the
old arrays to the newly allocated arrays.  The amount of memory to copy was
determined by seg->area_count.  However, seg->area_count was being set to the
higher value after copying the 'seg->areas' information, but before copying
the 'seg->meta_areas' information.  This means we were copying more memory
than necessary for 'seg->meta_areas' - something that could lead to a segfault.
2011-09-22 15:33:21 +00:00
Jonathan Earl Brassow
4026cb6fd1 fix compiler warning.
Compiler says variable may be used uninitialized.  It can't be, but we
initialize the variable to NULL anyway.  Also, remove the double initialization
of another variable.
2011-09-19 14:28:23 +00:00
Jonathan Earl Brassow
eb607100ef Fix Bug 738832 - core to disk log conversion fails with internal error
This bug showed up when trying to add a log to a mirror whose images are on
multiple devices.  This is an intra-release regression and no WHATS_NEW
entry will be added.  The error was introduce in the following commit:
	2d8a2f35c7

The solution is to recognise in _alloc_init that if there are no mirrors
or stripes specified, then 'new_extents' should be zero.
2011-09-16 18:39:03 +00:00
Jonathan Earl Brassow
a514067448 After suspend/resume following a splitmirror op, call sync_local_dev_names
to settle udev before calling deactivate_lv.

This is an intra-release regression (no WHATS_NEW entry required).  It is
part of the fix for the current WHATS_NEW entry:
  Work around resume_lv causing error LV scanning during splitmirror operation.
2011-09-16 16:41:37 +00:00
Zdenek Kabelac
a6d50bef2f Remove thin volumes before thin pools
When user wants to remove thin pool - check if there are no thin volumes using it.
If so - query before removal (or -ff for no question) and remove them first.
2011-09-16 12:12:51 +00:00
Zdenek Kabelac
4a0c6df8df Reset LV status when unlinking LV from VG
When LV is unlinked, we want to catch problem in vg_validate,
that LV has changed.

i.e. catch LV has been removed and is no long thin_pool while still
being referenced by some thin volume.
2011-09-16 11:59:22 +00:00
Zdenek Kabelac
94147f3f29 Trim spaces on EOL 2011-09-16 11:53:14 +00:00
Petr Rockai
fd7d4adc57 Fix the divisibility check in the allocator for the mirror+stripe case (require
divisibility by stripe count alone, not by (mirror*stripe)).
2011-09-16 09:59:42 +00:00
Milan Broz
c81a322337 Activate virtual snapshot origin exclusively (only on local node in cluster). 2011-09-14 14:20:16 +00:00
Zdenek Kabelac
e24be2abe4 Add suggest parentheses around '&&'
Follow gcc suggestion.
2011-09-14 10:03:15 +00:00
Zdenek Kabelac
886d005616 LVM_WRITE and LVM_READ are 64bit constants
Revert John patch, which fixed only 1 place where ~LVM_WRITE was in use and
convert ommited LVM_READ/WRITE flags to 64bit constants as well.
(Since both 'status' flags for LV and VG are 64bit.)
2011-09-14 09:57:35 +00:00
Zdenek Kabelac
3e25de05a9 Add missing underscores to local static functions 2011-09-14 09:54:21 +00:00
Jonathan Earl Brassow
462579d54e Additional fixes for lv_mirror_count.
Changing lv_mirror_count to only count the AREA_LVs made the function
stop working for PVMOVE mirrors.  A conditional has been added to fix
that problem.  Additionally, when counting the images in a mirror stack,
we don't need to subtract 1 from the count we get back from the
lv_mirror_count call on the temporary mirror layer.  (This is because we
are no falsely counting the top layer of the temporary mirror.)
2011-09-14 04:10:26 +00:00
Jonathan Earl Brassow
9cb27929e9 Fix for bug 734252 - problem up converting striped mirror after image failure
lv_mirror_count was not able to handle mirrors of stripes properly.  When a
failed device is removed, the MIRRORED status flag is removed from the LV
conditionally based on the results of lv_mirror_count.  However, lv_mirror_count
trusted the MIRRORED flag - thinking any such LV must be mirrored.  It would
happily assign first_seg(lv)->area_count as the number of mirrors, but when
a mirrored striped LV was reduced to a simple striped LV area_count would be
the number of /stripes/ not the number of /mirrors/.  A result higher than 1
would be returned from lv_mirror_count, the MIRRORED flag would not be cleared,
and the LV would fail to be up-converted properly in lvconvert_mirrors_aux
because of it.
2011-09-14 02:45:36 +00:00
Jonathan Earl Brassow
46f0efbfce Fix bug 733400 - Mirror down conversion when specifying the secondary leg is broke
The operation of deactivating the residual error target LV after removing a
mirror layer can cause a "device in-use" conflict with udev.  Giving udev a
poke before calling deactivate_lv eliminates the conflict.  The stick used
to poke udev is 'sync_local_dev_names'.
2011-09-13 21:13:33 +00:00
Jonathan Earl Brassow
c94c47abd7 Fix for bug 737200 - Can't create mirrored-log mirror on a VG with small extents
Kernel requires a mirror to be at least 1 region large.  So,
if our mirror log is itself a mirror, it must be at least
1 region large.  This restriction may not be necessary for
non-mirrored logs, but we apply the rule anyway.

(The other option is to make the region size of the log
mirror smaller than the mirror it is acting as a log for,
but that really complicates things.  It's much easier to
keep the region_size the same for both.)
2011-09-13 18:42:57 +00:00
Jonathan Earl Brassow
f5e43f061a Better fix for bug 737125 - unable to create mirror on 1K extent size VG
WHATS_NEW entry:
Fix log size calculation when only a log is being added to a mirror.

The original fix pass the mirror LV to allocate_extents (rather than
passing NULL) so that _alloc_init could correctly determine the necessary
size of the mirror log.  In the previous check-in, I noted:
    In order to get a decent value computed, we need to pass in the 'lv' argument
    to allocate_extents.  This would normally imply a desire for cling/contiguous
    allocation to the given LV, but since we are not allocating any parallel
    extents and only log extents, it works fine.
However, passing in the LV did have unintended consequences on the placement of
the log.  The better solution is to pass in the number of extext that are in
the mirror LV instead of the LV itself.  This will not cause the allocator to
reserve that number of extents, because 'stripes' and 'mirrors' are specified
as 0.  Thus, 'extents' is used to calculate the size of the log, but won't
affect how much is allocated.
2011-09-13 18:11:38 +00:00
Jonathan Earl Brassow
0c89ef513a Changing RAID status flags to 64-bit broke some binary flag operations.
LVM_WRITE is a 32-bit flag.  Now that RAID[_IMAGE|_META] are 64-bit,
and'ing a RAID LV's status against LVM_WRITE can reset the higher order
flags.

A similar thing will affect thinp flags if not careful.
2011-09-13 16:33:21 +00:00
Jonathan Earl Brassow
cc9dc919e6 Fix for bug 737125 - unable to create mirror on 1K extent size VG
_alloc_init calculates the number of necessary log extents via
'mirror_log_extents'.  'mirror_log_extents' takes 3 arguments: region_size,
pe_size, and size of the mirror LV.  Unfortunately, _alloc_init is guessing at
the mirror size by using 'ah->new_extents / ah->area_multiple' - the number of
extents that the mirror images have.  However, this is /always/ wrong when
allocating the log separately.  Further, the log is always allocated separately
unless we are up-converting the mirror at the same time.  It was by luck alone
that a default value of '1' reflects what we want in most cases.

In order to get a decent value computed, we need to pass in the 'lv' argument
to allocate_extents.  This would normally imply a desire for cling/contiguous
allocation to the given LV, but since we are not allocating any parallel
extents and only log extents, it works fine.
2011-09-13 14:37:48 +00:00
Jonathan Earl Brassow
6d0aa801a0 Fix for bug 733114.
When an image is split from a 2-way mirror, the original mirror is converted to
a linear device.  To do this, the top "layer" must be removed.  The segments
are transferred from the sub-lv to the top-level LV and the link is severed.
The former sub-lv - having its segments transferred - now contains a temporary
error target.

When the original LV is resumed, the old sub-lv that now contains an error
segment is activated and scanned.  This is what causes the I/O error messages.
There are three ways to fix this problem:

1) Do not set the sub-lv which contains the error target as "visible" before
suspending the original LV.  This way, when the original is resumed, the sub-lv
device node is not created and it is not scanned - avoiding the error messages.
 The problem with this approach is that if the machine crashes after the
resume, it leaves the *hidden* LV in place and the user has a more difficult
time noticing that it needs to be cleaned up.  Thus, this type of processing is
frowned upon.

2) Do like _remove_mirror_images does and suspend the original, then suspend
the sub-lv (the error target), then resume the sub-lv, and finally resume the
original LV.  This seems like extra pointless operations to me, but it does not
produce the error message (although, I'm not sure why) and it allows us to
leave the visible flag in place.

3) Flag the sub-lv (error target) with a "do not scan" flag.  This seems like
the cleanest approach, but I have been unable to find the method for doing
this.  LVs get tagged in such a way by _get_udev_flags, but in this case the
resume of the original LV also resumes the error target LV without running it
through _get_udev_flags (likely because they are no longer linked).  Could
there be something wrong in resume_lv?

Option #2 was chosen to fix this bug, but it seems like more of a workaround
for now.
2011-09-13 13:59:19 +00:00
Alasdair Kergon
5081181b5d Append z to lv_attr if new blocks will be zeroed. 2011-09-09 01:15:18 +00:00
Alasdair Kergon
dbb48de507 Add a new 'thin_pool' output field to 'lvs.
A gentle reminder that anyone relying on the output of reporting commands
like lvs in scripts must use -o to guarantee they get the fields they expect.

The default sequence of fields can change from release to release.
Equally, the 'attr' fields can have new values introduced and/or characters
appended to them.
2011-09-09 00:54:49 +00:00
Alasdair Kergon
52e3f9dd5e Add 7th lv_attr char to show the related kernel target.
Add thin volume types to lv_attr.
2011-09-08 20:55:39 +00:00
Alasdair Kergon
ef78ebf35a lvcreate/remove thin_pool and thin volumes (--driverloaded n only) 2011-09-08 16:41:18 +00:00
Alasdair Kergon
1abaaab1bc Terminate pv_attr field correctly. (2.02.86) 2011-09-07 13:42:00 +00:00
Zdenek Kabelac
f32b76a193 Minor change for pv_create api
Switch int to unsigned type.
2011-09-07 08:34:21 +00:00
Alasdair Kergon
bb6f9b10db pool attach fns & more field renaming 2011-09-06 22:43:56 +00:00
Alasdair Kergon
b88362ff95 add thin_manip.c like the other manip files
move basic lv_is_* to macros
data_lv -> pool_lv - we decided to call it 'pool' everywhere now
2011-09-06 19:25:42 +00:00
Alasdair Kergon
2ef5b7cca6 Start using 64-bit status flags - most of the code already handles them.
tdata -> tpool
remove commented out definitions from metadata.h
formatting clean-ups
2011-09-06 18:49:31 +00:00
Alasdair Kergon
dd44cccefe else 2011-09-06 15:39:46 +00:00
Alasdair Kergon
9ac61d2ba2 lvcreate parsing for thin provisioning.
The rest is incomplete so this isn't usable yet.
2011-09-06 00:26:42 +00:00
Jonathan Earl Brassow
da23255cc9 Fix for bug 732142: Unsafe table load during mirror image split
There was a bad sequence:
*) Make changes to LV layout to split images (e.g. 4-way -> 2-way/2-way)
1) vg_write, suspend_lv(original_mirror), vg_commit
2) activate_lv(newly_split_lv)
3) resume_lv(original_mirror)

Step #2 is not allowed.  However, without it, the resume of the original
mirror will also resume its former sub-LVs - making it impossible to
activate the newly split LV due to the changes in layering, pointers, and
names that had already been made.  Additionally, the resume or the original
brings the sub-lv's online with names that differ from the metadata on disk -
also a no-no.  Thus, the split must be done in stages such that the active LVs
always reflect what is in the committed LVM metadata.

First, alter the original mirror by releasing the images.  The images are made
visible and independent as an intermediate stage.  (This way, we can have
consistency between LVM metadata and active LVs.)  The second stage collects
the recently split LVs, deactivates them, forms them into a mirror if necessary,
and then activates them.  It is a bit of a circuitous method, but it is the only
way to split a mirror from a mirror and obey these general rules:
1) Never [de]activate sub-lvs when the top-level LV is suspended
2) Avoid having active LVs that differ from the description in the LVM metadata

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
2011-09-01 19:22:11 +00:00
Zdenek Kabelac
3caa77f831 Use size_t return type
Since these function returns buffer size - use size_t type for them.
2011-09-01 10:25:22 +00:00
Petr Rockai
e59e2f7c3c Move the core of the lib/config/config.c functionality into libdevmapper,
leaving behind the LVM-specific parts of the code (convenience wrappers that
handle `struct device` and `struct cmd_context`, basically). A number of
functions have been renamed (in addition to getting a dm_ prefix) -- namely,
all of the config interface now has a dm_config_ prefix.
2011-08-30 14:55:15 +00:00
Alasdair Kergon
11bfaa1df8 same for segtype_is_thin 2011-08-26 18:17:05 +00:00
Alasdair Kergon
6fbf1c6b56 seg_is_thin includes both thin_pool and thin_volume 2011-08-26 18:15:14 +00:00
Alasdair Kergon
42914557d5 thin - hide unimplemented dso fn; remove duplicate origin_lv field; add
some lvcreate struct parms
2011-08-26 17:40:53 +00:00
Zdenek Kabelac
e82bd6249b Initial code for read/write of thin metadata lv segments 2011-08-26 13:37:47 +00:00
Zdenek Kabelac
9d32170d5c Add registration of thin_pool segment
Register thin and thin_pool segment via multiple_segtypes.
2011-08-25 10:00:09 +00:00
Alasdair Kergon
f9b92564a7 Fix raid shared lib segtype registration (2.02.87). 2011-08-24 13:41:46 +00:00
Zdenek Kabelac
3ba4a19510 Initial code layout for thin provisioning target
Only registers init_thin_segtype

Option --with-thin=internal needed for compilation.
For now useful only for developememt!
2011-08-24 08:27:49 +00:00
Alasdair Kergon
c31d14d786 Remove incorrect error message added in 2.02.87. 2011-08-19 22:55:07 +00:00
Alasdair Kergon
1d64dcfbf7 clarify comment 2011-08-19 19:35:50 +00:00
Alasdair Kergon
ba7df3de88 avoid multi-line calc with incorrect intermediate var contents 2011-08-19 16:41:26 +00:00
Alasdair Kergon
3250b38583 _ for static fns 2011-08-19 15:59:15 +00:00
Jonathan Earl Brassow
a2facf4ad4 Add ability to merge back a RAID1 image that has been split w/ --trackchanges
Argument layout is very similar to the merge command for snapshots.
2011-08-18 19:43:08 +00:00
Jonathan Earl Brassow
f439e65b64 Add support for m-way to n-way up-convert in RAID1 (no linear to n-way yet)
This patch adds the ability to upconvert a raid1 array - say from 2-way to
3-way.  It does not yet support upconverting linear to n-way.

The 'raid' device-mapper target allows for individual components (images) of
an array to be specified for rebuild.  This mechanism is used when adding
new images to the array so that the new images can be resync'ed while the
rest of the images in the array can remain 'in-sync'.  (There is no
mirror-on-mirror layering required.)
2011-08-18 19:41:21 +00:00
Jonathan Earl Brassow
6d04311efa Add the ability to split an image from the mirror and track changes.
~> lvconvert --splitmirrors 1 --trackchanges vg/lv
The '--trackchanges' option allows a user the ability to use an image of
a RAID1 array for the purposes of temporary read-only access.  The image
can be merged back into the array at a later time and only the blocks that
have changed in the array since the split will be resync'ed.  This
operation can be thought of as a partial split.  The image is never completely
extracted from the array, in that the array reserves the position the device
occupied and tracks the differences between the array and the split image via
a bitmap.  The image itself is rendered read-only and the name (<LV>_rimage_*)
cannot be changed.  The user can complete the split (permanently splitting the
image from the array) by re-issuing the 'lvconvert' command without the
'--trackchanges' argument and specifying the '--name' argument.
	~> lvconvert --splitmirrors 1 --name my_split vg/lv
Merging the tracked image back into the array is done with the '--merge'
option (included in a follow-on patch).
	~> lvconvert --merge vg/lv_rimage_<n>

The internal mechanics of this are relatively simple.  The 'raid' device-
mapper target allows for the specification of an empty slot in an array
via '- -'.  This is what will be used if a partial activation of an array
is ever required.  (It would also be possible to use 'error' targets in
place of the '- -'.)  If a RAID image is found to be both read-only and
visible, then it is considered separate from the array and '- -' is used
to hold it's position in the array.  So, all that needs to be done to
temporarily split an image from the array /and/ cause the kernel target's
bitmap to track (aka "mark") changes made is to make the specified image
visible and read-only.  To merge the device back into the array, the image
needs to be returned to the read/write state of the top-level LV and made
invisible.
2011-08-18 19:38:26 +00:00
Jonathan Earl Brassow
a324baf6a1 Add --splitmirrors support for RAID1 (1 image only)
Users already have the ability to split an image from an LV of "mirror"
segtype.  This patch extends that ability to LVs of "raid1" segtype.

This patch only allows a single image to be split off, however.  (The
"mirror" segtype allows an arbitrary number of images to be split off.
e.g.  4-way => 3-way/linear, 2-way/2-way, linear,3-way)
2011-08-18 19:34:18 +00:00
Jonathan Earl Brassow
63d32fb6a6 When down-converting RAID1, don't activate sub-lvs between suspend/resume
of top-level LV.

We can't activate sub-lv's that are being removed from a RAID1 LV while it
is suspended.  However, this is what was being used to have them show-up
so we could remove them.  'sync_local_dev_names' is a sufficient and
proper replacement and can be done after the top-level LV is resumed.
2011-08-18 19:31:33 +00:00
Jonathan Earl Brassow
4903b85d23 Compiler warning fixes, better error messaging, and cosmetic changes.
1) add new function 'raid_remove_top_layer' which will be useful
to other conversion functions later (also cleans up code)
2) Add error messages if raid_[extract|add]_images fails
3) Add function prototypes to prevent compiler warnings when
compiling with '--with-raid=shared'
2011-08-13 04:28:34 +00:00
Jonathan Earl Brassow
a22515c87f Various code clean-ups (s/malloc/zalloc/, new msgs, etc)
Fix a couple more issues that kabi found.
- Add some error messages in failure cases
- s/malloc/zalloc/
- use vg->vgmem for lv names instead of vg->cmd->mem
2011-08-11 21:32:18 +00:00
Jonathan Earl Brassow
2100c90dd7 Add missing checks for function return codes.
Some functions were being called without having their return values checked.
2011-08-11 19:38:00 +00:00
Jonathan Earl Brassow
b2fa9b43dc Add some log_error msg's and fix potential segfault
Thanks to kabi for spotting these - especially the possibility for
segfault if a loop runs all the way through without finding a match.
2011-08-11 19:17:10 +00:00
Jonathan Earl Brassow
4aebd52c4c Add ability to down-convert RAID1 arrays.
Also, add some simple RAID tests to testsuite.
2011-08-11 18:24:40 +00:00
Zdenek Kabelac
031c986ea8 Lock memory for shared VG
Use debug pool locking functionality. So the command could check,
whether the memory in the pool has not been modified.

For lv_postoder() instead of unlocking and locking for every changed
struct status member do it once when entering and leaving function.
(mprotect would trap each such memory access).
Currently lv_postoder() does not modify other part of vg structure
then status flags of each LV with flags that are reverted back to
its original state after function exit.
2011-08-11 17:34:30 +00:00
Zdenek Kabelac
bb115a7a6c Cache and share generated VG structs
Extend vginfo cache with cached VG structure. So if the same metadata
are use, skip mda decoding in the case, the same data are in use.
This helps for operations like activation of all LVs in one VG,
where same data were decoded giving the same output result.

Patch adds 1-to-1 connection between volume_group and lvmcache_vginfo.
2011-08-11 17:24:23 +00:00
Peter Rajnoha
47d7f00e16 Fix possible format instance memory leaks and premature releases in _vg_read. 2011-08-11 16:31:40 +00:00
Jonathan Earl Brassow
66d9675559 Fix renaming of RAID logical volumes.
The function 'for_each_sub_lv', which rename uses, was not handling the
RAID metadata areas.  Thus, the metadata LVs were not being renamed.
2011-08-11 03:29:51 +00:00
Zdenek Kabelac
530b00a652 Just add new lines between header comment 2011-08-10 20:26:41 +00:00
Zdenek Kabelac
077a6755ff Replace free_vg with release_vg
Move the free_vg() to  vg.c  and replace free_vg  with release_vg
and make the _free_vg internal.

Patch is needed for sharing VG in vginfo cache so the release_vg function name
is a better fit here.
2011-08-10 20:25:29 +00:00
Zdenek Kabelac
789f9c55e5 Remove INCONSISTENT_VG flag
As this flag could not have been set by the current code - removing it.

Note: because of the wrong code logic this call:

lvmcache_update_vg(correct_vg, correct_vg->status & PRECOMMITTED &
			   (inconsistent ? INCONSISTENT_VG : 0));

had always passed '0' - now after flag removal it's passing
PRECOMMITTED flag in - this present functinal change in this patch.

To match the original functionality - 0 had to be always passed.
More testing is needed here.
2011-08-10 20:17:33 +00:00
Jonathan Earl Brassow
e01bcc6884 Fix compiler warning.
Compiler complaining that meta_lv could be used uninitialized.  (Not true
because it is protected by 'clear_metadata'.)  I switched to using 'lv->vg',
as it makes no difference to vg_[write|commit].
2011-08-10 16:44:17 +00:00
Peter Rajnoha
0127a9a525 Remove unused 'origin' variable in lv_remove_single function. 2011-08-05 09:21:13 +00:00
Zdenek Kabelac
425862fb95 Remove unused inconsistent_seqno
Last usage was removed in Petr's commit related to VG mda repair fix
where relaxed check starts to ignore inconsistencies coming from
PVs that are marked MISSING - thus removing unused variable.
2011-08-04 15:18:10 +00:00
Jonathan Earl Brassow
cac52ca4ce Add basic RAID segment type(s) support.
Implementation described in doc/lvm2-raid.txt.

Basic support includes:
- ability to create RAID 1/4/5/6 arrays
- ability to delete RAID arrays
- ability to display RAID arrays
Notable missing features (not included in this patch):
- ability to clean-up/repair failures
- ability to convert RAID segment types
- ability to monitor RAID segment types
2011-08-02 22:07:20 +00:00
Jonathan Earl Brassow
7411a44871 Remove and unneeded parameter from build_parallel_areas_from_lv() 2011-07-19 16:37:42 +00:00
Jonathan Earl Brassow
aa6599e687 Fix potential null ptr deref in 'origin_from_cow'
return NULL rather than segfaulting if lv->snapshot is not set
2011-07-19 16:23:52 +00:00
Alasdair Kergon
ee840ff14c Move snapshot deactivation logic into lib/activate, fixing the
teardown sequence.  (Previously the snapshot was deactivated while its
origin was active and before its removal was committed to disk, so
restarting after a crash at the point would leave corruption.)
2011-07-08 12:48:41 +00:00
Alasdair Kergon
0f2a4ca2b5 When suspending, automatically preload newly-visible existing LVs
Let's find out if this makes things better or worse overall...
2011-06-30 18:25:18 +00:00
Alasdair Kergon
1d7649f36b Reinstate correct permissions when creating mirrors. 2011-06-29 17:05:53 +00:00
Alasdair Kergon
e189a84f57 Append 'm' attribute to pv_attr for missing PVs. 2011-06-29 14:56:33 +00:00
Alasdair Kergon
140615dafb remove unused var after recent patch 2011-06-24 23:39:09 +00:00
Jonathan Earl Brassow
9e0edb7ee5 Fix to preserve exclusive activation of mirror while up-converting.
When an LVM mirror is up-converted (an additional image added), it creates
a temporary mirror stack.  The lower-level mirror in the stack that is
created was not being activated exclusively - violating the exclusive nature
of the original mirror.  We now check for exclusive activation of a mirror
before converting it, and if found, we ensure that the temporary mirror
is also exclusively activated.
2011-06-23 14:00:58 +00:00
Milan Broz
6adbb95b82 Fail allocation if number of extents not divisible by area count
Allocation should fail early if this condition is not met.

Quick fix for https://bugzilla.redhat.com/show_bug.cgi?id=707779
2011-06-23 10:53:24 +00:00
Jonathan Earl Brassow
9e277b9e2c Fix issue preventing cluster mirror creation.
Mirrors used to be created by first creating a linear device and then adding
the other images plus the log.  Now mirrors are created by creating all the
images in one go and then adding the log separately.  The new way ran into
the condition that cluster mirrors cannot change the log type (in the case
of creation, from core -> disk) while the mirror is not active.  (It isn't
active because it is in the process of being created.)  The reason this
condition is in place is because a remote node may have the mirror active, and
we don't want to alter the log underneath it.

What we really needed was a way of checking if the mirror was active remotely
but not locally, and in that case do not allow a change of the log.  I've added
this check, and cluster mirrors can now be created again.
2011-06-22 21:31:21 +00:00
Zdenek Kabelac
bebe60b70c Code move of vg_mark_partial() up in stack
It's useful to keep the partial flag cached - so just move the call
for vg_mark_partil_lvs() into import_vg_from_config_tree() so it gets
evaluated before it goes through the lvmcache.

This patch should not present any functional change.

Note: It is rather temporal solution - proper place is probably inside the
'read' call back - but needs some more discussion.
For now using this minor hack.
2011-06-17 14:39:10 +00:00
Zdenek Kabelac
93a98c2672 Remove unused internal flag ACTIVATE_EXCL from the code 2011-06-17 14:30:58 +00:00
Zdenek Kabelac
f50a76379a Remove test for status flag
As the ACTIVATE_EXCL could be set only in clvmd code - there is no
use for this test in lv_add_mirrors() function only called from
tools context.

FIXME: Add cluster test case for this.
2011-06-17 14:27:34 +00:00