1
0
mirror of git://sourceware.org/git/lvm2.git synced 2024-12-21 13:34:40 +03:00
Commit Graph

3066 Commits

Author SHA1 Message Date
Peter Rajnoha
9738a02d3d filter-mpath: fix primary device lookup failure for partition when processing mpath filter
If using persistent filter and we're refreshing filters (just like we
do for pvcreate now after commit 54685c20fc),
we can't rely on getting the primary device of the partition from the cache
as such device could be already filtered by persistent filter and we get
a device cache lookup failure for such device.

For example:

$ lvm dumpconfig --type diff
devices {
	obtain_device_list_from_udev=0
}

$lsblk /dev/sda
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda      8:0    0  128M  0 disk
`-sda1   8:1    0  127M  0 part

$cat /etc/lvm/cache/.cache | grep sda
		"/dev/sda1",

$pvcreate /dev/sda1
  dev_is_mpath: failed to get device for 8:1
  Physical volume "/dev/sda1" successfully created

The problematic part of the code called dev_cache_get_by_devt
to get the device for the device number supplied. Then the code
used dev_name(dev) to get the name which is then used in check
whether there's any mpath on top of this dev...

This patch uses sysfs to get the base name for the partition
instead, hence avoiding the device cache which is a correct
approach here.
2014-08-08 10:49:19 +02:00
Peter Rajnoha
c52c9a1e31 activation: if LV inactive and non-clustered, do not issue "Cannot deactivate" on -aln
The message "Cannot deactivate remotely exclusive device locally." makes
sense only for clustered LV. If the LV is non-clustered, then it's
always exclusive by definition and if it's already deactivated, this
message pops up inappropriately as those two conditions are met.

So issue the message only if the conditions are met AND we have clustered VG.
2014-08-07 16:44:09 +02:00
Peter Rajnoha
ea662ca060 pvmove: remove spurious "Skipping mirror LV" message on pvmove of clustered mirror
With cmirrord, we can do pvmove of clustered mirror. The code checking
suitability of LVs on the PV being moved issued a message if a mirror
LV was found and the VG was clustered. However, the actual pvmove did
work correctly.

The top-level mirror LV is actually skipped in the code since it's
always layered on top of internal LVs making up the mirror LV and for pvmove
we consider these internal devices only as they're actually layered on
top of concrete PVs then. But we don't need to issue any message here
about skipping the top-level mirror LV - it's misleading here.
2014-08-07 15:23:58 +02:00
Alasdair G Kergon
26885ea119 post-release 2014-08-05 02:12:20 +01:00
Alasdair G Kergon
9d4e1e51a9 pre-release 2014-08-05 02:07:35 +01:00
Petr Rockai
46262f163b WHATS_NEW: lvscan --cache SEGV fix 2014-08-04 17:05:08 +02:00
Peter Rajnoha
54685c20fc filters: fix regression caused by commit e80884cd08
Commit e80884cd08 tried to dump filters
for them to be reevaluated when creating a PV to avoid overwriting
any existing signature that may have been created after last
scan/filtering.

However, we need to call refresh_filters instead of
persistent_filter->dump since dump requires proper rescannig to fill
up the persistent filter again. However, this is true only for pvcreate
but not for vgcreate with PV creation where the scanning happens before
this PV creation and hence the next rescan (if not full scan), does not
fill the persistent filter.

Also, move refresh_filters so that it's called sooner and only for
pvcreate, vgcreate already calls lvmcache_label_scan(cmd, 2) which
then calls refresh_filters itself, so no need to reevaluate this again.

This caused the persistent filter (/etc/lvm/cache/.cache file) to be
wrong and contain only the PV just being processed with
vgcreate <vg_name> <pv_name_to_create>.

This regression caused other block devices to be filtered out in case
the vgcreate with PV creation was used and then the persistent filter
is used by any other LVM command afterwards.
2014-08-01 11:39:53 +02:00
Alasdair G Kergon
c7b9f0ab42 lvresize: Allow approximation with +%FREE.
Make lvresize -l+%FREE support approximate allocation.

Move existing "Reducing/Extending' message to verbose level
and change it to say 'up to' if approximate allocation is being used.

Replace it with a new message that gives the actual old and new size or
says 'unchanged'.
2014-08-01 00:35:43 +01:00
Peter Rajnoha
ef85997980 metadata: remove spurious "Physical volume <dev_name> not found"
This is addendum to commit 2e82a070f3
which fixed these spurious messages that appeared after commit
651d5093ed ("avoid pv_read in
find_pv_by_name").

There was one more "not found" message issued in case the device
could not be found in device cache (commit 2e82a07 fixed this only
for PV lookup itself). But if we "allow_unformatted" for
find_pv_by_name, we should not issue this message even in case
the device can't be found in dev cache as we just need to know
whether there's a PV or not for the code to decide on next steps
and we don't want to issue any messages if either device itself
is not found or PV is not found.

For example, when we were creating a new PV (and so allow_unformatted = 1)
and the device had a signature on it which caused it to be filtered
by device filter (e.g. MD signature if md filtering is enabled),
or it was part of some other subsystem (e.g. multipath), this message
was issued on find_pv_by_name call which was misleading.

Also, remove misleading "stack" call in case find_pv_by_name
returns NULL in pvcreate_check - any error state is reported
later by pvcreate_check code so no need to "stack" here.

There's one more and proper check to issue "not found" message if
the device can't be found in device cache within pvcreate_check fn
so this situation is still covered properly later in the code.

Before this patch (/dev/sda contains MD signature and is therefore filtered):

$ pvcreate /dev/sda
  Physical volume /dev/sda not found
WARNING: linux_raid_member signature detected on /dev/sda at offset 4096. Wipe it? [y/n]:

With this patch applied:

$ pvcreate /dev/sda
WARNING: linux_raid_member signature detected on /dev/sda at offset 4096. Wipe it? [y/n]:

Non-existent devices are still caught properly:

$ pvcreate /dev/sdx
  Device /dev/sdx not found (or ignored by filtering).
2014-07-31 10:03:30 +02:00
Alasdair G Kergon
7cff640d9a activation: Fix upgrades using uuid suffixes.
2.02.106 added suffixes to some LV uuids in the kernel.

If any of these LVs is activated with 2.02.105 or earlier,
and then a later version is used, the LVs appear invisible and
activation commands fail.

The code now has to check the kernel for both old and new uuids.
2014-07-30 21:55:11 +01:00
Alasdair G Kergon
321bed7137 post-release 2014-07-23 16:23:52 +01:00
Alasdair G Kergon
25fa725b05 pre-release 2014-07-23 16:05:22 +01:00
Petr Rockai
66686a5bc5 WHATS_NEW: lvscan --cache, dmeventd RAID + lvmetad 2014-07-22 22:48:41 +02:00
Zdenek Kabelac
894eda4707 thin and cache: unify pool common code
Fix get_pool_params to only read params.
Add poolmetadataspare option to get_pool_params.
Move all profile code into update_pool_params.
Move recalculate code into pool_manip.c
2014-07-22 22:41:38 +02:00
Alasdair G Kergon
513fd029a6 config: Adjust description of activation_mode. 2014-07-21 15:50:47 +01:00
Zdenek Kabelac
8f9f180139 lvconvert: improve merge validation
Easier check for conflicting options with --merge.
2014-07-17 16:16:00 +02:00
Zdenek Kabelac
7abad9ef88 lvconvert: improve splitsnapshot test
Easier check for conflicting options with --splitsnapshot.
2014-07-17 16:15:30 +02:00
Peter Rajnoha
17b92001ea WHATS_NEW: recent commits
Commits:
d169ff1e03
bccc2bef33
f76879ba44
2014-07-11 14:09:05 +02:00
Zdenek Kabelac
970989655f lvconvert: update for thin a cache
Major update of lvconvert code to handle cache and thin.
related targets.

Code tries to unify handling of cache and thin pools.
Better supports lvm2 syntax:

lvconvert --type cache --cachepool vg/pool vg/cache
lvconvert --type thin --thinpool vg/pool vg/extorg
lvconvert --type cache-pool vg/pool
lvconvert --type thin-pool vg/pool

as well as:

lvconvert --cache --cachepool vg/pool vg/cache
lvconvert --thin --thinpool vg/pool vg/extorg
lvconvert --cachepool vg/pool
lvconvert --thinpool vg/pool

While catching much more command line errors.
(Yet couple paths still needs more tests)

Detects as much cmdline errors prior opening VG.

Uses single lvconvert_name_params to convert LV names.

Detects as much incompatibilies in VG prior prompting.

Uses single prompt to confirm whole conversion.

TODO: still the code needs fixes...
2014-07-11 13:32:22 +02:00
Zdenek Kabelac
4db5d78cef display: show C only for cache and cachepool
Keep target type (attr6) as the cache data and metadata volume has.
(i.e. when will show 'raid' type if metadata is raid)
2014-07-11 12:50:44 +02:00
Zdenek Kabelac
56c5ad7b19 lvconvert: snapshot prompts to confirm conversion
Since the type passed LV is changed and content of data detroyed,
query user with prompt to confirm this operation.
Also add a proper wiping of header.

Using '--yes' will skip this prompt:

lvconvert -s --yes vg/lv vg/lvcow
2014-07-11 12:49:55 +02:00
Zdenek Kabelac
64828d877e lvconvert: fix return codes
Error codes in some function are directly used
as command result - thus return 0 is not error code,
but success - switch to proper ECMD_FAILED.
2014-07-11 12:49:02 +02:00
Zdenek Kabelac
f9d80c9d31 cache: add tool support
Introducing cache tool support.
2014-07-11 12:47:35 +02:00
Jonathan Brassow
be75076dfc activation: Add "degraded" activation mode
Currently, we have two modes of activation, an unnamed nominal mode
(which I will refer to as "complete") and "partial" mode.  The
"complete" mode requires that a volume group be 'complete' - that
is, no missing PVs.  If there are any missing PVs, no affected LVs
are allowed to activate - even RAID LVs which might be able to
tolerate a failure.  The "partial" mode allows anything to be
activated (or at least attempted).  If a non-redundant LV is
missing a portion of its addressable space due to a device failure,
it will be replaced with an error target.  RAID LVs will either
activate or fail to activate depending on how badly their
redundancy is compromised.

This patch adds a third option, "degraded" mode.  This mode can
be selected via the '--activationmode {complete|degraded|partial}'
option to lvchange/vgchange.  It can also be set in lvm.conf.
The "degraded" activation mode allows RAID LVs with a sufficient
level of redundancy to activate (e.g. a RAID5 LV with one device
failure, a RAID6 with two device failures, or RAID1 with n-1
failures).  RAID LVs with too many device failures are not allowed
to activate - nor are any non-redundant LVs that may have been
affected.  This patch also makes the "degraded" mode the default
activation mode.

The degraded activation mode does not yet work in a cluster.  A
new cluster lock flag (LCK_DEGRADED_MODE) will need to be created
to make that work.  Currently, there is limited space for this
extra flag and I am looking for possible solutions.  One possible
solution is to usurp LCK_CONVERT, as it is not used.  When the
locking_type is 3, the degraded mode flag simply gets dropped and
the old ("complete") behavior is exhibited.
2014-07-09 22:56:11 -05:00
Peter Rajnoha
bccc2bef33 report: add lv_active_{locally,remotely,exclusively} LV reporting fields
lv_active_{locally,remotely,exclusively} display the original
"lv_active" field in a more separate way so that we can create
selection criteria in a binary-based form (yes/no).
2014-07-09 15:57:05 +02:00
Peter Rajnoha
4b65d7ec72 WHATS_NEW: commits a473435..7021c8f1 2014-07-07 16:52:43 +02:00
Peter Rajnoha
6dc7b783c8 metadata: fix regression causing PVs not in VGs to be marked as allocatable
If the PV is not yet in a VG, it's not allocatable.
A regression introduced by commit 0283c439ec
(_pv_create) and later commit a7ca101517
(pv_read).
2014-07-07 14:07:21 +02:00
Alasdair G Kergon
137ed3081a report: Add lv_parent field.
Only defined for thin/cache/raid/mirror at this stage as it
relies on get_only_segment_using_this_lv().
2014-07-03 23:49:34 +01:00
Alasdair G Kergon
1e1c2769a7 vgsplit: Fix VG component of lvid.
Fix VG component of lvid in vgsplit and vgmerge
Update vg_validate() to detect the error.
Call lv_is_active() before moving LV into new VG, not after.
2014-07-03 19:06:04 +01:00
Alasdair G Kergon
64ce3a8066 report: Add lv_dm_path and lv_full_name fields. 2014-07-02 17:24:05 +01:00
Alasdair G Kergon
5bfa2ec21d report: Exclude hidden devices from lv_path field. 2014-07-02 14:57:00 +01:00
Zdenek Kabelac
7bdf4719e8 pool: delay conversion prompt
First validate as much params as possible before prompting user
about conversion to data and metadata LV.
2014-07-02 10:45:39 +02:00
Zdenek Kabelac
3af761ba16 thin: fix chunk_size conversion prompt skip
Use --force only enables prompting for dangerous operation.
User has to add --yes to skip this prompt.
2014-07-02 10:43:56 +02:00
Zdenek Kabelac
93a80018ae lvremove: remove thin volumes on damaged pools
Support remove of thin volumes With --force --force
when thin pools is damaged.

This way it's possible to remove thin pool with
unrepairable metadata without requiring to
manually edit lvm2 metadata.

lvremove -ff vg/pool

removes all thin volumes and pool even when
thin pool cannot be activated (to accept
removal of thin volumes in kernel metadata)
2014-07-02 10:37:52 +02:00
Zdenek Kabelac
0b872ce870 raid: don't skip prompt with force
Yes is meant to be used to skip all new prompts.
(--force just adds more prompts).
2014-07-02 10:36:32 +02:00
Alasdair G Kergon
c77197c688 make: Fix pofile and .d file generation.
Use builddir not srcdir with make pofile.

Append 'incfile:' lines to %.d files to handle newly-missing dependencies
without 'make clean' after a file is moved or deleted.
2014-07-02 00:48:50 +01:00
Zdenek Kabelac
667f93b7d9 uuid: add more private uuid sufixes
Use suffixes for easier detection of private volumes.

This commit makes older volume UUIDs incompatible and
it most probably needs machine reboot after upgrade.
2014-06-30 12:17:07 +02:00
Zdenek Kabelac
6da14a82c6 thin: do not create reserved LVs
When creating pool's metadata - create initial LV for clearing with some
generic name and after the volume is create & cleared - rename it to
reserved name '_tmeta/_cmeta'.

We should not expose  'reserved' names for public LVs.
2014-06-30 12:16:05 +02:00
Zdenek Kabelac
eadcea2dae thin: repaired LV uses _meta%d
Don't leave 'regular' LV with reserved suffix for a user.
After succefull repair use 'normal' (non-reserved) LV name
for backup of original metadata.
2014-06-30 12:15:13 +02:00
Jonathan Brassow
ed3c2537b8 raid: Allow repair to reuse PVs from same image that suffered a PV failure
When repairing RAID LVs that have multiple PVs per image, allow
replacement images to be reallocated from the PVs that have not
failed in the image if there is sufficient space.

This allows for scenarios where a 2-way RAID1 is spread across 4 PVs,
where each image lives on two PVs but doesn't use the entire space
on any of them.  If one PV fails and there is sufficient space on the
remaining PV in the image, the image can be reallocated on just the
remaining PV.
2014-06-25 22:26:06 -05:00
Jonathan Brassow
b35fb0b15a raid/misc: Allow creation of parallel areas by LV vs segment
I've changed build_parallel_areas_from_lv to take a new parameter
that allows the caller to build parallel areas by LV vs by segment.
Previously, the function created a list of parallel areas for each
segment in the given LV.  When it came time for allocation, the
parallel areas were honored on a segment basis.  This was problematic
for RAID because any new RAID image must avoid being placed on any
PVs used by other images in the RAID.  For example, if we have a
linear LV that has half its space on one PV and half on another, we
do not want an up-convert to use either of those PVs.  It should
especially not wind up with the following, where the first portion
of one LV is paired up with the second portion of the other:
------PV1-------  ------PV2-------
[ 2of2 image_1 ]  [ 1of2 image_1 ]
[ 1of2 image_0 ]  [ 2of2 image_0 ]
----------------  ----------------
Previously, it was possible for this to happen.  The change makes
it so that the returned parallel areas list contains one "super"
segment (seg_pvs) with a list of all the PVs from every actual
segment in the given LV and covering the entire logical extent range.

This change allows RAID conversions to function properly when there
are existing images that contain multiple segments that span more
than one PV.
2014-06-25 21:20:41 -05:00
Peter Rajnoha
e80884cd08 filters: always reevaluate filter before creating a PV
...to avoid using cached value (persistent filter) and therefore
not noticing any change made after last scan/filtering - the state
of the device may have changed, for example new signatures added.

$ lvm dumpconfig --type diff
allocation {
	use_blkid_wiping=0
}
devices {
	obtain_device_list_from_udev=0
}

$ cat /etc/lvm/cache/.cache | grep sda

$ vgscan
  Reading all physical volumes.  This may take a while...
  Found volume group "fedora" using metadata type lvm2

$ cat /etc/lvm/cache/.cache | grep sda
		"/dev/sda",

$ parted /dev/sda mklabel gpt
Information: You may need to update /etc/fstab.

$ parted /dev/sda print
Model: QEMU QEMU HARDDISK (scsi)
Disk /dev/sda: 134MB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number  Start  End  Size  File system  Name  Flags

$ cat /etc/lvm/cache/.cache | grep sda
		"/dev/sda",

====

Before this patch:
$ pvcreate /dev/sda
  Physical volume "/dev/sda" successfully created

With this patch applied:
$ pvcreate /dev/sda
  Physical volume /dev/sda not found
  Device /dev/sda not found (or ignored by filtering).
2014-06-25 16:24:28 +02:00
Alasdair G Kergon
29ca0573ba post-release 2014-06-23 15:23:09 +01:00
Alasdair G Kergon
8d27f8e003 pre-release 2014-06-23 14:03:32 +01:00
Alasdair G Kergon
78533f72d3 locking: Introduce LCK_ACTIVATION.
Take a local file lock to prevent concurrent activation/deactivation of LVs.
Thin/cache types and an extension for cluster support are excluded for
now.

'lvchange -ay $lv' and 'lvchange -an $lv' should no longer cause trouble
if issued concurrently: the new lock should make sure they
activate/deactivate $lv one-after-the-other, instead of overlapping.

(If anyone wants to experiment with the cluster patch, please get in touch.)
2014-06-20 13:24:02 +01:00
Alasdair G Kergon
b33091cb11 pvmove: tidy 2014-06-19 13:40:47 +01:00
Zdenek Kabelac
aa3e413093 lvchange: better --refresh of raid and mirrors
Use lv_check_not_in_use() to detect openned device.
Plain info.open_count is not good enough for udev random
device openning.
2014-06-19 12:01:34 +02:00
Peter Rajnoha
63f5be0170 WHATS_NEW: commits 7dbbc05a69c4cb9756464720cad29e3c1ed971c3..b16f5633ab199dedfd25f08562f686a6fb4aba9d
Report selection support...
2014-06-18 10:48:53 +02:00
Jonathan Brassow
5ebff6cc9f pvmove: Enable all-or-nothing (atomic) pvmoves
pvmove can be used to move single LVs by name or multiple LVs that
lie within the specified PV range (e.g. /dev/sdb1:0-1000).  When
moving more than one LV, the portions of those LVs that are in the
range to be moved are added to a new temporary pvmove LV.  The LVs
then point to the range in the pvmove LV, rather than the PV
range.

Example 1:
	We have two LVs in this example.  After they were
	created, the first LV was grown, yeilding two segments
	in LV1.  So, there are two LVs with a total of three
	segments.

	Before pvmove:
	      ---------  ---------   ---------
	      | LV1s0 |  | LV2s0 |   | LV1s1 |
	      ---------  ---------   ---------
	         |           |           |
	   -------------------------------------
	PV | 000 - 255 | 256 - 511 | 512 - 767 |
	   -------------------------------------

	After pvmove inserts the temporary pvmove LV:
	          ---------   ---------   ---------
	          | LV1s0 |   | LV2s0 |   | LV1s1 |
	          ---------   ---------   ---------
	              |           |           |
	        -------------------------------------
	pvmove0 |   seg 0   |   seg 1   |   seg 2   |
	        -------------------------------------
	              |           |           |
	        -------------------------------------
	PV      | 000 - 255 | 256 - 511 | 512 - 767 |
	        -------------------------------------

	Each of the affected LV segments now point to a
	range of blocks in the pvmove LV, which purposefully
	corresponds to the segments moved from the original
	LVs into the temporary pvmove LV.

The current implementation goes on from here to mirror the temporary
pvmove LV by segment.  Further, as the pvmove LV is activated, only
one of its segments is actually mirrored (i.e. "moving") at a time.
The rest are either complete or not addressed yet.  If the pvmove
is aborted, those segments that are completed will remain on the
destination and those that are not yet addressed or in the process
of moving will stay on the source PV.  Thus, it is possible to have
a partially completed move - some LVs (or certain segments of LVs)
on the source PV and some on the destination.

Example 2:
	What 'example 1' might look if it was half-way
	through the move.
	             ---------   ---------   ---------
	             | LV1s0 |   | LV2s0 |   | LV1s1 |
	             ---------   ---------   ---------
	                 |           |           |
	           -------------------------------------
	pvmove0    |   seg 0   |   seg 1   |   seg 2   |
	           -------------------------------------
	                 |           |           |
	                 |     -------------------------
	source PV        |     | 256 - 511 | 512 - 767 |
	                 |     -------------------------
	                 |           ||
	           -------------------------
	dest PV    | 000 - 255 | 256 - 511 |
	           -------------------------

This update allows the user to specify that they would like the
pvmove mirror created "by LV" rather than "by segment".  That is,
the pvmove LV becomes an image in an encapsulating mirror along
with the allocated copy image.

Example 3:
	A pvmove that is performed "by LV" rather than "by segment".

	                   ---------   ---------
	                   | LV1s0 |   | LV2s0 |
	                   ---------   ---------
	                       |           |
	                 -------------------------
	        pvmove0  |  * LV-level mirror *  |
	                 -------------------------
                             /                \
	   pvmove_mimage0   /          pvmove_mimage1
	   -------------------------   -------------------------
	   |   seg 0   |   seg 1   |   |   seg 0   |   seg 1   |
	   -------------------------   -------------------------
	        |            |               |           |
	   -------------------------   -------------------------
	   | 000 - 255 | 256 - 511 |   | 000 - 255 | 256 - 511 |
	   -------------------------   -------------------------
	           source PV                    dest PV

The thing that differentiates a pvmove done in this way and a simple
"up-convert" from linear to mirror is the preservation of the
distinct segments.  A normal up-convert would simply allocate the
necessary space with no regard for segment boundaries.  The pvmove
operation must preserve the segments because they are the critical
boundary between the segments of the LVs being moved.  So, when the
pvmove copy image is allocated, all corresponding segments must be
allocated.  The code that merges ajoining segments that are part of
the same LV when the metadata is written must also be avoided in
this case.  This method of mirroring is unique enough to warrant its
own definitional macro, MIRROR_BY_SEGMENTED_LV.  This joins the two
existing macros: MIRROR_BY_SEG (for original pvmove) and MIRROR_BY_LV
(for user created mirrors).

The advantages of performing pvmove in this way is that all of the
LVs affected can be moved together.  It is an all-or-nothing approach
that leaves all LV segments on the source PV if the move is aborted.
Additionally, a mirror log can be used (in the future) to provide tracking
of progress; allowing the copy to continue where it left off in the event
there is a deactivation.
2014-06-17 22:59:36 -05:00
Zdenek Kabelac
494db11004 snapshot: %ORIGIN is relative to data size
Let's use the size of origin as the real base for percenta calculation,
and 'silenly' add needed metadata space for snapshot.

So now command   'lvcreate -s -l100%ORIGIN vg/lv' should always create a
snapshot to handle full device overwrite.
2014-06-17 13:41:01 +02:00