IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Sort out the lvresize calculation code to handle size changes
specified as physical extents as well as logical extents
and to process mirror resizing and raid extensions correctly.
The 'approx alloc' option was masking the underlying problem.
When testing conversion sanity, we checked lv->status & MIRRORED
which encompasses both old mirrors and raid1 mirrors. But we need to
ban only the old mirrors here hence allow raid1 mirrors.
The maximum stripe size is equal to the volume group PE size. If that
size falls below the STRIPE_SIZE_MIN, the creation of RAID 4/5/6 volumes
becomes impossible. (The kernel will fail to load a RAID 4/5/6 mapping
table with a stripe size less than STRIPE_SIZE_MIN.) So, we report an
error if it is attempted.
This is very rare because reducing the PE size down that far limits the
size of the PV below that of modern devices.
The lv_layout and lv_type fields together help with LV identification.
We can do basic identification using the lv_attr field which provides
very condensed view. In contrast to that, the new lv_layout and lv_type
fields provide more detialed information on exact layout and type used
for LVs.
For top-level LVs which are pure types not combined with any
other LV types, the lv_layout value is equal to lv_type value.
For non-top-level LVs which may be combined with other types,
the lv_layout describes the underlying layout used, while the
lv_type describes the use/type/usage of the LV.
These two new fields are both string lists so selection (-S/--select)
criteria can be defined using the list operators easily:
[] for strict matching
{} for subset matching.
For example, let's consider this:
$ lvs -a -o name,vg_name,lv_attr,layout,type
LV VG Attr Layout Type
[lvol1_pmspare] vg ewi------- linear metadata,pool,spare
pool vg twi-a-tz-- pool,thin pool,thin
[pool_tdata] vg rwi-aor--- level10,raid data,pool,thin
[pool_tdata_rimage_0] vg iwi-aor--- linear image,raid
[pool_tdata_rimage_1] vg iwi-aor--- linear image,raid
[pool_tdata_rimage_2] vg iwi-aor--- linear image,raid
[pool_tdata_rimage_3] vg iwi-aor--- linear image,raid
[pool_tdata_rmeta_0] vg ewi-aor--- linear metadata,raid
[pool_tdata_rmeta_1] vg ewi-aor--- linear metadata,raid
[pool_tdata_rmeta_2] vg ewi-aor--- linear metadata,raid
[pool_tdata_rmeta_3] vg ewi-aor--- linear metadata,raid
[pool_tmeta] vg ewi-aor--- level1,raid metadata,pool,thin
[pool_tmeta_rimage_0] vg iwi-aor--- linear image,raid
[pool_tmeta_rimage_1] vg iwi-aor--- linear image,raid
[pool_tmeta_rmeta_0] vg ewi-aor--- linear metadata,raid
[pool_tmeta_rmeta_1] vg ewi-aor--- linear metadata,raid
thin_snap1 vg Vwi---tz-k thin snapshot,thin
thin_snap2 vg Vwi---tz-k thin snapshot,thin
thin_vol1 vg Vwi-a-tz-- thin thin
thin_vol2 vg Vwi-a-tz-- thin multiple,origin,thin
Which is a situation with thin pool, thin volumes and thin snapshots.
We can see internal 'pool_tdata' volume that makes up thin pool has
actually a level10 raid layout and the internal 'pool_tmeta' has
level1 raid layout. Also, we can see that 'thin_snap1' and 'thin_snap2'
are both thin snapshots while 'thin_vol1' is thin origin (having
multiple snapshots).
Such reporting scheme provides much better base for selection criteria
in addition to providing more detailed information, for example:
$ lvs -a -o name,vg_name,lv_attr,layout,type -S 'type=metadata'
LV VG Attr Layout Type
[lvol1_pmspare] vg ewi------- linear metadata,pool,spare
[pool_tdata_rmeta_0] vg ewi-aor--- linear metadata,raid
[pool_tdata_rmeta_1] vg ewi-aor--- linear metadata,raid
[pool_tdata_rmeta_2] vg ewi-aor--- linear metadata,raid
[pool_tdata_rmeta_3] vg ewi-aor--- linear metadata,raid
[pool_tmeta] vg ewi-aor--- level1,raid metadata,pool,thin
[pool_tmeta_rmeta_0] vg ewi-aor--- linear metadata,raid
[pool_tmeta_rmeta_1] vg ewi-aor--- linear metadata,raid
(selected all LVs which are related to metadata of any type)
lvs -a -o name,vg_name,lv_attr,layout,type -S 'type={metadata,thin}'
LV VG Attr Layout Type
[pool_tmeta] vg ewi-aor--- level1,raid metadata,pool,thin
(selected all LVs which hold metadata related to thin)
lvs -a -o name,vg_name,lv_attr,layout,type -S 'type={thin,snapshot}'
LV VG Attr Layout Type
thin_snap1 vg Vwi---tz-k thin snapshot,thin
thin_snap2 vg Vwi---tz-k thin snapshot,thin
(selected all LVs which are thin snapshots)
lvs -a -o name,vg_name,lv_attr,layout,type -S 'layout=raid'
LV VG Attr Layout Type
[pool_tdata] vg rwi-aor--- level10,raid data,pool,thin
[pool_tmeta] vg ewi-aor--- level1,raid metadata,pool,thin
(selected all LVs with raid layout, any raid layout)
lvs -a -o name,vg_name,lv_attr,layout,type -S 'layout={raid,level1}'
LV VG Attr Layout Type
[pool_tmeta] vg ewi-aor--- level1,raid metadata,pool,thin
(selected all LVs with raid level1 layout exactly)
And so on...
_pvcreate_check() has two missing requirements:
After refreshing filters there must be a rescan.
(Otherwise the persistent filter may remain empty.)
After wiping a signature, the filters must be refreshed.
(A device that was previously excluded by the filter due to
its signature might now need to be included.)
If several devices are added at once, the repeated scanning isn't
strictly needed, but we can address that later as part of the command
processing restructuring (by grouping the devices).
Replace the new pvcreate code added by commit
54685c20fc "filters: fix regression caused
by commit e80884cd080cad7e10be4588e3493b9000649426"
with this change to _pvcreate_check().
The filter refresh problem dates back to commit
acb4b5e4de "Fix pvcreate device check."
If using persistent filter and we're refreshing filters (just like we
do for pvcreate now after commit 54685c20fc),
we can't rely on getting the primary device of the partition from the cache
as such device could be already filtered by persistent filter and we get
a device cache lookup failure for such device.
For example:
$ lvm dumpconfig --type diff
devices {
obtain_device_list_from_udev=0
}
$lsblk /dev/sda
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 128M 0 disk
`-sda1 8:1 0 127M 0 part
$cat /etc/lvm/cache/.cache | grep sda
"/dev/sda1",
$pvcreate /dev/sda1
dev_is_mpath: failed to get device for 8:1
Physical volume "/dev/sda1" successfully created
The problematic part of the code called dev_cache_get_by_devt
to get the device for the device number supplied. Then the code
used dev_name(dev) to get the name which is then used in check
whether there's any mpath on top of this dev...
This patch uses sysfs to get the base name for the partition
instead, hence avoiding the device cache which is a correct
approach here.
The message "Cannot deactivate remotely exclusive device locally." makes
sense only for clustered LV. If the LV is non-clustered, then it's
always exclusive by definition and if it's already deactivated, this
message pops up inappropriately as those two conditions are met.
So issue the message only if the conditions are met AND we have clustered VG.
With cmirrord, we can do pvmove of clustered mirror. The code checking
suitability of LVs on the PV being moved issued a message if a mirror
LV was found and the VG was clustered. However, the actual pvmove did
work correctly.
The top-level mirror LV is actually skipped in the code since it's
always layered on top of internal LVs making up the mirror LV and for pvmove
we consider these internal devices only as they're actually layered on
top of concrete PVs then. But we don't need to issue any message here
about skipping the top-level mirror LV - it's misleading here.
Commit e80884cd08 tried to dump filters
for them to be reevaluated when creating a PV to avoid overwriting
any existing signature that may have been created after last
scan/filtering.
However, we need to call refresh_filters instead of
persistent_filter->dump since dump requires proper rescannig to fill
up the persistent filter again. However, this is true only for pvcreate
but not for vgcreate with PV creation where the scanning happens before
this PV creation and hence the next rescan (if not full scan), does not
fill the persistent filter.
Also, move refresh_filters so that it's called sooner and only for
pvcreate, vgcreate already calls lvmcache_label_scan(cmd, 2) which
then calls refresh_filters itself, so no need to reevaluate this again.
This caused the persistent filter (/etc/lvm/cache/.cache file) to be
wrong and contain only the PV just being processed with
vgcreate <vg_name> <pv_name_to_create>.
This regression caused other block devices to be filtered out in case
the vgcreate with PV creation was used and then the persistent filter
is used by any other LVM command afterwards.
Make lvresize -l+%FREE support approximate allocation.
Move existing "Reducing/Extending' message to verbose level
and change it to say 'up to' if approximate allocation is being used.
Replace it with a new message that gives the actual old and new size or
says 'unchanged'.
This is addendum to commit 2e82a070f3
which fixed these spurious messages that appeared after commit
651d5093ed ("avoid pv_read in
find_pv_by_name").
There was one more "not found" message issued in case the device
could not be found in device cache (commit 2e82a07 fixed this only
for PV lookup itself). But if we "allow_unformatted" for
find_pv_by_name, we should not issue this message even in case
the device can't be found in dev cache as we just need to know
whether there's a PV or not for the code to decide on next steps
and we don't want to issue any messages if either device itself
is not found or PV is not found.
For example, when we were creating a new PV (and so allow_unformatted = 1)
and the device had a signature on it which caused it to be filtered
by device filter (e.g. MD signature if md filtering is enabled),
or it was part of some other subsystem (e.g. multipath), this message
was issued on find_pv_by_name call which was misleading.
Also, remove misleading "stack" call in case find_pv_by_name
returns NULL in pvcreate_check - any error state is reported
later by pvcreate_check code so no need to "stack" here.
There's one more and proper check to issue "not found" message if
the device can't be found in device cache within pvcreate_check fn
so this situation is still covered properly later in the code.
Before this patch (/dev/sda contains MD signature and is therefore filtered):
$ pvcreate /dev/sda
Physical volume /dev/sda not found
WARNING: linux_raid_member signature detected on /dev/sda at offset 4096. Wipe it? [y/n]:
With this patch applied:
$ pvcreate /dev/sda
WARNING: linux_raid_member signature detected on /dev/sda at offset 4096. Wipe it? [y/n]:
Non-existent devices are still caught properly:
$ pvcreate /dev/sdx
Device /dev/sdx not found (or ignored by filtering).
2.02.106 added suffixes to some LV uuids in the kernel.
If any of these LVs is activated with 2.02.105 or earlier,
and then a later version is used, the LVs appear invisible and
activation commands fail.
The code now has to check the kernel for both old and new uuids.
Fix get_pool_params to only read params.
Add poolmetadataspare option to get_pool_params.
Move all profile code into update_pool_params.
Move recalculate code into pool_manip.c
Major update of lvconvert code to handle cache and thin.
related targets.
Code tries to unify handling of cache and thin pools.
Better supports lvm2 syntax:
lvconvert --type cache --cachepool vg/pool vg/cache
lvconvert --type thin --thinpool vg/pool vg/extorg
lvconvert --type cache-pool vg/pool
lvconvert --type thin-pool vg/pool
as well as:
lvconvert --cache --cachepool vg/pool vg/cache
lvconvert --thin --thinpool vg/pool vg/extorg
lvconvert --cachepool vg/pool
lvconvert --thinpool vg/pool
While catching much more command line errors.
(Yet couple paths still needs more tests)
Detects as much cmdline errors prior opening VG.
Uses single lvconvert_name_params to convert LV names.
Detects as much incompatibilies in VG prior prompting.
Uses single prompt to confirm whole conversion.
TODO: still the code needs fixes...
Since the type passed LV is changed and content of data detroyed,
query user with prompt to confirm this operation.
Also add a proper wiping of header.
Using '--yes' will skip this prompt:
lvconvert -s --yes vg/lv vg/lvcow
Currently, we have two modes of activation, an unnamed nominal mode
(which I will refer to as "complete") and "partial" mode. The
"complete" mode requires that a volume group be 'complete' - that
is, no missing PVs. If there are any missing PVs, no affected LVs
are allowed to activate - even RAID LVs which might be able to
tolerate a failure. The "partial" mode allows anything to be
activated (or at least attempted). If a non-redundant LV is
missing a portion of its addressable space due to a device failure,
it will be replaced with an error target. RAID LVs will either
activate or fail to activate depending on how badly their
redundancy is compromised.
This patch adds a third option, "degraded" mode. This mode can
be selected via the '--activationmode {complete|degraded|partial}'
option to lvchange/vgchange. It can also be set in lvm.conf.
The "degraded" activation mode allows RAID LVs with a sufficient
level of redundancy to activate (e.g. a RAID5 LV with one device
failure, a RAID6 with two device failures, or RAID1 with n-1
failures). RAID LVs with too many device failures are not allowed
to activate - nor are any non-redundant LVs that may have been
affected. This patch also makes the "degraded" mode the default
activation mode.
The degraded activation mode does not yet work in a cluster. A
new cluster lock flag (LCK_DEGRADED_MODE) will need to be created
to make that work. Currently, there is limited space for this
extra flag and I am looking for possible solutions. One possible
solution is to usurp LCK_CONVERT, as it is not used. When the
locking_type is 3, the degraded mode flag simply gets dropped and
the old ("complete") behavior is exhibited.
lv_active_{locally,remotely,exclusively} display the original
"lv_active" field in a more separate way so that we can create
selection criteria in a binary-based form (yes/no).
Support remove of thin volumes With --force --force
when thin pools is damaged.
This way it's possible to remove thin pool with
unrepairable metadata without requiring to
manually edit lvm2 metadata.
lvremove -ff vg/pool
removes all thin volumes and pool even when
thin pool cannot be activated (to accept
removal of thin volumes in kernel metadata)
Use builddir not srcdir with make pofile.
Append 'incfile:' lines to %.d files to handle newly-missing dependencies
without 'make clean' after a file is moved or deleted.
Use suffixes for easier detection of private volumes.
This commit makes older volume UUIDs incompatible and
it most probably needs machine reboot after upgrade.
When creating pool's metadata - create initial LV for clearing with some
generic name and after the volume is create & cleared - rename it to
reserved name '_tmeta/_cmeta'.
We should not expose 'reserved' names for public LVs.
When repairing RAID LVs that have multiple PVs per image, allow
replacement images to be reallocated from the PVs that have not
failed in the image if there is sufficient space.
This allows for scenarios where a 2-way RAID1 is spread across 4 PVs,
where each image lives on two PVs but doesn't use the entire space
on any of them. If one PV fails and there is sufficient space on the
remaining PV in the image, the image can be reallocated on just the
remaining PV.
I've changed build_parallel_areas_from_lv to take a new parameter
that allows the caller to build parallel areas by LV vs by segment.
Previously, the function created a list of parallel areas for each
segment in the given LV. When it came time for allocation, the
parallel areas were honored on a segment basis. This was problematic
for RAID because any new RAID image must avoid being placed on any
PVs used by other images in the RAID. For example, if we have a
linear LV that has half its space on one PV and half on another, we
do not want an up-convert to use either of those PVs. It should
especially not wind up with the following, where the first portion
of one LV is paired up with the second portion of the other:
------PV1------- ------PV2-------
[ 2of2 image_1 ] [ 1of2 image_1 ]
[ 1of2 image_0 ] [ 2of2 image_0 ]
---------------- ----------------
Previously, it was possible for this to happen. The change makes
it so that the returned parallel areas list contains one "super"
segment (seg_pvs) with a list of all the PVs from every actual
segment in the given LV and covering the entire logical extent range.
This change allows RAID conversions to function properly when there
are existing images that contain multiple segments that span more
than one PV.
...to avoid using cached value (persistent filter) and therefore
not noticing any change made after last scan/filtering - the state
of the device may have changed, for example new signatures added.
$ lvm dumpconfig --type diff
allocation {
use_blkid_wiping=0
}
devices {
obtain_device_list_from_udev=0
}
$ cat /etc/lvm/cache/.cache | grep sda
$ vgscan
Reading all physical volumes. This may take a while...
Found volume group "fedora" using metadata type lvm2
$ cat /etc/lvm/cache/.cache | grep sda
"/dev/sda",
$ parted /dev/sda mklabel gpt
Information: You may need to update /etc/fstab.
$ parted /dev/sda print
Model: QEMU QEMU HARDDISK (scsi)
Disk /dev/sda: 134MB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
$ cat /etc/lvm/cache/.cache | grep sda
"/dev/sda",
====
Before this patch:
$ pvcreate /dev/sda
Physical volume "/dev/sda" successfully created
With this patch applied:
$ pvcreate /dev/sda
Physical volume /dev/sda not found
Device /dev/sda not found (or ignored by filtering).