IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
The backup_restore_vg is used directly for restoring the VG from backup.
It's also used to do the VG conversions from one metadata format to
another which means vgconvert calls backup_restore_vg too.
When restoring VG from backup, we need to rewrite/write PV headers as
PVs may have been orphans before and now they're becoming part of some
VG - we need to write the PV_EXT_USED flag at least.
When using the backup_restore_vg for vgconvert, we need to write
completely new PV header in different format.
Avoid the special "pv_write" call and handling that was used before
this patch in vgconvert (vgconvert_single function to be more precise)
and reuse existing internal interface to register PV header for writing
(or rewriting) via vg->pvs_to_write list instead like we do it elsewhere
in the code.
This patch also resolves a problem in which PV headers with target
format were written in the vgconvert_single fn as orphans and VG
metadata were added later on - this was a tiny hack actually.
We can't do this now - we need to write the PV as belonging
to a VG because otherwise the PV_EXT_USED flag won't be written
properly (if the PV header is written as orphan, the PV_EXT_USED
is set to 0, of course, even though metadata are attached later).
So this patch removes this tiny inconsistency which was passing
just fine before because we didn't have any relation to the VG
in PV header before. Now we have the PV_EXT_USED flag which says
the "PV is used in some VG".
The same check as we already do for orphan PVs, just the other way
round now: if the PV is surely part of some VG and any PV the VG
contains does not have the PV_EXT_USED flag set, repair it.
For example - /dev/sda here is in VG vg and it's incorrectly not
marked as used by PV_EXT_USED flag:
pvs --binary -o pv_ext_vsn,pv_in_use
WARNING: Volume Group vg is not consistent.
WARNING: Repairing Physical Volume /dev/sda that is in Volume Group vg but not marked as used.
PV VG Fmt Attr PSize PFree ExtVsn PInUse
/dev/sda vg lvm2 a-- 124.00m 124.00m 2 1
PV header extension versions:
0 - the original PV without any extensions
1 - bootloader area support added
2 - PV_EXT_USED flag support added
So do the associated checks related to PV_EXT_USED flag only if
PV header extension found is of version 2 and higher.
If we know that the PV is orphan, meaning there's at least one MDA on
that PV which does not reference any VG and at the same time there's
PV_EXT_USED flag set, we're certainly in an inconsistent state and we
need to fix this.
For example, such situation can happen during vgremove/vgreduce if we
removed/reduced the VG, but we haven't written PV headers yet because
vgremove stopped abruptly for whatever reason just before writing new
PV headers with updated state, including PV extension flags (and so the
PV_EXT_USED flag).
However, in case the PV has no MDAs at all, we can't double-check
whether the PV_EXT_USED is correct or not - if that PV is marked
as used, it's either:
- really used (but other disks with MDAs are missing)
- or the error state as described above is hit
User needs to overwrite the PV header directly if it's really clear
the PV having no MDAs does not belong to any VG and at the same time
it's still marked as being in use (pvcreate -ff <dev_name> will fix this).
For example - /dev/sda here has 1 MDA, orphan and is incorrectly marked
with PV_EXT_USED flag:
$ pvs --binary -o+pv_in_use
WARNING: Found inconsistent standalone Physical Volumes.
WARNING: Repairing flag incorrectly marking Physical Volume /dev/sda as used.
PV VG Fmt Attr PSize PFree InUse
/dev/sda lvm2 --- 128.00m 128.00m 0
For example:
$ pvs -o pv_name,vg_name,pv_in_use
PV VG InUse
/dev/sda vg used
/dev/sdb
/dev/sdc used
(sda is part of vg - it's used
sdb is not part of vg - it's not used
sdc is part of vg, but MDAs missing - it's used)
Scenario:
$ pvcreate /dev/sda
Physical volume "/dev/sda" successfully created
We're adding the PV to a VG.
Before this patch:
$ vgcreate vg /dev/sda
Physical volume "/dev/sda" successfully created
Volume group "vg" successfully created
With this path applied:
$ vgcreate vg /dev/sda
Volume group "vg" successfully created
...and verbose log containing: "Physical volume "/dev/sda" successfully written"
Make sure we won't use a PV that is already marked as used. Normally,
VG metadata would stop us from doing that, but we can run into a
situation where such metadata is missing because PVs with MDAs
are missing and the PVs left are the ones with 0 MDAs.
(/dev/sda in this example has 0 MDAs and it belongs to a VG,
but other PVs with MDA are missing)
$ pvs -o pv_name,pv_mda_count /dev/sda
PV #PMda
/dev/sda 0
$ pvcreate /dev/sda
PV '/dev/sda' is marked as belonging to a VG but its metadata is missing.
Can't initialize PV '/dev/sda' without -ff.
$ pvchange -u /dev/sda
PV '/dev/sda' is marked as belonging to a VG but its metadata is missing.
Can't change PV '/dev/sda' without -ff.
Physical volume /dev/sda not changed
0 physical volumes changed / 1 physical volume not changed
$ pvremove /dev/sda
PV '/dev/sda' is marked as belonging to a VG but its metadata is missing.
(If you are certain you need pvremove, then confirm by using --force twice.)
$ vgcreate vg /dev/sda
Physical volume '/dev/sda' is marked as belonging to a VG but its metadata is missing.
Unable to add physical volume '/dev/sda' to volume group 'vg'.
We'll use this struct in subsequent patches for PVs which should
be rewritten, not just created. So rename struct pv_to_create to
struct pv_to_write for clarity.
Address this gcc warning:
metadata/lv.c:243: warning: initialized field overwritten
metadata/lv.c:243: warning: (near initialization for 'status.seg_status')
Present with e.g.: gcc version 4.3.2 (Debian 4.3.2-1.1)
Simplify calculation of extents rounding needed for
segment size.
Segment size has to divisible by 'extent count' needed to contain
whole stripe. LVM currently does not support stripes across segment.
In case the stripe size is bigger then extent size,
require bigger rounding.
'verbose' was marked as a boolean option while it
takes integer args - so it has limited usage to 0 or 1,
but we supported 0-4 at least.
Fix it by switching to corrent int type.
(Hopefully noone was trying to use this variable as true/yes/false/no
way - as the would be unsupported/undocumented).
Reporter noticed lvm2 incorrectly translated
lvm2 threshold value to water mark in commit:
99237f0908
Fix it by properly translating size to number of
blocks in thin-pool and then calc for free blocks
matching configured lvm2 threshold value.
Reported-by: Ming-Hung Tsai <mingnus@gmail.com>
Normally, we generate and provide lvm.conf file where use_blkid_wiping
is set based on whether support for this is compiled in or not. This was
generated properly based on configure.
However, if lvm.conf is not used at all (someone deletes it) or the value
in lvm.conf is commented out (user edited it), we still need to use
proper default value that is based on DEFAULT_USE_BLKID_WIPING taken
from configure script - we used hardcoded value of "1" in this case
by mistake.
We already do check for suspended devs within udev rules where
the pvscan is to update lvmetad. So the check for suspended devs
in "pre-lvmetad" chain is not useful here - remove it - it may
be a source of hardly to detect races anyway (if udev rule detects
the device is not suspended and then the pvscan instance sees the
dev as suspended, we may end up not reacting to the event properly).
lvm1 and pool format do not support bootloader areas and we need to
remove any existing associated bootloader areas when we read lvm1 and
pool labels.
This has its importance if we're converting from one format to another
and we're reusing lvmcache in long-running commands (e.g. clvmd or lvm
shell) and we need to make lvmcache consistent and valid for current format.
Non-dm devices have ID_PART_TABLE_TYPE variable exported in
udev db from blkid scan for *both* whole devices and partitions.
We used ID_PART_ENTRY_DISK in addition to decide whether this
is the whole device or partition and then we filtered out only
whole devices where the partition table really is.
However, ID_PART_ENTRY_DISK was added in blkid 2.20 so we need
to use a different set of variables to decide on whole devices
and partitions on systems where older blkid is still used.
Now, we use ID_PART_TABLE_TYPE to detect that there's something
related to partitioning with this device and we use DEVTYPE variable
instead to decide between whole device (DEVTYPE="disk") and partition
(DEVTYPE="partition").
For dm devices it's simpler, we have ID_PART_TABLE_TYPE variable\
set in udev db for whole devices. It's not set for partitions,
hence we don't need more variable in addition to make the decision
on whole device vs. partition (dm devices do not have regular
partitions, hence DEVTYPE can't be used anyway, it's always set
to "disk" for whole disks and partitions).
Add "size" and "size_seqno" to struct device to cache device's size
and also to control its lifetime - the cached value is valid as long
as the global _dev_size_seqno is equal to the device's size_seqno,
otherwise we need to get the size again and cache the new value.
This patch also adds new dev_size_seqno_inc() fn for the appropriate
parts of the code to increment current global value of _dev_size_seqno
and hence to cause all currently cached values for device sizes to
be invalidated.
The device size is now cached because we're planning to reuse this
information for further checks and we want to avoid checking it more
than necessary to save resources.
The extent size must fits all blocks in 4294967295 sectors
(in 512b units) this is 1/2 KiB less then 2TiB.
So while previous statement 'suggested' 2TiB is still acceptable value,
make it clear it's not.
As now we support any multiples of 128KB as extent size -
values like 2047G will still 'flow-in' otherwise the largest power-of-2
supported value is 1TiB.
With 1TiB user needs 8388608 extents for 8EiB device.
(FYI such device is already unusable with todays glibc-2.22.90-27)
4GiB extent size is currently the smallest extent size which allows
a user to create 8EiB devices (with 2GiB it's less then 8EiB).
TODO: lvm2 may possibly print amount of 'lost/unused space' on a PV,
since using such ridiculously sized extent size may result in huge
space being left unaccessible.