IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Since we now keep lv names valid all the time (as they are part
of radix_tree) - there is a problem with this renaming code, that
for a moment used duplicated name in vg struct.
Fix it by interating LVs backwared - which avoids breaking consitency
and also actually makes code more simple.
Fix stripe count and size parameter validation for RAID LVs and
include existing automatic setting of these parameters based
on current shape of the RAID LV in case these are not set
on command line fully.
Previously, this was done only to a certain subset given by this
condition (where the 'stripes' is the '-i|--stripes' cmd line arg
and the 'stripe_size' is actually the '-I|--stripesize' cmd line arg):
!(stripes == 1 || (stripes > 1 && stripe_size))
This condition is a bit harder to follow at first sight and there
are no comments around with explanation for why this one is used,
so let's analyze it a bit more.
First, let's convert this to an equivalent condition (De Morgan law)
so it's easier to read for humans:
stripes != 1 && !(stripes > 1 && stripe_size)
Note: Both stripe and stripesize are unsigned integers, so they can't be negative.
Now, based on that condition, we were running the code to deduce the
stripe/stripesize and do the checks ("the code") only if both of these
are true:
- stripes is different from 1
- we don't have stripes > 1 and stripe_size defined at the same time
But this is not correct in all cases, because:
A) if someone uses stripes = 0, then "the code" is executed
(correct)
B) if someone uses stripes = 1, then "the code" is not executed
(wrong: we still need to be able to check the args against
existing RAID LV stripes whether it matches)
- if someone uses stripes > 1, then "the code" is:
C) if stripe_size = 0, executed
(correct)
D) if stripe_size > 0, not executed
(wrong: we still want to check against existing RAID LV stripes)
Current issues with this condition:
The B) ends up with segfault.
❯ lvextend -i 1 -l+1 vg/lvol0
Rounding size 4.00 MiB (1 extents) up to stripe boundary size 8.00 MiB (2 extents).
Segmentation fault (core dumped)
The D) ends up with errors like:
❯ lvextend -i 3 -l+1 -I128k vg/lvol0
Rounding size 4.00 MiB (1 extents) up to stripe boundary size 8.00 MiB (2 extents).
Rounding size (4 extents) up to stripe boundary size for segment (5 extents).
Size of logical volume vg/lvol0 changed from 8.00 MiB (2 extents) to 20.00 MiB (5 extents).
LV lvol0: segment 1 with len=5 has inconsistent area_len 3
Couldn't read all logical volumes for volume group vg.
Failed to write VG vg.
Conclusion:
The condition needs to be removed so we always run "the code" to check
given striping args given on command line against existing RAID LV
striping. The reason is that we don't want to allow changing stripe
count for RAID LVs through lvextend and we need to end up with the
error:
"Unable to extend <RAID segment type> segment type with different number of stripes"
(We do support changing the striping by lvconvert's reshaping functionality only).
When converting a VG to locktype sanlock, a new
lease is allocated for each existing lv. Finding
a new lease location involved searching the lvmlock
LV from the start for an unused location, which
would be very slow with many LVs. Improve this by
starting each search from the last used location.
When searching for committed LV by uuid, this search can
be expensive for commands like 'vgremove' - so for
this part introduce 'lv_uuids' radix_tree that is
build with first access to lv_committed().
Since there is a group of commands that need to access 'lv_list'
while still need to search for LV by its name, make the whole
struct lv_list a member of logical_volume structure.
This makes it easy to return also 'lv_list' this list this LV
within VG.
Also the patch should not use more memory, since we were allocating
lv_list for each LV anyway when linkin LV to VG.
Since find_lv_by_name() is now using radix_tree(),
use the same 'search for /' in LV in name for both
find_lv() & find_lv_in_vg().
TODO: Possibly refactor code and use only dm_list
instead of lv_list and dereference LV with container_of()
(thus saving pointer within struct logical_volume) - but
we use 'lv_list' currently in many places...
Add lvm.conf config/validate_metadata configurable setting.
Allows to disable validation of volume_group structure before
writing to disk.
Call of vg_validate() is supposed to catch any inconsistency
of in-memory volume group structure and possibly early aborting
commnand before making any more 'damage' in case the VG struct
is found insistent after some metadata manipulation.
This is almost always useful for devel - and also for normal user
as for small metadata size this doesn't add too much overhead.
However if the volume_group size is large and operations are just
adding removing simple LVs - this validation time may add noticable
to final command running time.
So if the user seeks the highest perfomance of command and does
not do any 'complex' metadata manipulation - it's reasonably safe
to disable validation (with the use of setting "none") here.
With presence of uniq_insert, use this function also
here for extra protection and check for duplicate lv_name
when inserting a new name into radix_tree.
When 'lvresize -r' is used to resize the volume, it's valid to
resize even to the same size of an LV, as the command then runs
fs-resize utility to eventually upsize the fs to the current
volume size.
Return code of such command then reflects the return value
of this fs-resize tool.
This fixes the regression introduced when the support
for option --fs was added (2.03.17).
Replace usage of dm_hash with radix_tree to quickly find LV name
with a vg and also index PV names with set of available PVs.
This PV index is only needed during the import, but instead
of passing 'radix_tree *' everywhere, just keep this within
a VG struct as well and once the parsing is finished, release
this PV index radix_tree.
This also makes it easier to replace this structure
in the future if needed.
lv_set_name now uses radix_tree remove+insert to keep lv_names
tree in-sync and usable for find_lv queries.
Enhance usage with uniq_insert and also try to better
utilize CPU cache and do a smaller loop for individual
hashing of lvname and separately lvid.
Also correcting usage of 'continue' within validation of
historical names as it should report as much errors
as it can within a loop.
Since we detect 'debug' level after calling 'log_debug()' - all
the arguments are evaluated, so in this case display_lvname() was
preparing a string that is not used in case debugging is not enabled.
So since these string are on 'hot-path' and it's already known
which VG is being worked on, in these few cases just use lv->name.
Split single check_lv_segments() into 2 separate
versions so they can be called independently.
This allow to 'skip' already checked segment
check after it's been imported to VG and also
avoid another repeated checking when validating
segment with complete vg.
**
check_lv_segments_incomplete_vg()
this check just basic LV segment properties and does not
validate those requiring full VG.
**
check_lv_segments_complete_vg()
Remaining check that expects complete VG is present.
ATM this rather save a lot of unncessary log entries as it grabs
the global autoextend_threshold (profile == NULL) just once instead
of revealing it every time with NULL profile.
Track whether import has even seen segment of LV with log_lv,
and call fixup mirror only in this case.
Also avoid repeated lookup of get_segtype_from_string for
SEG_TYPE_NAME_MIRROR.
Use bigger memory pool chunk size and reduces amount of
memory pool extensions when handling larger metadata, but do not
make it noticable bigger when handling small ones...
Use same large value also when allocating VG memory pool.
When BLKZEROOUT ioctl fails, it should not stop us from trying the direct
zeroing as a fallback action, since this is an optimization only.
We should be able to continue with new LV creation if we succeed
with that direct fallback then.
Related report: https://issues.redhat.com/browse/RHEL-58737
commit a125a3bb50 "lv_remove: reduce commits for removed LVs"
changed "lvremove <vgname>" from removing one LV at a time,
to removing all LVs in one vg write/commit. It also changed
the behavior if some of the LVs could not be removed, from
removing those LVs that could be removed, to removing nothing
if any LV could not be removed. This caused a regression in
shared VGs using sanlock, in which the on-disk lease was
removed for any LV that could be removed, even if the command
decided to remove nothing. This would leave LVs without a
valid ondisk lease, and "lock failed: error -221" would be
returned for any command attempting to lock the LV.
Fix this by not freeing the on-disk leases until after the
command has decided to go ahead and remove everything, and
has written the VG metadata.
Before the fix:
node1: lvchange -ay vg/lv1
node2: lvchange -ay vg/lv2
node1: lvs
lv1 test -wi-a----- 4.00m
lv2 test -wi------- 4.00m
node2: lvs
lv1 test -wi------- 4.00m
lv2 test -wi-a----- 4.00m
node1: lvremove -y vg/lv1 vg/lv2
LV locked by other host: vg/lv2
(lvremove removed neither of the LVs, but it freed
the lock for lv1, which could have been removed
except for the proper locking failure on lv2.)
node1: lvs
lv1 test -wi------- 4.00m
lv2 test -wi------- 4.00m
node1: lvremove -y vg/lv1
LV vg/lv1 lock failed: error -221
(The lock for lv1 is gone, so nothing can be done with it.)
This provides better hints when trying to resize the fs on top of an LV.
Also needs a3f6d2f593 for proper operation.
❯ lvs -o name,size vg/swap
lv_name lv_size
swap 60.00m
Before:
❯ lvextend -L72m vg/swap
Size of logical volume vg/swap changed from 60.00 MiB (15 extents) to 72.00 MiB (18 extents).
Logical volume vg/swap successfully resized.
❯ lvreduce -L60m vg/swap
File system swap found on vg/swap.
File system device usage is not available from libblkid.
❯ lvreduce -L50m vg/swap
Rounding size to boundary between physical extents: 52.00 MiB.
File system swap found on vg/swap.
File system device usage is not available from libblkid.
After:
❯ lvextend -L72m vg/swap
Size of logical volume vg/swap changed from 60.00 MiB (15 extents) to 72.00 MiB (18 extents).
Logical volume vg/swap successfully resized.
❯ lvreduce -L60m vg/swap
File system swap found on vg/swap.
File system size (60.00 MiB) is equal to the requested size (60.00 MiB).
File system reduce is not needed, skipping.
Size of logical volume vg/swap changed from 72.00 MiB (18 extents) to 60.00 MiB (15 extents).
Logical volume vg/swap successfully resized.
❯ lvreduce -L50m vg/swap
Rounding size to boundary between physical extents: 52.00 MiB.
File system swap found on vg/swap.
File system size (60.00 MiB) is larger than the requested size (52.00 MiB).
File system reduce is required and not supported (swap).
lvremove of a thin lv while the pool is inactive would
leave the pool locked but inactive.
lvcreate of a thin snapshot while the pool is inactive
would leave the pool locked but inactive.
lvcreate of a thin lv could activate the pool to check
a threshold before the pool lock was acquired in lvmlockd.
The lv_hash wasn't being passed to the seg-specific text import
functions, so they were doing many find_lv() calls which consumes
a lot of time when there are many LVs in the metadata.
Revert 373372c8ab and instead update
our validation code to handle LVs with empty segment - currently
we should need this only for pvmove operation, thus such LV should
have name 'pvmove%u'.
This fixes a problem where user tried i.e. pvmove on a VG with single
PV - as reported: https://github.com/lvmteam/lvm2/issues/148
Reported-by: bob@redhat.com