IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Commit 989626926c
introduced 2 new tests
lvconvert-raid-takeover-linear_to_raid4.sh and
lvconvert-raid-takeover-raid4_to_linear.sh
which involve raid reshaping.
Bump the checked dm-raid target version to 1.14.0
which has reshape kernel fixes to avoid test suite
runs to hang.
Bump target version to 1.14.0 which contains fixes
for reshape deadlock/corruption to allow tests to
run once the respective fixes show up in kernels.
Remove now superfluous multi-core checks.
Resolves: rhbz1501145
Related: rhbz1514539
Related: rhbz1586123
Related: rhbz1613039
Allow "lvconvert --type linear RaidLV" on a raid4 LV
providing convenient interim steps to convert to linear.
Add respective new test
lvconvert-raid-takeover-raid4_to_linear.sh
and
lvconvert-raid-takeover-linear_to_raid4.sh
for linear to raid4 once on it.
When converting from striped/raid0/raid0_meta
to raid6 with > 2 stripes, allow possible
direct conversion (to raid6_n_6).
In case of 2 stripes, first convert to raid5_n to restripe
to at least 3 data stripes (the raid6 minimum in lvm2) in
a second conversion before finally converting to raid6_n_6.
As before, raid6_n_6 then can be converted
to any other raid6 layout.
Enhance lvconvert-raid-takeover.sh to test the
2 stripes conversions to raid6.
Resolves: rhbz1624038
"lvconvert --type linear RaidLV" on striped and raid4/5/6/10
have to provide the convenient interim layouts. Fix involves
a cleanup to the convenience type function.
As a result of testing, add missing sync waits to
lvconvert-raid-reshape-linear_to_raid6-single-type.sh.
Resolves: rhbz1447809
Conversion to striped from raid0/raid0_meta is directly possible.
Fix a regression setting superfluous interim raid5_n conversion type
introduced by commit bd7cdd0b09.
Add new test script lvconvert-raid0-striped.sh.
Resolves: rhbz1608067
Check allocation of thin-pool works on 2PVs, when one is so full,
that even metadata do not fit there (as they need at least 2M,
while 99% of 63MB fills >62MB)
The md filter can operate in two native modes:
- normal: reads only the start of each device
- full: reads both the start and end of each device
md 1.0 devices place the superblock at the end of the device,
so components of this version will only be identified and
excluded when lvm uses the full md filter.
Previously, the full md filter was only used in commands
that could write to the device. Now, the full md filter
is also applied when there is an md 1.0 device present
on the system. This means the 'pvs' command can avoid
displaying md 1.0 components (at the cost of doubling
the i/o to every device on the system.)
(The md filter can operate in a third mode, using udev,
but this is disabled by default because there have been
problems with reliability of the info returned from udev.)
Add tests for linear <-> striped|raid* conversions.
Add region_size config to reshape tests to avoid test
failures in case of it being defined unexpectedly in lvm.conf.
Related: rhbz1439925
Related: rhbz1447809
We have been warning about duplicate devices (and disabling lvmetad)
immediately when the dup was detected (during label_scan). Move the
warnings (and the disabling) to happen later, after label_scan is
finished.
This lets us avoid an unwanted warning message about duplicates
in the special case were md components are eliminated during the
duplicate device resolution.
This minor patch fixes grammar in a few messages which get
printed to users. It also fixes the same grammar mistake in
several comments.
Signed-off-by: Rick Elrod <relrod@redhat.com>
--
While newer system can detect need for 4K mkfs, on older test machines
running test suite over 4k is reporting problems.
Some more generic solution is needed thought.
This test can't use brd (ramdisk) as backend since for some
weird reason lsblk is not listing these device.
TODO: test could be probably rewritten to avoid using lsblk somehow??
When the backend device supports only 4K blocks (like ramdisk)
we cannot use for testing any smaller blocksize.
So recalc test for 4K extent size.
We may possibly introduce one list extra test that
can be executed on devices with 512b sectors to
check lvm2 support those min extent sizes...
In case "lvconvert -mN RaidLV" was used on a degraded
raid1 LV, success was returned instead of an error.
Provide message to inform about the need to repair first
before changing number of mirrors and exit with error.
Add new lvconvert-m-raid1-degraded.sh test.
Resolves: rhbz1573960
There are likely more bits of code that can be removed,
e.g. lvm1/pool-specific bits of code that were identified
using FMT flags.
The vgconvert command can likely be reduced further.
The lvm1-specific config settings should probably have
some other fields set for proper deprecation.
Instead of using delayer device user 'zero' device and let mirror
do some real work which takes some time.
In case the test machine is too fast - mirror might need to be made bigger
to meet needed criteria.
Also move all test needed this 'zero' PV trick to the end of test
so $dev2 and $dev4 are covered with 'zero' and can take any amount of
write without consuming any real space.
For reporting commands (pvs,vgs,lvs,pvdisplay,vgdisplay,lvdisplay)
we do not need to repeat the label scan of devices in vg_read if
they all had matching metadata in the initial label scan. The
data read by label scan can just be reused for the vg_read.
This cuts the amount of device i/o in half, from two reads of
each device to one. We have to be careful to avoid repairing
the VG if we've skipped rescanning. (The VG repair code is very
poor, and will be redone soon.)
Test that no (Sub)LV remnants persist if the volume group is
not listed in configuration variable activation/volume_list,
hence not activatable thus causing initialization of rmeta
SubLVs to fail.
Related: rhbz1161347
Correct testing with format 1 and mq policy.
Add testing of 'smq'
Fix testing with clvmd - where logged message is part of clvmd log
and we can only check command status.
Use 4K chunks since some older kernels are not capable
to create striped volumes with smaller size.
TODO: lvm2 should detect this ahead and avoid kernel
reporting "Invalid chunk".
Add three new raid tests with io load and table
reloads during reshape for target 1.13.2.
Add a raid0 to raid10 conversion test.
Also add more signals to trap in lvconvert-raid-reshape-load.sh.
Avoiding "$(get first_extent_sector "$d")" in the loop
allows the test to succeed in the cluster. Further cluster
analysis needed to get to the core reason.
The lvm2 test suite aims at small test resource footprints
(few PVs, small PV sizes) to run on tmpfs backed loop device.
OTOH, lvconvert-reshape-raid.sh aims to test the maxima of
supported total stripes of 64. This patch adds a prerequisite
conditional to skip tests using more than 14 stripes.
It requires the target version 1.13.1 to avoid deadlocks.
In some cases the message could be slightly misleading so use
here rather conditional.
TODO:
In future we may possibly further tune the message in case we are
certain the level of redundancy protection has not been reduced.
The lvchange-raid[456].sh test checks that mismatches can be detected
properly. It does this by writing garbage to the back half of one of
the legs directly. When performing a "check" or "repair" of mismatches,
MD does a good job going directly to disk and bypassing any buffers that
may prevent it from seeing mismatches. However, in the case of RAID4/5/6
we have the stripe cache to contend with and this is not bypassed. Thus,
mismatches which have /just/ happened to an area that now populates the
stripe cache may be overlooked. This isn't a serious issue, however,
because the stripe cache is short-lived and reasonably small. So, while
there may be a small window of time between the disk changing underneath
the RAID array and when you run a "check"/"repair" - causing a mismatch
to be missed - that would be no worse than if a user had simply run a
"check" a few seconds before the disk changed. IOW, it simply isn't worth
making a fuss over dropping the stripe cache before beginning a "check" or
"repair" (which we actually did attempt to do a while back).
So, to get the test running smoothly, we simply deactivate and reactivate
the LV to force the stripe cache to be dropped and then proceed. We could
just as easily wait a few seconds for the stripe cache to empty also.
When a "recover" is just starting for a RAID LV, it is possible to get
"idle" for the sync action if the status is issued quickly enough. This
is fine, the MD thread just hasn't gotten things going yet. However,
the /need/ for a "recover" should be marked in md->recovery and it would
be simple enough to fix the kernel so this doesn't happen. May eventually
want a separate bug for this, but for now it fits with RHBZ 1507719.
There are two known bugs in the lvconvert-raid-status-validation.sh
test. The first one I consider to be more of an annoyance (1507719).
The second one I consider to be more serious (1507729).
RHBZ 1507719 simply documents the fact that the three RAID status
fields may not always be coherent due to the way they are set and
unset when the MD thread is shutting down and starting up. For
example, the sync ratio may be 100% but the sync action may not
yet have switched to "idle" and the health characters may not yet
all be 'A's (i.e. the devices set to InSync).
RHBZ 1507729 is more serious. The sync ratio can be 100% for a
short period of time after upconverting linear -> RAID1. It is
reset to 0 once the MD sync thread gets to work on it. It does
this because, technically, the array /is/ in-sync if the new
devices are excluded - i.e. the data is 100% available and
consistent. I'm not sure what to do about this problem, but we'd
much rather not have this state that looks exactly like the
end of the process when the sync ratio is 100% because the
"recover" process finished, but the sync action and health
characters haven't been updated yet. Put simply, the problem
is that we can't tell if a sync is starting or finished based
on the status output.
Preload reiserfs module for the case, fs is present/compiled for a
kernel but it's not present in memory.
Size reducition needs --yes confirmation to preceed for reiserfs.
Correct reported message when thin snapshot has been already merged.
So lvm2 is no longer reporting "Mergins of snapshot X will occur..."
(even with swapped names).
the bug in LUKS grow/shrink decision in fsadm was
masked due to fact that default LVM2 extent size
was larger than LUKS1 default data offset for dm-crypt
mapping. The new test address this bug.
Fix code checking that the 2nd mda which is at the end of disk really
fits the available free space and avoid any DA and MDA interleaving when
we already have DA preallocated. This mainly applies when we're restoring
a PV from VG backup using pvcreate --restorefile where we may already have
some DA preallocated - this means the PV was in a VG before with already
allocated space from it (the LVs were created). Hence we need to avoid
stepping into DA - the MDA can never ever be inside in such case!
The code responsible for this calculation was already in
_text_pv_add_metadata_area fn, but it had a bug in the calculation where
we subtracted one more sector by mistake and then the code could still
incorrectly allocate the MDA inside existing DA. The patch also renames
the variable in the code so it doesn't confuse us in future.
Also, if the 2nd mda doesn't fit, don't silently continue with just 1
MDA (at the start of the disk). If 2nd mda was requested and we can't
create that due to unavailable space, error out correctly (the patch
also adds a test to shell/pvcreate-operation.sh for this case).
Add missing get_devs.
When $7 is not given use empty string.
See if we can live with less RAM disk for PVs.
Drop limitation on single core as presence 1.12 should address this.
Possible misspelling: FAILED_MIXED_STR may not be assigned, but FAIL_MIXED_STR is.
Possible misspelling: FAILED_MULTI_STR may not be assigned, but FAIL_MULTI_STR is.
Possible misspelling: FAILED_BLACK_STR may not be assigned, but FAIL_BLACK_STR is.
New tests to add checking for '100%' in-sync at start of "recover"
process (it shouldn't happen, but I've seen it before). Also,
check status over the whole cycle of various sync processes ("resync"
and "recover").
First test in this file checks whether 'aa' is ever spotted during
a "recover" operation (it should not be). More tests should follow
in this file to look for oddities in status output - especially as
it relates to the sync_ratio, dev_health, and sync_action fields.
For the test clean-up, I was providing too many devices to the first
command - possibly allowing it to allocate in the wrong place. I was
also not providing a device for the second command - virtually ensuring
the test was not performing correctly at times.
This patch ensures that under normal conditions (i.e. not during repair
operations) that users are prevented from removing devices that would
cause data loss.
When a RAID1 is undergoing its initial sync, it is ok to remove all but
one of the images because they have all existed since creation and
contain all the data written since the array was created. OTOH, if the
RAID1 was created as a result of an up-convert from linear, it is very
important not to let the user remove the primary image (the source of
all the data). They should be allowed to remove any devices they want
and as many as they want as long as one original (primary) device is left
during a "recover" (aka up-convert).
This fixes bug 1461187 and includes the necessary regression tests.
Two of the sync actions performed by the kernel (aka MD runtime) are
"resync" and "recover". The "resync" refers to when an entirely new array
is going through the process of initializing (or resynchronizing after an
unexpected shutdown). The "recover" is the process of initializing a new
member device to the array. So, a brand new array with all new devices
will undergo "resync". An array with replaced or added sub-LVs will undergo
"recover".
These two states are treated very differently when failures happen. If any
device is lost or replaced while "resync", there are no worries. This is
because any writes created from the inception of the array have occurred to
all the devices and can be safely recovered. Even though non-initialized
portions will still be resync'ed with uninitialized data, it is ok. However,
if a pre-existing device is lost (aka, the original linear device in a
linear -> raid1 convert) during a "recover", data loss can be the result.
Thus, writes are errored by the kernel and recovery is halted. The failed
device must be restored or removed. This is the correct behavior.
Unfortunately, we were treating an up-convert from linear as a "resync"
when we should have been treating it as a "recover". This patch
removes the special case for linear upconvert. It allows each new image
sub-LV to be marked with a rebuild flag and treats the array as 'in-sync'.
This has the correct effect of causing the upconvert to be treated as a
"recover" rather than a "resync". There is no need to flag these two states
differently in LVM metadata, because they are already considered differently
by the kernel RAID metadata. (Any activation/deactivation will properly
resume the "recover" process and not a "resync" process.)
We make this behavior change based on the presense of dm-raid target
version 1.9.0+.
Code path missed validation of lvcreate --cachepool argument.
If the non cache-pool LV was passed in, code has still continued
further work and failed later on internal error. Validate this
condition at right place now.
With recent updates for thin pool monitoring in version 169
we lost multiple WARNINGs to be printed in syslog, when
pool crossed 80%, 85%, 90%, 95%, 100%.
Restore this logic as we want to keep user informed more
then just once when 80% boundary is passed.
If the device size does not match the size requested
by --setphysicalvolumesize, then prompt the user.
Make the pvcreate checking/prompting code handle
multiple prompts for the same device, since the
new prompt can be in addition to the existing
prompt when the PV is in a VG.