IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
When the fabric failure occurs, it will lose the connection with hosts
instantly, and after a while it can recovery back so that the hosts can
continue to access the drives.
For this case, the locking manager should be reliable for this case and
can dynamically handle this case and allows user to continue to use the
VG/LV with associated locking scheme.
This patch adds a testing to emulate the fabric faliure, verify LVM
commands for this case. The testing usage is:
# make check_lvmlockd_idm \
LVM_TEST_BACKING_DEVICE=/dev/sdo3,/dev/sdp3,/dev/sdp4 \
LVM_TEST_FAILURE=1 T=idm_fabric_failure.sh
Signed-off-by: Leo Yan <leo.yan@linaro.org>
After the lvmlockd abnormally exits and relaunch the daemon, if LVM
commands continue to run, lvmlockd and the backend lock manager (e.g.
sanlock lock manager or IDM lock manager) should can continue to serve
the requests from LVM commands.
This patch adds a test to emulate lvmlockd failure, and verify the LVM
commands after lvmlockd recovers back. Below is an example for testing
the case:
# make check_lvmlockd_idm \
LVM_TEST_BACKING_DEVICE=/dev/sdo3,/dev/sdp3,/dev/sdp4 \
LVM_TEST_FAILURE=1 T=lvmlockd_failure.sh
Signed-off-by: Leo Yan <leo.yan@linaro.org>
This patch is to add the stress testing, which launches three threads,
one thread is for creating/removing PV, one thread is for
creating/removing VG, and the last one thread is for LV operations.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
This patch is to add the stress testing, which launches two threads,
each thread creates LV, activate and deactivate LV in the loop; so this
can test for multi-threading in lvmlockd and its backend lock manager.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
This patch is to add the stress testing, which loops to create LV,
activate and deactivate LV in the single thread.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
This patch is to introduce testing option LVM_TEST_LOCK_TYPE_IDM, with
specifying this option, the Seagate IDM lock manager will be launched as
backend for testing. Also add the prepare and remove shell scripts for
IDM.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
While we heavily try to spot arrays that are not yet in-sync,
some kernels tends to block our lvm2 command in kernel,
while we resume these smaller raid arrays even for 5 seconds.
But since the result is not really wrong - report these
check failures only as TEST WARNING.
The autoactivation property can be specified in lvcreate
or vgcreate for new LVs/VGs, and the property can be changed
by lvchange or vgchange for existing LVs/VGs.
--setautoactivation y|n
enables|disables autoactivation of a VG or LV.
Autoactivation is enabled by default, which is consistent with
past behavior. The disabled state is stored as a new flag
in the VG metadata, and the absence of the flag allows
autoactivation.
If autoactivation is disabled for the VG, then no LVs in the VG
will be autoactivated (the LV autoactivation property will have
no effect.) When autoactivation is enabled for the VG, then
autoactivation can be controlled on individual LVs.
The state of this property can be reported for LVs/VGs using
the "-o autoactivation" option in lvs/vgs commands, which will
report "enabled", or "" for the disabled state.
Previous versions of lvm do not recognize this property. Since
autoactivation is enabled by default, the disabled setting will
have no effect in older lvm versions. If the VG is modified by
older lvm versions, the disabled state will also be dropped from
the metadata.
The autoactivation property is an alternative to using the lvm.conf
auto_activation_volume_list, which is still applied to to VGs/LVs
in addition to the new property.
If VG or LV autoactivation is disabled either in metadata or in
auto_activation_volume_list, it will not be autoactivated.
An autoactivation command will silently skip activating an LV
when the autoactivation property is disabled.
To determine the effective autoactivation behavior for a specific
LV, multiple settings would need to be checked:
the VG autoactivation property, the LV autoactivation property,
the auto_activation_volume_list. The "activation skip" property
would also be relevant, since it applies to both normal and auto
activation.
Switch to plain 'kill' we should no longer need SIGKILL
as polling can be interrupted.
Resolve problem in aux wait_pvmove_lv_ready() that was using
lvm command to check for UUID - but this was interferring with
VG lock and it's been delaying confirmation.
So reducing slow-down of test - so it can run faster.
Looks like there was some missed versioning increase during devel.
So with kernel >= 4.18 version 1.19 is enough to look like 1.20
However backported 1.19 targets seems to not provide all
the capabilities.
Added comment the 'lvs' already initiates dmeventd
Note: we don't have any query mechanism to check if dmeventd
is already running except access of socket which basically
starts dmeventd if it's not running.
For determinist test results lvm2/dm service shall not be present
and running in the system as it may randomize test results.
In case they are found present, this test ends with warning (not failure).
Some older instancies of 'mdadm' opened legs in RW and
closed and opened again and expected exlusive access.
But here udev rule can be fired - so on these versions
slow down whole mdadm runtime by using strace, to
give system a bit more time to finish udev rule.
Just like lvm command ignores 0/xxxx report from judging the status.
Avoid using infinite loop and limit report checking to 100 checks.
If it would need more - something is not right.
Combination of throttling and slowed device is a bit faster.
Also add FIXME about the mutliple spawn polling processing
when activating invidual LV for a pvmove.
Use for testing new mdadm_create aux wrapper.
Place functionality into a 2 pass loop - one for 'auto' other for 'start'.
Share same tests between raid level 0 and level 1 version of raid.
We have here some kind of catch-22 - since older kernels do
use 'resync' while new 'recover' for initial raid synchronization.
So now - how do we recognize in which state of raid we are.
ATM seems to be simplest to simply keep disabled droping of primary raid
leg unless we are in sync.
FIXME: we should add a target version check and enable removal
Adding full filesystem sync, trying to fight with strange error from losetup:
losetup: loopa: failed to set up loop device: Resource temporarily unavailable
loop0: detected capacity change from 0 to 4096
loop_set_block_size: loop0 () has still dirty pages (nrpages=13)
Also reuse internal aux wipefs_a
This reverts commit 99b6173f10.
These tests are disabled with lvmlockd because they use
snapshots without an origin which is not permitted in a
shared vg.
user creates a file listing real devices they want
lvm tests to use, and sets LVM_TEST_DEVICE_LIST.
lvm tests can use these with prepare_real_devs
and get_real_devs.
Other aux functions do not work with these devs.
To better test actually fsadm in test suite - avoid setting
LVM_BINARY locally - since test setup already modifies
PATH to find test's lvm binary as the 1st. in path.
Move extra md component detection into the label scan phase.
It had been in set_pv_devices which was deep within the vg_read
phase, which wasn't a good place (better to detect that earlier.)
Now that pv metadata info is available in the scan phase, the pv
details (size and device_hint) can be used for extra md checking.
Use the device_hint from the pv metadata to trigger a full md
component check if the device_hint begins with /dev/md.
Stop triggering full md component checks based on missing
udev info for a dev.
Changes to tests to reflect that the code is now detecting
md components in some test case that it wasn't before.
A cachevol can be forcibly detached when it's missing devices.
Also allow this if it's damaged/invalid and unrepairable.
This would be needed to recover data from the origin LV after
a cachevol is lost or damaged beyond repair.
In cases where lvconvert does not detect a fs block size on the
device, it falls back to choosing a writecache block size based
on the device's LBS and PBS (tries to match those.)
If the user specifies a writecache block size on the command
line (--cachesettings block_size=4096|512), lvconvert currently
fails and reports an error if the user-specified value does not
match the value lvconvert would have chosen based on LBS and PBS.
The purpose of allowing a user-specified value on the command line
is to override what lvconvert would otherwise do, so change this
to just print a warning that the user value does not match the
value that would be chosen based on the LBS/PBS, and then take
the user-specified value as the writecache block size.
In case legs of a raid0 LV are removed, the lvdisplay command still
reports 'available' though raid0 is not providing any resilience
compared to the other raid levels.
Also lvdisplay does not display '(partial)' in case of missing raid0
legs as oposed to the lvs command.
Enhance lvdisplay to report "NOT available" for any RaidLV type in case
too many legs are inaccessible hence causing data loss. I.e. any leg
for raid0, all for raid1, more than 1 for raid4/5, more than 2 for raid6
and in case of completely lost mirror groups for raid10.
Add test/shell/lvdisplay-raid.sh.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1872678
When a writecache sublv or an integrity metadata sublv
are partial (missing a dev), set the partial flag on
the upper level LV also, as is done for other sublvs.
Fix the two-step writecache detach in commit c32d7fed4f.
In the case of uncache, the cachevol is removed after
detaching the writecache. When the detach is finished
in the second step, the remove must wait until then.
Verify that corruption is corrected for raid levels other
than raid1. For other raid levels, attempt to corrupt the
given file pattern on each underlying device, since we don't
know which device contains the file being corrupted.
This ensures that corruption is actually be introduced
when testing the other raid levels.
Verify that corruption is being corrected by checking
the integritymismatches count is non-zero for the raid LV,
which includes the total from all images (since we don't
know which image will have the corruption.)
Test case where filesystem has been corrected via fsck.
In such case fsck returns '1' as success and should be
handled in a same way as '0' since fs is correct.
Restructure the pvscan code, and add new temporary files
that list pvids in a VG, used for processing PVs that
have no metadata.
The new temp files, in /run/lvm/pvs_lookup/<vgname>, allow a
proper pvscan --cache to be done on PVs that have no metadata.
pvscan --cache <dev> is only supposed to read <dev>, but when
<dev> has no metadata, this had not been possible. The
command had to fall back to scanning all devices to read all
VG metadata to get the list of all PVIDs needed to check for
a complete VG. Now, the temp file can be used in place of
reading metadata from all PVs on the system.
Since 'BLKZEROOUT' streams out more block at once, at can easily
zero-out larger set of blocks after 1st. failing one.
So the test is adapted to fully 'hide' swap header under error target.
Cover the case where two copies of metadata have the
same seqno but different checksums. Also elaborate
on an existing fixme in the code for this case, since
we should be doing something better for this case.
This had been uncovering an issue with reopening
fds in readwrite mode.
This test seems to be hitting some corner case in handling
out-of-metadata space condintion in thin-pool.
Add few more aid notes and functionality.
Also add missing '|| true' with now direct-IO dd command.
Shorten running time of the test.
Fix some issues in invoked resizing script to it returns
correct return code and dmeventd can be a little bit quicker
in this test.
When loop can't handle sector-size option - failure caused double fail
for access of unbound variable
Also fix expression for 'rm' and remove loops after loop release.
If the test runs of loop device backend with 512 sectors,
xfs selects this smaller sector size and then data do not fit
(we would need -l9 with most of 'raids').
With 4K sectors data always fits.
Shorten and make the test easily readable by moving same code into
function and removed one duplicated test for 512,4096 combination.
Always use scsi_debug - since default ramdisk or loop device backend
is unpredictible.
Use new SKIP_WITH_LOW_SPACE and set higher requirement for free space.
But still this test can't run on system's tmpfs directories -
as they typically provide less then 2G of space and when the test
runs there it also provisioning for all READ pages!)
BRD (ramdisk) device should work.
Extend a _wait_recalc() loop for slower hw.
When creating large raid which do not need to be fully synchronized use
them on delay devices - so even less data needs read/write.
Remove unneeded lvchange as lvcreate is already leaving LV inactive.
Replace printf with awk as generator.
mm
While the previous commit c9b40083fc
decresed version to 1.19 for using bigger datasets, it's not
been quite right - so from our bb machine it looks like
bigger metadata consumption started with 1.19 and kernel 4.18
(fc27)
Use bigger volume and slowdown writing to cache device.
This allows more simple to reach 'dirty' state.
Also document exactly 1 SIGINT has to fire aborting of flushing.
For proper checking of extension progress require version 1.15
It looks with older versoin extension happens during very slow
resume within lvm command - although speed is still somewhat slow
with latest version.
Speed-up a bit the first synchronization with just 50ms write delay,
but later set also delay on read to slowdown lvextend.
FIXME: there are still things to look at:
0 229376 raid raid1 2 AA 229376/229376 idle 0 0
0 229376 raid raid1 2 AA 0/229376 frozen 0 0 -
0 262144 raid raid1 2 AA 229376/262144 repair 0 0 -
0 262144 raid raid1 2 AA 229376/262144 repair 0 0 -
0 262144 raid raid1 2 AA 245888/262144 repair 0 0 -
On test system with 'default' filter (aka accept all) test
after enabling device can suffer from automatic system
activation - so for created LVs setup skipping this automatic
activation. This should prevent getting LVs into table
with pvscan service.
The test was using a raid+integrity LV without
first waiting for the integrity sync, which could
cause the test to fail (depending on init speed)
where it depends on integrity to work in uninitialized
areas.
Also use cmp instead of diff.
dm-integrity stores checksums of the data written to an
LV, and returns an error if data read from the LV does
not match the previously saved checksum. When used on
raid images, dm-raid will correct the error by reading
the block from another image, and the device user sees
no error. The integrity metadata (checksums) are stored
on an internal LV allocated by lvm for each linear image.
The internal LV is allocated on the same PV as the image.
Create a raid LV with an integrity layer over each
raid image (for raid levels 1,4,5,6,10):
lvcreate --type raidN --raidintegrity y [options]
Add an integrity layer to images of an existing raid LV:
lvconvert --raidintegrity y LV
Remove the integrity layer from images of a raid LV:
lvconvert --raidintegrity n LV
Settings
Use --raidintegritymode journal|bitmap (journal is default)
to configure the method used by dm-integrity to ensure
crash consistency.
Initialization
When integrity is added to an LV, the kernel needs to
initialize the integrity metadata/checksums for all blocks
in the LV. The data corruption checking performed by
dm-integrity will only operate on areas of the LV that
are already initialized. The progress of integrity
initialization is reported by the "syncpercent" LV
reporting field (and under the Cpy%Sync lvs column.)
Example: create a raid1 LV with integrity:
$ lvcreate --type raid1 -m1 --raidintegrity y -n rr -L1G foo
Creating integrity metadata LV rr_rimage_0_imeta with size 12.00 MiB.
Logical volume "rr_rimage_0_imeta" created.
Creating integrity metadata LV rr_rimage_1_imeta with size 12.00 MiB.
Logical volume "rr_rimage_1_imeta" created.
Logical volume "rr" created.
$ lvs -a foo
LV VG Attr LSize Origin Cpy%Sync
rr foo rwi-a-r--- 1.00g 4.93
[rr_rimage_0] foo gwi-aor--- 1.00g [rr_rimage_0_iorig] 41.02
[rr_rimage_0_imeta] foo ewi-ao---- 12.00m
[rr_rimage_0_iorig] foo -wi-ao---- 1.00g
[rr_rimage_1] foo gwi-aor--- 1.00g [rr_rimage_1_iorig] 39.45
[rr_rimage_1_imeta] foo ewi-ao---- 12.00m
[rr_rimage_1_iorig] foo -wi-ao---- 1.00g
[rr_rmeta_0] foo ewi-aor--- 4.00m
[rr_rmeta_1] foo ewi-aor--- 4.00m
To write a new/repaired pv_header and label_header:
pvck --repairtype pv_header --file <file> <device>
This uses the metadata input file to find the PV UUID,
device size, and data offset.
To write new/repaired metadata text and mda_header:
pvck --repairtype metadata --file <file> <device>
This requires a good pv_header which points to one or two
metadata areas. Any metadata areas referenced by the
pv_header are updated with the specified metadata and
a new mda_header. "--settings mda_num=1|2" can be used
to select one mda to repair.
To combine all header and metadata repairs:
pvck --repair --file <file> <device>
It's best to use a raw metadata file as input, that was
extracted from another PV in the same VG (or from another
metadata area on the same PV.) pvck will also accept a
metadata backup file, but that will produce metadata that
is not identical to other metadata copies on other PVs
and other areas. So, when using a backup file, consider
using it to update metadata on all PVs/areas.
To get a raw metadata file to use for the repair, see
pvck --dump metadata|metadata_search.
List all instances of metadata from the metadata area:
pvck --dump metadata_search <device>
Save one instance of metadata at the given offset to
the specified file (this file can be used for repair):
pvck --dump metadata_search --file <file>
--settings "metadata_offset=<off>" <device>
When running cluster test with clvmd, the actual 'monitoring'
happens in cluster - so the 'already monitored' message
is also logged within clvmd code and the command cannot
see such effect.
clvmd was incapable to report this information back to command
so it cannot be displayed this way.
Add 'lvs -o+seg_monitor' validation which also works in clustered mode.
Improve the implementation of extracting all text metadata
copies from the metadata area. Use this for the existing
metadata_all dump option.
Add a new metadata_search dump option which does not use
lvm headers to find metadata, but looks in standard
locations. This is useful if headers are damaged and
can't be used to locate metadata.
Adding '-v' to metadata_all or metadata_search will add
the description and creation_time to the printed list of
metadata instances that are found.
In the hex dump output, grep for the vgname
followed by one space. This allows for test pids
with up to seven digits, which are used to contruct
the variable vgname used by the test. Otherwise
the long vgname wraps to the next line and fails to
match in grep.
When an LV is used as a writecache cachevol, give
it the LV name a _cvol suffix. Remove the suffix
when the cachevol is detached, restoring the
original LV name.
Use /dev/md33 instead of /dev/md0 to reduce chances of
conflicting with an existing name.
Only call 'mdadm --stop /dev/md33' for cleanup and don't
use 'mdadm --stop --scan' to avoid stopping other md devs.
Due to a dm-raid target flaw fixed in target version 1.15.0,
extents of raid sets don't get resynchronized when new MD bitmp
pages have to be allocated due to the extension.
Introduce lvextend-raid.sh to test this flaw.
Related: rhbz1671964
When an online PV completed a VG, the standard
activation functions were used to activate the VG.
These functions use a full scan of all devs.
When many pvscans are run during startup and need
to activate many VGs, scanning all devs from all
the pvscans can take a long time.
Optimize VG activation in pvscan to scan only the
devs in the VG being activated. This makes use of
the online file info that was used to determine
the VG was complete.
The downside of this approach is that pvscan activation
will not detect duplicate PVs and block activation,
where a normal activation command (which scans all
devices) would.
Some older BB with older cryptsetup tool do not 'retry' on remove
and when remove is issued right after 'fsck' - it might be
rejected with:
Device @PREFIX@-tcrypt2 is busy.
Try to use udevadm settle.
The cache repair utility does not yet work with a cachevol
(where metadata and data exist on the same LV.) So, warn
and prompt if writeback is specified with a cachevol.
Previously the VG metadata description field (which contains
the command line) was only included in backup/archive copies
of the metadata. Now also include it in the metadata written
to the metadata areas.